Más contenido relacionado La actualidad más candente (20) Similar a Interfaces de Voz avanzadas con VoiceXML - Iván Sixto | VoIP2DAY 2015 (20) Interfaces de Voz avanzadas con VoiceXML - Iván Sixto | VoIP2DAY 20153. © 2015 Interactive Powers | www.ivrpowers.com
Self driving cars … Self voice services
DTMF keypad
Speaker
Automatic Speech Recognition
ASR
Micro
Text-to-Speech
TTS or Wav
Autopilot for
Interactive Voice Response
CAR
IVR
SIP / VoIP or TDM
4. © 2015 Interactive Powers | www.ivrpowers.com
What is IVR?
In telephony, Interactive Voice Response, or IVR, is a phone
technology that allows a computer to detect voice and touch
tones using a normal phone call. The IVR system can respond with
pre-recorded or dynamically generated audio to further direct
callers on how to proceed. IVR systems can be used to control
almost any function where the interface can be broken down into a
series of simple menu choices. Once constructed IVR systems
scale well to handle large call volumes.
5. © 2015 Interactive Powers | www.ivrpowers.com
IVR: Simple definition
ApplicationsPhone
IVR
Voice API
Persons Machines
6. © 2015 Interactive Powers | www.ivrpowers.com
IVR: Human-Machine Dialogue
Listen
Speak
Process
ASR / SIV
TTS / WAV
VoiceXML
IVR
Listen
Speak
Process
Machine Human
7. © 2015 Interactive Powers | www.ivrpowers.com
What is VoiceXML?
VoiceXML is a language for creating voice-user interfaces, particularly
for the telephone. It uses speech recognition (ASR) and touchtone
(DTMF keypad) for input, and pre-recorded audio and text-to-speech
synthesis (TTS) for output. It is based on the Worldwide Web
Consortium’s (W3C’s) Extensible Markup Language (XML), and
leverages the web paradigm for application development and
deployment. By having a common language, application developers,
platform vendors, and tool providers all can benefit from code
portability and reuse.
8. © 2015 Interactive Powers | www.ivrpowers.com
VoiceXML: History
201520101998 1999 2000 2001 2002 2006
VoiceXML 2.1
VoiceXML 3.0 draft
VoiceXML 1.0
VoiceXML 2.0
Natural Language Under.
NLU
Motorola
VoxML
IBM
SpeechML
Lucent
Teleportal
W3C
VoiceXML 0.9
AT&T
Labs
(Meta-languages)
9. © 2015 Interactive Powers | www.ivrpowers.com
W3C VoiceXML Open Standard
• W3C VoiceXML 2.0
Recommendation March 2004
• W3C VoiceXML 2.1 (Recommendation)
Recommendation June 2007
• W3C VoiceXML 3.0 (Draft)
Early Stage of development January 2006
10. © 2015 Interactive Powers | www.ivrpowers.com
Voice Browser or Web Browser
<vxml>
Internet
<html>
Web
Navegador Web
(Web Browser)
Navegador de Voz
(Voice Browser)
http://
Servidor Web
(web Server)
11. © 2015 Interactive Powers | www.ivrpowers.com
HTML versus VXML
HTML VXML
Mouse + Display Phone + Keypad
HTML layout VXML layout
images, video files audio, grammar files
Text Text (TTS)
Scripts Scripts
HTTP / HTTPS HTTP / HTTPS
RTP - SOAP - WSDL RTP - SOAP - WSDL - SIP
12. © 2015 Interactive Powers | www.ivrpowers.com
PBX versus IVR
Features PBX IVR
Connect Phones / Extensions Phones / Applications
Call Routing Person-to-Person Person-to-Machine
Configuration Static (Dialplan) Dynamic (VoiceXML)
Interaction DTMF DTMF | TTS | ASR | NLU | SIV
13. © 2015 Interactive Powers | www.ivrpowers.com
3 Niveles de dialogo: IVR … NLU
Key Tones (DTMF)
Direct Dialog (ASR/TTS)
Natural Language Understanding (NLU)
Dialogo determinista:
“Ventas, Comercial, Soporte…”
Teclas del teléfono:
“0…9 # *”
Dialogo indeterminista:
“Quiero contactar con un comercial”
1
2
3
14. © 2015 Interactive Powers | www.ivrpowers.com
Diagrama IVR… extendido
NLU
IVR
TTS ASR
MRCP
HTTP
MRCP
API
HTTP VoiceXML + GRXML + BNF
Voice Browser
Business
Applications
VOIP | TDM
HTTP | API
VXML
Speech Servers
HTTPS
Phone
SIP / TDM
PBX
Natural Language Understanding Voice Biometrics
Telephony Private Branch Exchange
IVR+
SIV
15. © 2015 Interactive Powers | www.ivrpowers.com
Hello World!
<?xml version="1.0"?>
<vxml version = "2.0" xmlns="http://www.w3.org/2001/vxml" xml:lang="en-US">
<form>
<block>Hello world!</block>
</form>
</vxml>
<?xml version="1.0"?>
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
<form>
<block>
<prompt>
<audio src="helloworld.wav"/>
</prompt>
</block>
</form>
</vxml>
16. © 2015 Interactive Powers | www.ivrpowers.com
Menu DTMF
<?xml version="1.0"?>
<vxml version = "2.1">
<menu>
<prompt>
Hello. Choose among the following option:
<enumerate> <value expr=“_dtmf”/> for <value expr=“_prompt”/>
</prompt>
<choice dtmf=“1” next=page1.vxml> Hotel </choice>
<choice dtmf=“2” next=page2.vxml> Wheather </choice>
<choice dtmf=“3” next=page3.vxml> News </choice>
</menu>
17. © 2015 Interactive Powers | www.ivrpowers.com
Reconocimiento de voz (ASR)
<?xml version=“1.0" encoding=“ISO-8859-1”?>
<vxml version = “2.0" lang=“en”>
<form> <field name=“city”>
<prompt>Where to you want to travel to?</prompt>
<option>New York</option>
<option>Paris</option>
<option>Berlin</option>
<option>Madrid</option>
<option>London</option>
</field>
<field name=“travellers” type=“number”>
<prompt>How many are traveling to <value expr=“city”/>? </prompt>
</field>
<block>
<submit next=“http://localhost/handler” namelist=“city travelers”/>
</block>
</form>
</vxml>
18. © 2015 Interactive Powers | www.ivrpowers.com
Ventajas de VoiceXML
• VoiceXML es un estándar abierto para los sistemas IVR
• Lenguaje basado en el paradigma XML / HTTP
• Inclusión de gramáticas de diálogo compiladas o dinámicas GRXML, ABNF,…
• Integración y gestión de los motores del habla TTS / ASR
• Compatible con todos lenguajes de programación web PHP / JSP / ASP /...
• Acceso universal a bases de datos y sistemas externos (también para NLU)
• Permite una gestión de los eventos en tiempo real