SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Standards Update:

VoiceXML 3

Dan Burnett, Ph.D.
Dir. of Speech Technologies, Voxeo
(Dir. of Standards, Voxeo)
Voxeo on Standards

        Develop ahead of standards

        Make it Open Source




        Lead in standards creation

        Lead in standards adoption
© Voxeo Corporation
Past Leadership
    W3C
      •  VoiceXML 2.0/2.1, SRGS 1.0, SISR
         1.0, SSML 1.0
      •  CCXML 1.0, SCXML 1.0, EMMA 1.0

    IETF
      •  MRCPv1 extensions, MRCPv2,
            P-charge-info, SIP security




     © Voxeo Corporation
Where we are now
    W3C
      •  VoiceXML 3, SSML 1.1, Pronunciation
            Alphabet Registry, Speech in HTML 5
      •  CCXML 1.0, SCXML 1.0, EMMA next, MMI
            architecture

    IETF, 3GPP
      •  MRCPv2, XMPP (incl. multi-party Jingle and
            multiple chat), Media Control, SIP Overload,
            SIPREC, CODEC (Speex)

    JCP
      •  JSR 289, 309 – SIP servlets, media control
      •  JSR 154, 254 – Java servlets and servlet
         pages
      •  XMPP SIP servlet – submitting to JCP
     © Voxeo Corporation
VoiceXML


                                                     VoiceXML
                                                     3



                                   VoiceXML
                        VoiceXML   2.1
                        2.0
VoiceXML
1.0




 2000                     2004       2007     2010


  © Voxeo Corporation
VoiceXML


                                                     VoiceXML
                                                     3



                                   VoiceXML
                        VoiceXML   2.1
                        2.0
VoiceXML
1.0




  2000                    2004       2007     2010


  © Voxeo Corporation
V3 Motivations

        FIA flexibility

        New features

        Extensibility

        Better integration with other W3C languages




© Voxeo Corporation
V3 is . . .

        a restructured core

        some new features

        convenience elements to mimic VoiceXML 2.1




© Voxeo Corporation
V3 Architecture

        Core functionality defined in modules

        Modules combined with convenience syntax into
         profiles




© Voxeo Corporation
Core functionality defined in modules




        Module behavior defined precisely as state
         machines



© Voxeo Corporation
Modules + Conv. Syntax = Profiles




        Modules grouped into profiles
        Legacy (V2.1), Basic, Maximal
        Convenience syntax simplifies authoring

© Voxeo Corporation
Convenience Syntax

        New elements and attributes, but no new
         functionality

        Behavior defined in terms of core functionality

        For example, <menu> defined in terms of
         <form> with grammars and prompts




© Voxeo Corporation
Convenience Syntax

        Definite candidates are
          •  menu/choice/enumerate/option
          •  error/help/noinput/nomatch shortcuts
          •  link

        Possible (but different) candidates might be
          •  if/else/elseif (using SCXML)
          •  transfer (using CCXML)




© Voxeo Corporation
New Stuff

        New media, SIV functions

        Session root documents

        Real-time controls

        Author-specifiable transition controllers

        V2 eventing model now async & compatible
         with DOM Level 3



© Voxeo Corporation
New Functionality – Video 

        Video -- <audio> replaced by <media>, which
         allows both audio and video


       <media type="audio/x-wav" src="http://www.example.com/resource.wav"/>

       <media type="video/3gpp" src="http://www.example.com/resource.3gp"/>


       <media>     <!-- inline SSML with audio media fallback-->
        <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis">
          Ich bin ein Berliner.
        </speak>
        <media type="audio/x-wav" src="ichbineinberliner.wav">
       </media>




© Voxeo Corporation
New Functionality – Media Control 

        Media control -- media clipping, speed, and
         volume control now possible without resorting
         to SSML


  <media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2"
  src="http://www.example.com/resource.wav"/>

  <media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s"
  src="http://www.example.com/resource.3gp"/>




© Voxeo Corporation
New Functionality – SIV 

        SIV – speaker authentication capabilities
         available as core functionality
          •  Enrollment – creates voice model, associates it with
                id in speaker database
          •  Identification – which voice model in speaker
                database is a match for the speech?
          •  Verification – for the claimed id,
                does the speech match the voice
                model in the speaker database?



© Voxeo Corporation
New Control – Session Root 

        Just like application root
          <vxml session="blahblah.vxml" ...>



        Well, not exactly
          •  If not specified, no session root
          •  Session root change is ignored or causes error


        First, let’s review application roots




© Voxeo Corporation
Application Root Review



  A: <vxml>
               AppRoot A

  B: <vxml>
               AppRoot B

  C: <vxml root="B">
      AppRoot B

  D: <vxml root="E">
      AppRoot E

  F: <vxml root="E">
      AppRoot E

  G: <vxml>
               AppRoot G




© Voxeo Corporation
Session Root



  A: <vxml>
                                  No Session Root

  B: <vxml session="C">
                      Session Root C

  D: <vxml>
                                  Session Root C

  E: <vxml session="F" >
                     Session Root C

  G: <vxml session="H" requiresession="true">
 error.badfetch




© Voxeo Corporation
Real-time Controls

        Special grammars that are always active (not just in
         the wait state)
          •  Allows arbitrary speech/dtmf
          •  Immediate: volume, speed, skip
          •  At next event processing: cancel, goto
         <form>
           <rtc grammar="digit3.grxml" action="volume" params="+5"/>
           <field name="a"> ... </field>
           <field name="b">
             <cancelrtc grammar= "digit3.grxml "/>
             ... 
           </field>
         </form>

        Acts as pre-filter on input stream,
         replacing matches with silence

© Voxeo Corporation
Transition Controllers

        Inter-element transitions now under author
         control

        Controllers at form, document, application, and
         perhaps session levels
          •  e.g. form controller specifies which form item to
                execute next

        Controllers can be in SCXML or another flow
         control language

        Default controllers will give FIA behavior in
         Legacy Profile
© Voxeo Corporation
Transition Controllers Example 1

    <!-- document-level transition controller controls inter-form transitions -->
    <vxml ...>
     <controller ...>
       <scxml:scxml version="1.0" ...>
         <!-- SCXML code determining which form to go to next -->
       </scxml>
     </controller>

      <form id="form_a" >
       ...
        <goto next="form_b"/>     <!-- goto is only a suggestion now -->
      </form>

     <form id="form_b" >
      ...
     </form>
     ...
    </vxml>



© Voxeo Corporation
Transition Controllers Example 2


  <!-- form-level transition controller controls inter-field transitions -->
  <vxml ...>
   <form>
     <controller src= "myformbehavior.scxml">

     <field name="field_a" > ... </field>
     <field name="field_b" > ... </field>
     <field name="field_c" > ... </field>
     <field name="field_d" > ... </field>
   </form>
   ...
  </vxml>




© Voxeo Corporation
For More V3 Info

        Follow the work
          •  http://www.w3.org/Voice

        Check out our recent Developer Jam Session
          •  http://developers.voiceobjects.com/tech-topics/
                monthly-jam-sessions/

        Contact me
          •  dburnett at voxeo dot com


                               Dan Burnett, Ph.D.
                      Dir. of Speech Technologies, Voxeo
© Voxeo Corporation

Más contenido relacionado

Similar a Voxeo Summit 2010: Standards Update: VoiceXML3

Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011
telestax
 
Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?
Voxeo Labs
 
Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow
WSO2
 
Using Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShiftUsing Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShift
Kenneth Peeples
 
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self ServiceVoxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Corp
 
Mike Taulty Beyond Silverlight With W P F
Mike Taulty  Beyond  Silverlight  With  W P FMike Taulty  Beyond  Silverlight  With  W P F
Mike Taulty Beyond Silverlight With W P F
ukdpe
 
XML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorXML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processor
jimfuller2009
 

Similar a Voxeo Summit 2010: Standards Update: VoiceXML3 (20)

Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011Voxeo Labs presentation at Mobicents Summit 2011
Voxeo Labs presentation at Mobicents Summit 2011
 
vCenter Orchestrator APIs
vCenter Orchestrator APIsvCenter Orchestrator APIs
vCenter Orchestrator APIs
 
Developing SIP Applications
Developing SIP ApplicationsDeveloping SIP Applications
Developing SIP Applications
 
Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?Ahn Conf 2011 - What is Prism?
Ahn Conf 2011 - What is Prism?
 
01 introduction
01 introduction01 introduction
01 introduction
 
High Volume Web API Management with the WSO2 ESB
High Volume Web API Management with the WSO2 ESBHigh Volume Web API Management with the WSO2 ESB
High Volume Web API Management with the WSO2 ESB
 
Facets of applied smw
Facets of applied smwFacets of applied smw
Facets of applied smw
 
How fluentd fits into the modern software landscape
How fluentd fits into the modern software landscapeHow fluentd fits into the modern software landscape
How fluentd fits into the modern software landscape
 
Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow Introduction to ESB Architecture and Message Flow
Introduction to ESB Architecture and Message Flow
 
Using Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShiftUsing Red Hat JBoss Fuse on OpenShift
Using Red Hat JBoss Fuse on OpenShift
 
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self ServiceVoxeo Summit 2010: Prophecy 10 - Unified Self Service
Voxeo Summit 2010: Prophecy 10 - Unified Self Service
 
XOOPS 2.5.x Operations Guide
XOOPS 2.5.x Operations GuideXOOPS 2.5.x Operations Guide
XOOPS 2.5.x Operations Guide
 
DEVNET-1122 Integrating Cisco Collaboration into Web Apps
DEVNET-1122	Integrating Cisco Collaboration into Web AppsDEVNET-1122	Integrating Cisco Collaboration into Web Apps
DEVNET-1122 Integrating Cisco Collaboration into Web Apps
 
WSO2Con 2011: Introduction to the WSO2 Carbon Platform
WSO2Con 2011: Introduction to the WSO2 Carbon PlatformWSO2Con 2011: Introduction to the WSO2 Carbon Platform
WSO2Con 2011: Introduction to the WSO2 Carbon Platform
 
VAST 7.5 and Beyond
VAST 7.5 and BeyondVAST 7.5 and Beyond
VAST 7.5 and Beyond
 
Developer Jam Session - Intro to Voxeo Products
Developer Jam Session - Intro to Voxeo ProductsDeveloper Jam Session - Intro to Voxeo Products
Developer Jam Session - Intro to Voxeo Products
 
Mike Taulty Beyond Silverlight With W P F
Mike Taulty  Beyond  Silverlight  With  W P FMike Taulty  Beyond  Silverlight  With  W P F
Mike Taulty Beyond Silverlight With W P F
 
Apache cloud stack 4.1 new features deep dive
Apache cloud stack 4.1 new features deep diveApache cloud stack 4.1 new features deep dive
Apache cloud stack 4.1 new features deep dive
 
What's New In InduSoft Web Studio 8.0 +SP1
What's New In InduSoft Web Studio 8.0 +SP1What's New In InduSoft Web Studio 8.0 +SP1
What's New In InduSoft Web Studio 8.0 +SP1
 
XML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorXML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processor
 

Más de Voxeo Corp

Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on SteroidsVoxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Corp
 
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Corp
 
Voxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactionsVoxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactions
Voxeo Corp
 
Voxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTCVoxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Corp
 
Voxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business usersVoxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Corp
 
Voxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fansVoxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fans
Voxeo Corp
 
Voxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topicsVoxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topics
Voxeo Corp
 
Voxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsessionVoxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsession
Voxeo Corp
 

Más de Voxeo Corp (20)

Voxeo Summit Day 2 -What's new in CXP 14
Voxeo Summit Day 2 -What's new in CXP 14Voxeo Summit Day 2 -What's new in CXP 14
Voxeo Summit Day 2 -What's new in CXP 14
 
Voxeo Summit Day 2 -Voxeo APIs and SDKs
Voxeo Summit Day 2 -Voxeo APIs and SDKsVoxeo Summit Day 2 -Voxeo APIs and SDKs
Voxeo Summit Day 2 -Voxeo APIs and SDKs
 
Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on SteroidsVoxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
Voxeo Summit Day 2 - Voxeo CXP - IVR on Steroids
 
Voxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analyticsVoxeo Summit Day 2 - Using CXP hotspot analytics
Voxeo Summit Day 2 - Using CXP hotspot analytics
 
Voxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactionsVoxeo Summit Day 2 - Securing customer interactions
Voxeo Summit Day 2 - Securing customer interactions
 
Voxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTCVoxeo Summit Day 2 - Real-time communications with WebRTC
Voxeo Summit Day 2 - Real-time communications with WebRTC
 
Voxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business usersVoxeo Summit Day 2 - Voxeo CXP for business users
Voxeo Summit Day 2 - Voxeo CXP for business users
 
Voxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fansVoxeo Summit Day 2 - Creating raving fans
Voxeo Summit Day 2 - Creating raving fans
 
Voxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topicsVoxeo Summit Day 2 - Advanced CCXML topics
Voxeo Summit Day 2 - Advanced CCXML topics
 
Voxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsessionVoxeo Summit Day 2 - The science of customer obsession
Voxeo Summit Day 2 - The science of customer obsession
 
Voxeo Summit Day 1 - Extending your IVR investment to mobile
Voxeo Summit Day 1 - Extending your IVR investment to mobileVoxeo Summit Day 1 - Extending your IVR investment to mobile
Voxeo Summit Day 1 - Extending your IVR investment to mobile
 
Voxeo Summit Day 1 - The Art of The Possible
Voxeo Summit Day 1 - The Art of The PossibleVoxeo Summit Day 1 - The Art of The Possible
Voxeo Summit Day 1 - The Art of The Possible
 
Voxeo Summit Day 1 - Prophecy log search
Voxeo Summit Day 1 - Prophecy log searchVoxeo Summit Day 1 - Prophecy log search
Voxeo Summit Day 1 - Prophecy log search
 
Voxeo Summit Day 1 - Customer experience analytics
Voxeo Summit Day 1 - Customer experience analyticsVoxeo Summit Day 1 - Customer experience analytics
Voxeo Summit Day 1 - Customer experience analytics
 
Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)
Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)
Voxeo Summit Day 1 - Communications-enabled Business Processes (CEBP)
 
Voxeo Summit Day 1 - A view into the Voxeo cloud
Voxeo Summit Day 1 - A view into the Voxeo cloudVoxeo Summit Day 1 - A view into the Voxeo cloud
Voxeo Summit Day 1 - A view into the Voxeo cloud
 
Voxeo Summit Day 1 - Lessons learned from large scale deployments
Voxeo Summit Day 1 - Lessons learned from large scale deploymentsVoxeo Summit Day 1 - Lessons learned from large scale deployments
Voxeo Summit Day 1 - Lessons learned from large scale deployments
 
Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?
Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?
Voxeo Jam Session: What's New in Prophecy 11 and VoiceObjects 11?
 
How Do You Hear Me Now?
How Do You Hear Me Now?How Do You Hear Me Now?
How Do You Hear Me Now?
 
IPv6 and How It Impacts Communication Applications
IPv6 and How It Impacts Communication ApplicationsIPv6 and How It Impacts Communication Applications
IPv6 and How It Impacts Communication Applications
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Voxeo Summit 2010: Standards Update: VoiceXML3

  • 1. Standards Update:
 VoiceXML 3
 Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)
  • 2. Voxeo on Standards   Develop ahead of standards   Make it Open Source   Lead in standards creation   Lead in standards adoption © Voxeo Corporation
  • 3. Past Leadership   W3C •  VoiceXML 2.0/2.1, SRGS 1.0, SISR 1.0, SSML 1.0 •  CCXML 1.0, SCXML 1.0, EMMA 1.0   IETF •  MRCPv1 extensions, MRCPv2, P-charge-info, SIP security © Voxeo Corporation
  • 4. Where we are now   W3C •  VoiceXML 3, SSML 1.1, Pronunciation Alphabet Registry, Speech in HTML 5 •  CCXML 1.0, SCXML 1.0, EMMA next, MMI architecture   IETF, 3GPP •  MRCPv2, XMPP (incl. multi-party Jingle and multiple chat), Media Control, SIP Overload, SIPREC, CODEC (Speex)   JCP •  JSR 289, 309 – SIP servlets, media control •  JSR 154, 254 – Java servlets and servlet pages •  XMPP SIP servlet – submitting to JCP © Voxeo Corporation
  • 5. VoiceXML VoiceXML 3 VoiceXML VoiceXML 2.1 2.0 VoiceXML 1.0 2000 2004 2007 2010 © Voxeo Corporation
  • 6. VoiceXML VoiceXML 3 VoiceXML VoiceXML 2.1 2.0 VoiceXML 1.0 2000 2004 2007 2010 © Voxeo Corporation
  • 7. V3 Motivations   FIA flexibility   New features   Extensibility   Better integration with other W3C languages © Voxeo Corporation
  • 8. V3 is . . .   a restructured core   some new features   convenience elements to mimic VoiceXML 2.1 © Voxeo Corporation
  • 9. V3 Architecture   Core functionality defined in modules   Modules combined with convenience syntax into profiles © Voxeo Corporation
  • 10. Core functionality defined in modules   Module behavior defined precisely as state machines © Voxeo Corporation
  • 11. Modules + Conv. Syntax = Profiles   Modules grouped into profiles   Legacy (V2.1), Basic, Maximal   Convenience syntax simplifies authoring © Voxeo Corporation
  • 12. Convenience Syntax   New elements and attributes, but no new functionality   Behavior defined in terms of core functionality   For example, <menu> defined in terms of <form> with grammars and prompts © Voxeo Corporation
  • 13. Convenience Syntax   Definite candidates are •  menu/choice/enumerate/option •  error/help/noinput/nomatch shortcuts •  link   Possible (but different) candidates might be •  if/else/elseif (using SCXML) •  transfer (using CCXML) © Voxeo Corporation
  • 14. New Stuff   New media, SIV functions   Session root documents   Real-time controls   Author-specifiable transition controllers   V2 eventing model now async & compatible with DOM Level 3 © Voxeo Corporation
  • 15. New Functionality – Video   Video -- <audio> replaced by <media>, which allows both audio and video <media type="audio/x-wav" src="http://www.example.com/resource.wav"/> <media type="video/3gpp" src="http://www.example.com/resource.3gp"/> <media> <!-- inline SSML with audio media fallback--> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"> Ich bin ein Berliner. </speak> <media type="audio/x-wav" src="ichbineinberliner.wav"> </media> © Voxeo Corporation
  • 16. New Functionality – Media Control   Media control -- media clipping, speed, and volume control now possible without resorting to SSML <media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2" src="http://www.example.com/resource.wav"/> <media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s" src="http://www.example.com/resource.3gp"/> © Voxeo Corporation
  • 17. New Functionality – SIV   SIV – speaker authentication capabilities available as core functionality •  Enrollment – creates voice model, associates it with id in speaker database •  Identification – which voice model in speaker database is a match for the speech? •  Verification – for the claimed id, does the speech match the voice model in the speaker database? © Voxeo Corporation
  • 18. New Control – Session Root   Just like application root <vxml session="blahblah.vxml" ...>   Well, not exactly •  If not specified, no session root •  Session root change is ignored or causes error   First, let’s review application roots © Voxeo Corporation
  • 19. Application Root Review A: <vxml> AppRoot A B: <vxml> AppRoot B C: <vxml root="B"> AppRoot B D: <vxml root="E"> AppRoot E F: <vxml root="E"> AppRoot E G: <vxml> AppRoot G © Voxeo Corporation
  • 20. Session Root A: <vxml> No Session Root B: <vxml session="C"> Session Root C D: <vxml> Session Root C E: <vxml session="F" > Session Root C G: <vxml session="H" requiresession="true"> error.badfetch © Voxeo Corporation
  • 21. Real-time Controls   Special grammars that are always active (not just in the wait state) •  Allows arbitrary speech/dtmf •  Immediate: volume, speed, skip •  At next event processing: cancel, goto <form> <rtc grammar="digit3.grxml" action="volume" params="+5"/> <field name="a"> ... </field> <field name="b"> <cancelrtc grammar= "digit3.grxml "/> ... </field> </form>   Acts as pre-filter on input stream, replacing matches with silence © Voxeo Corporation
  • 22. Transition Controllers   Inter-element transitions now under author control   Controllers at form, document, application, and perhaps session levels •  e.g. form controller specifies which form item to execute next   Controllers can be in SCXML or another flow control language   Default controllers will give FIA behavior in Legacy Profile © Voxeo Corporation
  • 23. Transition Controllers Example 1 <!-- document-level transition controller controls inter-form transitions --> <vxml ...> <controller ...> <scxml:scxml version="1.0" ...> <!-- SCXML code determining which form to go to next --> </scxml> </controller> <form id="form_a" > ... <goto next="form_b"/> <!-- goto is only a suggestion now --> </form> <form id="form_b" > ... </form> ... </vxml> © Voxeo Corporation
  • 24. Transition Controllers Example 2 <!-- form-level transition controller controls inter-field transitions --> <vxml ...> <form> <controller src= "myformbehavior.scxml"> <field name="field_a" > ... </field> <field name="field_b" > ... </field> <field name="field_c" > ... </field> <field name="field_d" > ... </field> </form> ... </vxml> © Voxeo Corporation
  • 25. For More V3 Info   Follow the work •  http://www.w3.org/Voice   Check out our recent Developer Jam Session •  http://developers.voiceobjects.com/tech-topics/ monthly-jam-sessions/   Contact me •  dburnett at voxeo dot com Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo © Voxeo Corporation