Más contenido relacionado
La actualidad más candente (17)
Más de Droidcon Berlin (20)
Nuance
- 1. Voice Control combined with Speech-To-Text
and NLU resulting in Smart UI
Reimund Schmald, Nuance
Stefan Seide, T-Systems
1 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 2. This is what we are working on!
2 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
Scene from Star Trek IV: The Voyage Home (1986)
- 3. Agenda
• Multi-Modal Input UE: Status and Trends in Mobile
• Voice enabled NLU: Requirements + Demo
• Hybrid Architecture, Programming Example
3 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 4. Starting with Keyboard
Type
Write
Speak
Swype
4 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 5. Multi-Modality in Apps
Example: amazon and iTranslate
5 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 6. Security and Personalization
From completing a financial
transaction to accessing sensitive
content Voice Biometrics offers
security so you can proceed with
confidence.
Through speaker identification
Voice Biometrics delivers a
personalized experience where
“My voice is my various users profiles are
password” available, e.g. shared devices such
as the TV or tablets. Simply speak,
and your personal settings are
loaded.
6 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 7. Personalization – Across Devices
Text Dictionary (Local)
Speech Dictionary (Cloud)
7 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 8. Just the Mic Button
Requirements: High Quality SpeechToText + NLU
8 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 9. High Quality SpeechToText on NDEV
The Industry’s FIRST developer program to offer
Speech To Text and Text to Speech integration for
any mobile app
8000+ developers registered to date
iOS, Android, WP 7
www.ndevmobile.com
9 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 10. NDEV Mobile: Client SDK Technical Aspects
SDK Components
– Recognizer object
– Audio engine Client Application
– End of Speech Detection 1 8
– Encoding (compresses request to conserve bandwidth)
– Network Transport
Dragon SDK
Recogniser Audio End Of
Server Components 2 Object Engine Speech
7
– Authentication Network
Encoding
Transport
– Recognizer
– Vocalizer TTS
3
1. Client application invokes SDK
2. SDK captures request and encodes it
6
• Might use End of Speech, if enabled Authentication
1. SDK Network Transport sends utterance to NVC Servers
2. NVC Server authenticates Client app
4
MREC Vocalizer
3. Recognizer/TTS processes request 5
4. NVC Server redirects response to Client Search Dictation
5. SDK processes response and sends to Client app
6. Client app plays/shows response NVC Hosted Server
10 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 11. NDev mobile Service Levels
FREE
11 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 12. Feature Comparison: Gold, Silver, Emerald
Silver Gold Emerald
Features
ASR Dictation & Search Models
ü ü ü
for 18 Languages
Network TTS for 30+ Languages ü ü ü
Bluetooth Support (8 KHz) ü ü ü
SSL ü ü
Customized Features ü
Flexibility & Customization
UI ü ü ü
Platforms
Android, iOS, W P7 ü ü ü
HTTP ü ü
Consulting Services Available
Availability & Support
Centralized Speech Resource &
ü ü ü
Support Forums
W eb Ticketing ü ü
SLA ü ü
Dedicated Support Contact Available
Cost
Development Free for 90 days Payment Options Custom
Production Free w/ cap $0.009 trx or $0.24 flat Custom
12 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 13. NUANCE PROPRIETARY NON-DISCLOSURE INFORMATION
Different Levels of NLU
Structured NLU Unstructured
NLU
Embedded & Connected Server-side natural language
speech systems working understanding platform that
together to determine what supports open-ended queries
specific phone-related task the and intent classification.
user is looking to complete. “Is it raining in Berlin?”
“Send text to John Call me “What movies are playing near
shortly” me?”
“Search for New York Yankees” “Make a reservation to Capital
“Update Facebook I am today in Grille in Burlington for 8 pm on
Berlin” Friday for 2 people”
13 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 14. NUANCE PROPRIETARY NON-DISCLOSURE INFORMATION
Deploying a Comprehensive Speech Solution
Both NLU systems can be combined to offer a
comprehensive speech experience
Structured NLU Unstructured NLU
NVC Hybrid allows users to
complete core phone
functions (dialing,
+ DragonGO! allows for intelligent
Web and Content access
messaging, etc…)
All Web and media
related queries can be
passed to unstructured
NLU system
(e.g. DragonGo!)
14 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 15. Dragon Go! Directed Search
• A specific site is referenced
in the query.
• Today we support 180+
content providers
including…
• CNN
• eBay
• Engadget
• Facebook
• New York Times
• TechCrunch
• USA Today
• Regional Newspapers
15 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 16. Dragon Go! Intent Search
• CALL a
business
• GET directions
• MAKE
reservations
• PLAY music
• BUY tickets,
products,
music
• More…
16 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS
- 17. Dragon Go! Category Search
• Music
• Movies
• Businesses
• Restaurants
• Sports
• News
• Shopping
• Weather
• More…
17 CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved. MOBILE SOLUTIONS