SlideShare una empresa de Scribd logo
1 de 18
Cloud Robotics for Building
Conversational Robots
Komei Sugiura
National Institute of Information and Communications Tech., Japan
Beyond the Language Barrier:
NICT’s free software and cloud services
1. Speech to speech translation system: VoiceTra (2010)
>1M downloads.
High performance in translation to/from Asian languages
2. MCML Speech interaction SDK (2013)
The SDK enable the user to build WFST-
based multilingual dialogue systems.
3. Smartphone dialogue apps (2011)
Spoken dialogues and recommendation in tourist
guidance domains
4. Cloud robotics platform rospeex (2013)
40K unique users. Top level quality as dialogue-based TTS
in Japanese.
[New] Automatic captioning SDK for developers
http://www2.nict.go.jp/astrec-ast/mcml-sdk/index_en.html
Free of charge, but authentication required
Video
Motivation:
How can we build communicative robots to help people?
Smartphones and other consumer devices
Speech interfaces give benefit to
consumers
cf. Market size of speech recognition
¥88B@2013→¥170B@2018 (€1.5B)*
Show me today’s
schedule
* Estimation by NEDO, TSC Foresight Vol.8, 2015
Sushi restaurants
around here
Benefit for
QA/search
GPS Contacts Other context
info.
Current communication with robots
Insufficient benefit to consumers
??
??Throw
them away.
Is there any milk
in the fridge?
• Bad recognition accuracy
• User needs to specify [what,
where, how] as well as start/end
conditions
ROSPEEX:
A CLOUD ROBOTICS PLATFORM FOR
MULTILINGUAL SPOKEN DIALOGUES
5
Background: Speech recognition/synthesis is bottleneck
for reducing cost in human-robot interactions
• Synthesized speech sounds
monotonous and unfriendly
• Speech recognition does not work
well than expected
XIMERA 3
(Text-reading)
Voice
talent
Target = Interactions with service robots
Rospeex:
A cloud robotics platform for multilingual spoken dialogues
• >40,000 unique users have used rospeex
• WER =7.9% (accuracy=92.1%) for IWSLT tst2011 (1st Place
Winner in IWSLT12, 13, 14)
• Top-level quality dialogue-oriented TTS
Python & C++ samples
are available
rospeex Search
* Free of charge for research
Rospeex’s positioning in robot dialogue quadrants
8
Cloud APIs
(Google, Microsoft, IBM,
NTT docomo, Wit.ai,…)
Free software
Commercial software
OpenHRI,
PocketSphinx, Festival
Cloud-based
Stand-alone
Robot
middleware-
compatible
Incompatibl
e
Does not work with
very low-spec PCs 
Robotics-specific
logs are lost 
Authentication
Low quality 
Expensive 
8
Distribution of rospeex users
rospeex applications (40k unique users)
Conversational agents in elderly care
facilities, service robots, humanoid,
dialogue agents, speech interface for car
navigation systems or smarthome devices,
…
Analysis: TTS requests depend heavily on individuals
• Question: Do developers use same sentences for TTS? If so, we can
speed up by introducing local cache.
Cache hit
Cache miss
• Analysis on top 88 users
– New requests = 50.4% on average
– An individual uses max. 200 unique sentences
Without a cloud platform, we
cannot conduct large-scale
analysis of robot developers
Introducing cache will
reduce comm. time
MULTIMODAL SPOKEN DIALOGUES
WITH ROBOTS
10
Multimodal language understanding
Kollar+ 2010
HRI 2010 Best Paper
• Input: Text, LRF, Image
• Output: path planning
• E.g. “Go down the hallway”
Iwahashi &
Sugiura+ 2010
• Input: Image and speech
• Output: object manipulation
• E.g. “Place-on Elmo”
Visual QA[2015-] • Input: Image and question
• Output: Answer
• E.g. “How many elephants are there?” -> “2”
Video
LCore: Multimodal Robot Language Acquisition
[Iwahashi, Sugiura, et al 2010]
Key features
• Fully grounded vocabulary
• Imitation learning
• Incremental & interactive learning
• Language independent
• Learning when to ask questions
12
HMM “Place-
on” Place X on Y
Imitation learning for spoken language understanding:
Re-ranking hypotheses using planned trajectories’ likelihood
• Transformation of reference-point-dependent HMMs*
– Input: verb ID, object ID(s)
e.g. <place-on, Object 1, Object 3>
– Transforms HMM from intrinsic coordinate system into world
coordinate system
HMM “Place-on”
World CS
Situation
Place X on Y
* Sugiura et al, IROS 2011 RoboCup Best Paper
HMM-based trajectory generation using dynamic features*
: state sequence
: HMM parameters
: time series of
(position,velocity,acceleration)
Maximum likelihood trajectory
*Tokuda, K. et al, “Speech parameter generation algorithms for HMM-based speech synthesis”, 2000
: vector of mean vectors
: matrix of covariance
matrices of each OPDF
: matrix of coefficients in
difference approximation
: time series of position
ROBOCUP@HOME
BUILDING DOMESTIC SERVICE ROBOTS
15
RoboCup@Home: Benchmark tests for domestic robots
• RoboCup@Home: The largest competition for domestic robots
– One of the major RoboCup leagues
– Focuses on human-robot interaction and mobile manipulation
– Robots are evaluated by 8 standardized and 3 demonstration tasks
• Scientific challenges
– Navigation in unknown environments (e.g. real shop), handling
everyday objects, spoken dialogues in very noisy environments, …
16
RoboCup@Home Standard Platform Leagues start in 2017
• Many teams need low-cost standardized platforms
• Companies know NAO’s success after selected as soccer-
Standard Platform (Softbank bought Aldebaran @100M USD )
Toyota HSR
• Main use case = partner robot for those who need care
• Lease-based
Softbank Pepper
• Already deployed in restaurants and shops
• Very low price
Both compatible with ROS
CFPs for HSR/Pepper users will be open soon
Summary
• Data-driven approaches
• Multimodal spoken dialogue with robots
• RoboCup and domestic service robots
• …and we’re hiring!

Más contenido relacionado

Similar a 20161014IROS_WS

Human-Machine Interface For Presentation Robot
Human-Machine Interface For Presentation RobotHuman-Machine Interface For Presentation Robot
Human-Machine Interface For Presentation RobotAngela Williams
 
An ontology-based approach to improve the accessibility of ROS-based robotic ...
An ontology-based approach to improve the accessibility of ROS-based robotic ...An ontology-based approach to improve the accessibility of ROS-based robotic ...
An ontology-based approach to improve the accessibility of ROS-based robotic ...Vrije Universiteit Amsterdam
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingIOSR Journals
 
An Integrated Prototyping Environment For Programmable Automation
An Integrated Prototyping Environment For Programmable AutomationAn Integrated Prototyping Environment For Programmable Automation
An Integrated Prototyping Environment For Programmable AutomationMeshDynamics
 
IRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET Journal
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroAlex Barbosa Coqueiro
 
SOFIA - Semantic Technologies and Techniques for Interoperable Information in...
SOFIA - Semantic Technologies and Techniques for Interoperable Information in...SOFIA - Semantic Technologies and Techniques for Interoperable Information in...
SOFIA - Semantic Technologies and Techniques for Interoperable Information in...Sofia Eu
 
Key Features Of The Pseudo Code
Key Features Of The Pseudo CodeKey Features Of The Pseudo Code
Key Features Of The Pseudo CodeAngilina Jones
 
Robots in Human Environments
Robots in Human EnvironmentsRobots in Human Environments
Robots in Human EnvironmentsAndreas Heil
 
Desarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de Castro
Desarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de CastroDesarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de Castro
Desarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de CastroFacultad de Informática UCM
 
Live, Work, Play with Intelligent Robots
Live, Work, Play with Intelligent RobotsLive, Work, Play with Intelligent Robots
Live, Work, Play with Intelligent RobotsNUS-ISS
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...semanticsconference
 
MR + AI: Machine Learning for Language in HoloLens & VR Apps
MR + AI: Machine Learning for Language in HoloLens & VR AppsMR + AI: Machine Learning for Language in HoloLens & VR Apps
MR + AI: Machine Learning for Language in HoloLens & VR AppsNick Landry
 
IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016Vaidheswaran CS
 
HoloLens.pdf
HoloLens.pdfHoloLens.pdf
HoloLens.pdfVishwas N
 
Mobility today & what's next. Application ecosystems.
Mobility today & what's next.Application ecosystems.Mobility today & what's next.Application ecosystems.
Mobility today & what's next. Application ecosystems.Petru Jucovschi
 
Figure 1
Figure 1Figure 1
Figure 1butest
 
Figure 1
Figure 1Figure 1
Figure 1butest
 

Similar a 20161014IROS_WS (20)

Human-Machine Interface For Presentation Robot
Human-Machine Interface For Presentation RobotHuman-Machine Interface For Presentation Robot
Human-Machine Interface For Presentation Robot
 
An ontology-based approach to improve the accessibility of ROS-based robotic ...
An ontology-based approach to improve the accessibility of ROS-based robotic ...An ontology-based approach to improve the accessibility of ROS-based robotic ...
An ontology-based approach to improve the accessibility of ROS-based robotic ...
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
 
An Integrated Prototyping Environment For Programmable Automation
An Integrated Prototyping Environment For Programmable AutomationAn Integrated Prototyping Environment For Programmable Automation
An Integrated Prototyping Environment For Programmable Automation
 
robocity2013-jderobot
robocity2013-jderobotrobocity2013-jderobot
robocity2013-jderobot
 
Iitdmj 1
Iitdmj 1Iitdmj 1
Iitdmj 1
 
IRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for Blinds
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to Hero
 
SOFIA - Semantic Technologies and Techniques for Interoperable Information in...
SOFIA - Semantic Technologies and Techniques for Interoperable Information in...SOFIA - Semantic Technologies and Techniques for Interoperable Information in...
SOFIA - Semantic Technologies and Techniques for Interoperable Information in...
 
Key Features Of The Pseudo Code
Key Features Of The Pseudo CodeKey Features Of The Pseudo Code
Key Features Of The Pseudo Code
 
Robots in Human Environments
Robots in Human EnvironmentsRobots in Human Environments
Robots in Human Environments
 
Desarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de Castro
Desarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de CastroDesarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de Castro
Desarrollo de robots sociales con RoboComp - Dr. Pablo Bustos García de Castro
 
Live, Work, Play with Intelligent Robots
Live, Work, Play with Intelligent RobotsLive, Work, Play with Intelligent Robots
Live, Work, Play with Intelligent Robots
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
 
MR + AI: Machine Learning for Language in HoloLens & VR Apps
MR + AI: Machine Learning for Language in HoloLens & VR AppsMR + AI: Machine Learning for Language in HoloLens & VR Apps
MR + AI: Machine Learning for Language in HoloLens & VR Apps
 
IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016
 
HoloLens.pdf
HoloLens.pdfHoloLens.pdf
HoloLens.pdf
 
Mobility today & what's next. Application ecosystems.
Mobility today & what's next.Application ecosystems.Mobility today & what's next.Application ecosystems.
Mobility today & what's next. Application ecosystems.
 
Figure 1
Figure 1Figure 1
Figure 1
 
Figure 1
Figure 1Figure 1
Figure 1
 

Más de Komei Sugiura

ロボティクスにおける言語の利活用
ロボティクスにおける言語の利活用ロボティクスにおける言語の利活用
ロボティクスにおける言語の利活用Komei Sugiura
 
生活支援ロボットにおける 大規模データ収集に向けて
生活支援ロボットにおける大規模データ収集に向けて生活支援ロボットにおける大規模データ収集に向けて
生活支援ロボットにおける 大規模データ収集に向けてKomei Sugiura
 
生活支援ロボットのマルチモーダル言語理解技術
生活支援ロボットのマルチモーダル言語理解技術生活支援ロボットのマルチモーダル言語理解技術
生活支援ロボットのマルチモーダル言語理解技術Komei Sugiura
 
SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...
SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...
SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...Komei Sugiura
 
ロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けて
ロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けてロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けて
ロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けてKomei Sugiura
 
Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...
Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...
Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...Komei Sugiura
 
言葉や能力の壁を越えるデータ指向知能
言葉や能力の壁を越えるデータ指向知能言葉や能力の壁を越えるデータ指向知能
言葉や能力の壁を越えるデータ指向知能Komei Sugiura
 
20160907rsj16ロボット聴覚OS
20160907rsj16ロボット聴覚OS20160907rsj16ロボット聴覚OS
20160907rsj16ロボット聴覚OSKomei Sugiura
 
20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置
20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置
20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置Komei Sugiura
 
20160221statistic imitation learning and human-robot communication
20160221statistic imitation learning and human-robot communication20160221statistic imitation learning and human-robot communication
20160221statistic imitation learning and human-robot communicationKomei Sugiura
 
20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック
20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック
20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバックKomei Sugiura
 
20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測
20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測
20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測Komei Sugiura
 
階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験
階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験
階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験Komei Sugiura
 
実世界の意味を扱う理論と機械知能の構築
実世界の意味を扱う理論と機械知能の構築実世界の意味を扱う理論と機械知能の構築
実世界の意味を扱う理論と機械知能の構築Komei Sugiura
 
20151129インテリジェントホームロボティクス研究会
20151129インテリジェントホームロボティクス研究会20151129インテリジェントホームロボティクス研究会
20151129インテリジェントホームロボティクス研究会Komei Sugiura
 
Japan Robot Week 2014けいはんなロボットフォーラム
Japan Robot Week 2014けいはんなロボットフォーラムJapan Robot Week 2014けいはんなロボットフォーラム
Japan Robot Week 2014けいはんなロボットフォーラムKomei Sugiura
 
Language acquisition framework for robots: From grounded language acquisition...
Language acquisition framework for robots: From grounded language acquisition...Language acquisition framework for robots: From grounded language acquisition...
Language acquisition framework for robots: From grounded language acquisition...Komei Sugiura
 
Introduction to RoboCup@Home
Introduction to RoboCup@HomeIntroduction to RoboCup@Home
Introduction to RoboCup@HomeKomei Sugiura
 
ロボカップ@ホーム入門
ロボカップ@ホーム入門ロボカップ@ホーム入門
ロボカップ@ホーム入門Komei Sugiura
 

Más de Komei Sugiura (19)

ロボティクスにおける言語の利活用
ロボティクスにおける言語の利活用ロボティクスにおける言語の利活用
ロボティクスにおける言語の利活用
 
生活支援ロボットにおける 大規模データ収集に向けて
生活支援ロボットにおける大規模データ収集に向けて生活支援ロボットにおける大規模データ収集に向けて
生活支援ロボットにおける 大規模データ収集に向けて
 
生活支援ロボットのマルチモーダル言語理解技術
生活支援ロボットのマルチモーダル言語理解技術生活支援ロボットのマルチモーダル言語理解技術
生活支援ロボットのマルチモーダル言語理解技術
 
SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...
SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...
SuMo-SS: Submodular Optimization Sensor Scattering for Deploying Sensor Netwo...
 
ロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けて
ロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けてロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けて
ロボットの音声コミュニケーション技術:言葉や能力の壁を越えるデータ指向知能に向けて
 
Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...
Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...
Spatio-Temporal Pseudo Relevance Feedback for Large-Scale and Heterogeneous S...
 
言葉や能力の壁を越えるデータ指向知能
言葉や能力の壁を越えるデータ指向知能言葉や能力の壁を越えるデータ指向知能
言葉や能力の壁を越えるデータ指向知能
 
20160907rsj16ロボット聴覚OS
20160907rsj16ロボット聴覚OS20160907rsj16ロボット聴覚OS
20160907rsj16ロボット聴覚OS
 
20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置
20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置
20160606劣モジュラ性を利用したドローンによるばらまき型センサ配置
 
20160221statistic imitation learning and human-robot communication
20160221statistic imitation learning and human-robot communication20160221statistic imitation learning and human-robot communication
20160221statistic imitation learning and human-robot communication
 
20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック
20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック
20140513大規模異分野データ横断検索における時空間情報を用いた擬似適合性フィードバック
 
20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測
20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測
20150531Deep Recurrent Neural Networkによる環境モニタリングデータの予測
 
階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験
階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験
階層型評価構造に基づく観光スポット推薦システムの構築と長期実証実験
 
実世界の意味を扱う理論と機械知能の構築
実世界の意味を扱う理論と機械知能の構築実世界の意味を扱う理論と機械知能の構築
実世界の意味を扱う理論と機械知能の構築
 
20151129インテリジェントホームロボティクス研究会
20151129インテリジェントホームロボティクス研究会20151129インテリジェントホームロボティクス研究会
20151129インテリジェントホームロボティクス研究会
 
Japan Robot Week 2014けいはんなロボットフォーラム
Japan Robot Week 2014けいはんなロボットフォーラムJapan Robot Week 2014けいはんなロボットフォーラム
Japan Robot Week 2014けいはんなロボットフォーラム
 
Language acquisition framework for robots: From grounded language acquisition...
Language acquisition framework for robots: From grounded language acquisition...Language acquisition framework for robots: From grounded language acquisition...
Language acquisition framework for robots: From grounded language acquisition...
 
Introduction to RoboCup@Home
Introduction to RoboCup@HomeIntroduction to RoboCup@Home
Introduction to RoboCup@Home
 
ロボカップ@ホーム入門
ロボカップ@ホーム入門ロボカップ@ホーム入門
ロボカップ@ホーム入門
 

Último

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Último (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

20161014IROS_WS

  • 1. Cloud Robotics for Building Conversational Robots Komei Sugiura National Institute of Information and Communications Tech., Japan
  • 2. Beyond the Language Barrier: NICT’s free software and cloud services 1. Speech to speech translation system: VoiceTra (2010) >1M downloads. High performance in translation to/from Asian languages 2. MCML Speech interaction SDK (2013) The SDK enable the user to build WFST- based multilingual dialogue systems. 3. Smartphone dialogue apps (2011) Spoken dialogues and recommendation in tourist guidance domains 4. Cloud robotics platform rospeex (2013) 40K unique users. Top level quality as dialogue-based TTS in Japanese.
  • 3. [New] Automatic captioning SDK for developers http://www2.nict.go.jp/astrec-ast/mcml-sdk/index_en.html Free of charge, but authentication required Video
  • 4. Motivation: How can we build communicative robots to help people? Smartphones and other consumer devices Speech interfaces give benefit to consumers cf. Market size of speech recognition ¥88B@2013→¥170B@2018 (€1.5B)* Show me today’s schedule * Estimation by NEDO, TSC Foresight Vol.8, 2015 Sushi restaurants around here Benefit for QA/search GPS Contacts Other context info. Current communication with robots Insufficient benefit to consumers ?? ??Throw them away. Is there any milk in the fridge? • Bad recognition accuracy • User needs to specify [what, where, how] as well as start/end conditions
  • 5. ROSPEEX: A CLOUD ROBOTICS PLATFORM FOR MULTILINGUAL SPOKEN DIALOGUES 5
  • 6. Background: Speech recognition/synthesis is bottleneck for reducing cost in human-robot interactions • Synthesized speech sounds monotonous and unfriendly • Speech recognition does not work well than expected XIMERA 3 (Text-reading) Voice talent Target = Interactions with service robots
  • 7. Rospeex: A cloud robotics platform for multilingual spoken dialogues • >40,000 unique users have used rospeex • WER =7.9% (accuracy=92.1%) for IWSLT tst2011 (1st Place Winner in IWSLT12, 13, 14) • Top-level quality dialogue-oriented TTS Python & C++ samples are available rospeex Search * Free of charge for research
  • 8. Rospeex’s positioning in robot dialogue quadrants 8 Cloud APIs (Google, Microsoft, IBM, NTT docomo, Wit.ai,…) Free software Commercial software OpenHRI, PocketSphinx, Festival Cloud-based Stand-alone Robot middleware- compatible Incompatibl e Does not work with very low-spec PCs  Robotics-specific logs are lost  Authentication Low quality  Expensive  8 Distribution of rospeex users rospeex applications (40k unique users) Conversational agents in elderly care facilities, service robots, humanoid, dialogue agents, speech interface for car navigation systems or smarthome devices, …
  • 9. Analysis: TTS requests depend heavily on individuals • Question: Do developers use same sentences for TTS? If so, we can speed up by introducing local cache. Cache hit Cache miss • Analysis on top 88 users – New requests = 50.4% on average – An individual uses max. 200 unique sentences Without a cloud platform, we cannot conduct large-scale analysis of robot developers Introducing cache will reduce comm. time
  • 11. Multimodal language understanding Kollar+ 2010 HRI 2010 Best Paper • Input: Text, LRF, Image • Output: path planning • E.g. “Go down the hallway” Iwahashi & Sugiura+ 2010 • Input: Image and speech • Output: object manipulation • E.g. “Place-on Elmo” Visual QA[2015-] • Input: Image and question • Output: Answer • E.g. “How many elephants are there?” -> “2” Video
  • 12. LCore: Multimodal Robot Language Acquisition [Iwahashi, Sugiura, et al 2010] Key features • Fully grounded vocabulary • Imitation learning • Incremental & interactive learning • Language independent • Learning when to ask questions 12
  • 13. HMM “Place- on” Place X on Y Imitation learning for spoken language understanding: Re-ranking hypotheses using planned trajectories’ likelihood • Transformation of reference-point-dependent HMMs* – Input: verb ID, object ID(s) e.g. <place-on, Object 1, Object 3> – Transforms HMM from intrinsic coordinate system into world coordinate system HMM “Place-on” World CS Situation Place X on Y * Sugiura et al, IROS 2011 RoboCup Best Paper
  • 14. HMM-based trajectory generation using dynamic features* : state sequence : HMM parameters : time series of (position,velocity,acceleration) Maximum likelihood trajectory *Tokuda, K. et al, “Speech parameter generation algorithms for HMM-based speech synthesis”, 2000 : vector of mean vectors : matrix of covariance matrices of each OPDF : matrix of coefficients in difference approximation : time series of position
  • 16. RoboCup@Home: Benchmark tests for domestic robots • RoboCup@Home: The largest competition for domestic robots – One of the major RoboCup leagues – Focuses on human-robot interaction and mobile manipulation – Robots are evaluated by 8 standardized and 3 demonstration tasks • Scientific challenges – Navigation in unknown environments (e.g. real shop), handling everyday objects, spoken dialogues in very noisy environments, … 16
  • 17. RoboCup@Home Standard Platform Leagues start in 2017 • Many teams need low-cost standardized platforms • Companies know NAO’s success after selected as soccer- Standard Platform (Softbank bought Aldebaran @100M USD ) Toyota HSR • Main use case = partner robot for those who need care • Lease-based Softbank Pepper • Already deployed in restaurants and shops • Very low price Both compatible with ROS CFPs for HSR/Pepper users will be open soon
  • 18. Summary • Data-driven approaches • Multimodal spoken dialogue with robots • RoboCup and domestic service robots • …and we’re hiring!