SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
ChatBot 的智慧與靈魂
YUN-NUNG (VIVIAN) CHEN 陳縕儂
Apr 27th, 2017
@iThome #ChatBot Day
HTTP://VIVIANCHEN.IDV.TW
Intelligent Assistants 2
Apple Siri (2011) Google Now (2012)
Google Assistant (2016)
Facebook M & Bot (2015)
Google Home (2016)
Microsoft Cortana (2014)
Amazon Alexa/Echo
(2014)
Why We Need?
 Get things done
 E.g. set up alarm/reminder, take note
 Easy access to structured data, services and apps
 E.g. find docs/photos/restaurants
 Assist your daily schedule and routine
 E.g. commute alerts to/from work
 Be more productive in managing your work and personal life
3
“Hey Assistant”
Why Natural Language?
 Global Digital Statistics (2015 January)
4
Global Population
7.21B
Active Internet Users
3.01B
Active Social
Media Accounts
2.08B
Active Unique
Mobile Users
3.65B
The more natural and convenient input of devices evolves towards speech.
Intelligent Assistant Architecture 5
Reactive
Assistance
ASR, LU, Dialog, LG, TTS
Proactive
Assistance
Inferences, User
Modeling, Suggestions
Data
Back-end Data
Bases, Services and
Client Signals
Device/Service End-points
(Phone, PC, Xbox, Web Browser, Messaging Apps)
User Experience
“restaurant suggestions”“call taxi”
Dialogue System / ChatBot
 Dialogue systems are intelligent agents that are able to help users finish tasks
more efficiently via spoken interactions.
 Dialogue systems are being incorporated into various devices (smart-phones,
smart TVs, in-car navigating system, etc).
6
JARVIS – Iron Man’s Personal Assistant Baymax – Personal Healthcare Companion
Good dialogue systems assist users to access information conveniently and finish tasks efficiently.
App  Bot
 A bot is responsible for a “single” domain, similar to an app
7
Seamless and automatic information transferring across domains
 reduce duplicate information and interaction
愛食記
Map
LINE
Goal: Schedule a lunch with Vivian
KKBOX
GUI v.s. CUI (Conversational UI) 8
https://github.com/enginebai/Movie-lol-android
GUI v.s. CUI (Conversational UI)
網站/APP內GUI 即時通訊內的CUI
情境 探索式,使用者沒有特定目標 搜尋式,使用者有明確的指令
資訊量 多 少
資訊精準度 低 高
資訊呈現 結構式 非結構式
介面呈現 以圖像為主 以文字為主
介面操作 以點選為主 以文字或語音輸入為主
學習與引導
使用者需了解、學習、適應不同的介面操
作
如同使用即時通訊軟體,使用者無需另行
學習,只需遵循引導
服務入口 需另行下載App或進入特定網站
與即時通訊軟體整合,可適時在對話裡提
供服務
服務內容 需有明確架構供檢索 可接受龐雜、彈性的內容
人性化程度 低,如同操作機器 高,如同與人對話
9
Two Branches of Bots
Task-Oriented
 Personal assistant, helps users achieve a
certain task
 Combination of rules and statistical
components
 POMDP for spoken dialog systems (Williams
and Young, 2007)
 End-to-end trainable task-oriented dialogue
system (Wen et al., 2016)
 End-to-end reinforcement learning dialogue
system (Li et al., 2017; Zhao and Eskenazi, 2016)
Chit-Chat
 No specific goal, focus on natural
responses
 Using variants of generation model
 A neural conversation model (Vinyals and Le, 2015)
 Reinforcement learning for dialogue generation (Li
et al., 2016)
 Conversational contextual cues for response ranking
(AI-Rfou et al., 2016)
10
Dialogue System Framework
MODULAR SYSTEM
11
System Framework 12
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking
• Policy Optimization
Generation
Hypothesis
are there any action movies to
see this weekend
Semantic Frame
request_movie
genre=action, date=this weekend
System Action/Policy
request_location
Text response
Where are you located?
Screen Display
location?
Text Input
Are there any action movies to see this weekend?
Speech Signal current bottleneck
 error propagation
Backend Database
1 2
3 4
Interaction Example
Intelligent Agent
Q: How does a dialogue system process this request?
Good Taiwanese eating places include Din Tai Fung,
etc. What do you want to choose?
好的台式餐廳包含鼎泰豐…,你想要選哪一間?
find a good eating place for taiwanese food
我想找好的台式食物
14
Language Understanding
System Framework 16
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking
• Policy Optimization
Generation
Hypothesis
are there any action movies to
see this weekend
Semantic Frame
request_movie
genre=action, date=this weekend
System Action/Policy
request_location
Text response
Where are you located?
Screen Display
location?
Text Input
Are there any action movies to see this weekend?
Speech Signal
Backend Database
1. Domain Identification
Requires Predefined Domain Ontology
17
Organized Domain Knowledge (Database)
Restaurant DB Taxi DB Movie DB
Classification!
find a good eating place for taiwanese food
我想找好的台式食物
find a good eating place for taiwanese food
我想找好的台式食物
2. Intent Detection
Requires Predefined Schema
18
Restaurant DB
FIND_RESTAURANT
FIND_PRICE
FIND_TYPE
:
Classification!
find a good eating place for taiwanese food
我想找好的台式食物
3. Slot Filling
Requires Predefined Schema
19
Restaurant DB
Restaurant Rating Type
Rest 1 good Taiwanese
Rest 2 bad Thai
: : :
FIND_RESTAURANT
rating=“good”
type=“taiwanese”
SELECT restaurant {
rest.rating=“good”
rest.type=“taiwanese”
}Semantic Frame Sequence Labeling
O O B-rating O O O B-type O
Dialogue Management
State Tracking
Dialogue Management
Policy Optimization
System Framework 21
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking
• Policy Optimization
Generation
Hypothesis
are there any action movies to
see this weekend
Semantic Frame
request_movie
genre=action, date=this weekend
System Action/Policy
request_location
Text response
Where are you located?
Screen Display
location?
Text Input
Are there any action movies to see this weekend?
Speech Signal
Backend Database
Dialogue State Tracking
Requires Hand-Crafted States
22
location rating type
loc, rating rating, type loc, type
all
i want it near to my office
我希望可以離我辦公室近一點
NULL
find a good eating place for taiwanese food
我想找好的台式食物
i want it near to my office
我希望可以離我辦公室近一點
find a good eating place for taiwanese food
我想找好的台式食物
Dialogue State Tracking
Requires Hand-Crafted States
23
location rating type
loc, rating rating, type loc, type
all
NULL
i want it near to my office
我希望可以離我辦公室近一點
find a good eating place for taiwanese food
我想找好的台式食物
Dialogue State Tracking
Handling Errors and Confidence
24
FIND_RESTAURANT
rating=“good”
type=“taiwanese”
FIND_RESTAURANT
rating=“good”
type=“thai”
FIND_RESTAURANT
rating=“good”
location rating type
loc, rating rating, type loc, type
all
NULL
?
?
find a good eating place for taiwanese food
我想找好的台式食物
rating=“good”,
type=“thai”
rating=“good”,
type=“taiwanese”
?
?
Dialogue Policy for Agent Action
 Inform(location=“Taipei 101”)
 “The nearest one is at Taipei 101”
 Request(location)
 “Where is your home?”
 Confirm(type=“taiwanese”)
 “Did you want Taiwanese food?”
 Task Completion / Information Display
 ticket booked, weather information
25
Generation
System Framework 27
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking
• Policy Optimization
Generation
Hypothesis
are there any action movies to
see this weekend
Semantic Frame
request_movie
genre=action, date=this weekend
System Action/Policy
request_location
Text response
Where are you located?
Screen Display
location?
Text Input
Are there any action movies to see this weekend?
Speech Signal
Backend Database
Output / Natural Language Generation
 Goal: generate natural language or GUI given the selected dialogue
action for interactions
 Inform(location=“Taipei 101”)
 “The nearest one is at Taipei 101” v.s.
 Request(location)
 “Where is your home?” v.s.
 Confirm(type=“taiwanese”)
 “Did you want Taiwanese food?” v.s.
28
Challenges and Recent Trends
DEEP LEARNING / MULTIMODALITY/ DIALOGUE COVERAGE / DIALOGUE COMPLEXITY
29
Challenges
 Predefined semantic schema
Chen et al., “Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken Language Understanding,” in ACL-IJCNLP, 2015.
 Data without annotations
Chen et al., “Zero-Shot Learning of Intent Embeddings for Expansion by Convolutional Deep Structured Semantic Models,” in ICASSP, 2016.
 Semantic concept interpretation
Chen et al., “Deriving Local Relational Surface Forms from Dependency-Based Entity Embeddings for Unsupervised Spoken Language Understanding,” in SLT, 2014.
 Predefined dialogue states
Chen, et al., “End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding,” in Interspeech, 2016.
 Error propagation
Hakkani-Tur et al., “Multi-Domain Joint Semantic Frame Parsing using Bi-directional RNN-LSTM,” in Interspeech, 2016.
 Cross-domain intention/bot hierarchy
Sun et al., “An Intelligent Assistant for High-Level Task Understanding,” in IUI, 2016.
Sun et al., “AppDialogue: Multi-App Dialogues for Intelligent Assistants,” in LREC, 2016.
Chen et al., “Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding,” in ICMI, 2016.
 Cross-domain information transferring
Kim et al., “New Transfer Learning Techniques For Disparate Label Sets,” in ACL-IJCNLP, 2015.
30
FIND_RESTAURANT
rating=“good” rating=5? 4?
HotelRest Flight
Travel
Trip
Planning
Deep Learning for LU
 IOB Sequence Labeling for Slot Filling
 Intent Classification
31
𝑤0 𝑤1 𝑤2 𝑤 𝑛
ℎ0
𝑓
ℎ1
𝑓
ℎ2
𝑓
ℎ 𝑛
𝑓
ℎ0
𝑏
ℎ1
𝑏
ℎ2
𝑏 ℎ 𝑛
𝑏
𝑦0 𝑦1 𝑦2 𝑦 𝑛
(a) LSTM (b) LSTM-LA (c) bLSTM-LA
(d) Intent LSTM
intent
𝑤0 𝑤1 𝑤2 𝑤 𝑛
ℎ0 ℎ1 ℎ2 ℎ 𝑛
𝑦0 𝑦1 𝑦2 𝑦 𝑛
𝑤0 𝑤1 𝑤2 𝑤 𝑛
ℎ0 ℎ1 ℎ2 ℎ 𝑛
𝑦0 𝑦1 𝑦2 𝑦 𝑛
𝑤0 𝑤1 𝑤2 𝑤 𝑛
ℎ0 ℎ1 ℎ2 ℎ 𝑛
Deep Learning for LU
 Joint multi-domain intent prediction and slot filling
 Information can mutually enhanced
32
Hakkani-Tur, et al., “Multi-Domain Joint Semantic Frame Parsing using Bi-directional RNN-LSTM,” in Interspeech, 2016.
ht-1 ht+1ht
taiwanese
B-type
food please
O O
hT+1
EOS
FIND_REST
Slot Filling Intent
Prediction
Contextual LU (Chen et al., 2016) 33
just sent email to bob about fishing this weekend
O O O O
B-contact_name
O
B-subject I-subject I-subject
U
S
I send_emailD communication
 send_email(contact_name=“bob”, subject=“fishing this weekend”)
are we going to fish this weekend
U1
S2
 send_email(message=“are we going to fish this weekend”)
send email to bob
U2
 send_email(contact_name=“bob”)
B-message
I-message
I-message I-message I-message
I-message I-message
B-contact_nameS1
Domain Identification  Intent Prediction  Slot Filling
Chen et al., "End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding,” in Interspeech, 2016.
Supervised v.s. Reinforcement
 Supervised
 Reinforcement
Hello 
Agent
……
Agent
……. ……. ……
Bad
“Hello” Say “Hi”
“Bye bye” Say “Good bye”
Learning from teacher
Learning from critics
34
Scenario of Reinforcement Learning
Agent learns to take actions to maximize expected reward.
35
Environment
Observation ot Action at
Reward rt
If win, reward = 1
If loss, reward = -1
Otherwise, reward = 0
Next Move
Dialogue Policy Optimization
 Dialogue management in a RL framework
36
U s e r
Reward
ObservationAction
Environment
Agent
Natural Language Generation Language Understanding
Dialogue Manager
Slides credited by Pei-Hao Su
Optimized dialogue policy selects the best action that can maximize the future reward.
Correct rewards are a crucial factor in dialogue policy training
Reinforcement Learning Based Bot
User goal: Two tickets for “the witch” tomorrow 9:30 PM at regal meridian 16, Seattle.
37
RULE BASED SYSTEM
User: Can I get 2 tickets for the witch?
Agent: Which theater do you want?
User: Regal meridian 16
Agent: What time would you like to see it?
User: 9:30 pm
Agent: What date would you like to see it?
User: Tomorrow!
Agent: How many people are you?
User: Just two adults.
Agent: What is your location?
User: Seattle!
Agent: What kind of video format do you like?
User: Oh, any kind is okay.
Agent: Do you have any price constraint?
User: No, I don’t care.
Agent: Great - I was able to purchase 2 tickets for
you to see the witch tomorrow at regal meridian
16 theater in Seattle at 9:30 pm.
User: Thank you.
Agent: Thank you!
REINFORCEMENT LEARNING SYSTEM
User: Can I get 2 tickets for the witch?
Agent: What date would you like to see it?
User: Tomorrow
Agent: Which theater do you prefer?
User: Regal meridian 16
Agent: What is your location?
User: Seattle
Agent: What time would you like to see it?
User: 9:30 pm.
Agent: Great - I was able to purchase 2 tickets
for you to see the witch tomorrow at regal
meridian 16 theater in Seattle at 9:30 pm.
User: Thanks.
Agent: Thanks!
The system can learn how to efficiently interact
with users for task completion
Li et al., "End-to-End Task-Completion Neural Dialogue Systems,” in arXiv, 2017.
Brain Signal for Understanding
 Misunderstanding detection by brain signal
 Green: listen to the correct answer
 Red: listen to the wrong answer
38
Detecting misunderstanding via brain signal in order to correct the understanding results
Sridharan et al., “NeuroDialog: An EEG-Enabled Spoken Dialog Interface,” in ICMI, 2012.
Video for Intent Understanding 39
Proactive (from camera)
I want to see a movie on TV!
Intent: turn_on_tv
Sir, may I turn on the TV for you?
Proactively understanding user intent to initiate the dialogues.
App Behavior for Understanding
 Task: user intent prediction
 Challenge: language ambiguity
 User preference
 Some people prefer “Message” to “Email”
 Some people prefer “Outlook” to “Gmail”
 App-level contexts
 “Message” is more likely to follow “Camera”
 “Email” is more likely to follow “Excel”
40
send to vivian
v.s.
Email? Message?
Communication
Considering behavioral patterns in history to model understanding for intent prediction.
Chen et al., “Leveraging Behavior Patterns of Mobile Applications for Personalized Spoken Language Understanding,” in ICMI, 2015.
Evolution Roadmap 41
Single
domain
systems
Extended
systems
Multi-
domain
systems
Open
domain
systems
Dialogue breadth (coverage)
Dialoguedepth(complexity)
What is influenza?
I’ve got a cold what do I do?
Tell me a joke.
I feel sad…
Intent Expansion
 Transfer dialogue acts across domains
 Dialogue acts are similar for multiple domains
 Learning new intents by information from other domains
CDSSM
New Intent
Intent Representation
1
2
K
:
Embedding
Generation
K+1
K+2<change_calender>
Training Data
<change_note>
“adjust my note”
:
<change_setting>
“volume turn down”
The dialogue act representations can be automatically learned for other domains
postpone my meeting to five pm
Chen et al., “Zero-Shot Learning of Intent Embeddings for Expansion by Convolutional Deep Structured Semantic Models,” in ICASSP 2016.
42
Policy for Domain Adaptation
 Bayesian committee machine (BCM) enables estimated Q-
function to share knowledge across domains
QRDR
QHDH
QL DL
Committee Model
The policy from a new domain can be boosted by the committee policy
#dialogues
reward
Gašić, et al., “Policy Committee for Adaptation in Multi-Domain Spoken Dialogue Systems,” in ASRU, 2015.
43
Evolution Roadmap 44
Knowledge based system
Common sense system
Empathetic systems
Dialogue breadth (coverage)
Dialoguedepth(complexity)
What is influenza?
I’ve got a cold what do I do?
Tell me a joke.
I feel sad…
High-Level Intention for Dialogue Planning
 High-level intention may span several domains
Schedule a lunch with Vivian.
find restaurant check location contact play music
What kind of restaurants do you prefer?
The distance is …
Should I send the restaurant information to Vivian?
Users can interact via high-level descriptions and the
system learns how to plan the dialogues
Sun, et al., “An Intelligent Assistant for High-Level Task Understanding,” in IUI, 2016.
45
Empathy in Dialogue System
 Embed an empathy module
 Recognize emotion using multimodality
 Generate emotion-aware responses
46
Emotion Recognizer
vision
speech
text
Fung et al., “Towards Empathetic Human-Robot Interactions,” in arXiv, 2016.
Conclusion
47
Conclusion
 The conversational systems can manage information access via spoken interactions
 A domain is usually constrained by the backend database
 Semantic schema should be predefined
 Cross-domain knowledge and intention is difficult to handled
 Human-like dialogue system
 Robustness  error handling via interactions / reasoning
 Dialogue coverage  domain switching / open-domain system
 Dialogue complexity  common sense and empathy
48
Q & A
THANKS FOR YOUR ATTENTION!
49

Más contenido relacionado

La actualidad más candente

Statistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent AssistantsStatistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent AssistantsYun-Nung (Vivian) Chen
 
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Yun-Nung (Vivian) Chen
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Yun-Nung (Vivian) Chen
 
One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人Yun-Nung (Vivian) Chen
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...Yun-Nung (Vivian) Chen
 
End-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue SystemsEnd-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue SystemsYun-Nung (Vivian) Chen
 
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Yun-Nung (Vivian) Chen
 
2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue Systems2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue SystemsMLReview
 
An Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingAn Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingYun-Nung (Vivian) Chen
 
How the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in DialoguesHow the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in DialoguesYun-Nung (Vivian) Chen
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System ReviewNguyen Quang
 
Dialogue systems and personal assistants
Dialogue systems and personal assistantsDialogue systems and personal assistants
Dialogue systems and personal assistantsNatalia Konstantinova
 
Multiskill Conversational AI
Multiskill Conversational AIMultiskill Conversational AI
Multiskill Conversational AIDaniel Kornev
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...krisztianbalog
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?krisztianbalog
 
Managing Dialog Strategy in Multiskill AI Assistant with Discourse Management
Managing Dialog Strategy in Multiskill AI Assistant with Discourse ManagementManaging Dialog Strategy in Multiskill AI Assistant with Discourse Management
Managing Dialog Strategy in Multiskill AI Assistant with Discourse ManagementDaniel Kornev
 
God Mode for designing scenario-driven skills for DeepPavlov Dream
God Mode for designing scenario-driven skills for DeepPavlov DreamGod Mode for designing scenario-driven skills for DeepPavlov Dream
God Mode for designing scenario-driven skills for DeepPavlov DreamDaniel Kornev
 
Managing Dialog Strategy In Multiskill AI Assistant.pdf
Managing Dialog Strategy In Multiskill AI Assistant.pdfManaging Dialog Strategy In Multiskill AI Assistant.pdf
Managing Dialog Strategy In Multiskill AI Assistant.pdfDaniel Kornev
 

La actualidad más candente (18)

Statistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent AssistantsStatistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent Assistants
 
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
 
One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
 
End-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue SystemsEnd-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue Systems
 
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
 
2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue Systems2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue Systems
 
An Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingAn Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task Understanding
 
How the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in DialoguesHow the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in Dialogues
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System Review
 
Dialogue systems and personal assistants
Dialogue systems and personal assistantsDialogue systems and personal assistants
Dialogue systems and personal assistants
 
Multiskill Conversational AI
Multiskill Conversational AIMultiskill Conversational AI
Multiskill Conversational AI
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
Managing Dialog Strategy in Multiskill AI Assistant with Discourse Management
Managing Dialog Strategy in Multiskill AI Assistant with Discourse ManagementManaging Dialog Strategy in Multiskill AI Assistant with Discourse Management
Managing Dialog Strategy in Multiskill AI Assistant with Discourse Management
 
God Mode for designing scenario-driven skills for DeepPavlov Dream
God Mode for designing scenario-driven skills for DeepPavlov DreamGod Mode for designing scenario-driven skills for DeepPavlov Dream
God Mode for designing scenario-driven skills for DeepPavlov Dream
 
Managing Dialog Strategy In Multiskill AI Assistant.pdf
Managing Dialog Strategy In Multiskill AI Assistant.pdfManaging Dialog Strategy In Multiskill AI Assistant.pdf
Managing Dialog Strategy In Multiskill AI Assistant.pdf
 

Similar a Chatbot的智慧與靈魂

[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人台灣資料科學年會
 
Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...
Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...
Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...Association for Computational Linguistics
 
Building a Single User Experience
Building a Single User ExperienceBuilding a Single User Experience
Building a Single User ExperienceNina McHale
 
single ux il2011
single ux il2011single ux il2011
single ux il2011jjbattles
 
Building a Single User Experience
Building a Single User ExperienceBuilding a Single User Experience
Building a Single User ExperienceRachel Vacek
 
Understanding Human Conversations with AI
Understanding Human Conversations with AI Understanding Human Conversations with AI
Understanding Human Conversations with AI Rajath D M
 
Realizzare un Virtual Assistant con Bot Framework Azure e Unity
Realizzare un Virtual Assistant con Bot Framework Azure e UnityRealizzare un Virtual Assistant con Bot Framework Azure e Unity
Realizzare un Virtual Assistant con Bot Framework Azure e UnityMarco Parenzan
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automationbenosteen
 
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow StuttgartGoogle Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow StuttgartVMware Tanzu
 
Google Cloud Platform Munich
Google Cloud Platform MunichGoogle Cloud Platform Munich
Google Cloud Platform MunichVMware Tanzu
 
Computers have feelings too
Computers have feelings tooComputers have feelings too
Computers have feelings tooPaul Glavich
 
Software Bots as Superheroes in the SPACE of Developer Productivity
Software Bots as Superheroes in the SPACE of Developer ProductivitySoftware Bots as Superheroes in the SPACE of Developer Productivity
Software Bots as Superheroes in the SPACE of Developer ProductivityMargaret-Anne Storey
 
Blazing Cloud: Agile Product Development
Blazing Cloud: Agile Product DevelopmentBlazing Cloud: Agile Product Development
Blazing Cloud: Agile Product DevelopmentSarah Allen
 
.NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в...
.NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в....NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в...
.NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в...NETFest
 
Intelligent Apps - Amplifying Human Ingenuity
Intelligent Apps - Amplifying Human IngenuityIntelligent Apps - Amplifying Human Ingenuity
Intelligent Apps - Amplifying Human IngenuityDavid J Rosenthal
 
Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr...
 Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr... Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr...
Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr...Association for Computational Linguistics
 
Mobile Information Architecture
Mobile Information ArchitectureMobile Information Architecture
Mobile Information ArchitectureAndy Fitzgerald
 

Similar a Chatbot的智慧與靈魂 (20)

[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人[系列活動] 一天搞懂對話機器人
[系列活動] 一天搞懂對話機器人
 
Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...
Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...
Yun-Nung (Vivian) Chen - 2015 - Unsupervised Learning and Modeling of Knowled...
 
Building a Single User Experience
Building a Single User ExperienceBuilding a Single User Experience
Building a Single User Experience
 
single ux il2011
single ux il2011single ux il2011
single ux il2011
 
Building a Single User Experience
Building a Single User ExperienceBuilding a Single User Experience
Building a Single User Experience
 
Realizing AI Conversational Bot
Realizing AI Conversational BotRealizing AI Conversational Bot
Realizing AI Conversational Bot
 
Understanding Human Conversations with AI
Understanding Human Conversations with AI Understanding Human Conversations with AI
Understanding Human Conversations with AI
 
Realizzare un Virtual Assistant con Bot Framework Azure e Unity
Realizzare un Virtual Assistant con Bot Framework Azure e UnityRealizzare un Virtual Assistant con Bot Framework Azure e Unity
Realizzare un Virtual Assistant con Bot Framework Azure e Unity
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow StuttgartGoogle Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
 
Google Cloud Platform Munich
Google Cloud Platform MunichGoogle Cloud Platform Munich
Google Cloud Platform Munich
 
Computers have feelings too
Computers have feelings tooComputers have feelings too
Computers have feelings too
 
Software Bots as Superheroes in the SPACE of Developer Productivity
Software Bots as Superheroes in the SPACE of Developer ProductivitySoftware Bots as Superheroes in the SPACE of Developer Productivity
Software Bots as Superheroes in the SPACE of Developer Productivity
 
Blazing Cloud: Agile Product Development
Blazing Cloud: Agile Product DevelopmentBlazing Cloud: Agile Product Development
Blazing Cloud: Agile Product Development
 
Yahoo Help Content Strategy - Chris Todd
Yahoo Help Content Strategy -  Chris ToddYahoo Help Content Strategy -  Chris Todd
Yahoo Help Content Strategy - Chris Todd
 
Intelligent ChatBot
Intelligent ChatBotIntelligent ChatBot
Intelligent ChatBot
 
.NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в...
.NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в....NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в...
.NET Fest 2017. Олександр Краковецький. Інструменти та технології Microsoft в...
 
Intelligent Apps - Amplifying Human Ingenuity
Intelligent Apps - Amplifying Human IngenuityIntelligent Apps - Amplifying Human Ingenuity
Intelligent Apps - Amplifying Human Ingenuity
 
Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr...
 Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr... Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr...
Yun-Nung (Vivian) Chen - 2015 - Matrix Factorization with Knowledge Graph Pr...
 
Mobile Information Architecture
Mobile Information ArchitectureMobile Information Architecture
Mobile Information Architecture
 

Último

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Chatbot的智慧與靈魂

  • 1. ChatBot 的智慧與靈魂 YUN-NUNG (VIVIAN) CHEN 陳縕儂 Apr 27th, 2017 @iThome #ChatBot Day HTTP://VIVIANCHEN.IDV.TW
  • 2. Intelligent Assistants 2 Apple Siri (2011) Google Now (2012) Google Assistant (2016) Facebook M & Bot (2015) Google Home (2016) Microsoft Cortana (2014) Amazon Alexa/Echo (2014)
  • 3. Why We Need?  Get things done  E.g. set up alarm/reminder, take note  Easy access to structured data, services and apps  E.g. find docs/photos/restaurants  Assist your daily schedule and routine  E.g. commute alerts to/from work  Be more productive in managing your work and personal life 3 “Hey Assistant”
  • 4. Why Natural Language?  Global Digital Statistics (2015 January) 4 Global Population 7.21B Active Internet Users 3.01B Active Social Media Accounts 2.08B Active Unique Mobile Users 3.65B The more natural and convenient input of devices evolves towards speech.
  • 5. Intelligent Assistant Architecture 5 Reactive Assistance ASR, LU, Dialog, LG, TTS Proactive Assistance Inferences, User Modeling, Suggestions Data Back-end Data Bases, Services and Client Signals Device/Service End-points (Phone, PC, Xbox, Web Browser, Messaging Apps) User Experience “restaurant suggestions”“call taxi”
  • 6. Dialogue System / ChatBot  Dialogue systems are intelligent agents that are able to help users finish tasks more efficiently via spoken interactions.  Dialogue systems are being incorporated into various devices (smart-phones, smart TVs, in-car navigating system, etc). 6 JARVIS – Iron Man’s Personal Assistant Baymax – Personal Healthcare Companion Good dialogue systems assist users to access information conveniently and finish tasks efficiently.
  • 7. App  Bot  A bot is responsible for a “single” domain, similar to an app 7 Seamless and automatic information transferring across domains  reduce duplicate information and interaction 愛食記 Map LINE Goal: Schedule a lunch with Vivian KKBOX
  • 8. GUI v.s. CUI (Conversational UI) 8 https://github.com/enginebai/Movie-lol-android
  • 9. GUI v.s. CUI (Conversational UI) 網站/APP內GUI 即時通訊內的CUI 情境 探索式,使用者沒有特定目標 搜尋式,使用者有明確的指令 資訊量 多 少 資訊精準度 低 高 資訊呈現 結構式 非結構式 介面呈現 以圖像為主 以文字為主 介面操作 以點選為主 以文字或語音輸入為主 學習與引導 使用者需了解、學習、適應不同的介面操 作 如同使用即時通訊軟體,使用者無需另行 學習,只需遵循引導 服務入口 需另行下載App或進入特定網站 與即時通訊軟體整合,可適時在對話裡提 供服務 服務內容 需有明確架構供檢索 可接受龐雜、彈性的內容 人性化程度 低,如同操作機器 高,如同與人對話 9
  • 10. Two Branches of Bots Task-Oriented  Personal assistant, helps users achieve a certain task  Combination of rules and statistical components  POMDP for spoken dialog systems (Williams and Young, 2007)  End-to-end trainable task-oriented dialogue system (Wen et al., 2016)  End-to-end reinforcement learning dialogue system (Li et al., 2017; Zhao and Eskenazi, 2016) Chit-Chat  No specific goal, focus on natural responses  Using variants of generation model  A neural conversation model (Vinyals and Le, 2015)  Reinforcement learning for dialogue generation (Li et al., 2016)  Conversational contextual cues for response ranking (AI-Rfou et al., 2016) 10
  • 12. System Framework 12 Speech Recognition Language Understanding (LU) • Domain Identification • User Intent Detection • Slot Filling Dialogue Management (DM) • Dialogue State Tracking • Policy Optimization Generation Hypothesis are there any action movies to see this weekend Semantic Frame request_movie genre=action, date=this weekend System Action/Policy request_location Text response Where are you located? Screen Display location? Text Input Are there any action movies to see this weekend? Speech Signal current bottleneck  error propagation Backend Database
  • 14. Interaction Example Intelligent Agent Q: How does a dialogue system process this request? Good Taiwanese eating places include Din Tai Fung, etc. What do you want to choose? 好的台式餐廳包含鼎泰豐…,你想要選哪一間? find a good eating place for taiwanese food 我想找好的台式食物 14
  • 16. System Framework 16 Speech Recognition Language Understanding (LU) • Domain Identification • User Intent Detection • Slot Filling Dialogue Management (DM) • Dialogue State Tracking • Policy Optimization Generation Hypothesis are there any action movies to see this weekend Semantic Frame request_movie genre=action, date=this weekend System Action/Policy request_location Text response Where are you located? Screen Display location? Text Input Are there any action movies to see this weekend? Speech Signal Backend Database
  • 17. 1. Domain Identification Requires Predefined Domain Ontology 17 Organized Domain Knowledge (Database) Restaurant DB Taxi DB Movie DB Classification! find a good eating place for taiwanese food 我想找好的台式食物
  • 18. find a good eating place for taiwanese food 我想找好的台式食物 2. Intent Detection Requires Predefined Schema 18 Restaurant DB FIND_RESTAURANT FIND_PRICE FIND_TYPE : Classification!
  • 19. find a good eating place for taiwanese food 我想找好的台式食物 3. Slot Filling Requires Predefined Schema 19 Restaurant DB Restaurant Rating Type Rest 1 good Taiwanese Rest 2 bad Thai : : : FIND_RESTAURANT rating=“good” type=“taiwanese” SELECT restaurant { rest.rating=“good” rest.type=“taiwanese” }Semantic Frame Sequence Labeling O O B-rating O O O B-type O
  • 20. Dialogue Management State Tracking Dialogue Management Policy Optimization
  • 21. System Framework 21 Speech Recognition Language Understanding (LU) • Domain Identification • User Intent Detection • Slot Filling Dialogue Management (DM) • Dialogue State Tracking • Policy Optimization Generation Hypothesis are there any action movies to see this weekend Semantic Frame request_movie genre=action, date=this weekend System Action/Policy request_location Text response Where are you located? Screen Display location? Text Input Are there any action movies to see this weekend? Speech Signal Backend Database
  • 22. Dialogue State Tracking Requires Hand-Crafted States 22 location rating type loc, rating rating, type loc, type all i want it near to my office 我希望可以離我辦公室近一點 NULL find a good eating place for taiwanese food 我想找好的台式食物 i want it near to my office 我希望可以離我辦公室近一點 find a good eating place for taiwanese food 我想找好的台式食物
  • 23. Dialogue State Tracking Requires Hand-Crafted States 23 location rating type loc, rating rating, type loc, type all NULL i want it near to my office 我希望可以離我辦公室近一點 find a good eating place for taiwanese food 我想找好的台式食物
  • 24. Dialogue State Tracking Handling Errors and Confidence 24 FIND_RESTAURANT rating=“good” type=“taiwanese” FIND_RESTAURANT rating=“good” type=“thai” FIND_RESTAURANT rating=“good” location rating type loc, rating rating, type loc, type all NULL ? ? find a good eating place for taiwanese food 我想找好的台式食物 rating=“good”, type=“thai” rating=“good”, type=“taiwanese” ? ?
  • 25. Dialogue Policy for Agent Action  Inform(location=“Taipei 101”)  “The nearest one is at Taipei 101”  Request(location)  “Where is your home?”  Confirm(type=“taiwanese”)  “Did you want Taiwanese food?”  Task Completion / Information Display  ticket booked, weather information 25
  • 27. System Framework 27 Speech Recognition Language Understanding (LU) • Domain Identification • User Intent Detection • Slot Filling Dialogue Management (DM) • Dialogue State Tracking • Policy Optimization Generation Hypothesis are there any action movies to see this weekend Semantic Frame request_movie genre=action, date=this weekend System Action/Policy request_location Text response Where are you located? Screen Display location? Text Input Are there any action movies to see this weekend? Speech Signal Backend Database
  • 28. Output / Natural Language Generation  Goal: generate natural language or GUI given the selected dialogue action for interactions  Inform(location=“Taipei 101”)  “The nearest one is at Taipei 101” v.s.  Request(location)  “Where is your home?” v.s.  Confirm(type=“taiwanese”)  “Did you want Taiwanese food?” v.s. 28
  • 29. Challenges and Recent Trends DEEP LEARNING / MULTIMODALITY/ DIALOGUE COVERAGE / DIALOGUE COMPLEXITY 29
  • 30. Challenges  Predefined semantic schema Chen et al., “Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken Language Understanding,” in ACL-IJCNLP, 2015.  Data without annotations Chen et al., “Zero-Shot Learning of Intent Embeddings for Expansion by Convolutional Deep Structured Semantic Models,” in ICASSP, 2016.  Semantic concept interpretation Chen et al., “Deriving Local Relational Surface Forms from Dependency-Based Entity Embeddings for Unsupervised Spoken Language Understanding,” in SLT, 2014.  Predefined dialogue states Chen, et al., “End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding,” in Interspeech, 2016.  Error propagation Hakkani-Tur et al., “Multi-Domain Joint Semantic Frame Parsing using Bi-directional RNN-LSTM,” in Interspeech, 2016.  Cross-domain intention/bot hierarchy Sun et al., “An Intelligent Assistant for High-Level Task Understanding,” in IUI, 2016. Sun et al., “AppDialogue: Multi-App Dialogues for Intelligent Assistants,” in LREC, 2016. Chen et al., “Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding,” in ICMI, 2016.  Cross-domain information transferring Kim et al., “New Transfer Learning Techniques For Disparate Label Sets,” in ACL-IJCNLP, 2015. 30 FIND_RESTAURANT rating=“good” rating=5? 4? HotelRest Flight Travel Trip Planning
  • 31. Deep Learning for LU  IOB Sequence Labeling for Slot Filling  Intent Classification 31 𝑤0 𝑤1 𝑤2 𝑤 𝑛 ℎ0 𝑓 ℎ1 𝑓 ℎ2 𝑓 ℎ 𝑛 𝑓 ℎ0 𝑏 ℎ1 𝑏 ℎ2 𝑏 ℎ 𝑛 𝑏 𝑦0 𝑦1 𝑦2 𝑦 𝑛 (a) LSTM (b) LSTM-LA (c) bLSTM-LA (d) Intent LSTM intent 𝑤0 𝑤1 𝑤2 𝑤 𝑛 ℎ0 ℎ1 ℎ2 ℎ 𝑛 𝑦0 𝑦1 𝑦2 𝑦 𝑛 𝑤0 𝑤1 𝑤2 𝑤 𝑛 ℎ0 ℎ1 ℎ2 ℎ 𝑛 𝑦0 𝑦1 𝑦2 𝑦 𝑛 𝑤0 𝑤1 𝑤2 𝑤 𝑛 ℎ0 ℎ1 ℎ2 ℎ 𝑛
  • 32. Deep Learning for LU  Joint multi-domain intent prediction and slot filling  Information can mutually enhanced 32 Hakkani-Tur, et al., “Multi-Domain Joint Semantic Frame Parsing using Bi-directional RNN-LSTM,” in Interspeech, 2016. ht-1 ht+1ht taiwanese B-type food please O O hT+1 EOS FIND_REST Slot Filling Intent Prediction
  • 33. Contextual LU (Chen et al., 2016) 33 just sent email to bob about fishing this weekend O O O O B-contact_name O B-subject I-subject I-subject U S I send_emailD communication  send_email(contact_name=“bob”, subject=“fishing this weekend”) are we going to fish this weekend U1 S2  send_email(message=“are we going to fish this weekend”) send email to bob U2  send_email(contact_name=“bob”) B-message I-message I-message I-message I-message I-message I-message B-contact_nameS1 Domain Identification  Intent Prediction  Slot Filling Chen et al., "End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding,” in Interspeech, 2016.
  • 34. Supervised v.s. Reinforcement  Supervised  Reinforcement Hello  Agent …… Agent ……. ……. …… Bad “Hello” Say “Hi” “Bye bye” Say “Good bye” Learning from teacher Learning from critics 34
  • 35. Scenario of Reinforcement Learning Agent learns to take actions to maximize expected reward. 35 Environment Observation ot Action at Reward rt If win, reward = 1 If loss, reward = -1 Otherwise, reward = 0 Next Move
  • 36. Dialogue Policy Optimization  Dialogue management in a RL framework 36 U s e r Reward ObservationAction Environment Agent Natural Language Generation Language Understanding Dialogue Manager Slides credited by Pei-Hao Su Optimized dialogue policy selects the best action that can maximize the future reward. Correct rewards are a crucial factor in dialogue policy training
  • 37. Reinforcement Learning Based Bot User goal: Two tickets for “the witch” tomorrow 9:30 PM at regal meridian 16, Seattle. 37 RULE BASED SYSTEM User: Can I get 2 tickets for the witch? Agent: Which theater do you want? User: Regal meridian 16 Agent: What time would you like to see it? User: 9:30 pm Agent: What date would you like to see it? User: Tomorrow! Agent: How many people are you? User: Just two adults. Agent: What is your location? User: Seattle! Agent: What kind of video format do you like? User: Oh, any kind is okay. Agent: Do you have any price constraint? User: No, I don’t care. Agent: Great - I was able to purchase 2 tickets for you to see the witch tomorrow at regal meridian 16 theater in Seattle at 9:30 pm. User: Thank you. Agent: Thank you! REINFORCEMENT LEARNING SYSTEM User: Can I get 2 tickets for the witch? Agent: What date would you like to see it? User: Tomorrow Agent: Which theater do you prefer? User: Regal meridian 16 Agent: What is your location? User: Seattle Agent: What time would you like to see it? User: 9:30 pm. Agent: Great - I was able to purchase 2 tickets for you to see the witch tomorrow at regal meridian 16 theater in Seattle at 9:30 pm. User: Thanks. Agent: Thanks! The system can learn how to efficiently interact with users for task completion Li et al., "End-to-End Task-Completion Neural Dialogue Systems,” in arXiv, 2017.
  • 38. Brain Signal for Understanding  Misunderstanding detection by brain signal  Green: listen to the correct answer  Red: listen to the wrong answer 38 Detecting misunderstanding via brain signal in order to correct the understanding results Sridharan et al., “NeuroDialog: An EEG-Enabled Spoken Dialog Interface,” in ICMI, 2012.
  • 39. Video for Intent Understanding 39 Proactive (from camera) I want to see a movie on TV! Intent: turn_on_tv Sir, may I turn on the TV for you? Proactively understanding user intent to initiate the dialogues.
  • 40. App Behavior for Understanding  Task: user intent prediction  Challenge: language ambiguity  User preference  Some people prefer “Message” to “Email”  Some people prefer “Outlook” to “Gmail”  App-level contexts  “Message” is more likely to follow “Camera”  “Email” is more likely to follow “Excel” 40 send to vivian v.s. Email? Message? Communication Considering behavioral patterns in history to model understanding for intent prediction. Chen et al., “Leveraging Behavior Patterns of Mobile Applications for Personalized Spoken Language Understanding,” in ICMI, 2015.
  • 41. Evolution Roadmap 41 Single domain systems Extended systems Multi- domain systems Open domain systems Dialogue breadth (coverage) Dialoguedepth(complexity) What is influenza? I’ve got a cold what do I do? Tell me a joke. I feel sad…
  • 42. Intent Expansion  Transfer dialogue acts across domains  Dialogue acts are similar for multiple domains  Learning new intents by information from other domains CDSSM New Intent Intent Representation 1 2 K : Embedding Generation K+1 K+2<change_calender> Training Data <change_note> “adjust my note” : <change_setting> “volume turn down” The dialogue act representations can be automatically learned for other domains postpone my meeting to five pm Chen et al., “Zero-Shot Learning of Intent Embeddings for Expansion by Convolutional Deep Structured Semantic Models,” in ICASSP 2016. 42
  • 43. Policy for Domain Adaptation  Bayesian committee machine (BCM) enables estimated Q- function to share knowledge across domains QRDR QHDH QL DL Committee Model The policy from a new domain can be boosted by the committee policy #dialogues reward Gašić, et al., “Policy Committee for Adaptation in Multi-Domain Spoken Dialogue Systems,” in ASRU, 2015. 43
  • 44. Evolution Roadmap 44 Knowledge based system Common sense system Empathetic systems Dialogue breadth (coverage) Dialoguedepth(complexity) What is influenza? I’ve got a cold what do I do? Tell me a joke. I feel sad…
  • 45. High-Level Intention for Dialogue Planning  High-level intention may span several domains Schedule a lunch with Vivian. find restaurant check location contact play music What kind of restaurants do you prefer? The distance is … Should I send the restaurant information to Vivian? Users can interact via high-level descriptions and the system learns how to plan the dialogues Sun, et al., “An Intelligent Assistant for High-Level Task Understanding,” in IUI, 2016. 45
  • 46. Empathy in Dialogue System  Embed an empathy module  Recognize emotion using multimodality  Generate emotion-aware responses 46 Emotion Recognizer vision speech text Fung et al., “Towards Empathetic Human-Robot Interactions,” in arXiv, 2016.
  • 48. Conclusion  The conversational systems can manage information access via spoken interactions  A domain is usually constrained by the backend database  Semantic schema should be predefined  Cross-domain knowledge and intention is difficult to handled  Human-like dialogue system  Robustness  error handling via interactions / reasoning  Dialogue coverage  domain switching / open-domain system  Dialogue complexity  common sense and empathy 48
  • 49. Q & A THANKS FOR YOUR ATTENTION! 49