SlideShare una empresa de Scribd logo
Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08
Introduction Interested in the functional benefits of emotion for a cognitive agent Appraisal theories of emotion PEACTIDM theory of cognitive control Use emotion as a reward signal to a reinforcement learning agent Demonstrates a functional benefit of emotion Provides a theory of the origin of intrinsic reward 2
Outline Background Integration of emotion and cognition Integration of emotion and reinforcement learning Implementation in Soar Learning task Results 3
Appraisal Theories of Emotion A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc. Appraisals influence emotion Emotion can then be coped with (via internal or external actions) Situation Goals Appraisals Coping Emotion 4
Appraisals to Emotions (Scherer 2001) 5
Cognitive Control: PEACTIDM (Newell 1990) 6
Unification of PEACTIDM and Appraisal Theories 7 Perceive Raw Perceptual Information Environmental Change Encode Motor Suddenness Unpredictability Goal Relevance Intrinsic Pleasantness Stimulus Relevance Motor Commands Prediction Outcome Probability Attend Decode Causal Agent/Motive Discrepancy Conduciveness Control/Power Stimulus chosen for processing Action Comprehend Intend Current Situation Assessment
Distinction between emotion, mood, and feeling(Marinier & Laird 2007) Emotion: Result of appraisals Is about the current situation Mood: “Average” over recent emotions Provides historical context Feeling: Emotion “+” Mood What agent actually perceives 8
Emotion, mood, and feeling Cognition Active Appraisals Perceived Feeling Emotion Feeling Combination Function Pull Mood Decay 9
Intrinsically Motivated Reinforcement Learning(Sutton & Barto 1998; Singh et al. 2004) 10 External Environment Environment Actions Sensations Critic “Organism” Internal Environment Actions States Rewards Critic Appraisal Process Agent +/- Feeling Intensity States Rewards Decisions Agent Reward = Intensity * Valence
Extending Soar with Emotion(Marinier & Laird 2007) Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning Appraisal Detector Short-Term Memory Situation, Goals Decision Procedure Visual Imagery Perception Action Body 11
Extending Soar with Emotion(Marinier & Laird 2007) 12 Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning      +/-Intensity Appraisal Detector Feeling .9,.6,.5,-.1,.8,… Short-Term Memory Situation, Goals Feelings Decision Procedure Feelings Appraisals Visual Imagery Emotion .5,.7,0,-.4,.3,… Mood .7,-.2,.8,.3,.6,… Perception Action Knowledge Body Architecture
Learning task Start Goal 13
Learning task: Encoding 14 North Passable: false On path: false Progress: true East Passable: false On path: true Progress: true West Passable: false On path: false Progress: true South Passable: true On path: true Progress: true
Learning task: Encoding & Appraisal 15 North Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High East Intrinsic Pleasantness: Low Goal Relevance: High Unpredictability: High West Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low
Learning task: Attending, Comprehending & Appraisal 16 South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low Conduciveness: High Control: High …
Learning task: Tasking 17
Learning task: Tasking 18 Optimal Subtasks
What is being learned? When to Attend vs Task If Attending, what to Attend to If Tasking, which subtask to create When to Intend vs. Ignore 19
Learning Results 20
Results: With and without mood 21
Discussion Agent learns both internal (tasking) and external (movement) actions Emotion allows for more frequent rewards, and thus learns faster than standard RL Mood “fills in the gaps” allowing for even faster learning and less variability 22

Más contenido relacionado

La actualidad más candente

Expectancy theory
Expectancy theoryExpectancy theory
Expectancy theorykdore
 
Eiwp conf presentation scott thor
Eiwp conf presentation scott thorEiwp conf presentation scott thor
Eiwp conf presentation scott thorScott Thor
 
Lessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojectsLessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojectsRamanan Jagannathan
 
Identifying neurocorrelates in psychological type ap ti tc 2011
Identifying neurocorrelates in psychological type  ap ti tc 2011Identifying neurocorrelates in psychological type  ap ti tc 2011
Identifying neurocorrelates in psychological type ap ti tc 2011Ann Holm
 
Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)zohebchana
 
Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.dewitkoen
 

La actualidad más candente (9)

Expectancy theory
Expectancy theoryExpectancy theory
Expectancy theory
 
Eiwp conf presentation scott thor
Eiwp conf presentation scott thorEiwp conf presentation scott thor
Eiwp conf presentation scott thor
 
Lessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojectsLessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojects
 
Ei
EiEi
Ei
 
Identifying neurocorrelates in psychological type ap ti tc 2011
Identifying neurocorrelates in psychological type  ap ti tc 2011Identifying neurocorrelates in psychological type  ap ti tc 2011
Identifying neurocorrelates in psychological type ap ti tc 2011
 
Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)
 
HOW STATISTICS WORKS?
HOW STATISTICS WORKS?HOW STATISTICS WORKS?
HOW STATISTICS WORKS?
 
Problem solving
Problem solvingProblem solving
Problem solving
 
Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.
 

Similar a Marinier Laird Cogsci 2008 Emotionrl Pres

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and TechnologyTS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and TechnologyJawad Haqbeen
 
Reflective learning
Reflective learningReflective learning
Reflective learningP&CO
 
Intention-behavior relations
Intention-behavior relationsIntention-behavior relations
Intention-behavior relationsrenes002
 
How to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at WorkHow to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at WorkThe Chazin Group LLC
 
The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2MartinD1
 
Process theories of motivation
Process theories of motivationProcess theories of motivation
Process theories of motivationace boado
 
Perception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kfPerception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kfnikhilojha4142
 
Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018Vishweshwar Hegde
 
PERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOURPERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOURKriace Ward
 
Lab Presentation 103108
Lab Presentation 103108Lab Presentation 103108
Lab Presentation 103108tkvaran
 
Emotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette ReyesEmotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette ReyesJodi Rudick
 
Perception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mbaPerception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mbaBabasab Patil
 
Perseption
PerseptionPerseption
Perseptionnymufti
 
Interactive Metronome
Interactive MetronomeInteractive Metronome
Interactive MetronomeSharpBrains
 
LASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt SchlichtmannLASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt SchlichtmannLA-Boston
 
Depth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent ArchitecturesDepth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent ArchitecturesEva Hudlicka
 
Week 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docxWeek 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docxhelzerpatrina
 

Similar a Marinier Laird Cogsci 2008 Emotionrl Pres (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and TechnologyTS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
 
Reflective learning
Reflective learningReflective learning
Reflective learning
 
Intention-behavior relations
Intention-behavior relationsIntention-behavior relations
Intention-behavior relations
 
How to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at WorkHow to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at Work
 
The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2
 
Process theories of motivation
Process theories of motivationProcess theories of motivation
Process theories of motivation
 
Perception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kfPerception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kf
 
Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018
 
PERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOURPERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOUR
 
Lab Presentation 103108
Lab Presentation 103108Lab Presentation 103108
Lab Presentation 103108
 
Emotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette ReyesEmotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette Reyes
 
Perception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mbaPerception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mba
 
Perseption
PerseptionPerseption
Perseption
 
Interactive Metronome
Interactive MetronomeInteractive Metronome
Interactive Metronome
 
Motivation
MotivationMotivation
Motivation
 
LASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt SchlichtmannLASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt Schlichtmann
 
Module 1
Module 1Module 1
Module 1
 
Depth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent ArchitecturesDepth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent Architectures
 
Week 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docxWeek 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docx
 

Más de gueste9cbbf

Más de gueste9cbbf (7)

Power Point 2007
Power Point 2007Power Point 2007
Power Point 2007
 
Marinier Laird Cogsci 2008 Emotionrl Pres
Marinier Laird Cogsci 2008 Emotionrl PresMarinier Laird Cogsci 2008 Emotionrl Pres
Marinier Laird Cogsci 2008 Emotionrl Pres
 
Presentation 10 20 08 1
Presentation 10 20 08 1Presentation 10 20 08 1
Presentation 10 20 08 1
 
bb
bbbb
bb
 
b
bb
b
 
Power Point 2007
Power Point 2007Power Point 2007
Power Point 2007
 
Britwear
BritwearBritwear
Britwear
 

Último

WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastUXDXConf
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsUXDXConf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 

Último (20)

WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 

Marinier Laird Cogsci 2008 Emotionrl Pres

  • 1. Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08
  • 2. Introduction Interested in the functional benefits of emotion for a cognitive agent Appraisal theories of emotion PEACTIDM theory of cognitive control Use emotion as a reward signal to a reinforcement learning agent Demonstrates a functional benefit of emotion Provides a theory of the origin of intrinsic reward 2
  • 3. Outline Background Integration of emotion and cognition Integration of emotion and reinforcement learning Implementation in Soar Learning task Results 3
  • 4. Appraisal Theories of Emotion A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc. Appraisals influence emotion Emotion can then be coped with (via internal or external actions) Situation Goals Appraisals Coping Emotion 4
  • 5. Appraisals to Emotions (Scherer 2001) 5
  • 6. Cognitive Control: PEACTIDM (Newell 1990) 6
  • 7. Unification of PEACTIDM and Appraisal Theories 7 Perceive Raw Perceptual Information Environmental Change Encode Motor Suddenness Unpredictability Goal Relevance Intrinsic Pleasantness Stimulus Relevance Motor Commands Prediction Outcome Probability Attend Decode Causal Agent/Motive Discrepancy Conduciveness Control/Power Stimulus chosen for processing Action Comprehend Intend Current Situation Assessment
  • 8. Distinction between emotion, mood, and feeling(Marinier & Laird 2007) Emotion: Result of appraisals Is about the current situation Mood: “Average” over recent emotions Provides historical context Feeling: Emotion “+” Mood What agent actually perceives 8
  • 9. Emotion, mood, and feeling Cognition Active Appraisals Perceived Feeling Emotion Feeling Combination Function Pull Mood Decay 9
  • 10. Intrinsically Motivated Reinforcement Learning(Sutton & Barto 1998; Singh et al. 2004) 10 External Environment Environment Actions Sensations Critic “Organism” Internal Environment Actions States Rewards Critic Appraisal Process Agent +/- Feeling Intensity States Rewards Decisions Agent Reward = Intensity * Valence
  • 11. Extending Soar with Emotion(Marinier & Laird 2007) Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning Appraisal Detector Short-Term Memory Situation, Goals Decision Procedure Visual Imagery Perception Action Body 11
  • 12. Extending Soar with Emotion(Marinier & Laird 2007) 12 Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning +/-Intensity Appraisal Detector Feeling .9,.6,.5,-.1,.8,… Short-Term Memory Situation, Goals Feelings Decision Procedure Feelings Appraisals Visual Imagery Emotion .5,.7,0,-.4,.3,… Mood .7,-.2,.8,.3,.6,… Perception Action Knowledge Body Architecture
  • 14. Learning task: Encoding 14 North Passable: false On path: false Progress: true East Passable: false On path: true Progress: true West Passable: false On path: false Progress: true South Passable: true On path: true Progress: true
  • 15. Learning task: Encoding & Appraisal 15 North Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High East Intrinsic Pleasantness: Low Goal Relevance: High Unpredictability: High West Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low
  • 16. Learning task: Attending, Comprehending & Appraisal 16 South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low Conduciveness: High Control: High …
  • 18. Learning task: Tasking 18 Optimal Subtasks
  • 19. What is being learned? When to Attend vs Task If Attending, what to Attend to If Tasking, which subtask to create When to Intend vs. Ignore 19
  • 21. Results: With and without mood 21
  • 22. Discussion Agent learns both internal (tasking) and external (movement) actions Emotion allows for more frequent rewards, and thus learns faster than standard RL Mood “fills in the gaps” allowing for even faster learning and less variability 22
  • 23. Conclusion & Future Work Demonstrated computational model that integrates emotion and cognitive control Confirmed emotion can drive reinforcement learning We have already successfully demonstrated similar learning in a more complex domain Would like to explore multi-agent scenarios 23
  • 24. 24 HIGH INTENSITY alert tense excited nervous elated stressed happy upset NEGATIVE VALENCE POSITIVE VALENCE sad contented depressed serene lethargic relaxed fatigued calm LOW INTENSITY Circumplex models Emotions can be described in terms of intensity and valence, as in a circumplex model: Adapted from Feldman Barrett & Russell (1998)
  • 25. Computing Feeling from Emotion and Mood 25 Assumption: Appraisal dimensions are independent Limited Range: Inputs and outputs are in [0,1] or [-1,1] Distinguishability: Very different inputs should lead to very different outputs Non-linear: Linearity would violate limited range and distinguishability
  • 26. Computing Feeling Intensity 26 Motivation: Intensity gives a summary of how important (i.e., how good or bad) the situation is Limited range: Should map onto [0,1] No dominant appraisal: No single value should drown out all the others Can’t just multiply values, because if any are 0, then intensity is 0 Realization principle: Expected events should be less intense than unexpected events

Notas del editor

  1. Be careful about how say agent generates appraisal values
  2. Say prediction is our extension
  3. A cognitive architecture is a set of task-independent mechanisms that interact to give rise to behavior.
  4. In this environment, the agent’s sensing is limited: it can only see the cells immediately adjacent to it in the four cardinal directions. The agent has a sensor that tells it its Manhattan distance to the goal. However, the agent has no knowledge as to the effects of its actions, and thus cannot evaluate possible actions relative to the goal until it has actually performed them. Even then, it cannot always blindly move closer to the goal because given the shape of the maze, it must sometimes increase its Manhattan distance to the goal in order to make progress in the maze.
  5. Mention relaxation and direction
  6. 15 episodes50 trialsCutoff at 10kdcsmedian
  7. 1st and 3rd quartiles shownReach optimality at the same time, but mood is less variable
  8. This is an extension of previous workThese constraints define a set of equations. This is one possible equation which improves previous work that seems to work well for our current models.
  9. This is an extension of previous workUnifies intensity for all feelings in one equation (others use different equations for each “kind” of feeling)Again these constraints define a set of possible functions, of which this is one that seems to work well for us