SlideShare una empresa de Scribd logo
1 de 22
Emotion-Driven
Reinforcement Learning
Bob Marinier & John Laird
University of Michigan, Computer Science and Engineering
CogSci’08
2




Introduction
• Interested in the functional benefits of emotion
  for a cognitive agent
 ▫ Appraisal theories of emotion
 ▫ PEACTIDM theory of cognitive control
• Use emotion as a reward signal to a
  reinforcement learning agent
 ▫ Demonstrates a functional benefit of emotion
 ▫ Provides a theory of the origin of intrinsic reward
3




Outline
• Background
 ▫ Integration of emotion and cognition
 ▫ Integration of emotion and reinforcement learning
 ▫ Implementation in Soar
• Learning task
• Results
4



Appraisal Theories of Emotion
 • A situation is evaluated along a number of appraisal
   dimensions, many of which relate the situation to
   current goals
   ▫ Novelty, goal relevance, goal conduciveness, expectedness,
     causal agency, etc.
 • Appraisals influence emotion
 • Emotion can then be coped with (via internal or
   external actions)
                          Situation
                            Goals


              Coping                  Appraisals


                          Emotion
5


  Appraisals to Emotions (Scherer 2001)
                         Joy                  Fear           Anger
                         High/medium          High           High
Suddenness
                         High                 High           High
Unpredictability
                                              Low
Intrinsic pleasantness
                         High                 High           High
Goal/need relevance
                                              Other/nature   Other
Cause: agent
                         Chance/intentional                  Intentional
Cause: motive
                         Very high            High           Very high
Outcome probability
Discrepancy from                              High           High
expectation
                         Very high            Low            Low
Conduciveness
                                                             High
Control
                                              Very low       High
Power
6



Cognitive Control: PEACTIDM (Newell 1990)
Perceive      Obtain raw perception
Encode     Create domain-independent
           representation
Attend     Choose stimulus to process
Comprehend Generate structures that relate stimulus
           to tasks and can be used to inform
           behavior
Task       Perform task maintenance
Intend     Choose an action, create prediction
Decode     Decompose action into motor commands

Motor         Execute motor commands
7



Unification of PEACTIDM and Appraisal Theories

                                 Perceive
          Environmental                                          Raw Perceptual
             Change                                               Information




             Motor                                               Encode
                                                           Suddenness
                                                                           Stimulus
                                                      Unpredictability
           Motor                                                           Relevance
                                                       Goal Relevance
        Commands                                Intrinsic Pleasantness
                                  Prediction

                                   Outcome
             Decode                                               Attend
                                  Probability


                              Causal Agent/Motive
            Action                                                       Stimulus chosen
                                  Discrepancy
                                                                          for processing
                                Conduciveness
                                 Control/Power

                     Intend                           Comprehend
                               Current Situation
                                 Assessment
8




Distinction between emotion, mood, and feeling
(Marinier & Laird 2007)
  • Emotion: Result of appraisals
    ▫ Is about the current situation
  • Mood: “Average” over recent emotions
    ▫ Provides historical context
  • Feeling: Emotion “+” Mood
    ▫ What agent actually perceives
10

 Intrinsically Motivated Reinforcement Learning
 (Sutton & Barto 1998; Singh et al. 2004)
                                             External
                                           Environment
          Environment
                                 Actions                  Sensations

             Critic
                                             Internal
                                           Environment
                                            Appraisal
Actions    Rewards      States                Critic
                                             Process

                                           +/- Feeling
                                  Decisions Rewards States
                                            Intensity
             Agent


                                              Agent
                                                         “Organism”


                 • Reward = Intensity * Valence
11


Extending Soar with Emotion
(Marinier & Laird 2007)
                                           Symbolic Long-Term Memories
                     Procedural                                                      Episodic
                                                     Semantic




             Reinforcement Chunking                                                     Episodic
                                                                Semantic
               Learning                                                                 Learning
                                                                Learning




                                                Short-Term Memory
                          Appraisal
                          Detector                                                   Decision
                                                                                    Procedure
                                                  Situation, Goals




                                                      Visual
                                  Perception                               Action
                                                     Imagery

                                                       Body
12


       Extending Soar with Emotion
       (Marinier & Laird 2007)
                                                                                                           Symbolic Long-Term Memories
                                                                                         Procedural                                                  Episodic
                                                                                                                     Semantic




                                                                            Reinforcement Chunking                                                      Episodic
                                                                                                                                Semantic
                                                                              Learning                                                                  Learning
                                                                                                                                Learning
Appraisal Detector




                                                       Feeling
                                                  .9,.6,.5,-.1,.8,…
                                                                                                                Short-Term Memory
                                                                                                                                                     Decision
                                                                                                                     Feelings                       Procedure
                                                                                                                  Situation, Goals
                                                                          Emotion
                                    Mood
                                                                      .5,.7,0,-.4,.3,…
                              .7,-.2,.8,.3,.6,…


                                                                                                                      Visual
                                                                                                  Perception                               Action
                                                                                                                     Imagery

                                                                                                                       Body
                     Knowledge

                     Architecture
13



Learning task


Start



                Goal
14



Learning task: Encoding
                       North
                       Passable: false
                       On path: false
                       Progress: true

                                         East
     West
                                         Passable: false
     Passable: false
                                         On path: true
     On path: false
                                         Progress: true
     Progress: true

                       South
                       Passable: true
                       On path: true
                       Progress: true
15



Learning task: Encoding & Appraisal
                              North
                              Intrinsic Pleasantness: Low
                              Goal Relevance: Low
                              Unpredictability: High

                                                East
West
                                                Intrinsic Pleasantness: Low
Intrinsic Pleasantness: Low
                                                Goal Relevance: High
Goal Relevance: Low
                                                Unpredictability: High
Unpredictability: High

                              South
                              Intrinsic Pleasantness: Neutral
                              Goal Relevance: High
                              Unpredictability: Low
16


Learning task: Attending,
Comprehending & Appraisal




            South
            Intrinsic Pleasantness: Neutral
            Goal Relevance: High
            Unpredictability: Low
            Conduciveness: High
            Control: High …
17



Learning task: Tasking
18



Learning task: Tasking




             Optimal Subtasks
19




What is being learned?
•   When to Attend vs Task
•   If Attending, what to Attend to
•   If Tasking, which subtask to create
•   When to Intend vs. Ignore
20


                             Learning Results
                           12000
Median Processing Cycles




                           10000

                           8000

                           6000

                           4000

                           2000

                              0
                                   1   2     3   4   5   6   7     8   9   10   11   12   13   14   15
                                                             Episode
                               Standard RL       Feeling=Emotion       Feeling=Emotion+Mood
21




                     Results: With and without mood
                           300
Median Processing Cycles




                           290

                           280

                           270

                           260

                           250

                           240
                                 8       9         10       11     12      13     14      15
                                                             Episode
                                 Feeling=Emotion        Feeling=Emotion+Mood    Optimal
22




Discussion
• Agent learns both internal (tasking) and external
  (movement) actions
• Emotion allows for more frequent rewards, and
  thus learns faster than standard RL
• Mood “fills in the gaps” allowing for even faster
  learning and less variability
23




Conclusion & Future Work
• Demonstrated computational model that integrates
  emotion and cognitive control
• Confirmed emotion can drive reinforcement learning
• We have already successfully demonstrated similar
  learning in a more complex domain
• Would like to explore multi-agent scenarios

Más contenido relacionado

La actualidad más candente

Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012vivek_shaw
 
Natural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological RationalityNatural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological RationalityBenoit Hardy-Vallée, Ph.D.
 
5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objects5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objectsYogeeswar Reddy
 
YGCC case interview guide
YGCC case interview guideYGCC case interview guide
YGCC case interview guideYGCC
 
solving problems
solving problemssolving problems
solving problemsnhok maruko
 
Neuromarketing the hope and hype of neuroimaging in business
Neuromarketing  the hope and hype of neuroimaging in businessNeuromarketing  the hope and hype of neuroimaging in business
Neuromarketing the hope and hype of neuroimaging in businessAnna Jo
 
I think...therefore IM
I think...therefore IMI think...therefore IM
I think...therefore IMKevin McGrew
 
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning knowledge Technology Week
 
The Science of Listening
The Science of ListeningThe Science of Listening
The Science of ListeningMark Robinson
 
Key Point Sampler 2011
Key Point Sampler 2011Key Point Sampler 2011
Key Point Sampler 2011dsandlerny
 
Yonce clay
Yonce clayYonce clay
Yonce clayNASAPMC
 
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...IT Network marcus evans
 
08 rita schoeny
08 rita schoeny08 rita schoeny
08 rita schoenyradarrt
 
Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...SM2 Strategic
 
malabika executive functions
malabika   executive functionsmalabika   executive functions
malabika executive functionsCOT SSNP
 

La actualidad más candente (17)

Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012
 
Natural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological RationalityNatural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological Rationality
 
Social Loafing
Social LoafingSocial Loafing
Social Loafing
 
5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objects5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objects
 
YGCC case interview guide
YGCC case interview guideYGCC case interview guide
YGCC case interview guide
 
solving problems
solving problemssolving problems
solving problems
 
Neuromarketing the hope and hype of neuroimaging in business
Neuromarketing  the hope and hype of neuroimaging in businessNeuromarketing  the hope and hype of neuroimaging in business
Neuromarketing the hope and hype of neuroimaging in business
 
I think...therefore IM
I think...therefore IMI think...therefore IM
I think...therefore IM
 
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
 
The Science of Listening
The Science of ListeningThe Science of Listening
The Science of Listening
 
Key Point Sampler 2011
Key Point Sampler 2011Key Point Sampler 2011
Key Point Sampler 2011
 
Yonce clay
Yonce clayYonce clay
Yonce clay
 
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
 
Sme
SmeSme
Sme
 
08 rita schoeny
08 rita schoeny08 rita schoeny
08 rita schoeny
 
Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...
 
malabika executive functions
malabika   executive functionsmalabika   executive functions
malabika executive functions
 

Destacado (6)

Improving Findability Inside the Firewall
Improving Findability Inside the FirewallImproving Findability Inside the Firewall
Improving Findability Inside the Firewall
 
State of Social Media 2013
State of Social Media 2013State of Social Media 2013
State of Social Media 2013
 
La ley SOPA
La ley SOPALa ley SOPA
La ley SOPA
 
Beyond Boolean - Enterprise Search Technologies
Beyond Boolean - Enterprise Search TechnologiesBeyond Boolean - Enterprise Search Technologies
Beyond Boolean - Enterprise Search Technologies
 
US Trip Sharing
US Trip Sharing US Trip Sharing
US Trip Sharing
 
J Welch Skills1
J Welch Skills1J Welch Skills1
J Welch Skills1
 

Similar a Marinier Laird Cogsci 2008 Emotionrl Pres

Organizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture twoOrganizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture twoMurray Hunter
 
David Bennet KMME 2013
David Bennet KMME 2013David Bennet KMME 2013
David Bennet KMME 2013KMMiddleEast
 
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...Michael Burnett
 
Theories of work motivation
Theories of work motivationTheories of work motivation
Theories of work motivationMansi Khurana
 
PERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKINGPERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKINGAli Zeeshan
 
Neural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognitionNeural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognitionKyongsik Yun
 
Building Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation ChandramowlyBuilding Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation Chandramowlygueste6e6f5f
 
Dave snowden practice without sound theory will not scale
Dave snowden   practice without sound theory will not scaleDave snowden   practice without sound theory will not scale
Dave snowden practice without sound theory will not scaleAGILEMinds
 
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B..." Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...SeriousGamesAssoc
 
Utility and neuroscience: a mechanistic approach of decision-making and ratio...
Utility and neuroscience: a mechanistic approach of decision-making and ratio...Utility and neuroscience: a mechanistic approach of decision-making and ratio...
Utility and neuroscience: a mechanistic approach of decision-making and ratio...Benoit Hardy-Vallée, Ph.D.
 
Diagnosing behavioral problems and perception
Diagnosing behavioral problems and perceptionDiagnosing behavioral problems and perception
Diagnosing behavioral problems and perceptionEui Jung Hwang
 
The Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online AdvertisingThe Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online AdvertisingThe Advertising Research Foundation
 
Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...
Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...
Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...Huw Hepworth
 
Representing Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investmentRepresenting Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investmentMichael Burnett
 
Augmented Reality: Beyond Usability
Augmented Reality: Beyond UsabilityAugmented Reality: Beyond Usability
Augmented Reality: Beyond UsabilityPamela Rutledge
 

Similar a Marinier Laird Cogsci 2008 Emotionrl Pres (20)

201106 G4C
201106 G4C201106 G4C
201106 G4C
 
Teambuilding Exercises
Teambuilding ExercisesTeambuilding Exercises
Teambuilding Exercises
 
Organizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture twoOrganizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture two
 
David Bennet KMME 2013
David Bennet KMME 2013David Bennet KMME 2013
David Bennet KMME 2013
 
Perception
PerceptionPerception
Perception
 
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
 
Perception
PerceptionPerception
Perception
 
Perception
PerceptionPerception
Perception
 
Theories of work motivation
Theories of work motivationTheories of work motivation
Theories of work motivation
 
PERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKINGPERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKING
 
Neural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognitionNeural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognition
 
Building Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation ChandramowlyBuilding Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation Chandramowly
 
Dave snowden practice without sound theory will not scale
Dave snowden   practice without sound theory will not scaleDave snowden   practice without sound theory will not scale
Dave snowden practice without sound theory will not scale
 
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B..." Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
 
Utility and neuroscience: a mechanistic approach of decision-making and ratio...
Utility and neuroscience: a mechanistic approach of decision-making and ratio...Utility and neuroscience: a mechanistic approach of decision-making and ratio...
Utility and neuroscience: a mechanistic approach of decision-making and ratio...
 
Diagnosing behavioral problems and perception
Diagnosing behavioral problems and perceptionDiagnosing behavioral problems and perception
Diagnosing behavioral problems and perception
 
The Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online AdvertisingThe Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online Advertising
 
Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...
Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...
Hearts, Minds, Will, Body, World, Tribe A Framework for Considering Consumer ...
 
Representing Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investmentRepresenting Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investment
 
Augmented Reality: Beyond Usability
Augmented Reality: Beyond UsabilityAugmented Reality: Beyond Usability
Augmented Reality: Beyond Usability
 

Más de guru001

Lapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve DiscoverabilityLapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve Discoverabilityguru001
 

Más de guru001 (6)

Banner1
Banner1Banner1
Banner1
 
UCL
UCLUCL
UCL
 
UCL
UCLUCL
UCL
 
UCL
UCLUCL
UCL
 
Lapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve DiscoverabilityLapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve Discoverability
 
Banner1
Banner1Banner1
Banner1
 

Último

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Marinier Laird Cogsci 2008 Emotionrl Pres

  • 1. Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08
  • 2. 2 Introduction • Interested in the functional benefits of emotion for a cognitive agent ▫ Appraisal theories of emotion ▫ PEACTIDM theory of cognitive control • Use emotion as a reward signal to a reinforcement learning agent ▫ Demonstrates a functional benefit of emotion ▫ Provides a theory of the origin of intrinsic reward
  • 3. 3 Outline • Background ▫ Integration of emotion and cognition ▫ Integration of emotion and reinforcement learning ▫ Implementation in Soar • Learning task • Results
  • 4. 4 Appraisal Theories of Emotion • A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals ▫ Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc. • Appraisals influence emotion • Emotion can then be coped with (via internal or external actions) Situation Goals Coping Appraisals Emotion
  • 5. 5 Appraisals to Emotions (Scherer 2001) Joy Fear Anger High/medium High High Suddenness High High High Unpredictability Low Intrinsic pleasantness High High High Goal/need relevance Other/nature Other Cause: agent Chance/intentional Intentional Cause: motive Very high High Very high Outcome probability Discrepancy from High High expectation Very high Low Low Conduciveness High Control Very low High Power
  • 6. 6 Cognitive Control: PEACTIDM (Newell 1990) Perceive Obtain raw perception Encode Create domain-independent representation Attend Choose stimulus to process Comprehend Generate structures that relate stimulus to tasks and can be used to inform behavior Task Perform task maintenance Intend Choose an action, create prediction Decode Decompose action into motor commands Motor Execute motor commands
  • 7. 7 Unification of PEACTIDM and Appraisal Theories Perceive Environmental Raw Perceptual Change Information Motor Encode Suddenness Stimulus Unpredictability Motor Relevance Goal Relevance Commands Intrinsic Pleasantness Prediction Outcome Decode Attend Probability Causal Agent/Motive Action Stimulus chosen Discrepancy for processing Conduciveness Control/Power Intend Comprehend Current Situation Assessment
  • 8. 8 Distinction between emotion, mood, and feeling (Marinier & Laird 2007) • Emotion: Result of appraisals ▫ Is about the current situation • Mood: “Average” over recent emotions ▫ Provides historical context • Feeling: Emotion “+” Mood ▫ What agent actually perceives
  • 9. 10 Intrinsically Motivated Reinforcement Learning (Sutton & Barto 1998; Singh et al. 2004) External Environment Environment Actions Sensations Critic Internal Environment Appraisal Actions Rewards States Critic Process +/- Feeling Decisions Rewards States Intensity Agent Agent “Organism” • Reward = Intensity * Valence
  • 10. 11 Extending Soar with Emotion (Marinier & Laird 2007) Symbolic Long-Term Memories Procedural Episodic Semantic Reinforcement Chunking Episodic Semantic Learning Learning Learning Short-Term Memory Appraisal Detector Decision Procedure Situation, Goals Visual Perception Action Imagery Body
  • 11. 12 Extending Soar with Emotion (Marinier & Laird 2007) Symbolic Long-Term Memories Procedural Episodic Semantic Reinforcement Chunking Episodic Semantic Learning Learning Learning Appraisal Detector Feeling .9,.6,.5,-.1,.8,… Short-Term Memory Decision Feelings Procedure Situation, Goals Emotion Mood .5,.7,0,-.4,.3,… .7,-.2,.8,.3,.6,… Visual Perception Action Imagery Body Knowledge Architecture
  • 13. 14 Learning task: Encoding North Passable: false On path: false Progress: true East West Passable: false Passable: false On path: true On path: false Progress: true Progress: true South Passable: true On path: true Progress: true
  • 14. 15 Learning task: Encoding & Appraisal North Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High East West Intrinsic Pleasantness: Low Intrinsic Pleasantness: Low Goal Relevance: High Goal Relevance: Low Unpredictability: High Unpredictability: High South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low
  • 15. 16 Learning task: Attending, Comprehending & Appraisal South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low Conduciveness: High Control: High …
  • 17. 18 Learning task: Tasking Optimal Subtasks
  • 18. 19 What is being learned? • When to Attend vs Task • If Attending, what to Attend to • If Tasking, which subtask to create • When to Intend vs. Ignore
  • 19. 20 Learning Results 12000 Median Processing Cycles 10000 8000 6000 4000 2000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Episode Standard RL Feeling=Emotion Feeling=Emotion+Mood
  • 20. 21 Results: With and without mood 300 Median Processing Cycles 290 280 270 260 250 240 8 9 10 11 12 13 14 15 Episode Feeling=Emotion Feeling=Emotion+Mood Optimal
  • 21. 22 Discussion • Agent learns both internal (tasking) and external (movement) actions • Emotion allows for more frequent rewards, and thus learns faster than standard RL • Mood “fills in the gaps” allowing for even faster learning and less variability
  • 22. 23 Conclusion & Future Work • Demonstrated computational model that integrates emotion and cognitive control • Confirmed emotion can drive reinforcement learning • We have already successfully demonstrated similar learning in a more complex domain • Would like to explore multi-agent scenarios