This presentation performs an in-depth analysis of the rather emerging field of Emotion AI. The presentation aims at covering different aspects of Emotion AI, ranging from emotion elicitation and modelling to sensing and recognition. Special attention is paid to describing the art of the possible with respect to existing technologies for emotion sensing and AI-models for the automatic recognition of human emotions.
2. 2
@orestibanos
PRESENTATION
ABOUT ME
Oresti Banos
Associate Professor, PhD
Research Center for ICT
University of Granada (Spain)
oresti@ugr.es
@orestibanos
http://orestibanos.com/
Research:
• smart ubiquitous sensing
• holistic behaviour modelling
• virtual coaching systems
3. 3
@orestibanos
OUTLINE
TODAY WE ARE GOING TO TALK ABOUT...
1. EMOTIONS
Some definitions
2. EMOTION AI
Context and prospects
3. EMOTION RECOGNITION
The engine of Emotion AI
5. EMOTION SENSING
Physical, physiological and questionnaires
4. EMOTION MODELS
Psychological foundations
7. CHALLENGES AND OPPORTUNITIES
Harnessing limitations to create new avenues
6. EMOTION RECOGNITION PIPELINE
Transforming data into emotions
8. CONCLUSIONS
Take home message
5. 8
@orestibanos
EMOTIONS
Definitions
“Emotion is a complex set of interactions
among subjective and objective factors, mediated
by neural/hormonal systems, which can:
• give rise to affective experiences such as
feelings of arousal, pleasure/displeasure;
• generate cognitive processes such as
emotionally relevant perceptual effects,
appraisals, labeling processes;
• activate widespread physiological adjustments
to the arousing conditions;
• lead to behavior that is often, but not always,
expressive, goal directed, and adaptive."
(Kleinginna & Kleinginna, 1981)
6. 9
@orestibanos
EMOTIONS
Definitions
“The co-occurring components that compose a
prototypical emotional episode include:
• overt behavior congruent with the emotion
(e.g., a smile or a facial expression of fear);
• attention directed toward the eliciting stimulus;
• cognitive appraisal of the meaning and
possible implications of the stimulus;
• attribution of the genesis of the episode to the
stimulus;
• the experience of the particular emotion, and
• physiological and endocrine changes consistent
with the particular emotion.”
(Ekkekakis, 2012)
7. 10
@orestibanos
EMOTIONS
Definitions
One simpler definition could be:
• Affects are the basic biological building blocks which we share with
babies and animals
• These can be expressed through Emotions (for instance facial
expressions)
• Moods are emotional states that lasts over a comparatively long time,
while emotions might be short-lived
• Your personal history of affects and emotions constitutes your Feelings
Emotions do not always represent true affects
We use emotions to signal or hide affects
8. 11
@orestibanos
EMOTIONS
Why are they relevant anyway?
Emotions, which affect both human physiological
and psychological status, play a very important
role in human life
Positive emotions help improve human health
and work efficiency, while negative emotions may
cause health problems
Long-term accumulations of negative emotions
are predisposing factors for depression, which
might lead to suicide in the worst cases
14. 18
@orestibanos
EMOTION RECOGNITION
The key enabler of Emotion AI
Due to the complexity of mutual interaction of
physiology and psychology in emotions, recognizing
human emotions precisely and timely is still modest
and remains the target of relevant scientific research
and industry innovation, although a large number of
efforts have been made by researchers in different
interdisciplinary fields
15. 19
@orestibanos
EMOTION MODELS
Quantifying human emotions
Emotions should be defined and
accessed quantitatively
Basic emotions were firstly proposed
decades ago, but the precise
definition has never been widely
acknowledged by psychologists
Psychologists tend to model emotions in two
different ways:
• Divide emotions into discrete categories
• Use multiple dimensions to label emotions
20. 27
@orestibanos
EMOTION MODELS
Multi-Dimensional Emotion Space Models
(Russell, 1980)
Valence is the subjective spectrum of
positive-to-negative evaluation of an
experience an individual may have had,
ranging from unpleasant (negative) to
pleasant (positive)
Arousal is is the self-perceived level of
activation (whether the individual is
likely to take an action under the mood
state), ranging from passive (low) to
active (high)
21. 28
@orestibanos
EMOTION MODELS
Multi-Dimensional Emotion Space Models
(Mehrabian, 1997)
The added dimension axis is
named dominance, ranging
from submissive to dominant
The dominance reflects the
control ability of the human in
a certain emotion
25. 32
@orestibanos
EMOTION SENSING
How to measure emotional signals
Emotions often refer to mental states that arise spontaneously
rather than through conscious effort and are often accompanied by
physical and physiological changes that are relevant to the human
organs and tissues such as brain, heart, skin, blood flow, muscle,
facial expressions, voice, etc.
26. 33
@orestibanos
EMOTION SENSING
How to measure emotional signals
Emotion
Sensing
Physical
Facial expression
Speech
Gestures
Body posture
Physiological
Electroencephalogram (EEG)
Skin temperature (T)
Electrocardiogram (ECG)
Galvanic skin response (GSR)
Respiration (RSP)
Questionnaires
PANAS
SAM
PAM
ESM
56. 64
@orestibanos
EMOTION RECOGNITION PIPELINE
Preprocessing
Time-domain digital filter
• Offset filter
• Moving average filter
• Median filter
• Detrending
Frequency-domain digital filter
• Butterworth/Chebyshev
• Notch/Comb
• Savitzky–Golay 0 0.5 1 1.5 2 2.5 3 3.5 4
Time (s)
-1
-0.5
0
0.5
1
1.5
2
Noisy ECG signal (base-line drift)
0 0.5 1 1.5 2 2.5 3 3.5 4
Time (s)
-1
-0.5
0
0.5
1
1.5
Detrended ECG signal
0 1 2 3 4 5
Time (s)
-1
0
1
2
ECG noisy signal and ideal signal
Noisy
Ideal
0 1 2 3 4 5
Time (s)
-1
0
1
2
ECG noisy signal and median-filtered signal
Noisy
medFiltered
0 1 2 3 4 5
Time (s)
-1
0
1
2
Median-filtered signal and ideal signal
medFiltered
Ideal
https://www.slideshare.net/
orestibl/biosignal-processing
57. 65
@orestibanos
EMOTION RECOGNITION PIPELINE
Segmentation
Process to divide the sensor data stream into smaller time
segments or data windows
The segmentation process is frequently called “windowing” as
each segment represents a data window or frame
In real-time applications, windows are defined concurrently
with data acquisition and processing, so data streams can be
effectively analysed “on-the-fly”
58. 66
@orestibanos
EMOTION RECOGNITION PIPELINE
Segmentation
Sliding window
• Signals are split into
windows of a fixed size
and with no inter-window
gaps
• An overlap between
adjacent windows is
sometimes tolerated
• Most widely used
approach
Window 1 Window 2 Window 3 Window 4
Fixed window
size
Window 1 Window 3 Window 5 Window 7
Window 2 Window 4 Window 6
Non-overlapingOverlaping
59. 67
@orestibanos
EMOTION RECOGNITION PIPELINE
Segmentation
Event-defined window
• The segment start and
end is defined by a
detected event
• Additional processing is
required to identify the
events of interest
• Example: estimation of
the R peaks from ECG
using the Pam-Tomkins
algorithm
60. 68
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Extraction)
Process of (numerically) characterising or transforming raw data
into more descriptive or informative data
Intended to facilitate the subsequent learning and generalization
steps, and in some cases lead to better human interpretations
Location=prefrontal,
Size=3cm,
Density=60g/cm3, …
61. 69
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Extraction)
Time-domain features: statistical values derived directly
from data window
Examples:
• Max
• Min
• Mean
• Median
• Variance
• Skewness
• Kurtosis
0 0.5 1 1.5 2 2.5 3 3.5 4
Time (s)
-25
-20
-15
-10
-5
0
5
10
15
Acceleration(m/s2)
X-axis acceleration signal (JUMPING)
0 0.5 1 1.5 2 2.5 3 3.5 4
Time (s)
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
Acceleration(m/s2)
X-axis acceleration signal (WALKING)
0 0.5 1 1.5 2 2.5 3 3.5 4
Time (s)
-3.4
-3.3
-3.2
-3.1
-3
-2.9
-2.8
-2.7
-2.6
Acceleration(m/s2)
X-axis acceleration signal (STANDING)
62. 70
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Extraction)
Frequency-domain features: derived from a transformed
version of the data window in the frequency domain
Examples:
• Fundamental frequency
• N-order harmonics
• Mean/Median/Mode frequency
• Spectral power/energy
• Entropy
• Cepstrum coefficients
0 5 10 15 20 25
Frequency (Hz)
0
200
400
600
800
1000
1200
1400
FFTMagnitude
X-axis acceleration signal (JUMPING)
0 5 10 15 20 25
Frequency (Hz)
0
20
40
60
80
100
120
140
160
FFTMagnitude
X-axis acceleration signal (STANDING)
0 5 10 15 20 25
Frequency (Hz)
0
20
40
60
80
100
120
140
160
180
200
FFTMagnitude
X-axis acceleration signal (WALKING)
64. 72
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Extraction)
The outcome of the feature extraction process is normally a feature
matrix
• Rows represent each data instance, chunk or segment
• Columns refer to the mathematical function (feature)
Feature
matrix
F1: Mean F2: Variance
!. #$ !. %&
−!. () !. #&
−!. !&
−!. #*
!. (#
!. #$
65. 73
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Extraction)
Feature space:
• Total number of features
extracted from the data
• Normally described as an array
(also known as feature matrix)
in which rows represent each
instance and columns the
feature type
• The dimensions (D) of the
feature space are given by the
number of features (N)
Feature
matrix
0.18 0.55
0.26 0.15
0.15
2.13
2.86
2.58
0.85
2.62
2.35
2.51
0 1 2 3
Mean
0
0.5
1
1.5
2
2.5
3
Variance
Class A
Class B
0 0.5 1 1.5 2 2.5 3
Mean
0
0.5
1
1.5
2
2.5
3
Variance
Sitting
Climbing
Feature
space
F1: Mean F2: Variance
Positive
Negative
66. 74
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Reduction)
Process to select relevant and informative
features
Different motivations
• General data reduction: limit storage
requirements and increase algorithm speed
• Feature set reduction: save resources in the
next round of data collection or during its
utilisation
• Performance improvement: gain in
predictive accuracy
• Data understanding: acquire knowledge
about the process that generated the data
or simply visualise the data
67. 75
@orestibanos
EMOTION RECOGNITION PIPELINE
Traditional ML (Feature Reduction)
Visualising the feature space can help determining which features (or
combination thereof) are most discriminative
Hyperdimensional features spaces (#features > 3) need to be reduced
for a proper visualisation (e.g., PCA, ICA)
-1
-0.5
0
0
0.5
-2
1
-2
1.5
Mean(accelerationZ)
2
-4 -3
Feature space representation
Mean (accelerationY)
2.5
-4
3
Mean (accelerationX)
-6
3.5
-5
-8
-6
-10 -7
Standing
Walking
Jumping
Do not trust statistics alone,
visualise your data!
81. 89
@orestibanos
CHALLENGES
• Successful elicitation of target emotions for
further processing since the perceived
emotions may not be the induced
emotions
• Models based on controlled experiments
(posed emotions, contextless, etc.) cannot
be accurately used to recognize emotion in
real-world scenarios
• Though many features have been tried,
there is still no clear evidence that what
feature combinations of what physiological
signal combinations are the most
significantly relevant to emotion changes
Existing limitations
82. 90
@orestibanos
CHALLENGES
• The number of subjects is usually small (2-
30), leading to limited sample sets which
generate poor classification models
• Several factors in the preprocessing and
analytical procedures for choosing the
classifiers (e.g. if the number of samples is
small, only linear classifiers are applicable
and deep learning is not possible)
• Human movement influences the
physiological signals measured from a user,
and therefore, also the results of the
emotion recognition
Existing limitations
84. 92
@orestibanos
CONCLUSIONS
Take Home Message
Physical emotion AI (facial/speech emotion recognition) has
attracted most of the attention, however it comes with great
limitations: emotions “easy” to fake, gender/ethnicity bias, privacy,
etc.
Physiological emotion AI can overcome many of these constraints by
using body signals and vital signs, mainly EEG, ECG/PPG, EDA, RESP
and TEMP
Emotional responses are hardly detectable based on a single
physiological signal, hence current and future trends should put the
focus on multimodal physiological data fusion in combination with
advanced deep learning methods
85. 93
@orestibanos
REFERENCES
Al-Nafjan, A., Hosny, M., Al-Ohali, Y., & Al-Wabil, A. (2017). Review and classification of emotion recognition based on EEG
brain-computer interface system research: a systematic review. Applied Sciences, 7(12), 1239.
Bang, J., Hur, T., Kim, D., Lee, J., Han, Y., Banos, O., ... & Lee, S. (2018). Adaptive Data Boosting Technique for Robust
Personalized Speech Emotion in Emotionally-Imbalanced Small-Sample Environments. Sensors, 18(11), 3744.
Banos, O., Damas, M., Pomares, H., Rojas, F., Delgado-Marquez, B., & Valenzuela, O. (2013). Human activity recognition
based on a sensor weighting hierarchical classifier. Soft Computing, 17(2), 333-343.
Banos, O., Damas, M., Guillen, A., Herrera, L. J., Pomares, H., Rojas, I., & Villalonga, C. (2015). Multi-sensor fusion based on
asymmetric decision weighting for robust activity recognition. Neural Processing Letters, 42(1), 5-26.
Bekkedal, M. Y., Rossi III, J., & Panksepp, J. (2011). Human brain EEG indices of emotions: delineating responses to affective
vocalizations by measuring frontal theta event-related synchronization. Neuroscience & Biobehavioral Reviews, 35(9), 1959-
1970.
Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: the self-assessment manikin and the semantic differential. Journal
of behavior therapy and experimental psychiatry, 25(1), 49-59.
Bradley, M. M. & Lang, P. J. (2007). The International Affective Digitized Sounds (2nd Edition; IADS-2): Affective ratings of
sounds and instruction manual. Technical report B-3. University of Florida, Gainesville, Fl
Cheng, Z., Shu, L., Xie, J., & Chen, C. P. (2017, December). A novel ECG-based real-time detection method of negative
emotions in wearable applications. In 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)
(pp. 296-301). IEEE.
Chu, M., Nguyen, T., Pandey, V., Zhou, Y., Pham, H. N., Bar-Yoseph, R., ... & Khine, M. (2019). Respiration rate and volume
measurements using wearable strain sensors. NPJ digital medicine, 2(1), 1-9.
Cowen, Alan S., et al. "What music makes us feel: At least 13 dimensions organize subjective experiences associated with
music across different cultures." Proceedings of the National Academy of Sciences (2020).
For those who want to deepen...
86. 94
@orestibanos
REFERENCES
Ekkekakis, P. (2012). Affect, mood, and emotion. Measurement in sport and exercise psychology, 321.
Ekman, P. (1992). An argument for basic emotions. Cognition & emotion, 6(3-4), 169-200.
Feng, H., Golshan, H. M., & Mahoor, M. H. (2018). A wavelet-based approach to emotion classification using EDA signals.
Expert Systems with Applications, 112, 77-86.
Guo, H. W., Huang, Y. S., Lin, C. H., Chien, J. C., Haraikawa, K., & Shieh, J. S. (2016, October). Heart rate variability signal
features for emotion recognition by using principal component analysis and support vectors machine. In 2016 IEEE 16th
International Conference on Bioinformatics and Bioengineering (BIBE) (pp. 274-277). IEEE.
Harmon-Jones, E., Harmon-Jones, C., Amodio, D. M., & Gable, P. A. (2011). Attitudes toward emotions. Journal of personality
and social psychology, 101(6), 1332.
Hui, T. K., & Sherratt, R. S. (2018). Coverage of emotion recognition for common wearable biosensors. Biosensors, 8(2), 30.
Izard, C. E. (2007). Basic emotions, natural kinds, emotion schemas, and a new paradigm. Perspectives on psychological
science, 2(3), 260-280.
Jiang, W., Wang, Z., Jin, J. S., Han, X., & Li, C. (2019). Speech Emotion Recognition with Heterogeneous Feature Unification of
Deep Neural Network. Sensors, 19(12), 2730.
Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Mahjoub, M. A., & Cleder, C. (2019). Automatic Speech Emotion Recognition
Using Machine Learning. In Social Media and Machine Learning. IntechOpen.
Kleinginna, P. R., & Kleinginna, A. M. (1981). A categorized list of emotion definitions, with suggestions for a consensual
definition. Motivation and emotion, 5(4), 345-379.
Koelstra, S., Muhl, C., Soleymani, M., Lee, J. S., Yazdani, A., Ebrahimi, T., ... & Patras, I. (2011). Deap: A database for emotion
analysis; using physiological signals. IEEE transactions on affective computing, 3(1), 18-31.
Lan, Z., Sourina, O., Wang, L., & Liu, Y. (2016). Real-time EEG-based emotion monitoring using stable features. The Visual
Computer, 32(3), 347-358.
Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (1999). International affective picture system (IAPS): Instruction manual and
affective ratings. The center for research in psychophysiology, University of Florida.
For those who want to deepen...
87. 95
@orestibanos
REFERENCES
Li, S., & Deng, W. (2018). Deep facial expression recognition: A survey. arXiv preprint
Marín-Morales, J., Higuera-Trujillo, J. L., Greco, A., Guixeres, J., Llinares, C., Scilingo, E. P., ... & Valenza, G. (2018). Affective
computing in virtual reality: emotion recognition from brain and heartbeat dynamics using wearable sensors. Scientific
reports, 8(1), 1-15.
Mehrabian, A. (1997). Comparison of the PAD and PANAS as models for describing emotions and for differentiating anxiety
from depression. Journal of psychopathology and behavioral assessment, 19(4), 331-357.
Mirmohamadsadeghi, L., Yazdani, A., & Vesin, J. M. (2016, September). Using cardio-respiratory signals to recognize
emotions elicited by watching music video clips. In 2016 IEEE 18th International Workshop on Multimedia Signal Processing
(MMSP) (pp. 1-5). IEEE.
Morris, J. D. (1995). Observations: SAM: the Self-Assessment Manikin; an efficient cross-cultural measurement of emotional
response. Journal of advertising research, 35(6), 63-68.
Nasoz, F., Alvarez, K., Lisetti, C. L., & Finkelstein, N. (2004). Emotion recognition from physiological signals using wireless
sensors for presence technologies. Cognition, Technology & Work, 6(1), 4-14.
Plutchik, R. (2001). The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their
complexity and provide tools for clinical practice. American scientist, 89(4), 344-350.
Pollak, J. P., Adams, P., & Gay, G. (2011). PAM: a photographic affect meter for frequent, in situ measurement of affect. In
Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 725-734).
Russell, J. A. (1980). A circumplex model of affect. Journal of personality and social psychology, 39(6), 1161.
Semwal, N., Kumar, A., & Narayanan, S. (2017, February). Automatic speech emotion detection system using multi-domain
acoustic feature selection and classification models. In 2017 IEEE International Conference on Identity, Security and
Behavior Analysis (ISBA) (pp. 1-6). IEEE.
Siddiqi, M. H., Ali, M., Eldib, M. E. A., Khan, A., Banos, O., Khan, A. M., ... & Choo, H. (2018). Evaluating real-life performance
of the state-of-the-art in facial expression recognition using a novel YouTube-based datasets. Multimedia Tools and
Applications, 77(1), 917-937.
For those who want to deepen...
88. 96
@orestibanos
REFERENCES
Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, D., ... & Yang, X. (2018). A review of emotion recognition using physiological
signals. Sensors, 18(7), 2074.
Valenza, G., Citi, L., Lanatá, A., Scilingo, E. P., & Barbieri, R. (2014). Revealing real-time emotional responses: a personalized
assessment based on heartbeat dynamics. Scientific reports, 4, 4998.
Veen, F. van. (2016, septiembre 14). The Neural Network Zoo. The Asimov Institute.
https://www.asimovinstitute.org/neural-network-zoo/
Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect:
the PANAS scales. Journal of personality and social psychology, 54(6), 1063.
Wegrzyn, M., Vogt, M., Kireclioglu, B., Schneider, J., & Kissler, J. (2017). Mapping the emotional face. How individual face
parts contribute to successful emotion recognition. PloS one, 12(5).
Wen, W., Liu, G., Cheng, N., Wei, J., Shangguan, P., & Huang, W. (2014). Emotion recognition based on multi-variant
correlation of physiological signals. IEEE Transactions on Affective Computing, 5(2), 126-140.
Wu, S., Xu, X., Shu, L., & Hu, B. (2017, November). Estimation of valence of emotion using two frontal EEG channels. In 2017
IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1127-1130). IEEE.
Xu, Y., Hübener, I., Seipp, A. K., Ohly, S., & David, K. (2017, March). From the lab to the real-world: An investigation on the
influence of human movement on Emotion Recognition using physiological signals. In 2017 IEEE International Conference on
Pervasive Computing and Communications Workshops (PerCom Workshops) (pp. 345-350). IEEE.
Yang, W., Makita, K., Nakao, T., Kanayama, N., Machizawa, M. G., Sasaoka, T., ... & Iwanaga, M. (2018). Affective auditory
stimulus database: An expanded version of the International Affective Digitized Sounds (IADS-E). Behavior research
methods, 50(4), 1415-1429.
Zhang, Q., Chen, X., Zhan, Q., Yang, T., & Xia, S. (2017). Respiration-based emotion recognition with deep learning.
Computers in Industry, 92, 84-90.
Zhang, W., Shu, L., Xu, X., & Liao, D. (2017). Affective virtual reality system (avrs): Design and ratings of affective vr scenes.
In 2017 International Conference on Virtual Reality and Visualization (ICVRV) (pp. 311-314). IEEE.
For those who want to deepen...
89. 97
@orestibanos
REFERENCES
Zhao, M., Adib, F., & Katabi, D. (2016, October). Emotion recognition using wireless signals. In Proceedings of the 22nd
Annual International Conference on Mobile Computing and Networking (pp. 95-108).d
Zheng, W. L., Zhu, J. Y., & Lu, B. L. (2017). Identifying stable patterns over time for emotion recognition from EEG. IEEE
Transactions on Affective Computing.
For those who want to deepen...
90. 98
@orestibanos
Oresti Banos
Room 26 (2nd floor), Faculty
ETSIIT, University of Granada,
E-18071 Granada, Spain
Phone
(+34) 958248598
Email / Web
oresti@ugr.es
http://orestibanos.com/
MANY THANKS!
CONTACT: