SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
1
Exploring the Relationship
Between Multi-Modal Emotion
Semantics of Music
Ju-Chiang Wang, Yi-Hsuan Yang, Kaichun
Chang, Hsin-Min Wang, and Skyh-Kang Jeng
Academia Sinica,
National Taiwan University,
Taipei, Taiwan
2
Outline
• Introduction and Potentiality
• Methodology
– The ATB and AEG models
– Framework to combine the two models
• Evaluation and Result
• Conclusion
• In this presentation, mood and emotion
are exchangeable
3
Introduction – Tag and Valence-Arousal (VA)
• Music emotion modeling, two approaches:
• Share a unified goal of
understanding the emotion
semantics of music
• (Arbitrary) mood tags can be
mapped into the VA space
in an unsupervised and
content-based manner,
without any training ground
truth for the semantic mapping
• Automatically generate a
semantically structured tag cloud
in the VA space
Categorical
Dimensional
Arousal
2 1
3 4
(high )
(low )
Valence
(positive )(negative )
4
Visualization of Music Mood (Laurier et al.)
Generated by SOM
5
Potentiality (Clarifying the Debate)
• A novice user may be unfamiliar with VA model, it
would be helpful to display mood tags in the VA space
• Facilitate applications such as tag-based music search
and browsing interface
• Dimension reduction for tag visualization may result
dimensions not conforming to valence and arousal
• The VA values of some affective terms can be found,
but not elicited from music
• Affective terms are not cross lingual and not always
have exact translations in different languages
• Cultural-dependent, corpus-dependent
6
Taxonomy of Music Mood (Xiao Hu, et al.)
Aggressive 侵略的;好鬥
Amiable 和藹可親的;厚道的
Autumnal 秋的;像秋天的
Bittersweet 苦樂參半的
Boisterous 喧鬧的;狂暴的
Brooding 徘徊不去的;沈思的
Calm 冷靜;鎮定
Campy 裝模作樣;
Cheerful 興高采烈的;情緒好的
Confident 有信心的,自負的
Dreamy 夢幻般的;愛作白日夢的;
Fiery (感情)激烈的,熱烈的
Fun 有趣的
Humorous 幽默的;滑稽的
Intense 強烈的;熱情的
Literate 有文化修養的
Nostalgic 鄉愁的
Passionate 熱情的;熱烈的;易怒的
Poignant 深刻的;辛酸的
Quirky 詭詐的;多變的;古怪的
Relaxed 鬆懈的;放鬆的
Rollicking 嬉耍的;愉快的
Rousing 使覺醒的;使奮起的
Rowdy 粗暴的;喧鬧的
Silly 愚蠢的;糊塗的;無聊的
Soothing 慰藉的;使人寬心的
Sweet 甜的;悅耳的
Tense 緊張的;引起緊張的
Visceral 出自內心深處的
Volatile 易發作的;輕浮的;飛逝的
Whimsical 想入非非的,怪誕的,古怪的
Wistful 渴望的;想往的;留戀的
Witty 機智的;說話風趣的
Wry 歪斜的;曲解的;堅持錯誤的
GAP GAP
7
Potentiality (Clarifying the Debate)
Machine Learning is necessary for such a task
8
Methodology of the Framework
• A probabilistic framework with two component models,
Acoustic Tag Bernoullis (ATB) and Acoustic Emotion
Gaussians (AEG)
– Computationally model the generative processes from acoustic
features to a mood tag and a VA value, respectively
• Based on the same acoustic feature space, the ATB and
AEG models can share and transit the semantic
information to each other
• Bridged by the acoustic feature space, we can align one
emotion modality to the other
• The first attempt to establish a joint model for exploring
between discrete mood categories and continuous
emotion space
9
Construct Feature Reference Model
A1 A2
AK-1
AK A3A4
Global GMM for acoustic
feature encoding
EM Training
A Universal
Music Database
Acoustic GMM
Music Tracks
& Audio Signal
Frame-based Features
… …
… …
Global Set  of frame
vectors randomly
selected from each track
…
Music Tracks
& Audio Signal
A Universal
Music Database
Music Tracks
& Audio Signal
10
Represent a Song into Probabilistic Space
1
2
K-1
K…
Posterior
Probabilities over
the Acoustic GMM
…
A1
A2
AK-1
Acoustic GMM
AK
…
Feature Vectors
Histogram:
Acoustic GMM Posterior
prob
Each dim corresponds to a specific acoustic pattern
1 2 K-1 K…
11
Acoustic Tag Bernoullis (ATB)
• Given an mood-tagged music dataset with the binary
label for a mood tag
• Learn ATB that describes the generative process of each
song in the dataset from acoustic features to mood tag
• Won (AUC Clip) in Mood Tag Classification (MIREX2009,
2010)
12
Acoustic Emotion Gaussians (AEG)
• Given a VA-annotated music dataset
• Learn AEG that describes the generative process of
each song in the dataset from acoustic features to the
VA space
• Presented in OS2, superior to its rivals, SVR and MLR
13
The Learning of VA GMM on MER60
14
Multi-Modal Emotion Semantic Mapping
• Three models are aligned, ATB, Acoustic GMM, and AEG
• Transit the weights from a mood tag to the VA GMM
• The semantic mapping processes are transparent and
easy to be observed and interpreted
Mapping a tag into a VA Gaussian distribution
15
Evaluation – Corpora and Settings
• Two corpora used: MER60 and AMG1644
• MER60: jointly annotated corpus (MER60-alone setting)
– 60 music clips, each is 30-second
– 99 subjects in total, each clip annotated by 40 subjects
– The VA values are entered by clicking on the emotion space
on a computer display
– Query Last.fm and leave 50 top mood tags for the 60 songs
• AMG1644: used for the separately annotated corpora
(AMG1644-MER60 setting)
– Crawl the audio of the “top songs” for 33 mood tags (AMG),
most of the tags are used in MIREX mood classification task
– Leading to 1,644 clips, each is about 30-second
16
Acoustic Features
• Adopt the bag-of-frames representation
• Extracting frame-based musical features from audio
using the MIRToolbox 1.3
• All the frames of a clip are aggregated into the acoustic
GMM posterior and perform the analysis of emotion at
the clip-level, instead of frame-level
• Frame-based features
– Dynamic, spectral, timbre, and tonal
– 70-dim concatenated feature vector for a frame
17
Result for the MER60-Alone Setting
• Graphviz for visualization, Voronoi diagram-based
heuristic to avoid tag overlapping
18
• Graphviz for visualization, Voronoi diagram-based
heuristic to avoid tag overlapping
Result for the AMG-MER Setting
19
Comparison with Psychologist
• Quantitative comparison
– Refer to the VA values of 30 affective terms proposed by
Whissell and Plutchik (WP) and by the Affective Norms for
English Words (ANEW)
– For a tag, measure the Euclidean distance between the
generated VA value and the psychologists’ one
• Baseline
– Set the generated VA values of each tag to the origin
– Represent a non-effective tag-VA mapping
20
Discussion
• The result is not sensitive to K
• Such a learning-based framework is scalable and can do
better if more annotated data is available
• Automatic discovering
– For instance, construct a balance audio music corpus and let
Chinese to label the Chinese mood tags
– Generate a Chinese mood tag cloud
• Inverse correlation between the VA intensity and the
covariance of a tag
– Tags lying on the outer circle would have larger font sizes
21
Result for the MER60-Alone Setting
22
Conclusion
• A novel framework that unifies the categorical and
dimensional emotion semantics of music
• Demonstrated how to map a mood tag to a 2-D VA
Gaussian and generate the corresponding tag cloud,
and this can be further extended to arbitrary tags
• Verify whether an arbitrary tag is mood-related or not
• We will conduct user studies for the result
• More investigations in acoustic feature
representations for better generalization of the
emotion modeling
23
Arbitrary Tag - MajorMiner Not Mood-related
24
Arbitrary Tag - MajorMiner Mood-related

Más contenido relacionado

Último

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Último (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Exploring the Relationship Between Multi-Modal Emotion Semantics of Music

  • 1. 1 Exploring the Relationship Between Multi-Modal Emotion Semantics of Music Ju-Chiang Wang, Yi-Hsuan Yang, Kaichun Chang, Hsin-Min Wang, and Skyh-Kang Jeng Academia Sinica, National Taiwan University, Taipei, Taiwan
  • 2. 2 Outline • Introduction and Potentiality • Methodology – The ATB and AEG models – Framework to combine the two models • Evaluation and Result • Conclusion • In this presentation, mood and emotion are exchangeable
  • 3. 3 Introduction – Tag and Valence-Arousal (VA) • Music emotion modeling, two approaches: • Share a unified goal of understanding the emotion semantics of music • (Arbitrary) mood tags can be mapped into the VA space in an unsupervised and content-based manner, without any training ground truth for the semantic mapping • Automatically generate a semantically structured tag cloud in the VA space Categorical Dimensional Arousal 2 1 3 4 (high ) (low ) Valence (positive )(negative )
  • 4. 4 Visualization of Music Mood (Laurier et al.) Generated by SOM
  • 5. 5 Potentiality (Clarifying the Debate) • A novice user may be unfamiliar with VA model, it would be helpful to display mood tags in the VA space • Facilitate applications such as tag-based music search and browsing interface • Dimension reduction for tag visualization may result dimensions not conforming to valence and arousal • The VA values of some affective terms can be found, but not elicited from music • Affective terms are not cross lingual and not always have exact translations in different languages • Cultural-dependent, corpus-dependent
  • 6. 6 Taxonomy of Music Mood (Xiao Hu, et al.) Aggressive 侵略的;好鬥 Amiable 和藹可親的;厚道的 Autumnal 秋的;像秋天的 Bittersweet 苦樂參半的 Boisterous 喧鬧的;狂暴的 Brooding 徘徊不去的;沈思的 Calm 冷靜;鎮定 Campy 裝模作樣; Cheerful 興高采烈的;情緒好的 Confident 有信心的,自負的 Dreamy 夢幻般的;愛作白日夢的; Fiery (感情)激烈的,熱烈的 Fun 有趣的 Humorous 幽默的;滑稽的 Intense 強烈的;熱情的 Literate 有文化修養的 Nostalgic 鄉愁的 Passionate 熱情的;熱烈的;易怒的 Poignant 深刻的;辛酸的 Quirky 詭詐的;多變的;古怪的 Relaxed 鬆懈的;放鬆的 Rollicking 嬉耍的;愉快的 Rousing 使覺醒的;使奮起的 Rowdy 粗暴的;喧鬧的 Silly 愚蠢的;糊塗的;無聊的 Soothing 慰藉的;使人寬心的 Sweet 甜的;悅耳的 Tense 緊張的;引起緊張的 Visceral 出自內心深處的 Volatile 易發作的;輕浮的;飛逝的 Whimsical 想入非非的,怪誕的,古怪的 Wistful 渴望的;想往的;留戀的 Witty 機智的;說話風趣的 Wry 歪斜的;曲解的;堅持錯誤的 GAP GAP
  • 7. 7 Potentiality (Clarifying the Debate) Machine Learning is necessary for such a task
  • 8. 8 Methodology of the Framework • A probabilistic framework with two component models, Acoustic Tag Bernoullis (ATB) and Acoustic Emotion Gaussians (AEG) – Computationally model the generative processes from acoustic features to a mood tag and a VA value, respectively • Based on the same acoustic feature space, the ATB and AEG models can share and transit the semantic information to each other • Bridged by the acoustic feature space, we can align one emotion modality to the other • The first attempt to establish a joint model for exploring between discrete mood categories and continuous emotion space
  • 9. 9 Construct Feature Reference Model A1 A2 AK-1 AK A3A4 Global GMM for acoustic feature encoding EM Training A Universal Music Database Acoustic GMM Music Tracks & Audio Signal Frame-based Features … … … … Global Set  of frame vectors randomly selected from each track … Music Tracks & Audio Signal A Universal Music Database Music Tracks & Audio Signal
  • 10. 10 Represent a Song into Probabilistic Space 1 2 K-1 K… Posterior Probabilities over the Acoustic GMM … A1 A2 AK-1 Acoustic GMM AK … Feature Vectors Histogram: Acoustic GMM Posterior prob Each dim corresponds to a specific acoustic pattern 1 2 K-1 K…
  • 11. 11 Acoustic Tag Bernoullis (ATB) • Given an mood-tagged music dataset with the binary label for a mood tag • Learn ATB that describes the generative process of each song in the dataset from acoustic features to mood tag • Won (AUC Clip) in Mood Tag Classification (MIREX2009, 2010)
  • 12. 12 Acoustic Emotion Gaussians (AEG) • Given a VA-annotated music dataset • Learn AEG that describes the generative process of each song in the dataset from acoustic features to the VA space • Presented in OS2, superior to its rivals, SVR and MLR
  • 13. 13 The Learning of VA GMM on MER60
  • 14. 14 Multi-Modal Emotion Semantic Mapping • Three models are aligned, ATB, Acoustic GMM, and AEG • Transit the weights from a mood tag to the VA GMM • The semantic mapping processes are transparent and easy to be observed and interpreted Mapping a tag into a VA Gaussian distribution
  • 15. 15 Evaluation – Corpora and Settings • Two corpora used: MER60 and AMG1644 • MER60: jointly annotated corpus (MER60-alone setting) – 60 music clips, each is 30-second – 99 subjects in total, each clip annotated by 40 subjects – The VA values are entered by clicking on the emotion space on a computer display – Query Last.fm and leave 50 top mood tags for the 60 songs • AMG1644: used for the separately annotated corpora (AMG1644-MER60 setting) – Crawl the audio of the “top songs” for 33 mood tags (AMG), most of the tags are used in MIREX mood classification task – Leading to 1,644 clips, each is about 30-second
  • 16. 16 Acoustic Features • Adopt the bag-of-frames representation • Extracting frame-based musical features from audio using the MIRToolbox 1.3 • All the frames of a clip are aggregated into the acoustic GMM posterior and perform the analysis of emotion at the clip-level, instead of frame-level • Frame-based features – Dynamic, spectral, timbre, and tonal – 70-dim concatenated feature vector for a frame
  • 17. 17 Result for the MER60-Alone Setting • Graphviz for visualization, Voronoi diagram-based heuristic to avoid tag overlapping
  • 18. 18 • Graphviz for visualization, Voronoi diagram-based heuristic to avoid tag overlapping Result for the AMG-MER Setting
  • 19. 19 Comparison with Psychologist • Quantitative comparison – Refer to the VA values of 30 affective terms proposed by Whissell and Plutchik (WP) and by the Affective Norms for English Words (ANEW) – For a tag, measure the Euclidean distance between the generated VA value and the psychologists’ one • Baseline – Set the generated VA values of each tag to the origin – Represent a non-effective tag-VA mapping
  • 20. 20 Discussion • The result is not sensitive to K • Such a learning-based framework is scalable and can do better if more annotated data is available • Automatic discovering – For instance, construct a balance audio music corpus and let Chinese to label the Chinese mood tags – Generate a Chinese mood tag cloud • Inverse correlation between the VA intensity and the covariance of a tag – Tags lying on the outer circle would have larger font sizes
  • 21. 21 Result for the MER60-Alone Setting
  • 22. 22 Conclusion • A novel framework that unifies the categorical and dimensional emotion semantics of music • Demonstrated how to map a mood tag to a 2-D VA Gaussian and generate the corresponding tag cloud, and this can be further extended to arbitrary tags • Verify whether an arbitrary tag is mood-related or not • We will conduct user studies for the result • More investigations in acoustic feature representations for better generalization of the emotion modeling
  • 23. 23 Arbitrary Tag - MajorMiner Not Mood-related
  • 24. 24 Arbitrary Tag - MajorMiner Mood-related