ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using adjective noun pairs

ACMMM2013 reading
@ Kanto CV 2014.2.23
Akisato Kimura (@_akisato)
NTT Communication Science Labs

Basic strategy
• Adjective noun pairs (ANPs)
– Adjectives play a significant role in conveying
sentiments, but visually inconsistent.
– Combined phrases make the concepts more
detectable than single adj. & n.
• cf. Recognition using visual phrases
[CVPR11]

Contributions
• Automatically construct a large-scale Visual
Sentiment Ontology (VSO) with 3000 ANPs
– With the help of psychological theories and web
mining techniques

• Propose SentiBank: a visual concept detector
library to detect the presence of 1200 ANPs
– Useful for sentiment analysis of visual contents as
attributes

Framework
1. Select 24 fundamental words representing emotion
2. Retrieve images with every of the words as a query
3. Tags associated with the images are extracted to
build ANPs ( = strong sentiment ADJs + all Ns)
4. Train ANP detectors and keep only detectors with
reasonable performance to form SentiBank

24 basic words for emotions
• Founded on Plutchik’s Wheel on Emotions
1
1
2
3
4
1
2
3
4
1
2
3
4

1
2
3
4

4
4
3

1

2
2
3
3

4
3
1
2
4
2
3
1
2
1
4
1
2
3 http://en.wikipedia.org/wiki/Plutchik%27s_Wheel_
4 of_Emotions#Plutchik.27s_wheel_of_emotions

24 basic words for emotions (cont.)
• 8 basic emotions
x 3 degrees

1
4

2

3
1
2
3
4
1
2
3
4

3

2

4
1

Sentiment word discovery
• Web mining strategy
– Retrieve images & videos from Flickr & YouTube
with each of 24 basic words as a query
– Extract their associated tags by Lookapp tool
[Borth+ ICMR11]

Sentiment word discovery (cont.)
• Exploits various NLP techniques & resources
– Post-processings
• Remove stop words, perform stemming
• Top 100 tags are selected for each emotion

– Sentiment value computation (-1 neg  +1 pos)
• SentiWordNet [Esuli+ 2006] SentiStrength [Thelwall+ 2010]

ANP construction
• Take all the pairs of (ADJ, N)s into consideration
– Remove named entities with meaning changed
(e.g. “hot” + “dog”  generic named entity)

• Fuse sentiment values
– Simple sum-up model : s(ANP) = s(ADJ) + s(N)
• If sgn(s(ADJ)) != sgn(s(N)), then s(ANP) = S(ADJ).

• Rank ANPs by their frequency
– Remove all ANPs with no images
– Resulting in 47K ANP candidates

ANP construction (cont.)
• Ontology sampling
– Partition candidates into individual ADJ sets
– Sample a subset from each ADJ set
– Take ANPs with sufficient (>125) images

• Linking back to emotions
– For each ANP, count images with 24 basic words & the ANP
in their meta, create a 24-dim histogram

How reliable ANP labels are?
• Web annotation may not be reliable
– Using Flickr tags as pseudo ANP labels might incur
false positive

• Manual (=AMT) validation
– Randomly sample images of 200 ANPs
– Each image is validated by 3 Turkers, treated as
correct only if >= 2 Turkers agree
– Results: 97% correct

http://visual-sentiment-ontology.appspot.com

Training ANP detectors
• Various visual features
– Color histogram (3 colors x 256 dim), GIST (512 dim),
LBP (53 dim), BoW with spatial pyramid and max
pooling (1000 dim x 2 layers), attributes [Yu+ CVPR13]
(2000 dim)

• Training a linear SVM for every ANP
– Parameter tuning by cross validation (AP@20-based)
– Measure performance by AP@20, AUC & F-score.

• Several feature fusions
– Early fusion, late fusion, weighted early/late fusion

Detector performance
• Comparing visual features (left)
– 1st: attributes, 2nd: BoWs

• Comparing feature fusions (right)
– 1st: Weighted late fusion, but not dominant
– Adopt early fusion for implementation simplicity

Detectability issues
• Select only ANPs with good detection accuracy
– 1200 ANPs with AP@20>0 & F-score>0.6

• No correlation bwt detectability & occurrence
– Difficulty in detecting ANPs depends on the
content diversity and the abstract level

Other issues
• Special visual features improve detectors
– ObjectBank [Li+ NIPS2010], facial features, aesthetic
features [Bhattacharya+ ACMMM13]

• Ontology structure
– Interactive process to combine 1200 ANPs into distinct
groups  6 levels, 15 nodes at the top
• N: standard “is-a” relations
• ADJ: exclusive (“sad” vs “happy”) & strength (“nice”, “great”,
“awesome”)

– 41% nouns uncovered by ImageNet
• Related to abstract concepts (e.g. “violence”, “religion”)

SentiBank applications
• Sentiment prediction in image tweets
– Sentiment analysis rely on text-based tools
– 140 characters (in ENG) are too short
– Use SentiBank to complement and augment texts

• Emotion classification
– Demonstrate the performance against an emotion
dataset of art photos [Machajdik+ ACMMM10]

Sentiment prediction in tweets
• Data collection
– Gather tweets with images & popular hashtags
• #nuclearpower, #election, #championsleague, #cairo …

– AMT to obtain sentiment ground-truth
• 3 Turkers for every tweets: almost agreed (below)

http://www.ee.columbia.ed
u/ln/dvmm/vso/download/t
witter_dataset.html

Sentiment prediction in tweets (cont.)
• Visual-based classifier
– Serve SentiBank as a mid-level representation
• Use ANP responses as an input feature
• Employ a linear classifier for the final output

– Compare SentiBank with low-level features

• Text-based classifier
– Naïve Bayes + SentiStrength

• Overall performance

• Detailed performance

Emotion classification
• Dataset
– 807 art photos, 8 emotion categories retrieved
from DeviantArt.com

Takeaway messages
• To appear in Tomorrow’s meeting

ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using adjective noun pairs

Recomendados

Recomendados

Más contenido relacionado

Similar a ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using adjective noun pairs

Similar a ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using adjective noun pairs (20)

Más de Akisato Kimura

Más de Akisato Kimura (20)

Último

Último (20)

ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using adjective noun pairs