SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
Topic and Text Analysis for Sentiment, Emotion,
and Computational Social Science
November 2012
Alice Oh
alice.oh@kaist.edu
Users & Information Lab
http://uilab.kaist.ac.kr
1
Thursday, December 6, 2012
Overview
• Topic modeling research
• CIKM 2011: Distance-dependent Chinese restaurant franchise (ddCRF)
• ICML 2012: Dirichlet process with random mixed measures (DP-MRM)
• CIKM 2012: Recursive chinese restaurant process for modeling topic
hierarchies (rCRP)
• NIPS Big Learning Workshop 2012: Distributed Online Learning for
Latent Dirichlet Allocation (DoLDA)
• Computational social science research
• WSDM 2011: Aspect sentiment unification model for online review analysis
• ICWSM 2012: Social aspects of emotions in Twitter conversations
• ACL 2012: Self-disclosure and relationship strength in Twitter
conversations
2
Thursday, December 6, 2012
Do you feel what I feel?
Social Aspects of Emotions in Twitter Conversations
Suin Kim, JinYeong Bak, Alice Oh
ICWSM 2012
3
Thursday, December 6, 2012
Asking Research Questions
4
Thursday, December 6, 2012
Asking Research Questions
4
Thursday, December 6, 2012
Asking Research Questions
Human emotion is typically studied as a within-person, one-direction,
non-repetitive phenomenon; focus has traditionally been on how one
individual feels in reaction to various stimuli at a certain point of
time. But people recognize and inevitably react emotionally and
otherwise to expressions of emotion of other people. We propose
that organizational dyads and groups inhabit emotion cycles:
Emotions of an individual influence the emotions, thoughts and
behaviors of others; others’ reactions can then influence their
future interactions with the individual expressing the original
emotion, as well as that individual’s future emotions and
behaviors. People can mimic the emotions of others, thereby
extending the social presence of a specific emotion, but can also
respond to others’ emotions, extending the range of emotions
present.
5
Thursday, December 6, 2012
Social Aspects of Emotions: Motivating Question
How are our emotions affected by others we talk to?
Thursday, December 6, 2012
Social Aspects of Emotions: Research Questions
• How do we communicate our emotions?
• Use a topic model on Twitter conversations to discover the “topics” that
represent the eight emotions
• Analyze the proportions of the total tweets for the emotions
• How do we influence other people’s emotions?
• Analyze the and emotion transitions of the tweets
• Look for topics that change the emotions of the conversation partners
• Find interesting patterns of emotion pairs
Thursday, December 6, 2012
Social Aspects of Emotions: Data
• Twitter conversation data: approx 220k dyads who “reply” to each other,
1,670k conversational chains
!
"!
#!
$!
%!
Thursday, December 6, 2012
Seed Words (We Feel Fine by Harris & Kamvar)
anticipation
hope
wait
await
inspir
excit
bore
readi
expect
nervou
calm
motiv
prepar
certain
anxiou
optimist
forese
joy
awesom
amaz
wonder
excit
glad
fine
beauti
high
lucki
super
perfect
complet
special
bless
safe
proud
anger
shit
bitch
ass
mean
damn
mad
jealou
piss
annoi
angri
upset
moron
rage
screw
stuck
irrit
surprise
amaz
wow
wonder
weird
lucki
differ
awkward
confus
holi
strang
shock
odd
embarrass
overwhelm
astound
astonish
fear
scare
stress
horror
nervou
terror
alarm
behind
panic
fear
afraid
desper
threaten
tens
terrifi
fright
anxiou
sadness
sorri
bad
aw
sad
wrong
hurt
blue
dead
lost
crush
weak
depress
wors
low
terribl
lone
disgust
sick
wrong
evil
fat
ugli
horribl
gross
terribl
selfish
miser
pathet
disgust
worthless
aw
asham
fuck
acceptance
okai
ok
same
alright
safe
lazi
relax
peac
content
normal
secur
complet
numb
fulfil
comfort
defeat
Thursday, December 6, 2012
Dirichlet Forest Prior
• Dirichlet Forest Prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichlet tree: Generalization of Dirichlet distribution
• Knowledge is expressed using Must-link and Cannot-link
primitives
• Must-link (love, sweetheart)
• Cannot-link (exciting, bored)
10
DF-LDA
Thursday, December 6, 2012
Dirichlet Forest Prior
• Dirichlet Forest Prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichlet tree: Generalization of Dirichlet distribution
• Knowledge is expressed using Must-link and Cannot-link
primitives
• Must-link (love, sweetheart)
• Cannot-link (exciting, bored)
10
q
β
η
DF-LDA
Thursday, December 6, 2012
Domain Knowledge in Dirichlet Forest Prior
11
Seed Words
anticipation
hope
wait
await
inspir
excit
bore
readi
expect
nervou
calm
motiv
prepar
certain
anxiou
optimist
forese
joy
awesom
amaz
wonder
excit
glad
fine
beauti
high
lucki
super
perfect
complet
special
bless
safe
proud
anger
shit
bitch
ass
mean
damn
mad
jealou
piss
annoi
angri
upset
moron
rage
screw
stuck
irrit
surprise
amaz
wow
wonder
weird
lucki
differ
awkward
confus
holi
strang
shock
odd
embarrass
overwhelm
astound
astonish
fear
scare
stress
horror
nervou
terror
alarm
behind
panic
fear
afraid
desper
threaten
tens
terrifi
fright
anxiou
sadness
sorri
bad
aw
sad
wrong
hurt
blue
dead
lost
crush
weak
depress
wors
low
terribl
lone
disgust
sick
wrong
evil
fat
ugli
horribl
gross
terribl
selfish
miser
pathet
disgust
worthless
aw
asham
fuck
acceptance
okai
ok
same
alright
safe
lazi
relax
peac
content
normal
secur
complet
numb
fulfil
comfort
defeat
Must-link within a class Cannot-link between classes
Thursday, December 6, 2012
Dirichlet Forest vs. Dirichlet
12
Fear
DF-LDA don’t think but know why even wanna care worry understand
Fear
LDA good exam lol luck just school haha i’m xx worry tomorrow
Surprise
DF-LDA that very really cool wow wonder just some differ amazing
Surprise
LDA just rt holy got thank did shit new love lol awesome buy oh
Sadness
DF-LDA bad my real feel life aw sad kill lost dead hurt wrong sick
Sadness
LDA lol just know sorry isn’t oh tweet did haha don’t thought think
Thursday, December 6, 2012
Emotion Topics How do we express emotions?
JoyAnticipation Anger
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follow
wow
Topic 159
good
day
hope
morning
thank
Topic 158
love
thank
miss
hug
Topic 125
hope
better
feel
thank
soon
Topic 26
good
thank
hope
miss
Topic 146
come
wait
week
day
june
Topic 146
good
day
time
work
Topic 131
lmao
fuck
ass
bitch
shit
Topic 4
ass
yo
lmao
nigga
Topic 19
lmao
shit
damn
fuck
oh
Topic 13
shit
nigga
smh
yea
Fear
Topic 48
omg
oh
lmao
shit
scare
Topic 78
happen
heart
attack
hospital
Topic 27
don’t
come
night
sleep
outside
Topic 140
time
got
work
day
Surprise
Topic 172
yeag
know
think
true
funny
Topic 89
know
don’t
think
look
Topic 15
think
don’t
know
make
really
Topic 94
haha
dont
think
really
29 70 21 14 5
Sadness Disgust
Topic 6
oh
sorry
haha
know
didnt
Topic 59
hurt
got
good
bad
pain
Topic 106
tweet
reply
didn’t
read
sorry
Topic 155
oh
really
make
feel
Topic 116
oh
fuck
don’t
ye
ew
Topic 116
look
haha
oh
know
Topic 22
don’t
oh
think
yeah
lmao
Topic 174
don’t
think
say
people
Acceptance
Topic 43
ok
oh
thank
cool
okay
Topic 102
know
try
let
ok
Topic 199
xx
thank
good
okay
follow
Topic 8
night
love
good
sleep
17 7 18 Neutral
Topic 180
com
www
http
check
youtube
Topic 156
twitter
facebook
people
account
Topic 184
account
google
app
work
email
Topic 67
food
chicken
cook
rt
19
13
Thursday, December 6, 2012
Emotion Topics How do we express emotions?
JoyAnticipation
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follow
wow
Topic 125
hope
better
feel
thank
soon
Topic 26
good
thank
hope
miss
Sadness
Topic 6
oh
sorry
know
didnt
Topic 59
hurt
got
good
bad
pain
Neutral
Topic 180
com
www
http
check
youtube
Topic 156
twitter
facebook
people
account
GreetingCaring Sympathy IT/Tech
14
Thursday, December 6, 2012
Emotion Transitions Plutchik’s Wheel of Emotions
Joy
39.7%
0.51
Acceptance
10.4%
0.23
Fear
2.6%
0.11
Surprise
7.4%
0.17
Anticipation
15.1%
0.26
Disgust
2.9%
0.11
Sadness
9.1%
0.19
0.31
Anger
12.8%
0.37
0.33
0.32
0.31
0.33
0.21
0.34
0.15
0.14
0.13
0.15
15
Thursday, December 6, 2012
Defining “Influence”
User A
User B
Having a tough day
today. RIP Harrison. I’ll
miss you a ton :/
Just pray about it.
God will help you.
Not really religious,
but thanks man. :)
If you need talk
you know I’m here.
Time
(Sadness)
(Acceptance)
(Anticipation)
16
Thursday, December 6, 2012
Defining “Influence”
emotion influencing tweet
User A
User B
Having a tough day
today. RIP Harrison. I’ll
miss you a ton :/
Just pray about it.
God will help you.
Not really religious,
but thanks man. :)
If you need talk
you know I’m here.
Time
(Sadness)
(Acceptance)
(Anticipation)
16
Thursday, December 6, 2012
Topic 117
tweet
people
don’t
read
post
Topic 59
hurt
got
bad
pain
feel
Emotion Influences What can you say to make your
partner feel better?
Joy → SadnessSadness → Joy
Topic 18
wear
look
think
love
black
Topic 24
love
thank
great
new
look
Acceptance → Anger
Topic 31
i’m
got
lmax
shit
da
Topic 13
lmao
shit
nigga
smh
yea
Greeting
Sympathizing
Swearing Complaining
17
Thursday, December 6, 2012
0
0.075
0.15
0.225
0.3
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.041
0.0710.082
0.053
0.265
0.061
0.081
0.0420.051
Emotion Influence: Sadness to Joy
Emotion Influence: Joy to Anger
0
0.1
0.2
0.3
0.4
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.211
0.230.2140.209
0.191
0.2370.253
0.358
0.273
Expressing Anger has 26.5% of chance
of changing the partner’s emotion from
Joy to Anger.
18
Expressing Joy has 35.8% of chance of changing
the partner’s emotion from Sadness to Joy.
Thursday, December 6, 2012
Outliers
19
A: Sorry to hear about your bags.
If you would like us to get
someone to contact you DM us
your reference and contact
number.
B: it's on it's way to manch. If the
woman on the check in desk in
Miami hadn't been trying
to be all smart! Been no problem.
A: Sorry about that. Pleased to
hear they located it quickly for you
though.
B: mistakes happen.
Thursday, December 6, 2012
Analyzing Self-Disclosure Behaviors in
Twitter Conversations Using Text Mining
Techniques (Presented at ACL 2012)
JinYeong Bak, Suin Kim, Alice Oh
{jy.bak, suin.kim}@kaist.ac.kr, alice.oh@kaist.edu
Department of Computer Science, KAIST
Thursday, December 6, 2012
2012-07-11
In social psychology
} Degree of self-disclosure in a relationship depends on
the strength of the relationship
} Strategic self-disclosure can strengthen the relationship
Introduction
21
I like you
too!
You’re my
best
friend!
Thursday, December 6, 2012
2012-07-11
Hypothesis
22
Twitter conversations also show a similar pattern
} Dyads with high relationship strength show more self-disclosure
behavior
} Dyads with low relationship strength show less self-disclosure
behavior
I like you
too!
You’re my
best
friend!
Hello~
Hi
Thursday, December 6, 2012
2012-07-11
Methodology
} Twitter Data
} 131K users
} 2M conversations
} Relationship Strength
} Chain frequency (CF)
} Chain length (CL)
} Self-Disclosure
} Personal information
} Open communication
} Profanity
} Analysis with Topic Models
} Latent Dirichlet allocation (LDA, [Blei, JMLR 2003])
} Aspect and sentiment unification model (ASUM, [Jo,WSDM 2011])
23
Thursday, December 6, 2012
2012-07-11
Twitter Conversation
} A Twitter conversation chain
} 3 or more tweets
} at least one reply by each user
} Our Twitter conversation data
} Oct 2011 to Dec 2011
} 131K users
} 2M chains
} 11M tweets
24
https://twitter.com/#!/britneyspears
Example of a conversation chain
Thursday, December 6, 2012
2012-07-11
Relationship Strength
} Social psychology literature states relationship strength can be
measured by communication frequency and length [Granovetter, 1973;
Levin and Cross, 2004]
} CF: chain frequency
} The number of conversational chains between the dyad
averaged per month
} CL: chain length
} The length of conversational chains between the dyad
averaged per month
} Relationship strength
} A high CF or CL for a dyad means the relationship is strong
} A low CF or CL for a dyad means the relationship is weak
25
Thursday, December 6, 2012
2012-07-11
Self-Disclosure
} Open communication - Openness
} Negative openness
} Nonverbal openness
} Emotional openness
} Receptive openness – difficult to find in tweets
} General-style openness – not clearly defined in the literature
} Personal Information
} Personally Identifiable Information (PII)
} Personally Embarrassing Information (PEI)
} Profanity
} nigga, ass, wtf, lmao
26
Thursday, December 6, 2012
2012-07-11
Negative openness
} Method
} We use ASUM with emoticons as seed words
[ “Aspect and sentiment unification model for online review analysis”, Jo,WSDM’11]
} ASUM is LDA-based joint model of topic and sentiment
} ASUM takes unannotated data and classifies each sentence (tweet) as
positive/negative/neutral
Self-Disclosure - Openness
27
Thursday, December 6, 2012
2012-07-11
Self-Disclosure - Openness
Nonverbal openness
} Method
} We look for emoticons,‘lol’,‘xxx’
} Emoticons are like facial expressions -- :) :( :P
} ‘lol’ (laughing out loud) and ‘xxx’ (kisses) are very frequently used in a
similar manner to nonverbal openness
28
Thursday, December 6, 2012
2012-07-11
Self-Disclosure - Openness
Emotional openness
} Method
} Look for tweets that contain common expressions of feeling words
[We feel fine (Harris, J, 2009)]
29
Thursday, December 6, 2012
2012-07-11
Self-Disclosure – Personal Information
Personally Identifiable Information (PII)
Personally Embarrassing Information (PEI)
30
Ex) name, location,
email address, job,
social security number
Ex) clinical history,
sexual life,
job loss,
family problem
Thursday, December 6, 2012
2012-07-11
Self-Disclosure – Personal Information
}  
31
Thursday, December 6, 2012
2012-07-11
Self-Disclosure – Personal Information
Example of PII, PEI and Profanity topics
} Shown by high probability words in each topic
PII 1 PII 2 PEI 1 PEI 2 PEI 3 Profanity
san tonight pants teeth family nigga
live time wear doctor brother lmao
state tomorrow boobs dr sister shit
texas good naked dentist uncle ass
south ill wearing tooth cousin bitch
32
Thursday, December 6, 2012
2012-07-11
Results
Thursday, December 6, 2012
2012-07-1134
weak ßà strong weak ßà strong
weak ßà strong weak ßà strong
sentiment nonverbal emotional profanity PII & PEI
Thursday, December 6, 2012
2012-07-1135
weak ßà strong
weak ßà strong
emotional PII & PEI
weak ßà strong
weak ßà strong
Thursday, December 6, 2012
2012-07-11
Results: Interpretation
} Emotional openness
} When they are not very close, they express frequent encouragements,
or polite reactions to baby or pets
36
Thursday, December 6, 2012
2012-07-11
Results: Interpretation
} PII
} When they meet new acquaintances, they use PII to introduce
themselves
37
Thursday, December 6, 2012
2012-07-11
Results
Analyzing outliers: a dyad linked weakly but shows high self-
disclosure
38
Thursday, December 6, 2012
Distributed Online Learning for
Latent Dirichlet Allocation
JinYeong Bak, Dongwoo Kim, and Alice Oh
NIPS 2012
Workshop on Big Learning
39
Thursday, December 6, 2012
Motivation
• Problem 1: Inference for LDA takes a long time
• Problem 2: Continuously expanding corpus necessitates continuous updates
of model parameters
• But updating of model parameters is not possible with plain LDA
• Must re-train with the entire updated corpus
• Solution to 1: Distributed inference shortens inference time (Newman
JMLR 2009, Wang WWW 2012)
• Solution to 2: Online (batch) learning enables updates to model
parameters (Hoffman NIPS 2010)
• Our Approach: Combine distributed inference and online learning
40
Thursday, December 6, 2012
Distributed Online LDA
• Based on variational inference
• Mini-batch updates via stochastic learning (variational EM)
• Distribute variational EM using MapReduce
41
Thursday, December 6, 2012
Experimental Setup
• Data: 5.1M Twitter conversations
• 4.8M English Wikipedia articles
• 60 node Hadoop system
• Each node with 8 x 2.30GHz cores
42
Thursday, December 6, 2012
Wikipedia Results
43
Topic 0 Topic 22 Topic 42 Topic 65 Topic 94 Topic 170 Topic 232
relativity
physics
einstein
quantum
gravity
channel
television
tv
cable
news
milk
chocolate
sugar
food
cream
god
bible
moses
chapter
genesis
party
election
president
member
elected
season
team
league
game
football
album
song
band
music
released
Minibatch oLDA DoLDA Speedup
16,384 238666.25 47994.03 4.97
32,768 188508.71 33470.03 5.63
65,536 206290.27 26788.53 7.70
Thursday, December 6, 2012
Twitter Temporal Patterns of Topics
44
Conversation b1 on November 2, 2010
A I wish I could vote today, but I have to work for 14 hours
B is it legal for them not to give you time off to vote?
A probably
Conversation b2 on March 31, 2012
A Mitt Romney: "Obama should release the notes and transcripts of
all his meetings with world leaders"
B Why is he being held to higher standard than any other president.
A did you see my Santorum 'slip' tweet? Is the media afraid to
comment on it?
B oh yes I did. I saw it mentioned yesterday also. disgusting and he
should be raked over hot coals for it.
0.005
0.010
0.015
10−10 11−01 11−04 11−07 11−10 12−01 12−04
Day
Documentproportion
0.004
0.006
0.008
0.010
0.012
11−07 11−10 12−01
Day
Documentproportion
Conversation c1 on September 5, 2011
A Oh god, miss Waite ran over to me up the school just now! :L on
the plus subjects are now picked! :D
B what did you pick??
A english, RE, art and psychology! :) was unsure between history
and psych but found out bubbles was teaching it so nooo! :L
Conversation c2 on October 12, 2011
A :) My day's been okay! It feels long! But school' was okayish. I
hope you have an awesome day! :D
B that's good then! Ahh hope it's not cause anything bad happened?
Thanks! Have a great sleep :)
A no! Class was just boring lol and thanks! :) i will! Even though i
have to wake up early tomorrow for a midterm! :S
<Topic words: party vote people politics obama>
<Topic words: school mate class teacher grade>
Thursday, December 6, 2012
CAVEAT
45
Big Data, social media data, do not always get the right answers!
They contain much noise and much bias.
Sentiment analysis is also full of problems at the big data-level
because every small assumption can turn out to cause wide swings
in the final interpretation of the data.
They are valuable because they have opened up possibilities for
analyses of naturally-occurring data in huge amounts.
We need better methods and tools that are tailored for social media.
We need to ask the right questions that can be answered well despite
the biases of the social media data.
Thursday, December 6, 2012
For details, visit our webpage:
http://uilab.kaist.ac.kr
Or email me:
alice.oh@kaist.edu
Thursday, December 6, 2012

Más contenido relacionado

Destacado

Analysis and Visualization of Real-Time Twitter Data
Analysis and Visualization of Real-Time Twitter DataAnalysis and Visualization of Real-Time Twitter Data
Analysis and Visualization of Real-Time Twitter DataEducational Technology
 
Data Driven PR: 8 Steps to Building Media Attention with Research
Data Driven PR: 8 Steps to Building Media Attention with ResearchData Driven PR: 8 Steps to Building Media Attention with Research
Data Driven PR: 8 Steps to Building Media Attention with ResearchWalkerSands
 
Business Models in the Data Economy: A Case Study from the Business Partner D...
Business Models in the Data Economy: A Case Study from the Business Partner D...Business Models in the Data Economy: A Case Study from the Business Partner D...
Business Models in the Data Economy: A Case Study from the Business Partner D...Boris Otto
 
What is 1st, 2nd, 3rd party data?
What is 1st, 2nd, 3rd party data?What is 1st, 2nd, 3rd party data?
What is 1st, 2nd, 3rd party data?Sparc Media Poland
 
Can Digital Data help predict the results of the US elections?
Can Digital Data help predict the results of the US elections? Can Digital Data help predict the results of the US elections?
Can Digital Data help predict the results of the US elections? Laurence Borel
 
Influence mapping Toolbox Presentation London 2015
Influence mapping Toolbox Presentation London 2015Influence mapping Toolbox Presentation London 2015
Influence mapping Toolbox Presentation London 2015Jun Julien Matsushita
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightMatthew Russell
 
Text Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and TomorrowText Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and TomorrowTony Russell-Rose
 
Text Analytics Past, Present & Future: An Industry View
Text Analytics Past, Present & Future: An Industry ViewText Analytics Past, Present & Future: An Industry View
Text Analytics Past, Present & Future: An Industry ViewSeth Grimes
 
Big Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesBig Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesSocialphysicist
 
Learn How a New Kind of Marketing Mix Modeling is Better for Media Planning
Learn How a New Kind of Marketing Mix Modeling is Better for Media PlanningLearn How a New Kind of Marketing Mix Modeling is Better for Media Planning
Learn How a New Kind of Marketing Mix Modeling is Better for Media PlanningThinkVine
 
How to Build a Basic Model with Analytica
How to Build a Basic Model with AnalyticaHow to Build a Basic Model with Analytica
How to Build a Basic Model with AnalyticaTorsten Röhner
 
Deep Social Insight
Deep Social InsightDeep Social Insight
Deep Social InsightSysomos
 
Staying on the Right Side of the Fence when Analyzing Human Data
Staying on the Right Side of the Fence when Analyzing Human DataStaying on the Right Side of the Fence when Analyzing Human Data
Staying on the Right Side of the Fence when Analyzing Human DataDataSift
 
Evolving in a new Data economy
Evolving in a new Data economyEvolving in a new Data economy
Evolving in a new Data economyAcxiom Corporation
 

Destacado (17)

Analysis and Visualization of Real-Time Twitter Data
Analysis and Visualization of Real-Time Twitter DataAnalysis and Visualization of Real-Time Twitter Data
Analysis and Visualization of Real-Time Twitter Data
 
Data Driven PR: 8 Steps to Building Media Attention with Research
Data Driven PR: 8 Steps to Building Media Attention with ResearchData Driven PR: 8 Steps to Building Media Attention with Research
Data Driven PR: 8 Steps to Building Media Attention with Research
 
Business Models in the Data Economy: A Case Study from the Business Partner D...
Business Models in the Data Economy: A Case Study from the Business Partner D...Business Models in the Data Economy: A Case Study from the Business Partner D...
Business Models in the Data Economy: A Case Study from the Business Partner D...
 
What is 1st, 2nd, 3rd party data?
What is 1st, 2nd, 3rd party data?What is 1st, 2nd, 3rd party data?
What is 1st, 2nd, 3rd party data?
 
Can Digital Data help predict the results of the US elections?
Can Digital Data help predict the results of the US elections? Can Digital Data help predict the results of the US elections?
Can Digital Data help predict the results of the US elections?
 
Influence mapping Toolbox Presentation London 2015
Influence mapping Toolbox Presentation London 2015Influence mapping Toolbox Presentation London 2015
Influence mapping Toolbox Presentation London 2015
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Text Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and TomorrowText Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and Tomorrow
 
Market Mix Models: Shining a Light in the Black Box
Market Mix Models: Shining a Light in the Black BoxMarket Mix Models: Shining a Light in the Black Box
Market Mix Models: Shining a Light in the Black Box
 
Text Analytics Past, Present & Future: An Industry View
Text Analytics Past, Present & Future: An Industry ViewText Analytics Past, Present & Future: An Industry View
Text Analytics Past, Present & Future: An Industry View
 
Big Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesBig Data: Mapping Twitter Communities
Big Data: Mapping Twitter Communities
 
Text mining and Visualizations
Text mining  and VisualizationsText mining  and Visualizations
Text mining and Visualizations
 
Learn How a New Kind of Marketing Mix Modeling is Better for Media Planning
Learn How a New Kind of Marketing Mix Modeling is Better for Media PlanningLearn How a New Kind of Marketing Mix Modeling is Better for Media Planning
Learn How a New Kind of Marketing Mix Modeling is Better for Media Planning
 
How to Build a Basic Model with Analytica
How to Build a Basic Model with AnalyticaHow to Build a Basic Model with Analytica
How to Build a Basic Model with Analytica
 
Deep Social Insight
Deep Social InsightDeep Social Insight
Deep Social Insight
 
Staying on the Right Side of the Fence when Analyzing Human Data
Staying on the Right Side of the Fence when Analyzing Human DataStaying on the Right Side of the Fence when Analyzing Human Data
Staying on the Right Side of the Fence when Analyzing Human Data
 
Evolving in a new Data economy
Evolving in a new Data economyEvolving in a new Data economy
Evolving in a new Data economy
 

Similar a Topic and text analysis for sentiment, emotion, and computational social science

2012 Mar11 Positive Attitude and Creativity - for Lead India -
2012 Mar11   Positive Attitude and Creativity - for Lead India -2012 Mar11   Positive Attitude and Creativity - for Lead India -
2012 Mar11 Positive Attitude and Creativity - for Lead India -viswanadham vangapally
 
Johari: Improving communication and relationships sept 2010
Johari: Improving communication and relationships sept 2010Johari: Improving communication and relationships sept 2010
Johari: Improving communication and relationships sept 2010Eileen Brown
 
6 steps to triple your social confidence and meet new people
6 steps to triple your social confidence and meet new people6 steps to triple your social confidence and meet new people
6 steps to triple your social confidence and meet new peopleAdrian Nqld Cahill
 
Hh smona awarenessbriefinghiddenbiasnovember28
Hh smona awarenessbriefinghiddenbiasnovember28Hh smona awarenessbriefinghiddenbiasnovember28
Hh smona awarenessbriefinghiddenbiasnovember28Alvin Lee
 
Gromming And Sproucing_Miraj Khan
Gromming And Sproucing_Miraj KhanGromming And Sproucing_Miraj Khan
Gromming And Sproucing_Miraj KhanMiraj khan
 
Face fear with the power of audiences
Face fear with the power of audiencesFace fear with the power of audiences
Face fear with the power of audiencesferisulianta.com
 
Team building
Team buildingTeam building
Team buildingsuperrin
 
How to have a beautiful mind
How to have a beautiful mindHow to have a beautiful mind
How to have a beautiful mindMin Zaw
 
Beyond Networking: WAPL 2014
Beyond Networking:  WAPL 2014Beyond Networking:  WAPL 2014
Beyond Networking: WAPL 2014WiLS
 
Why we do what we do by Tony Robbins
Why we do what we do by Tony RobbinsWhy we do what we do by Tony Robbins
Why we do what we do by Tony RobbinsSameer Mathur
 
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head  JOHARI WINDOW1 JOHARI WINDOW2.docxRunning Head  JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docxhealdkathaleen
 
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head  JOHARI WINDOW1 JOHARI WINDOW2.docxRunning Head  JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docxtoddr4
 
Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...
Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...
Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...Min-Yen Kan
 
Social Media Discussion with BYU College Communicators
Social Media Discussion with BYU College Communicators Social Media Discussion with BYU College Communicators
Social Media Discussion with BYU College Communicators jonathanmcbride
 
Why discovery for individuals presesntation
Why discovery for individuals presesntationWhy discovery for individuals presesntation
Why discovery for individuals presesntationUsama_bt
 

Similar a Topic and text analysis for sentiment, emotion, and computational social science (20)

2012 Mar11 Positive Attitude and Creativity - for Lead India -
2012 Mar11   Positive Attitude and Creativity - for Lead India -2012 Mar11   Positive Attitude and Creativity - for Lead India -
2012 Mar11 Positive Attitude and Creativity - for Lead India -
 
Johari: Improving communication and relationships sept 2010
Johari: Improving communication and relationships sept 2010Johari: Improving communication and relationships sept 2010
Johari: Improving communication and relationships sept 2010
 
House tree person test
House tree person testHouse tree person test
House tree person test
 
6 steps to triple your social confidence and meet new people
6 steps to triple your social confidence and meet new people6 steps to triple your social confidence and meet new people
6 steps to triple your social confidence and meet new people
 
The choice is yours
The choice is yoursThe choice is yours
The choice is yours
 
Hh smona awarenessbriefinghiddenbiasnovember28
Hh smona awarenessbriefinghiddenbiasnovember28Hh smona awarenessbriefinghiddenbiasnovember28
Hh smona awarenessbriefinghiddenbiasnovember28
 
Gromming And Sproucing_Miraj Khan
Gromming And Sproucing_Miraj KhanGromming And Sproucing_Miraj Khan
Gromming And Sproucing_Miraj Khan
 
Face fear with the power of audiences
Face fear with the power of audiencesFace fear with the power of audiences
Face fear with the power of audiences
 
Hard Conversations: Managing Parent Relationships
Hard Conversations: Managing Parent Relationships Hard Conversations: Managing Parent Relationships
Hard Conversations: Managing Parent Relationships
 
Team building
Team buildingTeam building
Team building
 
CSP - Week 3
CSP - Week 3CSP - Week 3
CSP - Week 3
 
Anna Pochepaeva
Anna Pochepaeva  Anna Pochepaeva
Anna Pochepaeva
 
How to have a beautiful mind
How to have a beautiful mindHow to have a beautiful mind
How to have a beautiful mind
 
Beyond Networking: WAPL 2014
Beyond Networking:  WAPL 2014Beyond Networking:  WAPL 2014
Beyond Networking: WAPL 2014
 
Why we do what we do by Tony Robbins
Why we do what we do by Tony RobbinsWhy we do what we do by Tony Robbins
Why we do what we do by Tony Robbins
 
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head  JOHARI WINDOW1 JOHARI WINDOW2.docxRunning Head  JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docx
 
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head  JOHARI WINDOW1 JOHARI WINDOW2.docxRunning Head  JOHARI WINDOW1 JOHARI WINDOW2.docx
Running Head JOHARI WINDOW1 JOHARI WINDOW2.docx
 
Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...
Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...
Soujanya Poria / Emotion Recognition in Conversation: Research Challenges, Ne...
 
Social Media Discussion with BYU College Communicators
Social Media Discussion with BYU College Communicators Social Media Discussion with BYU College Communicators
Social Media Discussion with BYU College Communicators
 
Why discovery for individuals presesntation
Why discovery for individuals presesntationWhy discovery for individuals presesntation
Why discovery for individuals presesntation
 

Último

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Topic and text analysis for sentiment, emotion, and computational social science

  • 1. Topic and Text Analysis for Sentiment, Emotion, and Computational Social Science November 2012 Alice Oh alice.oh@kaist.edu Users & Information Lab http://uilab.kaist.ac.kr 1 Thursday, December 6, 2012
  • 2. Overview • Topic modeling research • CIKM 2011: Distance-dependent Chinese restaurant franchise (ddCRF) • ICML 2012: Dirichlet process with random mixed measures (DP-MRM) • CIKM 2012: Recursive chinese restaurant process for modeling topic hierarchies (rCRP) • NIPS Big Learning Workshop 2012: Distributed Online Learning for Latent Dirichlet Allocation (DoLDA) • Computational social science research • WSDM 2011: Aspect sentiment unification model for online review analysis • ICWSM 2012: Social aspects of emotions in Twitter conversations • ACL 2012: Self-disclosure and relationship strength in Twitter conversations 2 Thursday, December 6, 2012
  • 3. Do you feel what I feel? Social Aspects of Emotions in Twitter Conversations Suin Kim, JinYeong Bak, Alice Oh ICWSM 2012 3 Thursday, December 6, 2012
  • 6. Asking Research Questions Human emotion is typically studied as a within-person, one-direction, non-repetitive phenomenon; focus has traditionally been on how one individual feels in reaction to various stimuli at a certain point of time. But people recognize and inevitably react emotionally and otherwise to expressions of emotion of other people. We propose that organizational dyads and groups inhabit emotion cycles: Emotions of an individual influence the emotions, thoughts and behaviors of others; others’ reactions can then influence their future interactions with the individual expressing the original emotion, as well as that individual’s future emotions and behaviors. People can mimic the emotions of others, thereby extending the social presence of a specific emotion, but can also respond to others’ emotions, extending the range of emotions present. 5 Thursday, December 6, 2012
  • 7. Social Aspects of Emotions: Motivating Question How are our emotions affected by others we talk to? Thursday, December 6, 2012
  • 8. Social Aspects of Emotions: Research Questions • How do we communicate our emotions? • Use a topic model on Twitter conversations to discover the “topics” that represent the eight emotions • Analyze the proportions of the total tweets for the emotions • How do we influence other people’s emotions? • Analyze the and emotion transitions of the tweets • Look for topics that change the emotions of the conversation partners • Find interesting patterns of emotion pairs Thursday, December 6, 2012
  • 9. Social Aspects of Emotions: Data • Twitter conversation data: approx 220k dyads who “reply” to each other, 1,670k conversational chains ! "! #! $! %! Thursday, December 6, 2012
  • 10. Seed Words (We Feel Fine by Harris & Kamvar) anticipation hope wait await inspir excit bore readi expect nervou calm motiv prepar certain anxiou optimist forese joy awesom amaz wonder excit glad fine beauti high lucki super perfect complet special bless safe proud anger shit bitch ass mean damn mad jealou piss annoi angri upset moron rage screw stuck irrit surprise amaz wow wonder weird lucki differ awkward confus holi strang shock odd embarrass overwhelm astound astonish fear scare stress horror nervou terror alarm behind panic fear afraid desper threaten tens terrifi fright anxiou sadness sorri bad aw sad wrong hurt blue dead lost crush weak depress wors low terribl lone disgust sick wrong evil fat ugli horribl gross terribl selfish miser pathet disgust worthless aw asham fuck acceptance okai ok same alright safe lazi relax peac content normal secur complet numb fulfil comfort defeat Thursday, December 6, 2012
  • 11. Dirichlet Forest Prior • Dirichlet Forest Prior (Andrzejewski et al.) • Mixture of Dirichlet tree distribution • Dirichlet tree: Generalization of Dirichlet distribution • Knowledge is expressed using Must-link and Cannot-link primitives • Must-link (love, sweetheart) • Cannot-link (exciting, bored) 10 DF-LDA Thursday, December 6, 2012
  • 12. Dirichlet Forest Prior • Dirichlet Forest Prior (Andrzejewski et al.) • Mixture of Dirichlet tree distribution • Dirichlet tree: Generalization of Dirichlet distribution • Knowledge is expressed using Must-link and Cannot-link primitives • Must-link (love, sweetheart) • Cannot-link (exciting, bored) 10 q β η DF-LDA Thursday, December 6, 2012
  • 13. Domain Knowledge in Dirichlet Forest Prior 11 Seed Words anticipation hope wait await inspir excit bore readi expect nervou calm motiv prepar certain anxiou optimist forese joy awesom amaz wonder excit glad fine beauti high lucki super perfect complet special bless safe proud anger shit bitch ass mean damn mad jealou piss annoi angri upset moron rage screw stuck irrit surprise amaz wow wonder weird lucki differ awkward confus holi strang shock odd embarrass overwhelm astound astonish fear scare stress horror nervou terror alarm behind panic fear afraid desper threaten tens terrifi fright anxiou sadness sorri bad aw sad wrong hurt blue dead lost crush weak depress wors low terribl lone disgust sick wrong evil fat ugli horribl gross terribl selfish miser pathet disgust worthless aw asham fuck acceptance okai ok same alright safe lazi relax peac content normal secur complet numb fulfil comfort defeat Must-link within a class Cannot-link between classes Thursday, December 6, 2012
  • 14. Dirichlet Forest vs. Dirichlet 12 Fear DF-LDA don’t think but know why even wanna care worry understand Fear LDA good exam lol luck just school haha i’m xx worry tomorrow Surprise DF-LDA that very really cool wow wonder just some differ amazing Surprise LDA just rt holy got thank did shit new love lol awesome buy oh Sadness DF-LDA bad my real feel life aw sad kill lost dead hurt wrong sick Sadness LDA lol just know sorry isn’t oh tweet did haha don’t thought think Thursday, December 6, 2012
  • 15. Emotion Topics How do we express emotions? JoyAnticipation Anger Topic 114 omg love haha thank really Topic 107 love thank follow wow Topic 159 good day hope morning thank Topic 158 love thank miss hug Topic 125 hope better feel thank soon Topic 26 good thank hope miss Topic 146 come wait week day june Topic 146 good day time work Topic 131 lmao fuck ass bitch shit Topic 4 ass yo lmao nigga Topic 19 lmao shit damn fuck oh Topic 13 shit nigga smh yea Fear Topic 48 omg oh lmao shit scare Topic 78 happen heart attack hospital Topic 27 don’t come night sleep outside Topic 140 time got work day Surprise Topic 172 yeag know think true funny Topic 89 know don’t think look Topic 15 think don’t know make really Topic 94 haha dont think really 29 70 21 14 5 Sadness Disgust Topic 6 oh sorry haha know didnt Topic 59 hurt got good bad pain Topic 106 tweet reply didn’t read sorry Topic 155 oh really make feel Topic 116 oh fuck don’t ye ew Topic 116 look haha oh know Topic 22 don’t oh think yeah lmao Topic 174 don’t think say people Acceptance Topic 43 ok oh thank cool okay Topic 102 know try let ok Topic 199 xx thank good okay follow Topic 8 night love good sleep 17 7 18 Neutral Topic 180 com www http check youtube Topic 156 twitter facebook people account Topic 184 account google app work email Topic 67 food chicken cook rt 19 13 Thursday, December 6, 2012
  • 16. Emotion Topics How do we express emotions? JoyAnticipation Topic 114 omg love haha thank really Topic 107 love thank follow wow Topic 125 hope better feel thank soon Topic 26 good thank hope miss Sadness Topic 6 oh sorry know didnt Topic 59 hurt got good bad pain Neutral Topic 180 com www http check youtube Topic 156 twitter facebook people account GreetingCaring Sympathy IT/Tech 14 Thursday, December 6, 2012
  • 17. Emotion Transitions Plutchik’s Wheel of Emotions Joy 39.7% 0.51 Acceptance 10.4% 0.23 Fear 2.6% 0.11 Surprise 7.4% 0.17 Anticipation 15.1% 0.26 Disgust 2.9% 0.11 Sadness 9.1% 0.19 0.31 Anger 12.8% 0.37 0.33 0.32 0.31 0.33 0.21 0.34 0.15 0.14 0.13 0.15 15 Thursday, December 6, 2012
  • 18. Defining “Influence” User A User B Having a tough day today. RIP Harrison. I’ll miss you a ton :/ Just pray about it. God will help you. Not really religious, but thanks man. :) If you need talk you know I’m here. Time (Sadness) (Acceptance) (Anticipation) 16 Thursday, December 6, 2012
  • 19. Defining “Influence” emotion influencing tweet User A User B Having a tough day today. RIP Harrison. I’ll miss you a ton :/ Just pray about it. God will help you. Not really religious, but thanks man. :) If you need talk you know I’m here. Time (Sadness) (Acceptance) (Anticipation) 16 Thursday, December 6, 2012
  • 20. Topic 117 tweet people don’t read post Topic 59 hurt got bad pain feel Emotion Influences What can you say to make your partner feel better? Joy → SadnessSadness → Joy Topic 18 wear look think love black Topic 24 love thank great new look Acceptance → Anger Topic 31 i’m got lmax shit da Topic 13 lmao shit nigga smh yea Greeting Sympathizing Swearing Complaining 17 Thursday, December 6, 2012
  • 21. 0 0.075 0.15 0.225 0.3 Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral 0.041 0.0710.082 0.053 0.265 0.061 0.081 0.0420.051 Emotion Influence: Sadness to Joy Emotion Influence: Joy to Anger 0 0.1 0.2 0.3 0.4 Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral 0.211 0.230.2140.209 0.191 0.2370.253 0.358 0.273 Expressing Anger has 26.5% of chance of changing the partner’s emotion from Joy to Anger. 18 Expressing Joy has 35.8% of chance of changing the partner’s emotion from Sadness to Joy. Thursday, December 6, 2012
  • 22. Outliers 19 A: Sorry to hear about your bags. If you would like us to get someone to contact you DM us your reference and contact number. B: it's on it's way to manch. If the woman on the check in desk in Miami hadn't been trying to be all smart! Been no problem. A: Sorry about that. Pleased to hear they located it quickly for you though. B: mistakes happen. Thursday, December 6, 2012
  • 23. Analyzing Self-Disclosure Behaviors in Twitter Conversations Using Text Mining Techniques (Presented at ACL 2012) JinYeong Bak, Suin Kim, Alice Oh {jy.bak, suin.kim}@kaist.ac.kr, alice.oh@kaist.edu Department of Computer Science, KAIST Thursday, December 6, 2012
  • 24. 2012-07-11 In social psychology } Degree of self-disclosure in a relationship depends on the strength of the relationship } Strategic self-disclosure can strengthen the relationship Introduction 21 I like you too! You’re my best friend! Thursday, December 6, 2012
  • 25. 2012-07-11 Hypothesis 22 Twitter conversations also show a similar pattern } Dyads with high relationship strength show more self-disclosure behavior } Dyads with low relationship strength show less self-disclosure behavior I like you too! You’re my best friend! Hello~ Hi Thursday, December 6, 2012
  • 26. 2012-07-11 Methodology } Twitter Data } 131K users } 2M conversations } Relationship Strength } Chain frequency (CF) } Chain length (CL) } Self-Disclosure } Personal information } Open communication } Profanity } Analysis with Topic Models } Latent Dirichlet allocation (LDA, [Blei, JMLR 2003]) } Aspect and sentiment unification model (ASUM, [Jo,WSDM 2011]) 23 Thursday, December 6, 2012
  • 27. 2012-07-11 Twitter Conversation } A Twitter conversation chain } 3 or more tweets } at least one reply by each user } Our Twitter conversation data } Oct 2011 to Dec 2011 } 131K users } 2M chains } 11M tweets 24 https://twitter.com/#!/britneyspears Example of a conversation chain Thursday, December 6, 2012
  • 28. 2012-07-11 Relationship Strength } Social psychology literature states relationship strength can be measured by communication frequency and length [Granovetter, 1973; Levin and Cross, 2004] } CF: chain frequency } The number of conversational chains between the dyad averaged per month } CL: chain length } The length of conversational chains between the dyad averaged per month } Relationship strength } A high CF or CL for a dyad means the relationship is strong } A low CF or CL for a dyad means the relationship is weak 25 Thursday, December 6, 2012
  • 29. 2012-07-11 Self-Disclosure } Open communication - Openness } Negative openness } Nonverbal openness } Emotional openness } Receptive openness – difficult to find in tweets } General-style openness – not clearly defined in the literature } Personal Information } Personally Identifiable Information (PII) } Personally Embarrassing Information (PEI) } Profanity } nigga, ass, wtf, lmao 26 Thursday, December 6, 2012
  • 30. 2012-07-11 Negative openness } Method } We use ASUM with emoticons as seed words [ “Aspect and sentiment unification model for online review analysis”, Jo,WSDM’11] } ASUM is LDA-based joint model of topic and sentiment } ASUM takes unannotated data and classifies each sentence (tweet) as positive/negative/neutral Self-Disclosure - Openness 27 Thursday, December 6, 2012
  • 31. 2012-07-11 Self-Disclosure - Openness Nonverbal openness } Method } We look for emoticons,‘lol’,‘xxx’ } Emoticons are like facial expressions -- :) :( :P } ‘lol’ (laughing out loud) and ‘xxx’ (kisses) are very frequently used in a similar manner to nonverbal openness 28 Thursday, December 6, 2012
  • 32. 2012-07-11 Self-Disclosure - Openness Emotional openness } Method } Look for tweets that contain common expressions of feeling words [We feel fine (Harris, J, 2009)] 29 Thursday, December 6, 2012
  • 33. 2012-07-11 Self-Disclosure – Personal Information Personally Identifiable Information (PII) Personally Embarrassing Information (PEI) 30 Ex) name, location, email address, job, social security number Ex) clinical history, sexual life, job loss, family problem Thursday, December 6, 2012
  • 34. 2012-07-11 Self-Disclosure – Personal Information }   31 Thursday, December 6, 2012
  • 35. 2012-07-11 Self-Disclosure – Personal Information Example of PII, PEI and Profanity topics } Shown by high probability words in each topic PII 1 PII 2 PEI 1 PEI 2 PEI 3 Profanity san tonight pants teeth family nigga live time wear doctor brother lmao state tomorrow boobs dr sister shit texas good naked dentist uncle ass south ill wearing tooth cousin bitch 32 Thursday, December 6, 2012
  • 37. 2012-07-1134 weak ßà strong weak ßà strong weak ßà strong weak ßà strong sentiment nonverbal emotional profanity PII & PEI Thursday, December 6, 2012
  • 38. 2012-07-1135 weak ßà strong weak ßà strong emotional PII & PEI weak ßà strong weak ßà strong Thursday, December 6, 2012
  • 39. 2012-07-11 Results: Interpretation } Emotional openness } When they are not very close, they express frequent encouragements, or polite reactions to baby or pets 36 Thursday, December 6, 2012
  • 40. 2012-07-11 Results: Interpretation } PII } When they meet new acquaintances, they use PII to introduce themselves 37 Thursday, December 6, 2012
  • 41. 2012-07-11 Results Analyzing outliers: a dyad linked weakly but shows high self- disclosure 38 Thursday, December 6, 2012
  • 42. Distributed Online Learning for Latent Dirichlet Allocation JinYeong Bak, Dongwoo Kim, and Alice Oh NIPS 2012 Workshop on Big Learning 39 Thursday, December 6, 2012
  • 43. Motivation • Problem 1: Inference for LDA takes a long time • Problem 2: Continuously expanding corpus necessitates continuous updates of model parameters • But updating of model parameters is not possible with plain LDA • Must re-train with the entire updated corpus • Solution to 1: Distributed inference shortens inference time (Newman JMLR 2009, Wang WWW 2012) • Solution to 2: Online (batch) learning enables updates to model parameters (Hoffman NIPS 2010) • Our Approach: Combine distributed inference and online learning 40 Thursday, December 6, 2012
  • 44. Distributed Online LDA • Based on variational inference • Mini-batch updates via stochastic learning (variational EM) • Distribute variational EM using MapReduce 41 Thursday, December 6, 2012
  • 45. Experimental Setup • Data: 5.1M Twitter conversations • 4.8M English Wikipedia articles • 60 node Hadoop system • Each node with 8 x 2.30GHz cores 42 Thursday, December 6, 2012
  • 46. Wikipedia Results 43 Topic 0 Topic 22 Topic 42 Topic 65 Topic 94 Topic 170 Topic 232 relativity physics einstein quantum gravity channel television tv cable news milk chocolate sugar food cream god bible moses chapter genesis party election president member elected season team league game football album song band music released Minibatch oLDA DoLDA Speedup 16,384 238666.25 47994.03 4.97 32,768 188508.71 33470.03 5.63 65,536 206290.27 26788.53 7.70 Thursday, December 6, 2012
  • 47. Twitter Temporal Patterns of Topics 44 Conversation b1 on November 2, 2010 A I wish I could vote today, but I have to work for 14 hours B is it legal for them not to give you time off to vote? A probably Conversation b2 on March 31, 2012 A Mitt Romney: "Obama should release the notes and transcripts of all his meetings with world leaders" B Why is he being held to higher standard than any other president. A did you see my Santorum 'slip' tweet? Is the media afraid to comment on it? B oh yes I did. I saw it mentioned yesterday also. disgusting and he should be raked over hot coals for it. 0.005 0.010 0.015 10−10 11−01 11−04 11−07 11−10 12−01 12−04 Day Documentproportion 0.004 0.006 0.008 0.010 0.012 11−07 11−10 12−01 Day Documentproportion Conversation c1 on September 5, 2011 A Oh god, miss Waite ran over to me up the school just now! :L on the plus subjects are now picked! :D B what did you pick?? A english, RE, art and psychology! :) was unsure between history and psych but found out bubbles was teaching it so nooo! :L Conversation c2 on October 12, 2011 A :) My day's been okay! It feels long! But school' was okayish. I hope you have an awesome day! :D B that's good then! Ahh hope it's not cause anything bad happened? Thanks! Have a great sleep :) A no! Class was just boring lol and thanks! :) i will! Even though i have to wake up early tomorrow for a midterm! :S <Topic words: party vote people politics obama> <Topic words: school mate class teacher grade> Thursday, December 6, 2012
  • 48. CAVEAT 45 Big Data, social media data, do not always get the right answers! They contain much noise and much bias. Sentiment analysis is also full of problems at the big data-level because every small assumption can turn out to cause wide swings in the final interpretation of the data. They are valuable because they have opened up possibilities for analyses of naturally-occurring data in huge amounts. We need better methods and tools that are tailored for social media. We need to ask the right questions that can be answered well despite the biases of the social media data. Thursday, December 6, 2012
  • 49. For details, visit our webpage: http://uilab.kaist.ac.kr Or email me: alice.oh@kaist.edu Thursday, December 6, 2012