SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
Machine learning approaches for understanding
social interactions on Twitter
May 6, 2014

Alice Oh

alice.oh@kaist.edu

aoh@seas.harvard.edu

http://uilab.kaist.ac.kr/members/aliceoh/
Our Research
• Topic Modeling

• ICML 2014: Hierarchical Dirichlet scaling process

• IJCAI 2013: Context-dependent conceptualization

• NIPS Big Learning Workshop 2012: Distributed online learning for latent Dirichlet
allocation

• CIKM 2012: Recursive Chinese restaurant processes for modeling topic hierarchies

• ICML 2012: Dirichlet processes with mixed random measures

• Social Media Analysis

• ACL 2014 Workshop: Self-disclosure topic model

• WWW 2014: Computational analysis of agenda setting theory

• AAAI 2013: Hierarchical aspect-sentiment model

• ICWSM 2012: Social aspects of emotions in Twitter conversations

• ACL 2012: Self-disclosure and relationship strength in Twitter conversations

• WSDM 2011: Aspect sentiment unification model for online review analysis
2
Contact Information
• At Harvard until end of July, 2014 and open for
• Collaborations: writing papers, sharing data, etc.
• Discussions about topic modeling and computational social science
• Going back to KAIST in August
• http://uilab.kaist.ac.kr
• alice.oh@kaist.edu
• Can recommend students for intern, postdoc, and researcher positions
• Please consider attending
• ICWSM (program co-chair), Ann Arbor, MI
• ACL Workshop on Social Dynamics and Personal Attributes (co-
organizer), Baltimore, MD
3
What is topic modeling?
Blei, Communications of the ACM, 2012
Motivation
Motivation
• What are the topics discussed in the article?

• Is the article related to

• household finances?

• price of gasoline?

• price of Apple stock?

• How would you build an automatic system for answering these questions?
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
8
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
Topics: multinomial over wordsTopic Distributions
Input to LDA
10
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
Topics Discovered by LDA
nascar 0.12 spending 0.09 sports 0.12
races 0.1 economic 0.07 team 0.11
cars 0.1 recession 0.06 game 0.1
racing 0.09 save 0.05 player 0.1
track 0.08 money 0.05 athlete 0.09
speed 0.06 cut 0.04 win 0.07
... ... ...
money 0.002 speed 0.003 nascar 0.001
Topics: multinomial over vocabulary
11
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
Topics: multinomial over wordsTopic Distributions
Graphical Representation of LDA
Topic Distributions
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
Topics: multinomial over words
Topics
sales xxx slowdown
recession cars races
spending xxx save
costs fuel

13
Do you feel what I feel?
Social Aspects of Emotions in Twitter Conversations
Suin Kim, JinYeong Bak, Alice Oh
ICWSM 2012
14
Twitter conversation data
• Twitter conversation data: approx 220k dyads who “reply” to each other,
1,670k conversational chains (We now have about 5x this amount)
!
1!
2!
3!
4!
Emotion Cycles
16
Emotion cycles
We propose that organizational dyads and groups inhabit
emotion cycles: Emotions of an individual influence the
emotions, thoughts and behaviors of others; others’ reactions
can then influence their future interactions with the individual
expressing the original emotion, as well as that individual’s
future emotions and behaviors. People can mimic the
emotions of others, thereby extending the social presence of a
specific emotion, but can also respond to others’ emotions,
extending the range of emotions present.
17
Topic model with a twist
• Dirichlet forest prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichlet tree: Generalization of Dirichlet distribution
• Knowledge is expressed using Must-link and Cannot-link
primitives
• Must-link(love, sweetheart)
• Cannot-link(exciting, bored)
18
q
⌘
DF-LDA
Domain knowledge in Dirichlet forest prior
19
Seed Words
anticipation
hope
wait
await
inspir
excit
bore
readi
expect
nervou
calm
motiv
prepar
certain
anxiou
optimist
forese
joy
awesom
amaz
wonder
excit
glad
fine
beauti
high
lucki
super
perfect
complet
special
bless
safe
proud
anger
shit
bitch
ass
mean
damn
mad
jealou
piss
annoi
angri
upset
moron
rage
screw
stuck
irrit
surprise
amaz
wow
wonder
weird
lucki
differ
awkward
confus
holi
strang
shock
odd
embarrass
overwhelm
astound
astonish
fear
scare
stress
horror
nervou
terror
alarm
behind
panic
fear
afraid
desper
threaten
tens
terrifi
fright
anxiou
sadness
sorri
bad
aw
sad
wrong
hurt
blue
dead
lost
crush
weak
depress
wors
low
terribl
lone
disgust
sick
wrong
evil
fat
ugli
horribl
gross
terribl
selfish
miser
pathet
disgust
worthless
aw
asham
fuck
acceptance
okai
ok
same
alright
safe
lazi
relax
peac
content
normal
secur
complet
numb
fulfil
comfort
defeat
Must-link within a class Cannot-link between classes
Emotion Topics How do we express emotions?
JoyAnticipation Anger
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follow
wow
Topic 159
good
day
hope
morning
thank
Topic 158
love
thank
miss
hug
Topic 125
hope
better
feel
thank
soon
Topic 26
good
thank
hope
miss
Topic 146
come
wait
week
day
june
Topic 146
good
day
time
work
Topic 131
lmao
fuck
ass
bitch
shit
Topic 4
ass
yo
lmao
nigga
Topic 19
lmao
shit
damn
fuck
oh
Topic 13
shit
nigga
smh
yea
Fear
Topic 48
omg
oh
lmao
shit
scare
Topic 78
happen
heart
attack
hospital
Topic 27
don’t
come
night
sleep
outside
Topic 140
time
got
work
day
Surprise
Topic 172
yeag
know
think
true
funny
Topic 89
know
don’t
think
look
Topic 15
think
don’t
know
make
really
Topic 94
haha
dont
think
really
29 70 21 14 5
Sadness Disgust
Topic 6
oh
sorry
haha
know
didnt
Topic 59
hurt
got
good
bad
Topic 106
tweet
reply
didn’t
read
sorry
Topic 155
oh
really
make
feel
Topic 116
oh
fuck
don’t
ye
ew
Topic 116
look
haha
oh
know
Topic 22
don’t
oh
think
yeah
lmao
Topic 174
don’t
think
say
people
Acceptance
Topic 43
ok
oh
thank
cool
okay
Topic 102
know
try
let
ok
Topic 199
xx
thank
good
okay
follow
Topic 8
night
love
good
sleep
17 7 18 Neutral
Topic 180
com
www
http
check
youtube
Topic 156
twitter
facebook
people
account
Topic 184
account
google
app
work
email
Topic 67
food
chicken
cook
rt
19
20
Emotion Topics How do we express emotions?
JoyAnticipation
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follow
wow
Topic 125
hope
better
feel
thank
soon
Topic 26
good
thank
hope
miss
Sadness
Topic 6
oh
sorry
haha
know
didnt
Topic 59
hurt
got
good
bad
Neutral
Topic 180
com
www
http
check
youtube
Topic 156
twitter
facebook
people
account
GreetingCaring
Sympathy

IT/Tech
21
Emotion-tagged
conversations 22
A (Love): @amithpr @dhempe @OperaIndia - Would you have any update on
@mrunmaiy's health - hope she is recovering well?
B (neut): @labnol @dhempe she is recovering but slow. The injury is on the spine
therefore worrisome. Still in icu.
A (Sadness): @amithpr thanks for the update.. extremely said to hear that news..
B (neut): @labnol #prayformrun She is a fighter and will come out of this
B (neut): @AyeItsMeiMei just tell ur followers to report her for spam. then she'll be
kicked off twitter
A (Anger): @Jakeosaurous dude I didn't even do shit to her I'm just here tweeting &
she calls me a ugly bitch? I was like oh wow thanks?
B (neut): @AyeItsMeiMei yeah clearly shes so ugly she cant even use her real pic:P
so dont feel bad
A (Love): @Jakeosaurous haha. I don't care. She's getting spammed with hate.
Hahaha. (": thanks though.
B (neut): @AyeItsMeiMei np
Emotion Transitions Plutchik’s Wheel of Emotions
Joy
39.7%
0.51
Acceptance
10.4%
0.23
Fear
2.6%
0.11
Surprise
7.4%
0.17
Anticipation
15.1%
0.26
Disgust
2.9%
0.11
Sadness
9.1%
0.19
0.31
Anger
12.8%
0.37
0.33
0.32
0.31
0.33
0.21
0.34
0.15
0.14
0.13
0.15
23
Defining “Influence”
emotion influencing tweet
User A
User B
Having a tough day
today. RIP Harrison. I’ll
miss you a ton :/
Just pray about it.
God will help you.
Not really religious, 	

but thanks man. :)
If you need talk
you know I’m here.
Time
(Sadness)
(Acceptance)
(Anticipation)
24
Topic 117
tweet
people
don’t
read
post
Topic 59
hurt
got
bad
pain
feel
Emotion Influences What can you say to make your
partner feel better?
Joy → SadnessSadness → Joy
Topic 18
wear
look
think
love
black
Topic 24
love
thank
great
new
look
Anticipation → Surprise
Topic 96
music
listen
play
song
good
Topic 178
follow
tweet
people
twitter
thank
Acceptance → Anger
Topic 31
i’m
got
lmax
shit
da
Topic 13
lmao
shit
nigga
smh
yea
Disgust → Joy
Topic 61
watch
new
live
tv
tonight
Topic 63
watch
good
think
know
look
Suggesting Greeting
Sympathy

Swear words Complaining
25
0
0.075
0.15
0.225
0.3
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.041
0.0710.082
0.053
0.265
0.061
0.081
0.0420.051
Emotion Influence: Sadness to Joy
Emotion Influence: Joy to Anger
0
0.09
0.18
0.27
0.36
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.211
0.230.2140.209
0.191
0.2370.253
0.358
0.273
Expressing Anger has 26.5% of chance
of changing the partner’s emotion from
Joy to Anger.
26
Expressing Joy has 35.8% of chance of changing
the partner’s emotion from Sadness to Joy.
Self-disclosure topic model
JinYeong Bak, Chin-Yew Lin, and Alice Oh

ACL 2014 Workshop on Social Dynamics and Personal Attributes
27
Self-disclosure Research using Twitter
• People disclose personal and secretive information
• to build and maintain interpersonal relationship
• to get social support
• Twitter is a great source for naturally-occurring, large-
scale, longitudinal data on self-disclosure behavior
• We develop a topic model for classifying self-disclosure
behavior into three categories: G (general, no disclosure),
M (medium disclosure), H (high disclosure)
• We look at the correlation of self-disclosure behavior and
frequency of Twitter conversations in longitudinal data
28
Self-disclosure in Twitter conversations
29
Conversa)on	
  2:	
  
I'm	
  moving	
  out.	
  
@xxxx	
  ???	
  What's	
  going	
  on	
  bb?	
  
@yyyy	
  Mother.	
  Done	
  with	
  her.	
  I	
  am	
  planning	
  to	
  get	
  out	
  now.	
  There's	
  nothing	
  I	
  can	
  do,	
  we	
  dont	
  get	
  along	
  
@xxxx	
  I'm.sorry	
  hunn.	
  That's	
  rough.	
  Where	
  are	
  you	
  going	
  to	
  go	
  though?	
  
@yyyy	
  Probably	
  stay	
  at	
  a	
  friends	
  place	
  in	
  the	
  Cmebeing	
  unCl	
  I	
  find	
  a	
  place	
  to	
  live!	
  
@xxxx	
  :/	
  well	
  I'm	
  glad	
  your	
  geHng	
  out	
  if	
  she	
  is	
  being	
  horrible	
  to	
  you	
  
Conversa)on	
  3:	
  
Oh,	
  prepregnancy	
  pants,	
  you	
  are	
  so	
  uncomfortable.	
  
@eeee	
  You	
  can	
  put	
  them	
  on?	
  Jealous.	
  
@ffff	
  they	
  are	
  cuHng	
  into	
  my	
  flesh	
  and	
  are	
  giving	
  me	
  a	
  ridiculous	
  muffin	
  top.	
  It	
  isn't	
  preOy.	
  But	
  we	
  have	
  company	
  coming	
  
over.	
  
@eeee	
  Yea,	
  I	
  tried	
  yesterday.	
  I	
  got	
  one	
  pair	
  of	
  shorts	
  to	
  buOon	
  painfully	
  and	
  my	
  jeans	
  just	
  laughed	
  at	
  me.
Conversa)on	
  1:	
  
So	
  my	
  brother	
  is	
  going	
  to	
  Roskilde	
  FesCval	
  and	
  	
  my	
  mother	
  and	
  sister	
  is	
  going	
  to	
  England..	
  That	
  leaves	
  me,	
  my	
  dad	
  and	
  my	
  
dog.	
  
@cccc	
  why	
  aren't	
  you	
  going	
  to	
  england?	
  
@dddd	
  because	
  my	
  sister	
  is	
  going	
  with	
  3	
  of	
  her	
  friends	
  and	
  my	
  mom's	
  just	
  there...	
  to	
  be	
  there.	
  And	
  my	
  sister	
  didn't	
  want	
  
me	
  to	
  come	
  :(	
  
Data
• Full data
• 88k users, 51k dyads
• 1.3M conversations
• 10.5M tweets
• Longitudinal data from August 2007 to July 2013
• Labeled data (gold standard for self-disclosure level)
• 101 conversations
• 673 tweets
30
Graphical Representation of SDTM
3 sets of topics, one for G, M, and H levels
By using a topic model, we can !
-classify the levels of disclosure!
-discover topics associated with each level!
-generalize to other social media sites using the same set of seed words
Seed Words
• Medium level: frequent trigrams for personally identifiable
information
!
!
!
!
• High level: automatically extracted from sixbillionsecrets Website
32
Classification Results
33
Direct Classification using the Models
Classification with SVM using

Features Learned from Models
Self-disclosure topics
34
SD level & conversation frequency
35
Sociolinguistic Analysis of Twitter in Multilingual
Societies
Suin Kim, Ingmar Weber, Li Wei, and Alice Oh

Under Review
36
Data
Data
Visualization of the network
How are they
connected?
• English monolinguals and X-EN
bilinguals bridge the network
Closer look at Bilinguals: Which language do they
choose?
Closer look at Bilinguals: Hashtag usage
Closer look at Bilinguals: Topics (Results of LDA)
Closer look at Bilinguals: Topics (Results of LDA)
Future directions
• Develop model for prediction of language choice in bilinguals

• Look at how English is used throughout the world

• Cognitive studies of first- and second- language

• Self-disclosure and relationship building

• Email me for data sharing, collaborating, discussing, …

• alice.oh@kaist.edu

Más contenido relacionado

Similar a Talk at MIT HCI Seminar

Energy Powers - Superheroes and Villains Minitheme by Slidesgo.pptx
Energy Powers - Superheroes and Villains Minitheme by Slidesgo.pptxEnergy Powers - Superheroes and Villains Minitheme by Slidesgo.pptx
Energy Powers - Superheroes and Villains Minitheme by Slidesgo.pptx
RodrigoVega135669
 
Survey results report
Survey results reportSurvey results report
Survey results report
Nina McHale
 
Geography Activities (Me on the Map) by Slidesgo.pptx
Geography Activities (Me on the Map) by Slidesgo.pptxGeography Activities (Me on the Map) by Slidesgo.pptx
Geography Activities (Me on the Map) by Slidesgo.pptx
HarpreetKaur1382
 

Similar a Talk at MIT HCI Seminar (20)

diapositivas_29_de_febrero_.pdf
diapositivas_29_de_febrero_.pdfdiapositivas_29_de_febrero_.pdf
diapositivas_29_de_febrero_.pdf
 
NORMA & UUD NRI 1945.pptx
NORMA & UUD NRI 1945.pptxNORMA & UUD NRI 1945.pptx
NORMA & UUD NRI 1945.pptx
 
21st Century Schizoid Plan: Learning Tools for the ENG Classroom
21st Century Schizoid Plan: Learning Tools for the ENG Classroom21st Century Schizoid Plan: Learning Tools for the ENG Classroom
21st Century Schizoid Plan: Learning Tools for the ENG Classroom
 
Slideshare Tips
Slideshare TipsSlideshare Tips
Slideshare Tips
 
Energy Powers - Superheroes and Villains Minitheme by Slidesgo.pptx
Energy Powers - Superheroes and Villains Minitheme by Slidesgo.pptxEnergy Powers - Superheroes and Villains Minitheme by Slidesgo.pptx
Energy Powers - Superheroes and Villains Minitheme by Slidesgo.pptx
 
Survey results report
Survey results reportSurvey results report
Survey results report
 
53.pptx
53.pptx53.pptx
53.pptx
 
iGB - PR & Link Building Surgery
iGB - PR & Link Building SurgeryiGB - PR & Link Building Surgery
iGB - PR & Link Building Surgery
 
diapositivas educativas
diapositivas educativasdiapositivas educativas
diapositivas educativas
 
SLIDESGO template for presentations.pptx
SLIDESGO template for presentations.pptxSLIDESGO template for presentations.pptx
SLIDESGO template for presentations.pptx
 
Lovely Couple by Slidesgo.pptxooooooooooooooooooooooooooooooo
Lovely Couple by Slidesgo.pptxoooooooooooooooooooooooooooooooLovely Couple by Slidesgo.pptxooooooooooooooooooooooooooooooo
Lovely Couple by Slidesgo.pptxooooooooooooooooooooooooooooooo
 
Neuromarketing and Content: How to Connect Directly to Visitors' Brains
Neuromarketing and Content: How to Connect Directly to Visitors' BrainsNeuromarketing and Content: How to Connect Directly to Visitors' Brains
Neuromarketing and Content: How to Connect Directly to Visitors' Brains
 
Personalized Learning Lagniappe
Personalized Learning LagniappePersonalized Learning Lagniappe
Personalized Learning Lagniappe
 
Data visualization & Story Telling with Data
Data visualization & Story Telling with DataData visualization & Story Telling with Data
Data visualization & Story Telling with Data
 
Copywriting for the share without going click-bait
Copywriting for the share without going click-baitCopywriting for the share without going click-bait
Copywriting for the share without going click-bait
 
Geography Activities (Me on the Map) by Slidesgo.pptx
Geography Activities (Me on the Map) by Slidesgo.pptxGeography Activities (Me on the Map) by Slidesgo.pptx
Geography Activities (Me on the Map) by Slidesgo.pptx
 
College Application Essay Topics List. Online assignment writing service.
College Application Essay Topics List. Online assignment writing service.College Application Essay Topics List. Online assignment writing service.
College Application Essay Topics List. Online assignment writing service.
 
How do you teach computers humor + Text Generators as Creative Partners (May ...
How do you teach computers humor + Text Generators as Creative Partners (May ...How do you teach computers humor + Text Generators as Creative Partners (May ...
How do you teach computers humor + Text Generators as Creative Partners (May ...
 
Visual tools and innovation games - full day workshop - sp intersections - no...
Visual tools and innovation games - full day workshop - sp intersections - no...Visual tools and innovation games - full day workshop - sp intersections - no...
Visual tools and innovation games - full day workshop - sp intersections - no...
 
Levels of Engagement
Levels of EngagementLevels of Engagement
Levels of Engagement
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

Talk at MIT HCI Seminar

  • 1. Machine learning approaches for understanding social interactions on Twitter May 6, 2014 Alice Oh alice.oh@kaist.edu aoh@seas.harvard.edu http://uilab.kaist.ac.kr/members/aliceoh/
  • 2. Our Research • Topic Modeling • ICML 2014: Hierarchical Dirichlet scaling process • IJCAI 2013: Context-dependent conceptualization • NIPS Big Learning Workshop 2012: Distributed online learning for latent Dirichlet allocation • CIKM 2012: Recursive Chinese restaurant processes for modeling topic hierarchies • ICML 2012: Dirichlet processes with mixed random measures • Social Media Analysis • ACL 2014 Workshop: Self-disclosure topic model • WWW 2014: Computational analysis of agenda setting theory • AAAI 2013: Hierarchical aspect-sentiment model • ICWSM 2012: Social aspects of emotions in Twitter conversations • ACL 2012: Self-disclosure and relationship strength in Twitter conversations • WSDM 2011: Aspect sentiment unification model for online review analysis 2
  • 3. Contact Information • At Harvard until end of July, 2014 and open for • Collaborations: writing papers, sharing data, etc. • Discussions about topic modeling and computational social science • Going back to KAIST in August • http://uilab.kaist.ac.kr • alice.oh@kaist.edu • Can recommend students for intern, postdoc, and researcher positions • Please consider attending • ICWSM (program co-chair), Ann Arbor, MI • ACL Workshop on Social Dynamics and Personal Attributes (co- organizer), Baltimore, MD 3
  • 4. What is topic modeling?
  • 5. Blei, Communications of the ACM, 2012
  • 7. Motivation • What are the topics discussed in the article? • Is the article related to • household finances? • price of gasoline? • price of Apple stock? • How would you build an automatic system for answering these questions?
  • 8. http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition 8
  • 9. http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html? nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition Topics: multinomial over wordsTopic Distributions
  • 11. Topics Discovered by LDA nascar 0.12 spending 0.09 sports 0.12 races 0.1 economic 0.07 team 0.11 cars 0.1 recession 0.06 game 0.1 racing 0.09 save 0.05 player 0.1 track 0.08 money 0.05 athlete 0.09 speed 0.06 cut 0.04 win 0.07 ... ... ... money 0.002 speed 0.003 nascar 0.001 Topics: multinomial over vocabulary 11
  • 12. http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html? nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition Topics: multinomial over wordsTopic Distributions
  • 13. Graphical Representation of LDA Topic Distributions nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition Topics: multinomial over words Topics sales xxx slowdown recession cars races spending xxx save costs fuel
 13
  • 14. Do you feel what I feel? Social Aspects of Emotions in Twitter Conversations Suin Kim, JinYeong Bak, Alice Oh ICWSM 2012 14
  • 15. Twitter conversation data • Twitter conversation data: approx 220k dyads who “reply” to each other, 1,670k conversational chains (We now have about 5x this amount) ! 1! 2! 3! 4!
  • 17. Emotion cycles We propose that organizational dyads and groups inhabit emotion cycles: Emotions of an individual influence the emotions, thoughts and behaviors of others; others’ reactions can then influence their future interactions with the individual expressing the original emotion, as well as that individual’s future emotions and behaviors. People can mimic the emotions of others, thereby extending the social presence of a specific emotion, but can also respond to others’ emotions, extending the range of emotions present. 17
  • 18. Topic model with a twist • Dirichlet forest prior (Andrzejewski et al.) • Mixture of Dirichlet tree distribution • Dirichlet tree: Generalization of Dirichlet distribution • Knowledge is expressed using Must-link and Cannot-link primitives • Must-link(love, sweetheart) • Cannot-link(exciting, bored) 18 q ⌘ DF-LDA
  • 19. Domain knowledge in Dirichlet forest prior 19 Seed Words anticipation hope wait await inspir excit bore readi expect nervou calm motiv prepar certain anxiou optimist forese joy awesom amaz wonder excit glad fine beauti high lucki super perfect complet special bless safe proud anger shit bitch ass mean damn mad jealou piss annoi angri upset moron rage screw stuck irrit surprise amaz wow wonder weird lucki differ awkward confus holi strang shock odd embarrass overwhelm astound astonish fear scare stress horror nervou terror alarm behind panic fear afraid desper threaten tens terrifi fright anxiou sadness sorri bad aw sad wrong hurt blue dead lost crush weak depress wors low terribl lone disgust sick wrong evil fat ugli horribl gross terribl selfish miser pathet disgust worthless aw asham fuck acceptance okai ok same alright safe lazi relax peac content normal secur complet numb fulfil comfort defeat Must-link within a class Cannot-link between classes
  • 20. Emotion Topics How do we express emotions? JoyAnticipation Anger Topic 114 omg love haha thank really Topic 107 love thank follow wow Topic 159 good day hope morning thank Topic 158 love thank miss hug Topic 125 hope better feel thank soon Topic 26 good thank hope miss Topic 146 come wait week day june Topic 146 good day time work Topic 131 lmao fuck ass bitch shit Topic 4 ass yo lmao nigga Topic 19 lmao shit damn fuck oh Topic 13 shit nigga smh yea Fear Topic 48 omg oh lmao shit scare Topic 78 happen heart attack hospital Topic 27 don’t come night sleep outside Topic 140 time got work day Surprise Topic 172 yeag know think true funny Topic 89 know don’t think look Topic 15 think don’t know make really Topic 94 haha dont think really 29 70 21 14 5 Sadness Disgust Topic 6 oh sorry haha know didnt Topic 59 hurt got good bad Topic 106 tweet reply didn’t read sorry Topic 155 oh really make feel Topic 116 oh fuck don’t ye ew Topic 116 look haha oh know Topic 22 don’t oh think yeah lmao Topic 174 don’t think say people Acceptance Topic 43 ok oh thank cool okay Topic 102 know try let ok Topic 199 xx thank good okay follow Topic 8 night love good sleep 17 7 18 Neutral Topic 180 com www http check youtube Topic 156 twitter facebook people account Topic 184 account google app work email Topic 67 food chicken cook rt 19 20
  • 21. Emotion Topics How do we express emotions? JoyAnticipation Topic 114 omg love haha thank really Topic 107 love thank follow wow Topic 125 hope better feel thank soon Topic 26 good thank hope miss Sadness Topic 6 oh sorry haha know didnt Topic 59 hurt got good bad Neutral Topic 180 com www http check youtube Topic 156 twitter facebook people account GreetingCaring Sympathy
 IT/Tech 21
  • 22. Emotion-tagged conversations 22 A (Love): @amithpr @dhempe @OperaIndia - Would you have any update on @mrunmaiy's health - hope she is recovering well? B (neut): @labnol @dhempe she is recovering but slow. The injury is on the spine therefore worrisome. Still in icu. A (Sadness): @amithpr thanks for the update.. extremely said to hear that news.. B (neut): @labnol #prayformrun She is a fighter and will come out of this B (neut): @AyeItsMeiMei just tell ur followers to report her for spam. then she'll be kicked off twitter A (Anger): @Jakeosaurous dude I didn't even do shit to her I'm just here tweeting & she calls me a ugly bitch? I was like oh wow thanks? B (neut): @AyeItsMeiMei yeah clearly shes so ugly she cant even use her real pic:P so dont feel bad A (Love): @Jakeosaurous haha. I don't care. She's getting spammed with hate. Hahaha. (": thanks though. B (neut): @AyeItsMeiMei np
  • 23. Emotion Transitions Plutchik’s Wheel of Emotions Joy 39.7% 0.51 Acceptance 10.4% 0.23 Fear 2.6% 0.11 Surprise 7.4% 0.17 Anticipation 15.1% 0.26 Disgust 2.9% 0.11 Sadness 9.1% 0.19 0.31 Anger 12.8% 0.37 0.33 0.32 0.31 0.33 0.21 0.34 0.15 0.14 0.13 0.15 23
  • 24. Defining “Influence” emotion influencing tweet User A User B Having a tough day today. RIP Harrison. I’ll miss you a ton :/ Just pray about it. God will help you. Not really religious, but thanks man. :) If you need talk you know I’m here. Time (Sadness) (Acceptance) (Anticipation) 24
  • 25. Topic 117 tweet people don’t read post Topic 59 hurt got bad pain feel Emotion Influences What can you say to make your partner feel better? Joy → SadnessSadness → Joy Topic 18 wear look think love black Topic 24 love thank great new look Anticipation → Surprise Topic 96 music listen play song good Topic 178 follow tweet people twitter thank Acceptance → Anger Topic 31 i’m got lmax shit da Topic 13 lmao shit nigga smh yea Disgust → Joy Topic 61 watch new live tv tonight Topic 63 watch good think know look Suggesting Greeting Sympathy
 Swear words Complaining 25
  • 26. 0 0.075 0.15 0.225 0.3 Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral 0.041 0.0710.082 0.053 0.265 0.061 0.081 0.0420.051 Emotion Influence: Sadness to Joy Emotion Influence: Joy to Anger 0 0.09 0.18 0.27 0.36 Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral 0.211 0.230.2140.209 0.191 0.2370.253 0.358 0.273 Expressing Anger has 26.5% of chance of changing the partner’s emotion from Joy to Anger. 26 Expressing Joy has 35.8% of chance of changing the partner’s emotion from Sadness to Joy.
  • 27. Self-disclosure topic model JinYeong Bak, Chin-Yew Lin, and Alice Oh ACL 2014 Workshop on Social Dynamics and Personal Attributes 27
  • 28. Self-disclosure Research using Twitter • People disclose personal and secretive information • to build and maintain interpersonal relationship • to get social support • Twitter is a great source for naturally-occurring, large- scale, longitudinal data on self-disclosure behavior • We develop a topic model for classifying self-disclosure behavior into three categories: G (general, no disclosure), M (medium disclosure), H (high disclosure) • We look at the correlation of self-disclosure behavior and frequency of Twitter conversations in longitudinal data 28
  • 29. Self-disclosure in Twitter conversations 29 Conversa)on  2:   I'm  moving  out.   @xxxx  ???  What's  going  on  bb?   @yyyy  Mother.  Done  with  her.  I  am  planning  to  get  out  now.  There's  nothing  I  can  do,  we  dont  get  along   @xxxx  I'm.sorry  hunn.  That's  rough.  Where  are  you  going  to  go  though?   @yyyy  Probably  stay  at  a  friends  place  in  the  Cmebeing  unCl  I  find  a  place  to  live!   @xxxx  :/  well  I'm  glad  your  geHng  out  if  she  is  being  horrible  to  you   Conversa)on  3:   Oh,  prepregnancy  pants,  you  are  so  uncomfortable.   @eeee  You  can  put  them  on?  Jealous.   @ffff  they  are  cuHng  into  my  flesh  and  are  giving  me  a  ridiculous  muffin  top.  It  isn't  preOy.  But  we  have  company  coming   over.   @eeee  Yea,  I  tried  yesterday.  I  got  one  pair  of  shorts  to  buOon  painfully  and  my  jeans  just  laughed  at  me. Conversa)on  1:   So  my  brother  is  going  to  Roskilde  FesCval  and    my  mother  and  sister  is  going  to  England..  That  leaves  me,  my  dad  and  my   dog.   @cccc  why  aren't  you  going  to  england?   @dddd  because  my  sister  is  going  with  3  of  her  friends  and  my  mom's  just  there...  to  be  there.  And  my  sister  didn't  want   me  to  come  :(  
  • 30. Data • Full data • 88k users, 51k dyads • 1.3M conversations • 10.5M tweets • Longitudinal data from August 2007 to July 2013 • Labeled data (gold standard for self-disclosure level) • 101 conversations • 673 tweets 30
  • 31. Graphical Representation of SDTM 3 sets of topics, one for G, M, and H levels By using a topic model, we can ! -classify the levels of disclosure! -discover topics associated with each level! -generalize to other social media sites using the same set of seed words
  • 32. Seed Words • Medium level: frequent trigrams for personally identifiable information ! ! ! ! • High level: automatically extracted from sixbillionsecrets Website 32
  • 33. Classification Results 33 Direct Classification using the Models Classification with SVM using
 Features Learned from Models
  • 35. SD level & conversation frequency 35
  • 36. Sociolinguistic Analysis of Twitter in Multilingual Societies Suin Kim, Ingmar Weber, Li Wei, and Alice Oh Under Review 36
  • 37. Data
  • 38. Data
  • 40. How are they connected? • English monolinguals and X-EN bilinguals bridge the network
  • 41. Closer look at Bilinguals: Which language do they choose?
  • 42. Closer look at Bilinguals: Hashtag usage
  • 43. Closer look at Bilinguals: Topics (Results of LDA)
  • 44. Closer look at Bilinguals: Topics (Results of LDA)
  • 45. Future directions • Develop model for prediction of language choice in bilinguals • Look at how English is used throughout the world • Cognitive studies of first- and second- language • Self-disclosure and relationship building • Email me for data sharing, collaborating, discussing, … • alice.oh@kaist.edu