What do you really mean when you tweet? Challenges for opinion mining on social media.

What do you really mean when you tweet?
Challenges for opinion mining on social media
Dr. Diana Maynard
University of Sheffield, UK

The Social Web

Information, thoughts
and opinions are shared
prolifically these days on
the social web

Who cares about social media though?

Isn't Twitter just full of
stupid messages about
Justin Bieber?

Well, social media has other uses too

http://socialmediatoday.com/node/1568271









One in six people have used social media to get information about an
emergency
One in two people would sign up for emails, text alerts, or applications
to receive any of the emergency information.
75% of people would use Facebook to post eyewitness information on
an emergency or newsworthy event; 22% would use blogs, 21% would
use Twitter
During an emergency, one in two people would use social media to let
loved ones know they are safe

It's all a bit new-fangled, isn't it?
●

Well actually, social media goes back a long way

●

The first email was sent in 1971

●

But it really goes back much further

●

●

The first documented postal service was in 550BC, although there was
evidence of written couriers long before that
However, communication speed is a little faster these days!

Drowning in information
• It can be difficult to get the
relevant information out of such
large volumes of data in a useful
way
• Social web analysis is all about the
users who are actively engaged
and generate content
• Social networks are pools of a
wide range of articulation
methods, from simple "I like it"
buttons to complete articles

Opinion Mining
• Along with NER, opinion
mining is a key component
in social web analysis
• NER: names of people,
organisations, locations
• Opinion mining: what
sentiments are being
expressed?

Opinion Mining is about finding out what people
think...

And one for the Portuguese speakers :-)

It's not just about product reviews
•

Much opinion mining research has been focused around
reviews of films, books, electronics etc.

• But there are many other uses
– companies want to know what people think
– finding out political and social opinions and moods
– investigating how public mood influences the stock market
– investigating and preserving community memories
– drawing inferences from social analytics

And taking it a step further
It allows us to answer questions like:
• What are the opinions on crucial social
events and the key people involved?
• How are these opinions distributed in
relation to demographic user data?
• How have these opinions evolved?
• Who are the opinion leaders?
• What is their impact and influence?

Analysing Public Mood
• Closely related to opinion mining is the
analysis of sentiment and mood
• Mood of the Nation project at Bristol
University
http://geopatterns.enm.bris.ac.uk/mood/
• Mood has proved more useful than
sentiment for things like stock market
prediction (fluctuations are driven mainly
by fear rather than by things like
happiness or sadness)

Derwent Capital Markets
●

●

●

●

Derwent Capital Markets launched a £25m fund in 2011 that made its
investments via social media analysis by evaluating whether people
are generally happy, sad, anxious or tired
DCM Capital used a proprietary algorithm to research the public
sentiment of stock, primarily through Twitter, to attempt to predict the
movements of the Dow Jones Industrial Average.
Bollen told the Sunday Times: "We recorded the sentiment of the
online community, but we couldn't prove if it was correct. So we
looked at the Dow Jones to see if there was a correlation. We believed
that if the markets fell, then the mood of people on Twitter would
fall.”
"But we realised it was the other way round — that a drop in the mood
or sentiment of the online community would precede a fall in the
market.”

But it didn't quite work out as planned...
●

●

●

●

●

●

It was later suggested that there are actually many flaws in Bollen's
work, and that it's impossible to predict the stock market in this way
The "Twitter Fund"─ formally, The Derwent Absolute Return Fund ─ was
launched in July 2011, but failed to survive the summer, despite posting
initial returns, and the company was sold for peanuts in Feb 2013
There's quite a lot of sloppiness in the reporting of methodology and
results, so it's not clear what can really be trusted
The advertised results are biased by selection (they picked the winners
after the race and tried to show correlation)
The accuracy claim is too general to be useful (you can't predict
individual stock prices, only the general trend)
However, most trading companies now use some form of social media
analysis to help with prediction, though it's usually quite shallow

Transatlantic Trends







This annual diplomatic report is a manually collected survey of US
and European public opnion
It informs politicians in international relations by revealing reasoning
behind multilateral negotiations
But it's expensive and time-consuming to create - the kind of thing
that global sentiment analysis can replace, and in real-time, instead
of annually

Twitter Gives you Flu!
●

●

●

Researchers at the University
of Rochester used
twitter analysis to predict who
would get flu
They looked at the role of
interactions between users on
social media on the real-life
spread of the disease
Researchers at Johns Hopkins
also reckon they can
do better at flu tracking via
Twitter analysis than the CDC.

The Social Oscars 2013
Brandwatch ran a project to investigate how closely public opinion
predicted/mirrored the results of the 2013 Oscars

Tracking opinions over time
●
●

●

●

Opinions can be extracted with a time stamp and/or a geo-location
We can then analyse changes to opinions about the same
entity/event over time, and other statistics
We can also measure the impact of an entity or event on the overall
sentiment about an entity or another event, over the course of time
(e.g. in politics)
Also possible to incorporate statistical (non-linguistic) techniques to
investigate dynamics of opinions, e.g. find statistical correlations
between interest in certain topics or entities/events and
number/impact/influence of tweets etc.

Viewing opinion changes over time

Mapping dynamics from social media: UK riots demo

Opinion mining is like “Ask the Audience”

But be careful!

Sentiment analyis isn't just about looking at the sentiment words
●

●
●

“It's a great movie if you have the taste and sensibilities of a 5-year-old
boy.”
“It's terrible Candidate X did so well in the debate last night.”
“I'd have liked the film a lot more if it had been a bit shorter.”

Situation is everything. If you and I are best friends, then my graceful
swearing at you is different than if it’s at my boss.

Death confuses opinion mining tools



Opinion mining
tools are good for a
general overview,
but not for some
situations

Whitney Houston wasn't very popular...

Why are many opinion mining tools unsuccessful?
• They don't work well at more than a very basic level
• They mainly use dictionary lookup for positive and negative
words
• They classify the tweets as positive or negative, but not with
respect to the keyword you're searching for
• First, the keyword search just retrieves any tweet mentioning
it, but not necessarily about it as a topic
• Second, there is no correlation between the keyword and the
sentiment: the sentiment refers to the tweet as a whole
• Sometimes this is fine, but it can also go horribly wrong

Why bother with opinion mining?
• It depends what kind of information you want
• Don't use opinion mining tools to help you win money on
quiz shows
• Recent research has shown that one knowledgeable
analyst is better than gathering general public sentiment
from lots of analysts and taking the majority opinion
• But only for some kinds of tasks
• If you want a general overview about public sentiment
on a topic like the Olympic Games or Justin Bieber, it'll
probably work out OK

Challenges imposed by social media
• Language: incorrect use of language makes NLP hard
●
Solution: specific pre-processing for Twitter. use shallow
analysis techniques with back-off strategies; incorporate
specific subcomponents for swear words, sarcasm etc.
• Relevance: topics and comments can rapidly diverge.
●

Solution: train a classifier or use clustering techniques

• Lack of context: hard to disambiguate entities
●
Solution: use metadata for further information, also
aggregation of data can be useful

Analysing language in social media
●

Sumbuddy: Hey, hao es your familie?
Guy: They got crushed by a bus and died.
Sumbuddy: Daz so sad...wanna get iscreem?

●

●

OMMMFG!!! JUST HEARD EMINEM'S “RAPGOD”. SMFH!!!
these other dudes might as well stop rapping if they not on
this level
@adambation Try reading this article , it looks like it would be
really helpful and not obvious at all #sarcasm
http://t.co/mo3vODoX

Short sentences in tweets
• Social media, and especially tweets, can be problematic because
sentences are very short and/or incomplete
• Typically, linguistic pre-processing tools such as tokenisers, POS
taggers and parsers do badly on such texts
• Even language identification tools can have problems
• Need for special NLP pre-processing tools

Lack of context causes ambiguity
Branching out from Lincoln park after dark ... Hello Russian Navy, it's
like the same thing but with glitter!

??

Getting the NEs right is crucial
Branching out from Lincoln park after dark ... Hello Russian Navy, it's like
the same thing but with glitter!

The Problem with NER
• Running standard IE tools (ANNIE) on 300 news articles – 87% Fmeasure

• Running ANNIE on some tweets - < 40% F-measure

Example: Persons in news articles

Language identification is tricky
●

Language identification tools such as TextCat need a decent
amount of text (around 20 words at least)

●

But Twitter has an average of only 10 tokens/tweet

●

Noisy nature of the words (abbreviations, misspellings).

●

Due to the length of the text, we can make the assumption that one
tweet is written in only one language

●

We have adapted the TextCat language identification plugin

●

Provided fingerprints for 5 languages: DE, EN, FR, ES, NL

●

You can extend it to new languages easily

Language detection examples
●

x

Tokenisation
• Plenty of “unusual”, but very important tokens in social
media:
– @Apple – mentions of company/brand/person names
– #fail, #SteveJobs – hashtags expressing sentiment, person
or company names
– :-(, :-), :-P – emoticons (punctuation and optionally letters)
– URLs
• Tokenisation is crucial for entity recognition and opinion
mining

Example
#WiredBizCon #nike vp said when @Apple saw what
http://nikeplus.com did, #SteveJobs was like wow I didn't expect
this at all.

Tokenising on white space doesn't work that well:

Nike and Apple are company names, but if we have tokens such
as #nike and @Apple, this will make the entity recognition
harder, as it will need to look at sub-token level

Tokenising on white space and punctuation characters doesn't
work well either: URLs get separated (http, nikeplus), as are
emoticons and email addresses

The TwitIE Tokeniser
●

●

●

●

Treat RTs and URLs as 1 token each
#nike is two tokens (# and nike) plus a separate annotation
Hashtag covering both. Same for @mentions -> UserID
Capitalisation is preserved, but an orthography feature is
added: all caps, lowercase, mixCase
Date and phone number normalisation, lowercasing, and
emoticons are optionally done later in separate modules

●

Consequently, tokenisation is faster and more generic

●

Also, more tailored to our NER module

Normalisation
• “RT @Bthompson WRITEZ: @libbyabrego honored?! Everybody
knows the libster is nice with it...lol...(thankkkks a bunch;))”
• OMG! I’m so guilty!!! Sprained biibii’s leg! ARGHHHHHH!!!!!!
• Similar to SMS normalisation
• For some later components to work well (POS tagger, parser), it
is necessary to produce a normalised version of each token
• BUT uppercasing, and letter and exclamation mark repetition
often convey strong sentiment, so we keep both versions of
tokens
• Syntactic normalisation: determine when @mentions and #tags
have syntactic value and should be kept in the sentence, vs
replies, retweets and topic tagging

A normalised example

●

●

Normaliser currently based on spelling correction and some lists of
common abbreviations
Outstanding issues:
●

●

Some abbreviations which span token boundaries (e.g. gr8, do n’t)
difficult to handle
Capitalisation and punctuation normalisation

What's in a hashtag?
●

●

●

Hashtags often contain smushed words
●
#SteveJobs
●
#CombineAFoodAndABand
●
#southamerica
For NER we want the individual tokens so
we can link them to the right entity
For opinion mining, individual words in
the hashtags often indicate sentiment,
sarcasm etc.
●
#greatidea
●
#worstdayever

How to analyse hashtags?
●

●

●

●

Camelcasing makes it relatively easy to separate the words,
using an adapted tokeniser, but many people don't bother
We use a simple approach based on dictionary matching the
longest consecutive strings, working L to R
●
#lifeisgreat -> #-life-is-great
●
#lovinglife -> #-loving-life
It's not foolproof, however
●
#greatstart -> #-greats-tart
To improve it, we could use contextual information, or we
could restrict matches to certain POS combinations (ADJ+N is
more likely than ADJ+V)

Irony and sarcasm
• I had never seen snow in Holland before but thanks to twitter and
facebook I now know what it looks like. Thanks guys, awesome!
• Life's too short, so be sure to read as many articles about celebrity
breakups as possible.
• I feel like there aren't enough singing competitions on TV .
#sarcasmexplosion
• I wish I was cool enough to stalk my ex-boyfriend ! #sarcasm
#bitchtweet
• On a bright note if downing gets injured we have Henderson to
come in

Sarcasm is a part of British culture
●

So much so that the BBC has its own webpage on sarcasm
designed to teach non-native English speakers how to be
sarcastic successfully in conversation

How do you know when someone is being
sarcastic?
• Use of hashtags in tweets such as #sarcasm, #irony, #whoknew etc.
• Large collections of tweets based on hashtags can be used to make
a training set for machine learning
• But you still have to know what to do with sarcasm once you've
found it
• Although sarcasm generally entails saying the opposite of what you
mean, it doesn't necessarily just invert the polarity of an opinion
• “It's not like I wanted to eat breakfast anyway” is negative when
uttered sarcastically, but non-opinionated when uttered neutrally.

Identifying the scope of sarcasm
I am not happy that I woke up at 5:15 this morning.

#greatstart #sarcasm

You are really mature. #lying #sarcasm

Experiment with sarcastic hashtags










Collected a corpus of 134 tweets containing the hashtag
#sarcasm
Manually annotated sentences with sentiment

266 sentences, of which 68 opinionated (25%)

62 negative, 6 positive
Also annotated the same corpus as if the sarcasm was absent
Compared how well our applications performed on each, with
and without sarcasm analysis
The results were a little surprising
Even when we KNEW the statement was sarcastic, we didn't
always get the polarity of the opinion right

Effect of sarcasm on sentiment analysis
Sarcastic corpus

Precision

Recall

F1

Opinionated

74.58

63.77

68.75

Opinion+polarity - Regular

20.34

17.39

18.75

Polarity-only - Regular

27.27

27.27

27.27

Opinion+polarity - Sarcastic

57.63

49.28

53.13

Polarity-only - Sarcastic

77.02

77.28

77.28

Regular corpus
Opinionated
Opinion+polarity - Regular

Precision
57.89
45.61

Recall
58.93
46.43

F1
58.41
46.02

Polarity-only - Regular

78.79

78.79

78.79

Opinion+polarity - Sarcastic

22.81

23.21

23.01

Polarity-only - Sarcastic

39.40

39.39

39.39

What about non-textual content?

We can also do opinion mining on images and
multimedia

Image-opinion identification
• Facial expression analysis/classification
–
Helps with facial similarity calculations and face
recognition
–
Can be used to predict sentiment/polarity
–
Can be combined with analysis text from
document
●

Coarse-grained opinion classification
–
Looking at image-feature classification for
abstract concepts (sentiment / privacy /
attractiveness)
–
e.g. looking at image colours, placement of
interesting images in the picture

Multimodal opinion analysis


Investigate correlation between images and
whole-document opinions








Do documents asserting specific opinions
get illustrated with the same imagery?
e.g. articles about euro-scepticism in the
UK might be illustrated with images of
specific Conservative peers….
Is there correlation between low-level
image features and specific opinions?

Investigate finer-grained (i.e. sub-document)
correlations between imagery and opinions


e.g. sentence-level correlations
incorporating analysis of the document
layout

Demo: extracting opinions from images

So where does this leave us?
●

Social media is a tricky but interesting medium to analyse

●

Opinion mining is ubiquitous, but it's still far from perfect

●

●

●
●

●

There are lots of linguistic and social quirks that fool sentiment
analysis tools.
The good news is that this means there are lots of interesting
problems for us to research
And it doesn’t mean we shouldn’t use existing opinion mining tools
The benefits of a modular approach mean that we can pick the bits
that are most useful
Take-away message: it is critical to use the right tool for the right job

Don't be misled by the advertising: caveat emptor!

Further information
• Research supported by the EU-funded ARCOMEM, uComp and
TrendMiner projects
• See http://www.arcomem.eu and http://www.trend-miner.eu for
more details
• More information about GATE at http://gate.ac.uk
• Opinion mining demo:
http://demos.gate.ac.uk/arcomem/opinions/
• Learn about the technical details in the STIL 2013 tutorial: Practical
Opinion Mining for social media (Wednesday 11.30am)

What do you really mean when you tweet? Challenges for opinion mining on social media.

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a What do you really mean when you tweet? Challenges for opinion mining on social media.

Similar a What do you really mean when you tweet? Challenges for opinion mining on social media. (20)

Más de Diana Maynard

Más de Diana Maynard (15)

Último

Último (20)

What do you really mean when you tweet? Challenges for opinion mining on social media.