Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

What do you really mean when you tweet? Challenges for opinion mining on social media.

2.545 visualizaciones

Publicado el

This talk, given at BRACIS 2013, introduces the topics of opinion mining and social media analytics, in particular looking at the challenges they impose for an NLP system. It investigates the impact of non-standard text in social media, use of sarcasm, swear words, non-words, short sentences, multiple languages and so on, which impede the success of current NLP tools to perform good analysis, and examines tools being developed in some current cutting-edge research projects, including not only text-based research but also multimedia analysis.

Publicado en: Tecnología, Empresariales
  • Inicia sesión para ver los comentarios

What do you really mean when you tweet? Challenges for opinion mining on social media.

  1. 1. What do you really mean when you tweet? Challenges for opinion mining on social media Dr. Diana Maynard University of Sheffield, UK
  2. 2. The Social Web Information, thoughts and opinions are shared prolifically these days on the social web
  3. 3. Who cares about social media though? Isn't Twitter just full of stupid messages about Justin Bieber?
  4. 4. Well, social media has other uses too
  5. 5.     One in six people have used social media to get information about an emergency One in two people would sign up for emails, text alerts, or applications to receive any of the emergency information. 75% of people would use Facebook to post eyewitness information on an emergency or newsworthy event; 22% would use blogs, 21% would use Twitter During an emergency, one in two people would use social media to let loved ones know they are safe
  6. 6. It's all a bit new-fangled, isn't it? ● Well actually, social media goes back a long way ● The first email was sent in 1971 ● But it really goes back much further ● ● The first documented postal service was in 550BC, although there was evidence of written couriers long before that However, communication speed is a little faster these days!
  7. 7. Let's rewind a little...
  8. 8. Drowning in information • It can be difficult to get the relevant information out of such large volumes of data in a useful way • Social web analysis is all about the users who are actively engaged and generate content • Social networks are pools of a wide range of articulation methods, from simple "I like it" buttons to complete articles
  9. 9. Opinion Mining • Along with NER, opinion mining is a key component in social web analysis • NER: names of people, organisations, locations • Opinion mining: what sentiments are being expressed?
  10. 10. Opinion Mining is about finding out what people think...
  11. 11. Amazon book reviews
  12. 12. TripAdvisor Hotel reviews
  13. 13. And one for the Portuguese speakers :-)
  14. 14. Rotten Tomatoes Film Reviews
  15. 15. It's not just about product reviews • Much opinion mining research has been focused around reviews of films, books, electronics etc. • But there are many other uses – companies want to know what people think – finding out political and social opinions and moods – investigating how public mood influences the stock market – investigating and preserving community memories – drawing inferences from social analytics
  16. 16. And taking it a step further It allows us to answer questions like: • What are the opinions on crucial social events and the key people involved? • How are these opinions distributed in relation to demographic user data? • How have these opinions evolved? • Who are the opinion leaders? • What is their impact and influence?
  17. 17. Analysing Public Mood • Closely related to opinion mining is the analysis of sentiment and mood • Mood of the Nation project at Bristol University • Mood has proved more useful than sentiment for things like stock market prediction (fluctuations are driven mainly by fear rather than by things like happiness or sadness)
  18. 18. Derwent Capital Markets ● ● ● ● Derwent Capital Markets launched a £25m fund in 2011 that made its investments via social media analysis by evaluating whether people are generally happy, sad, anxious or tired DCM Capital used a proprietary algorithm to research the public sentiment of stock, primarily through Twitter, to attempt to predict the movements of the Dow Jones Industrial Average. Bollen told the Sunday Times: "We recorded the sentiment of the online community, but we couldn't prove if it was correct. So we looked at the Dow Jones to see if there was a correlation. We believed that if the markets fell, then the mood of people on Twitter would fall.” "But we realised it was the other way round — that a drop in the mood or sentiment of the online community would precede a fall in the market.”
  19. 19. But it didn't quite work out as planned... ● ● ● ● ● ● It was later suggested that there are actually many flaws in Bollen's work, and that it's impossible to predict the stock market in this way The "Twitter Fund"─ formally, The Derwent Absolute Return Fund ─ was launched in July 2011, but failed to survive the summer, despite posting initial returns, and the company was sold for peanuts in Feb 2013 There's quite a lot of sloppiness in the reporting of methodology and results, so it's not clear what can really be trusted The advertised results are biased by selection (they picked the winners after the race and tried to show correlation) The accuracy claim is too general to be useful (you can't predict individual stock prices, only the general trend) However, most trading companies now use some form of social media analysis to help with prediction, though it's usually quite shallow
  20. 20. Transatlantic Trends    This annual diplomatic report is a manually collected survey of US and European public opnion It informs politicians in international relations by revealing reasoning behind multilateral negotiations But it's expensive and time-consuming to create - the kind of thing that global sentiment analysis can replace, and in real-time, instead of annually
  21. 21. Twitter Gives you Flu! ● ● ● Researchers at the University of Rochester used twitter analysis to predict who would get flu They looked at the role of interactions between users on social media on the real-life spread of the disease Researchers at Johns Hopkins also reckon they can do better at flu tracking via Twitter analysis than the CDC.
  22. 22. The Social Oscars 2013 Brandwatch ran a project to investigate how closely public opinion predicted/mirrored the results of the 2013 Oscars
  23. 23. Tracking opinions over time ● ● ● ● Opinions can be extracted with a time stamp and/or a geo-location We can then analyse changes to opinions about the same entity/event over time, and other statistics We can also measure the impact of an entity or event on the overall sentiment about an entity or another event, over the course of time (e.g. in politics) Also possible to incorporate statistical (non-linguistic) techniques to investigate dynamics of opinions, e.g. find statistical correlations between interest in certain topics or entities/events and number/impact/influence of tweets etc.
  24. 24. Viewing opinion changes over time
  25. 25. Mapping dynamics from social media: UK riots demo
  26. 26. Opinion mining is like “Ask the Audience”
  27. 27. But be careful! Sentiment analyis isn't just about looking at the sentiment words ● ● ● “It's a great movie if you have the taste and sensibilities of a 5-year-old boy.” “It's terrible Candidate X did so well in the debate last night.” “I'd have liked the film a lot more if it had been a bit shorter.” Situation is everything. If you and I are best friends, then my graceful swearing at you is different than if it’s at my boss.
  28. 28. Death confuses opinion mining tools  Opinion mining tools are good for a general overview, but not for some situations
  29. 29. Whitney Houston wasn't very popular...
  30. 30. Or was she?
  31. 31. Why are many opinion mining tools unsuccessful? • They don't work well at more than a very basic level • They mainly use dictionary lookup for positive and negative words • They classify the tweets as positive or negative, but not with respect to the keyword you're searching for • First, the keyword search just retrieves any tweet mentioning it, but not necessarily about it as a topic • Second, there is no correlation between the keyword and the sentiment: the sentiment refers to the tweet as a whole • Sometimes this is fine, but it can also go horribly wrong
  32. 32. Why bother with opinion mining? • It depends what kind of information you want • Don't use opinion mining tools to help you win money on quiz shows • Recent research has shown that one knowledgeable analyst is better than gathering general public sentiment from lots of analysts and taking the majority opinion • But only for some kinds of tasks • If you want a general overview about public sentiment on a topic like the Olympic Games or Justin Bieber, it'll probably work out OK
  33. 33. Challenges imposed by social media • Language: incorrect use of language makes NLP hard ● Solution: specific pre-processing for Twitter. use shallow analysis techniques with back-off strategies; incorporate specific subcomponents for swear words, sarcasm etc. • Relevance: topics and comments can rapidly diverge. ● Solution: train a classifier or use clustering techniques • Lack of context: hard to disambiguate entities ● Solution: use metadata for further information, also aggregation of data can be useful
  34. 34. Analysing language in social media ● Sumbuddy: Hey, hao es your familie? Guy: They got crushed by a bus and died. Sumbuddy: Daz so sad...wanna get iscreem? ● ● OMMMFG!!! JUST HEARD EMINEM'S “RAPGOD”. SMFH!!! these other dudes might as well stop rapping if they not on this level @adambation Try reading this article , it looks like it would be really helpful and not obvious at all #sarcasm
  35. 35. Short sentences in tweets • Social media, and especially tweets, can be problematic because sentences are very short and/or incomplete • Typically, linguistic pre-processing tools such as tokenisers, POS taggers and parsers do badly on such texts • Even language identification tools can have problems • Need for special NLP pre-processing tools
  36. 36. Lack of context causes ambiguity Branching out from Lincoln park after dark ... Hello Russian Navy, it's like the same thing but with glitter! ??
  37. 37. Getting the NEs right is crucial Branching out from Lincoln park after dark ... Hello Russian Navy, it's like the same thing but with glitter!
  38. 38. The Problem with NER • Running standard IE tools (ANNIE) on 300 news articles – 87% Fmeasure • Running ANNIE on some tweets - < 40% F-measure
  39. 39. Example: Persons in news articles
  40. 40. Example: Persons in tweets
  41. 41. TwitIE to the rescue
  42. 42. Language identification is tricky ● Language identification tools such as TextCat need a decent amount of text (around 20 words at least) ● But Twitter has an average of only 10 tokens/tweet ● Noisy nature of the words (abbreviations, misspellings). ● Due to the length of the text, we can make the assumption that one tweet is written in only one language ● We have adapted the TextCat language identification plugin ● Provided fingerprints for 5 languages: DE, EN, FR, ES, NL ● You can extend it to new languages easily
  43. 43. Language detection examples ● x
  44. 44. Tokenisation • Plenty of “unusual”, but very important tokens in social media: – @Apple – mentions of company/brand/person names – #fail, #SteveJobs – hashtags expressing sentiment, person or company names – :-(, :-), :-P – emoticons (punctuation and optionally letters) – URLs • Tokenisation is crucial for entity recognition and opinion mining
  45. 45. Example #WiredBizCon #nike vp said when @Apple saw what did, #SteveJobs was like wow I didn't expect this at all.  Tokenising on white space doesn't work that well:  Nike and Apple are company names, but if we have tokens such as #nike and @Apple, this will make the entity recognition harder, as it will need to look at sub-token level  Tokenising on white space and punctuation characters doesn't work well either: URLs get separated (http, nikeplus), as are emoticons and email addresses
  46. 46. The TwitIE Tokeniser ● ● ● ● Treat RTs and URLs as 1 token each #nike is two tokens (# and nike) plus a separate annotation Hashtag covering both. Same for @mentions -> UserID Capitalisation is preserved, but an orthography feature is added: all caps, lowercase, mixCase Date and phone number normalisation, lowercasing, and emoticons are optionally done later in separate modules ● Consequently, tokenisation is faster and more generic ● Also, more tailored to our NER module
  47. 47. Normalisation • “RT @Bthompson WRITEZ: @libbyabrego honored?! Everybody knows the libster is nice with a bunch;))” • OMG! I’m so guilty!!! Sprained biibii’s leg! ARGHHHHHH!!!!!! • Similar to SMS normalisation • For some later components to work well (POS tagger, parser), it is necessary to produce a normalised version of each token • BUT uppercasing, and letter and exclamation mark repetition often convey strong sentiment, so we keep both versions of tokens • Syntactic normalisation: determine when @mentions and #tags have syntactic value and should be kept in the sentence, vs replies, retweets and topic tagging
  48. 48. A normalised example ● ● Normaliser currently based on spelling correction and some lists of common abbreviations Outstanding issues: ● ● Some abbreviations which span token boundaries (e.g. gr8, do n’t) difficult to handle Capitalisation and punctuation normalisation
  49. 49. TwitIE NER Results
  50. 50. Analysing Hashtags
  51. 51. What's in a hashtag? ● ● ● Hashtags often contain smushed words ● #SteveJobs ● #CombineAFoodAndABand ● #southamerica For NER we want the individual tokens so we can link them to the right entity For opinion mining, individual words in the hashtags often indicate sentiment, sarcasm etc. ● #greatidea ● #worstdayever
  52. 52. How to analyse hashtags? ● ● ● ● Camelcasing makes it relatively easy to separate the words, using an adapted tokeniser, but many people don't bother We use a simple approach based on dictionary matching the longest consecutive strings, working L to R ● #lifeisgreat -> #-life-is-great ● #lovinglife -> #-loving-life It's not foolproof, however ● #greatstart -> #-greats-tart To improve it, we could use contextual information, or we could restrict matches to certain POS combinations (ADJ+N is more likely than ADJ+V)
  53. 53. Irony and sarcasm • I had never seen snow in Holland before but thanks to twitter and facebook I now know what it looks like. Thanks guys, awesome! • Life's too short, so be sure to read as many articles about celebrity breakups as possible. • I feel like there aren't enough singing competitions on TV . #sarcasmexplosion • I wish I was cool enough to stalk my ex-boyfriend ! #sarcasm #bitchtweet • On a bright note if downing gets injured we have Henderson to come in
  54. 54. Sarcasm is a part of British culture ● So much so that the BBC has its own webpage on sarcasm designed to teach non-native English speakers how to be sarcastic successfully in conversation
  55. 55. BBC sarcasm quiz
  56. 56. How do you know when someone is being sarcastic? • Use of hashtags in tweets such as #sarcasm, #irony, #whoknew etc. • Large collections of tweets based on hashtags can be used to make a training set for machine learning • But you still have to know what to do with sarcasm once you've found it • Although sarcasm generally entails saying the opposite of what you mean, it doesn't necessarily just invert the polarity of an opinion • “It's not like I wanted to eat breakfast anyway” is negative when uttered sarcastically, but non-opinionated when uttered neutrally.
  57. 57. Identifying the scope of sarcasm I am not happy that I woke up at 5:15 this morning. #greatstart #sarcasm You are really mature. #lying #sarcasm
  58. 58. Experiment with sarcastic hashtags       Collected a corpus of 134 tweets containing the hashtag #sarcasm Manually annotated sentences with sentiment  266 sentences, of which 68 opinionated (25%)  62 negative, 6 positive Also annotated the same corpus as if the sarcasm was absent Compared how well our applications performed on each, with and without sarcasm analysis The results were a little surprising Even when we KNEW the statement was sarcastic, we didn't always get the polarity of the opinion right
  59. 59. Effect of sarcasm on sentiment analysis Sarcastic corpus Precision Recall F1 Opinionated 74.58 63.77 68.75 Opinion+polarity - Regular 20.34 17.39 18.75 Polarity-only - Regular 27.27 27.27 27.27 Opinion+polarity - Sarcastic 57.63 49.28 53.13 Polarity-only - Sarcastic 77.02 77.28 77.28 Regular corpus Opinionated Opinion+polarity - Regular Precision 57.89 45.61 Recall 58.93 46.43 F1 58.41 46.02 Polarity-only - Regular 78.79 78.79 78.79 Opinion+polarity - Sarcastic 22.81 23.21 23.01 Polarity-only - Sarcastic 39.40 39.39 39.39
  60. 60. What about non-textual content?
  61. 61. We can also do opinion mining on images and multimedia
  62. 62. Image-opinion identification • Facial expression analysis/classification – Helps with facial similarity calculations and face recognition – Can be used to predict sentiment/polarity – Can be combined with analysis text from document ● Coarse-grained opinion classification – Looking at image-feature classification for abstract concepts (sentiment / privacy / attractiveness) – e.g. looking at image colours, placement of interesting images in the picture
  63. 63. Multimodal opinion analysis  Investigate correlation between images and whole-document opinions     Do documents asserting specific opinions get illustrated with the same imagery? e.g. articles about euro-scepticism in the UK might be illustrated with images of specific Conservative peers…. Is there correlation between low-level image features and specific opinions? Investigate finer-grained (i.e. sub-document) correlations between imagery and opinions  e.g. sentence-level correlations incorporating analysis of the document layout
  64. 64. Demo: extracting opinions from images
  65. 65. So where does this leave us? ● Social media is a tricky but interesting medium to analyse ● Opinion mining is ubiquitous, but it's still far from perfect ● ● ● ● ● There are lots of linguistic and social quirks that fool sentiment analysis tools. The good news is that this means there are lots of interesting problems for us to research And it doesn’t mean we shouldn’t use existing opinion mining tools The benefits of a modular approach mean that we can pick the bits that are most useful Take-away message: it is critical to use the right tool for the right job
  66. 66. Don't be misled by the advertising: caveat emptor!
  67. 67. Acknowledgements
  68. 68. Further information • Research supported by the EU-funded ARCOMEM, uComp and TrendMiner projects • See and for more details • More information about GATE at • Opinion mining demo: • Learn about the technical details in the STIL 2013 tutorial: Practical Opinion Mining for social media (Wednesday 11.30am)
  69. 69. Questions?