In this 40 minute presentation, Attensity’s CTO, Ian Hersey, speaks about the challenges and critical benefits of real-time social media analytics. Real-world examples further illustrate the types of insights that natural-language-processing is capable of discovering.
Recorded at the 2012 Social Media Analytics Summit, some of the topics covered are:
- How people have become “human sensors” about all kinds of news
- The limitations of insights pulled from the Twitter stream
- The success of predictions based on social media data
- The application of natural language processing in social analytics
2. The Business Challenge
The BIG DATA wave
Driven by conversations on the Internet, Social Media,
Mobile Apps
• 300 Million: Tweets per day
• 250 Billion: Emails per day
• 800 Million: Facebook users
• 126 Million: Blogs
• 1.97 Billion: Internet users
worldwide
• 5 Trillion: SMS messages
annually
• Millions of CRM Records
• 100s of Millions of Survey
Verbatims
3. Social Media: “Human Sensors”
• News, firsthand, secondhand, thirdhand…
• Natural disasters
• Military movements
• Networks, velocity, acceleration
• Opinions
• Products
• Services
• Popular culture (TV, film, music)
• Politics
• Conversations, Comments, Recommendations
• Can sometimes predict (or explain) outcomes
4.
5. Predictive Power
• Don’t take it to Vegas…
• 90+% success rate if data volumes are sufficient
• Successful business uses involve not just prediction, but
engagement
• Product feedback
• Direct customer service
• Analysis of marketing campaign effectiveness (TV, film, music)
• Political outreach/mobilization
• Science Art is still in its infancy
• Equally or more important are the “whys” behind the
predictions/outcomes
6. GOP Florida – Newt Gingrich
There was a sustained campaign to drum
up support for Newt Gingrich
Selected Newt Gingrich topics were discussed
Another topic that had mileage more at length throughout the day. For example,
throughout the day particularly around being sued by the band Foreign for using the song
mid-afternoon. A Ron Paul supporter “Eye Of The Tiger” since 2009 captured the
was somewhat roughed up at a Gingrich imagination of Twittersphere …
rally. Later in the day Ron Paul’s team
demanded an apology…
7. GOP Florida – Mitt Romney
There was a sustained campaign to drum
up support for Newt Gingrich
The most consistent theme throughout One of the key criticisms were jabs
the day was Romney being a Populist. aimed at Romney’s wealth in the sense
that money and privilege can win you
leadership…
8. Some Major Technical Challenges
• Data scale and rates
• NLP – no “one size fits all” technology
• Multi-channel content acquisition, coverage and quality
• Domain and customer specificity in the metadata
• Combining structured and full-text queries
• Operation by non-linguists
9. Data Scale and Rates
• Experience with Hadoop, HBase and Solr
• Biggest issues
• “Enterprise friendliness”
• Cannot support low-latency processing
• No current commercial offerings with both SQL and full-text front ends
• “Real-time” analysis scenario
• Match a tweet according to an initial filter
• Do further analysis to determine whether it is “actionable” vs “just a mention”
• Figure out who to route it to with what kind of priority
• All within a handful of seconds from the time it was tweeted
• 2500 times or more per second
• Required development of real-time ingestion and orchestration
framework
13. Natural Language Processing “Reads” Every Communication
I bought an iPad2 for my mom last week. She loves the weight, but
doesn’t like the color. She wishes it came in blue. She says if it came in
blue, then she’d buy one for all her friends.
Entities (brands, people, locations, times, products…)
Events and relationships (purchasing event, my mom…)
Sentiment
Suggestions
Intent (to purchase, to leave)
I:have:mom
I:buy[past]:Product.apple_product.iPad2
14. Limitations of NLP
• Irony, sarcasm
• “slanguage”
• Who’s talking/tweeting?
• Agendas
• Impact (“opinions are like…”)
• Cross-/multi-language
• Single posts vs. “body of work”
18. Command Center Concepts and Overview
The Command Center is a highly branded shared experience providing a lens to
real-time social media conversations
Command Center screens use a responsive design for the following resolutions
1920x1080 (Most televisions)
1024x728 (Compatible with desktop computers and tablets)
A Command Center implementation is made up of multiple Dashboards
Implementations are hosted by Attensity
Dashboards contain multiple Widgets
Widgets are configurable with lots of options
Endless combinations