Vidar Brekke presents on text analytics for enterprise and consumer applications. Text analytics involves applying statistical, linguistic, machine learning and data analysis techniques to uncover business value from unstructured text. It helps answer business questions faster by finding new insights in sources like social media, surveys and emails. However, text poses big data challenges as it comes in huge volumes, varies in formality and language, and is often "dirty". Nu-school text analytics uses machine learning trained on human-annotated data to better understand sentiment, sarcasm and evolving language than early word-spotting techniques. Accuracy is best measured against human judgments or using partial credit matrices that account for nuanced sentiment. Top uses of text analytics include brand monitoring
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
SOCIAL TEXT ANALYTICS FOR ENTERPRISE AND CONSUMER APPLICATIONS
1. Presented by Vidar Brekke,
Social Intent LLC
SOCIAL TEXT
ANALYTICS FOR
ENTERPRISE AND
CONSUMER
APPLICATIONS
The International Association of Software
Architects. October 23, 2012
@ividar #nlproc
2. What is Text Analytics?
Processes that uncover
business value in
A unstructured text via the
application of statistical,
B
linguistic, machine
C learning, and data analysis
and visualization
techniques
@ividar #nlproc 2
3. Text analytics help answer
business questions faster and
cheaper than before, uncovering
new, hidden insights!
@ividar #nlproc 3
4. Text analytics is a Big Data problem
Volume Velocity Variety
Hundreds of
languages
Social media,
help inquiries,
email, texts,
surveys
10.2 Million
tweets sent Cryptic (vertical
during the first Formal, inform industry or
presidential al or criminal activity)
debate ridiculously
informal
@ividar #nlproc 4
5. I’m So Intextuated With You
Unstructured text represents the
biggest opportunity and problem
in Big Data
Text, as opposed to most other
enterprise data, it’s very dirty
data
@ividar #nlproc 5
9. Low Signal/Noise Ratio + Naïve Metrics Lead to Wrong Conclusions
• Lack of relevance: Many conversations you think
are about you, aren’t.
• Poor accuracy: Many automated sentiment
solutions are as good as a coin flip.
• Generic: All analysis is applied the same way
across domains
• Language Evolves: Slang, sarcasm is rampant in
social media. Dictionary-based approaches are
largely ineffective.
@ividar #nlproc 9
10. Relevancy: It’s not all about you.
Let me finish my drink before you drive me to the
Betty Ford clinic!
Call me a bigot, but white guys can’t sprint!
#london2012
My husband is such a baby. He won’t even taste raw
food.
Is Delta’s food prepared by Purina? So much for first
class.
@ividar #nlproc 10
11. Search and Destroy (the data you’re looking for)
Text analytics got traction in the 80s, but the use-cases
were different than today.
“Word spotting” – not different from a Google search.
Show me all documents containing:
Ford NOT Harrison
But it doesn’t scale
@ividar #nlproc 11
12. Booleans are like woodcarving with a chainsaw
Query: Ford NOT Harrison ….
…would miss this tweet
Carguy231: Me and a dozen others
have lined up outside the Harrison, NY
Ford dealership to test drive the new
Fusion!
@ividar #nlproc 12
13. Booleans are like woodcarving with a chainsaw
Query: Ford AND Fusion….
…would get this tweet
Roadrunner123: Stuck with my dad in
his ford listening to horrible jazz fusion
@ividar #nlproc 13
14. Sentiment Analysis
Early sentiment analysis tools also use word spotting.
“Awesome” = good
“Sucks” = bad
What about sarcasm, slang, new words?
Additionally, the analysis is typically on overall contextual polarity, rather
than targeted.
“I love the new Camaro, it’s better than the Mustang”
@ividar #nlproc 14
15. You can’t use word spotting for sentiment detection
“It took all morning to sign the lease papers for my new Mustang!”
“I stood on line all morning to get the last Mustang on the lot!”
“The brakes on the Mustang are surprisingly unpredictable.”
“The TV ads for the Mustang are surprisingly unpredictable!”
“The Mustang has never been good”
“The Mustang has never been this good”
@ividar #nlproc 15
16. Nu-School text analytics is based on Machine Learning
Using training-data to help the system to recognize patterns. We
develop a statistical probability that a sentence is
positive, negative, etc.
What are training data?
These are samples of text annotated by humans in an effort to
show the machine what the right answer is
“I love my iPhone, but hate AT&T”
| iPhone | Positive | AT&T | Negative
Much easier and quicker to develop new languages than
dictionary based approaches
@ividar #nlproc 16
17. Test: What’s the sentiment here?
“Reuters reports that
Assad continues the
massacre of his own
people amid sanctions
from the international
community.”
@ividar #nlproc 17
18. How to evaluate a text analytics platform
The accuracy of a sentiment analysis system is, in
principle, how well it agrees with human judgments.
“I can’t believe the bar has a hidden gambling room in
the back!”
An automated system can never be better than
humans. Or can it?
@ividar #nlproc 18
19. Using Human Parallel Coding to Establish Gold Standards
Confusion Matrix: Human as Gold Standard
POSITIVE NEGATIVE NEUTRAL TOTAL
POSITIVE 365 24 159 548
NEGATIVE 57 81 65 203 Raw Accuracy:
61.5%
NEUTRAL 274 60 415 749
TOTAL 696 165 639 1500
If human agrees with a machine around 60% percent of the time, the
machine would be performing as well as a human being.
@ividar #nlproc 19
20. Using A Credit Matrix to Create Improved Measurement
POSITIVE NEGATIVE NEUTRAL
POSITIVE 100% 0% 50%
NEGATIVE 0% 100% 50% Credit Matrix
NEUTRAL 50% 50% 100%
Partial Credit Figure of Merit:
82.3%
POSITIVE NEGATIVE NEUTRAL
Confusion Matrix: POSITIVE 365 24 159
Human 1 as Gold NEGATIVE 57 81 65
Standard
NEUTRAL 274 60 415
@ividar #nlproc 20
21. Precision & Recall (sentiment as an example)
Precision is the fraction of retrieved instances
that are relevant
E.g. How many instances labeled as positive, were
actually positive
Recall is the fraction of relevant instances that are
retrieved
E.g. How many positive instances the system
detected compared to all positive instances.
@ividar #nlproc 21
22. Top business applications of text/content analytics*
*Alta Plana, 2011
• Brand / product / reputation management
• Market research and social media monitoring, i.e. what are people saying
about my brand or products
• Voice of the Customer / Customer Experience Management
• Do I need to step in and offer customer service?
• How many people recommend my brand vs. advocate against it?
• Search, Information Access, or Questions Answering
• Which bloggers are negative toward Obamacare?
• Which of the hotels on Yelp.com get great reviews for the room service?
• What are some articles similar to this one?
• Competitive intelligence
• What competing products are people considering and why
• Are competitor’s media spend generating purchase intent?
@ividar #nlproc 22
23. Growing areas for is text analytics being applied
Product development
Intelligence and counter-terrorism, law enforcement
Pharmaceutical drug discovery
Financial services and insurance
Media, publishing & advertising
Political research
CRM
@ividar #nlproc 23
24. Still awake?
There is money in text analytics.
Here’s a stock tip worth the price of admission
alone
(YMMV….)
@ividar #nlproc 24
25. Strange Bedfellows
Whenever Anne Hathaway's
name appeared with any
regularity in news
stories, Berkshire Hathaway A
shares rose in value.
@ividar #nlproc 25
The green cells here are where the two coders agree. We can use this to derive a “raw” accuracy score. We add up the total number of instances where the two coders agree (the green cells) and divide by the total number of instances (1500) – to get a raw accuracy score of 61.5%.This raw accuracy score provides the first benchmark against which we can assess machine performance. Put concretely, if we can get a machine to classify documents for sentiment where a human would agree with its classifications around 60% percent of the time, our machine would be performing as well as a human being.
Remember, we said before that not all mistakes are made equally. It depends on the use to which you’re putting the data. In most situations, however, it’s worse to mislabel something positive as negative than it is to mislabel something positive as neutral. This is true both for a human or machine coder.We can factor in these relative weights by using what is called a Credit Matrix. This says that you get 100% when your label agrees with the gold standardUltimately, the PCFM will establish the baseline against which we measure the performance of our machine learning algorithm.