Text and Sentiment Analytics for Business Intelligence
1. Text and Sentiment Analytics
For Business Intelligence
By Gan Keng Hoon
5 April 2018
1
2. Outlines
Overview Business Intelligence with Text &
Sentiment Analytics
Approach
Text Analytics
Sentiment Analytics
Activity: Dataset for Analytics
Related Research Work on News’s Sentiment
Goodbye 2
4. What Customer Thinks?
One key purpose business intelligence
Improve customer experience
This include to know exactly what consumers or clients think
of new and established products or services,
recent initiatives and
customer service offerings.
4
5. Unstructured Feedbacks
How to do that?
- Get Structured vs unstructured feedbacks.
- From Social media, review platform, complaint platform etc.
- Then
Take
Action
Source: https://www.complaintboard.com/pos-malaysia-
complaints-l35839.html
5
6. Analytics Technology
With technology, reading/browsing/interpreting can be easier.
Text Analytics
is the process of analyzing unstructured text, extracting relevant
information, and transforming it into useful business intelligence.
Sentiment Analytics
determines if an expression is positive, negative, or neutral, and to what
degree.
6
11. A Hotel Business Scenario
Looks good, 155 person
says Very Good…
Not bad, customers
rated 4 * and above
for location,
cleanliness ..
Source: http://www.tripadvisor.com.my
13. Many Questions To Be Answered…
Mr X: How is the condition of
Wifi?
Miss Y: Is the toilet really dirty?
Family Z: Any convenience store
nearby?
Manager of Hotel: I want to
know all the complaints about
toilet!
14. Technology Needed
It is impossible to scan through
each of them.
Important details could be missed.
It is hard to visualize or summarize
all the texts via manual effort.
It is impossible to digest new
reviews generated each day.
*There are 438 reviews (as of 5/4/2018)
for the mentioned hotel.
15. Text & Sentiment Analytics
Is the toilet really dirty?
Text Analytics
- Let’s extract and analyze some
texts to answer the question.
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown
and were left in the
shower area
2. dirty sink, and very
very dirty shower glass
wall.
3. the shower, it's clean...
Sentiment Analytics
- Let’s find some sentiments
about these texts.
16. Text Analytics
Texts Preprocessing
Sentence Tokenizer
Stop Word Removal etc.
Feature Selection
Bags of Words Approach
Term Frequency Inversed Document Frequency
Ngram Word etc.
Natural Language Processing
Part of Speech Tagging
Dependency Analysis etc.
16
17. Entity Detection (or Aspect Selection)
Texts
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown and
were left in the shower
area
2. dirty sink, and very very
dirty shower glass wall.
3. the shower, it's clean...
…
Aspect
1. Bathroom
2. Toiletries
3. Shower area
4. Sink
5. Shower
6. Hair dryer
7. Wifi
8. Bed ….
- POS
- Tagging
- Noun
Phrase
Selection
- Feature
Selection
18. Sentiment Analytics
Basic Technique
1. Sentiment or Not.
Decide whether the text contains sentiment or not.
Differentiate between fact (objective) and opinion (subjective)
2. Target of the talk.
Determine the target (entity/aspect) an opinion is about.
3. Sentiment.
Determine the opinion’s polarity (Good or Bad)
4. Scoring.
Calculate overall score based on a single target (e.g. the
bathroom), or entire review (many targets).
19. Sentiment Extraction
Texts
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown and
were left in the shower
area
2. dirty sink, and very very
dirty shower glass wall.
3. the shower, it's clean...
…
Aspect -
Sentiment
1. Sink – dirty
2. Shower –
clean
3. Shower glass
wall - dirty
- POS
- Tagging
- Adjective
Phrase
Selection
20. Sentiment Polarity Detection & Scoring
Texts
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown and
were left in the shower
area
2. dirty sink, and very very
dirty shower glass wall.
3. the shower, it's clean...
…
Aspect - Sentiment
1. Sink – dirty (N:0.75)
2. Shower – clean (P:0.5)
3. Shower glass wall – dirty (N:0.75)
24. Related Research Work on News’s Sentiment
Label -> Train the Classifier -> Predict Label for new article
24
June Ling Ong Hui, Gan Keng Hoon, Wan Mohd Nazmee Wan Zainon: Effects of Word Class
and Text Position in Sentiment-based News Classification. 4th Information Systems
International Conference 2017, ISICO 2017, 6-8 November 2017, Bali, Indonesia, Procedia
Computer Science, Elsevier (2017).
27. Challenges Ahead
How to detect a more in depth sentiment.
Comparison of features of a product.
Differentiate the spam and the credible. FAKE comments?!
Language problem
usage of mixed languages.
Usage of non standard languages.
28. THANK YOU
Drop me an email at: khgan@usm.my
Visit http://ir.cs.usm.my
28