3. Overview
Rationale, objective and method
Case study
• Comments Analysis
• Negative Comments Analysis
Limitations
Conclusions
Next steps
3
4. Rationale
The feedback from survey comments could be
useful to improve client satisfaction or improve
the survey.
Presently, survey comments collected by
Statistics Canada are not being used, or
extracted through time-consuming manual
processes.
The question is, can we automate the analysis of
survey comments using text mining techniques?
4
5. Objective
Develop an automated tool to monitor and
interpret feedback from survey comments using
basic text mining analysis techniques.
5
7. 7
Method
“trip canada good border
guards security rude”
“10th trip to Canada was
GOOd, but the border Guards
@at security were rude!!”
Remove punctuations,
digits, upper-case and
meaningless words
1. Clean
3. Classify
2. Split
trip
canada
good
border
security
guards
rude
Separate individual
words from
comment
Classify words as
positive or negative
Everything else is
neutral
Continued
Word Class P
S
N
S
good + 1 0
rude - 0 1
8. 8
Method
4. Net Score
Net Score = PS – NS,
Is the difference between positive and negative score.
ID Comment Words PS NS
Net
Score
01
10th trip to
Canada was
GOOd, but the
border Guards
@at security
were rude
trip
1 1 0
canada
good
border
guards
security
rude
Continued
9. Method
9
5. Append
ID Comment PS NS
Net
Score
Year Quarter Country Other…
01 10th trip to
Canada was
GOOd, but
the border
Guards @at
security were
rude
1 1 0 2016 1 USA
10. Case Study
International Travel Survey (ITS) collects
information on travellers to and from Canada
Reference period: 2013 to 2015 (Quarterly data)
Comments were provided on around 20% of the
received survey questionnaires
10
11. Comments Analysis
23% of all provided comments were classified as
positive, while 7% were classified negative.
11
Negative
7%
Neutral
70%
Positive
23%
Percentage of response
12. Comments Analysis
Traveller comments trend by traveller flow and
comment type
12
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
1 2 3 4 1 2 3 4 1 2 3 4
2013 2014 2015
%ofcomment
Period
Canadian - Negative
Canadian - Positive
Visitors - Negative
Visitors - Positive
14. Negative Comments Analysis
14
Discover reasons for negative comments
7% of comments are classified as negative
Negative
words
Words from Negative
comments
15. Negative Comments Analysis
Half of the negative comments are from visitors.
Most prevalent words found in the comments
classified as negative.
15
0 20 40 60 80 100
cost
hotel
accommodation
food
flight
canada
Count
NeutralorPositivewords
0 10 20 30 40 50 60
bad
sick
rude
unable
refused
expensive
Count
Negativeword
16. Limitations
Software
One-word analysis
Only English comments are analyzed
Words are equally weighted regardless of the
degree of polarity
16
17. Conclusion
We developed an automated in-house tool to
quantify, monitor, and extract meaning from
survey comments.
The tool provides an opportunity to analyze
survey comments efficiently.
Survey managers can use this extracted
information to improve respondents experience.
For Tourism stakeholders, the information can
be used to enhance services provided to
travellers.17
18. Next Steps
Analyze census and/or other surveys’ comments
Run the same approach with R.
Compare French and English comments.
Explore additional methods
Correlation
N-gram
Visualization
Sentence segmentation
Predictive modelling
18
Visitor Observations = 48516
Canadian Observations = 21641
Total Observations = 70157
Green and Red: Positive higher than negative.
Flow, Visitors respond better than Canadians.
3. Briefly seasonality effect. Suggestion only.
How does the response trend change by season.
Price to text mining module is very expensive
Using R would have been possible but TM package and all of its dependency would have been required.
Python or OpenNPL were not available to install.
Predictive modelling meaning to see if a certain word has a meaning.