1. A Survey of Sentiment Mining Techniques
Khan Mostafa
Graduate Student, Computer Science, Stony Brook University, NY 11794, USA
Email: khan.mostafa@stonybrook.edu
Student ID# 109365509
ABSTRACT
A survey on publications addressing challenges in and techniques of sentiment mining.
1 INTRODUCTION
Text convey subjective and objective information, as well as sentiments associated with it. It is an
intuitive task for human to identify associated sentiment of any text. However, to identify collective,
as well as individual sentiments amongst a large collection of textual data can be an enormous task.
This requires data mining and classifying techniques to automatically associate sentiments of textual
data. Sentiment mining can be used to identify how people feel about a product, topic or more generally
an entity. This is useful to manufactures from business point of view. In recent years, there had been
much academic research in sentiment analysis as well as practical commercial applications.
Generally, sentiment can be negative or positive. Nevertheless, every text do not convey sentiment,
some are merely objective statements. Thus, application needs to classify texts as positive, negative or
neutral while mining sentiment.
Sentiment analysis has been studied from perspectives of data mining, machine learning, natural
language processing and statistical analysis. In this article, I would try to address several aspects of
sentiment mining. I survey several papers starting with a text which familiarizes readers to basic ideas
on automatic sentiment analysis. Then, I briefly address a well cited paper which instigated much of
sentiment mining research as a specialized classification task. Next article discusses on utilizing
microblogging sites like Twitter for sentiment analysis and opinion mining. Next papers discuss
different approaches for sentiment classification. One specially focuses on mining large real time
streaming data and the last paper gives hints to the case of ironic speeches.
Each of these surveyed papers address slightly different aspects in sentiment mining and covers subtly
overall problem domain.
2 SENTIMENT ANALYSIS AND OPINION MINING
2.1 Automatic Sentiment Analysis in On-line Text
I would open my survey by first addressing to a relatively old but not so ancient text (Boiy, et al. 2007)
about sentiment analysis which introduces readers to basic concepts, methodology, techniques and
challenges in related topic. They first objectify sentiment by introducing concepts of emotions.
Emotions can occur in text as appraisal, direct expressions, elements of action and remarks.
Survey paper submitted for CSE590 Networks and Data Mining Techniques on Sep 26, 2013
2. Khan Mostafa
Student ID# 109365509
Then they introduces readers to methodologies for identifying emotion (and thus sentiments) of text.
They explores symbolic techniques and machine learning techniques. To employ machine learning
techniques, first we need to select some features. In search of potential candidate features terminologies
like Parts of Speech (POS), unigrams, n-grams, lemmas, negations, opinion words, adjectives are
prevalent. Authors, then mentions support vector machines, naïve Bayes multinomial and maximum
entropy as three example supervised method.
Authors also lay focus on several challenges. One challenge is that, often in many texts persons express
sentiment about different topics – some being negative and some being positive. Therefore, it can be
useful to investigate relation topic sentiment relation. Again, many texts are not subjective but merely
neutral objective statements. So, before estimating sentiment polarity, it is useful to identify whether
they really bear some sentiment. Similar challenge is cross domain classification. Another important
issue is, the text quality; especially when gathered from the web – text are intertwined with fair amount
of junk. This requires decent amount of text filtration.
2.2 Thumbs up? Sentiment Classification using Machine Learning Techniques
Pang, et al. (Pang, Lee and Vaithyanathan 2002), amongst many, investigated in the field of sentiment
classification at an early stage and posed several challenges in the field. They aimed to, “examine whether
it suffices to treat sentiment classification simply as a special case of topic-based categorization or whether special sentimentcategorization methods need to be developed.” They tried to employ three machine learning techniques, which
performs well in topic categorization, namely: - (a) naïve Bayes, (b) maximum entropy classification
and (c) support vector machines only to find that, they do not perform satisfactorily in sentiment
classification. Thus, they ended with an open question for researchers to investigate intensely.
2.3 Twitter as a Corpus for Sentiment Analysis and Opinion Mining
A. Pak & P. Paroubek (Pak and Paroubek 2010) studies how microblogging platform can be used for
sentiment analysis. They mined Twitter to automatically collect a corpus of negative and positive
sentiment (subjective) as well as objective (neutral) posts. They cleverly exploited the use of emoticons
to associate sentiment to tweets; similar approach was exemplified by J. Read (Read. 2005). They
queried Twitter for two types of emoticons:
Happy emoticons: “:-)”, “:)”, “=)”, “:D” etc.
Sad emoticons: “:-(”, “:(”, “=(”, “;(” etc.
In conjunction to that, they collected objective/neutral posts by retrieving posts from newspapers and
magazines.
Pak, et al. analyzed their collected corpus first by tagging posts in the corpus using TreeTagger (Schmid
1994) and then performing pairwise comparison of tags distribution over two sets. For subjective set
vs. objective set they observe that, POS tags are not evenly distributed and postulated that, such feature
can be used to classify objective and subjective posts. Similar observation was for positive vs. negative
sentiment posts too.
For training a sentiment classifier, they used the presence of n-grams as binary feature. They claimed
that, high order n-grams performs better at capturing sentiments while unigrams has good coverage
over data. While constructing n-grams they attached negation to adjacent terms. Then they use Naïve
Bayes classifier and claimed that, this performs better than SVM or CRF (Lafferty, McCallum and
Fernando 2001). They trained two Bayes classifiers, (a) n-gram based and (b) POS based. To attain a
final result, they estimate sentiment using both classifiers and calculate the log likelihood of each
sentiment. To increase accuracy, they suggested discarding common n-grams. For this, they only used
2
3. A Survey of Sentiment Mining Techniques
n-grams with low Shannon entropy values. They evaluated their system over hand annotated real
Twitter posts.
The methodology presented here is an ideal one for this particular case. Specially, automatic training
of classifier is a clever corpus building idea. Besides, combination of n-gram based and POS based
classifying significantly solves the challenge of topic-sentiment relation. However, this methodology
do not address how to handle streaming data which changes over time.
2.4 Using Appraisal Taxonomies for Sentiment Analysis
In their paper about sentiment analysis, Whitelaw, et al. (Whitelaw, Garg and Argamon 2005) suggests
using appraisal taxonomies for sentiment classification. They argued that, for semantic analysis
approaches should go beyond (a) bag of words and (b) mood classified words. They identified the need
for semantic analysis of attitude expression and also hypothesized that, atomic units of sentiment
expression are not individual word but rather appraisal groups.
They adopted four main types of attributes for appraisal groups: Attitude, Orientation, Graduation and
Polarity; adopted from Martin and White’s Appraisal Theory (Martin and White 2005). They discussed
a semi-automated technique to construct a lexicon of appraisal groups. To do so, they used terms from
(Martin and White 2005) as seed terms and generated candidate expansions using WordNet and two
other thesauri. They used coarse ranking of relevance to enlist such terms. However, they manually
inspected each ranked list to produce final set of terms. Then they tested several feature sets, e.g. Words
by Attitude, Systems by Attitude, Appraisal Group by Attitude & Orientation etc. They evaluated the
effectiveness of the feature sets for movie review classification on IMDb movie reviews. They found
that, union of bag-of-words and appraisal group by attitude & orientation (BoW+G:AO) yields best
result.
The approach demonstrated in this paper has several drawbacks in terms of scalability. Especially, as
the lexicon building involves much manual effort and the objective function for classification tends to
be computation intensive. However, their work draw the attention of researchers towards an important
notion that, sentiment analysis should concentrate more on key terms rather than the whole corpus.
Similar observation was found by (Benamara, et al. 2007) and (Subrahmanian and Reforgiato 2008)
stating that, “Adjectives and Adverbs are better than Adjectives Alone”. Alongside, the essence of the outcome
of (Whitelaw, Garg and Argamon 2005)’s work can be identified to be analogous to what (Pak and
Paroubek 2010) exploits in their work by classifying sentiments based on both POS and word groups
(n-grams).
2.5 Joint Sentiment/Topic Model for Sentiment Analysis
Lin, et al. (Lin and He 2009) addressed sentiment analysis in a slightly different perspective by
combining topic to it. They proposed an extension of the topic model, Latent Dirichlet Allocation
(LDA) by adding a sentiment layer to it. Their model is described as Joint Sentiment/Topic (JST)
model which is fully unsupervised and can detect sentiment and topic simultaneously in document
level.
They describe, “The existing framework of LDA has three hierarchical layers, where topics are associated with
documents, and words are associated with topics. In order to model document sentiments, we propose a joint sentiment/topic
(JST) model by adding an additional sentiment layer between the document and the topic layer. Hence, JST is effectively
a four-layer model, where sentiment labels are associated with documents, under which topics are associated with sentiment
labels and words are associated with both sentiment labels and topics.” They observed that, sentiment document
distribution plays important role in determining polarity of a document.
3
4. Khan Mostafa
Student ID# 109365509
They have examined an alternative model, called Tying-JST, which incorporates single topic-document
distribution as opposed to individual distribution for each document in JST. However, Tying-JST
shows consistently poor performance than JST.
JST incorporates prior information with its model to enhance accuracy. They examined four model
priors:- (a) paradigm word list, (b) mutual information, (c) full subjectivity lexicon and (d) filtered
subjectivity lexicon.
They evaluates result accuracy for different prior models which demonstrates significant improvement
with incorporation of prior models as compared to results obtained from implementation without prior
models. Also, filtered subjectivity lexicon perceived to be best amongst studied models.
JST is stipulated to be a novel text mining approach for sentiment analysis and topic extraction. By
simultaneously identifying topic, this model addresses to the problem of domain dependence of
subjectivity. (i.e., a single word can have negative connotation in one domain whereas the same word
might be positive in another domain.) However, the complexity of this approach can pose a major
challenge is large scale commercial implementation of this method. This method considers document
level sentiment, while many applications are often interested in much granular sentiment, especially
sentiment towards entities.
2.6 Sentiment Knowledge Discovery in Twitter Streaming Data
Yet another perspective of sentiment analysis is investigate by (Bifet and Frank. 2010) addressing
challenges in mining streaming “data whose nature or distribution changes over time”. It specifically addresses
Twitter data stream where data arrives at high speed and prediction algorithms requires to perform in
real time. The paper addresses specifics of Twitter API and other implementation detail, which I would
keep aside from survey discussion.
In question of sentiment analysis, they note challenges posed due to succinctness of tweets and
possibility of sarcasm and irony. They also leverages the advantage of many tweets being annotated by
tweet-authors using emoticons – same idea utilized by (Pak and Paroubek 2010) to use such tweets as
training data for sentiment classifier. However to train, they filter tweets by (a) replacing mentions with
tag: USER, hyperlinks by tag: URL, (b) removing emoticons.
Authors argue that frequently used measure, “prequential accuracy is not well-suited for data streams with
unbalanced data, and that a prequential estimate of Kappa should be used instead.” Authors identifies the reason
is that, the classes are not balanced and can vary over time and often one class is much more frequent
than other class. Hence, a more appropriate measure would be something that normalizes a classifier
accuracy by chance predictor such as Kappa statistics (Cohen 1960). They postulates on a suggestion
by (Gama, Sebastião and Rodrigues 2009) which proposed to forget estimation either by (a) sliding a
window on most recent observation or (b) weighing observation with fading factors. Authors indicates
that, output on both approach are almost similar and thus suggest using sliding window with Kappa
statistics. Then the authors experimented three fast incremental methods: - (a) multinomial naïve Bayes,
(b) stochastic gradient descent (SGD) and (c) Hoeffding tree for mining this data stream. On the basis
of their demonstration, authors suggested using SGD.
This work successfully address the problem of streaming data and their novel solution can be viewed
as an ideal solution.
2.7 The case of irony
The last paper I would investigate is much recent one by Bosco, et al. (Bosco, Patti and Bolioli 2013)
– a portion of which addresses the case of irony. In our relevant perspective, irony can be identified as
4
5. A Survey of Sentiment Mining Techniques
a polarity reverser. That being said, question arises how to identify irony (and other figures of speech).
Authors suggest that, context knowledge is important to identify irony. In Facebook comment threads,
diagonal comments can be marked as ironic. But in context less circumstances (e.g. Twitter) world
knowledge is required. Again, interpretation of ironic speeches can be subjective. Hence, authors finds
the necessity of developing manually annotated corpora for irony detection and poses an open question
to investigate.
3 CONCLUSION
In this paper, I have tried to represent core ideas behind surveyed texts. These texts are all related to
sentiment mining, sentiment analysis problems domain and challenges in them. Each of them addresses
different aspects of this vast problem domain and provides insight on how to build a complete solution
for mining large text data and extract sentiment out of it. This survey defines what sentiment is, how
to classify them and use data mining and machine learning techniques to extract opinion from large
corpuses. It also discusses on few approaches addressing challenges of domain dependence, ironic
speeches, streaming data and so forth. Insights are found to identify opinion related to entities and
trace sentiment transition over time.
4 REFERENCES
Benamara, Farah, Carmine Cesarano, Antonio Picariello, Diego Reforgiato, and VS Subrahmanian.
2007. "Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone."
International Conference on Weblogs and Social Media. Boulder, CO USA: ICWSM.
Bifet, Albert, and Eibe Frank. 2010. "Sentiment knowledge discovery in twitter streaming data." In
Discovery Science, 1-15. Berlin Heidelberg: Springer .
Boiy, Erik, Pieter Hens, Koen Deschacht, and Marie-francine Moens. 2007. "Automatic Sentiment
Analysis in On-line Text." Proceedings of Conference on Electronic Publishing. Vienna, Austria:
ELPUB. 349-360.
Bosco, Cristina, Viviana Patti, and Andrea Bolioli. 2013. "Developing Corpora for Sentiment Analysis:
The Case of Irony and Senti-TUT." IEEE Intelligent Systems (IEEE Computer Society) 55-63.
Cohen, Jacob. 1960. "A coefficient of agreement for nominal scales." Educational and Psychological
Measurement 37-46.
Gama, João, Raquel Sebastião, and Pedro Pereira Rodrigues. 2009. "Issues in evaluation of stream
learning algorithms." Proceedings of the 15th ACM SIGKDD International Conference. ACM. 329338.
Lafferty, John D., Andrew McCallum, and N.C. Fernando. 2001. "Conditional random fields:
Probabilistic." Proceedings of the Eighteenth International Conference on Machine Learning. San
Francisco, CA, USA.: Morgan Kaufmann Publishers Inc. 282-289.
Lin, Chenghua, and Yulan He. 2009. "Joint sentiment/topic model for sentiment analysis." Proceedings
of the 18th ACM conference on Information and knowledge management. ACM. 375-384.
Martin, J. R., and P. R. R. White. 2005. Language of Evaluation: Appraisal in English. London: Palgrave.
http://grammatics.com/appraisal/.
5
6. Khan Mostafa
Student ID# 109365509
Pak, Alexander, and Patrick Paroubek. 2010. "Twitter as a Corpus for Sentiment Analysis and Opinion
Mining." Language Resources and Evaluation. 1320-1326.
Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. "Thumbs up? Sentiment Classification
using Machine Learning Techniques." Proceedings of the ACL-02 conference on Empirical methods in
natural language processing. Philadelphia, PA, USA: Association for Computational Linguistics.
79-86.
Read., Jonathon. 2005. "Using emoticons to reduce dependency." The Association for Computer Linguistics.
Schmid, Helmut. 1994. "Probabilistic part-of-speech tagging using decision trees." Proceedings of the
International. 44-49.
Subrahmanian, Venkatramana S., and Diego Reforgiato. 2008. "AVA: Adjective-verb-adverb
combinations for sentiment analysis." Intelligent Systems (IEEE) 23 (4): 43-50.
Whitelaw, Casey, Navendu Garg, and Shlomo Argamon. 2005. "Using appraisal groups for sentiment
analysis." Proceedings of the 14th ACM international conference on Information and knowledge management.
ACM. 625-631.
6