Aspect Miner: Fine-grained, feature-level opinion mining from rated review corpora

Aspect Miner
Fine-grained feature level opinion mining
from rated review corpora

MSc Thesis Defense | February 2012

Stelios Karabasakis
Dept. of Informatics and Telecommunications
National and Kapodistrian University of Athens

in association with the Knowledge Discovery in Databases Laboratory
kddlab.di.uoa.gr

INTRODUCTION

Opinion Mining: an overview
What is it? The task of recognizing and classifying the
opinions and sentiments expressed in unstructured text.
Our focus in
Use cases this work Opinion sources
• product comparison • news
• opinion summarization • blogs
• opinion-aware recommendation systems • reviews
• opinion-aware online advertising • user comments
• reputation management • social networks
• business intelligence • forums
• government intelligence • discussion groups

Stelios Karabasakis Aspect Miner: Fine-grained feature-level opinion mining from rated review corpora Feb 2012 2

INTRODUCTION

Reviews

• Popular form of user
movies books
generated content
» consumers use them to
make informed choices
» businesses use them to
gauge and monitor hotels restaurants

consumer sentiment

• Covering many distinct
domains, such as… goods services


INTRODUCTION

Ratings
• Every online review typically
carries a rating
» picked by the review author
» summarizes the sentiment of
the text

• Corpora of rated reviews are
» abundant on the web
» potentially useful for
supervised opinion mining
» largely ignored in the literature!


INTRODUCTION

Opinion Mining is challenging
Not as simple as counting positive vs. negative words
It is pointless to discuss why Hitchcock was a genius.

Distinct opinions about different topics in the same sentence
The top-notch production values are not enough to distract from a
clichéd story that lacks heart and soul.

Semantics of subjective expressions are domain-dependent
unpredictable plot twist, gloomy atmosphere (movies)
unpredictable service quality, gloomy room (hotels)


INTRODUCTION

Opinion Mining is a text classification problem

classification dimensions
• subjectivity: factual vs. subjective statements
• polarity: positive vs. negative sentiment
• intensity: weak vs. strong sentiment

classification granularity ? Motivating question
How can we train a system to
• binary distinguish among multiple
• multiclass degrees of sentiment?


INTRODUCTION

Classification levels
document level
In “Game of Thrones” (2011), the transition
from book to screen is remarkably successful.
The carefully chosen location and cast, the
top-notch cinematography and the seamless- positive
ness of its narrative come together brilliantly.
The new HBO show offers compelling drama,
even when rehashing old fantasy themes.


INTRODUCTION

sentence level
positive
The carefully chosen location and cast, the
top-notch cinematography and the seamless- positive
ness of its narrative come together brilliantly.
The new HBO show offers compelling drama,
even when rehashing old fantasy themes. positive


INTRODUCTION

feature level
features = domain-specific ratable properties
adaptation: positive
The carefully chosen location and cast, the production: positive
cast: positive
top-notch cinematography and the seamless-
direction: positive
ness of its narrative come together brilliantly. plot: positive
The new HBO show offers compelling drama, serialization: positive
even when rehashing old fantasy themes. subject: negative

? Motivating question
How can we identify feature terms
and the features they refer to?

INTRODUCTION

Problem description

Produce rich, fine-grained, feature-oriented review summaries
by analyzing reviews at the sentence level and aggregating the results

Sample summary
“Avatar” (2009) aggregated summary of 90 reviews
aspect mentions sentiment mean sentiment dispersion
direction 217 9/10 STRONGLY POSITIVE 17% UNANIMOUS AGREEMENT
story 152 8/10 POSITIVE 32% GENERAL AGREEMENT
acting 177 4/10 WEAKLY NEGATIVE 56% MIXED REACTION


INTRODUCTION

Solution components
a sentiment lexicon term prior sentiment _
masterpiece 10 (very strongly positive)
multiclass and adapted good 8 (positive)
to the target domain mediocre 5 (very weakly negative)
terrible 2 (strongly negative)

feature term feature a feature lexicon
protagonist CAST
performance CAST for the target domain
deliver CAST
camera DIRECTION
cinematography DIRECTION
dialogue WRITING
script WRITING

and a set of linguistic rules for sentence classification

INTRODUCTION

The Aspect Miner system
(a proof-of concept implementation of our approach)

Training subsystem

Training corpus Index of
(rated reviews) terms

Feature Term
Lexical
identifier classifier
Analyzer

Feature Sentiment
lexicon lexicon

Result:
Text to
classify
Sentence classifier Feature-level
sentiments

Key features: modular architecture, unsupervised,
domain agnostic, configurable granularity

INTRODUCTION

Aspect Miner implementation*
• Implemented in Java with
» NekoHTML for scraping
» JDBC/MySQL for dataset storage
» Lucene as a lexical analysis API and for indexing
» Wordnet & JWNL for lemmatization
» Stanford Parser for POS-tagging & typed dependency parsing
» Mallet’s LDA implementation for topic modeling
» GraphViz for visualizations

* source code (MIT-licensed) available from
github.com/skarabasakis/ImdbAspectMiner


INTRODUCTION

Training dataset*
107.646 movie reviews from IMDB.com, rated 1-10 stars
*available as an SQL dump from http://db.tt/vAthzJRL

mean = 291 words
median = 228 words
# reviews

review length (words)


Sentiment Lexicon Construction
Designing a fine-grained term classifier

SENTIMENT LEXICON

Terms

A term is a (base form, part of speech) tuple
» part of speech {VERB, NOUN, ADJECTIVE, ADVERB}

» a term represents all inflected forms and spellings of a word
e.g. {choose, chooses, chose, chosen, …}  [choose VERB]
{localise, localize, …}  [localize VERB]

» terms can be compound
e.g. [work out VERB] [common sense NOUN]
[meet up with VERB] [as a matter of fact ADVERB]


SENTIMENT LEXICON

Lexical analyzer
Training corpus
(rated reviews)
Purpose: to extract terms from texts

Tokenization
» Identifies the base form of words & compounds
POS tagging • Uses Wordnet to look up base forms
Named Entity identification
Lemmatization
» Eliminates non-subjective words
Lexical Analyzer

Comparatives annotation • Stop words including very common terms (be,have,…)

Negation scope resolution
• Named Entities (i.e. proper nouns)
• all articles, pronouns, prepositions etc.
Stop word removal

» Eliminates words that would be misleading
Open-class word filtering
for sentiment classification
• Comparatives & superlatives
Bags of terms
(one per • Words within a negation scope
document)


SENTIMENT LEXICON

Lexical analysis example
The most dramatic moment in the Sixth Sense does not occur until the
final minutes and the jaw dropping twist Shyamalan has been building up to.
Lemmatize

Eliminate

Get indexable terms


SENTIMENT LEXICON

Previous approaches to term classification
Lexicon-based approach
• Prior sentiment inferred from lexical associations
(synonyms, antonyms, hypernyms etc.) in a dictionary
• High accuracy, limited coverage
• Notable example: Sentiwordnet (Esuli & Sebastiani 2006)
Corpus-based approach
• Prior sentiment inferred from correlation patterns
(and, or, either…or, but etc.) in a training corpus
• Extended coverage, lower accuracy
• Notable examples: Hatzivassiloglou & McKeown 1997, Turney & Littman 2003,
Popescu & Etzioni 2005, Ding Liu & Yu 2008


SENTIMENT LEXICON

Ratings-based term classification
Our proposal: a ratings-based approach
positive term negative term
• Requires a training
set of rated reviews

• Prior sentiment
inferred from the
distribution of ratings
among all the reviews neutral term polysemous term
where a term occurs,
i.e. the rating
histogram of the
term


SENTIMENT LEXICON

IMDB dataset: Ratings distribution
# reviews # terms
# reviews
# terms

rating

Caution: Ratings are not evenly distributed
across the training corpus.

SENTIMENT LEXICON

Rating frequency weighting

Why? Weighting is necessary to
» eliminate training set biases
» make rating frequencies comparable to each other

How? Multiply every rating frequency in a histogram
with that rating’s weight , calculated as follows:
» := cumulative term count of all reviews with rating
» We pick in such a way that are equal for all
• Most predominant rating in the dataset has =1
• The less frequent the rating, the higher its weight


SENTIMENT LEXICON

Some sample histograms
extracted from the IMDB dataset


SENTIMENT LEXICON

Designing a term classifier
input: weighted rating histogram for term
output: one or more* sets of significant ratings
* if term is polysemous

A weighted mean function can
condense into a single rating.

9 5 7 7 10 8
7
9 7 10

This rating indicates the term’s
sentiment.


SENTIMENT LEXICON

Neutrality criterion
For a term to be neutral, its rating histogram must
approximate a uniform distribution

1

where 0 < ≤1


SENTIMENT LEXICON

Term classification schemes
Scheme 1: Peak Classifier
 Picks the histogram’s
peak rating as the only
significant rating

Pros Simplest classifier possible. Useful as a comparison baseline.
Surprisingly capable at classifying polarity (almost 2/3 accurate)
Cons Can’t detect polysemy
Poor at classifying intensity

SENTIMENT LEXICON

Scheme 2: Positive/Negative Area Classifier (PN)
 All ratings above a cutoff
frequency are significant
 Cutoff frequency should
be set a little bit above 11
the frequency average.
 Returns separate sets for
positive and negative
ratings

Pros Better at classifying intensity
Makes an attempt at detecting polysemy
Cons Weak terms can be mistaken for polysemous

SENTIMENT LEXICON

Scheme 3: Widest Window Classifier (WW)
 Looks for windows of
consecutive significant ratings
 Ratings are added to windows
from most to least frequent
 Significant rating windows must
satisfy 2 constraints
 minimum coverage:
windows must contain at
least of samples
 be as wide as possible
 Returns as many rating classes
as the windows it detects

Pros Avoids detecting false polysemy
Avoids biases exhibited by the other classification schemes

SENTIMENT LEXICON

Classifier evaluation: Ratings Distribution
We classified 33.000 terms
that appear ≥5 times in the IMDB dataset.
Conclusion: WW classifier distributes rating classes more evenly

PEAK

PN

WW

Distribution of primary rating classes for each classifier


SENTIMENT LEXICON

Classifier evaluation: Polarity
We evaluate against a reference lexicon of 5272 terms
based on the MPQA and General Inquirer subjectivity lexicons.

Accuracy Precision Recall F1-Score  WW is the most
POSITIVE 55.5% 44.2% 49.2% accurate of the 3
PEAK 63.6%
NEGATIVE 67.3% 65.3% 66.3%
proposed classifiers
POSITIVE 62.4% 58.4% 60.4%  But not as accurate
PN 66.2%
NEGATIVE 68.4% 72.3% 70.3% than SentiWordnet
POSITIVE 70.4% 86.2% 77.5%
WW 70.1%  However, WW is
NEGATIVE 69.6% 60.5% 64.8%
more accurate for
POSITIVE 63.6% 61.3% 62.4% domain-specific
SentiWordnet 73.2%
NEGATIVE 83.6% 48.3% 61.3% terms


SENTIMENT LEXICON

Classifier evaluation: Intensity
We evaluate against a test set of 443 strong + 323 weak terms
based on the General Inquirer subjectivity lexicon.

WEAK STRONG

40.0% Using the WW classifier
% terms in WW lexicon

to classify intensity:
30.0%
 78% of strong terms
Ποσοστό όρων

20.0%
are classified 3 and
above
10.0%
 83% of weak terms
are classified 3 and
0.0%
1 2 3 4 5 below
Intensity Τιμή Έντασης WW lexicon
class in WW


SENTIMENT LEXICON

The Aspect Miner sentiment lexicon*

A reusable sentiment lexicon for the movie review domain
* downloadable from
github.com/skarabasakis/ImdbAspectMiner/blob/master/imdb_sentiment_lexicon.xls

Feature Identification
Using topic models for feature discovery

FEATURE IDENTIFICATION

Approaches to feature identification
The traditional approach: discovery through heuristics
• frequency: commonly occurring noun phrases are often features
(Hu & Liu 2004)
• co-occurrence: terms commonly found near subjective expressions
may be features (Kim & Hovy 2006, Qiu et al. 2011)
• language patterns: in phrases such as 'F of P' or 'P has F‘, P is a
product and F is a feature (Popescu & Etzioni 2005)
• background knowledge: user annotations, ontologies, search
engine results, Wikipedia data…
An up-and-coming approach: topic modeling



Topic Modeling
Probabilistic Topic Models can model the
abstract topics that occur in a set of documents

documents are
mixtures of topics

topics are
distributions over words



Topic Modeling
Probabilistic topic models
• require that the user specifies a number of topics
» Topics are just numbers – their semantic interpretation is not the model’s concern

• make an assumption about the probability distribution of topics
• define a probabilistic procedure for generating documents from topics
» by inverting this procedure, we can infer topics from documents

A popular topic model: Latent Dirichlet Allocation (LDA)
• assumes that topics follow a Dirichlet prior distribution
» i.e. each document is associated with just a small number of topics



Topics vs. Features
? Motivating question Here are a few sample topics we
Features are a form of topics. Can we got from running LDA on the
use topic models to discover features? IMDB dataset

ROLE SCRIPT WAR POLICE CAR
ACTOR IDEA HERO CASE CHASE
PERFORMANCE DIALOGUE ATTACK MYSTERY SHOOT
PLAY WRITE GROUP VICTIM VEHICLE
LEAD PLOT AIRPLANE SOLVE COP
CAST SCREENPLAY BUNCH MURDER DRIVE
SUPPORT COME UP SOLDIER OFFICER KILL
ACTRESS CRAFT KILL SUSPECT STREET
SHINE EXPLAIN BOMB DETECTIVE BULLET
STAR HOLE ENEMY CRIME ROBBERY

These topics arefeatures. These topics arethemes.
They are useful to us We are not interested in them


Feature identification with LDA
Problem. Topics are global, features are local

Solution. Train topic model on shorter segments (e.g. sentences) rather
than full documents.

Problem. Running LDA on such short segments produces noisy topics

Solution. Implement a bootstrap aggregation scheme to filter the noise:
1. Train N topic models from different subsets of dataset
2. Merge similar topics across models to produce a single meta-model
» Intuition: Valid feature-topics should occur in >1 models and share many common top
terms. Noisy topics should be isolated to specific models



Merging topics

COMEDY 0.200 COMEDY 0.180 COMEDY 0.380
JOKE 0.099 PARODY 0.168 PARODY 0.168
LAUGH
FUN
0.096
0.088
+ SATIRE
JOKE
0.099
0.061
= JOKE
LAUGH
0.160
0.096
FORMULA 0.025 RIDICULE 0.054 SATIRE 0.099
FUN 0.088
RIDICULE 0.054 discarded
FORMULA 0.025

Topic Similarity for topics Tm, Tn
» More common terms with higher
probabilities  higher similarity



Merging topic models
To merge 2 topic sets
• Merge every topic of set A to most similar topic from set B
» but only if that similarity is above average similarity

To merge N topic sets
• Merge first two, then merge the result with the third etc.
• At the end
» discard topics with a low merging degree
» If same term ends up in >1 topics, only keep it in the topic where it
has the highest probability



Movie feature lexicon
56 topics, manually labeled with 18 labels


Sentence classification
Utilizing language structure
for contextual sentiment estimation
and feature targeting

SENTENCE-LEVEL ANALYSIS

Sentiment
Sentiment: a (polarity, intensity) tuple, where
» polarity {+,−}
2n classes
» intensity {1, 2, …, n}

mbinary: R10 S1 m3: R10 S3 m5: R10 S5

1 1 1 -5
2 We define a
2 -3 2 -4
3 -1 3
mapping function 3 -3
-2
4 4 4 -2
5
to convert ratings to
5 -1 5 -1
6 sentiment classes
6 +1 6 +1
7 7 7 +2
(preferably 1:1) +2
8 +1 8 8 +3
9 9 +3 9 +4
10 10 10 +5



Typed Dependencies
Natalie Portman comes off as very believable,
Typed dependencies are binary gaining empathy from the audience.
grammatical relations between
word pairs in a sentence
(de Marneffe et al., 2006)

amod(relations, binary)

type governor dependent

Typed dependency trees are
• semantically richer than syntax trees
• easier to process, because content words are connected directly
rather than through function words


Dependency types

Stanford Parser’s representation defines a
hierarchy of 48 dependency types


Contextual sentiment estimation
? Motivating question
What is the contextual sentiment of a dependency,
given the prior sentiment of its constituents?

Examples

It is best to avoid watching infmod(best/+2, avoid/−4)  −4
any of the increasingly xcomp(avoid/−4, watching/+2)  −2
disappointing sequels.
advmod(disappointing/−2, increasingly/+3)  −3

Our model. We empirically developed and formally defined
• 6 outcome functions that model types of word interactions
• 42 dependency rules that cover all possible dependency patterns



Outcome functions

Models an interaction where
UNCHANGED base term imposes the sentiment

Ιt seems that they ran out of budget.
STRONGER stronger term imposes the sentiment

a mighty talent wasted in mass produced rom-coms

AVG both terms contribute equally to the sentiment

intelligent and ambitious



Outcome functions

Models an interaction where
INTENSIFY modifier increases the intensity of the base

increasingly disappointing sequels
REFLECT modifier overrides polarity, increases or decreases intensity of base

impossible to enjoy unless you lower your expectations

NEG modifier diminishes or negates the base

not a masterpiece, but not bad either



Dependency Rules: General form

td(pgov, pdep)  outcome_base

type label term patterns outcome function base specifier
A pattern may specify: one of the following: GOV or DEP
• a list of allowed parts of speech UNCHANGED NEGATED
• a white list of specific terms STRONGER AVG
INTENSIFY REFLECT
POSITIVE NEGATIVE

Examples conj(*,*)  AVG_DEP
advmod({n,a,r},*)  INTENSIFY_GOV
amod(*,{too})  NEGATIVE_GOV


Aspect Miner dependency rule set
gov dep gov dep
Td outcome base td outcome base
pos wlist pos wlist pos wlist pos wlist

1. Negation 4. Modifiers

1.1 neg * * * *  NEGATE GOV 4.1.1 advmod
* * * {enough}  POSITIVE GOV
1.2.1 det 4.1.2 Amod
1.2.2 prt 4.2.1 advmod
* * * {too}  NEGATIVE GOV
1.2.3 advmod 4.2.2 amod
* * * negTerms1  NEGATE GOV
1.2.4 dobj 4.3 advmod v * * *  REFLECT GOV
1.2.5 nsubj
4.4 advmod n,a,r * * *  INTENSIFY GOV
1.2.6 dep
4.5 amod * * * *  REFLECT GOV
1.3 pobj * negTerms1 * *  NEGATE DEP
4.6 infmod a * * *  REFLECT GOV
1.4 aux * * * negAux2  NEGATE GOV
4.7 infmod v,n,r * * *  INTENSIFY DEP
1 negTerms = {n't, no, not, never, none, nothing, nobody, noone, nowhere, without, hardly,
barely, rarely, seldom, against, minus, sans} 4.8 a * * *  REFLECT DEP
2 negAux = {should, could, would, might, ought} partmod
4.9 v,n,r * * *  STRONGER DEP
2. Subjects 4.10 quantmod * * * *  INTENSIFY GOV
2.1.1 nsubj 4.11 prt * * * *  STRONGER GOV
* * * *  INTENSIFY GOV
2.1.2 nsubjpass
4.13 prep * * * {like}  UNCHANGED GOV
2.2.1 csubj
* * * *  REFLECT GOV 4.12 prep * * * *  REFLECT GOV
2.2.2 csubjpass
5. Clausal Modifiers
3. Objects
5.1 advcl a * * *  REFLECT DEP
3.1.1 dobj * negVerbs3 * *  NEGATE DEP
3.1.2 dobj * * * *  REFLECT GOV 5.2 advcl v,n,r * * *  UNCHANGED DEP

3.2 iobj * * * *  UNCHANGED GOV 5.3 purpcl * * * *  UNCHANGED DEP

3.3 pobj * * * *  UNCHANGED DEP 6. Clausal complements

3 negVerbs = {avoid, cease, decline , forget, fail, miss , neglect, refrain, refuse, stop} 6.1.1 ccomp
6.1.2 xcomp * * * *  REFLECT GOV
6.1.3 acomp
6.2.1 conj
6.2.2 appos * * * *  AVG GOV
6.2.3 parataxis
6.3 dep * * * *  STRONGER DEP



Sentence classification algorithm
Initialization
• Generate dependency tree from sentence
• Annotate subjective terms with prior polarities from sentiment lexicon
• Annotate feature terms with labels from feature lexicon
Sentiment estimation
• Apply closest matching rule to every dependency relation in the tree
» The sentiment of the dependency replaces previous sentiment of the governor node
» Dependencies are processed in reverse postfix order (bottom to top and right to left)

Feature targeting
• The scope of a feature term is a subtree that contains the term and goes
» all the way down to the leaves
» all the way up to the closest clausal dependency
• the sentiment at the root of the subtree gets assigned to the feature


Sentence classification example



Sentence polarity evaluation
Test set: Sentence polarity dataset by Pang & Lee, 2002
(5331 positive + 5331 negative sentences from movie reviews)

Results
Polarity classification is accurate for
71.5% of positive sentences
76.9% of negative sentences
74.2% of all sentences
Analysis of error causes
39.0% inaccurate dependency rule
28.5% misclassified term (or we picked the wrong sense)
21.5% erroneous sentence parsing
8.5% ambiguous sentence
2.5% dependency rules applied in the wrong order


Comparative evaluation
Reference Method Accuracy

Linguistic methods

Nakagawa, Irui & Kurohashi, 2005 majority voting 62,9%
Ikeda & Takamura, 2008 majority voting with negations 65.8%
Aspect Miner dependency rules 74.2%

Learning based methods

Andreevskaia & Bergler, 2008 naïve bayes 69.0%
Nakagawa, Irui & Kurohashi, 2005 SVM (bag-of-features) 76.4%
Arora, Mayfield et al., 2010 genetic programming 76.9%
SVM (sentence-wise learning
Ikeda & Takamura, 2008 77.0%
with polarity shifting + ngrams)
Nakagawa, Irui & Kurohashi, 2005 dependency tree CRFs 77.3%

Conclusion: Our method fares well among linguistic techniques,
but does not match the accuracy of learning based methods

Conclusions
Putting it all together

Training subsystem
CONCLUSIONS

Training corpus Term classifier
(rated reviews)

Corpus statistics Term Histogram
collection generation
Tokenization
POS tagging
Named Entity identification Index of PEAK PN WW
Indexing
Lexical Analyzer Lemmatization terms classifier classifier classifier

Comparatives annotation

Feature identifier
Negation scope resolution
... Topic
models
Stop word removal partition 1

Training set partitioning
TΜ1 TΜ2 ... TΜΝ-1 TΜΝ
...

Open-class word filtering partition 2
LDA

...
Aggregation
...

Bags of terms partition N-1
(one per ...
document)
partition N Assisted labeling

Sentiment lexicon Feature lexicon

Dependency Dependency Sentence & Feature
parsing tree(s) Classification
Result:
Text to
Feature-sentiment
classify
pairs
Dependency
Rule set
Sentence classifier


CONCLUSIONS

Summary of contributions
• We showed the feasibility of • We developed a
granular prior polarity reusable sentiment lexicon
classification using review and feature lexicon for the
ratings movie review domain
» and developed a classifier that
achieved at least 70% accuracy • We created a set of linguistic
on the training dataset rules and developed a
methodology that is capable
• We suggested a fine-grained feature-level
bagging-inspired classification of sentences
meta-algorithm for » and achieved 74.2% accuracy
discovering feature topics for polarity classification on
with LDA our test dataset.


CONCLUSIONS

Suggested Improvements
Term classification intensifier term

• Assigning a special class to intensifier terms
• Per-feature polysemy resolution

Feature identification
• Named entities as features
• Applying multi-grain topic models for
discovery of local topics, e.g. MG-LDA (Titov & MacDonald, 2008)
Sentence-level classification
• Supervised learning of rules.
Replace manually-made set of rules with a set of rules inferred from
frequent dependency patterns.


CONCLUSIONS

References
For a complete list of references, see the full report (in greek)
http://j.mp/AspectMiner
B. Liu, “Sentiment analysis and subjectivity,” Handbook of Natural M. Huand B. Liu, “Mining and summarizing customer reviews,” in
Language Processing,, pp. 978–1420085921, 2010. Proceedings of the tenth ACM SIGKDD international conference
B. Pang and L. Lee, “Opinion mining and sentiment analysis,” on Knowledge discovery and data mining, 2004, pp. 168–177.
Foundations and Trends in Information Retrieval, vol. 2, no. 1-2, X. Ding, B. Liu, and P S. Yu, “A holistic lexicon-based approach to opinion
.
pp. 1–135, 2008. mining,” in Proceedings of the international conference on Web
A. Esuliand F. Sebastiani, “Sentiwordnet: A publicly available lexical search and web data mining, 2008, pp. 231–240.
resource for opinion mining,” in Proceedings of LREC, 2006, vol. 6, I. Titovand R. McDonald, “Modelingonline reviews with multi-grain
pp. 417–422. topic models,” in Proceeding of the 17th international conference
V. Hatzivassiloglouand K. R. McKeown, “Predicting the semantic on World Wide Web, 2008, pp. 111–120.
orientation of adjectives,” in Proceedings of the eighth conference T. Nakagawa, K. Inui, and S. Kurohashi, “Dependency tree-based
on European chapter of the Association for Computational sentiment classification using CRFswith hidden variables,” in
Linguistics, 1997, pp. 174–181. Human Language Technologies: The 2010 Annual Conference of
P Turney, M. L. Littman, and others, “Measuring praise and criticism:
. the North American Chapter of the Association for Computational
Inference of semantic orientation fromassociation,” in ACM Linguistics, 2010, pp. 786–794.
Transactions on Information Systems (TOIS), 2003. A. Andreevskaiaand S. Bergler, “When specialists and generalists work
A. M. Popescuand O. Etzioni, “Extracting product features and together: Overcoming domain dependence in sentiment
opinions from reviews,” in Proceedings of the conference on tagging,” ACL-08: HLT, 2008.
Human Language Technology and Empirical Methods in Natural D. Ikeda and H. Takamura, “Learning to shift the polarity of words for
Language Processing, 2005, pp. 339–346. sentiment classification,” Comp.Intelligence, vol. 25, no. 1, pp.
296–303, 2008.


Aspect Miner: Fine-grained, feature-level opinion mining from rated review corpora

Recomendados

Recomendados

Más contenido relacionado

Último

Último (20)

Destacado

Destacado (20)

Aspect Miner: Fine-grained, feature-level opinion mining from rated review corpora

Notas del editor