SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
The JDPA Sentiment Corpus
for the Automotive Domain
Miriam Eckert, Lyndsie Clark,
Nicolas Nicolov
J.D. Power and Associates
Jason S. Kessler
Indiana University
Overview
• 335 blog posts containing opinions about cars
– 223K tokens of blog data
• Goal of annotation project:
– Examples of how words interact to evaluate entities
– Annotations encode these interactions
• Entities are invoked physical objects and their
properties
– Not just cars, car parts
– People, locations, organizations, times
Excerpt from the corpus
“last night was nice. sean bought me caribou
and we went to my house to watch the baseball
game …
“… yesturday i helped me mom with brians
house and then we went and looked at a kia
spectra. it looked nice, but when we got up to it,
i wasn't impressed ...”
Outline
• Motivating example
• Overview of annotation types
– Some statistics
• Potential uses of corpus
• Comparison to other resources
John recently purchased a
had agreat a disappointing stereo,
and was
mildly
very grippy. He also considered a
which, while highly had a better
PERSON
Honda Civic.
CAR
engine,
CAR-PART CAR-PART
stereo.
CAR-PART
CARPERSON
BMW
It
CAR
REFERS-TO
priced
CAR-FEATURE
REFERS-TO
John recently purchased a
had agreat a disappointing stereo,
and was
mildly
very grippy. He also considered a
which, while highly had a better
PERSON
Honda Civic.
CAR
engine,
CAR-PART CAR-PART
stereo.
CAR-PART
CARPERSON
BMW
It
CAR
priced
CAR-FEATURE
TARGET TARGET TARGET
TARGET
TARGET
John recently purchased a
had agreat a disappointing stereo,
and was
mildly
very grippy. He also considered a
which, while highly had a better
PERSON
Honda Civic.
CAR
engine,
CAR-PART CAR-PART
stereo.
CAR-PART
CARPERSON
BMW
It
CAR
REFERS-TO
priced
CAR-FEATURE
REFERS-TO
PART-OF PART-OF
FEATURE-OF
PART-OF
John recently purchased a
had a great a disappointing stereo,
and was
mildly
very grippy. He also considered a
which, while highly had a better
PERSON
Honda Civic.
CAR
engine,
CAR-PART CAR-PART
stereo.
CAR-PART
CARPERSON
BMW
It
CAR
priced
CAR-FEATURE
DIMENSION
MORE
LESS
John recently purchased a
had a great a disappointing stereo,
and was
mildly
very grippy. He also considered a
which, while highly had a better
PERSON
Honda Civic.
CAR
engine,
CAR-PART CAR-PART
stereo.
CAR-PART
CARPERSON
BMW
It
CAR
REFERS-TO
PART-OF PART-OF
TARGET TARGET TARGET
TARGET
TARGET
priced
CAR-FEATURE
FEATURE-OF
DIMENSION
MORE
LESS
Entity-level
sentiment: positive
Entity-level
sentiment: mixedREFERS-TO
TARGET
Outline
• Motivating example
• Overview of annotation types
– Some statistics
• Potential uses of corpus
• Comparison to other resources
John recently purchased a Civic. It had a
great engine and was priced well.
John
PERSON
Civic It
Entity annotations
REFERS-TO
REFERS-TO
CAR
engine
CAR-PART
• >20 semantic types from
• ACE Entity Mention Detection Task
• Generic automotive types
priced
CAR-
FEATURE
Entity-relation annotations
Entity-level sentiment:
Positive
• Relations between entities
• Entity-level sentiment
annotations
• Sentiment flow between
entities through relations
• My car has a great engine.
• Honda, known for its high
standards, made my car.
Civic
CAR
engine
CAR-
PART
priced
CAR-
FEATURE
PART-OF FEATURE-
OF
Entity annotation type: statistics
• Inter-annotator
agreement
• Among mentions 83%
• Refers-to: 68%
• 61K mentions in corpus
and 43K entities
• 103 documents
annotated by around 3
annotators
A1: …Kia Rio…
A2: …Kia Rio…
MATCH
A1: …Kia Rio…
A2: …Kia Rio…
NOT A MATCH
Sentiment expressions
great engine
highly priced
Prior polarity: positive
Prior polarity: negative
• Evaluations
• Target mentions
• Prior polarity:
• Semantic orientation
given target
• positive, negative,
neutral, mixed
… a
highly spec’ed
Prior polarity: positive
Sentiment expressions
• Occurrences in corpus: 10K
• 13% are multi-word
• like no other, get up and go
• 49% are headed by adjectives
• 22% nouns (damage, good amount)
• 20% verbs (likes, upset)
• 5% adverbs (highly)
Sentiment expressions
• 75% of sentiment expression occurrences
have non evaluative uses in corpus
• “light”
– …the car seemed too light to be safe…
– …vehicles in the light truck category…
• 77% sentiment expression occurrences are
positive
• Inter-annotator agreement:
– 75% spans, 66% targets, 95% prior polarity
Modifiers -> contextual polarity
NEGATORS
not a good car
not a very good car
INTENSIFIERS
very good cara
kind of good cara
UPWARD
DOWNARD
NEUTRALIZERS
i
f
goodthe car is
I hope goodthe car is
COMMITTERS
sure goodthe car isI am
UPWARD
suspect goodthe car isI
DOWNWARD
Other annotations
• Speech events (not sourced from author)
–John thinks the car is good.
• Comparisons:
–Car X has a better engine than car Y.
–Handles a variety of cases
Outline
• Motivating example
• Overview of annotation types
– Some statistics
• Potential uses of corpus
• Comparison to other resources
Possible tasks
• Detecting mentions, sentiment expressions,
and modifiers
• Identifying targets of sentiment expressions,
modifiers
• Coreference resolution
• Finding part-of, feature-of, etc. relations
• Identifying errors/inconsistencies in data
Possible tasks
• Exploring how elements interact:
– Some idiot thinks this is a good car.
• Evaluating unsupervised sentiment systems or
those trained on other domains
• How do relations between entities transfer
sentiment?
– The car’s paint job is flawless but the safety record
is poor.
• Solution to one task may be useful in solving
another.
But wait, there’s more!
• 180 digital camera blog posts were annotated
• Total of 223,001 + 108,593 = 331,594 tokens
Outline
• Motivating example
– Elements combine to render entity-level
sentiment
• Overview of annotation types
– Some statistics
• Potential uses of corpus
• Comparison to other resources
Other resources
• MPQA Version 2.0
– Wiebe, Wilson and Cardie (2005)
– Largely professionally written news articles
– Subjective expression
• “beliefs, emotions, sentiments, speculations, etc.”
– Attitude, contextual sentiment on subjective
expressions
– Target, source annotations
– 226K tokens (JDPA: 332K)
Other resources
• Data sets provided by Bing Liu (2004, 2008)
– Customer-written consumer electronics product
reviews
– Contextual sentiment toward mention of product
– Comparison annotations
– 130K tokens (JDPA: 332K)
Thank you!
• Obtaining the corpus:
– Research and educational purposes
– ICWSM.JDPA.corpus@gmail.com
– June 2010
– Annotation guidelines:
http://www.cs.indiana.edu/~jaskessl
• Thanks to: Prof. Michael Gasser, Prof. James
Martin, Prof. Martha Palmer, Prof. Michael
Mozer, William Headden
Top 20 annotations by type
Inter-annotator agreement

Más contenido relacionado

Similar a The 2010 JDPA Sentiment Corpus for the Automotive Domain

COM597 Interactive Design: CARmax Mobile APP
COM597 Interactive Design: CARmax Mobile APP COM597 Interactive Design: CARmax Mobile APP
COM597 Interactive Design: CARmax Mobile APP Melinda Yang
 
Tutorial 13 (explicit ugc + sentiment analysis)
Tutorial 13 (explicit ugc + sentiment analysis)Tutorial 13 (explicit ugc + sentiment analysis)
Tutorial 13 (explicit ugc + sentiment analysis)Kira
 
From Sentiment to Persuasion Analysis: A Look at Idea Generation Tools
From Sentiment to Persuasion Analysis: A Look at Idea Generation ToolsFrom Sentiment to Persuasion Analysis: A Look at Idea Generation Tools
From Sentiment to Persuasion Analysis: A Look at Idea Generation ToolsJason Kessler
 
Search Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level ViewSearch Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level Viewjustin spratt
 
Mktg350 lecture 10142013
Mktg350 lecture 10142013Mktg350 lecture 10142013
Mktg350 lecture 10142013lkirkman
 
Search Engine Optimization, SEO Audits, and Analytics
Search Engine Optimization, SEO Audits, and AnalyticsSearch Engine Optimization, SEO Audits, and Analytics
Search Engine Optimization, SEO Audits, and AnalyticsBill Hartzer
 
La increíble tabla periódica de los factores SEO
La increíble tabla periódica de los factores SEOLa increíble tabla periódica de los factores SEO
La increíble tabla periódica de los factores SEOIgnacio Santiago Pérez
 
DS.pptx
DS.pptxDS.pptx
DS.pptxJoeus1
 
Business Data Management- Car Rental Company
Business Data Management- Car Rental CompanyBusiness Data Management- Car Rental Company
Business Data Management- Car Rental CompanyJuhi Srivastava
 
Search Quality Evaluator Guidelines. Digirank Ltd Aug 18
Search Quality Evaluator Guidelines. Digirank Ltd Aug 18Search Quality Evaluator Guidelines. Digirank Ltd Aug 18
Search Quality Evaluator Guidelines. Digirank Ltd Aug 18Karen Pearce
 
Understanding search engine algorithms
Understanding search engine algorithmsUnderstanding search engine algorithms
Understanding search engine algorithmsVijay Sankar
 
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchTHAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchBrian McKeiver
 
Periodic table guide to seo - Search Engine Land
Periodic table guide to seo - Search Engine LandPeriodic table guide to seo - Search Engine Land
Periodic table guide to seo - Search Engine LandFanus van Straten
 

Similar a The 2010 JDPA Sentiment Corpus for the Automotive Domain (20)

COM597 Interactive Design: CARmax Mobile APP
COM597 Interactive Design: CARmax Mobile APP COM597 Interactive Design: CARmax Mobile APP
COM597 Interactive Design: CARmax Mobile APP
 
Tutorial 13 (explicit ugc + sentiment analysis)
Tutorial 13 (explicit ugc + sentiment analysis)Tutorial 13 (explicit ugc + sentiment analysis)
Tutorial 13 (explicit ugc + sentiment analysis)
 
From Sentiment to Persuasion Analysis: A Look at Idea Generation Tools
From Sentiment to Persuasion Analysis: A Look at Idea Generation ToolsFrom Sentiment to Persuasion Analysis: A Look at Idea Generation Tools
From Sentiment to Persuasion Analysis: A Look at Idea Generation Tools
 
Website workout
Website workoutWebsite workout
Website workout
 
Overview power point final
Overview power point finalOverview power point final
Overview power point final
 
Search Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level ViewSearch Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level View
 
Mktg350 lecture 10142013
Mktg350 lecture 10142013Mktg350 lecture 10142013
Mktg350 lecture 10142013
 
Database Analysis
Database AnalysisDatabase Analysis
Database Analysis
 
Give Your CMS an SEO Jolt
Give Your CMS an SEO JoltGive Your CMS an SEO Jolt
Give Your CMS an SEO Jolt
 
Search Engine Optimization, SEO Audits, and Analytics
Search Engine Optimization, SEO Audits, and AnalyticsSearch Engine Optimization, SEO Audits, and Analytics
Search Engine Optimization, SEO Audits, and Analytics
 
How Google works
How Google worksHow Google works
How Google works
 
La increíble tabla periódica de los factores SEO
La increíble tabla periódica de los factores SEOLa increíble tabla periódica de los factores SEO
La increíble tabla periódica de los factores SEO
 
Summit EU Machine Learning
Summit EU Machine LearningSummit EU Machine Learning
Summit EU Machine Learning
 
Advanced Analytics with Social Media Data
Advanced Analytics with Social Media DataAdvanced Analytics with Social Media Data
Advanced Analytics with Social Media Data
 
DS.pptx
DS.pptxDS.pptx
DS.pptx
 
Business Data Management- Car Rental Company
Business Data Management- Car Rental CompanyBusiness Data Management- Car Rental Company
Business Data Management- Car Rental Company
 
Search Quality Evaluator Guidelines. Digirank Ltd Aug 18
Search Quality Evaluator Guidelines. Digirank Ltd Aug 18Search Quality Evaluator Guidelines. Digirank Ltd Aug 18
Search Quality Evaluator Guidelines. Digirank Ltd Aug 18
 
Understanding search engine algorithms
Understanding search engine algorithmsUnderstanding search engine algorithms
Understanding search engine algorithms
 
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchTHAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
 
Periodic table guide to seo - Search Engine Land
Periodic table guide to seo - Search Engine LandPeriodic table guide to seo - Search Engine Land
Periodic table guide to seo - Search Engine Land
 

Más de Jason Kessler

Visualizing Words and Topics with Scattertext
Visualizing Words and Topics with ScattertextVisualizing Words and Topics with Scattertext
Visualizing Words and Topics with ScattertextJason Kessler
 
Natural Language Visualization with Scattertext
Natural Language Visualization with ScattertextNatural Language Visualization with Scattertext
Natural Language Visualization with ScattertextJason Kessler
 
Lexicon Mining for Semiotic Squares: Exploding Binary Classification
Lexicon Mining for Semiotic Squares: Exploding Binary ClassificationLexicon Mining for Semiotic Squares: Exploding Binary Classification
Lexicon Mining for Semiotic Squares: Exploding Binary ClassificationJason Kessler
 
Jason Kessler Problems: What's Wrong with Twitter
Jason Kessler Problems: What's Wrong with TwitterJason Kessler Problems: What's Wrong with Twitter
Jason Kessler Problems: What's Wrong with TwitterJason Kessler
 
Discovering Persuasive Language through Observing Customer Behavior
Discovering Persuasive Language through Observing Customer BehaviorDiscovering Persuasive Language through Observing Customer Behavior
Discovering Persuasive Language through Observing Customer BehaviorJason Kessler
 
Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...
Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...
Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...Jason Kessler
 
Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...
Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...
Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...Jason Kessler
 

Más de Jason Kessler (7)

Visualizing Words and Topics with Scattertext
Visualizing Words and Topics with ScattertextVisualizing Words and Topics with Scattertext
Visualizing Words and Topics with Scattertext
 
Natural Language Visualization with Scattertext
Natural Language Visualization with ScattertextNatural Language Visualization with Scattertext
Natural Language Visualization with Scattertext
 
Lexicon Mining for Semiotic Squares: Exploding Binary Classification
Lexicon Mining for Semiotic Squares: Exploding Binary ClassificationLexicon Mining for Semiotic Squares: Exploding Binary Classification
Lexicon Mining for Semiotic Squares: Exploding Binary Classification
 
Jason Kessler Problems: What's Wrong with Twitter
Jason Kessler Problems: What's Wrong with TwitterJason Kessler Problems: What's Wrong with Twitter
Jason Kessler Problems: What's Wrong with Twitter
 
Discovering Persuasive Language through Observing Customer Behavior
Discovering Persuasive Language through Observing Customer BehaviorDiscovering Persuasive Language through Observing Customer Behavior
Discovering Persuasive Language through Observing Customer Behavior
 
Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...
Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...
Targeting Sentiment Expressions through Supervised Ranking of Linguistic Conf...
 
Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...
Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...
Polling the Blogosphere: a Rule-Based Approach to Belief Classification, By J...
 

Último

Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 

Último (20)

Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 

The 2010 JDPA Sentiment Corpus for the Automotive Domain

  • 1. The JDPA Sentiment Corpus for the Automotive Domain Miriam Eckert, Lyndsie Clark, Nicolas Nicolov J.D. Power and Associates Jason S. Kessler Indiana University
  • 2. Overview • 335 blog posts containing opinions about cars – 223K tokens of blog data • Goal of annotation project: – Examples of how words interact to evaluate entities – Annotations encode these interactions • Entities are invoked physical objects and their properties – Not just cars, car parts – People, locations, organizations, times
  • 3. Excerpt from the corpus “last night was nice. sean bought me caribou and we went to my house to watch the baseball game … “… yesturday i helped me mom with brians house and then we went and looked at a kia spectra. it looked nice, but when we got up to it, i wasn't impressed ...”
  • 4. Outline • Motivating example • Overview of annotation types – Some statistics • Potential uses of corpus • Comparison to other resources
  • 5. John recently purchased a had agreat a disappointing stereo, and was mildly very grippy. He also considered a which, while highly had a better PERSON Honda Civic. CAR engine, CAR-PART CAR-PART stereo. CAR-PART CARPERSON BMW It CAR REFERS-TO priced CAR-FEATURE REFERS-TO
  • 6. John recently purchased a had agreat a disappointing stereo, and was mildly very grippy. He also considered a which, while highly had a better PERSON Honda Civic. CAR engine, CAR-PART CAR-PART stereo. CAR-PART CARPERSON BMW It CAR priced CAR-FEATURE TARGET TARGET TARGET TARGET TARGET
  • 7. John recently purchased a had agreat a disappointing stereo, and was mildly very grippy. He also considered a which, while highly had a better PERSON Honda Civic. CAR engine, CAR-PART CAR-PART stereo. CAR-PART CARPERSON BMW It CAR REFERS-TO priced CAR-FEATURE REFERS-TO PART-OF PART-OF FEATURE-OF PART-OF
  • 8. John recently purchased a had a great a disappointing stereo, and was mildly very grippy. He also considered a which, while highly had a better PERSON Honda Civic. CAR engine, CAR-PART CAR-PART stereo. CAR-PART CARPERSON BMW It CAR priced CAR-FEATURE DIMENSION MORE LESS
  • 9. John recently purchased a had a great a disappointing stereo, and was mildly very grippy. He also considered a which, while highly had a better PERSON Honda Civic. CAR engine, CAR-PART CAR-PART stereo. CAR-PART CARPERSON BMW It CAR REFERS-TO PART-OF PART-OF TARGET TARGET TARGET TARGET TARGET priced CAR-FEATURE FEATURE-OF DIMENSION MORE LESS Entity-level sentiment: positive Entity-level sentiment: mixedREFERS-TO TARGET
  • 10. Outline • Motivating example • Overview of annotation types – Some statistics • Potential uses of corpus • Comparison to other resources
  • 11. John recently purchased a Civic. It had a great engine and was priced well. John PERSON Civic It Entity annotations REFERS-TO REFERS-TO CAR engine CAR-PART • >20 semantic types from • ACE Entity Mention Detection Task • Generic automotive types priced CAR- FEATURE
  • 12. Entity-relation annotations Entity-level sentiment: Positive • Relations between entities • Entity-level sentiment annotations • Sentiment flow between entities through relations • My car has a great engine. • Honda, known for its high standards, made my car. Civic CAR engine CAR- PART priced CAR- FEATURE PART-OF FEATURE- OF
  • 13. Entity annotation type: statistics • Inter-annotator agreement • Among mentions 83% • Refers-to: 68% • 61K mentions in corpus and 43K entities • 103 documents annotated by around 3 annotators A1: …Kia Rio… A2: …Kia Rio… MATCH A1: …Kia Rio… A2: …Kia Rio… NOT A MATCH
  • 14. Sentiment expressions great engine highly priced Prior polarity: positive Prior polarity: negative • Evaluations • Target mentions • Prior polarity: • Semantic orientation given target • positive, negative, neutral, mixed … a highly spec’ed Prior polarity: positive
  • 15. Sentiment expressions • Occurrences in corpus: 10K • 13% are multi-word • like no other, get up and go • 49% are headed by adjectives • 22% nouns (damage, good amount) • 20% verbs (likes, upset) • 5% adverbs (highly)
  • 16. Sentiment expressions • 75% of sentiment expression occurrences have non evaluative uses in corpus • “light” – …the car seemed too light to be safe… – …vehicles in the light truck category… • 77% sentiment expression occurrences are positive • Inter-annotator agreement: – 75% spans, 66% targets, 95% prior polarity
  • 17. Modifiers -> contextual polarity NEGATORS not a good car not a very good car INTENSIFIERS very good cara kind of good cara UPWARD DOWNARD NEUTRALIZERS i f goodthe car is I hope goodthe car is COMMITTERS sure goodthe car isI am UPWARD suspect goodthe car isI DOWNWARD
  • 18. Other annotations • Speech events (not sourced from author) –John thinks the car is good. • Comparisons: –Car X has a better engine than car Y. –Handles a variety of cases
  • 19. Outline • Motivating example • Overview of annotation types – Some statistics • Potential uses of corpus • Comparison to other resources
  • 20. Possible tasks • Detecting mentions, sentiment expressions, and modifiers • Identifying targets of sentiment expressions, modifiers • Coreference resolution • Finding part-of, feature-of, etc. relations • Identifying errors/inconsistencies in data
  • 21. Possible tasks • Exploring how elements interact: – Some idiot thinks this is a good car. • Evaluating unsupervised sentiment systems or those trained on other domains • How do relations between entities transfer sentiment? – The car’s paint job is flawless but the safety record is poor. • Solution to one task may be useful in solving another.
  • 22. But wait, there’s more! • 180 digital camera blog posts were annotated • Total of 223,001 + 108,593 = 331,594 tokens
  • 23. Outline • Motivating example – Elements combine to render entity-level sentiment • Overview of annotation types – Some statistics • Potential uses of corpus • Comparison to other resources
  • 24. Other resources • MPQA Version 2.0 – Wiebe, Wilson and Cardie (2005) – Largely professionally written news articles – Subjective expression • “beliefs, emotions, sentiments, speculations, etc.” – Attitude, contextual sentiment on subjective expressions – Target, source annotations – 226K tokens (JDPA: 332K)
  • 25. Other resources • Data sets provided by Bing Liu (2004, 2008) – Customer-written consumer electronics product reviews – Contextual sentiment toward mention of product – Comparison annotations – 130K tokens (JDPA: 332K)
  • 26. Thank you! • Obtaining the corpus: – Research and educational purposes – ICWSM.JDPA.corpus@gmail.com – June 2010 – Annotation guidelines: http://www.cs.indiana.edu/~jaskessl • Thanks to: Prof. Michael Gasser, Prof. James Martin, Prof. Martha Palmer, Prof. Michael Mozer, William Headden