TweetSpector: Entity-based retrieval of Tweets [Demo]

•

0 recomendaciones•525 vistas

This is a demonstration, which will be presented by Surender Reddy Yerva during the 35th Annual SIGIR (Special Interest Group on Information Retrieval) Conference, taking place in Portland, Oregon, USA from August 12-16, 2012. Extended Abstract: People readily express their opinions about the various products, companies, TV shows etc., on Twitter. These tweet messages are thus a rich source of information that can be exploited to understand the sentiments about the concerned products or services. Retrieving the tweets related to given entities is however a challenging task as their names are often (deliberately) ambiguous, e.g. Apple, Blackberry, Friends, etc. Nevertheless, identifying the relevant entities is an essential rst step to develop reliable sentiment analysis techniques that is not considered in existing systems, for example TweetFeel, TwitterSentiment. While there is a number of techniques for identifying namedentities in unstructured text, they are often not directly applicable in this case, as tweet messages are very short (maximal 140 characters). This demonstrator introduces TweetSpector, a tool that addresses this retrieval task and enables to link tweet messages to a given entity. Our retrieval methods rely on classication techniques that exploit our concise descriptions of entity-relevant information, also called entity proles.

Tecnología

TweetSpector: Entity-based retrieval of Tweets

Surender Reddy Yerva, Zoltán Miklós, Flavia Grosan, Alexandru Tandrau, Karl Aberer
Swiss Federal Institute of Technology (EPFL)
Lausanne, Switzerland
{surenderreddy.yerva,zoltan.miklos,ﬂavia.grosan,alexandru.tandrau,karl.aberer}@epﬂ.ch

Categories and Subject Descriptors
H.3.1 [Information Systems Applications]: Content Anal-
ysis and Indexing; H.3.5 [Information Systems Applica-
tions]: Online Information Services

Keywords
Entity, Disambiguation, Proﬁles, Twitter

1. EXTENDED ABSTRACT
People readily express their opinions about the various
products, companies, TV shows etc., on Twitter1 . These
tweet messages are thus a rich source of information that can
be exploited to understand the sentiments about the con-
cerned products or services. Retrieving the tweets related
to given entities is however a challenging task as their names
are often (deliberately) ambiguous, e.g. Apple, Blackberry,
Friends, etc. Nevertheless, identifying the relevant entities
is an essential ﬁrst step to develop reliable sentiment analy-
sis techniques that is not considered in existing systems, for
example TweetFeel2 , TwitterSentiment3 .
While there is a number of techniques for identifying named Figure 1: TweetSpector: Various Features
entities in unstructured text, they are often not directly ap-
plicable in this case, as tweet messages are very short (max- -Tweet Classiﬁcation: TweetSpector displays in real-time
imal 140 characters). This demonstrator introduces Tweet- the classiﬁcation results (see Figure 1). For example, a
Spector, a tool that addresses this retrieval task and enables stream of tweets is displayed and it is indicated whether
to link tweet messages to a given entity. Our retrieval meth- or not the messages shall be related to the company Ap-
ods rely on classiﬁcation techniques that exploit our concise ple Inc.. The classiﬁcation techniques are widely extended
descriptions of entity-relevant information, also called entity versions of our earlier work [1].
proﬁles. -User Feedback: The users can indicate whether the pro-
The demonstrator presents the following features of Tweet- posed classiﬁcation is correct or not. This feedback is taken
Spector: into account by the algorithms. TweetSpector can also take
-Entity Proﬁle Creation: TweetSpector supports auto- human input through crowdsourcing (through an interface
matic proﬁle creation, where we apply named-entity recog- to Amazon Mechanical Turk).
nition, NLTK, wordnet and Web data extraction techniques -Dashboard: TweetSpector can display performance met-
to construct proﬁles for an entity, given a relevant Web- rics and statistical information on a dashboard related to
page. TweetSpector also enables manual proﬁle construc- the entity.
tion, where users can construct arbitrary entity proﬁles,
as well as manual and automatic updates for initially con- 2. ACKNOWLEDGEMENTS
structed proﬁles (thus the proﬁles are dynamic). The proﬁles This work was partly funded by the NisB project (FP7-
can also be visualized using Word Clouds. ICT-256955) and the European Commission in the Planet-
1 Data NoE (contract nr. 257641).
http://www.twitter.com
2
http://www.tweetfeel.com
3
http://twittersentiment.appspot.com
3. REFERENCES
[1] Surender Reddy Yerva, Zolt´n Mikl´s, and Karl
a o
Aberer. Entity-based Classiﬁcation of Twitter
Copyright is held by the author/owner(s). Messages. International Journal of Computer Science &
SIGIR’12, August 12–16, 2012, Portland, Oregon, USA.
ACM 978-1-4503-1472-5/12/08.
Applications, 9(1):88–115, 2012.

Más contenido relacionado

Más de PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence

On Leveraging Crowdsourcing Techniques for Schema Matching NetworksPlanetData Network of Excellence

Privacy-Preserving Schema ReusePlanetData Network of Excellence

Pay-as-you-go Reconciliation in Schema Matching NetworksPlanetData Network of Excellence

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence

On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence

Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence

SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence

Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence

Data and Knowledge Evolution PlanetData Network of Excellence

Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence

Access Control for RDF graphs using Abstract ModelsPlanetData Network of Excellence

Arrays in Databases, the next frontier?PlanetData Network of Excellence

Abstract Access Control Model for Dynamic RDF DatasetsPlanetData Network of Excellence

Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence

Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence

Heuristic based Query Optimisation for SPARQLPlanetData Network of Excellence

Building a Front End for a Sensor Data CloudPlanetData Network of Excellence

OntoGen Extension for Exploring Image CollectionsPlanetData Network of Excellence

Exploring The Hubness-Related Properties of Oceanographic Sensor DataPlanetData Network of Excellence

Más de PlanetData Network of Excellence (20)

A Contextualized Knowledge Repository for Open Data about Trentino

On Leveraging Crowdsourcing Techniques for Schema Matching Networks

Privacy-Preserving Schema Reuse

Pay-as-you-go Reconciliation in Schema Matching Networks

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream

On the need for a W3C community group on RDF Stream Processing

Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...

SciQL, Bridging the Gap between Science and Relational DBMS

Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Data and Knowledge Evolution

Evolution of Workflow Provenance Information in the Presence of Custom Infere...

Access Control for RDF graphs using Abstract Models

Arrays in Databases, the next frontier?

Abstract Access Control Model for Dynamic RDF Datasets

Towards Parallel Nonmonotonic Reasoning with Billions of Facts

Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...

Heuristic based Query Optimisation for SPARQL

Building a Front End for a Sensor Data Cloud

OntoGen Extension for Exploring Image Collections

Exploring The Hubness-Related Properties of Oceanographic Sensor Data

Último

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

How to convert PDF to text with Nanonetsnaman860154

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Slack Application Development 101 Slidespraypatel2

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

A Year of the Servo Reboot: Where Are We Now?Igalia

Histor y of HAM Radio presentation slidevu2urc

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

🐬 The future of MySQL is Postgres 🐘RTylerCroy

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Artificial Intelligence: Facts and MythsJoaquim Jorge

TweetSpector: Entity-based retrieval of Tweets [Demo]

1. TweetSpector: Entity-based retrieval of Tweets Surender Reddy Yerva, Zoltán Miklós, Flavia Grosan, Alexandru Tandrau, Karl Aberer Swiss Federal Institute of Technology (EPFL) Lausanne, Switzerland {surenderreddy.yerva,zoltan.miklos,flavia.grosan,alexandru.tandrau,karl.aberer}@epfl.ch Categories and Subject Descriptors H.3.1 [Information Systems Applications]: Content Anal- ysis and Indexing; H.3.5 [Information Systems Applica- tions]: Online Information Services Keywords Entity, Disambiguation, Profiles, Twitter 1. EXTENDED ABSTRACT People readily express their opinions about the various products, companies, TV shows etc., on Twitter1 . These tweet messages are thus a rich source of information that can be exploited to understand the sentiments about the concerned products or services. Retrieving the tweets related to given entities is however a challenging task as their names are often (deliberately) ambiguous, e.g. Apple, Blackberry, Friends, etc. Nevertheless, identifying the relevant entities is an essential first step to develop reliable sentiment analysis techniques that is not considered in existing systems, for example TweetFeel2 , TwitterSentiment3 . While there is a number of techniques for identifying named Figure 1: TweetSpector: Various Features entities in unstructured text, they are often not directly applicable in this case, as tweet messages are very short (max- -Tweet Classification: TweetSpector displays in real-time imal 140 characters). This demonstrator introduces Tweet- the classification results (see Figure 1). For example, a Spector, a tool that addresses this retrieval task and enables stream of tweets is displayed and it is indicated whether to link tweet messages to a given entity. Our retrieval meth- or not the messages shall be related to the company Ap- ods rely on classification techniques that exploit our concise ple Inc.. The classification techniques are widely extended descriptions of entity-relevant information, also called entity versions of our earlier work [1]. profiles. -User Feedback: The users can indicate whether the pro- The demonstrator presents the following features of Tweet- posed classification is correct or not. This feedback is taken Spector: into account by the algorithms. TweetSpector can also take -Entity Profile Creation: TweetSpector supports auto- human input through crowdsourcing (through an interface matic profile creation, where we apply named-entity recog- to Amazon Mechanical Turk). nition, NLTK, wordnet and Web data extraction techniques -Dashboard: TweetSpector can display performance met- to construct profiles for an entity, given a relevant Web- rics and statistical information on a dashboard related to page. TweetSpector also enables manual profile construc- the entity. tion, where users can construct arbitrary entity profiles, as well as manual and automatic updates for initially con- 2. ACKNOWLEDGEMENTS structed profiles (thus the profiles are dynamic). The profiles This work was partly funded by the NisB project (FP7- can also be visualized using Word Clouds. ICT-256955) and the European Commission in the Planet- 1 Data NoE (contract nr. 257641). http://www.twitter.com 2 http://www.tweetfeel.com 3 http://twittersentiment.appspot.com 3. REFERENCES [1] Surender Reddy Yerva, Zoltń Mikl´s, and Karl a o Aberer. Entity-based Classification of Twitter Copyright is held by the author/owner(s). Messages. International Journal of Computer Science & SIGIR’12, August 12–16, 2012, Portland, Oregon, USA. ACM 978-1-4503-1472-5/12/08. Applications, 9(1):88–115, 2012.

TweetSpector: Entity-based retrieval of Tweets [Demo]

Recomendados

Recomendados

Más contenido relacionado

Más de PlanetData Network of Excellence

Más de PlanetData Network of Excellence (20)

Último

Último (20)

TweetSpector: Entity-based retrieval of Tweets [Demo]