SlideShare una empresa de Scribd logo
1 de 24
Building a Microblog Corpus
for Search Result Diversification
AIRS 2013, Singapore, December 10

Ke Tao, Claudia Hauff, Geert-Jan Houben
Web Information Systems, TU Delft, the Netherlands

Delft
University of
Technology
Research Challenges
1. Diversification needed: Users are likely to use shorter
queries, which tend to be underspecified, to search on
microblog

2. Lack of Corpus for Diversification Study: How can
one build a microblog corpus for evaluating further study
on diversification?
Search
Result

tweets

query

Diversified
Result

diversification
strategy

diversity
judgment

Building a Microblog Corpus for Search Result Diversification

2
Methodology

Overview

1. Data Source
• How can we find a good representative Twitter dataset?

2. Topic Selection
• How do we select the search topics?

3. Tweets Pooling
• Which tweets are we going to annotate?

4. Diversity Annotation
• How do we annotate the tweets with diversity characteristics?
Building a Microblog Corpus for Search Result Diversification

3
Methodology – Data source
• From where?
• Twitter sampling API  around 1% of whole Twitter streams

• Duration
• From February 1st to March 31st 2013
• Coincide with TREC 2013 Microblog Track

• Tools
• Twitter Public Stream Sampling Tools by @lintool
• Amazon EC2 in EU
TREC 2013 Microblog Guideline: https://github.com/lintool/twitter-tools/wiki/ TREC-2013-Track-Guidelines
Twitter Public Stream Sampling Tool: https://github.com/lintool/twitter-tools/wiki/Sampling-the-public-Twitter-stream

Building a Microblog Corpus for Search Result Diversification

4
Methodology – Topic Selection

How do we select the search topics?
• Candidates in Wikipedia Current Events Portal
• Enough importance
• More than local interests

• Temporal Characteristics
• Evenly distributed during the period of 2-month
• Enables further analysis on temporal characteristics

• Selected
• 50 topics on trending news events
Wikipedia Current Events Portal: http://en.wikipedia.org/wiki/Portal: Current_events

Building a Microblog Corpus for Search Result Diversification

5
Methodology – Tweets Pooling – 1/2

Maximize coverage & Minimize effort
• Challenge for adopting existing solution
• Lack of access to multiple retrieval systems

• Topic Expansion
• Manually created query for each topic
• Aim at maximum coverage of tweets that are relevant to the topic

• Duplicate Filtering
• Filter out the duplicate tweets (cosine similarity > 0.9)

Building a Microblog Corpus for Search Result Diversification

6
Methodology – Tweets Pooling – 2/2

Topic Expansion Example

Hillary Clinton steps
down as United States
Secretary of State
Possible variety
of expressions

Building a Microblog Corpus for Search Result Diversification

7
Methodology – Diversity Annotation

Annotation Efforts

• 500 tweets for each topic
• No identification of subtopics beforehand
• Tweets about general topic (=no added value) are judged non-relevant

• No further check on URL links  may be not available as time goes

• 50 topics split between 2 annotators
• Subjective process
• Later comparative results
• 3 topics dropped – e.g. not enough diversity / relevant documents

Building a Microblog Corpus for Search Result Diversification

8
Topic Analysis

The Topics and Subtopics 1/2
All topics
Avg. #subtopics
Std. dev. #subtopics
Min. #subtopics
Max. #subtopics

9.27
3.88
2
21

Topics annotated by
Annotator 1 Annotator 2
8.59
9.88
5.11
2.14
2
6
21
13

On average, we found 9 subtopics per each topic.
The subjectivity of annotation is confirmed based on
the differences in the standard deviation of number
of subtopics per each topic between two annotators.
Building a Microblog Corpus for Search Result Diversification

9
Topic Analysis

The Topics and Subtopics 2/2

The annotators on average spent 6.6 seconds to
annotate a tweet. Most of the tweets are assigned
with exactly one subtopic.
Building a Microblog Corpus for Search Result Diversification

10
Topic Analysis

The relevance judgment 1/2
• Different diversity in topics
• 25 topics have less than 100 tweets with subtopics
• 6 topics have more than 350 tweets with subtopics

• Difference between 2 annotators
• On average, 96 tweets v.s. 181 tweets with subtopic assignment

Number of documents

500
400
300
RELEVANT
200

NONRELEVANT

100
0

Topics

Building a Microblog Corpus for Search Result Diversification

11
Topic Analysis

The relevance judgment 2/2
• Temporal persistence
• Some topics are active during the entire timespan
• Northern Mali conflicts
• Syrian civil war

• Low to 24 hours for some topics
• BBC Twitter account hacked
• Eiffel Tower, evacuated due to bomb threat
Difference in days

60
50
40
30
20
10
0

Topics

Building a Microblog Corpus for Search Result Diversification

12
Topic Analysis

Diversity Difficulty
• The difficulty to diversify the search results
• Ambiguity or Under-specification of topics
• Diverse content available in the corpus

• Golbus et al. proposed diversity difficulty measure dd
• dd > 0.9 = arbitrary ranked list is likely to cover all subtopics
• dd < 0.5 means hard to discover subtopics by an untuned retrieval system
All topics
Avg. diversity difficulty
Std. Dev. diversity difficulty

0.71
0.07

Topics annotated by
Annotator 1
Annotator 2
0.72
0.70
0.06
0.07

Golbus et al.: Increasing evaluation sensitivity to diversity. Information Retrieval (2013) 16

Building a Microblog Corpus for Search Result Diversification

13
Topic Analysis

Diversity Difficulty
• The difficulty to diversify the search results
• Ambiguity or Under-specification of topics
• Diverse content available in the corpus

• Golbus et al. proposed diversity difficulty measure dd
• dd > 0.9 indicates a diverse query
• dd < 0.5 means hard to discover subtopics by an untuned retrieval system

• Difference between long-/short-term topics
• The topics with longer timespan (>50 days) are easier in diversity difficulty
(0.73 > 0.70)
Golbus et al.: Increasing evaluation sensitivity to diversity. Information Retrieval (2013) 16

Building a Microblog Corpus for Search Result Diversification

14
Diversification by De-Duplicating – 1/6

Lower redudancy, but higher diversity?

• In previous work, we were motivated by the fact that
• 20% of search results are duplicate information in different extent

• Therefore, we proposed to remove the duplicates in order to
achieve lower redundancy in top-k results
• Implemented with a machine learning framework
• Make use of syntactical, semantic, and contextual features
• Eliminate the identified duplicates with lower rank in the search result

Whether it can achieve in higher diversity?
Tao et al.: Groundhog Day: Near-duplicate Detection on Twitter. In Proceedings of 22nd
International World Wide Web Conference.

Building a Microblog Corpus for Search Result Diversification

15
Diversification by De-Duplicating – 2/6

Measures

• We adopts following measures:
• alpha-(n)DCG

• Precision-IA
• Subtopic-Recall
• Redundancy

Clarke et al.: Novelty and Diversity in Information Retrieval Evaluation. In Proceedings of
SIGIR, 2008.
Agrawal et al.: Diversifying Search Results. In Proceedings of WSDM, 2009.
Zhai et al.: Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic
Retrieval. In Proceedings of SIGIR, 2003.
Building a Microblog Corpus for Search Result Diversification

16
Diversification by De-Duplicating – 3/6

Baseline and De-Duplicate Strategies
• Baseline Strategies

• Automatic run: using standard queries (no more than 3 terms)
• Filtered Auto: filter the duplicates out w.r.t. cosine similarity

• Manual Run: manually created complex queries with automatic filtering

• De-duplicate Strategies
• Sy = Syntactical, Se= Semantic, Co = Contextual
• Four strategies: Sy, SyCo, SySe, SySeCo

Building a Microblog Corpus for Search Result Diversification

17
Diversification by De-Duplicating – 4/6

Overall comparison

Overall, the de-duplicate strategies did achieve in
lower redundancy. However, they didn’t achieve
in terms of higher diversity.
Building a Microblog Corpus for Search Result Diversification

18
Diversification by De-Duplicating – 5/6

Influence of Annotator Subjectivity

Building a Microblog Corpus for Search Result Diversification

19
Diversification by De-Duplicating – 5/6

Influence of Annotator Subjectivity

The same general trends for both annotators.
alpha-nDCG scores are higher for Annotator 2
 can be explained by on average more
documents judged as relevant by Annotator 2.

Building a Microblog Corpus for Search Result Diversification

20
Diversification by De-Duplicating – 6/6

Influence of Temporal Persistence

Building a Microblog Corpus for Search Result Diversification

21
Diversification by De-Duplicating – 6/6

Influence of Temporal Persistence

De-duplicate strategies can help for long-term
topics, because the vocabulary was richer
while only a small set of terms were used for
short-term topics.

Building a Microblog Corpus for Search Result Diversification

22
Conclusions
• We have done:

• Created a microblog-based corpus for search result diversification
• Conducted comprehensive analysis and showed its suitability
• Confirmed considerable subjectivity among annotators, although the trends
w.r.t. the different evaluation measures were largely independent of
annotators

• We have made the corpus available via:
• http://wis.ewi.tudelft.nl/airs2013/

• What we will do:

• Apply the diversification approaches that have been shown to perform well
in the Web search setting.
• Propose the diversification approaches specifically designed for search on
microblogging platforms.
Building a Microblog Corpus for Search Result Diversification

23
Thank you!
@wisdelft
http://ktao.nl

Ke Tao
@taubau

Building a Microblog Corpus for Search Result Diversification

24

Más contenido relacionado

Similar a Building a Microblog Corpus for Search Result Diversification

Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Ke Tao
 
Incentivising the uptake of reusable metadata in the survey production process
Incentivising the uptake of reusable metadata in the survey production processIncentivising the uptake of reusable metadata in the survey production process
Incentivising the uptake of reusable metadata in the survey production processLouise Corti
 
Data Management for librarians
Data Management for librariansData Management for librarians
Data Management for librariansC. Tobin Magle
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information RetrievalCarsten Eickhoff
 
Research culture presentation Sept 4, 2013
Research culture presentation Sept 4, 2013Research culture presentation Sept 4, 2013
Research culture presentation Sept 4, 2013Shawna Reibling
 
Digging into assessment data: Tips, tricks, and tools of the trade.
Digging into assessment data: Tips, tricks, and tools of the trade.Digging into assessment data: Tips, tricks, and tools of the trade.
Digging into assessment data: Tips, tricks, and tools of the trade.Lynn Connaway
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Michael Levine-Clark
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Addressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer searchAddressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer searchHarisankar H
 
Bioinformatic core facilities discussion
Bioinformatic core facilities discussionBioinformatic core facilities discussion
Bioinformatic core facilities discussionJennifer Shelton
 
N=10^9: Automated Experimentation at Scale
N=10^9: Automated Experimentation at ScaleN=10^9: Automated Experimentation at Scale
N=10^9: Automated Experimentation at ScaleOptimizely
 
Willmers&King open con2016-ct-14.11.16
Willmers&King open con2016-ct-14.11.16Willmers&King open con2016-ct-14.11.16
Willmers&King open con2016-ct-14.11.16Michelle Willmers
 
Community and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent HackathonsCommunity and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent HackathonsArlin Stoltzfus
 
Conole eden _mooc_evaln
Conole eden _mooc_evalnConole eden _mooc_evaln
Conole eden _mooc_evalnGrainne Conole
 
SPSBE14 Intranet Search #fail
SPSBE14 Intranet Search #failSPSBE14 Intranet Search #fail
SPSBE14 Intranet Search #failBen van Mol
 
SharePoint Saturday Belgium 2013 Intranet search fail
SharePoint Saturday Belgium 2013 Intranet search failSharePoint Saturday Belgium 2013 Intranet search fail
SharePoint Saturday Belgium 2013 Intranet search failBIWUG
 
Series 2, Texas A&M University Libraries LibGuides Town Hall Meeting
Series 2, Texas A&M University Libraries LibGuides Town Hall MeetingSeries 2, Texas A&M University Libraries LibGuides Town Hall Meeting
Series 2, Texas A&M University Libraries LibGuides Town Hall Meetinglmrey_tamul
 

Similar a Building a Microblog Corpus for Search Result Diversification (20)

Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter
 
Incentivising the uptake of reusable metadata in the survey production process
Incentivising the uptake of reusable metadata in the survey production processIncentivising the uptake of reusable metadata in the survey production process
Incentivising the uptake of reusable metadata in the survey production process
 
Data Management for librarians
Data Management for librariansData Management for librarians
Data Management for librarians
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Research culture presentation Sept 4, 2013
Research culture presentation Sept 4, 2013Research culture presentation Sept 4, 2013
Research culture presentation Sept 4, 2013
 
Digging into assessment data: Tips, tricks, and tools of the trade.
Digging into assessment data: Tips, tricks, and tools of the trade.Digging into assessment data: Tips, tricks, and tools of the trade.
Digging into assessment data: Tips, tricks, and tools of the trade.
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Addressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer searchAddressing scalability challenges in peer-to-peer search
Addressing scalability challenges in peer-to-peer search
 
Bioinformatic core facilities discussion
Bioinformatic core facilities discussionBioinformatic core facilities discussion
Bioinformatic core facilities discussion
 
N=10^9: Automated Experimentation at Scale
N=10^9: Automated Experimentation at ScaleN=10^9: Automated Experimentation at Scale
N=10^9: Automated Experimentation at Scale
 
Willmers&King open con2016-ct-14.11.16
Willmers&King open con2016-ct-14.11.16Willmers&King open con2016-ct-14.11.16
Willmers&King open con2016-ct-14.11.16
 
Community and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent HackathonsCommunity and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent Hackathons
 
Conole eden _mooc_evaln
Conole eden _mooc_evalnConole eden _mooc_evaln
Conole eden _mooc_evaln
 
SPSBE14 Intranet Search #fail
SPSBE14 Intranet Search #failSPSBE14 Intranet Search #fail
SPSBE14 Intranet Search #fail
 
SharePoint Saturday Belgium 2013 Intranet search fail
SharePoint Saturday Belgium 2013 Intranet search failSharePoint Saturday Belgium 2013 Intranet search fail
SharePoint Saturday Belgium 2013 Intranet search fail
 
Series 2, Texas A&M University Libraries LibGuides Town Hall Meeting
Series 2, Texas A&M University Libraries LibGuides Town Hall MeetingSeries 2, Texas A&M University Libraries LibGuides Town Hall Meeting
Series 2, Texas A&M University Libraries LibGuides Town Hall Meeting
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 

Último

Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 

Último (20)

Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 

Building a Microblog Corpus for Search Result Diversification

  • 1. Building a Microblog Corpus for Search Result Diversification AIRS 2013, Singapore, December 10 Ke Tao, Claudia Hauff, Geert-Jan Houben Web Information Systems, TU Delft, the Netherlands Delft University of Technology
  • 2. Research Challenges 1. Diversification needed: Users are likely to use shorter queries, which tend to be underspecified, to search on microblog 2. Lack of Corpus for Diversification Study: How can one build a microblog corpus for evaluating further study on diversification? Search Result tweets query Diversified Result diversification strategy diversity judgment Building a Microblog Corpus for Search Result Diversification 2
  • 3. Methodology Overview 1. Data Source • How can we find a good representative Twitter dataset? 2. Topic Selection • How do we select the search topics? 3. Tweets Pooling • Which tweets are we going to annotate? 4. Diversity Annotation • How do we annotate the tweets with diversity characteristics? Building a Microblog Corpus for Search Result Diversification 3
  • 4. Methodology – Data source • From where? • Twitter sampling API  around 1% of whole Twitter streams • Duration • From February 1st to March 31st 2013 • Coincide with TREC 2013 Microblog Track • Tools • Twitter Public Stream Sampling Tools by @lintool • Amazon EC2 in EU TREC 2013 Microblog Guideline: https://github.com/lintool/twitter-tools/wiki/ TREC-2013-Track-Guidelines Twitter Public Stream Sampling Tool: https://github.com/lintool/twitter-tools/wiki/Sampling-the-public-Twitter-stream Building a Microblog Corpus for Search Result Diversification 4
  • 5. Methodology – Topic Selection How do we select the search topics? • Candidates in Wikipedia Current Events Portal • Enough importance • More than local interests • Temporal Characteristics • Evenly distributed during the period of 2-month • Enables further analysis on temporal characteristics • Selected • 50 topics on trending news events Wikipedia Current Events Portal: http://en.wikipedia.org/wiki/Portal: Current_events Building a Microblog Corpus for Search Result Diversification 5
  • 6. Methodology – Tweets Pooling – 1/2 Maximize coverage & Minimize effort • Challenge for adopting existing solution • Lack of access to multiple retrieval systems • Topic Expansion • Manually created query for each topic • Aim at maximum coverage of tweets that are relevant to the topic • Duplicate Filtering • Filter out the duplicate tweets (cosine similarity > 0.9) Building a Microblog Corpus for Search Result Diversification 6
  • 7. Methodology – Tweets Pooling – 2/2 Topic Expansion Example Hillary Clinton steps down as United States Secretary of State Possible variety of expressions Building a Microblog Corpus for Search Result Diversification 7
  • 8. Methodology – Diversity Annotation Annotation Efforts • 500 tweets for each topic • No identification of subtopics beforehand • Tweets about general topic (=no added value) are judged non-relevant • No further check on URL links  may be not available as time goes • 50 topics split between 2 annotators • Subjective process • Later comparative results • 3 topics dropped – e.g. not enough diversity / relevant documents Building a Microblog Corpus for Search Result Diversification 8
  • 9. Topic Analysis The Topics and Subtopics 1/2 All topics Avg. #subtopics Std. dev. #subtopics Min. #subtopics Max. #subtopics 9.27 3.88 2 21 Topics annotated by Annotator 1 Annotator 2 8.59 9.88 5.11 2.14 2 6 21 13 On average, we found 9 subtopics per each topic. The subjectivity of annotation is confirmed based on the differences in the standard deviation of number of subtopics per each topic between two annotators. Building a Microblog Corpus for Search Result Diversification 9
  • 10. Topic Analysis The Topics and Subtopics 2/2 The annotators on average spent 6.6 seconds to annotate a tweet. Most of the tweets are assigned with exactly one subtopic. Building a Microblog Corpus for Search Result Diversification 10
  • 11. Topic Analysis The relevance judgment 1/2 • Different diversity in topics • 25 topics have less than 100 tweets with subtopics • 6 topics have more than 350 tweets with subtopics • Difference between 2 annotators • On average, 96 tweets v.s. 181 tweets with subtopic assignment Number of documents 500 400 300 RELEVANT 200 NONRELEVANT 100 0 Topics Building a Microblog Corpus for Search Result Diversification 11
  • 12. Topic Analysis The relevance judgment 2/2 • Temporal persistence • Some topics are active during the entire timespan • Northern Mali conflicts • Syrian civil war • Low to 24 hours for some topics • BBC Twitter account hacked • Eiffel Tower, evacuated due to bomb threat Difference in days 60 50 40 30 20 10 0 Topics Building a Microblog Corpus for Search Result Diversification 12
  • 13. Topic Analysis Diversity Difficulty • The difficulty to diversify the search results • Ambiguity or Under-specification of topics • Diverse content available in the corpus • Golbus et al. proposed diversity difficulty measure dd • dd > 0.9 = arbitrary ranked list is likely to cover all subtopics • dd < 0.5 means hard to discover subtopics by an untuned retrieval system All topics Avg. diversity difficulty Std. Dev. diversity difficulty 0.71 0.07 Topics annotated by Annotator 1 Annotator 2 0.72 0.70 0.06 0.07 Golbus et al.: Increasing evaluation sensitivity to diversity. Information Retrieval (2013) 16 Building a Microblog Corpus for Search Result Diversification 13
  • 14. Topic Analysis Diversity Difficulty • The difficulty to diversify the search results • Ambiguity or Under-specification of topics • Diverse content available in the corpus • Golbus et al. proposed diversity difficulty measure dd • dd > 0.9 indicates a diverse query • dd < 0.5 means hard to discover subtopics by an untuned retrieval system • Difference between long-/short-term topics • The topics with longer timespan (>50 days) are easier in diversity difficulty (0.73 > 0.70) Golbus et al.: Increasing evaluation sensitivity to diversity. Information Retrieval (2013) 16 Building a Microblog Corpus for Search Result Diversification 14
  • 15. Diversification by De-Duplicating – 1/6 Lower redudancy, but higher diversity? • In previous work, we were motivated by the fact that • 20% of search results are duplicate information in different extent • Therefore, we proposed to remove the duplicates in order to achieve lower redundancy in top-k results • Implemented with a machine learning framework • Make use of syntactical, semantic, and contextual features • Eliminate the identified duplicates with lower rank in the search result Whether it can achieve in higher diversity? Tao et al.: Groundhog Day: Near-duplicate Detection on Twitter. In Proceedings of 22nd International World Wide Web Conference. Building a Microblog Corpus for Search Result Diversification 15
  • 16. Diversification by De-Duplicating – 2/6 Measures • We adopts following measures: • alpha-(n)DCG • Precision-IA • Subtopic-Recall • Redundancy Clarke et al.: Novelty and Diversity in Information Retrieval Evaluation. In Proceedings of SIGIR, 2008. Agrawal et al.: Diversifying Search Results. In Proceedings of WSDM, 2009. Zhai et al.: Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval. In Proceedings of SIGIR, 2003. Building a Microblog Corpus for Search Result Diversification 16
  • 17. Diversification by De-Duplicating – 3/6 Baseline and De-Duplicate Strategies • Baseline Strategies • Automatic run: using standard queries (no more than 3 terms) • Filtered Auto: filter the duplicates out w.r.t. cosine similarity • Manual Run: manually created complex queries with automatic filtering • De-duplicate Strategies • Sy = Syntactical, Se= Semantic, Co = Contextual • Four strategies: Sy, SyCo, SySe, SySeCo Building a Microblog Corpus for Search Result Diversification 17
  • 18. Diversification by De-Duplicating – 4/6 Overall comparison Overall, the de-duplicate strategies did achieve in lower redundancy. However, they didn’t achieve in terms of higher diversity. Building a Microblog Corpus for Search Result Diversification 18
  • 19. Diversification by De-Duplicating – 5/6 Influence of Annotator Subjectivity Building a Microblog Corpus for Search Result Diversification 19
  • 20. Diversification by De-Duplicating – 5/6 Influence of Annotator Subjectivity The same general trends for both annotators. alpha-nDCG scores are higher for Annotator 2  can be explained by on average more documents judged as relevant by Annotator 2. Building a Microblog Corpus for Search Result Diversification 20
  • 21. Diversification by De-Duplicating – 6/6 Influence of Temporal Persistence Building a Microblog Corpus for Search Result Diversification 21
  • 22. Diversification by De-Duplicating – 6/6 Influence of Temporal Persistence De-duplicate strategies can help for long-term topics, because the vocabulary was richer while only a small set of terms were used for short-term topics. Building a Microblog Corpus for Search Result Diversification 22
  • 23. Conclusions • We have done: • Created a microblog-based corpus for search result diversification • Conducted comprehensive analysis and showed its suitability • Confirmed considerable subjectivity among annotators, although the trends w.r.t. the different evaluation measures were largely independent of annotators • We have made the corpus available via: • http://wis.ewi.tudelft.nl/airs2013/ • What we will do: • Apply the diversification approaches that have been shown to perform well in the Web search setting. • Propose the diversification approaches specifically designed for search on microblogging platforms. Building a Microblog Corpus for Search Result Diversification 23
  • 24. Thank you! @wisdelft http://ktao.nl Ke Tao @taubau Building a Microblog Corpus for Search Result Diversification 24

Notas del editor

  1. annimation with lack of corpus
  2. 3 dropped topicsG20 finance ministers meetingUEFA champions leagueNorth Korea nullify armistice
  3. Basic statistics
  4. 6.6  start early, later fasterMost of the tweets are assigned exactly one subtopic.
  5. Different from TREC 2011/12 subtopics, no timestamp considered for building this corpus.
  6. Diversity difficultyTREC 2010 0.727 (0.449, 0.994)TREC 2011 0.809 (0.643, 0.977)
  7. Diversity difficulty
  8. Diversity difficulty