SlideShare una empresa de Scribd logo
1 de 44
Aggregation for searching complex information spaces Mounia Lalmas [email_address]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],INEX - INitiative for the Evaluation of XML Retrieval Complexity of the information space (s)
A bit about myself ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Three retrieval paradigms Document Retrieval Focused Retrieval Aggregated Retrieval Complexity of the information space (s)
Classical document retrieval Retrieval System Query Document corpus Ranked Documents One homogeneous information space
Classical document retrieval process Documents Query Ranked  documents Representation Function Representation Function Query Representation Document Representation Retrieval Function Index
Information retrieval process Documents Query Results Representation Function Representation Function Query representation Object representation Retrieval Function Index Task Context Interface Interaction Multimodality Genre Media Language Structure Heterogeneity   The Turn, Ingwersen & Jarvelin, 2005
Focused Retrieval ,[object Object],[object Object],[object Object],[object Object],One information space A more complex one and/or several of them
Focused Retrieval - Question & Answering
Focused Retrieval - Passage Retrieval 1.2 3.2 3.4 3.7 1.2 2.2 2.2 3.4 3.2 1.4 3.4 3.5 1.4 2.2 2.3 2.4 Document segmented into passages Passages are returned as answers to a given query ,[object Object],[object Object],[object Object],[object Object],Lots of work in mid 90s
Structure in documents ,[object Object],[object Object],[object Object],[object Object],[object Object],author date
Logical structure - XML Document This is a heading This is some text This is a quote <doc> <head>This is a heading</head> <text>This is some text</text> <quote>This is a quote</quote> </doc> doc head text quote This is a heading This is a quote This is  some text
Using the (XML) structure ,[object Object],[object Object],[object Object]
Query languages for XML Retrieval ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Sihem Amer-Yahia
XML Retrieval ,[object Object],[object Object],[object Object],Book Chapters Sections Subsections ,[object Object],[object Object],[object Object]
Evaluation of XML Retrieval: INEX ,[object Object],[object Object],[object Object],[object Object],Collaborative  effort    participants contribute to the development of the collection End with a yearly workshop, in December, in Dagstuhl, Germany INEX  has allowed a new community in XML information access to emerge Fuhr
XML “Element” Retrieval (Courtesy of Norbert Goevert )
“ Element” Ranking algorithms Combination of evidence Element score Document score Element size … … “ Aggregation” in semi-complex information spaces vector space model language model extending DB model polyrepresentation probabilistic model logistic regression Bayesian network divergence from randomness Boolean model machine learning belief model statistical model natural language processing structured text models
Machine learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],relationship type “ Aggregation” in semi-complex information spaces
This is not the end ,[object Object],[object Object],[object Object],[object Object],[object Object],The complexity of the information space (s) increases - complexity of content/data - complexity of retrieval task/information need - complexity of context -  complexity of presentation of results - …
Aggregated result - Relevance in context (Courtesy of Jaap  Kamps)
Aggregated result - Element-biased table of content (Courtesy of Zoltan  Szlavik)
Aggregated answers in XML retrieval and beyond … ,[object Object],[object Object],[object Object],Let us be more adventurous and attempt to  create the perfect answer …   the beyond bit
Aggregated (virtual) documents 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 Special case: relevant in context Chiaramella & Roelleke
Aggregated (virtual) documents ,[object Object],[object Object],[object Object],[object Object],[object Object]
Yippy – Clustering search engine from Vivisimo clusty.com
Multi-document summarization http://newsblaster.cs.columbia.edu/
“ Fictitious” document generation (Courtesy of Cecile Paris)
Aggregated views 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 Special case:  element-biased table of content
Aggregated views (non-blended) http://au.alpha.yahoo.com/
Naver.com – Korean search engine
Aggregated views (blended)
Aggregated views (entities and relationships)
Research questions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Current work on aggregated search ,[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Understanding: Log analysis
Understanding: Log analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],Sushmita & Piwowarski
Images on top Images in the middle Images at the bottom Images at top-right Images on the left Images at the bottom-right Result presentation: User studies Blended vs non-blended interfaces 3 verticals (image, video, news) 3 positions 3 vertical intents (high, medium, low)
[object Object],[object Object],[object Object],[object Object],Sushmita & Hideo Result presentation: User studies
Evaluation: Test collections ImageCLEF photo retrieval track …… TREC  web track INEX ad-hoc track TREC blog track topic t 1 doc d 1 d 2 d 3 … d n judgment R N R … R …… Blog Vertical Reference (Encyclopedia) Vertical Image Vertical General Web Vertical Shopping Vertical topic t 1 doc d 1 d 2 … d V1 judgment R N … R vertical V 1 V 2 d 1 d 2 … d V2 N N … R …… V k d 1 d 2 … d Vk N N … N t 1 existing test collections (simulated) verticals
Evaluation: Test collections * There are on an average more than 100 events/shots contained in each video clip (document). Zhou Statistics on Topics number of topics 150 average rel docs per topic 110.3 average rel verticals per topic 1.75 ratio of “General Web” topics 29.3% ratio of topics with two vertical intents 66.7% ratio of topics with more than two vertical intents 4.0% quantity/media text image video total size (G) 2125 41.1 445.5 2611.6 number of documents 86,186,315 670,439 1,253* 86,858,007
There is  related  work ,[object Object],[object Object],[object Object],[object Object],“ combination of evidence” “ uncertainty theory” “ machine learning” “ agent technology” poly-representation cognitive overlap complex information needs structured query languages aggregator operators “ distributed retrieval” “ data fusion” “ federated search” “ federated digital libraries” “ resource selection” “ heterogeneous collection” Web search Information retrieval and digital libraries Cognitive science Databases Artificial intelligence
Final word - Aggregated search ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank you ,[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)9866825059
 
Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal EdiFaizal2
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notesBAIRAVI T
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibEl Habib NFAOUI
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantinimaxfalc
 
Techniques of information retrieval
Techniques of information retrieval Techniques of information retrieval
Techniques of information retrieval Tariq Hassan
 
Text data mining1
Text data mining1Text data mining1
Text data mining1KU Leuven
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text miningKrish_ver2
 
A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...
A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...
A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...IRJET Journal
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Abhay Ratnaparkhi
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
information retrieval Techniques and normalization
information retrieval Techniques and normalizationinformation retrieval Techniques and normalization
information retrieval Techniques and normalizationAmeenababs
 
Tovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio CostantiniTovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio Costantinimaxfalc
 

La actualidad más candente (20)

Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)
 
Web search engines
Web search enginesWeb search engines
Web search engines
 
Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_Habib
 
IR
IRIR
IR
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantini
 
Techniques of information retrieval
Techniques of information retrieval Techniques of information retrieval
Techniques of information retrieval
 
Text data mining1
Text data mining1Text data mining1
Text data mining1
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text mining
 
Text mining
Text miningText mining
Text mining
 
A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...
A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...
A Robust Keywords Based Document Retrieval by Utilizing Advanced Encryption S...
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Automatic indexing
Automatic indexingAutomatic indexing
Automatic indexing
 
information retrieval Techniques and normalization
information retrieval Techniques and normalizationinformation retrieval Techniques and normalization
information retrieval Techniques and normalization
 
Tovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio CostantiniTovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio Costantini
 

Destacado

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Dirk Lewandowski
 
Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202Jungah Park
 
Building a Digital Library
Building a Digital LibraryBuilding a Digital Library
Building a Digital Librarytomasz
 
Wnl 122 towards social sementic by samhati soor
Wnl 122 towards social sementic by samhati soorWnl 122 towards social sementic by samhati soor
Wnl 122 towards social sementic by samhati soorKishor Satpathy
 
How Much to Semanticize? Looking at the future of Library Data and the Semant...
How Much to Semanticize? Looking at the future of Library Data and the Semant...How Much to Semanticize? Looking at the future of Library Data and the Semant...
How Much to Semanticize? Looking at the future of Library Data and the Semant...Jenn Riley
 
Semantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital LibrariesSemantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital LibrariesNikesh Narayanan
 
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
Semantic web technologies and digital library search
Semantic web technologies and digital library searchSemantic web technologies and digital library search
Semantic web technologies and digital library searchRichard Nurse
 
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E..."Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...Yelp Engineering
 
Untangling spring week2
Untangling spring week2Untangling spring week2
Untangling spring week2Derek Jacoby
 
Some Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningSome Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningGianluca Bontempi
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and gridspotaters
 
Applying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network RoutingApplying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network Routingbutest
 
Streamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve ProductivityStreamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve ProductivityKevin Fream
 
Machine Learning techniques
Machine Learning techniques Machine Learning techniques
Machine Learning techniques Jigar Patel
 
Power of Code: What you don’t know about what you know
Power of Code: What you don’t know about what you knowPower of Code: What you don’t know about what you know
Power of Code: What you don’t know about what you knowcdathuraliya
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
07 history of cv vision paradigms - system - algorithms - applications - eva...
07  history of cv vision paradigms - system - algorithms - applications - eva...07  history of cv vision paradigms - system - algorithms - applications - eva...
07 history of cv vision paradigms - system - algorithms - applications - eva...zukun
 

Destacado (20)

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202
 
Relevance Assessment Tool
Relevance Assessment ToolRelevance Assessment Tool
Relevance Assessment Tool
 
Building a Digital Library
Building a Digital LibraryBuilding a Digital Library
Building a Digital Library
 
Wnl 122 towards social sementic by samhati soor
Wnl 122 towards social sementic by samhati soorWnl 122 towards social sementic by samhati soor
Wnl 122 towards social sementic by samhati soor
 
How Much to Semanticize? Looking at the future of Library Data and the Semant...
How Much to Semanticize? Looking at the future of Library Data and the Semant...How Much to Semanticize? Looking at the future of Library Data and the Semant...
How Much to Semanticize? Looking at the future of Library Data and the Semant...
 
Semantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital LibrariesSemantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital Libraries
 
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
 
Semantic web technologies and digital library search
Semantic web technologies and digital library searchSemantic web technologies and digital library search
Semantic web technologies and digital library search
 
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E..."Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
 
Untangling spring week2
Untangling spring week2Untangling spring week2
Untangling spring week2
 
Some Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningSome Take-Home Message about Machine Learning
Some Take-Home Message about Machine Learning
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and grids
 
Applying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network RoutingApplying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network Routing
 
Streamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve ProductivityStreamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve Productivity
 
Machine Learning techniques
Machine Learning techniques Machine Learning techniques
Machine Learning techniques
 
Power of Code: What you don’t know about what you know
Power of Code: What you don’t know about what you knowPower of Code: What you don’t know about what you know
Power of Code: What you don’t know about what you know
 
Supervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured TextSupervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured Text
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
07 history of cv vision paradigms - system - algorithms - applications - eva...
07  history of cv vision paradigms - system - algorithms - applications - eva...07  history of cv vision paradigms - system - algorithms - applications - eva...
07 history of cv vision paradigms - system - algorithms - applications - eva...
 

Similar a Aggregation for searching complex information spaces

Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-rankingFELIX75
 
Chapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and RetrievalChapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and Retrievalcaptainmactavish1996
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionKent State University
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...Ralf Stockmann
 
Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Salam Shah
 
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdfChapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdfHabtamu100
 
Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Finalguestcaef1d
 
Research Interests : Their Dynamics, Structures and Applications in Personali...
Research Interests : Their Dynamics, Structures and Applications in Personali...Research Interests : Their Dynamics, Structures and Applications in Personali...
Research Interests : Their Dynamics, Structures and Applications in Personali...Yi Zeng
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Arjen de Vries
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationJohn Doove
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
 
Faceted search using Solr and Ontopia
Faceted search using Solr and OntopiaFaceted search using Solr and Ontopia
Faceted search using Solr and OntopiaGeir Ove Grønmo
 
Searching Heterogenous E Learning Resources
Searching Heterogenous E Learning ResourcesSearching Heterogenous E Learning Resources
Searching Heterogenous E Learning Resourcesimranlatif
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Netgramana
 

Similar a Aggregation for searching complex information spaces (20)

From federated to aggregated search
From federated to aggregated searchFrom federated to aggregated search
From federated to aggregated search
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
 
Chapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and RetrievalChapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and Retrieval
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: Introduction
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
 
Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...
 
Podobnostní hledání v netextových datech (Pavel Zezula)
Podobnostní hledání v netextových datech (Pavel Zezula)Podobnostní hledání v netextových datech (Pavel Zezula)
Podobnostní hledání v netextových datech (Pavel Zezula)
 
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdfChapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdf
 
Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Final
 
Research Interests : Their Dynamics, Structures and Applications in Personali...
Research Interests : Their Dynamics, Structures and Applications in Personali...Research Interests : Their Dynamics, Structures and Applications in Personali...
Research Interests : Their Dynamics, Structures and Applications in Personali...
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Twente ir-course 20-10-2010
Twente ir-course 20-10-2010
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundation
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
Faceted search using Solr and Ontopia
Faceted search using Solr and OntopiaFaceted search using Solr and Ontopia
Faceted search using Solr and Ontopia
 
Searching Heterogenous E Learning Resources
Searching Heterogenous E Learning ResourcesSearching Heterogenous E Learning Resources
Searching Heterogenous E Learning Resources
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Net
 

Más de Mounia Lalmas-Roelleke

Engagement, Metrics & Personalisation at Scale
Engagement, Metrics &  Personalisation at ScaleEngagement, Metrics &  Personalisation at Scale
Engagement, Metrics & Personalisation at ScaleMounia Lalmas-Roelleke
 
Engagement, metrics and "recommenders"
Engagement, metrics and "recommenders"Engagement, metrics and "recommenders"
Engagement, metrics and "recommenders"Mounia Lalmas-Roelleke
 
Metrics, Engagement & Personalization
Metrics, Engagement & Personalization Metrics, Engagement & Personalization
Metrics, Engagement & Personalization Mounia Lalmas-Roelleke
 
Tutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationTutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationMounia Lalmas-Roelleke
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Mounia Lalmas-Roelleke
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceMounia Lalmas-Roelleke
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
 
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Mounia Lalmas-Roelleke
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersMounia Lalmas-Roelleke
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataMounia Lalmas-Roelleke
 
Story-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementStory-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementMounia Lalmas-Roelleke
 
Mobile advertising: The preclick experience
Mobile advertising: The preclick experienceMobile advertising: The preclick experience
Mobile advertising: The preclick experienceMounia Lalmas-Roelleke
 
Predicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsPredicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsMounia Lalmas-Roelleke
 
Improving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisImproving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisMounia Lalmas-Roelleke
 
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Mounia Lalmas-Roelleke
 
A Journey into Evaluation: from Retrieval Effectiveness to User Engagement
A Journey into Evaluation: from Retrieval Effectiveness to User EngagementA Journey into Evaluation: from Retrieval Effectiveness to User Engagement
A Journey into Evaluation: from Retrieval Effectiveness to User EngagementMounia Lalmas-Roelleke
 

Más de Mounia Lalmas-Roelleke (20)

Engagement, Metrics & Personalisation at Scale
Engagement, Metrics &  Personalisation at ScaleEngagement, Metrics &  Personalisation at Scale
Engagement, Metrics & Personalisation at Scale
 
Engagement, metrics and "recommenders"
Engagement, metrics and "recommenders"Engagement, metrics and "recommenders"
Engagement, metrics and "recommenders"
 
Metrics, Engagement & Personalization
Metrics, Engagement & Personalization Metrics, Engagement & Personalization
Metrics, Engagement & Personalization
 
Tutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationTutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and Optimization
 
Recommending and searching @ Spotify
Recommending and searching @ SpotifyRecommending and searching @ Spotify
Recommending and searching @ Spotify
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)
 
Search @ Spotify
Search @ Spotify Search @ Spotify
Search @ Spotify
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information Retrieval
 
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
 
Advertising Quality Science
Advertising Quality ScienceAdvertising Quality Science
Advertising Quality Science
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
 
Story-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementStory-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User Engagement
 
Mobile advertising: The preclick experience
Mobile advertising: The preclick experienceMobile advertising: The preclick experience
Mobile advertising: The preclick experience
 
Predicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsPredicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native Advertisements
 
Improving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisImproving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival Analysis
 
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
 
A Journey into Evaluation: from Retrieval Effectiveness to User Engagement
A Journey into Evaluation: from Retrieval Effectiveness to User EngagementA Journey into Evaluation: from Retrieval Effectiveness to User Engagement
A Journey into Evaluation: from Retrieval Effectiveness to User Engagement
 

Último

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Aggregation for searching complex information spaces

  • 1. Aggregation for searching complex information spaces Mounia Lalmas [email_address]
  • 2.
  • 3.
  • 4. Three retrieval paradigms Document Retrieval Focused Retrieval Aggregated Retrieval Complexity of the information space (s)
  • 5. Classical document retrieval Retrieval System Query Document corpus Ranked Documents One homogeneous information space
  • 6. Classical document retrieval process Documents Query Ranked documents Representation Function Representation Function Query Representation Document Representation Retrieval Function Index
  • 7. Information retrieval process Documents Query Results Representation Function Representation Function Query representation Object representation Retrieval Function Index Task Context Interface Interaction Multimodality Genre Media Language Structure Heterogeneity The Turn, Ingwersen & Jarvelin, 2005
  • 8.
  • 9. Focused Retrieval - Question & Answering
  • 10.
  • 11.
  • 12. Logical structure - XML Document This is a heading This is some text This is a quote <doc> <head>This is a heading</head> <text>This is some text</text> <quote>This is a quote</quote> </doc> doc head text quote This is a heading This is a quote This is some text
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. XML “Element” Retrieval (Courtesy of Norbert Goevert )
  • 18. “ Element” Ranking algorithms Combination of evidence Element score Document score Element size … … “ Aggregation” in semi-complex information spaces vector space model language model extending DB model polyrepresentation probabilistic model logistic regression Bayesian network divergence from randomness Boolean model machine learning belief model statistical model natural language processing structured text models
  • 19.
  • 20.
  • 21. Aggregated result - Relevance in context (Courtesy of Jaap Kamps)
  • 22. Aggregated result - Element-biased table of content (Courtesy of Zoltan Szlavik)
  • 23.
  • 24. Aggregated (virtual) documents 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 Special case: relevant in context Chiaramella & Roelleke
  • 25.
  • 26. Yippy – Clustering search engine from Vivisimo clusty.com
  • 28. “ Fictitious” document generation (Courtesy of Cecile Paris)
  • 29. Aggregated views 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 Special case: element-biased table of content
  • 30. Aggregated views (non-blended) http://au.alpha.yahoo.com/
  • 31. Naver.com – Korean search engine
  • 33. Aggregated views (entities and relationships)
  • 34.
  • 35.
  • 36.
  • 37.
  • 38. Images on top Images in the middle Images at the bottom Images at top-right Images on the left Images at the bottom-right Result presentation: User studies Blended vs non-blended interfaces 3 verticals (image, video, news) 3 positions 3 vertical intents (high, medium, low)
  • 39.
  • 40. Evaluation: Test collections ImageCLEF photo retrieval track …… TREC web track INEX ad-hoc track TREC blog track topic t 1 doc d 1 d 2 d 3 … d n judgment R N R … R …… Blog Vertical Reference (Encyclopedia) Vertical Image Vertical General Web Vertical Shopping Vertical topic t 1 doc d 1 d 2 … d V1 judgment R N … R vertical V 1 V 2 d 1 d 2 … d V2 N N … R …… V k d 1 d 2 … d Vk N N … N t 1 existing test collections (simulated) verticals
  • 41. Evaluation: Test collections * There are on an average more than 100 events/shots contained in each video clip (document). Zhou Statistics on Topics number of topics 150 average rel docs per topic 110.3 average rel verticals per topic 1.75 ratio of “General Web” topics 29.3% ratio of topics with two vertical intents 66.7% ratio of topics with more than two vertical intents 4.0% quantity/media text image video total size (G) 2125 41.1 445.5 2611.6 number of documents 86,186,315 670,439 1,253* 86,858,007
  • 42.
  • 43.
  • 44.

Notas del editor

  1. Passage retrieval was an active research area in the mid 90s. The idea here is that a document is decomposed into passages, and then we are doing IR against passages and not the whole document. A main issue is the actually decomposition of the document in passages. There were three main techniques: sliding windows of words using the discourse, e.g. every sentence is a passage, or more likely, every paragraph is a passage using a topic segmentation algorithm such as TextTiling, to identify shifts in topics, and to assign passage boundaries in such occurring topic shifts.
  2. Now I will concentrate on the use of the structure and in particular XML to perform focused retrieval. But let me say few words about structure and document. Structure is everywhere in a document. I will mainly concentrate on the structure in the sense of the hierarchical or logical structure.
  3. This illustrates the logical structure of a document, and how this is represented in the XML markup language. XML is de facto markup language for documents, and represent both the content and the structure of documents. Note that here do, head, text, and quote are what are called XML elements. Doc is the root element, or in other words the whole documents.
  4. How can the structure be used for focused retrieval? Before describing the use of structure to retrieve so-called XML elements, I want to say few words about complex information needs.
  5. To express information needs in the context of XML retrieval, we need query languages. These can be divided into four main groups, where from up here to down here, we have an increase in expressivity. What is the appropriate query language depends on the application and its users.
  6. Now what XML retrieval is about: Before hand we do not know at which level of the hierarchy the best answer is contained for a given query. So we do not know what is actually a retrieval units, as any elements cane bee returned. Elements are related to each other, and this may be useful to find these that should be returned. Finally we may have structural constraints to satisfy.
  7. This is an example of the outcomes of the techniques described so far, where as a result to a given query, we have a ranked list of element.
  8. Lots of models from IR and approaches from other areas were used to rank elements for a given queries. There is yet to know what is the best model. XML element retrieval is more a combination of evidence problem, where what matters is what get combined. We know that whatever the model combining the element score (estimated relevance), the document score and some information about the size is what bring the best performance and this whatever the model.
  9. At INEX, a number of user studies were carried (Tassos Tombros at QMUL was heavily involved in these). One outcome was that, although users liked to be returned XML elements, they preferred these to be grouped by document containing them. This led to the definition of a retrieval task at INEX, called releance in context. Rank article Highlight relevant elements This can be implemented as a heat map. This is an example of a aggregated result.
  10. A second line of outcomes of the user studies was that user like to be shown the context of the element they were reading. This can be done by the following interface, where on one side we display the element, and the other side the ToC of the document containing that element. As part of his PhD student, Zoltan looked at way to build a ToC that adapt to the element be displayed, where for example the structure of the elements around the displayed elements are shown in more details than those further away. This is again an example of an aggregated result.
  11. Initial ranking of elements, where elements in the same documents are shown in the same colour. We could aggregate elements in various ways to form virtual documents. The terminology of virtual document came a while ago from a discussion between TR and YC, and Thomas has extended it in so-called augmentation context. The RiC is a special case of the above, where we are grouping elements per document. There is nothing stopping us to do the grouping across documents.
  12. The notion of virtual documents is not new for example on the web, although the aggregated aspect is not exactly the same. Example of virtual documents is for example the organization of result into clusters. A cluster is a virtual document. We can go further and provide a summary of the documents forming the clusters, which is what WebInEssence (not working any more) is doing. In the news domain, the TDT at TREC is about about detecting topics and tracking them so that to present the while news item in one go to users
  13. Aggregated retrieval may also consist of what I call aggregated views. Going back to our example, we could for instance have one window showing one type of results (title element), and a window showing abstract element, and so on. Note that ToC is special case, where we have an element in one window and on the other a selected number of elements that correspond to the ToC for that element.
  14. Test collection generation: Existing test collections (topic, doc, judgment) Defining a set of verticals (varied at different genres and media) Document classification (mapping documents in existing test collections into the verticals defined) Duplicate topic detection (detecting duplicate topics from existing test collections, e.g. “golden gate bridge” in image and web test collection) Extracting topics with more than one vertical intent Adding topics with only “General Web” vertical intent
  15. Two tables + one figure Table 1: statistics of whole test collection Figure 1: breakdown of text documents into different genres Table 2: basic statistics of topics Those results are based on AIRS paper.