SlideShare una empresa de Scribd logo
1 de 57
Retrieval and Feedback Models for Blog Feed Search SIGIR 2008 Singapore Jonathan Elsas, Jaime Arguello, Jamie Callan & Jaime Carbonell LTI/SCS/CMU
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Background
What is a Blog?
What is a Feed? <xml> <feed> <entry> <author>Peter …</> <title>Good, Evil…</> <content>I’ve said…</> </entry> <entry> <author>Peter …</> <title>Agreeing…</> <content>Some peo…</> </entry> …
Blog-Feed Correspondence Blog Feed Post Entry HTML XML
Why are Blogs important? ,[object Object],[http://www.technorati.com/about/]
The Task
Feed Search at TREC ,[object Object],[object Object],[object Object],(a.k.a. Blog Distillation)
Feed Search at TREC ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Represent Ongoing Information  Needs Frequently Very General
Challenges in Feed Search
Challenges in Feed Search ,[object Object],entries time feed
[object Object],[object Object],Challenges in Feed Search entries time feed
Challenges in Feed Search ,[object Object],time Space Exploration topic NASA China’s plans for the moon shuttle launch My dog Mars rover Boeing
Challenges in Feed Search ,[object Object],[object Object],Space Exploration time topic
Challenges in Feed Search ,[object Object],[object Object],time
Challenges in Feed Search ,[object Object],[Mac] [Music] [Food] [Wine] …  post regularly about new  products ,  features , or  application software  of Apple Mac computers. …  describing  songs ,  biographies  of musicians, musical  styles  and their  influences  of music on people are discussed. … such as  tastings ,  reviews , food  matching  or  pairing , and  oenophile news  and  events . …  describing experiences  eating  cuisines,  culinary delights , recipes ,  nutrition plans .
Our Approach
Feeds: ,[object Object],[object Object],[object Object],Information Needs: General & Ongoing Challenges Our Approach Retrieval Models Feedback Models
Retrieval Models ,[object Object],[object Object],[object Object]
Large Document (Feed) Model [Q] <?xml… … </…> `<?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> <?xml… … </…> <?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> Feed Document  Collection Ranked Feeds Rank by Indri’s standard retrieval model [Metzler and Croft, 2004; 2005]
Large Document (Feed) Model ,[object Object],[object Object],[object Object],[object Object],[object Object],Feed Entry E E Entry Entry E
Small Document (Entry) Model Ranked Entries [Q] <entry> <entry> <entry> <entry> <?xml… <entry> Entry Document  Collection <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> Ranked Feeds document = entry Apply some rank aggregation function Rank By
Small Document (Entry) Model ,[object Object],[object Object],[object Object],ReDDE Federated Search Algortihm [Si & Callan, 2003]
Entry Centrality ,[object Object],[object Object],time topic
Small Document (Entry) Model ,[object Object],[object Object],[object Object],[object Object],[object Object],Not only improves speed,  Also performance Q
Retrieval Model Results
Retrieval Model Results ,[object Object],[object Object],[object Object]
Retrieval Model Results Mean Average Precision Large Document (Feed) Model Small Document (Entry) Models
Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform Log Prior Map 0.188
Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform n/a
Feedback Models ,[object Object],[object Object],[object Object]
Query Expansion (PRF) [Q] BLOG06 Collection Related Terms from top K documents [Q + Terms] [Lavrenko & Croft, 2001]
Query Expansion Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Photography] PRF photography nude erotic art girl free teen fashion women
Feedback Model Results Mean Average Precision None PRF
Query Expansion (Wikipedia PRF) [Q] BLOG06 Collection [Q + Terms] [Lavrenko & Croft, 2001] Wikipedia [Diaz & Metzler, 2006] Related Terms from top K documents
Query Expansion Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Photography] PRF photography nude erotic art girl free teen fashion women Wikipedia PRF photography director special film art camera music cinematographer photographic
Feedback Model Results Mean Average Precision None PRF Wiki. PRF
Query Expansion (Wikipedia Link) [Q] BLOG06 Collection [Q + Terms] Wikipedia Related Terms from  link structure
Wikipedia Link-Based Query Expansion
Wikipedia Link-Based Expansion Wikipedia … Q
Wikipedia Link-Based Expansion … Relevance Set,  Top R = 100 Working Set,  Top W = 1000 Q Wikipedia
Wikipedia Link-Based Expansion … Wikipedia Q Relevance Set,  Top R = 100 Working Set,  Top W = 1000
Wikipedia Link-Based Expansion Relevance Set,  Top R = 100 Working Set,  Top W = 1000 … Wikipedia Extract anchor text from Working Set  that link to the  Relevance Set . Q
Wikipedia Link-Based Expansion Relevance Set,  Top R = 500 Working Set,  Top W = 1000 … Wikipedia Extract anchor text from Working Set  that link to the  Relevance Set . Q Combines relevance and popularity Relevance: An anchor phrase that links to a high ranked article gets a high score Popularity: An anchor phrase that links many times to a mid-ranked articles also gets high score
Query Expansion Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Photography] PRF photography nude erotic art girl free teen fashion women Ideal digital photography depth of field photographic film photojournalism cinematography
Feedback Model Results Mean Average Precision None PRF Wiki. PRF Wiki. Link
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank You! Student Travel Grant funding from:    ACM SIGIR,    Amit Singhal,    Microsoft Research
Entry Centrality GM Derivation where Entry Generation Likelihood: |E|
Query Expansion Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Music] PRF Music Country Download Free MP3 Mp3andmore Lyric Listen Song
Query Expansion Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Scottish Independence] PRF scotland independence party convention politics snp national people scot
Query Expansion Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Machine Learning] PRF learn machine credit card karaoke journal sex model sew
Query Generality Characteristics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Relevance Set Cohesiveness … Relevance Set,  Top R = 100 Wikipedia Cohesiveness = |  L in  | |  L in  U  L out  |
Relevant Set Cohesiveness
Is it the Queries? ,[object Object],[object Object],[object Object],But, none of these measures predict whether wikipedia expansions helps…

Más contenido relacionado

Similar a Retrieval and Feedback Models for Blog Feed Search

Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015François Scharffe
 
Семантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаСемантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаVitebsk Miniq
 
Data scientist enablement dse 400 week 3 roadmap
Data scientist enablement   dse 400   week 3 roadmapData scientist enablement   dse 400   week 3 roadmap
Data scientist enablement dse 400 week 3 roadmapDr. Mohan K. Bavirisetty
 
Scalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningGerard de Melo
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share志明 陳
 
AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Pythonamyiris
 
Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Anna Lisa Gentile
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdfvkharish18
 
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docx
1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docxoswald1horne84988
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsPaul Hofmann
 
Data scientist enablement dse 400 week 4 roadmap
Data scientist enablement   dse 400   week 4 roadmap Data scientist enablement   dse 400   week 4 roadmap
Data scientist enablement dse 400 week 4 roadmap Dr. Mohan K. Bavirisetty
 
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)jeffreylancaster
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Avkash Chauhan
 
Data-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized ResultsData-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized ResultsHull
 
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)Turing Fest
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014Welocalize
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebConstantin Orasan
 

Similar a Retrieval and Feedback Models for Blog Feed Search (20)

Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015
 
Семантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаСемантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поиска
 
Data scientist enablement dse 400 week 3 roadmap
Data scientist enablement   dse 400   week 3 roadmapData scientist enablement   dse 400   week 3 roadmap
Data scientist enablement dse 400 week 3 roadmap
 
Scalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data Mining
 
Fiddling with flickr
Fiddling with flickrFiddling with flickr
Fiddling with flickr
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share
 
AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Python
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdf
 
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docx
1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docx
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Data scientist enablement dse 400 week 4 roadmap
Data scientist enablement   dse 400   week 4 roadmap Data scientist enablement   dse 400   week 4 roadmap
Data scientist enablement dse 400 week 4 roadmap
 
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
Data-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized ResultsData-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized Results
 
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 

Último

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Último (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Retrieval and Feedback Models for Blog Feed Search

  • 1. Retrieval and Feedback Models for Blog Feed Search SIGIR 2008 Singapore Jonathan Elsas, Jaime Arguello, Jamie Callan & Jaime Carbonell LTI/SCS/CMU
  • 2.
  • 4. What is a Blog?
  • 5. What is a Feed? <xml> <feed> <entry> <author>Peter …</> <title>Good, Evil…</> <content>I’ve said…</> </entry> <entry> <author>Peter …</> <title>Agreeing…</> <content>Some peo…</> </entry> …
  • 6. Blog-Feed Correspondence Blog Feed Post Entry HTML XML
  • 7.
  • 9.
  • 10.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 19.
  • 20.
  • 21. Large Document (Feed) Model [Q] <?xml… … </…> `<?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> <?xml… … </…> <?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> Feed Document Collection Ranked Feeds Rank by Indri’s standard retrieval model [Metzler and Croft, 2004; 2005]
  • 22.
  • 23. Small Document (Entry) Model Ranked Entries [Q] <entry> <entry> <entry> <entry> <?xml… <entry> Entry Document Collection <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> Ranked Feeds document = entry Apply some rank aggregation function Rank By
  • 24.
  • 25.
  • 26.
  • 28.
  • 29. Retrieval Model Results Mean Average Precision Large Document (Feed) Model Small Document (Entry) Models
  • 30. Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform Log Prior Map 0.188
  • 31. Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform n/a
  • 32.
  • 33. Query Expansion (PRF) [Q] BLOG06 Collection Related Terms from top K documents [Q + Terms] [Lavrenko & Croft, 2001]
  • 34.
  • 35. Feedback Model Results Mean Average Precision None PRF
  • 36. Query Expansion (Wikipedia PRF) [Q] BLOG06 Collection [Q + Terms] [Lavrenko & Croft, 2001] Wikipedia [Diaz & Metzler, 2006] Related Terms from top K documents
  • 37.
  • 38. Feedback Model Results Mean Average Precision None PRF Wiki. PRF
  • 39. Query Expansion (Wikipedia Link) [Q] BLOG06 Collection [Q + Terms] Wikipedia Related Terms from link structure
  • 42. Wikipedia Link-Based Expansion … Relevance Set, Top R = 100 Working Set, Top W = 1000 Q Wikipedia
  • 43. Wikipedia Link-Based Expansion … Wikipedia Q Relevance Set, Top R = 100 Working Set, Top W = 1000
  • 44. Wikipedia Link-Based Expansion Relevance Set, Top R = 100 Working Set, Top W = 1000 … Wikipedia Extract anchor text from Working Set that link to the Relevance Set . Q
  • 45. Wikipedia Link-Based Expansion Relevance Set, Top R = 500 Working Set, Top W = 1000 … Wikipedia Extract anchor text from Working Set that link to the Relevance Set . Q Combines relevance and popularity Relevance: An anchor phrase that links to a high ranked article gets a high score Popularity: An anchor phrase that links many times to a mid-ranked articles also gets high score
  • 46.
  • 47. Feedback Model Results Mean Average Precision None PRF Wiki. PRF Wiki. Link
  • 48.
  • 49. Thank You! Student Travel Grant funding from: ACM SIGIR, Amit Singhal, Microsoft Research
  • 50. Entry Centrality GM Derivation where Entry Generation Likelihood: |E|
  • 51.
  • 52.
  • 53.
  • 54.
  • 55. Relevance Set Cohesiveness … Relevance Set, Top R = 100 Wikipedia Cohesiveness = | L in | | L in U L out |
  • 57.