SlideShare una empresa de Scribd logo
1 de 1
Descargar para leer sin conexión
Better Contextual Suggestions from ClueWeb12
Using Domain Knowledge Inferred from The Open Web
Results And AnalysisContextual Suggestion Model
 Applying domain knowledge about sites that are more likely to offer
attractions leads to better suggestions
 The best results are obtained when identifying attractions through
specialized services such as Foursquare
Conclusions
Thaer Samar†
, Alejandro Bellogín*
, Arjen P. de Vries†
†
Centrum Wiskunde & Informatica, The Netherlands
*
Universidad Autónoma de Madrid, Spain
{samar, arjen.de.vries}@cwi.nl, alejandro.bellogin@uam.es
General performance
TouristFiltered run is better than GeoFiltered run
Performance of TouristFiltered vs. GeoFiltered
Percentage of topics where TouristFiltered run is better than, equal to,
and worse than GeoFiltered run
Dimensions of relevance
 Effect of geographical (geo), description (desc), and document (doc)
relevance on the performance of TouristFiltered and GeoFiltered runs
 Considering desc and doc relevance only:
 Both runs have almost the same effectiveness
 Considering the geo relevance only
 TouristFiltered is more geographically appropriate
Effect of TouristFiltered parts on the Performance
 TouristFiltered sub-collection consists of three parts:
A. TouristListFiltered (TLF), B. TouristOutlinksFiltered (TOF),
and C. AttractionFiltered (AF)
 Effect of each part on the performance
 Best results from the part obtained through specialized service like
Foursquare
Goal: Applying domain knowledge inferred from the Open Web
to get better suggestions from the Archived Web (ClueWeb12)
2. User Profiles Generation
For each user:
Aggregate descriptions of attractions rated by the user
Split descriptions into positive and negative profiles based on user ratings
3. Representation
Represents attractions and user profiles in a Vector Space Model
1. clean attraction documents (remove HTML tags)
2. remove stop words from profiles and documents
3. transform documents and profiles into vectors of <term, frequency> pairs
4. Similarity of Attractions & User Profiles
For each set of attractions per (user, context) topic
Compute similarity between each attraction and positive and negative profiles
GeoFiltered
Exact mention of the context
Format: “City, ST”
TouristFiltered
50 Attractions: name and URL
Missing URL
“Attraction name
Miami, FL”
Get Hosts of all found URLS
Search in
ClueWeb12
1. Find Attractions in ClueWeb12
“Miami, FL”
A. Tourist sites: Any document whose URL belongs to 7 manually
selected tourist websites. (TouristListFiltered)
{yelp, tripadvisor, wikitravel, zagat, xpedia, orbitz, and travel.yahoo}
B. Expand A
Extract Outlinks
Find Outlinks in ClueWeb12 (TouristOutlinksFiltered)
C. Attraction Oriented (AttractionsFiltered)
Two runs: one based on TouristFiltered and another based on GeoFiltered
The following steps are the same for the two runs
NEW
Same as
2013
Same as
2013
Same as
2013
all: means geo, desc, and doc
are considered

Más contenido relacionado

Similar a Better Contextual Suggestions from Domain Knowledge in ClueWeb12

Resource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnResource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnBonaria Biancu
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
 
Empirical evaluation-of-popular-mobile-hotel-booking-apps
Empirical evaluation-of-popular-mobile-hotel-booking-appsEmpirical evaluation-of-popular-mobile-hotel-booking-apps
Empirical evaluation-of-popular-mobile-hotel-booking-appsSTIinnsbruck
 
AngularJS Deep Dives (NYC GDG Apr 2013)
AngularJS Deep Dives (NYC GDG Apr 2013)AngularJS Deep Dives (NYC GDG Apr 2013)
AngularJS Deep Dives (NYC GDG Apr 2013)Nitya Narasimhan
 
ELynch_finalpresentation_ist676
ELynch_finalpresentation_ist676ELynch_finalpresentation_ist676
ELynch_finalpresentation_ist676erlynch1
 
Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...jins0618
 
On line presence of the touristic accommodation industry 18.06.2013
On line presence of the touristic accommodation industry 18.06.2013On line presence of the touristic accommodation industry 18.06.2013
On line presence of the touristic accommodation industry 18.06.2013STIinnsbruck
 
A Review on Tourist Analyzer
A Review on Tourist AnalyzerA Review on Tourist Analyzer
A Review on Tourist AnalyzerIRJET Journal
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
 
ATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph APIATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph APIDragan Panjkov
 
WINDMash prensentation at DocEng 2011
WINDMash prensentation at DocEng 2011WINDMash prensentation at DocEng 2011
WINDMash prensentation at DocEng 2011LUONG The Nhan
 
FIWARE Training: NGSI-LD Introduction
FIWARE Training: NGSI-LD IntroductionFIWARE Training: NGSI-LD Introduction
FIWARE Training: NGSI-LD IntroductionFIWARE
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Craig Knoblock
 
Optimized travel recommendation using location based collaborative filtering
Optimized travel recommendation using location based collaborative filteringOptimized travel recommendation using location based collaborative filtering
Optimized travel recommendation using location based collaborative filteringIRJET Journal
 
Metadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation beginsMetadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation beginsPéter Király
 
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...multimediaeval
 
Exploring highly interconnected humanities data: are faceted browsers always ...
Exploring highly interconnected humanities data: are faceted browsers always ...Exploring highly interconnected humanities data: are faceted browsers always ...
Exploring highly interconnected humanities data: are faceted browsers always ...Michele Pasin
 
Contextual Recommendation of Social Updates, a tag-based framework
Contextual Recommendation of Social Updates, a tag-based frameworkContextual Recommendation of Social Updates, a tag-based framework
Contextual Recommendation of Social Updates, a tag-based frameworkAdrien Joly
 

Similar a Better Contextual Suggestions from Domain Knowledge in ClueWeb12 (20)

Resource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnResource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turn
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
 
Empirical evaluation-of-popular-mobile-hotel-booking-apps
Empirical evaluation-of-popular-mobile-hotel-booking-appsEmpirical evaluation-of-popular-mobile-hotel-booking-apps
Empirical evaluation-of-popular-mobile-hotel-booking-apps
 
AngularJS Deep Dives (NYC GDG Apr 2013)
AngularJS Deep Dives (NYC GDG Apr 2013)AngularJS Deep Dives (NYC GDG Apr 2013)
AngularJS Deep Dives (NYC GDG Apr 2013)
 
ELynch_finalpresentation_ist676
ELynch_finalpresentation_ist676ELynch_finalpresentation_ist676
ELynch_finalpresentation_ist676
 
Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...
 
On line presence of the touristic accommodation industry 18.06.2013
On line presence of the touristic accommodation industry 18.06.2013On line presence of the touristic accommodation industry 18.06.2013
On line presence of the touristic accommodation industry 18.06.2013
 
A Review on Tourist Analyzer
A Review on Tourist AnalyzerA Review on Tourist Analyzer
A Review on Tourist Analyzer
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
 
Data Modeling with NGSI, NGSI-LD
Data Modeling with NGSI, NGSI-LDData Modeling with NGSI, NGSI-LD
Data Modeling with NGSI, NGSI-LD
 
ATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph APIATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph API
 
WINDMash prensentation at DocEng 2011
WINDMash prensentation at DocEng 2011WINDMash prensentation at DocEng 2011
WINDMash prensentation at DocEng 2011
 
FIWARE Training: NGSI-LD Introduction
FIWARE Training: NGSI-LD IntroductionFIWARE Training: NGSI-LD Introduction
FIWARE Training: NGSI-LD Introduction
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
 
Optimized travel recommendation using location based collaborative filtering
Optimized travel recommendation using location based collaborative filteringOptimized travel recommendation using location based collaborative filtering
Optimized travel recommendation using location based collaborative filtering
 
Metadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation beginsMetadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation begins
 
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
 
Exploring highly interconnected humanities data: are faceted browsers always ...
Exploring highly interconnected humanities data: are faceted browsers always ...Exploring highly interconnected humanities data: are faceted browsers always ...
Exploring highly interconnected humanities data: are faceted browsers always ...
 
Inversini Enter2009(P35)
Inversini Enter2009(P35)Inversini Enter2009(P35)
Inversini Enter2009(P35)
 
Contextual Recommendation of Social Updates, a tag-based framework
Contextual Recommendation of Social Updates, a tag-based frameworkContextual Recommendation of Social Updates, a tag-based framework
Contextual Recommendation of Social Updates, a tag-based framework
 

Último

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 

Último (20)

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 

Better Contextual Suggestions from Domain Knowledge in ClueWeb12

  • 1. Better Contextual Suggestions from ClueWeb12 Using Domain Knowledge Inferred from The Open Web Results And AnalysisContextual Suggestion Model  Applying domain knowledge about sites that are more likely to offer attractions leads to better suggestions  The best results are obtained when identifying attractions through specialized services such as Foursquare Conclusions Thaer Samar† , Alejandro Bellogín* , Arjen P. de Vries† † Centrum Wiskunde & Informatica, The Netherlands * Universidad Autónoma de Madrid, Spain {samar, arjen.de.vries}@cwi.nl, alejandro.bellogin@uam.es General performance TouristFiltered run is better than GeoFiltered run Performance of TouristFiltered vs. GeoFiltered Percentage of topics where TouristFiltered run is better than, equal to, and worse than GeoFiltered run Dimensions of relevance  Effect of geographical (geo), description (desc), and document (doc) relevance on the performance of TouristFiltered and GeoFiltered runs  Considering desc and doc relevance only:  Both runs have almost the same effectiveness  Considering the geo relevance only  TouristFiltered is more geographically appropriate Effect of TouristFiltered parts on the Performance  TouristFiltered sub-collection consists of three parts: A. TouristListFiltered (TLF), B. TouristOutlinksFiltered (TOF), and C. AttractionFiltered (AF)  Effect of each part on the performance  Best results from the part obtained through specialized service like Foursquare Goal: Applying domain knowledge inferred from the Open Web to get better suggestions from the Archived Web (ClueWeb12) 2. User Profiles Generation For each user: Aggregate descriptions of attractions rated by the user Split descriptions into positive and negative profiles based on user ratings 3. Representation Represents attractions and user profiles in a Vector Space Model 1. clean attraction documents (remove HTML tags) 2. remove stop words from profiles and documents 3. transform documents and profiles into vectors of <term, frequency> pairs 4. Similarity of Attractions & User Profiles For each set of attractions per (user, context) topic Compute similarity between each attraction and positive and negative profiles GeoFiltered Exact mention of the context Format: “City, ST” TouristFiltered 50 Attractions: name and URL Missing URL “Attraction name Miami, FL” Get Hosts of all found URLS Search in ClueWeb12 1. Find Attractions in ClueWeb12 “Miami, FL” A. Tourist sites: Any document whose URL belongs to 7 manually selected tourist websites. (TouristListFiltered) {yelp, tripadvisor, wikitravel, zagat, xpedia, orbitz, and travel.yahoo} B. Expand A Extract Outlinks Find Outlinks in ClueWeb12 (TouristOutlinksFiltered) C. Attraction Oriented (AttractionsFiltered) Two runs: one based on TouristFiltered and another based on GeoFiltered The following steps are the same for the two runs NEW Same as 2013 Same as 2013 Same as 2013 all: means geo, desc, and doc are considered