SlideShare una empresa de Scribd logo
1 de 64
Descargar para leer sin conexión
Evaluation Initiatives for
Entity-oriented Search
KEYSTONE Spring WG Meeting 2015 | Kosice, Slovakia, 2015
Krisztian Balog 

University of Stavanger
Evaluation Initiatives for
Entity-oriented Search
Krisztian Balog 

University of Stavanger
6%
36%
1%5% 12%
41%
Entity (“1978 cj5 jeep”)
Type (“doctors in barcelona”)
Attribute (“zip code waterville Maine”)
Relation (“tom cruise katie holmes”)
Other (“nightlife in Barcelona”)
Uninterpretable
Distribution of web search
queries (Pound et al., 2010)
Pound, Mika, and Zaragoza (2010). Ad-hoc object retrieval in the web of data. In WWW ’10.
28%
15%
10% 4%
14%
29% Entity
Entity+refiner
Category
Category+refiner
Other
Website
Distribution of web search
queries (Lin et al., 2011)
Lin, Pantel, Gamon, Kannan, and Fuxman (2012). Active objects. In WWW ’12.
Evaluation Initiatives for
Entity-oriented Search
Krisztian Balog 

University of Stavanger
Evaluation Initiatives for
Entity-oriented Search
- Shared tasks at world-wide evaluation
campaigns (TREC, CLEF)

- Provide Ingredients for evaluating a given task

- Data collection
- Gold-standard annotations
- Evaluation metrics
In this talk
- Keyword-based search in knowledge bases

- Query understanding with the help of
knowledge bases

- Test collections, metrics, evaluation campaigns
1

Entity Retrieval
Ad-hoc entity retrieval
- Input: keyword query

- “telegraphic” queries (neither well-formed nor
grammatically correct sentences or questions)
- Output: ranked list of entities

- Collection: semi-structured documents
meg ryan war
american embassy nairobi
ben franklin
Chernobylworst actor century
Sweden Iceland currency
Evaluation initiatives
- INEX Entity Ranking track (2007-09)

- Wikipedia
- TREC Entity track (2009-11)

- Web crawl, including Wikipedia
- Semantic Search Challenge (2010-11)

- BTC2009 (Semantic Web crawl, ~1 billion RDF triples)
- INEX Linked Data track (2012-13)

- Wikipedia enriched with RDF properties from DBpedia
and YAGO
Common denominator: DBpedia
bit.ly/dbpedia-entity
Balog and Neumayer (2013). A Test Collection for Entity Retrieval in DBpedia. In SIGIR ’13.
Query set
#q 

(orig)
#q

(used)
avg
q_len
avg 

#rel
INEX-XER

“US presidents since 1960”
55 55 5,5 29,8
TREC Entity

“Airlines that currently use Boeing 747 planes”
20 17 6,7 13,1
SemSearch ES

“Ben Franklin”
142 130 2,7 8,7
SemSearch LS

“Axis powers of World War II”
50 43 5,4 12,5
QALD-2

“Who is the mayor of Berlin?”
200 140 7,9 41,5
INEX-LD

“England football player highest paid”
100 100 4,8 37,6
Total 485 5,3 27
Semi-structured data
foaf:name Audi A4
rdfs:label Audi A4
rdfs:comment The Audi A4 is a compact executive car
produced since late 1994 by the German car
manufacturer Audi, a subsidiary of the
Volkswagen Group. The A4 has been built [...]
dbpprop:production 1994
2001
2005
2008
rdf:type dbpedia-owl:MeanOfTransportation
dbpedia-owl:Automobile
dbpedia-owl:manufacturer dbpedia:Audi
dbpedia-owl:class dbpedia:Compact_executive_car
owl:sameAs freebase:Audi A4
is dbpedia-owl:predecessor of dbpedia:Audi_A5
is dbpprop:similar of dbpedia:Cadillac_BLS
dbpedia:Audi_A4
Baseline models
- Standard document retrieval methods applied
on entity description documents

- E.g., language modeling (MLM)
P(e|q) / P(e)P(q|✓e) = P(e)
Y
t2q
P(t|✓e)n(t,q)
Entity prior

Probability of the entity 

being relevant to any query
Entity language model

Multinomial probability distribution
over the vocabulary of terms
Fielded models
- Fielded extensions of document retrieval
methods

- E.g., Mixture of Language Models (MLM)
mX
j=1
µj = 1
Field language model

Smoothed with a collection model built

from all document representations of the

same type in the collectionField weights
P(t|✓d) =
mX
j=1
µjP(t|✓dj )
Setting field weights
- Heuristically 

- Proportional to the length of text content in that field,
to the field’s individual performance, etc.
- Empirically (using training queries)

- Problems

- Number of possible fields is huge
- It is not possible to optimise their weights directly
- Entities are sparse w.r.t. different fields
- Most entities have only a handful of predicates
Predicate folding
- Idea: reduce the number of fields by grouping
them together based on, e.g., type
foaf:name Audi A4
rdfs:label Audi A4
rdfs:comment The Audi A4 is a compact executive car
produced since late 1994 by the German car
manufacturer Audi, a subsidiary of the
Volkswagen Group. The A4 has been built [...]
dbpprop:production 1994
2001
2005
2008
rdf:type dbpedia-owl:MeanOfTransportation
dbpedia-owl:Automobile
dbpedia-owl:manufacturer dbpedia:Audi
dbpedia-owl:class dbpedia:Compact_executive_car
owl:sameAs freebase:Audi A4
is dbpedia-owl:predecessor of dbpedia:Audi_A5
is dbpprop:similar of dbpedia:Cadillac_BLS
Name

Attributes

Out-relations

In-relations

Probabilistic Retrieval Model
for Semistructured data (PRMS)
- Extension to the Mixture of Language Models

- Find which document field each query term
may be associated with
Mapping probability

Estimated for each query term
P(t|✓d) =
mX
j=1
µjP(t|✓dj )
P(t|✓d) =
mX
j=1
P(dj|t)P(t|✓dj )
Kim, Xue, and Croft (2009). Probabilistic Retrieval Model for Semistructured data. In ECIR'09.
Estimating the mapping
probability
Term likelihood

Probability of a query term
occurring in a given field type

Prior field probability

Probability of mapping the query term 

to this field before observing collection
statistics
P(dj|t) =
P(t|dj)P(dj)
P(t)
X
dk
P(t|dk)P(dk)
P(t|Cj) =
P
d n(t, dj)
P
d |dj|
Example (IMDB)
cast 0,407
team 0,382
title 0,187
genre 0,927
title 0,07
location 0,002
cast 0,601
team 0,381
title 0,017
dj dj djP(t|dj) P(t|dj) P(t|dj)
meg ryan war
Baseline results
MAP
0
0,1
0,2
0,3
0,4
INEX-XER TREC Entity SemSearch ESSemSearch LS QALD-2 INEX-LD Total
LM LM title+content LM all fields PRMS
Related entity finding
- TREC Entity track (2009-2011)

- Input: natural language query

- Output: ranked list of entities

- Collection: combination of unstructured (web
crawl) and structured (DBpedia)

- Queries contain an entity (E), target type (T),
and a required relation (R)

- Entity and target type are annotated in the query
Examples
airlines that currently use Boeing 747 planes
target type input entity
Members of The Beaux Arts Trio
target type input entity
What countries does Eurail operate in?
target type input entity
Modeling related entity finding
- Exploiting the available annotations in a three-
component model
p(e|E, T, R) / p(e|E) · p(T|e) · p(R|E, e)
Context model
Type filtering
Co-occurrence
model
xxxx x xxx xx xxxxxx xx x xxx xx x xxxx
xx xxx x xxxxxx xxxxxx xx x xxx xx x xxxx
xx xxx x xxxxx xx x xxx xx xxxx xx xxx xx
x xxxxx xxx
xxxx x xxx xx xxxxxx xx x xxx xx x xxxx
xx xxx x xxxxxx xxxxxx xx x xxx xx x xxxx
xx xxx x xxxxx xx x xxx xx xxxx xx xxx xx
x xxxxx xxx
xxxx x xxx xx xxxxxx xx x xxx xx x xxxx
xx xxx x xxxxxx xxxxxx xx x xxx xx x xxxx
xx xxx x xxxxx xx x xxx xx xxxx xx xxx xx
x xxxxx xxx
Bron, Balog, and de Rijke (2010). Ranking Related Entities: Components and Analyses. In CIKM ’10.
Examples
airlines that currently use Boeing 747 planes
target type input entity
Members of The Beaux Arts Trio
target type input entity
What countries does Eurail operate in?
target type input entity
Can	
  we	
  obtain	
  such	
  

annotations	
  

automatically?
2

Understanding Queries
Wikification
- Recognizing concepts in text and linking them
to the corresponding entries in Wikipedia
Image taken from Milne and Witten (2008b). Learning to Link with Wikipedia. In CIKM '08.
This evaluation can also be considered as a large-scale test of our
Wikipedia link-based measure. Just the testing phase of the
experiment involved more than two million comparisons in order
to weight context articles and compare them to candidate senses.
When these operations were separated out from the rest of the
disambiguation process they where performed in three minutes (a
rate of about 11,000 every second) on the desktop machine.
4. LEARNING TO DETECT LINKS
This section describes a new approach to link detection. The
central difference between this and Mihalcea and Csomai’s
system is that Wikipedia articles are used to learn what terms
should and should not be linked, and the context surrounding the
terms is taken into account when doing so. Wikify’s detection
approach, in contrast, relies exclusively on link probability. If a
term is used as a link for a sufficient proportion of the Wikipedia
articles in which it is found, they consider it to be a link whenever
it is encountered in other documents—regardless of context. This
approach will always make mistakes, no matter what threshold is
chosen. No matter how small a terms link probability is, if it
exceeds zero then, by definition, there is some context in which
as is the case with Democrats and Democratic Party, several
terms link to the same concept if that concept is mentioned more
than once. Sometimes, if the disambiguation classifier found more
than one likely sense, terms may point to multiple concepts.
Democrats, for example, could refer to the party or to any
proponent of democracy.
These automatically identified Wikipedia articles provide training
instances for a classifier. Positive examples are the articles that
were manually linked to, while negative ones are those that were
not. Features of these articles—and the places where they were
mentioned—are used to inform the classifier about which topics
should and should not be linked. The features are as follows.
Link Probability. Mihalcea and Csomai’s link probability is a
proven feature. On its own it is able to recognize the majority of
links. Because each of our training instances involves several
candidate link locations (e.g. Hillary Clinton and Clinton in
Figure 4), there are multiple link probabilities. These are
combined into two separate features: the average and the
maximum. The former is expected to be more consistent, but the
latter may be more indicative of links. For example, Democratic
Figure 4: Associating document phrases with appropriate Wikipedia articles
Hilary Rodham
Clinton
Barack
Obama
Democratic Party
(United States)
Florida
(US State)
Nomination Voting
Delegate President of the
United States
Michigan
(US State)
Democrat
Entity linking
- Typically only named entities are annotated

- Reference KB can be different from Wikipedia

- Usages

- Improved retrieval, enabling semantic search
- Advanced UX/UI, help users to explore
- Knowledge base population/acceleration
Entity linking methods
- Mention detection

- Identifying entity mentions in text
- Candidate entity ranking

- Generating a set of candidate entries from the KB
for each mention
- Disambiguation

- Selecting the best entity for a mention
- Machine learning; features: commonness,
relatedness, context, …
Entity linking evaluation
ground truth system annotation AˆA
Košice is the biggest city in
eastern Slovakia and in 2013 was
the European Capital of Culture
together with Marseille, France. It
is situated on the river Hornád at
the eastern reaches of the Slovak
Ore Mountains, near the border
with Hungary.
Košice is the biggest city in
eastern Slovakia and in 2013 was
the European Capital of Culture
together with Marseille, France. It
is situated on the river Hornád at
the eastern reaches of the Slovak
Ore Mountains, near the border
with Hungary.
Entity linking evaluation
ground truth system annotation AˆA
Košice is the biggest city in
eastern Slovakia and in 2013 was
the European Capital of Culture
together with Marseille, France. It
is situated on the river Hornád at
the eastern reaches of the Slovak
Ore Mountains, near the border
with Hungary.
Košice is the biggest city in
eastern Slovakia and in 2013 was
the European Capital of Culture
together with Marseille, France. It
is situated on the river Hornád at
the eastern reaches of the Slovak
Ore Mountains, near the border
with Hungary.
Entity linking evaluation
Košice is the biggest city in
eastern Slovakia and in 2013 was
the European Capital of Culture
together with Marseille, France. It
is situated on the river Hornád at
the eastern reaches of the Slovak
Ore Mountains, near the border
with Hungary.
ground truth system annotation
Košice is the biggest city in
eastern Slovakia and in 2013 was
the European Capital of Culture
together with Marseille, France. It
is situated on the river Hornád at
the eastern reaches of the Slovak
Ore Mountains, near the border
with Hungary.
AˆA
P =
|A
T ˆA|
|A|
R =
|A
T ˆA|
| ˆA|
F =
2 · P · R
P + R
Entity linking for queries
- Challenges

- search queries are short
- limited context
- lack of proper grammar, spelling
- multiple interpretations
- needs to be fast
Example
the governator
movie
person
Example
the governator movie
Example
new york pizza manhattan
Example
new york pizza manhattan
ERD’14 challenge
- Task: finding query interpretations

- Input: keyword query

- Output: sets of sets of entities

- Reference KB: Freebase

- Annotations are to be performed by a web
service within a given time limit
Evaluation
P =
|I
T ˆI|
|I|
R =
|I
T ˆI|
|ˆI|
F =
2 · P · R
P + R
New York City, Manhattan
ground truth system annotationˆI I
new york pizza manhattan
New York-style pizza, Manhattan
New York City, Manhattan
New York-style pizza
ERD’14 results
Single
interpretation
is returned
Rank Team F1 latency
1
SMAPH
Team
0.7076 0.49
2 NTUNLP 0.6797 1.04
3
Seznam
Research
0.6693 3.91
http://web-ngram.research.microsoft.com/erd2014/LeaderBoard.aspx
3

Living Labs for IR Evaluation
http://living-labs.net
@livinglabsnet
Goals & focus
- Overall goal: make information retrieval
evaluation more realistic

- Evaluate retrieval methods in a live setting with real
users in their natural task environments
- Focus: medium-sized organizations with fair
amount of search volume

- Typically lack their own R&D department, but would
gain much from improved approaches
What is in it for
participants?
- Access to privileged commercial data 

- (Search and click-through data)
- Opportunity to test IR systems with real,
unsuspecting users in a live setting

- (Not the same as crowdsourcing!)
“Give us your ranking, we’ll have it clicked!”
Research aims
- Understanding of online evaluation and of the
generalization of retrieval techniques across
different use cases

- Specific research questions

- Are system rankings different when using historical
clicks from those using online experiments?
- Are system rankings different when using manual
relevance assessments (“expert judgments”) from
those using online experiments?
Key idea
- Focus on frequent (head) queries

- Enough traffic on them (both real-time and historical)
- Ranked result lists can be generated offline
- An API orchestrates all data exchange between 

live sites and experimental systems
Balog, Kelly, and Schuth (2014). Head First: Living Labs for Ad-hoc Search Evaluation. In CIKM '14.
Methodology
- Queries, candidate documents, historical search and
click data made available
- Rankings are generated for each query and uploaded
through an API
- When any of the test queries is fired, the live site
request rankings from the API and interleaves them
with that of the production system
- Participants get detailed feedback on user interactions
(clicks)
- Ultimate measure is the number of “wins” against the
production system
Interleaving
Result A1
Result B1
Result A2
Result B2
Result A3
Result B3
Result A1
Result A2
Result A3
Result B1
Result B2
Result B3
production system experimental system
System A System B
Partners & Use-cases
Product search
Local domain
search
Web search
Provider regiojatek.hu uva.nl seznam.cz
Data
raw queries and
(highly structured)
documents
raw queries and
(generally textual)
documents
pre-computed
document-query
features
Site traffic
relatively low 

(~4K sessions/day)
relatively low high
Info needs
(mostly)
transactional
(mostly)
navigational
vary
API doc: doc.living-labs.net
Guide for CLEF participants
Partners & Use-cases
Product search
Local domain
search
Web search
Provider regiojatek.hu uva.nl seznam.cz
Data
raw queries and
(highly structured)
documents
raw queries and
(generally textual)
documents
pre-computed
document-query
features
Site traffic
relatively low 

(~4K sessions/day)
relatively low high
Info needs
(mostly)
transactional
(mostly)
navigational
vary
Product search
- Ad-hoc retrieval over a product catalog

- Several thousand products

- Limited amount of text, lots of structure

- Categories, characters, brands, etc.
Homepage of the year

2014

in e-commerce category
Relational database
products
product_id
product_name
price
manufacturer_id
description
short_description
brandname
age_min
age_max
…
product_characters
product_id
character_id
characters
character_id
name
categories
category_id
name
parent
priority
description
…
product_categories
product_id
category_id
manufacturers
manufacturer_id
name
…
Logs
1402052038;7902b47fbbd45360a672670367cccf8d;IDX;28178,28188,33533,36450,32394,34188,76395,856
1402052039;4561d3d945d8981e177eac00f6025809;QRY;kisvakond puzzle;;
1;47702,07921,80875,36419,65736,09726,85683,09679,37294,15847,80878,00131,03994,17713
1402052040;fd593d671402dd5ea21e39d00aa30e26;PRD;16402
1402052040;fd593d671402dd5ea21e39d00aa30e26;PRC;16402;1;80875,36419,65736,09726,85683,09679,1
1402052041;fd593d671402dd5ea21e39d00aa30e26;IDX;34188,34561,34543,30649,27747,28188,28178,763
1402052043;44ee8beae78426b0eb5662d25191984c;PRD;71983
1402052043;44ee8beae78426b0eb5662d25191984c;PRC;71983;1;88986,73681,49141,02215,85632,85633,4
1402052043;897deef88cfd77d59cf2eebe6c9cab43;CAT;12;9;07975,07976,76715,99293,88844,64632,6452
1402052044;dc1db4bef11f027a0a8d161b8354a318;IDX;00841,34543,76395,28178,34188,33039,33533,339
1402052045;bb3a12f50e5c1a53765885a90a8d8992;IDX;00841,28188,27747,76395,30649,28178,34188,364
…
1402052039;4561d3d945d8981e177eac00f6025809;QRY;kisvakond puzzle;;
2;00132,13867,64253,13866,64252,69406,00127,36152,69405,14792,92991,00126,15758,20313
1402052068;4561d3d945d8981e177eac00f6025809;QRY;puzzle;gender:0|category_id:0|price:1001-2000
bonus:;1;47702,09679,36152,01918,19991,19989,06633,19995,99888,51805,36150,81522,05655,64
1402052076;4561d3d945d8981e177eac00f6025809;QRY;puzzle;gender:0|category_id:0|price:1001-2000
bonus:;8;32497,28307,08246,00064,32471,27995,34018,32491,61540
timestamp session_id event event attributes
Test queries
candidate products
Test queries
historical clicks
Product data
{
"content": {
"age_max": 10,
"age_min": 6,
"arrived": "2014-08-28",
"available": 1,
"brand": "Mattel",
"category": "Babu00e1k, kellu00e9kek",
"category_id": "25",
"characters": [],
"description": "A Monster Highu00ae iskola szu00f6rnycsemetu00e9i […]",
"gender": 2,
"main_category": "Baba, babakocsi",
"main_category_id": "3",
"photos": [
"http://regiojatek.hu/data/regio_images/normal/20777_0.jpg",
"http://regiojatek.hu/data/regio_images/normal/20777_1.jpg",
[…]
],
"price": 8675.0,
"product_name": "Monster High Scaris Paravu00e1rosi baba tu00f6bbfu00e9le",
"queries": {
"clawdeen": "0.037",
"monster": "0.222",
"monster high": "0.741"
},
"short_description": "A Monster Highu00ae iskola szu00f6rnycsemetu00e9i 

elsu0151 ku00fclfu00f6ldi u00fatjukra indulnak..."
},
"creation_time": "Mon, 11 May 2015 04:52:59 -0000",
"docid": "R-d43",
"site_id": "R",
frequent queries

that led to the product
Results
Method
impr. 

total
impr. 

per query
#clicks CTR
1 1884 18.84 581 0.31
2 1407 14.07 630 0.45
3 1334 13.34 483 0.36
Join us!
- Evaluation will continue to run even after the
CLEF deadline

- Additional use-cases are welcome
living-labs.net
Summary
- Entities as the unit of retrieval

- Understanding queries with the help of entities

- From traditional test-collection based
evaluations to evaluation as a service
Questions?
Contact | @krisztianbalog | krisztianbalog.com

Más contenido relacionado

La actualidad más candente

Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
 
Entity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessEntity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessFaegheh Hasibi
 
Gleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity SummarizationGleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity SummarizationKalpa Gunaratna
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsVito Ostuni
 
Everything you wanted to know about Dublin Core metadata
Everything you wanted to know about Dublin Core metadataEverything you wanted to know about Dublin Core metadata
Everything you wanted to know about Dublin Core metadataEduserv Foundation
 
Representing financial reports on the semantic web a faithful translation f...
Representing financial reports on the semantic web   a faithful translation f...Representing financial reports on the semantic web   a faithful translation f...
Representing financial reports on the semantic web a faithful translation f...Jie Bao
 
Question Answering with Lydia
Question Answering with LydiaQuestion Answering with Lydia
Question Answering with LydiaJae Hong Kil
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in JuliaJiahao Chen
 
Recommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRecommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRoku
 
Expressive Query Answering For Semantic Wikis (20min)
Expressive Query Answering For  Semantic Wikis (20min)Expressive Query Answering For  Semantic Wikis (20min)
Expressive Query Answering For Semantic Wikis (20min)Jie Bao
 
Verifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetVerifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetAlexandre Rademaker
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Cognitum
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Textbutest
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Information Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and ToolsInformation Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and ToolsBenjamin Habegger
 
Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018Federico Bianchi
 
Dublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialDublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialEduserv Foundation
 

La actualidad más candente (20)

Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and Evaluation
 
Entity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessEntity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. Effectiveness
 
Gleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity SummarizationGleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity Summarization
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender Systems
 
Everything you wanted to know about Dublin Core metadata
Everything you wanted to know about Dublin Core metadataEverything you wanted to know about Dublin Core metadata
Everything you wanted to know about Dublin Core metadata
 
Representing financial reports on the semantic web a faithful translation f...
Representing financial reports on the semantic web   a faithful translation f...Representing financial reports on the semantic web   a faithful translation f...
Representing financial reports on the semantic web a faithful translation f...
 
Question Answering with Lydia
Question Answering with LydiaQuestion Answering with Lydia
Question Answering with Lydia
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in Julia
 
Recommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRecommender Systems in the Linked Data era
Recommender Systems in the Linked Data era
 
Expressive Query Answering For Semantic Wikis (20min)
Expressive Query Answering For  Semantic Wikis (20min)Expressive Query Answering For  Semantic Wikis (20min)
Expressive Query Answering For Semantic Wikis (20min)
 
Verifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetVerifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNet
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Information Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and ToolsInformation Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and Tools
 
Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018
 
Topical_Facets
Topical_FacetsTopical_Facets
Topical_Facets
 
Dublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialDublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax Tutorial
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 

Similar a Evaluation Initiatives for Entity-oriented Search

Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Sean Golliher
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19ngamou
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
Towards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIsTowards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIsSpeck&Tech
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Mariana Damova, Ph.D
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph FuturesPaul Groth
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisJonathan Stray
 
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...FedorNikolaev
 
Implementing the Open Government Directive using the technologies of the Soci...
Implementing the Open Government Directive using the technologies of the Soci...Implementing the Open Government Directive using the technologies of the Soci...
Implementing the Open Government Directive using the technologies of the Soci...George Thomas
 
Metadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation beginsMetadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation beginsPéter Király
 
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic GraphsStretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic GraphsAmparo Elizabeth Cano Basave
 
Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)VikramJothyPrakash1
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webFabien Gandon
 
KDIR2015-Entity Linking and Knowledge Discovery in Microblogs-Presentation
KDIR2015-Entity Linking and Knowledge Discovery in Microblogs-PresentationKDIR2015-Entity Linking and Knowledge Discovery in Microblogs-Presentation
KDIR2015-Entity Linking and Knowledge Discovery in Microblogs-PresentationPikakshi Manchanda
 
Eprints Special Session - DC-2006, Mexico
Eprints Special Session - DC-2006, MexicoEprints Special Session - DC-2006, Mexico
Eprints Special Session - DC-2006, MexicoEduserv Foundation
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1manujam
 
The Eprints Application Profile: a FRBR approach to modelling repository meta...
The Eprints Application Profile: a FRBR approach to modelling repository meta...The Eprints Application Profile: a FRBR approach to modelling repository meta...
The Eprints Application Profile: a FRBR approach to modelling repository meta...Julie Allinson
 

Similar a Evaluation Initiatives for Entity-oriented Search (20)

Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Towards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIsTowards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIs
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph Futures
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
 
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of...
 
Implementing the Open Government Directive using the technologies of the Soci...
Implementing the Open Government Directive using the technologies of the Soci...Implementing the Open Government Directive using the technologies of the Soci...
Implementing the Open Government Directive using the technologies of the Soci...
 
Eprints Application Profile
Eprints Application ProfileEprints Application Profile
Eprints Application Profile
 
Metadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation beginsMetadata Quality Assurance Part II. The implementation begins
Metadata Quality Assurance Part II. The implementation begins
 
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic GraphsStretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs
 
Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the web
 
KDIR2015-Entity Linking and Knowledge Discovery in Microblogs-Presentation
KDIR2015-Entity Linking and Knowledge Discovery in Microblogs-PresentationKDIR2015-Entity Linking and Knowledge Discovery in Microblogs-Presentation
KDIR2015-Entity Linking and Knowledge Discovery in Microblogs-Presentation
 
Eprints Special Session - DC-2006, Mexico
Eprints Special Session - DC-2006, MexicoEprints Special Session - DC-2006, Mexico
Eprints Special Session - DC-2006, Mexico
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1121004 linking open_data_with_drupal_v1
121004 linking open_data_with_drupal_v1
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF Graphs
 
The Eprints Application Profile: a FRBR approach to modelling repository meta...
The Eprints Application Profile: a FRBR approach to modelling repository meta...The Eprints Application Profile: a FRBR approach to modelling repository meta...
The Eprints Application Profile: a FRBR approach to modelling repository meta...
 

Más de krisztianbalog

Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...krisztianbalog
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...krisztianbalog
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?krisztianbalog
 
Personal Knowledge Graphs
Personal Knowledge GraphsPersonal Knowledge Graphs
Personal Knowledge Graphskrisztianbalog
 
Entities for Augmented Intelligence
Entities for Augmented IntelligenceEntities for Augmented Intelligence
Entities for Augmented Intelligencekrisztianbalog
 
On Entities and Evaluation
On Entities and EvaluationOn Entities and Evaluation
On Entities and Evaluationkrisztianbalog
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Editionkrisztianbalog
 
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF LabOverview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Labkrisztianbalog
 
Time-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation SystemsTime-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation Systemskrisztianbalog
 
Multi-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation RecommendationMulti-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation Recommendationkrisztianbalog
 
Semistructured Data Seach
Semistructured Data SeachSemistructured Data Seach
Semistructured Data Seachkrisztianbalog
 
Collection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity SearchCollection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity Searchkrisztianbalog
 

Más de krisztianbalog (12)

Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
Personal Knowledge Graphs
Personal Knowledge GraphsPersonal Knowledge Graphs
Personal Knowledge Graphs
 
Entities for Augmented Intelligence
Entities for Augmented IntelligenceEntities for Augmented Intelligence
Entities for Augmented Intelligence
 
On Entities and Evaluation
On Entities and EvaluationOn Entities and Evaluation
On Entities and Evaluation
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Edition
 
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF LabOverview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
 
Time-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation SystemsTime-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation Systems
 
Multi-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation RecommendationMulti-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation Recommendation
 
Semistructured Data Seach
Semistructured Data SeachSemistructured Data Seach
Semistructured Data Seach
 
Collection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity SearchCollection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity Search
 

Último

Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 

Último (20)

Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 

Evaluation Initiatives for Entity-oriented Search

  • 1. Evaluation Initiatives for Entity-oriented Search KEYSTONE Spring WG Meeting 2015 | Kosice, Slovakia, 2015 Krisztian Balog University of Stavanger
  • 2. Evaluation Initiatives for Entity-oriented Search Krisztian Balog University of Stavanger
  • 3.
  • 4. 6% 36% 1%5% 12% 41% Entity (“1978 cj5 jeep”) Type (“doctors in barcelona”) Attribute (“zip code waterville Maine”) Relation (“tom cruise katie holmes”) Other (“nightlife in Barcelona”) Uninterpretable Distribution of web search queries (Pound et al., 2010) Pound, Mika, and Zaragoza (2010). Ad-hoc object retrieval in the web of data. In WWW ’10.
  • 5. 28% 15% 10% 4% 14% 29% Entity Entity+refiner Category Category+refiner Other Website Distribution of web search queries (Lin et al., 2011) Lin, Pantel, Gamon, Kannan, and Fuxman (2012). Active objects. In WWW ’12.
  • 6. Evaluation Initiatives for Entity-oriented Search Krisztian Balog University of Stavanger
  • 7. Evaluation Initiatives for Entity-oriented Search - Shared tasks at world-wide evaluation campaigns (TREC, CLEF) - Provide Ingredients for evaluating a given task - Data collection - Gold-standard annotations - Evaluation metrics
  • 8. In this talk - Keyword-based search in knowledge bases - Query understanding with the help of knowledge bases - Test collections, metrics, evaluation campaigns
  • 10. Ad-hoc entity retrieval - Input: keyword query - “telegraphic” queries (neither well-formed nor grammatically correct sentences or questions) - Output: ranked list of entities - Collection: semi-structured documents meg ryan war american embassy nairobi ben franklin Chernobylworst actor century Sweden Iceland currency
  • 11. Evaluation initiatives - INEX Entity Ranking track (2007-09) - Wikipedia - TREC Entity track (2009-11) - Web crawl, including Wikipedia - Semantic Search Challenge (2010-11) - BTC2009 (Semantic Web crawl, ~1 billion RDF triples) - INEX Linked Data track (2012-13) - Wikipedia enriched with RDF properties from DBpedia and YAGO
  • 12. Common denominator: DBpedia bit.ly/dbpedia-entity Balog and Neumayer (2013). A Test Collection for Entity Retrieval in DBpedia. In SIGIR ’13. Query set #q 
 (orig) #q
 (used) avg q_len avg 
 #rel INEX-XER
 “US presidents since 1960” 55 55 5,5 29,8 TREC Entity
 “Airlines that currently use Boeing 747 planes” 20 17 6,7 13,1 SemSearch ES
 “Ben Franklin” 142 130 2,7 8,7 SemSearch LS
 “Axis powers of World War II” 50 43 5,4 12,5 QALD-2
 “Who is the mayor of Berlin?” 200 140 7,9 41,5 INEX-LD
 “England football player highest paid” 100 100 4,8 37,6 Total 485 5,3 27
  • 13.
  • 14. Semi-structured data foaf:name Audi A4 rdfs:label Audi A4 rdfs:comment The Audi A4 is a compact executive car produced since late 1994 by the German car manufacturer Audi, a subsidiary of the Volkswagen Group. The A4 has been built [...] dbpprop:production 1994 2001 2005 2008 rdf:type dbpedia-owl:MeanOfTransportation dbpedia-owl:Automobile dbpedia-owl:manufacturer dbpedia:Audi dbpedia-owl:class dbpedia:Compact_executive_car owl:sameAs freebase:Audi A4 is dbpedia-owl:predecessor of dbpedia:Audi_A5 is dbpprop:similar of dbpedia:Cadillac_BLS dbpedia:Audi_A4
  • 15. Baseline models - Standard document retrieval methods applied on entity description documents - E.g., language modeling (MLM) P(e|q) / P(e)P(q|✓e) = P(e) Y t2q P(t|✓e)n(t,q) Entity prior
 Probability of the entity 
 being relevant to any query Entity language model
 Multinomial probability distribution over the vocabulary of terms
  • 16. Fielded models - Fielded extensions of document retrieval methods - E.g., Mixture of Language Models (MLM) mX j=1 µj = 1 Field language model
 Smoothed with a collection model built
 from all document representations of the
 same type in the collectionField weights P(t|✓d) = mX j=1 µjP(t|✓dj )
  • 17. Setting field weights - Heuristically - Proportional to the length of text content in that field, to the field’s individual performance, etc. - Empirically (using training queries) - Problems - Number of possible fields is huge - It is not possible to optimise their weights directly - Entities are sparse w.r.t. different fields - Most entities have only a handful of predicates
  • 18. Predicate folding - Idea: reduce the number of fields by grouping them together based on, e.g., type foaf:name Audi A4 rdfs:label Audi A4 rdfs:comment The Audi A4 is a compact executive car produced since late 1994 by the German car manufacturer Audi, a subsidiary of the Volkswagen Group. The A4 has been built [...] dbpprop:production 1994 2001 2005 2008 rdf:type dbpedia-owl:MeanOfTransportation dbpedia-owl:Automobile dbpedia-owl:manufacturer dbpedia:Audi dbpedia-owl:class dbpedia:Compact_executive_car owl:sameAs freebase:Audi A4 is dbpedia-owl:predecessor of dbpedia:Audi_A5 is dbpprop:similar of dbpedia:Cadillac_BLS Name
 Attributes
 Out-relations
 In-relations

  • 19. Probabilistic Retrieval Model for Semistructured data (PRMS) - Extension to the Mixture of Language Models - Find which document field each query term may be associated with Mapping probability
 Estimated for each query term P(t|✓d) = mX j=1 µjP(t|✓dj ) P(t|✓d) = mX j=1 P(dj|t)P(t|✓dj ) Kim, Xue, and Croft (2009). Probabilistic Retrieval Model for Semistructured data. In ECIR'09.
  • 20. Estimating the mapping probability Term likelihood Probability of a query term occurring in a given field type
 Prior field probability
 Probability of mapping the query term 
 to this field before observing collection statistics P(dj|t) = P(t|dj)P(dj) P(t) X dk P(t|dk)P(dk) P(t|Cj) = P d n(t, dj) P d |dj|
  • 21. Example (IMDB) cast 0,407 team 0,382 title 0,187 genre 0,927 title 0,07 location 0,002 cast 0,601 team 0,381 title 0,017 dj dj djP(t|dj) P(t|dj) P(t|dj) meg ryan war
  • 22. Baseline results MAP 0 0,1 0,2 0,3 0,4 INEX-XER TREC Entity SemSearch ESSemSearch LS QALD-2 INEX-LD Total LM LM title+content LM all fields PRMS
  • 23. Related entity finding - TREC Entity track (2009-2011) - Input: natural language query - Output: ranked list of entities - Collection: combination of unstructured (web crawl) and structured (DBpedia) - Queries contain an entity (E), target type (T), and a required relation (R) - Entity and target type are annotated in the query
  • 24. Examples airlines that currently use Boeing 747 planes target type input entity Members of The Beaux Arts Trio target type input entity What countries does Eurail operate in? target type input entity
  • 25. Modeling related entity finding - Exploiting the available annotations in a three- component model p(e|E, T, R) / p(e|E) · p(T|e) · p(R|E, e) Context model Type filtering Co-occurrence model xxxx x xxx xx xxxxxx xx x xxx xx x xxxx xx xxx x xxxxxx xxxxxx xx x xxx xx x xxxx xx xxx x xxxxx xx x xxx xx xxxx xx xxx xx x xxxxx xxx xxxx x xxx xx xxxxxx xx x xxx xx x xxxx xx xxx x xxxxxx xxxxxx xx x xxx xx x xxxx xx xxx x xxxxx xx x xxx xx xxxx xx xxx xx x xxxxx xxx xxxx x xxx xx xxxxxx xx x xxx xx x xxxx xx xxx x xxxxxx xxxxxx xx x xxx xx x xxxx xx xxx x xxxxx xx x xxx xx xxxx xx xxx xx x xxxxx xxx Bron, Balog, and de Rijke (2010). Ranking Related Entities: Components and Analyses. In CIKM ’10.
  • 26. Examples airlines that currently use Boeing 747 planes target type input entity Members of The Beaux Arts Trio target type input entity What countries does Eurail operate in? target type input entity Can  we  obtain  such  
 annotations  
 automatically?
  • 28. Wikification - Recognizing concepts in text and linking them to the corresponding entries in Wikipedia Image taken from Milne and Witten (2008b). Learning to Link with Wikipedia. In CIKM '08. This evaluation can also be considered as a large-scale test of our Wikipedia link-based measure. Just the testing phase of the experiment involved more than two million comparisons in order to weight context articles and compare them to candidate senses. When these operations were separated out from the rest of the disambiguation process they where performed in three minutes (a rate of about 11,000 every second) on the desktop machine. 4. LEARNING TO DETECT LINKS This section describes a new approach to link detection. The central difference between this and Mihalcea and Csomai’s system is that Wikipedia articles are used to learn what terms should and should not be linked, and the context surrounding the terms is taken into account when doing so. Wikify’s detection approach, in contrast, relies exclusively on link probability. If a term is used as a link for a sufficient proportion of the Wikipedia articles in which it is found, they consider it to be a link whenever it is encountered in other documents—regardless of context. This approach will always make mistakes, no matter what threshold is chosen. No matter how small a terms link probability is, if it exceeds zero then, by definition, there is some context in which as is the case with Democrats and Democratic Party, several terms link to the same concept if that concept is mentioned more than once. Sometimes, if the disambiguation classifier found more than one likely sense, terms may point to multiple concepts. Democrats, for example, could refer to the party or to any proponent of democracy. These automatically identified Wikipedia articles provide training instances for a classifier. Positive examples are the articles that were manually linked to, while negative ones are those that were not. Features of these articles—and the places where they were mentioned—are used to inform the classifier about which topics should and should not be linked. The features are as follows. Link Probability. Mihalcea and Csomai’s link probability is a proven feature. On its own it is able to recognize the majority of links. Because each of our training instances involves several candidate link locations (e.g. Hillary Clinton and Clinton in Figure 4), there are multiple link probabilities. These are combined into two separate features: the average and the maximum. The former is expected to be more consistent, but the latter may be more indicative of links. For example, Democratic Figure 4: Associating document phrases with appropriate Wikipedia articles Hilary Rodham Clinton Barack Obama Democratic Party (United States) Florida (US State) Nomination Voting Delegate President of the United States Michigan (US State) Democrat
  • 29. Entity linking - Typically only named entities are annotated - Reference KB can be different from Wikipedia - Usages - Improved retrieval, enabling semantic search - Advanced UX/UI, help users to explore - Knowledge base population/acceleration
  • 30. Entity linking methods - Mention detection - Identifying entity mentions in text - Candidate entity ranking - Generating a set of candidate entries from the KB for each mention - Disambiguation - Selecting the best entity for a mention - Machine learning; features: commonness, relatedness, context, …
  • 31. Entity linking evaluation ground truth system annotation AˆA Košice is the biggest city in eastern Slovakia and in 2013 was the European Capital of Culture together with Marseille, France. It is situated on the river Hornád at the eastern reaches of the Slovak Ore Mountains, near the border with Hungary. Košice is the biggest city in eastern Slovakia and in 2013 was the European Capital of Culture together with Marseille, France. It is situated on the river Hornád at the eastern reaches of the Slovak Ore Mountains, near the border with Hungary.
  • 32. Entity linking evaluation ground truth system annotation AˆA Košice is the biggest city in eastern Slovakia and in 2013 was the European Capital of Culture together with Marseille, France. It is situated on the river Hornád at the eastern reaches of the Slovak Ore Mountains, near the border with Hungary. Košice is the biggest city in eastern Slovakia and in 2013 was the European Capital of Culture together with Marseille, France. It is situated on the river Hornád at the eastern reaches of the Slovak Ore Mountains, near the border with Hungary.
  • 33. Entity linking evaluation Košice is the biggest city in eastern Slovakia and in 2013 was the European Capital of Culture together with Marseille, France. It is situated on the river Hornád at the eastern reaches of the Slovak Ore Mountains, near the border with Hungary. ground truth system annotation Košice is the biggest city in eastern Slovakia and in 2013 was the European Capital of Culture together with Marseille, France. It is situated on the river Hornád at the eastern reaches of the Slovak Ore Mountains, near the border with Hungary. AˆA P = |A T ˆA| |A| R = |A T ˆA| | ˆA| F = 2 · P · R P + R
  • 34. Entity linking for queries - Challenges - search queries are short - limited context - lack of proper grammar, spelling - multiple interpretations - needs to be fast
  • 39. ERD’14 challenge - Task: finding query interpretations - Input: keyword query - Output: sets of sets of entities - Reference KB: Freebase - Annotations are to be performed by a web service within a given time limit
  • 40. Evaluation P = |I T ˆI| |I| R = |I T ˆI| |ˆI| F = 2 · P · R P + R New York City, Manhattan ground truth system annotationˆI I new york pizza manhattan New York-style pizza, Manhattan New York City, Manhattan New York-style pizza
  • 41. ERD’14 results Single interpretation is returned Rank Team F1 latency 1 SMAPH Team 0.7076 0.49 2 NTUNLP 0.6797 1.04 3 Seznam Research 0.6693 3.91 http://web-ngram.research.microsoft.com/erd2014/LeaderBoard.aspx
  • 42. 3
 Living Labs for IR Evaluation http://living-labs.net @livinglabsnet
  • 43. Goals & focus - Overall goal: make information retrieval evaluation more realistic - Evaluate retrieval methods in a live setting with real users in their natural task environments - Focus: medium-sized organizations with fair amount of search volume - Typically lack their own R&D department, but would gain much from improved approaches
  • 44. What is in it for participants? - Access to privileged commercial data - (Search and click-through data) - Opportunity to test IR systems with real, unsuspecting users in a live setting - (Not the same as crowdsourcing!) “Give us your ranking, we’ll have it clicked!”
  • 45. Research aims - Understanding of online evaluation and of the generalization of retrieval techniques across different use cases - Specific research questions - Are system rankings different when using historical clicks from those using online experiments? - Are system rankings different when using manual relevance assessments (“expert judgments”) from those using online experiments?
  • 46. Key idea - Focus on frequent (head) queries - Enough traffic on them (both real-time and historical) - Ranked result lists can be generated offline - An API orchestrates all data exchange between 
 live sites and experimental systems Balog, Kelly, and Schuth (2014). Head First: Living Labs for Ad-hoc Search Evaluation. In CIKM '14.
  • 47. Methodology - Queries, candidate documents, historical search and click data made available - Rankings are generated for each query and uploaded through an API - When any of the test queries is fired, the live site request rankings from the API and interleaves them with that of the production system - Participants get detailed feedback on user interactions (clicks) - Ultimate measure is the number of “wins” against the production system
  • 48. Interleaving Result A1 Result B1 Result A2 Result B2 Result A3 Result B3 Result A1 Result A2 Result A3 Result B1 Result B2 Result B3 production system experimental system System A System B
  • 49. Partners & Use-cases Product search Local domain search Web search Provider regiojatek.hu uva.nl seznam.cz Data raw queries and (highly structured) documents raw queries and (generally textual) documents pre-computed document-query features Site traffic relatively low 
 (~4K sessions/day) relatively low high Info needs (mostly) transactional (mostly) navigational vary
  • 51. Guide for CLEF participants
  • 52. Partners & Use-cases Product search Local domain search Web search Provider regiojatek.hu uva.nl seznam.cz Data raw queries and (highly structured) documents raw queries and (generally textual) documents pre-computed document-query features Site traffic relatively low 
 (~4K sessions/day) relatively low high Info needs (mostly) transactional (mostly) navigational vary
  • 53. Product search - Ad-hoc retrieval over a product catalog - Several thousand products - Limited amount of text, lots of structure - Categories, characters, brands, etc.
  • 54. Homepage of the year 2014 in e-commerce category
  • 56. Logs 1402052038;7902b47fbbd45360a672670367cccf8d;IDX;28178,28188,33533,36450,32394,34188,76395,856 1402052039;4561d3d945d8981e177eac00f6025809;QRY;kisvakond puzzle;; 1;47702,07921,80875,36419,65736,09726,85683,09679,37294,15847,80878,00131,03994,17713 1402052040;fd593d671402dd5ea21e39d00aa30e26;PRD;16402 1402052040;fd593d671402dd5ea21e39d00aa30e26;PRC;16402;1;80875,36419,65736,09726,85683,09679,1 1402052041;fd593d671402dd5ea21e39d00aa30e26;IDX;34188,34561,34543,30649,27747,28188,28178,763 1402052043;44ee8beae78426b0eb5662d25191984c;PRD;71983 1402052043;44ee8beae78426b0eb5662d25191984c;PRC;71983;1;88986,73681,49141,02215,85632,85633,4 1402052043;897deef88cfd77d59cf2eebe6c9cab43;CAT;12;9;07975,07976,76715,99293,88844,64632,6452 1402052044;dc1db4bef11f027a0a8d161b8354a318;IDX;00841,34543,76395,28178,34188,33039,33533,339 1402052045;bb3a12f50e5c1a53765885a90a8d8992;IDX;00841,28188,27747,76395,30649,28178,34188,364 … 1402052039;4561d3d945d8981e177eac00f6025809;QRY;kisvakond puzzle;; 2;00132,13867,64253,13866,64252,69406,00127,36152,69405,14792,92991,00126,15758,20313 1402052068;4561d3d945d8981e177eac00f6025809;QRY;puzzle;gender:0|category_id:0|price:1001-2000 bonus:;1;47702,09679,36152,01918,19991,19989,06633,19995,99888,51805,36150,81522,05655,64 1402052076;4561d3d945d8981e177eac00f6025809;QRY;puzzle;gender:0|category_id:0|price:1001-2000 bonus:;8;32497,28307,08246,00064,32471,27995,34018,32491,61540 timestamp session_id event event attributes
  • 60. { "content": { "age_max": 10, "age_min": 6, "arrived": "2014-08-28", "available": 1, "brand": "Mattel", "category": "Babu00e1k, kellu00e9kek", "category_id": "25", "characters": [], "description": "A Monster Highu00ae iskola szu00f6rnycsemetu00e9i […]", "gender": 2, "main_category": "Baba, babakocsi", "main_category_id": "3", "photos": [ "http://regiojatek.hu/data/regio_images/normal/20777_0.jpg", "http://regiojatek.hu/data/regio_images/normal/20777_1.jpg", […] ], "price": 8675.0, "product_name": "Monster High Scaris Paravu00e1rosi baba tu00f6bbfu00e9le", "queries": { "clawdeen": "0.037", "monster": "0.222", "monster high": "0.741" }, "short_description": "A Monster Highu00ae iskola szu00f6rnycsemetu00e9i 
 elsu0151 ku00fclfu00f6ldi u00fatjukra indulnak..." }, "creation_time": "Mon, 11 May 2015 04:52:59 -0000", "docid": "R-d43", "site_id": "R", frequent queries
 that led to the product
  • 61. Results Method impr. 
 total impr. 
 per query #clicks CTR 1 1884 18.84 581 0.31 2 1407 14.07 630 0.45 3 1334 13.34 483 0.36
  • 62. Join us! - Evaluation will continue to run even after the CLEF deadline - Additional use-cases are welcome living-labs.net
  • 63. Summary - Entities as the unit of retrieval - Understanding queries with the help of entities - From traditional test-collection based evaluations to evaluation as a service
  • 64. Questions? Contact | @krisztianbalog | krisztianbalog.com