The document discusses entity linking in queries for tasks like ad-hoc document retrieval and query understanding. It describes three main tasks for entity linking in queries: entity linking, semantic mapping, and interpretation finding. Semantic mapping allows mentions to overlap and links mentions to multiple entities, while interpretation finding returns sets of semantically related entity sets where mentions do not overlap within a set. The document also discusses test collections, evaluation metrics, and methods like mention detection, candidate entity ranking, and greedy interpretation finding for linking entity mentions in queries to knowledge bases.
Formation of low mass protostars and their circumstellar disks
Entity Linking in Queries: Tasks and Evaluation
1. Entity Linking in Queries:
Tasks and Evaluation
ICTIR conference, September 2015
Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg
2. 2
Entity linking
Definition from Wikipedia:
https://en.wikipedia.org/wiki/Natural_language_processing
https://en.wikipedia.org/wiki/Statistical_classification
3. 3
Why entity linking in queries?
• ~70% of queries contain entities
• To exploit semantic representation of
queries
Improves:
• Ad-hoc document retrieval
• Entity retrieval
• Query understanding
• Understanding users’ task (Tasks track, TREC)
J. Pound, P. Mika, and H. Zaragoza. Ad-hoc object retrieval in the web of data. In Proc. of WWW '10.
tom cruise movies
semantic representation
<relation>
4. 4
It is different …
Different from conventional entity linking:
• Limited or even no context
• A mention may be linked to more than one entity
France France national football teamFIFA world cup
france world cup 98
{France, FIFA world cup}
or
{France national football team, FIFA world cup}
5. 5
In this talk
How entity linking should be performed for queries?
➤ Task:
“Semantic Mapping” or “Interpretation Finding”?
➤ Evaluation metrics
➤ Test collections
➤ Methods
6. 6
In this talk
How entity linking should be performed for queries?
➤ Task:
“Semantic Mapping” or “Interpretation Finding”?
➤ Evaluation metrics
➤ Test collections
➤ Methods
7. 7
Entity linking
• Output is set of entities
• Each mention is linked to a single entity
• Mentions do not overlap
• Entities are explicitly mentioned
obama mother the music man new york pizza manhattan
{Barack
Obama}
{The Music Man} {New York City,
Manhattan}
8. 8
Semantic mapping
• Output is ranked list of entities
• Mentions can overlap and be linked to multiple entities
• Entities may not be explicitly mentioned
• Entities do not need to form semantically compatible sets
• False positive are not penalized
obama mother the music man new york pizza manhattan
Ann Dunham
Barack Obama
The Music Man
The Music Man (1962
film)
The Music Man (2003
film)
…
New York City
New York-style pizza
Manhattan
Manhattan pizza
...
9. 9
Interpretation finding
• Output is set(s) of semantically related entity sets
• Each entity set is an interpretation of the query
• Mention do not overlap within a set
obama mother the music man new york pizza manhattan
{ {Barack
Obama}}
{
{The Music Man}
{The Music Man (1962
film)},
{The Music Man (2003
film)}
}
{
{New York City,
Manhattan},
{New York-style pizza,
Manhattan}
}
D. Carmel, M.-W. Chang, E. Gabrilovich, B.J. P. Hsu, and K. Wang. ERD: Entity recognition and disambiguation challenge, 2014.
10. 10
Tasks summary
Entity Linking Semantic Mapping Interpretation Finding
Entities explicitly
mentioned?
Yes No Yes
Mentions can
overlap?
No Yes No*
Results format Set Ranked list Sets of sets
Evaluation criteria Mentioned entities
found
Relevant entities
found
Interpretations found
Evaluation metrics Set-based Ranked-based Set-based
* Not within the same interpretation
✓Entity linking requirements are relaxed in semantic mapping.
11. 11
In this talk
How entity linking should be performed for queries?
➤ Task:
“Semantic Mapping” or “Interpretation Finding”?
➤ Evaluation metrics
➤ Test collections
➤ Methods
12. 12
Evaluation
• Macro-averaged metrics (precision, recall, F-measure)
• Matching condition:
– Interpretation sets should exactly match the ground truth
System
query interpretation
Ground truth
query interpretation
︸
What if or ?
D. Carmel, M.-W. Chang, E. Gabrilovich, B.J. P. Hsu, and K. Wang. Entity recognition and disambiguation challenge, 2014.
☞
︸
14. 14
Lean evaluation
• Partial matches are not rewarded in
• E.g. {{New York City, Manhattan}} ≠ {{New York City},
{Manhattan}}
Solution: Combine them with entity-based metrics.
15. 15
In this talk
How entity linking should be performed for queries?
➤ Task:
“Semantic Mapping” or “Interpretation Finding”?
➤ Evaluation metrics
➤ Test collections
➤ Methods
16. 16
Test collections - ERD
The ERD challenge introduced two test collections:
• ERD-dev (91 queries)
• ERD-test (500 queries)
– Unavailable for traditional offline evaluation
Annotation rules:
• The longest mention is used for entities
• Only proper noun entities are annotated (e.g., companies, locations)
• Overlapping mentions are not allowed within a single interpretation
1 http://web-ngram.research.microsoft.com/erd2014/Datasets.aspx
☞ ERD-dev is not suitable for training purposes (small)
1
17. 17
Test collections - YSQLE
Yahoo Search Query Log to Entities (YSQLE)
• 2398 queries, manually annotated with Wikipedia entities
• Designed for training and testing entity linking systems for queries
Issues:
• Not possible to automatically form interpretation sets
– E.g. Query “france world cup 1998”
• Linked entities are not necessarily mentioned explicitly
– E.g. Query “charlie sheen lohan” is annotated with Anger Management (TV
series)
• Annotations are not always complete
– E.g. Query “louisville courier journal” is not annotated with Louisville, Kentucky
Yahoo! Webscope L24 dataset - Yahoo! search query log to entities, v1.0. http://webscope.sandbox.yahoo.com/
☞ YSQLE is meant for the semantic mapping task
18. 18
Test collections - Y-ERD
Y-ERD is manually re-annotated based on:
• YSQLE annotations
• ERD rules
Additional rules:
• Site search queries are not linked
– E.g. Query “facebook obama slur” is only linked to Barack Obama
• Clear policy about misspelled mentions
– Two versions of Y-ERD is made available
☞ Y-ERD is made publicly available
http://bit.ly/ictir2015-elq
19. 19
In this talk
How entity linking should be performed for queries?
➤ Task:
“Semantic Mapping” or “Interpretation Finding”?
➤ Evaluation metrics
➤ Test collections
➤ Methods
21. 21
Mention detection
Entity name variants are gathered from:
• KB: A manually curated knowledge base (DBpedia)
• WEB: Freebase Annotations of the ClueWeb Corpora (FACC)
E. Gabrilovich, et. al. FACC1: Freebase annotation of ClueWeb corpora, 2013.
Recall in mention detection step
22. 22
Methods
Pipeline architecture for two tasks:
Semantic mapping
Interpretation Finding
The goal of entity linking in queries
Candidate
entity ranking
Interpretation
finding
Mention
detection
ranked list
of entities
set of
mentions
interpretationsquery
23. 23
Candidate entity ranking
Ranking using language models:
P(q) - Query length
normalization
- P. Ogilvie and J. Callan. Combining document representations for known-item search. In Proc. of SIGIR ’03, 2003
- W. Kraaij and M. Spitters. Language models for topic tracking. In Language Modeling for Information Retrieval, 2003.
Scores should be comparable across queries
– P(Q) should be considered
Mixture of Language
Models (MLM)
24. 24
Candidate entity ranking
Combining MLM and Commonness:
Commonness
Probability of entity e being the link target of mention m Query length normalized
MLM score
25. 25
Candidate entity ranking
Semantic mapping results on YSQLE:
TAGME is an entity linking system.
• Should not be evaluated using rank-based metrics
• Should not be compared with semantic mapping results
P. Ferragina and U. Scaiella. TAGME: On-the-fly annotation of short text fragments. In Proc. of CIKM 2010.
26. 26
Methods
Pipeline architecture for two tasks:
Semantic mapping
Interpretation Finding
The goal of entity linking in queries
Candidate
entity ranking
Interpretation
finding
Mention
detection
ranked list
of entities
set of
mentions
interpretationsquery
27. 27
Interpretation finding
Greedy Interpretation Finding (GIF):
Example query: “jacksonville fl riverside”
Mention Entity Score
“jacksonville fl” Jacksonville Florida 0.9
“jacksonville” Jacksonville Florida 0.8
“riverside” Riverside Park
(Jacksonville)
0.6
“jacksonville fl” Naval Station
Jacksonville
0.2
“riverside Riverside (band) 0.1
Step 1:
Pruning based on a score
threshold (0.3)
Step 2:
Pruning containment
mentions
Step 3:
Forming interpretation sets { {Jacksonville Florida, Riverside Park
(Jacksonville)} }
29. 29
Take home messages
• Entity linking in queries is different from documents
• Different flavors, different evaluation criteria:
– Interpretation finding (yes)
– Semantic mapping (no)
• Ultimate goal should be interpretation finding
• SM and EL should not be compared to each other
• Resources are available at http://bit.ly/ictir2015-elq
Entity linking is a key enabling component for semantic search
It is so important that it has its own Wikipedia article, referring to the task and the research that has been done in this area.
This text is the definition of EL from Wikipedia.
In facts this piece of text represent entity lining queit well, not because of the text, but because
Of course this has been done by human, by the aim of entity linking is to make it automatically.
evaluate system's understanding of tasks users aim to achieve
This is the beginning of the stories
Both approaches have their place, but there is an important distinction to be made as they are designed to accomplish different tasks.
is no difference made between system A that does not return anything and system B that returns meaningless or nonsense suggestions
Mentions can be overlapping
Entities do not need to form semantically compatible sets
False positive are not penalized in semantic mapping
according to this definition, if the query does not have any interpretations in the ground truth (Iˆ = ;) then precision is un- defined;
We correct for this behavior by defining precision and recall for interpretation-based evaluation:
It captures the extent to which the interpretations of the query are identified.
Interpretation-based metrics:
As with any problem in information retrieval, the availability of public datasets is of key importance.
Terms that are meant to restrict the search to a certain site (such as Facebook or IMDB) should not be linked.
to rank items (here: entities) based on query likelihood:
It ranks entities based on the highest scoring mention, i.e., ranking is dependent not only on the query but on the specific mention as well:
Though we include results for TAGME, we note that this comparison, despite having been done in prior work (e.g., in [6]), is an unfair one.
TAGME is not a baseline for semantic mapping task
Recall what Y-ERD and ERD-dev are.
Semantic mapping and entity linking task should not be compared to each other.
Next: Exploiting multiple query interpretations in entity ranking
Some questions:
Q1: Why multiple interpretations?
ERD-dev is better representation of Web search queries
Y-ERD is limited to annotations from YSQLE, and yes it has its own limitations
Multiple interpretation is crucial; this is not just our idea, but also ERD organizer’s idea.
Q2: Why use both KB and WEB for mention detection?
Using KB is an offline process and it comes for free.
All FACC entities are not used for entity ranking (it is not efficient).
In this case, KB gives high quality name variants and improves recall.
Q3: Are these evaluation metrics biased for a system that tend to return less entities?
The naïve baseline for this datasets is f-measure 0.5
A system should be of course better than this baseline
Refer to the results table, top-ranked results are low.
Q4: Why interpretation finding?
Closer to how human think.
Part of understanding of ambiguity of queries.