SlideShare una empresa de Scribd logo
1 de 30
Using BabelNet in Bridging the Gap
Between Natural Language Queries and
Linked Data Concepts
Khadija Elbedweihy, Stuart N. Wrigley, Fabio Ciravegna and and Ziqi Zhang
OAK Research Group,
Department of Computer Science,
University of Sheffield, UK
Outline
• Motivation and Problem Statement
• Natural Language Query Approach
• Approach Steps
• Evaluation
• Results and Discussion
Motivation – Semantic Search
• Wikipedia states that Semantic Search:
“seeks to improve search accuracy by understanding
searcher intent and the contextual meaning of terms as
they appear in the searchable dataspace, whether on the
Web or within a closed system, to generate more
relevant results”
• Semantic search evaluations reported user preference for
free natural language as a query approach (simple, fast &
flexible) as opposed to controlled or view-based inputs.
Problem Statement
• Complete freedom increases difficulty of matching query
terms with the underlying data and ontologies.
• Word sense disambiguation (WSD) is core to the solution.
Question: “How tall is ..... ?”: property height
– tall is polysemous, should be first disambiguated:
– great in vertical dimension; tall people; tall buildings, etc.
– too improbable to admit of belief; a tall story, …
• Another difficulty: Named Entity (NE) recognition and
disambiguation.
Approach
• Free-NL semantic search approach, matching user query
terms with the underlying ontology using:

1) An extended-Lesk WSD approach.
2) A NE recogniser.
3) A set of advanced string similarity algorithms and
ontology-based heuristics to match disambiguated
query terms to ontology concepts and properties.
Extended-Lesk WSD approach
• WordNet is predominant, however its granularity is a
problem for achieving high performance in WSD.

• BabelNet is a very large multilingual ontology with widecoverage obtained from both WordNet and Wikipedia.
• For disambiguation, bags are extended with senses’
glosses and different lexical and semantic relations.
• Include synonyms, hyponyms , hypernyms , attribute, see
also and similar to relations.
Extended-Lesk WSD approach
• Information added from a Wikipedia page (W), mapped
to a WordNet synset includes:
1.labels; page “Play (theatre)”  add play and theatre
2. set of pages redirecting to W; Playlet redirects to Play
3. set of pages linked from W; links in the page Play (theatre)
include literature, comedy, etc.

• Synonyms of synset S, associated with Wikipedia page W:
WordNet synonyms of S in addition to lemmas of
wikipedia information of W".
Extended-Lesk WSD approach
Feature

P

R

F1

Baseline
Synonyms
Syn + hypo
Syn + gloss examples (WN)
Syn + gloss examples (Wiki)
Syn + gloss examples (WN + Wiki)
Syn + hyper
Syn + semRel
Syn + hypo + gloss(WN)
Syn + hypo + gloss(WN) + hyper
Syn + hypo + gloss(WN) + hyper + semRel
Syn+hypo+gloss(WN)+hyper+semRel+relGlosses

58.09
59.14
62.16
61.97
61.14
60.21
60.36
59.65
64.92
65.28
65.45
69.76

57.98
59.03
62.07
61.86
61.02
60.10
60.26
59.54
64.81
65.18
65.33
69.66

58.03
59.09
62.12
61.92
61.08
60.16
60.31
59.59
64.86
65.23
65.39
69.71

• Sentences with less than seven words: f-measure of 81.34%
Approach – Steps
1. Recognition and disambiguation of Named Entities.
2. Parsing and Disambiguation of the NL query.
3. Matching query terms with ontology concepts and
properties.
4. Generation of candidate triples.
5. Integration of triples and generation of SPARQL queries.
1.Recognition and disambiguation of Named Entities
• Named entities recognised using AlchemyAPI.
• AlchemyAPI had the best recognition performance in
NERD evaluation of SOA NE recognizers.
• AlchemyAPI exhibits poor disambiguation performance
• Each NE is disambiguated using our BabelNet-based WSD
approach.
1.Recognition and disambiguation of Named Entities
• Example: “In which country does the Nile start?”
• Matches of Nile in BabelNet include:
–
–
–
–

http://dbpedia.org/resource/Nile (singer)
http://dbpedia.org/resource/Nile (TV series)
http://dbpedia.org/resource/Nile (band)
http://dbpedia.org/resource/Nile

• Match selected (Nile: river): overlapping terms between
sense and query (geography, area, culture, continent)
more than other senses.
2.Parsing and Disambiguation of the NL query
• Stanford Parser used to gather lemmas and POS tags.
• Proper nouns identified by the parser and not recognized
by AlchemyAPI are disambiguated and added to the
recognized entities.

• Example: “In which country does the Nile start?”
– The algorithm does not miss the entity Nile, although it
was not recognized by AlchemyAPI.
2.Parsing and Disambiguation of the NL query
• Example: “Which software has been developed by
organizations founded in California?”
Output:
Word
software

POS
NP

position
1

developed
organizations
founded

develop
organize
find

VBN
NNS
VBN

2
3
4

California

•

Lemma
software

California

NP

5

Equivalent output generated using keywords or phrases.
3.Matching Query Terms with Ontology Concepts & Properties
• Noun phrases, nouns and adjectives are matched with
concepts and properties.

• Verbs are matched only with properties.
• Candidate ontology matches ordered using: Jaro-Winkler
and Double Metaphone string similarity algorithms.
• Jaro-Winkler threshold to accept a match is set to 0.791,
shown in literature to be the best threshold value.
3.Matching Query Terms with Ontology Concepts & Properties
• Matching process uses the following in order:
1. query term (e.g., created)
2. lemma (e.g., create)
3. derivationally related forms (creator)

• If no matches, disambiguate query term and use
expansion terms in order:
1. synonyms
2. hyponyms
3. hypernyms
4. semantic relations (e.g., height as an attribute for tall)
4. Generation of Candidate Query Triples
• Structure of the ontology (taxonomy of classes and domain
and range of properties) used to link matched concepts and
properties and recognized entities to generate query triples.

Three-Terms Rule
• Each three consecutive terms matched with set of templates.

E.g., “Which television shows were created by Walt Disney?”
• Template (concept-property-instance) generates triples:
?television_show <dbo:creator> <res:Walt_Disney>
?television_show <dbp:creator> <res:Walt_Disney>
?television_show <dbo:creativeDirector> <res:Walt_Disney>
Three-Terms Rule
Examples of templates used in three-terms rule:
• concept-property-instance
– airports located in California
– actors born in Germany
• instance-property-instance
– Was Natalie Portman born in the United States?
• property-concept-instance
– birthdays of actors of television show Charmed
Two-Terms Rule
Two-Terms Rule, used when:
1) There is fewer than three derived terms
2) No match between query terms and three-term template
3) Matched template did not generate candidate triples
E.g., “In which films directed by Garry Marshall was Julia
Roberts starring?”
<Garry Marshall, Julia Roberts, starring> : matched to a
three-terms template but does not generate triples.
Two-Terms Rule
Two-Terms Rule
Question: “what is the area code of Berlin?”
• Template (property-instance) generates the triples:
<res:Berlin> <dbp:areaCode> ?area_code

<res:Berlin> <dbo:areaCode> ?area_code
Comparatives
Comparatives Scenarios:
1) Comparative used with a numeric datatype property:
e.g., “companies with more than 500,000 employees”
?company <dbp:numEmployees> ?employee
?company <dbp:numberOfEmployees> ?employee
?company a <dboCompany>
FILTER (?employee > 500000)
Comparatives
2) Comparative is used with a concept:
e.g., “places with more than 2 caves”

• Generate the same triples for places with caves:
?place a <http://dbpedia.org/ontology/Place>.
?cave a <http://dbpedia.org/ontology/Cave>.
?place ?rel1 ?cave.
?cave ?rel1 ?place.

• Add the aggregate restriction:
GROUP BY ?place
HAVING (COUNT(?cave)>2).
Comparatives
3) Comparative is used with an object property
e.g., “countries with more than 2 official languages”

• Similarly, generate the same triples for country and
official language and add the restriction:
GROUP BY ?country
HAVING (COUNT(?official_language) > 2)

4) Generic Comparatives
e.g., “Which mountains are higher than the Nanga Parbat?”
Generic Comparatives
• Difficulty: identify the property referred to by the
comparative term.

1) Select best relation according to query context.
– Identify all numeric datatype properties associated
with the concept “mountain”, include:
“latS, longD, prominence, firstAscent, elevation, longM, …”
2) Disambiguate synsets of all properties and use WSD
approach to identify the most related synset to the query.
– property elevation is correctly selected
5. Integration of Triples and Generation of SPARQL Queries
• Generated triples integrated to produce SPARQL query.
• Query term positions used to order the generated triples.
• Triples originating from the same query term are
executed in order until an answer is found.
• Duplicates are removed while merging the triples.
• SELECT and WHERE clauses added in addition to any
aggregate restrictions or solution modifiers.
Evaluation
• Test data from 2nd Open Challenge at QALD-2.
• Results produced by QALD-2 evaluation tool.

• Very promising results: 76% of questions answered correct.
Approach

Answered Correct

Precision

Recall

F1

BELA
QAKiS
Alexandria

31
35
25

17
11
5

0.62
0.39
0.43

0.73
0.37
0.46

0.67
0.38
0.45

SenseAware
SemSeK
MHE

54
80
97

41
32
30

0.51
0.44
0.36

0.53
0.48
0.4

0.52
0.46
0.38
Discussion
• Design choices affected by priority for precision or recall:
1. Query Relaxation
e.g., “Give me all actors starring in Last Action Hero”
– Restricting results to actors harms recall
– Not all entities in LD are typed, let alone correctly typed
– Query relaxation favors recall but affects precision
e.g. “How many films did Leonardo DiCaprio star in?”
– Return TV series rather than only films such as
res:Parenthood (1990 TV series).

• Decision: favor precision; keep restriction when specified.
Discussion
2. Best or All Matches
e.g., “software by organizations founded in California”
– Properties matched: foundation and foundationPlace
– Using only best match (foundation ) does not generate
all results  affects recall.
– Using all properties (may not be relevant to the query)
would harm precision.
• Decision: use all matches; with high value for the
similarity threshold; perform checks against the ontology
structure to assure relevant matches are only used.
Discussion
3. Query Expansion
• Can be useful for recall, when the query term is not
sufficient to return all answers.
• Example: use “website” and “homepage” if any of them
used in a query and both have matches in the ontology.
• Quality of expansion terms influenced by WSD approach;
wrong sense identification will lead to noisy list of terms.
• Decision: perform query expansion only when no
matches found in the ontology for a term; or no results
generated using the identified matches.
Questions

Questions?
Additional Slides

Additional Slides

Más contenido relacionado

La actualidad más candente

Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAlessandro Benedetti
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrLucidworks
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly ResourcesRobert Sanderson
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsVito Ostuni
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrTrey Grainger
 

La actualidad más candente (12)

Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache Lucene
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with Solr
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly Resources
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender Systems
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 

Destacado

Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...giuseppe_futia
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Heiko Paulheim
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesHeiko Paulheim
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataMuhammad Saleem
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsMarieke van Erp
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine LearningHeiko Paulheim
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataOlaf Hartig
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementJindřich Mynarz
 
Exploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesExploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesLuiz Henrique Zambom Santana
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysisChristophe Guéret
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionYunyao Li
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?WiLS
 
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyTimm Heuss
 

Destacado (20)

Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of Data
 
DBpedia InsideOut
DBpedia InsideOutDBpedia InsideOut
DBpedia InsideOut
 
NLP todo
NLP todoNLP todo
NLP todo
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public Procurement
 
Exploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesExploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queries
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysis
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity Detection
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?
 
Entity Search Engine
Entity Search Engine Entity Search Engine
Entity Search Engine
 
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
 

Similar a NLP & DBpedia

ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ Prateek Jain
 
PyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchPyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchNoemi Derzsy
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Lucidworks
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingSimon Hughes
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectorsSimon Hughes
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Andre Freitas
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Webebiquity
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015RIILP
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsPyData
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Noemi Derzsy
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationFrank van Harmelen
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...Pierpaolo Basile
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedSören Auer
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 

Similar a NLP & DBpedia (20)

ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 
PyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchPyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from Scratch
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA Datasets
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge Representation
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 

Último

General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 

Último (20)

General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 

NLP & DBpedia

  • 1. Using BabelNet in Bridging the Gap Between Natural Language Queries and Linked Data Concepts Khadija Elbedweihy, Stuart N. Wrigley, Fabio Ciravegna and and Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield, UK
  • 2. Outline • Motivation and Problem Statement • Natural Language Query Approach • Approach Steps • Evaluation • Results and Discussion
  • 3. Motivation – Semantic Search • Wikipedia states that Semantic Search: “seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results” • Semantic search evaluations reported user preference for free natural language as a query approach (simple, fast & flexible) as opposed to controlled or view-based inputs.
  • 4. Problem Statement • Complete freedom increases difficulty of matching query terms with the underlying data and ontologies. • Word sense disambiguation (WSD) is core to the solution. Question: “How tall is ..... ?”: property height – tall is polysemous, should be first disambiguated: – great in vertical dimension; tall people; tall buildings, etc. – too improbable to admit of belief; a tall story, … • Another difficulty: Named Entity (NE) recognition and disambiguation.
  • 5. Approach • Free-NL semantic search approach, matching user query terms with the underlying ontology using: 1) An extended-Lesk WSD approach. 2) A NE recogniser. 3) A set of advanced string similarity algorithms and ontology-based heuristics to match disambiguated query terms to ontology concepts and properties.
  • 6. Extended-Lesk WSD approach • WordNet is predominant, however its granularity is a problem for achieving high performance in WSD. • BabelNet is a very large multilingual ontology with widecoverage obtained from both WordNet and Wikipedia. • For disambiguation, bags are extended with senses’ glosses and different lexical and semantic relations. • Include synonyms, hyponyms , hypernyms , attribute, see also and similar to relations.
  • 7. Extended-Lesk WSD approach • Information added from a Wikipedia page (W), mapped to a WordNet synset includes: 1.labels; page “Play (theatre)”  add play and theatre 2. set of pages redirecting to W; Playlet redirects to Play 3. set of pages linked from W; links in the page Play (theatre) include literature, comedy, etc. • Synonyms of synset S, associated with Wikipedia page W: WordNet synonyms of S in addition to lemmas of wikipedia information of W".
  • 8. Extended-Lesk WSD approach Feature P R F1 Baseline Synonyms Syn + hypo Syn + gloss examples (WN) Syn + gloss examples (Wiki) Syn + gloss examples (WN + Wiki) Syn + hyper Syn + semRel Syn + hypo + gloss(WN) Syn + hypo + gloss(WN) + hyper Syn + hypo + gloss(WN) + hyper + semRel Syn+hypo+gloss(WN)+hyper+semRel+relGlosses 58.09 59.14 62.16 61.97 61.14 60.21 60.36 59.65 64.92 65.28 65.45 69.76 57.98 59.03 62.07 61.86 61.02 60.10 60.26 59.54 64.81 65.18 65.33 69.66 58.03 59.09 62.12 61.92 61.08 60.16 60.31 59.59 64.86 65.23 65.39 69.71 • Sentences with less than seven words: f-measure of 81.34%
  • 9. Approach – Steps 1. Recognition and disambiguation of Named Entities. 2. Parsing and Disambiguation of the NL query. 3. Matching query terms with ontology concepts and properties. 4. Generation of candidate triples. 5. Integration of triples and generation of SPARQL queries.
  • 10. 1.Recognition and disambiguation of Named Entities • Named entities recognised using AlchemyAPI. • AlchemyAPI had the best recognition performance in NERD evaluation of SOA NE recognizers. • AlchemyAPI exhibits poor disambiguation performance • Each NE is disambiguated using our BabelNet-based WSD approach.
  • 11. 1.Recognition and disambiguation of Named Entities • Example: “In which country does the Nile start?” • Matches of Nile in BabelNet include: – – – – http://dbpedia.org/resource/Nile (singer) http://dbpedia.org/resource/Nile (TV series) http://dbpedia.org/resource/Nile (band) http://dbpedia.org/resource/Nile • Match selected (Nile: river): overlapping terms between sense and query (geography, area, culture, continent) more than other senses.
  • 12. 2.Parsing and Disambiguation of the NL query • Stanford Parser used to gather lemmas and POS tags. • Proper nouns identified by the parser and not recognized by AlchemyAPI are disambiguated and added to the recognized entities. • Example: “In which country does the Nile start?” – The algorithm does not miss the entity Nile, although it was not recognized by AlchemyAPI.
  • 13. 2.Parsing and Disambiguation of the NL query • Example: “Which software has been developed by organizations founded in California?” Output: Word software POS NP position 1 developed organizations founded develop organize find VBN NNS VBN 2 3 4 California • Lemma software California NP 5 Equivalent output generated using keywords or phrases.
  • 14. 3.Matching Query Terms with Ontology Concepts & Properties • Noun phrases, nouns and adjectives are matched with concepts and properties. • Verbs are matched only with properties. • Candidate ontology matches ordered using: Jaro-Winkler and Double Metaphone string similarity algorithms. • Jaro-Winkler threshold to accept a match is set to 0.791, shown in literature to be the best threshold value.
  • 15. 3.Matching Query Terms with Ontology Concepts & Properties • Matching process uses the following in order: 1. query term (e.g., created) 2. lemma (e.g., create) 3. derivationally related forms (creator) • If no matches, disambiguate query term and use expansion terms in order: 1. synonyms 2. hyponyms 3. hypernyms 4. semantic relations (e.g., height as an attribute for tall)
  • 16. 4. Generation of Candidate Query Triples • Structure of the ontology (taxonomy of classes and domain and range of properties) used to link matched concepts and properties and recognized entities to generate query triples. Three-Terms Rule • Each three consecutive terms matched with set of templates. E.g., “Which television shows were created by Walt Disney?” • Template (concept-property-instance) generates triples: ?television_show <dbo:creator> <res:Walt_Disney> ?television_show <dbp:creator> <res:Walt_Disney> ?television_show <dbo:creativeDirector> <res:Walt_Disney>
  • 17. Three-Terms Rule Examples of templates used in three-terms rule: • concept-property-instance – airports located in California – actors born in Germany • instance-property-instance – Was Natalie Portman born in the United States? • property-concept-instance – birthdays of actors of television show Charmed
  • 18. Two-Terms Rule Two-Terms Rule, used when: 1) There is fewer than three derived terms 2) No match between query terms and three-term template 3) Matched template did not generate candidate triples E.g., “In which films directed by Garry Marshall was Julia Roberts starring?” <Garry Marshall, Julia Roberts, starring> : matched to a three-terms template but does not generate triples.
  • 19. Two-Terms Rule Two-Terms Rule Question: “what is the area code of Berlin?” • Template (property-instance) generates the triples: <res:Berlin> <dbp:areaCode> ?area_code <res:Berlin> <dbo:areaCode> ?area_code
  • 20. Comparatives Comparatives Scenarios: 1) Comparative used with a numeric datatype property: e.g., “companies with more than 500,000 employees” ?company <dbp:numEmployees> ?employee ?company <dbp:numberOfEmployees> ?employee ?company a <dboCompany> FILTER (?employee > 500000)
  • 21. Comparatives 2) Comparative is used with a concept: e.g., “places with more than 2 caves” • Generate the same triples for places with caves: ?place a <http://dbpedia.org/ontology/Place>. ?cave a <http://dbpedia.org/ontology/Cave>. ?place ?rel1 ?cave. ?cave ?rel1 ?place. • Add the aggregate restriction: GROUP BY ?place HAVING (COUNT(?cave)>2).
  • 22. Comparatives 3) Comparative is used with an object property e.g., “countries with more than 2 official languages” • Similarly, generate the same triples for country and official language and add the restriction: GROUP BY ?country HAVING (COUNT(?official_language) > 2) 4) Generic Comparatives e.g., “Which mountains are higher than the Nanga Parbat?”
  • 23. Generic Comparatives • Difficulty: identify the property referred to by the comparative term. 1) Select best relation according to query context. – Identify all numeric datatype properties associated with the concept “mountain”, include: “latS, longD, prominence, firstAscent, elevation, longM, …” 2) Disambiguate synsets of all properties and use WSD approach to identify the most related synset to the query. – property elevation is correctly selected
  • 24. 5. Integration of Triples and Generation of SPARQL Queries • Generated triples integrated to produce SPARQL query. • Query term positions used to order the generated triples. • Triples originating from the same query term are executed in order until an answer is found. • Duplicates are removed while merging the triples. • SELECT and WHERE clauses added in addition to any aggregate restrictions or solution modifiers.
  • 25. Evaluation • Test data from 2nd Open Challenge at QALD-2. • Results produced by QALD-2 evaluation tool. • Very promising results: 76% of questions answered correct. Approach Answered Correct Precision Recall F1 BELA QAKiS Alexandria 31 35 25 17 11 5 0.62 0.39 0.43 0.73 0.37 0.46 0.67 0.38 0.45 SenseAware SemSeK MHE 54 80 97 41 32 30 0.51 0.44 0.36 0.53 0.48 0.4 0.52 0.46 0.38
  • 26. Discussion • Design choices affected by priority for precision or recall: 1. Query Relaxation e.g., “Give me all actors starring in Last Action Hero” – Restricting results to actors harms recall – Not all entities in LD are typed, let alone correctly typed – Query relaxation favors recall but affects precision e.g. “How many films did Leonardo DiCaprio star in?” – Return TV series rather than only films such as res:Parenthood (1990 TV series). • Decision: favor precision; keep restriction when specified.
  • 27. Discussion 2. Best or All Matches e.g., “software by organizations founded in California” – Properties matched: foundation and foundationPlace – Using only best match (foundation ) does not generate all results  affects recall. – Using all properties (may not be relevant to the query) would harm precision. • Decision: use all matches; with high value for the similarity threshold; perform checks against the ontology structure to assure relevant matches are only used.
  • 28. Discussion 3. Query Expansion • Can be useful for recall, when the query term is not sufficient to return all answers. • Example: use “website” and “homepage” if any of them used in a query and both have matches in the ontology. • Quality of expansion terms influenced by WSD approach; wrong sense identification will lead to noisy list of terms. • Decision: perform query expansion only when no matches found in the ontology for a term; or no results generated using the identified matches.