2. Understand the changes on the database landscape
in the direction of more heterogeneous data
scenarios, especially the Semantic Web.
Understand how Question Answering (QA) fits into
this new scenario.
Give the fundamental pointers for you to develop
your own QA system from the state-of-the-art.
Focus on coverage.
2
3. Motivation & Context
Big Data, Linked Data & QA
Challenges for QA over LD
The Anatomy of a QA System
QA over Linked Data (Case Studies)
Evaluation of QA over Linked Data
Do-it-yourself (DIY): Core Resources
Trends
Take-away Message
4
5. Humans are built-in
with natural language
communication
capabilities.
Very natural way for
humans to
communicate
information needs.
The archetypal AI
system.
6
6. A research field on its own.
Empirical bias: Focus on the development and
evaluation of approaches and systems to
answer questions over a knowledge base.
Multidisciplinary:
◦ Natural Language Processing
◦ Information Retrieval
◦ Knowledge Representation
◦ Databases
◦ Linguistics
◦ Artificial Intelligence
◦ Software Engineering
◦ ...
7
8. From the QA expert perspective
◦ QA depends on mastering different semantic computing
techniques.
9
9. Keyword Search:
◦ User still carries the major efforts in interpreting the data.
◦ Satisfying information needs may depend on multiple
search operations.
◦ Answer-driven information access.
◦ Input: Keyword search
Typically specification of simpler information needs.
◦ Output: documents, structured data.
QA:
◦ Delegates more ‘interpretation effort’ to the machines.
◦ Query-driven information access.
◦ Input: natural language query
Specification of complex information needs.
◦ Output: direct answer.
10
10. Structured Queries:
◦ A priori user effort in understanding the schemas behind
databases.
◦ Effort in mastering the syntax of a query language.
◦ Satisfying information needs may depend on multiple
querying operations.
◦ Input: Structured query
◦ Output: data records, aggregations, etc
QA:
◦ Delegates more ‘semantic interpretation effort’ to the
machine.
◦ Input: natural language query
◦ Output: direct natural language answer
11
11. Keyword search:
◦ Simple information needs.
◦ Predictable search behavior.
◦ Vocabulary redundancy (large document collections,
Web).
Structured queries:
◦ Demand for absolute precision/recall guarantees.
◦ Small & centralized schemas.
◦ More data volume/less semantic heterogeneity.
QA:
◦ Specification of complex information needs.
◦ Heterogeneous and schema-less data.
◦ More automated semantic interpretation.
12
17. QA is usually associated with the delegation of
more of the ‘interpretation effort’ to the
machines.
QA supports the specification of more complex
information needs.
QA, Information Retrieval and Databases are
complementary perspectives.
18
19. Big Data: More complete data-based picture
of the world.
20
20. Heterogeneous, complex and large-scale databases.
Very-large and dynamic “schemas”.
21
10s-100s attributes
1,000s-1,000,000s attributes
circa 2000
circa 2013
21. Multiple perspectives (conceptualizations) of the reality.
Ambiguity, vagueness, inconsistency.
22
22. Franklin et al. (2005): From Databases to
Dataspaces.
Helland (2011): If You Have Too Much Data,
then “Good Enough” Is Good Enough.
Fundamental trends:
◦ Co-existence of heterogeneous data.
◦ Semantic best-effort behavior.
◦ Pay-as-you go data integration.
◦ Co-existing query/search services.
23
25. Decentralization of content generation.
26
Eifrem, A NOSQL Overview And The Benefits Of Graph Database (2009)
26. Volume
Velocity
Variety
27
- The most interesting but usually neglected
dimension.
27. The Semantic Web Vision:
Software which is able to
understand meaning
(intelligent, flexible).
QA as a way to support this
vision.
28
28. From which university did the wife of
Barack Obama graduate?
With Linked Data we are still in the DB world
29
29. With Linked Data we are still in the DB world
(but slightly worse)
30
30. Addresses practical problems of data
accessibility in a data heterogeneity
scenario.
A fundamental part of the original
Semantic Web vision.
31
33. Example: What is the currency of the Czech Republic?
SELECT DISTINCT ?uri WHERE {
res:Czech_Republic dbo:currency ?uri .
}
Main challenges:
Mapping natural language expressions to
vocabulary elements (accounting for lexical and
structural differences).
Handling meaning variations (e.g. ambiguous or
vague expressions, anaphoric expressions).
34
34. URIs are language independent identifiers.
Their only actual connection to natural language is by
the labels that are attached to them.
dbo:spouse rdfs:label “spouse”@en , “echtgenoot”@nl .
Labels, however, do not capture lexical variation:
wife of
husband of
married to
...
35
35. Which Greek cities have more than 1 million inhabitants?
SELECT DISTINCT ?uri
WHERE {
?uri rdf:type dbo:City .
?uri dbo:country res:Greece .
?uri dbo:populationTotal ?p .
FILTER (?p > 1000000)
}
36
36. Often the conceptual granularity of language does
not coincide with that of the data schema.
When did Germany join the EU?
SELECT DISTINCT ?date
WHERE {
res:Germany dbp:accessioneudate ?date .
}
Who are the grandchildren of Bruce Lee?
SELECT DISTINCT ?uri
WHERE {
res:Bruce_Lee dbo:child ?c .
?c dbo:child ?uri .
}
37
37. In addition, there are expressions with a fixed,
dataset-independent meaning.
Who produced the most films?
SELECT DISTINCT ?uri
WHERE {
?x rdf:type dbo:Film .
?x dbo:producer ?uri .
}
ORDER BY DESC(COUNT(?x))
OFFSET 0 LIMIT 1
38
38. Different datasets usually follow different schemas,
thus provide different ways of answering an
information need.
Example:
39
39. The meaning of expressions like the verbs to be, to
have, and prepositions of, with, etc. strongly
depends on the linguistic context.
Which museum has the most paintings?
?museum dbo:exhibits ?painting .
Which country has the most caves?
?cave dbo:location ?country .
40
40. The number of non-English actors on the web is
growing substantially.
◦ Accessing data.
◦ Creating and publishing data.
Semantic Web:
In principle very well suited for multilinguality, as URIs are
language-independent.
But adding multilingual labels is not common practice (less
than a quarter of the RDF literals have language tags, and
most of those tags are in English).
41
41. Requirement: Completeness and accuracy
(Wrong answers are worse than no answers)
In the context of the Semantic Web:
QA systems need to deal with heterogeneous
and imperfect data.
◦ Datasets are often incomplete.
◦ Different datasets sometimes contain duplicate information,
often using different vocabularies even when talking about
the same things.
◦ Datasets can also contain conflicting information and
inconsistencies.
42
42. Data is distributed among a large collection
of interconnected datasets.
Example: What are side effects of drugs used for the
treatment of Tuberculosis?
SELECT DISTINCT ?x
WHERE {
disease:1154 diseasome:possibleDrug ?d1.
?d1 a drugbank:drugs .
?d1 owl:sameAs ?d2.
?d2 sider:sideEffect ?x.
}
43
43. Requirement: Real-time answers, i.e. low
processing time.
In the context of the Semantic Web:
Datasets are huge.
◦ There are a lot of distributed datasets that might be
relevant for answering the question.
◦ Reported performance of current QA systems
amounts to ~20-30 seconds per question (on one
dataset).
44
44. Bridge the gap between natural languages and
data.
Deal with incomplete, noisy and heterogeneous
datasets.
Scale to a large number of huge datasets.
Use distributed and interlinked datasets.
Integrate structured and unstructured data.
Low maintainability costs (easily adaptable to
new datasets and domains).
45
46. Natural Language Interfaces (NLI)
◦ Input: Natural language queries
◦ Output: Either processed or unprocessed answers
Processed: Direct answers (QA).
Unprocessed: Database records, text snippets,
documents.
47
NLI
QA
47. Categorization of questions and answers.
Important for:
◦ Understanding the challenges before attacking the
problem.
◦ Scoping the QA system.
Based on:
◦ Chin-Yew Lin: Question Answering.
◦ Farah Benamara: Question Answering Systems: State of
the Art and Future Directions.
48
48. The part of the question that says what is
being asked:
◦ Wh-words:
who, what, which, when, where, why, and how
◦ Wh-words + nouns, adjectives or adverbs:
“which party …”, “which actress …”, “how long …”, “how tall
…”.
49
49. Useful for distinguishing different processing
strategies
◦ FACTOID:
PREDICATIVE QUESTIONS:
“Who was the first man in space?”
“What is the highest mountain in Korea?”
“How far is Earth from Mars?”
“When did the Jurassic Period end?”
“Where is Taj Mahal?”
LIST:
“Give me all cities in Germany.”
SUPERLATIVE:
“What is the highest mountain?”
YES-NO:
“Was Margaret Thatcher a chemist?”
50
50. Useful for distinguishing different processing
strategies
◦ OPINION:
“What do most Americans think of gun control?”
◦ CAUSE & EFFECT:
“What is the most frequent cause for lung cancer?”
◦ PROCESS:
“How do I make a cheese cake?”
◦ EXPLANATION & JUSTIFICATION:
“Why did the revenue of IBM drop?”
◦ ASSOCIATION QUESTION:
“What is the connection between Barack Obama and
Indonesia?”
◦ EVALUATIVE OR COMPARATIVE QUESTIONS:
“What is the difference between impressionism and
expressionism?”
51
51. The class of object sought by the question:
Abbreviation
Entity: event, color, animal, plant,. . .
Description, Explanation & Justification :
definition, manner, reason,. . . (“How, why …”)
Human: group, individual,. . . (“Who …”)
Location: city, country, mountain,. . . ( “Where …”)
Numeric: count, distance, size,. . . (“How many
how far, how long …”)
Temporal: date, time, …(from “When …”)
52
52. Question focus is the property or entity that is
being sought by the question
◦ “In which city was Barack Obama born?”
◦ “What is the population of Galway?”
Question topic: What the question is generally
about :
◦ “What is the height of Mount Everest?”
(geography, mountains)
◦ “Which organ is affected by the Meniere’s disease?”
(medicine)
53
53. Structure level:
◦ Structured data (databases).
◦ Semi-structured data (e.g. XML).
◦ Unstructured data.
Data source:
◦ Single dataset (centralized).
◦ Enumerated list of multiple, distributed datasets.
◦ Web-scale.
54
54. Domain Scope:
◦ Open Domain
◦ Domain specific
Data Type:
◦ Text
◦ Image
◦ Sound
◦ Video
Multi-modal QA
55
55. Long answers
◦ Definition/justification based.
Short answers
◦ Phrases.
Exact answers
◦ Named entities, numbers, aggregate, yes/no.
56
56. Relevance: The level in which the answer addresses
users information needs.
Correctness: The level in which the answer is
factually correct.
Conciseness: The answer should not contain
irrelevant information.
Completeness: The answer should be complete.
Simplicity: The answer should be easy to interpret.
Justification: Sufficient context should be provided
to support the data consumer in the determination
of the query correctness.
57
57. Right: The answer is correct and complete.
Inexact: The answer is incomplete or incorrect.
Unsupported: The answer does not have an
appropriate evidence/justification.
Wrong: The answer is not appropriate for the
question.
58
58. Simple Extraction: Direct extraction of snippets
from the original document(s) / data records.
Combination: Combines excerpts from multiple
sentences, documents / multiple data records,
databases.
Summarization: Synthesis from large texts /
data collections.
Operational/functional: Depends on the
application of functional operators.
Reasoning: Depends on the application of an
inference process over the original data.
59
59. Semantic Tractability (Popescu et al., 2003):
Vocabulary overlap/distance between the query
and the answer.
Answer Locality (Webber et al., 2003): Whether
answer fragments are distributed across different
document fragments / documents or
datasets/dataset records.
Derivability (Webber et al, 2003): Dependent if the
answer is explicit or implicit. Level of reasoning
dependency.
Semantic Complexity: Level of ambiguity and
discourse/data heterogeneity.
60
61. Data pre-processing: Pre-processes the database data (includes
indexing, data cleaning, feature extraction).
Question Analysis: Performs syntactic analysis and detects/extracts
the core features of the question (NER, answer type, etc).
Data Matching: Matches terms in the question to entities in the data.
Query Constuction: Generates structured query candidates
considering the question-data mappings and the syntactic
constraints in the query and in the database.
Scoring: Data matching and the query construction components
output several candidates that need to be scored and ranked
according to certain criteria.
Answer Retrieval & Extraction: Executes the query and extracts the
natural language answer from the result set.
62
66. Key contributions:
◦ Using ontologies to interpret user questions
◦ Relies on a deep linguistic analysis that returns
semantic representations aligned to the
ontology vocabulary and structure
Evaluation: Geobase
68
67. 69
Ontology-based QA:
◦ Ontologies play a central role in interpreting user questions
◦ Output is a meaning representation that is aligned to the
ontology underlying the dataset that is queried
◦ ontological knowledge is used for drawing inferences, e.g. for
resolving ambiguities
Grammar-based QA:
◦ Rely on linguistic grammars that assign a syntactic and
semantic representation to lexical units
◦ Advantage: can deal with questions of arbitrary complexity
◦ Drawback: brittleness (fail if question cannot be parsed
because expressions or constructs are not covered by the
grammar)
68. 70
Ontology-independent entries
◦ mostly function words
quantifiers (some, every, two)
wh-words (who, when, where, which, how many)
negation (not)
◦ manually specified and re-usable for all domains
Ontology-specific entries
◦ content words and phrases corresponding to
concepts and properties in the ontology
◦ automatically generated from an ontology lexicon
69. Aim: capture rich and structured linguistic
information about how ontology elements are
lexicalized in a particular language
lemon (Lexicon Model for Ontologies)
http://lemon-model.net
◦ meta-model for describing ontology lexica with
RDF
◦ declarative (abstracting from specific syntactic
and semantic theories)
◦ separation of lexicon and ontology
71
70. Semantics by reference:
◦ The meaning of lexical entries is specified by
pointing to elements in the ontology.
Example:
72
72. Which cities have more than three universities?
SELECT DISTINCT ?x WHERE {
?x rdf:type dbo:City .
?y rdf:type dbo:University .
?y dbo:city ?x .
}
GROUP BY ?y
HAVING (COUNT(?y) > 3)
74
74. Key contributions:
◦ Constructs a query template that directly
mirrors the linguistic structure of the question
◦ Instantiates the template by matching natural
language expressions with ontology concepts
Evaluation: QALD 2012
76
75. In order to understand a user question, we
need to understand:
The words (dataset-specific)
Abraham Lincoln → res:Abraham Lincoln
died in → dbo:deathPlace
The semantic structure (dataset-independent)
who → SELECT ?x WHERE { … }
the most N → ORDER BY DESC(COUNT(?N)) LIMIT 1
more than i N → HAVING COUNT(?N) > i
77
76. Goal: An approach that combines both an
analysis of the semantic structure and a mapping
of words to URIs.
Two-step approach:
◦ 1. Template generation
Parse question to produce a SPARQL template that directly
mirrors the structure of the question, including filters and
aggregation operations.
◦ 2. Template instantiation
Instantiate SPARQL template by matching natural language
expressions with ontology concepts using statistical entity
identification and predicate detection.
78
77. SPARQL template:
SELECT DISTINCT ?x WHERE {
?y rdf:type ?c .
?y ?p ?x .
}
ORDER BY DESC(COUNT(?y))
OFFSET 0 LIMIT 1
?c CLASS [films]
?p PROPERTY [produced]
Instantiations:
?c = <http://dbpedia.org/ontology/Film>
?p = <http://dbpedia.org/ontology/producer>
79
80. 1. Natural language question is tagged with part-of-speech
information.
2. Based on POS tags, grammar entries are built on the fly.
◦ Grammar entries are pairs of:
tree structures (Lexicalized Tree Adjoining Grammar)
semantic representations (ext. Discourse Representation
Structures)
3. These lexical entries, together with domain-independent
lexical entries, are used for parsing the question (cf. Pythia).
4. The resulting semantic representation is translated into a
SPARQL template.
82
81. Domain-independent: who, the most
Domain-dependent: produced/VBD, films/NNS
SPARQL template 1:
SELECT DISTINCT ?x WHERE {
?x ?p ?y .
?y rdf:type ?c .
}
ORDER BY DESC(COUNT(?y)) LIMIT 1
?c CLASS [films]
?p PROPERTY [produced]
SPARQL template 2:
SELECT DISTINCT ?x WHERE {
?x ?p ?y .
}
ORDER BY DESC(COUNT(?y)) LIMIT 1
?p PROPERTY [films]
83
83. 1. For resources and classes, a generic approach to
entity detection is applied:
◦ Identify synonyms of the label using WordNet.
◦ Retrieve entities with a label similar to the slot label based
on string similarities (trigram, Levenshtein and substring
similarity).
2. For property labels, the label is additionally
compared to natural language expressions stored in
the BOA pattern library.
3. The highest ranking entities are returned as
candidates for filling the query slots.
85
86. 1. Every entity receives a score considering
string similarity and prominence.
2. The score of a query is then computed as
the average of the scores of the entities used
to fill its slots.
3. In addition, type checks are performed:
◦ For all triples ?x rdf:type <class>, all query triples ?x p e
and e p ?x are checked w.r.t. whether domain/range of p is
consistent with <class>.
4. Of the remaining queries, the one with
highest score that returns a result is chosen.
88
87. SELECT DISTINCT ?x WHERE {
?x <http://dbpedia.org/ontology/producer> ?y .
?y rdf:type <http://dbpedia.org/ontology/Film> .
}
ORDER BY DESC(COUNT(?y)) LIMIT 1
Score: 0.76
SELECT DISTINCT ?x WHERE {
?x <http://dbpedia.org/ontology/producer> ?y .
?y rdf:type <http://dbpedia.org/ontology/FilmFestival>.
}
ORDER BY DESC(COUNT(?y)) LIMIT 1
Score: 0.60
89
88. The created template structure does not
always coincide with how the data is actually
modelled.
Considering all possibilities of how the data
could be modelled leads to a big amount of
templates (and even more queries) for one
question.
90
92. Two words are strongly similar if any of the
following holds:
◦ 1. They have a synset in common (e.g. “human” and
“person”)
◦ 2. A word is a hypernym/hyponym in the taxonomy of
the other word.
◦ 3. If there exists an allowable “is-a” path connecting a
synset associated with each word.
◦ 4. If any of the previous cases is true and the
definition (gloss) of one of the synsets of the word (or
its direct hypernyms/hyponyms) includes the other
word as one of its synonyms, we said that they are
highly similar.
94
Lopez et al. 2006
101. • Most semantic models have dealt with particular types of
constructions, and have been carried out under very
simplifying assumptions, in true lab conditions.
• If these idealizations are removed it is not clear at all that
modern semantics can give a full account of all but the
simplest models/statements.
Sahlgren, 2013
Formal World Real World
10
3
Baroni et al. 2013
102. “Words occurring in similar (linguistic) contexts
are semantically related.”
If we can equate meaning with context, we can
simply record the contexts in which a word occurs
in a collection of texts (a corpus).
This can then be used as a surrogate of its
semantic representation.
104
112. Minimize the impact of Ambiguity, Vagueness,
Synonymy.
Address the simplest matchings first (heuristics).
Semantic Relatedness as a primitive operation.
Distributional semantics as commonsense
knowledge.
114
113. Transform natural language queries into
triple patterns.
“Who is the daughter of Bill Clinton married to?”
115
118. Step 5: Determine Partial Ordered Dependency
Structure (PODS)
◦ Rules based.
Remove stop words.
Merge words into entities.
Reorder structure from core entity position.
Bill Clinton daughter married to
(INSTANCE)
Person
ANSWER
TYPE
QUESTION FOCUS
120
119. Step 5: Determine Partial Ordered Dependency
Structure (PODS)
◦ Rules based.
Remove stop words.
Merge words into entities.
Reorder structure from core entity position.
Bill Clinton daughter married to
(INSTANCE)
Person
(PREDICATE) (PREDICATE) Query Features
121
120. Map query features into a query plan.
A query plan contains a sequence of:
◦ Search operations.
◦ Navigation operations.
(INSTANCE) (PREDICATE) (PREDICATE) Query Features
(1) INSTANCE SEARCH (Bill Clinton)
(2) DISAMBIGUATE ENTITY TYPE
(3) GENERATE ENTITY FACETS
(4) p1 <- SEARCH RELATED PREDICATE (Bill Clintion, daughter)
(5) e1 <- GET ASSOCIATED ENTITIES (Bill Clintion, p1)
(6) p2 <- SEARCH RELATED PREDICATE (e1, married to)
(7) e2 <- GET ASSOCIATED ENTITIES (e1, p2)
(8) POST PROCESS (Bill Clintion, e1, p1, e2, p2)
Query Plan
122
121. Bill Clinton daughter married to Person
:Bill_Clinton
Query:
Linked
Data:
Entity Search
123
122. Bill Clinton daughter married to Person
:Bill_Clinton
Query:
Linked
Data:
:Chelsea_Clinton
:child
:Baptists:religion
:Yale_Law_School
:almaMater
...
(PIVOT ENTITY)
(ASSOCIATED
TRIPLES)
124
123. Bill Clinton daughter married to Person
:Bill_Clinton
Query:
Linked
Data:
:Chelsea_Clinton
:child
:Baptists:religion
:Yale_Law_School
:almaMater
...
sem_rel(daughter,child)=0.054
sem_rel(daughter,child)=0.004
sem_rel(daughter,alma mater)=0.001
Which properties are semantically related to ‘daughter’?
125
124. Bill Clinton daughter married to Person
:Bill_Clinton
Query:
Linked
Data:
:Chelsea_Clinton
:child
126
125. Computation of a measure of “semantic
proximity” between two terms.
Allows a semantic approximate matching
between query terms and dataset terms.
It supports a commonsense reasoning-like
behavior based on the knowledge embedded
in the corpus.
127
126. Use distributional semantics to semantically match
query terms to predicates and classes.
Distributional principle: Words that co-occur
together tend to have related meaning.
◦ Allows the creation of a comprehensive semantic model
from unstructured text.
◦ Based on statistical patterns over large amounts of text.
◦ No human annotations.
Distributional semantics can be used to compute a
semantic relatedness measure between two words.
128
127. Bill Clinton daughter married to Person
:Bill_Clinton
Query:
Linked
Data:
:Chelsea_Clinton
:child
(PIVOT ENTITY)
129
128. Bill Clinton daughter married to Person
:Bill_Clinton
Query:
Linked
Data:
:Chelsea_Clinton
:child
:Mark_Mezvinsky
:spouse
130
140. Semantic approximation in databases (as in
any IR system): semantic best-effort.
Need some level of user disambiguation,
refinement and feedback.
As we move in the direction of semantic
systems we should expect the need for
principled dialog mechanisms (like in human
communication).
Pull the the user interaction back into the
system.
144
144. Key contributions:
◦ Evidence-based QA system.
◦ Complex and high performance QA Pipeline.
◦ The system uses more than 50 scoring
components that produce scores which range
from probabilities and counts to categorical
features.
◦ Major cultural impact:
Before: QA as AI vision, academic exercise.
After: QA as an attainable software architecture in
the short term.
Evaluation: Jeopardy! Challenge
148
146. Question analysis: includes shallow and deep
parsing, extraction of logical forms, semantic role
labelling, coreference resolution, relations
extraction, named entity recognition, among
others.
Question decomposition: decomposition of the
question into separate phrases, which will generate
constraints that need to be satisfied by evidence
from the data.
150
148. Hypothesis generation:
◦ Primary search
Document and passage retrieval
SPARQL queries are used over triple stores.
◦ Candidate answer generation (maximizing recall).
Information extraction techniques are applied to the
search results to generate candidate answers.
Soft filtering: Application of lightweight (less
resource intensive) scoring algorithms to a larger
set of initial candidates to prune the list of
candidates before the more intensive scoring
components.
152
150. Hypothesis and evidence scoring:
◦ Supporting evidence retrieval
Seeks additional evidence for each candidate answer from
the data sources while the deep evidence scoring step
determines the degree of certainty that the retrieved
evidence supports the candidate answers.
◦ Deep evidence scoring
Scores are then combined into an overall evidence profile
which groups individual features into aggregate evidence
dimensions.
154
152. Answer merging: is a step that merges answer
candidates (hypotheses) with different surface
forms but with related content, combining their
scores.
Ranking and confidence estimation: ranks the
hypotheses and estimate their confidence based on
the scores, using machine learning approaches
over a training set. Multiple trained models cover
different question types.
156
153. Ferrucci et al. 2010
Farrell, 2011
Answer
Scoring
Models
Answer &
Confidence
Question
Evidence
Sources
Models
Models
Models
Models
ModelsPrimary
Search
Candidate
Answer
Generation
Hypothesis
Generation
Hypothesis and
Evidence Scoring
Final Confidence
Merging &
Ranking
Synthesis
Answer
Sources
Question
& Topic
Analysis
Question
Decomposition
Evidence
Retrieval
Deep
Evidence
Scoring
Hypothesis
Generation
Hypothesis and Evidence
Scoring
Learned Models
help combine and
weigh the Evidence
157
www.medbiq.org/sites/default/files/presentations/2011/Farrell.pp
154. Question
100s Possible
Answers
1000’s of
Pieces of Evidence
Multiple
Interpretations
100,000’s scores from many simultaneous
Text Analysis Algorithms100s sources
Hypothesis
Generation
Hypothesis and
Evidence Scoring
Final Confidence
Merging &
Ranking
Synthesis
Question
& Topic
Analysis
Question
Decomposition
Hypothesis
Generation
Hypothesis and Evidence
Scoring
Answer &
Confidence
Ferrucci et al. 2010,
Farrell, 2011
158 www.medbiq.org/sites/default/files/presentations/2011/Farrell.pp
UIMA for interoperability
UIMA-AS for scale-out and speed
158. Measures how complete is the answer set.
The fraction of relevant instances that are
retrieved.
Which are the Jovian planets in the Solar System?
◦ Returned Answers:
Mercury
Jupiter
Saturn
Gold-standard:
– Jupiter
– Saturn
– Neptune
– Uranus
Recall = 2/4 = 0.5
162
159. Measures how accurate is the answer set.
The fraction of retrieved instances that are
relevant.
Which are the Jovian planets in the Solar System?
◦ Returned Answers:
Mercury
Jupiter
Saturn
Gold-standard:
– Jupiter
– Saturn
– Neptune
– Uranus
Recall = 2/3 = 0.67
163
160. Measures the ranking quality.
The Reciprocal-Rank (1/r) of a query can be defined as the rank r
at which a system returns the first relevant result.
Which are the Jovian planets in the Solar System?
Returned Answers:
– Mercury
– Jupiter
– Saturn
Gold-standard:
– Jupiter
– Saturn
– Neptune
– Uranus
rr = 1/2 = 0.5
164
161. Query execution time
Indexing time
Index size
Dataset adaptation effort (Indexing time)
Semantic enrichment/disambiguation
◦ # of operations/time
162. Question Answering over Linked Data
(QALD-CLEF)
INEX Linked Data Track
BioASQ
SemSearch
168
164. QALD is a series of evaluation campaigns on
question answering over linked data.
◦ QALD-1 (ESWC 2011)
◦ QALD-2 as part of the workshop
(Interacting with Linked Data (ESWC 2012))
◦ QALD-3 (CLEF 2013)
◦ QALD-4 (CLEF 2014)
It is aimed at all kinds of systems that mediate
between a user, expressing his or her
information need in natural language, and
semantic data.
170
165. QALD-4 is part of the Question Answering
track at CLEF 2014:
http://nlp.uned.es/clef-qa/
Tasks:
◦ 1. Multilingual question answering over DBpedia
◦ 2. Biomedical question answering on interlinked
data
◦ 3. Hybrid question answering
171
166. Task:
Given a natural language question or keywords,
either retrieve the correct answer(s) from a given
RDF repository, or provide a SPARQL query that
retrieves these answer(s).
◦ Dataset: DBpedia 3.9 (with multilingual labels)
◦ Questions: 200 training + 50 test
◦ Seven languages:
English, Spanish, German, Italian, French, Dutch,
Romanian
172
167. <question id = "36" answertype = "resource"
aggregation = "false"
onlydbo = "true" >
Through which countries does the Yenisei river flow?
Durch welche Länder fließt der Yenisei?
¿Por qué países fluye el río Yenisei?
...
PREFIX res: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?uri WHERE {
res:Yenisei_River dbo:country ?uri .
}
173
168. Datasets: SIDER, Diseasome, Drugbank
Questions: 25 training + 25 test
require integration of information from different
datasets
Example: What is the side effects of drugs used for
Tuberculosis?
SELECT DISTINCT ?x WHERE {
disease:1154 diseasome:possibleDrug ?v2 .
?v2 a drugbank:drugs .
?v3 owl:sameAs ?v2 .
?v3 sider:sideEffect ?x .
}
174
169. Dataset: DBpedia 3.9 (with English abstracts)
Questions: 25 training + 10 test
require both structured data and free text from the
abstract to be answered
Example: Give me the currencies of all G8 countries.
SELECT DISTINCT ?uri WHERE {
?x text:"member of" text:"G8" .
?x dbo:currency ?uri .
}
175
170. Focuses on the combination of textual and structured data.
Datasets:
◦ English Wikipedia (MediaWiki XML Format)
◦ DBpedia 3.8 & YAGO2 (RDF)
◦ Links among the Wikipedia, DBpedia 3.8, and YAGO2 URI's.
Tasks:
◦ Ad-hoc Task: return a ranked list of results in response to a search
topic that is formulated as a keyword query (144 search topics).
◦ Jeopardy Task: Investigate retrieval techniques over a set of natural-
language Jeopardy clues (105 search topics – 74 (2012) + 31 (2013)).
https://inex.mmci.uni-saarland.de/tracks/lod/
181
172. Focuses on entity search over Linked Datasets.
Datasets:
◦ Sample of Linked Data crawled from publicly available
sources (based on the Billion Triple Challenge 2009).
Tasks:
◦ Entity Search: Queries that refer to one particular
entity. Tiny sample of Yahoo! Search Query.
◦ List Search: The goal of this track is select objects that
match particular criteria. These queries have been
hand-written by the organizing committee.
http://semsearch.yahoo.com/datasets.php#
183
173. List Search queries:
◦ republics of the former Yugoslavia
◦ ten ancient Greek city
◦ kingdoms of Cyprus
◦ the four of the companions of the prophet
◦ Japanese-born players who have played in MLB where
the British monarch is also head of state
◦ nations where Portuguese is an official language
◦ bishops who sat in the House of Lords
◦ Apollo astronauts who walked on the Moon
184
174. Entity Search queries:
◦ 1978 cj5 jeep
◦ employment agencies w. 14th street
◦ nyc zip code
◦ waterville Maine
◦ LOS ANGELES CALIFORNIA
◦ ibm
◦ KARL BENZ
◦ MIT
185
175. Balog & Neumayer, A Test Collection for Entity Search in DBpedia (2013).
186
176. Datasets:
◦ PubMed documents
Tasks:
◦ 1a: Large-Scale Online Biomedical Semantic
Indexing
Automatic annotation of PubMed documents.
Training data is provided.
◦ 1b: Introductory Biomedical Semantic QA
300 questions and related material (concepts, triples
and golden answers).
187
177. Metrics, Statistics, Tests - Tetsuya Sakai (IR)
◦ http://www.promise-noe.eu/documents/10156/26e7f254-
1feb-4169-9204-1c53cc1fd2d7
Building test Collections (IR Evaluation - Ian
Soboroff)
◦ http://www.promise-noe.eu/documents/10156/951b6dfb-
a404-46ce-b3bd-4bbe6b290bfd
188
179. DBpedia
◦ http://dbpedia.org/
YAGO
◦ http://www.mpi-inf.mpg.de/yago-naga/yago/
Freebase
◦ http://www.freebase.com/
Wikipedia dumps
◦ http://dumps.wikimedia.org/
ConceptNet
◦ http:// conceptnet5.media.mit.edu/
Common Crawl
◦ http://commoncrawl.org/
Where to use:
◦ As a commonsense KB or as a data source
190
180. High domain coverage:
◦ ~95% of Jeopardy! Answers.
◦ ~98% of TREC answers.
Wikipedia is entity-centric.
Curated link structure.
Complementary tools:
◦ Wikipedia Miner
Where to use:
◦ Construction of distributional semantic models.
◦ As a commonsense KB
191
181. WordNet
◦ http://wordnet.princeton.edu/
Wiktionary
◦ http://www.wiktionary.org/
◦ API: https://www.mediawiki.org/wiki/API:Main_page
FrameNet
◦ https://framenet.icsi.berkeley.edu/fndrupal/
VerbNet
◦ http://verbs.colorado.edu/~mpalmer/projects/verbnet.html
English lexicon for DBpedia 3.8 (in the lemon format)
◦ http://lemon-model.net/lexica/dbpedia_en/
PATTY (collection of semantically-typed relational patterns)
◦ http://www.mpi-inf.mpg.de/yago-naga/patty/
BabelNet
◦ http://babelnet.org/
Where to use:
◦ Query expansion
◦ Semantic similarity
◦ Semantic relatedness
◦ Word sense disambiguation
192
190. Querying distributed linked data
Integration of structured and unstructured data
User interaction and context mechanisms
Integration of reasoning (deductive, inductive,
counterfactual, abductive ...) on QA approaches
and test collections
Measuring confidence and answer uncertainty
Multilinguality
Machine Learning
Reproducibility in QA research
201
191. Big/Linked Data demand new principled semantic
approaches to cope with the scale and heterogeneity of
data.
Part of the Semantic Web/AI vision can be addressed today
with a multi-disciplinary perspective:
◦ Linked Data, IR and NLP
The multidiscipinarity of the QA problem can serve show
what semantic computing have achieved and can be
transported to other information system types.
Challenges are moving from the construction of basic QA
systems to more sophisticated semantic functionalities.
Very active research area.
202
192. Unger, C., Freitas, A., Cimiano, P.,
Introduction to Question Answering for
Linked Data, Reasoning Web Summer School,
2014.
203
193. [1] Eifrem, A NOSQL Overview And The Benefits Of Graph Database, 2009
[2] Idehen, Understanding Linked Data via EAV Model, 2010.
[3] Kaufmann & Bernstein, How Useful are Natural Language Interfaces to the Semantic Web for
Casual End-users?, 2007
[4] Chin-Yew Lin, Question Answering.
[5] Farah Benamara, Question Answering Systems: State of the Art and Future Directions.
[6] Yahya et al Robust Question Answering over the Web of Linked Data, CIKM, 2013.
[7] Freitas et al., Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches
and Trends, 2012.
[8] Freitas et al., Answering Natural Language Queries over Linked Data Graphs: A Distributional
Semantics Approach,, 2014.
[9] Freitas et al., Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches
and Trends, 2012.
[10] Freitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, IUI, 2014.
[11] Cimiano et al., Towards portable natural language interfaces to knowledge bases, 2008.
[12] Lopez et al., PowerAqua: fishing the semantic web, 2006.
[13] Damljanovic et al., Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and
Ontology-based Lookup through the User Interaction, 2010
[14] Unger et al. Template-based Question Answering over RDF Data, 2012.
[15] Cabrio et al., QAKiS: an Open Domain QA System based on Relational Patterns, 2012.
[16] How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users?, 2007.
[17] Popescu et al.,Towards a theory of natural language interfaces to databases., 2003.
[18] Farrel, IBM Watson A Brief Overview and Thoughts for Healthcare Education and Performance
Improvement
204