Search, Signals & Sense: An Analytics Fueled Vision
Open hpi semweb-06-part7
1. Semantic Web
Technologies
Lecture 6: Applications in the Web of Data
07: Semantic Search
Dr. Harald Sack
Hasso Plattner Institute for IT Systems Engineering
University of Potsdam
Spring 2013
This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0)
2. 2
Lecture 6: Applications in the Web of Data
Open HPI - Course: Semantic Web Technologies
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
3. 3
07 - Semantic Search
Open HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
4. 4
Meaning
sender
Experience
receiver
Context
Concept
symbolizes refers to
Experience http://commons.wikimedia.org/wiki/User:McSmit
Symbol Object
stands for
Armstrong
Pragmatics Ogden, Richards: The Meaning of Meaning:
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Study of the Influence of Language upon Thought and of the Science of Symbolism (1923)
A Potsdam
5. Arms
tron
g
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam
6. http://dbpedia.org/resource/Neil_Armstrong
Neil Armstrong Entities
is a is a
Ontologies
same as
Kosmonaut Astronaut Person
subClassOf
is NOT a
Science Occupation
subClassOf
has an
Employment
7. Classical Information Retrieval
files of records
7
Set of Documents
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
8. Classical Information Retrieval
Information requests files of records
7
Set of Queries Set of Documents
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
9. Classical Information Retrieval
Information requests files of records
7
Set of Queries Set of Documents
similarity
Query indexing
Formulation
indexing language
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
10. Classical Information Retrieval
(simplified version)
Set of documents
8
„search“
?
searching, vb. , in allen ger n
sprachen bezeugt: got.sokjan,
ags. sēcan, as. sokian, an. Soekj
search term(s) keywords
[Bd. 20, Sp. 835]
sēza, ahd. suohhan. aus idg. sprachen steht
am nächsten lat. sāgiospüre, air. saigim gehe
search query einer sache nach, suche; zur weiteren
verwandtschaft vgl. Walde-Pokorny 2, 449.
der umlaut des stammvokals erscheint im nd.,
er wird im md. verzeichnet vonCrecelius
oberhess. wb. 827; Spiess henneb. id. 248;
Hertel Thüringen240; Gerbet Vogtland 425
und auf kolonialem boden bei
Schröerdeutsche mundarten des ungrischen
berglandes 225.
neben eigentlichem suchen 'einer sache
nachspüren, sich bemühen, sie
aufzufinden' (dann auch 'jemanden
aufsuchen, ihn bedrohen, angreifen') steht
search index eine reich bezeugte bedeutungsgruppe mehr
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
11. Evaluation of Information Retrieval Systems
9 |R∩P|
Recall =
|R|
|R∩P|
Precision =
relevant documents that have been retrieved |P|
(1+α)⋅(Recall ⋅ Precision )
Fα=
α⋅(Recall + Precision )
P
R
relevant documents retrieved documents
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
12. Semantic Search
(One of many Definitions...)
10 • Annotation of (text-based) metadata with semantic entities
• Entity-based Information Retrieval
• Make use of semantic relations, as e.g. content-based
similarities of relationships
• Interoperable metadata via semantic annotations
• for content-based description
• for structural / technical description (Multimedia Ontologies)
Overall Goal:
Quantitative and qualitative improvement of Information Retrieval
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
13. Semantic Search
Semantic metadata enable improvement of traditional keyword-
based retrieval by
(1) Query String Extension/Refinement
enables more precise or more complete search results
(2) Cross Referencing
enables to complement search results with additional associated
or similar information
(3) Exploratory Search
enables visualization and navigation of the search space
(4) Reasoning
enables to complement search results with implicitly given
information
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
14. Semantic Search
Query String Extension
12
• Keyword-based search does not deliver all search results that
are relevant for a query, because synonyms and metaphors might
describe the queried content.
• Extension of the original query string (Query Extension)
• from dictionaries and thesauri
• extend query with synonyms, hyponyms, etc.
• from domain ontologies
• extend query with meronyms, related concepts, etc.
Original query string: Bank
possible extensions: Bank ∨ depository financial institution
∨ credit union ∨ acquirer
∨ federal reserve ∨ ...
increase recall
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
15. Semantic Search
Query String Refinement
13
• Keyword-based search does also deliver search results that
are not relevant for a query, because query terms and
document terms might be ambiguous.
• Refinement of the original query string (Query Refinement)
• from dictionaries and thesauri
• disambiguate polysemic terms with hypernyms
• from domain ontologies
• disambiguate polysemic terms with holonyms
Original query string: Bank
possible refinements: (1) Bank ∧ financial institution
(2) Bank ∧ incline ∧ slope ∧ side
(3) Bank ∧ container
(4) Bank ∧ deposit ∧ repository
increase precision
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
16. Semantic Search
Cross Referencing
14
• Provide search results that do not literally contain the query
string but are closely related to the query by content
• Apply domain ontologies for determining related concepts
• Apply statistical analysis of large (text) document
corpora
dbprop:mission
dbpedia:Michael_Collins
dbpedia:Apollo_11
dbprop:mission dbprop:mission
Neil Armstrong dbpedia:Neil_Armstrong dbpedia:Buzz_Aldrin
NER
query string
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
17. Semantic Search
Exploratory Search
15 • Provide additional search results that do not necessarely contain
95
the query string but are related to the query by content or also
are related to the search results achieved by the direct
query
• Apply domain ontologies and heuristics to determine the
relevance of facts
dcterms:subject
category:Apollo_program
dbpedia:Apollo_11
dcterms:subject
dbpedia-owl:mission
dbpedia:Apollo_13
rdf:type
dbpedia:Neil_Armstrong
yago:Space_accidents_and_incidents
rdf:type
dbpedia:Space_Shuttle_Challenger
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
18. Semantic Search
Reasoning
16 • Provide additional search results (and information) that do not
95
necessarely contain the query string but are related to the
query by content, whereby the relation may not be a direct one,
but can be derived via entailment.
• Apply domain ontologies, reasoning algorithms and
heuristics to find new facts and determine the relevance of
facts
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
19. Semantic Search
Reasoning
17
95
Example: query string= Neil Armstrong
(Hard) questions to solve via reasoning:
• Will there be the Moon or documents about the Moon in the search results?
• How is Neil Armstrong related to the Moon? (is he?)
• Was Neil Armstrong (really) on the Moon?
• ...
category:Missions_to_the_Moon dcterms:subject
dcterms:subject
category:Exploration_of_the_Moon
dbpedia:Apollo_11 skos:broader
skos:broader
dbpedia-owl:mission category:Spaceflight
dbpedia:Neil_Armstrong category:Moon
dcterms:subject skos:broader
dbpedia:Moon category:Animals_in_Space
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
20. 18
08 - Exploratory Semantic Search
Open HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam