Más contenido relacionado
Similar a Concept based semantic search (17)
Más de Semantic Web Company (20)
Concept based semantic search
- 2. Content/agenda
1. What means „concept-based“?
2. Concept-tagging
3. Semantic search
• Faceted search
• Similarity search
4. Semantics as a means for
‚interpretation‘
5. Topic pages
6. Three levels of semantic search
© Semantic Web Company – http://www.semantic-web.at/ 2
- 3. What is a concept?
The semiotic triangle
Mental model
of „A-Class“
concept Another
mental model
of „A-Class“
another
object
A-Class
A-Klasse label object
W 176
© Semantic Web Company – http://www.semantic-web.at/ 3
- 4. Concept-based enterprise vocabulary
http://voc.org.com/core/355 http://voc.org.com/core/54
Vehicle prefLabel prefLabel
manufacturing compact car
company
broader
broader
Daimler-Benz A-Class
prefLabel related prefLabel (de)
Daimler AG A-Klasse
http://voc.org.com/core/97 http://voc.org.com/core/176
W 176
narrower
narrower
Mercedes-AMG
prefLabel related prefLabel
AMG A 250 Sport
http://voc.org.com/core/77 http://voc.org.com/core/44
Each concept has a unique URI and can have various multi-lingual labels. Additionaly, it can have various types of
semantic relations with other concepts. 4
W3C´s SKOS standard describes a pre-defined set of semantic relations especially
for controlled vocabularies.
© Semantic Web Company – http://www.semantic-web.at/ 4
- 5. Concept-tagging vs. Term-tagging
Concept-tagging is done on top Enterprise vocabulary
of concepts which are already
part of the enterprise
vocabulary, thus contextualised ‚Term-tags„ become a ‚concept„
and linked to other concepts. as part of the enterprise vocabulary
Term-tagging means that tags
are extracted from text
(automatically via text mining)
which are not part of the Concept Tagging
controlled vocabulary yet.
--- ------ - Term Tagging
Term-tags can be inserted into
the enterprise vocabulary. -- --- ---- -
This extends and refines the ---- ---- ---
vocabulary more and more.
---- --- - --
- --- ---- --
--- ------
Content from CMS
© Semantic Web Company – http://www.semantic-web.at/ 5
- 6. Concept-tagging: pre-condition for
semantic search
W 176 search
--- -- ----- --
prefLabel
------ ---- --- A-Class
------ --- ----
-- --- --
narrower
W 176
A 250 Sport ---- -
---- ---- ----
---- --- prefLabel
A 250 Sport
© Semantic Web Company – http://www.semantic-web.at/ 6
- 7. Traditional search methods vs.
semantic search
W 176 search
Semantic:
prefLabel
Can the search phrase
A-Class
be found analogously?
Traditional:
narrower
Can the search phrase W 176
be found literally
in the document?
prefLabel
A 250 Sport
--- -- ----- -- --- -- -- --- -
------ ---- --- ----- --- -----
------ --- ---- - --- ---- ---
-- --- --- ---- ----------A 250
A 250 Sport ---- - Sport ---- -----
---- ---- ---- ---- ---- ----
---- --- ---
© Semantic Web Company – http://www.semantic-web.at/ 7
- 8. Semantics as a means for
interpretation
Semantics helps to make
different language levels or
W 176 search
various perspectives
comparable.
prefLabel
A-Class
Example: Vendors and their
customers quite often talk
narrower
W 176
different languages. Wrong or
sometimes time-consuming
‚translations„ and prefLabel
A 250 Sport
interpretations have to be done
by the customers themselves.
Example: The state of
----- --- -----
knowledge of employees can be - --- ---- ----
quite divergent. Semantics as a --- --- -A 250
search assistant can serve
especially less experienced Sport ---- -----
colleagues. ---- ---- ----
---
© Semantic Web Company – http://www.semantic-web.at/ 8
- 9. Concept-based high-precision facet
classification
#1 ---- --- -- --
Daimler-Benz ----- Synonyms and hidden labels:
#1 is also classified as ‚Daimler
- --- ------ --- AG„ because ‚Daimler-Benz„ is
also (an old) name for ‚Daimler
- ----- ---- --- AG„.
- ---- ------ --
Transitivity:
COMPANY #2 is categorized as ‚vehicle
manufacturer„ too, because in
#2 ----- ------ --
our thesaurus ‚AMG„ is narrower
Vehicle manufacturer (2) (is part of) of ‚Daimler„ which is a
- ------ -- --- ‚vehicle manufacturer„.
---- ---- ----- Daimler AG (2)
---- ---- ----
AMG --
AMG (1)
---- --- ------
--
Concept-/thesaurus-based facet classification of documents is as precise as the classification scheme
used by the enterprise thesaurus itself. In consideration of all different labels of concepts and their
transitive hierarchical relations, a more precise facet classification can be realised than with
traditional term-based methods.
9
© Semantic Web Company – http://www.semantic-web.at/ 9
- 10. Similarity search: efficient re-use of
existing information
Mercedes-AMG --- -- AMG
http://voc.org.com/core/77
--- ------ ---
prefLabel ------ -- ----
AMG
-- ---- ----- -
-- --A 250 Sport -
--- ----- ----
http://voc.org.com/core/176
--- ---- ----- -- ---
--- -- --- ------ -- A-Class ---- ---- --- -
Mercedes-AMG -------- ----- --------
--- -------- -- W 176 W 176 ----
narrower
---- ----- ---- ----
---- ---
A 250 Sport
http://voc.org.com/core/44
Content-authors as well as end-users can benefit from similarity search (content recommendation),
e.g. by ‚skim reading„ or by the avoidance of duplicated work. Even if two documents have no words in
common they can be classified as similar when using a concept-based text analysis.
10
© Semantic Web Company – http://www.semantic-web.at/ 10
- 11. Topic Pages: Mashups for a
fast 360O view
Articles (twitter, videos etc.) can be retrieved
Short
http:/
from various content sources
description
/
Related
concepts
CMS
Geo search
API
11
© Semantic Web Company – http://www.semantic-web.at/ 11
- 12. Linked Data: complex queries on top of
standard technologies
Example: Find industry news which mention countries or regions, in which our export
volume increased by more than 10% over the last 5 years an which mention either one of
our products and/or a competitor.
(Federated) SPARQL Queries
Industry
Export statistics
News
12
© Semantic Web Company – http://www.semantic-web.at/ 12
- 13. Conclusio 1: The three levels of
semantic search
Year in which the 2014 Semantics is explicitly available via linked knowledge models.
underlying Content from various sources and deparments can be linked and
technology will
Linked Data mashed on top of an explicit meta data layer. Complex queries
be/has been rolled based search which use data from many sources can be made by using the
out. standard query language SPARQL.
2011 Semantics is explicitly available by using controlled vocabularies
and thesauri. Thesauri are the basis for precise text analysis and
Concept- to build a semantic index. Building knowledge models is
based search especially cost-efficient for larger organisations since a more
precise search can be provided.
No Standards
2005 Semantics is calculated by text analysis. Example: Because
Term-based „Dieter Zetsche“ frequently occurs together with „Daimler AG“
in a text the algorithm assumes that those two phrases relate
search somehow to each other. Term-based methods are less precise
than the two from further above.
© Semantic Web Company – http://www.semantic-web.at/ 13
- 14. Conclusio 2: Explicit metadata layer
Data Data
Research Production
Metadata:
• Stored and processed separately from data
• Metadata management is part of the enterprise information management strategy
Data Data
Marketing/Sales
HR
© Semantic Web Company – http://www.semantic-web.at/ 14
- 15. “Thank you for your time and
please forward any comments
or questions to me to get more
information on our product or
linked data & vocabularies!”
Andreas Blumauer
Managing Partner
a.blumauer@semantic-web.at
Semantic Web Company GmbH http://www.semantic-web.at/
Mariahilfer Strasse 70/8 http://poolparty.biz
1070 Vienna
Austria http://twitter.com/semwebcompany
© Semantic Web Company – http://www.semantic-web.at/ 15
15