Más contenido relacionado Similar a Enterprise Terminology Management as a Basis for powerful Semantic Services (20) Más de Martin Kaltenböck (20) Enterprise Terminology Management as a Basis for powerful Semantic Services1. Enterprise Terminology Management
as a Basis for Powerful Semantic Services
in Content Publishing
Publishers‘ Forum 2013
Berlin, 22 of April 2013
Martin Kaltenböck
Semantic Web Company
www.semantic-web.at
Christian Dirschl
Wolters Kluwer Deutschland GmbH
www.wolterskluwer.de
@semwebcompany
2. Agenda of the Workshop
Challenges and Introduction
Solution: Linked Controlled Vocabularies
Terminology WKD Use Case (C. Dirschl, WKD)
Conclusion & Outlook: a new Business Model?
Semantic Services on Top of Terminology Mgnt.
Q&A and Open Discussion
… bring your own Use Cases!
© Semantic Web Company – http://www.semantic-web.at/
3. Semantic Web Company (SWC)
SWC FACTS
SEMANTIC INFORMATION MANAGEMENT
• Semantic Web Company founded 2001 in Vienna, Austria
• 20 experts in strategy, coding, consulting, research
• Product: PoolParty Suite (launched 2009)
• Serving global 500 companies
• EU- & US-based consulting services
Partner Network
© Semantic Web Company – http://www.semantic-web.at/
4. SWC Customers (excerpt)
World Bank
Roche Diagnostics
Credit Suisse
Wolters Kluwer
Biogen Idec
Wood MacKenzie
UNIQA Insurance AG
Pearson
REEEP
British Museum
Education Services
Australia
Daimler
A1 Telekom
© Semantic Web Company – http://www.semantic-web.at/
6. We use different terminologies…
We use different languages…
We use different classification systems…
We use different meta data management systems…
We use different glossaries and definitions…
We use content from several data silos…
What are the challenges?
Innovation
management Innovation
management
HRMarketing
© Semantic Web Company – http://www.semantic-web.at/
7. Terminology = Controlled Vocabulary = SKOS Thesaurus
SKOS = Simple Knowledge Organisation System
L(O)D = Linked (Open) Data
Linked Controlled Vocabularies = using L(O)D principles
Concept based tagging = semantic tagging = semantic
annotation
URI = Uniform Resource Identifier
….
I am using a special Terminology ;)
© Semantic Web Company – http://www.semantic-web.at/
8. What is a thesaurus, what is the difference to a taxonomy
or an ontology?
A thesaurus is expressive
enough to improve most
enterprise applications
significantly
but it is not too complex
to create and maintain it
in a sustainable way
Taxonomy – Thesaurus - Ontology
© Semantic Web Company – http://www.semantic-web.at/
9. SKOS stands for ‚Simple Knowledge
Organization System‘
© Semantic Web Company – http://www.semantic-web.at/ 9
• W3C Standard
since 2009
• Based on Semantic
Web standards
• Open for linking with
additional linked data
http://www.w3.org/2004/02/skos/
10. What is a Concept? The Semiotic Triangle
concept
objectlabel
A-Class
A-Klasse
W 176
Mental model
of „A-Class“
another
object
Another
mental model
of „A-Class“
© Semantic Web Company – http://www.semantic-web.at/ 10
11. Concept-tagging vs. Term-tagging
Enterprise vocabulary
--- ------ -
-- --- ---- -
---- ---- ---
---- --- - --
- --- ---- --
--- ------
Concept Tagging
Content from CMS
Term Tagging
‚Term-tags‘ become a ‚concept‘
as part of the enterprise vocabulary
Concept-tagging is done on top
of concepts which are already
part of the enterprise
vocabulary, thus contextualised
and linked to other concepts.
Term-tagging means that tags
are extracted from text
(automatically via text mining)
which are not part of the
controlled vocabulary yet.
Term-tags can be inserted into
the enterprise vocabulary.
This extends and refines the
vocabulary more and more.
© Semantic Web Company – http://www.semantic-web.at/ 11
13. Using Linked (Open) Data Principles
• Use URIs to denote things.
• Use HTTP URIs so that these things can be
referred to and looked up ("dereferenced")
by people and user agents.
• Provide useful information about the thing
when its URI is dereferenced, leveraging
standards such as RDF, SPARQL.
• Include links to other related things (using
their URIs) when publishing data on the Web.
Linked Data Principles Tim Berners-Lee
WHY?
• To enable connected vocabularies over several
departments (also different languages)
• To enrich a Terminology in the areas of
concepts, synonyms, definitions, relations….
• To enable contextualization / data integration
linking different Terminologies
Linked Controlled Vocabularies
© Semantic Web Company – http://www.semantic-web.at/
14. © Semantic Web Company – http://www.semantic-web.at/ 14
1. Each concept in one or many concept schemes
2. Each concept has one URI
3. Each concept has one ore more labels
4. (Poly-)Hierarchical and non-hierachical relations
5. Matching between concepts from various sources
1.
2.
3.
4.
5.
Linked Controlled Vocabularies
15. Linked Controlled Vocabularies
• Simple Knowledge Organisation System is a W3C
standard to develop enterprise vocabularies
• SKOS provides several properties for vocabulary
linking (mapping):
– skos:exactMatch
– skos:closeMatch
– skos:broadMatch
– skos:narrowMatch
– skos:relatedMatch
http://www.w3.org/TR/2009/REC-skos-reference-20090818/
© Semantic Web Company – http://www.semantic-web.at/
17. Semantic Services on Top of Terminology Mgnt.
© Semantic Web Company – http://www.semantic-web.at/
18. Semantic Services on Top of
Terminology Management
Topic Pages & Dossier Pages
SEO / SEM
Semantic Search
Recommender Systems
Content Aggregation
Data Integration (Services)
Matchmaking Services
Smart Glossary Services
© Semantic Web Company – http://www.semantic-web.at/
19. © Semantic Web Company – http://www.semantic-web.at/ 19
Live-Demo
http://scot.curriculum.edu.au/
Smart Glossary Services
Example: Schools Online Thesaurus
20. Dossier Pages:
From ‚Gopher‘ to ‚Super-Mashups‘
© Semantic Web Company – http://www.semantic-web.at/ 20
Live-Demo
http://www.reegle.info/countries
21. Topic Pages: Mashups providing a quick
overview
© Semantic Web Company – http://www.semantic-web.at/ 21
Short
Description
Related
Concepts
Geo-
Search
Content(Twitter,Videosetc)
fomseveraldifferentsources
API
http:/
/
CMS
22. © Semantic Web Company – http://www.semantic-web.at/ 22
Live-Demo
http://www.gbpn.org/newsroom/news-aggregator
Content Aggregation
Example: GBPN News Aggregator
23. SKOS & Linked data alignment
© Semantic Web Company – http://www.semantic-web.at/ 23
Live-Demo
http://bit.ly/semantic_search
24. The Business Perspective:
Costs of Data Integration
© Semantic Web Company – http://www.semantic-web.at/ 24
Source: Price Waterhouse Coopers – Technology Forecast, Spring 2009
25. Semantic Search
„Innovation management methods“ Search
HRMarketing/Sales
Research Production
© Semantic Web Company – http://www.semantic-web.at/
Live-Demo
http://pilot4.poolparty.biz/alcedo/
26. Querying structured data AND
unstructured data in one step
Industry
News
Show me industry news which mention countries or regions
to which our export volume has increased over the last 5 years
at least by 10% and which deal with one of our products and/or
with one of our competitors.
(Federated) SPARQL Queries
Export statistics
© Semantic Web Company – http://www.semantic-web.at/
27. Terminology WKD Use Case (C. Dirschl, WKD)
© Semantic Web Company – http://www.semantic-web.at/
28. © Semantic Web Company – http://www.semantic-web.at/
Content
Acquisition
Manually collecting
data from different
sources
Most information is
publicly not available
1:1 contractual
relationships with
authors
Content Enrichment
Composing/Bundling
Using internal
taxonomies and
thesauri
Mainly manual
enrichment
Linking of WK
content only
Sales
Customer
Service
Online libraries as
isolated applications
Hardly any
integration with Web
content
Only first steps in
integration of client
software and content
Content
Acquisition
Content
Enrichment
Composing
Bundling
Publishing
Interfacing
Sales
Customer
Service
Customer
Publishing
Interfacing
Publishing mainly in
the context of a
distinct product
Publishing of
texts, not information
Content Supply Chain
29. © Semantic Web Company – http://www.semantic-web.at/
Jurion Platform
jDesk
Real integration in
local processes
jCloud
Secure access
and mobility
jStore
Access to many sources
and immediate usage
jBook
Individualisation of
content
jLink
Networking and
Personalisation
jCreate
Create and sell
knowledge
jSearch
Semantic search on
legal information
30. © Semantic Web Company – http://www.semantic-web.at/
Overview Search and Content Enrichment architecture
CMS
Customer
Content
Metadata
DB/Services
www… Crawler
Import
path
3rd Party
Content
UGC
Import
path
Classification*
Metadata Recognition
Content Enrichment
Classification*
Metadata Recognition
Content Enrichment
Index
Concept Recognition*
Doc. Segmentation
Normalization
Index
Concept Recognition*
Doc. Segmentation
Normalization
User Query
Query Analysis
• Concept Recogn.*
• Named Entity Recogn.
• Semantic expansion*
• Link to Taxonomy*
Search
Search Result (Raw)
Result Analysis
• Relevance Ranking
Refinement
• Data organization
(e.g. faceting)
• Further analysis (e.g.
ontology, linked data)
Search
Result
(Final)
Search
Feedback
(e.g.
ontology)
* Domain specific requirements
Enrichment
Preprocessing/
Indexing
Search
User
Information
31. © Semantic Web Company – http://www.semantic-web.at/
Jurion – Autosuggest from dedicated knowledge domain database
Domain knowledge in PoolParty is the
basis for auto complete;
No keywords, but detailed legal concepts
are offered
32. © Semantic Web Company – http://www.semantic-web.at/
PoolParty for Metadata Storage and Development
Tool for storing the domain
knowledge vocabulary; independent
of content and metadata database;
sound basis for applied knowledge
management
33. © Semantic Web Company – http://www.semantic-web.at/
Pebbles for Additional Metadata Assignment
Vocabulary maintained in PoolParty is
assigned to content via an editorial
workflow;
Additional free metadata can also be
applied
34. © Semantic Web Company – http://www.semantic-web.at/
Pebbles as a means to include external knowledge
Leveraging the external knowledge available
in the Semantic Web;
Automatic inclusion of e.g.
synonyms, definitions and references
35. © Semantic Web Company – http://www.semantic-web.at/
Linked Data Publishing
vocabulary.wolterskluwer.de
36. © Semantic Web Company – http://www.semantic-web.at/
Cooperation between SWC and WKD
Metadata Management
Text Mining
Data Integration
Semantic Search
Thesaurus Management
Knowledge Extraction
Knowledge Model Creation
Knowledge Model Maintenance
Knowledge Model Development
Open Data Usage
Linked Data Usage
Wolters Kluwer
Semantic Web Company
37. © Semantic Web Company – http://www.semantic-web.at/
Cooperation between SWC and WKD
Metadata Management
Text Mining
Data Integration
Semantic Search
Thesaurus Management
Knowledge Extraction
Knowledge Model Creation
Knowledge Model Maintenance
Knowledge Model Development
Open Data Usage
Linked Data Usage
Wolters Kluwer
Semantic Web Company
39. Enterprise Terminologies:
An Explicit Metadata Layer
• Metadata are stored and processed separately from data
• Metadata management is part of the enterprise information management strategy
HRMarketing/Sales
Research Production
© Semantic Web Company – http://www.semantic-web.at/
40. Linked enterprise vocabularies are the
backbone for a semantic infrastructure
© Semantic Web Company – http://www.semantic-web.at/ 40
Information integration on semantic level
Application (integrated views)
http://compa
ny.com/resea
rch/1452
http://compa
ny.com/prod
uction/729
Lean manufacturing
Lean production
http://compa
ny.com/region
s/Belgium
http://compa
ny.com/region
s/Benelux
broaderrelatedmatch
41. Experienced publishers can provide support in each of these steps:
1. Publishers have expertise in their specific domain and can support others with this knowledge about
adequate concepts and its usage.
2. Publishers can consult partners or customers concerning the different processes that come up with
creating standardized data or transforming existing data in the desired format.
3. Publishers can take over the creation of taxonomies or thesauri by using existing resources or
engaging their internal domain experts’ network.
4. Enrichment can be supported by publishers in form of planning and executing the linking with external
(cloud) or internal (publisher’s) resources and quality management of the linking.
5. Also curation can be executed manually or automatically by specialized tools. Publishers might have
better experience in quality improvement of data and appropriate tools at hand.
6. Values of controlled vocabularies lie in the internal structural processes. They can improve
functionalities of applications or enable additional services and even completely new applications.
Publishers can support in order to use the potential of these data and to monetize the advantages of
already existing applications by introducing proper showcases.
7. Maintenance is also an important topic that has to be taken into account as language, data and
information change over time. This service can be offered by publishers.
Publishers could therefore support the implementation of external linked data infrastructures by
process consulting and content expertise.
Source: A systemic perspective on linked open vocabularies (Blumauer, Dirschl, Eck, Pellegrini)
A Business Model for Publishers?
© Semantic Web Company – http://www.semantic-web.at/