SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Ontological representation of the telecom domain for advanced
AI applications
Felix Burkhardt, Joachim Stegmann: Deutsche Telekom AG
Till Plumbaum, Christian Sauer: DAI (Distributed Artificial Intelligence) Labor
Tilman Becker, Michael Feld: DFKI (German Research Center for AI)
2
Creating an ontology in the telecom domain
Overview
• Research and development project together with scientific partners
• DAI-Labor, Distributed Artificial Intelligence Laboratory, Berlin
• DFKI (German research center for AI), Saarbrücken
3
Creating an ontology in the telecom domain
Contents
• Motivation
• Ontology Creation:
• Datasources
• Manual creation of upper ontology
• Retrieving suggestions for new concepts, synonyms, relations
• Comparing traditional machine learning with DNNs
• Ontology storage and maintenance
• Natural lanuguage interface
• Translating Natural queries to SPARQL, Answer generation
• Application examples
• Audio based semantic search, Chatbot integration
• Summary and Outlook
4
Creating an ontology in the telecom domain
Motivation
• Statistical data mining is successful
but has its limits
• Rule based systems can be used
when training data is sparse
• Data-based / rulse based can be
combined when rules are learned
from data
• Modeling semantic knowledge
explicitly can help knowledge retrieval
applications
• One example would be
disambiguation for question
answering in „chatbots“, i.e.
automatic dialog systems
5
Creating an ontology in the telecom domain
Motivation
Why are we developing an own ontology?
• An ontology is deeply connected to the company‘s knowledge / internal
data and can exist in a vendor independent format
• Separating the ontology from the supplier lessens the dependence to
one supplier
• One common ontology about a company‘s domain can be updated by
common data-sources and reused by different applications
6
Creating an ontology in the telecom domain
Datasources
• Several
datasources get
harvested:
• Forum posts
• Official
website
• XML files
(product
specs)
• Chatlogs
7
Creating an ontology in the telecom domain
Manual creation of upper Ontology
• A general ontology for a domain big as the telecom domain is a challenge
• Needs to cover different areas, such as sales, infrastructure,
customer support and many more
• Two design decisions were made:
1. Concentrate on one area after another – not all at once
• Starting with customer support use case
2. Creating an ontology/taxonomy for each area - relations
between different areas are added when needed
• Each area is created by area experts and ontology experts
Early version of the ontology showing the broad concepts included
8
Creating an ontology in the telecom domain
Retrieving suggestions for new concepts
• After the manual creation of the upper structure, we used
crawler techniques to gather more concepts automatically
• Different sources were used
• Telekom Hilft Forum
• Telekom Product Website
• XML Data
• For each source we created a specialized crawler
• New sub-concepts, e.g. from Telekom Product Website
new VR hardware as part of the general Home concept
• New attributes from product data, e.g. XML such as 5G as
new transmission speed
Creating an ontology in the telecom domain
Retrieving suggestions for new concepts
9
• New Concepts retrieved for the Home (Zuhause) concept
include for example sub-concepts like EntertainTV or Geräte
(Devices)
• These sub-concepts are then populated with devices and
device information (also automated by crawling the relevant
sources)
Concept Zuhause with automatically added sub-concepts
Sub-concept WLANundRouter
with automatically entities
10
Creating an ontology in the telecom domain
Retrieving suggestions for new Synonyms
• Customers tend to use different words for the same thing – Synonyms
• Our ontology should cover all those different words
• Important in e.g. a search use case
• To retrieve the synonyms (and also misspellings)
we used shallow neuronal nets (word2vec)
and fasttext to learn from a big corpus of
user conversations
• Corpus is Telekom Hilft Forum
with over 2 Million text snippets
W2V and fasttext deliver
different views on a concept
11
Creating an ontology in the telecom domain
Retrieving suggestions for new Synonyms
ML Approach allows us to
get Information for yet
unknown
devices.
Word2Vec based on Deep
Learning for J, skip gram
version.
Word2Vec operates word-
based, fasttext character-
based, so it finds similar
terms even for unknown
words
12
Creating an ontology in the telecom domain
Comparing traditional machine learning with DNNs
• We investigated topic
classification of the forum-posts
• Compared „classical machine
learning“ with Deep neural
nets.
• Both resulted in 55% accuracy
rsp. 83% „one in three“
• Also investigated subclustering
with DNN (4 subclusters per
category)
Apache Lucene preprocessing
NER / disambuiguation
Multinomial Naive Bayes
Deep Temporal
Convolutional Neural
Network *
Comparison
13
Creating an ontology in the telecom domain
Adding arbitrary relations
• Not done automatically
yet.
• Of course term
candidates from natrual
language harvesting
might become alt-labels
for relations
• For now, relations are
added manually,
derived from appliaction
use-cases
14
Creating an ontology in the telecom domain
Ontology storage and maintenance
• Started with
Protege and
switched to
Poolparty now
• Scalabilty
• Interfaces
• NLP integration
• Maintenance
15
Creating an ontology in the telecom domain
Translating Natural queries to SPARQL
• Design Time:
NL model Generator
builds example
sentences from question
templates and ontology
entities
• Runtime:
Nuance Mix model
processes input
sentence. The extracted
paramters are converted
into a SPARQL query
and executed on the
ontology. Results are
converted back to text. 13
NL Model
Generator
Ontology
Q&A Templates
(Classification)
NL Model
(Nuance .trsx file)
DialogRuntime
e.g. „Which
smartphones havea
changeable
battery?“
e.g. „The
iPhone7 and
the Samsung
Galaxy S7“
SPARQL Query
forresponse
EntitiesList
CloudUpload /
Download
User Input
Chatbot
Output
Static
Generation
NLU via
Mix API
Query
Generation
JSON
Text/
Speech
Facts
Answer
Generation
16
Creating an ontology in the telecom domain
NL MODEL EXAMPLE
• The Nuance Mix
web interface
allows the
definition of intents
and parameters.
• For each intent,
several example
sentences should
be provided.
• Here, the intents
are the different
Q&A templates
(question types).
17
Creating an ontology in the telecom domain
Answer generation
• Upper part:
The Q&A template
database defines how
SPARQL queries look
for different linguistic
question structures.
• Lower part:
An example question
is executed and the
presented
intermediate and final
results are generated. Running Query
on Ontology
(via Jena)
Reading question classification:
Intent ID
+ SPARQL Template
+ Parameter
Ask question
[via Text or Speech input]
List:
SCOWWS_SELECT ?s WHERE {?s rdf:type/rdfs:subClassOf* dtag:%s.?s dtag:%s?v. FILTER(?v = "%s").}_subject1_predicate1_object1
SCOWWI_SELECT ?s WHERE {?s rdf:type/rdfs:subClassOf* dtag:%s. ?s dtag:%s?v. FILTER(REGEX(str(?v), "%s")).}_subject1_predicate1_object1
…
JSON
"interpretations": [{
"action": {"intent":
"value": "SCOWWS"}},
"concepts": {
"object1":[{"literal": “changeable", "value": "changeable"}],
"predicate1": [{"literal": “battery", "value": “battery" }],
"subject1":[{"literal": "Smartphones", "value": "Smartphone"}]},
"literal":„Which smartphoneshaveachangeablebattery"}]
Evaluation using Mix.nlu Model
via WebSocket
(NLU Service)
via WebSocket
(NLU Service)
1) Extract intent + concepts
2) Based on IntentID, create final SPARQL query
via SPARQL Template
Answer
dtag:Alcatel2051silver
dtag:SamsungGalaxyJ52016black
dtag:SamsungGalaxyXcover3SMG389Fsilver
SCOWWS: Question by Subjectwhere Subject = Class; WITHOUTComputation; WITH Predicate, WITH Object(String)
SPARQL Query
Ontologie
SELECT?s WHERE {?s rdf:type/rdfs:subClassOf*dtag:Smartphone.?s dtag:battery ?v.FILTER(?v = “changeable").}
18
Creating an ontology in the telecom domain
Application examples: Audio based semantic search
• Human agents get supported by intelligent content suggestions
• Project Highlights
• Intelligent Q&A is highly appreciated by DT agents
• Full export of product XML data provides good precondition for AI based answer generation
• Expansion of AI approach to support DT Social Media agents with recommended answers for
several customer requests (Facebook & Twitter)
• Learnings:
• System performance depends on data material quality (structured vs. unstructured)
• Social media data are highly unstructured – time is needed for manual preprocessing -
currently full automated clustering and preprocessing is still under development
19
Creating an ontology in the telecom domain
Application examples: Audio based semantic search
20
Creating an ontology in the telecom domain
Application examples: Chatbot integration
21
Creating an ontology in the telecom domain
Summary and Outlook
• We created an ontology based on RDF in the telecom domain to support our AI activities.
• From several in domain data sources such as Product descriptions, chat logs and help forum posts
• On the one hand the ontology will be queried directly by SPARQL queries that are derived from natural language searches.
• On the other hand, it is the semantic basis for a variety of other applications such as
• semantic search,
• agent content assistance,
• virtual digital assistant,
• social media mining and
• intelligent chat bot.
• The advantages of maintaining an own centralized ontology are manifold: by storing the knowledge in an open standard
format
• we strengthen the independence of proprietary technology,
• can keep parts of the data private and on-site,
• and re-use the data more easily.
22
Creating an ontology in the telecom domain
Thanks
Felix.Burkhardt@telekom.de

Más contenido relacionado

La actualidad más candente

II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...
II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...
II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...
Dr. Haxel Consult
 
II-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data ExplorationII-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data Exploration
Dr. Haxel Consult
 
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic AnalysisII-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
Dr. Haxel Consult
 
ICIC 2013 Conference Proceedings Ricardo Eito Brun Uni Madrid
ICIC 2013 Conference Proceedings Ricardo Eito Brun Uni MadridICIC 2013 Conference Proceedings Ricardo Eito Brun Uni Madrid
ICIC 2013 Conference Proceedings Ricardo Eito Brun Uni Madrid
Dr. Haxel Consult
 
AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...
AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...
AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...
Dr. Haxel Consult
 
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
Dr. Haxel Consult
 

La actualidad más candente (20)

II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
 
II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...
II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...
II-SDV 2012 Expert System Driven Insights into Patent Quality and Competitive...
 
PoolParty 6.0 - Climbing the Semantic Ladder
PoolParty 6.0 - Climbing the Semantic LadderPoolParty 6.0 - Climbing the Semantic Ladder
PoolParty 6.0 - Climbing the Semantic Ladder
 
Semantic AI
Semantic AISemantic AI
Semantic AI
 
II-SDV 2017: Towards Semantic Search at the European Patent Office
II-SDV 2017: Towards Semantic Search at the European Patent OfficeII-SDV 2017: Towards Semantic Search at the European Patent Office
II-SDV 2017: Towards Semantic Search at the European Patent Office
 
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...
 
II-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data ExplorationII-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data Exploration
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital libraries
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
 
RDF and OWL : the powerful duo | Tara Raafat
RDF and OWL : the powerful duo | Tara RaafatRDF and OWL : the powerful duo | Tara Raafat
RDF and OWL : the powerful duo | Tara Raafat
 
Semantic Technology in Publishing & Finance
Semantic Technology in Publishing & FinanceSemantic Technology in Publishing & Finance
Semantic Technology in Publishing & Finance
 
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic AnalysisII-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data Integration
 
Big Data: Big Issues for IP
Big Data: Big Issues for IPBig Data: Big Issues for IP
Big Data: Big Issues for IP
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and Linkurious
 
ICIC 2013 Conference Proceedings Ricardo Eito Brun Uni Madrid
ICIC 2013 Conference Proceedings Ricardo Eito Brun Uni MadridICIC 2013 Conference Proceedings Ricardo Eito Brun Uni Madrid
ICIC 2013 Conference Proceedings Ricardo Eito Brun Uni Madrid
 
AI is Not Magic: It’s Time to Demystify and Apply Srinivasan Parthiban (VINGY...
AI is Not Magic: It’s Time to Demystify and Apply Srinivasan Parthiban (VINGY...AI is Not Magic: It’s Time to Demystify and Apply Srinivasan Parthiban (VINGY...
AI is Not Magic: It’s Time to Demystify and Apply Srinivasan Parthiban (VINGY...
 
AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...
AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...
AI-SDV 2021 - Holger Keibel; Daniele Puccinelli - Leveraging pre-trained lang...
 
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
II-SDV 2012 Automatic Query Re-Ranking in a Patent Database by Local Frequenc...
 

Similar a Session 2.1 ontological representation of the telecom domain for advanced ai applications

A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
Natalia Díaz Rodríguez
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
eswcsummerschool
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
Flink Forward
 

Similar a Session 2.1 ontological representation of the telecom domain for advanced ai applications (20)

AI-SDV 2021: Stefan Geissler - AI support for creating and maintaining vocabu...
AI-SDV 2021: Stefan Geissler - AI support for creating and maintaining vocabu...AI-SDV 2021: Stefan Geissler - AI support for creating and maintaining vocabu...
AI-SDV 2021: Stefan Geissler - AI support for creating and maintaining vocabu...
 
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxGenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
 
Application of Library Management Software: NewGenLib
Application of Library Management Software: NewGenLibApplication of Library Management Software: NewGenLib
Application of Library Management Software: NewGenLib
 
open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphones
 
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
An Introduction to Semantic Web Technology
An Introduction to Semantic Web TechnologyAn Introduction to Semantic Web Technology
An Introduction to Semantic Web Technology
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
RuleML2015 - Tutorial - Powerful Practical Semantic Rules in Rulelog - Funda...
RuleML2015 - Tutorial -  Powerful Practical Semantic Rules in Rulelog - Funda...RuleML2015 - Tutorial -  Powerful Practical Semantic Rules in Rulelog - Funda...
RuleML2015 - Tutorial - Powerful Practical Semantic Rules in Rulelog - Funda...
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
 
Mobile Multi-domain Search over Structured Web Data
Mobile Multi-domain Search over Structured Web DataMobile Multi-domain Search over Structured Web Data
Mobile Multi-domain Search over Structured Web Data
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 
A Brief History of e-Learning Standards in the United States
A Brief History of e-Learning Standards in the United StatesA Brief History of e-Learning Standards in the United States
A Brief History of e-Learning Standards in the United States
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Ef overview
Ef overviewEf overview
Ef overview
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
 

Más de semanticsconference

Más de semanticsconference (20)

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventure
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideation
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance center
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domains
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi res
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlands
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage information
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...
 
Session 1.3 energy, smart homes & smart grids: towards interoperability...
Session 1.3   energy, smart homes & smart grids: towards interoperability...Session 1.3   energy, smart homes & smart grids: towards interoperability...
Session 1.3 energy, smart homes & smart grids: towards interoperability...
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
Session 2.3 semantics for safeguarding & security – a police story
Session 2.3   semantics for safeguarding & security – a police storySession 2.3   semantics for safeguarding & security – a police story
Session 2.3 semantics for safeguarding & security – a police story
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...
 

Último

Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).
luckyk1575
 

Último (12)

OC Streetcar Final Presentation-Downtown Santa Ana
OC Streetcar Final Presentation-Downtown Santa AnaOC Streetcar Final Presentation-Downtown Santa Ana
OC Streetcar Final Presentation-Downtown Santa Ana
 
Deciding The Topic of our Magazine.pptx.
Deciding The Topic of our Magazine.pptx.Deciding The Topic of our Magazine.pptx.
Deciding The Topic of our Magazine.pptx.
 
Understanding Poverty: A Community Questionnaire
Understanding Poverty: A Community QuestionnaireUnderstanding Poverty: A Community Questionnaire
Understanding Poverty: A Community Questionnaire
 
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docxThe Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
 
Breathing in New Life_ Part 3 05 22 2024.pptx
Breathing in New Life_ Part 3 05 22 2024.pptxBreathing in New Life_ Part 3 05 22 2024.pptx
Breathing in New Life_ Part 3 05 22 2024.pptx
 
DAY 0 8 A Revelation 05-19-2024 PPT.pptx
DAY 0 8 A Revelation 05-19-2024 PPT.pptxDAY 0 8 A Revelation 05-19-2024 PPT.pptx
DAY 0 8 A Revelation 05-19-2024 PPT.pptx
 
Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).
 
ServiceNow CIS-Discovery Exam Dumps 2024
ServiceNow CIS-Discovery Exam Dumps 2024ServiceNow CIS-Discovery Exam Dumps 2024
ServiceNow CIS-Discovery Exam Dumps 2024
 
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfOracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
 
art integrated project of computer applications
art integrated project of computer applicationsart integrated project of computer applications
art integrated project of computer applications
 
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdfACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
 
05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking
 

Session 2.1 ontological representation of the telecom domain for advanced ai applications

  • 1. Ontological representation of the telecom domain for advanced AI applications Felix Burkhardt, Joachim Stegmann: Deutsche Telekom AG Till Plumbaum, Christian Sauer: DAI (Distributed Artificial Intelligence) Labor Tilman Becker, Michael Feld: DFKI (German Research Center for AI)
  • 2. 2 Creating an ontology in the telecom domain Overview • Research and development project together with scientific partners • DAI-Labor, Distributed Artificial Intelligence Laboratory, Berlin • DFKI (German research center for AI), Saarbrücken
  • 3. 3 Creating an ontology in the telecom domain Contents • Motivation • Ontology Creation: • Datasources • Manual creation of upper ontology • Retrieving suggestions for new concepts, synonyms, relations • Comparing traditional machine learning with DNNs • Ontology storage and maintenance • Natural lanuguage interface • Translating Natural queries to SPARQL, Answer generation • Application examples • Audio based semantic search, Chatbot integration • Summary and Outlook
  • 4. 4 Creating an ontology in the telecom domain Motivation • Statistical data mining is successful but has its limits • Rule based systems can be used when training data is sparse • Data-based / rulse based can be combined when rules are learned from data • Modeling semantic knowledge explicitly can help knowledge retrieval applications • One example would be disambiguation for question answering in „chatbots“, i.e. automatic dialog systems
  • 5. 5 Creating an ontology in the telecom domain Motivation Why are we developing an own ontology? • An ontology is deeply connected to the company‘s knowledge / internal data and can exist in a vendor independent format • Separating the ontology from the supplier lessens the dependence to one supplier • One common ontology about a company‘s domain can be updated by common data-sources and reused by different applications
  • 6. 6 Creating an ontology in the telecom domain Datasources • Several datasources get harvested: • Forum posts • Official website • XML files (product specs) • Chatlogs
  • 7. 7 Creating an ontology in the telecom domain Manual creation of upper Ontology • A general ontology for a domain big as the telecom domain is a challenge • Needs to cover different areas, such as sales, infrastructure, customer support and many more • Two design decisions were made: 1. Concentrate on one area after another – not all at once • Starting with customer support use case 2. Creating an ontology/taxonomy for each area - relations between different areas are added when needed • Each area is created by area experts and ontology experts Early version of the ontology showing the broad concepts included
  • 8. 8 Creating an ontology in the telecom domain Retrieving suggestions for new concepts • After the manual creation of the upper structure, we used crawler techniques to gather more concepts automatically • Different sources were used • Telekom Hilft Forum • Telekom Product Website • XML Data • For each source we created a specialized crawler • New sub-concepts, e.g. from Telekom Product Website new VR hardware as part of the general Home concept • New attributes from product data, e.g. XML such as 5G as new transmission speed
  • 9. Creating an ontology in the telecom domain Retrieving suggestions for new concepts 9 • New Concepts retrieved for the Home (Zuhause) concept include for example sub-concepts like EntertainTV or Geräte (Devices) • These sub-concepts are then populated with devices and device information (also automated by crawling the relevant sources) Concept Zuhause with automatically added sub-concepts Sub-concept WLANundRouter with automatically entities
  • 10. 10 Creating an ontology in the telecom domain Retrieving suggestions for new Synonyms • Customers tend to use different words for the same thing – Synonyms • Our ontology should cover all those different words • Important in e.g. a search use case • To retrieve the synonyms (and also misspellings) we used shallow neuronal nets (word2vec) and fasttext to learn from a big corpus of user conversations • Corpus is Telekom Hilft Forum with over 2 Million text snippets W2V and fasttext deliver different views on a concept
  • 11. 11 Creating an ontology in the telecom domain Retrieving suggestions for new Synonyms ML Approach allows us to get Information for yet unknown devices. Word2Vec based on Deep Learning for J, skip gram version. Word2Vec operates word- based, fasttext character- based, so it finds similar terms even for unknown words
  • 12. 12 Creating an ontology in the telecom domain Comparing traditional machine learning with DNNs • We investigated topic classification of the forum-posts • Compared „classical machine learning“ with Deep neural nets. • Both resulted in 55% accuracy rsp. 83% „one in three“ • Also investigated subclustering with DNN (4 subclusters per category) Apache Lucene preprocessing NER / disambuiguation Multinomial Naive Bayes Deep Temporal Convolutional Neural Network * Comparison
  • 13. 13 Creating an ontology in the telecom domain Adding arbitrary relations • Not done automatically yet. • Of course term candidates from natrual language harvesting might become alt-labels for relations • For now, relations are added manually, derived from appliaction use-cases
  • 14. 14 Creating an ontology in the telecom domain Ontology storage and maintenance • Started with Protege and switched to Poolparty now • Scalabilty • Interfaces • NLP integration • Maintenance
  • 15. 15 Creating an ontology in the telecom domain Translating Natural queries to SPARQL • Design Time: NL model Generator builds example sentences from question templates and ontology entities • Runtime: Nuance Mix model processes input sentence. The extracted paramters are converted into a SPARQL query and executed on the ontology. Results are converted back to text. 13 NL Model Generator Ontology Q&A Templates (Classification) NL Model (Nuance .trsx file) DialogRuntime e.g. „Which smartphones havea changeable battery?“ e.g. „The iPhone7 and the Samsung Galaxy S7“ SPARQL Query forresponse EntitiesList CloudUpload / Download User Input Chatbot Output Static Generation NLU via Mix API Query Generation JSON Text/ Speech Facts Answer Generation
  • 16. 16 Creating an ontology in the telecom domain NL MODEL EXAMPLE • The Nuance Mix web interface allows the definition of intents and parameters. • For each intent, several example sentences should be provided. • Here, the intents are the different Q&A templates (question types).
  • 17. 17 Creating an ontology in the telecom domain Answer generation • Upper part: The Q&A template database defines how SPARQL queries look for different linguistic question structures. • Lower part: An example question is executed and the presented intermediate and final results are generated. Running Query on Ontology (via Jena) Reading question classification: Intent ID + SPARQL Template + Parameter Ask question [via Text or Speech input] List: SCOWWS_SELECT ?s WHERE {?s rdf:type/rdfs:subClassOf* dtag:%s.?s dtag:%s?v. FILTER(?v = "%s").}_subject1_predicate1_object1 SCOWWI_SELECT ?s WHERE {?s rdf:type/rdfs:subClassOf* dtag:%s. ?s dtag:%s?v. FILTER(REGEX(str(?v), "%s")).}_subject1_predicate1_object1 … JSON "interpretations": [{ "action": {"intent": "value": "SCOWWS"}}, "concepts": { "object1":[{"literal": “changeable", "value": "changeable"}], "predicate1": [{"literal": “battery", "value": “battery" }], "subject1":[{"literal": "Smartphones", "value": "Smartphone"}]}, "literal":„Which smartphoneshaveachangeablebattery"}] Evaluation using Mix.nlu Model via WebSocket (NLU Service) via WebSocket (NLU Service) 1) Extract intent + concepts 2) Based on IntentID, create final SPARQL query via SPARQL Template Answer dtag:Alcatel2051silver dtag:SamsungGalaxyJ52016black dtag:SamsungGalaxyXcover3SMG389Fsilver SCOWWS: Question by Subjectwhere Subject = Class; WITHOUTComputation; WITH Predicate, WITH Object(String) SPARQL Query Ontologie SELECT?s WHERE {?s rdf:type/rdfs:subClassOf*dtag:Smartphone.?s dtag:battery ?v.FILTER(?v = “changeable").}
  • 18. 18 Creating an ontology in the telecom domain Application examples: Audio based semantic search • Human agents get supported by intelligent content suggestions • Project Highlights • Intelligent Q&A is highly appreciated by DT agents • Full export of product XML data provides good precondition for AI based answer generation • Expansion of AI approach to support DT Social Media agents with recommended answers for several customer requests (Facebook & Twitter) • Learnings: • System performance depends on data material quality (structured vs. unstructured) • Social media data are highly unstructured – time is needed for manual preprocessing - currently full automated clustering and preprocessing is still under development
  • 19. 19 Creating an ontology in the telecom domain Application examples: Audio based semantic search
  • 20. 20 Creating an ontology in the telecom domain Application examples: Chatbot integration
  • 21. 21 Creating an ontology in the telecom domain Summary and Outlook • We created an ontology based on RDF in the telecom domain to support our AI activities. • From several in domain data sources such as Product descriptions, chat logs and help forum posts • On the one hand the ontology will be queried directly by SPARQL queries that are derived from natural language searches. • On the other hand, it is the semantic basis for a variety of other applications such as • semantic search, • agent content assistance, • virtual digital assistant, • social media mining and • intelligent chat bot. • The advantages of maintaining an own centralized ontology are manifold: by storing the knowledge in an open standard format • we strengthen the independence of proprietary technology, • can keep parts of the data private and on-site, • and re-use the data more easily.
  • 22. 22 Creating an ontology in the telecom domain Thanks Felix.Burkhardt@telekom.de