SlideShare una empresa de Scribd logo
1 de 66
Descargar para leer sin conexión
Ontology-Based Data Access Mapping Generation
via Data, Schema, Query, and Mapping Knowledge
Pieter Heyvaert
pheyvaer.heyvaert@ugent.be
Semantic Web technologies rely on Linked Data
querying
visualizations
publishing
But not all data is accessible as Linked Data
databases
XML files
JSON files
Solutions to provide access exist
manual: completely done by the user
semi-automatic: users provide feedback
automatic: no user interaction required
But they have limitations
limited to specific use cases
limited support for complex use cases
PhD’s goal: improve access to Linked Data
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
How do we provide access?
non-Linked
Data
Linked
Data
?
How do we provide access?
non-Linked
Data
Linked
Data
?
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
Apply mappings on non-Linked Data
non-Linked
Data
Linked
Data
mapping
mapping: rules to generate RDF terms and triples using data and ontologies
Apply mappings on non-Linked Data
non-Linked Data Linked Datamapping
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
rule: create url from id
rule: name is value for ex:fullname
rule: if genre is ‘fiction’
class is ex:FictionAuthor
else
class is ex:NonFictionAuthor
Apply mappings on non-Linked Data
non-Linked Data Linked Datamapping
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
ex:0 a ex:FictionAuthor .
ex:0 ex:fullname ‘J.K. Rowling’ .
ex:1 a ex:NonFictionAuthor .
ex:1 ex:fullname ‘George Orwell’ .
Mappings need to be created
from scratch (single-scenario use case)
mapping A
by reusing previous mappings (multi-scenario use case)
mapping B mapping C
mapping
(Semi-)automatic methods are preferred
mapping
manual
(semi-)automatic
Still a number of challenges left
dealing complex data (schemas)
not all techniques work on single-scenario use cases
Dealing with complex data (schemas)
e.g., when the class of an entity does not depend on the table, but on a value
rule: if genre is ‘fiction’,
class is ex:FictionAuthor
else
class is ex:NonFictionAuthor
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
Not all techniques work on single-scenario use cases
scenario A scenario Bmulti
single
because they rely on readily-available previous mappings
mapping
results in reuse
? scenario B?
results in reuse
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Current solutions
What knowledge is used?
How is this knowledge used?
What knowledge is not used?
What do current solutions use?
knowledge from the mapping process
existing knowledge outside the mapping process
Knowledge from mapping process is used
data
data schema
ontologies
not all elements are required
Existing knowledge is used
data
data schemas
mappings
ontologies
Linked Data
not all elements are required
How is all this knowledge used?
data schema + existing ontology
data + existing mapping
Data schema + existing ontology
data schema
new ontology
1
Data schema + existing ontology
data schema
existing ontologynew ontology match
1
2 2
Data schema + existing ontology
data schema
existing ontologynew ontology match
mapping
1
2 2
3
Data + existing mapping
data
classesproperties
1
Data + existing mapping
data existing mapping
classesproperties classespropertiesmodel
1
2 2
2
Data + existing mapping
data existing mapping
classes
mapping
properties classespropertiesmodel
1
2 2
2
3
3 3
These methods are not combined
only a single method is used
combining multiple methods has not been explored
What knowledge do current solutions not use?
not all knowledge from previous mappings
neglect query workload
Not all knowledge from previous mappings is used
data transformations
to lowercase
substring
conditions: if-else rules
Query workload is neglected
queries to be executed on the non-existing Linked Dataset
queries contains knowledge
model
used ontologies
annotations
select * where {
?s a ex:FictionAuthor .
?s ex:fullname ?n .
}
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
ontology to use: http://example.com
model + annotations: ex:FictionAuthor
ex:fullname
How can we use queries?
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Research questions
discover existing knowledge
use discovered knowledge
Question 1: how can we discover
existing knowledge that is relevant?
?mappings
ontologies
(Linked) Data
query workload
data schema
existing
mapping
Question 2: how can we use the discovered knowledge
to generate a new mapping?
mapping
mappings
ontologies
(Linked) Data
query workload
data
data schema
ontologies
query workload
data schema
existing mapping process
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Hypotheses
improve quality
decrease task complexity
Hypothesis 1: using existing knowledge improves
the quality of a new single-scenario mapping.
quality → fitness for use
Hypothesis 2: using existing knowledge
decreases the task complexity of the mapping process.
Lui and Li developed model to measure task complexity.
5 characteristics that influence the task’s performance
Task complexity has 5 characteristics
input: e.g., data, ontologies, user feedback
output: Linked Data, mapping
process: steps, user actions
duration: time to complete task
presentation: user interface
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Two aspects need to be tackled
discover existing knowledge
use knowledge
both can be tackled separately
Discover existing knowledge
infer knowledge from mapping process where possible
find relevant other existing knowledge via similarity metrics
Infer knowledge from mapping process
e.g., infer data schema from data
e.g., infer ontology from queries
Infer data schema from data
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
table: authors
columns: id, name, genre
id: index, integer
name: string
genre: string (‘fiction’ or ‘non-fiction’)
Infer ontology from queries
select * where {
?s a ex:FictionAuthor .
?s ex:fullname ?n .
}
http://example.com
Find relevant existing knowledge via similarity metrics
mapping process
mapping
1. determine similarity
2. consider in mapping process
existing
table: authors
columns: id, name, genre
id: index, integer, unique
name: string
genre: string (‘fiction’ or
‘non-fiction’)
table: author
columns: id, fullname,
genres
id: index, integer
fullname: string
genres: string
Similarity metrics on different/combination of elements
metrics on data schema, ontologies, data, and query workload
PhD:
Which metrics do we use?
How do we combine the different metrics?
Two aspects need to be tackled
discover existing knowledge
use knowledge
Use knowledge
work with existing methods, e.g.:
data schema + existing ontology
data + existing mappings
PhD:
how do we include new knowledge?
how do we combine these methods?
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Preliminary Results
RMLEditor
RMLWorkbench
mapping generation approaches
hierarchical data analysis
RMLEditor eases the creation of mappings
GUI so domain experts can create mappings
users can view the data, mappings, and RDF triples
usable by both non-SW and SW experts
PhD: present mappings to get feedback during mapping process
RMLWorkbench eases generation and publication
graphical user interface so domain experts can administer
Linked Data generation
publication workflow
PhD: manage elements of the mapping generation process
Identified mapping generation approaches
data-driven
schema-driven
model-driven
result-driven
PhD:
provides insights on how users work
this can be applied when developing an (semi-)automatic approach
Developed tool for data analysis on hierarchical data
efficient discovery of unique identifiers in hierarchical data
PhD: to infer knowledge within the mapping process
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Evaluation Plan
mapping quality
task complexity
Evaluate mapping quality
existing benchmark RODI
great for tabular data
no support for other formats, such as hierarchical data formats
Evaluate task complexity via 5 characteristics
input: e.g., data, ontologies, user feedback
output: Linked Data, mapping
process: steps, user actions
duration: time to complete task
presentation: user interface
Limited in current evaluations to single aspect
only duration
only number of user actions
only precision and recall
Roundup
improve single-scenario mappings by discovering and using existing knowledge
What similarity metrics we use for discovery?
How do we use and combine
the different methods and knowledge?

Más contenido relacionado

La actualidad más candente

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Dirk Lewandowski
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsMatthias Braunhofer
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsUniversity of Bergen
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluationijnlc
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentationnirvdrum
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsMohammad Aliannejadi
 
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question GenerationDynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question Generationijtsrd
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsMatthias Braunhofer
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USAIadh Ounis
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsMatthias Braunhofer
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingJoe Raad
 
Carma internet research module n-bias
Carma internet research module   n-biasCarma internet research module   n-bias
Carma internet research module n-biasSyracuse University
 
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsMarco Rossetti
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queriesiosrjce
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
 

La actualidad más candente (20)

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender Systems
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
 
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question GenerationDynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender Systems
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USA
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start Recommendations
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema Matching
 
Carma internet research module n-bias
Carma internet research module   n-biasCarma internet research module   n-bias
Carma internet research module n-bias
 
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queries
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
 

Similar a Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge

313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptxsameernsn1
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data ConferenceDataTactics
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationRich Heimann
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature reviewKashif Hussain
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Riccardo Albertoni
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabusanoop bk
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfRAKESHG79
 
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsAn Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsGregor Polančič
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
 
Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networkingStenio Fernandes
 
Lec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustLec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustMenchita Falcutila Dumlao
 

Similar a Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge (20)

Phd thesis final presentation
Phd thesis   final presentationPhd thesis   final presentation
Phd thesis final presentation
 
Topic modeling
Topic modelingTopic modeling
Topic modeling
 
OpenSciMatch
OpenSciMatchOpenSciMatch
OpenSciMatch
 
Topic model
Topic modelTopic model
Topic model
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics Corporation
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature review
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystemDigital repertoires of poetry metrics: towards a Linked Open Data ecosystem
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabus
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdf
 
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsAn Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networking
 
Lec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustLec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrust
 
My experiment
My experimentMy experiment
My experiment
 

Más de Pieter Heyvaert

Semi-Automatic Example-Driven Linked Data Mapping Creation
Semi-Automatic  Example-Driven Linked Data Mapping CreationSemi-Automatic  Example-Driven Linked Data Mapping Creation
Semi-Automatic Example-Driven Linked Data Mapping CreationPieter Heyvaert
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsPieter Heyvaert
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Pieter Heyvaert
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsPieter Heyvaert
 
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Pieter Heyvaert
 
FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)Pieter Heyvaert
 
Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Pieter Heyvaert
 

Más de Pieter Heyvaert (7)

Semi-Automatic Example-Driven Linked Data Mapping Creation
Semi-Automatic  Example-Driven Linked Data Mapping CreationSemi-Automatic  Example-Driven Linked Data Mapping Creation
Semi-Automatic Example-Driven Linked Data Mapping Creation
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping Definitions
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
 
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
 
FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)
 
Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)
 

Último

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxzaydmeerab121
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermicultureTakeleZike1
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxsubscribeus100
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Biological classification of plants with detail
Biological classification of plants with detailBiological classification of plants with detail
Biological classification of plants with detailhaiderbaloch3
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 

Último (20)

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptx
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermiculture
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptx
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
Biological classification of plants with detail
Biological classification of plants with detailBiological classification of plants with detail
Biological classification of plants with detail
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 

Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge

  • 1. Ontology-Based Data Access Mapping Generation via Data, Schema, Query, and Mapping Knowledge Pieter Heyvaert pheyvaer.heyvaert@ugent.be
  • 2. Semantic Web technologies rely on Linked Data querying visualizations publishing
  • 3. But not all data is accessible as Linked Data databases XML files JSON files
  • 4. Solutions to provide access exist manual: completely done by the user semi-automatic: users provide feedback automatic: no user interaction required
  • 5. But they have limitations limited to specific use cases limited support for complex use cases
  • 6. PhD’s goal: improve access to Linked Data
  • 7. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 8. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 9. How do we provide access? non-Linked Data Linked Data ?
  • 10. How do we provide access? non-Linked Data Linked Data ? id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors
  • 11. Apply mappings on non-Linked Data non-Linked Data Linked Data mapping mapping: rules to generate RDF terms and triples using data and ontologies
  • 12. Apply mappings on non-Linked Data non-Linked Data Linked Datamapping id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors rule: create url from id rule: name is value for ex:fullname rule: if genre is ‘fiction’ class is ex:FictionAuthor else class is ex:NonFictionAuthor
  • 13. Apply mappings on non-Linked Data non-Linked Data Linked Datamapping id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors ex:0 a ex:FictionAuthor . ex:0 ex:fullname ‘J.K. Rowling’ . ex:1 a ex:NonFictionAuthor . ex:1 ex:fullname ‘George Orwell’ .
  • 14. Mappings need to be created from scratch (single-scenario use case) mapping A by reusing previous mappings (multi-scenario use case) mapping B mapping C mapping
  • 15. (Semi-)automatic methods are preferred mapping manual (semi-)automatic
  • 16. Still a number of challenges left dealing complex data (schemas) not all techniques work on single-scenario use cases
  • 17. Dealing with complex data (schemas) e.g., when the class of an entity does not depend on the table, but on a value rule: if genre is ‘fiction’, class is ex:FictionAuthor else class is ex:NonFictionAuthor id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors
  • 18. Not all techniques work on single-scenario use cases scenario A scenario Bmulti single because they rely on readily-available previous mappings mapping results in reuse ? scenario B? results in reuse
  • 19. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 20. Current solutions What knowledge is used? How is this knowledge used? What knowledge is not used?
  • 21. What do current solutions use? knowledge from the mapping process existing knowledge outside the mapping process
  • 22. Knowledge from mapping process is used data data schema ontologies not all elements are required
  • 23. Existing knowledge is used data data schemas mappings ontologies Linked Data not all elements are required
  • 24. How is all this knowledge used? data schema + existing ontology data + existing mapping
  • 25. Data schema + existing ontology data schema new ontology 1
  • 26. Data schema + existing ontology data schema existing ontologynew ontology match 1 2 2
  • 27. Data schema + existing ontology data schema existing ontologynew ontology match mapping 1 2 2 3
  • 28. Data + existing mapping data classesproperties 1
  • 29. Data + existing mapping data existing mapping classesproperties classespropertiesmodel 1 2 2 2
  • 30. Data + existing mapping data existing mapping classes mapping properties classespropertiesmodel 1 2 2 2 3 3 3
  • 31. These methods are not combined only a single method is used combining multiple methods has not been explored
  • 32. What knowledge do current solutions not use? not all knowledge from previous mappings neglect query workload
  • 33. Not all knowledge from previous mappings is used data transformations to lowercase substring conditions: if-else rules
  • 34. Query workload is neglected queries to be executed on the non-existing Linked Dataset queries contains knowledge model used ontologies annotations
  • 35. select * where { ?s a ex:FictionAuthor . ?s ex:fullname ?n . } id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors ontology to use: http://example.com model + annotations: ex:FictionAuthor ex:fullname How can we use queries?
  • 36. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 37. Research questions discover existing knowledge use discovered knowledge
  • 38. Question 1: how can we discover existing knowledge that is relevant? ?mappings ontologies (Linked) Data query workload data schema existing mapping
  • 39. Question 2: how can we use the discovered knowledge to generate a new mapping? mapping mappings ontologies (Linked) Data query workload data data schema ontologies query workload data schema existing mapping process
  • 40. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 42. Hypothesis 1: using existing knowledge improves the quality of a new single-scenario mapping. quality → fitness for use
  • 43. Hypothesis 2: using existing knowledge decreases the task complexity of the mapping process. Lui and Li developed model to measure task complexity. 5 characteristics that influence the task’s performance
  • 44. Task complexity has 5 characteristics input: e.g., data, ontologies, user feedback output: Linked Data, mapping process: steps, user actions duration: time to complete task presentation: user interface
  • 45. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 46. Two aspects need to be tackled discover existing knowledge use knowledge both can be tackled separately
  • 47. Discover existing knowledge infer knowledge from mapping process where possible find relevant other existing knowledge via similarity metrics
  • 48. Infer knowledge from mapping process e.g., infer data schema from data e.g., infer ontology from queries
  • 49. Infer data schema from data id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors table: authors columns: id, name, genre id: index, integer name: string genre: string (‘fiction’ or ‘non-fiction’)
  • 50. Infer ontology from queries select * where { ?s a ex:FictionAuthor . ?s ex:fullname ?n . } http://example.com
  • 51. Find relevant existing knowledge via similarity metrics mapping process mapping 1. determine similarity 2. consider in mapping process existing table: authors columns: id, name, genre id: index, integer, unique name: string genre: string (‘fiction’ or ‘non-fiction’) table: author columns: id, fullname, genres id: index, integer fullname: string genres: string
  • 52. Similarity metrics on different/combination of elements metrics on data schema, ontologies, data, and query workload PhD: Which metrics do we use? How do we combine the different metrics?
  • 53. Two aspects need to be tackled discover existing knowledge use knowledge
  • 54. Use knowledge work with existing methods, e.g.: data schema + existing ontology data + existing mappings PhD: how do we include new knowledge? how do we combine these methods?
  • 55. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 56. Preliminary Results RMLEditor RMLWorkbench mapping generation approaches hierarchical data analysis
  • 57. RMLEditor eases the creation of mappings GUI so domain experts can create mappings users can view the data, mappings, and RDF triples usable by both non-SW and SW experts PhD: present mappings to get feedback during mapping process
  • 58. RMLWorkbench eases generation and publication graphical user interface so domain experts can administer Linked Data generation publication workflow PhD: manage elements of the mapping generation process
  • 59. Identified mapping generation approaches data-driven schema-driven model-driven result-driven PhD: provides insights on how users work this can be applied when developing an (semi-)automatic approach
  • 60. Developed tool for data analysis on hierarchical data efficient discovery of unique identifiers in hierarchical data PhD: to infer knowledge within the mapping process
  • 61. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 63. Evaluate mapping quality existing benchmark RODI great for tabular data no support for other formats, such as hierarchical data formats
  • 64. Evaluate task complexity via 5 characteristics input: e.g., data, ontologies, user feedback output: Linked Data, mapping process: steps, user actions duration: time to complete task presentation: user interface
  • 65. Limited in current evaluations to single aspect only duration only number of user actions only precision and recall
  • 66. Roundup improve single-scenario mappings by discovering and using existing knowledge What similarity metrics we use for discovery? How do we use and combine the different methods and knowledge?