SlideShare una empresa de Scribd logo
1 de 73
Descargar para leer sin conexión
Mappings Validation
Data Quality Tutorial - SEMANTICS2016
Anastasia Dimou
Anastasia.Dimou@ugent.be ● @natadimou
Ghent University – iMinds
Linked (Open) Data
semantically annotated & interlinked data
using different vocabularies or ontologies
published in the form of RDF datasets
Linked (Open) Data
derive from originally heterogeneous
(semi-)structured data
e.g.
Eurostat from TSV
DBLP from DBLP database
DBpedia from Wikipedia
LinkedBrainz from MusicBrainz database
... … …
Linked Data Quality
in the context of Linked Data
generation and publication workflow
Linked Data Quality dimensions
Representational dimension
Intrinsic dimension
Accessibility dimension
Contextual dimension
A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer.
Quality Assessment for Linked Data: A Survey.
Semantic Web Journal, 2016.
Linked Data Quality dimensions
Representational dimension
data modeling
Intrinsic dimension
Linked Data generation
Accessibility dimension
Linked Data publication
Contextual dimension
Linked Data consumption
Linked Data Quality dimensions
Representational dimension
data modeling
Intrinsic dimension
Linked Data generation
Accessibility dimension
Linked Data publishing
Contextual dimension
Linked Data consumption
Linked Data Quality - Intrinsic Dimension
determines the RDF Dataset Quality
by assessing it for possible violations
with respect to
accuracy (e.g. malformed datatype literals)
consistency (e.g. disjoint classes/properties)
Instead of applying Quality Assessment
to the already published Linked Data
as part of Linked Data consumption
Apply Quality Assessment
to the Mappings
that generate the Linked Data
as part of Linked Data production
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
dbo:Person
dbo:Personxsd:date
dbo:Personxsd:date
Linked Data Quality Assessment
Linked Data Quality Assessment (DQA)
RDFUnit http://rdfunit.aksw.org
test-driven data-debugging framework
based on SPARQL-patterns
D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, and A. J. Zaveri
Test-driven evaluation of linked data quality.
In Proceedings of the 23rd International Conference on World Wide Web
DQA with RDFUnit
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
10 domain violations
10 datatype violations
1,000,000 domain violations!!!
1,000,000 datatype violations!!!
Linked Data Quality Assessment (DQA)
Similar violations occur repeatedly
within a single Linked Data set
Linked Data Quality Assessment (DQA)
Sets of triples of a dataset have
repetitive patterns
Linked Data Quality Assessment (DQA)
Sets of triples of a dataset have
repetitive patterns
DQA: Linked Data Quality Assessment
is applied by third parties
to already published Linked Data sets
violations
DQA
DQA: Linked Data Quality Assessment
Adjustments is NOT applied
at the root of the problem
violations
DQA
DQA: Linked Data Quality Assessment
Adjustments are overwritten
if a new version of the original data
is annotated and published as Linked Data
violations
DQA
Instead of applying Quality Assessment
to the already published Linked Data set
as part of data consumption
Apply Quality Assessment to the Mappings
that generate the Linked Data
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Mapping languages
formalize patterns into rules to generate
Linked Data from some original data
RDF Mapping Language (RML) http://rml.io
extends the W3C-recommended R2RML
specify the mapping rules to
generate Linked Data
from heterogeneous data sources
mapping rules are Linked Data sets too!
A. Dimou, M. Vander Sande, P. Colpaert, R. Verborgh, E. Mannens, and R. Van de Walle.
RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data.
In Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), 2014.
RDF Mapping Language (RML) http://rml.io
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
RDF Mapping Language (RML) http://rml.io
data map doc
Mapping
Processor
RDF Mapping Language (RML) http://rml.io
data map doc
Mapping
Processor
violations
DQA
DQA: Linked Data Quality Assessment
data map doc
Mapping
Processor
violations
DQA
DQA: Linked Data Quality Assessment
data map doc
Mapping
Processor
violations
DQA
DQA: Linked Data Quality Assessment
data map doc
Mapping
Processor
violations
MQA
MQA: Mapping Quality Assessment
DQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
D→MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
D→MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
D→MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
… WHERE {
?resource rr:predicateObjectMap ?poMap.
?poMap rr:predicate %%P1%%;
rr:objectMap ?objM.
?objM rr:datatype ?c.
FILTER (?c != %%D1%%) }
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
data map doc
Mapping
Processor
violations
MQA
MQA: Mapping Quality Assessment
MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
… WHERE {
?resource rr:predicateObjectMap ?poMap.
?poMap rr:predicate %%P1%%;
rr:objectMap ?objM.
?objM rr:datatype ?c.
FILTER (?c != %%D1%%) }
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
1 ONLY domain violations!!!
1 ONLY datatype violations!!!
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
MQA: Mapping Quality Assessment
discover not only the violations
but also their origin
before they are even generated
MQA: Mapping Quality Assessment
easily apply structural adjustments
prevent same violations to
appear repeatedly over distinct entities
allow intuitively combining
different ontologies and vocabularies
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:gYear ;
rut:missingValue xsd:date .
data map doc
Mapping
Processor
Mapping
Refinementsviolations
MDQA
Uniform Mapping & Dataset
Quality Assessment Workflow
Correcting MQA violations with RML Editor
Correcting MQA violations with RML Editor
Correcting MQA violations with RML Editor
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:gYear ;
rut:missingValue xsd:date .
DEL: <#ObjectMap> rr:datatype xsd:gYear.
ADD: <#ObjectMap> rr:datatype xsd:date.
MQA with RDFUnit over RML
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:float ;
rut:missingValue xsd:int .
DEL: <#ObjectMap> rr:datatype xsd:gYear.
ADD: <#ObjectMap> rr:datatype xsd:date.
DEL: <#SubjectMap> rr:class dbo:Event.
ADD: <#SubjectMap> rr:class dbo:Person.
MQA with RDFUnit over RML
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:float ;
rut:missingValue xsd:int .
DEL: <#ObjectMap> rr:datatype xsd:gYear.
ADD: <#ObjectMap> rr:datatype xsd:date.
<#Mapping>
rr:subjectMap [ rr:class dbo:Person
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:date ] ] .
DEL: <#SubjectMap> rr:class dbo:Event.
ADD: <#SubjectMap> rr:class dbo:Person.
data
new
map doc
map doc
Mapping
Processor
Mapping
Refinementsviolations
MDQA
(optional)
Uniform Mapping & Dataset
Quality Assessment Workflow
data
new
map doc
map doc
Mapping
Processor
Mapping
Refinementsviolations
MDQA
(optional)
Uniform Mapping & Dataset
Quality Assessment Workflow
Uniform Mapping & Dataset
Quality Assessment Workflow
Mapping Quality Assessment: Limitations
Mapping Quality Assessment: Limitations
certain test cases inevitably
require the complete Linked Data set
Mapping Quality Assessment: Limitations
certain test cases inevitably
require the complete Linked Data set
cardinality,
functionality,
symmetricity
Mapping Quality Assessment: Limitations
certain test cases inevitably
require the complete Linked Data set
cardinality,
functionality,
symmetricity
on Mappings defense:
more data issue
NOT affected by the mapping rules
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Dataset Vs Mapping Quality Assessment
Number of Violations
*Dbpedia and DBLP D2RQ Mappings were translated to RML mappings
#violations - Quality Assessment
Dataset Assessment Mappings Assessment
DBpedia EN 3.2M 160
DBLP 8.1M 8
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
Dataset Vs Mapping Quality Assessment
Time
Dataset Quality Assessment Mappings Quality Assessment
size time size time
DBPedia EN 62M 16h 115K 11s
DBPedia NL 21M 1.5h 53K 6s
DBLP 12M 12h 368 12s
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
Mapping Quality Assessment
* http://mappings.dbpedia.org/validation
Live update of DBpedia Mapping Quality Assessment results every night! ☺
Mapping Quality Assessment
size time
DBpedia EN 115K 11s
DBpedia NL 53K 6s
DBpedia All 511K 32s
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
* http://mappings.dbpedia.org/validation
DBpedia Mappings Quality Assessment
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann
DBpedia Mappings Quality Assessment.
To be published in Proceedings of the 15th International Semantic Web Conference: Posters and Demos 2016
Live update of DBpedia Mapping Quality Assessment results every night! ☺
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Violations
are related to the dataset's schema
(vocabularies or ontologies)
occur repeatedly
within a single RDF dataset
The situation aggravates the more
ontologies and vocabularies
are reused and combined
Linked Data Quality Assessment
shifted from data consumption
to data publication
integrated systematically
in the publishing workflow
violations are identified,
resolved and will not re-appear
Linked Data of higher Quality is generated!!!
Mappings Validation
Data Quality Tutorial - SEMANTICS2016
Anastasia Dimou
Anastasia.Dimou@ugent.be ● @natadimou
Ghent University – iMinds

Más contenido relacionado

La actualidad más candente

The WorldCat Search API
The WorldCat Search APIThe WorldCat Search API
The WorldCat Search APIOCLC Research
 
Knowledge graphs on the Web
Knowledge graphs on the WebKnowledge graphs on the Web
Knowledge graphs on the WebArmin Haller
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwanandrea huang
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentMaribel Acosta Deibe
 
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Blerina Spahiu
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011Peter Mika
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaGezim Sejdiu
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online NewsBernardo Najlis
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessOntotext
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databasesGraph-TA
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 

La actualidad más candente (20)

The WorldCat Search API
The WorldCat Search APIThe WorldCat Search API
The WorldCat Search API
 
Knowledge graphs on the Web
Knowledge graphs on the WebKnowledge graphs on the Web
Knowledge graphs on the Web
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online News
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databases
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 

Destacado

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RMLExtraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RMLandimou
 
2014 review of data quality assessment methods
2014 review of data quality assessment methods2014 review of data quality assessment methods
2014 review of data quality assessment methodsRoger Zapata
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Beniamino Murgante
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening TalkWilliam Smith
 
Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Routine Health Information NetwOrk (RHINO)
 
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentUmair ul Hassan
 
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalData quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalSurvey Department
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentOlaf Hartig
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...HTAi Bilbao 2012
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introductiondatatovalue
 
Data quality overview
Data quality overviewData quality overview
Data quality overviewAlex Meadows
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality DashboardsWilliam Sharp
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratchdmurph4
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profilingShailja Khurana
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 

Destacado (19)

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RMLExtraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RML
 
2014 review of data quality assessment methods
2014 review of data quality assessment methods2014 review of data quality assessment methods
2014 review of data quality assessment methods
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening Talk
 
Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...
 
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
 
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalData quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality Dashboards
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
 
Data Quality Definitions
Data Quality DefinitionsData Quality Definitions
Data Quality Definitions
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 

Similar a Mappings Validation

Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality andimou
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
High quality Linked Data generation for librarians
High quality Linked Data generation for librariansHigh quality Linked Data generation for librarians
High quality Linked Data generation for librariansandimou
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika IntroductionJosef Hardi
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesConnected Data World
 
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...andimou
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsRomanaPernischov
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in SparkDatabricks
 
LOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD CycleLOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD Cyclerogers.rj
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESQUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESNexgen Technology
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computingBAINIDA
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1ErhardRahm
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastEric Kavanagh
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routingchennaijp
 

Similar a Mappings Validation (20)

Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
High quality Linked Data generation for librarians
High quality Linked Data generation for librariansHigh quality Linked Data generation for librarians
High quality Linked Data generation for librarians
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika Introduction
 
RDF data clustering
RDF data clusteringRDF data clustering
RDF data clustering
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the pieces
 
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix Revolutions
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
LOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD CycleLOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD Cycle
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
Data Quality
Data QualityData Quality
Data Quality
 
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESQUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computing
 
Building a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with RBuilding a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with R
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 
Analysis of the Datasets
Analysis of the DatasetsAnalysis of the Datasets
Analysis of the Datasets
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory Webcast
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routing
 

Último

Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Mappings Validation

  • 1. Mappings Validation Data Quality Tutorial - SEMANTICS2016 Anastasia Dimou Anastasia.Dimou@ugent.be ● @natadimou Ghent University – iMinds
  • 2. Linked (Open) Data semantically annotated & interlinked data using different vocabularies or ontologies published in the form of RDF datasets
  • 3. Linked (Open) Data derive from originally heterogeneous (semi-)structured data e.g. Eurostat from TSV DBLP from DBLP database DBpedia from Wikipedia LinkedBrainz from MusicBrainz database ... … …
  • 4. Linked Data Quality in the context of Linked Data generation and publication workflow
  • 5. Linked Data Quality dimensions Representational dimension Intrinsic dimension Accessibility dimension Contextual dimension A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality Assessment for Linked Data: A Survey. Semantic Web Journal, 2016.
  • 6. Linked Data Quality dimensions Representational dimension data modeling Intrinsic dimension Linked Data generation Accessibility dimension Linked Data publication Contextual dimension Linked Data consumption
  • 7. Linked Data Quality dimensions Representational dimension data modeling Intrinsic dimension Linked Data generation Accessibility dimension Linked Data publishing Contextual dimension Linked Data consumption
  • 8. Linked Data Quality - Intrinsic Dimension determines the RDF Dataset Quality by assessing it for possible violations with respect to accuracy (e.g. malformed datatype literals) consistency (e.g. disjoint classes/properties)
  • 9. Instead of applying Quality Assessment to the already published Linked Data as part of Linked Data consumption Apply Quality Assessment to the Mappings that generate the Linked Data as part of Linked Data production
  • 10. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 11. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 12.
  • 13.
  • 17. Linked Data Quality Assessment (DQA) RDFUnit http://rdfunit.aksw.org test-driven data-debugging framework based on SPARQL-patterns D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, and A. J. Zaveri Test-driven evaluation of linked data quality. In Proceedings of the 23rd International Conference on World Wide Web
  • 18. DQA with RDFUnit …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) }
  • 19. 10 domain violations 10 datatype violations
  • 21. Linked Data Quality Assessment (DQA) Similar violations occur repeatedly within a single Linked Data set
  • 22. Linked Data Quality Assessment (DQA) Sets of triples of a dataset have repetitive patterns
  • 23. Linked Data Quality Assessment (DQA) Sets of triples of a dataset have repetitive patterns
  • 24. DQA: Linked Data Quality Assessment is applied by third parties to already published Linked Data sets violations DQA
  • 25. DQA: Linked Data Quality Assessment Adjustments is NOT applied at the root of the problem violations DQA
  • 26. DQA: Linked Data Quality Assessment Adjustments are overwritten if a new version of the original data is annotated and published as Linked Data violations DQA
  • 27. Instead of applying Quality Assessment to the already published Linked Data set as part of data consumption
  • 28. Apply Quality Assessment to the Mappings that generate the Linked Data A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 29. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 30. Mapping languages formalize patterns into rules to generate Linked Data from some original data
  • 31. RDF Mapping Language (RML) http://rml.io extends the W3C-recommended R2RML specify the mapping rules to generate Linked Data from heterogeneous data sources mapping rules are Linked Data sets too! A. Dimou, M. Vander Sande, P. Colpaert, R. Verborgh, E. Mannens, and R. Van de Walle. RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), 2014.
  • 32. RDF Mapping Language (RML) http://rml.io <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
  • 33. RDF Mapping Language (RML) http://rml.io
  • 34. data map doc Mapping Processor RDF Mapping Language (RML) http://rml.io
  • 35. data map doc Mapping Processor violations DQA DQA: Linked Data Quality Assessment
  • 36. data map doc Mapping Processor violations DQA DQA: Linked Data Quality Assessment
  • 37. data map doc Mapping Processor violations DQA DQA: Linked Data Quality Assessment
  • 39. DQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) }
  • 40. D→MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) }
  • 41. D→MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) } <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
  • 42. D→MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) } … WHERE { ?resource rr:predicateObjectMap ?poMap. ?poMap rr:predicate %%P1%%; rr:objectMap ?objM. ?objM rr:datatype ?c. FILTER (?c != %%D1%%) } <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
  • 44. MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) } … WHERE { ?resource rr:predicateObjectMap ?poMap. ?poMap rr:predicate %%P1%%; rr:objectMap ?objM. ?objM rr:datatype ?c. FILTER (?c != %%D1%%) } <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] . 1 ONLY domain violations!!! 1 ONLY datatype violations!!!
  • 45. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment
  • 46. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 47. MQA: Mapping Quality Assessment discover not only the violations but also their origin before they are even generated
  • 48. MQA: Mapping Quality Assessment easily apply structural adjustments prevent same violations to appear repeatedly over distinct entities allow intuitively combining different ontologies and vocabularies
  • 49. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment
  • 50. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:gYear ; rut:missingValue xsd:date .
  • 51. data map doc Mapping Processor Mapping Refinementsviolations MDQA Uniform Mapping & Dataset Quality Assessment Workflow
  • 52. Correcting MQA violations with RML Editor
  • 53. Correcting MQA violations with RML Editor
  • 54. Correcting MQA violations with RML Editor
  • 55. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:gYear ; rut:missingValue xsd:date . DEL: <#ObjectMap> rr:datatype xsd:gYear. ADD: <#ObjectMap> rr:datatype xsd:date.
  • 56. MQA with RDFUnit over RML <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:float ; rut:missingValue xsd:int . DEL: <#ObjectMap> rr:datatype xsd:gYear. ADD: <#ObjectMap> rr:datatype xsd:date. DEL: <#SubjectMap> rr:class dbo:Event. ADD: <#SubjectMap> rr:class dbo:Person.
  • 57. MQA with RDFUnit over RML <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:float ; rut:missingValue xsd:int . DEL: <#ObjectMap> rr:datatype xsd:gYear. ADD: <#ObjectMap> rr:datatype xsd:date. <#Mapping> rr:subjectMap [ rr:class dbo:Person rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:date ] ] . DEL: <#SubjectMap> rr:class dbo:Event. ADD: <#SubjectMap> rr:class dbo:Person.
  • 60. Uniform Mapping & Dataset Quality Assessment Workflow
  • 62. Mapping Quality Assessment: Limitations certain test cases inevitably require the complete Linked Data set
  • 63. Mapping Quality Assessment: Limitations certain test cases inevitably require the complete Linked Data set cardinality, functionality, symmetricity
  • 64. Mapping Quality Assessment: Limitations certain test cases inevitably require the complete Linked Data set cardinality, functionality, symmetricity on Mappings defense: more data issue NOT affected by the mapping rules
  • 65. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 66. Dataset Vs Mapping Quality Assessment Number of Violations *Dbpedia and DBLP D2RQ Mappings were translated to RML mappings #violations - Quality Assessment Dataset Assessment Mappings Assessment DBpedia EN 3.2M 160 DBLP 8.1M 8 A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 67. Dataset Vs Mapping Quality Assessment Time Dataset Quality Assessment Mappings Quality Assessment size time size time DBPedia EN 62M 16h 115K 11s DBPedia NL 21M 1.5h 53K 6s DBLP 12M 12h 368 12s A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 68. Mapping Quality Assessment * http://mappings.dbpedia.org/validation Live update of DBpedia Mapping Quality Assessment results every night! ☺ Mapping Quality Assessment size time DBpedia EN 115K 11s DBpedia NL 53K 6s DBpedia All 511K 32s A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 69. * http://mappings.dbpedia.org/validation DBpedia Mappings Quality Assessment A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann DBpedia Mappings Quality Assessment. To be published in Proceedings of the 15th International Semantic Web Conference: Posters and Demos 2016 Live update of DBpedia Mapping Quality Assessment results every night! ☺
  • 70. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 71. Violations are related to the dataset's schema (vocabularies or ontologies) occur repeatedly within a single RDF dataset The situation aggravates the more ontologies and vocabularies are reused and combined
  • 72. Linked Data Quality Assessment shifted from data consumption to data publication integrated systematically in the publishing workflow violations are identified, resolved and will not re-appear Linked Data of higher Quality is generated!!!
  • 73. Mappings Validation Data Quality Tutorial - SEMANTICS2016 Anastasia Dimou Anastasia.Dimou@ugent.be ● @natadimou Ghent University – iMinds