SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
TripleCheckMate: A Tool for
Crowdsourcing the Quality
Assessment of Linked Data
Dimitris Kontokostas, Amrapali Zaveri,
Sören Auer and Jens Lehmann
KESW 2013 Oct 08, 2013
Outline
❏ Data Quality
❏ Data Quality Assessment Methodology
❏ Evaluation Methodology - Manual
❏ Phase I: Quality Problem Taxonomy
❏ Phase II: Crowdsourcing Quality Assessment
❏ TripleCheckMate
❏ Architecture
❏ Demo
❏ Conclusion & Future Work
2
Data Quality
● Data Quality (DQ) is defined as:
○ fitness for a certain use case*
● On the Data Web - varying quality of information
covering various domains
● High quality datasets
○ curated over decades - life science domain
○ crowdsourcing process - extracted from unstructured
and semi-structured information, e.g. DBpedia
* J. Juran. The Quality Control Handbook. McGraw-Hill, New York, 1974.
3
Data Quality Assessment
Methodology
4 Step Methodology:
❏ Step 1: Resource selection
❏ Per Class
❏ Completely random
❏ Manual
❏ Step 2: Evaluation mode
selection
❏ Manual
❏ Semi-automatic
❏ Automatic
❏ Step 3: Resource evaluation
❏ Step 4: DQ improvement
❏ Direct
❏ Indirect
4
Evaluating Methodology - Manual
❏Phase I: Creation of quality problem
taxonomy
❏Phase II: Crowdsourcing quality
assessment
5
Phase I: Quality Problem Taxonomy
AZaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality assessment methodologies
for Linked Open Data: A Review. Under review, available at
http://www.semantic-webjournal.net/content/quality-assessment-methodologieslinked-open-data.
6
Phase II: Crowdsourcing
Quality Assessment
Crowdsourcing Our Approach
Type Human Intelligent Tasks
(HITs)
Contest-based
Participants Labor market Linked Data (LD) experts
Task Detect quality issues in
triples
Detect & classify quality issues in
resources
Reward Per tasks/triple Most no. of resources evaluated
Tool Amazon Mechanical
Turk, CrowdFlower etc.
TripleCheckMate
7
TripleCheckMate - Architecture (1/2)
8
TripleCheckMate - Architecture (2/2)
● Built on Java / GWT
○ GWT compiles to native cross-browser HTML/JS
● Tomcat / Jetty & MySQL as minimal backend
○ store/retrieve evaluation data only
● Application logic is built on the client
○ SPARQL executed on client
○ Portable
9
Evaluation storage schema
● Designed to support multiple campaigns and
different ontologies
● Quality taxonomy is stored in the database
which makes it easy to adapt
10
TripleCheckMate - Demo
http://tinyurl.com/TCM-Demo
http://tinyurl.com/TCM-Screencast
Conclusion & Future Work
● TripleCheckMate
○ Tool for crowdsouring quality assessment
○ Linked Data quality assessment
○ Supports inter-rater agreement
○ Can be used with any Linked Dataset
● Future Work
○ Directly integrating semi-automatic methods
○ Improve efficiency of quality assessment
○ Include support for Patch Ontology* as output format
* M. Knuth, J. Hercher, and H. Sack. Collaboratively patching linked data. CoRR, 2012. 12
Thank You
Questions?
http://nl.dbpedia.org:8080/TripleCheckMate-Demo/
https://github.com/AKSW/TripleCheckMate
http://aksw.org/AmrapaliZaveri
zaveri@informatik.uni-leipzig.de
Twitter: @amrapaliz

Más contenido relacionado

Similar a TripleCheckMate

5 Practical Steps to a Successful Deep Learning Research
5 Practical Steps to a Successful  Deep Learning Research5 Practical Steps to a Successful  Deep Learning Research
5 Practical Steps to a Successful Deep Learning ResearchBrodmann17
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueXavier Amatriain
 
SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...
SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...
SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...South Tyrol Free Software Conference
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
Machine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskMachine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskQuantUniversity
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingDatabricks
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
Performance testing
Performance testingPerformance testing
Performance testingNalini Kanth
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...RINUSATHYAN
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cyclehktripathy
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.pptSK Chew
 
Loadtesting wuc2009v2
Loadtesting wuc2009v2Loadtesting wuc2009v2
Loadtesting wuc2009v2ravneetraman
 

Similar a TripleCheckMate (20)

5 Practical Steps to a Successful Deep Learning Research
5 Practical Steps to a Successful  Deep Learning Research5 Practical Steps to a Successful  Deep Learning Research
5 Practical Steps to a Successful Deep Learning Research
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
 
SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...
SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...
SFScon 22 - Fiete Lüer - Heading towards reproducible machine learning resear...
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
Data Quality
Data QualityData Quality
Data Quality
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Machine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskMachine Learning Applications in Credit Risk
Machine Learning Applications in Credit Risk
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated Training
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
Performance testing
Performance testingPerformance testing
Performance testing
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Requirements Analysis
Requirements AnalysisRequirements Analysis
Requirements Analysis
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
Loadtesting wuc2009v2
Loadtesting wuc2009v2Loadtesting wuc2009v2
Loadtesting wuc2009v2
 

Más de Amrapali Zaveri, PhD

Data Quality and the FAIR principles
Data Quality and the FAIR principlesData Quality and the FAIR principles
Data Quality and the FAIR principlesAmrapali Zaveri, PhD
 
Workshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataWorkshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataAmrapali Zaveri, PhD
 
CrowdED: Guideline for optimal Crowdsourcing Experimental Design
CrowdED: Guideline for optimal Crowdsourcing Experimental DesignCrowdED: Guideline for optimal Crowdsourcing Experimental Design
CrowdED: Guideline for optimal Crowdsourcing Experimental DesignAmrapali Zaveri, PhD
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentAmrapali Zaveri, PhD
 
smartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIssmartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIsAmrapali Zaveri, PhD
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionTowards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionAmrapali Zaveri, PhD
 

Más de Amrapali Zaveri, PhD (13)

Data Quality and the FAIR principles
Data Quality and the FAIR principlesData Quality and the FAIR principles
Data Quality and the FAIR principles
 
Workshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataWorkshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in Wikidata
 
ESOF Panel 2018
ESOF Panel 2018ESOF Panel 2018
ESOF Panel 2018
 
CrowdED: Guideline for optimal Crowdsourcing Experimental Design
CrowdED: Guideline for optimal Crowdsourcing Experimental DesignCrowdED: Guideline for optimal Crowdsourcing Experimental Design
CrowdED: Guideline for optimal Crowdsourcing Experimental Design
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
 
smartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIssmartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIs
 
Introduction to Bio SPARQL
Introduction to Bio SPARQL Introduction to Bio SPARQL
Introduction to Bio SPARQL
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Amrapali Zaveri Defense
Amrapali Zaveri DefenseAmrapali Zaveri Defense
Amrapali Zaveri Defense
 
LOD-SEM
LOD-SEMLOD-SEM
LOD-SEM
 
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionTowards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
 
Converting GHO to RDF
Converting GHO to RDFConverting GHO to RDF
Converting GHO to RDF
 
ReDD-Observatory
ReDD-ObservatoryReDD-Observatory
ReDD-Observatory
 

Último

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 

Último (20)

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 

TripleCheckMate

  • 1. TripleCheckMate: A Tool for Crowdsourcing the Quality Assessment of Linked Data Dimitris Kontokostas, Amrapali Zaveri, Sören Auer and Jens Lehmann KESW 2013 Oct 08, 2013
  • 2. Outline ❏ Data Quality ❏ Data Quality Assessment Methodology ❏ Evaluation Methodology - Manual ❏ Phase I: Quality Problem Taxonomy ❏ Phase II: Crowdsourcing Quality Assessment ❏ TripleCheckMate ❏ Architecture ❏ Demo ❏ Conclusion & Future Work 2
  • 3. Data Quality ● Data Quality (DQ) is defined as: ○ fitness for a certain use case* ● On the Data Web - varying quality of information covering various domains ● High quality datasets ○ curated over decades - life science domain ○ crowdsourcing process - extracted from unstructured and semi-structured information, e.g. DBpedia * J. Juran. The Quality Control Handbook. McGraw-Hill, New York, 1974. 3
  • 4. Data Quality Assessment Methodology 4 Step Methodology: ❏ Step 1: Resource selection ❏ Per Class ❏ Completely random ❏ Manual ❏ Step 2: Evaluation mode selection ❏ Manual ❏ Semi-automatic ❏ Automatic ❏ Step 3: Resource evaluation ❏ Step 4: DQ improvement ❏ Direct ❏ Indirect 4
  • 5. Evaluating Methodology - Manual ❏Phase I: Creation of quality problem taxonomy ❏Phase II: Crowdsourcing quality assessment 5
  • 6. Phase I: Quality Problem Taxonomy AZaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality assessment methodologies for Linked Open Data: A Review. Under review, available at http://www.semantic-webjournal.net/content/quality-assessment-methodologieslinked-open-data. 6
  • 7. Phase II: Crowdsourcing Quality Assessment Crowdsourcing Our Approach Type Human Intelligent Tasks (HITs) Contest-based Participants Labor market Linked Data (LD) experts Task Detect quality issues in triples Detect & classify quality issues in resources Reward Per tasks/triple Most no. of resources evaluated Tool Amazon Mechanical Turk, CrowdFlower etc. TripleCheckMate 7
  • 9. TripleCheckMate - Architecture (2/2) ● Built on Java / GWT ○ GWT compiles to native cross-browser HTML/JS ● Tomcat / Jetty & MySQL as minimal backend ○ store/retrieve evaluation data only ● Application logic is built on the client ○ SPARQL executed on client ○ Portable 9
  • 10. Evaluation storage schema ● Designed to support multiple campaigns and different ontologies ● Quality taxonomy is stored in the database which makes it easy to adapt 10
  • 12. Conclusion & Future Work ● TripleCheckMate ○ Tool for crowdsouring quality assessment ○ Linked Data quality assessment ○ Supports inter-rater agreement ○ Can be used with any Linked Dataset ● Future Work ○ Directly integrating semi-automatic methods ○ Improve efficiency of quality assessment ○ Include support for Patch Ontology* as output format * M. Knuth, J. Hercher, and H. Sack. Collaboratively patching linked data. CoRR, 2012. 12