SlideShare una empresa de Scribd logo
1 de 11
Descargar para leer sin conexión
EKAW 2016
ACRyLIQ: Leveraging DBpedia for Adaptive
Crowdsourcing in Linked Data Quality Assessment
Umair ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, Jens Lehmann
Background
• Linked Data Quality Assessment
(LDQA)
– Incomplete, inaccurate,
inconsistent data in LOD
• Crowdsourcing LDQA
1. Generate Micro-tasks to
assess quality of Linked
Data dataset
2. Recruits crowd workers to
perform LDQA tasks
3. Update dataset based on
crowd answers
Zaveri, Amrapali, et al. "Quality assessment for linked data: A survey." Semantic Web 7.1 (2015): 63-93.
Acosta, Maribel, et al. "Crowdsourcing linked data quality assessment." International Semantic Web Conference. Springer Berlin Heidelberg, 2013.
2
Linked
Dataset
LDQA tasks Updates
Crowd
Workers
Answers
Research Challenge
• Workers have varying reliability and expertise depending on the
domain and topics of a datasets
3
Linked
Dataset
Crowdsourced
LDQA tasks
How can we estimate
the reliability of crowd
workers to achieve
high accuracy of LDQA
tasks though adaptive
task assignment?
Existing Approach
• Use experts to create gold-standard tasks (GST)
• Estimate worker reliability and assign tasks
4
Correct
Responses
Gold-standard
LDQA tasks
Linked
Dataset
Crowdsourced
LDQA tasks
1) GST Selection
2) Task Assignment
Domain
Experts
Propose Approach
• Leverage DBPedia to generate knowledge-based questions (KBQs)
• Estimate worker reliability and assign tasks
5
Facts (i.e. triples)
KBQs
Linked
Dataset
Crowdsourced
LDQA tasks
1) KBQ Selection
2) Task Assignment
Evaluation Methodology
Languages Interlinks
LDQA Tasks Verify language tags for
entities in LinkedSpending
dataset
Verify relationships
between entities as
generated by OAEI
Topics Chinese, English, French,
Japanese, Russian
Anatomy, Books,
Economics, Geography,
Nature
KBQs Verify language of Dbpedia
facts
Verify Dbpedia facts based
on SKOS relationships
No. of tasks 25 25
No. of KBQs 10 10
6
Evaluation Methodology
• Crowd Workers
– 60 workers from Amazon
Mechanical Turk
– $1.5 for 30 mins
– Provided answers to 10
KBQs and 25 tasks for both
datasets
– Diverse reliability on
Languages tasks
– Low reliability on Interlinks
tasks
7
Results: Compared Approaches
KBQ approach generates reliability estimates similar to the GST approach
8
Results: Algorithm Parameters
9
Summary
• Strengths
– KBQs provide a quick and inexpensive method of estimating the
reliability and expertise of workers
– Our approach is particularly suited for complex and knowledge-
intensive tasks
• Limitations
– Assumption that LDQA tasks and KBQs are partitioned according to
same set of topics
– Assumption that the all facts in Dbpedia are correct
– Assumption that dataset topics are mutually exclusive
• Future work
– Scalability of the proposed approach needs to be validated
– Evaluate of wide range of tasks and datasets
10
Thank you
Umair Ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, and Jens
Lehmann. “ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in
Linked Data Quality Assessment”. In: 20th International Conference on
Knowledge Engineering and Knowledge Management. Springer
International Publishing. 2016
Questions:
umair.ulhassan@insight-centre.org

Más contenido relacionado

Destacado

Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentOlaf Hartig
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Qualityandimou
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...HTAi Bilbao 2012
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introductiondatatovalue
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyAmrapali Zaveri, PhD
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Data quality overview
Data quality overviewData quality overview
Data quality overviewAlex Meadows
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality DashboardsWilliam Sharp
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratchdmurph4
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profilingShailja Khurana
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 

Destacado (14)

Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality Dashboards
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
 
Data Quality Definitions
Data Quality DefinitionsData Quality Definitions
Data Quality Definitions
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 

Similar a Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationRoya Hosseini
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Bayes Nets meetup London
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015Ioan Toma
 
Designing real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxDesigning real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxGopi Krishna
 
Beyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesBeyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesMaya Hristakeva
 
Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceAditya Parameswaran
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)Yun Huang
 
MongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics PlatformMongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics PlatformMongoDB
 
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.BELIV Workshop
 
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataNL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataSuvodeep Mazumdar
 
Crowdsourcing the Semantic Web
Crowdsourcing the Semantic WebCrowdsourcing the Semantic Web
Crowdsourcing the Semantic WebElena Simperl
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSungchul Kim
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...Ilkay Altintas, Ph.D.
 
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB Project
 
Linked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and LuzzuLinked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and Luzzujerdeb
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesMohamed Reda
 

Similar a Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment (20)

Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its application
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
 
Designing real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxDesigning real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptx
 
Beyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesBeyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research Articles
 
DEPT CONF (1) (1).pptx
DEPT CONF (1) (1).pptxDEPT CONF (1) (1).pptx
DEPT CONF (1) (1).pptx
 
Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data Science
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
 
MongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics PlatformMongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics Platform
 
Cloud
CloudCloud
Cloud
 
KREAM@ICCS2013
KREAM@ICCS2013KREAM@ICCS2013
KREAM@ICCS2013
 
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
 
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataNL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic Data
 
Crowdsourcing the Semantic Web
Crowdsourcing the Semantic WebCrowdsourcing the Semantic Web
Crowdsourcing the Semantic Web
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
 
Linked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and LuzzuLinked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and Luzzu
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-Queries
 

Más de Umair ul Hassan

A Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentA Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentUmair ul Hassan
 
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingSLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingUmair ul Hassan
 
A Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsA Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsUmair ul Hassan
 
Researh toolbox - Data analysis with python
Researh toolbox  - Data analysis with pythonResearh toolbox  - Data analysis with python
Researh toolbox - Data analysis with pythonUmair ul Hassan
 
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...Umair ul Hassan
 
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Umair ul Hassan
 
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Umair ul Hassan
 
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Umair ul Hassan
 

Más de Umair ul Hassan (8)

A Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentA Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task Assignment
 
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingSLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
 
A Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsA Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of Things
 
Researh toolbox - Data analysis with python
Researh toolbox  - Data analysis with pythonResearh toolbox  - Data analysis with python
Researh toolbox - Data analysis with python
 
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
 
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
 
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
 
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
 

Último

Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfInfopole1
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0DanBrown980551
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxKaustubhBhavsar6
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...DianaGray10
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTxtailishbaloch
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024Brian Pichman
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 

Último (20)

Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptx
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile Brochure
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 

Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

  • 1. EKAW 2016 ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment Umair ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, Jens Lehmann
  • 2. Background • Linked Data Quality Assessment (LDQA) – Incomplete, inaccurate, inconsistent data in LOD • Crowdsourcing LDQA 1. Generate Micro-tasks to assess quality of Linked Data dataset 2. Recruits crowd workers to perform LDQA tasks 3. Update dataset based on crowd answers Zaveri, Amrapali, et al. "Quality assessment for linked data: A survey." Semantic Web 7.1 (2015): 63-93. Acosta, Maribel, et al. "Crowdsourcing linked data quality assessment." International Semantic Web Conference. Springer Berlin Heidelberg, 2013. 2 Linked Dataset LDQA tasks Updates Crowd Workers Answers
  • 3. Research Challenge • Workers have varying reliability and expertise depending on the domain and topics of a datasets 3 Linked Dataset Crowdsourced LDQA tasks How can we estimate the reliability of crowd workers to achieve high accuracy of LDQA tasks though adaptive task assignment?
  • 4. Existing Approach • Use experts to create gold-standard tasks (GST) • Estimate worker reliability and assign tasks 4 Correct Responses Gold-standard LDQA tasks Linked Dataset Crowdsourced LDQA tasks 1) GST Selection 2) Task Assignment Domain Experts
  • 5. Propose Approach • Leverage DBPedia to generate knowledge-based questions (KBQs) • Estimate worker reliability and assign tasks 5 Facts (i.e. triples) KBQs Linked Dataset Crowdsourced LDQA tasks 1) KBQ Selection 2) Task Assignment
  • 6. Evaluation Methodology Languages Interlinks LDQA Tasks Verify language tags for entities in LinkedSpending dataset Verify relationships between entities as generated by OAEI Topics Chinese, English, French, Japanese, Russian Anatomy, Books, Economics, Geography, Nature KBQs Verify language of Dbpedia facts Verify Dbpedia facts based on SKOS relationships No. of tasks 25 25 No. of KBQs 10 10 6
  • 7. Evaluation Methodology • Crowd Workers – 60 workers from Amazon Mechanical Turk – $1.5 for 30 mins – Provided answers to 10 KBQs and 25 tasks for both datasets – Diverse reliability on Languages tasks – Low reliability on Interlinks tasks 7
  • 8. Results: Compared Approaches KBQ approach generates reliability estimates similar to the GST approach 8
  • 10. Summary • Strengths – KBQs provide a quick and inexpensive method of estimating the reliability and expertise of workers – Our approach is particularly suited for complex and knowledge- intensive tasks • Limitations – Assumption that LDQA tasks and KBQs are partitioned according to same set of topics – Assumption that the all facts in Dbpedia are correct – Assumption that dataset topics are mutually exclusive • Future work – Scalability of the proposed approach needs to be validated – Evaluate of wide range of tasks and datasets 10
  • 11. Thank you Umair Ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, and Jens Lehmann. “ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment”. In: 20th International Conference on Knowledge Engineering and Knowledge Management. Springer International Publishing. 2016 Questions: umair.ulhassan@insight-centre.org