SlideShare una empresa de Scribd logo
1 de 15
DBGroup@UNIMO
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 1
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Fabio Benedetti
Department of Engineering “Enzo Ferrari”
University of Modena & Reggio Emilia
D-Day 2015 - Modena
DBGroup@UNIMO
3
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 3
[Schmachtenberg, Max, Christian Bizer, and Heiko Paulheim. "Adoption of the Linked Data Best Practices in
Different Topical Domains." The Semantic Web–ISWC 2014. Springer International Publishing, 2014. 245-260}
DBGroup@UNIMO
4
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 4
*Only 570 datasets belong to the LOD cloud,
the remaining datasets do not contain
ingoing/outgoing links to the LOD Cloud.
2009 2014*
Domain Number % Number %
Cross-domain 41 13.95% 41 4.04%
Geographic 31 10.54% 21 2.07%
Government 49 16.67% 183 18.05%
Life sciences 41 13.95% 83 8.19%
Media 25 8.50% 22 2.17%
Publications 87 29.59% 96 9.47%
Social web 0 0.00% 520 51.28%
User-generated
content 20 6.80% 48 4.73%
Total 294 1014
2009 Domain
Cross-domain
Geographic
Government
Life sciences
Media
Publications
Social web
2014
DBGroup@UNIMO
5
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 5
The Open Access trends encourage the
publication of Open Data in form of
Linked Data
But
discovering LOD sources of interest is a
complex task for a user
Main issues
• Do not exist any standard to document a Dataset
• The structure of the Dataset can be understood only
manually exploring the Dataset
• The Semantic Web technologies are extremely complex for
unskilled user
DBGroup@UNIMO
6
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 6
• To automatically extract and summarize a schema
(Schema Summary) able to describe a LOD Dataset
• Use the Schema Summary to support the user in the
information extraction task
Online & Automatic extraction
• It does not require any additional information by the user
• It works with SPARQL endpoints
– We have to handle the bad performance issues of these Datasets
The Schema Summary has to describe a Dataset
• Ontology/Vocabulary (OWL & RDFS constraints)
• Open Data (i.e. generated from existing RDBMS)
DBGroup@UNIMO
7
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 7
Two main modules
• Extraction & Summarization
• Visualization & Querying
LODeX uses a NoSQL
Database as back-end
Input
URLs of SPARQL endpoints
Output
Interactive Schema Summary
LOD Cloud
SPARQL
Queries
Schema
Summary
NoSQL
LODeX
Post-
processing
Statistical
Indexes
LODeX
Indexes
Extraction
Query
Orchestrator
Schema
Summary
Visualizzation
Schema
Summary
Basic
QueryResults
Endpoint
URLs
Sgvizler
SPARQL
Queries
DBGroup@UNIMO
8
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 8
Statistical Indexes
They are composed by 9 indexes divided in three groups:
• General group
• Intensional group
• Extensional group
The IE process is able to generate the SPARQL queries used to extract the
different indexes.
• Iterative algorithm able to extract the Intensional knowledge
• Pattern Strategy technique
– It is a technique able to produce an higher number of less complex
SPARQL query
The IE process is able to perform online index extraction handling the
performance issues of the SPARQL endpoints
[F. Benedetti, S. Bergamaschi, and L. Po, “Online index extraction from linked open data sources,” 2014, Linked Data for Information
Extraction (LD4IE) Workshop held at International Semantic Web Conference]
DBGroup@UNIMO
9
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 9
The elements composing the Schema Summary are:
• Classes
• Properties
• Attributes
An algorithm combines
the information
contained in the
Statistical Indexes to
produce and store the
Schema Summary
[F. Benedetti, S. Bergamaschi, and L. Po, “A visual summary for linked open data sources,” 2014, International
Semantic Web Conference (Posters & Demos)]
DBGroup@UNIMO
10
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 10
Schema
Summary
SPARQL
compiler
SPARQL
query
Basic
Query
• The User using the Web Application GUI is
driven to building a Basic Query
• A refinement panel helps the user in refine
the Basic Query
A SPARQL compiler automatically generates
the corresponding SPARQL query
Operator supported by the compiler:
• AND
• Optional
• Filter
The query is sent to the SPARQL endpoint
and the results can be visualized in a
tabular, maps or chart view (pie, bar, etc.)
• ORDER BY
• LIMIT
• OFFSET
DBGroup@UNIMO
11
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 11
DBGroup@UNIMO
12
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 12
Try LODeX demo at: http://dbgroup.unimo.it/lodex2
[F. Benedetti, S. Bergamaschi, and L. Po, “Visual Querying LOD sources with LODeX,” 2014, submitted at The
Semantic Web journal]
DBGroup@UNIMO
13
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 13
Test Nov. 2014
Dataset URLs 559
Reachable datasets 302
SPARQL 1.1
compatible
206
Extraction completed 185
Task Correct Answers
Schema Summary browsing 94% (32/34)
Query generation 88% (60/68)
Online survey with 17 anonymous
users:
• 8 Skilled users
• 9 Unskilled user
The survey is divided in two parts:
• Schema Summary browsing
clarity
• Query generation
DBGroup@UNIMO
14
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 14
• Modify the interface of LODeX according to the
results of the online survey
• Extends the VOID descriptor vocabulary in order
to represent the Statistical Indexes and publish our
data as LOD
– Build an observatory for the LOD cloud
• Define clustering techniques to reduce the size of
the Summary for huge dataset
DBGroup@UNIMO
15
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 15
Accepted papers
• Beneventano, D., Bergamaschi, S., Sorrentino, S., Vincini, M., Benedetti, F. “Semantic
annotation of the CEREALAB database by the AGROVOC linked dataset” (2014)
Ecological Informatics journal, . Article in Press.
• F. Benedetti, S. Bergamaschi, and L. Po, “Online index extraction from linked open
data sources” 2014, Linked Data for Information Extraction (LD4IE) Workshop held at
International Semantic Web Conference
• F. Benedetti, S. Bergamaschi, and L. Po, “A visual summary for linked open data
sources” 2014, International Semantic Web Conference (Posters & Demos)
Submitted papers
• F. Benedetti, S. Bergamaschi, and L. Po, “Visual Querying LOD sources with LODeX”
2014, submitted at Semantic Web – Interoperability, Usability, Applicability an IOS
Press Journal
European projects & schools
• Web Science Summer School - Southampton University (20-26 July 2014)
• RDA Research Data Alliance - RDA Fourth Plenary Meeting 22 - 24 September 2014 in
Amsterdam. I won an Early Career Scientist grant and I belong to the Big Data
Analytics Interest group.
• Keystone - COST Action IC1302. Autumn 2014 MC and WG Meetings “QUERYING THE
SEMANTIC WEB” 17-18 October 2014, Riva del Garda, TN.
DBGroup@UNIMO
16
Dot. Fabio Benedetti
Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia
D Day 2015 – Modena Italy
LODeX: Schema Summarization and automatic SPARQL query
generation for Linked Open Data sources
Thanks for your attention!

Más contenido relacionado

Similar a LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources​

Visual Querying LOD sources with LODeX
 Visual Querying LOD sources with LODeX Visual Querying LOD sources with LODeX
Visual Querying LOD sources with LODeXFabio Benedetti
 
A Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesA Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesKarwan Jacksi
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Paolo Nesi
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationJohn Doove
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikisSören Auer
 
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Ghislain ATEMEZING
 
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...3TU.Datacentrum
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessOpenSource Connections
 
Describing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.orgDescribing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.orgOCLC
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityOpen Cyber University of Korea
 
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata ItemsLviv Data Science Summer School
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 

Similar a LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources​ (20)

Visual Querying LOD sources with LODeX
 Visual Querying LOD sources with LODeX Visual Querying LOD sources with LODeX
Visual Querying LOD sources with LODeX
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
A Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesA Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD Resources
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair
LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and RepairLOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair
LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair
 
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundation
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikis
 
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
 
LOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViewsLOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViews
 
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
 
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and RepairLOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
 
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...
 
bonino
boninobonino
bonino
 
Describing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.orgDescribing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.org
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
 
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 

Último

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfTukamushabaBismark
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to VirusesAreesha Ahmad
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 

Último (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 

LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources​

  • 1. DBGroup@UNIMO Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 1 D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Fabio Benedetti Department of Engineering “Enzo Ferrari” University of Modena & Reggio Emilia D-Day 2015 - Modena
  • 2. DBGroup@UNIMO 3 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 3 [Schmachtenberg, Max, Christian Bizer, and Heiko Paulheim. "Adoption of the Linked Data Best Practices in Different Topical Domains." The Semantic Web–ISWC 2014. Springer International Publishing, 2014. 245-260}
  • 3. DBGroup@UNIMO 4 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 4 *Only 570 datasets belong to the LOD cloud, the remaining datasets do not contain ingoing/outgoing links to the LOD Cloud. 2009 2014* Domain Number % Number % Cross-domain 41 13.95% 41 4.04% Geographic 31 10.54% 21 2.07% Government 49 16.67% 183 18.05% Life sciences 41 13.95% 83 8.19% Media 25 8.50% 22 2.17% Publications 87 29.59% 96 9.47% Social web 0 0.00% 520 51.28% User-generated content 20 6.80% 48 4.73% Total 294 1014 2009 Domain Cross-domain Geographic Government Life sciences Media Publications Social web 2014
  • 4. DBGroup@UNIMO 5 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 5 The Open Access trends encourage the publication of Open Data in form of Linked Data But discovering LOD sources of interest is a complex task for a user Main issues • Do not exist any standard to document a Dataset • The structure of the Dataset can be understood only manually exploring the Dataset • The Semantic Web technologies are extremely complex for unskilled user
  • 5. DBGroup@UNIMO 6 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 6 • To automatically extract and summarize a schema (Schema Summary) able to describe a LOD Dataset • Use the Schema Summary to support the user in the information extraction task Online & Automatic extraction • It does not require any additional information by the user • It works with SPARQL endpoints – We have to handle the bad performance issues of these Datasets The Schema Summary has to describe a Dataset • Ontology/Vocabulary (OWL & RDFS constraints) • Open Data (i.e. generated from existing RDBMS)
  • 6. DBGroup@UNIMO 7 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 7 Two main modules • Extraction & Summarization • Visualization & Querying LODeX uses a NoSQL Database as back-end Input URLs of SPARQL endpoints Output Interactive Schema Summary LOD Cloud SPARQL Queries Schema Summary NoSQL LODeX Post- processing Statistical Indexes LODeX Indexes Extraction Query Orchestrator Schema Summary Visualizzation Schema Summary Basic QueryResults Endpoint URLs Sgvizler SPARQL Queries
  • 7. DBGroup@UNIMO 8 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 8 Statistical Indexes They are composed by 9 indexes divided in three groups: • General group • Intensional group • Extensional group The IE process is able to generate the SPARQL queries used to extract the different indexes. • Iterative algorithm able to extract the Intensional knowledge • Pattern Strategy technique – It is a technique able to produce an higher number of less complex SPARQL query The IE process is able to perform online index extraction handling the performance issues of the SPARQL endpoints [F. Benedetti, S. Bergamaschi, and L. Po, “Online index extraction from linked open data sources,” 2014, Linked Data for Information Extraction (LD4IE) Workshop held at International Semantic Web Conference]
  • 8. DBGroup@UNIMO 9 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 9 The elements composing the Schema Summary are: • Classes • Properties • Attributes An algorithm combines the information contained in the Statistical Indexes to produce and store the Schema Summary [F. Benedetti, S. Bergamaschi, and L. Po, “A visual summary for linked open data sources,” 2014, International Semantic Web Conference (Posters & Demos)]
  • 9. DBGroup@UNIMO 10 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 10 Schema Summary SPARQL compiler SPARQL query Basic Query • The User using the Web Application GUI is driven to building a Basic Query • A refinement panel helps the user in refine the Basic Query A SPARQL compiler automatically generates the corresponding SPARQL query Operator supported by the compiler: • AND • Optional • Filter The query is sent to the SPARQL endpoint and the results can be visualized in a tabular, maps or chart view (pie, bar, etc.) • ORDER BY • LIMIT • OFFSET
  • 10. DBGroup@UNIMO 11 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 11
  • 11. DBGroup@UNIMO 12 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 12 Try LODeX demo at: http://dbgroup.unimo.it/lodex2 [F. Benedetti, S. Bergamaschi, and L. Po, “Visual Querying LOD sources with LODeX,” 2014, submitted at The Semantic Web journal]
  • 12. DBGroup@UNIMO 13 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 13 Test Nov. 2014 Dataset URLs 559 Reachable datasets 302 SPARQL 1.1 compatible 206 Extraction completed 185 Task Correct Answers Schema Summary browsing 94% (32/34) Query generation 88% (60/68) Online survey with 17 anonymous users: • 8 Skilled users • 9 Unskilled user The survey is divided in two parts: • Schema Summary browsing clarity • Query generation
  • 13. DBGroup@UNIMO 14 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 14 • Modify the interface of LODeX according to the results of the online survey • Extends the VOID descriptor vocabulary in order to represent the Statistical Indexes and publish our data as LOD – Build an observatory for the LOD cloud • Define clustering techniques to reduce the size of the Summary for huge dataset
  • 14. DBGroup@UNIMO 15 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia 15 Accepted papers • Beneventano, D., Bergamaschi, S., Sorrentino, S., Vincini, M., Benedetti, F. “Semantic annotation of the CEREALAB database by the AGROVOC linked dataset” (2014) Ecological Informatics journal, . Article in Press. • F. Benedetti, S. Bergamaschi, and L. Po, “Online index extraction from linked open data sources” 2014, Linked Data for Information Extraction (LD4IE) Workshop held at International Semantic Web Conference • F. Benedetti, S. Bergamaschi, and L. Po, “A visual summary for linked open data sources” 2014, International Semantic Web Conference (Posters & Demos) Submitted papers • F. Benedetti, S. Bergamaschi, and L. Po, “Visual Querying LOD sources with LODeX” 2014, submitted at Semantic Web – Interoperability, Usability, Applicability an IOS Press Journal European projects & schools • Web Science Summer School - Southampton University (20-26 July 2014) • RDA Research Data Alliance - RDA Fourth Plenary Meeting 22 - 24 September 2014 in Amsterdam. I won an Early Career Scientist grant and I belong to the Big Data Analytics Interest group. • Keystone - COST Action IC1302. Autumn 2014 MC and WG Meetings “QUERYING THE SEMANTIC WEB” 17-18 October 2014, Riva del Garda, TN.
  • 15. DBGroup@UNIMO 16 Dot. Fabio Benedetti Dip. Ing. “Enzo Ferrari” – University of Modena e Reggio Emilia D Day 2015 – Modena Italy LODeX: Schema Summarization and automatic SPARQL query generation for Linked Open Data sources Thanks for your attention!