SlideShare a Scribd company logo
1 of 26
DESWeb 2014
ICDE 2014, Chicago IL, USA, March 3
balloon Fusion
SPARQL Rewriting Based on
Unified Co-Reference Information
Kai Schlegel (kai.schlegel@googlemail.com)
Florian Stegmaier, Sebastian Bayerl, Michael Granitzer, Harald Kosch
2
Motivation
SPARQL Rewriting & Federation
Intermediate Results
Outline
supported by the European Commission
under the Seventh Framework Program
3
Linked Data is
the heart of Semantic Web
“
- W3C Semantic Web Group
4
5
• Easy access to Linked Data
• Query Linked Open Data with SPARQL
• Plethora of tools available
• Problems:
• Business oriented
• Complex setup
• Maintenance
• „Paper-only“
• Not developer friendly
•  Simple and „instant“ SPARQL Query Federation (-as-a-Service)
6
Motivation
Nothing-as-a-Service
• How to get information about the German City „Passau“?
• Problem: LOD is not a single database!
7
Querying LOD
SPARQL SPARQL
RDF
RDFRDF
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
de.dbpedia.org
Relations, Coordinates, Leader, etc.
What about the population?
SPARQL
• Problem: Selection of appropriate endpoints
• Send query to some endpoints and aggregate the results?
8
Distributed Querying!
SPARQL SPARQL
RDF
RDFRDF
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
de.dbpedia.org
SPARQL
linkedgeodata.org
• Problem: Different identifier for the same semantic concept
9
Misunderstanding: Co-Referencing
SPARQL SPARQL
RDF
RDFRDF
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
de.dbpedia.org
SPARQL
linkedgeodata.org
Known problem in linguistic:
It’s a spud!“
What?“
I mean potato!“
Co-Referencing: Multiple expressions
refer to the same thing.
10
Problem = Solution?
SPARQL-based crawling of co-reference information
Exploit co-reference information for
• accomplishing immediate SPARQL rewriting
• performing endpoint selection
• execute automatic query federation
Basic idea: Focusing distributed co-reference information
Main principle: Semantic entites over
identifier!
11
Components
balloon toolsuite
12
balloon Overflight
• SPARQL based crawling of LOD endpoints
• Query: Ask for subjects and objects which are
related with special predicate
• Simplified global view on
• Equivalence: owl:SameAs, skos:exactMatch,
coref:coreferenceData, ...
• Graph-Database Neo4j
• Equivalence Cluster:
Multiple synonym URIs representing the same
semantic entity including Provenance
13
balloon Fusion
SPARQL Federation setup using co-reference information
SPARQL Transformation for each BGP
1. Determine synonym URIs
2. Select suitable endpoints
3. Adapt sub-queries to endpoints
4. Federated querying
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
SPARQL
14
1. Determine synonym URIs
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
SPARQL
15
2. Select suitable endpoints
• Provenance based selection (PBS)
• Endpoints which are involved in cluster composition
• Namespace based selection (NBS)
• Prefix and Namespace matching of synonym URLs
Summarized: origin of co-reference
information and origin of synonym URIs
16
2. Select suitable endpoints (2)
Assumption:
• Provenance information only contains „linkedgeodata.org“
as co-reference origin
• Namespaces for freebase and dbpedia available (datahub.io)
PBS:
Linked-Geo-Data
Endpoint
NBS:
DBPedia
endpoint
NBS:
Freebase
endpoint
17
3. Adapt sub-queries to endpoints
PBS:
Linked-Geo-Data
Endpoint
NBS:
DBPedia
endpoint
NBS:
Freebase
endpoint
SELECT ?p ?o WHERE {
<http://rdf.freebase.com/
ns/m.01h5td> ?p ?o.
}
SPARQL
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
SPARQL
SELECT ?p ?o WHERE {
{ <http://rdf.freebase.com/ns/m.01h5td> ?p ?o. }
UNION
{ <http://linkedgeodata.org/triplify/node240057351> ?p ?o. }
UNION
{ <http://de.dbpedia.org/resource/Passau> ?p ?o. }
}
SPARQL
SELECT ?p ?o WHERE {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
}
SPARQL
• W3C SPARQL 1.1 Federated Query Extension (SERVICE)
• (Partial) Query can be executed against a remote SPARQL
endpoint
• Distributed sub-queries don‘t contain SPARQL 1.1 features
18
4. Federated Querying
SPARQL
SELECT ?p ?o WHERE {
SERVICE <http://dbpedia.org/sparql> {
<http://de.dbpedia.org/resource/Passau> ?p ?o.
} UNION {
SERVICE <http://www.freebase.com/base/sparql> {
<http://rdf.freebase.com/ns/m.01h5td> ?p ? }
} UNION {
SERVICE <http://linkedgeodata.org/sparql/> {
{ <http://rdf.freebase.com/ns/m.01h5td> ?p ?o. }
UNION
{ <http://linkedgeodata.org/triplify/node240057351> ?p ?o. }
UNION
{ <http://de.dbpedia.org/resource/Passau> ?p ?o. }
}}}
• Endpoint status check
• Check routine in terms of availability and latency
• Minimize sub-queries
• Group sub-queries with common endpoint
• Push join to endpoint
• SPARQL Features
• Condense PBS UNION-construct of synonym URIs
• SPARQL 1.1 VALUES or FILTER with IN operator
• Not well implemented in Linked Data endpoints
19
Optimizations (ongoing)
balloon Overflight
Results
20
21
Results from a sounding
balloon
22
balloon toolsuite
23
Statistics
• Datahub.io: Linked Open Data Cloud catalog
• 337 datasets in total
• 237 expose a SPARQL endpoint
• 112 successfully queried for co-reference information
• Balloon Dataset (first run)
• 17.6M co-reference statements
• 22.4M distinct URLs
• 8.4M equivalence cluster (~ 2.68 identifier per cluster)
• Pending Analysis
• Distribution of cluster sizes, Number of different Hosts per cluster
• Main representative per cluster & False-Friends
Open Source:
• Demo, information and sources available (MIT License)
• X as a Service
• SPARQL Rewriting (HTTP API)
• Query Federation (SPARQL)
24
http://schlegel.github.io/balloon
Summary:
• SPARQL-based crawling of distributed co-reference information
• Exploit co-reference information for SPARQL federation
25
Single Point of Access
Any questions?
“
26
Research is formalized curiosity.
It is poking and prying with a
purpose. - Zora Neale Hurston

More Related Content

What's hot

Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web CorpusRobert Meusel
 
grlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsgrlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsAlbert Meroño-Peñuela
 
Health Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by GlobusHealth Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by GlobusGlobus
 
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...Jukka Huhtamäki
 
2016 urisa track: nhd hydro linked data registery by michael tinker
2016 urisa track:  nhd hydro linked data registery by michael tinker2016 urisa track:  nhd hydro linked data registery by michael tinker
2016 urisa track: nhd hydro linked data registery by michael tinkerGIS in the Rockies
 
Data Sharing via Globus in the NIH Intramural Program
Data Sharing via Globus in the NIH Intramural ProgramData Sharing via Globus in the NIH Intramural Program
Data Sharing via Globus in the NIH Intramural ProgramGlobus
 
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012Amazon Web Services
 
Search Joins with the Web - ICDT2014 Invited Lecture
Search Joins with the Web - ICDT2014 Invited LectureSearch Joins with the Web - ICDT2014 Invited Lecture
Search Joins with the Web - ICDT2014 Invited LectureChris Bizer
 
London HUG
London HUGLondon HUG
London HUGBoudicca
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage InformationEnno Meijers
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...Robert Meusel
 
An Ontology-Driven Integration Framework for Smart Communities
An Ontology-Driven Integration Framework for Smart CommunitiesAn Ontology-Driven Integration Framework for Smart Communities
An Ontology-Driven Integration Framework for Smart CommunitiesSteve Ray
 
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc RDM
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4CLARIAH
 
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortarOpen Analytics
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...South London Geek Nights
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...Martin Klein
 

What's hot (20)

Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web Corpus
 
grlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsgrlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIs
 
Health Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by GlobusHealth Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by Globus
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Shawn-Averkamp-feb25
Shawn-Averkamp-feb25Shawn-Averkamp-feb25
Shawn-Averkamp-feb25
 
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
 
2016 urisa track: nhd hydro linked data registery by michael tinker
2016 urisa track:  nhd hydro linked data registery by michael tinker2016 urisa track:  nhd hydro linked data registery by michael tinker
2016 urisa track: nhd hydro linked data registery by michael tinker
 
Data Sharing via Globus in the NIH Intramural Program
Data Sharing via Globus in the NIH Intramural ProgramData Sharing via Globus in the NIH Intramural Program
Data Sharing via Globus in the NIH Intramural Program
 
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
 
Search Joins with the Web - ICDT2014 Invited Lecture
Search Joins with the Web - ICDT2014 Invited LectureSearch Joins with the Web - ICDT2014 Invited Lecture
Search Joins with the Web - ICDT2014 Invited Lecture
 
London HUG
London HUGLondon HUG
London HUG
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
Neo4j_allHands_04112013
Neo4j_allHands_04112013Neo4j_allHands_04112013
Neo4j_allHands_04112013
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
 
An Ontology-Driven Integration Framework for Smart Communities
An Ontology-Driven Integration Framework for Smart CommunitiesAn Ontology-Driven Integration Framework for Smart Communities
An Ontology-Driven Integration Framework for Smart Communities
 
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 Paper
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4
 
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 

Similar to balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information

Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
The Semantic Web #10 - SPARQL
The Semantic Web #10 - SPARQLThe Semantic Web #10 - SPARQL
The Semantic Web #10 - SPARQLMyungjin Lee
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Juan Sequeda
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Olaf Hartig
 
SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeAdriel Café
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQLOlaf Hartig
 
Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016Martin Necasky
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
Sparql a simple knowledge query
Sparql  a simple knowledge querySparql  a simple knowledge query
Sparql a simple knowledge queryStanley Wang
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02eswcsummerschool
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Gautier Poupeau
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commonsJesse Wang
 

Similar to balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information (20)

Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
The Semantic Web #10 - SPARQL
The Semantic Web #10 - SPARQLThe Semantic Web #10 - SPARQL
The Semantic Web #10 - SPARQL
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
 
SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & Practice
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
Querying Linked Data
Querying Linked DataQuerying Linked Data
Querying Linked Data
 
Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Sparql a simple knowledge query
Sparql  a simple knowledge querySparql  a simple knowledge query
Sparql a simple knowledge query
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
Sparql
SparqlSparql
Sparql
 
SFScon 2020 - Peter Hopfgartner - Open Data de luxe
SFScon 2020 - Peter Hopfgartner - Open Data de luxeSFScon 2020 - Peter Hopfgartner - Open Data de luxe
SFScon 2020 - Peter Hopfgartner - Open Data de luxe
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 

Recently uploaded

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 

Recently uploaded (20)

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 

balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information

  • 1. DESWeb 2014 ICDE 2014, Chicago IL, USA, March 3 balloon Fusion SPARQL Rewriting Based on Unified Co-Reference Information Kai Schlegel (kai.schlegel@googlemail.com) Florian Stegmaier, Sebastian Bayerl, Michael Granitzer, Harald Kosch
  • 2. 2 Motivation SPARQL Rewriting & Federation Intermediate Results Outline supported by the European Commission under the Seventh Framework Program
  • 3. 3 Linked Data is the heart of Semantic Web “ - W3C Semantic Web Group
  • 4. 4
  • 5. 5
  • 6. • Easy access to Linked Data • Query Linked Open Data with SPARQL • Plethora of tools available • Problems: • Business oriented • Complex setup • Maintenance • „Paper-only“ • Not developer friendly •  Simple and „instant“ SPARQL Query Federation (-as-a-Service) 6 Motivation Nothing-as-a-Service
  • 7. • How to get information about the German City „Passau“? • Problem: LOD is not a single database! 7 Querying LOD SPARQL SPARQL RDF RDFRDF SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } de.dbpedia.org Relations, Coordinates, Leader, etc. What about the population? SPARQL
  • 8. • Problem: Selection of appropriate endpoints • Send query to some endpoints and aggregate the results? 8 Distributed Querying! SPARQL SPARQL RDF RDFRDF SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } de.dbpedia.org SPARQL linkedgeodata.org
  • 9. • Problem: Different identifier for the same semantic concept 9 Misunderstanding: Co-Referencing SPARQL SPARQL RDF RDFRDF SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } de.dbpedia.org SPARQL linkedgeodata.org Known problem in linguistic: It’s a spud!“ What?“ I mean potato!“ Co-Referencing: Multiple expressions refer to the same thing.
  • 10. 10 Problem = Solution? SPARQL-based crawling of co-reference information Exploit co-reference information for • accomplishing immediate SPARQL rewriting • performing endpoint selection • execute automatic query federation Basic idea: Focusing distributed co-reference information Main principle: Semantic entites over identifier!
  • 12. 12 balloon Overflight • SPARQL based crawling of LOD endpoints • Query: Ask for subjects and objects which are related with special predicate • Simplified global view on • Equivalence: owl:SameAs, skos:exactMatch, coref:coreferenceData, ... • Graph-Database Neo4j • Equivalence Cluster: Multiple synonym URIs representing the same semantic entity including Provenance
  • 13. 13 balloon Fusion SPARQL Federation setup using co-reference information SPARQL Transformation for each BGP 1. Determine synonym URIs 2. Select suitable endpoints 3. Adapt sub-queries to endpoints 4. Federated querying SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } SPARQL
  • 14. 14 1. Determine synonym URIs SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } SPARQL
  • 15. 15 2. Select suitable endpoints • Provenance based selection (PBS) • Endpoints which are involved in cluster composition • Namespace based selection (NBS) • Prefix and Namespace matching of synonym URLs Summarized: origin of co-reference information and origin of synonym URIs
  • 16. 16 2. Select suitable endpoints (2) Assumption: • Provenance information only contains „linkedgeodata.org“ as co-reference origin • Namespaces for freebase and dbpedia available (datahub.io) PBS: Linked-Geo-Data Endpoint NBS: DBPedia endpoint NBS: Freebase endpoint
  • 17. 17 3. Adapt sub-queries to endpoints PBS: Linked-Geo-Data Endpoint NBS: DBPedia endpoint NBS: Freebase endpoint SELECT ?p ?o WHERE { <http://rdf.freebase.com/ ns/m.01h5td> ?p ?o. } SPARQL SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } SPARQL SELECT ?p ?o WHERE { { <http://rdf.freebase.com/ns/m.01h5td> ?p ?o. } UNION { <http://linkedgeodata.org/triplify/node240057351> ?p ?o. } UNION { <http://de.dbpedia.org/resource/Passau> ?p ?o. } } SPARQL SELECT ?p ?o WHERE { <http://de.dbpedia.org/resource/Passau> ?p ?o. } SPARQL
  • 18. • W3C SPARQL 1.1 Federated Query Extension (SERVICE) • (Partial) Query can be executed against a remote SPARQL endpoint • Distributed sub-queries don‘t contain SPARQL 1.1 features 18 4. Federated Querying SPARQL SELECT ?p ?o WHERE { SERVICE <http://dbpedia.org/sparql> { <http://de.dbpedia.org/resource/Passau> ?p ?o. } UNION { SERVICE <http://www.freebase.com/base/sparql> { <http://rdf.freebase.com/ns/m.01h5td> ?p ? } } UNION { SERVICE <http://linkedgeodata.org/sparql/> { { <http://rdf.freebase.com/ns/m.01h5td> ?p ?o. } UNION { <http://linkedgeodata.org/triplify/node240057351> ?p ?o. } UNION { <http://de.dbpedia.org/resource/Passau> ?p ?o. } }}}
  • 19. • Endpoint status check • Check routine in terms of availability and latency • Minimize sub-queries • Group sub-queries with common endpoint • Push join to endpoint • SPARQL Features • Condense PBS UNION-construct of synonym URIs • SPARQL 1.1 VALUES or FILTER with IN operator • Not well implemented in Linked Data endpoints 19 Optimizations (ongoing)
  • 21. 21 Results from a sounding balloon
  • 23. 23 Statistics • Datahub.io: Linked Open Data Cloud catalog • 337 datasets in total • 237 expose a SPARQL endpoint • 112 successfully queried for co-reference information • Balloon Dataset (first run) • 17.6M co-reference statements • 22.4M distinct URLs • 8.4M equivalence cluster (~ 2.68 identifier per cluster) • Pending Analysis • Distribution of cluster sizes, Number of different Hosts per cluster • Main representative per cluster & False-Friends
  • 24. Open Source: • Demo, information and sources available (MIT License) • X as a Service • SPARQL Rewriting (HTTP API) • Query Federation (SPARQL) 24 http://schlegel.github.io/balloon
  • 25. Summary: • SPARQL-based crawling of distributed co-reference information • Exploit co-reference information for SPARQL federation 25 Single Point of Access
  • 26. Any questions? “ 26 Research is formalized curiosity. It is poking and prying with a purpose. - Zora Neale Hurston