SlideShare una empresa de Scribd logo
1 de 140
Descargar para leer sin conexión
METADATA AND CONTROL FEATURES
FOR LOW-COST LINKED DATA
PUBLISHING INFRASTRUCTURES
Public defense Drs. Miel Vander Sande
DEPARTMENT ELIS
RESEARCH GROUP IDLAB
http://wikipedia.org/wiki/Umberto_Eco
Uniform
Resource
Identifier
http://wikipedia.org/wiki/Umberto_Eco
Web Client
Web Client Web Server
Web Client Web Server
“Give me
http://wikipedia.org/wiki/Umberto_Eco”
Request
Web Client Web Server
“Give me
http://wikipedia.org/wiki/Umberto_Eco”
Request
Response“Here’s the document
http://wikipedia.org/wiki/Umberto_Eco”
Web Client Web Server
“Give me
http://wikipedia.org/wiki/Umberto_Eco”
Request
Response“Here’s the document
http://wikipedia.org/wiki/Umberto_Eco”
HTTP
protocol
+ 4 billion webpages
+ 4 billion webpages
Wikipedia page of
Umberto Eco
Wikipedia page of
Umberto Eco
Umberto Eco
Alessandria
Wikipedia page of
Umberto Eco
The Name of The Rose
Umberto Eco
Renate Ramge
Alessandria
displays
married
The Name of The Rose
Umberto Eco
Alessandria
Wikipedia page of
Umberto Eco
Renate Ramge
displays
name
author
name
name
name
birthplace
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
name
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
stars
Sean Connery
name
about
name
Jean-Jacques Annaud
name
director
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
stars
Sean Connery
name
about
name
Jean-Jacques Annaud
name
director
Film database
Book database
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
Linked Data
stars
Sean Connery
name
about
name
Jean-Jacques Annaud
name
director
Film database
Book database
What actor stars in films based on
books by ?Umberto Eco
Web applications execute queries to yield answers
What actor stars in films based on
books by ?Umberto Eco
author
name
stars
Sean Connery
name
about
Web applications execute queries to yield answers
Website
Client
Server
human

readable
+ understandable
Website Web API
Client
Server
machine

readable
human

readable
+ understandable
Website Web API
Client
Server Linked Data

API
machine

readable
human

readable
+ understandable
machine

understandable
Around 10.000 published 

Linked Open Datasets.
But not many are directly queryable.
Most datasets 

require download
Unavailable for at least 

1,5 days / month
It is partly an architectural problem
with economical repercussions.
.,
Many data publishers are under-resourced,
looking for “good-enough” solutions.
It is partly an architectural problem
with economical repercussions.
.,
Many data publishers are under-resourced,
looking for “good-enough” solutions.
Can we enable Web clients to 

query Linked data directly,
Can we enable Web clients to 

query Linked data directly,
while lowering infrastructure cost
by simplifying Linked Data APIs
Can we enable Web clients to 

query Linked data directly,
while lowering infrastructure cost
by simplifying Linked Data APIs
thus making Linked Data publishing
more democratic and sustain better?
How do Web clients
query published Linked Data today
query a Linked Data API with 

lower server cost
discover and query multiple
low-cost Linked Data APIs
reproduce query results
1
2
3
4
How do Web clients
query published Linked Data today
query a Linked Data API with 

lower server cost
discover and query multiple
low-cost Linked Data APIs
reproduce query results
1
2
3
4
Client
Server Linked Data

API
displays
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
name
birthplace
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
umberto.jpg
Linked Dataset/Graph
displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
Triple
Subject Predicate Object
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
http://dbpedia.org/resource/Umberto_Eco
http://dbpedia.org/resource/Alessandria
http://dbpedia.org/ontology/birthPlace
Triple
URI URI URI (or Value)
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
URI URI URI (or Value)
dbr:Umberto_Eco
dbr:Alessandria
dbo:birthPlace
Triple
SELECT ?bookname 

WHERE {
?person dbo:name “Umberto Eco”.
?book dbo:author ?person;
dbo:name ?bookname.
}
What book was written by 

Umberto Eco?
Queries over Linked Data are written in SPARQL
SELECT ?bookname 

WHERE {
?person dbo:name “Umberto Eco”.
?book dbo:author ?person;
dbo:name ?bookname.
}
“I want to select a value”
What book was written by 

Umberto Eco?
Queries over Linked Data are written in SPARQL
SELECT ?bookname 

WHERE {
?person dbo:name “Umberto Eco”.
?book dbo:author ?person;
dbo:name ?bookname.
}
“I want to select a value”
“I’m looking for somebody
named ‘Umberto Eco’”
What book was written by 

Umberto Eco?
Queries over Linked Data are written in SPARQL
SELECT ?bookname 

WHERE {
?person dbo:name “Umberto Eco”.
?book dbo:author ?person;
dbo:name ?bookname.
}
“I want to select a value”
“I’m looking for somebody
named ‘Umberto Eco’”
“Some book has 

that somebody

as author”
What book was written by 

Umberto Eco?
Queries over Linked Data are written in SPARQL
SELECT ?bookname 

WHERE {
?person dbo:name “Umberto Eco”.
?book dbo:author ?person;
dbo:name ?bookname.
}
“I want to select a value”
“I’m looking for somebody
named ‘Umberto Eco’”
“Some book has 

that somebody

as author”
“That book must have a name”
What book was written by 

Umberto Eco?
Queries over Linked Data are written in SPARQL
?person
Graph pattern
name “Umberto Eco”
?book author ?person
?book name ?bookname
?person name “Umberto Eco”
?book author ?person
?book name ?bookname
Triple pattern
Variable URI Value (or URI)
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
?person name “Umberto Eco”
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
?person name “Umberto Eco”
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
?person name “Umberto Eco”
umberto-eco name “Umberto Eco”
renate-ramge name “Renate Ramge”
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
?person name “Umberto Eco”
umberto-eco name “Umberto Eco”
?person name “Umberto Eco”
umberto-eco name “Umberto Eco”
?person name “Umberto Eco”
umberto-eco?person:
umberto-eco
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
umberto-eco?person:
?book author ?person
umberto-eco
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
?person:
?book author umberto-eco
umberto-eco
the-name-of-the-rose author umberto-eco
?person:
?book author umberto-eco
the-name-of-the-rose?book:
umberto-eco
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
umberto-eco?person:
The-name-of-the-rose?book:
?book name ?bookname
the-name-of-the-rose
umberto-eco
umberto.jpg displays umberto-eco
umberto-eco name “Umberto Eco”
umberto-eco birthplace allessandria
renate.jpg displays renate ramge
renate-ramge name “Renate Ramge”
renate-ramge married umberto-eco
the-name-of-the-rose name “The name of the rose”
allessandria name “Allessandria"
the-name-of-the-rose author umberto-eco
umberto-eco?person:
The-name-of-the-rose?book:
name ?booknamethe-name-of-the-rose
umberto-eco
the-name-of-the-rose name “The name of the rose”
umberto-eco?person:
The-name-of-the-rose?book:
name ?booknamethe-name-of-the-rose
?bookname: “The name of the rose”
?person name “Umberto Eco”
?book author ?person
?book name ?bookname
Order A
?person name “Umberto Eco”
?book author ?person
Order B
?book name ?bookname
1 + 1 + 1 = 3 operations
4 + 1 + 4 = 9 operations
1
1
1
1
4
4
How do Web clients
query published Linked Data today
query a Linked Data API with 

lower server cost
discover and query multiple
low-cost Linked Data APIs
reproduce query results
1
2
3
4
Client
Server Linked Data

API
Client
Server Linked Data

API
Request Response
filename URI SPARQL Query
Data 

dump
Linked Data

document
SPARQL 

Endpoint
Request
Response results
filename URI SPARQL Query
Data 

dump
Linked Data

document
SPARQL 

Endpoint
Client
Network

Traffic
Request
Response
Server
results
filename URI SPARQL Query
Data 

dump
Linked Data

document
SPARQL 

Endpoint
Client
Network

Traffic
Request
Response
Server
results
filename URI SPARQL Query
Data 

dump
Linked Data

document
SPARQL 

Endpoint
Client
Network

Traffic
Request
Response
Server
results
highservercostlow server cost
data

dump
SPARQL

endpoint
API offered by the server
high availability low availability
high network traffic low network traffic
out-of-date data live data
lowclientcosthigh client cost
LinkedData

documents


Offers 

specific fragments

of a Linked Dataset.
Hunting for trade-offs between client & server:
Linked Data Fragments
data
metadata
controls
What triples does it contain?
What do we know about it?
How to access more data?
Each type of Linked Data Fragment is defined by
3 characteristics
A low-cost API that enables clients to query:
Triple Pattern Fragments
low server cost
data

dump
SPARQL

endpoint
high availability
live data
LinkedData

documents
triplepattern

fragments
matches of a triple pattern
total number of matches
access to all other fragments
data
metadata
controls
(in pages)
A low-cost API that enables clients to query
Triple Pattern Fragments
data (first 100)
data (first 100)
metadata (total count)
data (first 100)
controls (other fragments)
metadata (total count)
SPARQL Layer
Fragment Layer
HTTP LayerClient
Server
TPF API
triple 

pattern fragment 

How clients evaluate SPARQL over
Triple Pattern Fragments APIs
SPARQL Layer
Fragment Layer
HTTP LayerClient
Server
TPF API
triple 

pattern fragment 

How clients evaluate SPARQL over
Triple Pattern Fragments APIs
GiveclientaSPARQLqueryand

anyfragmentURI.
SPARQL Layer
Fragment Layer
HTTP LayerClient
Server
TPF API
triple 

pattern fragment 

How clients evaluate SPARQL over
Triple Pattern Fragments APIs
GiveclientaSPARQLqueryand

anyfragmentURI.
Clientslookinsidethefragment

toseehowtoaccesstheAPI.
SPARQL Layer
Fragment Layer
HTTP LayerClient
Server
TPF API
triple 

pattern fragment 

How clients evaluate SPARQL over
Triple Pattern Fragments APIs
GiveclientaSPARQLqueryand

anyfragmentURI.
Clientslookinsidethefragment

toseehowtoaccesstheAPI.
Clientsissuearequesttotheserverfor
eachtriplepattern
SPARQL Layer
Fragment Layer
HTTP LayerClient
Server
TPF API
triple 

pattern fragment 

How clients evaluate SPARQL over
Triple Pattern Fragments APIs
GiveclientaSPARQLqueryand

anyfragmentURI.
Clientslookinsidethefragment

toseehowtoaccesstheAPI.
andusethecountmetadata

todetermineinwhichorder.
Clientsissuearequesttotheserverfor
eachtriplepattern
Querying Datasets on
1 10 100
10100100010000
clients
throughput(q/hr)
Virtuoso 6
Fuseki–tdb
triple pattern
Fig. 3.1: Server performance (log-log plot)
The query throughput is lower,

but resilient to high client numbers.
executed SPARQL queries per hour
The server uses much less CPU,

lowering the cost of server infrastructure.
server CPU usage per core
1 10 100
0
50
100
clients
#tim
Fig. 3.3: Query timeouts
1 10 100
0
50
100
clients
cpuuse(%)
Fig. 3.5: Server processor usage per core
100
(%)
The server traffic is higher,

but requests are significantly lighter.
ets on the Web with High Availability 13
so 6 Virtuoso 7
–tdb Fuseki–hdt
pattern fragments
1 10 100
0
2
4
clients
datasent(mb)
Fig. 3.2: Server network trafficdata sent by server in MB
For some queries, many requests are of
type “is this triple in the dataset?”
0%
25%
50%
75%
100%
L1 L2 L3 L4 L5 S1 S2 S3 S4 S5 S6 S7 F1 F2 F3 F4 F5 C1 C2 C3
The fraction of membership requests for 20 queries

linear (L), star (S), snowflake-shaped (F) and complex (C)
total number of matchesmetadata
Approximate Membership Filter
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
100 MByte
1 Byte
total number of matchesmetadata
+ approximate membership filter
Approximate Membership Filter
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
100 MByte
1 Byte“Is this URI in the set?”
total number of matchesmetadata
+ approximate membership filter
Approximate Membership Filter
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
URI
100 MByte
1 Byte“Is this URI in the set?”
“No”

“Maybe.”
total number of matchesmetadata
+ approximate membership filter
>50%ofthequerieshasfewerrequests,

< 20% has more requests.
Original

+ Bloom
Original

+ GCS
Optimized

+ Bloom
Optimized

+ GCS
Percentage of queries per AMF/query algorithm combination
0% 25% 50% 75% 100%
6%
5%
18%
17%
35%
33%
33%
32%
59%
62%
49%
50%
Fewer Requests Equal More Requests
No queries have reduction in execution time, 

a third even has increase.
Original

+ Bloom
Original

+ GCS
Optimized

+ Bloom
Optimized

+ GCS
Percentage of queries per AMF/query algorithm combination
0% 25% 50% 75% 100%
16%
31%
33%
38%
84%
69%
67%
62%
Equal Lower Execution time Higher Execution time
How do Web clients
query published Linked Data today
query a Linked Data API with 

lower server cost
discover and query multiple
low-cost Linked Data APIs
reproduce query results
1
2
3
4
A Web of Linked Data
A Web of Linked Data
TPF API TPF API
TPF APITPF API
TPF API
TPF API TPF API
a sustainable Web of Linked Data?
Are low-cost Triple Pattern Fragments APIs a good fit for
a sustainable Web of Linked Data?
Are low-cost Triple Pattern Fragments APIs a good fit for
How to query 

multiple TPF APIs
TPF API TPF API
TPF API
a sustainable Web of Linked Data?
Are low-cost Triple Pattern Fragments APIs a good fit for
How to query 

multiple TPF APIs
How to discover

relevant TPF APIs
TPF API TPF API
TPF API
TPF API TPF API
TPF API
Fragment mediator
A mediator enables the client to abstract
multiple Triple Pattern Fragment APIs
SPARQL Layer
Fragment Layer B
HTTP Layer BClient
Server
TPF API
HTTP Layer A
TPF API
Fragment Layer A
Merge multiple

Triple Pattern Fragments 

as one

Sum the 

count metadata



Eliminate sources that
have no results
Dataset A Dataset B
1
10
100
Average Execution time per Query Group in seconds
LD CD LS C
Triple Pattern Fragments ANAPSID ANAPSID EG FedX SPLENDID
Executiontimesonapublicnetwork

areinrangeoftheSOTAonalocalnetwork.
0%
25%
50%
75%
100%
Percentage of Queries per System
Triple Pattern
Fragments
ANAPSID ANAPSID EG FedX (warm) SPLENDID
100% 90 - 100% 10 - 90% 0 - 10% 0%
Compared to the other systems, 

more queries retrieve >90%oftheresults.
TPF API TPF API
TPF APITPF API
TPF API
TPF API TPF API
TPF API TPF API
TPF APITPF API
TPF API
TPF API TPF API
Exploit the links in Linked Data to let APIs

discover each other and inform the client.
TPF API
Each Triple Pattern Fragments API creates 

a summary of the dataset.
geonames.org
TPF API
Each Triple Pattern Fragments API creates 

a summary of the dataset.
Per Predicate, list first part of the Subject and Object URIs.
http://dbpedia.org, … located in http://geonames.org, …
… … …
geonames.org
TPF API
Each Triple Pattern Fragments API creates 

a summary of the dataset.
Per Predicate, list first part of the Subject and Object URIs.
Keep a sample URI for each external domain
http://dbpedia.org, … located in http://geonames.org, …
… … …
http://dbpedia.org/resource/Louvre
geonames.org
TPF API
TPF API
TPF API
TPF APITPF API
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
Request External URI
Request External URI Request External URI
Request Exernal URI
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
Request External URI
Request External URI Request External URI
Request Exernal URI
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
Request 
External URI
Where did this
request come from?
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
Request 
External URI
TPF API
TPF API
TPF API
Active / Reactive
TPF APITPF API
Request 
External URI
TPF API
TPF API
TPF API
TPF APITPF API
triple 

pattern
TPF API
TPF API
TPF API
TPF APITPF API
Link
Link
triple 

pattern
Number of 

needed 

requests
0 200 400 600 800
DBPedia subset NY Times LinkedMDB Jamendo
Geonames Semantic Web Dog Food Drugbank Kegg-ChEBI
Discovery 

process 

time 

in minutes
0 1,75 3,5 5,25 7
0%
25%
50%
75%
100%
Percentage of Queries per Dataset
DBPedia NYTimes LinkedMDB Jamendo Geonames SWDF Drugbank Kegg-chebi
100% 90 - 100% 10 - 90% 0 - 10% 0% Unknown
The number of retrieved results is low and 

highly depends on what dataset is queried.
1
1.000
1.000.000
Execution time per Query in milliseconds (logarithmic)
No discovery With discovery
Discovery reduces query time for most, 

but causessubstantialoverheadforsome.
How do Web clients
query published Linked Data today
query a Linked Data API with 

lower server cost
discover and query multiple
low-cost Linked Data APIs
reproduce query results
1
2
3
4
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
Sean Connery
name
about
name
Jean-Jacques Annaud
name
director
Film database
Book database
stars
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
Sean Connery
name
about
name
Jean-Jacques Annaud
name
director
Film database
Book database
2017 2017
stars
displays
birthplace
married
The Name of The Rose
Umberto Eco
Alessandria
Renate Ramge
displays
name
author
name
name
Sean Connery
name
about
name
Jean-Jacques Annaud
name
director
Film database
Book database
2017 2018
Tom Hanks
name
stars
What actor stars in films based on
books by ?Umberto Eco
author
name
Sean Connery
about
Linked Datasets drift & produce different answers later on
name
stars
2017
What actor stars in films based on
books by ?Umberto Eco
author
name
Sean Connery
about
Linked Datasets drift & produce different answers later on
Tom Hanks
name
stars
2018
Ensuring the reproducibility of query results
over Linked Data.
Sustain the 

validity of claims
Backwards-compatible 

applications
Version 1.0 Version 2.0
A pragmatic DBpedia archive can store 

14 versions with 12% of the original size.
0
40
80
120
160
2.0
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
2014
2015-04
2015-10
Original data size (GB) Archived size (GB)
Archive’s space (↓50%) and time-to-publish (↓20h / version) 

significantly decreased for twice the number of triples (6 billion).
Querying a Triple Pattern Fragments API
without knowing what versions exist.
SPARQL Layer
Fragment Layer
HTTP LayerClient
Server
TPF API
Dataset in 2017 Dataset now
“Dataset 

in 2017 please”
Query in 2017
The Memento Framework enables clients to 

request URIs in the time dimension.
The Memento Framework enables clients to 

request URIs in the time dimension.
Fragment mediator
Multiple Triple Pattern Fragment APIs 

can be synced to a certain point in time.
SPARQL Layer
Fragment Layer B
Memento Layer BClient
Server
TPF API
Memento Layer A
TPF API
Fragment Layer A
Dataset A Dataset B
“Dataset A 

in 2017 

please”
“Dataset B 

in 2017 

please”
Query in 2017
2008 2009 2010 2011 2012 2013 2014 2015 2016
2008 2009 2010 2011 2012 2013 2014 2015 2016
“What is the number of awards won by Belgian academics?”
“What is the number of triples describing professor
Jacques-Joseph Haus of Ghent University?”
Multiple sources
Single source
2008 2009 2010 2011 2012 2013 2014 2015 2016
2008 2009 2010 2011 2012 2013 2014 2015 2016
“What is the number of awards won by Belgian academics?”
“What is the number of triples describing professor
Jacques-Joseph Haus of Ghent University?”
When interpreting differences between facts, 

consider why facts change.
Multiple sources
Single source
How do Web clients
query published Linked Data today
query a Linked Data API with 

lower server cost
discover and query multiple
low-cost Linked Data APIs
reproduce query results
1
2
3
4
Embrace the Web 

and the diversity in publishers.
Many queries are answered within acceptable time,
and the query algorithm can still improve.
Enable clients to be intelligent, not servers.
Triple Pattern Fragments trade bandwidth and time

for low and stable CPU usage.
Rethink Web querying.
“Fast” is defined by the application
and when it needs the results.
In a public Web setting, other query languages 

besides SPARQL might be (more) appropriate.
Continue the quest for metadata and interfaces 

to cover more query use cases.
From physical integration 

to virtual integration.
Triple Pattern Fragments is competitive as 

infrastructure for querying multiple APIs.
Publishing archives can ensure reproducibility, 

but caution is needed when interpreting change.
Lightweight APIs enable more Linked Data 

publishers with maintained control.
Blur the distinction between 

querying one or more APIs.
Exploiting Linked Data for API discovery is promising,
but clients need to consume links more intelligently.
Selecting relevant sources is a open challenge, 

which could involve machine learning.
Dedicated discovery hubs that gather metadata 

will be necessary for scale.
Miel Vander Sande

PhD Student



IDLab - ELIS



E miel.vandersande@ugent.be





www.ugent.be


Más contenido relacionado

Más de Miel Vander Sande

Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpediaMiel Vander Sande
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataMiel Vander Sande
 
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)Miel Vander Sande
 
The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...Miel Vander Sande
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesMiel Vander Sande
 
The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.Miel Vander Sande
 
PMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challengesPMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challengesMiel Vander Sande
 
Aan de slag met Linked Open Data
Aan de slag met Linked Open DataAan de slag met Linked Open Data
Aan de slag met Linked Open DataMiel Vander Sande
 
The DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic outputThe DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic outputMiel Vander Sande
 

Más de Miel Vander Sande (10)

Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpedia
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
 
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
 
The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
 
The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.
 
PMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challengesPMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challenges
 
Aan de slag met Linked Open Data
Aan de slag met Linked Open DataAan de slag met Linked Open Data
Aan de slag met Linked Open Data
 
The DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic outputThe DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic output
 
Follow the stars 25/11/2011
Follow the stars 25/11/2011Follow the stars 25/11/2011
Follow the stars 25/11/2011
 

Último

Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniquesugginaramesh
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 

Último (20)

Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 

Low-cost Linked Data publishing infrastructures

  • 1. METADATA AND CONTROL FEATURES FOR LOW-COST LINKED DATA PUBLISHING INFRASTRUCTURES Public defense Drs. Miel Vander Sande DEPARTMENT ELIS RESEARCH GROUP IDLAB
  • 2.
  • 5.
  • 7. Web Client Web Server
  • 8. Web Client Web Server “Give me http://wikipedia.org/wiki/Umberto_Eco” Request
  • 9. Web Client Web Server “Give me http://wikipedia.org/wiki/Umberto_Eco” Request Response“Here’s the document http://wikipedia.org/wiki/Umberto_Eco”
  • 10. Web Client Web Server “Give me http://wikipedia.org/wiki/Umberto_Eco” Request Response“Here’s the document http://wikipedia.org/wiki/Umberto_Eco” HTTP protocol
  • 11.
  • 12. + 4 billion webpages
  • 13. + 4 billion webpages
  • 14.
  • 16. Wikipedia page of Umberto Eco Umberto Eco Alessandria
  • 17. Wikipedia page of Umberto Eco The Name of The Rose Umberto Eco Renate Ramge Alessandria
  • 18. displays married The Name of The Rose Umberto Eco Alessandria Wikipedia page of Umberto Eco Renate Ramge displays name author name name name birthplace
  • 19. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name name
  • 20. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name stars Sean Connery name about name Jean-Jacques Annaud name director
  • 21. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name stars Sean Connery name about name Jean-Jacques Annaud name director Film database Book database
  • 22. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name Linked Data stars Sean Connery name about name Jean-Jacques Annaud name director Film database Book database
  • 23. What actor stars in films based on books by ?Umberto Eco Web applications execute queries to yield answers
  • 24. What actor stars in films based on books by ?Umberto Eco author name stars Sean Connery name about Web applications execute queries to yield answers
  • 27. Website Web API Client Server Linked Data
 API machine
 readable human
 readable + understandable machine
 understandable
  • 28. Around 10.000 published 
 Linked Open Datasets. But not many are directly queryable. Most datasets 
 require download Unavailable for at least 
 1,5 days / month
  • 29. It is partly an architectural problem with economical repercussions. ., Many data publishers are under-resourced, looking for “good-enough” solutions.
  • 30. It is partly an architectural problem with economical repercussions. ., Many data publishers are under-resourced, looking for “good-enough” solutions.
  • 31. Can we enable Web clients to 
 query Linked data directly,
  • 32. Can we enable Web clients to 
 query Linked data directly, while lowering infrastructure cost by simplifying Linked Data APIs
  • 33. Can we enable Web clients to 
 query Linked data directly, while lowering infrastructure cost by simplifying Linked Data APIs thus making Linked Data publishing more democratic and sustain better?
  • 34. How do Web clients query published Linked Data today query a Linked Data API with 
 lower server cost discover and query multiple low-cost Linked Data APIs reproduce query results 1 2 3 4
  • 35. How do Web clients query published Linked Data today query a Linked Data API with 
 lower server cost discover and query multiple low-cost Linked Data APIs reproduce query results 1 2 3 4
  • 37. displays married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name name birthplace
  • 38. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco
  • 39. umberto.jpg Linked Dataset/Graph displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco
  • 40. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco Triple Subject Predicate Object
  • 41. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco http://dbpedia.org/resource/Umberto_Eco http://dbpedia.org/resource/Alessandria http://dbpedia.org/ontology/birthPlace Triple URI URI URI (or Value)
  • 42. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco URI URI URI (or Value) dbr:Umberto_Eco dbr:Alessandria dbo:birthPlace Triple
  • 43. SELECT ?bookname 
 WHERE { ?person dbo:name “Umberto Eco”. ?book dbo:author ?person; dbo:name ?bookname. } What book was written by 
 Umberto Eco? Queries over Linked Data are written in SPARQL
  • 44. SELECT ?bookname 
 WHERE { ?person dbo:name “Umberto Eco”. ?book dbo:author ?person; dbo:name ?bookname. } “I want to select a value” What book was written by 
 Umberto Eco? Queries over Linked Data are written in SPARQL
  • 45. SELECT ?bookname 
 WHERE { ?person dbo:name “Umberto Eco”. ?book dbo:author ?person; dbo:name ?bookname. } “I want to select a value” “I’m looking for somebody named ‘Umberto Eco’” What book was written by 
 Umberto Eco? Queries over Linked Data are written in SPARQL
  • 46. SELECT ?bookname 
 WHERE { ?person dbo:name “Umberto Eco”. ?book dbo:author ?person; dbo:name ?bookname. } “I want to select a value” “I’m looking for somebody named ‘Umberto Eco’” “Some book has 
 that somebody
 as author” What book was written by 
 Umberto Eco? Queries over Linked Data are written in SPARQL
  • 47. SELECT ?bookname 
 WHERE { ?person dbo:name “Umberto Eco”. ?book dbo:author ?person; dbo:name ?bookname. } “I want to select a value” “I’m looking for somebody named ‘Umberto Eco’” “Some book has 
 that somebody
 as author” “That book must have a name” What book was written by 
 Umberto Eco? Queries over Linked Data are written in SPARQL
  • 48. ?person Graph pattern name “Umberto Eco” ?book author ?person ?book name ?bookname
  • 49. ?person name “Umberto Eco” ?book author ?person ?book name ?bookname Triple pattern Variable URI Value (or URI)
  • 50. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco ?person name “Umberto Eco”
  • 51. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco ?person name “Umberto Eco”
  • 52. umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco ?person name “Umberto Eco”
  • 53. umberto-eco name “Umberto Eco” renate-ramge name “Renate Ramge” the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" ?person name “Umberto Eco”
  • 54. umberto-eco name “Umberto Eco” ?person name “Umberto Eco”
  • 55. umberto-eco name “Umberto Eco” ?person name “Umberto Eco” umberto-eco?person:
  • 56. umberto-eco umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco umberto-eco?person: ?book author ?person
  • 57. umberto-eco umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco ?person: ?book author umberto-eco
  • 58. umberto-eco the-name-of-the-rose author umberto-eco ?person: ?book author umberto-eco the-name-of-the-rose?book:
  • 59. umberto-eco umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco umberto-eco?person: The-name-of-the-rose?book: ?book name ?bookname the-name-of-the-rose
  • 60. umberto-eco umberto.jpg displays umberto-eco umberto-eco name “Umberto Eco” umberto-eco birthplace allessandria renate.jpg displays renate ramge renate-ramge name “Renate Ramge” renate-ramge married umberto-eco the-name-of-the-rose name “The name of the rose” allessandria name “Allessandria" the-name-of-the-rose author umberto-eco umberto-eco?person: The-name-of-the-rose?book: name ?booknamethe-name-of-the-rose
  • 61. umberto-eco the-name-of-the-rose name “The name of the rose” umberto-eco?person: The-name-of-the-rose?book: name ?booknamethe-name-of-the-rose ?bookname: “The name of the rose”
  • 62. ?person name “Umberto Eco” ?book author ?person ?book name ?bookname Order A ?person name “Umberto Eco” ?book author ?person Order B ?book name ?bookname 1 + 1 + 1 = 3 operations 4 + 1 + 4 = 9 operations 1 1 1 1 4 4
  • 63. How do Web clients query published Linked Data today query a Linked Data API with 
 lower server cost discover and query multiple low-cost Linked Data APIs reproduce query results 1 2 3 4
  • 66. filename URI SPARQL Query Data 
 dump Linked Data
 document SPARQL 
 Endpoint Request Response results
  • 67. filename URI SPARQL Query Data 
 dump Linked Data
 document SPARQL 
 Endpoint Client Network
 Traffic Request Response Server results
  • 68. filename URI SPARQL Query Data 
 dump Linked Data
 document SPARQL 
 Endpoint Client Network
 Traffic Request Response Server results
  • 69. filename URI SPARQL Query Data 
 dump Linked Data
 document SPARQL 
 Endpoint Client Network
 Traffic Request Response Server results
  • 70. highservercostlow server cost data
 dump SPARQL
 endpoint API offered by the server high availability low availability high network traffic low network traffic out-of-date data live data lowclientcosthigh client cost LinkedData
 documents 
 Offers 
 specific fragments
 of a Linked Dataset. Hunting for trade-offs between client & server: Linked Data Fragments
  • 71. data metadata controls What triples does it contain? What do we know about it? How to access more data? Each type of Linked Data Fragment is defined by 3 characteristics
  • 72. A low-cost API that enables clients to query: Triple Pattern Fragments low server cost data
 dump SPARQL
 endpoint high availability live data LinkedData
 documents triplepattern
 fragments
  • 73. matches of a triple pattern total number of matches access to all other fragments data metadata controls (in pages) A low-cost API that enables clients to query Triple Pattern Fragments
  • 74.
  • 76. data (first 100) metadata (total count)
  • 77. data (first 100) controls (other fragments) metadata (total count)
  • 78. SPARQL Layer Fragment Layer HTTP LayerClient Server TPF API triple 
 pattern fragment 
 How clients evaluate SPARQL over Triple Pattern Fragments APIs
  • 79. SPARQL Layer Fragment Layer HTTP LayerClient Server TPF API triple 
 pattern fragment 
 How clients evaluate SPARQL over Triple Pattern Fragments APIs GiveclientaSPARQLqueryand
 anyfragmentURI.
  • 80. SPARQL Layer Fragment Layer HTTP LayerClient Server TPF API triple 
 pattern fragment 
 How clients evaluate SPARQL over Triple Pattern Fragments APIs GiveclientaSPARQLqueryand
 anyfragmentURI. Clientslookinsidethefragment
 toseehowtoaccesstheAPI.
  • 81. SPARQL Layer Fragment Layer HTTP LayerClient Server TPF API triple 
 pattern fragment 
 How clients evaluate SPARQL over Triple Pattern Fragments APIs GiveclientaSPARQLqueryand
 anyfragmentURI. Clientslookinsidethefragment
 toseehowtoaccesstheAPI. Clientsissuearequesttotheserverfor eachtriplepattern
  • 82. SPARQL Layer Fragment Layer HTTP LayerClient Server TPF API triple 
 pattern fragment 
 How clients evaluate SPARQL over Triple Pattern Fragments APIs GiveclientaSPARQLqueryand
 anyfragmentURI. Clientslookinsidethefragment
 toseehowtoaccesstheAPI. andusethecountmetadata
 todetermineinwhichorder. Clientsissuearequesttotheserverfor eachtriplepattern
  • 83. Querying Datasets on 1 10 100 10100100010000 clients throughput(q/hr) Virtuoso 6 Fuseki–tdb triple pattern Fig. 3.1: Server performance (log-log plot) The query throughput is lower,
 but resilient to high client numbers. executed SPARQL queries per hour
  • 84. The server uses much less CPU,
 lowering the cost of server infrastructure. server CPU usage per core 1 10 100 0 50 100 clients #tim Fig. 3.3: Query timeouts 1 10 100 0 50 100 clients cpuuse(%) Fig. 3.5: Server processor usage per core 100 (%)
  • 85. The server traffic is higher,
 but requests are significantly lighter. ets on the Web with High Availability 13 so 6 Virtuoso 7 –tdb Fuseki–hdt pattern fragments 1 10 100 0 2 4 clients datasent(mb) Fig. 3.2: Server network trafficdata sent by server in MB
  • 86. For some queries, many requests are of type “is this triple in the dataset?” 0% 25% 50% 75% 100% L1 L2 L3 L4 L5 S1 S2 S3 S4 S5 S6 S7 F1 F2 F3 F4 F5 C1 C2 C3 The fraction of membership requests for 20 queries
 linear (L), star (S), snowflake-shaped (F) and complex (C)
  • 87. total number of matchesmetadata
  • 88. Approximate Membership Filter URI URI URI URI URI URI URI URI URI URI URI URI 100 MByte 1 Byte total number of matchesmetadata + approximate membership filter
  • 89. Approximate Membership Filter URI URI URI URI URI URI URI URI URI URI URI URI 100 MByte 1 Byte“Is this URI in the set?” total number of matchesmetadata + approximate membership filter
  • 90. Approximate Membership Filter URI URI URI URI URI URI URI URI URI URI URI URI 100 MByte 1 Byte“Is this URI in the set?” “No”
 “Maybe.” total number of matchesmetadata + approximate membership filter
  • 91. >50%ofthequerieshasfewerrequests,
 < 20% has more requests. Original
 + Bloom Original
 + GCS Optimized
 + Bloom Optimized
 + GCS Percentage of queries per AMF/query algorithm combination 0% 25% 50% 75% 100% 6% 5% 18% 17% 35% 33% 33% 32% 59% 62% 49% 50% Fewer Requests Equal More Requests
  • 92. No queries have reduction in execution time, 
 a third even has increase. Original
 + Bloom Original
 + GCS Optimized
 + Bloom Optimized
 + GCS Percentage of queries per AMF/query algorithm combination 0% 25% 50% 75% 100% 16% 31% 33% 38% 84% 69% 67% 62% Equal Lower Execution time Higher Execution time
  • 93. How do Web clients query published Linked Data today query a Linked Data API with 
 lower server cost discover and query multiple low-cost Linked Data APIs reproduce query results 1 2 3 4
  • 94.
  • 95. A Web of Linked Data
  • 96. A Web of Linked Data TPF API TPF API TPF APITPF API TPF API TPF API TPF API
  • 97. a sustainable Web of Linked Data? Are low-cost Triple Pattern Fragments APIs a good fit for
  • 98. a sustainable Web of Linked Data? Are low-cost Triple Pattern Fragments APIs a good fit for How to query 
 multiple TPF APIs TPF API TPF API TPF API
  • 99. a sustainable Web of Linked Data? Are low-cost Triple Pattern Fragments APIs a good fit for How to query 
 multiple TPF APIs How to discover
 relevant TPF APIs TPF API TPF API TPF API TPF API TPF API TPF API
  • 100. Fragment mediator A mediator enables the client to abstract multiple Triple Pattern Fragment APIs SPARQL Layer Fragment Layer B HTTP Layer BClient Server TPF API HTTP Layer A TPF API Fragment Layer A Merge multiple
 Triple Pattern Fragments 
 as one
 Sum the 
 count metadata
 
 Eliminate sources that have no results Dataset A Dataset B
  • 101. 1 10 100 Average Execution time per Query Group in seconds LD CD LS C Triple Pattern Fragments ANAPSID ANAPSID EG FedX SPLENDID Executiontimesonapublicnetwork
 areinrangeoftheSOTAonalocalnetwork.
  • 102. 0% 25% 50% 75% 100% Percentage of Queries per System Triple Pattern Fragments ANAPSID ANAPSID EG FedX (warm) SPLENDID 100% 90 - 100% 10 - 90% 0 - 10% 0% Compared to the other systems, 
 more queries retrieve >90%oftheresults.
  • 103. TPF API TPF API TPF APITPF API TPF API TPF API TPF API
  • 104. TPF API TPF API TPF APITPF API TPF API TPF API TPF API Exploit the links in Linked Data to let APIs
 discover each other and inform the client.
  • 105. TPF API Each Triple Pattern Fragments API creates 
 a summary of the dataset. geonames.org
  • 106. TPF API Each Triple Pattern Fragments API creates 
 a summary of the dataset. Per Predicate, list first part of the Subject and Object URIs. http://dbpedia.org, … located in http://geonames.org, … … … … geonames.org
  • 107. TPF API Each Triple Pattern Fragments API creates 
 a summary of the dataset. Per Predicate, list first part of the Subject and Object URIs. Keep a sample URI for each external domain http://dbpedia.org, … located in http://geonames.org, … … … … http://dbpedia.org/resource/Louvre geonames.org
  • 108. TPF API TPF API TPF API TPF APITPF API
  • 109. TPF API TPF API TPF API Active / Reactive TPF APITPF API
  • 110. TPF API TPF API TPF API Active / Reactive TPF APITPF API Request External URI Request External URI Request External URI Request Exernal URI
  • 111. TPF API TPF API TPF API Active / Reactive TPF APITPF API Request External URI Request External URI Request External URI Request Exernal URI
  • 112. TPF API TPF API TPF API Active / Reactive TPF APITPF API
  • 113. TPF API TPF API TPF API Active / Reactive TPF APITPF API Request 
External URI Where did this request come from?
  • 114. TPF API TPF API TPF API Active / Reactive TPF APITPF API Request 
External URI
  • 115. TPF API TPF API TPF API Active / Reactive TPF APITPF API Request 
External URI
  • 116. TPF API TPF API TPF API TPF APITPF API triple 
 pattern
  • 117. TPF API TPF API TPF API TPF APITPF API Link Link triple 
 pattern
  • 118. Number of 
 needed 
 requests 0 200 400 600 800 DBPedia subset NY Times LinkedMDB Jamendo Geonames Semantic Web Dog Food Drugbank Kegg-ChEBI Discovery 
 process 
 time 
 in minutes 0 1,75 3,5 5,25 7
  • 119. 0% 25% 50% 75% 100% Percentage of Queries per Dataset DBPedia NYTimes LinkedMDB Jamendo Geonames SWDF Drugbank Kegg-chebi 100% 90 - 100% 10 - 90% 0 - 10% 0% Unknown The number of retrieved results is low and 
 highly depends on what dataset is queried.
  • 120. 1 1.000 1.000.000 Execution time per Query in milliseconds (logarithmic) No discovery With discovery Discovery reduces query time for most, 
 but causessubstantialoverheadforsome.
  • 121. How do Web clients query published Linked Data today query a Linked Data API with 
 lower server cost discover and query multiple low-cost Linked Data APIs reproduce query results 1 2 3 4
  • 122. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name Sean Connery name about name Jean-Jacques Annaud name director Film database Book database stars
  • 123. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name Sean Connery name about name Jean-Jacques Annaud name director Film database Book database 2017 2017 stars
  • 124. displays birthplace married The Name of The Rose Umberto Eco Alessandria Renate Ramge displays name author name name Sean Connery name about name Jean-Jacques Annaud name director Film database Book database 2017 2018 Tom Hanks name stars
  • 125. What actor stars in films based on books by ?Umberto Eco author name Sean Connery about Linked Datasets drift & produce different answers later on name stars 2017
  • 126. What actor stars in films based on books by ?Umberto Eco author name Sean Connery about Linked Datasets drift & produce different answers later on Tom Hanks name stars 2018
  • 127. Ensuring the reproducibility of query results over Linked Data. Sustain the 
 validity of claims Backwards-compatible 
 applications Version 1.0 Version 2.0
  • 128. A pragmatic DBpedia archive can store 
 14 versions with 12% of the original size. 0 40 80 120 160 2.0 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 2014 2015-04 2015-10 Original data size (GB) Archived size (GB) Archive’s space (↓50%) and time-to-publish (↓20h / version) 
 significantly decreased for twice the number of triples (6 billion).
  • 129. Querying a Triple Pattern Fragments API without knowing what versions exist. SPARQL Layer Fragment Layer HTTP LayerClient Server TPF API Dataset in 2017 Dataset now “Dataset 
 in 2017 please” Query in 2017
  • 130. The Memento Framework enables clients to 
 request URIs in the time dimension.
  • 131. The Memento Framework enables clients to 
 request URIs in the time dimension.
  • 132. Fragment mediator Multiple Triple Pattern Fragment APIs 
 can be synced to a certain point in time. SPARQL Layer Fragment Layer B Memento Layer BClient Server TPF API Memento Layer A TPF API Fragment Layer A Dataset A Dataset B “Dataset A 
 in 2017 
 please” “Dataset B 
 in 2017 
 please” Query in 2017
  • 133. 2008 2009 2010 2011 2012 2013 2014 2015 2016 2008 2009 2010 2011 2012 2013 2014 2015 2016 “What is the number of awards won by Belgian academics?” “What is the number of triples describing professor Jacques-Joseph Haus of Ghent University?” Multiple sources Single source
  • 134. 2008 2009 2010 2011 2012 2013 2014 2015 2016 2008 2009 2010 2011 2012 2013 2014 2015 2016 “What is the number of awards won by Belgian academics?” “What is the number of triples describing professor Jacques-Joseph Haus of Ghent University?” When interpreting differences between facts, 
 consider why facts change. Multiple sources Single source
  • 135. How do Web clients query published Linked Data today query a Linked Data API with 
 lower server cost discover and query multiple low-cost Linked Data APIs reproduce query results 1 2 3 4
  • 136. Embrace the Web 
 and the diversity in publishers. Many queries are answered within acceptable time, and the query algorithm can still improve. Enable clients to be intelligent, not servers. Triple Pattern Fragments trade bandwidth and time
 for low and stable CPU usage.
  • 137. Rethink Web querying. “Fast” is defined by the application and when it needs the results. In a public Web setting, other query languages 
 besides SPARQL might be (more) appropriate. Continue the quest for metadata and interfaces 
 to cover more query use cases.
  • 138. From physical integration 
 to virtual integration. Triple Pattern Fragments is competitive as 
 infrastructure for querying multiple APIs. Publishing archives can ensure reproducibility, 
 but caution is needed when interpreting change. Lightweight APIs enable more Linked Data 
 publishers with maintained control.
  • 139. Blur the distinction between 
 querying one or more APIs. Exploiting Linked Data for API discovery is promising, but clients need to consume links more intelligently. Selecting relevant sources is a open challenge, 
 which could involve machine learning. Dedicated discovery hubs that gather metadata 
 will be necessary for scale.
  • 140. Miel Vander Sande
 PhD Student
 
 IDLab - ELIS
 
 E miel.vandersande@ugent.be
 
 
 www.ugent.be