RDF Completeness Statements for Query Answering

Completeness Statements about RDF Data Sources
and Their Use for Query Answering
Fariz Darari
joint work with Werner Nutt, Giuseppe Pirrò, and Simon Razniewski
KRDB, Free University of Bozen-Bolzano, Italy

Context

Problem

Thousands of RDF data sources are today
available on the Web.
Machine-readable qualitative descriptions
of their content are crucial.
We focus on data completeness,
an important aspect of data quality.

Contributions

How to formalize and express in
a machine-readable way
completeness information
about RDF data sources?
How to leverage
such completeness information?

Completeness statement on the Web

1. Formal framework for expressing
completeness information.
2. Study of query completeness from
completeness information
in various settings.

Completeness statement on the Semantic Web
lv:lmdbdataset rdf:type void:Dataset.
lv:lmdbdataset c:hasComplStmt lv:st1.
lv:st1 c:hasPattern
[c:subject[spin:varName "m"]; c:predicate schema:actor; c:object[spin:varName "a"]].
lv:st1 c:hasCondition
[c:subject [spin:varName "m"]; c:predicate rdf:type; c:object schema:Movie].
lv:st1 c:hasCondition
[c:subject [spin:varName "m"]; c:predicate schema:director; c:object dbp:Tarantino].

Semantics of completeness statements
For each completeness statement, all the triple patterns defined
via hasPattern are collected into a set P1 and all the triple patterns defined
via hasCondition are collected into a set P2. A completeness statement is
interpreted as: CONSTRUCT {P1} WHERE {P1 . P2}
When a data source has a completeness statement (defined via
hasComplStmt), it means that if the query above is evaluated over
an “ideal” graph then all the results are in the data source.

Users visiting this source can prefer it
to other sources.

Checking query completeness
Given a query Q and a data source with completeness statements S:
1. Create a template answer graph GQ of Q.
2. Over GQ , evaluate all CONSTRUCT queries derived from S
3. Check whether GQ can be obtained after the evaluation.
If yes, the query is complete, otherwise might be incomplete.

However, the completeness
statement verified as complete is
only human readable!

Query completeness in a single data source scenario
@prefix
@prefix
@prefix
@prefix
@prefix
@prefix
@prefix
@prefix

c: <http://inf.unibz.it/ontologies/completeness#>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
spin: <http://spinrdf.org/sp#>
void: <http://rdfs.org/ns/void#>
dv: <http://dbpedia.org/void/>
lv: <http://linkedmdb.org/void/>
dbp: <http://dbpedia.org/resource/>
schema: <http://schema.org>

dv:dbpdataset rdf:type void:Dataset;
dv:dbpdataset c:hasComplStmt dv:st1.
dv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate rdf:type; c:object schema:Movie
].
dv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate schema:director;c:object dbp:Tarantino].

Endpoint IRI
DBPe

lv:lmdbdataset rdf:type void:Dataset;
lv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate rdf:type; c:object schema:Movie
].
lv:st1 c:hasPattern [c:subject [spin:varName "m"];
c:predicate schema:director;c:object dbp:Tarantino ].
lv:st2 c:hasPattern
[c:subject[spin:varName "m"];
c:predicate schema:actor; c:object[spin:varName "a"]].
lv:st2 c:hasCondition [c:subject [spin:varName "m"];
c:predicate rdf:type; c:object schema:Movie].
lv:st2 c:hasCondition [c:subject [spin:varName "m"];
c:predicate schema:director; c:object dbp:Tarantino].

Select all the movies for which
Tarantino is the director and also an actor
SPARQL
endpoint

DBPedia is complete
for all Tarantino's movies

The answer is
incomplete

Endpoint IRI
LMDBe

SELECT ?m
SPARQL
WHERE {?m rdf:type schema:Movie. The answer is
endpoint
complete
?m schema:director dbp:Tarantino.
?m schema:actor dbp:Tarantino}
LinkedMDB is completeall Tarantino’s movies and
LMDB is complete for for all Tarantino's movies
Q
and also moviestheir actors. is an actor
all for which he

Extensions
SPARQL queries with OPT
Completeness with RDFS inference
Federated query completeness

Work In Progress
SPARQL queries with negations and comparisons

Live, Web-based CoRner
Empirical evaluation of query completeness checking

Why is DBpedia
not complete for the query ?
The completeness statement
in DBpedia says that
it is complete for Tarantino’s
movies (dv:st1). However, the
query asks about all movies for
which Tarantino is the director,
and also an actor.
It is not stated that DBpedia
includes all the actors of
Tarantino’s movies.
Therefore, DBpedia is possibly
not complete for this query.

Why is LinkedMDB
complete ?
The completeness statements in
LMDB say that they are complete
for Tarantino’s movies (lv:st1)
and also the actors (lv:st2).

Implementation

CoRner:
Completeness Reasoner
http://rdfcorner.wordpress.com

RDF Completeness Statements for Query Answering

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a RDF Completeness Statements for Query Answering

Similar a RDF Completeness Statements for Query Answering (20)

Más de Fariz Darari

Más de Fariz Darari (20)

Último

Último (20)

RDF Completeness Statements for Query Answering