SlideShare una empresa de Scribd logo
1 de 44
RDF Analysis with
Virtuoso

giovedì 19 dicembre 13
Over view
Triple Store Benchmarking
Virtuoso
Virtuoso Connection
RDF/OO Mapper

giovedì 19 dicembre 13
Triple store
Benchmarking
giovedì 19 dicembre 13
How to...
BSBM (Berlin SPARQL
Benchmark)
to compare performance of RDF and
Named Graph store, as well as RDFmapped relational databases

Lehigh University Benchmark
(LUBM)
to facilitate the evaluation of
Semantic Web repositories in a
standard way

giovedì 19 dicembre 13
Benchmark (1/5)
After an analysis of a April 2013 BSBM experiment in which the
Berlin SPARQL Benchmark version 3.1 was used to measure the
performance of:
Load times for SUTs (hh:mm:ss)
BigData (rev. 6528)
BigOwlim (v. 5.2)
TDB (v. 0.9.4)
Virtuoso6 (ver. 6.04)
Virtuoso7 (ver. 7
.0)

giovedì 19 dicembre 13

SUT

10M

100M

200M

1B

BigData

00:2:39

00:25:35

00:59:25

-

BigOwlim

00:2:31

00:22:47

00:47:19

4:9:39

TDB
Virtuoso6
Virtuoso7

00:9:41
00:7:06
-

1:37:55
00:19:26
00:3:39

3:34:59
00:31:30
-

1:10:30
00:27:11
Benchmark (2/5)
The tables below summarize the query throughput
for various type of query over all 500 runs (in QpS)
Benchmark Query results: QpS (Queries per Second)
BigData

BigOwlim

TDB

100M

200M

100M

Query 1 49.955
Query 2 42.769
Query 3 37
.280

49.520

Query 1 93.773 65.385
Query 2 115.960 65.158
Query 3 170.242 61.155

43.713
38.355

200M

Query 1 232.234 217
.865
Query 2 109.445 110.019
Query 3 180.245 174.216

giovedì 19 dicembre 13

100M

200M

Query 1 119.048 94.877
Query 2 158.755 151.883
Query 3 84.660 70.492

Virtuoso7

Virtuoso6
100M

200M

100M

1B

Query 1 125.786 75.324
Query 2 68.929 68.820
Query 3 117426 62.243
.
Benchmark (3/5)
Query 1
Find products for a given set of generic features

Query 2

Query 3

giovedì 19 dicembre 13

Retrieve basic information about a specific product for
display purposes

Find products having some specific features and not
having one feature
Benchmark (4/5)
Query 1
PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>

Query 2

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
SELECT DISTINCT ?product ?label

PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>

WHERE {

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

?product rdfs:label ?label . ?product a %ProductType% . ?product bsbm:productFeature
%ProductFeature1% .

PREFIX dc: <http://purl.org/dc/elements/1.1/>

?product bsbm:productFeature %ProductFeature2% .

?product bsbm:productPropertyNumeric1 ?value1 .
!

FILTER (?value1 > %x%)

!

}

SELECT ?label ?comment ?producer ?productFeature ?propertyTextual1 ?propertyTextual2 ?propertyTextual3
 ?propertyNumeric1 ?propertyNumeric2 ?propertyTextual4 ?propertyTextual5 ?propertyNumeric4
WHERE {

ORDER BY ?label LIMIT 10

%ProductXYZ% rdfs:label ?label .
!

Query 3

%ProductXYZ% rdfs:comment ?comment .

!

%ProductXYZ% bsbm:producer ?p .

!

?p rdfs:label ?producer .
%ProductXYZ% dc:publisher ?p .

PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

!

%ProductXYZ% bsbm:productFeature ?f .

!

?f rdfs:label ?productFeature .

!

%ProductXYZ% bsbm:productPropertyTextual1 ?propertyTextual1 .

!

PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>

%ProductXYZ% bsbm:productPropertyTextual2 ?propertyTextual2 .
%ProductXYZ% bsbm:productPropertyTextual3 ?propertyTextual3 .

SELECT ?product ?label
!

WHERE {
?product rdfs:label ?label . ?product a %ProductType% .
!

?product bsbm:productFeature %ProductFeature1% .

!

?product bsbm:productPropertyNumeric3 ?p3 .

!

FILTER (?p3 < %y% )

%ProductXYZ% bsbm:productPropertyNumeric2 ?propertyNumeric2 .

!

OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual4 ?propertyTextual4 }

FILTER ( ?p1 > %x% )

!

!

?product bsbm:productPropertyNumeric1 ?p1 .

!

%ProductXYZ% bsbm:productPropertyNumeric1 ?propertyNumeric1 .

OPTIONAL {
?product bsbm:productFeature %ProductFeature2% . ?product rdfs:label ?testVar }
FILTER (!bound(?testVar))
} ORDER BY ?label LIMIT 10

giovedì 19 dicembre 13

OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual5 ?propertyTextual5 }
OPTIONAL { %ProductXYZ% bsbm:productPropertyNumeric4 ?propertyNumeric4 }
}
Benchmark (5/5)
Hardware Configuration
Processors: 2 x Intel(R) Xeon(R) CPU E5-2650, 2.00GHz (8 cores and
hyperthreading), Sandy Bridge architecture
Memory: 256GB
Hard Disks: 3 x 1.8TB (7,200 rpm) SATA in RAID 0 (180MB/s sequential
throughput)
Soft ware Configuration
Operating System: Linux version 3.3.4 -3.fc16.x86_64
Filesystem: ext4

Java Version and JVM: Version 1.6.0_31, 64-Bit Server VM (build 20.6-b01)
BSBM generator and test driver version: bibm-0.7
.8

giovedì 19 dicembre 13
Conclusions about 1st step
The platform chosen was Virtuoso,
downstream of the phase of benchmarking
short loading times
high query throughput

giovedì 19 dicembre 13
Virtuoso
giovedì 19 dicembre 13
Starting with Virtuoso
Starting point
Installing Virtuoso
Getting Started

giovedì 19 dicembre 13
Starting point
Linux CentOS 6.4
run “sudo yum install gcc gmake autoconf automake libtool flex
bison gperf gawk m4 make openssl-devel readline-devel wget”
to install build dependencies
It may be wise to open port 8890/tcp in the firewall
configuration to allow external access to Virtuoso's webbased interfaces such as the Conductor
run “yum update” in order to update the indexes of available
packages

giovedì 19 dicembre 13
Installing Virtuoso
Download Virtuoso from SourceForge and unpack it with:
“tar xvpfz virtuoso-opensource-6.1.7.tar.gz”
a simple configuration is:
[user@centos virtuoso-opensource-6.1.7]$ ./configure --prefix=/usr/local/ --withreadline
the prefix /usr/local in the above command forms a base directory for
Virtuoso. There will be the following structures:
/usr/local/lib/: various libraries for Jena, Sesame, JDBC and hosting;
/usr/local/bin/: where the main executables (virtuoso-t, isql) live;
/usr/local/share/virtuoso/vad/: used to store VAD archives;
/usr/local/share/virtuoso/doc/: local offline documentation;
/usr/local/var/lib/virtuoso/db/: default location for a Virtuoso instance;
/usr/local/var/lib/virtuoso/vsp/: various VSP scripts.

Building and Installing: [user@centos virtuoso-opensource-6.1.7]$ nice make and
[user@centos virtuoso-opensource-6.1.7]$ sudo make install
giovedì 19 dicembre 13
Getting Started (1/2)
Take a backup of the virtuoso.ini file in case of
making erroneous changes
run ”cd /usr/local/var/lib/virtuoso/db/” and “virtuoso-t -df”
to start the server
you can access the Conductor menu with
“http://localhost:8890/conductor/”
t wo system users are available:
dba - the relational data and administrative account
dav - the WebDAV administrative account

giovedì 19 dicembre 13
Getting Started (2/2)
Conductor

Helps you to manage users and automate backup, to install
VAD packages, to execute SQL commands in a wed-based
iSQL tool, to configure the RDF Sponger and to load more

SQL/ODBC Listener

Virtuoso provides a listener on port 1111/tcp. You can
connect directly to this and execute SQL statements with
isql tool

Resource Usage

The defaults with Virtuoso Open-Source give:
•

160 MB process size in memory

•

about 29 MB database

•

total 237 MB footprint on disk
There are 20 threads for db and/or web-server use

giovedì 19 dicembre 13
Virtuoso Connection
giovedì 19 dicembre 13
Connections used
RESTFul ser vices
JENA Provider
SESAME Provider

giovedì 19 dicembre 13
Rest (1/4)
HTTP PUT

Download an example dataset (e.g. geo_coordinates_en_uris_it.ttl from
Dbpedia)
Load the sample data to a named graph identified by
<urn:graph:update:test:put>
> curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth?
graph-uri=urn:graph:update:test:put" -T /root/Desktop/Dataset/geo_coordinates_en_uris_it.ttl

Query the graph data:
SELECT *
FROM <urn:graph:update:test:put>
WHERE {?s ?p ?o}

giovedì 19 dicembre 13
Rest (2/4)
HTTP GET

Load the sample data to a named graph identified by
<urn:graph:update:test:get>
> curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crudauth?graph-uri=urn:graph:update:test:get" -T /root/Desktop/Dataset/
geo_coordinates_en_uris_it.ttl

Query the graph data:
> curl --verbose --url "http://localhost:8890/sparql-graph-crud?graphuri=urn:graph:update:test:get"

giovedì 19 dicembre 13
Rest (3/4)
HTTP DELETE

Load the sample data to a named graph identified by
<urn:graph:update:test:delete>
> curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth?graphuri=urn:graph:update:test:delete" -T /root/Desktop/Dataset/geo_coordinates_en_uris_it.ttl

Delete the graph data
> curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth?graphuri=urn:graph:update:test:delete" -X DELETE

To ensure there are no triples after the deletion there are 2 ways:
curl: > curl --verbose --url "http://localhost:8890/sparql-graph-crud?graphuri=urn:graph:update:test:delete"
SPARQL:

SELECT *
FROM <urn:graph:update:test:delete>
WHERE {?s ?p ?o}

giovedì 19 dicembre 13
Rest (4/4)
HTTP POST

Load the sample data to a named graph identified by
<urn:graph:update:test:post>
> curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth?
graph-uri=urn:graph:update:test:post" -X POST -T /root/Desktop/Dataset/
geo_coordinates_en_uris_it.ttl

To query the graph data there are t wo ways:
curl: > curl --verbose --url "http://localhost:8890/sparql-graph-crud?graphuri=urn:graph:update:test:post"
SPARQL:

SELECT *
FROM <urn:graph:update:test:post>
WHERE {?s ?p ?o}

giovedì 19 dicembre 13
What is Jena
Jena is an open source Semantic Web framework for Java
Provides an API to extract data from and write to RDF graphs
The graphs are represented as an abstract "model"
A model can be sourced with data from files, databases, URIs or a
combination of these
A model can also be queried through SPARQL and updated through
SPARUL

giovedì 19 dicembre 13
Virtuoso Jena provider
Virtuoso Jena Provider is a
Native Graph Model Storage
Provider for the Jena
Framework
It enables to query the Virtuoso
RDF Quad Store by Jena RDF
Frameworks Providers are
available for the latest Jena
2.6.x and 2.10.x versions

giovedì 19 dicembre 13
Setup
Download latest Virtuoso Jena Provider, Virtuoso JDBC driver, associated classes and
sample programs from the page www.openlinksw.com
Edit the sample programs VirtuosoSPARQLExampleX.java, where X = 1 to 9
Set the JDBC connection strings to a valid Virtuoso Server instance, using the form:
<jdbc:virtuoso://localhost:1111/charset=UTF-8/log_enable=2", "dba", "dba">
From Eclipse, start a new project and add the following jar at the CLASSPATH:
axis.jar
commons-logging.jar
icu4j.jar
xercesImpl.jar
jena-arq.jar
jena-core.jar
jena-iri.jar
slf4j-api.jar
slf4j-simple.jar
virt_jena.jar
virtjdbc.jar

giovedì 19 dicembre 13
Testing
Once the Provider classes and sample program have been
successfully compiled, the Provider can be tested using the included
sample programs.
Example 1

Example 2

Example 3

giovedì 19 dicembre 13

returns the contents of the RDF Quad store of
the targeted Virtuoso instance

reads in the contents of FOAF URIs

performs simple addition and deletion operation
on the content of the triple store
What is Sesame

Sesame is an open source Java framework for storing,
querying and reasoning with RDF and RDF Schema
It can be used as a database for RDF and RDF Schema, or as a
Java library for applications that need to work with RDF
internally

giovedì 19 dicembre 13
Virtuoso Sesame provider
Virtuoso Sesame Provider is a
Nat i ve Graph Model Storage
Pro v ide r f or t h e Se s ame
Framework
It allows to modify, query and
reason with the Virtuoso quad
store
The Se s ame Re p osi tor y AP I
offers a central access point for
connecting to the Virtuoso quad
store; it provides a Java-friendly
ac c e s s p o i n t t o Vi rt u o s o,
abstracting the details of the
underlying machinery
The Provider has been tested
agains t t he late s t ve rsions,
Sesame 2.7
.x.
giovedì 19 dicembre 13
Setup
Download latest Virtuoso Sesame 2 Provider for the version
of Sesame being used, Virtuoso JDBC dri ver, Sesame
Framework,associated classes and sample programs from the
page www.openlinksw.com
From Eclipse, start a new project and add the following jar at
the CLASSPATH:
virtjdbc.jar
virt_sesame.jar
slf4j-api.jar
slf4j-simple.jar
openrdf-sesame.jar
commons-io.jar

giovedì 19 dicembre 13
Testing
Once the Provider classes and sample program have been successfully compiled,
the Provider can be tested using the included sample programs
The following tests cover the essentials for connecting to and manipulating
data stored in a Virtuoso repository using the Sesame API
VirtuosoTest
Loading data from URL: http:/
/www.openlinksw.com/dataspace/person/kidehen@openlinksw.com/foaf.rdf
Clearing triple store
Loading data from file: virtuoso_driver/data.nt
Loading UNICODE single triple
Loading single triple
Casted value type
Selecting property
Statement does not exists
Statement exists (by resultset size)
Statement exists (by hasStatement())
Retrieving namespaces
Retrieving statement (http:/
/myopenlink.net/dataspace/person/kidehen http:/
/myopenlink.net/foaf/name null)
Writing the statements to file: (/Users/src/virtuoso-opensource/binsrc/sesame2/results.n3.txt)
Retrieving graph ids
Retrieving triple store size
Sending ask query
Sending construct query
Sending describe query
giovedì 19 dicembre 13
Conclusions
In this phase of my analysis the use of Jena or Sesame
providers is indifferent, beacause they are both fully
operational about the triple manipulation

Operations

SESAME

Reading RDF

V

V

Wirting RDF

V

V

Reasoning

V

V

SPARQL Support

V

V

Internal Storage

V

V

External Storage

giovedì 19 dicembre 13

JENA

V

V
RDF/OO Mapping
giovedì 19 dicembre 13
Why ?
Problem

The explosive development of the Web has brought
for ward the need of semantically rich information: a
vision at the heart of the Semantic Web
Having soft ware application where RDF triple are used,
we often need to work with data stored in a semantic
repository
In such case the use of APIs of these repositories could
be difficult

Solution

giovedì 19 dicembre 13

The use of an object-RDF mapper is useful in applications
developed with object-oriented approach, to extend the features
of the OO-paradigm to the RDF world
How?
The
Bean!!

A class that contains attributes equivalent to
the semantic properties of the class and includes
get and set methods

JavaBean classes are written in the Java programming
language according to a particular convention
Used to encapsulate multiple objects into a single object (the
bean)
these objects can be passed as a single bean object instead of as multiple
individual objects
giovedì 19 dicembre 13
Pro and con
Advantages

The advantages are familiarity with the beans they are the
common currency of java frameworks

Disadvantages

The disadvantage is that it is harder to use RDF in a natural
way. Pulling in disparate data sources and merging, the
schemaless aspect of RDF stores, don't work that well when
forced into beans

giovedì 19 dicembre 13
RDF-Mapping tools
Elmo (ex Alibaba)
Jenabean
Sommer
RDFBeans
RDF2JAVA
RDFReactor
giovedì 19 dicembre 13
Elmo
Features
BSD, Java 5.0, store Sesame (generic API)
Additional functionality on top of the triple store: predictive caching (preloading
properties and saving query results for future queries), query expansion (for
handling owl:sameAs), dealing with metadata (reification)
JavaBeans concepts for a number of well known web ontologies including Dublin
Core, RSS and FOAF
Dynamic Runtime JavaBean creation based on RDFS/OWL
A set of tools related to the supported ontologies:
RDF crawler
a generic smusher framework
a generic validator framework with various smushers and a validator specific to FOAF

Code generation using Groovy script template
Use of annotated Java interfaces, implemented by dynamic classes at runtime
using Javassist
giovedì 19 dicembre 13
Jenabean
Jenabean uses Jena's flexible RDF/OWL api to persist java beans. It
takes an unconventional approach to binding that is driven by the
java object model rather than an OWL or RDF schema.
Features

It works against Jena Model API
it should interact with one of the t wo jena backends (SDB,TDB)
use some wrapper to interact with another RDF store
(SAIL,AllegroGraph)

giovedì 19 dicembre 13
Sommer
Sommer just thinks of java fields as named relations. It
makes those relations explicit with the @RDF annotation
Features

runtime via byte code rewriting
no generation of code
uses Java annotations
store: Sesame
vocabulary: any URIs
giovedì 19 dicembre 13
RDFBeans
Features
Does not depend on specific triplestore implementation: any supported by
RDF2Go API can be used
Cascade databinding to reduce development time and ensure referential
integrity of complex object data structures
Modular RDFBeans annotations: can be inherited from superclasses and
interfaces
No predefined ontologies and RDF-schemas are required for RDF data
Transactions support (triplestore-specific)
Extensible mechanism of mapping Java data types to RDF literals
Support of basic Java Collections, optionally represented with RDF containers
Support of indexed JavaBean properties
Support of RDF namespaces
giovedì 19 dicembre 13
RDF2JAVA
Features

good command line
generates code from RDFS
Java classes for RDFS classes: no multiple inheritance
supported and no multiple super classes
very tiny, light weight project
not maintained anymore (soft ware frozen but working)

giovedì 19 dicembre 13
RDFReactor
Features

Generates code from RDFS
experimental, partial generation from OWL
cardinality constraints
store: via RDF2Go Jena, Sesame and YARS are supported
uses Velocity for template generation

giovedì 19 dicembre 13
Conclusion
Features

Elmo

Jenabean

Sommer

RDFbeans

Java Annotations

V

V

V

V

Storage via Sesame

V

X

V

V

Storage via Jena

X

V

X

X

JenaBean Generetaion
based on RDFS

V

X

X

-

JenaBean Generetaion
based on OWL

V

V

X

-

Documentation

V

X

X

V

Downstream of the analysis about the mapping tools, the choice fell on Elmo
Elmo is equipped with all the necessary functionality for handling triple
within Virtuoso

The Virtuoso provider chosen was SESAME
SESAME can easily interface with Virtuoso and Elmo
giovedì 19 dicembre 13
The end

giovedì 19 dicembre 13

Más contenido relacionado

La actualidad más candente

Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
 
Semantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaSemantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaThomas Kurz
 
Harnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaHarnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaKnoldus Inc.
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Olaf Hartig
 
IBM Spark Meetup - RDD & Spark Basics
IBM Spark Meetup - RDD & Spark BasicsIBM Spark Meetup - RDD & Spark Basics
IBM Spark Meetup - RDD & Spark BasicsSatya Narayan
 
Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012Hortonworks
 
Introduction to LDP in Apache Marmotta
Introduction to LDP in Apache MarmottaIntroduction to LDP in Apache Marmotta
Introduction to LDP in Apache MarmottaSergio Fernández
 
Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109Chengjen Lee
 
The Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for BeginnersThe Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for BeginnersEdelweiss Kammermann
 
Stream all the things
Stream all the thingsStream all the things
Stream all the thingsDean Wampler
 
Shrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic WebShrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic WebGordon Dunsire
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Rupak Roy
 
(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and ProtobufGuido Schmutz
 

La actualidad más candente (20)

May 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data OutMay 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data Out
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Semantic Media Management with Apache Marmotta
Semantic Media Management with Apache MarmottaSemantic Media Management with Apache Marmotta
Semantic Media Management with Apache Marmotta
 
Harnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaHarnessing the power of Nutch with Scala
Harnessing the power of Nutch with Scala
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
 
IBM Spark Meetup - RDD & Spark Basics
IBM Spark Meetup - RDD & Spark BasicsIBM Spark Meetup - RDD & Spark Basics
IBM Spark Meetup - RDD & Spark Basics
 
Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012
 
Introduction to LDP in Apache Marmotta
Introduction to LDP in Apache MarmottaIntroduction to LDP in Apache Marmotta
Introduction to LDP in Apache Marmotta
 
Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815
 
CKAN as an open-source data management solution for open data
CKAN as an open-source data management solution for open data CKAN as an open-source data management solution for open data
CKAN as an open-source data management solution for open data
 
The Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for BeginnersThe Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
 
Stream all the things
Stream all the thingsStream all the things
Stream all the things
 
Shrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic WebShrinking the silo boundary: data and schema in the Semantic Web
Shrinking the silo boundary: data and schema in the Semantic Web
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem
 
Hive hcatalog
Hive hcatalogHive hcatalog
Hive hcatalog
 
Hadoop pig
Hadoop pigHadoop pig
Hadoop pig
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
 

Similar a Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO

Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsJulien Nioche
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friendslucenerevolution
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
 
More on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB Devroom
More on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB DevroomMore on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB Devroom
More on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB DevroomValeriy Kravchuk
 
Bundling Packages and Deploying Applications with RPM
Bundling Packages and Deploying Applications with RPMBundling Packages and Deploying Applications with RPM
Bundling Packages and Deploying Applications with RPMAlexander Shopov
 
9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linux9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linuxchinkshady
 
EKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern FragmentsEKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern FragmentsRuben Taelman
 
Using R on High Performance Computers
Using R on High Performance ComputersUsing R on High Performance Computers
Using R on High Performance ComputersDave Hiltbrand
 
Spack - A Package Manager for HPC
Spack - A Package Manager for HPCSpack - A Package Manager for HPC
Spack - A Package Manager for HPCinside-BigData.com
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under controlMarcin Przepiórowski
 
Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on KubernetesJoerg Henning
 
Orchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorOrchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorRaphaël PINSON
 
Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...
Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...
Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...Puppet
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowTatiana Al-Chueyr
 
RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)Daniel Nüst
 
PostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data WrappersPostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data WrappersNicholas Kiraly
 

Similar a Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO (20)

Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
 
More on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB Devroom
More on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB DevroomMore on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB Devroom
More on bpftrace for MariaDB DBAs and Developers - FOSDEM 2022 MariaDB Devroom
 
SFScon 2020 - Peter Hopfgartner - Open Data de luxe
SFScon 2020 - Peter Hopfgartner - Open Data de luxeSFScon 2020 - Peter Hopfgartner - Open Data de luxe
SFScon 2020 - Peter Hopfgartner - Open Data de luxe
 
Bundling Packages and Deploying Applications with RPM
Bundling Packages and Deploying Applications with RPMBundling Packages and Deploying Applications with RPM
Bundling Packages and Deploying Applications with RPM
 
9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linux9 steps to install and configure postgre sql from source on linux
9 steps to install and configure postgre sql from source on linux
 
The DBpedia databus
The DBpedia databusThe DBpedia databus
The DBpedia databus
 
EKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern FragmentsEKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern Fragments
 
Using R on High Performance Computers
Using R on High Performance ComputersUsing R on High Performance Computers
Using R on High Performance Computers
 
Spack - A Package Manager for HPC
Spack - A Package Manager for HPCSpack - A Package Manager for HPC
Spack - A Package Manager for HPC
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under control
 
Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on Kubernetes
 
Orchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorOrchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and Mspectator
 
Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...
Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...
Orchestrated Functional Testing with Puppet-spec and Mspectator - PuppetConf ...
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache Airflow
 
Docker as an every day work tool
Docker as an every day work toolDocker as an every day work tool
Docker as an every day work tool
 
Load demo-oct2016
Load demo-oct2016Load demo-oct2016
Load demo-oct2016
 
RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)
 
PostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data WrappersPostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data Wrappers
 

Último

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO

  • 2. Over view Triple Store Benchmarking Virtuoso Virtuoso Connection RDF/OO Mapper giovedì 19 dicembre 13
  • 4. How to... BSBM (Berlin SPARQL Benchmark) to compare performance of RDF and Named Graph store, as well as RDFmapped relational databases Lehigh University Benchmark (LUBM) to facilitate the evaluation of Semantic Web repositories in a standard way giovedì 19 dicembre 13
  • 5. Benchmark (1/5) After an analysis of a April 2013 BSBM experiment in which the Berlin SPARQL Benchmark version 3.1 was used to measure the performance of: Load times for SUTs (hh:mm:ss) BigData (rev. 6528) BigOwlim (v. 5.2) TDB (v. 0.9.4) Virtuoso6 (ver. 6.04) Virtuoso7 (ver. 7 .0) giovedì 19 dicembre 13 SUT 10M 100M 200M 1B BigData 00:2:39 00:25:35 00:59:25 - BigOwlim 00:2:31 00:22:47 00:47:19 4:9:39 TDB Virtuoso6 Virtuoso7 00:9:41 00:7:06 - 1:37:55 00:19:26 00:3:39 3:34:59 00:31:30 - 1:10:30 00:27:11
  • 6. Benchmark (2/5) The tables below summarize the query throughput for various type of query over all 500 runs (in QpS) Benchmark Query results: QpS (Queries per Second) BigData BigOwlim TDB 100M 200M 100M Query 1 49.955 Query 2 42.769 Query 3 37 .280 49.520 Query 1 93.773 65.385 Query 2 115.960 65.158 Query 3 170.242 61.155 43.713 38.355 200M Query 1 232.234 217 .865 Query 2 109.445 110.019 Query 3 180.245 174.216 giovedì 19 dicembre 13 100M 200M Query 1 119.048 94.877 Query 2 158.755 151.883 Query 3 84.660 70.492 Virtuoso7 Virtuoso6 100M 200M 100M 1B Query 1 125.786 75.324 Query 2 68.929 68.820 Query 3 117426 62.243 .
  • 7. Benchmark (3/5) Query 1 Find products for a given set of generic features Query 2 Query 3 giovedì 19 dicembre 13 Retrieve basic information about a specific product for display purposes Find products having some specific features and not having one feature
  • 8. Benchmark (4/5) Query 1 PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/> PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/> Query 2 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/> SELECT DISTINCT ?product ?label PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/> WHERE { PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> ?product rdfs:label ?label . ?product a %ProductType% . ?product bsbm:productFeature %ProductFeature1% . PREFIX dc: <http://purl.org/dc/elements/1.1/> ?product bsbm:productFeature %ProductFeature2% . ?product bsbm:productPropertyNumeric1 ?value1 . ! FILTER (?value1 > %x%) ! } SELECT ?label ?comment ?producer ?productFeature ?propertyTextual1 ?propertyTextual2 ?propertyTextual3  ?propertyNumeric1 ?propertyNumeric2 ?propertyTextual4 ?propertyTextual5 ?propertyNumeric4 WHERE { ORDER BY ?label LIMIT 10 %ProductXYZ% rdfs:label ?label . ! Query 3 %ProductXYZ% rdfs:comment ?comment . ! %ProductXYZ% bsbm:producer ?p . ! ?p rdfs:label ?producer . %ProductXYZ% dc:publisher ?p . PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> ! %ProductXYZ% bsbm:productFeature ?f . ! ?f rdfs:label ?productFeature . ! %ProductXYZ% bsbm:productPropertyTextual1 ?propertyTextual1 . ! PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/> %ProductXYZ% bsbm:productPropertyTextual2 ?propertyTextual2 . %ProductXYZ% bsbm:productPropertyTextual3 ?propertyTextual3 . SELECT ?product ?label ! WHERE { ?product rdfs:label ?label . ?product a %ProductType% . ! ?product bsbm:productFeature %ProductFeature1% . ! ?product bsbm:productPropertyNumeric3 ?p3 . ! FILTER (?p3 < %y% ) %ProductXYZ% bsbm:productPropertyNumeric2 ?propertyNumeric2 . ! OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual4 ?propertyTextual4 } FILTER ( ?p1 > %x% ) ! ! ?product bsbm:productPropertyNumeric1 ?p1 . ! %ProductXYZ% bsbm:productPropertyNumeric1 ?propertyNumeric1 . OPTIONAL { ?product bsbm:productFeature %ProductFeature2% . ?product rdfs:label ?testVar } FILTER (!bound(?testVar)) } ORDER BY ?label LIMIT 10 giovedì 19 dicembre 13 OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual5 ?propertyTextual5 } OPTIONAL { %ProductXYZ% bsbm:productPropertyNumeric4 ?propertyNumeric4 } }
  • 9. Benchmark (5/5) Hardware Configuration Processors: 2 x Intel(R) Xeon(R) CPU E5-2650, 2.00GHz (8 cores and hyperthreading), Sandy Bridge architecture Memory: 256GB Hard Disks: 3 x 1.8TB (7,200 rpm) SATA in RAID 0 (180MB/s sequential throughput) Soft ware Configuration Operating System: Linux version 3.3.4 -3.fc16.x86_64 Filesystem: ext4 Java Version and JVM: Version 1.6.0_31, 64-Bit Server VM (build 20.6-b01) BSBM generator and test driver version: bibm-0.7 .8 giovedì 19 dicembre 13
  • 10. Conclusions about 1st step The platform chosen was Virtuoso, downstream of the phase of benchmarking short loading times high query throughput giovedì 19 dicembre 13
  • 12. Starting with Virtuoso Starting point Installing Virtuoso Getting Started giovedì 19 dicembre 13
  • 13. Starting point Linux CentOS 6.4 run “sudo yum install gcc gmake autoconf automake libtool flex bison gperf gawk m4 make openssl-devel readline-devel wget” to install build dependencies It may be wise to open port 8890/tcp in the firewall configuration to allow external access to Virtuoso's webbased interfaces such as the Conductor run “yum update” in order to update the indexes of available packages giovedì 19 dicembre 13
  • 14. Installing Virtuoso Download Virtuoso from SourceForge and unpack it with: “tar xvpfz virtuoso-opensource-6.1.7.tar.gz” a simple configuration is: [user@centos virtuoso-opensource-6.1.7]$ ./configure --prefix=/usr/local/ --withreadline the prefix /usr/local in the above command forms a base directory for Virtuoso. There will be the following structures: /usr/local/lib/: various libraries for Jena, Sesame, JDBC and hosting; /usr/local/bin/: where the main executables (virtuoso-t, isql) live; /usr/local/share/virtuoso/vad/: used to store VAD archives; /usr/local/share/virtuoso/doc/: local offline documentation; /usr/local/var/lib/virtuoso/db/: default location for a Virtuoso instance; /usr/local/var/lib/virtuoso/vsp/: various VSP scripts. Building and Installing: [user@centos virtuoso-opensource-6.1.7]$ nice make and [user@centos virtuoso-opensource-6.1.7]$ sudo make install giovedì 19 dicembre 13
  • 15. Getting Started (1/2) Take a backup of the virtuoso.ini file in case of making erroneous changes run ”cd /usr/local/var/lib/virtuoso/db/” and “virtuoso-t -df” to start the server you can access the Conductor menu with “http://localhost:8890/conductor/” t wo system users are available: dba - the relational data and administrative account dav - the WebDAV administrative account giovedì 19 dicembre 13
  • 16. Getting Started (2/2) Conductor Helps you to manage users and automate backup, to install VAD packages, to execute SQL commands in a wed-based iSQL tool, to configure the RDF Sponger and to load more SQL/ODBC Listener Virtuoso provides a listener on port 1111/tcp. You can connect directly to this and execute SQL statements with isql tool Resource Usage The defaults with Virtuoso Open-Source give: • 160 MB process size in memory • about 29 MB database • total 237 MB footprint on disk There are 20 threads for db and/or web-server use giovedì 19 dicembre 13
  • 18. Connections used RESTFul ser vices JENA Provider SESAME Provider giovedì 19 dicembre 13
  • 19. Rest (1/4) HTTP PUT Download an example dataset (e.g. geo_coordinates_en_uris_it.ttl from Dbpedia) Load the sample data to a named graph identified by <urn:graph:update:test:put> > curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth? graph-uri=urn:graph:update:test:put" -T /root/Desktop/Dataset/geo_coordinates_en_uris_it.ttl Query the graph data: SELECT * FROM <urn:graph:update:test:put> WHERE {?s ?p ?o} giovedì 19 dicembre 13
  • 20. Rest (2/4) HTTP GET Load the sample data to a named graph identified by <urn:graph:update:test:get> > curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crudauth?graph-uri=urn:graph:update:test:get" -T /root/Desktop/Dataset/ geo_coordinates_en_uris_it.ttl Query the graph data: > curl --verbose --url "http://localhost:8890/sparql-graph-crud?graphuri=urn:graph:update:test:get" giovedì 19 dicembre 13
  • 21. Rest (3/4) HTTP DELETE Load the sample data to a named graph identified by <urn:graph:update:test:delete> > curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth?graphuri=urn:graph:update:test:delete" -T /root/Desktop/Dataset/geo_coordinates_en_uris_it.ttl Delete the graph data > curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth?graphuri=urn:graph:update:test:delete" -X DELETE To ensure there are no triples after the deletion there are 2 ways: curl: > curl --verbose --url "http://localhost:8890/sparql-graph-crud?graphuri=urn:graph:update:test:delete" SPARQL: SELECT * FROM <urn:graph:update:test:delete> WHERE {?s ?p ?o} giovedì 19 dicembre 13
  • 22. Rest (4/4) HTTP POST Load the sample data to a named graph identified by <urn:graph:update:test:post> > curl --digest --user dba:dba --verbose --url "http://localhost:8890/sparql-graph-crud-auth? graph-uri=urn:graph:update:test:post" -X POST -T /root/Desktop/Dataset/ geo_coordinates_en_uris_it.ttl To query the graph data there are t wo ways: curl: > curl --verbose --url "http://localhost:8890/sparql-graph-crud?graphuri=urn:graph:update:test:post" SPARQL: SELECT * FROM <urn:graph:update:test:post> WHERE {?s ?p ?o} giovedì 19 dicembre 13
  • 23. What is Jena Jena is an open source Semantic Web framework for Java Provides an API to extract data from and write to RDF graphs The graphs are represented as an abstract "model" A model can be sourced with data from files, databases, URIs or a combination of these A model can also be queried through SPARQL and updated through SPARUL giovedì 19 dicembre 13
  • 24. Virtuoso Jena provider Virtuoso Jena Provider is a Native Graph Model Storage Provider for the Jena Framework It enables to query the Virtuoso RDF Quad Store by Jena RDF Frameworks Providers are available for the latest Jena 2.6.x and 2.10.x versions giovedì 19 dicembre 13
  • 25. Setup Download latest Virtuoso Jena Provider, Virtuoso JDBC driver, associated classes and sample programs from the page www.openlinksw.com Edit the sample programs VirtuosoSPARQLExampleX.java, where X = 1 to 9 Set the JDBC connection strings to a valid Virtuoso Server instance, using the form: <jdbc:virtuoso://localhost:1111/charset=UTF-8/log_enable=2", "dba", "dba"> From Eclipse, start a new project and add the following jar at the CLASSPATH: axis.jar commons-logging.jar icu4j.jar xercesImpl.jar jena-arq.jar jena-core.jar jena-iri.jar slf4j-api.jar slf4j-simple.jar virt_jena.jar virtjdbc.jar giovedì 19 dicembre 13
  • 26. Testing Once the Provider classes and sample program have been successfully compiled, the Provider can be tested using the included sample programs. Example 1 Example 2 Example 3 giovedì 19 dicembre 13 returns the contents of the RDF Quad store of the targeted Virtuoso instance reads in the contents of FOAF URIs performs simple addition and deletion operation on the content of the triple store
  • 27. What is Sesame Sesame is an open source Java framework for storing, querying and reasoning with RDF and RDF Schema It can be used as a database for RDF and RDF Schema, or as a Java library for applications that need to work with RDF internally giovedì 19 dicembre 13
  • 28. Virtuoso Sesame provider Virtuoso Sesame Provider is a Nat i ve Graph Model Storage Pro v ide r f or t h e Se s ame Framework It allows to modify, query and reason with the Virtuoso quad store The Se s ame Re p osi tor y AP I offers a central access point for connecting to the Virtuoso quad store; it provides a Java-friendly ac c e s s p o i n t t o Vi rt u o s o, abstracting the details of the underlying machinery The Provider has been tested agains t t he late s t ve rsions, Sesame 2.7 .x. giovedì 19 dicembre 13
  • 29. Setup Download latest Virtuoso Sesame 2 Provider for the version of Sesame being used, Virtuoso JDBC dri ver, Sesame Framework,associated classes and sample programs from the page www.openlinksw.com From Eclipse, start a new project and add the following jar at the CLASSPATH: virtjdbc.jar virt_sesame.jar slf4j-api.jar slf4j-simple.jar openrdf-sesame.jar commons-io.jar giovedì 19 dicembre 13
  • 30. Testing Once the Provider classes and sample program have been successfully compiled, the Provider can be tested using the included sample programs The following tests cover the essentials for connecting to and manipulating data stored in a Virtuoso repository using the Sesame API VirtuosoTest Loading data from URL: http:/ /www.openlinksw.com/dataspace/person/kidehen@openlinksw.com/foaf.rdf Clearing triple store Loading data from file: virtuoso_driver/data.nt Loading UNICODE single triple Loading single triple Casted value type Selecting property Statement does not exists Statement exists (by resultset size) Statement exists (by hasStatement()) Retrieving namespaces Retrieving statement (http:/ /myopenlink.net/dataspace/person/kidehen http:/ /myopenlink.net/foaf/name null) Writing the statements to file: (/Users/src/virtuoso-opensource/binsrc/sesame2/results.n3.txt) Retrieving graph ids Retrieving triple store size Sending ask query Sending construct query Sending describe query giovedì 19 dicembre 13
  • 31. Conclusions In this phase of my analysis the use of Jena or Sesame providers is indifferent, beacause they are both fully operational about the triple manipulation Operations SESAME Reading RDF V V Wirting RDF V V Reasoning V V SPARQL Support V V Internal Storage V V External Storage giovedì 19 dicembre 13 JENA V V
  • 33. Why ? Problem The explosive development of the Web has brought for ward the need of semantically rich information: a vision at the heart of the Semantic Web Having soft ware application where RDF triple are used, we often need to work with data stored in a semantic repository In such case the use of APIs of these repositories could be difficult Solution giovedì 19 dicembre 13 The use of an object-RDF mapper is useful in applications developed with object-oriented approach, to extend the features of the OO-paradigm to the RDF world
  • 34. How? The Bean!! A class that contains attributes equivalent to the semantic properties of the class and includes get and set methods JavaBean classes are written in the Java programming language according to a particular convention Used to encapsulate multiple objects into a single object (the bean) these objects can be passed as a single bean object instead of as multiple individual objects giovedì 19 dicembre 13
  • 35. Pro and con Advantages The advantages are familiarity with the beans they are the common currency of java frameworks Disadvantages The disadvantage is that it is harder to use RDF in a natural way. Pulling in disparate data sources and merging, the schemaless aspect of RDF stores, don't work that well when forced into beans giovedì 19 dicembre 13
  • 36. RDF-Mapping tools Elmo (ex Alibaba) Jenabean Sommer RDFBeans RDF2JAVA RDFReactor giovedì 19 dicembre 13
  • 37. Elmo Features BSD, Java 5.0, store Sesame (generic API) Additional functionality on top of the triple store: predictive caching (preloading properties and saving query results for future queries), query expansion (for handling owl:sameAs), dealing with metadata (reification) JavaBeans concepts for a number of well known web ontologies including Dublin Core, RSS and FOAF Dynamic Runtime JavaBean creation based on RDFS/OWL A set of tools related to the supported ontologies: RDF crawler a generic smusher framework a generic validator framework with various smushers and a validator specific to FOAF Code generation using Groovy script template Use of annotated Java interfaces, implemented by dynamic classes at runtime using Javassist giovedì 19 dicembre 13
  • 38. Jenabean Jenabean uses Jena's flexible RDF/OWL api to persist java beans. It takes an unconventional approach to binding that is driven by the java object model rather than an OWL or RDF schema. Features It works against Jena Model API it should interact with one of the t wo jena backends (SDB,TDB) use some wrapper to interact with another RDF store (SAIL,AllegroGraph) giovedì 19 dicembre 13
  • 39. Sommer Sommer just thinks of java fields as named relations. It makes those relations explicit with the @RDF annotation Features runtime via byte code rewriting no generation of code uses Java annotations store: Sesame vocabulary: any URIs giovedì 19 dicembre 13
  • 40. RDFBeans Features Does not depend on specific triplestore implementation: any supported by RDF2Go API can be used Cascade databinding to reduce development time and ensure referential integrity of complex object data structures Modular RDFBeans annotations: can be inherited from superclasses and interfaces No predefined ontologies and RDF-schemas are required for RDF data Transactions support (triplestore-specific) Extensible mechanism of mapping Java data types to RDF literals Support of basic Java Collections, optionally represented with RDF containers Support of indexed JavaBean properties Support of RDF namespaces giovedì 19 dicembre 13
  • 41. RDF2JAVA Features good command line generates code from RDFS Java classes for RDFS classes: no multiple inheritance supported and no multiple super classes very tiny, light weight project not maintained anymore (soft ware frozen but working) giovedì 19 dicembre 13
  • 42. RDFReactor Features Generates code from RDFS experimental, partial generation from OWL cardinality constraints store: via RDF2Go Jena, Sesame and YARS are supported uses Velocity for template generation giovedì 19 dicembre 13
  • 43. Conclusion Features Elmo Jenabean Sommer RDFbeans Java Annotations V V V V Storage via Sesame V X V V Storage via Jena X V X X JenaBean Generetaion based on RDFS V X X - JenaBean Generetaion based on OWL V V X - Documentation V X X V Downstream of the analysis about the mapping tools, the choice fell on Elmo Elmo is equipped with all the necessary functionality for handling triple within Virtuoso The Virtuoso provider chosen was SESAME SESAME can easily interface with Virtuoso and Elmo giovedì 19 dicembre 13
  • 44. The end giovedì 19 dicembre 13