Fostering Friendships - Enhancing Social Bonds in the Classroom
RDF_API_Java_Stefan_Apostoaie
1. RDF processing API's for Java
Ştefan Apostoaie,
Computer Science Faculty, “Al. I. Cuza” University, Iaşi, Romania
stefan.apostoaie@info.uaic.ro
Abstract. There are several RDF APIs for java, including the more popular:
Jrdf, Jena, and Sesame. In this paper we try to compare them in terms of triple
storage, SPARQL support, programmers support (documentation, IDE
integration, learning curve, etc.), performance, interoperability, maturity and
licensing.
Keywords: API, RDF, SPARQL, Java
1. Introduction
There are a lot of RDF APIs for Java but is hard to choose one of them. There are
simple, clean implementations offering basic RDF support as there are complex ones
that offers full RDF functionality for projects of any size. We try to make a
comparison between some of these implementations based on their description,
documentation and small examples.
1.1 Document structure
In this document we have basic description for the following RDF APIs:
- JRDF,
- Jena,
- Sesame
Each API will be analyzed in terms of:
- how the triples are stored,
- SPARQL interrogations support,
- programmers support,
- maturity,
- licensing.
1
2. 2. JRDF1
JRDF is an open source RDF framework for Java that uses object oriented model of
RDF graphs including URIs, literals and blank nodes.
It offers the following features:
- a graph API including graph comparison and graph set-based operations
- creating and manipulating Graph objects (Statements, Resources, Nodes, etc)
- in memory and disk based graphs with a standard system level interface for
storing triples
- IoC support (using Spring 2)
- RDF Datatypes
- local (where nodes are tied to a graph/store) and global (where they are not)
RDF statements
- Query Handling including SPARQL support (including results, transport,
etc).
JRDF comes in two variants:
1. JRDF GUI: allows to read RDF/XML and N3 files and query them using
SPARQL.
· To open it use:
· java -jar jrdf-gui-0.x.x.jar
2. The JRDF JAR is used to load an RDF/XML file.
· It contains a RDF/XML parser which is used for processing RDF/XML
files
· Also it offers a Graph interface with methods for adding, removing, and
finding triples.
JRDF is a new project, so we don't expect it to be very mature, but since it is designed
to use features from previous RDF API implementations such as Jena and Sesame it
offers a lot of features. Version 0.5.6 is reached, so it can still contain a lot of bugs.
The JRDF documentation helps the users install and use the API in a few steps. Also
the Javadoc and, for those interested, the source code are published. There are a few
tutorials, but the programmer must try the framework by himself to learn all the
features.
JRDF is released under the Apache Software License, Version 1.1, meaning it's free
and open source and any redistribution should include the original JRDF license file.
3. Jena2
Jena is a framework that provides a programmatic environment for RDF, RDFS,
OWL, and SPARQL.
1 http://jrdf.sourceforge.net/
2
http://jena.sourceforge.net/
2
3. It includes:
· A RDF API
· Reading and writing RDF in RDF/XML, N3 and N-Triples
· An OWL API
· In-memory and persistent storage
· SPARQL query engine
It provides methods for reading and writing RDF files and also navigating and
querying a model. Jena also uses ARQ query language (which is a SPARQL
implementation) for accessing RDF.
The persistence of RDF and OWL data is done using two subsystems: SDB or TDB
(separate downloads).
SDB provides scalable storage and query of RDF datasets using conventional SQL
databases. SDB is designed specifically to support SPARQL. SDB supports Microsoft
SQL Server 2005, Oracle 10gR2, IBM DB2, PostgreSQL v8, MySQL 5.0, HSQLDB
1.8, H2 1, Apache Derby 10.2.
TDB is a high performance, non-transactional persistence engine using custom
indexing and storage.
Between the two, TDB is faster and simpler to setup.
Jena also supports RDB for legacy applications, but it's deprecated for new
development.
Jena is a relative old project (version 2.6 reached) which allows us to say that it's
quite mature and stable. Backward compatibility is also assured, so we can use the
new version of the framework even though we started it some time ago with a
previous release.
On the Jena project web site we find many tutorials and HowTo's that guide the
programmer on the Jena learning steps. There we can find examples on how to create
a model, use RDF Readers and Writers, Typed literals, ARP (An RDF Parser),
Schemagen, and many other. Compared to JRDF, the Jena documentation is far more
useful and it covers a lot more of the framework functionality.
Jena is also free and open source, can be copied under some simple terms: we must
keep the original license file.
4. Sesame3
Sesame is an open source RDF framework with support for RDF Schema inferencing
and querying. It has been designed with flexibility in mind, can be deployed on top of
a variety of storage systems (relational databases, in-memory, file systems, keyword
indexers, etc.), and offers a large set of tools to developers to leverage the power of
RDF and RDF Schema.
Sesame contains the following components:
- Sail API (Storage And Inference Layer) – low level System API for RDF stores and
inferencers. Its purpose is to abstract from the storage and statements, and the writers
for the reverse operation.
3
http://www.openrdf.org/
3
4. - Rio (RDF I/O) – a set of parsers and writers for various RDF file formats. The
parsers can be used to translate RDF files to sets of statements, and the writers for the
reverse operation.
- Repository API – a higher level API that offers a large number of developer-
oriented methods for handling RDF data. This API should make the life of application
developers "as easy as possible".
Sesame can be used as a library or as a server. If used as a library the setup is
straightforward. For using it as a server the user has to setup some environment
variables, and install a Java Servlet Container. Also Java 5 or newer is required for
both variants.
For persistence, Sesame can use
- memory store (in memory persistence): can be stored to disk before shut
down.
- Native store: it's slower than memory store, but it isn't limited to the size of
available memory. Uses B-Trees for indexing statements and more indexes
can be used to speed up querying.
- RDBMS (Relational DataBase Management System): PostgreSQL and
MySQL are supported. The JDBC layer is used.
- HTTP repository: isn't an actual store, but serves as a proxy for a store on a
remote Sesame server.
Sesame reached version 2.0 in December 2007 after "two years of intensive
development". The most recent official version is 2.2.4 and also a 2.3 preview version
is available. So the framework is mature, and if we have problems using it we can
always use either community or commercial support.
The Sesame user guide walks the user through the steps of downloading, installing,
and basic usage of the library and the server. Also there are tutorials for RQL and
SeRQL (Sesame's RDF query languages, the first will not be updated, the second still
in development). Compared with the Jena documentation, Sesame doesn't really meet
the expectations. Despite this it is usable, and can be learned without too much effort.
A notable thing is that SPARQL support is not mentioned in the official
documentation, so my guess is that it's not supported.
Sesame is a complex RDF API implementation that can just as easy be used in
enterprise projects and small simple applications.
Sesame 2.x is available under a BSD-style license, which means that it's open-source
and free, provided the original license file is not removed.
References
1. http://jena.sourceforge.net/ main, license and documentation pages.
2. http://jrdf.sourceforge.net/ home and documentation pages.
3. http://www.openrdf.org/ home, user manual, documentation, and license pages.
4