1. Mark van Berkel, Founder
mark@hunchmanifest.com
@vberkel
March 2013, Hunch Manifest Inc
2. lastName “van Berkel”
Mark
March 2013, Hunch Manifest Inc
3. lastName “van Berkel”
Mark
studied
M.Eng.
founded
Hunch Manifest
March 2013, Hunch Manifest Inc
4. lastName “van Berkel”
Mark
studied
M.Eng.
founded
included
Hunch Manifest
M.Eng.
offers Project
offers
Servicedat
RHomeApi.com
Home.com
March 2013, Hunch Manifest Inc
5. lastName “van Berkel”
Mark
studied
M.Eng.
founded
included
Hunch Manifest
M.Eng.
offers Project
offers
produced
produced
Servicedat
RHomeApi.com
Home.com Research
Report
SAP Labs
Prototype
March 2013, Hunch Manifest Inc
6. lastName “van Berkel”
Mark
studied
M.Eng.
founded presents
included
“Designing
Hunch Manifest Semantic Web Apps”
basedOn basedOn
M.Eng.
offers Project
offers
basedOn produced
produced
Servicedat
RHomeApi.com
Home.com Research
Report
SAP Labs
Prototype
March 2013, Hunch Manifest Inc
7. Making connections may be the noblest work of
man
Ralph Caplan, author, public speaker, and designer
Digital technology and the Internet have suddenly
opened up a dramatic flood of new connections
and connectivity that’s confusing in its intensity
and reach. Traditional media are being challenged
by unexpected new media that have been
spawned by these new connections.
Bill Moggridge, Founder IDEO
Big data without context is just noise
Expert System ?
March 2013, Hunch Manifest Inc
8. Why?
Basic semantic web applications vs idealistic
designs
Data Collection
Basic Queries, CRUD operations
Advanced Queries, Transactions, using Contexts
Security
Publishing information
Platform and Infrastructure.
Ecosystem and complimentary technologies.
Opportunities and Challenges
March 2013, Hunch Manifest Inc
9. Search & Discovery: ALL kinds of
information is unified
Do More: Better context, content
for User Actions
Knowledge Graph: Generate
assertions and automated
reasoning
Business Control: Shift from techies
to Knowledge workers (eg ODapps)
Adaptive & Robust: Easily extended,
integrate syntax, structure,
meaning.
Lower Costs: Integrate w/out
rearchitecting, single model
Domain Rationale: Elegant method
to solve data explosion
Source: http://www.mkbergman.com/1626/seven-arguments-for-semantic-technologies/
14. Start with a Graph database
Load some data (Data Collection)
RDB to RDF Mapping Language ( R2RML )
Load some LinkedData, e.g. DBPedia.org
Custom Application Integration
Test the SPARQL Query
Find a Sem Web Library for your language
[see Ecosystem Slide]
March 2013, Hunch Manifest Inc
16. Partial R2RML mapping document will produce the
triples from the EMP table:
http://d2rq.org can generate mappings & SPARQL
read access to content of relational databases
March 2013, Hunch Manifest Inc
22. TopBraid Suite
Leverages emerging
technology to help
customers connect
silos of data, systems
and infrastructure and
to build flexible
applications from
linked data models.
March 2013, Hunch Manifest Inc
23. Ontology
Defines all the elements
involved in a business
ecosystem and
organizes them by their
relationship to each
other.
Upper Level Ontology
Domain Specific
Ontology
Generate Reasoning &
Dynamic Insights
March 2013, Hunch Manifest Inc
24. INSERT DATA { d:i8301 ab:homeTel "(718) 440-9821” .
ab:Person a rdfs:Class . }
Preview dynamic inserts
CONSTRUCT { ?person a ab:Person . }
WHERE { ?person ab:firstName ?firstName ;
ab:lastName ?lastName . }
Dynamic Insert
INSERT { ?person a ab:Person . }
WHERE { ?person ab:firstName ?firstName ;
ab:lastName ?lastName . }
DuCharme, Bob (2011-07-14). Learning SPARQL. OReilly Media -
March 2013, Hunch Manifest Inc
A. Kindle Edition.
25. SELECT * WHERE { ?person rdf:type ab:Person }
SELECT ?child ?predicate ?object WHERE {
?person rdf:type ab:Person .
?person ab:LastName “Smith” .
?person ab:child ?child .
?child ?predicate ?object .
}
Lots of options:
Can use FILTER, regex, test data types, in a list, LIMIT result
count, OFFSET results, SORT / ORDER BY, CONCAT, etc.
DuCharme, Bob (2011-07-14). Learning SPARQL. OReilly Media -
March 2013, Hunch Manifest Inc
A. Kindle Edition.
26. DELETE and INSERT behave similarly
Specific Delete
DELETE { d:i8301 an:name "Tommy_Potter" .
d:i8301 ab:homeTel "(718) 440-9821” }
Test dynamic DELETE first with CONSTRUCT
CONSTRUCT { ?s ?p "Tommy_Potter" }
WHERE { ?s ?p "Tommy_Potter" }
And execute:
DELETE { ?s ?p "Tommy_Potter" }
WHERE { ?s ?p "Tommy_Potter" }
DuCharme, Bob (2011-07-14). Learning SPARQL. OReilly Media -
March 2013, Hunch Manifest Inc
A. Kindle Edition.
27. Uses DELETE and INSERT together
DELETE { ?s ab:email ?o }
INSERT { ?s foaf:mbox ?o }
WHERE {?s ab:email ?o }
Alternative using RDF transactions follows
DuCharme, Bob (2011-07-14). Learning SPARQL. OReilly Media -
March 2013, Hunch Manifest Inc
A. Kindle Edition.
28. Transactions, depends on database and
SPARQL version supported
Some graph databases are ACID compliant
▪ Atomicity, Consistency, Isolation, and Durability
Query with Reasoning
Federated Query with SPARQL SERVICE
Using specific or multiple GRAPHS
March 2013, Hunch Manifest Inc
29. • Transactions, depends on database, this is the
format I use with AllegroGraph
<transaction>
<add>
<bnode>person4</bnode>
<uri>http://www.w3.org/1999/02/22-rdf-syntax-
ns#type</uri>
<uri>http://www.franz.com/simple#person</uri> </add>
<add>
<bnode>person4</bnode>
<uri>http://www.franz.com/simple#birth</uri>
<literal datatype=“xmls#date">1917-05-29</literal>
</add>
<remove>
<null/>
<uri>http://www.franz.com/simple#first-name</uri>
<null/>
</remove>
<clear>
<uri>http://franz.com/simple#context1</uri>
</clear>
</transaction>
March 2013, Hunch Manifest Inc
30. Depends on database
Allegrograph, with HTTP query simply specify
&infer=true
rdfs++ reasoning
▪ rdf:type and rdfs:subClassOf
▪ rdfs:range and rdfs:domain
▪ rdfs:subPropertyOf
▪ owl:sameAs
▪ owl:inverseOf
▪ owl:TransitiveProperty
Queries take longer, well designed will be subsecond but
can be unpredictable, orders of magnitude longer
Allegrograph now was a Materializer to generate triples
applying a set of rules which places triples in the store
March 2013, Hunch Manifest Inc
31. Common newbie challenge is metadata
Where did the data come from?
When was it loaded?
Who created the data?
Data Provenance
From Triples
hr:Mark rdf:type ho:Person .
hr:Mark ho:lastName “van Berkel” .
hr:Mark ho:founded hr:HunchManifestInc .
March 2013, Hunch Manifest Inc
32. To Quads
hr:Mark rdf:type ho:Person hr:Context123 .
hr:Mark ho:lastName “van Berkel” hr:Context123 .
hr:Mark ho:founded hr:HunchManifest hr:Context123 .
hr:Context123 ho:source “Mark’s head” .
hr:Context123 ho:createdOn “2013-03-25”^^xsd:date .
hr:Context123 ho:createdBy hr:Mark .
Now you can SELECT and FILTER by source,
createdOn, createdBy.
March 2013, Hunch Manifest Inc
33. Depends on technology
How I solve it with AllegroGraph
Authorized for DoD .mil network
Transport Layer security / encryption
Http or Https
SSL handshake
Management of Access Control for various admin
functions
User / Role Management
Triple/Quad Level Security
Fine grained flexible access for read / write
Restricts access according to security filters, user views
March 2013, Hunch Manifest Inc
34. Linked Data Publishing covered last month by James
5 Stars for publishing
★ make your stuff available on the Web
(whatever format) under an open license
★★ make it available as structured data (e.g.,
Excel instead of image scan of a table)
★★★ use non-proprietary formats (e.g., CSV
instead of Excel)
★★★★ use URIs to denote things, so that people
can point at your stuff
★★★★★ link your data to other data to provide
context
A user-view of the database
March 2013, Hunch Manifest Inc
35. Cloud platforms providers great to get
started
Azure, Amazon Web Services, etc
How to get data in?
Use RDFizers or google “XYZ to RDF”
Openlink Virtuoso, great middleware
I like Mule ESB and Talend
▪ big data integration, open source software
Plan ahead and make it a Service Oriented
Architecture
March 2013, Hunch Manifest Inc
36. Platforms, Open source vs Proprietary
Tradeoffs include
Cost
Licenses
Support
Documentation
Integratation with other tools
Inactive vs Active Development
Scalability
March 2013, Hunch Manifest Inc
37. Range from Hosting, Open Source Projects, eCommerce
standards, Graph databases, data integration, numerous
code libraries.
More tools than I can list:
http://www.w3.org/2001/sw/wiki/Tools
http://semanticweb.org/wiki/Category:Tool
A couple platforms to get going
Callimachus
Drupal with its RDFa plugin
Get a graph database / triplestore
Franz AllegroGraph, Neo4J, BigData, 4store, Jena, etc
Open source tools
Protege, modeling tool
Apache ...
March 2013, Hunch Manifest Inc
38. Respect to Apache 0 employee &2,663 volunteers:
www.any23.org – any URI to triples
jena.apache.org – Framework for building
SemWeb Apps, APIs, graph storage, server, etc
stanbol.apache.org – set of components for
Semantic Content Enhancement
incubator.apache.org/clerezza/ - Semantically
Linked Data through RESTful Web Services
lucene.apache.org/solr/ - Search server which
can use Triplestores
March 2013, Hunch Manifest Inc
39. Scaling graphs databases recent challenge
Horizontal, across clusters of machines
Vertical, e.g. Super computers , YarcData's uRiKA
Semantic Web Stack
RDF, SPARQL, RDFS, OWL fairly mature
OWL and Rules standardized but not easy
Logic, Proof, Trust not mature
Finding people with Experience
Predicting a Return on Investment
March 2013, Hunch Manifest Inc