In this webinar Thomas Cook, Sales Director, AnzoGraph DB, provides a history lesson on the origins of SPARQL, including its roots in the Semantic Web, and how linked open data is used to create Knowledge Graphs. Then, he dives into "What is RDF?", "What is a URI?" and "What is SPARQL?", wrapping up with a real-world demonstration via a Zeppelin notebook.
3. Origins of The Semantic Web
„The Semantic Web is an extension of the current web in which
information is given well-defined meaning, better enabling
computers and people to work in cooperation"
Tim Berners-Lee, James Hendler, Ora Lassila: The Semantic Web, Scientific American, 284(5), pp. 34-43(2001)
9. RDF (Resource Description Framework) is the data model of the Semantic Web. That means
that all data in Semantic Web technologies is represented as RDF
RDF's simple data model and ability to model disparate, abstract concepts has also led to its
increasing use in knowledge management applications unrelated to Semantic Web activity
What is RDF ?
At the most atomic level,
RDF is made of Triples.
A “Triple” is a single fact
Subject Object
E.g. “The Sky is Blue”
Sky Blue
Color
Predicate
https://en.wikipedia.org/wiki/Resource_Description_Framework
10. RDF is not like the tabular data model of relational databases. Nor is it like the
trees of the XML world. Instead, RDF is a graph
It’s a labeled, directed graph.
RDF Graph
Alice Telsa
drives
Bill
friend_of
Austin
resident_of
11. <Alice> <drives> <Tesla> .
<Alice> <friend_of> <Bill> .
<Alice> <resident_of> <Austin>.
<Tesla> <color> “blue” .
RDF Serializations – Turtle, N-Triples, RDFa, JSON-LD
Alice Tesla
drives
Bill
friend_of
Austin
resident_of
color
“blue”
12. Resource nodes A resource is anything that can have things said about it. It’s easy to think of a
resource as a thing vs. a value. In a visual representation, resources are represented by ovals.
Literal nodes The term literal is a fancy word for value. In a visual representation, literals are
represented by rectangles.
Blank nodes
3 Types of Nodes
Alice Tesla
drives
Bill
friend_of
Austin
resident_of
color
“blue”
13. <Alice>
Expressed as a full URI would look something more like:
<http://example.com/resource/person#Alice>
And
<drives>
Would be more like:
<http://example.com/resource/person#drives>
URIs – Uniform Resource Identifier
How can we uniquely ID resources universally? Add a URL to the start of your ID.
<http://example.com/resource/person#Alice> <http://example.com/resource/person#drives> <http://example.com/resource/person#Tesla> .
<http://example.com/resource/person#Alice> <http://example.com/resource/person#friend_of> <http://example.com/resource/person#Bill> .
<http://example.com/resource/person#Alice> <http://example.com/resource/person#resident_of> <http://example.com/resource/person#Austin>.
<http://example.com/resource/car#Tesla> <http://example.com/car#color> “blue” .
14. SPARQL PREFIX abbreviation
BEFORE:
<http://example.com/resource/person#Alice> <http://example.com/resource/person#drives> <http://example.com/resource/car#Tesla
<http://example.com/resource/person#Alice> <http://example.com/resource/person#friend_of> <http://example.com/resource/person
<http://example.com/resource/person#Alice> <http://example.com/resource/person#resident_of> <http://example.com/resource#Aus
<http://example.com/resource#Tesla> <http://example.com/resource#color> “blue” .
With PREFIX we can get a much shorter representation with abbreviations.
AFTER:
PREFIX tslap: <http://example.com/resource/person#> .
PREFIX tslar: <http://example.com/resource#> .
tslap:Alice tslap:drives tslac:Tesla .
tslap:Alice tslap:friend_of tslap:Bill .
tslap:Alice tslap:resident_of tslar:Austin .
tslar:Tesla tslar:color “blue”.
<Alice> <drives> <Tesla> .
<Alice> <friend_of> <Bill> .
<Alice> <resident_of> <Austin>.
<Tesla> <color> “blue” .
Same as below without URIs, but
now universally uniquely identified
15. PREFIX Short For:
rdf: http://xmlns.com/foaf/0.1/
rdfs: http://www.w3.org/2000/01/rdf-schema#
owl: http://www.w3.org/2002/07/owl#
xsd: http://www.w3.org/2001/XMLSchema#
dc: http://purl.org/dc/elements/1.1/
foaf: http://xmlns.com/foaf/0.1/
Common Prefixes
More common prefixes at http://prefix.cc
16. SPARQL stands for:
SPARQL Protocol And RDF Query Language
A query language and a protocol
What is SPARQL?
A SPARQL QUERY:
SELECT …
FROM ….
WHERE { … }
GROUP BY …
ORDER BY …
SELECT – Identifies the values to return
FROM – selects the dataset to query
WHERE – the graph patterns to match
GROUP BY – group aggregations on this field
ORDER BY – order the result set
17. INSERT DATA { GRAPH <test1> {
<Alice> <drives> <Tesla> .
<Alice> <friend_of> <Bill> .
<Alice> <resident_of> <Austin>.
<Tesla> <color> "blue" .
}
}
Let’s INSERT some data
18. SELECT (count(*) as ?count)
FROM <test1>
WHERE {
?s ?p ?o .
}
RESULT:
count
--------
4
Let’s count how many triples are in the graph
19. SELECT ?s ?p ?o
FROM <test1>
WHERE {
?s ?p ?o .
}
Show all the triples
20. SELECT ?s
FROM <test1>
WHERE {
?s <drives> <Tesla> .
}
RESULT:
s
-------
Alice
1 rows
Use graph patterns to find data you want
Who drives a Tesla?
21. SELECT ?s
FROM <test1>
WHERE {
?s <drives> ?car .
?car <color> "blue" .
}
Who drives a blue car?
Join operation
Use graph patterns to match data in the graph
22. Who drives a blue car?
Join operation
Use graph patterns to match data in the graph
23. SELECT ?s ?color ?year
FROM <test1>
WHERE {
?s <drives> ?car .
?car <color> ?color .
?car <year> ?year .
}
RESULT? No results. Why? <year> does not exist in our graph
Graph patterns must exist in the WHERE
Who drives a car and what’s the color and year?
24. SELECT ?s ?color ?year
FROM <test1>
WHERE {
?s <drives> ?car .
?car <color> ?color .
OPTIONAL{?car <year> ?year . }
}
RESULT:
s | color | year
-------+-------+------
Alice | blue |
1 rows
Who drives a blue car?
USE OPTIONAL for Graph patterns that might not exist
25. • ORDER BY: This modifier sorts the result set in a particular order. It sorts query solutions on the
value of one or more variables.
• OFFSET: Using this modifier in conjunction with LIMIT and ORDER BY returns a slice of a sorted
solution set, for example, for paging.
• LIMIT: This modifier restricts the results to return a certain number of solutions.
• GROUP BY: This modifier is used with aggregate functions and specifies the key variables to use
to partition the solutions into groups. For information about AnzoGraph GROUP BY clause
extensions, see Advanced Grouping Sets.
• HAVING: This modifier is used with aggregate functions and further filters the results after
applying the aggregates.
SPARQL SELECT, like SQL, has several solution modifiers
26. The built-in SPARQL aggregate functions:
AVG: Calculates the average value for a numeric expression.
COUNT: Counts the number of times the specified value is bound to the given
variable.
GROUP_CONCAT: Performs a string concatenation of all of the values that are
bound to the given variable.
MAX: Returns the maximum value from the specified set of values.
MIN: Returns the minimum value from the specified set of values.
SAMPLE: Returns an arbitrary value from the specified set of values.
SUM: Adds the specified values.
Aggregate Functions
27. There are Four standard SPARQL query forms:
SELECT: Run SELECT queries when you want to find and return all of the data that
matches certain patterns.
CONSTRUCT: Run CONSTRUCT queries when you want to create or transform data
based on the existing data.
ASK: Run ASK queries when you want to know whether a certain pattern exists in the
data. ASK queries return only "true" or "false" to indicate whether a solution exists.
DESCRIBE: Run DESCRIBE queries when you want to view the RDF graph that
describes a particular resource.
Query Forms