SlideShare una empresa de Scribd logo
1 de 79
Descargar para leer sin conexión
Overview of the SPARQL-Generate
language and latest developments
Maxime Lefrançois
Maxime.Lefrancois@emse.fr
http://maxime-lefrancois.info/
MINES Saint-Étienne – Institut Henri Fayol
Laboratoire Hubert Curien UMR CNRS 5516
2008 20102005 2014 20172009
2
Ph.D. KRR applied to linguistics
Agrégation in mechanical engineering
M2 Signal processing
M2 Computer science
Semantic Web and interoperabiliity
Knowledge
representation and
reasoning Semantic Web and
Web of data
Semantic
Interoperability on the
Web of Thing
Maxime Lefrançois
Maxime.Lefrancois@emse.fr
http://maxime-lefrancois.info/
MINES Saint-Étienne – Institut Henri Fayol
Laboratoire Hubert Curien UMR CNRS 5516
Connected Intelligence Team
Hubert Curien Laboratory
https://connected-intelligence.univ-st-etienne.fr/
13 permanent staff members – 11 Ph.D. students (2 open pos. 2020) – 1 postdoc (2 open pos. now)
Contact: Olivier.Boissier@emse.fr
Saint-Étienne
3
Saint-Étienne
4
5
How to reach semantic interoperability at the data
level between heterogeneous things and services?
Data
Services
HumansSmart Things
6
There is a multitude of data formats/models on the Web
7
M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats,
In Proc. Extended Semantic Web Conference, 2017
+ EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
Key step: generate some RDF
Requirements for RDF generation
8
M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats,
In Proc. Extended Semantic Web Conference, 2017
+ EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
- Transform multiple sources …
- ... having heterogeneous formats
- Be extensible to new data formats
- Be easy to use by Semantic Web experts
- Integrate in a typical semantic web engineering workflow
- Be flexible and easily maintainable
- Contextualize the transformation with an RDF Dataset
- Modularize / decouple / compose transformations
- Enable data transformation, aggregates, filters, ...
- Generate RDF Lists
- Performant
- Transform binary formats as well as textual formats
- Transform big documents (output HDT)
- Generate RDF streams from streams
Existing approaches
9
RDFizers (see https://www.w3.org/wiki/ConverterToRdf )
- A lot of tools are specific to one or a few formats
(44 referenced formats)
- Some frameworks support several/many formats
ad hoc methods,
little or no control on the structure of the output
=> may require an additional transformation
M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats,
In Proc. Extended Semantic Web Conference, 2017
+ EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
Existing approaches
10
Approaches based on mapping/transformation languages
M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats,
In Proc. Extended Semantic Web Conference, 2017
+ EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
<html xmlns=http://www.w3.org/1999/xhtml
xmlns:grddl='http://www.w3.org/2003/g/data-view#'
grddl:transformation="http://example.com/getAuthor.xsl" >
<head>
<title>Are You Experienced?</title>
[...]
</html>
<album xmlns:grddl='http://www.w3.org/2003/g/data-view#'
grddl:transformation="http://example.org/getAlbum.xsl" >
<artist mbid="">The Jimi Hendrix Experience</artist>
<name>Are You Experienced?</name>
...
</album>
GRDDL
11
(W3C REC 2007)
12
XSPARQL
(W3C member submission 2009)
13
R2RML
(W3C REC 2012)
<#TriplesMap2>
rr:logicalTable <#DeptTableView>;
rr:subjectMap [
rr:template "http://data.example.com/department/{DEPTNO}";
rr:class ex:Department;
];
rr:predicateObjectMap [
rr:predicate ex:name;
rr:objectMap [ rr:column "DNAME" ];
];
rr:predicateObjectMap [
rr:predicate ex:location;
rr:objectMap [ rr:column "LOC" ];
];
rr:predicateObjectMap [
rr:predicate ex:staff;
rr:objectMap [ rr:column "STAFF" ];
].
{
"@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
"url": "tree-ops.csv",
"dc:title": "Tree Operations",
"dcat:keyword": ["tree", "street", "maintenance"],
"dc:publisher": {
"schema:name": "Example Municipality",
"schema:url": {"@id": "http://example.org"}
},
"dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"},
"dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"},
"tableSchema": {
"columns": [{
"name": "GID",
"titles": ["GID", "Generic Identifier"],
"dc:description": "An identifier for the operation on a tree.",
"datatype": "string",
"required": true
}, {
"name": "on_street",
"titles": "On Street",
"dc:description": "The street that the tree is on.",
"datatype": "string"
}, {
"name": "species",
"titles": "Species",
"dc:description": "The species of the tree.",
"datatype": "string"
}, {
"name": "trim_cycle",
"titles": "Trim Cycle",
"dc:description": "The operation performed on the tree.",
"datatype": "string"
}, {
"name": "inventory_date",
"titles": "Inventory Date",
"dc:description": "The date of the operation that was performed.",
"datatype": {"base": "date", "format": "M/d/yyyy"}
}], 14
CSVW
(W3C REC 2015)
RML + Extensions
15
(Dimou et al., 2013)
16
- Transform multiple sources …
- ... having heterogeneous formats
- Be extensible to new data formats
- Be easy to use by Semantic Web experts
- Integrate in a typical semantic web engineering workflow
- Be flexible and easily maintainable
- Contextualize the transformation with an RDF Dataset
- Modularize / decouple / compose transformations
- Enable data transformation, aggregates, filters, ...
- Generate RDF Lists
- Performant
- Transform binary formats as well as textual formats
- Transform big documents (output HDT)
- Generate RDF streams from streams
- Subject-centric?
- Include parts from different sources in one RDF Term?
(Dimou et al., 2013)
RML + Extensions – covers the REQs?
Main research questions
How to design a mapping language that…
…is expressive enough to cover all of our use cases?
…is still rather simple to use?
…can be easily extended to any source format?
17
RDF generation process
18
RDF generation process
19
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
RDF generation process
20
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
RDF generation process
21
?
?
?
Graph pattern
definition
ex:salary
ex:Director
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
RDF generation process
22
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select ontologies
First SPARQL-Generate query
23
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
The Graph Pattern Definition is here
24
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
Query RDF Dataset + « Documentset »
25
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
Iterator Functions
26
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
Binding Functions (that’s standard SPARQL)
27
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
iter&fun functions have dereferenceable URIs
28
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
It extends SPARQL, so we got for free ...
29
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
• Implementable on top of existing SPARQL engines
• Easy to learn when you know SPARQL
• You get to know SPARQL better when you use SPARQL-Generate
• Output template (Basic Graph Pattern) is not subject-centric
• SPARQL 1.1 FILTER BIND OPTIONAL,...
• SPARQL 1.1 Standard binding functions and operators
STR LANG LANGMATCHES DATATYPE BOUND IRI URI BNODE RAND ABS CEIL
FLOOR ROUND CONCAT SUBSTR STRLEN REPLACE UCASE LCASE
ENCODE_FOR_URI CONTAINS STRSTARTS STRENDS STRBEFORE STRAFTER YEAR
MONTH DAY HOURS MINUTES SECONDS TIMEZONE TZ NOW UUID STRUUID MD5
SHA1 SHA256 SHA384 SHA512 COALESCE IF STRLANG STRDT sameTerm isIRI
isURI isBLANK isLITERAL isNUMERIC REGEX + * / = > < || &&
• SPARQL 1.1 Binding functions extension mechanism
• SPARQL 1.1 ORDER LIMIT OFFSET
• SPARQL 1.1 GROUP BY, HAVING, Aggregate functions
COUNT SUM MIN MAX AVG SAMPLE GROUP_CONCAT
30
M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats,
In Proc. Extended Semantic Web Conference, 2017
+ EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
SPARQL-Generate is not just « theoretically cool »
https://w3id.org/sparql-generate/
Open-source Licence Apache 2.0 implementation based on Apache Jena
financed by ITEA, ANR, ENGIE
Expressive, Perfomant, extensible to other formats
API (available on Maven Central)
JAR
Web playground + tutorial
31
M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats,
In Proc. Extended Semantic Web Conference, 2017
+ EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
SPARQL-Generate is not just « theoretically cool »
https://w3id.org/sparql-generate/
Open-source Licence Apache 2.0 implementation based on Apache Jena
financed by ITEA, ANR, ENGIE
Expressive, Perfomant, extensible to other formats
API (available on Maven Central)
JAR
Web playground + tutorial
Sublime Text package
Functions have dereferenceable URIs
32
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
How are we going so far?
33
?
?
?
Selection patterns
Xpath, JSONpath, CSS selectors, regex, etc.
Graph pattern
definition
ex:salary
ex:Director
+ Select
ontologies
- Transform multiple sources …
- ... having heterogeneous formats
- Be extensible to new data formats
- Be easy to use by Semantic Web experts
- Integrate in a typical semantic web engineering workflow
- Be flexible and easily maintainable
- Contextualize the transformation with an RDF Dataset
- Modularize / decouple / compose transformations
- Enable data transformation, aggregates, filters, ...
- Generate RDF Lists
- Performant
- Transform binary formats as well as textual formats
- Transform big documents (output HDT)
- Generate RDF streams from streams
- Subject-centric?
- Include parts from different sources in one literal?
There’s more than
SOURCE and ITERATOR ...
34
What I didn’t tell yet...
Iterators can bind several variables
35
Iterators can be more powerful than that
36
Syntactic sugar helps to compact queries
37
Iterators work in batches
38
Iterators work in batches
39
Generate clauses can be nested
40
Generate queries can be modularized
41
One can generate RDF lists easily*
42
* NEW!
One can generate HDT directly*
43
20 seconds for a 20 MB gene2go sample, output is 17 MB of HDT
9 min, 20 sec for a 145 MB gene2go sample, output is "only" 114 MB HDT
* NEW!
There’s more than
GENERATE ...
44
What I didn’t tell yet...
45
FUNCTION
Functions on RDF terms, RDF graphs or SPARQL results
Subset of LDScript: a Linked Data Script Language
http://ns.inria.fr/sparql-extension/
Olivier Corby, Catherine Faron-Zucker and Fabien Gandon, LDScript: a Linked Data Script Language, ISWC 2017, spotlight paper
46
TEMPLATE
Generate Text from RDF
Extended subset of STTL: the SPARQL Template Transformation Language
https://ns.inria.fr/sparql-template/
Olivier Corby and Catherine Faron-Zucker. A Transformation Language for RDF based on SPARQL. Web Information Systems
and Technologies - Selected Extended Papers from WEBIST 2015. Springer-Verlag, 2015. Best paper nominee.
• Elements FORMAT GROUP BOX
• Binding function st:call-template(templateURI, param1, param2, ...)
47
TEMPLATE
Generate Text from RDF
Extended subset of STTL: the SPARQL Template Transformation Language
https://ns.inria.fr/sparql-template/
Olivier Corby and Catherine Faron-Zucker. A Transformation Language for RDF based on SPARQL. Web Information Systems
and Technologies - Selected Extended Papers from WEBIST 2015. Springer-Verlag, 2015. Best paper nominee.
• Elements FORMAT GROUP BOX also SPARQL-Generate clauses ITERATOR SOURCE
• Binding function st:call-template(templateURI, param1, param2, ...)
48
SELECT
SPARQL Select:
- with SPARQL-Generate clauses ITERATOR SOURCE
- with syntactic sugar, etc.
49
SELECT
SPARQL Select:
- with SPARQL-Generate clauses ITERATOR SOURCE
- with syntactic sugar, etc.
• SELECT queries output a list of solution bindings …
• … Like ITERATOR clauses
Use ITERATORS in sequence?
Let’s give you an idea of how an unoptimized query looks like
Use ITERATORS in sequence?
Let’s give you an idea of how an unoptimized query looks like
• GENERATE queries output RDF Graph …
• …
Calling a GENERATE query
• GENERATE queries output RDF Graph …
• … like FROM Clauses
FROM GENERATE {} and FROM GENERATE <>(param1, param2, …)
53
- Modularize, Decouple, Compose queries
GENERATE TEMPLATE
SELECT FUNCTION
can call
can call
can call
can call
There’s much more to do
than « just use it »
54
What I didn’t tell yet...
55
There’s much more to do than « just use it »
Two use cases that drove recent development
• Generation of documentation from an ontology
• Generation of an ontology from configuration (RDF or files)
Use it
• On the web, as JAR, in Sublime Text, as API
• Extend with your own functions and iterators (e.g., NER)
• Make students play with it
• Ask for help, contribute, report issues, propose enhancements
https://github.com/sparql-generate/sparql-generate/issues
56
There’s much more to do than « just use it »
Ideas
• How can SPARQL-Generate be compared to / used with RML?
• How can SPARQL-Generate be used by Helio?
• What query optimization techniques for SPARQL-Generate?
• Can we use (a subset of) SPARQL-Generate for OBDA?
• Can we qualify transformations w.r.t the input/output space?
• Can we identify/compute operators on these objects?
e.g., If input conforms to a schema, then does the output conforms to a SHACL shape?
What was the main purpose ...
57
What I didn’t tell yet...
58
Can we use RDF as a lingua franca?
59
Idea: just send RDF!
application/rdf+xml
text/turtle
application/ld+json
…
ERI
HDTQ
RDF data formats will never be the only ones on the web
Developers prefer JSON, …
60
Idea: just send RDF! JSON-LD!
Solution:
- to rely on *one* RDF syntax like JSON-LD
Requires:
- Global adoption of JSON-LD
- Maintaining during the development or the evolution phase
Abstract data model
(ontology)
+ JSON validation: JSON Schema
JSON-LD Context
+ RDF validation: SHACL/SHeX
61
Approx. modeling of the communication
between heterogeneous agents on the Web.
send
X
receivermediumsender
wants to send some content
e.g., state of a resource
M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
Generates a
representation of
the content
62
send
X
receivermedium
received
X
1 2
Transmits
the message via
the internet
3
Decodes and get
understanding
sender
wants to send some content
e.g., state of a resource
M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
Approx. modeling of the communication
between heterogeneous agents on the Web.
Generates a
representation of
the content
63
send
X
receivermedium
received
X
1 2
Transmits
the message via
the internet
3
Decodes and get
understanding
sender
Correct Content Conveyance iif:
1. all of the essential characteristics of the content is encoded in the message
2. the encoding and the decoding phase are symmetric
3. the message is not altered in the transmission medium
Approx. modeling of the communication
between heterogeneous agents on the Web.
Generates a
representation of
the content
64
Assumption:
content is always a RDF graph
send
receivermedium
received
1 2
Transmits
the message via
the internet
3
Decodes and get
understanding
sender
Correct Content Conveyance iif:
The RDF graph the sender encodes is equivalent to the RDF
graph the receiver obtained after decoding the message
65
Content Representation
RDF Graphs Typed octet streams
M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
66
RDF Presentations
RDF Graphs
RDF Presentation
Typed octet streams
67
Lifting, lowering, validating
RDF Graphs Typed octet streams
Validation
Representation validation
Lifting rule maps all the tos in p to their corresponding graph (unique)
Lowering rule maps all the graphs in p to a *chosen* corresponding typed stream
Validation rule checks if there is a pair with that graph
Representation validation rule checks if there is a pair with that tos
68
+ categorisation of tools
+ analysis of RDF formats
RDF Graphs
Validation
- SHACL
- ShEx
- SPIN
- ad hoc code...
Representation validation
- JSON Schema
- XML Schema
- ad hoc code...
Typed octet streams
M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
69
Usage scenarios on the Web
send
sender receiver
The sender can be…
Req/Res: client sends some content to the server
Req/Res: server responds to the client
Pub/Sub: client broadcasts some content to its subscribers
Both agents can also…
be constrained in some ways (battery, storage, comput,…)
implement some of the principles we devise (be semantically flexible)
rely on a third party to operate lifting/lowering/…
M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
70
Scenario 1: server/publisher sends its
message to a client/subscriber
client/
subscriber
How to lift ?
server/
publisher
receiver discovers how to lift (from the sender, or elsewhere)
sender can be constrained… / receiver can be constrained…
I lowered this way,
just lift accordingly
send
initiates interaction
71
Scenario 2: A client asks a server for the
representation of a resource
send
clientserver
server discovers a presentation suitable for the client
client can be constrained… / server can be constrained…
How to lower ?
I will lift this way,
please lower
accordingly
initiates interaction
72
Scenario 3: A client sends some encoded
content to a server
send
serverclient
client discovers a presentation suitable for the server
client can be constrained… / server can be constrained…
How to lower ?
initiates interaction
I will lift this way,
please lower
accordingly
Ontologies, langages, tools,
to ease the usage of RDF
as a lingua franca
Maxime Lefrançois
Maxime.Lefrancois@emse.fr
http://maxime-lefrancois.info/
MINES Saint-Étienne – Institut Henri Fayol
Laboratoire Hubert Curien UMR CNRS 5516
Take away message
1. RDF Transformation Language
2. Use it and contribute
3. Not just a tool
Saint-Étienne
74
Generates a
representation of
the content
75
send
receivermedium
received
1 2
Transmits
the message via
the internet
3
Decodes and get
understanding
sender
ontologies
Assumption:
content is always a RDF graph
76
Ontologies, contribution to standardization
SOSA/SSN ontology
Standard OGC et W3C
A. Haller, K. Janowicz, S. Cox, D. Le Phuoc, K. Taylor, M. Lefrançois, Semantic Sensor Network Ontology,
W3C Recommendation, W3C, 19 October 2017
77
Ontologies, contribution to standardization
BOT ontology
M. H. Rasmussen, M. Lefrançois, G. F. Schneider, P. Pauwels, BOT: the Building Topology Ontology of the W3C Linked
Building Data Group (submitted May 2019 to Semantic Web Journal)
78
Ontologies, contribution to standardization
SAREF + influences
SAREF patterns
+ instances
http://saref.etsi.org/@prefix seas: <http://w3id.org/seas/>.
Smart Grid
Core of SEAS
Ontology patterns
SimpleTransport
Smart Home
Building
HVAC
ETSI STF 556: Consolidation of SAREF and its community of industrial users, based on the experience of the EUREKA
ITEA 12004 SEAS
ETSI STF DP: Specification of the SAREF development framework and workflow, and development of the Community
SAREF Portal for user engagement
79
Ontologies, contribution to standardization
A work-in progress at international level
N. Seydoux, M. Lefrançois, L. Médini, . F. Schneider, P. Pauwels, Positionnement sur le Web Sémantique des Objets,
Ingénierie des Connaissances 2019 (best paper award)

Más contenido relacionado

La actualidad más candente

(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
Guido Schmutz
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 

La actualidad más candente (20)

XQuery
XQueryXQuery
XQuery
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
 
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeSimplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta Lake
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Customizing the look and-feel of DSpace
Customizing the look and-feel of DSpaceCustomizing the look and-feel of DSpace
Customizing the look and-feel of DSpace
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Apache Hive Tutorial
Apache Hive TutorialApache Hive Tutorial
Apache Hive Tutorial
 
Apache Calcite overview
Apache Calcite overviewApache Calcite overview
Apache Calcite overview
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use casesApache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 

Similar a Overview of the SPARQL-Generate language and latest developments

Michael Lang Sr. Presentation
Michael Lang Sr. PresentationMichael Lang Sr. Presentation
Michael Lang Sr. Presentation
Mediabistro
 
Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...
SemWebPro
 

Similar a Overview of the SPARQL-Generate language and latest developments (20)

Michael Lang Sr. Presentation
Michael Lang Sr. PresentationMichael Lang Sr. Presentation
Michael Lang Sr. Presentation
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Leveraging NLP and Deep Learning for Document Recommendations in the CloudLeveraging NLP and Deep Learning for Document Recommendations in the Cloud
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
 
Spark meetup TCHUG
Spark meetup TCHUGSpark meetup TCHUG
Spark meetup TCHUG
 
RSP4J: An API for RDF Stream Processing
RSP4J: An API for RDF Stream ProcessingRSP4J: An API for RDF Stream Processing
RSP4J: An API for RDF Stream Processing
 
Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...
 
Web Spa
Web SpaWeb Spa
Web Spa
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Lightening Fast Big Data Analytics using Apache Spark
Lightening Fast Big Data Analytics using Apache SparkLightening Fast Big Data Analytics using Apache Spark
Lightening Fast Big Data Analytics using Apache Spark
 
Sinux
SinuxSinux
Sinux
 
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf dataIEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 

Más de Maxime Lefrançois

Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...
Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...
Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...
Maxime Lefrançois
 

Más de Maxime Lefrançois (8)

SPARQL-Generate, présentation SemWeb.Pro 2019
SPARQL-Generate, présentation SemWeb.Pro 2019SPARQL-Generate, présentation SemWeb.Pro 2019
SPARQL-Generate, présentation SemWeb.Pro 2019
 
Reference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationReference Knowledge Models for Smart Application
Reference Knowledge Models for Smart Application
 
ITEA2 SEAS Knowledge Model Online Course
ITEA2 SEAS Knowledge Model Online CourseITEA2 SEAS Knowledge Model Online Course
ITEA2 SEAS Knowledge Model Online Course
 
Ph.D. Defense: Représentation des connaissances sémantiques lexicales de la T...
Ph.D. Defense: Représentation des connaissances sémantiques lexicales de la T...Ph.D. Defense: Représentation des connaissances sémantiques lexicales de la T...
Ph.D. Defense: Représentation des connaissances sémantiques lexicales de la T...
 
MTT-2013
MTT-2013MTT-2013
MTT-2013
 
Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...
Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...
Représentation des connaissances du DEC: Concepts fondamentaux du formalisme ...
 
LefrancoisGandon@MSW11: ULiS
LefrancoisGandon@MSW11: ULiSLefrancoisGandon@MSW11: ULiS
LefrancoisGandon@MSW11: ULiS
 
LefrancoisGandon@MTT11: ILexicOn
LefrancoisGandon@MTT11: ILexicOnLefrancoisGandon@MTT11: ILexicOn
LefrancoisGandon@MTT11: ILexicOn
 

Último

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Último (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 

Overview of the SPARQL-Generate language and latest developments

  • 1. Overview of the SPARQL-Generate language and latest developments Maxime Lefrançois Maxime.Lefrancois@emse.fr http://maxime-lefrancois.info/ MINES Saint-Étienne – Institut Henri Fayol Laboratoire Hubert Curien UMR CNRS 5516
  • 2. 2008 20102005 2014 20172009 2 Ph.D. KRR applied to linguistics Agrégation in mechanical engineering M2 Signal processing M2 Computer science Semantic Web and interoperabiliity Knowledge representation and reasoning Semantic Web and Web of data Semantic Interoperability on the Web of Thing Maxime Lefrançois Maxime.Lefrancois@emse.fr http://maxime-lefrancois.info/ MINES Saint-Étienne – Institut Henri Fayol Laboratoire Hubert Curien UMR CNRS 5516
  • 3. Connected Intelligence Team Hubert Curien Laboratory https://connected-intelligence.univ-st-etienne.fr/ 13 permanent staff members – 11 Ph.D. students (2 open pos. 2020) – 1 postdoc (2 open pos. now) Contact: Olivier.Boissier@emse.fr Saint-Étienne 3
  • 5. 5 How to reach semantic interoperability at the data level between heterogeneous things and services? Data Services HumansSmart Things
  • 6. 6 There is a multitude of data formats/models on the Web
  • 7. 7 M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats, In Proc. Extended Semantic Web Conference, 2017 + EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial  Key step: generate some RDF
  • 8. Requirements for RDF generation 8 M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats, In Proc. Extended Semantic Web Conference, 2017 + EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial  - Transform multiple sources … - ... having heterogeneous formats - Be extensible to new data formats - Be easy to use by Semantic Web experts - Integrate in a typical semantic web engineering workflow - Be flexible and easily maintainable - Contextualize the transformation with an RDF Dataset - Modularize / decouple / compose transformations - Enable data transformation, aggregates, filters, ... - Generate RDF Lists - Performant - Transform binary formats as well as textual formats - Transform big documents (output HDT) - Generate RDF streams from streams
  • 9. Existing approaches 9 RDFizers (see https://www.w3.org/wiki/ConverterToRdf ) - A lot of tools are specific to one or a few formats (44 referenced formats) - Some frameworks support several/many formats ad hoc methods, little or no control on the structure of the output => may require an additional transformation M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats, In Proc. Extended Semantic Web Conference, 2017 + EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
  • 10. Existing approaches 10 Approaches based on mapping/transformation languages M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats, In Proc. Extended Semantic Web Conference, 2017 + EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial 
  • 11. <html xmlns=http://www.w3.org/1999/xhtml xmlns:grddl='http://www.w3.org/2003/g/data-view#' grddl:transformation="http://example.com/getAuthor.xsl" > <head> <title>Are You Experienced?</title> [...] </html> <album xmlns:grddl='http://www.w3.org/2003/g/data-view#' grddl:transformation="http://example.org/getAlbum.xsl" > <artist mbid="">The Jimi Hendrix Experience</artist> <name>Are You Experienced?</name> ... </album> GRDDL 11 (W3C REC 2007)
  • 13. 13 R2RML (W3C REC 2012) <#TriplesMap2> rr:logicalTable <#DeptTableView>; rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}"; rr:class ex:Department; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "DNAME" ]; ]; rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:column "LOC" ]; ]; rr:predicateObjectMap [ rr:predicate ex:staff; rr:objectMap [ rr:column "STAFF" ]; ].
  • 14. { "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "url": "tree-ops.csv", "dc:title": "Tree Operations", "dcat:keyword": ["tree", "street", "maintenance"], "dc:publisher": { "schema:name": "Example Municipality", "schema:url": {"@id": "http://example.org"} }, "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"}, "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"}, "tableSchema": { "columns": [{ "name": "GID", "titles": ["GID", "Generic Identifier"], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true }, { "name": "on_street", "titles": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "titles": "Species", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim_cycle", "titles": "Trim Cycle", "dc:description": "The operation performed on the tree.", "datatype": "string" }, { "name": "inventory_date", "titles": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": {"base": "date", "format": "M/d/yyyy"} }], 14 CSVW (W3C REC 2015)
  • 15. RML + Extensions 15 (Dimou et al., 2013)
  • 16. 16 - Transform multiple sources … - ... having heterogeneous formats - Be extensible to new data formats - Be easy to use by Semantic Web experts - Integrate in a typical semantic web engineering workflow - Be flexible and easily maintainable - Contextualize the transformation with an RDF Dataset - Modularize / decouple / compose transformations - Enable data transformation, aggregates, filters, ... - Generate RDF Lists - Performant - Transform binary formats as well as textual formats - Transform big documents (output HDT) - Generate RDF streams from streams - Subject-centric? - Include parts from different sources in one RDF Term? (Dimou et al., 2013) RML + Extensions – covers the REQs?
  • 17. Main research questions How to design a mapping language that… …is expressive enough to cover all of our use cases? …is still rather simple to use? …can be easily extended to any source format? 17
  • 19. RDF generation process 19 Selection patterns Xpath, JSONpath, CSS selectors, regex, etc.
  • 20. RDF generation process 20 Selection patterns Xpath, JSONpath, CSS selectors, regex, etc.
  • 21. RDF generation process 21 ? ? ? Graph pattern definition ex:salary ex:Director Selection patterns Xpath, JSONpath, CSS selectors, regex, etc.
  • 22. RDF generation process 22 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 23. First SPARQL-Generate query 23 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 24. The Graph Pattern Definition is here 24 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 25. Query RDF Dataset + « Documentset » 25 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 26. Iterator Functions 26 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 27. Binding Functions (that’s standard SPARQL) 27 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 28. iter&fun functions have dereferenceable URIs 28 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 29. It extends SPARQL, so we got for free ... 29 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies • Implementable on top of existing SPARQL engines • Easy to learn when you know SPARQL • You get to know SPARQL better when you use SPARQL-Generate • Output template (Basic Graph Pattern) is not subject-centric • SPARQL 1.1 FILTER BIND OPTIONAL,... • SPARQL 1.1 Standard binding functions and operators STR LANG LANGMATCHES DATATYPE BOUND IRI URI BNODE RAND ABS CEIL FLOOR ROUND CONCAT SUBSTR STRLEN REPLACE UCASE LCASE ENCODE_FOR_URI CONTAINS STRSTARTS STRENDS STRBEFORE STRAFTER YEAR MONTH DAY HOURS MINUTES SECONDS TIMEZONE TZ NOW UUID STRUUID MD5 SHA1 SHA256 SHA384 SHA512 COALESCE IF STRLANG STRDT sameTerm isIRI isURI isBLANK isLITERAL isNUMERIC REGEX + * / = > < || && • SPARQL 1.1 Binding functions extension mechanism • SPARQL 1.1 ORDER LIMIT OFFSET • SPARQL 1.1 GROUP BY, HAVING, Aggregate functions COUNT SUM MIN MAX AVG SAMPLE GROUP_CONCAT
  • 30. 30 M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats, In Proc. Extended Semantic Web Conference, 2017 + EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial  SPARQL-Generate is not just « theoretically cool » https://w3id.org/sparql-generate/ Open-source Licence Apache 2.0 implementation based on Apache Jena financed by ITEA, ANR, ENGIE Expressive, Perfomant, extensible to other formats API (available on Maven Central) JAR Web playground + tutorial
  • 31. 31 M. Lefrançois, A. Zimmermann, N. Bakerally, A SPARQL extension for generating RDF from heterogeneous formats, In Proc. Extended Semantic Web Conference, 2017 + EKAW, Hackatons, ESWC Tutorial 2018, PFIA Tutorial 2019, OEG Tutorial  SPARQL-Generate is not just « theoretically cool » https://w3id.org/sparql-generate/ Open-source Licence Apache 2.0 implementation based on Apache Jena financed by ITEA, ANR, ENGIE Expressive, Perfomant, extensible to other formats API (available on Maven Central) JAR Web playground + tutorial Sublime Text package
  • 32. Functions have dereferenceable URIs 32 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies
  • 33. How are we going so far? 33 ? ? ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition ex:salary ex:Director + Select ontologies - Transform multiple sources … - ... having heterogeneous formats - Be extensible to new data formats - Be easy to use by Semantic Web experts - Integrate in a typical semantic web engineering workflow - Be flexible and easily maintainable - Contextualize the transformation with an RDF Dataset - Modularize / decouple / compose transformations - Enable data transformation, aggregates, filters, ... - Generate RDF Lists - Performant - Transform binary formats as well as textual formats - Transform big documents (output HDT) - Generate RDF streams from streams - Subject-centric? - Include parts from different sources in one literal?
  • 34. There’s more than SOURCE and ITERATOR ... 34 What I didn’t tell yet...
  • 35. Iterators can bind several variables 35
  • 36. Iterators can be more powerful than that 36
  • 37. Syntactic sugar helps to compact queries 37
  • 38. Iterators work in batches 38
  • 39. Iterators work in batches 39
  • 40. Generate clauses can be nested 40
  • 41. Generate queries can be modularized 41
  • 42. One can generate RDF lists easily* 42 * NEW!
  • 43. One can generate HDT directly* 43 20 seconds for a 20 MB gene2go sample, output is 17 MB of HDT 9 min, 20 sec for a 145 MB gene2go sample, output is "only" 114 MB HDT * NEW!
  • 44. There’s more than GENERATE ... 44 What I didn’t tell yet...
  • 45. 45 FUNCTION Functions on RDF terms, RDF graphs or SPARQL results Subset of LDScript: a Linked Data Script Language http://ns.inria.fr/sparql-extension/ Olivier Corby, Catherine Faron-Zucker and Fabien Gandon, LDScript: a Linked Data Script Language, ISWC 2017, spotlight paper
  • 46. 46 TEMPLATE Generate Text from RDF Extended subset of STTL: the SPARQL Template Transformation Language https://ns.inria.fr/sparql-template/ Olivier Corby and Catherine Faron-Zucker. A Transformation Language for RDF based on SPARQL. Web Information Systems and Technologies - Selected Extended Papers from WEBIST 2015. Springer-Verlag, 2015. Best paper nominee. • Elements FORMAT GROUP BOX • Binding function st:call-template(templateURI, param1, param2, ...)
  • 47. 47 TEMPLATE Generate Text from RDF Extended subset of STTL: the SPARQL Template Transformation Language https://ns.inria.fr/sparql-template/ Olivier Corby and Catherine Faron-Zucker. A Transformation Language for RDF based on SPARQL. Web Information Systems and Technologies - Selected Extended Papers from WEBIST 2015. Springer-Verlag, 2015. Best paper nominee. • Elements FORMAT GROUP BOX also SPARQL-Generate clauses ITERATOR SOURCE • Binding function st:call-template(templateURI, param1, param2, ...)
  • 48. 48 SELECT SPARQL Select: - with SPARQL-Generate clauses ITERATOR SOURCE - with syntactic sugar, etc.
  • 49. 49 SELECT SPARQL Select: - with SPARQL-Generate clauses ITERATOR SOURCE - with syntactic sugar, etc. • SELECT queries output a list of solution bindings … • … Like ITERATOR clauses
  • 50. Use ITERATORS in sequence? Let’s give you an idea of how an unoptimized query looks like
  • 51. Use ITERATORS in sequence? Let’s give you an idea of how an unoptimized query looks like • GENERATE queries output RDF Graph … • …
  • 52. Calling a GENERATE query • GENERATE queries output RDF Graph … • … like FROM Clauses FROM GENERATE {} and FROM GENERATE <>(param1, param2, …)
  • 53. 53 - Modularize, Decouple, Compose queries GENERATE TEMPLATE SELECT FUNCTION can call can call can call can call
  • 54. There’s much more to do than « just use it » 54 What I didn’t tell yet...
  • 55. 55 There’s much more to do than « just use it » Two use cases that drove recent development • Generation of documentation from an ontology • Generation of an ontology from configuration (RDF or files) Use it • On the web, as JAR, in Sublime Text, as API • Extend with your own functions and iterators (e.g., NER) • Make students play with it • Ask for help, contribute, report issues, propose enhancements https://github.com/sparql-generate/sparql-generate/issues
  • 56. 56 There’s much more to do than « just use it » Ideas • How can SPARQL-Generate be compared to / used with RML? • How can SPARQL-Generate be used by Helio? • What query optimization techniques for SPARQL-Generate? • Can we use (a subset of) SPARQL-Generate for OBDA? • Can we qualify transformations w.r.t the input/output space? • Can we identify/compute operators on these objects? e.g., If input conforms to a schema, then does the output conforms to a SHACL shape?
  • 57. What was the main purpose ... 57 What I didn’t tell yet...
  • 58. 58 Can we use RDF as a lingua franca?
  • 59. 59 Idea: just send RDF! application/rdf+xml text/turtle application/ld+json … ERI HDTQ RDF data formats will never be the only ones on the web Developers prefer JSON, …
  • 60. 60 Idea: just send RDF! JSON-LD! Solution: - to rely on *one* RDF syntax like JSON-LD Requires: - Global adoption of JSON-LD - Maintaining during the development or the evolution phase Abstract data model (ontology) + JSON validation: JSON Schema JSON-LD Context + RDF validation: SHACL/SHeX
  • 61. 61 Approx. modeling of the communication between heterogeneous agents on the Web. send X receivermediumsender wants to send some content e.g., state of a resource M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
  • 62. Generates a representation of the content 62 send X receivermedium received X 1 2 Transmits the message via the internet 3 Decodes and get understanding sender wants to send some content e.g., state of a resource M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18 Approx. modeling of the communication between heterogeneous agents on the Web.
  • 63. Generates a representation of the content 63 send X receivermedium received X 1 2 Transmits the message via the internet 3 Decodes and get understanding sender Correct Content Conveyance iif: 1. all of the essential characteristics of the content is encoded in the message 2. the encoding and the decoding phase are symmetric 3. the message is not altered in the transmission medium Approx. modeling of the communication between heterogeneous agents on the Web.
  • 64. Generates a representation of the content 64 Assumption: content is always a RDF graph send receivermedium received 1 2 Transmits the message via the internet 3 Decodes and get understanding sender Correct Content Conveyance iif: The RDF graph the sender encodes is equivalent to the RDF graph the receiver obtained after decoding the message
  • 65. 65 Content Representation RDF Graphs Typed octet streams M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
  • 66. 66 RDF Presentations RDF Graphs RDF Presentation Typed octet streams
  • 67. 67 Lifting, lowering, validating RDF Graphs Typed octet streams Validation Representation validation Lifting rule maps all the tos in p to their corresponding graph (unique) Lowering rule maps all the graphs in p to a *chosen* corresponding typed stream Validation rule checks if there is a pair with that graph Representation validation rule checks if there is a pair with that tos
  • 68. 68 + categorisation of tools + analysis of RDF formats RDF Graphs Validation - SHACL - ShEx - SPIN - ad hoc code... Representation validation - JSON Schema - XML Schema - ad hoc code... Typed octet streams M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
  • 69. 69 Usage scenarios on the Web send sender receiver The sender can be… Req/Res: client sends some content to the server Req/Res: server responds to the client Pub/Sub: client broadcasts some content to its subscribers Both agents can also… be constrained in some ways (battery, storage, comput,…) implement some of the principles we devise (be semantically flexible) rely on a third party to operate lifting/lowering/… M. Lefrançois, RDF presentation and correct content conveyance for legacy services and the web of things. In Proc. IOT'18
  • 70. 70 Scenario 1: server/publisher sends its message to a client/subscriber client/ subscriber How to lift ? server/ publisher receiver discovers how to lift (from the sender, or elsewhere) sender can be constrained… / receiver can be constrained… I lowered this way, just lift accordingly send initiates interaction
  • 71. 71 Scenario 2: A client asks a server for the representation of a resource send clientserver server discovers a presentation suitable for the client client can be constrained… / server can be constrained… How to lower ? I will lift this way, please lower accordingly initiates interaction
  • 72. 72 Scenario 3: A client sends some encoded content to a server send serverclient client discovers a presentation suitable for the server client can be constrained… / server can be constrained… How to lower ? initiates interaction I will lift this way, please lower accordingly
  • 73. Ontologies, langages, tools, to ease the usage of RDF as a lingua franca Maxime Lefrançois Maxime.Lefrancois@emse.fr http://maxime-lefrancois.info/ MINES Saint-Étienne – Institut Henri Fayol Laboratoire Hubert Curien UMR CNRS 5516 Take away message 1. RDF Transformation Language 2. Use it and contribute 3. Not just a tool
  • 75. Generates a representation of the content 75 send receivermedium received 1 2 Transmits the message via the internet 3 Decodes and get understanding sender ontologies Assumption: content is always a RDF graph
  • 76. 76 Ontologies, contribution to standardization SOSA/SSN ontology Standard OGC et W3C A. Haller, K. Janowicz, S. Cox, D. Le Phuoc, K. Taylor, M. Lefrançois, Semantic Sensor Network Ontology, W3C Recommendation, W3C, 19 October 2017
  • 77. 77 Ontologies, contribution to standardization BOT ontology M. H. Rasmussen, M. Lefrançois, G. F. Schneider, P. Pauwels, BOT: the Building Topology Ontology of the W3C Linked Building Data Group (submitted May 2019 to Semantic Web Journal)
  • 78. 78 Ontologies, contribution to standardization SAREF + influences SAREF patterns + instances http://saref.etsi.org/@prefix seas: <http://w3id.org/seas/>. Smart Grid Core of SEAS Ontology patterns SimpleTransport Smart Home Building HVAC ETSI STF 556: Consolidation of SAREF and its community of industrial users, based on the experience of the EUREKA ITEA 12004 SEAS ETSI STF DP: Specification of the SAREF development framework and workflow, and development of the Community SAREF Portal for user engagement
  • 79. 79 Ontologies, contribution to standardization A work-in progress at international level N. Seydoux, M. Lefrançois, L. Médini, . F. Schneider, P. Pauwels, Positionnement sur le Web Sémantique des Objets, Ingénierie des Connaissances 2019 (best paper award)