SlideShare una empresa de Scribd logo
1 de 65
Descargar para leer sin conexión
Publishing and Consuming Geo-
spatial and Government Data on
the Semantic Web
Ghislain Auguste Atemezing
Multimedia Department, Eurecom
Supervisor:
Dr. Raphaël Troncy, MM Department, Eurecom
§  Open Government Data benefits
Ø Transparency in decision making
Ø Better governance or e-governance
Ø Eco-system of added value application
§  Barriers and challenges
Ø  Heterogeneity of data formats
Ø Variety of access method
Ø Lack of nomenclature
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
2
Thesis Context
In this thesis we explore how Semantic Web technologies
can be used for better integration and consumption of geo-
spatial data.
§  Introduction
§  Research Questions
§  Contributions
Ø Semantic publishing of geospatial data
Ø Visualizations of Government Linked Data
Ø Best Practices for metadata in vocabularies
§  Conclusion
§  Publications
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
3
Outline
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
4
Geospatial data: why it matters
“80% of needs for decisions from public authorities have
a geospatial component”.
(Philippe Grelot, IGN-France)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
5
GeoData on the LOD Cloud
http://lod-cloud.net/versions/2014-08-30/
Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer,
Anja Jentzsch and Richard Cyganiak.
http://lod-cloud.net/
In 2011 19,43% à31 geo-datasets in LOD
http://lod-cloud.net/state/
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
6
French IGN- Reference Geodata
« ..describes the French national territory and the
occupation of its land, elaborates and updates perpetual
inventory of the forest resources »
ü  Different databases with overlapping information:
BD ORTHO, BD PARCELLAIRE, POINT ADRESSE, BD ALTI 25m,
BD TOPO; etc.
ü  CRS: LAMBERT93 or RGF93
Q: ”Give me all the
bridges in a radius of
2km from the "Eiffel
Tower“?
A: Not straightforward
§  How to efficiently represent and store geospatial data
on the Web to ensure interoperable applications?
Ø  the publication of real datasets
Ø  the interlinking with other datasets having overlapping coverage
§  What are the best options for a user to discover,
browse and interact with semantic content?
Ø  Can we propose a generic model for visualizing semantic
content?
§  What are the mechanisms to help preserving
structured data of a high quality on the Web?
Ø  How to improve reusing vocabularies for better interoperability
Ø  How to detect incompatibility between data and metadata
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
7
Research Questions
2015/04/10
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
8
Part 1
Semantic Publication of
Geo-spatial Data
2015/04/10
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
9
Modeling geospatial objects
2015/04/10
3 main components:
§  CRS
§  Features
§  Geometry
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
10
Coordinate Reference Systems (CRS)
2015/04/10
§  A representation of the locations of geographic
features within a common geographic framework
§  Each CRS is defined by its measurement framework
Ø  Geographic: spherical coordinates (unit: decimal degrees)
Ø  Planimetric: 2D planar surface (unit: meters) + a map projection
Ø  Additional measurement properties: ellipsoid, datum, standard
parallels, central meridians, etc.
§  Situation:
Ø  Several hundred Geographical Coordinate Systems
(WGS84, ETRS89, etc.)
Ø  Several thousands Projected Coordinate Systems
(UTM, Lambert93, etc.),
http://resources.esri.com/help/9.3/arcgisserver/apis/rest/pcs.html
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
11
CRS in France (Metropolitan + overseas)
2015/04/10
§  10 CRSs coverage (mostly RGF93)
§  10 projections (mostly used Lambert93)
§  3 different ellipsoids to define the CRS
The Web only used
WGS84
Publishers need converters
for their geodata
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
12
CRS Converters and Limitations
2015/04/10
§  Circé: A Converter Software from IGN
Ø  Allows conversion between some France CRSs
to WGS84
Ø  Closed Source, No open service
§  Existing Web tools (e.g., world coordinate converter)
Ø  No open service, closed source
Ø  Converting algorithms don’t usually map to any national
mapping authority.
§  Contribution: a Web based service to
perform conversion between various CRSs.
Ø  The integration of such a converter can ease the
process of the publishing different types of geometries
on the web regardless of their Coordinates systems.
§  WGS 84 <–> Lambert 93
§  WGS 84 <–> UTM
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
13
Developing a REST converter service
2015/04/10
•  Many regional's’
published data
involve with
local CRS
International
Communication
•  Interpret the
coordinates
between these
CRSs
A medium
•  Open services
for community
•  RESTFul Web
Service
A Converter
Service
GOOD accuracy of the
results
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
14
Vocabularies for Modeling Features
2015/04/10
§  Authority list of terms (e.g. Foursquare)
Ø No semantics at all!
§  SKOS Categories (e.g. GeoNames)
Ø Classes are skos:conceptScheme, codes are
skos:Concept
§  Domain specific ontologies (OrdS, GeoLD)
Ø Interconnected subdomain ontologies (transport,
hydrography, etc.)
§  Data driven ontologies (LGD, GeOnto)
Ø Deeper taxonomy to structure the ontology
Ø Many classes
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
15
Modeling Geometry: State of the Art
2015/04/10
§  Point (lat/long)
Ø  WGS 84 vocabulary described by W3C
§  Rectangle (“bounding box”)
Ø  Geopolitical Vocabulary (FAO)
§  Points in a List
Ø  Sequence of points (LinkedGeoData)
Ø  An object is “formedBy” a ListOfPoints (GeoLinkedData.es)
§  Literals WKT datatype in RDF
Ø  Ordnance Survey (UK), GeoSPARQL embedding CRS in literals
§  More structured representation of complex geometry
Ø  NeoGeo Vocabulary (GeoVocamp), http://geovocab.org/
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
16
Reusing Existing Ontologies (GeOnto)
2015/04/10
§  Ontology for geographic objects (POI)
Ø Output of a French (ANR) research project
Ø Obtained from NLP tools
§  Classes in French
Ø rdfs:labels in FR & EN
Ø No rdfs:comments
Ø Few owl:ObjectProperty
Ø 783 classes
§  Overlap with other vocabs
Ø Need for alignment
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
17
Aligning GeOnto with existing Ontologies
2015/04/10
§  Alignment of GeOnto with 5 ontologies and 2 simple
taxonomies
Ø  LGD, DBpedia, Schema.org, GeoNames, bdtopo
Ø  Foursquare, Google Places
§  Goal: finding owl:equivalentClass
Ø  Tool : Silk framework
Ø  Metrics : LevenshteinDistance, Jaro
Ø  Labels : @en des classes
Ø  Aggregation Function: Mean
§  Manual validation
Ø  For « rdfs:subClassOf »
Ø  Specific alignments with GeoNames codes
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
18
Alignment Process (GeoNames)– Results
2015/04/10
§ High precisions > 80%
§ BUT P(Schema.org) = 50%
Silk
•  Look for skos codes that
matches GeOnto classes
•  Verify the links <70%
•  Generate « sameAs » links
SPARL
Endpoint
•  Use SPARQL «Construct»
to generate a new graph.
Alignment
File
•  Export the RDF file
Vocab/
taxonomies
#Classes #Classes aligned
Bdtopo 237 153 (64.65%)
GeoNames 699 287 (41.06%)
Google
Place (*)
126 41 (32.54%)
Schema.org
(*)
296 52 (17.57%)
LGD 1294 178 (13.76%)
Foursquare 359 46 (12.81%)
DBpedia 366 42 (11.48%)
GeOnto entities are more
specific to France
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
19
Proposal: CheckList for modeling geodata
2015/04/10
§  Complex Geometry Coverage
Ø  Need to publish more data with complex geometries
Ø  Reuse and extend suitable ontologies (NeoGeo, GeoSPARQL)
§  Features MUST be connected to Geometry
Ø  Sometimes it may requires two namespaces
§  Serialization in other GIS formats
Ø  Provide serialization in other GIS formats (GML, WKT, KML, etc.)
§  Structured Representation
Ø  Use of structured representation for complex geometry
Ø  This covers some of the Use Cases at IGN
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
20
Proposal: Vocabulary for complex Geometry
2015/04/10
§  Extend and reuse existing vocabularies
§  Available at http://data.ignf.fr/def/geometrie
@prefix ngeo: <http://geovocab.org/geometry#>.
@prefix sf: <http://www.opengis.net/ont/sf#>.
[…]
geom:Geometry a owl:Class;
rdfs:comment "Primitive géométrique non instanciable, racine de
l'ontologie des primitives géométriques. Une géométrie est
associée à un système de coordonnées et un seul."@fr;
rdfs:label "Géométrie"@fr, "Geometry"@en;
owl:equivalentClass [ a owl:Restriction;
owl:onClass ignf:CoordinatesSystem;
owl:onProperty geom:crs;
owl:qualifiedCardinality "1"^^xsd:nonNegativeInteger];
rdfs:subClassOf ngeo:Geometry;
rdfs:subClassOf sf:Geometry.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
21
Associating CRS to Geometries
2015/04/10
§  A vocabulary for describing CRS
Ø  Subset of ISO 19111 model
Ø  Available at http://data.ignf.fr/def/ingf
§  A dataset for French CRS
Ø  Convert from XML data published by IGN France to RDF
Ø  Eg: “Lambert 2 étendu” http://data.ign.fr/id/ignf/crs/NTFLAMB2E
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
22
Overview of the vocabularies and relationships
2015/04/10
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
23
A workflow for publishing Linked(Geo)Data
2015/04/10
§ DATALIFT: a set of modular tools for “lifting” raw data in RDF.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
24
Publishing French Reference Geodata in RDF
2015/04/10
§  GEOFLA® DATASET ON FRENCH ADMINISTRATIVE UNITS IN
RDF
Ø  Features are of type http://data.ign.fr/geofla
§  FRENCH GAZETTEER DATASET
Ø  Features are of type http://data.ign.fr/topo
Mapping results with external dataset
DATASET
GEOFLA
DATASET
#mappings Tool
INSEE 37,020 SILK
DBPEDIA-FR 23, 252 (communes)
and 93 departments
LIMES
NUTS 105 links (14 comm.,
75 depts, 16 regions)
SILK
GADM 70 links (10 comm., 51
depts, 9 regions)
SILK
FRENCH
GAZETTEER
LINKED
GEODATA
654 links
(lgdo:Amenity) LIMES
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
25
French Open Addresses in RDF Mappings
2015/04/10
BANO2RDF
LGData
Amenities Paris (248052) Marseille (401404) Lyon (89061)
#matched	
   %matched	
   #matched	
   %matched	
   #matched	
   %matched	
  
Shop (778680) 21171	
   2.71	
   8556	
   1.098	
   3049	
   0.391	
  
Restaurant
(260675) 13567	
   5.204	
   2654	
   1.018	
   1882	
   0.721	
  
PostOffice (87731) 971	
   1.106	
   555	
   0.632	
   173	
   0.197	
  
School (318287) 883	
   0.277	
   411	
   0.129	
   197	
   0.061	
  
Parking (250516) 735	
   0.293	
   625	
   0.24	
   210	
   0.083	
  
PlaceOfWorship
(357445) 272	
   0.076	
   193	
   0.053	
   31	
   0.008	
  
PublicBuilding
(26735) 97	
   0.362	
   64	
   0.239	
   21	
   0.078	
  
Building (22283) 5	
   0.022	
   12	
   0.053	
   0	
   0	
  
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
26
Contribution to the LOD Cloud (FrLOD)
2015/04/10
§  340 millions of triples in RDF (10% of DBPedia 2014)
§  Part of our work is under reused by the W3C/OGC
Spatial Data Working Group
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
27
Why Visualizations Matters
2015/04/10
“Don’t ask what you can do for
the Semantic Web; ask what
The Semantic Web can do for you!”
(D. Karger, MIT CSAIL)
1.  How to build bridge to fill the gap
between traditional InfoVis tools and
Semantic Web technologies
2.  How can Semantic Web help in
visualization?
“If you use our Linked Data, please let
us know, or we might switch off!”
(Ordnance Survey)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
28
Part 2
Generating Visualizations
For Linked Data
2015/04/10
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
29
Visualization Categories in Government
Portals
2015/04/10
§  Study of applications consuming Open Data
Ø 13 applications from UK (7), USA (3) and France (3)
Ø Domain: education, health, transport, government,
city, housing, criminality, foreign aid
§  Different dimensions
Ø Platform (web, mobile), data sources, which views are
available (maps, charts, timeline, etc.)
Ø URL policy for identifying data objects
Ø Licenses for the application / for the data
Ø Commercial / non-commercial
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
30
Relevant Features in Visualization Tools
2015/04/10
§  Data format given as input (csv, xml, shp, rdf, etc.)
§  Data access
(API, dump, etc.)
§  Language code
§  Type of view
§  External Libraries
§  License
§  Metadata:
author, organization
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
31
Describing an Application: Opendatacom
2015/04/10
Scope/Domain: Department for Communities and Local
Government, datasets access
Description: visualize available datasets (finance, housing,
deprivation, geography) by authorities or postcode.
On the dashboard, it provides graphs showing the national
distribution of a district and how the values for this local
authority compare with others in England.
Supported Platform: Web
URL Policy: http://{domain}/id/{...} with redirection to the
corresponding document at: http://{domain}/doc/{...}.
Hampshire County Council is:
http://opendatacommunities.org/id/county-council/hampshire
Data Sources: 36 datasets from DCLG, Administrative
Geography and Postcodes from Ordnance Survey.
Type of View: Graph, Map views.
Visualization Tools: google visualization API, raphael.js
License: Open Government license [OGL]
Business Value: Non commercial
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
32
DVIA: A vocabulary to describe Applications
2015/04/10
dvia: Application
dct:title
dvia:description
dvia:keyword
dvia:url
dct:creator
dvia:businessValue
dvia:scope
dvia:view
dctype:Software
dvia: Platform
dct:title
dvia:system
dvia:preferredNavi
gator
dvia:alternativeNav
igator
dvia: VisualTool
dct:title
dct:description
dvia:accessUrl
dvia: downloadUrl
dcat: Dataset
dct: title
dcat: accessURL
dct:references
dcat: keyword
org:Organization
dvia:consumes
dvia:platform
Prefixes:
@prefix dct: <http://purl.org/dc/terms/>.
@prefix dcat: <http://www.w3.org/ns/dcat#>.
@prefix dctype: <http://purl.org/dc/dcmitype/>.
@prefix org: <http://www.w3.org/ns/org#>.
@prefix dvia: <http://data.eurecom.fr/ontology/dvia#>.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
33
DVIA in Real World Datasets
2015/04/10
§  4 applications re-using DVIA
Ø Use to populate 22 past events of
hack-athons in Europe, with 889
applications from Apps4Europe.
a. The  script  will  fetch  the  event  or  application  information  using  
an  AJAX-­‐request.  
b. After  the  server  has  responded  with  the  event/application  
information,  the  information  is  displayed  on  the  page  in  human  
readable  form  by  modifying  the  DOM  of  the  page.  Furthermore,  the  
same  information  is  presented  also  in  computer  readable  format  by  
embedding  RDFa  in  the  DOM.  
  
The embeddable script is non-­optimal in the sense that the event or application information is                                            
loaded only after the actual page (where the script is embedded) has loaded. This causes some                                               
additional delay before the page is fully rendered, but the problem can be alleviated by showing                                               
loading  indicators.  
  
CSS styling for the events and applications is provided using Bootstrap. All CSS rules have been                                               
made specific to the container div-­element of the plugin, in order to prevent the CSS from                                               
conflicting with the CSS of the page. In the future, another solution called shadow DOM could be                                                  
used to prevent conflicts between different components of the page, but at the moment the                                            
browser support is not satisfactory. Since the event and application information is directly                                      29
embedded on the page using DOM, the event organizer can add his own CSS styling to the                                                  
plugin  by  overriding  the  provided  CSS  rules.  
  
  
Fig  9.  A  screenshot  of  a  test  event  displayed  on  an  event  organizer’s  page.  The  content  inside  
the  red  square  is  injected  after  page  load  by  the  embeddable  script.  Note!  The  red  square  is  not  
visible  in  the  actual  page,  it  was  added  only  for  visualization  purposes.  
Ø Implementation of a universal JavaScript plugin to
embed RDFa in organizers events
Ø An extension for Wordpress uses DVIA.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
34
Developing Web Applications: from specific
to generic approach
2015/04/10
§  Scenario 1
Ø  Known Datasets, Known
vocabularies à Specific
SPARQL queries
Ø  Visualizations: dataset
specific
§  Example
Ø  Datasets on schools in France
Ø  Vocabularies: geo vocab, data
cube, geometrie,
Ø  Application: PerfectSchool
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
35
Developing Web Applications: from specific
to generic approach
2015/04/10
§  Scenario 2
Ø  Unknown Datasets, Known
domains, so domain-
specific SPARQL queries
Ø  Visualizations: domain
specific
§  Example
Ø  Endpoints of geodatasets
Ø  Domain: geospatial
Ø  Application: GeoRDFviz
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
36
Developing Web Applications: from specific
to generic approach
2015/04/10
§  Scenario 3
Ø  Unknown Datasets, Unknown
domains, so generic SPARQL
queries
Ø  Visualizations: adapted to
domains specific
Ø  Any endpoints
Ø  Multiple domains: geodata,
statistics, persons, cross-
domains, etc..
Ø  Application: ?????
Related work on configuring Semantic Web widgets by data mapping. [1]
Application: Efficient search for Semantic News demonstrator
in Cultural Heritage Dataset
Tool: ClioPatria
[1] Hildebrand, Michiel, and Jacco Van Ossenbruggen. "Configuring semantic web interfaces by data mapping."
Visual Interfaces to the Social and the Semantic Web (VISSW 2009) 443 (2009): 96.
…but “method not apply to create
interfaces on top of arbitrary
SPARLQ endpoints”
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
37
Our Proposal
2015/04/10
Linked Data
Vizualization Wizard
(LDVizWiz)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
38
Requirements of LDVizWiz (LDViz-”Wise”)
2015/04/10
§  Predefined categories associated to visual elements
§  Build on top of RDF standards
Ø e.g., SPARQL queries ; Semantic Web technologies
§  Reuse existing Visualization libraries
Ø e.g., Google Maps, Google Charts, D3.js, etc.
§  Reuse On-line Library of Information Visualization
(OLIVE)
§  Target to non “RDF/SPARQL speakers”
§  Input: Datasets published as LOD
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
39
Mapping Categories and vocabularies
2015/04/10
§  Geographic
information
Ø geo: vocab,
schema:Place, etc.
§  Temporal information
Ø Time, interval ontologies
§  Event information
Ø lode, event, sport, etc..
§  Agent/Person
Ø foaf:Person/foaf:Agent
§  Organization
information
Ø ORG vocabulary,
foaf:Organization
§  Statistics information
Ø Data cube, SDMX model
§  Knowledge information
Ø Schemas, classifications
using SKOS vocabulary
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
40
LDVizWiz Workflow
2015/04/10
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
41
Step 1: Categories Detection
2015/04/10
§  Detection of main categories in datasets
Ø ASK SPARQL queries on predefined categories
Ø Uses well-known vocabularies in LOV
Ø  Condition the type of visual elements
Detection
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
42
Experiment: Categories Detection
2015/04/10
Category Number %
GEO DATA 97 21.84%
EVENT DATA 16 3.60%
TIME DATA 27 6.08%
SKOS DATA 02 0.45%
ORG DATA 48 10.81%
PERSON DATA 59 13.28%
STAT DATA 29 6.6%
§  Applications
Ø  Automatic detection of
endpoints categories
Ø  More “trustable” than human
tagging
Ø  Map categories detected with
“suitable” visual elements for
the visualizations (e.g.,
TimeLine + maps for events
data)
(*) All the endpoints retrieved from sparqles.org
Detection
Ø  444 endpoints (*) analyzed, 278
good answers (62.61%) using
ASK queries.
Ø  Few taxonomies in SKOS, many
GEO DATA
§  Build candidate properties for visualization
Ø For pop-up menus
Ø For facet browsing
Ø For charts display
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
43
Step 2: Properties Aggregation
2015/04/10
§  Retrieve properties from external datasets
Ø So called “enriched properties”
Detection Aggregation
§  Goal: Exploit the “connectors” between
graphs
§  “connectors” are used to enrich a given graph
Ø e.g., owl:sameAs ; rdfs:seeAlso,
skos:exactMatch
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
44
Step 3: Publication
2015/04/10
§  Visualization Generator
Ø Recommend the visual elements based on categories
Ø Transform ASK queries to SELECT or CONSTRUCT
queries for input to visual library.
Detection Aggregation Publication
§  Visualization Publisher
Ø Export the description of visualization in RDF
Ø Add metadata for the visualization (charts) and steps
used to create it.
Ø e.g., dcat:Dataset, prov:wasDerivedFrom,
chart vocabulary, void:ExampleResource.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
45
Current Implementation
2015/04/10
§  Javascript light version as “proof-of-concept”
§  Url: http://semantics.eurecom.fr/datalift/rdfViz/apps/
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
46
Part 3
Best Practices for metadata
in vocabularies
2015/04/10
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
47
W3C Govn’t LD Best Practices
2015/04/10
§  10 best practices to help government worldwide to
access and reuse their by taking benefit of Linked Data
mechanism.
§  4 steps to rich 5★ ratings datasets in TimBL scale.
Dataset
selection
Dataset
preparation
Dataset
conversion
in RDF
Dataset
publication and
advertisement
Four steps corresponding to the best practices
to publish Linked Data by Governments.
(1) (2)
(3)
(4)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
48
LOV in a Nutshell : http://lov.okfn.org/dataset/lov/
2015/04/10
§  A curated list of vocabularies
Ø  More than 495 vocabularies
Ø  Each of them described by vocabulary-
of-a-friend (voaf)
Ø  Provide a dump in N3 of the different
versions of a vocabulary
Ø  Quasi linearity of the growth, started with
75 vocabularies
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
49
Focus on vocabularies: Disambiguating Vocabulary Prefixes
2015/04/10
Goal:
align services against Linked Open Vocabularies to
harmonize and manage vocabularies’ namespaces
§  Global namespaces
Ø With good practices to
recommend a prefix
Ø Have a more transparent list
of built-in prefixes
Ø All the services understand
each other with prefixes
Ø Some de facto prefixes
emerging: rdfs:, foaf:, rdf:,
owl:, skos:,
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
50
LOV vs PREFIX.CC-Alignment Findings
2015/04/10
14%	
  
11%	
  
19%	
  
56%	
  
Findings	
  during	
  alignment	
  process	
  
lov-­‐able	
  vocabs	
  
Intersect-­‐prefixes	
  
vocabs	
  in	
  LOV	
  
vocabs	
  in	
  prefix.cc	
  
Category Number
lov-able vocabs	
   227	
  
Intersect-prefixes	
   188	
  
vocabs in LOV	
   321	
  
vocabs in prefix.cc	
   925	
  
More than 200 prefixes in
prefix.cc are vocabularies
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
51
Vocabulary Search and Ranking
2015/04/10
Goal
Ranking vocabularies based on Information
Content Metrics
§  Metrics
Ø Information Content Metric (IC): value of
information associated with a given entity
Ø Partition Information Content Metric (PIC)
Ø Proposed a ranking based on IC and PIC
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
52
Ranking Algorithm
2015/04/10
Output ranking
Compute PIC score
Compute IC score
Grouping terms by namespace &
weight assignment
Candidate terms selection in LOV
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
53
Ranking Algorithm
2015/04/10
§  dcterms:
http://purl.org/dc/terms/
§  Candidate terms: 53 (39
properties + 14 classes)
§  wf = 1+ 2+3 = 6
§  PIC = 1724.844
§  foaf:
http://xmlns.com/foaf/0.1/
§  Candidate terms: 35 (26
properties + 9 classes)
§  wf = 1+ 2+ 3 = 6
§  PIC = 1033.197
PIC(dcterms) > PIC(foaf)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
54
Ranking Algorithm
2015/04/10
§  Top-15 terms (IC value) §  Top-15 vocabs (PIC value)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
55
Applications of the Ranking Metrics
2015/04/10
§  Vocabulary life-cycle management
Ø Help assessing the use of terms and vocabulary updates
Ø Monitoring the use of owl:deprecated or
http://www.w3.org/2003/06/sw-vocab-status/ns#:term_status
§  Semantic Web applications
Ø Vocabularies with higher PIC might be proposed to a user
as much as possible, e.g. for choosing properties to display
in a facetted browsing interface
§  Interlinking datasets
Ø Generate sameAs links between resources based on
vocabularies terms with lower IC value
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
56
Licenses Compatibility
2015/04/10
§  Approach:
Ø  Licenses retrieval both for the
dataset and for the
vocabularies used in the
dataset.
Ø  Deontic logic [1] to compute
compatibility using RDF
representation of licenses
9%#
3%#
47%#
1%#
3%#2%#
3%#
3%#
5%#
5%#
2%#
2%#
9%#
6%#
Crea/ve#Commons#A6ribu/on:
NonCommercial:ShareAlike#3.0#Unported##
Crea/ve#Commons#A6ribu/on:ShareAlike#3.0#
Unported##
Crea/ve#Commons#A6ribu/on#3.0#Unported##
Crea/ve#Commons#Public#Domain#Mark#1.0#
Licence#Ouverte#/#Open#Licence#
Crea/ve#Commons#A6ribu/on:ShareAlike#3.0#
United#States#
Crea/ve#Commons#Zero#Public#Domain#
Dedica/on#
Apache#License#Version#2.0#
ODC#Public#Domain#Dedica/on#and#Licence#
(PDDL)#
ISA#Open#Metadata#License#1.1#
W3C#SoSware#No/ce#and#License#
Crea/ve#Commons#A6ribu/on:ShareAlike#3.0#
United#States#
Crea/ve#Commons#A6ribu/on:NoDerivs#3.0#
Unported##
Crea/ve#Commons#A6ribu/on:
NonCommercial:ShareAlike#2.0#Generic#
Goal
Reasoning on Licenses for checking compatibilities
between vocabularies and datasets
[1] Governatori, Guido, et al. "One License to Compose Them All." The Semantic Web–ISWC 2013.
Springer Berlin Heidelberg, 2013. 151-166.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
57
Our Proposal: LIVE Framework
2015/04/10
LOV
Licenses
retrieval
module
Licenses
compatibility
module
Check consistency of
licensing information
for dataset D
dataset D
retrieve vocabularies
used in the dataset
retrieve licenses
for selected vocabularies
vocabularies and data
licenses
LIVE framework
Warning: licenses are
not compatible
http://www.eurecom.fr/~atemezin/licenseChecker/
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
58
The Logic
2015/04/10
§  RDF licenses: http://purl.org/NET/rdflicense
§  Logic of deontic rules:
Ø constructive account of basic deontic modalities (obligation,
prohibition, permission)
Ø compute the set of all conclusions for each license and then
check whether incompatible conclusions are obtained.
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
59
Live Evaluation
2015/04/10
LIVE provides the compatibility in less than 5 seconds for 7 datasets
LIVE retrieves 48 vocabularies in less than 14 seconds
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
60
Conclusions
2015/04/10
§  We have contributed in managing vocabulary metadata
Ø  Disambiguating prefixes between vocabularies
Ø  Improving vocabulary search and ranking
Ø  Providing a license compatibility framework
Ø  Contributing to Best Practices standard
§  We have presented models for Geodata
Ø  For better handling complex geometries
Ø  Supporting *all* CRS
Ø  For easy querying in SPARQL
§  We have proposed a generic visualization wizard
Ø  Based on predefined categories
Ø  Targeted to lay-users
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
61
Future Work
2015/04/10
§  Managing Data updates and versioning
Ø  Versioning: integrate Memento protocol?
Ø  Spatio-temporal evolution
§  Multiple representation: need for metadata?
Ø  Level of detail
Ø  Geometry modeling rules and reasoning
§  Tracking Provenance of geodata
Ø  To ensure quality of published dataset
Ø  To ensure trust from application consumers
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
62
Future Work
2015/04/10
§  Visualizations
Ø  Extend categories and vocabularies for detection
Ø  Provide templates for generating “mash-ups” to combine
domains, an mash-up widget generator
Ø  Investigate the “importance” of a category in dataset
Ø  Provide a user evaluation
§  Metadata management
Ø  Publish a list of common recommended prefixes
Ø  Foster and support current effort towards a more sustainable
governance of vocabularies.
Ø  Compare (P)IC with other graph-based ranking (e.g. pagerank)
Ø  Investigate the dependency ranking between vocabularies
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
63
Publications
2015/04/10
§  Pierre-Yves Vandenbussche, G.A. Atemezing, Maria Poveda, Bernard Vatant:
Linked Open Vocabularies (LOV): a gateway to reusable semantic vocabularies
on the Web. Semantic Web Journal, under review, 2015.
§  G.A. Atemezing, Raphael Troncy: Modeling visualization tools and applications
on the Web. Semantic Web Journal, under review, 2015.
§  G.A. Atemezing et al.: Transforming meteorological data into linked data. In
Semantic Web journal, Special Issue on Linked Dataset descriptions, 2012. IOS
Press.
§  Guido Governatori, Ho-Pun Lam, Antonino Rotolo, Serena Villata, G.A.
Atemezing and Fabien Gandon: LIVE: a Tool for Checking Licenses
Compatibility between Vocabularies and Data.(ISWC 2014, Demo Track)
§  G.A. Atemezing and Raphael Troncy: Information content based ranking metric
for linked open vocabularies. (SEMANTICS 2014)
§  Ahmad Assaf, G.A. Atemezing, Raphael Troncy and Elena Cabrio: What are the
important properties of an entity? Comparing users and knowledge graph point
of view. In 11th Extended Semantic Web Conference (Demo Track, ESWC 2014)
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
64
Publications
2015/04/10
§  Francois Scharffe, G.A Atemezing, Raphael Troncy, Fabien Gandon, Serena
Villata, Bénédicte Bucher, Faycal Hamdi, Laurent Bihanic, Gabriel Kepeklian,
Franck Cotton, Jerome Euzenat, Zhengjie Fan, Pierre-Yves Vandenbussche
and Bernard Vatant: Enabling linked-data publication with the datalift platform.
(AAAI, W10:Semantic Cities, 2012)
§  G.A. Atemezing and Raphael Troncy: Vers une meilleure interopérabilité des
donneés geographiques francaises sur le Web de donneés. (IC 2012).
§  Houda Khrouf, G.A. Atemezing, Thomas Steiner, Giuseppe Rizzo and Raphael
Troncy: Confomaton: A conference enhancer with social media from the cloud.
(ESWC 2012, Demo Track)
§  Bernadette Hyland, G.A. Atemezing and Boris Villazón-Terrazas (editors): Best
Practices for Publishing Linked Data. W3C Working Group Note pub- lished on
January 9, 2014. URL: http://www.w3.org/TR/ld-bp/
§  Bernadette Hyland, G.A Atemezing, Michael Pendleton, Biplav Srivastava
(editors): Linked Data Glossary. W3C Working Group Note published on June
27, 2013. URL: www.w3.org/TR/ld-glossary/
Thank you
for your attention!
Credits layout
Mariella Sabatino: mll.sabatino@gmail.com

Más contenido relacionado

La actualidad más candente

Slides, thesis dissertation defense, deep generative neural networks for nove...
Slides, thesis dissertation defense, deep generative neural networks for nove...Slides, thesis dissertation defense, deep generative neural networks for nove...
Slides, thesis dissertation defense, deep generative neural networks for nove...
mehdi Cherti
 

La actualidad más candente (20)

Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
Slides, thesis dissertation defense, deep generative neural networks for nove...
Slides, thesis dissertation defense, deep generative neural networks for nove...Slides, thesis dissertation defense, deep generative neural networks for nove...
Slides, thesis dissertation defense, deep generative neural networks for nove...
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
Andrew NG machine learning
Andrew NG machine learningAndrew NG machine learning
Andrew NG machine learning
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for Graphs
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Graph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph GenerationGraph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph Generation
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
 

Destacado

PhD Dissertation Presentation v 2.4
PhD Dissertation Presentation v 2.4PhD Dissertation Presentation v 2.4
PhD Dissertation Presentation v 2.4
Rob Hickey
 

Destacado (8)

Review session2
Review session2Review session2
Review session2
 
Phd Slides
Phd SlidesPhd Slides
Phd Slides
 
PhD Slides
PhD SlidesPhD Slides
PhD Slides
 
Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...
 
PowerPoint Presentation Sample # 1
PowerPoint Presentation Sample # 1PowerPoint Presentation Sample # 1
PowerPoint Presentation Sample # 1
 
PhD Dissertation Presentation v 2.4
PhD Dissertation Presentation v 2.4PhD Dissertation Presentation v 2.4
PhD Dissertation Presentation v 2.4
 
Home automation using android mobiles
Home automation using android mobilesHome automation using android mobiles
Home automation using android mobiles
 
What REALLY Differentiates The Best Content Marketers From The Rest
What REALLY Differentiates The Best Content Marketers From The RestWhat REALLY Differentiates The Best Content Marketers From The Rest
What REALLY Differentiates The Best Content Marketers From The Rest
 

Similar a Phd defense slides

Web Mapping with Drupal
Web Mapping with DrupalWeb Mapping with Drupal
Web Mapping with Drupal
Ranel Padon
 
2014 05-gwf geo-appathon_ogc_a_trakas
2014 05-gwf geo-appathon_ogc_a_trakas2014 05-gwf geo-appathon_ogc_a_trakas
2014 05-gwf geo-appathon_ogc_a_trakas
dboi10
 
Integrating GIS utility data in the UK
Integrating GIS utility data in the UKIntegrating GIS utility data in the UK
Integrating GIS utility data in the UK
AntArch
 
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
OpenTopography Facility
 

Similar a Phd defense slides (20)

Big Data for Local Context
Big Data for Local ContextBig Data for Local Context
Big Data for Local Context
 
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
 
Big Geo Data: Open Source and Open Standards
Big Geo Data: Open Source and Open StandardsBig Geo Data: Open Source and Open Standards
Big Geo Data: Open Source and Open Standards
 
FreeGIS.net, INSPIRE, Open Source Software and OGC standards
FreeGIS.net, INSPIRE, Open Source Software and OGC standardsFreeGIS.net, INSPIRE, Open Source Software and OGC standards
FreeGIS.net, INSPIRE, Open Source Software and OGC standards
 
Analysis Ready Data workshop - OGC presentation
Analysis Ready Data workshop - OGC presentation Analysis Ready Data workshop - OGC presentation
Analysis Ready Data workshop - OGC presentation
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...
 
Web Mapping with Drupal
Web Mapping with DrupalWeb Mapping with Drupal
Web Mapping with Drupal
 
OGC Update for State of Geospatial Tech at T-Rex
OGC Update for State of Geospatial Tech at T-RexOGC Update for State of Geospatial Tech at T-Rex
OGC Update for State of Geospatial Tech at T-Rex
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
 
2014 05-gwf geo-appathon_ogc_a_trakas
2014 05-gwf geo-appathon_ogc_a_trakas2014 05-gwf geo-appathon_ogc_a_trakas
2014 05-gwf geo-appathon_ogc_a_trakas
 
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
 
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
 
Integrating GIS utility data in the UK
Integrating GIS utility data in the UKIntegrating GIS utility data in the UK
Integrating GIS utility data in the UK
 
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
 
Ifgi presentation
Ifgi presentationIfgi presentation
Ifgi presentation
 
LPIS QA formalises mainstream GI in the CAP
LPIS QA formalises mainstream GI  in the CAPLPIS QA formalises mainstream GI  in the CAP
LPIS QA formalises mainstream GI in the CAP
 
Golden Age of Geospatial Data Science
Golden Age of Geospatial Data ScienceGolden Age of Geospatial Data Science
Golden Age of Geospatial Data Science
 
The development of a Geographic Information System for traffic route planni...
The development of a  Geographic  Information System for traffic route planni...The development of a  Geographic  Information System for traffic route planni...
The development of a Geographic Information System for traffic route planni...
 

Más de Ghislain Atemezing

Más de Ghislain Atemezing (10)

Trends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsTrends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of Things
 
Big Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceBig Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable Intelligence
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataLIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
 
publishing-ign-data
 publishing-ign-data publishing-ign-data
publishing-ign-data
 
cold2014-ldvizwiz
cold2014-ldvizwizcold2014-ldvizwiz
cold2014-ldvizwiz
 
Information Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open VocabulariesInformation Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open Vocabularies
 
Harmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyHarmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case study
 
Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their Geometry
 

Último

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Último (20)

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 

Phd defense slides

  • 1. Publishing and Consuming Geo- spatial and Government Data on the Semantic Web Ghislain Auguste Atemezing Multimedia Department, Eurecom Supervisor: Dr. Raphaël Troncy, MM Department, Eurecom
  • 2. §  Open Government Data benefits Ø Transparency in decision making Ø Better governance or e-governance Ø Eco-system of added value application §  Barriers and challenges Ø  Heterogeneity of data formats Ø Variety of access method Ø Lack of nomenclature G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 2 Thesis Context In this thesis we explore how Semantic Web technologies can be used for better integration and consumption of geo- spatial data.
  • 3. §  Introduction §  Research Questions §  Contributions Ø Semantic publishing of geospatial data Ø Visualizations of Government Linked Data Ø Best Practices for metadata in vocabularies §  Conclusion §  Publications G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 3 Outline
  • 4. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 4 Geospatial data: why it matters “80% of needs for decisions from public authorities have a geospatial component”. (Philippe Grelot, IGN-France)
  • 5. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 5 GeoData on the LOD Cloud http://lod-cloud.net/versions/2014-08-30/ Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/ In 2011 19,43% à31 geo-datasets in LOD http://lod-cloud.net/state/
  • 6. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 6 French IGN- Reference Geodata « ..describes the French national territory and the occupation of its land, elaborates and updates perpetual inventory of the forest resources » ü  Different databases with overlapping information: BD ORTHO, BD PARCELLAIRE, POINT ADRESSE, BD ALTI 25m, BD TOPO; etc. ü  CRS: LAMBERT93 or RGF93 Q: ”Give me all the bridges in a radius of 2km from the "Eiffel Tower“? A: Not straightforward
  • 7. §  How to efficiently represent and store geospatial data on the Web to ensure interoperable applications? Ø  the publication of real datasets Ø  the interlinking with other datasets having overlapping coverage §  What are the best options for a user to discover, browse and interact with semantic content? Ø  Can we propose a generic model for visualizing semantic content? §  What are the mechanisms to help preserving structured data of a high quality on the Web? Ø  How to improve reusing vocabularies for better interoperability Ø  How to detect incompatibility between data and metadata G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 7 Research Questions 2015/04/10
  • 8. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 8 Part 1 Semantic Publication of Geo-spatial Data 2015/04/10
  • 9. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 9 Modeling geospatial objects 2015/04/10 3 main components: §  CRS §  Features §  Geometry
  • 10. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 10 Coordinate Reference Systems (CRS) 2015/04/10 §  A representation of the locations of geographic features within a common geographic framework §  Each CRS is defined by its measurement framework Ø  Geographic: spherical coordinates (unit: decimal degrees) Ø  Planimetric: 2D planar surface (unit: meters) + a map projection Ø  Additional measurement properties: ellipsoid, datum, standard parallels, central meridians, etc. §  Situation: Ø  Several hundred Geographical Coordinate Systems (WGS84, ETRS89, etc.) Ø  Several thousands Projected Coordinate Systems (UTM, Lambert93, etc.), http://resources.esri.com/help/9.3/arcgisserver/apis/rest/pcs.html
  • 11. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 11 CRS in France (Metropolitan + overseas) 2015/04/10 §  10 CRSs coverage (mostly RGF93) §  10 projections (mostly used Lambert93) §  3 different ellipsoids to define the CRS The Web only used WGS84 Publishers need converters for their geodata
  • 12. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 12 CRS Converters and Limitations 2015/04/10 §  Circé: A Converter Software from IGN Ø  Allows conversion between some France CRSs to WGS84 Ø  Closed Source, No open service §  Existing Web tools (e.g., world coordinate converter) Ø  No open service, closed source Ø  Converting algorithms don’t usually map to any national mapping authority. §  Contribution: a Web based service to perform conversion between various CRSs. Ø  The integration of such a converter can ease the process of the publishing different types of geometries on the web regardless of their Coordinates systems.
  • 13. §  WGS 84 <–> Lambert 93 §  WGS 84 <–> UTM G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 13 Developing a REST converter service 2015/04/10 •  Many regional's’ published data involve with local CRS International Communication •  Interpret the coordinates between these CRSs A medium •  Open services for community •  RESTFul Web Service A Converter Service GOOD accuracy of the results
  • 14. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 14 Vocabularies for Modeling Features 2015/04/10 §  Authority list of terms (e.g. Foursquare) Ø No semantics at all! §  SKOS Categories (e.g. GeoNames) Ø Classes are skos:conceptScheme, codes are skos:Concept §  Domain specific ontologies (OrdS, GeoLD) Ø Interconnected subdomain ontologies (transport, hydrography, etc.) §  Data driven ontologies (LGD, GeOnto) Ø Deeper taxonomy to structure the ontology Ø Many classes
  • 15. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 15 Modeling Geometry: State of the Art 2015/04/10 §  Point (lat/long) Ø  WGS 84 vocabulary described by W3C §  Rectangle (“bounding box”) Ø  Geopolitical Vocabulary (FAO) §  Points in a List Ø  Sequence of points (LinkedGeoData) Ø  An object is “formedBy” a ListOfPoints (GeoLinkedData.es) §  Literals WKT datatype in RDF Ø  Ordnance Survey (UK), GeoSPARQL embedding CRS in literals §  More structured representation of complex geometry Ø  NeoGeo Vocabulary (GeoVocamp), http://geovocab.org/
  • 16. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 16 Reusing Existing Ontologies (GeOnto) 2015/04/10 §  Ontology for geographic objects (POI) Ø Output of a French (ANR) research project Ø Obtained from NLP tools §  Classes in French Ø rdfs:labels in FR & EN Ø No rdfs:comments Ø Few owl:ObjectProperty Ø 783 classes §  Overlap with other vocabs Ø Need for alignment
  • 17. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 17 Aligning GeOnto with existing Ontologies 2015/04/10 §  Alignment of GeOnto with 5 ontologies and 2 simple taxonomies Ø  LGD, DBpedia, Schema.org, GeoNames, bdtopo Ø  Foursquare, Google Places §  Goal: finding owl:equivalentClass Ø  Tool : Silk framework Ø  Metrics : LevenshteinDistance, Jaro Ø  Labels : @en des classes Ø  Aggregation Function: Mean §  Manual validation Ø  For « rdfs:subClassOf » Ø  Specific alignments with GeoNames codes
  • 18. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 18 Alignment Process (GeoNames)– Results 2015/04/10 § High precisions > 80% § BUT P(Schema.org) = 50% Silk •  Look for skos codes that matches GeOnto classes •  Verify the links <70% •  Generate « sameAs » links SPARL Endpoint •  Use SPARQL «Construct» to generate a new graph. Alignment File •  Export the RDF file Vocab/ taxonomies #Classes #Classes aligned Bdtopo 237 153 (64.65%) GeoNames 699 287 (41.06%) Google Place (*) 126 41 (32.54%) Schema.org (*) 296 52 (17.57%) LGD 1294 178 (13.76%) Foursquare 359 46 (12.81%) DBpedia 366 42 (11.48%) GeOnto entities are more specific to France
  • 19. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 19 Proposal: CheckList for modeling geodata 2015/04/10 §  Complex Geometry Coverage Ø  Need to publish more data with complex geometries Ø  Reuse and extend suitable ontologies (NeoGeo, GeoSPARQL) §  Features MUST be connected to Geometry Ø  Sometimes it may requires two namespaces §  Serialization in other GIS formats Ø  Provide serialization in other GIS formats (GML, WKT, KML, etc.) §  Structured Representation Ø  Use of structured representation for complex geometry Ø  This covers some of the Use Cases at IGN
  • 20. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 20 Proposal: Vocabulary for complex Geometry 2015/04/10 §  Extend and reuse existing vocabularies §  Available at http://data.ignf.fr/def/geometrie @prefix ngeo: <http://geovocab.org/geometry#>. @prefix sf: <http://www.opengis.net/ont/sf#>. […] geom:Geometry a owl:Class; rdfs:comment "Primitive géométrique non instanciable, racine de l'ontologie des primitives géométriques. Une géométrie est associée à un système de coordonnées et un seul."@fr; rdfs:label "Géométrie"@fr, "Geometry"@en; owl:equivalentClass [ a owl:Restriction; owl:onClass ignf:CoordinatesSystem; owl:onProperty geom:crs; owl:qualifiedCardinality "1"^^xsd:nonNegativeInteger]; rdfs:subClassOf ngeo:Geometry; rdfs:subClassOf sf:Geometry.
  • 21. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 21 Associating CRS to Geometries 2015/04/10 §  A vocabulary for describing CRS Ø  Subset of ISO 19111 model Ø  Available at http://data.ignf.fr/def/ingf §  A dataset for French CRS Ø  Convert from XML data published by IGN France to RDF Ø  Eg: “Lambert 2 étendu” http://data.ign.fr/id/ignf/crs/NTFLAMB2E
  • 22. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 22 Overview of the vocabularies and relationships 2015/04/10
  • 23. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 23 A workflow for publishing Linked(Geo)Data 2015/04/10 § DATALIFT: a set of modular tools for “lifting” raw data in RDF.
  • 24. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 24 Publishing French Reference Geodata in RDF 2015/04/10 §  GEOFLA® DATASET ON FRENCH ADMINISTRATIVE UNITS IN RDF Ø  Features are of type http://data.ign.fr/geofla §  FRENCH GAZETTEER DATASET Ø  Features are of type http://data.ign.fr/topo Mapping results with external dataset DATASET GEOFLA DATASET #mappings Tool INSEE 37,020 SILK DBPEDIA-FR 23, 252 (communes) and 93 departments LIMES NUTS 105 links (14 comm., 75 depts, 16 regions) SILK GADM 70 links (10 comm., 51 depts, 9 regions) SILK FRENCH GAZETTEER LINKED GEODATA 654 links (lgdo:Amenity) LIMES
  • 25. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 25 French Open Addresses in RDF Mappings 2015/04/10 BANO2RDF LGData Amenities Paris (248052) Marseille (401404) Lyon (89061) #matched   %matched   #matched   %matched   #matched   %matched   Shop (778680) 21171   2.71   8556   1.098   3049   0.391   Restaurant (260675) 13567   5.204   2654   1.018   1882   0.721   PostOffice (87731) 971   1.106   555   0.632   173   0.197   School (318287) 883   0.277   411   0.129   197   0.061   Parking (250516) 735   0.293   625   0.24   210   0.083   PlaceOfWorship (357445) 272   0.076   193   0.053   31   0.008   PublicBuilding (26735) 97   0.362   64   0.239   21   0.078   Building (22283) 5   0.022   12   0.053   0   0  
  • 26. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 26 Contribution to the LOD Cloud (FrLOD) 2015/04/10 §  340 millions of triples in RDF (10% of DBPedia 2014) §  Part of our work is under reused by the W3C/OGC Spatial Data Working Group
  • 27. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 27 Why Visualizations Matters 2015/04/10 “Don’t ask what you can do for the Semantic Web; ask what The Semantic Web can do for you!” (D. Karger, MIT CSAIL) 1.  How to build bridge to fill the gap between traditional InfoVis tools and Semantic Web technologies 2.  How can Semantic Web help in visualization? “If you use our Linked Data, please let us know, or we might switch off!” (Ordnance Survey)
  • 28. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 28 Part 2 Generating Visualizations For Linked Data 2015/04/10
  • 29. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 29 Visualization Categories in Government Portals 2015/04/10 §  Study of applications consuming Open Data Ø 13 applications from UK (7), USA (3) and France (3) Ø Domain: education, health, transport, government, city, housing, criminality, foreign aid §  Different dimensions Ø Platform (web, mobile), data sources, which views are available (maps, charts, timeline, etc.) Ø URL policy for identifying data objects Ø Licenses for the application / for the data Ø Commercial / non-commercial
  • 30. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 30 Relevant Features in Visualization Tools 2015/04/10 §  Data format given as input (csv, xml, shp, rdf, etc.) §  Data access (API, dump, etc.) §  Language code §  Type of view §  External Libraries §  License §  Metadata: author, organization
  • 31. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 31 Describing an Application: Opendatacom 2015/04/10 Scope/Domain: Department for Communities and Local Government, datasets access Description: visualize available datasets (finance, housing, deprivation, geography) by authorities or postcode. On the dashboard, it provides graphs showing the national distribution of a district and how the values for this local authority compare with others in England. Supported Platform: Web URL Policy: http://{domain}/id/{...} with redirection to the corresponding document at: http://{domain}/doc/{...}. Hampshire County Council is: http://opendatacommunities.org/id/county-council/hampshire Data Sources: 36 datasets from DCLG, Administrative Geography and Postcodes from Ordnance Survey. Type of View: Graph, Map views. Visualization Tools: google visualization API, raphael.js License: Open Government license [OGL] Business Value: Non commercial
  • 32. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 32 DVIA: A vocabulary to describe Applications 2015/04/10 dvia: Application dct:title dvia:description dvia:keyword dvia:url dct:creator dvia:businessValue dvia:scope dvia:view dctype:Software dvia: Platform dct:title dvia:system dvia:preferredNavi gator dvia:alternativeNav igator dvia: VisualTool dct:title dct:description dvia:accessUrl dvia: downloadUrl dcat: Dataset dct: title dcat: accessURL dct:references dcat: keyword org:Organization dvia:consumes dvia:platform Prefixes: @prefix dct: <http://purl.org/dc/terms/>. @prefix dcat: <http://www.w3.org/ns/dcat#>. @prefix dctype: <http://purl.org/dc/dcmitype/>. @prefix org: <http://www.w3.org/ns/org#>. @prefix dvia: <http://data.eurecom.fr/ontology/dvia#>.
  • 33. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 33 DVIA in Real World Datasets 2015/04/10 §  4 applications re-using DVIA Ø Use to populate 22 past events of hack-athons in Europe, with 889 applications from Apps4Europe. a. The  script  will  fetch  the  event  or  application  information  using   an  AJAX-­‐request.   b. After  the  server  has  responded  with  the  event/application   information,  the  information  is  displayed  on  the  page  in  human   readable  form  by  modifying  the  DOM  of  the  page.  Furthermore,  the   same  information  is  presented  also  in  computer  readable  format  by   embedding  RDFa  in  the  DOM.     The embeddable script is non-­optimal in the sense that the event or application information is                               loaded only after the actual page (where the script is embedded) has loaded. This causes some                                 additional delay before the page is fully rendered, but the problem can be alleviated by showing                                 loading  indicators.     CSS styling for the events and applications is provided using Bootstrap. All CSS rules have been                                 made specific to the container div-­element of the plugin, in order to prevent the CSS from                                 conflicting with the CSS of the page. In the future, another solution called shadow DOM could be                                   used to prevent conflicts between different components of the page, but at the moment the                               browser support is not satisfactory. Since the event and application information is directly                          29 embedded on the page using DOM, the event organizer can add his own CSS styling to the                                   plugin  by  overriding  the  provided  CSS  rules.       Fig  9.  A  screenshot  of  a  test  event  displayed  on  an  event  organizer’s  page.  The  content  inside   the  red  square  is  injected  after  page  load  by  the  embeddable  script.  Note!  The  red  square  is  not   visible  in  the  actual  page,  it  was  added  only  for  visualization  purposes.   Ø Implementation of a universal JavaScript plugin to embed RDFa in organizers events Ø An extension for Wordpress uses DVIA.
  • 34. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 34 Developing Web Applications: from specific to generic approach 2015/04/10 §  Scenario 1 Ø  Known Datasets, Known vocabularies à Specific SPARQL queries Ø  Visualizations: dataset specific §  Example Ø  Datasets on schools in France Ø  Vocabularies: geo vocab, data cube, geometrie, Ø  Application: PerfectSchool
  • 35. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 35 Developing Web Applications: from specific to generic approach 2015/04/10 §  Scenario 2 Ø  Unknown Datasets, Known domains, so domain- specific SPARQL queries Ø  Visualizations: domain specific §  Example Ø  Endpoints of geodatasets Ø  Domain: geospatial Ø  Application: GeoRDFviz
  • 36. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 36 Developing Web Applications: from specific to generic approach 2015/04/10 §  Scenario 3 Ø  Unknown Datasets, Unknown domains, so generic SPARQL queries Ø  Visualizations: adapted to domains specific Ø  Any endpoints Ø  Multiple domains: geodata, statistics, persons, cross- domains, etc.. Ø  Application: ????? Related work on configuring Semantic Web widgets by data mapping. [1] Application: Efficient search for Semantic News demonstrator in Cultural Heritage Dataset Tool: ClioPatria [1] Hildebrand, Michiel, and Jacco Van Ossenbruggen. "Configuring semantic web interfaces by data mapping." Visual Interfaces to the Social and the Semantic Web (VISSW 2009) 443 (2009): 96. …but “method not apply to create interfaces on top of arbitrary SPARLQ endpoints”
  • 37. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 37 Our Proposal 2015/04/10 Linked Data Vizualization Wizard (LDVizWiz)
  • 38. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 38 Requirements of LDVizWiz (LDViz-”Wise”) 2015/04/10 §  Predefined categories associated to visual elements §  Build on top of RDF standards Ø e.g., SPARQL queries ; Semantic Web technologies §  Reuse existing Visualization libraries Ø e.g., Google Maps, Google Charts, D3.js, etc. §  Reuse On-line Library of Information Visualization (OLIVE) §  Target to non “RDF/SPARQL speakers” §  Input: Datasets published as LOD
  • 39. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 39 Mapping Categories and vocabularies 2015/04/10 §  Geographic information Ø geo: vocab, schema:Place, etc. §  Temporal information Ø Time, interval ontologies §  Event information Ø lode, event, sport, etc.. §  Agent/Person Ø foaf:Person/foaf:Agent §  Organization information Ø ORG vocabulary, foaf:Organization §  Statistics information Ø Data cube, SDMX model §  Knowledge information Ø Schemas, classifications using SKOS vocabulary
  • 40. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 40 LDVizWiz Workflow 2015/04/10
  • 41. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 41 Step 1: Categories Detection 2015/04/10 §  Detection of main categories in datasets Ø ASK SPARQL queries on predefined categories Ø Uses well-known vocabularies in LOV Ø  Condition the type of visual elements Detection
  • 42. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 42 Experiment: Categories Detection 2015/04/10 Category Number % GEO DATA 97 21.84% EVENT DATA 16 3.60% TIME DATA 27 6.08% SKOS DATA 02 0.45% ORG DATA 48 10.81% PERSON DATA 59 13.28% STAT DATA 29 6.6% §  Applications Ø  Automatic detection of endpoints categories Ø  More “trustable” than human tagging Ø  Map categories detected with “suitable” visual elements for the visualizations (e.g., TimeLine + maps for events data) (*) All the endpoints retrieved from sparqles.org Detection Ø  444 endpoints (*) analyzed, 278 good answers (62.61%) using ASK queries. Ø  Few taxonomies in SKOS, many GEO DATA
  • 43. §  Build candidate properties for visualization Ø For pop-up menus Ø For facet browsing Ø For charts display G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 43 Step 2: Properties Aggregation 2015/04/10 §  Retrieve properties from external datasets Ø So called “enriched properties” Detection Aggregation §  Goal: Exploit the “connectors” between graphs §  “connectors” are used to enrich a given graph Ø e.g., owl:sameAs ; rdfs:seeAlso, skos:exactMatch
  • 44. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 44 Step 3: Publication 2015/04/10 §  Visualization Generator Ø Recommend the visual elements based on categories Ø Transform ASK queries to SELECT or CONSTRUCT queries for input to visual library. Detection Aggregation Publication §  Visualization Publisher Ø Export the description of visualization in RDF Ø Add metadata for the visualization (charts) and steps used to create it. Ø e.g., dcat:Dataset, prov:wasDerivedFrom, chart vocabulary, void:ExampleResource.
  • 45. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 45 Current Implementation 2015/04/10 §  Javascript light version as “proof-of-concept” §  Url: http://semantics.eurecom.fr/datalift/rdfViz/apps/
  • 46. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 46 Part 3 Best Practices for metadata in vocabularies 2015/04/10
  • 47. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 47 W3C Govn’t LD Best Practices 2015/04/10 §  10 best practices to help government worldwide to access and reuse their by taking benefit of Linked Data mechanism. §  4 steps to rich 5★ ratings datasets in TimBL scale. Dataset selection Dataset preparation Dataset conversion in RDF Dataset publication and advertisement Four steps corresponding to the best practices to publish Linked Data by Governments. (1) (2) (3) (4)
  • 48. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 48 LOV in a Nutshell : http://lov.okfn.org/dataset/lov/ 2015/04/10 §  A curated list of vocabularies Ø  More than 495 vocabularies Ø  Each of them described by vocabulary- of-a-friend (voaf) Ø  Provide a dump in N3 of the different versions of a vocabulary Ø  Quasi linearity of the growth, started with 75 vocabularies
  • 49. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 49 Focus on vocabularies: Disambiguating Vocabulary Prefixes 2015/04/10 Goal: align services against Linked Open Vocabularies to harmonize and manage vocabularies’ namespaces §  Global namespaces Ø With good practices to recommend a prefix Ø Have a more transparent list of built-in prefixes Ø All the services understand each other with prefixes Ø Some de facto prefixes emerging: rdfs:, foaf:, rdf:, owl:, skos:,
  • 50. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 50 LOV vs PREFIX.CC-Alignment Findings 2015/04/10 14%   11%   19%   56%   Findings  during  alignment  process   lov-­‐able  vocabs   Intersect-­‐prefixes   vocabs  in  LOV   vocabs  in  prefix.cc   Category Number lov-able vocabs   227   Intersect-prefixes   188   vocabs in LOV   321   vocabs in prefix.cc   925   More than 200 prefixes in prefix.cc are vocabularies
  • 51. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 51 Vocabulary Search and Ranking 2015/04/10 Goal Ranking vocabularies based on Information Content Metrics §  Metrics Ø Information Content Metric (IC): value of information associated with a given entity Ø Partition Information Content Metric (PIC) Ø Proposed a ranking based on IC and PIC
  • 52. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 52 Ranking Algorithm 2015/04/10 Output ranking Compute PIC score Compute IC score Grouping terms by namespace & weight assignment Candidate terms selection in LOV
  • 53. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 53 Ranking Algorithm 2015/04/10 §  dcterms: http://purl.org/dc/terms/ §  Candidate terms: 53 (39 properties + 14 classes) §  wf = 1+ 2+3 = 6 §  PIC = 1724.844 §  foaf: http://xmlns.com/foaf/0.1/ §  Candidate terms: 35 (26 properties + 9 classes) §  wf = 1+ 2+ 3 = 6 §  PIC = 1033.197 PIC(dcterms) > PIC(foaf)
  • 54. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 54 Ranking Algorithm 2015/04/10 §  Top-15 terms (IC value) §  Top-15 vocabs (PIC value)
  • 55. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 55 Applications of the Ranking Metrics 2015/04/10 §  Vocabulary life-cycle management Ø Help assessing the use of terms and vocabulary updates Ø Monitoring the use of owl:deprecated or http://www.w3.org/2003/06/sw-vocab-status/ns#:term_status §  Semantic Web applications Ø Vocabularies with higher PIC might be proposed to a user as much as possible, e.g. for choosing properties to display in a facetted browsing interface §  Interlinking datasets Ø Generate sameAs links between resources based on vocabularies terms with lower IC value
  • 56. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 56 Licenses Compatibility 2015/04/10 §  Approach: Ø  Licenses retrieval both for the dataset and for the vocabularies used in the dataset. Ø  Deontic logic [1] to compute compatibility using RDF representation of licenses 9%# 3%# 47%# 1%# 3%#2%# 3%# 3%# 5%# 5%# 2%# 2%# 9%# 6%# Crea/ve#Commons#A6ribu/on: NonCommercial:ShareAlike#3.0#Unported## Crea/ve#Commons#A6ribu/on:ShareAlike#3.0# Unported## Crea/ve#Commons#A6ribu/on#3.0#Unported## Crea/ve#Commons#Public#Domain#Mark#1.0# Licence#Ouverte#/#Open#Licence# Crea/ve#Commons#A6ribu/on:ShareAlike#3.0# United#States# Crea/ve#Commons#Zero#Public#Domain# Dedica/on# Apache#License#Version#2.0# ODC#Public#Domain#Dedica/on#and#Licence# (PDDL)# ISA#Open#Metadata#License#1.1# W3C#SoSware#No/ce#and#License# Crea/ve#Commons#A6ribu/on:ShareAlike#3.0# United#States# Crea/ve#Commons#A6ribu/on:NoDerivs#3.0# Unported## Crea/ve#Commons#A6ribu/on: NonCommercial:ShareAlike#2.0#Generic# Goal Reasoning on Licenses for checking compatibilities between vocabularies and datasets [1] Governatori, Guido, et al. "One License to Compose Them All." The Semantic Web–ISWC 2013. Springer Berlin Heidelberg, 2013. 151-166.
  • 57. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 57 Our Proposal: LIVE Framework 2015/04/10 LOV Licenses retrieval module Licenses compatibility module Check consistency of licensing information for dataset D dataset D retrieve vocabularies used in the dataset retrieve licenses for selected vocabularies vocabularies and data licenses LIVE framework Warning: licenses are not compatible http://www.eurecom.fr/~atemezin/licenseChecker/
  • 58. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 58 The Logic 2015/04/10 §  RDF licenses: http://purl.org/NET/rdflicense §  Logic of deontic rules: Ø constructive account of basic deontic modalities (obligation, prohibition, permission) Ø compute the set of all conclusions for each license and then check whether incompatible conclusions are obtained.
  • 59. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 59 Live Evaluation 2015/04/10 LIVE provides the compatibility in less than 5 seconds for 7 datasets LIVE retrieves 48 vocabularies in less than 14 seconds
  • 60. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 60 Conclusions 2015/04/10 §  We have contributed in managing vocabulary metadata Ø  Disambiguating prefixes between vocabularies Ø  Improving vocabulary search and ranking Ø  Providing a license compatibility framework Ø  Contributing to Best Practices standard §  We have presented models for Geodata Ø  For better handling complex geometries Ø  Supporting *all* CRS Ø  For easy querying in SPARQL §  We have proposed a generic visualization wizard Ø  Based on predefined categories Ø  Targeted to lay-users
  • 61. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 61 Future Work 2015/04/10 §  Managing Data updates and versioning Ø  Versioning: integrate Memento protocol? Ø  Spatio-temporal evolution §  Multiple representation: need for metadata? Ø  Level of detail Ø  Geometry modeling rules and reasoning §  Tracking Provenance of geodata Ø  To ensure quality of published dataset Ø  To ensure trust from application consumers
  • 62. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 62 Future Work 2015/04/10 §  Visualizations Ø  Extend categories and vocabularies for detection Ø  Provide templates for generating “mash-ups” to combine domains, an mash-up widget generator Ø  Investigate the “importance” of a category in dataset Ø  Provide a user evaluation §  Metadata management Ø  Publish a list of common recommended prefixes Ø  Foster and support current effort towards a more sustainable governance of vocabularies. Ø  Compare (P)IC with other graph-based ranking (e.g. pagerank) Ø  Investigate the dependency ranking between vocabularies
  • 63. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 63 Publications 2015/04/10 §  Pierre-Yves Vandenbussche, G.A. Atemezing, Maria Poveda, Bernard Vatant: Linked Open Vocabularies (LOV): a gateway to reusable semantic vocabularies on the Web. Semantic Web Journal, under review, 2015. §  G.A. Atemezing, Raphael Troncy: Modeling visualization tools and applications on the Web. Semantic Web Journal, under review, 2015. §  G.A. Atemezing et al.: Transforming meteorological data into linked data. In Semantic Web journal, Special Issue on Linked Dataset descriptions, 2012. IOS Press. §  Guido Governatori, Ho-Pun Lam, Antonino Rotolo, Serena Villata, G.A. Atemezing and Fabien Gandon: LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data.(ISWC 2014, Demo Track) §  G.A. Atemezing and Raphael Troncy: Information content based ranking metric for linked open vocabularies. (SEMANTICS 2014) §  Ahmad Assaf, G.A. Atemezing, Raphael Troncy and Elena Cabrio: What are the important properties of an entity? Comparing users and knowledge graph point of view. In 11th Extended Semantic Web Conference (Demo Track, ESWC 2014)
  • 64. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web 64 Publications 2015/04/10 §  Francois Scharffe, G.A Atemezing, Raphael Troncy, Fabien Gandon, Serena Villata, Bénédicte Bucher, Faycal Hamdi, Laurent Bihanic, Gabriel Kepeklian, Franck Cotton, Jerome Euzenat, Zhengjie Fan, Pierre-Yves Vandenbussche and Bernard Vatant: Enabling linked-data publication with the datalift platform. (AAAI, W10:Semantic Cities, 2012) §  G.A. Atemezing and Raphael Troncy: Vers une meilleure interopérabilité des donneés geographiques francaises sur le Web de donneés. (IC 2012). §  Houda Khrouf, G.A. Atemezing, Thomas Steiner, Giuseppe Rizzo and Raphael Troncy: Confomaton: A conference enhancer with social media from the cloud. (ESWC 2012, Demo Track) §  Bernadette Hyland, G.A. Atemezing and Boris Villazón-Terrazas (editors): Best Practices for Publishing Linked Data. W3C Working Group Note pub- lished on January 9, 2014. URL: http://www.w3.org/TR/ld-bp/ §  Bernadette Hyland, G.A Atemezing, Michael Pendleton, Biplav Srivastava (editors): Linked Data Glossary. W3C Working Group Note published on June 27, 2013. URL: www.w3.org/TR/ld-glossary/
  • 65. Thank you for your attention! Credits layout Mariella Sabatino: mll.sabatino@gmail.com