SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Comparing Vocabularies for
 Representing Geographical
Features and Their Geometry
 Ghislain Atemezing / @gatemezing
    Raphaël Troncy / @rtroncy
Goal and Research Questions
 Goal: Provide recommendations for the French IGN in
 exposing their GIS database in the Linked Data world
 Study of geo vocabularies in the Web of Data
    LOD Cloud review
    Who are the GeoData providers?
    How features are generally modeled?
    How geometry is generally modeled?
    Illustrative scenario
 Align vocabularies when necessary
 Compare Triple Stores with geospatial indexing


   2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   -2
GeoData: why it matters?

 “80% of needs for decisions from public
 authorities have a geospatial component”.
 (Philippe Grelot, IGN-France)




                                                                        Photo by Vilavelosa on flickr.com


  2012/11/12   5th International Terra Cognita Workshop - ISWC Boston       -3
Feature and Geometry

Term                        Definition                                             Source
                            An abstract representation of a real-world [INSPIRE
                            phenomenon related to a specific location Directive]
Spatial Object =            or geographical area . It should be noted Item 67
(Geographical) Feature      that the term has a different meaning in
                            the ISO 19100 series. It is also
                            synonymous with "(geographic) feature"
                            as used in the ISO 19100 series.




Feature                     A geographical feature, capable of holding NeoGeo
                            spatial relations.                         Vocab
                            A top-level geometry type. This class is
Geometry                    equivalent to the UML class GM_Object GeoSPARQL
                            defined in ISO 19107, and it is superclass [OGC]
                            of all geometry types.


       2012/11/12        5th International Terra Cognita Workshop - ISWC Boston   -4
GeoData on the LOD Cloud
                                                                                             31 datasets
                                                                                             19.43% triples




 http://lod-cloud.net/state
 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/


     2012/11/12                    5th International Terra Cognita Workshop - ISWC Boston        -5
Where are Geo-Linked Data?
 DBpedia
 GeoNames
                                          Provider                           #Triples
 LinkedGeodata (OSM)
                                          DBpedia                            727 232 triples
 Freebase (Google)                        GeoNames                           5 240 032 (« features »)

 Ordnance Survey (UK), LinkedGeoData                                         60 356 364 triples

 GeoLinkedData (ES)                       Ordnance Survey                    6 295 triples

 GADM-RDF                                 Freebase                           8,5 MB (tsv fichiers)
                                          GeoLinkedData.es                   101 018 triples
 NUTS-RDF
                                          Projet GADM                        682 605 triples
 data.ign.fr (FR)
                                          Projet NUTS                        316 238 triples

   2012/11/12       5th International Terra Cognita Workshop - ISWC Boston         -6
Vocabularies for Space and Geography
                        http://lov.okfn.org/dataset/lov/details/vocabularySpace_Space.html




Only 5 vocabs are reused: W3C Geo (21 datasets), OS spatialrelations (10 datasets),
Geonames (5 datasets), UK administrative (3 datasets) and NeoGeo (3 datasets)

        2012/11/12            5th International Terra Cognita Workshop - ISWC Boston         -7
Vocabularies for Modeling Features (1/2)

 Authority list of terms (e.g. Foursquare)
   Less structured
   Represent categories of Points of Interest (POIs)
   Typically, one type as an API answer
   Need to express the semantics of the terms

 SKOS Categories (e.g. GeoNames)
   Classes are skos:conceptScheme
   Codes are skos:Concept
   Few classes … BUT many codes



  2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   -8
Vocabularies for Modeling Features (2/2)

 Domain specific ontologies
   One ontology per subdomain
   (transport, administrative unit, hydrography, etc.)
   Interconnected ontologies
   (by explicit semantic e.g. owl:imports)
   UK (OS) – ES (GeoLinkedData)
 Some richer ontologies created by
 (semi-)automatic tools / NLP
   Deeper taxonomy to structure the ontology
   LinkedGeoData: 16 high-level classes, 1294 classes
   GeOnto: 2 high-level classes, 783 classes in total

  2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   -9
Modeling Geometry
 Point (lat/long)
    WGS 84 vocabulary described by W3C

 Rectangle (“bounding box”)
    Geopolitical Vocabulary (FAO)

 Points in a List
    Sequence of points (LinkedGeoData)
    An object is “formedBy” a ListOfPoints (GeoLinkedData.es)

 Literals (GML datatype in RDF)
    Ordnance Survey (UK)

 More structured representation of complex geometry
    NeoGeo Vocabulary (GeoVocamp), http://geovocab.org/


   2012/11/12       5th International Terra Cognita Workshop - ISWC Boston   - 10
Scenario: 7th Arrondissement of Paris




  2012/11/12   5th International Terra Cognita Workshop - ISWC Boston   - 11
7th Arrondissement in DBpedia (a gml_Feature)
dbpedia:7th_arrondissement_of_Paris a gml:_Feature ;
    (gml IS NOT an ontology with OWL-flavour )
  a <http://dbpedia.org/class/yago/1900SummerOlympicVenuEs>
(Yago Class)

 rdfs:label "巴黎第七區"@zh; (14 different languages)
  dbpprop:commune "Paris" ;
  dbpprop:département dbpedia:Paris ;
  dbpprop:région dbpedia:Île-de-France_(region) ;
  grs:point "48.85916666666667 2.312777777777778" ;
  geo:geometry "POINT(2.31278 48.8592)" ; (fake property!)
  geo:lat "48.859165"^^xsd:float;
  geo:long "2.312778"^^xsd:float.




    2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 12
7th Arrondissement in GeoNames (a A.ADM4)

gnr:6618613 a gn:Feature ; gn:name "Paris 07";
  gn:alternateName "7ème arrondissement";
  gn:featureClass gn:A [
    a skos:ConceptScheme ;
    rdfs:comment "country, state, region ..."@en .
  ] ;
  gn:featureColde gn:A.ADM4 [
    a skos:Concept ;
    rdfs:comment
    "a subdivision of a third-order administrative division"@en .
  ];
  gn:countryCode "FR";
  gn:population "57410";
  geo:lat "48.8565";
  geo:long "2.321".



      2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 13
7th Arrondissement in LGD (a “Suburb”)

lgd:node248177663 a lgdo:Suburb ;
  rdfs:label "7th Arrondissement"@en , "7e Arrondissement" ;
  lgdo:contributor lgd:user13442 ;
  <http://linkedgeodata.org/ontology/ref%3AINSEE> 75107 ;
  lgdp:alt_name "VIIe Arrondissement" ;
  georss:point "48.8570281 2.3201953" ;
  geo:lat 48.8570281 ;
  geo:long 2.3201953 .




     2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 14
French IGN
« ..describes the French national territory and the occupation
of its land, elaborates and updates perpetual inventory of the
forest resources »
  Different databases:
  BD ORTHO, BD PARCELLAIRE, POINT ADRESSE, BD ALTI 25m, BD
  TOPO; etc.
  Data in LAMBERT93 or RGF93

Q: ”Give me all the
bridges in a radius of
2km from the "Eiffel
Tower“?
A: Not straightforward

     2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 15
Modeling Features in France (GeOnto)

 Ontology for geographic objects (POI)
   Output of a French (ANR) research project
   Obtained from NLP tools

 Classes in French
   rdfs:labels in FR & EN
   No rdfs:comments
   Few owl:ObjectProperty
   783 classes

 Overlap with other vocabs
   Need for alignment

  2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 16
Alignment Methodology
 Alignment of GeOnto with 4 ontologies
 and 2 more simple taxonomies
    LGD, DBpedia, Schema.org, GeoNames
    Foursquare, Google Places

 Goal: finding owl:equivalentClass
    Tool : Silk
    Metrics : LevenshteinDistance, Jaro
    Labels : @en des classes
    Aggregation Function: Mean

 Manual validation
    For « rdfs:subClassOf »
    Specific alignments with GeoNames codes


   2012/11/12      5th International Terra Cognita Workshop - ISWC Boston   - 17
Alignment Process with GeoNames

geOnto:AGeoConcept a                                                      • Look for skos codes that
 owl:Class;                                                                 matches GeOnto classes
 rdfs:label “a label”@en;                                                 • Verify the links <70%
                                                               Silk       • Generate « sameAs » links
 rdfs:subClassOf gn:Feature;
 owl:equivalentClass
  [a owl:Restriction;
    owl:onProperty                                                        • Use SPARQL
                                                                            « Construct » to generate
       gn:featureCode;                                     SPARL            a new graph.
    owl:hasValue gn:CODE. ]                               Endpoint



                                                                          • Export the rdf file
                                                        Alignment
                                                            File




    2012/11/12   5th International Terra Cognita Workshop - ISWC Boston             - 18
Results/Evaluation
       Vocab/taxonomies          #Classes                                      #Classes
                                                                               aligned
       LGD                       owl:Class: 1294                                   178
       DBpedia                   owl:Class: 366                                     42
       Schema.org                owl:Class: 296                                     52
       GeoNames                  skos:Concept: 699                                 287
       Foursquare                359                                                46
       Google Place              126                                                41
       bdtopo                    owl:Class: 237                                    153

 High precisions > 80%
 BUT P(Schema.org) = 50%.
   Possible reasons: GeOnto entities are more specific to France
   Fine grain details for entities in Schema.org

  2012/11/12          5th International Terra Cognita Workshop - ISWC Boston        - 19
Topological Functions in GeoVocabs




  NeoGeo (Spatial) and OS Spatial have integrated in their modeling partial
  or full aspect of topological functions of GeoSPARQL.

   2012/11/12         5th International Terra Cognita Workshop - ISWC Boston   - 20
Triple Stores and geospatial indexing




Open Sahara, Parliament and Virtuoso are good choices                             Open Sahara
because they integrate many Geospatial Functions.                                 Parliament
                                                                                  Virtuoso


      2012/11/12         5th International Terra Cognita Workshop - ISWC Boston   - 21
Some Recommendations
 Complex Geometry Coverage
    Need to publish more data with complex geometries
    Select suitable ontologies (e.g: NeoGeo) or GeoSPARQL

 Features MUST be connected to Geometry
    Sometimes it may requires two namespaces

 Serialization and Triple Stores
    Provide serialization in other GIS formats (GML, WKT, KML, etc.)
    Store geodata in a triple store with many topological functions
    implemented (e.g: Open Sahara, Parliament, Virtuoso)

 Literal vs Structured Representation
    Use of structured representation for complex geometry
    This covers some of the Use Cases at IGN

   2012/11/12       5th International Terra Cognita Workshop - ISWC Boston   - 22
Conclusion

 Studied geo vocabularies in the Web of Data
   Multiplicity, attempt of comparison
   Alignment needed, starting from a new ontology
 Presented steps tailored for the French IGN
   Publishing following linked data principles …
   … including complex geometry
 Outline some recommendations
   When publishing data with complex geometry on the web
   Useful for any Geodata provider having similar
   requirements than IGN

  2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 23
Future Work
 Publish a new version of GeOnto ontology
   Following the Best Practices on the LOD
   Reusing NeoGeo Vocabulary
   Use of W3C Geo for representing points
 « Lift » raw data in RDF
   Using GeOnto and external vocabularies
   Store graph in Virtuoso + IndexingSail service
 Continue mappings and alignments
   Schema.org, Foursquare, Google Place
   GeoSPARQL vocabulary
   Mappings at data level
  2012/11/12     5th International Terra Cognita Workshop - ISWC Boston   - 24
Thanks for your attention!



  Questions?
French IGN & Open Data Initiative

Provider of the
data.gouv.fr portal
21 datasets in SHAPE files




Want to publish their data in 5 stars
Data.ign.fr (experimental version)
Towards IGN LD with complex
geometries



   2012/11/12          5th International Terra Cognita Workshop - ISWC Boston   - 26

Más contenido relacionado

Similar a Comparing Vocabularies for Representing Geographical Features and Their Geometry

Representing and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic WebRepresenting and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic Web
Kostis Kyzirakos
 

Similar a Comparing Vocabularies for Representing Geographical Features and Their Geometry (20)

Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
LinkedGeoData and GeoKnow
LinkedGeoData and GeoKnowLinkedGeoData and GeoKnow
LinkedGeoData and GeoKnow
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
 
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtech
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything Project
 
Getting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBGetting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDB
 
Mapping with Drupal
Mapping with DrupalMapping with Drupal
Mapping with Drupal
 
GFOSS DAY 2012 GeoNetwork Presentation
GFOSS DAY 2012 GeoNetwork PresentationGFOSS DAY 2012 GeoNetwork Presentation
GFOSS DAY 2012 GeoNetwork Presentation
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
Linked Open Data (LOD) part 3
Linked Open Data (LOD)  part 3Linked Open Data (LOD)  part 3
Linked Open Data (LOD) part 3
 
Drupal Day 2011 - Drupal and the rise of the documents
Drupal Day 2011 - Drupal and the rise of the documentsDrupal Day 2011 - Drupal and the rise of the documents
Drupal Day 2011 - Drupal and the rise of the documents
 
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
Representing and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic WebRepresenting and Querying Geospatial Information in the Semantic Web
Representing and Querying Geospatial Information in the Semantic Web
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
 
Towards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIsTowards Virtual Knowledge Graphs over Web APIs
Towards Virtual Knowledge Graphs over Web APIs
 

Más de Ghislain Atemezing

Más de Ghislain Atemezing (8)

Trends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsTrends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of Things
 
Big Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceBig Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable Intelligence
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
Phd defense slides
Phd defense slidesPhd defense slides
Phd defense slides
 
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataLIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
 
Information Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open VocabulariesInformation Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open Vocabularies
 
Harmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyHarmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case study
 
Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 

Comparing Vocabularies for Representing Geographical Features and Their Geometry

  • 1. Comparing Vocabularies for Representing Geographical Features and Their Geometry Ghislain Atemezing / @gatemezing Raphaël Troncy / @rtroncy
  • 2. Goal and Research Questions Goal: Provide recommendations for the French IGN in exposing their GIS database in the Linked Data world Study of geo vocabularies in the Web of Data LOD Cloud review Who are the GeoData providers? How features are generally modeled? How geometry is generally modeled? Illustrative scenario Align vocabularies when necessary Compare Triple Stores with geospatial indexing 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -2
  • 3. GeoData: why it matters? “80% of needs for decisions from public authorities have a geospatial component”. (Philippe Grelot, IGN-France) Photo by Vilavelosa on flickr.com 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -3
  • 4. Feature and Geometry Term Definition Source An abstract representation of a real-world [INSPIRE phenomenon related to a specific location Directive] Spatial Object = or geographical area . It should be noted Item 67 (Geographical) Feature that the term has a different meaning in the ISO 19100 series. It is also synonymous with "(geographic) feature" as used in the ISO 19100 series. Feature A geographical feature, capable of holding NeoGeo spatial relations. Vocab A top-level geometry type. This class is Geometry equivalent to the UML class GM_Object GeoSPARQL defined in ISO 19107, and it is superclass [OGC] of all geometry types. 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -4
  • 5. GeoData on the LOD Cloud 31 datasets 19.43% triples http://lod-cloud.net/state Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -5
  • 6. Where are Geo-Linked Data? DBpedia GeoNames Provider #Triples LinkedGeodata (OSM) DBpedia 727 232 triples Freebase (Google) GeoNames 5 240 032 (« features ») Ordnance Survey (UK), LinkedGeoData 60 356 364 triples GeoLinkedData (ES) Ordnance Survey 6 295 triples GADM-RDF Freebase 8,5 MB (tsv fichiers) GeoLinkedData.es 101 018 triples NUTS-RDF Projet GADM 682 605 triples data.ign.fr (FR) Projet NUTS 316 238 triples 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -6
  • 7. Vocabularies for Space and Geography http://lov.okfn.org/dataset/lov/details/vocabularySpace_Space.html Only 5 vocabs are reused: W3C Geo (21 datasets), OS spatialrelations (10 datasets), Geonames (5 datasets), UK administrative (3 datasets) and NeoGeo (3 datasets) 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -7
  • 8. Vocabularies for Modeling Features (1/2) Authority list of terms (e.g. Foursquare) Less structured Represent categories of Points of Interest (POIs) Typically, one type as an API answer Need to express the semantics of the terms SKOS Categories (e.g. GeoNames) Classes are skos:conceptScheme Codes are skos:Concept Few classes … BUT many codes 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -8
  • 9. Vocabularies for Modeling Features (2/2) Domain specific ontologies One ontology per subdomain (transport, administrative unit, hydrography, etc.) Interconnected ontologies (by explicit semantic e.g. owl:imports) UK (OS) – ES (GeoLinkedData) Some richer ontologies created by (semi-)automatic tools / NLP Deeper taxonomy to structure the ontology LinkedGeoData: 16 high-level classes, 1294 classes GeOnto: 2 high-level classes, 783 classes in total 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston -9
  • 10. Modeling Geometry Point (lat/long) WGS 84 vocabulary described by W3C Rectangle (“bounding box”) Geopolitical Vocabulary (FAO) Points in a List Sequence of points (LinkedGeoData) An object is “formedBy” a ListOfPoints (GeoLinkedData.es) Literals (GML datatype in RDF) Ordnance Survey (UK) More structured representation of complex geometry NeoGeo Vocabulary (GeoVocamp), http://geovocab.org/ 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 10
  • 11. Scenario: 7th Arrondissement of Paris 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 11
  • 12. 7th Arrondissement in DBpedia (a gml_Feature) dbpedia:7th_arrondissement_of_Paris a gml:_Feature ; (gml IS NOT an ontology with OWL-flavour ) a <http://dbpedia.org/class/yago/1900SummerOlympicVenuEs> (Yago Class) rdfs:label "巴黎第七區"@zh; (14 different languages) dbpprop:commune "Paris" ; dbpprop:département dbpedia:Paris ; dbpprop:région dbpedia:Île-de-France_(region) ; grs:point "48.85916666666667 2.312777777777778" ; geo:geometry "POINT(2.31278 48.8592)" ; (fake property!) geo:lat "48.859165"^^xsd:float; geo:long "2.312778"^^xsd:float. 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 12
  • 13. 7th Arrondissement in GeoNames (a A.ADM4) gnr:6618613 a gn:Feature ; gn:name "Paris 07"; gn:alternateName "7ème arrondissement"; gn:featureClass gn:A [ a skos:ConceptScheme ; rdfs:comment "country, state, region ..."@en . ] ; gn:featureColde gn:A.ADM4 [ a skos:Concept ; rdfs:comment "a subdivision of a third-order administrative division"@en . ]; gn:countryCode "FR"; gn:population "57410"; geo:lat "48.8565"; geo:long "2.321". 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 13
  • 14. 7th Arrondissement in LGD (a “Suburb”) lgd:node248177663 a lgdo:Suburb ; rdfs:label "7th Arrondissement"@en , "7e Arrondissement" ; lgdo:contributor lgd:user13442 ; <http://linkedgeodata.org/ontology/ref%3AINSEE> 75107 ; lgdp:alt_name "VIIe Arrondissement" ; georss:point "48.8570281 2.3201953" ; geo:lat 48.8570281 ; geo:long 2.3201953 . 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 14
  • 15. French IGN « ..describes the French national territory and the occupation of its land, elaborates and updates perpetual inventory of the forest resources » Different databases: BD ORTHO, BD PARCELLAIRE, POINT ADRESSE, BD ALTI 25m, BD TOPO; etc. Data in LAMBERT93 or RGF93 Q: ”Give me all the bridges in a radius of 2km from the "Eiffel Tower“? A: Not straightforward 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 15
  • 16. Modeling Features in France (GeOnto) Ontology for geographic objects (POI) Output of a French (ANR) research project Obtained from NLP tools Classes in French rdfs:labels in FR & EN No rdfs:comments Few owl:ObjectProperty 783 classes Overlap with other vocabs Need for alignment 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 16
  • 17. Alignment Methodology Alignment of GeOnto with 4 ontologies and 2 more simple taxonomies LGD, DBpedia, Schema.org, GeoNames Foursquare, Google Places Goal: finding owl:equivalentClass Tool : Silk Metrics : LevenshteinDistance, Jaro Labels : @en des classes Aggregation Function: Mean Manual validation For « rdfs:subClassOf » Specific alignments with GeoNames codes 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 17
  • 18. Alignment Process with GeoNames geOnto:AGeoConcept a • Look for skos codes that owl:Class; matches GeOnto classes rdfs:label “a label”@en; • Verify the links <70% Silk • Generate « sameAs » links rdfs:subClassOf gn:Feature; owl:equivalentClass [a owl:Restriction; owl:onProperty • Use SPARQL « Construct » to generate gn:featureCode; SPARL a new graph. owl:hasValue gn:CODE. ] Endpoint • Export the rdf file Alignment File 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 18
  • 19. Results/Evaluation Vocab/taxonomies #Classes #Classes aligned LGD owl:Class: 1294 178 DBpedia owl:Class: 366 42 Schema.org owl:Class: 296 52 GeoNames skos:Concept: 699 287 Foursquare 359 46 Google Place 126 41 bdtopo owl:Class: 237 153 High precisions > 80% BUT P(Schema.org) = 50%. Possible reasons: GeOnto entities are more specific to France Fine grain details for entities in Schema.org 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 19
  • 20. Topological Functions in GeoVocabs NeoGeo (Spatial) and OS Spatial have integrated in their modeling partial or full aspect of topological functions of GeoSPARQL. 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 20
  • 21. Triple Stores and geospatial indexing Open Sahara, Parliament and Virtuoso are good choices Open Sahara because they integrate many Geospatial Functions. Parliament Virtuoso 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 21
  • 22. Some Recommendations Complex Geometry Coverage Need to publish more data with complex geometries Select suitable ontologies (e.g: NeoGeo) or GeoSPARQL Features MUST be connected to Geometry Sometimes it may requires two namespaces Serialization and Triple Stores Provide serialization in other GIS formats (GML, WKT, KML, etc.) Store geodata in a triple store with many topological functions implemented (e.g: Open Sahara, Parliament, Virtuoso) Literal vs Structured Representation Use of structured representation for complex geometry This covers some of the Use Cases at IGN 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 22
  • 23. Conclusion Studied geo vocabularies in the Web of Data Multiplicity, attempt of comparison Alignment needed, starting from a new ontology Presented steps tailored for the French IGN Publishing following linked data principles … … including complex geometry Outline some recommendations When publishing data with complex geometry on the web Useful for any Geodata provider having similar requirements than IGN 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 23
  • 24. Future Work Publish a new version of GeOnto ontology Following the Best Practices on the LOD Reusing NeoGeo Vocabulary Use of W3C Geo for representing points « Lift » raw data in RDF Using GeOnto and external vocabularies Store graph in Virtuoso + IndexingSail service Continue mappings and alignments Schema.org, Foursquare, Google Place GeoSPARQL vocabulary Mappings at data level 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 24
  • 25. Thanks for your attention! Questions?
  • 26. French IGN & Open Data Initiative Provider of the data.gouv.fr portal 21 datasets in SHAPE files Want to publish their data in 5 stars Data.ign.fr (experimental version) Towards IGN LD with complex geometries 2012/11/12 5th International Terra Cognita Workshop - ISWC Boston - 26