SlideShare una empresa de Scribd logo
1 de 53
Descargar para leer sin conexión
Methodological Guidelines for
   Publishing Linked Data
            g
                Boris Villazón-Terrazas, Oscar Corcho
      Facultad de Informática, Universidad Politécnica de Madrid
                              ,
    Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid
                       http://www.oeg-upm.net
                    {bvillazon,ocorcho}@fi.upm.es
             Phone: 34 91 3366605 Fax: 34 91 3524819
                     34.91.3366605,       34.91.3524819
      Slides available at: http://www.slideshare.net/boricles/


Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches,
Victor Saquicela, Al
Vi t S     i l Alexander d L ó and many others th t we
                     d de León,   d        th   that
may have omitted.
WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0
Main References

Wood, David (Ed) Linking Government Data - 2011

Methodological Guidelines for Publishing Government Linked Data

Boris Villazón-Terrazas, Luis M. Vilches, Oscar Corcho, Asunción Gómez-Pérez




Best Practices for Publishing Linked Data

W3C Editor’s Draft – Government Linked Data Working Group

Michael Hausenblas, Bernadette Hyland, Boris Villazón-Terrazas

https://dvcs.w3.org/hg/gld/raw-file/bcb72f87b5cc/bp/index.html



Cookbook for Open Government Linked Data

W3C Editor’s Draft – Government Linked Data Working Group

Bernadette Hyland, Boris Villazón-Terrazas, Sarven Capadisli

http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook
http://www w3 org/2011/gld/wiki/Linked Data Cookbook
Guidelines for Publishing Linked Data

• The process of publishing Linked Data has an
  iterative incremental life cycle model.



• Based on our experience in the production of Linked
  Data in several Governmental Contexts, have been
  applied in real case scenarios.




                           3
4
5
Specification
• Identification and analysis of the data
  sources

• URI design

• Definition of the license




                        6
Specification
            Identification and analysis of the data sources

We have to distinguish

• O
  Open and publish d t th t government agencies h
           d bli h data that         t      i have
  not yet opened up and published
   • Task that may require contacting to specific government data
     owners to get access to their legacy data




• Reuse and leverage on data already opened up and
  p
  published by g
             y government agencies
                           g
   • Task to look for these data in public government catalogs
      • Open Government Data
      • datacatalogs org
        datacatalogs.org
      • Open Government Catalog
                                7
Specification
           Identification and analysis of the data sources

After we have identified and selected the government data
   sources

• Search and compile all the available data and
  documentation about those resources

• Identify the schema of those resources including
  conceptual components and th i relationships
          t l           t    d their l ti   hi

• Identify the items in the domain i e things whose
                            domain, i.e.,
  properties and relations are described in the data
  sources


                           8
Specification
                    GeoLinkedData – Identification of the data sources

                                                      Agreement with the IGN
                IGN
National Geographic Institute of Spain

        Oracle & MySQL




                                                       Data
                                                       D t sources available
                                                                        il bl
                                                      in a public data catalog
         INE
National Statistic Institute of Spain




                                         9
Specification
           GeoLinkedData – Analysis of the data sources




                   Year




Province                             Industry Production Index




                          10
Specification
                                                    URI Design

• Use meaningful URIs, instead of opaque URIs, when
  possible

• Separate TBox (ontology model) from ABox
  (instances) URIs
              URIs.
   • Base URI
     http://data.gov.bo/
     http://health.data.gov.bo/
   • TBox URIs
     http://data.gov.bo/ontology/{class|property}
        p        g            gy {     |p p y}
   • ABox URIs
     http://data.gov.bo/resource/
     http://data.gov.bo/resource/province/Tiraque
     http://data gov bo/resource/province/Tiraque


                                11
Specification
                                   GeoLinkedData - URI design

• Base URI
  http://linkeddata.es/
  http://geo.linkeddata.es/

• TBox URIs
  http://geo.linkeddata.es/ontology/{concept|property}
  http://geo.linkeddata.es/ontology/Provincia
  http://geo linkeddata es/ontology/Provincia

• ABox URIs
  http://geo.linkeddata.es/resource/{r. type}/{r. name}
  http://geo.linkeddata.es/resource/Provincia/Madrid


                              12
Specification
                                      Definition of the license

• Several possibilities

   • The UK Open Government License

   • Open Database License

   • Public Domain Dedication and License

   • Open Data Commons Attribution License

   • The C
         Creative C
                  Commons Licenses


It is also possible to reuse and apply an existing license
           p                      pp y           g
    of the government data sources.
                              13
Specification
                                    GeoLinkedData - Definition of the license

• Reusing the original license of the government data
  sources. IGN and INE data sources have their own
  license, similar t Att ib ti Sh
  li        i il to Attribution-Share Alik 2 5 G
                                       Alike 2.5 Generic
                                                      i
  License




  http://creativecommons.org/licenses/by-sa/2.5/


                                                   14
15
Modelling
                                                                            Ontology




•   An ontology is an engineering artifact, which provides:
     •   A set of terms
     •   A set of explicit assumptions regarding the intended meaning of the terms.
           • Almost always including concepts and their classification
           • Almost always including properties between concepts




•   Shared understanding of a domain of interest

•   Ontologies expressed in OWL or RDF(S), both based on RDF




                                          16
Modelling
                               Reuse available vocabularies



Search f suitable
S    h for it bl
  vocabularies



                                               Linked Open Vocabularies




    are there       Yes                  Build the vocabulary by
     suitable                               reusing available
                                                   g
  vocabularies?                               vocabularies


           No



       …

                          17
Modelling
                 Reuse available non-ontological resources

                                               Highly reliable Web Sites



   Search f suitable
   S     h for it bl                           Domain related
                                               Domain-related sites
non-ontological resources

                                               Government Catalogs




        are there           Yes        Build the vocabulary by
         suitable                      transforming available
                                       t     f    i      il bl
       resources?                              resources


               No




Build the vocabulary from
         scratch



                                  18
Modelling
                                                                                                              GeoLinkedData
                                                                      WGS84 Geo
                                                                   Positioning: an RDF
                                                                       vocabulary                                  scv:Dimension
                                                                                                                      scv:Item
                                                                                                                    scv:Dataset

               hydrographical
             phenomena (rivers,
                 lakes, etc.)




                                                                                                                     Vocabulary for
                                                                                                                     instants, intervals,
                                                                                                                     durations, etc.




                                                                                         Names and
                                                                                         international code
                                  Ontology for OGC                                       systems for
                                  Geography Markup                                       territories and
                                  Language                                               groups




Classes                     33          33
Object Properties           44          44
Data Properties            318         318
                                                     http://neon-toolkit.org/


                                                                   19
Modelling
     GeoLinkedData




20
21
Generation
• Transformation

• Data cleansing

• Linking




                   22
Generation
                                                Transformation

• Take the data sources selected in the specification
  activity and transform them to RDF according to the
  vocabulary created i th modelling activity
       b l         t d in the   d lli   ti it

• Some tools
   • CSV and spreadsheets
      • RDF extension of Google Refine, XLWrap, RDF123, NOR2O
   • RDB
      • D2R Server, ODEMapster, W3C RDB2RDF WG – R2RML
   • XML
      • GRDDL, ReDeFer




                               23
Generation
                                  GeoLinkedData - Transformation



                             NOR2O

       INE




                          ODEMapster


      IGN




             Geospatial       Geometry2RDF
              column


IGN




                                       24
Generation
                                   GeoLinkedData - Transformation
Industry Production Index   Year




Province




                                    NOR2O




                                     25
Generation
                                                      GeoLinkedData - Transformation
•   R2O is an e te s b e, fully dec a at e language to desc be
          s a extensible, u y declarative a guage describe
    mappings between relational database schemas and ontologies.
•   The ODEMapster processor generates RDF instances from
    relational instances based on the mapping description
                                          pp g       p
    expressed in the R2O document




    www.oeg-upm.net/index.php/en/downloads/9-r2o-odempaster
                                                              26
Generation
                       GeoLinkedData - Transformation
• Creation of the R2O Mappings




                         27
Generation
GeoLinkedData - Transformation


            Excerpt of the R2O document




  28
Generation
                                                      GeoLinkedData - Transformation

• Tool for generating RDF from geometrical information

• The geometry could be available in GML or WKT

• The RDF generated follows our Geometry Model




  http://www.oeg-upm.net/index.php/en/downloads/151-geometry2rdf

                                                           29
Generation
 GeoLinkedData - Transformation


                   Oracle STO UTIL package




SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry))
          AS Gml311Geometry
FROM "BCN200"."BCN200_0301L_RIO" c
WHERE c.Etiqueta='Arroyo'




     30
Generation
GeoLinkedData - Transformation
Generation
                                                      Data Cleansing

• To find possible errors, identified by Hogan et al.
   • http-level issues, such as accessibility and derefencability,
     e.g.,
     e g HTTP URIs ret rn 40 /50 errors
                        return 40x/50x
   • reasoning issues such as namespace without vocabulary,
     e.g., rss:item term invented
   • malformed/incompatible datatypes, e.g., “true” as xsd:int


• To fix the identified errors




                                 32
Generation
                            GeoLinkedData – Data Cleansing

• Errors
   • Some resources, with the same name, were mixed. For
     example,
     e ample Granada municipality belongs to Granada
                       m nicipalit
     province, and La Granada municipality belongs to Barcelona
     Province.

   • Autonomous communities that only have one province, e.g.,
     Murcia Region, missed some municipalities, but their
     corresponding provinces, e g Murcia Province have the
                   provinces e.g.,       Province,
     correct number of municipalities.

   • S
     Some hydrographical resources missed some parts of their
                                                      f
     geometrical information.




                               33
Generation
                                                                                                          Linking


                     Identify suitable data sets                                       http://ckan.net
                         as linking targets




                       Discover relationships
                        between data items
LIMES                                              Silk Framework
http://aksw.org/Projects/limes                     http://www4.wiwiss.fu-berlin.de/bizer/silk/




                     Validate the relationships
                            discovered              sameAs Validator
                                                    http://oegdev.dia.fi.upm.es:8080/sameAs/




                                                                34
Generation
                                                            GeoLinkedData - Linking


                   GeoLinked
                     Data




                               DBPedia                     GeoNames




        ….                                  ….                               ….

http://dbpedia.org/re              http://geo.linkeddata              http://sws.geoname
   source/Madrid                       .es/.../Madrid                    s.org/6355233/


        ….                                 ….                                 ….

                                                35
Generation
                                                GeoLinkedData - Linking




http://oegdev.dia.fi.upm.es:8080/sameAs/
http://oegdev dia fi upm es:8080/sameAs/




                                           36
37
Publication
• Dataset publication

• Metadata publication

• Dataset discovery




                        38
Publication
                                           Dataset Publication

• Tools for storing RDF
   • Virtuoso Universal Server, Jena, Sesame, 4Store, YARS,
     OWLIM


• SPARQL endpoint and Linked Data frontend
   • Pubby, Talis Platform, Fuseki




                               39
Publication
                                  Metadata Publication

• VoID allows to express metadata about RDF
  datasets




• Open Provenance Model




                          40
Publication
                                                                                   Dataset discovery

• Register the dataset into CKAN Registry

• Generate sitemap files for your dataset, by using
  sitemap4rdf

• Submit the sitemap location to Google and Sindice




  http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation


                                                       41
Publication
                                               GeoLinkedData – Dataset publication




                               HTML                  Linked Data        SPARQL




     Including Provenance                   Pubby
            Support

http://www4.wiwiss.fu-berlin.de/pubby/   Pubby 0.3




                                                                   Virtuoso 6.1.0
                                                                            610



                                                            42
Publication
GeoLinkedData – Dataset discovery




    43
44
Exploitation




Streaming resources
     45
Exploitation
                                                                      GeoLinkedData

                      http://oegdev.dia.fi.upm.es/projects/map4rdf/


map4rdf:
   • Google maps viewer of RDF resources
       • Resources with spatial information
   • Extensible with google plugins
   • Used in other applications like Aemet Goodrelations
                                     Aemet,




                               map4rdf                 SPARQL




                                                     Triplestore
                                                46
DEMO
http://geo.linkeddata.es/browser




              47
Provinces




48
Capital of Province




49
Provinces – Industry Production Index




 50
Beaches




51
Methodological Guidelines for
   Publishing Linked Data
            g
                Boris Villazón-Terrazas, Oscar Corcho
      Facultad de Informática, Universidad Politécnica de Madrid
                              ,
    Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid
                       http://www.oeg-upm.net
                    {bvillazon,ocorcho}@fi.upm.es
             Phone: 34 91 3366605 Fax: 34 91 3524819
                     34.91.3366605,       34.91.3524819
      Slides available at: http://www.slideshare.net/boricles/


Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches,
Victor Saquicela, Al
Vi t S     i l Alexander d L ó and many others th t we
                     d de León,   d        th   that
may have omitted.
WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0

Más contenido relacionado

La actualidad más candente

Secrets of the catalog remix the remix
Secrets of the catalog remix the remixSecrets of the catalog remix the remix
Secrets of the catalog remix the remixrobin fay
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift
 
All About Access Points in RDA
All About Access Points in RDAAll About Access Points in RDA
All About Access Points in RDAShana McDanold
 
Resource Description & Access (RDA)
Resource Description & Access (RDA)Resource Description & Access (RDA)
Resource Description & Access (RDA)Buzz Haughton
 
Revealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid ApproachRevealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid ApproachJulien PLU
 
The tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARCThe tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARCAnn Chapman
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelabCAMELIA BOBAN
 
Marc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue RecordsMarc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue RecordsOtuoma Peter
 

La actualidad más candente (9)

Secrets of the catalog remix the remix
Secrets of the catalog remix the remixSecrets of the catalog remix the remix
Secrets of the catalog remix the remix
 
A brief history of MARC
A brief history of MARCA brief history of MARC
A brief history of MARC
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011
 
All About Access Points in RDA
All About Access Points in RDAAll About Access Points in RDA
All About Access Points in RDA
 
Resource Description & Access (RDA)
Resource Description & Access (RDA)Resource Description & Access (RDA)
Resource Description & Access (RDA)
 
Revealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid ApproachRevealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid Approach
 
The tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARCThe tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARC
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelab
 
Marc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue RecordsMarc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue Records
 

Destacado

SEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and InteroperabilitySEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and InteroperabilityBoris Villazón-Terrazas
 
Linguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial InformationLinguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial InformationBoris Villazón-Terrazas
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataBoris Villazón-Terrazas
 
Towards a Commons RDF Java library
Towards a Commons RDF Java libraryTowards a Commons RDF Java library
Towards a Commons RDF Java librarySergio Fernández
 
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...Boris Villazón-Terrazas
 

Destacado (11)

SEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and InteroperabilitySEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and Interoperability
 
Linguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial InformationLinguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial Information
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Geolinkeddata 07042011 1
Geolinkeddata 07042011 1Geolinkeddata 07042011 1
Geolinkeddata 07042011 1
 
Yet another SPARQL 1.1 brief introduction
Yet another SPARQL 1.1 brief introductionYet another SPARQL 1.1 brief introduction
Yet another SPARQL 1.1 brief introduction
 
Towards a Commons RDF Java library
Towards a Commons RDF Java libraryTowards a Commons RDF Java library
Towards a Commons RDF Java library
 
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
 
Sitemap4rdf(v2 boris)
Sitemap4rdf(v2 boris)Sitemap4rdf(v2 boris)
Sitemap4rdf(v2 boris)
 
Ecuadorian Geospatial Linked Data
Ecuadorian Geospatial Linked Data Ecuadorian Geospatial Linked Data
Ecuadorian Geospatial Linked Data
 
iSOCO - Research Lab Brief Introduction
iSOCO - Research Lab Brief IntroductioniSOCO - Research Lab Brief Introduction
iSOCO - Research Lab Brief Introduction
 
Data Shapes and Data Transformations
Data Shapes and Data TransformationsData Shapes and Data Transformations
Data Shapes and Data Transformations
 

Similar a Methodological Guidelines for Publishing Linked Data

Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesPrateek Jain
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
reegle - a new key portal for open energy data
reegle - a new key portal for open energy datareegle - a new key portal for open energy data
reegle - a new key portal for open energy datareeep
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISGiannis Tsakonas
 
Tsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-LisTsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-LisLIS EPI Meeting
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
First they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedFirst they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedRensselaer Polytechnic Institute
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip finalDeborah McGuinness
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 

Similar a Methodological Guidelines for Publishing Linked Data (20)

PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
 
Linked Data
Linked DataLinked Data
Linked Data
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
reegle - a new key portal for open energy data
reegle - a new key portal for open energy datareegle - a new key portal for open energy data
reegle - a new key portal for open energy data
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LIS
 
Tsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-LisTsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-Lis
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
First they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedFirst they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and Used
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
Linked Data
Linked DataLinked Data
Linked Data
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 

Más de Boris Villazón-Terrazas

RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingBoris Villazón-Terrazas
 
Map4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsMap4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsBoris Villazón-Terrazas
 
Linked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusLinked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusBoris Villazón-Terrazas
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationBoris Villazón-Terrazas
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataBoris Villazón-Terrazas
 
Linked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupLinked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupBoris Villazón-Terrazas
 
Lightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesLightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesBoris Villazón-Terrazas
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseBoris Villazón-Terrazas
 

Más de Boris Villazón-Terrazas (12)

RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct Mapping
 
Map4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsMap4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial Datasets
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Linked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusLinked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current Status
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and Organization
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Linked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupLinked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering Group
 
Lightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesLightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful Services
 
Geometry2rdf(v2 boris)
Geometry2rdf(v2 boris)Geometry2rdf(v2 boris)
Geometry2rdf(v2 boris)
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
 
Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)
 
GeoLinkedData
GeoLinkedDataGeoLinkedData
GeoLinkedData
 

Último

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Último (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Methodological Guidelines for Publishing Linked Data

  • 1. Methodological Guidelines for Publishing Linked Data g Boris Villazón-Terrazas, Oscar Corcho Facultad de Informática, Universidad Politécnica de Madrid , Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net {bvillazon,ocorcho}@fi.upm.es Phone: 34 91 3366605 Fax: 34 91 3524819 34.91.3366605, 34.91.3524819 Slides available at: http://www.slideshare.net/boricles/ Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches, Victor Saquicela, Al Vi t S i l Alexander d L ó and many others th t we d de León, d th that may have omitted. WorkdistributedunderthelicenseCreativeCommonsAttribution- Noncommercial-Share Alike 3.0
  • 2. Main References Wood, David (Ed) Linking Government Data - 2011 Methodological Guidelines for Publishing Government Linked Data Boris Villazón-Terrazas, Luis M. Vilches, Oscar Corcho, Asunción Gómez-Pérez Best Practices for Publishing Linked Data W3C Editor’s Draft – Government Linked Data Working Group Michael Hausenblas, Bernadette Hyland, Boris Villazón-Terrazas https://dvcs.w3.org/hg/gld/raw-file/bcb72f87b5cc/bp/index.html Cookbook for Open Government Linked Data W3C Editor’s Draft – Government Linked Data Working Group Bernadette Hyland, Boris Villazón-Terrazas, Sarven Capadisli http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook http://www w3 org/2011/gld/wiki/Linked Data Cookbook
  • 3. Guidelines for Publishing Linked Data • The process of publishing Linked Data has an iterative incremental life cycle model. • Based on our experience in the production of Linked Data in several Governmental Contexts, have been applied in real case scenarios. 3
  • 4. 4
  • 5. 5
  • 6. Specification • Identification and analysis of the data sources • URI design • Definition of the license 6
  • 7. Specification Identification and analysis of the data sources We have to distinguish • O Open and publish d t th t government agencies h d bli h data that t i have not yet opened up and published • Task that may require contacting to specific government data owners to get access to their legacy data • Reuse and leverage on data already opened up and p published by g y government agencies g • Task to look for these data in public government catalogs • Open Government Data • datacatalogs org datacatalogs.org • Open Government Catalog 7
  • 8. Specification Identification and analysis of the data sources After we have identified and selected the government data sources • Search and compile all the available data and documentation about those resources • Identify the schema of those resources including conceptual components and th i relationships t l t d their l ti hi • Identify the items in the domain i e things whose domain, i.e., properties and relations are described in the data sources 8
  • 9. Specification GeoLinkedData – Identification of the data sources Agreement with the IGN IGN National Geographic Institute of Spain Oracle & MySQL Data D t sources available il bl in a public data catalog INE National Statistic Institute of Spain 9
  • 10. Specification GeoLinkedData – Analysis of the data sources Year Province Industry Production Index 10
  • 11. Specification URI Design • Use meaningful URIs, instead of opaque URIs, when possible • Separate TBox (ontology model) from ABox (instances) URIs URIs. • Base URI http://data.gov.bo/ http://health.data.gov.bo/ • TBox URIs http://data.gov.bo/ontology/{class|property} p g gy { |p p y} • ABox URIs http://data.gov.bo/resource/ http://data.gov.bo/resource/province/Tiraque http://data gov bo/resource/province/Tiraque 11
  • 12. Specification GeoLinkedData - URI design • Base URI http://linkeddata.es/ http://geo.linkeddata.es/ • TBox URIs http://geo.linkeddata.es/ontology/{concept|property} http://geo.linkeddata.es/ontology/Provincia http://geo linkeddata es/ontology/Provincia • ABox URIs http://geo.linkeddata.es/resource/{r. type}/{r. name} http://geo.linkeddata.es/resource/Provincia/Madrid 12
  • 13. Specification Definition of the license • Several possibilities • The UK Open Government License • Open Database License • Public Domain Dedication and License • Open Data Commons Attribution License • The C Creative C Commons Licenses It is also possible to reuse and apply an existing license p pp y g of the government data sources. 13
  • 14. Specification GeoLinkedData - Definition of the license • Reusing the original license of the government data sources. IGN and INE data sources have their own license, similar t Att ib ti Sh li i il to Attribution-Share Alik 2 5 G Alike 2.5 Generic i License http://creativecommons.org/licenses/by-sa/2.5/ 14
  • 15. 15
  • 16. Modelling Ontology • An ontology is an engineering artifact, which provides: • A set of terms • A set of explicit assumptions regarding the intended meaning of the terms. • Almost always including concepts and their classification • Almost always including properties between concepts • Shared understanding of a domain of interest • Ontologies expressed in OWL or RDF(S), both based on RDF 16
  • 17. Modelling Reuse available vocabularies Search f suitable S h for it bl vocabularies Linked Open Vocabularies are there Yes Build the vocabulary by suitable reusing available g vocabularies? vocabularies No … 17
  • 18. Modelling Reuse available non-ontological resources Highly reliable Web Sites Search f suitable S h for it bl Domain related Domain-related sites non-ontological resources Government Catalogs are there Yes Build the vocabulary by suitable transforming available t f i il bl resources? resources No Build the vocabulary from scratch 18
  • 19. Modelling GeoLinkedData WGS84 Geo Positioning: an RDF vocabulary scv:Dimension scv:Item scv:Dataset hydrographical phenomena (rivers, lakes, etc.) Vocabulary for instants, intervals, durations, etc. Names and international code Ontology for OGC systems for Geography Markup territories and Language groups Classes 33 33 Object Properties 44 44 Data Properties 318 318 http://neon-toolkit.org/ 19
  • 20. Modelling GeoLinkedData 20
  • 21. 21
  • 22. Generation • Transformation • Data cleansing • Linking 22
  • 23. Generation Transformation • Take the data sources selected in the specification activity and transform them to RDF according to the vocabulary created i th modelling activity b l t d in the d lli ti it • Some tools • CSV and spreadsheets • RDF extension of Google Refine, XLWrap, RDF123, NOR2O • RDB • D2R Server, ODEMapster, W3C RDB2RDF WG – R2RML • XML • GRDDL, ReDeFer 23
  • 24. Generation GeoLinkedData - Transformation NOR2O INE ODEMapster IGN Geospatial Geometry2RDF column IGN 24
  • 25. Generation GeoLinkedData - Transformation Industry Production Index Year Province NOR2O 25
  • 26. Generation GeoLinkedData - Transformation • R2O is an e te s b e, fully dec a at e language to desc be s a extensible, u y declarative a guage describe mappings between relational database schemas and ontologies. • The ODEMapster processor generates RDF instances from relational instances based on the mapping description pp g p expressed in the R2O document www.oeg-upm.net/index.php/en/downloads/9-r2o-odempaster 26
  • 27. Generation GeoLinkedData - Transformation • Creation of the R2O Mappings 27
  • 28. Generation GeoLinkedData - Transformation Excerpt of the R2O document 28
  • 29. Generation GeoLinkedData - Transformation • Tool for generating RDF from geometrical information • The geometry could be available in GML or WKT • The RDF generated follows our Geometry Model http://www.oeg-upm.net/index.php/en/downloads/151-geometry2rdf 29
  • 30. Generation GeoLinkedData - Transformation Oracle STO UTIL package SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry)) AS Gml311Geometry FROM "BCN200"."BCN200_0301L_RIO" c WHERE c.Etiqueta='Arroyo' 30
  • 32. Generation Data Cleansing • To find possible errors, identified by Hogan et al. • http-level issues, such as accessibility and derefencability, e.g., e g HTTP URIs ret rn 40 /50 errors return 40x/50x • reasoning issues such as namespace without vocabulary, e.g., rss:item term invented • malformed/incompatible datatypes, e.g., “true” as xsd:int • To fix the identified errors 32
  • 33. Generation GeoLinkedData – Data Cleansing • Errors • Some resources, with the same name, were mixed. For example, e ample Granada municipality belongs to Granada m nicipalit province, and La Granada municipality belongs to Barcelona Province. • Autonomous communities that only have one province, e.g., Murcia Region, missed some municipalities, but their corresponding provinces, e g Murcia Province have the provinces e.g., Province, correct number of municipalities. • S Some hydrographical resources missed some parts of their f geometrical information. 33
  • 34. Generation Linking Identify suitable data sets http://ckan.net as linking targets Discover relationships between data items LIMES Silk Framework http://aksw.org/Projects/limes http://www4.wiwiss.fu-berlin.de/bizer/silk/ Validate the relationships discovered sameAs Validator http://oegdev.dia.fi.upm.es:8080/sameAs/ 34
  • 35. Generation GeoLinkedData - Linking GeoLinked Data DBPedia GeoNames …. …. …. http://dbpedia.org/re http://geo.linkeddata http://sws.geoname source/Madrid .es/.../Madrid s.org/6355233/ …. …. …. 35
  • 36. Generation GeoLinkedData - Linking http://oegdev.dia.fi.upm.es:8080/sameAs/ http://oegdev dia fi upm es:8080/sameAs/ 36
  • 37. 37
  • 38. Publication • Dataset publication • Metadata publication • Dataset discovery 38
  • 39. Publication Dataset Publication • Tools for storing RDF • Virtuoso Universal Server, Jena, Sesame, 4Store, YARS, OWLIM • SPARQL endpoint and Linked Data frontend • Pubby, Talis Platform, Fuseki 39
  • 40. Publication Metadata Publication • VoID allows to express metadata about RDF datasets • Open Provenance Model 40
  • 41. Publication Dataset discovery • Register the dataset into CKAN Registry • Generate sitemap files for your dataset, by using sitemap4rdf • Submit the sitemap location to Google and Sindice http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation 41
  • 42. Publication GeoLinkedData – Dataset publication HTML Linked Data SPARQL Including Provenance Pubby Support http://www4.wiwiss.fu-berlin.de/pubby/ Pubby 0.3 Virtuoso 6.1.0 610 42
  • 44. 44
  • 46. Exploitation GeoLinkedData http://oegdev.dia.fi.upm.es/projects/map4rdf/ map4rdf: • Google maps viewer of RDF resources • Resources with spatial information • Extensible with google plugins • Used in other applications like Aemet Goodrelations Aemet, map4rdf SPARQL Triplestore 46
  • 50. Provinces – Industry Production Index 50
  • 52.
  • 53. Methodological Guidelines for Publishing Linked Data g Boris Villazón-Terrazas, Oscar Corcho Facultad de Informática, Universidad Politécnica de Madrid , Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net {bvillazon,ocorcho}@fi.upm.es Phone: 34 91 3366605 Fax: 34 91 3524819 34.91.3366605, 34.91.3524819 Slides available at: http://www.slideshare.net/boricles/ Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches, Victor Saquicela, Al Vi t S i l Alexander d L ó and many others th t we d de León, d th that may have omitted. WorkdistributedunderthelicenseCreativeCommonsAttribution- Noncommercial-Share Alike 3.0