SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Creating Knowledge out of Interlinked Data




LOD2 Webinar . 29.11.2011 . Page 1                    http://lod2.eu
Creating Knowledge out of Interlinked Data




        LOD2 is a large-scale integrating project co-funded by the European
        Commission within the FP7 Information and Communication Technologies
        Work Programme. This 4-year project comprises leading Linked Open
        Data technology researchers, companies, and service providers. Coming
        from across 12 countries the partners are coordinated by the Agile
        Knowledge Engineering and Semantic Web Research Group at the
        University of Leipzig, Germany.

        LOD2 will integrate and syndicate Linked Data with existing large-scale
        applications. The project shows the benefits in the scenarios of Media and
        Publishing, Corporate Data intranets and eGovernment.




                                                                                     http://lod2.eu
LOD2 Webinar . 29.11.2011 . Page 2                                                    http://lod2.eu
Creating Knowledge out of Interlinked Data




        Once per month the LOD2 webinar series offer a free webinar about
        tools and services along the Linked Open Data Life Cycle.

        Stay with us and learn more about acquisition, editing, composing,
        connected applications – and finally publishing Linked Open Data.




                                                                             http://lod2.eu
LOD2 Webinar . 29.11.2011 . Page 3                                            http://lod2.eu
Creating Knowledge out of Interlinked Data



Web-based Systems Group

• School of Business & Economics, Freie Universität Berlin
• Research focus: Linked Data technologies for extending the World
  Wide Web with a global data commons
• Funded Projects:
      •   LOD2 - Creating Knowledge out of Interlinked Data
      •   LATC - LOD Around The Clock
      •   PlanetData
• Visit us at: http://wbsg.de




LOD2 Webinar . 21.02.2012 . Page 4                            http://lod2.eu
Creating Knowledge out of Interlinked Data



Main Projects

• DBpedia is a community effort lead by WBSG and AKSW to:
      •   Extract structured information from Wikipedia
      •   Make this information available on the Web under an open license
      •   Interlink the DBpedia dataset with other open datasets on the Web
      •   DBpedia Spotlight: Automatic annotation of free-text with DBpedia URIs


• D2R: Publishing relational databases on the Semantic Web

• Data Integration
      • R2R: Translates Web data that is represented using terms from different
        vocabularies into a single target vocabulary.
      • Silk: Tool for generating RDF links between data items.
      • LDIF: Translates heterogeneous Linked Data from the Web into a clean,
        local target representation while keeping track of data provenance.
LOD2 Webinar . 21.02.2012 . Page 5                                        http://lod2.eu
Creating Knowledge out of Interlinked Data



Motivation

• The Web of Data is a single global data space because data sources
  are connected by links
• Over 30 billion triples published as Linked Open Data (09/19/2011)
• But:
      • Less than 500 million links
      • Most publishers only link to one other dataset




                                                      LOD data sets by the number of
                                                      other data sources that are target
                                                      of outgoing RDF links.
LOD2 Webinar . 21.02.2012 . Page 6                                                  http://lod2.eu
Creating Knowledge out of Interlinked Data



Challenges for Link Discovery

• The Web of Data is heterogeneous
      • Many different vocabularies are in use
      • Different data formats
      • Many different ways to represent the same information




                     Distribution of the most widely used vocabularies
LOD2 Webinar . 21.02.2012 . Page 7                                       http://lod2.eu
Creating Knowledge out of Interlinked Data



Challenges for Link Discovery

• Large range of domains
      • 277 data sources in the LOD cloud from a variety of domains
      • Linkage Rules are different in each domain
      • Writing a Linkage Rule is for each of these domains is usually not trivial




                                Distribution of triples by domain

LOD2 Webinar . 21.02.2012 . Page 8                                          http://lod2.eu
Creating Knowledge out of Interlinked Data



Challenges for Link Discovery

• Scalability
      •   The current LOD cloud contains 277 datasets (August 2011)
      •   Over 31 billion triples in total
      •   Infeasible to compare every possible entity pair




                                     LOD datasets per domain
LOD2 Webinar . 21.02.2012 . Page 9                                    http://lod2.eu
Creating Knowledge out of Interlinked Data



Link Discovery Tools

•    Tools enable data publishers to set links
•    Most tools generate links based on user-defined linkage rules
•    A linkage rule specifies the conditions data items must fulfill in order to be
     interlinked
•    Popular Link Discover Tools:
      •   Silk Link Discovery Framework
      •   LIMES
      •   Others:
          http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/EquivalenceMining




LOD2 Webinar . 21.02.2012 . Page 10                                                    http://lod2.eu
Creating Knowledge out of Interlinked Data



Silk Link Discovery Framework

•    Tool for discovering links between data items within different Linked Data
     sources.
•    The Silk Link Specification Language (Silk-LSL) allows to express complex
     linkage rules
•    Can be used to generate owl:sameAs links as well as other relationships
•    Scalability and high performance through efficient data handling




LOD2 Webinar . 21.02.2012 . Page 11                                      http://lod2.eu
Creating Knowledge out of Interlinked Data



Silk Versions

• Silk Single Machine
      • Generate links on a single machine
      • Local or remote data sets
• Silk MapReduce
      • Generate RDF links using a cluster of multiple machines
      • Based on Hadoop (Can be run on Amazon Elastic MapReduce)
• Silk Server
      • Provides an HTTP API for matching instances from an incoming stream
        of RDF data while keeping track of known entities
      • Can be used as an identity resolution component within applications that
        consume Linked Data from the Web




LOD2 Webinar . 21.02.2012 . Page 12                                      http://lod2.eu
Creating Knowledge out of Interlinked Data



(Simplified) Linking Workflow




 Select Datasets                      Write Linkage Rule        Generate Links
 • Select two data                    • Specifies how two       • Locally or on a Hadoop
   sources                              entities are compared     Cluster
 • Select the entity types            • Can be written          • Write Links to file or a
   to be interlinked                    manually or learned       triple store




LOD2 Webinar . 21.02.2012 . Page 13                                               http://lod2.eu
Creating Knowledge out of Interlinked Data



Linkage Rule Components

A linkage rule is represented as a tree consisting of 4 types of operators:

                    RDF paths                                         Transformations
 • Similar to SPARQL 1.1 Property Paths               • Transforms the result set of an RDF paths
 • Examples:                                          • Variety of built-in transformations
   • ?movie/dbpedia:director/rdfs:label               • Examples:
   • ?person/label[@lang=‘en’]                          • LowerCase
                                                        • RegexReplace
                                                        • Stem



                Similarity Metrics                                     Aggregations
 • Similarity of two inputs based on a user-defined   • Aggregates multiple similarity metrics
   metric.                                            • Examples:
 • Examples:                                            • Min, Max, Average
   • Various string similarity metrics                  • Quadratic Mean
   • Geographic similarity                              • Geometric Mean
   • Date similarity


LOD2 Webinar . 21.02.2012 . Page 14                                                              http://lod2.eu
Creating Knowledge out of Interlinked Data



Silk Workbench

• Silk Workbench is a web application which guides the user through
  the process of interlinking different data sources.
• Enables the user to manage different sets of data sources and linking
  tasks.
• Offers a graphical editor which enables the user to easily create and
  edit linkage rules
• Offers tools to evaluate the current linkage rule
• Includes support for learning linkage rules




LOD2 Webinar . 21.02.2012 . Page 15                             http://lod2.eu
Creating Knowledge out of Interlinked Data



Workspace

The Workspace holds a set of projects
consisting of:

•    Data Sources
      •   Holds all information that is needed
          by Silk to retrieve entities from it.
      •   Usually a file dump or a SPARQL
          endpoint
•    Linking Tasks
      •   Interlinks a type of entity between two
          data sources
      •   e.g. Interlinkiing movies in DBpedia
          and LinkedMDB




LOD2 Webinar . 21.02.2012 . Page 16                    http://lod2.eu
Creating Knowledge out of Interlinked Data



Linkage Rule Editor

•    Allows to view and edit linkage rules
•    Linkage Rules are shown as a tree
•    Editing using drag & drop.




LOD2 Webinar . 21.02.2012 . Page 17                   http://lod2.eu
Creating Knowledge out of Interlinked Data



Learning Linkage Rules

•    Linkage Rules can be learned interactively
•    Can be used to generate new linkage rules or to improve existing rules
•    Learned Linkage Rule can be viewed and edited by the user




LOD2 Webinar . 21.02.2012 . Page 18                                      http://lod2.eu
Creating Knowledge out of Interlinked Data



Demo 1: Interlinking Movies

• Interlinking movies between two datasources:
      •   DBpedia: Linked dataset extracted from Wikipedia
      •   LinkedMDB: Large dataset for movies
•    For demonstration, we assume that no existing links are available




LOD2 Webinar . 21.02.2012 . Page 19                                      http://lod2.eu
Creating Knowledge out of Interlinked Data




                                      Demo


LOD2 Webinar . 21.02.2012 . Page 20                   http://lod2.eu
Creating Knowledge out of Interlinked Data



Demo 2: Cora

•    Cora:
      •   Dataset of citations to research papers from the Cora Computer Science research
          paper search engine
      •   Frequently used for evaluating the performance of interlinking approaches
•    A set of reference links is available




LOD2 Webinar . 21.02.2012 . Page 21                                              http://lod2.eu
Creating Knowledge out of Interlinked Data




                                      Demo


LOD2 Webinar . 21.02.2012 . Page 22                   http://lod2.eu
Creating Knowledge out of Interlinked Data



Availability

•    Silk can be downloaded from the official homepage at:

                            http://www4.wiwiss.fu-berlin.de/bizer/silk/

•    Support is provided through the official mailing list:

                          http://groups.google.com/group/silk-discussion

•    The latest source code is available from the project's Git repository and can
     be browsed online at:

                         http://www.assembla.com/code/silk/git/nodes/

•    Silk is licensed under the terms of the Apache Software

LOD2 Webinar . 21.02.2012 . Page 23                                         http://lod2.eu
Creating Knowledge out of Interlinked Data




                                      Q&A


LOD2 Webinar . 21.02.2012 . Page 24                   http://lod2.eu
Creating Knowledge out of Interlinked Data




Credits

Jingle       R.E.M., Martin Kaltenböck, Florian Kondert
Coordination Thomas Thurner
             Martin Kaltenböck
Moderation   Martin Kaltenböck
Presented by Robert Isele




LOD2 Webinar . 29.11.2011 . Page 25                       http://lod2.eu
Creating Knowledge out of Interlinked Data




        Hope you enjoyed staying with us – if you need more detailed information,
        visit us at www.lod2.eu and let us know how we can improve to meet your
        expectations!

        Don’t forget to register for our next webinar

        March 2012 – LIMES + SAIM (University of Leipzig)
        24.04.2012 – D2R and Sparqlify (University of Leipzig, Free University Berlin)

        Have a great day and don’t forget ...




                                                                                  http://lod2.eu
LOD2 Webinar . 29.11.2011 . Page 26                                                http://lod2.eu
Creating Knowledge out of Interlinked Data




                                                      http://lod2.eu
LOD2 Webinar . 29.11.2011 . Page 27                    http://lod2.eu

Más contenido relacionado

La actualidad más candente

Making the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataMaking the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataKingsley Uyi Idehen
 
Tell Me Quality Documentation
Tell Me Quality DocumentationTell Me Quality Documentation
Tell Me Quality DocumentationMarco Berlot
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsMarkus Neteler
 
Introduction To The Semantic Web
Introduction To The Semantic  WebIntroduction To The Semantic  Web
Introduction To The Semantic Webguest262aaa
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org sopekmir
 
LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19Kingsley Uyi Idehen
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 

La actualidad más candente (9)

Making the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataMaking the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked Data
 
Tell Me Quality Documentation
Tell Me Quality DocumentationTell Me Quality Documentation
Tell Me Quality Documentation
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formats
 
Introduction To The Semantic Web
Introduction To The Semantic  WebIntroduction To The Semantic  Web
Introduction To The Semantic Web
 
Understanding Data
Understanding Data Understanding Data
Understanding Data
 
Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org Industry Ontologies: Case Studies in Creating and Extending Schema.org
Industry Ontologies: Case Studies in Creating and Extending Schema.org
 
gupea_2077_38605_1
gupea_2077_38605_1gupea_2077_38605_1
gupea_2077_38605_1
 
LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19LOD Cloud Knowledge Graph vs COVID-19
LOD Cloud Knowledge Graph vs COVID-19
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 

Similar a LOD2 Webinar Series: SILK

Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)Anja Jentzsch
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentPeter Haase
 
OntoWiki Application Framework & Erfurt API
OntoWiki Application Framework & Erfurt APIOntoWiki Application Framework & Erfurt API
OntoWiki Application Framework & Erfurt APIPhilipp Frischmuth
 
Instance Matching
Instance Matching Instance Matching
Instance Matching Robert Isele
 
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataSoren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataOpen City Foundation
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
Vila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-reduxVila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-reduxLIS EPI Meeting
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Linked Open Library Data @hbz
Linked Open Library Data @hbzLinked Open Library Data @hbz
Linked Open Library Data @hbzAdrian Pohl
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 

Similar a LOD2 Webinar Series: SILK (20)

Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Limes webinar
Limes webinarLimes webinar
Limes webinar
 
LOD2 Webinar Series: LIMES
LOD2 Webinar Series: LIMESLOD2 Webinar Series: LIMES
LOD2 Webinar Series: LIMES
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
OntoWiki Application Framework & Erfurt API
OntoWiki Application Framework & Erfurt APIOntoWiki Application Framework & Erfurt API
OntoWiki Application Framework & Erfurt API
 
Instance Matching
Instance Matching Instance Matching
Instance Matching
 
LOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViewsLOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViews
 
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataSoren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
Vila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-reduxVila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-redux
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack PrototypeLOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Linked Open Library Data @hbz
Linked Open Library Data @hbzLinked Open Library Data @hbz
Linked Open Library Data @hbz
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 

Más de LOD2 Creating Knowledge out of Interlinked Data

Más de LOD2 Creating Knowledge out of Interlinked Data (20)

LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
LOD2 Webinar: SIREn
 
LOD2 Webinar Series FOX
LOD2 Webinar Series FOXLOD2 Webinar Series FOX
LOD2 Webinar Series FOX
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
LOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the StackLOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the Stack
 
LOD2 Webinar Series: Virtuoso 7
LOD2 Webinar Series: Virtuoso 7LOD2 Webinar Series: Virtuoso 7
LOD2 Webinar Series: Virtuoso 7
 
LOD2 Webinar Series: DBpedia Spotlight
LOD2 Webinar Series: DBpedia SpotlightLOD2 Webinar Series: DBpedia Spotlight
LOD2 Webinar Series: DBpedia Spotlight
 
LOD2 Webinar Series: publicdata.eu and CKAN
LOD2 Webinar Series: publicdata.eu and CKANLOD2 Webinar Series: publicdata.eu and CKAN
LOD2 Webinar Series: publicdata.eu and CKAN
 
LOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industryLOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industry
 
LOD2 General Presentation 2012
LOD2 General Presentation 2012LOD2 General Presentation 2012
LOD2 General Presentation 2012
 
LOD2 Webinar Series: PoolParty
LOD2 Webinar Series: PoolPartyLOD2 Webinar Series: PoolParty
LOD2 Webinar Series: PoolParty
 
LOD2 Plenary Vienna 2012: WP12 - Project Management
LOD2 Plenary Vienna 2012: WP12 - Project ManagementLOD2 Plenary Vienna 2012: WP12 - Project Management
LOD2 Plenary Vienna 2012: WP12 - Project Management
 
LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...
LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...
LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...
 
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
 
LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...
LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...
LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...
 
LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web
LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data WebLOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web
LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web
 
LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing
LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing
LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing
 
LOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 Stack
LOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 StackLOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 Stack
LOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 Stack
 
LOD2 Plenary Vienna 2012: WP5 - Linked Data Browsing, Visualization and Autho...
LOD2 Plenary Vienna 2012: WP5 - Linked Data Browsing, Visualization and Autho...LOD2 Plenary Vienna 2012: WP5 - Linked Data Browsing, Visualization and Autho...
LOD2 Plenary Vienna 2012: WP5 - Linked Data Browsing, Visualization and Autho...
 
LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion
LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge FusionLOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion
LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion
 
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge BasesLOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
 

Último

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Último (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

LOD2 Webinar Series: SILK

  • 1. Creating Knowledge out of Interlinked Data LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu
  • 2. Creating Knowledge out of Interlinked Data LOD2 is a large-scale integrating project co-funded by the European Commission within the FP7 Information and Communication Technologies Work Programme. This 4-year project comprises leading Linked Open Data technology researchers, companies, and service providers. Coming from across 12 countries the partners are coordinated by the Agile Knowledge Engineering and Semantic Web Research Group at the University of Leipzig, Germany. LOD2 will integrate and syndicate Linked Data with existing large-scale applications. The project shows the benefits in the scenarios of Media and Publishing, Corporate Data intranets and eGovernment. http://lod2.eu LOD2 Webinar . 29.11.2011 . Page 2 http://lod2.eu
  • 3. Creating Knowledge out of Interlinked Data Once per month the LOD2 webinar series offer a free webinar about tools and services along the Linked Open Data Life Cycle. Stay with us and learn more about acquisition, editing, composing, connected applications – and finally publishing Linked Open Data. http://lod2.eu LOD2 Webinar . 29.11.2011 . Page 3 http://lod2.eu
  • 4. Creating Knowledge out of Interlinked Data Web-based Systems Group • School of Business & Economics, Freie Universität Berlin • Research focus: Linked Data technologies for extending the World Wide Web with a global data commons • Funded Projects: • LOD2 - Creating Knowledge out of Interlinked Data • LATC - LOD Around The Clock • PlanetData • Visit us at: http://wbsg.de LOD2 Webinar . 21.02.2012 . Page 4 http://lod2.eu
  • 5. Creating Knowledge out of Interlinked Data Main Projects • DBpedia is a community effort lead by WBSG and AKSW to: • Extract structured information from Wikipedia • Make this information available on the Web under an open license • Interlink the DBpedia dataset with other open datasets on the Web • DBpedia Spotlight: Automatic annotation of free-text with DBpedia URIs • D2R: Publishing relational databases on the Semantic Web • Data Integration • R2R: Translates Web data that is represented using terms from different vocabularies into a single target vocabulary. • Silk: Tool for generating RDF links between data items. • LDIF: Translates heterogeneous Linked Data from the Web into a clean, local target representation while keeping track of data provenance. LOD2 Webinar . 21.02.2012 . Page 5 http://lod2.eu
  • 6. Creating Knowledge out of Interlinked Data Motivation • The Web of Data is a single global data space because data sources are connected by links • Over 30 billion triples published as Linked Open Data (09/19/2011) • But: • Less than 500 million links • Most publishers only link to one other dataset LOD data sets by the number of other data sources that are target of outgoing RDF links. LOD2 Webinar . 21.02.2012 . Page 6 http://lod2.eu
  • 7. Creating Knowledge out of Interlinked Data Challenges for Link Discovery • The Web of Data is heterogeneous • Many different vocabularies are in use • Different data formats • Many different ways to represent the same information Distribution of the most widely used vocabularies LOD2 Webinar . 21.02.2012 . Page 7 http://lod2.eu
  • 8. Creating Knowledge out of Interlinked Data Challenges for Link Discovery • Large range of domains • 277 data sources in the LOD cloud from a variety of domains • Linkage Rules are different in each domain • Writing a Linkage Rule is for each of these domains is usually not trivial Distribution of triples by domain LOD2 Webinar . 21.02.2012 . Page 8 http://lod2.eu
  • 9. Creating Knowledge out of Interlinked Data Challenges for Link Discovery • Scalability • The current LOD cloud contains 277 datasets (August 2011) • Over 31 billion triples in total • Infeasible to compare every possible entity pair LOD datasets per domain LOD2 Webinar . 21.02.2012 . Page 9 http://lod2.eu
  • 10. Creating Knowledge out of Interlinked Data Link Discovery Tools • Tools enable data publishers to set links • Most tools generate links based on user-defined linkage rules • A linkage rule specifies the conditions data items must fulfill in order to be interlinked • Popular Link Discover Tools: • Silk Link Discovery Framework • LIMES • Others: http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/EquivalenceMining LOD2 Webinar . 21.02.2012 . Page 10 http://lod2.eu
  • 11. Creating Knowledge out of Interlinked Data Silk Link Discovery Framework • Tool for discovering links between data items within different Linked Data sources. • The Silk Link Specification Language (Silk-LSL) allows to express complex linkage rules • Can be used to generate owl:sameAs links as well as other relationships • Scalability and high performance through efficient data handling LOD2 Webinar . 21.02.2012 . Page 11 http://lod2.eu
  • 12. Creating Knowledge out of Interlinked Data Silk Versions • Silk Single Machine • Generate links on a single machine • Local or remote data sets • Silk MapReduce • Generate RDF links using a cluster of multiple machines • Based on Hadoop (Can be run on Amazon Elastic MapReduce) • Silk Server • Provides an HTTP API for matching instances from an incoming stream of RDF data while keeping track of known entities • Can be used as an identity resolution component within applications that consume Linked Data from the Web LOD2 Webinar . 21.02.2012 . Page 12 http://lod2.eu
  • 13. Creating Knowledge out of Interlinked Data (Simplified) Linking Workflow Select Datasets Write Linkage Rule Generate Links • Select two data • Specifies how two • Locally or on a Hadoop sources entities are compared Cluster • Select the entity types • Can be written • Write Links to file or a to be interlinked manually or learned triple store LOD2 Webinar . 21.02.2012 . Page 13 http://lod2.eu
  • 14. Creating Knowledge out of Interlinked Data Linkage Rule Components A linkage rule is represented as a tree consisting of 4 types of operators: RDF paths Transformations • Similar to SPARQL 1.1 Property Paths • Transforms the result set of an RDF paths • Examples: • Variety of built-in transformations • ?movie/dbpedia:director/rdfs:label • Examples: • ?person/label[@lang=‘en’] • LowerCase • RegexReplace • Stem Similarity Metrics Aggregations • Similarity of two inputs based on a user-defined • Aggregates multiple similarity metrics metric. • Examples: • Examples: • Min, Max, Average • Various string similarity metrics • Quadratic Mean • Geographic similarity • Geometric Mean • Date similarity LOD2 Webinar . 21.02.2012 . Page 14 http://lod2.eu
  • 15. Creating Knowledge out of Interlinked Data Silk Workbench • Silk Workbench is a web application which guides the user through the process of interlinking different data sources. • Enables the user to manage different sets of data sources and linking tasks. • Offers a graphical editor which enables the user to easily create and edit linkage rules • Offers tools to evaluate the current linkage rule • Includes support for learning linkage rules LOD2 Webinar . 21.02.2012 . Page 15 http://lod2.eu
  • 16. Creating Knowledge out of Interlinked Data Workspace The Workspace holds a set of projects consisting of: • Data Sources • Holds all information that is needed by Silk to retrieve entities from it. • Usually a file dump or a SPARQL endpoint • Linking Tasks • Interlinks a type of entity between two data sources • e.g. Interlinkiing movies in DBpedia and LinkedMDB LOD2 Webinar . 21.02.2012 . Page 16 http://lod2.eu
  • 17. Creating Knowledge out of Interlinked Data Linkage Rule Editor • Allows to view and edit linkage rules • Linkage Rules are shown as a tree • Editing using drag & drop. LOD2 Webinar . 21.02.2012 . Page 17 http://lod2.eu
  • 18. Creating Knowledge out of Interlinked Data Learning Linkage Rules • Linkage Rules can be learned interactively • Can be used to generate new linkage rules or to improve existing rules • Learned Linkage Rule can be viewed and edited by the user LOD2 Webinar . 21.02.2012 . Page 18 http://lod2.eu
  • 19. Creating Knowledge out of Interlinked Data Demo 1: Interlinking Movies • Interlinking movies between two datasources: • DBpedia: Linked dataset extracted from Wikipedia • LinkedMDB: Large dataset for movies • For demonstration, we assume that no existing links are available LOD2 Webinar . 21.02.2012 . Page 19 http://lod2.eu
  • 20. Creating Knowledge out of Interlinked Data Demo LOD2 Webinar . 21.02.2012 . Page 20 http://lod2.eu
  • 21. Creating Knowledge out of Interlinked Data Demo 2: Cora • Cora: • Dataset of citations to research papers from the Cora Computer Science research paper search engine • Frequently used for evaluating the performance of interlinking approaches • A set of reference links is available LOD2 Webinar . 21.02.2012 . Page 21 http://lod2.eu
  • 22. Creating Knowledge out of Interlinked Data Demo LOD2 Webinar . 21.02.2012 . Page 22 http://lod2.eu
  • 23. Creating Knowledge out of Interlinked Data Availability • Silk can be downloaded from the official homepage at: http://www4.wiwiss.fu-berlin.de/bizer/silk/ • Support is provided through the official mailing list: http://groups.google.com/group/silk-discussion • The latest source code is available from the project's Git repository and can be browsed online at: http://www.assembla.com/code/silk/git/nodes/ • Silk is licensed under the terms of the Apache Software LOD2 Webinar . 21.02.2012 . Page 23 http://lod2.eu
  • 24. Creating Knowledge out of Interlinked Data Q&A LOD2 Webinar . 21.02.2012 . Page 24 http://lod2.eu
  • 25. Creating Knowledge out of Interlinked Data Credits Jingle R.E.M., Martin Kaltenböck, Florian Kondert Coordination Thomas Thurner Martin Kaltenböck Moderation Martin Kaltenböck Presented by Robert Isele LOD2 Webinar . 29.11.2011 . Page 25 http://lod2.eu
  • 26. Creating Knowledge out of Interlinked Data Hope you enjoyed staying with us – if you need more detailed information, visit us at www.lod2.eu and let us know how we can improve to meet your expectations! Don’t forget to register for our next webinar March 2012 – LIMES + SAIM (University of Leipzig) 24.04.2012 – D2R and Sparqlify (University of Leipzig, Free University Berlin) Have a great day and don’t forget ... http://lod2.eu LOD2 Webinar . 29.11.2011 . Page 26 http://lod2.eu
  • 27. Creating Knowledge out of Interlinked Data http://lod2.eu LOD2 Webinar . 29.11.2011 . Page 27 http://lod2.eu