SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
Finding Data Sets



           Anja Jentzsch, Freie Universität Berlin


                       17 April 2012
Tutorial: Practical Cross-Dataset Queries on the Web of Data
                 WWW2012, Lyon, France



                                                               1
Different motivations
•   Finding data sets
    •   Look for resources to link a data set to
    •   Find a data set with relevant data to consume / integrate


•   Finding vocabularies
    •   Find vocabularies to use to model data sets
    •   Find vocabularies to map your existing schema to




                                                                    2
Different tool types
•   Search engines
    •   find data sets based on keywords


•   Data catalogs / directories
    •   explore data sets and faceted search


•   Data Marketplaces
    •   explore and consume data sets



                                                3
Linked Data Search Engines
•   The description of the resources is published as document in RDF
•   RDF search engine index the RDF documents
•   Process similar to that of search engines for HTML documents




                                                                       4
http://sindice.com   5
http://sindice.com   6
http://sig.ma   7
http://sig.ma   8
http://swoogle.umbc.edu   9
http://kmi-web05.open.ac.uk/WatsonWUI/   10
http://factforge.net   11
http://factforge.net   12
Suitability
•   Look for resources to link a data set to
    •   Good


•   Find a data set with relevant data to consume
    •   Maybe good: depends on how the query is expressed


•   Find vocabularies to use to model data sets
    •   Not good: everything is indexed, too much noise



                                                            13
Data catalogs
•   Several governments and institutions are opening their catalogs
•   http://datacatalogs.org provides a manually curated index of 226 data catalogs




                                                                                     14
http://datacatalogs.org   15
16
The Data Hub
•   Manually curated list of (>3.500) data sets, at least 326 Linked Data Sets
•   Various metadata for each data set


•   Other views over (part of) its content
    •   Semantic CKAN (http://semantic.ckan.net)
    •   LATC Data Source Inventory
    •   LOD Cloud
    •   State of the LOD Cloud



                                                                                 17
http://thedatahub.org   18
19
http://dsi.lod-cloud.net   20
http://lod-cloud.net   21
http://lod-cloud.net/state/   22
http://lod-cloud.net/state   23
Data Marketplaces
•   “Services that make it easy to find data from a range of secondary data sources,
    then consume or acquire the data in a usable and unified format. Several of these
    services are trying to create marketplaces for data, envisioning that data providers
    can offer their data sets for sale to data seekers.” (http://datamarket.com)




                                                                                       24
Kasabi
•   Data domain
    •   All purpose, incl. DBpedia, GeoNames, BBC Linked Data, …
•   Data population
    •   Public datasets
    •   User submitted datasets
•   Data size
    •   186 data sets
•   Data model
    •   RDF


                                                                   25
http://kasabi.com   26
Freebase
•   Metaweb (USA), now Google
•   Free for 100K read API calls per day (10K write), paid for higher volumes
•   Data access
    •   REST API
    •   Linked Data endpoint (http://rdf.freebase.com)
    •   Triple uploader / RDF dumps
•   Data tools
    •   Web based – schema editor, review queue, viewers, …
    •   GridWorks (Google Refine)
        •   Exploring, data cleaning, transformation of tabular data
        •   Map data to Freebase schema & RDF export (3rd party extension)      27
http://www.freebase.com   28
29
Linked Open Vocabularies (LOV)
•   Initiative similar to the LOD Cloud but focused on vocabularies
•   250+ vocabularies




                                                                      30
http://labs.mondeca.com/dataset/lov/   31
32

Más contenido relacionado

La actualidad más candente

RDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an updateRDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an updateAndy Powell
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data ArrivesRichard Wallis
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureMichele Pasin
 
LOD4JS - Linked Open Data for Jewish Studies
LOD4JS - Linked Open Data for Jewish StudiesLOD4JS - Linked Open Data for Jewish Studies
LOD4JS - Linked Open Data for Jewish StudiesKepa J. Rodriguez
 
2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk CambridgeMagnus Manske
 
Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...Michele Pasin
 
ODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureMichele Pasin
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 
Facilitating the discovery of public datasets
Facilitating the discovery of public datasetsFacilitating the discovery of public datasets
Facilitating the discovery of public datasetsNafiseh Navabpour
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountLeigh Dodds
 
The Modern Palimpsest
The Modern PalimpsestThe Modern Palimpsest
The Modern PalimpsestLeigh Dodds
 
Many flavors of linked data
Many flavors of linked dataMany flavors of linked data
Many flavors of linked dataDebra Shapiro
 
What flavor of metadata is best for your collection?
What flavor of metadata is best for your collection?What flavor of metadata is best for your collection?
What flavor of metadata is best for your collection?Debra Shapiro
 
Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27Andy Jackson
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 

La actualidad más candente (20)

RDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an updateRDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an update
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data Arrives
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
 
LOD4JS - Linked Open Data for Jewish Studies
LOD4JS - Linked Open Data for Jewish StudiesLOD4JS - Linked Open Data for Jewish Studies
LOD4JS - Linked Open Data for Jewish Studies
 
2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge
 
SemanticWebApp
SemanticWebAppSemanticWebApp
SemanticWebApp
 
Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...
 
Linked Data
Linked DataLinked Data
Linked Data
 
ODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer Nature
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Facilitating the discovery of public datasets
Facilitating the discovery of public datasetsFacilitating the discovery of public datasets
Facilitating the discovery of public datasets
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
 
The Modern Palimpsest
The Modern PalimpsestThe Modern Palimpsest
The Modern Palimpsest
 
Many flavors of linked data
Many flavors of linked dataMany flavors of linked data
Many flavors of linked data
 
What flavor of metadata is best for your collection?
What flavor of metadata is best for your collection?What flavor of metadata is best for your collection?
What flavor of metadata is best for your collection?
 
Providing Linked Data
Providing Linked DataProviding Linked Data
Providing Linked Data
 
Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 

Destacado

Benedictine sisters
Benedictine sistersBenedictine sisters
Benedictine sistersmarcelabui00
 
2do romeo y julieta
2do romeo y julieta2do romeo y julieta
2do romeo y julietamarcelabui00
 
Visualizing Web Data Query Results
Visualizing Web Data Query ResultsVisualizing Web Data Query Results
Visualizing Web Data Query ResultsAnja Jentzsch
 
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...Anja Jentzsch
 
Marstrat - white-label financial services for blue chip Brands
Marstrat - white-label financial services for blue chip BrandsMarstrat - white-label financial services for blue chip Brands
Marstrat - white-label financial services for blue chip BrandsDominic Reeves
 
Pokemon: Wii & DS ideas
Pokemon: Wii & DS ideasPokemon: Wii & DS ideas
Pokemon: Wii & DS ideassourpatch74
 
งานนำเสนอไฟฟ้า ม.304
งานนำเสนอไฟฟ้า ม.304งานนำเสนอไฟฟ้า ม.304
งานนำเสนอไฟฟ้า ม.304toaaasdwggh
 
FYP Presentation
FYP PresentationFYP Presentation
FYP Presentationwindkit
 
約旦華語老師數位教學訓練
約旦華語老師數位教學訓練約旦華語老師數位教學訓練
約旦華語老師數位教學訓練韶君 徐
 
FILE & LETTER TRACKING AND MANAGEMENT SYSTEM
FILE & LETTER TRACKING AND MANAGEMENT SYSTEMFILE & LETTER TRACKING AND MANAGEMENT SYSTEM
FILE & LETTER TRACKING AND MANAGEMENT SYSTEMNiharendra Choudhury
 
CK12.ORG presentation at miniCAST El Paso 2016
CK12.ORG presentation at miniCAST El Paso 2016CK12.ORG presentation at miniCAST El Paso 2016
CK12.ORG presentation at miniCAST El Paso 2016Tim Holt
 

Destacado (16)

Ethics
EthicsEthics
Ethics
 
Wikidata
WikidataWikidata
Wikidata
 
Benedictine sisters
Benedictine sistersBenedictine sisters
Benedictine sisters
 
2do romeo y julieta
2do romeo y julieta2do romeo y julieta
2do romeo y julieta
 
Visualizing Web Data Query Results
Visualizing Web Data Query ResultsVisualizing Web Data Query Results
Visualizing Web Data Query Results
 
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
 
Marstrat - white-label financial services for blue chip Brands
Marstrat - white-label financial services for blue chip BrandsMarstrat - white-label financial services for blue chip Brands
Marstrat - white-label financial services for blue chip Brands
 
Pokemon: Wii & DS ideas
Pokemon: Wii & DS ideasPokemon: Wii & DS ideas
Pokemon: Wii & DS ideas
 
งานนำเสนอไฟฟ้า ม.304
งานนำเสนอไฟฟ้า ม.304งานนำเสนอไฟฟ้า ม.304
งานนำเสนอไฟฟ้า ม.304
 
FYP Presentation
FYP PresentationFYP Presentation
FYP Presentation
 
Fraçao
FraçaoFraçao
Fraçao
 
約旦華語老師數位教學訓練
約旦華語老師數位教學訓練約旦華語老師數位教學訓練
約旦華語老師數位教學訓練
 
Project Monitoring System (PMIS)
Project Monitoring System (PMIS)Project Monitoring System (PMIS)
Project Monitoring System (PMIS)
 
FILE & LETTER TRACKING AND MANAGEMENT SYSTEM
FILE & LETTER TRACKING AND MANAGEMENT SYSTEMFILE & LETTER TRACKING AND MANAGEMENT SYSTEM
FILE & LETTER TRACKING AND MANAGEMENT SYSTEM
 
Linked Data Basics
Linked Data BasicsLinked Data Basics
Linked Data Basics
 
CK12.ORG presentation at miniCAST El Paso 2016
CK12.ORG presentation at miniCAST El Paso 2016CK12.ORG presentation at miniCAST El Paso 2016
CK12.ORG presentation at miniCAST El Paso 2016
 

Similar a Finding Data Sets

The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commonsJesse Wang
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentPeter Haase
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in RomaniaVlad Posea
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked DataAdrian Stevenson
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
From Ambition to Go Live SWIB.pdf
From Ambition to Go Live SWIB.pdfFrom Ambition to Go Live SWIB.pdf
From Ambition to Go Live SWIB.pdfRichardWallis3
 
From Ambition to Go Live
From Ambition to Go LiveFrom Ambition to Go Live
From Ambition to Go LiveRichard Wallis
 
Linked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureLinked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureEmily Nimsakont
 
Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)Anja Jentzsch
 
Best Practices for Descriptive Metadata
Best Practices for Descriptive MetadataBest Practices for Descriptive Metadata
Best Practices for Descriptive MetadataOCLC
 

Similar a Finding Data Sets (20)

The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in Romania
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
From Ambition to Go Live SWIB.pdf
From Ambition to Go Live SWIB.pdfFrom Ambition to Go Live SWIB.pdf
From Ambition to Go Live SWIB.pdf
 
From Ambition to Go Live
From Ambition to Go LiveFrom Ambition to Go Live
From Ambition to Go Live
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Linked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureLinked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the Future
 
Dm1.1
Dm1.1Dm1.1
Dm1.1
 
Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)Link Sets And Why They Are Important (EDF2012)
Link Sets And Why They Are Important (EDF2012)
 
Best Practices for Descriptive Metadata
Best Practices for Descriptive MetadataBest Practices for Descriptive Metadata
Best Practices for Descriptive Metadata
 

Último

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Finding Data Sets

  • 1. Finding Data Sets Anja Jentzsch, Freie Universität Berlin 17 April 2012 Tutorial: Practical Cross-Dataset Queries on the Web of Data WWW2012, Lyon, France 1
  • 2. Different motivations • Finding data sets • Look for resources to link a data set to • Find a data set with relevant data to consume / integrate • Finding vocabularies • Find vocabularies to use to model data sets • Find vocabularies to map your existing schema to 2
  • 3. Different tool types • Search engines • find data sets based on keywords • Data catalogs / directories • explore data sets and faceted search • Data Marketplaces • explore and consume data sets 3
  • 4. Linked Data Search Engines • The description of the resources is published as document in RDF • RDF search engine index the RDF documents • Process similar to that of search engines for HTML documents 4
  • 13. Suitability • Look for resources to link a data set to • Good • Find a data set with relevant data to consume • Maybe good: depends on how the query is expressed • Find vocabularies to use to model data sets • Not good: everything is indexed, too much noise 13
  • 14. Data catalogs • Several governments and institutions are opening their catalogs • http://datacatalogs.org provides a manually curated index of 226 data catalogs 14
  • 16. 16
  • 17. The Data Hub • Manually curated list of (>3.500) data sets, at least 326 Linked Data Sets • Various metadata for each data set • Other views over (part of) its content • Semantic CKAN (http://semantic.ckan.net) • LATC Data Source Inventory • LOD Cloud • State of the LOD Cloud 17
  • 19. 19
  • 24. Data Marketplaces • “Services that make it easy to find data from a range of secondary data sources, then consume or acquire the data in a usable and unified format. Several of these services are trying to create marketplaces for data, envisioning that data providers can offer their data sets for sale to data seekers.” (http://datamarket.com) 24
  • 25. Kasabi • Data domain • All purpose, incl. DBpedia, GeoNames, BBC Linked Data, … • Data population • Public datasets • User submitted datasets • Data size • 186 data sets • Data model • RDF 25
  • 27. Freebase • Metaweb (USA), now Google • Free for 100K read API calls per day (10K write), paid for higher volumes • Data access • REST API • Linked Data endpoint (http://rdf.freebase.com) • Triple uploader / RDF dumps • Data tools • Web based – schema editor, review queue, viewers, … • GridWorks (Google Refine) • Exploring, data cleaning, transformation of tabular data • Map data to Freebase schema & RDF export (3rd party extension) 27
  • 29. 29
  • 30. Linked Open Vocabularies (LOV) • Initiative similar to the LOD Cloud but focused on vocabularies • 250+ vocabularies 30
  • 32. 32