SlideShare una empresa de Scribd logo
1 de 52
An Introduction to
Linked Open Data for Museums
David Henry
Jarred Moore
MW2014
Presented by
An Introduction to Linked Open Data for
Museums
Limitations of Keyword Searching
Polysemy: One word with multiple meanings. E.g.
man
crane
bank
Synonymy: Multiple words with the same meaning.
buy OR purchase
create OR make
eliminate OR remove OR abolish
Signal to noise ratio e.g.
Try searching for the term
“Mississippi”
What is Linked Open Data?
On the web, open license
Machine-readable data
Non-proprietary format
RDF Format
Linked RDF
Copyright and Licensing
If Your content files are still under copyright and your institution is the
copyright owner, encourage your institution to license the content as
openly as possible
CCO
CC-BY
CC-BY-SA
CC-BY-NC
What is RDF?
• “Resource Description Framework (RDF) is a standard model for data
interchange on the Web. RDF has features that facilitate data merging
even if the underlying schemas differ, and it specifically supports the
evolution of schemas over time without requiring all the data
consumers to be changed. “ (from W3C)
• “…making Statements about resources (in particular web resources)
in the form of subject-predicate-object expressions.” (Wikipedia)
What are Triples?
• Triples are statements of fact (or assertions) composed of a
subject, predicate, and object. For example:
“David Henry”
Subject
“Lives in”
Predicate
“St. Louis”
Object
What are Questions Answered by RDF?
Fact-Based
Interpretive
Theoretical
Subjective
Analytical
Fact Based Questions ask Who, What,
When Where (Not so much Why)
Fact-Based
Questions
Who directed “Citizen Kane’?
What’s a daguerreotype?
Where did Van Gogh paint ‘Starry Night’?
Fact Based Question:
Are there any daguerreotypes of the
Mississippian mounds in St. Louis, Missouri?
Title: Group of people standing on a partially destroyed Big
Mound.
Description: Group of people standing on a partially destroyed Big
Mound.
Place: St. Louis, Missouri
Dates: 1869
Type(s): photo, Daguerreotype
Maker/Creator: Thomas M. Easterly
Subjects: Mississippian Culture, mounds
Identifier: PHO:17665
Permalink: http://collections.mohistory.org/resource/9952
Triples to Complex
Graphs
Thomas M. Easterly
1869Subject
“Mississippian Culture”
hasSubject
hasLabel
hasType
Daguerreotype
“Thomas M.
Easterly” Name: Thomas M. Easterly
Birth Date: October 3, 1809
Death Date: March 12, 1882
Places of Residence:
Guilford, Vermont
Liberty, Missouri
St. Louis, Missouri
Bio: Thomas M. Easterly was one of the leading
American Daguerreotypists ….
During the 1860s, improvements in photographic development
caused daguerreotypes to become out of fashion. Easterly refused
to acknowledge these changes believing the highly detailed
daguerreotypes were far superior in terms of beauty or
permanence urging the public to "save your old daguerreotypes for
you will never see their like again".
Exercise 1.
Time: 10-15 minutes Activity:
• Break into groups of 2-3.
• Write out one or more research questions.
• For each question, draw a entity-relationship graph
that could provide an answer to the question
What’s Wrong with the Good Ole Web?
What is a Uniform Resource Identifier?
Uniform
Resource
Locator
-----
Purpose:
To locate a
web resource
(document)
Uniform
Resource
Name
-----
Purpose:
To identify
any resourceIn Linked Open Data,
URIs act as both URLs and URNs
UR
I
Principles of Linked Data
• Use URIs to denote things.
• Use HTTP URIs so that these things can be referred to and
looked up ("dereferenced") by people and user agents.
• Provide useful information about the thing when its URI is
dereferenced, leveraging standards such as RDF, SPARQL.
• Include links to other related things (using their URIs) when
publishing data on the Web.
To make this happen subjects and predicates MUST be defined
by URIs. Objects may be URIs or literals.
Triples to Complex
Graphs
Resource:9952
Thomas M. Easterly
1839
ns1:Subject_91011
“Mississippian Culture”
nso:hasSubject
nso:hasLabel
nso:hasType
“Daguerreotype”
ns1:type_80345
Resource:92142
Triples to Complex
Graphs
http://collections.mohistory.org/resource/9952
ns1:Person_5678
Thomas M. Easterly
1839
ns1:Subject_91011
“Mississippian Culture”
nso:hasSubject
nso:hasLabel
nso:hasType
“Daguerreotype”
ns1:type_80345
What two words are most commonly
found in a browser window?
Web links have a half life of about ten years.
In other words, 50% of links that are 10 years
old are broken.
Document
DocumentDocument
DocumentDocument
Link rot is a serious problem on the document-based web.
Person
PersonObject
PlaceObject
createdBy
createdBy
knows
livesAt
Link rot is even more serious on the web of data.
Rules for persistent URI’sCoolURI’s
• No date Context
• No ownership context
• No technology context
• Re-use existing identifiers
• Link multiple representations
• Implement 303 redirects for
real world objects
NotCoolURI’s
• Avoid stating ownership
• Avoid version numbers
• Avoid query strings
• Avoid file extensions
Example URI:
http://education.data.gov.uk/ministryofeducation/id/school/123456
http://education.data.gov.uk/doc/school/v01/123456
states ownership
version number
good
Mostly good
http://www.example.com/id/alice_brown
http://data.nytimes.com/88843902954064461461
Writing RDF
RDFXML
Turtle
NTriples
<rdf:RDF xmlns:ns0=“http://mydomain.org/people/”
xmlns:n1=http://otherdonain.org/>
<description about=“ns0:David_Henry”>
<ns1:livesIn>St. Louis, MO</ns1:livesIn>
</description>
@prefix ns0: <http://mydomain.org/people/> .
@prefix ns1: <http://otherdomain.org/> .
ns0:David_Henry ns1:livesIn “St. Louis, MO” .
<http://mydomain.org/people/David_Henry> <http://otherdomain.org/livesIn> “St. Louis, MO” .
“David Henry” “Lives In” “St. Louis”
Triples to Complex
Graphs
http://collections.mohistory.org/resource/9952
Resource:92142
Thomas M. Easterly
1839
ns1:Subject_91011
“Mississippian Culture”
nso:hasSubject
nso:hasLabel
nso:hasType
“Daguerreotype”
ns1:type_80345
Graph to RDF as Turtle
@prefix resource: <http://collections.mohistory.org/resource/> .
@prefix ns0: <http://collections.mohistory.org/vocab/relators/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema> .
resource:9952 ns0:dateCreated "1869-01-01"^^xsd:date .
resource:9952 ns0:hasType <http://collections.mohistory.org/vocab/daguerreotype> .
resource:9952 ns0:createdBy resource:92142 .
resource:92142 ns0:hasLabel "Thomas M. Easterly" .
resource:9952 ns0:hasSubject resource:5215 .
resource:5215 ns0:hasLabel "Mississippian Culture" .
Exercise 2.
Time: 15 minutes Activity:
• Break into groups of 2-3.
• Using the graph defined in Exercise 1, define a set of
triples from the graph (Use your own URIs)
• Use the RDF validator at
http://www.rdfabout.com/demo/validator/
What is Linked Open Data?
On the web, open license
Machine-readable data
Non-proprietary format
RDF Format
Linked RDF
Principles of Linked Data
• Use URIs to denote things.
• Use HTTP URIs so that these things can be referred to and looked up
("dereferenced") by people and user agents.
• Provide useful information about the thing when its URI is
dereferenced, leveraging standards such as RDF, SPARQL.
• Include links to other related things (using their URIs) when
publishing data on the Web.
Core Vocabularies
• RDF & RDFS
• Useful terms: rdf:type, rdfs:label
• SKOS (Simple Knowledge Organization Schema)
• Useful terms: skos:broader, skos:narrower
• OWL (Web Ontology Language)
• Useful terms: owl:sameAs, owl:differentFrom
• Dublin Core
• Useful terms: dc:creator, dc:date, dc: subject
• Foaf
• Useful terms: foaf:name, foaf:knows, foaf:image
Ontology
Thesaurus
Controlled
Vocabulary
Vocabulary Types
Simple list of terms.
e.g. DCMI Types list
Hierarchical list of terms
e.g. Library of Congress Subjects
Hierarchical list of terms with
relationship constraints
e.g. CIDOC CRM
Example using CRM
Core
E52 Time-Span
1898
E53 Place
France
(nation)
E21 Person
Rodin Auguste
E52 Time-Span
1840
E67 Birth
Rodin’s birth
E52 Time-Span
1917
P4 has
time-span
E69 Death
Rodin’s death
E12 Production
Rodin making “Monument
to Balzac” in 1898
E21 Person
Honoré de Balzac
E55 Type
sculptors
E84 Information Carrier
The “Monument to Balzac”
(plaster)
E55 Type
plaster
E52 Time-Span
1925
E55 Type
bronze
E40 Legal Body
Rudier (Vve Alexis)
et Fils
E12 Production
Bronze
casting“Monument to
Balzac” in 1925
E55 Type
companies
E84 Information Carrier
The “Monument to
Balzac”(S1296)
P108B was
produced by
P62 depicts
P16B was used for
P134 continued
P2 has type
P120B occurs
after
P4 has time-span
P2 has type
P100B died in
P98B was born
P4 has time
-span
P2 has type
P14 carried out by
P14 carried out by
P62 depicts
P108B was
produced by P2 has type
P7 took
place at
P4 has time-span
Implementing Linked Open
Data
Link existing data
• Low barrier to entry
• Controlled lists and
thesauri
• Not very descriptive
Manage data to fit an ontology
• High barrier to entry
• Ontologies
• Very descriptive
RDF facilitates the “evolution of schemas over time”
What is RDF?
• “Resource Description Framework (RDF) is a standard model for data
interchange on the Web. RDF has features that facilitate data merging
even if the underlying schemas differ, and it specifically supports the
evolution of schemas over time without requiring all the data
consumers to be changed. “ (from W3C)
• “…making Statements about resources (in particular web resources)
in the form of subject-predicate-object expressions.” (Wikipedia)
Triples to Complex
Graphs
http://collections.mohistory.org/resource/9952
Resource:92142
Thomas M. Easterly
1839
ns1:Subject_91011
“Mississippian Culture”
nso:hasSubject
nso:hasLabel
nso:hasType
“Daguerreotype”
ns1:type_80345
Finding Links
• Linked Open Vocabularies is a good starting point
• Other well-used sources include:
• DBPedia - for a wide-range of types (people, places, subjects,
concepts)
• Id.loc.gov – for name authorities and subjects
• Viaf.org – for name authorities
• geonames.org – for geographic locations
Problem: There are no universal vocabularies
A Note of Caution
When re-using existing URIs, be sure to use the URI that represents
the entity (thing/concept/person) and not the web resource.
For example:
http://id.loc.gov/authorities/subjects/sh85126887.html
Is NOT the same as:
http://id.loc.gov/authorities/subjects/sh85126887
A Note of Caution
When re-using existing URIs, be sure to use the URI that represents
the entity (thing/concept/person) and not the web resource.
Finding Links
• Matching predicates.
• hasType => rdfs:type, dcterms:type, crm:E55_Type
• createdBy => dc:creator, crm:P94i_was_created_by
• dateCreated => dc:created, ?
• Matching value vocabularies.
• “Daguerreotype” => http://dbpedia.org/resource/Daguerreotype
• “Mississippian Culture” => http://id.loc.gov/authorities/subjects/sh85086218
• “Thomas Easterly” => http://viaf.org/viaf/13114715/
Problem: There are no universal vocabularies
Triples to Complex
Graphs
http://collections.mohistory.org/resource/9952
Resource:92142
Thomas M. Easterly
1839
ns1:Subject_91011
“Mississippian Culture”
dc:subject
rdfs:label
rdf:type
“Daguerreotype”
ns1:type_80345
@prefix resource: <http://collections.mohistory.org/resource/> .
@prefix ns0: <http://collections.mohistory.org/vocab/relators/> .
@prefix dc: <http://purl.org/dc/terms/> . # dc:creator; dc:created; dc:subject
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . # rdf:type
@prefix owl: <http://www.w3.org/2002/07/owl#> . # sameAs; differentFrom
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> . # date; integer
resource:9952 ns0:dateCreated "1869"^^xsd:date .
resource:9952 dc:date "1869-01-01"^^xsd:date .
resource:9952 ns0:hasType <http://collections.mohistory.org/vocab/daguerreotype> .
resource:9952 rdf:type <http://dbpedia.org/resource/Daguerreotype> .
resource:9952 ns0:createdBy resource:92142 .
resource:9952 dc:creator <http://viaf.org/viaf/13114715/> .
resource:92142 ns0:hasLabel "Thomas M. Easterly" .
resource:9952 ns0:hasSubject resource:5215 .
resource:9952 dc:subject <http://id.loc.gov/authorities/subjects/sh85086218> .
#resource:5215 ns0:hasLabel "Mississippian Culture" .
resource:92142 owl:sameAs <http://viaf.org/viaf/13114715/> .
Graph to RDF as Turtle
Exercise 3.
Time: 15-20 minutes
Activity:
• Break into groups of 2-3.
• Using the triples you defined in Exercise 3, find existing URIs
to link with your local URIs.
• Be prepared to explain why you chose the URIs your chose.
How Tos
• Embed schema.org data in a web page
• Publish static RDF files
• Manage local vocabularies and align them with existing vocabularies
• Contributing to a collection aggregator – e.g. Europeana or DPLA
• Publish existing database records as RDF
• Managing RDF data in a triple (or quad) store
Embedding
schema.org
<div itemscope itemtype="http://schema.org/CreativeWork">
<img src="http://collections.mohistory.org/resource/16679.jpg" class="item_image"
width="300" itemprop="image" />
<div id="record_detail">
<p><b>Title:</b> <span itemprop="name“>Lord Fitzwilliam and manservant, hunting
on the Hunt Farm on Gravois Road.</span></p>
<p><b>Description:</b> <span itemprop="description"></span></p>
<p><b>Item:</b> <span itemprop="additionalType">Daguerreotype</span></p>
<p><b>Dates:</b> <span itemprop="dateCreated">1855 to 1865</span></p> .
Copy and paste entire text
Publish static RDF files
• RDF files can be hand-written (what fun!) or rendered using templates
• Paths to RDF files can be submitted to RDF search engines such as
Sindice (http://sindice.com)
• Caution: Some content negotiation would be required.
• Remember: http://mydomain.org/resource/1234.rdf is NOT the same as
http://mydomain.org/resource/1234
Manage local vocabularies and align
them with existing vocabularies
Tools include:
PoolParty
Tematres
Karma
Contributing to a collection aggregator –
e.g. Europeana or DPLA
Service
Hub
• Dataset A
• Dataset B
• Dataset C
Service
Hub
• Dataset 1
• Dataset 2
• Dataset 3
Content
Hub
• Dataset X
• Dataset Y
• Dataset Z
Publish existing database records as
RDF
Managing RDF data in a triple (or quad)
store
• Quad = triple + context
• Most stores feature a SPARQL interface to query across
all triples (quads) in a repository
• Tools:
• Sesame – from OpenRDF
• Virtuoso
• Mulgara
Questions?
dhenry@mohistory.org
jmoore@mohistory.org

Más contenido relacionado

La actualidad más candente

Creating web applications with LODSPeaKr
Creating web applications with LODSPeaKrCreating web applications with LODSPeaKr
Creating web applications with LODSPeaKrAlvaro Graves
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for LibrariesLukas Koster
 
DHWI Linked Open Data - Show and Tell
DHWI Linked Open Data - Show and TellDHWI Linked Open Data - Show and Tell
DHWI Linked Open Data - Show and TellGeorgina Goodlander
 
進行中
進行中進行中
進行中maolins
 
進行中
進行中進行中
進行中maolins
 
BHL: Your 24hr Library
BHL: Your 24hr LibraryBHL: Your 24hr Library
BHL: Your 24hr LibraryChris Freeland
 
Embrace The Chaos
Embrace The ChaosEmbrace The Chaos
Embrace The Chaosjonphipps
 
Best Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open DataBest Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open DataJose Emilio Labra Gayo
 
opening new doors: recent initiatives in open data at National Library of Sco...
opening new doors: recent initiatives in open data at National Library of Sco...opening new doors: recent initiatives in open data at National Library of Sco...
opening new doors: recent initiatives in open data at National Library of Sco...Gill Hamilton
 
Lita national forum 2012
Lita national forum 2012Lita national forum 2012
Lita national forum 2012Joel Richard
 
Civil War Data 150 at DLF Fall Forum 2011
Civil War Data 150 at DLF Fall Forum 2011Civil War Data 150 at DLF Fall Forum 2011
Civil War Data 150 at DLF Fall Forum 2011Jon Voss
 
An Introduction to RDF and the Web of Data
An Introduction to RDF and the Web of DataAn Introduction to RDF and the Web of Data
An Introduction to RDF and the Web of DataOlaf Hartig
 
Generous Interfaces - rich websites for digital collections
Generous Interfaces - rich websites for digital collections Generous Interfaces - rich websites for digital collections
Generous Interfaces - rich websites for digital collections Mitchell Whitelaw
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesMarin Dimitrov
 
Web Data Management with RDF
Web Data Management with RDFWeb Data Management with RDF
Web Data Management with RDFM. Tamer Özsu
 
Library 2012 presentation
Library 2012 presentationLibrary 2012 presentation
Library 2012 presentationMelda Yildiz
 
when the link makes sense
when the link makes sensewhen the link makes sense
when the link makes senseFabien Gandon
 
FluidDB NYC Python presentation
FluidDB NYC Python presentationFluidDB NYC Python presentation
FluidDB NYC Python presentationTerry Jones
 

La actualidad más candente (20)

Creating web applications with LODSPeaKr
Creating web applications with LODSPeaKrCreating web applications with LODSPeaKr
Creating web applications with LODSPeaKr
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
DHWI Linked Open Data - Show and Tell
DHWI Linked Open Data - Show and TellDHWI Linked Open Data - Show and Tell
DHWI Linked Open Data - Show and Tell
 
進行中
進行中進行中
進行中
 
進行中
進行中進行中
進行中
 
BHL: Your 24hr Library
BHL: Your 24hr LibraryBHL: Your 24hr Library
BHL: Your 24hr Library
 
Embrace The Chaos
Embrace The ChaosEmbrace The Chaos
Embrace The Chaos
 
Best Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open DataBest Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open Data
 
opening new doors: recent initiatives in open data at National Library of Sco...
opening new doors: recent initiatives in open data at National Library of Sco...opening new doors: recent initiatives in open data at National Library of Sco...
opening new doors: recent initiatives in open data at National Library of Sco...
 
Lita national forum 2012
Lita national forum 2012Lita national forum 2012
Lita national forum 2012
 
Open Library
Open Library Open Library
Open Library
 
Civil War Data 150 at DLF Fall Forum 2011
Civil War Data 150 at DLF Fall Forum 2011Civil War Data 150 at DLF Fall Forum 2011
Civil War Data 150 at DLF Fall Forum 2011
 
LRMI - OPEN Kick Off
LRMI - OPEN Kick OffLRMI - OPEN Kick Off
LRMI - OPEN Kick Off
 
An Introduction to RDF and the Web of Data
An Introduction to RDF and the Web of DataAn Introduction to RDF and the Web of Data
An Introduction to RDF and the Web of Data
 
Generous Interfaces - rich websites for digital collections
Generous Interfaces - rich websites for digital collections Generous Interfaces - rich websites for digital collections
Generous Interfaces - rich websites for digital collections
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic Repositories
 
Web Data Management with RDF
Web Data Management with RDFWeb Data Management with RDF
Web Data Management with RDF
 
Library 2012 presentation
Library 2012 presentationLibrary 2012 presentation
Library 2012 presentation
 
when the link makes sense
when the link makes sensewhen the link makes sense
when the link makes sense
 
FluidDB NYC Python presentation
FluidDB NYC Python presentationFluidDB NYC Python presentation
FluidDB NYC Python presentation
 

Similar a Linked Open Data for Museums Introduction

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and MuseumsLinked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museumstrevorthornton
 
It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011Ross Singer
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Jane Stevenson
 
Linking American Art to the Cloud
Linking American Art to the CloudLinking American Art to the Cloud
Linking American Art to the CloudGeorgina Goodlander
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples Victor de Boer
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked DataJane Stevenson
 
Transmission6 - Publishing Linked Data
Transmission6 - Publishing Linked DataTransmission6 - Publishing Linked Data
Transmission6 - Publishing Linked DataBill Roberts
 
Choices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein OntologiesChoices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein Ontologiesbenosteen
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeDan Brickley
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)net2-project
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationFrank van Harmelen
 
Madrid Building blocks of Linked Data
Madrid Building blocks of Linked DataMadrid Building blocks of Linked Data
Madrid Building blocks of Linked DataVictor de Boer
 
Radically Open at the National Archives
Radically Open at the National ArchivesRadically Open at the National Archives
Radically Open at the National ArchivesJon Voss
 
Aileen O'Carroll - DRI Training UCC: Introduction to Metadata
Aileen O'Carroll - DRI Training UCC: Introduction to MetadataAileen O'Carroll - DRI Training UCC: Introduction to Metadata
Aileen O'Carroll - DRI Training UCC: Introduction to Metadatadri_ireland
 

Similar a Linked Open Data for Museums Introduction (20)

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and MuseumsLinked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museums
 
It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011
 
Thinking of Linking
Thinking of LinkingThinking of Linking
Thinking of Linking
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 
Linking American Art to the Cloud
Linking American Art to the CloudLinking American Art to the Cloud
Linking American Art to the Cloud
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Webofdata
WebofdataWebofdata
Webofdata
 
Transmission6 - Publishing Linked Data
Transmission6 - Publishing Linked DataTransmission6 - Publishing Linked Data
Transmission6 - Publishing Linked Data
 
Choices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein OntologiesChoices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein Ontologies
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge Representation
 
Madrid Building blocks of Linked Data
Madrid Building blocks of Linked DataMadrid Building blocks of Linked Data
Madrid Building blocks of Linked Data
 
Radically Open at the National Archives
Radically Open at the National ArchivesRadically Open at the National Archives
Radically Open at the National Archives
 
Aileen O'Carroll - DRI Training UCC: Introduction to Metadata
Aileen O'Carroll - DRI Training UCC: Introduction to MetadataAileen O'Carroll - DRI Training UCC: Introduction to Metadata
Aileen O'Carroll - DRI Training UCC: Introduction to Metadata
 

Linked Open Data for Museums Introduction

  • 1. An Introduction to Linked Open Data for Museums David Henry Jarred Moore MW2014 Presented by
  • 2. An Introduction to Linked Open Data for Museums
  • 3. Limitations of Keyword Searching Polysemy: One word with multiple meanings. E.g. man crane bank Synonymy: Multiple words with the same meaning. buy OR purchase create OR make eliminate OR remove OR abolish Signal to noise ratio e.g. Try searching for the term “Mississippi”
  • 4. What is Linked Open Data? On the web, open license Machine-readable data Non-proprietary format RDF Format Linked RDF
  • 5. Copyright and Licensing If Your content files are still under copyright and your institution is the copyright owner, encourage your institution to license the content as openly as possible CCO CC-BY CC-BY-SA CC-BY-NC
  • 6. What is RDF? • “Resource Description Framework (RDF) is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. “ (from W3C) • “…making Statements about resources (in particular web resources) in the form of subject-predicate-object expressions.” (Wikipedia)
  • 7. What are Triples? • Triples are statements of fact (or assertions) composed of a subject, predicate, and object. For example: “David Henry” Subject “Lives in” Predicate “St. Louis” Object
  • 8. What are Questions Answered by RDF? Fact-Based Interpretive Theoretical Subjective Analytical
  • 9. Fact Based Questions ask Who, What, When Where (Not so much Why) Fact-Based Questions Who directed “Citizen Kane’? What’s a daguerreotype? Where did Van Gogh paint ‘Starry Night’?
  • 10. Fact Based Question: Are there any daguerreotypes of the Mississippian mounds in St. Louis, Missouri? Title: Group of people standing on a partially destroyed Big Mound. Description: Group of people standing on a partially destroyed Big Mound. Place: St. Louis, Missouri Dates: 1869 Type(s): photo, Daguerreotype Maker/Creator: Thomas M. Easterly Subjects: Mississippian Culture, mounds Identifier: PHO:17665 Permalink: http://collections.mohistory.org/resource/9952
  • 11. Triples to Complex Graphs Thomas M. Easterly 1869Subject “Mississippian Culture” hasSubject hasLabel hasType Daguerreotype
  • 12. “Thomas M. Easterly” Name: Thomas M. Easterly Birth Date: October 3, 1809 Death Date: March 12, 1882 Places of Residence: Guilford, Vermont Liberty, Missouri St. Louis, Missouri Bio: Thomas M. Easterly was one of the leading American Daguerreotypists …. During the 1860s, improvements in photographic development caused daguerreotypes to become out of fashion. Easterly refused to acknowledge these changes believing the highly detailed daguerreotypes were far superior in terms of beauty or permanence urging the public to "save your old daguerreotypes for you will never see their like again".
  • 13. Exercise 1. Time: 10-15 minutes Activity: • Break into groups of 2-3. • Write out one or more research questions. • For each question, draw a entity-relationship graph that could provide an answer to the question
  • 14. What’s Wrong with the Good Ole Web?
  • 15. What is a Uniform Resource Identifier? Uniform Resource Locator ----- Purpose: To locate a web resource (document) Uniform Resource Name ----- Purpose: To identify any resourceIn Linked Open Data, URIs act as both URLs and URNs UR I
  • 16. Principles of Linked Data • Use URIs to denote things. • Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents. • Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL. • Include links to other related things (using their URIs) when publishing data on the Web. To make this happen subjects and predicates MUST be defined by URIs. Objects may be URIs or literals.
  • 17. Triples to Complex Graphs Resource:9952 Thomas M. Easterly 1839 ns1:Subject_91011 “Mississippian Culture” nso:hasSubject nso:hasLabel nso:hasType “Daguerreotype” ns1:type_80345 Resource:92142
  • 18. Triples to Complex Graphs http://collections.mohistory.org/resource/9952 ns1:Person_5678 Thomas M. Easterly 1839 ns1:Subject_91011 “Mississippian Culture” nso:hasSubject nso:hasLabel nso:hasType “Daguerreotype” ns1:type_80345
  • 19. What two words are most commonly found in a browser window? Web links have a half life of about ten years. In other words, 50% of links that are 10 years old are broken.
  • 20. Document DocumentDocument DocumentDocument Link rot is a serious problem on the document-based web.
  • 22. Rules for persistent URI’sCoolURI’s • No date Context • No ownership context • No technology context • Re-use existing identifiers • Link multiple representations • Implement 303 redirects for real world objects NotCoolURI’s • Avoid stating ownership • Avoid version numbers • Avoid query strings • Avoid file extensions
  • 23. Example URI: http://education.data.gov.uk/ministryofeducation/id/school/123456 http://education.data.gov.uk/doc/school/v01/123456 states ownership version number good Mostly good http://www.example.com/id/alice_brown http://data.nytimes.com/88843902954064461461
  • 24. Writing RDF RDFXML Turtle NTriples <rdf:RDF xmlns:ns0=“http://mydomain.org/people/” xmlns:n1=http://otherdonain.org/> <description about=“ns0:David_Henry”> <ns1:livesIn>St. Louis, MO</ns1:livesIn> </description> @prefix ns0: <http://mydomain.org/people/> . @prefix ns1: <http://otherdomain.org/> . ns0:David_Henry ns1:livesIn “St. Louis, MO” . <http://mydomain.org/people/David_Henry> <http://otherdomain.org/livesIn> “St. Louis, MO” . “David Henry” “Lives In” “St. Louis”
  • 25. Triples to Complex Graphs http://collections.mohistory.org/resource/9952 Resource:92142 Thomas M. Easterly 1839 ns1:Subject_91011 “Mississippian Culture” nso:hasSubject nso:hasLabel nso:hasType “Daguerreotype” ns1:type_80345
  • 26. Graph to RDF as Turtle @prefix resource: <http://collections.mohistory.org/resource/> . @prefix ns0: <http://collections.mohistory.org/vocab/relators/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema> . resource:9952 ns0:dateCreated "1869-01-01"^^xsd:date . resource:9952 ns0:hasType <http://collections.mohistory.org/vocab/daguerreotype> . resource:9952 ns0:createdBy resource:92142 . resource:92142 ns0:hasLabel "Thomas M. Easterly" . resource:9952 ns0:hasSubject resource:5215 . resource:5215 ns0:hasLabel "Mississippian Culture" .
  • 27. Exercise 2. Time: 15 minutes Activity: • Break into groups of 2-3. • Using the graph defined in Exercise 1, define a set of triples from the graph (Use your own URIs) • Use the RDF validator at http://www.rdfabout.com/demo/validator/
  • 28. What is Linked Open Data? On the web, open license Machine-readable data Non-proprietary format RDF Format Linked RDF
  • 29. Principles of Linked Data • Use URIs to denote things. • Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents. • Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL. • Include links to other related things (using their URIs) when publishing data on the Web.
  • 30.
  • 31.
  • 32. Core Vocabularies • RDF & RDFS • Useful terms: rdf:type, rdfs:label • SKOS (Simple Knowledge Organization Schema) • Useful terms: skos:broader, skos:narrower • OWL (Web Ontology Language) • Useful terms: owl:sameAs, owl:differentFrom • Dublin Core • Useful terms: dc:creator, dc:date, dc: subject • Foaf • Useful terms: foaf:name, foaf:knows, foaf:image
  • 33. Ontology Thesaurus Controlled Vocabulary Vocabulary Types Simple list of terms. e.g. DCMI Types list Hierarchical list of terms e.g. Library of Congress Subjects Hierarchical list of terms with relationship constraints e.g. CIDOC CRM
  • 34. Example using CRM Core E52 Time-Span 1898 E53 Place France (nation) E21 Person Rodin Auguste E52 Time-Span 1840 E67 Birth Rodin’s birth E52 Time-Span 1917 P4 has time-span E69 Death Rodin’s death E12 Production Rodin making “Monument to Balzac” in 1898 E21 Person Honoré de Balzac E55 Type sculptors E84 Information Carrier The “Monument to Balzac” (plaster) E55 Type plaster E52 Time-Span 1925 E55 Type bronze E40 Legal Body Rudier (Vve Alexis) et Fils E12 Production Bronze casting“Monument to Balzac” in 1925 E55 Type companies E84 Information Carrier The “Monument to Balzac”(S1296) P108B was produced by P62 depicts P16B was used for P134 continued P2 has type P120B occurs after P4 has time-span P2 has type P100B died in P98B was born P4 has time -span P2 has type P14 carried out by P14 carried out by P62 depicts P108B was produced by P2 has type P7 took place at P4 has time-span
  • 35. Implementing Linked Open Data Link existing data • Low barrier to entry • Controlled lists and thesauri • Not very descriptive Manage data to fit an ontology • High barrier to entry • Ontologies • Very descriptive RDF facilitates the “evolution of schemas over time”
  • 36. What is RDF? • “Resource Description Framework (RDF) is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. “ (from W3C) • “…making Statements about resources (in particular web resources) in the form of subject-predicate-object expressions.” (Wikipedia)
  • 37. Triples to Complex Graphs http://collections.mohistory.org/resource/9952 Resource:92142 Thomas M. Easterly 1839 ns1:Subject_91011 “Mississippian Culture” nso:hasSubject nso:hasLabel nso:hasType “Daguerreotype” ns1:type_80345
  • 38. Finding Links • Linked Open Vocabularies is a good starting point • Other well-used sources include: • DBPedia - for a wide-range of types (people, places, subjects, concepts) • Id.loc.gov – for name authorities and subjects • Viaf.org – for name authorities • geonames.org – for geographic locations Problem: There are no universal vocabularies
  • 39. A Note of Caution When re-using existing URIs, be sure to use the URI that represents the entity (thing/concept/person) and not the web resource. For example: http://id.loc.gov/authorities/subjects/sh85126887.html Is NOT the same as: http://id.loc.gov/authorities/subjects/sh85126887
  • 40. A Note of Caution When re-using existing URIs, be sure to use the URI that represents the entity (thing/concept/person) and not the web resource.
  • 41. Finding Links • Matching predicates. • hasType => rdfs:type, dcterms:type, crm:E55_Type • createdBy => dc:creator, crm:P94i_was_created_by • dateCreated => dc:created, ? • Matching value vocabularies. • “Daguerreotype” => http://dbpedia.org/resource/Daguerreotype • “Mississippian Culture” => http://id.loc.gov/authorities/subjects/sh85086218 • “Thomas Easterly” => http://viaf.org/viaf/13114715/ Problem: There are no universal vocabularies
  • 42. Triples to Complex Graphs http://collections.mohistory.org/resource/9952 Resource:92142 Thomas M. Easterly 1839 ns1:Subject_91011 “Mississippian Culture” dc:subject rdfs:label rdf:type “Daguerreotype” ns1:type_80345
  • 43. @prefix resource: <http://collections.mohistory.org/resource/> . @prefix ns0: <http://collections.mohistory.org/vocab/relators/> . @prefix dc: <http://purl.org/dc/terms/> . # dc:creator; dc:created; dc:subject @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . # rdf:type @prefix owl: <http://www.w3.org/2002/07/owl#> . # sameAs; differentFrom @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . # date; integer resource:9952 ns0:dateCreated "1869"^^xsd:date . resource:9952 dc:date "1869-01-01"^^xsd:date . resource:9952 ns0:hasType <http://collections.mohistory.org/vocab/daguerreotype> . resource:9952 rdf:type <http://dbpedia.org/resource/Daguerreotype> . resource:9952 ns0:createdBy resource:92142 . resource:9952 dc:creator <http://viaf.org/viaf/13114715/> . resource:92142 ns0:hasLabel "Thomas M. Easterly" . resource:9952 ns0:hasSubject resource:5215 . resource:9952 dc:subject <http://id.loc.gov/authorities/subjects/sh85086218> . #resource:5215 ns0:hasLabel "Mississippian Culture" . resource:92142 owl:sameAs <http://viaf.org/viaf/13114715/> . Graph to RDF as Turtle
  • 44. Exercise 3. Time: 15-20 minutes Activity: • Break into groups of 2-3. • Using the triples you defined in Exercise 3, find existing URIs to link with your local URIs. • Be prepared to explain why you chose the URIs your chose.
  • 45. How Tos • Embed schema.org data in a web page • Publish static RDF files • Manage local vocabularies and align them with existing vocabularies • Contributing to a collection aggregator – e.g. Europeana or DPLA • Publish existing database records as RDF • Managing RDF data in a triple (or quad) store
  • 46. Embedding schema.org <div itemscope itemtype="http://schema.org/CreativeWork"> <img src="http://collections.mohistory.org/resource/16679.jpg" class="item_image" width="300" itemprop="image" /> <div id="record_detail"> <p><b>Title:</b> <span itemprop="name“>Lord Fitzwilliam and manservant, hunting on the Hunt Farm on Gravois Road.</span></p> <p><b>Description:</b> <span itemprop="description"></span></p> <p><b>Item:</b> <span itemprop="additionalType">Daguerreotype</span></p> <p><b>Dates:</b> <span itemprop="dateCreated">1855 to 1865</span></p> . Copy and paste entire text
  • 47. Publish static RDF files • RDF files can be hand-written (what fun!) or rendered using templates • Paths to RDF files can be submitted to RDF search engines such as Sindice (http://sindice.com) • Caution: Some content negotiation would be required. • Remember: http://mydomain.org/resource/1234.rdf is NOT the same as http://mydomain.org/resource/1234
  • 48. Manage local vocabularies and align them with existing vocabularies Tools include: PoolParty Tematres Karma
  • 49. Contributing to a collection aggregator – e.g. Europeana or DPLA Service Hub • Dataset A • Dataset B • Dataset C Service Hub • Dataset 1 • Dataset 2 • Dataset 3 Content Hub • Dataset X • Dataset Y • Dataset Z
  • 50. Publish existing database records as RDF
  • 51. Managing RDF data in a triple (or quad) store • Quad = triple + context • Most stores feature a SPARQL interface to query across all triples (quads) in a repository • Tools: • Sesame – from OpenRDF • Virtuoso • Mulgara

Notas del editor

  1. Add Thomas,mhm, host hotel, depository
  2. 3 min europiana and ted talk
  3. Below are some open data options from the Creative CommonsThese are listed from the least restrictive at the top of this slide to the most restrictive at the bottom of this slide. CCO – is when a copyright owner waives their right and dedicates it to the public domainCC-BY is when only requirement is attribution to the owner when reusingCC-BY-SA adds the additional criteria for others to share alike under the same termsCC-BY-NC - further restricts re-use to non-commercial uses only. I put this in red because some open data purists believe that a non-commercial restriction does not qualify for open content status.
  4. Time: 10-15 minutes Activity: Break into groups of 2-3. Write out one or more research questions.For each question, draw a entity-relationship graph that could provide an answer to the question
  5. RDF can be written in various formats including:RDFXMLN-TriplesTurtleJSON-LD
  6. See http://data.nytimes.com/77498966567276420453 for an example of crosslinking “Joan Baez”-- linkage uses the owl:sameAs predicate to link the URI for Joan Baez at the New York Times with the URI at DBPedia.
  7. Ad foaf
  8. DCMI Types: http://dublincore.org/documents/2000/07/11/dcmi-type-vocabulary/Library of Congress Subject Headings: http://id.loc.gov/authorities/subjects/sh85086237.htmlCIDOC CRM: http://www.cidoc-crm.org/docs/cidoc_crm_version_5.1.2.pdf
  9. Is there a URI for the type &quot;Daguerreotype&quot;?1) Try Linked open vocabularies. result: Nothing for &quot;Daguerreotype&quot; ref: http://lov.okfn.org/dataset/lov/search/#s=Daguerreotype result2: Many hits for &quot;photograph&quot; ref: http://lov.okfn.org/dataset/lov/search/#s=photograph -- could use http://schema.org/photograph as a broad match2) Try the LOC Linked Data Service. result: Subject result for &quot;Daguerreotype&quot; ref: http://id.loc.gov/search/?q=Daguerreotype&amp;q= result2: after filtering by the TGM, found http://id.loc.gov/vocabulary/graphicMaterials/tgm002852.html -- good result: well used vocabulary; fits within a hierarchy3) Try DBPedia. result: found a dbpedia resource ref: http://lookup.dbpedia.org/api/search.asmx/PrefixSearch?QueryClass=&amp;MaxHits=5&amp;QueryString=daguerreotype
  10. Is there a URI for the type &quot;Daguerreotype&quot;?1) Try Linked open vocabularies. result: Nothing for &quot;Daguerreotype&quot; ref: http://lov.okfn.org/dataset/lov/search/#s=Daguerreotype result2: Many hits for &quot;photograph&quot; ref: http://lov.okfn.org/dataset/lov/search/#s=photograph -- could use http://schema.org/photograph as a broad match2) Try the LOC Linked Data Service. result: Subject result for &quot;Daguerreotype&quot; ref: http://id.loc.gov/search/?q=Daguerreotype&amp;q= result2: after filtering by the TGM, found http://id.loc.gov/vocabulary/graphicMaterials/tgm002852.html -- good result: well used vocabulary; fits within a hierarchy3) Try DBPedia. result: found a dbpedia resource ref: http://lookup.dbpedia.org/api/search.asmx/PrefixSearch?QueryClass=&amp;MaxHits=5&amp;QueryString=daguerreotype
  11. Embed schema.org data in a web pagePublish static RDF filesManage local vocabularies and align them with existing vocabulariesContributing to a collection aggregator – e.g. Europeana or DPLAPublish existing database records as RDFManaging RDF data in a triple (or quad) store
  12. Embed schema.org data in a web pagePublish static RDF filesManage local vocabularies and align them with existing vocabulariesContributing to a collection aggregator – e.g. Europeana or DPLAPublish existing database records as RDFManaging RDF data in a triple (or quad) store
  13. Embed schema.org data in a web pagePublish static RDF filesManage local vocabularies and align them with existing vocabulariesContributing to a collection aggregator – e.g. Europeana or DPLAPublish existing database records as RDFManaging RDF data in a triple (or quad) store
  14. Embed schema.org data in a web pagePublish static RDF filesManage local vocabularies and align them with existing vocabulariesContributing to a collection aggregator – e.g. Europeana or DPLAPublish existing database records as RDFManaging RDF data in a triple (or quad) store