Linked Open Data in Libraries, Archives & Museums

Linked Open Data in Libraries, Archives and
Museums
Jon Voss – July 12, 2011 – NYPL Labs

http://www.ﬂickr.com/photos/pict_u_re/2372235999

Linked Open Data
in
Libraries, Archives & Museums

New York Public Library @jonvoss
July 12, 2011

Jon Voss
Historypin Strategic Partnerships Director
jon.voss@wearewhatwedo.org

Welcome
• Goal: a solid, basic, conceptual understanding of Linked
Open Data

• A chance to collaborate with others, share knowledge,
expertise, perspective; explore ideas

Linked Open Data in Cultural Context

• It’s not just Libraries, • http://
Archives & Museums mashupbreakdown.com/

• Linked Open Data has
evolved in the cultural
context of shared
information, music, movies

• From rock to rap to hip-hop
to mashups

• Changing expectations from
audiences, curators,
technologists

History & Mashup Culture

+

2010 National Archives Photo Contest

http://www.ﬂickr.com/photos/37377809@N00/5304492185/in/pool-1633053@N21/

2009
Linked
Open
Data

photos by PhOtOnQuAnTiQuE, TED

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

LODLAM is a Growing Movement

• in its infancy, but picking up steam

• it requires experimentation

• small, niche, domain-speciﬁc implementations

• use cases, reasons for content providers to get excited about
contributing

LODLAM is a product of our increasingly
connected culture.

• it’s an unfolding story, but it’s awn...

• ﬁrst funded projects in the US exploring Linked Open Data in the
humanities now underway: http://lod-lam.net

• June 2-3, 100 people gathered from around the world to
forward LODLAM in the next year

LODLAM is a product of our increasingly
connected culture.

• and that’s just the beginning...

Linked Open Data

Linked

http://openlibrary.org/works/OL6048721W/Linked

Going from Tables to Graphs

http://www.ﬂickr.com/photos/thomasjwoods-com/2264301251


• nodes and links in a graph


• As computing power increases, the ability to build
more and more complex graphs becomes a reality.

• Human vs. Machine readable

msulibraries lookbackmaps
msulibraries internetarchive
msulibraries librarycongress
lookbackmaps internetarchive
internetarchive librarycongress

Introducing Triples

Nodes & Links

follows
jonvoss NYPL_Labs

• Quite simply: Subject, Predicate, Object

• gives us the ability to describe entities in a way that is
machine readable

What do we know about the person: Ed
Summers (aside from the fact that he
rocks)?
Bio: Hacker for libraries, digital archaeologist, pragmatist.

bio
knows

depiction of

knows

http://inkdroid.org/ehs.rdf

Triples for machines
• triples can be serialized in many different ways,
including Resource Description
Framework, RDF/XML, RDFa, N3, Turtle, etc,
but they all describe things in the
<subject><predicate><object> format.

• of course, we need to be consistent and
predictable for machines to understand us.

• we’re almost ready to talk to machines

http://www.ﬂickr.com/photos/oface/3306994117/

• consider graph demo: http://civilwardata150.net

• Civil War vocabulary, or a way to link and traverse
across datasets

• Regiments, battles, Freebase military schema

• Building apps

• How tools like Simile/Exhibit can use Linked
Data in coordination with Freebase (Conﬂict
History: http://conﬂicthistory.com/#/period/

Now that we can see the code...

• Books

• Photos

• Information

Tim Berners-Lee’s 4 rules of Linked Data

• Use URIs as names for things
• Use HTTP URIs so that people can look up those
names.
• When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
• Include links to other URIs. so that they can discover
more things.

http://www.w3.org/DesignIssues/LinkedData.html

Tim Berners-Lee: 5 Stars of Linked Data

• More thanks to Ed Summers: http://inkdroid.org/
journal/2010/06/04/the-5-stars-of-open-linked-
data/

• This is NOT all or nothing

A cautionary word about vocabularies

http://www.ﬂickr.com/photos/sillygwailo/272291003/

A cautionary word about vocabularies

• Caution: what libraries call vocabularies is not
necessarily what we mean...

• This is how we organize information and
triangulate the data we’re looking for

• How we agree on predicates

• Ontologies like FOAF, OWL, http://id.loc.gov/,
VIAF, etc.

In summary Linked

• Graphs
• Human AND Machine readable
• Vocabulary, agreed terms for organizing info
• Triples, RDF

The “Open” part of Linked
Open Data Open

• Considerations and ramiﬁcations

• Difference between shared, published, open

• Legal tools

• Precedents/Examples

Expose yourself, be vulnerable
• This is the major cultural shift, the tide rising
amongst institutions, that data wants to be free in
a culture economy.

• There is value in sharing

• It does require a leap of faith, but risks and
rewards should be carefully considered and
calculated

• Excellent resource: JISC Open Bibliographic Data
Guide http://obd.jisc.ac.uk/

What will happen to your data?

• If you want people to do something with your
data/metadata, you have to put it out there

• But once you do, it’s [mostly] out of your control.
Yet it can be a part of something much greater
than any of the component parts

• Roots and Wings

• Lessig: Humility of the Web

What will happen to your data?
• working with
Open Data
from NOAA
at wherecamp
2011.

http://www.nauticalcharts.noaa.gov/history/CivilWar/

Metadata vs. data, assets, digital
surrogates

• A key conceptual shift with Open Data is
looking at metadata and data as two separate
things, that can have different licensing and
permissions

http://www.loc.gov/pictures/collection/cwp/item/2003653763/

http://www.loc.gov/pictures/item/2003653763/marc/

What are the legal tools for
publishing Open Data?

Legal Tools

• http://creativecommons.org/licenses/

• http://www.opendatacommons.org/licenses/

Open Data Published Data

CC-BY CC-BY-NC-ND
CC0
CC-BY-NC
Public Domain Mark
CC-BY-ND
Public Domain Dedication and License (PDDL)
CC-BY-SA
Attribution License (ODC-By)
Open Database License (ODC-ODbL) CC-BY-NC-SA

Concerns and Limitations
• There is some argument about whether or not
metadata can be protected under copyright at all.
Copyright protects a creative work, and some
argue that metadata is scientiﬁc fact, rather than
creative work.

• Databases are protected differently in the EU and
US, for example.

• Public Domain and No Known Copyright...

• Issuing blanket copyright over all works on a
website, even though some may be in the public
domain

Examples and precedents

• Bibliographic data:

• British Library (CC0), University of Michigan
(CC0), Stanford (CC-BY) have published large,
raw datasets of bibliographic data they have
created (being careful not to publish OCLC or
other vendor controlled or licensed metadata)


• Civil War Data 150

• Metadata from contributing federal
institutions are largely considered to be Public
Domain.

• State, local, university & individual researchers
are considering policies for metadata
publishing on a case by case basis.


http://googleancientplaces.wordpress.com/public-domain/

Sciences leading the way vs. Humanities

• In the sciences, there have been a lot of advances
in the realm of Open Data, which will provide
models for humanities research as well

• Nano Publishing: the idea of publishing
datasets separately from research findings, so
that it can more easily be built upon and
integrated into other datasets. Several scientific
journals have already started this.

• Federally funded medical research must have a
data management plan and some funders are
requiring that data be published separately from
analysis and findings as Open Data

In summary Open

• put it out there...
• published, shared, and/or open
• tools
• metadata vs. assets

Google Reﬁne

• A tool for large datasets, cleaning and reconciling

• http://code.google.com/p/google-reﬁne/

• Extremely powerful, though scripting language has
not yet been very well documented.

• Enables you to reconcile data against the 20 million
+ known entities in Freebase

What Would You Do?

• Conceptualizing domains, Linked Open
Data projects, collaborations, etc

Join the LODLAM movement

• #lodlam hashtag on Twitter
• http://groups.google.com/group/lod-lam
• http://lod-lam.net proceedings online and
on the road for the next year at various
annual meetings and conferences
• Contribute!

Thanks
@NYPL_Labs Team
@edsu & crew
Sloan Foundation, NEH, Internet Archive
Historypin

& all y’all.

Linked Open Data in Libraries, Archives & Museums

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Linked Open Data in Libraries, Archives & Museums

Similar to Linked Open Data in Libraries, Archives & Museums (20)

More from Jon Voss

More from Jon Voss (14)

Recently uploaded

Recently uploaded (20)

Linked Open Data in Libraries, Archives & Museums

Editor's Notes