3. Welcome
• Goal: a solid, basic, conceptual understanding of Linked
Open Data
• A chance to collaborate with others, share knowledge,
expertise, perspective; explore ideas
4. Linked Open Data in Cultural Context
• It’s not just Libraries, • http://
Archives & Museums mashupbreakdown.com/
• Linked Open Data has
evolved in the cultural
context of shared
information, music, movies
• From rock to rap to hip-hop
to mashups
• Changing expectations from
audiences, curators,
technologists
5. History & Mashup Culture
+
2010 National Archives Photo Contest
http://www.flickr.com/photos/37377809@N00/5304492185/in/pool-1633053@N21/
6. 2009
Linked
Open
Data
photos by PhOtOnQuAnTiQuE, TED
7. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
8.
9. LODLAM is a Growing Movement
• in its infancy, but picking up steam
• it requires experimentation
• small, niche, domain-specific implementations
• use cases, reasons for content providers to get excited about
contributing
10. LODLAM is a product of our increasingly
connected culture.
• it’s an unfolding story, but it’s awn...
• first funded projects in the US exploring Linked Open Data in the
humanities now underway: http://lod-lam.net
• June 2-3, 100 people gathered from around the world to
forward LODLAM in the next year
11. LODLAM is a product of our increasingly
connected culture.
• and that’s just the beginning...
Linked Open Data
15. Going from Tables to Graphs
• As computing power increases, the ability to build
more and more complex graphs becomes a reality.
• Human vs. Machine readable
msulibraries lookbackmaps
msulibraries internetarchive
msulibraries librarycongress
lookbackmaps internetarchive
internetarchive librarycongress
16. Introducing Triples
Nodes & Links
follows
jonvoss NYPL_Labs
• Quite simply: Subject, Predicate, Object
• gives us the ability to describe entities in a way that is
machine readable
17. What do we know about the person: Ed
Summers (aside from the fact that he
rocks)?
Bio: Hacker for libraries, digital archaeologist, pragmatist.
bio
knows
depiction of
knows
http://inkdroid.org/ehs.rdf
18. Triples for machines
• triples can be serialized in many different ways,
including Resource Description
Framework, RDF/XML, RDFa, N3, Turtle, etc,
but they all describe things in the
<subject><predicate><object> format.
• of course, we need to be consistent and
predictable for machines to understand us.
19. • we’re almost ready to talk to machines
http://www.flickr.com/photos/oface/3306994117/
21. • consider graph demo: http://civilwardata150.net
• Civil War vocabulary, or a way to link and traverse
across datasets
• Regiments, battles, Freebase military schema
• Building apps
• How tools like Simile/Exhibit can use Linked
Data in coordination with Freebase (Conflict
History: http://conflicthistory.com/#/period/
22.
23.
24.
25. Now that we can see the code...
• Books
• Photos
• Information
26.
27.
28. Tim Berners-Lee’s 4 rules of Linked Data
• Use URIs as names for things
• Use HTTP URIs so that people can look up those
names.
• When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
• Include links to other URIs. so that they can discover
more things.
http://www.w3.org/DesignIssues/LinkedData.html
29. Tim Berners-Lee: 5 Stars of Linked Data
• More thanks to Ed Summers: http://inkdroid.org/
journal/2010/06/04/the-5-stars-of-open-linked-
data/
• This is NOT all or nothing
30. A cautionary word about vocabularies
http://www.flickr.com/photos/sillygwailo/272291003/
31. A cautionary word about vocabularies
• Caution: what libraries call vocabularies is not
necessarily what we mean...
• This is how we organize information and
triangulate the data we’re looking for
• How we agree on predicates
• Ontologies like FOAF, OWL, http://id.loc.gov/,
VIAF, etc.
32. In summary Linked
• Graphs
• Human AND Machine readable
• Vocabulary, agreed terms for organizing info
• Triples, RDF
33. The “Open” part of Linked
Open Data Open
• Considerations and ramifications
• Difference between shared, published, open
• Legal tools
• Precedents/Examples
34. Expose yourself, be vulnerable
• This is the major cultural shift, the tide rising
amongst institutions, that data wants to be free in
a culture economy.
• There is value in sharing
• It does require a leap of faith, but risks and
rewards should be carefully considered and
calculated
• Excellent resource: JISC Open Bibliographic Data
Guide http://obd.jisc.ac.uk/
35. What will happen to your data?
• If you want people to do something with your
data/metadata, you have to put it out there
• But once you do, it’s [mostly] out of your control.
Yet it can be a part of something much greater
than any of the component parts
• Roots and Wings
• Lessig: Humility of the Web
36. What will happen to your data?
• working with
Open Data
from NOAA
at wherecamp
2011.
http://www.nauticalcharts.noaa.gov/history/CivilWar/
37. Metadata vs. data, assets, digital
surrogates
• A key conceptual shift with Open Data is
looking at metadata and data as two separate
things, that can have different licensing and
permissions
40. What are the legal tools for
publishing Open Data?
41. Legal Tools
• http://creativecommons.org/licenses/
• http://www.opendatacommons.org/licenses/
Open Data Published Data
CC-BY CC-BY-NC-ND
CC0
CC-BY-NC
Public Domain Mark
CC-BY-ND
Public Domain Dedication and License (PDDL)
CC-BY-SA
Attribution License (ODC-By)
Open Database License (ODC-ODbL) CC-BY-NC-SA
42. Concerns and Limitations
• There is some argument about whether or not
metadata can be protected under copyright at all.
Copyright protects a creative work, and some
argue that metadata is scientific fact, rather than
creative work.
• Databases are protected differently in the EU and
US, for example.
• Public Domain and No Known Copyright...
• Issuing blanket copyright over all works on a
website, even though some may be in the public
domain
43. Examples and precedents
• Bibliographic data:
• British Library (CC0), University of Michigan
(CC0), Stanford (CC-BY) have published large,
raw datasets of bibliographic data they have
created (being careful not to publish OCLC or
other vendor controlled or licensed metadata)
44. Examples and precedents
• Civil War Data 150
• Metadata from contributing federal
institutions are largely considered to be Public
Domain.
• State, local, university & individual researchers
are considering policies for metadata
publishing on a case by case basis.
46. Sciences leading the way vs. Humanities
• In the sciences, there have been a lot of advances
in the realm of Open Data, which will provide
models for humanities research as well
• Nano Publishing: the idea of publishing
datasets separately from research findings, so
that it can more easily be built upon and
integrated into other datasets. Several scientific
journals have already started this.
• Federally funded medical research must have a
data management plan and some funders are
requiring that data be published separately from
analysis and findings as Open Data
47. In summary Open
• put it out there...
• published, shared, and/or open
• tools
• metadata vs. assets
48. Google Refine
• A tool for large datasets, cleaning and reconciling
• http://code.google.com/p/google-refine/
• Extremely powerful, though scripting language has
not yet been very well documented.
• Enables you to reconcile data against the 20 million
+ known entities in Freebase
49. What Would You Do?
• Conceptualizing domains, Linked Open
Data projects, collaborations, etc
50. Join the LODLAM movement
• #lodlam hashtag on Twitter
• http://groups.google.com/group/lod-lam
• http://lod-lam.net proceedings online and
on the road for the next year at various
annual meetings and conferences
• Contribute!
51. Thanks
@NYPL_Labs Team
@edsu & crew
Sloan Foundation, NEH, Internet Archive
Historypin
& all y’all.
Editor's Notes
\n
\n
\n
\n
exploring history on mobile apps\n
people much smarter than I were already on it. earlier in 2009, the father of the World Wide Web, Tim Berners-Lee, was taking his message of Linked Open Data to the streets. How we can build a web of data... sounds familiar... and it seems to worked out the first time... From a web of documents, to a web of data\n
and that web of data is already growing rapidly...\n
What if we begin to apply this to the vast amounts of data at libraries, archives, and museums?\n
\n
\n
\n
It started for me with the book Linked, which was first published in 2002. I don&#x2019;t think I read it until 2003 or so, but it changed my life. The explanations of mathematical graph and network theory in lay terms helped me to see how an understanding of interconnectedness would allow us to do amazing things with the disparate datasets around us. \n
--Our data and databases have been organized in tables\n--which works, but only to a point\n
The World Wide Web is much more like a graph, and the ability to link to disparate datasets relies on our ability to understand data as nodes and links in a graph\n
\n
\n
\n
\n
\n
Where did we get all that info about Ed? He published it here.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
In the last several years, Creative Commons have provided standardized, portable legal tools that make it easier for individuals and institutions to use. Also see licenses by Open Knowledge Foundation, designed for databases.\n