Talk given at Open Knowledge Foundation 'Opening Up Metadata: Challenges, Standards and Tools' Workshop, Queen Mary University of London, 13th June 2012.
Info on the event at http://openglam.org/2012/05/31/last-places-left-for-opening-up-metadata-challenges-standards-and-tools/
1. Introduction to APIs and
Linked Data
Adrian Stevenson
Senior Technical Innovations Coordinator
Mimas, University of Manchester, UK
@adrianstevenson
2. Benefits of APIs for
GLAMs
• Cross-searching
• Improved resource discovery
• Data not trapped in silos
• Findability on the Web – Google
• Data re-use
• Bringing data together - integration
• Enhanced services – e.g. Mashups
2
3. Metadata
• What is it? - Data about data
• How do you create it?
– Catalog card, text editor, Word, Excel, Access, XML
Editor….
• Do you use standards?
– EAD – Encoded Archival Description
– Not using standards may have implications for
interoperability & sustainability
• How do you move it around?
– CDs, Email attachments, FTP, APIs
3
4. What is an API?
• „Application Programming Interface‟
– “API is an online interface that allows
distributed systems to communicate with one
another and exchange information”
– “APIs are carefully thought out pieces of code
created by programmers .. that allow other
applications to interact with their application”
4
5. APIs
• Allow machine readability of data
– Typically over the Web
• Provide other systems with access to
content or functions
• Many types – e.g.
– Google, Facebook, Flickr, twitter APIs ….
– OAI-PMH
– Linked Data API, SPARQL
– Others include SOLR, SRU, Z39.50, SOAP, ….
5
6. APIs are Machine to
Machine
• API is software-to-software interface, not a
user interface
• E.g. Cinema ticket websites use API:
– Sends credit card info to remote application
– Remote application sends response back to
ticket website saying OK to issue the tickets
• User see one interface
6
12. Open Expenses
12
http://benosteen.wordpress.com/2010/02/27/giving-the-mps-a-receipt-for-their-expenses-claim/
13. OAI-PMH
• Open Archive Initiative Protocol for Metadata
Harvesting
• Mechanism for repositories and services to share
metadata over the Web
• Facilitates cross-searching
• Works by use of 6 „verbs‟
– E.g. ListMetadataFormats, ListRecord, GetRecord …
– http://archiveshub.ac.uk/api/OAI-PMH/2.0/hub?verb=Identify
– http://archiveshub.ac.uk/api/OAI-
PMH/2.0/hub?verb=GetRecord&identifier=gb141vbh&metadataPrefix=o
ai_dc
13
16. Linked Data
“The term Linked Data refers to a set of best
practices for publishing and connecting
structured data on the Web.”
“the Semantic Web is the goal or end result…
Linked Data provides the means to reach
that goal”
From „Linked Data: The Story So Far‟ - Heath, Bizer and
Berners-Lee 2009
16
17. The goal of Linked Data is
to enable people to share
structured data on the
Web as easily as they can
share documents today.
Bizer/Cyganiak/Heath Linked Data Tutorial, linkeddata.org
19. URIs and HTTP
• “A Uniform Resource Identifier’ (URI)
provides a simple and extensible means
for identifying a resource” – W3C RFC 3986
• HTTP URIs may be „de-referenced‟on the
Web
• HTTP URIs are used for “real world” things
• http://adrianstevenson.com/id/me
• http://dbpedia.org/resource/Love
20. RDF
• Resource Description Framework
– a language for representing information about
resources on the Web
– RDF can be used to represent things identified
on the Web, even when they cannot be directly
retrieved on the Web
• Describes relations using „triples‟
• http://www.w3.org/TR/REC-rdf-syntax/
21. Triples
• Triples statements
– „Things‟ have „properties‟ with „values‟
– Subject – Predicate - Object
Keith Richards Is Member Of The Rolling
Stones
Repository Provides Access To Archival
Resource
• Triples are the basis of RDF and Linked
Data
23. From RDF to Linked Data
• If something is identified, it can be linked to
• We take items from our datasets and link
them to items from other datasets
BBC
Copac
VIAF
DBPedia
GeoNames
Archives Hub
29. Key Benefit of Linked Data
• Web 2.0 mashups work against a fixed
set of data sources
• Hand crafted by humans
• Don‟t integrate well
• Linked Data promises an unbound
global data space
• Easy dataset integration
• Generic „mesh-up‟ tools
30. Benefits for GLAMs
• Cross-searching
• Improved resource discovery
• Data not trapped in silos
• Findability on the Web – Google
• Data re-use
• Bringing data together - integration
• Enhanced services
32
31. Linked Data Challenges
• Dirty data, URI persistence
• Steep learning curve
• Complexity
• How sustainable are the data sources?
• How scalable are triple stores?
• Can you track the provenance of data
sources?
• Licensing
33
32. Contact
Adrian Stevenson
Mimas, University of Manchester, UK
adrian.stevenson@manchester.ac.uk
www.mimas.ac.uk
@adrianstevenson
www.linkedin.com/in/adrianstevenson
www.slideshare.net/adrianstevenson
34
33. CC License
• This presentation available under creative commons Non
Commercial-Share Alike:
http://creativecommons.org/licenses/by-nc/2.0/uk/
Notas del editor
EAD is XML format based on ISAD(G) rules
Emphasise interfaceInterface is a common boundary between separate systemsAPIs are specially crafted to expose only chosen functionality and/or data while safeguarding other parts of the application which provides the interface.
Some have more of an interoperability focus, some more proprietary.
What are APIs good for? – One things is mashupsUsesOpenBeeldenie. Open Images http://www.openimages.eu/api/ which is OAI-PMH and GPS dataAllows you to merge your snapshots at locations with snapshots from historical films – Augemented reality in reverse.Looks a bit like Historypin
Pronounced ‘Rikesmonumenten’ which is the National Heritage Museum(location-based) information on Holland’s 61,000 heritage sites
How do you use APIs – documentation and some dev skills.
It’s a harvesting approachEmphasise machine readabilityServices have OAI-PMH intefaces that facilitate harversting by data. Data can be put into repository to allow cross searching and distributed searching
Has been described as a ‘data commons’, or more usually a Web of Data.
Step back a bit to HTMLHTML web of documents doesn’t encourage re-use, reduce redundancy. Are network effects but could be much better.
Note this is a considerable simplification of the detail in danger of misleading.Linked data exploits semantically meaningful tagging to encourage re-use, reduce redundancy etc.
http://www.w3.org/DesignIssues/LinkedData.html
Uses predicate logic. Goes back to Aristotle.Conceptualises things, and the relationships between things
In hypertext web sites it is considered generally rather bad etiquette not to link to related external material. The value of your own information is very much a function of what it links to, as well as the inherent value of the information within the web page. So it is also in the Semantic Web.Remember, this is about machines linking – machines need identifiers; humans generally know when something is a place or when it is a person. BBC + DBPedia + GeoNames + Archives Hub + Copac + VIAF = the Web as an exploratory spaceUsers very interested in related materials acc to Terry Catapano at SAA 2011. LD can really help with this.
Can get XSLT stylesheet here too!
Note that it is machine readable interface as well as the human interfaceCurrently have a few hundred in Locah. There are 25,000 EAD records on theHub srevice. We’re Intending to put about 2,000 up for Linking Lives Project.
‘Every story has a beginning’Nice example of consumption of Archives Hub linked data
Data can be integrated from many diff sourcesUsers very interested in related materials acc to Terry Catapano at SAA 2011. LD can really help with this.