1. Applications of XML in Libraries for Electronic
Resources
Karen A. Coombs
University of Houston
librarywebchic@gmail.com
2. XML formats you might see or use in libraries
• MARCXML
• MARCXML holdings
• ISO/FDIS 20775 - Holdings schema
• OpenURL XML formats
• XML Metadata Format for Books (info:ofi/fmt:xml:xsd:book)
• XML Metadata Format for Journals (info:ofi/fmt:xml:xsd:journal)
• Digital Library standards
• Dublin Core
• MODS
• METS
3. MARCXML
• XML version of a MARC record
• Uses fields, subfields and indicators
• Very complex and often difficult to work with
• Typical output of most API for library catalogs
• Difficult to interpret if don’t know MARC
• OCLC Bibliographic Standards and Formats - http://www.oclc.org/
bibformats/default.htm
5. MARCXML Holdings
• MARC format for holdings
• Most relevant for serials/journals
• Limited number of important fields
• 856 - Electronic Location and Access
• 853 - Captions and Pattern information
• 863 - Enumeration and Chronology
• 866 - Textual Statement of Holdings
7. ISO/FDIS 20775
• Standard for transmitting holdings information
• Also contains information about the library with the holdings
• Being used by OCLC in WorldCat API
• Can contain information about complex serial holdings
• Can contain information about availability, availability policy, conditions and
charges
9. OpenURL XML formats
• Normally we think of OpenURL as a set of key/value pairs
http://www.crossref.org/openurl?
url_ver=Z39.882004&req_dat=username:password&rft_val_fmt=info:ofi/
fmt:kev:mtx:journal&rft.atitle=Isolation of a common receptor for coxsackie
B&rft.jtitle=Science&rft.aulast=Bergelson&rft.auinit=J&rft.date=1997&rft.volum
e=275&rft.spage=1320&rft.epage=1323
• Doesn’t have to be. Newer versions allow you to send the metadata as XML
rather than a set of key/value pairs
10. Digital Library Standards for Metadata
• There are lots of different types of metadata for digital objects
• Descriptive
• Structural
• Administrative
• Technical
• Different types of metadata = different standards
• Dublin Core, MODS - Descriptive
• METS - Structural, Administrative
• PREMIS - Administrative
• MIX - Technical
11. Dublin Core
• Two different elements sets: Simple and Qualified
• Simple
• 15 elements
• Extremely simplistic
• dc namespace
• Qualified
• Includes all the elements in Simple Dublin Core plus additional
elements that refinements
• description -> abstract
• Still fairly simple but better granularity
• dcterms namespace
12. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot; standalone=quot;noquot;?>
<records xmlns:dc=quot;http://purl.org/dc/elements/1.1/quot; >
<record>
<dc:creator>Morville, Peter.</dc:creator>
<dc:date>2005</dc:date>
<dc:description>Includes bibliographical references and index.</dc:description>
<dc:description>How do you find your way in an age of information overload? How can you filter
streams of complex information to pull out only what you want? Why does it matter how
information is structured when Google seems to magically bring up the right answer to your
questions? What does it mean to be quot;findablequot; in this day and age? This eye-opening new book
examines the convergence of information and connectivity. Written by Peter Morville, author of the
groundbreaking Information Architecture for the World Wide Web, the book defines our current age
as a state of unlimited findability. In other words, anyone can find anything at any time. </
dc:description>
<dc:format>xiv, 188 : ill. (some col.) ; 23 cm.</dc:format>
<dc:identifier>0596007655 (pbk.)</dc:identifier>
<dc:identifier>9780596007652 (pbk.)</dc:identifier>
<dc:language xsi:type=quot;http://purl.org/dc/terms/ISO639-2quot;>eng</dc:language>
<dc:publisher>O'Reilly</dc:publisher>
<dc:subject xsi:type=quot;http://purl.org/dc/terms/DDCquot;>005.72</dc:subject>
<dc:subject xsi:type=quot;http://purl.org/dc/terms/LCCquot;>QA76.9.D26 M67 2005</dc:subject>
<dc:subject xsi:type=quot;http://purl.org/dc/terms/LCSHquot;>Database searching.</dc:subject>
<dc:subject xsi:type=quot;http://purl.org/dc/terms/NLMquot;>TK 5105.888 M892a 2005</dc:subject>
<dc:title>Ambient findability </dc:title>
<dc:type>Text</dc:type>
</record>
</records>
13. METS
• Metadata Encoding Transmission Standard
• Used for digital objects to “wrap-up” all metadata elements
• Can include other metadata schemes
• Provides structural metadata
• what files are part of the objects
• what is their purpose
14. MODS
• Metadata Object Description Schema
• Advantages
• Richer description than Dublin Core
• Element names more user-friendly than MARCXML
• Better separation of data and presentation than MARC and actual
datatyping of elements
• Typically used for describing digital library content but MARCXML can be
converted to MODS
15. XML from the Internet also useful to Libraries
• Feeds
• Standard formats for syndicating content
• RSS
• title, description, link, author, pubDate
• Atom
• title, summary, link, modified, dc:date
16.
17. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?><rss version=quot;2.0quot;>
<channel>
<title>Library Hi Tech </title>
<link>http://www.emeraldinsight.com/0737-8831.htm</link>
<description> Table of Contents from the most recently published issues of Library Hi Tech</description>
<language>en-us</language>
<copyright>2009 Emerald Group Publishing Ltd.</copyright>
<image>
<title>Library Hi Tech </title>
<url>http://www.emeraldinsight.com/info/pics/journals/lht-cover-xix.gif</url>
<width>120</width>
<height>157</height>
</image>
<item>
<title>Accessing information in a parliamentary environment: is the OPAC dead? : Table of Contents</title>
<link/>
<description> <B>Abstract:</B><BR/> <B>Purpose</B> - Access to library
collections in an era where users want to quot;getquot; rather than quot;findquot; offers particular challenges. This article
explores users' needs for bibliographic records in a primarily full text environment.<B>Design/
methodology/approach</B> - The paper describes access to parliamentary and library information from
the Australian Parliament. It then outlines the approach taken to develop and implement a new search
system, ParlInfo, which applied a repository and search system that provides integrated access to
bibliographic and full text information. The system was launched in September 2008 and offers facets, alerts,
RSS feeds and other Web 2.0 functionality to offer both the Australian public and Parliamentary Network
users to access to library collections and parliamentary collections. <B>Findings</B> -.</
description>
<author>Ms. Roxanne Missingham, Ms. Rina Brettell, Ms. Shirley White, Dr. Sarah Miskin</author>
<pubDate>Sun Jan 18 14:15:05 GMT 2009</pubDate>
</item>
</channel>
</rss>
18.
19. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>
<feed xmlns=quot;http://purl.org/atom/ns#quot; xmlns:taxo=quot;http://purl.org/rss/1.0/modules/taxonomy/quot;
xmlns:rdf=quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#quot; xmlns:sy=quot;http://purl.org/rss/1.0/modules/
syndication/quot; xmlns:dc=quot;http://purl.org/dc/elements/1.1/quot; version=quot;0.3quot;>
<title>Geological Magazine - Current Issue</title>
<link rel=quot;alternatequot; href=quot;http://journals.cambridge.org/action/displayJournal?jid=GEOquot; />
<info>Geological Magazine, Volume 146 Issue 01 Geological Magazine , established in 1864, is
one of the oldest and best-known periodicals in earth sciences. It publishes original scientific papers
covering the complete spectrum of geological topics, with high quality illustrations. Its worldwide
circulation and high production values, combined with Rapid Communications and Book Review
sections keep the journal at the forefront of the field. This journal is included in the Cambridge Journals
open access initiative, Cambridge Open Option. Offer readers unrestricted online access to your work,
click here for more details.</info>
<entry>
<title>Volume 146 Issue 01</title>
<link rel=quot;alternatequot; href=quot;http://journals.cambridge.org/action/displayIssue?
jid=GEO&volumeId=146&issueId=01quot; />
<modified>2009-01-01T00:00:00Z</modified>
<summary type=quot;text/plainquot; mode=quot;xmlquot;>Geological Magazine, Volume 146 Issue 01 Geological
Magazine , established in 1864, is one of the oldest and best-known periodicals in earth sciences. It
publishes original scientific papers covering the complete spectrum of geological topics, with high
quality illustrations. Its worldwide circulation and high production values, combined with Rapid
Communications and Book Review sections keep the journal at the forefront of the field. This journal is
included in the Cambridge Journals open access initiative, Cambridge Open Option. Offer readers
unrestricted online access to your work, click here for more details.</summary>
<dc:date>2009-01-01T00:00:00Z</dc:date>
</entry>
</feed>
20. Sources for data in XML format
• Syndicated Table of Content feeds
• From Publisher websites - Emerald
• From ticTOCs project- http://www.tictocs.ac.uk
• WorldCat API
• Evergreen Catalogs (Georgia Pines, University of Prince Edward Island)
• xISSN services
• Serial Solutions API
21.
22.
23. WorldCat API
• Service Levels
• Default - limited set of indexes and limits; limited bibliographic data
returned
• Full - all indexes available in WorldCat; full bibliographic data
• Search formats
• OpenSearch
• SRU
• Response formats
• OpenSearch
• RSS
• Atom
• SRU
• MARCXML
• Dublin Core
24. SRU Query to WorldCat Search API
• Can search by ISSN or other fields, full MARC records can be returned
http://worldcat.org/webservices/catalog/search/sru?query=srw.in+all+%
221041-7915%
22&version=1.1&operation=searchRetrieve&wskey=key&recordSchema=info%
3Asrw%2Fschema%2F1%
2Fmarcxml&maximumRecords=10&startRecord=1&recordPacking=xml&servicelevel
=default&sortKeys=relevance&resultSetTTL=300&recordXPath=
• query - srw query
Use SRU Explain Service (http://worldcat.org/webservices/catalog/) to help
construct your query
• wskey - API key
25. An Open Search Query to WorldCat Search API
• Can only search by keywords and the data returned isn’t particularly useful when
dealing with serials/journals
http://worldcat.org/webservices/catalog/search/worldcat/opensearch?q=computers
%20in%20libraries&format=atom&wskey=key
• q - your query
This is very simple really can’t be anything but a keyword search
• format - format you want results returned in Atom or RSS
• wskey - WorldCat Search API key
26. xISSN Service
• Several types of Requests
• getForms - returns a list of ISSNs and its production form information in
same group as the requested ISSN.
• Form is ONIX production form code
• JB ( Printed serial ), JC ( Serial distributed electronically by
carrier ) ,JD ( Electronic serial distributed online ), MA ( Microform )
• getEditions - returns a list of ISSNs in same group as the requested ISSN.
• form, oclcnum, peerreview, publisher, rawcoverage, title
• getHistory - returns a list of ISSNs in same group as the requested ISSN,
as well as ISSNs for preceding/succeeding groups
• getMetadata - returns metadata about the requested ISSN
• xISSN History Visualization Tool - generate a chart showing the history of a
journal with a given ISSN
29. Serial Solutions API
• Proprietary APIs
• Available for customers only
• API for 360 Link (OpenURL)
• Serial Solutions provides other APIs depending on which of their products
you subscribe to
• SFX OpenURL resolver also has an API
30. Query to Serial Solutions 360 Link XML API
http://<client identifier>.openurl.xml.serialssolutions.com/openurlxml?
version=1.0&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%
3Ajournal&rfr_id=info%3Asid%2Fsersol%
3ARefinerQuery&url_ver=Z39.88-2004&rft_id=info%3Adoi%2F10.1037%
2F0003-066X.59.1.29
• Standard OpenURL elements are passed
• In this case the DOI is providing the majority of the info
31. Other XML standard of interest
• COUNTER and SUSHI - http://www.niso.org/schemas/sushi/
Data can be transmitted in XML format
• ONIX
• For Books - http://www.editeur.org/onix.html
• For Serials - http://www.editeur.org/onixserials.html
• Actually a set of formats
• Much more complex than books standard
32. Possible Applications
• Integrate journal table of contents into web pages
• Provide users with latest articles in their field by creating an aggregated feed
of important journal in a given field
• Provide better interfaces for resources discovery
• Display print journal holdings in-line with e-journal holdings
• Check for other versions/iterations of a journal during OpenURL resolution
(xISSN)
• Show users relationships between journals and title changes over time
33. Possible Applications
• Provide links to journal table of contents
• Use WorldCat API to search ISSN and retrieve 856
• Manipulate usage statistics information outside an ERM
• Show most popular journals, databases, ebooks to users
• Provide better interface for ILL staff to see holdings and loan rule information
for e-resources
• Better display of cross-references between print and electronic journal
holdings for users
34. Further Resources
• Auto-Populating an ILL form with the Serial Solutions Link Resolver API -
http://journal.code4lib.org/articles/108
• Dublin Core - http://dublincore.org/
• ISO/FDIS 20775 - Holdings schema - http://www.loc.gov/standards/
iso20775/
• MARC Holdings - http://www.loc.gov/marc/holdings/echdhome.html
• MARCXML - http://www.loc.gov/standards/marcxml/
• MODS - http://www.loc.gov/standards/mods/
• METS - http://www.loc.gov/standards/mets/
• OCLC Developer’s Network - http://worldcat.org/devnet/wiki/Main_Page
• WorldCat Search API URI Evaluator - http://worldcat.org/webservices/
catalog/evaluator.html
• xISSB Web Services Documentation - http://xissn.worldcat.org/xissnadmin/
doc/api.htm