1. METADATA: a library perspective
NINES Summer Workshop 2008
Jody Perkins, Miami University Libraries
2. METADATA CREATION IN LIBRARIES:
YESTERDAY AND TODAY
Anglo American Cataloging Rules
Authority Control
Collocation
Authorized Headings
Content Standards
Controlled Vocabulary
Fields and Subfields
Hierarchical Relationships
Information Retrieval
Non-authorized Terms
Precision and Recall Ratios
Synthetic Languages
3. METADATA IN LIBRARIES: TODAY AND TOMORROW
Resource Description and Access
Findability
Clustering
Tag Clouds
Resource Discovery
Folksonomy
Natural language
Elements & Attributes
Application Profiles
Usage Guidelines
5. Challenges Ahead
METADATA FOR ALL:
Practical Aspects
METADATA IS US but not ONLY US:
Where we’re going
METADATA WAS US:
Where we’ve been
6. METADATA DEFINED
“Metadata consists of statements we make about resources to
help us find, identify, use, manage, evaluate, and preserve
them.”
Concepts come from three traditions:
–Database Management Systems (“Schemas of relational
databases”)
–Library Cataloging Traditions (MARC & AACR2)
–The World Wide Web (since the mid-1990’s)
Stuart A. Sutton, Basic Semantics, International Conference on Dublin Core
and Metadata Applications—Singapore, 2007 http://www.dc2007.sg/T1-
BasicSemantics.pdf
9. PRIMARY FUNCTIONS OF METADATA
To support users in the following tasks:
Locate
Identify
Select
Acquire
Navigate
Svenonius, Elaine. 2001. The Intellectual Foundation of Information
Organization. Cambridge, Massachusetts, MIT Press.
10. CHARACTERISTICS OF HIGH QUALITY
METADATA
Completeness
Provenance
Accuracy
Conformance to Expectations
Consistency and Coherence
Timeliness
Thomas R. Bruce and Diane I. Hillman, 2004. “The continuum of metadata quality:
defining, expressing, exploiting,” In: Diane I. Hillman and Elaine Westbrooks
(editors). Metadata in Practice. Chicago: ALA Editions, pp. 238-256.
11. BASIC COMPONENTS OF
METADATA CREATION IN LIBRARIES
Schemas
Content Standards
Controlled Values
Encoding and Transmission Standards
Best Practices
12. SCHEMAS
General Purpose
Dublin Core
MARC, MODS
Specialized
VRA Core
EAD
TEI and many others . . .
20. AACR2: AREAS OF DESCRIPTION
Title and statement of responsibility
Edition
Publication
Physical description
Series
Notes
Standard number and terms of availability
21. MARC TAGGING SYSTEM (SIMPLIFIED)
0xx - Control and Fixed Data Fields
1xx - Main Entries (authors)
2xx - Title, Edition, Imprint
3xx - Physical Description
4xx - Series Statements
5xx - Notes
6xx - Subject Entries
7xx - Added Entries (other than subject or series)
8xx - Added Entries (Series)
841+ Holdings Information
9xx - Locally defined information
22.
23. MARC RECORD
LEADER 00000cam 2200457 a 4500
001 68192310
005 20070112000000.0
008 060418s2007 alua b s001 0 eng
010 2006013112
020 0817315381 (alk. paper)
020 9780817315382 (alk. paper)
024 3 9780817315382
040 DLC|cDLC |dBAKER
043 n-us---
049 MIAA
050 00 PS1342.R4|bB87 2007
100 1 Bush, Harold K.|q(Harold Karl),|d1956-
245 10 Mark Twain and the spiritual crisis of his age /|cHarold K. Bush, Jr
260 Tuscaloosa :|bUniversity of Alabama Press,|cc2007
300 340 p. :|bill. ;|c24 cm
440 0 Studies in American literary realism and naturalism
504 Includes bibliographical references (p. [311]-331) and index
505 0 Mark Twain's roots : Hannibal, the river, and the west -- Mark Twain's wife : the moral ethos of
the
600 10 Twain, Mark,|d1835-1910|xReligion
650 0 Christianity and literature |zUnited States|xHistory|y19th century
947 kmf
26. MARC RECORD EXAMPLE
Marko, Lynn and Christina Powell. 2001. Descriptive metadata strategy for TEI
headers: a University of Michigan Library case study. OCLC Systems & Services 17n3:
117-121.
27. SAME RECORD AS A TEI HEADER
FRAGMENT
Marko, Lynn and Christina Powell. 2001. Descriptive metadata strategy for TEI
headers: a University of Michigan Library case study. OCLC Systems & Services 17n3:
117-121.
28. LINKS TO BEST PRACTICE DOCUMENTS AND
PROJECTS USING MARC DATA IN TEI HEADERS
TEI Text Encoding in Libraries Guidelines for Best
Encoding Practices http://www.diglib.org/standards/tei.htm
(see section IV. The TEI Header)
Description of Text Encoding Initiatives (TEI)
Header Elements and Corresponding USMARC Fields.
Appendix to TEI/MARC Best Practices
http://etext.lib.virginia.edu/tei/tei-usmarc.html
Marko, Lynn and Christina Powell. 2001. Descriptive
metadata strategy for TEI headers: a University of
Michigan Library case study. OCLC Systems & Services
17n3: 117-121.
30. METS
Metadata Encoding and Transmission Standard
“. . . a standard for packaging descriptive, administrative and
structural metadata. It allows for metadata which adheres to
existing standards (such as Dublin Core and MARC) to be
embedded in a METS record, or stored outside the METS
record and referenced. METS is therefore not a metadata
standard but rather a wrapper for associating existing
metadata of various types within a single object, document, or
collection structure.”
Source:
http://staffweb.library.northwestern.edu/dl/metadata/standardsinvent
ory/metssummary.html
40. textMD
• Extension for METS
<amdSec>
• XML Schema for text-
based digital objects
• Used for technical
metadata that
specifies encoding,
character attributes,
languages, markup,
processing,
pagination, display
requirements
• www.loc.gov/standard
s/textMD/
42. BARRIERS TO INTEROPERABILITY:
Semantic differences
Different practices
Differences in representation
Different vocabularies
Multiple versions
Priscilla Caplan, “Metadata Fundamentals for All Librarians,”
(Chicago: ALA, 2003): 41-42.
43. INTEROPERABILITY IS CRITICAL FOR. . .
Federated Searching
Harvesting
Inter and Intra Institutional Collaboration
Future Proofing
44. SOME WAYS TO FACILITATE
INTEROPERABILITY:
Compliance with standards and best practices
Application Profiles
Framework
Conversion
Integration
Registries
Lois Mai Chan and Marcia Lei Zeng, "Metadata Interoperability and
Standardization - A Study of Methodology Part I." D-Lib Magazine, 12, no. 6
(2006). http://www.dlib.org/dlib/june06/chan/06chan.html (accessed August 30,
2007) and Metadata Interoperability and Standardization - A Study of
Methodology Part II." D-Lib Magazine, 12, no. 6 (2006).
http://www.dlib.org/dlib/june06/zeng/06zeng.html (accessed August 30, 2007)
46. GENERAL GUIDELINES
Framework of Guidance for Building Good Digital Collections. 3rd
ed. 2007. Bethesda, MD : National Information Standards
Organization, NISO Framework Advisory Group. Accessed 23
July 2008. http://www.niso.org/publications/rp/framework3.pdf
NINCH Guide to Good Practice in the Digital Representation and
Management of Cultural Heritage Materials. 2002. Humanities
Advanced Technology and Information Institute (HATII),
University of Glasgow, and the National Initiative for a
Networked Cultural Heritage. Accessed 23 July 2008.
http://www.nyu.edu/its/humanities/ninchguide/
47. METADATA SCHEMAS
Dublin Core Metadata Element Set, Version 1.1. Dec. 18,
2006. Dublin Core Metadata Initiative. Accessed 23 July
2008. http://dublincore.org/documents/dces/
MARC and MARC Related Standards (MODS, MARCXML,
etc.) Accessed 23 July 2008. http://www.loc.gov/marc/
VRA Core Categories, Version 4.0. 2007. Visual Resources
Association Data Standards Committee. Accessed 23 July
2008. http://www.vraweb.org/projects/vracore4/index.html
48. CONTENT STANDARDS
American Library Association, et al. AACR2 Anglo-American
Cataloguing Rules, 2nd ed. Chicago : American Library
Association, 2002. Also available in a concise edition.
Society of American Archivists. Describing Archives: A
Content Standard. Chicago : Society of American
Archivists, 2004.
Visual Resources Association. Cataloging Cultural Objects: A
Guide to Describing Cultural Works and Their Images.
Chicago : American Library Association, 2006.
49. CONTROLLED VALUE SCHEMES
Controlled Vocabularies for Use in Rare Book and Special Collections
Cataloging. ACRL/RBMS. Accessed 23 July 2008.
http://www.rbms.info/committees/bibliographic_standards/controlled_voc
abularies/index.shtml
FAST: Faceted Application of Subject Terminology. OCLC Online Computer
Library Center. Accessed 23 July 2008. Interface http://fast.oclc.org/
Information http://www.oclc.org/research/projects/fast/
Getty Vocabulary Program. J. Paul Getty Trust. Accessed 23 July 2008.
http://www.getty.edu/research/conducting_research/vocabularies/
Library of Congress Authorities. 2006. Library of Congress. Accessed 23
July 2008. http://authorities.loc.gov/
LC Thesaurus for Graphic Materials. 1995. Library of Congress. Print and
Photographs Division. Accessed 23 July 2008.
http://lcweb.loc.gov/rr/print/tgm1
50. BEST PRACTICES
Best Practices for Shareable Metadata. NSF, August 2005. Digital Library
Federation and the National Science Digital Library. Accessed 23 July
2008. http://oai-best.comm.nsdl.org/cgi-bin/wiki.pl?PublicTOC
Description of Text Encoding Initiatives (TEI) Header Elements and
Corresponding USMARC Fields. Appendix to TEI/MARC Best Practices
Accessed 23 July 2008. http://etext.lib.virginia.edu/tei/tei-usmarc.html
TEI Text Encoding in Libraries Guidelines for Best Encoding Practices.
Accessed 23 July 2008. http://www.diglib.org/standards/tei.htm (see
section IV. The TEI Header)
Marko, Lynn and Christina Powell. 2001. Descriptive metadata strategy for
TEI headers: a University of Michigan Library case study. OCLC Systems
& Services 17n3: 117-121.