Jack Brighton's presentation on the basics of moving image metadata, structured data, the PBCore metadata standard, and why it all matters to archivists. This was one part of an all-day workshop presented at the 2011 AMIA Conference in Austin, Texas on Wednesday, November 16, 2011.
21. WILL Radio interview with John Brady Kiesling, former U.S. diplomat, on
“The War In Iraq: U.S. Foreign Policy And The Crisis Of International
Legitimacy,” Focus 580, September 23, 2005
Interviewer: Jack Brighton
Producer: Harriet Williamson
22. Descriptive Metadata
• title:Focus 580 on WILL-AM
• subject: The War In Iraq: U.S. Foreign Policy And The Crisis
Of International Legitimacy
• description:
U.S. diplomat
Interview with John Brady Kiesling, former
• genre: politics, United States, Iraq, foreign policy, war
• date of broadcast: September 23, 2005
• etcetera...
23. Administrative Metadata
• creator: WILL Radio
• publisher: WILL Public Media
• copyright holder: University of Illinois
• rights summary: may be repurposed for non-commercial
and educational purposes with attribution to WILL
• etcetera...
24. Technical Metadata
• format:broadcast wav file
• sampling rate: 44.1 kHz
• bit depth:16 bit
• tracks:mono
• file size:
248,673,242 bites
• location:Enco DAD cut #36465
• etcetera...
27. media object as rss feed
<?xml version="1.0" encoding="UTF-8" ?>
- <rss xmlns:itunes="http://www.itunes.com/DTDs/Podcast-1.0.dtd" version="2.0">
- <channel>
<title>Focus 580 on WILL-AM</title>
<description>An intelligent interview program on current affairs</description>
<link>http://www.will.uiuc.edu/am/focus</link>
<language>en-us</language>
<copyright>Copyright 2005 University of Illinois</copyright>
<itunes:image href="http://www.will.uiuc.edu/am/focus/images/focuspodcast.jpg" />
<lastBuildDate>Fri, 23 Sep 2005 12:10:00 CST</lastBuildDate>
<pubDate>Fri, 23 Sep 2005 12:10:00 CST</pubDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
<webMaster>jackb@uiuc.edu</webMaster>
- <item>
<title>The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy</title>
<link>http://will.uiuc.edu/am/focus</link>
<description>Interview with John Brady Kiesling, former U.S. diplomat</description>
<enclosure url="http://www.will.uiuc.edu/willmp3/focus050923a.mp3" length="24767532"
type="audio/mpeg" />
<category>Current Events</category>
<pubDate>Fri, 23 Sep 2005 12:10:00 CST</pubDate>
</item>
</channel>
</rss>
30. as a Dublin Core record
<?xml version="1.0"?>
<!DOCTYPE rdf:RDF SYSTEM "http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd">
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://will.atlas.uiuc.edu/focus580/interview/focus050923a/">
<dc:title>
The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy
</dc:title>
<dc:creator>
WILL Public Media - http://will.illinois.edu
</dc:creator>
<dc:subject>
WILL; public affairs; public radio; interviews; talk show; NPR; Illinois; Indiana; University of Illinois; Iraq; United States; Foreign Policy
</dc:subject>
<dc:description>
Interview with John Brady Kiesling, former U.S. Diplomat
</dc:description>
<dc:publisher>
WILL-AM, University of Illinois
</dc:publisher>
<dc:contributor>
Jack Brighton, interviewer
</dc:contributor>
<dc:type>
Sound
</dc:type>
<dc:language>
en
</dc:language>
<dc:relation>
http://will.illinois.edu/media/focus050923a.mp3
</dc:relation>
<dc:rights>
c 2008 University of Illinois
</dc:rights>
</rdf:Description>
</rdf:RDF>
31. with more detail as a PBCore record:
<?xml version="1.0" encoding="UTF-8"?>
<PBCoreDescriptionDocument xmlns="http://www.pbcore.org/PBCore/PBCoreNamespace.html">
<pbcoreIdentifier>
<identifier>focus050923a</identifier>
<identifierSource>Illinois Public Media</identifierSource>
</pbcoreIdentifier>
<pbcoreTitle>
<title>The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy</title>
<titleType>Interview</titleType>
</pbcoreTitle>
<pbcoreSubject>
<subject>Iraq</subject>
<subjectAuthorityUsed>Illinois Public Media Subjects</subjectAuthorityUsed>
</pbcoreSubject>
<pbcoreSubject>
<subject>Insurgency--Iraq</subject>
<subjectAuthorityUsed>Library of Congress Subject Headings</subjectAuthorityUsed>
</pbcoreSubject>
<pbcoreDescription>
<description>WILL Radio interview with John Brady Kiesling, former U.S. diplomat, on “The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy,”</
description>
<descriptionType>Abstract</descriptionType>
</pbcoreDescription>
<pbcoreCoverage>
<coverage>Iraq</coverage>
<coverageType>Spatial</coverageType>
</pbcoreCoverage>
<pbcoreCreator>
<creator>WILL-AM</creator>
<creatorRole>Creator</creatorRole>
</pbcoreCreator>
<pbcorePublisher>
<publisher>University of Illinois</publisher>
<publisherRole>Copyright Holder</publisherRole>
</pbcorePublisher>
<pbcoreRightsSummary>
<rightsSummary>Permission granted by the copyright holder to stream and download for non-profit and educational use.</rightsSummary>
</pbcoreRightsSummary>
<pbcoreInstantiation>
<pbcoreFormatID>
<formatIdentifier>focus050923a.mp3</formatIdentifier>
<formatIdentifierSource>Illinois Public Media</formatIdentifierSource>
</pbcoreFormatID>
<dateCreated>Fri, 23 Sep 2005 12:00:00 CST</dateCreated>
<dateIssued>Fri, 23 Sep 2005 12:10:00 CST</dateIssued>
<formatDigital>mp3</formatDigital>
<formatLocation>http://will.illinois.edu/media/focus050923a.mp3</formatLocation>
<formatMediaType>Sound</formatMediaType>
<formatGenerations>Copy: access</formatGenerations>
<formatDuration>00:52:08</formatDuration>
</pbcoreInstantiation>
</PBCoreDescriptionDocument>
44. “The opportunity before all of us is living up to the
dream of the Library of Alexandria and then taking it a
step further: Universal access to all knowledge.
Interestingly, it is now technically doable. “
Brewster Kahle, founder of the Internet Archive
48. Dublin Core
(ISO 15836)
The Dublin Core set of metadata elements provides a small and
fundamental group of text elements through which most resources
can be described and catalogued. Using only 15 base text fields, a
Dublin Core metadata record can describe physical resources such
as books, digital materials such as video, sound, image, or text files,
and composite media like web pages.
http://en.wikipedia.org/wiki/Dublin_Core
49. Simple Dublin Core
The Simple Dublin Core Metadata Element Set (DCMES) consists
of 15 metadata elements:
• Title
• Creator
• Subject
• Description
• Publisher
• Contributor
• Date
• Type
• Format
• Identifier
• Source
• Language
• Relation
• Coverage
• Rights
Reference: http://dublincore.org/documents/usageguide/elements.shtml
53. DC Structure:
• Resource: the media item in the abstract
• Manifestation: a physical or digital instance of the
resource
• A Resource can have one and only one
Manifestation
54. The One-to-One Principle
In general Dublin Core metadata describes one manifestation or
version of a resource, rather than assuming that manifestations
stand in for one another. For instance, a jpeg image of the Mona Lisa
has much in common with the original painting, but it is not the
same as the painting.
http://dublincore.org/documents/usageguide/
55. The One-to-One principle is inefficient and/or
messy when dealing with multiple manifestations of a
media resource.
Steven J. Miller, The One-To-One Principle: Challenges in Current
Practice
58. Other Dublin Core gaps
• DC has no place for genre
• DC says nothing about versioning or provenance
59. Other Dublin Core gaps
• DC has no place for genre
• DC says nothing about versioning or provenance
• DC says little about media format
60. PBCore
• built on Dublin Core
• created by the American public broadcasting community for use by public media
• funded by the Corporation for Public Broadcasting
pbcore.org
61. PBCore Strengths
• Simple like Dublin Core
• Improved for moving images
• Solves the one-to-one problem
• Strong on technical metadata
• Extensible
• User community growth
62. People using PBCore
• PBS & NPR & local stations
• Film Archives
• Northeast Historic Film
• Democracy Now!
• Fresh Air with Terry Gross
• WNYC
• The Dance Heritage Collection
• Alliance for Community Media
• Open Media Project
• The Rock n’ Roll Hall of Fame
• The American Archive
• User community growing fast...
64. PBCore Versions
• 1.0 - first public release in 2005
• 1.1 - January 2007, minor fixes
• 1.2 - December 2008, added
pbcoreEssenceTrack
• 1.3 - August 2010, added the top-level
element pbcoreAssetType
• 2.0 - February 2011, many improvements
based on user-community input
• 2.1 - Hopefully soon
68. PBCore Structure:
• Asset: the media item in the abstract
• Instantiation: a physical or digital instance of the
asset
• A single Asset can have many Instantiations
71. PBCore Instantiations
• Instantiations capture technical metadata for either
physical or digital media objects
• This is needed for provenance and preservation
72. PBCore Instantiations
• Instantiations capture technical metadata for either
physical or digital media objects
• This is needed for provenance and preservation
• Also needed by software applications that handle
digital media
73. PBCore Instantiations
• Instantiations capture technical metadata for either
physical or digital media objects
• This is needed for provenance and preservation
• Also needed by software applications that handle
digital media
• Also important for answering the question “Where
is the media object?”
74. PBCore compared with Dublin Core
Dublin Core PBCore
title title
creator creator
subject subject
description description
publisher publisher
contributor contributor
date date
type type
format format
identifier identifier
source source
language language
relation relation
coverage coverage
rights rights
106. PBCore 1.x weaknesses
• No way to record separate dates for the Asset and its Instantiations
• No way to record information about multi-part Instantiations
• No way to record distinct rights information for different Instantiations
• No way to show relationships between Instantiations, e.g. that an MPEG4 file
was made from a master video file
• You can record that Harrison Ford is an Actor, but you can’t specify what role
he plays in the film
• You can’t identify clip information within an asset, e.g. where in the timeline a
subject or person occurs
• You can’t bundle together a collection of PBCore 1.x records for easy transfer
to other technical systems and collections
110. pbcoreIdentifier element:
PBCore 1.x
<pbcoreIdentifier>
<identifier>focus050923a</identifier>
<identifierSource>Illinois Public Media</identifierSource>
</pbcoreIdentifier>
PBCore 2.0
<pbcoreIdentifier source="Illinois Public Media"
ref="http://will.illinois.edu">focus050923a</pbcoreIdentifier>
111. Key Concept:
URI: Uniform Resource Identifier
a string of characters used to identify a name or a resource on the
Internet. Such identification enables interaction with representations of
the resource over a network (typically the World Wide Web) using
specific protocols.
112. pbcoreSubject element:
PBCore 1.x
<pbcoreSubject>
<subject>Insurgency--Iraq</subject>
<subjectAuthorityUsed>Library of Congress Subject Headings</subjectAuthorityUsed>
</pbcoreSubject>
113. pbcoreSubject element:
PBCore 1.x
<pbcoreSubject>
<subject>Insurgency--Iraq</subject>
<subjectAuthorityUsed>Library of Congress Subject Headings</subjectAuthorityUsed>
</pbcoreSubject>
PBCore 2.0
114. pbcoreSubject element:
PBCore 1.x
<pbcoreSubject>
<subject>Insurgency--Iraq</subject>
<subjectAuthorityUsed>Library of Congress Subject Headings</subjectAuthorityUsed>
</pbcoreSubject>
PBCore 2.0
<pbcoreSubject source="Library of Congress Subject Headings" ref="http://id.loc.gov/
authorities/sh2008123892#concept">Insurgency--Iraq</pbcoreSubject>
115. PBCore 2.0 subject with time attributes
<pbcoreSubject source="Library of Congress Subject Headings"
ref="http://id.loc.gov/authorities/sh2008123892#concept"
startTime=”00:23:14” endTime=”00:27:28”>Insurgency--Iraq</
pbcoreSubject>
126. pbcoreRightsSummary:
PBCore 1.x
<pbcoreRightsSummary>
<rightsSummary>Permission granted by the copyright holder to stream and download for
non-profit and educational use.</rightsSummary>
127. pbcoreRightsSummary:
PBCore 1.x
<pbcoreRightsSummary>
<rightsSummary>Permission granted by the copyright holder to stream and download for
non-profit and educational use.</rightsSummary>
PBCore 2.0
128. pbcoreRightsSummary:
PBCore 1.x
<pbcoreRightsSummary>
<rightsSummary>Permission granted by the copyright holder to stream and download for
non-profit and educational use.</rightsSummary>
PBCore 2.0
<pbcoreRightsSummary>
<rightsSummary>Permission granted by the copyright holder to stream and download for non-profit
and educational use under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic (CC
BY-NC-SA 2.0)</rightsSummary>
<rightsLink>http://creativecommons.org/licenses/by-nc-sa/2.0/</rightsLink>
</pbcoreRightsSummary>
131. PBCore 2.0 Common Attributes
• Source
• Ref
• Version
• Annotation
• startTime
• endTime
• timeAnnotation
132. a PBCore 2.0 record:
<?xml version="1.0" encoding="UTF-8"?>
<PBCoreDescriptionDocument xmlns="http://www.pbcore.org/PBCore/PBCoreNamespace.html">
<pbcoreAssetType ref="http://pbcore.org/vocabularies/pbcoreAssetType#program">Program</pbcoreAssetType>
<pbcoreAssetDate>Fri, 23 Sep 2005 12:00:00 CST</pbcoreAssetDate>
<pbcoreIdentifier source="Illinois Public Media" ref="http://will.illinois.edu">focus050923a</pbcoreIdentifier>
<pbcoreTitle titleType="Main">The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy</pbcoreTitle>
<pbcoreSubject subjectType="topic" source="Illinois Public Media Subjects">Iraq</pbcoreSubject>
<pbcoreSubject subjectType="topic" source="Library of Congress Subject Headings" ref="http://id.loc.gov/authorities/sh2008123892#concept">Insurgency--Iraq</
pbcoreSubject>
<pbcoreDescription descriptionType="Summary" ref="http://pbcore.org/vocabularies/pbcoreDescription/descriptionType#summary">WILL Radio interview with John Brady
Kiesling, former U.S. diplomat, on “The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy”</pbcoreDescription>
<pbcoreGenre source="PBCore" ref="http://pbcore.org/vocabularies/pbcoreGenre#interview">Interview</pbcoreGenre>
<pbcoreCoverage>
<coverage source="ISO-3166" ref="http://www.geonames.org/countries/IQ/iraq.html">IRQ</coverage>
<coverageType>Spatial</coverageType>
</pbcoreCoverage>
<pbcoreCreator>
<creator affiliation="Illinois Public Media">Jack Brighton</creator>
<creatorRole>Interviewer</creatorRole>
</pbcoreCreator>
<pbcoreCreator>
<creator affiliation="University of Illinois" ref="http://will.illinois.edu/am">WILL-AM</creator>
<creatorRole>Production Unit</creatorRole>
</pbcoreCreator>
<pbcoreContributor>
<contributor ref="http://en.wikipedia.org/wiki/Brady_Kiesling">John Brady Kiesling</contributor>
<contributorRole source="PBCore" ref="http://pbcore.org/vocabularies/contributorRole#interviewee">Interviewee</contributorRole>
</pbcoreContributor>
<pbcorePublisher>
<publisher ref=”http://illinois.edu”>University of Illinois</publisher>
<publisherRole ref="http://pbcore.org/vocabularies/publisherRole#copyright-holder">Copyright Holder</publisherRole>
</pbcorePublisher>
<pbcoreRightsSummary>
<rightsSummary>Permission granted by the copyright holder to stream and download for non-profit and educational use under a Creative Commons Attribution-
NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0)</rightsSummary>
<rightsLink>http://creativecommons.org/licenses/by-nc-sa/2.0/</rightsLink>
</pbcoreRightsSummary>
<pbcoreInstantiation>
<instantiationIdentifier source="Illinois Public Media" ></instantiationIdentifier>
<instantiationDate dateType="date created">Fri, 23 Sep 2005 12:00:00 CST</instantiationDate>
<instantiationDate dateType="date issued">Fri, 23 Sep 2005 12:10:00 CST</instantiationDate> <instantiationDigital>mp3</instantiationDigital>
<instantiationLocation>http://will.illinois.edu/media/focus050923a.mp3</instantiationLocation>
<instatiationMediaType>Sound</instatiationMediaType>
<instantiationGenerations>Copy: access</instantiationGenerations>
<instantiationFileSize unitsOfMeasure="bytes">24767532</instantiationFileSize>
<instantiationDuration>00:52:08</instantiationDuration>
</pbcoreInstantiation>
</PBCoreDescriptionDocument>
136. Systems of meaning:
• Taxonomy: The use of Controlled Vocabularies
created and maintained by authorities, e.g.
Library of Congress Subject Headings, or
PBCore lists on the Open Metadata Registry
137. Systems of meaning:
• Taxonomy: The use of Controlled Vocabularies
created and maintained by authorities, e.g.
Library of Congress Subject Headings, or
PBCore lists on the Open Metadata Registry
• Folksonomy: User tagging of content with
freestyle keywords for which no authority
beyond the user exists
138. Systems of meaning:
• Taxonomy: The use of Controlled Vocabularies
created and maintained by authorities, e.g.
Library of Congress Subject Headings, or
PBCore lists on the Open Metadata Registry
• Folksonomy: User tagging of content with
freestyle keywords for which no authority
beyond the user exists
• Taxonomy + Folksonomy = Powerful stuff