Presentation given at the Association of Moving Image Archivists Conference, November 14, 2009 in Savannah, GA. Part of the panel PBCore: What is it good for?
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
PBCore, METS, PREMIS, MODS, METSRights...oh my!
1. PBCore, METS,
PREMIS, MODS, METSRights...
oh my!
Kara van Malssen
Senior Research Fellow, NYU
Preserving Digital Public Television
AMIA 2008
2. Preserving Digital
A little bit about the
Public Television Project
• Identify at-risk born digital public
television content
• Build an OAIS-compliant prototype
GOALS: repository
• Explore and apply standards
• Create selection guidelines
• Research sustainability models,
copyright encumbrances
3. Project Partners
SIP site
Repository
WNET WGBH
NYU
PBS
Library of
Congress
4. Producing Stations Satellite
Station
Transmitting
A Stations
Station
B WNET WGBH
Station Station Station
C A B
Station Station
C D
PBS Station Station
WNET E F
Station Station
G H
Station Station
I J
WGBH
NYU PDPTV Prototype
Repository
Submission Workflow
5. NYU Goals:
• Create a prototype repository for long term retention
• Aggregate content from partner stations + PBS for
sample programs
• Populate records with metadata that already
exists (in station databases, files, scheduling systems, etc)
• Transform data and package content, while
preserving relationships between items
6. Important Vocabulary
•The Repository: NYU
prototype preservation repository
• OAIS : Open Archival
Information System
OAIS
• SIP: Submission Information
Package
Terms!
• AIP : Archival Information
Package
7. Applying standards
• Normalize disparate metadata
• XML based
• One uniform scheme
• Easier to manage over the long term
• Rules, vocabularies, schemas help
maintain consistency
8. SD
HD SD
Broadcast Production Production
Broadcast Broadcast
Master Master Master
Master Master
(mov/aiff/ (mov) (mxf)
(mov/data) (mpeg)
m2v)
DATABASE EXPORTS PODS TEAMS
PRO INMAGIC
TRACK
ADDITIONAL ITEMS Scripts,
etc
Challenge of
managing SIP Class 1: WNET National SIP Class 3: WNET Local Broadcast
diverse Broadcast (Nature) (New York Voices)
SD Production
HD SD
Broadcast Production
SIPs:
Broadcast Broadcast Master
Master Master (mxf)
Master Master
(mov/aiff/ (mxf)
(mov/data) (mpeg)
m2v)
PODS PRO INMAGIC
INMAGIC
TRACK
SIP Class 2: WGBH National SIP Class 4: Religion and Ethics
Broadcasts
SD SD
Broadcast Production
Broadcast Production Master
Master Master Master
(mov/aiff/ (mov)
(mov/aiff/ (mov)
m2v) m2v)
TEAMS Scripts,
PODS PODS PRO etc
TRACK
9. PDPTV metadata model
METS: Metadata Encoding
and Transmission Standard
Structural and administrative
PBCore: Public Broadcasting
Metadata Dictionary
Descriptive and technical
PREMIS: Preservation
Metadata Implementation
Strategy
Technical preservation metadata
10. METS : Metadata Encoding and Transmission
Standard
• Provides a structure to bundle all content
(essence + metadata) in one AIP
• Identifies types of metadata, but not the
terms to define them (with a few exceptions)
METS fileSec
amdSec
dmdSec structMap
techMD rightsMD sourceMD digiprovMD
behaviorSec
11. PBCore : What is it good for?
• Descriptive metadata elements that are
specific to public broadcasting
• Controlled vocabularies with broadcast terms
• Easy to map to from legacy station databases
• Granular technical metadata (PBCore 1.2)
➡ Accurately represents the file specific metadata
➡ Can be auto populated using technical metadata
extraction tools & sytlesheets
12. PREMIS : Preservation Metadata Implementation Strategy
Intellectual
Object Entity: Entity
•Creating Rights
application info
•Playback
environment Object Agents
(hardware and
software
Events
13. Issue of Redundancy between standards
METS PBCore
Title
Structure Creator
Description
Relationships
Agents
Rights File Format
Checksums File Size
Hardware
Software
PREMIS
14. Putting it all together
METS PBCore
Title
Structure Creator MODS
Description
Relationships
Agents
Rights File Format
Checksums File Size
Descriptive elements only
map to MODS
Hardware
Software
METSRights!
PREMIS
17. AIP creation simplified
1. Content submitted, verified
2. METS automatically generated (checksums
into METS attributes)
3. Source database exports automatically
converted to PBCore
4. Technical metadata extracted from files using
MediaInfo, converted to PBCore
5. MODS created from completed PBCore
6. Rights metadata (METSRights), preservation
metadata (PREMIS) created
7. AIP complete
18. SD
HD SD
Broadcast Production Production
ESSENCE FILE Broadcast
Master
Master Master Master Broadcast
(mov/aiff/ (mov) (mxf) Master
TYPES (mov/data)
m2v) (mpeg)
METS
METADATA METS PBCore PREMIS
Rights
MODS
Original
ADDITIONAL ITEMS Scripts,
database
etc
exports
AIP Class 1: Nationally distributed content (Nature)
SD
AIPs:
HD
Broadcast Production
Broadcast
METS Master Master
Master
(mov/aiff/ (mxf)
(mov/data)
m2v)
Original
METS database
PBCore PREMIS MODS
Rights exports
AIP Class 4: Religion and Ethics
SD
Broadcast Production Original
METS Master Master database
(mov/aiff/ (mov) exports
m2v)
METS Scripts,
PBCore PREMIS MODS
Rights etc
19. SD
HD SD
Broadcast Production Production
Broadcast Broadcast
Master Master Master
Master Master
(mov/aiff/ (mov) (mxf)
(mov/data) (mpeg)
m2v)
DATABASE EXPORTS PODS TEAMS
PRO INMAGIC
TRACK
ADDITIONAL ITEMS Scripts,
etc
Challenge of
managing SIP Class 1: WNET National SIP Class 3: WNET Local Broadcast
diverse Broadcast (Nature) (New York Voices)
SD Production
HD SD
Broadcast Production
SIPs:
Broadcast Broadcast Master
Master Master (mxf)
Master Master
(mov/aiff/ (mxf)
(mov/data) (mpeg)
m2v)
PODS PRO INMAGIC
INMAGIC
TRACK
SIP Class 2: WGBH National SIP Class 4: Religion and Ethics
Broadcasts
SD SD
Broadcast Production
Broadcast Production Master
Master Master Master
(mov/aiff/ (mov)
(mov/aiff/ (mov)
m2v) m2v)
TEAMS Scripts,
PODS PODS PRO etc
TRACK