SlideShare una empresa de Scribd logo
1 de 55
Metadata for Audiovisual
Materials and its Role in
Digital Projects
Jenn Riley
Metadata Librarian
Indiana University
Digital Library Program
2
OLAC/MOUG 2008

September 26
and 27, 2008

What we’re going to cover
• A lot! Get ready for a whirlwind tour.
• For many different metadata formats
▫
▫
▫
▫

Brief introduction
What it is for
When is a good time to use it
Usually an example

• Images, audio, and video
▫ Maps and other formats have their own standards too!

• We’ll focus mostly on standards cultural heritage
institutions use, and less on “industry” standards
Brief introduction to XML and
types of metadata
4
OLAC/MOUG 2008

September 26
and 27, 2008

Purpose
• XML = eXtensible Markup Language
• “Meta-language” for defining markup languages
for specific purposes
• Many metadata formats cultural heritage
institutions use are encoded in XML
• Specific XML languages can be defined in
several ways:
▫ DTD
▫ W3C XML Schema
▫ RELAX NG
5
OLAC/MOUG 2008

September 26
and 27, 2008

XML terminology
•

Element
▫
▫
▫

•

Also called a “tag”
Element name surrounded by brackets, e.g., <titleInfo>
“Opens” <titleInfo> and “closes” </titleInfo>

Attribute
▫
▫

Name/value pair that applies to the element and its
content
Included within the text in brackets, e.g.,
<titleInfo type="alternative">
6
OLAC/MOUG 2008

September 26
and 27, 2008

All elements must be closed
• YES:
<title>Title of a Work</title>
<subtitle>And its Subtitle</subtitle>
• NO:
<title>Title of a Work
<subtitle>And its Subtitle
7
OLAC/MOUG 2008

September 26
and 27, 2008

Elements must be properly nested
• YES:
<titleInfo>
<title>Spring and fall</title>
</titleInfo>
• NO:
<titleInfo>
<title>Spring and fall</titleInfo>
</title>
8
OLAC/MOUG 2008

September 26
and 27, 2008

Element content
• (What’s between the open and close tags)
• Text
<title>Spring and fall</title>

• Other elements

<titleInfo>
<title>Spring and fall</title>
<subTitle>a tone poem</subTitle>
</titleInfo>

• Both (mixed content)

<something>some text, <otherthing>other text</otherthing></something>

• Empty elements

<tableOfContents xlink:href=
"http://www.loc.gov/catdir/toc/99176484.html"/>
9
OLAC/MOUG 2008

Types of metadata
• Descriptive metadata
• Administrative metadata
▫ Technical metadata
▫ Preservation metadata
▫ Rights metadata

• Structural metadata
• Markup languages

September 26
and 27, 2008
10
OLAC/MOUG 2008

How metadata is used

September 26
and 27, 2008
11
OLAC/MOUG 2008

September 26
and 27, 2008

Levels of control
• Three general types of standards, as viewed by
libraries
▫ Data structure standards (e.g., MARC)
▫ Data content standards (e.g., AACR2r)
▫ Controlled vocabularies (e.g., LCSH)

• Mix and match to meet your needs
• Dividing lines not always clear, however
• We’ll be talking about data structure standards
today
General descriptive metadata
standards
13
OLAC/MOUG 2008

September 26
and 27, 2008

MARC
• Implementation of ISO 2709, ANSI/NISO Z39.2
• Originally released in the late 1960s
• MARC21 is the format used in the U.S.
▫ Other areas have other ISO 2709 implementations,
e.g., UNIMARC

• “Format integration” in the first half of the 1990s
• Typically used with AACR2, ISBD punctuation, and
LCSH, but this is not a requirement
• Use when you want integration of content into the
OPAC interface
14
OLAC/MOUG 2008

September 26
and 27, 2008

MARC example
• This is actually a “human-readable” view of this
record, not its native storage format
• Notice
▫ 3-digit data fields
▫ Subfields introduced by $ (also sometimes
rendered as | or ‡)
▫ Indicators providing information about how to
interpret the data in the field

• Mixture of machine-readable and humanreadable data
15
OLAC/MOUG 2008

September 26
and 27, 2008

MARCXML
• Exact rendering of MARC in XML
• Generally used as interim step between MARC
and some other XML-based format
▫ Not intended to be generated directly by people

• Notice in the example
▫ Verbose syntax (only a small portion of the record
is represented here)
16
OLAC/MOUG 2008

September 26
and 27, 2008

Metadata Object Description
Schema (MODS)
• Developed and maintained by the LC Network
Development and MARC Standards Office
• Inspired by MARC, but not equivalent
• Intended to be useful to a wider audience than
MARC
• Still a “bibliographic” focus
• Use when you want a library-type approach but
more interoperability than MARC and the
benefits of XML
17
OLAC/MOUG 2008

September 26
and 27, 2008

MODS example
• Textual element names
• General MARC inspiration
• AACR2 used in this example, but not required by
MODS
• Fairly extensive scope
• But still “library-ish”
18
OLAC/MOUG 2008

September 26
and 27, 2008

Dublin Core
• Perhaps the most misunderstood metadata
standard!
• Dublin Core Metadata Element Set (DCMES)
▫
▫
▫
▫

ANSI/NISO Z39.85, ISO 15836
No element required
All elements repeatable
1:1 principle

• Abstract Model is current focus
19
OLAC/MOUG 2008

September 26
and 27, 2008

Dublin Core Metadata Element Set
• Unqualified – 15 elements
▫ This is the format most think of as “Dublin Core”

• Qualified
▫
▫
▫
▫

Additional elements
Element refinements
Encoding schemes (vocabulary and syntax)
All qualifiers must follow “dumb-down” principle
20
OLAC/MOUG 2008

September 26
and 27, 2008

Uses of DCMES
• “Core” across all knowledge domains
• Unqualified DC required for sharing metadata
via the Open Archives Initiative
• Generally used as format for sharing metadata
with others
• QDC occasionally used as a native metadata
format
▫ CONTENTdm
▫ DSpace
21
OLAC/MOUG 2008

September 26
and 27, 2008

Dublin Core examples
• Relative simpleness of the formats
• QDC allows the specification of source
vocabulary, more specific element meanings
• These records generated via standard mappings
from MARC
▫ Obviously the mappings need some work
▫ But that doesn’t mean the target formats aren’t
useful!

• Remember, every format has its purpose
Still image descriptive metadata
23
OLAC/MOUG 2008

September 26
and 27, 2008

Visual Resources Association Core
Categories (VRA Core)
• Designed by visual resources specialists
• Distinguishes between collection, work, and
image
• Focus on creation, style, culture
• Best used on collections of reproductions of
works of art & architecture
• No infrastructure yet for easy sharing of work
records
24
OLAC/MOUG 2008

September 26
and 27, 2008

VRA Core example
• Work and image in separate records
• Image record describes a digitized photograph of
an architectural site
• Separate elements for display and indexing
values
• Use of controlled vocabularies
• Connections to research relevant to the work
25
OLAC/MOUG 2008

September 26
and 27, 2008

Categories for the Description of
Works of Art (CDWA) Lite
• Version of the full CDWA, intended to help
museums share metadata about their collections
• Strong museum, curatorial focus
• Strong on culture, physical location
• Meant to describe original works, not surrogates
or reproductions
• Best used for unique materials owned and
managed by your institution
26
OLAC/MOUG 2008

September 26
and 27, 2008

CDWA Lite example
• Separate elements for display and indexing
values
• Physical dimensions
• Current repository and provenance
• Inscription information
Music descriptive metadata
28
OLAC/MOUG 2008

September 26
and 27, 2008

Different landscape for music than
images
•
•
•
•

No discipline-generated format has emerged
Do we need one?
Industry is a strong influence in this community
“Music” is almost impossibly diverse
▫ Different cultures, traditions
▫ Different formats (sound, notation, visual +
audio)
▫ Quickly changing environment
29
OLAC/MOUG 2008

September 26
and 27, 2008

Some music metadata formats
• Variations2 – Indiana University
• Probado – Bavarian State Library
• Music Ontology – Music Information Retrieval
community
• ID3 tags - Industry
Overall, only very specialized applications choose
these over a format-neutral option.
Other “media” metadata
standards
31
OLAC/MOUG 2008

September 26
and 27, 2008

MPEG-7
• “Multimedia Content Description Interface”
• ISO/IEC standard
• From the Moving Picture Experts Group, which
is behind the MPEG-1 and MPEG-2 multimedia
content formats, and the MPEG-21 Multimedia
Framework
• Descriptions can be expressed in XML or
compressed binary form
32
OLAC/MOUG 2008

September 26
and 27, 2008

Framework rather than element set
• “Description Definition Language”
▫ Based on W3C XML Schema
▫ Defines “description schemes”

• Pre-defined description schemes for video and audio
• Focus is more on “low-level” descriptors than
library-style bibliographic information
• Would preserve MPEG-7 information when
generated by an editing application
• Unlikely a library would choose it as a format for
descriptive metadata to support discovery
33
OLAC/MOUG 2008

September 26
and 27, 2008

MPEG-7 scope
• Wide scope – intended to cover descriptive,
technical, rights, use, etc., information
• Many media formats
▫
▫
▫
▫
▫
▫
▫

Still pictures
Graphics
3D models
Audio
Speech
Video
“Scenarios” combining these elements

• Note technical details of the audio waveform in the
example
34
OLAC/MOUG 2008

September 26
and 27, 2008

Public Broadcasting Core (PB Core)
• Development funded by the Corporation for
Public Broadcasting
• Data to support the creation, management, and
discovery of “media items”
• 4 classes
▫
▫
▫
▫

IntellectualContent
IntellectualProperty
Instantiation
Extensions

• Likely the best choice for broadcasting archives
35
OLAC/MOUG 2008

September 26
and 27, 2008

PB Core example
• Common descriptive information such as title,
subject, genre
• Audience level and rating
• Rights information
• Separates “instantiation” from intellectual
content
Technical and administrative
metadata for A/V materials
37
OLAC/MOUG 2008

September 26
and 27, 2008

Metadata for Images in XML (MIX)
• Implementation in XML of ANSI/NISO Z39.87
data dictionary
• Maintained by the Library of Congress Network
Development and MARC Standards Office
• Technical information needed to render the
image and data on how it was created
• Use for any still image format; most can be
generated automatically
• Note features such as compression level, pixel
dimensions, format-specific data, and bit rate
38
OLAC/MOUG 2008

September 26
and 27, 2008

AES Core Audio
• Currently under development by the Audio
Engineering Society, not yet in general release
• Divides audio into face->region->stream
• Can be used for both analog and digital audio
• Use for any audio file; most can be generated
automatically
• Expectation is that most audio editing software
will be able to generate this format
• Note duration, sample rate, channel assignments
39
OLAC/MOUG 2008

September 26
and 27, 2008

LC A/V Prototyping Project Audio
(Source) Data Dictionary
• Developed in 2003
• Never implemented in a production
environment
• Use AES Core Audio instead when you can
▫ This is probably a reasonable choice in the
meantime

• Note encoding, duration, sample size, channel
information
40
OLAC/MOUG 2008

September 26
and 27, 2008

LC A/V Prototyping Project
VIDEOMD Data Dictionary
• Developed in 2003
• Never implemented in a production environment
• Just video information; assumes separate format for
the audio track
• Use if you can; no tools to create it for you
• This type of data stored internally in most video
editing software, but no real shared export formats
• Be on the lookout for new developments
• Note duration, sample rate, physical tape
characteristics, frame size/rate
41
OLAC/MOUG 2008

September 26
and 27, 2008

AES Process History Metadata
• Currently under development by the Audio
Engineering Society, not yet in general release
• Records “processing events”
• Detailed information about device settings, signal
patches
• Used to support the digital preservation process
• Use for any audio file; most can be generated
automatically
• Expectation is that most audio editing software will
be able to generate this format
• Note device data, input/output channels, patch list
Structural metadata
43
OLAC/MOUG 2008

September 26
and 27, 2008

Metadata Encoding and
Transmission Standard (METS)

• “Wrapper” to package many types of metadata
together for a resource
• Structural metadata is its heart
• Expectation is that METS documents will be
generated programmatically
• Not many METS generation tools out there,
though
• Often used for exchange of data between
repositories, and for ingest into and export out
of a repository
44
OLAC/MOUG 2008

September 26
and 27, 2008

METS example
• This example shows an “audio preservation
package”
▫ Collection-level descriptive metadata in MARCXML
▫ AES Core Audio technical metadata for analog source
and various digitized versions
▫ Audio decision lists
▫ AES Process History
▫ Audio and ADL files
▫ Structural information
 Relationships between different versions
 Milestones on the audio timeline
45
OLAC/MOUG 2008

September 26
and 27, 2008

SMPTE Material eXchange Format
(MXF)
• Actually a family of standards
• Wrapper for metadata and media files
(“essence”)
• Industry-driven format designed for
interoperability between devices
• Low-level feature information
• Generated by media editing software
• Example shows part of a header and references
to essence files
46
OLAC/MOUG 2008

September 26
and 27, 2008

Synchronized Multimedia
Integration Language (SMIL)
• From the W3C, the body behind HTML and
XML
• For multimedia presentations
• Embedded media, transitions, timing
• Most media players support SMIL
• Note examples showing images in sequence and
in parallel
47
OLAC/MOUG 2008

September 26
and 27, 2008

AES-31-3 Audio Decision List
• Used by editing software to record edits made to
audio files
• Text-based format that looks like XML in places
• Documents how files are stitched together to
create the output
• Uses a common “destination timeline” for all
files
• Non-standard extension for “markers” in
WaveLab
• Note in/out fade, “cuelist”
Music markup languages
49
OLAC/MOUG 2008

September 26
and 27, 2008

Content, not “metadata”
• For encoding musical notation itself - the full
content
• Tend to include “header” with some descriptive
metadata
• Currently, two primary choices
▫ MusicXML
 Focus on industry, notation software

▫ Music Encoding Initiative (MEI)
 Inspired by the Text Encoding Initiative (TEI)
Implementation scenarios
51
OLAC/MOUG 2008

September 26
and 27, 2008

Scenario 1: Audio/video course
reserves
• Discovery
▫ MARC/AACR2 records in OPAC
▫ Course reserves module with descriptive data
extracted from MARC records
▫ Link from discovery system launches media player

• Delivery
▫ Locally-managed media streaming server
▫ (Optional) SMIL for navigation
52
OLAC/MOUG 2008

September 26
and 27, 2008

Scenario 2: Digital music library
• High-end, specialized, online environment for music in a
variety of formats
• Work-based metadata model such as Variations2
optimized for music discovery
• Descriptive metadata records persistently link to media
files in tools that facilitate use of the content
• METS-based structural metadata for navigation within
and between media files
• Various forms of technical and administrative metadata
for long-term preservation of media files
53
OLAC/MOUG 2008

September 26
and 27, 2008

Scenario 3: Broadcast archive
• Focus on management of media; discovery only
for staff and not for end-users
• PB Core as base metadata
• High-end media editing software generates AES,
MXF, other industry standard technical
metadata
• METS wrapper for connecting PB Core data to
structural and technical metadata for ingest into
preservation repository
54
OLAC/MOUG 2008

September 26
and 27, 2008

Scenario 4: Online special
collections
• Discovery
▫ MODS for item-level description of a variety of
formats (letters, photographs, oral histories)

• Delivery
▫ METS for structural data for multi-page objects
▫ Online page-turning interface
▫ PDF download

• Commonly used software such as CONTENTdm
does much of this in its own quirky way – we need
to keep pushing for system adherence to standards!
55
OLAC/MOUG 2008

September 26
and 27, 2008

Thank you!
• jenlrile@indiana.edu
• These presentation slides:
http://www.dlib.indiana.edu/~jenlrile/presentations/
olac2008/olac.ppt

• Workshop handout:
http://www.dlib.indiana.edu/~jenlrile/presentations/
olac2008/handout.pdf

Más contenido relacionado

Similar a Metadata for Audiovisual Materials and its Role in Digital Projects

Lavacon 2011: Managing Translations in Frame DITA without a CMS
Lavacon 2011: Managing Translations in Frame DITA without a CMSLavacon 2011: Managing Translations in Frame DITA without a CMS
Lavacon 2011: Managing Translations in Frame DITA without a CMS
ClearPath, LLC
 

Similar a Metadata for Audiovisual Materials and its Role in Digital Projects (20)

The Sheet Music Consortium and Metadata Standards
The Sheet Music Consortium and Metadata StandardsThe Sheet Music Consortium and Metadata Standards
The Sheet Music Consortium and Metadata Standards
 
DITA 1.3: What's New and Different
DITA 1.3: What's New and DifferentDITA 1.3: What's New and Different
DITA 1.3: What's New and Different
 
DBMS - Introduction.ppt
DBMS - Introduction.pptDBMS - Introduction.ppt
DBMS - Introduction.ppt
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
 
Sustaining Television News Technical Challenges
Sustaining Television News Technical ChallengesSustaining Television News Technical Challenges
Sustaining Television News Technical Challenges
 
DITA Quick Start for Authors - Part I
DITA Quick Start for Authors - Part IDITA Quick Start for Authors - Part I
DITA Quick Start for Authors - Part I
 
Lavacon 2011: Managing Translations in Frame DITA without a CMS
Lavacon 2011: Managing Translations in Frame DITA without a CMSLavacon 2011: Managing Translations in Frame DITA without a CMS
Lavacon 2011: Managing Translations in Frame DITA without a CMS
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
DITA 1.3: What's New and Different
DITA 1.3: What's New and DifferentDITA 1.3: What's New and Different
DITA 1.3: What's New and Different
 
Repo for cbt
Repo for cbtRepo for cbt
Repo for cbt
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
 
MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)
 
Implementing RIOXX
Implementing RIOXXImplementing RIOXX
Implementing RIOXX
 
Application of domain engineering to generate customized information dashboards
Application of domain engineering to generate customized information dashboardsApplication of domain engineering to generate customized information dashboards
Application of domain engineering to generate customized information dashboards
 
LES02.ppt
LES02.pptLES02.ppt
LES02.ppt
 
Handout for Applying Digital Library Metadata Standards
Handout for Applying Digital Library Metadata StandardsHandout for Applying Digital Library Metadata Standards
Handout for Applying Digital Library Metadata Standards
 
TOGAF Classroom Series - M18 architecture repository
TOGAF Classroom Series - M18 architecture repositoryTOGAF Classroom Series - M18 architecture repository
TOGAF Classroom Series - M18 architecture repository
 
ISO 15926 Reference Data Engineering Methodology
ISO 15926 Reference Data Engineering MethodologyISO 15926 Reference Data Engineering Methodology
ISO 15926 Reference Data Engineering Methodology
 
Glossary of Metadata standards
Glossary of Metadata standardsGlossary of Metadata standards
Glossary of Metadata standards
 
Making RDA Easy(er) with MarcEdit
Making RDA Easy(er) with MarcEditMaking RDA Easy(er) with MarcEdit
Making RDA Easy(er) with MarcEdit
 

Más de Jenn Riley

Más de Jenn Riley (20)

Understanding Metadata: Looking Forward
Understanding Metadata: Looking ForwardUnderstanding Metadata: Looking Forward
Understanding Metadata: Looking Forward
 
The future of cataloguing? Future cataloguers!
The future of cataloguing? Future cataloguers!The future of cataloguing? Future cataloguers!
The future of cataloguing? Future cataloguers!
 
Discovery elsewhere
Discovery elsewhereDiscovery elsewhere
Discovery elsewhere
 
Designing the Garden: Getting Grounded in Linked Data
Designing the Garden: Getting Grounded in Linked DataDesigning the Garden: Getting Grounded in Linked Data
Designing the Garden: Getting Grounded in Linked Data
 
Launching metaware.buzz
Launching metaware.buzzLaunching metaware.buzz
Launching metaware.buzz
 
Getting Comfortable with Metadata Reuse
Getting Comfortable with Metadata ReuseGetting Comfortable with Metadata Reuse
Getting Comfortable with Metadata Reuse
 
Handout for Digital Imaging of Photographs
Handout for Digital Imaging of PhotographsHandout for Digital Imaging of Photographs
Handout for Digital Imaging of Photographs
 
Digital Imaging of Photographs
Digital Imaging of PhotographsDigital Imaging of Photographs
Digital Imaging of Photographs
 
The Open Archives Initiative and the Sheet Music Consortium
The Open Archives Initiative and the Sheet Music ConsortiumThe Open Archives Initiative and the Sheet Music Consortium
The Open Archives Initiative and the Sheet Music Consortium
 
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...
 
Handout for FRBR; or, How I learned to stop worrying and love the model
Handout for FRBR; or, How I learned to stop worrying and love the modelHandout for FRBR; or, How I learned to stop worrying and love the model
Handout for FRBR; or, How I learned to stop worrying and love the model
 
Metadata for Brittle Books Page Turner
Metadata for Brittle Books Page TurnerMetadata for Brittle Books Page Turner
Metadata for Brittle Books Page Turner
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and Video
 
Variations2
Variations2Variations2
Variations2
 
Handout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Handout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODSHandout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Handout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
 
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODSAlphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
 
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...
 
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...
 
Challenges in the Nursery: Linking a Finding Aid with Online Content
Challenges in the Nursery: Linking a Finding Aid with Online ContentChallenges in the Nursery: Linking a Finding Aid with Online Content
Challenges in the Nursery: Linking a Finding Aid with Online Content
 
Making Interoperability Easier: Creating Shareable Metadata
Making Interoperability Easier: Creating Shareable MetadataMaking Interoperability Easier: Creating Shareable Metadata
Making Interoperability Easier: Creating Shareable Metadata
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Metadata for Audiovisual Materials and its Role in Digital Projects

  • 1. Metadata for Audiovisual Materials and its Role in Digital Projects Jenn Riley Metadata Librarian Indiana University Digital Library Program
  • 2. 2 OLAC/MOUG 2008 September 26 and 27, 2008 What we’re going to cover • A lot! Get ready for a whirlwind tour. • For many different metadata formats ▫ ▫ ▫ ▫ Brief introduction What it is for When is a good time to use it Usually an example • Images, audio, and video ▫ Maps and other formats have their own standards too! • We’ll focus mostly on standards cultural heritage institutions use, and less on “industry” standards
  • 3. Brief introduction to XML and types of metadata
  • 4. 4 OLAC/MOUG 2008 September 26 and 27, 2008 Purpose • XML = eXtensible Markup Language • “Meta-language” for defining markup languages for specific purposes • Many metadata formats cultural heritage institutions use are encoded in XML • Specific XML languages can be defined in several ways: ▫ DTD ▫ W3C XML Schema ▫ RELAX NG
  • 5. 5 OLAC/MOUG 2008 September 26 and 27, 2008 XML terminology • Element ▫ ▫ ▫ • Also called a “tag” Element name surrounded by brackets, e.g., <titleInfo> “Opens” <titleInfo> and “closes” </titleInfo> Attribute ▫ ▫ Name/value pair that applies to the element and its content Included within the text in brackets, e.g., <titleInfo type="alternative">
  • 6. 6 OLAC/MOUG 2008 September 26 and 27, 2008 All elements must be closed • YES: <title>Title of a Work</title> <subtitle>And its Subtitle</subtitle> • NO: <title>Title of a Work <subtitle>And its Subtitle
  • 7. 7 OLAC/MOUG 2008 September 26 and 27, 2008 Elements must be properly nested • YES: <titleInfo> <title>Spring and fall</title> </titleInfo> • NO: <titleInfo> <title>Spring and fall</titleInfo> </title>
  • 8. 8 OLAC/MOUG 2008 September 26 and 27, 2008 Element content • (What’s between the open and close tags) • Text <title>Spring and fall</title> • Other elements <titleInfo> <title>Spring and fall</title> <subTitle>a tone poem</subTitle> </titleInfo> • Both (mixed content) <something>some text, <otherthing>other text</otherthing></something> • Empty elements <tableOfContents xlink:href= "http://www.loc.gov/catdir/toc/99176484.html"/>
  • 9. 9 OLAC/MOUG 2008 Types of metadata • Descriptive metadata • Administrative metadata ▫ Technical metadata ▫ Preservation metadata ▫ Rights metadata • Structural metadata • Markup languages September 26 and 27, 2008
  • 10. 10 OLAC/MOUG 2008 How metadata is used September 26 and 27, 2008
  • 11. 11 OLAC/MOUG 2008 September 26 and 27, 2008 Levels of control • Three general types of standards, as viewed by libraries ▫ Data structure standards (e.g., MARC) ▫ Data content standards (e.g., AACR2r) ▫ Controlled vocabularies (e.g., LCSH) • Mix and match to meet your needs • Dividing lines not always clear, however • We’ll be talking about data structure standards today
  • 13. 13 OLAC/MOUG 2008 September 26 and 27, 2008 MARC • Implementation of ISO 2709, ANSI/NISO Z39.2 • Originally released in the late 1960s • MARC21 is the format used in the U.S. ▫ Other areas have other ISO 2709 implementations, e.g., UNIMARC • “Format integration” in the first half of the 1990s • Typically used with AACR2, ISBD punctuation, and LCSH, but this is not a requirement • Use when you want integration of content into the OPAC interface
  • 14. 14 OLAC/MOUG 2008 September 26 and 27, 2008 MARC example • This is actually a “human-readable” view of this record, not its native storage format • Notice ▫ 3-digit data fields ▫ Subfields introduced by $ (also sometimes rendered as | or ‡) ▫ Indicators providing information about how to interpret the data in the field • Mixture of machine-readable and humanreadable data
  • 15. 15 OLAC/MOUG 2008 September 26 and 27, 2008 MARCXML • Exact rendering of MARC in XML • Generally used as interim step between MARC and some other XML-based format ▫ Not intended to be generated directly by people • Notice in the example ▫ Verbose syntax (only a small portion of the record is represented here)
  • 16. 16 OLAC/MOUG 2008 September 26 and 27, 2008 Metadata Object Description Schema (MODS) • Developed and maintained by the LC Network Development and MARC Standards Office • Inspired by MARC, but not equivalent • Intended to be useful to a wider audience than MARC • Still a “bibliographic” focus • Use when you want a library-type approach but more interoperability than MARC and the benefits of XML
  • 17. 17 OLAC/MOUG 2008 September 26 and 27, 2008 MODS example • Textual element names • General MARC inspiration • AACR2 used in this example, but not required by MODS • Fairly extensive scope • But still “library-ish”
  • 18. 18 OLAC/MOUG 2008 September 26 and 27, 2008 Dublin Core • Perhaps the most misunderstood metadata standard! • Dublin Core Metadata Element Set (DCMES) ▫ ▫ ▫ ▫ ANSI/NISO Z39.85, ISO 15836 No element required All elements repeatable 1:1 principle • Abstract Model is current focus
  • 19. 19 OLAC/MOUG 2008 September 26 and 27, 2008 Dublin Core Metadata Element Set • Unqualified – 15 elements ▫ This is the format most think of as “Dublin Core” • Qualified ▫ ▫ ▫ ▫ Additional elements Element refinements Encoding schemes (vocabulary and syntax) All qualifiers must follow “dumb-down” principle
  • 20. 20 OLAC/MOUG 2008 September 26 and 27, 2008 Uses of DCMES • “Core” across all knowledge domains • Unqualified DC required for sharing metadata via the Open Archives Initiative • Generally used as format for sharing metadata with others • QDC occasionally used as a native metadata format ▫ CONTENTdm ▫ DSpace
  • 21. 21 OLAC/MOUG 2008 September 26 and 27, 2008 Dublin Core examples • Relative simpleness of the formats • QDC allows the specification of source vocabulary, more specific element meanings • These records generated via standard mappings from MARC ▫ Obviously the mappings need some work ▫ But that doesn’t mean the target formats aren’t useful! • Remember, every format has its purpose
  • 23. 23 OLAC/MOUG 2008 September 26 and 27, 2008 Visual Resources Association Core Categories (VRA Core) • Designed by visual resources specialists • Distinguishes between collection, work, and image • Focus on creation, style, culture • Best used on collections of reproductions of works of art & architecture • No infrastructure yet for easy sharing of work records
  • 24. 24 OLAC/MOUG 2008 September 26 and 27, 2008 VRA Core example • Work and image in separate records • Image record describes a digitized photograph of an architectural site • Separate elements for display and indexing values • Use of controlled vocabularies • Connections to research relevant to the work
  • 25. 25 OLAC/MOUG 2008 September 26 and 27, 2008 Categories for the Description of Works of Art (CDWA) Lite • Version of the full CDWA, intended to help museums share metadata about their collections • Strong museum, curatorial focus • Strong on culture, physical location • Meant to describe original works, not surrogates or reproductions • Best used for unique materials owned and managed by your institution
  • 26. 26 OLAC/MOUG 2008 September 26 and 27, 2008 CDWA Lite example • Separate elements for display and indexing values • Physical dimensions • Current repository and provenance • Inscription information
  • 28. 28 OLAC/MOUG 2008 September 26 and 27, 2008 Different landscape for music than images • • • • No discipline-generated format has emerged Do we need one? Industry is a strong influence in this community “Music” is almost impossibly diverse ▫ Different cultures, traditions ▫ Different formats (sound, notation, visual + audio) ▫ Quickly changing environment
  • 29. 29 OLAC/MOUG 2008 September 26 and 27, 2008 Some music metadata formats • Variations2 – Indiana University • Probado – Bavarian State Library • Music Ontology – Music Information Retrieval community • ID3 tags - Industry Overall, only very specialized applications choose these over a format-neutral option.
  • 31. 31 OLAC/MOUG 2008 September 26 and 27, 2008 MPEG-7 • “Multimedia Content Description Interface” • ISO/IEC standard • From the Moving Picture Experts Group, which is behind the MPEG-1 and MPEG-2 multimedia content formats, and the MPEG-21 Multimedia Framework • Descriptions can be expressed in XML or compressed binary form
  • 32. 32 OLAC/MOUG 2008 September 26 and 27, 2008 Framework rather than element set • “Description Definition Language” ▫ Based on W3C XML Schema ▫ Defines “description schemes” • Pre-defined description schemes for video and audio • Focus is more on “low-level” descriptors than library-style bibliographic information • Would preserve MPEG-7 information when generated by an editing application • Unlikely a library would choose it as a format for descriptive metadata to support discovery
  • 33. 33 OLAC/MOUG 2008 September 26 and 27, 2008 MPEG-7 scope • Wide scope – intended to cover descriptive, technical, rights, use, etc., information • Many media formats ▫ ▫ ▫ ▫ ▫ ▫ ▫ Still pictures Graphics 3D models Audio Speech Video “Scenarios” combining these elements • Note technical details of the audio waveform in the example
  • 34. 34 OLAC/MOUG 2008 September 26 and 27, 2008 Public Broadcasting Core (PB Core) • Development funded by the Corporation for Public Broadcasting • Data to support the creation, management, and discovery of “media items” • 4 classes ▫ ▫ ▫ ▫ IntellectualContent IntellectualProperty Instantiation Extensions • Likely the best choice for broadcasting archives
  • 35. 35 OLAC/MOUG 2008 September 26 and 27, 2008 PB Core example • Common descriptive information such as title, subject, genre • Audience level and rating • Rights information • Separates “instantiation” from intellectual content
  • 37. 37 OLAC/MOUG 2008 September 26 and 27, 2008 Metadata for Images in XML (MIX) • Implementation in XML of ANSI/NISO Z39.87 data dictionary • Maintained by the Library of Congress Network Development and MARC Standards Office • Technical information needed to render the image and data on how it was created • Use for any still image format; most can be generated automatically • Note features such as compression level, pixel dimensions, format-specific data, and bit rate
  • 38. 38 OLAC/MOUG 2008 September 26 and 27, 2008 AES Core Audio • Currently under development by the Audio Engineering Society, not yet in general release • Divides audio into face->region->stream • Can be used for both analog and digital audio • Use for any audio file; most can be generated automatically • Expectation is that most audio editing software will be able to generate this format • Note duration, sample rate, channel assignments
  • 39. 39 OLAC/MOUG 2008 September 26 and 27, 2008 LC A/V Prototyping Project Audio (Source) Data Dictionary • Developed in 2003 • Never implemented in a production environment • Use AES Core Audio instead when you can ▫ This is probably a reasonable choice in the meantime • Note encoding, duration, sample size, channel information
  • 40. 40 OLAC/MOUG 2008 September 26 and 27, 2008 LC A/V Prototyping Project VIDEOMD Data Dictionary • Developed in 2003 • Never implemented in a production environment • Just video information; assumes separate format for the audio track • Use if you can; no tools to create it for you • This type of data stored internally in most video editing software, but no real shared export formats • Be on the lookout for new developments • Note duration, sample rate, physical tape characteristics, frame size/rate
  • 41. 41 OLAC/MOUG 2008 September 26 and 27, 2008 AES Process History Metadata • Currently under development by the Audio Engineering Society, not yet in general release • Records “processing events” • Detailed information about device settings, signal patches • Used to support the digital preservation process • Use for any audio file; most can be generated automatically • Expectation is that most audio editing software will be able to generate this format • Note device data, input/output channels, patch list
  • 43. 43 OLAC/MOUG 2008 September 26 and 27, 2008 Metadata Encoding and Transmission Standard (METS) • “Wrapper” to package many types of metadata together for a resource • Structural metadata is its heart • Expectation is that METS documents will be generated programmatically • Not many METS generation tools out there, though • Often used for exchange of data between repositories, and for ingest into and export out of a repository
  • 44. 44 OLAC/MOUG 2008 September 26 and 27, 2008 METS example • This example shows an “audio preservation package” ▫ Collection-level descriptive metadata in MARCXML ▫ AES Core Audio technical metadata for analog source and various digitized versions ▫ Audio decision lists ▫ AES Process History ▫ Audio and ADL files ▫ Structural information  Relationships between different versions  Milestones on the audio timeline
  • 45. 45 OLAC/MOUG 2008 September 26 and 27, 2008 SMPTE Material eXchange Format (MXF) • Actually a family of standards • Wrapper for metadata and media files (“essence”) • Industry-driven format designed for interoperability between devices • Low-level feature information • Generated by media editing software • Example shows part of a header and references to essence files
  • 46. 46 OLAC/MOUG 2008 September 26 and 27, 2008 Synchronized Multimedia Integration Language (SMIL) • From the W3C, the body behind HTML and XML • For multimedia presentations • Embedded media, transitions, timing • Most media players support SMIL • Note examples showing images in sequence and in parallel
  • 47. 47 OLAC/MOUG 2008 September 26 and 27, 2008 AES-31-3 Audio Decision List • Used by editing software to record edits made to audio files • Text-based format that looks like XML in places • Documents how files are stitched together to create the output • Uses a common “destination timeline” for all files • Non-standard extension for “markers” in WaveLab • Note in/out fade, “cuelist”
  • 49. 49 OLAC/MOUG 2008 September 26 and 27, 2008 Content, not “metadata” • For encoding musical notation itself - the full content • Tend to include “header” with some descriptive metadata • Currently, two primary choices ▫ MusicXML  Focus on industry, notation software ▫ Music Encoding Initiative (MEI)  Inspired by the Text Encoding Initiative (TEI)
  • 51. 51 OLAC/MOUG 2008 September 26 and 27, 2008 Scenario 1: Audio/video course reserves • Discovery ▫ MARC/AACR2 records in OPAC ▫ Course reserves module with descriptive data extracted from MARC records ▫ Link from discovery system launches media player • Delivery ▫ Locally-managed media streaming server ▫ (Optional) SMIL for navigation
  • 52. 52 OLAC/MOUG 2008 September 26 and 27, 2008 Scenario 2: Digital music library • High-end, specialized, online environment for music in a variety of formats • Work-based metadata model such as Variations2 optimized for music discovery • Descriptive metadata records persistently link to media files in tools that facilitate use of the content • METS-based structural metadata for navigation within and between media files • Various forms of technical and administrative metadata for long-term preservation of media files
  • 53. 53 OLAC/MOUG 2008 September 26 and 27, 2008 Scenario 3: Broadcast archive • Focus on management of media; discovery only for staff and not for end-users • PB Core as base metadata • High-end media editing software generates AES, MXF, other industry standard technical metadata • METS wrapper for connecting PB Core data to structural and technical metadata for ingest into preservation repository
  • 54. 54 OLAC/MOUG 2008 September 26 and 27, 2008 Scenario 4: Online special collections • Discovery ▫ MODS for item-level description of a variety of formats (letters, photographs, oral histories) • Delivery ▫ METS for structural data for multi-page objects ▫ Online page-turning interface ▫ PDF download • Commonly used software such as CONTENTdm does much of this in its own quirky way – we need to keep pushing for system adherence to standards!
  • 55. 55 OLAC/MOUG 2008 September 26 and 27, 2008 Thank you! • jenlrile@indiana.edu • These presentation slides: http://www.dlib.indiana.edu/~jenlrile/presentations/ olac2008/olac.ppt • Workshop handout: http://www.dlib.indiana.edu/~jenlrile/presentations/ olac2008/handout.pdf