An explanation of the Visual Resources Association Embedded Metadata working group project to develop a custom info panel for cultural heritage metadata.
4. Why?
Well documented, incorporated into most photo tools
Great guidelines, user support; frequently and
consistently updated
Widespread cross-disciplinary use
Currently supported in Media Bin database
5. Why just Core?
IPTC Extension, VRA, and PLUS panels not supported
“out of box” Media Bin
Exploring additional EM possibilities will require coding
15. Work Work Image
sculpture glass negative TIFF
ca. 1930 1934 2012
16. Collection Work Image
agent agent agent
culturalContext culturalContext culturalContext
date date date
description description description
inscription inscription inscription
location location location
material material material
measurements measurements measurements
relation relation relation
rights rights rights
source source source
stateEdition stateEdition stateEdition
stylePeriod stylePeriod stylePeriod
subject subject subject
technique technique technique
textref textref textref
title title title
worktype worktype worktype
17. Extension
Artwork or Object in the Image
Creator
Title
Date Created
Source
Source Inventory Number
Copyright Notice
18. Extension
Artwork or Object in the Image
Date Created
single calendar date
no BCE
“built 1298 – 1310, restored 1872”
19. Location of a Work
Sculpture Painting
Wikimedia Commons: Jim CHampion Metropolitan Museum of Art
IPTC Ext. Location Shown VRA Work Location <site>
+ details
20. Location of a Work
Sculpture Painting
Wikimedia Commons: Jim CHampion Metropolitan Museum of Art
IPTC Ext. Location Shown VRA Work Subject
+ details
IPTC Ext. AO Source VRA Work Location <repository>
no details
23. Qualifying IPTC Ext. with VRA
<
Extension
Artwork or Object in the Image
Core 4.0
Work
</
XMP allows qualifying one namespace with another
24. Qualifying IPTC Ext. with VRA
<
Extension
Artwork or Object in the Image
gone
</
Some info panels delete qualifiers
25. Using Both IPTC Ext. and VRA
< Extension
Artwork or Object in the Image
</
< Core 4.0
Work
</
Using several namespace is common in XMP
26. Connecting VRA to IPTC Ext.
<Iptc4xmpExt:ArtworkOrObject> <vra:work>
<Bag> <Seq>
[1] Work 1 [1] Work 1
[2] Work 2 [2] Work 2
[3] Work 3 [3] Work 3
</Bag> </Seq>
</Iptc4xmpExt:ArtworkOrObject> <vra:work>
unordered list (bag) order can change
27. Connecting VRA to IPTC Ext.
<Iptc4xmpExt:ArtworkOrObject> <vra:work>
<Bag> <Seq>
[1] Work 3 [1] Work 1
[2] Work 1 [2] Work 2
[3] Work 2 [3] Work 3
</Bag> </Seq>
</Iptc4xmpExt:ArtworkOrObject> <vra:work>
XMP does not allow linking via rdf:about
unordered list (bag) order can change
28. VRA Info Panel Compromise
IPTC Core
Dublin Core
Photoshop
XMP Rights
PLUS
VRA Core 4.0
All work properties
Display values (flat)
29. Work Image
agent agent
culturalContext date
date description
description location
inscription material
location (by type) measurements
material relation
measurements rights
relation source
rights subject
source textref
stateEdition title
stylePeriod worktype
subject
technique
textref
title
worktype
34. Concatenating
VRA Work:
Creator
Title
Description Date
Work Type
Location
Acc. Number
Rights
Creating interoperable metadata from specialized metadata
51. Name Authorities
Available now
ISNI
VIAF
Library of Congress
Plus Registry
Available in the future (hopefully)
Getty ULAN
52. Subject Vocabularies
Available now
IPTC NewsCodes
Deutsche National Bibliothek
Kungliga Biblioteket
Bibliothèque nationale de France
Library of Congress
Iconclass
BIC
UDC
Available in the future (hopefully)
Getty AAT
55. VRA’s Goal
Avoid property overpopulation
Avoid conflicting property labels and definitions
Make mapping easy
Use linked data
Interoperability
Interoperable heritage properties
56. VRA’s Goal
Both:
Basic Description URIs
Most applications Link to structured data
No special effort required Always current
Simple identification More relationships
Simple searching
57. VRA’s Goal
Both:
Basic Description Structured Data
Most applications For aggregators
No special effort required For collection managers
Simple identification Full identification
Simple searching Advanced searching
58. VRA’s Goal
Both:
Basic Description Structured Data
Which Schema?
Core properties?
Description
Subject All properties?
AO Copyright Written to XMP?
URI linked to authority?
59. Possible Fields
Artwork/Object Image/Photo
Work ID (linked) View Description
Creator (linked) View Type (controlled)
Creator Role (linked) View Subject (linked)
Work Type (linked)
Measurements
Materials and Techniques
Date (free text)
Earliest Date
Latest Date
Current Location
Subject (linked)
60. Embedded Metadata working group
metadatadeluxe.pbworks.com
vraweb.org/projects/vracore4
http://metadatadeluxe.pbworks.com
61. Embedded Metadata working group
Marta Bustillo, National College of Art and Design (Ireland)
Jen Cwiok, American Museum of Natural History
Sheryl Frisch, Cal Poly, San Luis Obispo
Jesse Henderson, Colgate University
Josh Lynn, Minneapolis Institute of Arts
Heidi Raatz, Minneapolis Institute of Arts
Greg Reser, UCSD
Steve Tatum, Virginia Tech
http://metadatadeluxe.pbworks.com
Notas del editor
I have always been a do-it-yourselfer. Admittedly, I like to make things, but usually I make things out of necessity.
The User Experience – how does EM serve our VirageMB database users? Data driven DYNAMIC collections – created by defining search parameters on EM which apply to image assets across the entire database Assets ingested that meet these parameters – that include the searched for EM – are (BAM!) AUTOMATICALLY added to the collection Used for recurring events: Family Day, Third Thursday, Art in Bloom, The Circle
CalState DSpace custom info panel for Photoshop and Bridge. Sheryl Frisch came to me for help creating an info panel that could collect all the data they wanted in their shared image database - the Cal State Visual Collective.
VRA’s original approach to embedding complex metadata: When choosing fields, like Work Title or Image Copyright, the idea was to start with schemas that are most widely used by the majority of photo applications and web services and then move down the list, using specialized schemas last. This places as much of the metadata as possible in properties that will be read by common tools. The approach to building the VRA panel was to use as many well-known namespaces as possible to provide interoperability with a wide range of photo software. The first schema used was IPTC core and Extension, then PLUS, then any other namespace built into XMP (as specified in the XMP specs, part 2). Remaining properties were assigned to the VRA Core 4.0 namespace. This ensures that the most essential data about an artwork can be read when the user does not have access to the VRA or IPTC Adobe CS panels. Further, key fields are combined to create Tags and a photo Caption, the most widely supported fields in photo applications, web sites and operating systems.
There’s VRA all the way at the bottom looking like the least favored of the children. It turns out however, that it has an important role to fill, stepping in when the other schemas can’t fulfill our needs.
We went boring, choosing the most widely used schema used for embedded metadata – the one that most tools recognize. IPTC has been around for a long time and it is used in just about every photo application and social media site out there.
VRA, on the other hand, can take you more places but requires more expertise and specialized equipment. It’s not suitable for everyone.
The VRA Core 4 specifications are hosted by the Library of Congress.
A central principal of Core 4 is the complete separation of Collection, Work and Image records. Each of these records is repeatable and can be combined to cover complex ojects and images.
What qualifies as a Work or Image depends on the nature of the item and the database it is used in. The original artwork is definitely a Work, but the original photograph of it can be too. A TIFF of that original photograph would be described in the Image record.
Core 4 Collection, Work and Image records all use the same set of elements. There are also sub-element, types and global elements not shown here.
Our first choice for embedding cultural heritage metadata was going to be IPTC, specifically Extension because it is well structured and it includes fields that are useful for the VRA panel including fields exclusively for artworks. Our original intent was to use all of these fields and then use VRA for the remaining fields such as Measurements, Materials, Technique, Culture, Style/Period. This turned out to be harder than we thought.
For Instance, “Date Created” is a single calendar date only and doesn’t allow for a range of dates or a complex free-text date such as “built 1298 – 1310, destroyed 1943”
Users are likely to enter the location depicted in the painting into IPTC Ext. Location Shown. This would cause mapping difficulties for VRA. If you want to know where this work is if you want to go see this work for yourself
Users are likely to enter the location depicted in the painting into IPTC Ext. Location Shown. This would cause mapping difficulties for VRA. If you want to know where this work is if you want to go see this work for yourself
The definition if Image Source is different for IPTC and VRA.
The definition if Image Source is different for IPTC and VRA.
Another thing we tried, and the method that would be the most reliable and computer friendly, would be to nest VRA within IPTC. This keeps all the artwork data together in one array and makes it possible to describe multiple artworks using multiple arrays, each one being a completely dicrete packet. This method is supported by XMP. Unfortunately, most applications don’t recognize the nested VRA data and they delete it. This is much too unreliable at this point.
Another thing we tried, and the method that would be the most reliable and computer friendly, would be to nest VRA within IPTC. This keeps all the artwork data together in one array and makes it possible to describe multiple artworks using multiple arrays, each one being a completely dicrete packet. This method is supported by XMP. Unfortunately, most applications don’t recognize the nested VRA data and they delete it. This is much too unreliable at this point.
Another thing we tried, and the method that would be the most reliable and computer friendly, would be to nest VRA within IPTC. This keeps all the artwork data together in one array and makes it possible to describe multiple artworks using multiple arrays, each one being a completely dicrete packet. This method is supported by XMP. Unfortunately, most applications don’t recognize the nested VRA data and they delete it. This is much too unreliable at this point.
We tried using both IPTC and VRA for Work metadata but had difficulty connecting them. One problem is that the IPTC artwork data is a bag structure which has no fixed order.
This means the items can shift position in the array and cannot be reliably connected to the matching array in VRA.
In the end we chose to use VRA Core for all artwork data but we were able to use mostly IPTC for the image data. This was a compromise because we wanted to avoid introducing another custom namespace into the XMP world. It would seem to benefit cultural heritage users to have a more standard (interoperable) set of properties. This would make ingesting metadata easier and allow advanced desktop searching (this means more tools would have to incorporate the additional cultural heritage properties).
In the end we chose to use VRA Core for all artwork data but we were able to use mostly IPTC for the image data. This was a compromise because we wanted to avoid introducing another custom namespace into the XMP world. It would seem to benefit cultural heritage users to have a more standard (interoperable) set of properties. This would make ingesting metadata easier and allow advanced desktop searching (this means more tools would have to incorporate the additional cultural heritage properties).
The VRA custom info panel Work section.
The VRA custom info panel Image section.
The VRA custom info panel Admin and Summarry sections.
So what about all that detailed artwork information we are embedding, it might not be seen with specialized applications like Photoshop, right? To make sure that the most important information is carried to all likely destinations, the VRA panel concatenates most of the artwork fields to the Title, Caption, and Tags
So what about all that detailed artwork information we are embedding, it might not be seen with specialized applications like Photoshop, right? To make sure that the most important information is carried to all likely destinations, the VRA panel concatenates most of the artwork fields to the Title, Caption, and Tags
The Cal State Visual Collective metadata template breaks down some elements like Creator to a very granular level. Naturally, the first thing I asked Sheryl was, “What are you trying to do?” After she told me to “Quit asking questions and just make it happen”, I looked at our current VRA info panel to see if I could modify it. The VRA panel was kept simple by design - a single display text field for every Core 4 element. Obviously for the CalState Creator section we had to add a lot of fields... 10 to be exact. We also added an auto-complete feature for the Creator Label so a uniformly formatted ULAN-type Label would be produced. We also allowed for multiple Creators. After we finished this and it was working I thought back on the many times people told me, “No one will ever want to enter granular data in Photoshop.”
We are looking into using linked open data to pull in reliable structure data. For example, the Cal State info panel uses Iconclass which is far too complex to build into an info panel. Instead, we decided to use linked data. I contacted Iconclass to ask how we should represent the vocabulary in XMP and how to access the data. It turns out that Iconclass publishes its data on the web as open linked data in SKOS/RDF and JSON. I chose to use JSON because it is easy to work with I found some example code for using it in Adobe ActionScript (the language of Adobe info panels). JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of JavaScript .
Using Iconclass in the CSU custom info panel 1. User goes to http://iconclass.org/ and finds a term they want to use. 2. They enter the Iconclass code in the CalState info panel. 3. The panel sends out a web call to an address that is created by concatenating the base URL with the Iconclass concept code. 4. The JSON text data is brought into the panels cache where it is broken apart and re-used.
Iconclass web data retrieval in action
Iconclass linked data is pulled in and embedded
This is the actual web data. It’s a webpage of plain text in JSON format which is easily parsed out with a little coding.
Bonus! Iconclass has keywords which can be imported to use as tags (less controlled but very useful).
The Cal State panel allows up to 4 Iconclass headings [video].
CalState panel Iconclass keyword function [video]. The user selects the words they want to use then adds them to the Keywords (Dublin Core Subject). By entering one code, the panel has added one controlled heading and eight keywords.
CalState panel Iconclass keyword function. The user selects the words they want to use then adds them to the Keywords (Dublin Core Subject). By entering one code, the panel has added one controlled heading and eight keywords.
This is all the data we are storing for a single Iconclass concept
CalState is looking into pulling data from the Getty Vocabularies. This would make it easy for faculty, students, and assistants to easily add a lot of controlled data.
It would be nice to fill in a ULAN ID number and have all the other name information brought in from the web. It would be even better to have a built-in search box so you can find and select a name right in the info panel.
Imagine if a user could simply enter a CONA ID and have all the data (or whatever they choose) embedded in the image.
The PLUS Registry is a very promising service for Work managing Work identifiers and metadata.
There are several linked data subject vocabularies available now that might be of interest to the cultural heritage community including: The German national Library' subject headings, Swedish union catalogue Libris, and French National Library's subject headings are available as SKOS linked data. They are all mapped to the Library of Congress and are all included in the aggregated authority VIAF (Virtual International Authority File). VIAF only has names at the moment, but it could become a hub for linked data in the future.
One project that is looking at embedded image metadata is Linked Heritage in London. They are working on bringing commercial image suppliers to the Europeana image database. They have developed data submission standards and tools for contributors. Currently they are working on mapping IPTC to LIDO. Of course you want to ask “Why would they do that?” The reason they are interested in embedded metadata is that a lot of commercial image producers have large libraries of images with IPTC metadata. To make this data work in Europeana it will be mapped to LIDO. The VRA Data Standards Committee is also working on mapping Core 4 to LIDO, so obviously there is some overlap here. It seems to me that there is an opportunity to collaborate and develop a shared system/schema/vocabulary.
We want to store both kinds of metadata so all the metadata is easily discovered and searched and granular data can be extracted to a database or searched with advanced tools for more accurate results.
We want to store both kinds of metadata so all the metadata is easily discovered and searched and granular data can be extracted to a database or searched with advanced tools for more accurate results.