1. RLG Partnership Meeting
June 1-3 2009
The Crosswalk Web Service
Jean Godby
Research Scientist
OCLC Research
2. The Crosswalk Web Service at OCLC
• Enables OCLC to translate from one metadata format to
another.
• A “metadata format” is a triple that consists of a metadata schema,
a structural encoding, and a character encoding.
• Supported standards are bibliographic, but the software can handle
other types of data.
• Can be called from any product or service that processes
metadata.
• A version with a slightly different interface resides on the
OCLC Enterprise Bus.
3. The translations
Inputs MARC 21- MARC 21- Outputs
2709 2709
ONIX Books
OCLC MARC
MARC XML
OCLC’s OCLC CDF
MODS Common ONIX Books
DC XML
Data
Format MARC XML
OAI-DC XML
DC XML
OCLC CDF
DC-Qualified DC-Qualified
ONIX Serials MODS
4. Data flow for a single translation
MARC input ISO 2709
522 $a northwest or
Convert to input structure <record>
<?xml version=“1.0” encoding=“UTF-8”?>
<header>
<record>
<qualifieddc xmlns
MARC XML
<header> name=‘marc21’
<schema
dcterms=‘purl.org;dc/terms’ >
<schema namespace=‘uri:”marc:21’/>
name=‘DC-Terms’
<dctermsset>
…</header> namespace=‘uri:DC-Terms’/>
Translate to DC Terms <dcterms:spatial>
<field name=‘522’>
</header>
northwest
<datafieldname=‘a’>
<field tag=‘522”>
<field name=‘spatial’>
</dcterms:spatial>
<subfield code=‘a’>northwest</subfield>
<value>northwest</value>
<value>northwest</value>
</dctermsset>
</datafield>
</field>
</field>
Convert to output structure </qualifieddc>
</field>
</record>
…</record>
DC Terms output
5. The MARC-Dublin Core relationship
OCLC Local
Marc
DC Terms DC Terms
DC Terms
DC Simple
DC Simple
Other
local variants
OCLC Local OCLC Local
DC Terms
DC Simple DC Simple
6. Translations and conversions, expanded
XML ISO 2709 Text
XML
XML
XML
ONIX-Books
Dublin Core
XML
XML MARC
MARC
MARC
New versions
Local
Local
Local
Local MARC Local
Local
variants
extensions
extensions variants
extensions
21 extensions
7.
8. From ContentDM to WorldCat
Customized
CONTENTdm CONTENTdm
web sites Collection
Build collections Administrator
with CONTENTdm
Acquisition station(s)
Custom
CONTENTdm
WorldCat maps
OCLC Staff, Master file CONTENTdm
data analysis and derivatives copies digital
DC- linked in metadata master files to
archival volumes
MARC
OAI Harvester
Digital
CONTENTdm
WorldCat Metadata Archive
Harvesting program Store digital
master files in
Digital Archive
9. Next Generation Cataloging
Vendor and publisher records
ONIX ONIX ONIX ONIX
To CDF
WorldCat
Translate
To ONIX
ESweep WorldCat
enriched
records
Enrich
data
To CDF
To CDF
Vendors &
publishers
NextGen process flow
10. A graphical user interface
Inputs
Inputs Outputs
Outputs
Standard
<map> translation
Source: MARC 245 $a
Target: ONIX Title
</map>
Implied
Editing Search translation
interface interface
Application
<map>
profile
Source: MARC 650 $a
Target: ONIX Subject
Map
</map>
database
Version
upgrade
11. For more information
• Research reports
• Encoding Application Profiles in a Computational Model of the
Crosswalk
• Toward element-level interoperability in bibliographic metadata
• A Repository of Metadata Crosswalks
• Two Paths to Interoperable Metadata
• The public demo
• OCLC Crosswalk Web Service Demo
• OCLC Information and Services for Publishers