The International Federation of Library Associations and Institutions (IFLA) is responsible for the development and maintenance of International Standard Bibliographic Description (ISBD), UNIMARC, and the "Functional Requirements" family for bibliographic records (FRBR), authority data (FRAD), and subject authority data (FRSAD). ISBD underpins the MARC family of formats used by libraries world-wide for many millions of catalog records, while FRBR is a relatively new model optimized for users and the digital environment. These metadata models, schemas, and content rules are now being expressed in the Resource Description Framework language for use in the Semantic Web.
This webinar provides a general update on the work being undertaken. It describes the development of an Application Profile for ISBD to specify the sequence, repeatability, and mandatory status of its elements. It discusses issues involved in deriving linked data from legacy catalogue records based on monolithic and multi-part schemas following ISBD and FRBR, such as the duplication which arises from copy cataloging and FRBRization. The webinar provides practical examples of deriving high-quality linked data from the vast numbers of records created by libraries, and demonstrates how a shift of focus from records to linked-data triples can provide more efficient and effective user-centered resource discovery services.
2. Abstract
The
Interna'onal
Federa'on
of
Library
Associa'ons
and
Ins'tu'ons
(IFLA)
is
responsible
for
the
development
and
maintenance
of
Interna'onal
Standard
Bibliographic
Descrip'on
(ISBD),
UNIMARC,
and
the
"Func'onal
Requirements"
family
for
bibliographic
records
(FRBR),
authority
data
(FRAD),
and
subject
authority
data
(FRSAD).
ISBD
underpins
the
MARC
family
of
formats
used
by
libraries
world-‐wide
for
many
millions
of
catalog
records,
while
FRBR
is
a
rela'vely
new
model
op'mized
for
users
and
the
digital
environment.
These
metadata
models,
schemas,
and
content
rules
are
now
being
expressed
in
the
Resource
Descrip'on
Framework
language
for
use
in
the
Seman'c
Web.
This
webinar
provides
a
general
update
on
the
work
being
undertaken.
It
describes
the
development
of
an
Applica'on
Profile
for
ISBD
to
specify
the
sequence,
repeatability,
and
mandatory
status
of
its
elements.
It
discusses
issues
involved
in
deriving
linked
data
from
legacy
catalogue
records
based
on
monolithic
and
mul'-‐
part
schemas
following
ISBD
and
FRBR,
such
as
the
duplica'on
which
arises
from
copy
cataloging
and
FRBRiza'on.
The
webinar
provides
prac'cal
examples
of
deriving
high-‐quality
linked
data
from
the
vast
numbers
of
records
created
by
libraries,
and
demonstrates
how
a
shiZ
of
focus
from
records
to
linked-‐data
triples
can
provide
more
efficient
and
effec've
user-‐centered
resource
discovery
services.
3. IFLA
standards
RDF
representa'ons
of
standards
for
“universal”
bibliographic
control
are
being
developed
“FR”
(Func'onal
Requirements)
family
of
models
For
Bibliographic
Records
(FRBR)
For
Authority
Data
(FRAD)
For
Subject
Authority
Data
(FRSAD)
Interna'onal
Standard
Bibliographic
Descrip'on
(ISBD)
Record
structure
and
content
UNIMARC
Encoding
for
ISBD
records
(Bibliographic)
and
FRAD
(Authori'es)
4. Representa'on
in
RDF
En''es
=>
RDF
classes
E.g.
FRBR
“Person”
Abributes,
tags,
(sub)fields,
rela'onships
=>
RDF
proper'es
E.g.
ISBD
“'tle
proper”
E.g.
UNIMARC
“200
$a”
('tle
proper)
E.g.
FRBR
“'tle
of
the
manifesta'on”
Controlled
term
values
=>
SKOS
vocabularies
E.g.
ISBD
Area
0
(content
and
media
type)
5. FR
family
Each
model
has
its
own
namespace
To
reflect
historical
development
Re-‐using
earlier
RDF
elements
Consolidated
model
under
development
Being
informed
by
analysis
of
RDF
representa'on
FRBR
RDF
published
FRBRer
(en'ty-‐rela'onship)
ontology
Namespace
elements
plus
OWL
FRBRoo
(object-‐oriented)
Extension
of
CIDOC
Conceptual
Reference
Model
FRAD
and
FRSAD
imminent
tba
6. ISBD
Element
set
and
vocabularies
for
content
and
media
types
Namespace
now
published
DC
Applica'on
Profile
in
development
Models
the
ISBD
record
What
proper'es
(fields)
Mandatory?
Repeatable?
Aggregated
statements
Sub-‐elements
and
punctua'on
7. ISBD
AP
snippet
<!-‐-‐
Area
0
is
mandatory
and
non-‐repeatable-‐-‐>
<StatementTemplate
ID="hasContentFormAndMediaTypeArea"
minOccurs="1"
maxOccurs="1"
type="nonliteral">
<Property>hbp://iflastandards.info/ns/isbd/elements/P1158</Property>
<!-‐-‐
Area
0
is
an
aggregated
statement
with
SES
-‐-‐>
<NonLiteralConstraint
descrip'onTemplateRef="DThasContentFormAndMediaTypeArea">
<ValueStringConstraint>
<SyntaxEncodingScheme>hbp://iflastandards.info/ns/isbd/elements/C2003
</SyntaxEncodingScheme>
</ValueStringConstraint>
</NonLiteralConstraint>
</StatementTemplate>
8. UNIMARC
Proposal
for
RDF
representa'on
made
at
IFLA
2011
hbp://conference.ifla.org/sites/default/files/files/
papers/ifla77/187-‐dunsire-‐en.pdf
Outcome
of
discussions
with
Permanent
UNIMARC
Commibee
tba
9. Other
library
standards
in
RDF
(1)
RDA:
resource
descrip'on
and
access
Content
standard
based
on
FR
models
Refines
the
FR
proper'es
Many
more
controlled
vocabularies
than
AACR
MODS/MADS
(Metadata
Object/Authority
Descrip'on
Schema)
Metadata
structure
based
on
MARC21
RDF
representa'on
just
beginning
...
10. Other
library
standards
in
RDF
(2)
BIBO:
Bibliographic
Ontology
Classes
and
proper'es
for
cita'ons
and
bibliographic
references
DCMI
Metadata
Terms
(Dublin
Core)
High-‐level
common-‐denominator
classes
and
proper'es
for
memory
ins'tu'on
metadata
Lots
of
controlled
vocabularies
LCSH,
DDC
summaries,
RDA
vocabularies,
etc.
11.
12.
13.
14.
15.
16. From
record
to
triples
(in
9
stages)
Very
large
numbers
of
records
Catalogue
records,
finding
aids,
etc.
300
million;
1
billion?
High
quality
metadata
In
comparison
with
other
communi'es
Each
record
may
generate
many
triples
30
“raw”
triples
(no
inferences)
per
MARC
record?
Very,
very
large
numbers
of
triples
Billions?
Trillions?
17. 1.
Take
a
record
Field/a(ribute
Value
Record
ID
54321
Title
Museum
archives:
an
introduc'on
Author
Wythe,
Deborah
Date
2004
LCSH
Museum
archives
Media/GMD
Electronic
Content
form
Text
18. 2.
Disaggregate
to
single
statements
Record
A(ribute
Value
54321
(has)
'tle
Museum
archives:
an
introduc'on
54321
(has)
author
Wythe,
Deborah
54321
(has)
date
2004
54321
(has)
LCSH
Museum
archives
54321
(has)
media
type
Electronic
54321
(has)
content
form
Text
19. 3.
Create
URI
for
record
Must
be
unique,
so
54321
no
good
on
its
own
hbp
URIs
are
a
good
thing
(W3C)
So
add
record
ID
to
a
unique
hbp
domain
E.g.
hbp://MyLibraryX.com
(unique
to
the
library)
+
54321
hbp://MyLibraryX.com/54321
(or
hbp://MyLibraryX.com#54321)
This
is
not
a
URL!
20. 4.
Replace
record
ID
with
URI
URI
A(ribute
Value
mlx:54321
(has)
'tle
Museum
archives:
an
introduc'on
mlx:54321
(has)
author
Wythe,
Deborah
mlx:54321
(has)
date
2004
mlx:54321
(has)
LCSH
Museum
archives
mlx:54321
(has)
media
type
Electronic
mlx:54321
(has)
content
form
Text
“mlx”
=
qname
(xmlns)
=
shorthand
for
“hbp://MyLibraryX.com/”
21. 5.
Find
URIs
for
abributes
Abributes
are
modelled
as
RDF
proper'es
(predicates)
in
“element
set”
namespaces
E.g.
Dublin
Core
terms
(dct);
ISBD
(isbd);
FRBR
(frbrer);
RDA
(rdaxxx);
Bibliographic
Ontology
(bibo);
etc.
Choose
a
namespace,
find
property
with
same
(or
closest)
“meaning”
(e.g.
defini'on)
as
abribute
Nearest
property
minimises
loss
of
informa'on
Get
URI
for
property
If
no
suitable
property,
choose
another
namespace
Proper'es
do
not
have
to
come
from
single
namespace
Match
and
mix!
22. 5
(cont).
Find
URI
for
'tle
hbp://purl.org/dc/terms/'tle
(dct:'tle)
hbp://iflastandards.info/ns/isbd/elements/
P1014
(isbd:P1014)
hasTitleProper
hbp://RDVocab.info/Elements/'tleProper
(rdaGR1:'tleProper)
23. 5
(cont).
Find
URI
for
author
dct:creator
rdarole:author
(isbd
does
not
cover
“headings”)
24. 5
(cont).
Find
URI
for
date
dct:date
isbd:P1018
hasDateOfPublica'onProduc'onDistribu'on
rdaGr1:dateOfPublica'on
25. 5
(cont).
Find
URI
for
LCSH
LCSH
is
a
subject
vocabulary
Controlled
terms
So
abribute
is
really
“subject”
And
the
term
itself
is
the
value
dct:subject
26. 5
(cont).
Find
URI
for
media
type
Assuming
record
uses
new
ISBD
Area
0
...
isbd:P1003
hasMediaType
27. 5
(cont).
Find
URI
for
content
form
Assuming
record
uses
new
ISBD
Area
0
...
isbd:
P1001
hasContentForm
28. 6.
Replace
abributes
with
URIs
URI
URI
Value
mlx:54321
isbd:P1014
Museum
archives:
an
introduc'on
mlx:54321
rdarole:author
Wythe,
Deborah
mlx:54321
isbd:P1018
2004
mlx:54321
dct:subject
Museum
archives
mlx:54321
isbd:P1003
Electronic
mlx:54321
isbd:P1001
Text
29. 7.
Find
URIs
for
values
If
object
of
a
triple
is
a
URI,
it
can
link
to
the
subject
of
another
triple
with
the
same
URI
Linked
data!
Values
from
controlled
vocabularies
may
have
URIs
Possible
vocabularies:
author,
subject,
ISBD
Area
0
NOT:
'tle,
date
For
author:
Virtual
Interna'onal
Authority
File
(VIAF)
For
LCSH:
Library
of
Congress
Authori'es
&
Vocabularies
For
ISBD
Area
0:
Open
Metadata
Registry
30. 7
(cont).
Find
URI
for
author
Author:
Wythe,
Deborah
VIAF:
hbp://www.viaf.org/
viaf:31899419/#Wythe,+Deborah
31. 7
(cont).
Find
URI
for
subject
(LCSH)
LCSH:
Museum
archives
LoC:
hbp://id.loc.gov/authori'es/
lcsh:/sh85088707#concept
32. 7
(cont).
Find
URIs
for
ISBD
Area
0
Media
type:
Electronic
ISBD
media
type
isbdmt:T1002
Content
form:
Text
ISBD
Content
form
isbdcf:T1009
37. Duplica'on
and
legacy
records
Many
copies
of
legacy
records
Copied
and
amended
for
local
use
Danger
of
min'ng
mul'ple
URIs
for
the
same
resource
Na'onal
bibliographic
agencies
have
significant
role
to
play
As
memory/cultural
ins'tu'ons
The
linked-‐data
memory/culture
of
a
na'on
38. FRBRiza'on
FRBR
splits
record
into
four
func'onal
parts
User-‐centred
func'ons
Subject
of
a
FRBR
triple
is
one
of
the
parts,
not
the
resource
as
a
whole
But
subject
of
ISBD
triple
is
the
resource
as
a
whole
Class
collisions
can
be
avoided
by
using
unbounded
(no
domain
or
range)
versions
of
proper'es
39. A
short
history
of
the
evolu'on
of
the
library
catalogue
record
40. In
the
beginning
...
Lee,
T.
B.
Cataloguing
has
a
future.
-‐
Audio
disc
(Spoken
word).
-‐
Donated
by
the
author.
1.
Metadata
...
the
catalogue
card
41. From
flat-‐file
record
...
Bibliographic
descrip7on
Name
authority
Author:
Lee,
T.
B.
Name:
Title:
Cataloguing
has
a
future
Biography:
...
Content
type:
Spoken
word
Carrier
type:
Audio
disc
Subject
authority
Subject:
Metadata
Term:
Provenance:
Donated
by
the
author
Defini'on:
...
...
to
rela'onal
record
42. From
flat-‐file
descrip'on
...
Bibliographic
descrip7on
Name
authority
Author:
Name:
Lee,
T.
B.
Title:
Cataloguing
has
a
future
Biography:
Work
...
Content
type:
Spoken
word
Author:
Carrier
type:
Audio
disc
Subject
authority
Subject:
Subject:
Term:
Metadata
Expression
Provenance:
Donated
by
the
author
Defini'on:
Content
type:
Spoken
word
...
Manifesta7on
Item
...
to
FRBR
record
43. From
FRBR
record
...
Work
Name
authority
Author:
Name:
Lee,
T.
B.
Subject:
Subject
authority
Expression
Content
type:
Spoken
word
Term:
Metadata
Manifesta7on
RDA
content
type
Title:
Cataloguing
has
a
future
Term:
Carrier
type:
Audio
disc
RDA
carrier
type
Item
Donor:
Provenance:
Donated
by
the
author
Term:
Amazon/Publisher
Title:
...
to
ex'nc'on!
44. Where
is
the
record?
Implicit,
not
explicit
Everywhere
and
nowhere
A
seman'c
Web
will
allow
machines
to
create
the
record
just-‐in-‐'me
We
will
not
have
to
maintain
records
just-‐in-‐case
The
user
will
have
control
over
the
presenta'on
I
want
to
see
an
archive
or
library
or
museum
or
Amazon
or
Google
or
Flickr
or
?
display
And
by
avoiding
duplica'on,
we
can
all
get
on
with
describing
new
stuff
...
45. The
hyperdimensional
(Tardis)
card
W3C
Library
Audio
shop
Lee,
T.
B.
Cataloguing
has
a
future.
-‐
Audio
disc
(Spoken
word).
-‐
Donated
by
the
author.
1.
Metadata
Spoken
word
archive
Lee
Museum
“TARDIS
four
port
USB
hub,
for
office-‐bound
Time
Lords:
Open
a
'me
vortex
on
your
desk”
–
Pocket-‐lint
46. Metadata
focus
ShiZ
of
focus
of
metadata
crea'on,
maintenance,
storage,
preserva'on
(by
professionals,
amateurs,
machines)
From
Record
To
Statement(s)
=
triple(s)
But
metadata
display
...
...
aggregates
triples
(from
mul'ple
sources)
to
create
records
on
the
fly