This talk was provided by Tim Augur of Innovative during the 11th Annual NISO-BISG Forum, Ensuring the Integrated Information Experience, held on June 23, 2017 during ALA in Chicago.
2. Your
Partner
for
Library
Success
innovativeCONFIDENTIAL
§ 2:15pm
–
3:00pm
Inges0ng
Metadata
By
Exis0ng
&
Emerging
Systems
§ Once
aggregated,
metadata
must
be
integrated
appropriately
into
a
wide
variety
of
inventory
systems,
whether
the
inventory
be
of
available
edi0ons,
a
collec0on
of
scholarly
output
or
roster
of
human
exper0se.
Producers,
vendors
and
librarians
discuss
the
current
challenges
and
poten0al
hurdles
to
be
overcome
when
integra0ng
metadata
from
a
variety
of
sources
into
an
even
broader
variety
of
systems.
Program
3. Your
Partner
for
Library
Success
innovativeCONFIDENTIAL
§ Love/hate
rela0onship
inges0ng
MARC
metadata
from
a
variety
of
catalogs
⎻ Union
Catalog
building
⎻ Bibliographic
u0lity
§ Our
goal
was
to
build
algorithms
to
automate
the
matching
of
the
same
manifesta0on
of
the
same
work
from
1000s
of
libraries.
§ For
all
of
the
detailed
defini0on
of
cataloging
rules
and
MARC
standards…
⎻ There
is
a
lot
of
varia0on
in
prac0ce
⎻ (But,
you
knew
that…but
what
you
may
not
know
is
just
how
expensive
it
is
to
get
it
right…)
For better or worse
4. CONFIDENTIAL
Your
Partner
for
Library
Success
innovative
§ In
development,
the
80/20
rule
applies...kinda…
⎻ Iden0fica0on
of
paZerns
and
crea0on
of
algorithms
for
that
80%
is
easy
(for
matching)
⎻ Then,
you
have
the
20%
and
that
takes
a
very
long
0me
and
many
experts
to
make
any
progress…
⎻ Then
you
move
onto
quality
and,
again,
the
80/20
rule
applies
and
then…
⎻ The
20%
takes
very
large
magnitudes
effort
to
iden0fy
algorithms
that
result
in
small
gains
⎻ WHY?
Development considerations
5. CONFIDENTIAL
Your
Partner
for
Library
Success
innovative
§ GIGO
principle
§ Everyone
has
a
local
copy
of
a
single
bibliographic
work
⎻ That
is
the
way
it
is
now.
Legacy
persists
⎻ Moving
forward,
with
Linked
Data,
what
will
happen
in
prac0ce?
§ Don’t
get
me
started
on
the
vendors…
Innova0ve
included…
⎻ Suppor0ng
GIGO,
say
what?!
⎻ ocm
stripping
is
a
good
example
⎻ Doing
“just
enough”
(and
not
doing
enough)
§ Derived
records
from
SkyRiver…we
could
have
done
beZer
to
prevent
prolifera0on
of
dups
⎻ Mash
ups
and
not
persis0ng
local
metadata
There are always challenges
6. Your
Partner
for
Library
Success
innovativeCONFIDENTIAL
§ If
building
a
merged
catalog,
⎻ Try
to
not
make
a
franken-‐record.
Keep
different
manifesta0ons
apart.
⎻ Try
to
be
discerning
at
the
database
level
and
let
other
sohware
layers
connect
up
and
express
the
rela0onships
⎻ Apply
quality
controls
everywhere
(ingest
and
export)
⎻ Pre-‐process
as
much
as
possible
⎻ Mapping
mapping
mapping
⎻ Report
regularly
and
iden0fy
irregulari0es
and
share
your
insights…
⎻ Accommodate
different
sources;
design
for
different
formats
⎻ Use
more
than
encoding
level
to
determine
quality
⎻ Use
fuzzy
logic
(it’s
needed)
⎻ Make
it
configurable
for
all
sorts
of
reasons…formats
and
string
replacements
are
a
few
examples…
⎻ Be
strict
and
lenient
for
different
cases
⎻ Give
users
op0ons
⎻ Don’t
worry
about
making
“mistakes”;
they
can
be
fixed
with
later
itera0ons.
Rather safe than sorry, regardless of the use
7. Your
Partner
for
Library
Success
innovativeCONFIDENTIAL
§ Vendors—offer
solu0ons
⎻ While
not
monetarily
advantageous,
we
took
0me
to
design
a
“format
family”
algorithm
for
our
mobile
(our
solu0on
ISN’T
perfect)
⎻ But
we
can
combine
different
solu0ons
to
remedy…
⎻ And
we
can
envision
together
a
beZer
future…
⎻ What’s
our
goal?
⎻ Who
do
we
serve?
⎻ What
does
our
work
do
to
aid
those
we
serve?
⎻ Are
we
ques0oning
our
mo0ves?
Are
we
reflec0ng
on
our
own
ac0ons?
Don’t trade in one problem for another
8. CONFIDENTIAL
Your
Partner
for
Library
Success
innovative
§ Can
we
work
more
closely
with
publishers/content
providers
to
solve
our
problems?
⎻ ONIX
&
MARC—can
they
dance
together?
⎻ Grouping
families
of
works
⎻ Subject
mappings
⎻ Publishers
and
imprints
⎻ Pre-‐ISBN
publica0ons
⎻ Adop0on
of
new
standards
⎻ How
does
this
all
fit
together
in
a
BIBFRAME
world?
§ How
do
we
transi0on?
⎻ Is
there
an
evolu0onary
path?
A
unified
API,
perhaps?
⎻ How
efficient
can
it
be?
Another
case
of
the
20%?
§ Evolving
standards
to
meet
our
collec0ve
business
needs
is
a
great
approach
A new dance step?