Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
From Strings to Things: Cataloging & Linked Data
1. RBMS Preconference “Futures”, San Diego, 21 June 2012
Cataloging & Linked Data
From Strings to
Things
Michael Panzer
Assistant Editor, DDC
OCLC
@MichaelPanzer
The world’s libraries. Connected.
2. Topics for today
• Cataloging from a linked data perspective
• Describing vs. identifying
• Mapping vs. linking
• Interlude: Technical basis for linked data
• Library Linked Data: 2 Examples
• Dewey.info
• WorldCat.org + schema.org
• Why should librarians care?
The world’s libraries. Connected.
3. Define “metadata”
Implications for descriptive
cataloging “Metadata is data associated with
objects which relieves their potential
- produces surrogates users of having to have full advance
knowledge of their existence or
- transforms textual characteristics.”
information
Dempsey/Heery (1997)
The world’s libraries. Connected.
5. Cataloging as string transformation
100 1# $a Peretz, Isaac Leib,
$d 1851 or 2-1915.
The world’s libraries. Connected.
6. Cataloging as string transformation
100 1# $a Peretz, Isaac Leib,
$d 1851 or 2-1915.
240 10 $a Short stories. $l English.
$k Selections
245 10 $a Stories and pictures / $c by Isaac
Loeb Perez ; translated from the
Yiddish by Helena Frank.
The world’s libraries. Connected.
7. Cataloging as string transformation
100 1# $a Peretz, Isaac Leib,
$d 1851 or 2-1915.
240 10 $a Short stories. $l English.
$k Selections
245 10 $a Stories and pictures / $c by Isaac
Loeb Perez ; translated from the
Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications
Society of America, 1906.
The world’s libraries. Connected.
8. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
The world’s libraries. Connected.
9. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
The world’s libraries. Connected.
10. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
The world’s libraries. Connected.
11. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
Cataloging as string transformation still
operates (cum grano salis) within the confines
of a card catalog
The world’s libraries. Connected.
12. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
Cataloging as string transformation still
operates (cum grano salis) within the confines
of a card catalog
• Focus on record as “end product”
The world’s libraries. Connected.
13. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
Cataloging as string transformation still
operates (cum grano salis) within the confines
of a card catalog
• Focus on record as “end product”
• Record as mark-up of text
The world’s libraries. Connected.
14. Strings bundled as records
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915.
240 10 $a Short stories. $l English. $k Selections
245 10 $a Stories and pictures / $c by Isaac Loeb Perez ;
translated from the Yiddish by Helena Frank.
260 ## $a Philadelphia : $b Jewish Publications Society
of America, 1906.
Cataloging as string transformation still
operates (cum grano salis) within the confines
of a card catalog
• Focus on record as “end product”
• Record as mark-up of text
• But: transformations can be extremely
sophisticated!
The world’s libraries. Connected.
15. Strings as keys
Personal name:
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915
Geographic heading:
151 ## $a Grand Canyon (Ariz. : City)
The world’s libraries. Connected.
16. Strings as keys
Personal name:
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915
Geographic heading:
151 ## $a Grand Canyon (Ariz. : City)
The world’s libraries. Connected.
17. Strings as keys
Identification by description
• Heading construction based on elimination of ambiguities
• Parts are added to increase specificity of description to avoid conflicts
with other headings
• Use of abbreviations, punctuation, special delimiters
• Form of heading carries meaning (dates, etc.)
• Headings relate to other headings
• Keys act as access points for records, not described entities
≈ cutting a key based on a vague
description of a lock that may not even exist
The world’s libraries. Connected.
18. Clustering of records by co-occurrence of complex text strings
001 $a 3157881
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915
...
Bib record
Authority record
001 $a oca00046566#
100 1# $a Peretz, Isaac Leib, $d 1851 or 2-1915
400 1# $a Perez, Isaac Loeb, $d 1851 or 2-1915
400 1# $a $ ,.פרץ י. לd 1851 or 2-1915
400 1# $a .ּפרץ. י. ל
...
001 $a 21331058
100 1# $6 880-01 $a Peretz, Isaac Leib, $d 1851 or 2-1915
880 1# $6 100-01/(2/r $a .פרץ, י. ל
...
Bib record
The world’s libraries. Connected.
24. Links as keys
Identification by denotation
http://viaf.org/viaf/27070050
owl:sameAs
rdf:type
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
rdaEnt:Person
foaf:Person
The world’s libraries. Connected.
25. Links as keys
Identification by denotation
http://viaf.org/viaf/27070050
owl:sameAs
rdf:type
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
rdaEnt:Person
foaf:Person
http://viaf.org/viaf/27070050/
The world’s libraries. Connected.
26. Links as keys
Identification by denotation
http://viaf.org/viaf/27070050
owl:sameAs
rdf:type
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
rdaEnt:Person
foaf:Person
foaf:focus http://viaf.org/viaf/27070050/
The world’s libraries. Connected.
27. Links as keys
Identification by denotation
“Real-world object”
(non-information resource)
http://viaf.org/viaf/27070050
owl:sameAs
rdf:type
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
rdaEnt:Person
foaf:Person
Web document
(information resource)
foaf:focus http://viaf.org/viaf/27070050/
The world’s libraries. Connected.
28. Links as keys
Identification by denotation
http://viaf.org/viaf/27070050
owl:sameAs
rdf:type
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
rdaEnt:Person
foaf:Person
foaf:focus http://viaf.org/viaf/27070050/
The world’s libraries. Connected.
29. Links as keys
Identification by denotation
• Identifier acquired from defining data source
• Actionable/verifiable by dereferencing through web
protocols
• Opaque
• Link directly to things
• Allow targeted knowledge acquisition across domain
boundaries
≈ Key is inseparably connected to an entity (person, place,
object, concept, …) and unlocks further knowledge
The world’s libraries. Connected.
30. Exploring the knowledge graph
http://viaf.org/viaf/27070050
The world’s libraries. Connected.
31. Exploring the knowledge graph
http://viaf.org/viaf/27070050
foaf:name
owl:sameAs “Peretz, Isaac Leib,
dct:created 1851 or 2-1915”
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
http://www.worldcat.org/oclc/
foaf:Person 21331058
The world’s libraries. Connected.
32. Exploring the knowledge graph
http://viaf.org/viaf/27070050
foaf:name
owl:sameAs “Peretz, Isaac Leib,
dct:created 1851 or 2-1915”
rdf:type
http://dbpedia.org/
resource/Isaac_Leib_Peretz
http://www.worldcat.org/oclc/
dbpedia-owl: foaf:Person 21331058
dateOfBirth
dbpedia-owl: foaf:depiction
dateOfDeath
“1851-05-18”
http://upload.wikimedia.org/
“1915-04-03” wikipedia/commons/2/2d/
I_L_Peretz_postcard.jpg
The world’s libraries. Connected.
37. Metadata as textual data (aka “the well-wrought urn”)
• “Record” as paradigm
• Emphasis on creation
of “literal” descriptions
• Structured, but flat
• Schema-dependent
• Assembled from
strings
• Collection of complex
values with meaning
also carried by syntax
and relative position
The world’s libraries. Connected.
38. Metadata as textual data (aka “the well-wrought urn”)
• “Record” as paradigm
• Emphasis on creation
of “literal” descriptions
• Structured, but flat
• Schema-dependent
• Assembled from
strings
• Collection of complex
values with meaning
also carried by syntax
and relative position
The world’s libraries. Connected.
39. Metadata as linked data (aka “the deconstructed urn”)
• Graph as paradigm
• Emphasis on creation
of properties between
entities
• Structured, but open-
ended
• Schema-free
• Assembled from
building blocks
• Collection of
knowledge statements
The world’s libraries. Connected.
40. Metadata as linked data (aka “the deconstructed urn”)
• Graph as paradigm
• Emphasis on creation
of properties between
entities
• Structured, but open-
ended
• Schema-free
• Assembled from
building blocks
• Collection of
Brian H. Nielsen
knowledge statements
The world’s libraries. Connected.
41. I„m confused; this is linked data?
Technical Interlude
The world’s libraries. Connected.
43. Linked data principles:
Rules, stars, and bags of chips
Four rules (Tim Berners-Lee):
1. Use URIs for things
(identification)
The world’s libraries. Connected.
44. Linked data principles:
Rules, stars, and bags of chips
Four rules (Tim Berners-Lee):
1. Use URIs for things
(identification)
2. Use http-URIs so people can
look up those names
(dereferenceability)
The world’s libraries. Connected.
45. Linked data principles:
Rules, stars, and bags of chips
Four rules (Tim Berners-Lee):
1. Use URIs for things
(identification)
2. Use http-URIs so people can
look up those names
(dereferenceability)
3. When someone looks up a
URI, provide useful
information, using the
standards
The world’s libraries. Connected.
46. Linked data principles:
Rules, stars, and bags of chips
Four rules (Tim Berners-Lee):
1. Use URIs for things
(identification)
2. Use http-URIs so people can
look up those names
(dereferenceability)
3. When someone looks up a
URI, provide useful
information, using the
standards
4. Include links to other URIs, so
that they can discover more
things (relationships)
The world’s libraries. Connected.
47. Let„s explore the first rule for a second
The world’s libraries. Connected.
48. Let„s explore the first rule for a second
• As seen, there is an
important distinction
between identifiers for
things and for
descriptions of things
The world’s libraries. Connected.
49. Let„s explore the first rule for a second
• As seen, there is an
important distinction
between identifiers for
things and for
descriptions of things
• URLs for things that
have no web
representation
The world’s libraries. Connected.
50. Let„s explore the first rule for a second
• As seen, there is an
important distinction
between identifiers for
things and for
descriptions of things
• URLs for things that
have no web
representation
• Reusing web
infrastructure to resolve
URL for non-
information resources
The world’s libraries. Connected.
51. Let„s explore the first rule for a second
• As seen, there is an
important distinction
between identifiers for
things and for
descriptions of things
• URLs for things that
have no web
representation
• Reusing web
infrastructure to resolve
URL for non-
information resources
• Web of data overlays
web of things
The world’s libraries. Connected.
52. Let„s explore the first rule for a second
• As seen, there is an
important distinction
between identifiers for
things and for
descriptions of things
• URLs for things that
have no web
representation
• Reusing web
infrastructure to resolve
URL for non-
information resources
• Web of data overlays
web of things
• Big idea of early
semantic web; linked
data approach more
prudent
The world’s libraries. Connected.
53. Examples for publishing library linked data
DDC 23 as linked
data
The world’s libraries. Connected.
55. Literals to Links
http://dewey.info/class/576.8/
Broader terms
Related terms Narrower terms
Mapped terms
Relative Index terms
The world’s libraries. Connected.
56. Literals to Links
576
Broader terms
http://dewey.info/class/576.8/
Related terms Narrower terms
Mapped terms
Relative Index terms
The world’s libraries. Connected.
57. Literals to Links
576
Broader terms
http://dewey.info/class/576.8/
Related terms
Mapped terms
Relative Index terms
Narrower terms 591.38
576.8… …
The world’s libraries. Connected.
58. Literals to Links
576
Broader terms
http://dewey.info/class/576.8/
Mapped terms
Relative Index terms 231.7652
Related terms
Narrower terms 576.8… 591.38 …
The world’s libraries. Connected.
59. Literals to Links
576
Broader terms
Relative Index terms
http://dewey.info/class/576.8/
Homoplasy
Mapped terms
231.7652
Related terms
Narrower terms 576.8… 591.38 …
The world’s libraries. Connected.
60. Literals to Links
576
Broader terms
Relative Index terms
http://dewey.info/class/576.8/
Mapped terms
Homoplasy
FAST
231.7652
Related terms
GND
Narrower terms 576.8… 591.38 …
The world’s libraries. Connected.
64. Dewey Linked Data Timeline
2009 2010 2011 2012
dewey.info Abridged Edition 12th language DDC 23 in
launched with DDC 14 in 3 languages added to DDC dewey.info
Summaries in 11 Summaries in
languages dewey.info
New data format New data format
and distribution and distribution
environment environment
introduced for extended to
Dewey data dewey.info
British Library FAO
DNB DDC in LOD Cloud
The world’s libraries. Connected.
65. Dewey Distribution Environment
Translation
WebDewey
Software
OCLC
German EN
EN/DE ESS OCLC
Translation
WebDewey
Software German
Italian DE
EN/IT
Translation WebDewey
Software Swedish
French DISTRIBUTION mixed EN/SV
EN/FR
SERVER
Translation
WebDewey
Software
French
Arabic FR
EN/AR
Translation
WebDewey
Software
Italian
Norwegian
IT
EN/NO
Linked
Translation DDC Web
Software WebDewey
Services .
Norwegian
Swedish .
NO
EN/SV .
The world’s libraries. Connected.
67. Dewey Linked Data Timeline:
Next steps
2009 2010 2011 2012 2013
dewey.info Abridged Edition 12th language DDC 23 in DDC 23 facets
launched with 14 in 3 added to DDC dewey.info FAST
DDC Summaries languages Summaries in MSC
in 11 languages dewey.info Table 2 +
GeoNames
New data format New data format
and distribution and distribution
environment environment
introduced for extended to
Dewey data dewey.info
British Library FAO
DNB DDC in LOD
Cloud
The world’s libraries. Connected.
69. Use Case:
Links to GeoNames
dewey.info
T2—797788 Tacoma
http://dewey.info/class/2--797788/
The world’s libraries. Connected.
70. Use Case:
Links to GeoNames
NYT articles
dewey.info
T2—797788 Tacoma
http://dewey.info/class/2--797788/
The world’s libraries. Connected.
71. Use Case:
Links to GeoNames
NYT articles
dewey.info
T2—797788 Tacoma
http://dewey.info/class/2--797788/
Geo-
Names
http://sws.geonames.org/5812944/
The world’s libraries. Connected.
72. Use Case:
Links to GeoNames
NYT articles
dewey.info
T2—797788 Tacoma
http://dewey.info/class/2--797788/ NYT
Tacoma (Wash)
http://data.nytimes.com/N51488338864578420851
Geo-
Names
http://sws.geonames.org/5812944/
The world’s libraries. Connected.
73. Use Case:
Links to GeoNames
NYT articles
dewey.info
T2—797788 Tacoma
http://dewey.info/class/2--797788/ NYT
Tacoma (Wash)
http://data.nytimes.com/N51488338864578420851
Geo-
Names
http://sws.geonames.org/5812944/
The world’s libraries. Connected.
74. Use Case:
Links to GeoNames
NYT articles
dewey.info
T2—797788 Tacoma
http://dewey.info/class/2--797788/ NYT
Tacoma (Wash)
http://data.nytimes.com/N51488338864578420851
Geo-
Names
http://sws.geonames.org/5812944/
The world’s libraries. Connected.
75. Two Views of T2—6626 Niger
The world’s libraries. Connected.
81. Closing the circle, traversing the graph
Why should
librarians care?
The world’s libraries. Connected.
82. Better living through LOD: role of libraries and librarians
• Librarians are uniquely qualified,
they just have to get more involved
The world’s libraries. Connected.
83. Better living through LOD: role of libraries and librarians
• Librarians are uniquely qualified,
they just have to get more involved
• From producing metadata records
(back?) to extracting and engineering
knowledge
The world’s libraries. Connected.
84. Better living through LOD: role of libraries and librarians
• Librarians are uniquely qualified,
they just have to get more involved
• From producing metadata records
(back?) to extracting and engineering
knowledge
• Do we really want to leave the field
to the engineers?
• Borges vs. schema.org
The world’s libraries. Connected.
85. Borges‟ taxonomy vs. schema.org: but which one is which?
Animals
a) belonging to the Emperor
b) embalmed
c) trained
d) piglets
e) sirens
f) fabulous
g) stray dogs
h) included in this classification
i) trembling like crazy
j) innumerables
k) drawn with a very fine camelhair brush
l) et cetera
m) just broke the vase
n) from a distance look like flies
The world’s libraries. Connected.
86. Borges‟ taxonomy vs. schema.org: but which one is which?
Animals • Intangible
a) belonging to the Emperor • Enumeration
b) embalmed • JobPosting
• Language
c) trained
• Offer
d) piglets
• Quantity
e) sirens
• Rating
f) fabulous • Structured Value
g) stray dogs
h) included in this classification
i) trembling like crazy
j) innumerables
k) drawn with a very fine camelhair brush
l) et cetera
m) just broke the vase
n) from a distance look like flies
The world’s libraries. Connected.
87. Borges‟ taxonomy vs. schema.org: but which one is which?
Animals • Intangible
a) belonging to the Emperor • Enumeration
b) embalmed • JobPosting
• Language
c) trained
• Offer
d) piglets
• Quantity
e) sirens
• Rating
f) fabulous • Structured Value
g) stray dogs • Organization
h) included in this classification • EducationalOrganization
i) trembling like crazy • CollegeOrUniversity
j) innumerables • ElementarySchool
• HighSchool
k) drawn with a very fine camelhair brush
• MiddleSchool
l) et cetera
• Preschool
m) just broke the vase
• School
n) from a distance look like flies
The world’s libraries. Connected.
88. Better living through LOD: role of libraries and librarians
• Librarians are uniquely qualified,
they just have to get more involved
• From producing metadata records
(back?) to extracting and engineering
knowledge
• Do we really want to leave the field
to the engineers?
• Borges vs. schema.org
The world’s libraries. Connected.
89. Better living through LOD: role of libraries and librarians
• Librarians are uniquely qualified,
they just have to get more involved
• From producing metadata records
(back?) to extracting and engineering
knowledge
• Do we really want to leave the field
to the engineers?
• Borges vs. schema.org
• Librarians can continued to do
what they have always done best
... We have just been handed
sharper tools!
The world’s libraries. Connected.