43. 43
• Relationships
defined for each
media type
• Managed
separately from
the article content
• The full set of
metadata was
available to all
articles
45. 45
• Authors just
selected the
primary category
• Related metadata
pulled in
automatically
• Updates appeared
on all articles
*Metadata categories and
relationships were managed
by a dedicated data librarian
49. 49
• Wal-mart sold gallon jars of
Vlasic pickles for $2.97.
• A popular item – priced so low
it nearly put Vlasic out of
business.
• By achieving their goals, they
put themselves in a position
they might not survive.
See: http://www.fastcompany.com/47593/wal-mart-you-dont-know
60. 60
• An informal post on August 4th
• Notification sent out September 30th
• Shut down October 31st
61. 61
“What happened to my web page on my husband, Bob Champine,
that took me many years to put together on his career and which
meant a lot to me and to the aviation community. I noticed with 9.0
I lost the left margin and the picture of him exiting the X-1. I need to
restore it to the internet as it is history. Please tell me what to do. I
will be glad to retype it, I just don’t want it lost to the world. I need
help. Gloria Champine”
63. 63
“Archive Team is a loose collective of rogue archivists,
programmers, writers and loudmouths dedicated to
saving our digital heritage. Since 2009 this variant
force of nature has caught wind of shutdowns,
shutoffs, mergers, and plain old deletions - and done
our best to save the history before it's lost forever.”
72. 72
• In 6 months Archive Team saved 900 Gb
• Estimated 4-5 Tb total
• Other people saved additional pages,
but probably ¼ is gone forever
• For many people, Geocities was their
first web presence
76. 76
Those screenshots were automatically generated from
Geocities sites rescued by Archive Team in 2009
See more at One Terabyte of Kilobyte Age Photo Op:
http://oneterabyteofkilobyteage.tumblr.com/
77. 77
Due to lack of metadata:
• The rescued data was less useful
• Really bulky files
• Case-sensitive filenames difficult to access and read
• Not in a web-ready format (WARC)
• The process was less efficient and more error prone
• Poor tracking of completed activity
• Lots of duplication of data
• Took way too long (6 months vs. 3 days)
• Could have gotten all the data in a month (estimated)
94. 94
• Small a pool of volunteers, and
their drive didn’t last long
• Tools didn’t provide immediate
feedback/satisfaction. They had
to email their inputs and wait.
Photo by psyberartist
95. 95
• 10 most common words + 10
most common 2-word phrases
• Applied to 200,000 items
• Much more scalable
• Heavily machine assisted: a
person can validate data and
create collections
Photo by James St. John
98. 98
Topics:
switch, atari,
antenna, game,
cable, terminals,
console, television,
video, program,
power supply,
console unit, video
computer, game
program, computer
system, atari game,
power switch,
switch box, atari
video, screw
terminals
99. 99
Having the stuff is vital, the
most important thing. But
it’s also vital to have a
system by which these
things are described.
“If a person can’t get the
information they need, then
we’re failing.”
Photo by Rachel Lovinger
100.
101. 101
• Jason had converted to a
metadata advocate
But I realized that…
• Content strategists who care
about the long game should
think like historians,
archivists and futurists, too.
103. 103
• Dutch leader in academic research and education on
biodiversity and taxonomy.
• Has a collection of 37 million natural history objects.
104. 104
Describe, understand and explore biodiversity for human
wellbeing and the future of our planet.
They do this with:
• Accessible collections
• Contributions to global
scientific research
• Awe of natural history
• Openly shared knowledge
105. 105
• From 2010 to June 2015
• 250 staff members & 450 volunteers
• Digitizing 7 million objects in detail
• Adding metadata for the other 30 million objects
106. 106
• Information is
more easily
discovered,
studied, and used.
• Scientists
worldwide can
access it directly
online, without
assistance.
• Some of this data
has never been
available in digital
form before.
107. 107
• Scientific name
• Where it was found
• When it was found
• Who found it
“Objects [in the collection] have no scientific value
without this information.” - Suzanne de Jong-Kole
111. 111
• Vele Handen = Many Hands
• People helped transcribe
hand written labels
• In 9 months, people did
200,000, of which about half
were usable.
112. 112
The person who collected the specimen wrote the metadata on the label.
This could be a professional researcher, or a non-professional enthusiast.