Presentation by Digitisation Project Manager Matthew Brack on things to think about when doing digitisation projects, for our fourth Digitisation Open Day.
2. Wellcome Digital Library Programme
10 ‘laws’ of digitisation
These are personal views based on experience of doing digitisation at Wellcome Library…
3. #1: Know your purpose
(“Thou shalt observe real users
and keep them holy”)
Obvious but important: who are you doing digitisation for? Knowing the answer
to that question will affect every subsequent decision you make in your project.
5. #2: Know Project Management
If you’ve never run a digitisation project before and only do one thing to prepare:
study project management. Your project is more likely to succeed with an
understanding of project management than a technical understanding of digitisation.
8. #3: There is no ‘best practice’
We can tell you how we created our digital library, but it’s unlikely that anyone is
going to be able to leave today and, even with a blank check, put that into practice
to solve their particular problems – there are too many variables.
10. #4: There are no simple projects
(especially at the beginning)
11. Digitisation Open Day
Project problems
post-mortem:
Machinery issues
Retrieval across 30
collections, 4 floors,
2 buildings, 2 states
of access
Copyright clearance
in parallel
12% of selection not
found
Display issues
12. #5: Imaging is the quickest step
The imaging step is dwarfed by preceding preparation and subsequent digital asset
management processes, yet it’s the most visible aspect of any digitisation project.
13. 215B STACKS
1.22 STORAGE
CONSERVATION
1.21 DIADEIS
BOOKS IN
STACKS
START
CONDITION?
NO
NOTE
CATALOGUING
IN
SCOPE
1a
YES
FAIR
STAY ON
SHELF
REPAIR
7
1c
NO
PRINT
CAT?
BOX
ONLINE
CAT?
1b
POOR
8
YES
GENERATE
SHELF
LIST
NOTE
NO
1d
1.22
STORE
TO
CATALOGUE?
SINGLE
SHELF
LISTS
DUPLICATE
CHECK
YES
9
2
CATALOGUE
3
SORT
BY
SIZE
4
10
1.22
STORE
CHECK
OUT
CHECK
OUT
5
DIGITISE
NOT OK
6
LARGER
UPDATE
SHELF
LIST
11
RETURN TO
SHELF
OK
CON
ASSESS
1.22
STORE
NO WAY
Imaging step
within a
preparation
workflow
14. #6: Metadata is really important
(see Dave Thompson’s presentation)
Lacking good metadata is an existential threat to your project – without it your
digital content will simply disappear and never be seen by users.
15. Digitisation Open Day
Metadata
• Digital objects „don‟t exist‟ without metadata – no search, no
discovery
• Metadata first, then digitisation – otherwise you don‟t know what
you have, where it is, or any way of controlling it…
• On average 50% of project time is spent on metadata and
cataloguing
• Must be shaped by user need and what an organisation is
capable of delivering
• Tension between low-volume digitisation with more metadata
for a richer user experience or larger-scale digitisation with
lighter metadata attached
• Standards-based framework helpful for consistency, accuracy
and efficiency in metadata input (e.g. Dublin Core, MARC21)
17. #7: It’s lots of small tasks
(repeated over and over…)
18. Digitisation Open Day
Tracking and retrieval
1.
2.
3.
4.
5.
6.
7.
8.
9.
Generate unique ID
Create ‘scan list’
Create „review file‟
Make unavailable to users
Create barcodes
Retrieve items
Insert barcodes
Deliver items for imaging
Update tracking list
[Re-work]
a.
b.
c.
d.
e.
f.
Return
Remove barcodes
Update tracking list
Make available to users
Pray for no more re-work
Repeat for next batch
19. #8: Digi can damage your stuff
(but not as much as you’d think)
20. Digitisation Open Day
Conservation
• Most damage to collections comes from handling
• Digitisation handles collections intensively in
new ways
• Survey to develop image capture approach and identify
out of scope material
• Survey detail depends on collection
• Training for photographers and digital preparators
• Actual preparation of materials (staples, openings)
• Digitisation is not preservation
21. #9: Digitisation is not preservation
This should not be a guiding principle of your project:
Generally your original physical material is going to last much longer than your
digital manifestation – no competition.
You’ve just created a second collection of material that you need to ‘preserve’
and manage.
Preservation doesn’t mean much in a digital context – it’s actually a
contradiction from traditional usage, which succeeds by restricting access –
what we are interested in is sustainable access.
23. Digitisation Open Day
Copyright and sensitivity
• UK copyright law is lagging behind the needs of
today‟s economy
• UK copyright is held by the creator and not the owner of
a work, making a rights risk assessment essential for
most projects
• Rights clearance of works on an item-by-item
basis is unworkable in the context of mass
digitisation
• Small organisations without legal support are
unlikely to take the risk of digitising orphan works, or
anything else that carries potential copyright risk
24. Digitisation Open Day
ProQuest EEB Project Overview
Project Scope:
14,000 books
5.5 million images
Incunabula to 1700
Printed outside UK
Access in UK and
HINARI – 15 years
3600 books now online: http://eeb.chadwyck.com
25. Digitisation Open Day
Phase 2 projects
Reading Room / Project X
Forensics and
Sex temporary
exhibitions
Western Manuscripts 1000-1650
26. Digitisation Open Day
Useful resources
THORNTON, E. (2013) Digitisation Doctor Workshop. 15th April 2013.
Available from: http://blog.wellcomelibrary.org/2013/05/resources-fromdigitisation-doctor-workshop-now-available
HENSHAW, C. and KILEY, R. (2013) The Wellcome Library, Digital.
Ariadne. July 2013. Available from:
http://www.ariadne.ac.uk/issue71/henshaw-kiley
JISC, Project Management for Digitisation, JISC Digital Media. Available
from: http://www.jiscdigitalmedia.ac.uk/guide/project-management-fora-digitisation-project
BRACK, M. (2012) Bridging the Gap: Library digital collections, innovation
and the user. Thesis submitted in partial fulfilment of the requirements
of King‟s College London for the Degree of Masters in Digital Asset
Management. Available from: http://nsla.org.au/publication/bridginggap-library-digital-collections-innovation-and-user
After three of these sessions, time to add some flavour…Some of us here have been ‘hacking’ our jobs recently, trying to summarise our experiences in pith statements… 10’s a nice number Some of them may come across as imperatives, but really these are just personal observations…
Following on from the 10 UX commandments…Who are you doing this project for? Knowing the answer to that question will affect every other decision you make in a project…
Example: strictly speaking selection and delivery not my responsibility as digi project manager, but these are key stakeholders for every digi project.Who is using this stuff – how is it delivered – does the person selecting the stuff understand how it will be used?
If you’ve never run a digi project before and only do one thing to prepare, this should be it… Your digi project is more likely to succeed with an understanding of project management than, say, a technical understanding of digitisation…Please don’t read any books on digitisation, read about pmgmt instead…
It’s very important that you strive for an appreciation of both of the digital and the physical and how they interact in order to execute a good digi project.
Importance of project management…
So I said please don’t read books on digitisation… that also because it’s going to be hard to find information directly applicable to your projects there…There are lowest common denominators, which you’ll find from your own experience and talking to someone who’s done this stuff…We can tell you how we created our digital library, but it’s unlikely that anyone is going to be able to leave today and, even with a blank check, put that into practice to solve their particular problems – there are too many variables… Institutional cultureResourcesContent typeAudience
Every digitisation project is differentNot really a ‘best practice’, certainly not for everyone who is doing digitisation (large and small institutions etc.) – it’s contextWhat we have is a shared goal (sustainable access), it’s getting there….
An innocent-looking project (!) – ‘only’ 2,000 books, and a robot to do them on – what could go wrong?Machinery issues – smokin’Retrieval – ‘theme’ issue – 30 collections, 4 floors, 2 buildings, 2 states of access (closed and open shelves)Return (general classification!) – couldn’t find 12% of themFirst use of Goobi WTSCopyright clearance – not feasible for mass digi (us and BL)Display – confusion over covers presented together at image 1 and 2
Please don’t tell anyone in the photography studio when you go up, but it is…First of all, it’s dwarfed by the preceding and succeeding steps in the workflow – Second, it’s actually something that you can apply basic parameters to (image specs, file format, camera set-up) – once established you’re all set…Just don’t use scanners and you’ll be fine…
Digitisation is mostly preparation…Here is the imaging step in the workflow…You’ll later see this entire workflow itself dwarfed by Dave’s system architecture…
No good metadata for your project is an existential threat to your project…
[All the different places where you use metadata] …You need good cataloguing to do digi – you shouldn’t start without it…Otherwise you don’t know what you have, where it is, or any way of controlling it…In particular you need administrative metadata that connects back to the physical object you’re digitising…Which goes back to bridging the gap between physical and digital…With metadata you have to string that thread through from beginning to end…Don’t forget that physical link to the digital…
It’s very time-consuming… could use lots of examples from workflow: photography, metadata addition…Often, especially if you’re in a library, you’re using legacy systems that really prefer to think digital delivery doesn’t exist… So you have to retrofit this clanger one piece at a time…
Excel spread sheets and barcodes are your best friends…
This should not be a guiding principle of your digi project.In some cases, like film stock or highly deteriorated items, this could be true…But generally your original physical material is going to last much longer than your digital manifestation – no competition.You’ve just created a second collection of material that you need to ‘preserve’ and manage…Preservation doesn’t mean much in a digital context – it’s actually a contradiction from traditional usage (restricting access) – we are interested in access…You might say, fine, we’ll restrict access to preserve our originals. Two potential implications: don’t create a self-imposed obsolescence for your physical building (there might be someone upstairs who wonders why they’re keeping London real estate for stuff that’s only available online - some people still think that ‘going digital’ equates to reducing costs)What would your users think?Time and again it seems that the physical originals are consulted more frequently after digitisation.
The workflow: Digitisation involves a lot of stakeholders…Also slices through the traditional library organisation…