1) The document discusses roles and responsibilities in ensuring permanent access to scholarly works.
2) It notes that while access to works has improved online, continuity of access is challenged as content can disappear from the web.
3) The document reports on measured progress in archiving journal content through organizations like CLOCKSS and Portico, but notes that only 19% of identified online journals are currently being preserved.
Ensuring Continuity of Access to the Scholarly Record
1. Roles and Responsibilities in guaranteeing
Permanent Access to the Scholarly Record
APE2014: Berlin, 29th January
Ensuring the Scholarly Record
is Kept Safe
Measured Progress with Serials
Peter Burnhill
EDINA, University of Edinburgh, UK
http://creativecommons.org/licenses/by/3.0/
2. The Internet & the Web have changed (many) things
Author
(article)
Reader
(article)
Publisher
‘Open Access’
article serial
issue
Licensed
Online
Access
Institutional
arrangement
Library
(serial)
learned
society
peer Licence
exchange
peer exchange
by
free2web access
Informal: ‘invisible college’ and the ‘gift economy’
Fo
rma
£
E
c
onomy
‘Open Access’
3. Our Shared Task …
… is to ensure ease and continuity of access to
the scholarly & cultural record
• Good News J Improved Ease of Access:
• What was once availably locally is now online
& accessed remotely, anytime/anywhere
4. Our Shared Task …
… is to ensure ease and continuity of access to
the scholarly & cultural record
• Good News J Improved Ease of Access:
• What was once availably locally is now online
& accessed remotely, anytime/anywhere
• Bad News! L Challenge to Continuity of Access:
• Academic libraries are no longer the custodians
of the scholarly record
– What’s on the Web one day is changed or has
disappeared the next
– We need to invest in good & reliable digital shelving!
5. … some larger problems lurk
1. Data, big & small – lots of numbers
2. Audio-visual materials – lots of pictures & sounds
3. Web-references (URIs) made in scholarly record
Hiberlink.org, a joint UoE/LANL/EDINA (Mellon funded)
project to investigate ‘Reference Rot’
– when what was cited ceases to say the same thing
“or has ceased to be”
E-journals should be the easier part of the problem:
… but is the e-journals problem is being solved?
… and what might help with the larger problems?
6. The Internet & the Web have changed (some) things
Author
(article)
Reader
(article)
learned
society
peer
exchange
Licensed
Online
Access
“The Library [Committee],
which is made up of
librarians and academics, ..
want reassurance about
long-term preservation
before confirming a
University policy of going
e-only.” Big UK Library
peer exchange
by
free2web access
Libraries speak of
‘e-collections’
but in practice
they have only
‘e-connections’
Informal: ‘invisible college’ and the ‘gift economy’
7. "We must all be archivists now"
1. Who has the archival responsibility?
² Who does ‘forever’?
2. Do academic libraries still wish to be the
custodians of the scholarly record?
3. How can libraries/publishers ensure continuity
of access?
4. Where are their digital shelving?
² What is on those shelves?
² What is missing?
² How do we know?
8. A ‘global challenge’: trans-national action
US.LoC 20%
Researchers (and therefore libraries) in any one country
are dependent upon content written and published
in countries other than their own
UK.BL 10%
‘hidden’ e-journals:
low % ISSN
Netherlands
& Germany:
c. 4.5% each
Brazil 4%
%age of the 113,000 ISSN issued for e-serials
9. Seriality: Identify point of issue for scholarly content
Non-trivial fraction of
the Scholarly Record
& important ‘model’
10. Many Reports over past 10 Years …
They highlighted risks in digital media & formats:
• ‘digital decay’: format obsolescence & bit rot
and warned against single points of failure:
• natural disasters (earthquake, fire and flood)
• human folly (criminal and political action): hacking
+ risks associated with commercial events in the publisher/
supply chain
• eDepot at Koninklijke Bibliotheek
… as early archiving initiatives emerged
• international significance (Elsevier & Kluwer) as well as national
role for The Netherlands)
• the LOCKSS project at Stanford University
• from which came CLOCKSS [as library/publisher ‘dark archive’]
• the Electronic-Archiving Initiative at JSTOR
• from which came Portico [as service provider]
11. Measured Progress with Serials
An impressive number of archiving agencies
① web-scale not-for-profit archiving agencies
e.g. CLOCKSS Archive & Portico
② national libraries (with legal deposit in mind)
e.g. e-Depot (Netherlands); British Library;
DnB; & National Science Library of China etc
③ research libraries: consortia & specialist centres
e.g. Global LOCKSS Network, HathiTrust,
Scholars Portal, Archaeology Data Service
12. Many archiving organisations is a Good Thing J
“Digital information is best preserved by replicating it at
multiple archives run by autonomous organizations”
B. Cooper and H. Garcia-Molina (2002)
13. Measured Progress with Serials
A continuing flow of Reports that update …
• highlighting risks in digital media & formats
• warning against single points of failure
• DPC Report lists 6 Use Cases to contrast
'Continuing Access' & 'Long-term Preservation’
1. Library cancels a JOURNAL subscription
2. Library exits a Big DEAL
3. Back issues of journal become unavailable from publisher
4. Journal becomes 'orphan' as publisher goes out of business
5. Journal 'unavailable' as operation of publisher hits disaster
6. Library decides to remove / dispose print journals
3 to 5 are the Archivist’s ‘preservation’ Use Cases
14. Now have a global Registry of e-journal archiving
… to discover who is looking after what
Enter title
or ISSN
to search across metadata
reported by leading
archiving organisations
*news*
Library of Congress has now joined the Keepers Registry
[& have high hopes for some others …]
15. 15
… and discover details of its ‘archival status’
… but coverage
of volumes is
partial & patchy
This e-journal is being archived
by 5 archiving agencies …
Example search: ‘Origins of Life’
16. What the Registry tells about progress?
Progress, but still not ‘job done’
The Keepers Registry <thekeepers.org> reports:
A. 21,557 e-serial titles are being 'Preserved’
i.e. ingested by organisations with archival intent
– (Many ‘missing volumes and issues’)
B. 113,092 ISSN assigned to ‘online serials in ISSN
Register
Ø Progress with a key indicator: ratio of A/B = 19%
– was 17% at close of 2011 (16,558 / 97,563)
17. Good News & Main Challenge?
Good new s?
• Most of the big publishers engage with archiving initiatives
• Keepers Registry often show titles held by 3+ 'Keepers’
– typically CLOCKSS, e-Depot and Portico.
Main challenge?
• The long tail of smaller publishers
– regardless of business model.
• It is not about Open Access per se
– DOAJ for content of 10,000 e-journals from 4,000 publishers
• Lots of other (important/priority?) e-journal
• Role of national libraries or library consortia?
18. Do we need to agree a ‘priority list’ of titles?
1. Should we only be interested in the c.30,000 ‘peer-reviewed’
scholarly journals?
2. Do we look only at on what individual libraries list?
– In 2012 we checked ‘archival status’ for 3 large university libraries
c.75%
‘at risk’
c.11%
held by
3 or more
• Two key indicators: %age (& number) of titles that are ‘at risk of loss’
%age (& number) of titles that are ‘preserved by 3 or more Keepers’.
3. Should we ask the audience?
• The researchers and students who read online serials
19. Looking from the user’s point of view …
… with usage logs for the UK OpenURL Router
• 10.4m full text requests in 2012; ISSN-L to de-duplicate ISSN
• 53,311 online titles requested by researchers & student from 108/160+
Analysis using the Keepers Registry:
• Only 15% (7,862) are being kept by 3+ Keepers
• Over two thirds (68%) held by none
Ø 36,326 titles ‘at risk’ of loss L
• Check robustness with UK logs for 2011 & 2013; Request logs for other countries (WorldCat)
Ø So ‘preservation really is still a problem!
20. Choice of future with 2020 Vision
• Best Case scenario for IFLA 2020 (APE2020)
– Libraries (& Publishers) have acted to reduce that
alarming 80% figure to near to zero J
– They have ensured that all the e-journal content used by
their researchers in 2013 has been preserved and can
be successfully BREAKING used NEWS: in 2020, and US assuredly President
beyond. J
• Worst announces Case scenario 2014 to for be IFLA Year 2020 of Action
(APE2020)
– Libraries (& Publishers) have failed to act L
– Important literature has been lost L
– Citizens & scholars complain of neglect!
21. Ask a librarian in 2020: 3 possible answers
1. "Yes, we have it (we've checked recently, both in the
catalogue and in actuality), and you can access it now"
2. "No, but we know some body that does (we trust),
– so we can point you to (or arrange access to) it now/soon-ish"
3. "Sorry, we don't know …
- perhaps nobody has it
- it may be lost forever, altho' perhaps somebody somewhere ...”
- That was true for the print world
- Unfortunately, unless we do something now, the 3rd answer
could become the common one for a lot of e-journal content
22. Sidebar note on monitoring The Keepers Registry: A cthtioenira pblreo Egvriedsesn c…e
1. To assist publishers ‘do the right thing’
– A showcase for the real heroes: the archiving organisations
– provide libraries, publishers & archiving organisations with lists of
titles that seem to be at risk of loss
2. To keep a close focus on volumes & issues
Breaking News:
New release (end of Q12014) Members Area:
– Need to make sure all issued content is being kept safe
3. To assist collaboration for Keepers: ‘a safe places network’:
Upload a list of ISSNs & get back archival status of Titles
Access to API, to report archival status on 3rd Party websites
many met at iPres 2013 in Lisbon this September
4. To assist the ISSN Network assign more ISSN
– If it is worth preserving, it really should have an identifier
5. To recruit more archiving organisations as Keepers
– The Registry is not an audit / certification authority but there are
eligibility checks for integrity of ‘archival intent’
24. How to know who is looking after what & how?
(and uncover what is still at risk)
SERVICES: user
requirements
E-J Preservation Registry Service
E-Journal
Preservation
Registry
Piloting an
E-journal
Preservation
Registry
Service
(b) Data
dependency
(a)
ISSN
Register
ISSN Register at heart
of the Data Model;
ISSN-L as kernel field
METADATA
on extant e-journals
METADATA
on preservation action
Digital Preservation
Agencies
e.g. CLOCKSS, Portico; BL, KB;
UK LOCKSS Alliance etc.
(Taken from Figure 1 in reference paper in Serials, March 2009)
25. Sidebar note on National Libraries
Should we wait upon Legal Deposit?
– 94% of libraries have some form of legal deposit for print.
• Only 44% national libraries had legislation in 2011 for e-books or
e-journals; expected to rise to 58% by June 2012.
from presentation, CENL 2011 Survey by Lynne Brindley
to CDNL Annual Meeting Puerto Rico, 15/8/11
• Only 27% [expected to rise to 37% by June 2012] actually ingesting via legal
deposit
Ø Total national libraries collecting = those 14 via legal deposit
+ 9 by other means (Netherlands, UK/BL, Switzerland voluntary deposit)
Ø Only KB e-Depot, BL, NSLC (+ LoC) in The Keepers Registry
Ø Only when the other 19 join will all know about their activity
Ø Key point is not about call for ‘legal deposit’ but that on its
own it is taking too much time