The document discusses the UK's Non-Print Legal Deposit (eLD), which allows libraries to collect digital publications like ebooks, websites, and journals in order to preserve the UK's published digital output. It covers what eLD is, why it is important, how content is collected and accessed. The Bodleian Libraries have collected over 935,000 digital articles and 16,300 ebooks through eLD. The UK Legal Deposit Web Archive preserves millions of UK websites and supports research. Staff were encouraged to provide input on promoting and curating the archive.
1. Ever wondered what UK Non-Print
Legal Deposit is all about?
Presentation to the 2015 Staff Conference
by Jackie Raw, Alison Felstead and Svenja Kunze
2. Introduction
• What is Non-print or Electronic Legal Deposit (eLD) and what
does it cover
• Why do we need it
• How can we access it
• The UK Legal Deposit Web Archive
• Q&As
3. What is non-print (or e-Legal) Deposit?
The revised Legal Deposit Libraries Act of 2003 recognised that much of the
nation’s published output in digital form was being lost.
The Legal Deposit Libraries (Non-print) Regulations 2013 address this.
In addition to print now we are able to collect
any digital publication
such as an e-book or journal article
works published in offline media
such as CD-Roms and microfilm
works published online that are issued from a UK domain
the UK Legal Deposit Web Archive.
4. eLD: Benefits for readers
Include:
access to an archive of millions of UK websites, preserved in the LD UKWA
access to publications only accessible in e format
access to all e-legal deposit content collected by all of the LDLs
access to e-journals to which we don’t currently subscribe
full text searching
immediate access in all Bodleian Libraries’ reading rooms
access to material from publishers who do not currently
deposit in print with us via access to BL content
content arrives more quickly:
immediately after the 7 day embargo period
benefits for visually impaired readers
5. eLD: Benefits for
the Legal Deposit Libraries
Across the LDLs:
joint policies and support
more collaboration, mutual support,
joint collecting principles and policies, shared content
Here:
for Subject Librarians
assessing eLD use allows to gauge interest
and making decisions regarding e-purchase
for space saving
in processing areas, the BSF, Gladstone Link and on reading room open shelves.
This leads to questions such as how will we use this extra space…
for conservation
there will be fewer orders and transfers of printed material between libraries and the
BSF.
6. How is electronic material collected?
• The British Library and the National Libraries of Wales and Scotland
collect the material on behalf of all 6 Legal Deposit Libraries
• University libraries of Oxford and Cambridge
and Trinity College Dublin can access it
• Content is accessible here via SOLO
or via the UK LD Web Archive
7. Legal Deposit Libraries Implementation Group
And the new chair is…
Who is working on this?
Metadata
Reader
Services
Web
Archiving
Security
eLD
Collection
Development
and
Acquisitions
Technical
Operations
8. …the Bodleian eLD Group:
Chaired by Michael Williams
Members from across the Bodleian
Vanessa Corrick Readers Services - Alison Felstead Resource Description
Jo Gardner Social Science s - Beth Gibbs Radcliffe Science Library
Svenja Kunze Archives - Andy Mackinnon BDLSS
Michael Popham BDLSS - Jackie Raw Legal Deposit Operations
Jane Rawson Humanities
With a remit to
• Consult
• Make recommendations
• Receive information & feedback
• Communicate with staff & readers
And locally…
Reporting to the Collection Management Strategy Group (CMSG)
9. On April 6th 2015 eLD was 2 years old.
What have we achieved for the collection?
As of 16.07.15:
Number of digital articles in SOLO: 935,000
Number of ebooks in SOLO: 16,300
The BL estimates: We have access to
47% more content
than we had
when publishers deposited only print!
12. Types of metadata for eLD
• Title-level records (= bibliographic records)
– Ejournals
– Ebooks
• Article-level records
– Articles in ejournals
– Chapters in books
• Issue-level records
– To reconstitute journal issues or books
13. Ebooks metadata
• Harvested from the BL’s OAI-PMH Gateway
• Converted from MARCXML to MARC21
• Loaded to Aleph and published to SOLO
• Vary in fullness from brief acquisitions
records to full-level records with authority
control and subject headings
• Upgraded with eCIP data or manually by BL
staff
15. What do eLD monos look like?
• The majority have been supplied in EPUB
format
• EPUB ebooks are viewed using the Calibre
ebook reader
• A small number have been supplied in EPUB
and PDF formats, and a handful in PDF only
• PDF ebooks are viewed using the Sumatra PDF
viewer – like eLD articles
33. http://buddah.projects.history.ac.uk/
1996
2008
Researching the Web
How .uk subdomains link to each other
(and how this changes over time)
Using Web archives as sources for
historical research
Source: http://www.webarchive.org.uk/ukwa/visualisation/ukwa.ds.2/linkage
Web Archives, and what to do with them…
Exploring the potential:
34. Domain Crawl
annual broad sweep of the UK web
largely automated
• 2nd Domain Crawl: June – December 2014: 191 days (2013: 70 days)
• 57.3 TB WARCs (2013: 30.8 TB) + 3.2 TB screenshots
‘Snapshot’ representations of websites as captured by crawlers
≠ Live Web
Curated Crawls
websites actively selected - additional description and QA
Scope and structure
of the Legal Deposit UK Web Archive
Special
Collections
for example
UK General Elections 2015
• 5-6 per year
• themes/events based
Rapid Response
Crawls
for example
Death of Nelson Mandela
• responding to current
events if and when arising
Key sites
Crawled more frequently
~ 40 news sites
for example
The Guardian, BuzzFeed UK,
Belfast Telegraph, Wales Online
~ 270 sites of high impact
for example
UK Parliament, British Museum,
Oxfam GB, Church of Scotland,
English Heritage
35. Special Collections
NHS Reforms 2013
Winter Olympics 2014
European Parliament Elections 2014
Centenary of outbreak of the First World War 2014
Scottish Independence referendum 2014
Commonwealth Games Glasgow 2014
Rapid Response Crawls
UK response to Typhoon Haiyan (11.2013-01.2014)
Death of Nelson Mandela (12.2013)
UK response to the Ebola in West Africa (11.2014- )
UK response to Nepal Earthquake (04.2015)
A bit more about curated crawls
36. UK General Elections
Magna Carta 800th anniversary
Forth Rail Bridge 125th anniversary
End of Second World War 70th anniversary
Rugby World Cup
Easter Rising centenary 2016 (started)
First World War centenary (continued)
Special Collections 2015
2016 ?
37. …Suggestions for websites or
a Special Collection topic?
…Ideas and suggestions for promoting the LDUKWA
and its use in teaching and research?
svenja.kunze@bodleian.ox.ac.uk
…Interested in co-curating Special Collections
or rapid response crawls?
Get in touch
38. Where can I find out more?
Public webpages
http://www.bodleian.ox.ac.uk/our-work/legal-deposit/
electronic-legal-deposit-non-print-publications
39. And here - the staff intranet, blog and tutorial
http://www.bodleian.ox.ac.uk/staff/services/card/ldo/eld-for-staff
40. Q&A?
Contact us
Jackie Raw Head of Legal Deposit Operations jackie.raw@bodleian.ox.ac.uk
Alison Felstead Head of Resource Description alison.felstead@bodleian.ox.ac.uk
Svenja Kunze Project Archivist svenja.kunze@bodleian.ox.ac.uk
Notas del editor
The aim of this presentation is to provide you with information, to answer some of your questions and show you where you can go to find out more information.
I am Jackie Raw and I am Head of Legal Deposit Operations. I will be introducing the topic and closing this session today.
Alison Felstead is Head of Resource Description and she will show you how you can read this material and tell you about how it can be used
And Svenja Kunze is a Project Archivist working on the UK Web Archive.
We have a lot of content and little time - moving swiftly on-
The Bodleian Libraries are well known for the size of their print collection much of which is due to its status as a Legal Deposit Library entitled to a free copy of every publication produced in the UK and Ireland. However, by the end of the 20 century it became clear that much of the nation’s published output in digital form was being lost.
This led to the revised LDL Act 2003 and the Regulations in 2013 that followed it.
Now we are able to…
What does this mean for readers?
Benefits to readers
I don’t propose to read out the list. This presentation will be available after the event and you can read the detail later but just a few points stand out-
Access to the UK websites
Access to material in digital formats
Full text searching and immediate access
And for the LDLs…
There is also a Collection Development and Acquisitions Group, a Metadata Group, a Reader Services Group, Official Papers Group and Technical and Security Group. These work across the LDLs.
And locally…
With a remit to…
Consult
Receive information & feedback
Make recommendations
Communicate
With staff and readers
There is a continuous process of ingest. Here are some statistics up to…
Of what we have collected so far.
Now, to show you what is available and how you can access this I’ll pass you to Alison.
So how do you find and access eLD content – articles and monographs It is metadata that provides the critical link between the eLD content held in the British Library’s Digital Library System, and the readers that want to find and access this content via SOLO.
Metadata is, of course, just another name for catalogue records.
We are receiving different types of metadata from the British Library, in different formats depending on what they are describing. The green ticks indicate the types of records we have received to date.
Not all types of record have been received yet, as the BL is still devising the workflows for ingesting the eLD content and its associated metadata.
No chapter-level metadata has been supplied by the BL, as no chapter-level content has been received – the BL is maintaining a watching brief on the publication of ebooks at the chapter level.
No issue-level records have been supplied yet, and the BL are currently working on the ingest of issue-level content (e.g. Business Monitor International reports).
(We have devised a local solution for retrieving all of the articles in a single issue of an eLD journal, in the SOLO Articles and More tab, and I will demonstrate this in a few minutes.)
Last year, the BL launched the OAI-PMH Gateway (Open Archives Initiative Protocol for Metadata Harvesting), which enables us to harvest ebook metadata when we choose, to our own timetable. The records are harvested in MARCXML format, which can be easily converted to MARC for loading into Aleph and publishing to SOLO. (The metadata for purchased ebooks is loaded into Aleph and published to SOLO in the same way.)
You should be aware that the records vary in fullness. Some are brief acquisitions records used by the BL to ingest the ebooks into their Digital Library System. Some are created from the metadata supplied by the publishers and do not conform to RDA (or AACR2). The quality is similar to that you would find in Amazon. Others are created by the BL if no publisher-supplied metadata is available.
eCIP records are being created by BDS (Bibliographic Data Services) as part of the CIP Programme, and if the brief acquisitions record finds a match with an eCIP record in the BL system, the brief record is upgraded. eCIP records are created to the standard of CIP records for print books with full authority control and subject headings.
In addition, the BL Digital Processing Team is manually upgrading some of the acquisitions records – but this will not be sustainable in the longer term, as the number of eLD books increases, and is intended as an interim measure whilst batch-upgrading workflows are explored and implemented.
You can search for records for eLD monographs in the SOLO Oxford Collections tab, just like other ebooks, limiting your search to Online Resources.
You can identify the records for eLD monographs by the warning message “Online access is restricted: available via Bodleian Libraries reading room PCs only”.
As with all eLD materials, the eLD monographs can only be accessed on Library PCs (including staff and reader PCs).
You can see that there are two records for the same eLD monograph. Before I show you what these resources look like, I will explain why.
EPUB is an industry standard for ebooks. PDF you will be familiar with already.
eLD monographs supplied by the publishers in both EPUB and PDF formats are represented in SOLO by separate records for the formats. This is explains the apparent duplication. (The format is only in a part of the MARC record that does not display in the SOLO brief display.)
The interface used for EPUB-formatted eLD books is Calibre which may be familiar to some staff.
Calibre is a free and open source e-book reading application. It has navigational tools, supports use of Table of Contents, bookmarks, customizing screen by size and font and background colour, and searching.
The text flows to suit the device being used to view the ebook. This means the print layout is not preserved. If I click the Table of Contents button on the left side, the TOC displays and causes the text on the right to be reformatted for a more narrow page width.
This reflowable text feature of the EPUB format introduces questions about how to cite from ebooks – but these questions are not specific to eLD monographs.
The interface used for PDF eLD books is Sumatra, a PDF reader. It has fewer features than Calibre, but searching within the book is supported.
PDF ebooks look much the same as eLD articles. The print layout is preserved, including pagination.
~~~~~~~~~~~~~~~~
I have worked with Nathalie Schulz to produce the “eVBD” – a current awareness service to keep subject librarians (and anyone else who is interested) up-to-date with the latest ebooks available to readers under Electronic Legal Deposit (eLD).
Like the well-established Virtual Book Display, the eVBD is available both on SOLO and as an Excel report. The identification of the publishers included in these reports as “academic” was initially based on two Google polls that I have distributed to the subject-reps list before eLD monographs went live in February 2015. As new publishers start to deposit electronically, their books are flagged as “academic” by default, but subject librarians are encouraged to advise us if they should be flagged as non-academic in future eVBDs.
The Excel reports contain hyperlinks to the records in SOLO, from where you can link to the eLD content on staff and readers PCs in the Bodleian Libraries. The Imprint (i.e. publisher) column has a filter so that you can restrict your view to publishers of interest.
Now I want to talk about eLD journals. These ejournals may either be “born eLD”, i.e. we have never received them under print legal deposit, or they may have switched from print to electronic deposit.
In the case of print legal deposit journals that have switched to eLD, Jackie’s staff in Legal Deposit Operations are “closing off” the print subscriptions and their records in Aleph, and this process includes adding a note to the subscription for the print journal to advise readers that the title is now available electronically.
Staff in Legal Deposit Operations have “closed off” the print subscriptions for X,XXX journal titles that have switched to electronic, and have another XXX to work on.
On the Details & Links tab, readers will find a link to “Search for articles from this journal in the Articles & More tab.”
Note that there is no direct access from the title-level record for the journal to the eLD articles that are available.
When readers click on the link a new window opens for the Articles & More tab, with a list of all of the articles for the journal that have been received under eLD.
Each article-level record bears the usual health warning.
Readers may refine the results using the left-hand facets, or may sort the articles to suit their requirements.
Once they have found an article that they wish to consult, they can click on the View Online to launch the Ericom system which delivers the e-article from the British Library’s DLS to the library PC.
The Ericom system connects to the BL’s DLS.
The reader must accept the Terms and Conditions of access to the eLD content by clicking the Accept button.
Once the Accept button is clicked the PDF article is displayed in the Sumatra PDF viewer (as with PDF eLD monographs). The reader can then read the article by scrolling through the pages, or by using the limited navigation tools available (find, go to page, etc.).
You might just be able to make out a print button in the top left corner of the PDF viewer. This is a good place to talk about how readers can obtain printouts from eLD content (but note that digital copies cannot be made.)
Until recently, if a reader wanted a printout of an eLD article (or book chapter) they had two options:
Ask a member of staff in a library that offered mediated printing to print out the article for them, on a staff machine.
Order a printout of the article using the Print and Deliver service which can be accessed from the link on the Details and Links tab of the article-level record in SOLO. [I will point this out on a later slide.]
Happily, BDLSS staff have recently succeeded in implementing a self-service printing option, integrated with the PCAS system.
[Check what happens when you click Print.]
The Other printing option will be retained for the time being, and the Print and Deliver option in particular provides a service for readers that cannot come to a Bodleian Libraries reading room to consult eLD material, or would simply prefer to order a printout remotely (from anywhere that SOLO can be accessed.)
Now I want to return to finding eLD content in SOLO. We have seen how you can find eLD monographs and LD journals that have switched from print to e. Now I want to show you what born-eLD journals look like in SOLO – how the records in SOLO have been used to lead the reader to the content they seek.
For born-eLD journals, title-level records are being received in MARC format from the BL. They are being loaded into Aleph, and published to SOLO like the records for eLD monographs.
These records appear in the Oxford Collections tab alongside records for the print LD journals (where applicable) and subscription ejournals. We are only loading the title-level records for which article-level metadata has been received, for the very good reason that it is the article-level records that contain the “View online” link to the article, as demonstrated previously.
Here is an example of the eLD run of Seed science research appearing below the record for the subscription ejournal. The health warning associated with the record for the eLD version is designed to discourage readers from using the eLD set with limited access, and to encourage readers to use the subscription set, which has no such restrictions.
The “View online” link and the “Search for articles from this journal in the Articles & More tab” both serve the same purpose in this context … they take the reader to the Articles & More tab where all the articles accessible under eLD for this title are collocated.
Click on the title of the selected article to view more details and links.
Let’s look more closely at the links that are offered.
“Print & Deliver Request” allows readers with PCAS accounts or Bodleian cards, wherever they are, to order a printout of the current article.
“Direct link to Electronic Legal Deposit item” takes the reader to the PDF article using the Ericom access solution, as demonstrated earlier.
“About Electronic Legal Deposit” takes the reader to the Oxford LibGuide on eLD.
And last, but not least, the “Search for other articles in the same issue” enables readers to reconstitute the issue by retrieving the records for all the articles in it.
By clicking the link, all of the articles in Volume 24 Issue 4 of Seed science research are retrieved, although not necessarily in the “right” order. The “Sorted by” option can be used to sort the articles to a limited extent, and the Author sort is probably the most useful in this context. (Note that the Title sort does not ignore initial articles in titles so it of limited usefulness.)
That’s the end of my presentation – a fairly swift canter through the means of finding and accessing eLD content via SOLO.
I will now hand over to Svenja, who is going to talk about one of the most exciting aspects of Electronic Legal Deposit – the UK LD Web Archive.
2.5M non .uk new hosts identified in 2014 crawl
Many thanks Svenja.
That was rather a gallop through eLD. You can find out more at…
Point out the blog and the eLD tutorial
Instructions on how to access the blog
Many thanks for your interest. Now, I think lunch is calling…