Student Profile Sample - We help schools to connect the data they have, with ...
Europeana Newspapers: Surveying Newspaper Digitisation in European Libraries, then Aggregating Them
1. Europeana Newspapers
Alastair Dunning
Programme Manager, The European Library
@alastairdunning, alastair.dunning AT kb.nl
LIBER Conference, June 2013, Munich
Surveying Newspaper Digitisation in European
Libraries, Then Aggregating Them !
This presentation is at http://www.slideshare.net/alastairdunning
2. On November 3, 1948,
the early edition of the
Chicago Tribune
proclaimed Thomas
Dewey as winner of the
US presidential
campaign
http://www.chicagotribune.com/news/politics/chi-histdewey_defeats_an20080104104816,0,547284.photo
3. In actual fact, the
campaign was won by
Harry Truman, who
became the 33rd
President of the United
States
http://en.wikipedia.org/wiki/File:Deweytruman12.jpg
4. Later editions of the
Chicago Tribune
corrected this mistake
with headline
"DEMOCRATS MAKE
SWEEP OF STATE
OFFICES"
However, I cannot find
these online !
http://en.wikipedia.org/wiki/File:Deweytruman12.jpg
5. As we shall see, presenting
comprehensive digital archives,
where everything is digitised, is
difficult... yet this is what users
often demand !
6. "This lack of collocation and collection
presents efficiency challenges and deepens
scholars’ concerns about
comprehensiveness. The anxiety over
“missing something” was quite common
across interviews."
Ithaka S+R, Supporting the Changing
Research Practices of Historians,
http://www.sr.ithaka.org/research-publications/supporting-changingresearch-practices-historians
7. "When lined up against the non-digital
object upon which it is based, the digital
object can only ever appear impoverished."
Jim Mussell, Historian at
University of Birmingham
http://jimmussell.com/2013/05/23/the-proximal-pastdigital-archives-and-the-here-and-now/
8. Genealogists - those studying family
history
"Genealogists represent the majority of
users in many archives. And yet, the
traditional archival information system
does not meet their needs."
Wendy M. Duff, Catherine A. Johnson, Where Is the
List with All the Names? Information-Seeking Behavior
of Genealogists, American Archivist, Volume 66(1),
2003, http://archivists.metapress.com/content/L375UJ047224737N
9. Despite this, European
libraries have made great
strides in digitising their
newspapers
(These results taken from first
Europeana Newspapers
survey, 2012. 47 libraries
responded.)
http://www.europeana-newspapers.eu/wp-content/uploads/2012/04/D4.1-Europeananewspapers-survey-report.pdf
12. 11 libraries have digitised more than 3m pages
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
National Library of Czech Republic
Koninklijke Bibliotheek van België
National Library of Spain
National Library of Norway
National and Univeristy Library of Iceland
BCU Lausanne
Hamburg State and University Library
Bibliothèque nationale de France
British Library
Koninklijke Bibliotheek
Austrian National Library
13. But, only
12 (26%)
10%
of the
libraries had digitised more than
of their collection
(either in terms of titles or page numbers)
14. National Library of Luxembourg
4.000.000
pages in collection
620.000
pages digitised
15. National Library of Finland
620.000
pages digitised
2.010.246
pages in collection
16. Hamburg State and University Library
c. 2.000.000 pages digitised
c. 12.000.000
pages
in collection
18. Access to digitised newspapers is nearly always
free of charge. At least
40 (85%)
offered free access to their digitised
newspapers.
One library had pay per view, whilst another three offered
subscription services for users (ie paid access per day or per
month).
Only four libraries licensed their newspaper contents to
other groups (e.g. school, universities).
19. Access to twentieth-century content remains
problematic.
27 out of 47 libraries
(57%)
have a cut off date
beyond which they will not publish digitised newspapers on
the web. Most frequently, this is based on a 70 year sliding
scale.
23%
(11 out of 47) had an agreement with a rights
organisation so that in-copyright digitised newspapers could
be published, but often restricted to individual titles
20. There is still much to be done to exploit the richness
of digitised newspaper content
64%
(37 from 47) of libraries made use of OCR
But only 17 of these libraries (
36%
) exposed the resulting
full text to the viewer
36%
13%
had undertaken zoning and segmentation and only six
libraries (
) had included features such as facetted
browsing or extracting entities such as place or name
21. --> Motivation for Europeana
Newspapers
Others WPs will explain process of
improving digitised archives but I
want to return to one earlier
quote
22. "... the lack of comprehensive search
tools for primary sources ..."
Locating primary sources presents a
crucial challenge for reserachers.
--> TEL aggregator as part of
Europeana Newspapers project
23. Timetable: Early version with
limited content added to The
European Library website in
September 20
More content being added in 2013
and 2014
24. http://theeuropeanlibrary.org will
deliver a search interface to help
locate
18m pages digitised
at European libraires
Users will also be able to search
over titles of newspapers. Title
metadata will also be forwarded to
Europeana
25. Some Issues:
Copyright means that some
images cannot be shared at all,
only metadata (e.g. names and
dates of newspapers)
26. Some Issues:
OCR and zoning quality will affect
search results significantly. Eg
Higher quality OCR will be
returned more often in search
results
28. Some Issues:
Different libraries are willing to
share different amounts of
content
Some libraries happy for full
content to be shared; for others it
is just snippets of images
29. Last Thoughts and What Next ?:
The European Library will sustain access
beyond project funding; but adding more
content will require membership of TEL
How can we allow for transcription?
What do non-academic users want?
How do we create full-text APIs ?
30. Oh, the results here
were all based on the
first edition of the
project survey.
If your library want to
contribute to later
editions, see links by
July 2013
http://www.europeana-newspapers.eu/tell-us-about-your-newspaperdigitisation-project/
http://www.surveymonkey.com/s/BQ28579