The European Library provides access to research materials from the collections of Europe’s national and research libraries, representing members from 46 countries. This paper presents the current status, on-going work, and future plans of the resource dissemination services provided by The European Library, covering resources such as national bibliographies, digital collections, full text collections, its access portal and API, open linked data publication, and integration in digital humanities infrastructures. In the coming years, The European Library will work to provide the means and tools for digital humanities researchers to easily use research materials from libraries in their research activities.
Facilitating Access and Reuse of Research Materials: the Case of The European Library
1. Facilitating Access and Reuse of
Research Materials:
the Case of The European Library
Nuno Freire
The European Library
17th International Conference on Electronic Publishing
June 2013
2. Presentation outline
• Introduction of The European Library
• Resources aggregated by The European Library
• Bibliographic resources
• Full-text contents
• Resource dissemination and reuse services
• APIs and Linked Open Data
• Intelectual property rights infrastructures
• Research infrastructures
4. The European Library
The European Library provides access to research
materials from of Europe’s national and research
libraries
• Its most visible service
is the portal
www.theeuropeanlibrary.org
6. The European Library
Provision of services based on exploiting
the centralization of pan-European
bibliographic data and digital content
• A portal and an API
• Library domain aggregator for Europeana
• Promoting the re-use of these digital
resources in many contexts
7. The European Library and the
Europeana Network
1
Data providers:
libraries, museums, archives
and audio-visual archives
2 3 4 5 6 7 8
Aggregators
(domain, national, etc.)
12
10
11
9
Service
providers
13
9. Resources aggregated by The
European Library
Bibliographic data:
• National bibliographies
• Comprehensive databases of all
publications in a country
• Traditional library catalogues
• Research collections from national and
research libraries (photographs, manuscripts,
historical pamphlets)
• May refer to digital and non-digital
materials
10. Resources Aggregated byResources Aggregated by The
European Library
The European Library hosts a centralized index of
textual resources:
• It currently contains over 24 million pages of full-text
content, originating from 14 national libraries
• These textual resources were created mostly from
OCR
• The quality of the text varies, depending on the quality
of the original material, and the use of special fonts in
older materials.
• An heterogeneous resource in terms of types of
materials, languages and publication periods
11. Textual resources:
Country of origin Material type Pages Temporal coverage Languages
Austria Newspapers, governmental
material
534.000 1862 – 1925 German
Czech Republic Books, newspapers 2.579.511 1800 – 1989 Czech, German
Estonia Newspapers, journals 713.933 1821 – 1940 Estonian
France Books, periodicals 8.242.908 1650 – 1930 French (some others)
Hungary Periodicals, newspapers,
journals, books, monographs,
pamphlets
237.914 1590 – 1992 Hungarian, Latin, English,
German
Iceland Newspapers, journals 5.727.149 1773 – 2002 Icelandic, Faroese,
Greenlandic
Latvia Newspapers, books 195.075 1900 – 1952 German, Latvian
Lithuania Newspapers 125.477 1904 – 1940 Lithuanian
Norway Books, journals 1.600.000 By authors
dead for more
that 70 years
Norwegian (others)
Poland Newspapers, books 436.198 Before 1939 Polish, German, Czech,
Ukrainian, Belarusian, Yiddish
Slovakia Newspapers 185.000 Before 1918 Slovak, Hungarian,
German
Slovenia Newspapers, books, journals 328.502 1500 – 1945 Slovenian
Spain Newspapers, books 3.033.525 17th – 19th
Century
Spanish
Sweden Newspapers, books, journals,
printed ephemera
253.653 Until the 20th
century Swedish
Resources Aggregated byResources Aggregated by The
European Library
13. • These textual resources will be expanded during 2013,
thanks to the Europeana Newspapers project
• http://www.europeana-newspapers.eu/
• A group of 17 European institutions will provide more
than 18 million newspaper pages for The European
Library and Europeana.
• Availability of material varies:
• Some are orphan works
• Some are public domain.
• Public domain works will be accessible for download
and reuse
Textual resources:
Resources Aggregated byResources Aggregated by The
European Library
15. Data dissemination channels
by The European Library
Commitment to provide ease of access to data:
• Search APIs
• Linked Open Data
• To be publicly available during 2013
Expected benefits
• Higher Profile – raising data providers’ profile and
driving web traffic to data provider’s websites.
• Establish Authority - to become an authority for library
data
• Positioning The European Library as a data hub
16. 16
ARROW – Accessible Registries of
Rights Information and Orphan
Works towards Europeana
ARROW is a tool to facilitate rights
information management in any digitisation
project involving text and image based works
ARROW infrastructure allows to determine
for a work:
• The authors, publishers and other right-holders
• Whether it is orphan
• Whether it is in or out of copyright
• Whether it is still commercially available
17. 17
ARROW - Motivation
To support mass digitisation projects with automated
ways to clear the rights of the books to be digitised.
To identify and clear the rights associated with a book
a complex process needs to be undertaken:
• Determine the work(s) contained within the book
• Identify all the other expressions of the same work(s)
• Identify the publisher(s) and contributor(s) involved
• Determine the dates of publication at work level
• Determine whether that work(s), and not the book itself, is
still in commerce
• If necessary, obtain any licenses from the rights holders or
collective rights organizations
19. The ARROW
Workflow
ONIX for Rights Information Services (ONIX-RS) used
for data exchange between ARROW participants
ONIX for Rights Information Services (ONIX-RS) used
for data exchange between ARROW participants
20. The role of
The European
Library
Allow the identification of the bibliographic record
describing the manifestation whose rights are to be
cleared
Allow the identification of the bibliographic record
describing the manifestation whose rights are to be
cleared
21. The role of
The European
Library
Identify all other manifestations that potentially share
intellectual work with a manifestation
Identify all other manifestations that potentially share
intellectual work with a manifestation
22. The role of
The European
Library
Match work contributors against VIAF to gather more
information for the ARROW process
(Name forms, birth and death dates, nationality)
Match work contributors against VIAF to gather more
information for the ARROW process
(Name forms, birth and death dates, nationality)
23. Projects Towards Enabling the Use of
Research Materials from Libraries
CENDARI
Collaborative European Digital Archive
Infrastructure
• Research Infrastructure for the study of
Medieval Manuscripts and World War I
• http://www.cendari.eu/
Europena Cloud
• Started in February 2013
24. Project Europeana Cloud
This project builds up on the Europeana
infrastructure to make cultural heritage
materials available for research
It will setup a research infrastructure
providing discovery services and tools:
• A cloud infrastructure for data and contents
• The licensing framework for reuse of content
• A new research platform: Europeana Research
25. A research platform will be created by extending the currently
existing portal of The European Library
The project will analyse how academic users work with data
and how they perceive the value of the content in Europeana
• Will be the basis of the content strategy of Europeana Research
• Will provide understanding of scholarly workflows to be supported
To be carried out jointly with:
• DARIAH - Network of arts and humanities researchers
• CESSDA - Council of European Social Science Data Archives
Project Europeana Cloud
A new research platform: Europeana
Research
26. The project will also address tools for scholars to interact
with the content from Europeana Research.
The areas to be approached are:
• Accessing and Analysing Data
permitting scholars to download, manipulate and analyse data sets.
• Annotation
allowing researchers to annotate documents and to share
annotations
• Transcription
allowing users to transcribe and interpret documents
• Discovery and Access
ensuring that research material is discoverable, possibly with
integration in other research infrastructures in the field of Digital
Humanities.
Project Europeana Cloud
A new research platform: Europeana
Research
In order to provide the means and tools for digital humanities researchers to exploit the cultural heritage materials, a new research infrastructure, Europeana Research, will be created by extending the currently existing portal of The European Library. Early stages of the project will analyse how academic users locate data and how they perceive the value of the content within Europeana. This analysis will be the basis of the content strategy of the Europeana Research platform, and will also provide the understanding of scholarly workflows which will be supported by Europeana Research. This analysis will be carried out jointly with the DARIAH network of arts and humanities researchers, and with the Council of European Social Science Data Archives (CESSDA). Academics working in these domains will be the most fertile exploiters of Europeana cultural content, therefore they have key roles to play in shaping the future of Europeana Research. This analysis will allow the identification of tools that allow researchers to manipulate and exploit cultural heritage materials in innovative ways. The project will therefore develop a suite of tools that allows scholars to interact with the content that they require from Europeana Research. The areas to be approached are: Accessing and Analysing Big Data - permitting scholars to download, manipulate and analyse large data sets. Annotation - allowing researchers to annotate documents and to share these annotations Transcription - allowing users to transcribe and interpret documents Discovery and Access - ensuring that services are tailored so that research material is discoverable by the scholarly community, possibly with integration in other research infrastructures in the field of Digital Humanities.
In order to provide the means and tools for digital humanities researchers to exploit the cultural heritage materials, a new research infrastructure, Europeana Research, will be created by extending the currently existing portal of The European Library. Early stages of the project will analyse how academic users locate data and how they perceive the value of the content within Europeana. This analysis will be the basis of the content strategy of the Europeana Research platform, and will also provide the understanding of scholarly workflows which will be supported by Europeana Research. This analysis will be carried out jointly with the DARIAH network of arts and humanities researchers, and with the Council of European Social Science Data Archives (CESSDA). Academics working in these domains will be the most fertile exploiters of Europeana cultural content, therefore they have key roles to play in shaping the future of Europeana Research. This analysis will allow the identification of tools that allow researchers to manipulate and exploit cultural heritage materials in innovative ways. The project will therefore develop a suite of tools that allows scholars to interact with the content that they require from Europeana Research. The areas to be approached are: Accessing and Analysing Big Data - permitting scholars to download, manipulate and analyse large data sets. Annotation - allowing researchers to annotate documents and to share these annotations Transcription - allowing users to transcribe and interpret documents Discovery and Access - ensuring that services are tailored so that research material is discoverable by the scholarly community, possibly with integration in other research infrastructures in the field of Digital Humanities.