Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Enriching Cultural Heritage Data with DBpedia
1. Enriching Cultural Heritage
Data with DBpedia
Antoine Isaac | DBpedia Community Meeting 2016
Netherlands, Public Domain
1660 - 1625, Rijksmuseum
Anonymous
Arrival of a Portuguese ship
2. Title here
CC BY-SA
Europeana?
Europeana Essentials
CC BY-SA
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
Europeana Collections homepage
Europeana| CC BY-SA
3. Title here
CC BY-SA
Title here
CC BY-SA
Europeana Essentials
CC BY-SA
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
Europeana aggregation infrastructure
Europeana| CC BY-SA
Europeana?
4. Europeana has many data challenges
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
We aggregate very heterogeneous metadata
• More than 48M objects
• 3,500 galleries, libraries, archives and museums
• 50 languages
• From all EU countries
• Level of quality varies greatly
5. Title here
CC BY-SA
Title here
CC BY-SA
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
Linked Open Data
Europeana Linked Open Data video on Vimeo
Europeana | CC BY-SA
6. Europeana Linked Data Strategy
Our efforts and lines of work
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
• The Europeana Data Model (EDM) offers a way to represent richer
(linked) data
• We apply an enrichment strategy to link source data to reference
data, including DBpedia
Will be discussed in Parallel Session 2:
• We encourage data providers to contribute links between objects
and (their own) vocabularies
• We encourage alignment activities between domain vocabularies
7. Title here
CC BY-SA
Title here
CC BY-SA
Europeana Essentials
CC BY-SA
The Europeana Data Model
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
Clavecin, Bartolomeo Cristofori
Cite de la Musique,
MIMO - Musical Instruments Museums Online|CC BY-NC-SA
Europeana Data Model example
Europeana| CC BY-SA
8. Title here
CC BY-SA
Title here
CC BY-SA
Europeana Essentials
CC BY-SA
Create a “semantic layer” on top of cultural
heritage objects
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
Include multilingual “value vocabularies” (e.g. thesauri represented SKOS)
from Europeana’s providers or from third-party data sources
9. Semantic enrichment, a solution for better
quality data?
Automatic and manual enrichment are more and more commonly used
in digital libraries to:
• normalise data
• “standardize data” by linking it to authority resources
• improve multilingual coverage in datasets
• contextualise resources
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
10. The main components of semantic enrichment
CC BY-SA
source objects
whose metadata is
being enriched
set of resources used
to enrich the source
metadata
targets can be of
different types, from
simple uncontrolled
strings to resources
published as LOD
specify how the
enrichment between
the source and target
should be executed.
Source
Target
Rules
Enriching Cultural Heritage Data with DBpedia
11. Automatic enrichment process in Europeana
CC BY-SA
selection of metadata
fields in descriptions
selection of potential
rules to match
matching the values
of the metadata
fields to values of the
contextual resources
adding contextual
links
selection of values
from the contextual
resource
values go into the
search index
Analysis
Linking
Augmentation of
search index
Enriching Cultural Heritage Data with DBpedia
13. Vocabularies we currently enrich metadata with
CC BY-SA
Enriching Cultural Heritage Data with DBpedia
Entity
Class
Target
vocabulary
Size Metadata Fields subject of Enrichment
Places GeoNames 140,097 dcterms:spatial, dc:coverage
Concepts DBpedia 5,284 dc:subject, dc:type
GEMET 280
Agents DBpedia 161,209 dc:creator, dc:contributor
Time Semium Time 2,566 dc:coverage, dcterms:temporal,
dc:date, edm:year
14. Why DBpedia?
CC BY-SA
Building an ecosystem of networked references
• It offers labels in about 124 languages through all its
language editions of which 48 match the languages that
Europeana supports
• It gives fairly complete and accurate descriptive metadata
about entities
• Works great as a “pivot” vocabulary, providing further links to
other vocabularies such as Wikidata and Freebase
15. Not everything is
perfect
France, Public Domain
1921, National Library of France
Agence de presse Meurisse
Colombes : championnats de France d’Athlétisme :
rivière, le speaker
16. Challenges of multilingual automatic enrichment
Evaluation of metadata enrichment practices in digital libraries: steps towards better data
enrichments
Poisonous India or the Importance of a Semantic and
Multilingual Enrichment Strategy
Marlies Olensky, Juliane Stiller, Evelyn Dröge, MTSR 2012
http://link.springer.com/chapter/10.1007%2F978-3-642-35233-
1_25
17. Comparative evaluation of enrichments
CC BY-SA
Enriching Cultural Heritage Data with DBpedia
We ran a quantitative evaluation on a sample set enriched by 7 different
tools (settings)
http://pro.europeana.eu/taskforce/evaluation-and-enrichments
18. Example of Recommendations that will be explored
CC BY-SA
Enriching Cultural Heritage Data with DBpedia
Define your enrichment goals
• Develop better criteria for evaluating enrichment
Choose the right service
• enrichment tool more aware of the semantics of the model
Monitor your enrichment process and re-assess
• target dataset could be richer: new terms, new languages,
more granular
Enrichment using a better reference for contextual entities?
You will hear about this in the next session ☺
19. Title here
CC BY-SA
Name of image | Creator
Providing organization|
Country, licence
Name of image | Creator
Providing organization| Country, licence
With slides from Valentine Charles, Juliane Stiller, Hugo
Manguinhas and Stefan Gradmann
Notas del editor
Europeana works with data experts around the world to ensure that the Europeana Data Model describes our cultural heritage material in the best possible way, and in a way that means it can link in with other systems.
Europeana Data Model
Information about digital cultural heritage comes in a variety of formats. Europeana has developed the Europeana Data Model to ensure that collections from any organisation are treated and displayed in the same ways in Europeana’s systems and services. Data partners often hold their data in a local or standard metadata format in their own systems. That data needs to be mapped, transformed and exported from those systems to EDM for use in Europeana’s systems. EDM has now become an industry standard for cultural heritage data.
Since its original release, EDM has permeated the entire portfolio of Europeana products: we ingest, store, enrich and exchange data following a richer, more semantic (and more multilingual!) approach. This work continues, for example, as Europeana prepares to handle more data enrichment, including user annotations. EDM has also been extended to meet the data needs of specific domain aggregators, like Europeana Sounds, and address the requirements of new data services and enrichment in Europeana's main platform. EDM is now used by Europeana and several other cultural aggregators, such as DPLA and DDB. In true linked data fashion, EDM "profiles" can be developed without Europeana having to update the core model anymore. Exploiting the data expressed with these profiles across different systems still requires work. But it is no longer impossible to realise the vision where the design of data models is decentralised and tailored to specific applications, while the data created and exchanged with them still forms together a vast, semantically interoperable knowledge environment.
See more at: http://pro.europeana.eu/blogpost/the-europeana-data-model-a-living-model-5-years-on#sthash.9HetKLZU.dpuf