SlideShare una empresa de Scribd logo
1 de 80
ArCo: the Knowledge Graph
of Italian Cultural Heritage
Valentina Presutti
University of Bologna
STLab, ISTC, National Research Council, Italy
ArCo team also includes Valentina Carriero (ISTC-CNR), Andrea Giovanni Nuzzolese (ISTC-CNR) and Aldo Gangemi (UniBo)
https://w3id.org/arco
This slides can be reused as they are according to the CC BY 4.0 license.
Please cite the author and link to the original.
ArCo’s ambitious goal is to build the knowledge graph of Italian Cultural Heritage
Valentina Anita Carriero, Aldo Gangemi, Maria Letizia Mancinelli, Ludovica Marinucci, Andrea Giovanni Nuzzolese, Valentina
Presutti and Chiara Veninata: ArCo: the Italian Cultural Heritage Knowledge Graph. In Proceedings of ISWC 2019 (To appear)
Preprint at: https://arxiv.org/abs/1905.02840
ArCo KG in numbers
ONTOLOGY NETWORK
• 7 modules
• 5058 axioms
• 1049 predicates
DATA
• 169.151.644 triples
• 28.838 owl:sameAs linking to 20.479 distinct entities in other datasets
How to use ArCo
https://w3id.org/arco CC BY-SA 4.0 license
USER GUIDES for supporting users in understanding the content of each release, with Graffoo
diagrams and narrative explanations of every ontology module
https://essepuntato.it/graffoo/
ONTOLOGIES, including their
source code and a human-
readable HTML documentation
created with LODE
https://essepuntato.it/lode/
A SPARQL endpoint storing ArCo KG, which is also
downloadable as a compressed dump
Examples of Competency
Questions (CQs) that ArCo KG can
answer, with their corresponding
SPARQL queries.
This helps users to have a quick
understanding of what is in ArCo
ontologies and data, and how to
use it.
e.g.:
ArCo on GitHub
https://github.com/ICCD-MiBACT/ArCo
RDFizer
ontologies
Why ArCo?
Why ArCo?
• Regulations (open data)
• Fostering reuse by third party
• Improving PA organisational data management
• Modeling Cultural Heritage knowledge vs metadata
• Cataloguing
• Providing data to scholars and researchers
• Connecting to other relevant knowledge bases
The general catalogue of Italian Cultural Heritage
…among ArCo’s main data and conceptual sources
the official institutional database of
Italian CH, maintained and published
by ICCD (Institute of the General
Catalogue and Documentation)
about 800.000
(out of
2.735.343)
publicly available
catalogue records
General
Catalogue
SIGEC
web
General Catalogue
of Italian Cultural
Heritage
the collaborative platform to which
formally authorised institutions can
submit their catalogue records, which
undergo a validation phase
30 types of
cultural
properties
ICCD
Cataloguing
standards
~15M catalogue
record numbers
released
Collecting and validating catalogue records
ICCD catalogue standards: documentation
XML input data
Challenge
From strings to domain entities
Different versions of catalogue standards
Building ArCo knowledge graph
with ontology design patterns
Ontology Design Patterns
An ontology design pattern
is a reusable successful solution
to a recurrent modeling problem
Aldo Gangemi, Valentina Presutti: Ontology Design Patterns. Handbook on
Ontology Design Patterns
Ontology patterns derive from
foundational theories
Agile methodology for
ontology design
Pattern representation
language within ontologies
A language for ontology engineers
Trajectory
AgentRole
RecurrentEvent
PartOf
Sequence
Observation
TimeIndexedParticipation TimeInterval
25
ODP portal http://www.ontologydesignpatterns.org/
26
Ontology Design Pattern
ODPs from DUL + DnS Ultra Lite
DOLCE+D&S and its main ontology design patterns: Valentina Presutti and Aldo Gangemi. Ontology Engineering with ontology design patterns. Pages 81-103. IOS Press (2016)
28
Experimenting with ODPs usability with vs. without
45 participants
Eva Blomqvist, Aldo Gangemi, Valentina Presutti: Experiments on pattern-based ontology design. K-
Experimenting with ODPs and XD
Usability: without vs. with
29
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Strongly
disagree
Disagree to
some extent
Neither
agree nor
disagree
Agree to
some extent
Strongly
agree
Not
applicable
The XD methodology helped me to
organize my work while modelling.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Strongly
disagree
Disagree to
some extent
Neither agree
nor disagree
Agree to
some extent
Strongly
agree
Not
applicable
I already organized my work in a way
similar to XD in the previous exercises...
ODP ODP + XD
Terminology
coverage
79% 83%
Task coverage 69% 81%
Disjoint
axioms
37% 52%
35 participants
Eva Blomqvist, Valentina Presutti, Enrico Daga, Aldo Gangemi: Experimenting with eXtreme Design. EKAW 2010: 120-
Experiments on ontology learning: with vs.
without
• ODP-based ontology learning improves results
• Ontologies are better in terms of cohesion, consistency,
functional quality, etc.
• Experiments with OntoCase applied to Text2Onto ontology
learning
Eva Blomqvist: OntoCase-Automatic Ontology Enrichment Based on Ontology Design Patterns. International Semantic Web
Paulheim, H. and Gangemi, A. Serving DBpedia with DOLCE – More than Just Adding a Cherry on Top. Proceedings of ISWC2015, the Thirteenth International Semantic Web
Conference, LNCS, Springer, 2015
Paulheim, H. and Gangemi, A. Serving DBpedia with DOLCE – More than Just Adding a Cherry on Top. Proceedings of ISWC2015, the Thirteenth International Semantic Web
Conference, LNCS, Springer, 2015
eXtreme Design (XD)
Building ArCo knowledge graph with ontology design patterns
Eva Blomqvist, Karl Hammar, Valentina Presutti:Engineering Ontologies with Patterns - The eXtreme Design Methodology.
Ontology Engineering with Ontology Design Patterns. Pages 23-50. IOS Press (2016)
Eva Blomqvist, Valentina Presutti, Enrico Daga, Aldo Gangemi: Experimenting with eXtreme Design. EKAW 2010: 120-134
eXtreme Design applied to ArCo
xd4arco: requirements and feedback loop
User stories
Continuous feedback
New emerging requirements and errors
arco-project@googlegroups.com
Methods and tools to
collect requirements
from users with
heterogenous expertise
Look for the story of an artwork, which was
confiscated from organised crime to appreciate
the value of social return through this type of
confiscation
 What requests should this application reply
to?
Visualise the cronology of an artwork that was
confiscated from organised crime
User stories
Goal
High level
requirements
To enable identifying cataloguing activity of
diverse organisations in specific location
areas
 What requests should this application reply
to?
How many catalogue records describing cultural
properties in a certain region have been produced? How
many of them have been filled by Heritage Protection
Agencies? How many by other organisations
(universities, regions, etc.)?
User stories
Goal
High level
requirements
 Data about residential estates
 Data about cultural heritage
 Data about touristic services
 Archeological data
 Archival data
 CH data
 Touristic services and touristic-cultural
itineraries
 Accessibility
 Consultation of CH data
From stories to competency questions and constraints
• What are the geographical
coordinates of cultural property X?
• What cultural events involved cultural
property X?
• What is the conservation status of
cultural property X? And what
interventions have been proposed for
it?
• When cultural property X was
realised? And what is its history? And
why?
• Who are the attributed authors of
cultural property X?
• …
• A cultural property can be associated
with different types of locations, each
possibly having a temporal validity
• Tangible and intangible cultural
properties are disjoint.
• Tangible cultural properties can be
either movable or immovable, not
both.
• …
testing
team
Lessons learned
• Depending on the domain under analysis you may need requirements
from beyond domain experts
• Social aspects
• Terminology
• Administrative or even political constraints
• Diverse means to collect requirements
• Different tools for different elicitators
• Bias towards cataloguing standards
xd4arco: ontology design
Module design
Ontology Design Patterns
Shortcut binary relations along with N-ary relations
Multiple languages (ita + eng)
Detailed documentation
(comments, usage examples, diagrams)
Design principles
Direct and indirect reuse
Direct and Indirect reuse
Direct Reuse
• Delegating the conceptualisation of
predicates and axioms to external ontologies
• e.g. dul:Event as type of individuals in
my ontology
• When?
• you want or have to comply with, an
ontology
• Effects
• Changes in reused external ontology
impact the semantics of your ontology
• Less design effort
Indirect reuse
• Defining predicates and axioms in your
ontology and align them to external
ontologies
• e.g. myont:CulturalEvent
rdfs:subClassOf dolce:Event
• When?
• you want to be interoperable but avoid
dependency on external resources
• Effects
• If external changes impact on the
semantics of your ontology you may
accept it or remove the alignments
• More design effortValentina Presutti, Giorgia Lodi, Andrea Giovanni Nuzzolese, Aldo Gangemi, Silvio Peroni, Luigi
Asprino:The Role of Ontology Design Patterns in Linked Data Projects. ER 2016: 113-121
ArCo: direct reuse
• OntoPiA: ontology network for Italian PA data
https://w3id.org/italia/onto/FULL
https://w3id.org/arco/ontology/ArCo ontology network
Modularisation is
driven by the type of
data available in the
catalogue
Module names are
based on terminology
used by domain
experts
OPLa: annotating ODPs
Pascal Hitzler, Aldo Gangemi, Krzysztof Janowicz, Adila Alfa Krisnadhi, Valentina Presutti:
Towards a Simple but Useful Ontology Design Pattern Representation Language. WOP@ISWC 2017
Modeling the catalogue
Modelling issues
• Our main source is a catalogue, which is about cultural properties
• A catalogue record describes a cultural property and includes
information about its owner or administrator, as well as other
administrative roles
• A change in the cultural property or in the information available
about it may causes a new version of its associated catalogue record
• ArCo wants to model both the catalogue and the entities it is about
ArCo: Catalogue Records
Open challenges and research questions
• Investigating the dynamics between catalogue record changes, the
cataloguing process, and the evolution of the cultural property over
time
• Knowledge graphs such as ArCo may be empirically studied to this aim
• Persistence of physical objects (e.g. cultural properties) vs fluency of
information objects
• cidoc:Spacetime_Volume subsumes (union?) cidoc:Presence,
cidoc:Physical_Thing, and cidoc:Period
• cidoc:Place equivalent to dul:space-region (an abstract place then?)
• cidoc:Actor is subsumed by cidoc:Persistent_Item (not a fluent?)
Modeling the CH domain: some
examples
The Current Taxonomy of Cultural
Properties
9 categories of Cultural Properties
generalising over 30 more specific types:
e.g. musical, natural, numismatic,
scientific and technological
2 main orthogonal distinctions:
immovable vs. movable
tangible vs intangible
Location of a cultural property
Modelling issues
• A cultural property may be associated with different types of locations
• A cultural property’s location has a temporal validity
CIDOC: Location
EDM: Place and current location
ArCo: time-indexed typed location
ArCo: cadastral identity (location type)
(Cultural) events
Modelling issues
• Cultural properties can be involved in exhibitions, or other types of
cultural events
• Cultural events always involve one or more cultural property
• There are events that belong to series having a more-or-less regular
frequency
ArCo: Event and Recurrent Event Series
http://ontologydesignpatterns.org/wiki/Submissions:RecurrentEventSeries
CIDOC: event
EDM: Event
Lessons learned
To favor and facilitate reuse:
• Favoring local constraints
• e.g. constraints using general predicates
• Annotating patterns is tedious but precious
• Call for tools and incentives!
• Identifying potentially relevant ontologies: how to make it easier?
• F.A.I.R. + tools
xd4arco: data production
• Findable:
• ArCo has permanent URIs (w3id) to identify its entities
• ArCo knowledge graph has its DOI (10.5281/zenodo.2630447)
• ArCo is indexed on Linked Open Vocabularies
• Accessible:
• Use of open standard protocols and query language (HTTP(S) and SPARQL)
• Interoperable:
• Through RDF/OWL, ontology patterns and ontology reuse
• Reusable:
• We release ArCo under CC BY-SA 4.0 license
F.A.I.R.
RDFizer: converting ICCD-XML data to ArCo-RDF,
Data production
Open source on Github
1. Identifying a possible key in the XML
source
2. Removing possible URI-illegal characters
and convert lower case the string
3. Sorting in alphabetical order
4. Computing the MD5 checksum
ID generation
<AUTN>Friscia Albert</AUTN>
friscia-albert
albert-friscia
dcd4ca7b54dd3d7dac083dd4c54a9eef.
https://w3id.org/arco/resource/Agent/dcd4ca7b54dd3d7dac083dd4c54
a9eef
From strings to entities
Cleaning and enriching ArCo knowledge graph
• Deduplication: two different entities are generated for the same one
• “Andrea d'Agnolo” and “Andrea d'Agnolo detto del Sarto”
• Entity linking applied on ArCo against itself
• Disambiguation: same entity generated for two or more different
ones
• Identification of entity fingerprints: e.g. active period of an author and types
of artworks she’s associated with
• Entity linking
• 28.838 owl:sameAs linking to 20.479 distinct entities in other datasets
• Mainly authors (8.884) and locations (9.862)
Xd4arco: testing
Testing methodology
CQ verification
Error Provocation
Lessons learned and open challenges
• Testing is useful to early error-discovery and for detecting new
modeling issues
• Huge manual effort
• TESTaLOD: Prototype for automatic regression tests
• Although the testing methodology supports a systematic approach,
some aspects can be easily overlooked:
• error provocation and inference verification are mainly pulled by designers
http://testalod.herokuapp.com/
Conclusion
• Towards a knowledge graph of Italian Cultural Heritage
• Evolving content, enabling diverse usage: from business applications
to science discovery
• Call for tools for facilitating reuse and testing
• Ontology patterns annotation, ontology discovery, automatic testing, etc.
• Open questions and material for reflections
• how to make different modeling approaches co-exist and be compatible
• what is the best way to handle evolution
• how to capture requirements from diverse “types of experts”
Thanks!

Más contenido relacionado

Similar a ArCo: the Knowledge Graph of Italian Cultural Heritage

Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHlorna_hughes
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Vladimir Alexiev, PhD, PMP
 
Experiences in the Development of Geographical Ontologies and Linked Data
Experiences in the Development of Geographical Ontologies and Linked DataExperiences in the Development of Geographical Ontologies and Linked Data
Experiences in the Development of Geographical Ontologies and Linked DataOscar Corcho
 
Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumMarieke van Erp
 
Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyEnrico Daga
 
Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyEnrico Daga
 
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Vladimir Alexiev, PhD, PMP
 
FORCE2019 Research Comms Conference
FORCE2019 Research Comms ConferenceFORCE2019 Research Comms Conference
FORCE2019 Research Comms ConferenceCarolineByrne17
 
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology EngineeringLearning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineeringbutest
 
22 9 2006 Opensourceeds
22 9 2006 Opensourceeds22 9 2006 Opensourceeds
22 9 2006 OpensourceedsStuart Dunn
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
 
2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf
2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf
2022_02_25_ppt2_EPICAMP 2022 Anglais.pdfEveilleHN
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012University of South Australlia
 
Low tech and high tech methods in participation
Low tech and high tech methods in participationLow tech and high tech methods in participation
Low tech and high tech methods in participationRamon Sangüesa
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management Oscar Corcho
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019heila1
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityOscar Corcho
 

Similar a ArCo: the Knowledge Graph of Italian Cultural Heritage (20)

Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
 
Ee bdm ws-v1
Ee bdm ws-v1Ee bdm ws-v1
Ee bdm ws-v1
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
 
Experiences in the Development of Geographical Ontologies and Linked Data
Experiences in the Development of Geographical Ontologies and Linked DataExperiences in the Development of Geographical Ontologies and Linked Data
Experiences in the Development of Geographical Ontologies and Linked Data
 
Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
 
Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data Journey
 
Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data Journey
 
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
 
FORCE2019 Research Comms Conference
FORCE2019 Research Comms ConferenceFORCE2019 Research Comms Conference
FORCE2019 Research Comms Conference
 
Ontology repositories and case study with OntoPortal
Ontology repositories and case study with OntoPortalOntology repositories and case study with OntoPortal
Ontology repositories and case study with OntoPortal
 
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology EngineeringLearning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineering
 
22 9 2006 Opensourceeds
22 9 2006 Opensourceeds22 9 2006 Opensourceeds
22 9 2006 Opensourceeds
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf
2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf
2022_02_25_ppt2_EPICAMP 2022 Anglais.pdf
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
 
Low tech and high tech methods in participation
Low tech and high tech methods in participationLow tech and high tech methods in participation
Low tech and high tech methods in participation
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 

Más de Valentina Presutti

Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebValentina Presutti
 
Frame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with SentiloFrame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with SentiloValentina Presutti
 
Using cognitive tools in robots dealing with people with dementia
Using cognitive tools in robots dealing with people with dementiaUsing cognitive tools in robots dealing with people with dementia
Using cognitive tools in robots dealing with people with dementiaValentina Presutti
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesValentina Presutti
 
Methods for Ontology Design Patterns reuse
Methods for Ontology Design Patterns reuseMethods for Ontology Design Patterns reuse
Methods for Ontology Design Patterns reuseValentina Presutti
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCValentina Presutti
 

Más de Valentina Presutti (7)

Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
 
Frame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with SentiloFrame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with Sentilo
 
Fred sw jpaper2017
Fred sw jpaper2017Fred sw jpaper2017
Fred sw jpaper2017
 
Using cognitive tools in robots dealing with people with dementia
Using cognitive tools in robots dealing with people with dementiaUsing cognitive tools in robots dealing with people with dementia
Using cognitive tools in robots dealing with people with dementia
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with Frames
 
Methods for Ontology Design Patterns reuse
Methods for Ontology Design Patterns reuseMethods for Ontology Design Patterns reuse
Methods for Ontology Design Patterns reuse
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
 

Último

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 

Último (20)

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 

ArCo: the Knowledge Graph of Italian Cultural Heritage

  • 1. ArCo: the Knowledge Graph of Italian Cultural Heritage Valentina Presutti University of Bologna STLab, ISTC, National Research Council, Italy ArCo team also includes Valentina Carriero (ISTC-CNR), Andrea Giovanni Nuzzolese (ISTC-CNR) and Aldo Gangemi (UniBo) https://w3id.org/arco This slides can be reused as they are according to the CC BY 4.0 license. Please cite the author and link to the original.
  • 2.
  • 3. ArCo’s ambitious goal is to build the knowledge graph of Italian Cultural Heritage Valentina Anita Carriero, Aldo Gangemi, Maria Letizia Mancinelli, Ludovica Marinucci, Andrea Giovanni Nuzzolese, Valentina Presutti and Chiara Veninata: ArCo: the Italian Cultural Heritage Knowledge Graph. In Proceedings of ISWC 2019 (To appear) Preprint at: https://arxiv.org/abs/1905.02840
  • 4. ArCo KG in numbers ONTOLOGY NETWORK • 7 modules • 5058 axioms • 1049 predicates DATA • 169.151.644 triples • 28.838 owl:sameAs linking to 20.479 distinct entities in other datasets
  • 5. How to use ArCo https://w3id.org/arco CC BY-SA 4.0 license
  • 6. USER GUIDES for supporting users in understanding the content of each release, with Graffoo diagrams and narrative explanations of every ontology module https://essepuntato.it/graffoo/
  • 7. ONTOLOGIES, including their source code and a human- readable HTML documentation created with LODE https://essepuntato.it/lode/
  • 8. A SPARQL endpoint storing ArCo KG, which is also downloadable as a compressed dump
  • 9. Examples of Competency Questions (CQs) that ArCo KG can answer, with their corresponding SPARQL queries. This helps users to have a quick understanding of what is in ArCo ontologies and data, and how to use it. e.g.:
  • 12. Why ArCo? • Regulations (open data) • Fostering reuse by third party • Improving PA organisational data management • Modeling Cultural Heritage knowledge vs metadata • Cataloguing • Providing data to scholars and researchers • Connecting to other relevant knowledge bases
  • 13. The general catalogue of Italian Cultural Heritage …among ArCo’s main data and conceptual sources
  • 14. the official institutional database of Italian CH, maintained and published by ICCD (Institute of the General Catalogue and Documentation) about 800.000 (out of 2.735.343) publicly available catalogue records General Catalogue SIGEC web General Catalogue of Italian Cultural Heritage the collaborative platform to which formally authorised institutions can submit their catalogue records, which undergo a validation phase 30 types of cultural properties ICCD Cataloguing standards ~15M catalogue record numbers released
  • 15. Collecting and validating catalogue records
  • 16. ICCD catalogue standards: documentation
  • 18. Challenge From strings to domain entities
  • 19. Different versions of catalogue standards
  • 20. Building ArCo knowledge graph with ontology design patterns
  • 21. Ontology Design Patterns An ontology design pattern is a reusable successful solution to a recurrent modeling problem Aldo Gangemi, Valentina Presutti: Ontology Design Patterns. Handbook on
  • 22. Ontology Design Patterns Ontology patterns derive from foundational theories Agile methodology for ontology design Pattern representation language within ontologies
  • 23. A language for ontology engineers Trajectory AgentRole RecurrentEvent PartOf Sequence Observation TimeIndexedParticipation TimeInterval
  • 26. ODPs from DUL + DnS Ultra Lite DOLCE+D&S and its main ontology design patterns: Valentina Presutti and Aldo Gangemi. Ontology Engineering with ontology design patterns. Pages 81-103. IOS Press (2016)
  • 27. 28 Experimenting with ODPs usability with vs. without 45 participants Eva Blomqvist, Aldo Gangemi, Valentina Presutti: Experiments on pattern-based ontology design. K-
  • 28. Experimenting with ODPs and XD Usability: without vs. with 29 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Strongly disagree Disagree to some extent Neither agree nor disagree Agree to some extent Strongly agree Not applicable The XD methodology helped me to organize my work while modelling. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Strongly disagree Disagree to some extent Neither agree nor disagree Agree to some extent Strongly agree Not applicable I already organized my work in a way similar to XD in the previous exercises... ODP ODP + XD Terminology coverage 79% 83% Task coverage 69% 81% Disjoint axioms 37% 52% 35 participants Eva Blomqvist, Valentina Presutti, Enrico Daga, Aldo Gangemi: Experimenting with eXtreme Design. EKAW 2010: 120-
  • 29. Experiments on ontology learning: with vs. without • ODP-based ontology learning improves results • Ontologies are better in terms of cohesion, consistency, functional quality, etc. • Experiments with OntoCase applied to Text2Onto ontology learning Eva Blomqvist: OntoCase-Automatic Ontology Enrichment Based on Ontology Design Patterns. International Semantic Web
  • 30. Paulheim, H. and Gangemi, A. Serving DBpedia with DOLCE – More than Just Adding a Cherry on Top. Proceedings of ISWC2015, the Thirteenth International Semantic Web Conference, LNCS, Springer, 2015
  • 31. Paulheim, H. and Gangemi, A. Serving DBpedia with DOLCE – More than Just Adding a Cherry on Top. Proceedings of ISWC2015, the Thirteenth International Semantic Web Conference, LNCS, Springer, 2015
  • 32. eXtreme Design (XD) Building ArCo knowledge graph with ontology design patterns Eva Blomqvist, Karl Hammar, Valentina Presutti:Engineering Ontologies with Patterns - The eXtreme Design Methodology. Ontology Engineering with Ontology Design Patterns. Pages 23-50. IOS Press (2016) Eva Blomqvist, Valentina Presutti, Enrico Daga, Aldo Gangemi: Experimenting with eXtreme Design. EKAW 2010: 120-134
  • 34. xd4arco: requirements and feedback loop
  • 35. User stories Continuous feedback New emerging requirements and errors arco-project@googlegroups.com Methods and tools to collect requirements from users with heterogenous expertise
  • 36. Look for the story of an artwork, which was confiscated from organised crime to appreciate the value of social return through this type of confiscation  What requests should this application reply to? Visualise the cronology of an artwork that was confiscated from organised crime User stories Goal High level requirements
  • 37. To enable identifying cataloguing activity of diverse organisations in specific location areas  What requests should this application reply to? How many catalogue records describing cultural properties in a certain region have been produced? How many of them have been filled by Heritage Protection Agencies? How many by other organisations (universities, regions, etc.)? User stories Goal High level requirements
  • 38.  Data about residential estates  Data about cultural heritage  Data about touristic services  Archeological data  Archival data  CH data  Touristic services and touristic-cultural itineraries  Accessibility  Consultation of CH data
  • 39. From stories to competency questions and constraints • What are the geographical coordinates of cultural property X? • What cultural events involved cultural property X? • What is the conservation status of cultural property X? And what interventions have been proposed for it? • When cultural property X was realised? And what is its history? And why? • Who are the attributed authors of cultural property X? • … • A cultural property can be associated with different types of locations, each possibly having a temporal validity • Tangible and intangible cultural properties are disjoint. • Tangible cultural properties can be either movable or immovable, not both. • … testing team
  • 40. Lessons learned • Depending on the domain under analysis you may need requirements from beyond domain experts • Social aspects • Terminology • Administrative or even political constraints • Diverse means to collect requirements • Different tools for different elicitators • Bias towards cataloguing standards
  • 43. Ontology Design Patterns Shortcut binary relations along with N-ary relations Multiple languages (ita + eng) Detailed documentation (comments, usage examples, diagrams) Design principles Direct and indirect reuse
  • 44. Direct and Indirect reuse Direct Reuse • Delegating the conceptualisation of predicates and axioms to external ontologies • e.g. dul:Event as type of individuals in my ontology • When? • you want or have to comply with, an ontology • Effects • Changes in reused external ontology impact the semantics of your ontology • Less design effort Indirect reuse • Defining predicates and axioms in your ontology and align them to external ontologies • e.g. myont:CulturalEvent rdfs:subClassOf dolce:Event • When? • you want to be interoperable but avoid dependency on external resources • Effects • If external changes impact on the semantics of your ontology you may accept it or remove the alignments • More design effortValentina Presutti, Giorgia Lodi, Andrea Giovanni Nuzzolese, Aldo Gangemi, Silvio Peroni, Luigi Asprino:The Role of Ontology Design Patterns in Linked Data Projects. ER 2016: 113-121
  • 45. ArCo: direct reuse • OntoPiA: ontology network for Italian PA data https://w3id.org/italia/onto/FULL
  • 46. https://w3id.org/arco/ontology/ArCo ontology network Modularisation is driven by the type of data available in the catalogue Module names are based on terminology used by domain experts
  • 47. OPLa: annotating ODPs Pascal Hitzler, Aldo Gangemi, Krzysztof Janowicz, Adila Alfa Krisnadhi, Valentina Presutti: Towards a Simple but Useful Ontology Design Pattern Representation Language. WOP@ISWC 2017
  • 49. Modelling issues • Our main source is a catalogue, which is about cultural properties • A catalogue record describes a cultural property and includes information about its owner or administrator, as well as other administrative roles • A change in the cultural property or in the information available about it may causes a new version of its associated catalogue record • ArCo wants to model both the catalogue and the entities it is about
  • 51. Open challenges and research questions • Investigating the dynamics between catalogue record changes, the cataloguing process, and the evolution of the cultural property over time • Knowledge graphs such as ArCo may be empirically studied to this aim • Persistence of physical objects (e.g. cultural properties) vs fluency of information objects • cidoc:Spacetime_Volume subsumes (union?) cidoc:Presence, cidoc:Physical_Thing, and cidoc:Period • cidoc:Place equivalent to dul:space-region (an abstract place then?) • cidoc:Actor is subsumed by cidoc:Persistent_Item (not a fluent?)
  • 52. Modeling the CH domain: some examples
  • 53. The Current Taxonomy of Cultural Properties
  • 54. 9 categories of Cultural Properties generalising over 30 more specific types: e.g. musical, natural, numismatic, scientific and technological 2 main orthogonal distinctions: immovable vs. movable tangible vs intangible
  • 55.
  • 56. Location of a cultural property
  • 57. Modelling issues • A cultural property may be associated with different types of locations • A cultural property’s location has a temporal validity
  • 59. EDM: Place and current location
  • 61. ArCo: cadastral identity (location type)
  • 63. Modelling issues • Cultural properties can be involved in exhibitions, or other types of cultural events • Cultural events always involve one or more cultural property • There are events that belong to series having a more-or-less regular frequency
  • 64. ArCo: Event and Recurrent Event Series http://ontologydesignpatterns.org/wiki/Submissions:RecurrentEventSeries
  • 67.
  • 68. Lessons learned To favor and facilitate reuse: • Favoring local constraints • e.g. constraints using general predicates • Annotating patterns is tedious but precious • Call for tools and incentives! • Identifying potentially relevant ontologies: how to make it easier? • F.A.I.R. + tools
  • 70. • Findable: • ArCo has permanent URIs (w3id) to identify its entities • ArCo knowledge graph has its DOI (10.5281/zenodo.2630447) • ArCo is indexed on Linked Open Vocabularies • Accessible: • Use of open standard protocols and query language (HTTP(S) and SPARQL) • Interoperable: • Through RDF/OWL, ontology patterns and ontology reuse • Reusable: • We release ArCo under CC BY-SA 4.0 license F.A.I.R.
  • 71. RDFizer: converting ICCD-XML data to ArCo-RDF, Data production Open source on Github
  • 72. 1. Identifying a possible key in the XML source 2. Removing possible URI-illegal characters and convert lower case the string 3. Sorting in alphabetical order 4. Computing the MD5 checksum ID generation <AUTN>Friscia Albert</AUTN> friscia-albert albert-friscia dcd4ca7b54dd3d7dac083dd4c54a9eef. https://w3id.org/arco/resource/Agent/dcd4ca7b54dd3d7dac083dd4c54 a9eef From strings to entities
  • 73. Cleaning and enriching ArCo knowledge graph • Deduplication: two different entities are generated for the same one • “Andrea d'Agnolo” and “Andrea d'Agnolo detto del Sarto” • Entity linking applied on ArCo against itself • Disambiguation: same entity generated for two or more different ones • Identification of entity fingerprints: e.g. active period of an author and types of artworks she’s associated with • Entity linking • 28.838 owl:sameAs linking to 20.479 distinct entities in other datasets • Mainly authors (8.884) and locations (9.862)
  • 78. Lessons learned and open challenges • Testing is useful to early error-discovery and for detecting new modeling issues • Huge manual effort • TESTaLOD: Prototype for automatic regression tests • Although the testing methodology supports a systematic approach, some aspects can be easily overlooked: • error provocation and inference verification are mainly pulled by designers http://testalod.herokuapp.com/
  • 79. Conclusion • Towards a knowledge graph of Italian Cultural Heritage • Evolving content, enabling diverse usage: from business applications to science discovery • Call for tools for facilitating reuse and testing • Ontology patterns annotation, ontology discovery, automatic testing, etc. • Open questions and material for reflections • how to make different modeling approaches co-exist and be compatible • what is the best way to handle evolution • how to capture requirements from diverse “types of experts”

Notas del editor

  1. I want to share with you the experience of a project that my group has developed in collaboration with the Italian Ministry of Cultural HeritageI First I’ll tell you what we have so far and then I will tell you how we got there. This will give me the opportunity to share some lessons learned, as well as problems we have faced along the way. For some of them we identified a possible solution, some others are still open.
  2. The Ministry of Cultural Heritage and Activities (MiBAC) together with regions and local agencies cooperatively catalogue Italian CH they own. ICCD (the Institute of the Central Catalogue and Documentation) of MiBAC coordinates this activity by maintaining the “General Catalogue of Italian Cultural Heritage”. SIGECweb: collaborative platform to which CH administers submit their catalogue records.
  3. Only authorised CH administers (publc/private institutions/organisations) Once a catalogue record is submitted it goes through a validation process: format and compliance with ICCD standards, and scientific assessment. Let’s see how a catalogue record looks like:
  4. This is how ICCD catalogue standards documentation looks like
  5. The content of a catalogue records: Good side: The model is very rich: from metadata to restoration processes, measurements, location, associated theories e.g. attribution, physical descriptions, etc. As compared to other data sources, SIGEC is based on ICCD standards, which are richly documented and conceptualisation is based on scientific competence There are some elements, whose value is based on controlled vocabularies Although most elements’ values are text, many of them converge to uniform descriptions, which can help knowledge extraction Bad side: Data are mainly textual descriptions The content types define mainly their format in terms of XML datatypes e.g. text, rather than their types Labels of XML elements are in italian  ArCo provides both English and Italian labels The text values are in italian: at the moment ArCo does not provide an English translation but this is an undergoing process Many records have only few mandatory filled fields
  6. During the project and by inspecting the catalogue records we realised that they followed different schemas in some cases. In fact, the standards have evolved through different versions, which are often backward incompatible The evolution of versions have limited documentation in terms of mapping ICCD experts had to provide this additional information to allow us developing automatic conversion of catalogue records into a LOD knowledge graph
  7. A ODP includes a vocabulary, an axiomatization, a set of requirements expressed in terms of competency questions that it addresses, examples of use, an optional implementation and possibly a source, e.g. a theory it is based on. It can be very specific but usually it addresses concepts that are general enough to be relevant for diverse domains.
  8. User stories, then reformulated as Competency Questions and used for ODP selection by the design team, as well as in the testing phase, by the testing team The customer and the testing teams can contribute continuous and updated feedback, which allows the design team to early detect new emerging requirements and errors, and schedule them for next releases
  9. ArCo is the root of the network, it imports all other modules and defines the main taxonomy of cultural property types General purpose concepts, reused by all ontologies. Catalogue: catalogue records linked to the CP they describe Denotative: measurable characteristics of a CP measurable according to a ref system, measurements e.g. length, constituting materials e.g. clay, employed techniques e.g. melting, conservation status e.g. good, decent, bad. Context: info about the CP that are not measurable but influence the knowledge of a CP or its ontological status: authors, collectors, copyright holders; relations to other objects such as inventories, bibliography, protective measures, collections; activities such as surveys, conservation interventions; involvement in situations, e.g. commission, coin issuance, estimate, legal proceedings Location: spatial and geometrical information
  10. A catalogue record is an entity that describes a cultural property As it denotes a real world object, it can be defined as an information object (a piece of information independent from how it’s realised A catalogue record is a fluent entity: it changes as the description of the cultural property changes. A catalogue record update can be caused by an ontological change (its conservation status) of the cultural property or by an epistemological change (new discoveries) Every change corresponds to a new information object: a version of the catalogue record The catalogue record however has its persistence in describing the same real-world object independently from the different versions A catalogue record is a persistence information object related to each of its versions, which are in turn information objects reflecting every change in the content
  11. Why do we model cultural properties mostly focusing at their persistence nature, while we give importance to the temporal evolution of catalogue records, hence modeling them as fluents? More in general: why do we model certain objects as fluents and others by focusing on their persistent attributes? Maybe for catalogue records their changes characterise them more than their persistent attributes, while for cultural properties what interests us more are the aspects that make them recognisable as the same over time
  12. CIDOC models the location of a cultural property as dependent on moving events. It distinguishes, by means of relations between a CP and a place, three types of locations: current, current or former, permanent. It also expresses a concept of an object being a section of a place.
  13. Eventi ricorrenti: Frequenza: concetto fuzzy Organizzazione e processi collegati alla serie Altri eventi, parte di ogni evento che appartiene alla serie- vedi biennale e i suoi eventi Eventi satellite di eventi ricorrenti – workshop di una conferenza
  14. Event is conceptualised as a change of state in some system, a different concept than our cultural event. An Activity is an intentional action carried out by an actor that may result in a change of state, an event. This corresponds to ArCo’s Activity, which for example subsumes Interventions such as Restoration, etc.