Mainz Expert Workshop on Controlled Vocabularies 10/10/2013
1. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
Topic Introduction
Controlled vocabularies and humanities, a problematic relationship.
The functional categorization of historical place types and the problems it raises.
Giovanni Colavizza
Leibniz Institute of European History
Colavizza@ieg-mainz.de
1
2. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
The scenario
Controlled vocabulary: a selected list of terms, which refer to concepts, used for
categorization. Criteria of concept selection are usually domain specific.
Focus for this talk: vocabularies of concepts, not proper names.
2
3. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
The scenario
Controlled vocabulary: a selected list of terms, which refer to concepts, used for
categorization. Criteria of concept selection are usually domain specific.
Focus for this talk: vocabularies of concepts, not proper names.
The term - concept relation is often not specified: intended (?) use of natural
language, which is context and interpretation specific.
But there goes language independence!
@Dalia Varanka, A topographic feature
taxonomy for a US national topographic
mapping ontology, 2009.
2
4. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
The problem
Quantitative and computer-based methods scale-up our responsibilities together
with our means.
Retrieve
The data and metadata loop:
Reuse Extend
3
Share
5. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
The problem
Quantitative and computer-based methods scale-up our responsibilities together
with our means.
Retrieve
The data and metadata loop:
Reuse Extend
Share
More strict requirements: classification systems must be shared, to some extent.
Such shared part must be formally specified (machine-readable). The term concept bond has to become explicit.
3
6. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
New design requirements
•Allow for comparison beyond single project (data integration)
•Interoperability and portability
•Scalability
•More accurate retrieval
•Automatic classification
•Named entity recognition
•Reasoning...
One possible solution: integrate a more strict knowledge model on top of
controlled vocabularies. Express it via ontologies: simplified specifications of
(shared!) conceptualizations.
Already possible! ISO 25964 (data model), SKOS (web format)
4
7. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
IEG proposal - concept
•Keep both natural language vocabularies AND formalized ontologies
•An integrated approach:
1.develop back-end ontologies, well formalized and documented*
2.vocabularies are built as needed, in natural language, associating tags with
formally defined concepts (prevent late integration)
5
8. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
IEG proposal - concept
•Keep both natural language vocabularies AND formalized ontologies
•An integrated approach:
1.develop back-end ontologies, well formalized and documented*
2.vocabularies are built as needed, in natural language, associating tags with
formally defined concepts (prevent late integration)
But!
No 1-1 mapping between vocabularies and ontologies. Focus on what’s shared*.
Pareto principle: 80% effects (tags we need) come from 20% causes (concepts).
5
9. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
IEG proposal - implementation
Implementation is key:
1.Upper ontologies (integration among domains)
2.Domain ontologies (e.g. functions)
3.Labeling system
4.Controlled vocabularies
> Linked data enabled, user friendly (minimize learning curve and overhead),
single entry-point to standards: bridges tags and concepts.
6
10. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
IEG proposal - implementation
Implementation is key:
1.Upper ontologies (integration among domains)
2.Domain ontologies (e.g. functions)
3.Labeling system
4.Controlled vocabularies
> Linked data enabled, user friendly (minimize learning curve and overhead),
single entry-point to standards: bridges tags and concepts.
Large-scale collaborative and community-driven framework (numbers 1, 2, 3, in
part 4), few experts for back-end, many users for front-end, everything open.
Could we think about a Consortium for controlled vocabularies (like TEI)?
6
11. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
Historical place types
Quite problematic:
Same names mean different things in space, time, culture
Generic tags for specific meanings: ambiguity
Layers of interpretations: agents, socio-political context, historians
7
12. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
Historical place types
Quite problematic:
Same names mean different things in space, time, culture
Generic tags for specific meanings: ambiguity
Layers of interpretations: agents, socio-political context, historians
From nouns to verbs:
Most vocabularies of place types/features are already loosely classified by
functionality (economic activity, leisure facility, place of culture, etc.)
There are less verbs than nouns (Wordnet synsets: ~82k nouns, ~14k verbs)
Verbs brings us closer to concrete events (and linked data triples..)
7
13. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
Functional categorization - I
@Filippo De Vivo, Patrizi, informatori, barbieri. Politica e comunicazione a Venezia nella prima età moderna. Milan: Feltrinelli,
2012. In English: id., Information and communication in Venice: Rethinking Early Modern Politics. Oxford: Oxford University
Press, 2007.
8
14. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
Open questions
Is all this useful and feasible? (let’s try it)
Where to start (historical place types)
What to model (functions)
Design requirements
Explore technical solutions
How to integrate existing vocabularies
> Sketch guidelines
Partners, anyone? :)
9
15. Experts Workshop on Controlled Vocabularies
Mainz 10-11/10/2013
Giovanni Colavizza
Thanks!
Controlled vocabularies and humanities, a problematic relationship.
The functional categorization of historical place types and the problems it raises.
Giovanni Colavizza
Leibniz Institute of European History
Colavizza@ieg-mainz.de
10