Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
1. Digital repertoires of poetry metrics:
towards a Linked Open Data ecosystem
Mariana Curado Malta (mariana@iscap.ipp.pt; mariana.malta@linhd.uned.es), Polytechnic of Oporto | LINHD – UNED
Elena González-Blanco (egonzalezblanco@flog.uned.es), LINHD – UNED
Clara Martínez (cimartinez@flog.uned.es), LINHD – UNED
Gimena del Rio (gdelrio.riande@gmail.com) , CONICET
Metadata and Semantics Research Conference
Göttingen 2016
2. Outline
• The Context
• The Problem
• The Approach
• Where are we now?
• Future work
• Conclusions
2
Metadata and Semantics Research Conference
Göttingen 2016
4. A digital humanities center is an entity where new media and technologies are used
for humanities-based research, teaching, and intellectual engagement and
experimentation. The goals of the center are to further humanities scholarship, create
new forms of knowledge, and explore technology’s impact on humanities
based disciplines".
Diane M. Zorich, A Survey of Digital Humanities Centers in the United States, 2008
Context – Where?
Metadata and Semantics Research Conference
Göttingen 2016
5. • LINHD: a bridge between different fields of
knowledge
• LINHD has:
Philologists
Software Developers
Natural Language Processing Experts
Ontologists & LOD technologists
Context – Where
5
Metadata and Semantics Research Conference
Göttingen 2016
8. The problem
• At least 21 repertoires on Poetry metrics & other
information (in the Web of Documents)
• This community wants to share all the data
among repertoires
• ….to enhance its research
• ……and more…e.g. researchers would love to
be able to play with the data!
8
Metadata and Semantics Research Conference
Göttingen 2016
9. (Sub-)problem I
• First issue: standardize poetic features
Different languages
Different cultures/traditions
9
Philologists take care of this issue!
Metadata and Semantics Research Conference
Göttingen 2016
10. Philological barriers are caused by different ways
of conceptualizationzation
Alexandrines
4x(7pp+7p)
(Classic Latin)
12A12A12A12A
(Romance)
Goliardic
(Sub-)problem I
12. (Sub-)problem II
Second issue: repertoires locked in their silos of
information:
Different paradigms: Local and Web of Docs
Different technologies: XML, Excel, Access,
MySQL, SQL, Perl Objects (so far)
Different data models
12
Metadata and Semantics Research Conference
Göttingen 2016
13. The problem
• How to overcome these diferences?
• LOD technology
• Development of a Metadata Application Profile
for the European Poetry community
13
Metadata and Semantics Research Conference
Göttingen 2016
14. The Approach
• Method for the development of Metadata
Application Profiles (Me4MAP)
• Me4MAP establishes a well defined process for
the development of a MAP:
defines activities
when those activities should take place
how they interconnect
and their resulting deliverables
14
Metadata and Semantics Research Conference
Göttingen 2016
16. Where are we now ?
• S1: Defining the Functional Requirements
Analysing the Websites’ functionalities: Use Case technique
16
Metadata and Semantics Research Conference
Göttingen 2016
17. • navigate through the Webpages of a repertoire
• analyse the functionalities these pages have
Step 1 – Functionalities
identification
Metadata and Semantics Research Conference
Göttingen 2016
18. Step 2 - Use Case model
development
Main flow Alternative flow (1)
1. Enumeration of interactions
between user and system in the
main scenario
2.
3.
1.1 Description of the interaction in a alternative
scenario. Only if a specific step is very different
compared to the main flow
Number: The number assigned to the Use Case
Use Case Name: The name of Use Case
Actor: The main actor of the Use Case. It is the Actor who starts the Use Case
Description: Short description of the Use Case function
Metadata and Semantics Research Conference
Göttingen 2016
19. Step 3 - Data elements
of the Use Cases
Label Cardinality Searchable Link
The label of the data
element in the window
1 – M
1
Yes/No Yes/No
Window number/name: Number or name of the window that is being described.
Data elements
Metadata and Semantics Research Conference
Göttingen 2016
21. Where are we now ?
• S2: Defining the Domain Model
Analysing the Logical Models of the databases (when possible)
21
Metadata and Semantics Research Conference
Göttingen 2016
22. Where are we now ?
22
Relational Database Conceptual Model
Source: ReMetCa
URL: http://www.remetca.uned.es/
23. Where are we now ?
23
Relational Database Conceptual Model
DOCUMENTATION
Source: ReMetCa
URL: http://www.remetca.uned.es/
24. Where are we now ?
24
XML Schema Model Conceptual Model
Source: Digital Edition of the index of Middle English Verse
URL: http://dimev.net
25. Where are we now ?
25
XML Schema Model Conceptual Model
DOCUMENTATION
Source: Digital Edition of the index of Middle English Verse
URL: http://dimev.net
26. Where are we now ?
26
Perl Scritp structure Conceptual Model
Source: Versologie
URL: http://metro.ucl.cas.cz/kveta
27. Where are we now ?
27
Perl Scritp structure Conceptual Model
DOCUMENTATION
Source: Versologie
URL: http://metro.ucl.cas.cz/kveta
28. Where are we now ?
• Reverse engineering process eliminates all the details
that have to do with the implementation/representation
(normalization)
• We have followed the method:
• ID keys deleted
• Separate different concepts that are represented in
the same table
• Tables that enumerate terms deleted become
properties that can be repeated
• When models have conceptual problems fix
problems
28
Metadata and Semantics Research Conference
Göttingen 2016
29. Where are we now ?
• During the process of reverse engineering
we standardize, i.e.
Call the same concepts by the same
name (working together with the philologists)
Try to call the same names to tables or
properties as classes or terms that
already exist in A3: Environmental
Scan
29
Metadata and Semantics Research Conference
Göttingen 2016
30. Where are we now ?
• A3: Environmental Scan
• A report contains a review of the RDF
vocabularies that may serve the needs of
the Domain Model
30
Metadata and Semantics Research Conference
Göttingen 2016
31. Where are we now ?
• If the repertoire’s responsible did not
provide database definition analyse the
functionalities of the Website
• Study the controlled vocabularies and
standardize them (work of a philologist)
31
Metadata and Semantics Research Conference
Göttingen 2016
32. Future Work
• At the end of S1 & S2 activities
Functional requirements & Domain
Model defined
• Domain Model validation in two meetings
(Jan/Feb 2016) with:
1. The repertoire’s responsibles (circa
20)
2. Semantic Modelers experts (3)
32
Metadata and Semantics Research Conference
Göttingen 2016
33. Next step ?
• S2.1: Vocabulary Alignment
To match the terms of the RDF
vocabularies identified in the
Environmental Scan (A3) with the
needs of the Domain Model.
33
Metadata and Semantics Research Conference
Göttingen 2016
34. Conclusions
• POSTDATA aims to put poetry metrics data in
LOD
• There are at least 21 repertoires on the Web of
Documents
• To achieve that we need to: 1) standardize the
way poetry metrics is defined, 2) create a
Metadata Application Profile (MAP) for the
European Poetry community
• We are following Me4MAP to define this MAP
34
Metadata and Semantics Research Conference
Göttingen 2016