SlideShare a Scribd company logo
1 of 26
Florentina Armaselu – DHLab, Centre virtuel de la connaissance sur l’Europe (CVCE),
Luxembourg
florentina.armaselu@cvce.eu
1
www.cvce.eu
From a Small-Scale Digital Edition to a TEI
Publication Framework in Modern
European History
Text Encoding Initiative (TEI) Conference and Members’
Meeting. Connect, Animate, Innovate. 28 to 31 Octobre
2015. Université Lumière Lyon 2
1. The WEU-DIPLO pilot project
2. Transviewer, towards a TEI publication
framework
3. Discussion
4. References
Summary
2
Part I
The WEU-DIPLO pilot project
3
1. Goal: XML-TEI encoding, corpus analysis and Web publication of institutional documents
of the W.E.U. (Western European Union):
• Topics: armament production, standardization, control in the period from 1954 to 1982;
• Source: Archives nationales de Luxembourg, W.E.U collection.
2. Initial format:
• digitized versions (JPG) of typewritten materials (one file per page).
3. Size:
*proc. = processed
Overview of the WEU-DIPLO project
Part I. WEU-DIPLO pilot 4
Category Number of
documents
Number of documents
per language
Number
of pages
Number of pages per
language
EN FR FR proc.* EN FR FR proc.*
Note 89 43 46 37 395 191 204 155
Minutes 30 15 15 15 256 138 118 118
Memorandum 3 1 2 2 16 7 9 9
Study 2 0 2 1 12 0 12 8
Discourse 1 0 1 0 4 0 4 0
Draft protocol 2 1 1 0 4 2 2 0
Total 127 60 67 55 687 338 349 290
Overview of the WEU-DIPLO project: workflow
Part I. WEU-DIPLO pilot 5
Overview of the WEU-DIPLO project: page structure. ©WEU-UEO
Part I. WEU-DIPLO pilot 6
Header
Content
Footer
Microsoft Word Styling – WEU-DIPLO
Part I. WEU-DIPLO pilot 7
Headers, footers
Headings, line breaks,
paragraphs
Conversion and enrichment (XSLT, manual, NER)
Part I. WEU-DIPLO pilot 8
OxGarage (DOCX to TEI P5)
oXygen XML Editor
• XSLT transformation (metadata, structure);
• manual enrichment (semantics – discourse
of country/institutional representatives)
GATE (Name Entity Recognition)
• training phase (Gazetteer List Collector)
• annotation phase (names of persons,
organisations, places, functions, events,
products; dates)
oXygen XML Editor
• XSLT (GATE XML to TEI P5 transformation)
XML-TEI Encoding: WEU-DIPLO - metadata; layout (header). ©WEU-UEO
Part I. WEU-DIPLO pilot 9
@@hAuthor
@@hArchNum
@@hStampConfid
@@hDocRef
@@hOrigDate
@@hOrigLang
@@hVersion
XML-TEI Encoding: WEU-DIPLO – Structure (headings, paragraphs, line breaks); semantics (named
entities, discourse). ©WEU-UEO
Part I. WEU-DIPLO pilot 10
@@Heading2@@Paragraph
@@LineBreak@@Names
@@Discourse
XML-TEI Encoding: WEU-DIPLO – transcription features (Pierazzo, 2011)
Part I. WEU-DIPLO pilot 11
Part II
Transviewer, towards a TEI
publication framework
12
• Treaties; official declarations and meeting reports; letters; notes; press articles; images, video and
audio archives related to European integration history
Context: The CVCE’s ePublications
Part II. Transviewer 13
1. Transviewer concept:
• XML-TEI transformation/visualisation on the fly, in the browser
• flexible framework for the publication of XML-TEI documents in European
integration history;
2. Technologies :
• XML, HTML, XSLT, CSS and JavaScript
3. Tested platforms:
• EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt-project/
• KILN : http://kiln.readthedocs.org/en/latest/#
• TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/
• Versioning Machine: http://v-machine.org/
• XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/
Transviewer overview
Part II. Transviewer 14
Implementation (adaptation and in-house development):
• side-by-side view digital facsimile and transcription (EVT model)
• third-party libraries:
o BookReader: tool designed to provide online access to scanned books
o Saxon-CE: support for XSLT 2.0 transformation in the browser
o in-house development (configuration, frames and buttons layout/actions, transcription rendering, third-party libraries
calls)
Transviewer prototype
Part II. Transviewer 15
Transviewer experiments– digital facsimile/transcription side-by-side view. ©WEU-UEO
Part II. Transviewer 16
Transviewer experiments– digital facsimile/transcription side-by-side view. Werner –
handwritten notes
Part II. Transviewer 17
Transviewer experiments (simulation) – video/audio and transcription synchronisation.
Werner - interviews
Part II. Transviewer 18
Transviewer features – panels layouts
Part II. Transviewer 19
Transviewer features– transcription format
Part II. Transviewer 20
Transviewer features– panels interlinking
Part II. Transviewer 21
Part III
Discussion
22
“By teaching an edition how to swim, I mean endowing an edition not only with a
store of factual knowledge concerning the work presented, but also with the
capability of dealing gracefully with the mutability of the electronic medium, by
exploiting the possibilities for reader-controlled changes to the edition’s
presentation and by adapting successfully to rapid changes in the hardware and
software environment.” (Sperberg-McQueen, 2009)
1. Transviewer prototype questions:
• flexible enough to support different types of documents in
European integration history and different user requirements;
• modular architecture to allow gradual development and
customisation according to the needs of the projects;
• balance manual interventions/automatic processing (XSLT, NER);
• XML transformation on the fly (no need for intermediary
formats/steps, changes to the XML already part of the publication).
Discussion
Part III. Discussion 23
3. Issues:
• BookReader – use of an older version of jQuery library;
• non-uniform support of Saxon-CE for XSLT 2.0 transformation in the
browsers;
• need for batch conversion to XML-TEI (potential adaptation of
OxGarage for batch processing).
4. Ongoing/future work for further development:
• evaluation (technology – technical experts; usability tests – experts
in European integration studies);
• development of new modules (multi-panels, audio/video
transcription, etc.) and tests with more project samples;
• integration into the existing CVCE’s Website architecture:
o Back End;
o Front End.
Discussion
Part III. Discussion 24
Thank you!
Discussion
25
Scaling in a publication framework would imply not only
teaching your editions “how to swim” but also how to swim
together.
• Book Reader: https://openlibrary.org/dev/docs/bookreader
• EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt-project/
• GATE: https://gate.ac.uk/
• KILN : http://kiln.readthedocs.org/en/latest/#
• OxGarage: http://www.tei-c.org/oxgarage/
• Pierazzo, Elena. (2011). A rationale of digital documentary editions. In LLC. The Journal of Digital
Scholarship in the Humanities, Vol. 26, No. 4, December 2011, pp. 463-477.
• http://www.scholarlyediting.org/2014/essays/essay.pierazzo.html.
• TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/
• TEI (Text Encoding Initiative): http://www.tei-c.org
• Versioning Machine: http://v-machine.org/
• Saxon-CE: http://www.saxonica.com/ce/user-doc/1.1/index.html
• Sperberg-McQueen, C.M. 2009. “How to teach your edition how to swim”. In LLC. The Journal of Digital
Scholarship in the Humanities. Volume 24, No. 1, April 2009. Oxford Journals.
• XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/
References
26

More Related Content

Similar to TEI Conference - CVCE

12_N.Smolenski, M.Kostic, A.Sofronijevic
12_N.Smolenski, M.Kostic, A.Sofronijevic12_N.Smolenski, M.Kostic, A.Sofronijevic
12_N.Smolenski, M.Kostic, A.SofronijevicNikola Smolenski
 
Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...
Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...
Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...Europeana
 
F/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesF/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesLibriotech
 
Semtech web-protege-tutorial
Semtech web-protege-tutorialSemtech web-protege-tutorial
Semtech web-protege-tutorialmatthewhorridge
 
Enabling accessible multimedia for Moodle: iMoot 2010
Enabling accessible multimedia for Moodle: iMoot 2010Enabling accessible multimedia for Moodle: iMoot 2010
Enabling accessible multimedia for Moodle: iMoot 2010Nick Freear
 
Getty Presentation of IMA/AIC OSCI tool
Getty Presentation of IMA/AIC OSCI toolGetty Presentation of IMA/AIC OSCI tool
Getty Presentation of IMA/AIC OSCI toolRobert J. Stein
 
Presentation of the AIC-IMA publishing tool for OSCI
Presentation of the AIC-IMA publishing tool for OSCIPresentation of the AIC-IMA publishing tool for OSCI
Presentation of the AIC-IMA publishing tool for OSCIRobert J. Stein
 
Science Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesScience Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesEOSCpilot .eu
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkBL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkIMPACT Centre of Competence
 
Open Access Week 2017: Life Sciences and Open Sciences - worfkflows and tools
Open Access Week 2017: Life Sciences and Open Sciences - worfkflows and toolsOpen Access Week 2017: Life Sciences and Open Sciences - worfkflows and tools
Open Access Week 2017: Life Sciences and Open Sciences - worfkflows and toolsOpenAIRE
 
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...Olaf Janssen
 
Reducing Infrastructure and Service Fragmentation
Reducing Infrastructure and Service Fragmentation Reducing Infrastructure and Service Fragmentation
Reducing Infrastructure and Service Fragmentation EOSCpilot .eu
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...Europeana
 
#T3UXW14 : workspace Team Work
#T3UXW14 : workspace Team Work#T3UXW14 : workspace Team Work
#T3UXW14 : workspace Team WorkPaul Blondiaux
 
XML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorXML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorjimfuller2009
 
Europeana Cloud Aggregator Forum 2014
Europeana Cloud Aggregator Forum 2014Europeana Cloud Aggregator Forum 2014
Europeana Cloud Aggregator Forum 2014Europeana
 
ECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming artsECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming artsPaolo Nesi
 
Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012Kerryn Amery
 

Similar to TEI Conference - CVCE (20)

12_N.Smolenski, M.Kostic, A.Sofronijevic
12_N.Smolenski, M.Kostic, A.Sofronijevic12_N.Smolenski, M.Kostic, A.Sofronijevic
12_N.Smolenski, M.Kostic, A.Sofronijevic
 
Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...
Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...
Crowd wales, Building a crowdsourcing platform for Wales by Paul McCann - Eur...
 
F/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesF/LOSS in Norwegian libraries
F/LOSS in Norwegian libraries
 
Semtech web-protege-tutorial
Semtech web-protege-tutorialSemtech web-protege-tutorial
Semtech web-protege-tutorial
 
Enabling accessible multimedia for Moodle: iMoot 2010
Enabling accessible multimedia for Moodle: iMoot 2010Enabling accessible multimedia for Moodle: iMoot 2010
Enabling accessible multimedia for Moodle: iMoot 2010
 
Getty Presentation of IMA/AIC OSCI tool
Getty Presentation of IMA/AIC OSCI toolGetty Presentation of IMA/AIC OSCI tool
Getty Presentation of IMA/AIC OSCI tool
 
Presentation of the AIC-IMA publishing tool for OSCI
Presentation of the AIC-IMA publishing tool for OSCIPresentation of the AIC-IMA publishing tool for OSCI
Presentation of the AIC-IMA publishing tool for OSCI
 
Squeak
SqueakSqueak
Squeak
 
Science Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesScience Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth Sciences
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkBL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
 
Open Access Week 2017: Life Sciences and Open Sciences - worfkflows and tools
Open Access Week 2017: Life Sciences and Open Sciences - worfkflows and toolsOpen Access Week 2017: Life Sciences and Open Sciences - worfkflows and tools
Open Access Week 2017: Life Sciences and Open Sciences - worfkflows and tools
 
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
An overview of The European Library. Olaf Janssen presenting during DRH 2005,...
 
Reducing Infrastructure and Service Fragmentation
Reducing Infrastructure and Service Fragmentation Reducing Infrastructure and Service Fragmentation
Reducing Infrastructure and Service Fragmentation
 
Bne impact iif
Bne impact iifBne impact iif
Bne impact iif
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
 
#T3UXW14 : workspace Team Work
#T3UXW14 : workspace Team Work#T3UXW14 : workspace Team Work
#T3UXW14 : workspace Team Work
 
XML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processorXML London 2013 - Architecture of xproc.xq an XProc processor
XML London 2013 - Architecture of xproc.xq an XProc processor
 
Europeana Cloud Aggregator Forum 2014
Europeana Cloud Aggregator Forum 2014Europeana Cloud Aggregator Forum 2014
Europeana Cloud Aggregator Forum 2014
 
ECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming artsECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming arts
 
Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012
 

More from dhlab

MyPublications: Enabling personal authoring and narrative making
MyPublications: Enabling personal authoring and narrative makingMyPublications: Enabling personal authoring and narrative making
MyPublications: Enabling personal authoring and narrative makingdhlab
 
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...dhlab
 
Humanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanitiesHumanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanitiesdhlab
 
History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013dhlab
 
CUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraphCUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraphdhlab
 
HistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de LyonHistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de Lyondhlab
 
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...dhlab
 
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...dhlab
 
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989dhlab
 
DH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task forceDH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task forcedhlab
 
DH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGCDH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGCdhlab
 
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...dhlab
 
DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction dhlab
 

More from dhlab (13)

MyPublications: Enabling personal authoring and narrative making
MyPublications: Enabling personal authoring and narrative makingMyPublications: Enabling personal authoring and narrative making
MyPublications: Enabling personal authoring and narrative making
 
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
Text Encoding and Enrichment for Linguistic Analysis: Archives on the policy ...
 
Humanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanitiesHumanist machine interaction for the digital humanities
Humanist machine interaction for the digital humanities
 
History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013History of Europe demo at IEEE MMSP 2013
History of Europe demo at IEEE MMSP 2013
 
CUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraphCUbRIK Summer School RHodes histoGraph
CUbRIK Summer School RHodes histoGraph
 
HistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de LyonHistoGraph presentation Insa de Lyon
HistoGraph presentation Insa de Lyon
 
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
DH2013: Stuart Dunn - An emerging field(?): defining the fundamentals of huma...
 
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
DH2013: Roei Amit – Engage the exhibitions audience with the use of photograp...
 
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
DH2013: Ad Pollé – Europeana 1914-18 & Europeana 1989
 
DH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task forceDH2013: Christine Sauter – Results of the task force
DH2013: Christine Sauter – Results of the task force
 
DH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGCDH2013: Julia Fallon – Legal aspects of UGC
DH2013: Julia Fallon – Legal aspects of UGC
 
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
DH2013: Marion Dupeyrat – Interacting with audiences: overview of participato...
 
DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction DH2013: Lars Wieneke – Workshop introduction
DH2013: Lars Wieneke – Workshop introduction
 

Recently uploaded

SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxCarrieButtitta
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 

Recently uploaded (20)

SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptx
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 

TEI Conference - CVCE

  • 1. Florentina Armaselu – DHLab, Centre virtuel de la connaissance sur l’Europe (CVCE), Luxembourg florentina.armaselu@cvce.eu 1 www.cvce.eu From a Small-Scale Digital Edition to a TEI Publication Framework in Modern European History Text Encoding Initiative (TEI) Conference and Members’ Meeting. Connect, Animate, Innovate. 28 to 31 Octobre 2015. Université Lumière Lyon 2
  • 2. 1. The WEU-DIPLO pilot project 2. Transviewer, towards a TEI publication framework 3. Discussion 4. References Summary 2
  • 3. Part I The WEU-DIPLO pilot project 3
  • 4. 1. Goal: XML-TEI encoding, corpus analysis and Web publication of institutional documents of the W.E.U. (Western European Union): • Topics: armament production, standardization, control in the period from 1954 to 1982; • Source: Archives nationales de Luxembourg, W.E.U collection. 2. Initial format: • digitized versions (JPG) of typewritten materials (one file per page). 3. Size: *proc. = processed Overview of the WEU-DIPLO project Part I. WEU-DIPLO pilot 4 Category Number of documents Number of documents per language Number of pages Number of pages per language EN FR FR proc.* EN FR FR proc.* Note 89 43 46 37 395 191 204 155 Minutes 30 15 15 15 256 138 118 118 Memorandum 3 1 2 2 16 7 9 9 Study 2 0 2 1 12 0 12 8 Discourse 1 0 1 0 4 0 4 0 Draft protocol 2 1 1 0 4 2 2 0 Total 127 60 67 55 687 338 349 290
  • 5. Overview of the WEU-DIPLO project: workflow Part I. WEU-DIPLO pilot 5
  • 6. Overview of the WEU-DIPLO project: page structure. ©WEU-UEO Part I. WEU-DIPLO pilot 6 Header Content Footer
  • 7. Microsoft Word Styling – WEU-DIPLO Part I. WEU-DIPLO pilot 7 Headers, footers Headings, line breaks, paragraphs
  • 8. Conversion and enrichment (XSLT, manual, NER) Part I. WEU-DIPLO pilot 8 OxGarage (DOCX to TEI P5) oXygen XML Editor • XSLT transformation (metadata, structure); • manual enrichment (semantics – discourse of country/institutional representatives) GATE (Name Entity Recognition) • training phase (Gazetteer List Collector) • annotation phase (names of persons, organisations, places, functions, events, products; dates) oXygen XML Editor • XSLT (GATE XML to TEI P5 transformation)
  • 9. XML-TEI Encoding: WEU-DIPLO - metadata; layout (header). ©WEU-UEO Part I. WEU-DIPLO pilot 9 @@hAuthor @@hArchNum @@hStampConfid @@hDocRef @@hOrigDate @@hOrigLang @@hVersion
  • 10. XML-TEI Encoding: WEU-DIPLO – Structure (headings, paragraphs, line breaks); semantics (named entities, discourse). ©WEU-UEO Part I. WEU-DIPLO pilot 10 @@Heading2@@Paragraph @@LineBreak@@Names @@Discourse
  • 11. XML-TEI Encoding: WEU-DIPLO – transcription features (Pierazzo, 2011) Part I. WEU-DIPLO pilot 11
  • 12. Part II Transviewer, towards a TEI publication framework 12
  • 13. • Treaties; official declarations and meeting reports; letters; notes; press articles; images, video and audio archives related to European integration history Context: The CVCE’s ePublications Part II. Transviewer 13
  • 14. 1. Transviewer concept: • XML-TEI transformation/visualisation on the fly, in the browser • flexible framework for the publication of XML-TEI documents in European integration history; 2. Technologies : • XML, HTML, XSLT, CSS and JavaScript 3. Tested platforms: • EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt-project/ • KILN : http://kiln.readthedocs.org/en/latest/# • TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/ • Versioning Machine: http://v-machine.org/ • XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/ Transviewer overview Part II. Transviewer 14
  • 15. Implementation (adaptation and in-house development): • side-by-side view digital facsimile and transcription (EVT model) • third-party libraries: o BookReader: tool designed to provide online access to scanned books o Saxon-CE: support for XSLT 2.0 transformation in the browser o in-house development (configuration, frames and buttons layout/actions, transcription rendering, third-party libraries calls) Transviewer prototype Part II. Transviewer 15
  • 16. Transviewer experiments– digital facsimile/transcription side-by-side view. ©WEU-UEO Part II. Transviewer 16
  • 17. Transviewer experiments– digital facsimile/transcription side-by-side view. Werner – handwritten notes Part II. Transviewer 17
  • 18. Transviewer experiments (simulation) – video/audio and transcription synchronisation. Werner - interviews Part II. Transviewer 18
  • 19. Transviewer features – panels layouts Part II. Transviewer 19
  • 20. Transviewer features– transcription format Part II. Transviewer 20
  • 21. Transviewer features– panels interlinking Part II. Transviewer 21
  • 23. “By teaching an edition how to swim, I mean endowing an edition not only with a store of factual knowledge concerning the work presented, but also with the capability of dealing gracefully with the mutability of the electronic medium, by exploiting the possibilities for reader-controlled changes to the edition’s presentation and by adapting successfully to rapid changes in the hardware and software environment.” (Sperberg-McQueen, 2009) 1. Transviewer prototype questions: • flexible enough to support different types of documents in European integration history and different user requirements; • modular architecture to allow gradual development and customisation according to the needs of the projects; • balance manual interventions/automatic processing (XSLT, NER); • XML transformation on the fly (no need for intermediary formats/steps, changes to the XML already part of the publication). Discussion Part III. Discussion 23
  • 24. 3. Issues: • BookReader – use of an older version of jQuery library; • non-uniform support of Saxon-CE for XSLT 2.0 transformation in the browsers; • need for batch conversion to XML-TEI (potential adaptation of OxGarage for batch processing). 4. Ongoing/future work for further development: • evaluation (technology – technical experts; usability tests – experts in European integration studies); • development of new modules (multi-panels, audio/video transcription, etc.) and tests with more project samples; • integration into the existing CVCE’s Website architecture: o Back End; o Front End. Discussion Part III. Discussion 24
  • 25. Thank you! Discussion 25 Scaling in a publication framework would imply not only teaching your editions “how to swim” but also how to swim together.
  • 26. • Book Reader: https://openlibrary.org/dev/docs/bookreader • EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt-project/ • GATE: https://gate.ac.uk/ • KILN : http://kiln.readthedocs.org/en/latest/# • OxGarage: http://www.tei-c.org/oxgarage/ • Pierazzo, Elena. (2011). A rationale of digital documentary editions. In LLC. The Journal of Digital Scholarship in the Humanities, Vol. 26, No. 4, December 2011, pp. 463-477. • http://www.scholarlyediting.org/2014/essays/essay.pierazzo.html. • TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/ • TEI (Text Encoding Initiative): http://www.tei-c.org • Versioning Machine: http://v-machine.org/ • Saxon-CE: http://www.saxonica.com/ce/user-doc/1.1/index.html • Sperberg-McQueen, C.M. 2009. “How to teach your edition how to swim”. In LLC. The Journal of Digital Scholarship in the Humanities. Volume 24, No. 1, April 2009. Oxford Journals. • XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/ References 26