Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Próximo SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Cargando en…3
×
1 de 31

eScriptorium: An Open Source Platform for Historical Document Analysis

0

Compartir

Descargar para leer sin conexión

Par Daniel Stoekl Ben Ezra (Directeur d'études, EPHE-PSL, UMR 8546 AOrOc).

Rendez-vous IIIF360, un événément en ligne autour des standards et technologies IIIF organisé par le consortium IIIF360 (Biblissima, Campus Condorcet, Huma-Num) le 24 mars 2021 : https://projet.biblissima.fr/fr/evenements/rendez-vous-iiif360-2021

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

eScriptorium: An Open Source Platform for Historical Document Analysis

  1. 1. eScriptorium: An Open Source Platform for Historical Document Analysis Daniel Stökl Ben Ezra Peter Stokes Marc Bui Ben Kiessling Robin Tissot
  2. 2. eScriptorium • Blog: http://escripta.hypotheses.org • UI Code: https://gitlab.inria.fr/scripta/escriptorium • AI Code: https://github.com/mittagessen/kraken Funded by: PSL IRIS Scripta, H2020 Resilience, MENESR, DIM STCN Ile de France, EquipEx Biblissima+, [indirectement: Mellon, MCC]
  3. 3. eScriptorium Universe Scripta PSL: eScriptorium LectauRep INRIA ANF openITI North-Eastern U Maryland ERC Vietnamica EPHE U-Bib Heidelberg ? National Library of Israel ? ENC Sorbonne Université DIM STCN Observatoir e de Paris IRHT H2020 Resilience • manuscriptologIA High Performance Computing Cluster at mesoPSL Biblissima+ TGIR Huma-Num
  4. 4. current • Import: IIIF pdf, imgfiles (jpg, png, …), alto, PageXML, trained segmentation or transcription models • Ergonomic UI for manual segmentation, transcription and (soon) annotation. 4 panels (facsimile, segmentation, transcription, text-annotation) ( user definable architectures)
  5. 5. ↓ Metadata imported via iiif
  6. 6. current • Import: IIIF, pdf, imgfiles (jpg, png, …), alto, PageXML, trained segmentation or transcription models (user definable architectures) • Ergonomic UI for manual segmentation, transcription and (soon) annotation. 4 panels (facsimile, segmentation, transcription, text-annotation)
  7. 7. Ergonomic transcription e.g. of vertical or oblique lines
  8. 8. BL ms Add. 27296 Transcription font size automatically adapted to manuscript line
  9. 9. current • Import: IIIF, pdf, imgfiles (jpg, png, …), alto, PageXML, trained segmentation or transcription models (user definable architectures) • Ergonomic UI for manual segmentation, transcription and (soon) annotation. 4 panels (facsimile, segmentation, transcription, text-annotation) • Automatic segmentation (lines, semantic lines and regions, also overlapping) based on user-defined ontologies. • Automatic transcription according to the principles set by the user. • Export: alto 4(!), PageXML, txt, imgfiles (jpg, png ,…) trained segmentation or transcription models • Powerful and growing API
  10. 10. Segmentation and Transcription Demonstration ↑ User definable segmentation ontology
  11. 11. Locate illuminations through layout segmentation
  12. 12. Automatic segmentation result of ms specific model
  13. 13. Ergonomic correction
  14. 14. Jbaiter Mirador textoverlay plugin
  15. 15. eScriptorium (near) FUTURE Scripta PSL: eScriptorium LectauRep INRIA ANF openITI North-Eastern U Maryland ERC Vietnamica EPHE U-Bib Heidelberg ? National Library of Israel ? ENC Sorbonne Université DIM STCN Observatoir e de Paris IRHT H2020 Resilience • Search • Trainable reading order • Prototype for text annotation (NE, ecdotic) with TEI-Export • Prototype for image annotation (e.g. Digipal / Archetype) • manuscriptologIA High Performance Computing Cluster at mesoPSL • Customizable virtual keyboard • Vertical interface for Chinese • Automatic textalignment • Additional simplified interface • Improved project management • Crowdsourcing interface Biblissima+ TGIR Huma-Num
  16. 16. Transcription created automatically without specific transcription BnF syr 341
  17. 17. Judeo-Arabic+Hebrew, Ox. Bodl. Pococke 295, Maimonides, Mishnah Commentary
  18. 18. Greek papyri (with WÜ, HD, B)
  19. 19. Greek papyri (with WÜ, HD, B)
  20. 20. eScriptorium used for Dead Sea Scroll Glyph alignment Automatic letter level alignment Images of Dead Sea Scrolls by Shay Halevy Courtesy Israel Antiquities Authority
  21. 21. p. 3558:
  22. 22. Please stay tuned for upcoming workshops Contact: daniel.stoekl@ephe.psl.eu, peter.stokes@ephe.psl.eu https://escripta.hypotheses.org Many thanks to Bibliothèque nationale de France National Library of Israel (Ktiv!) Bayerische Staatsbibliothek München Biblioteca Apostolica Vaticana Bodleian Library, Oxford Cambridge University Library Israel Antiquities Authority, Jerusalem Staatsbibliothek Berlin, Preußischer Kulturbesitz Intro tutorial: https://lectaurep.hypotheses.org/documentation/prendre-en-main-escriptorium

×