SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
1
Distributed Repositories of Medieval Calendars
and Crowd-Sourcing of Transcription
Rob Sanderson
azaroth42@gmail.com
azaroth@stanford.edu
t: @azaroth42
Stanford University
Ben Albritton, Stanford University
Doug Emery, University of Pennsylvania
Will Noel, University of Pennsylvania
Dot Porter, University of Pennsylvania
http://iiif.io/
This research was primarily funded by the Andrew W. Mellon Foundation
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
2
Image Repositories
•  Increase in digitization
•  Particularly precious,
fragile, beautiful objects
•  Medieval Manuscripts
•  Digitized images online
•  Increasingly Open
•  At high resolution
•  Easy to capture an image
•  Very hard to capture the text
http://gallica.bnf.fr/ark:/12148/btv1b8449691v/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
3
Calendars
•  Ubiquitous in liturgical books
•  e.g. Books of Hours
•  Structured and often tabular:
Date, Day, Saint / Event
•  Content varies slightly
•  Variation details give us
information about the
provenance of the object
•  Much easier to transcribe
•  Good pilot project!
http://www.e-codices.unifr.ch/en/bge/lat0033
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
4
Collaborative Crowd Sourcing?
•  Meeting at U. Penn including
content providers and
scholars
•  Plan:
•  Collect transcriptions
together
•  Analyze similarities
between manuscripts for
patterns of provenance
•  Manuscripts and images
distributed: need a community
to collect sufficient data
http://brbl-dl.library.yale.edu/vufind/Record/3446275
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
5
Micro Repository Rant: TEI
•  Most transcribing done in TEI
•  Terrible for this use case:
•  Single XML file
•  Single author
•  Single location
•  Hard to link to images
•  Tries to describe too much
•  Impossible to use once
created
•  Creating TEI is good for:
http://www.thedigitalwalters.org/Data/WaltersManuscripts/html/W41/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
6
Micro Repository Rant: TEI
•  Most transcribing done in TEI
•  Terrible for this use case:
•  Single XML file
•  Single author
•  Single location
•  Hard to link to images
•  Tries to describe too much
•  Impossible to use once
created
•  Creating TEI is good for:
•  The academic exercise of
creating TEI
http://www.thedigitalwalters.org/Data/WaltersManuscripts/html/W41/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
7
Requirements
•  Distributed image content
•  Consistent, rich API
•  Selection of regions
•  Base, not displayed size
•  Alignment of text with region
•  Distributed creation
•  Distributed curation
•  Multiple texts per region
•  Styling of the text
•  Some semantics
http://oculus-dev.lib.harvard.edu/manifests/view/drs:5981093
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
8
1. Images: BNF next to Yale
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
9
Open Technology: IIIF Image API
Base URL: {scheme}://{host}{/prefix}/{identifier}!
Image Resource:
{base}/{region}/{size}/{rotation}/{quality}.{format}!
!
http://iiif.io/api/image/1.1/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
10
(Part of the) IIIF Community
•  ARTstor
•  Bibliothèque Nationale de
France
•  Bodleian Libraries, Oxford
University
•  British Library
•  C2MRF
•  Cambridge University
•  Cornell University
•  DPLA
•  Europeana
•  e-codices
•  Harvard University
•  Johns Hopkins University
•  National Library of Denmark
•  National Library of Poland
•  National Library of New Zealand
•  National Library of Norway
•  National Library of Wales
•  Princeton University
•  Stanford University
•  Wellcome Trust
•  UK National Archives
•  Yale University
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
11
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
12
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
13
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
14
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
15
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
16
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
17
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
18
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
19
2. Crowdsourced Box Drawing
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
20
Open Technologies
•  Mirador
•  IIIF Community developed viewer
•  Stanford, Harvard, Yale, [LANL]
•  Zooming via Open SeaDragon
•  Princeton, and OSD committers
•  JCrop
•  JQuery plugin for drawing little boxes
•  MongoDB
•  Store information via REST interface
•  W3C Media Fragment image segments
•  Trivially converted to IIIF Image API requests
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
21
Open Technologies
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
22
Open Technologies
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
23
Open Technologies
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
24
Open Technologies
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
25
Open Technologies
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
26
Open Technologies
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
27
Open Technology
•  Line/Column inspiration from TPEN (IIIF compliant)
•  Transcription tool developed at St. Louis
•  http://t-pen.org/TPEN/
•  Line detection flakey, no internal columns
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
28
Open Technologies
•  Inspiration from TPEN (IIIF compliant)
•  Transcription tool developed at St. Louis
•  http://t-pen.org/TPEN/
•  Line detection flakey, no internal columns
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
29
Open Technologies
•  Inspiration from TPEN (IIIF compliant)
•  Transcription tool developed at St. Louis
•  http://t-pen.org/TPEN/
•  Line detection flakey, no internal columns
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
30
Boring (but Open) Metadata
•  Metadata collection to drive the analysis
•  Stored along with the segments
•  Defaults are normally correct
•  Custom extension, not intended for general purpose use
•  Convenient to do inline
•  Other metadata could be added
•  Could be done in a different workflow
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
31
Metadata
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
32
Metadata
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
33
Metadata
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
34
Metadata
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
35
Metadata
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
36
...
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
37
Metadata
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
38
Open Technology: IIIF Presentation API
Text/Image Linking is a subset of a larger challenge:
•  Non-Text / Image Linking
•  Dynamic Images
•  No Image to link to
•  Multiple Images
•  Parts of Images
•  Parts of larger texts
•  Distributed images, texts and links
Need an indirection layer:
•  Solution: align text and image with an abstract Canvas
http://iiif.io/api/presentation/1.0/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
39
Open Technology: IIIF Presentation API
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
40
Open Technology: IIIF Presentation API
http://iiif.io/api/presentation/1.0/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
41
Open Technology: IIIF Presentation API
http://iiif.io/api/presentation/1.0/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
42
Linked Data People...
If you do not want
to know the score,
look away now!
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
43
Linked Data People...
{ "it's" : "just JSON" }
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
44
Web Developers...
If you do not want
to know the score,
look away now!
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
45
Web Developers...
<_:it's>
<_:all>
<_:Linked_Data>;
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
46
Micro Repository Rant 2: RDF Serialization
“RDF/XML was the Semantic Web’s 3 Mile Island incident”
-- Manu Sporny, http://manu.sporny.org/2012/nuclear-rdf/
Or … RDF – Not in my back yard!
•  Serializing a graph is, admittedly, hard
•  RDF/XML is terrible, and too many others
•  Web currently uses JSON as convenient transfer syntax
•  JSON-LD allows transfer of RDF in syntax that does not require full
RDF stack, just a JSON implementation
•  … as available in every web browser
•  Rob's Conclusion: Require JSON-LD
•  http://json-ld.org/
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
47
JSON-LD Context Magic
{ // Canvas resource!
"@context":"http://iiif.io/api/presentation/2/context.json",!
!
!
@context provides mapping for JSON keys into RDF.
!
"sc":"http://www.shared-canvas.org/ns/",!
"oa":"http://www.w3.org/ns/oa#",!
"service":{!
"@type":"@id", !
"@id":"sioc_svcs:has_service"},!
"height":{!
"@type":"xsd:integer", !
"@id":"exif:height"},!
"sequences":{!
"@type":"@id",!
"@id":"sc:hasSequences",!
"@container":"@list"} !
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
48
Open Technologies: REST
•  Experimental IIIF REST specification
•  http://iiif.io/api/annex/rest/
•  For both Presentation and Image
•  Trivial Python/WSGI handler
•  Processes @context and generates identities
•  Stores in MongoDB (but API is agnostic)
•  Follows IIIF Presentation and Open Annotation
•  http://www.w3.org/community/openannotation/
•  Returns the correct JSON-LD
•  Doesn't fully handle image upload yet
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
49
The Future is Now
•  IIIF Image API 2.0
•  Request for Comment period open!
•  http://iiif.io/api/image/2.0/
•  IIIF Presentation API 2.0
•  Ditto!
•  http://iiif.io/api/presentation/2.0/
Please give us feedback: iiif-discuss@googlegroups.com
•  Ongoing work with U.Penn to make a more robust system
Distributed Repositories and Crowd-Sourcing Transcription
Open Repositories 2014, Helsinki, Finland, 11th of June 2014
50
Thank You
Rob Sanderson
azaroth42@gmail.com
azaroth@stanford.edu
t: @azaroth42
Stanford University
http://iiif.io/
iiif-discuss@googlegroups.com

Más contenido relacionado

Similar a Open Repositories 2014: Crowdsourced Transcription via IIIF

Hiberactive: Pro-Active Archiving of Web References from Scholarly Articles
Hiberactive: Pro-Active Archiving of  Web References from Scholarly Articles Hiberactive: Pro-Active Archiving of  Web References from Scholarly Articles
Hiberactive: Pro-Active Archiving of Web References from Scholarly Articles Martin Klein
 
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and SwitzerlandPaul Vierkant
 
Library collections and the emerging scholarly record
Library collections and the emerging scholarly recordLibrary collections and the emerging scholarly record
Library collections and the emerging scholarly recordlisld
 
Re-appropriating Wikipedia
Re-appropriating WikipediaRe-appropriating Wikipedia
Re-appropriating WikipediaShih-Chieh Li
 
Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...
Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...
Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...IL Group (CILIP Information Literacy Group)
 
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
Islandora Webinar:  Highlighting CUHK Chinese Digital CollectionsIslandora Webinar:  Highlighting CUHK Chinese Digital Collections
Islandora Webinar: Highlighting CUHK Chinese Digital CollectionsErin Tripp
 
Europeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorEuropeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorLIBER Europe
 
Re-Reading the British Memorial Project #de2012
Re-Reading the British Memorial Project #de2012Re-Reading the British Memorial Project #de2012
Re-Reading the British Memorial Project #de2012Nicole Beale
 
Europeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcherEuropeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcherLIBER Europe
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentConstance Malpas
 
Advocating Open Access: Before, during and after HEFCE
Advocating Open Access: Before, during and after HEFCEAdvocating Open Access: Before, during and after HEFCE
Advocating Open Access: Before, during and after HEFCENick Sheppard
 
Blowing Up the Book: Design Challenges for Online Cultural Heritage Information
Blowing Up the Book: Design Challenges for Online Cultural Heritage InformationBlowing Up the Book: Design Challenges for Online Cultural Heritage Information
Blowing Up the Book: Design Challenges for Online Cultural Heritage InformationDesign for Context
 
PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...
PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...
PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...pathsproject
 
Institutional repositories
Institutional repositoriesInstitutional repositories
Institutional repositoriesTor Loney
 
The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...
The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...
The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...Alannah Fitzgerald
 
Doing DH in Theological Libraries
Doing DH in Theological LibrariesDoing DH in Theological Libraries
Doing DH in Theological LibrariesClifford Anderson
 

Similar a Open Repositories 2014: Crowdsourced Transcription via IIIF (20)

Hiberactive: Pro-Active Archiving of Web References from Scholarly Articles
Hiberactive: Pro-Active Archiving of  Web References from Scholarly Articles Hiberactive: Pro-Active Archiving of  Web References from Scholarly Articles
Hiberactive: Pro-Active Archiving of Web References from Scholarly Articles
 
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
 
Library collections and the emerging scholarly record
Library collections and the emerging scholarly recordLibrary collections and the emerging scholarly record
Library collections and the emerging scholarly record
 
Building the Abnormal Hieratic Global Portal
Building the Abnormal Hieratic Global PortalBuilding the Abnormal Hieratic Global Portal
Building the Abnormal Hieratic Global Portal
 
Re-appropriating Wikipedia
Re-appropriating WikipediaRe-appropriating Wikipedia
Re-appropriating Wikipedia
 
Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...
Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...
Nevalainen & Syvalahti - Knotworking as a means to strengthen information ski...
 
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
Islandora Webinar:  Highlighting CUHK Chinese Digital CollectionsIslandora Webinar:  Highlighting CUHK Chinese Digital Collections
Islandora Webinar: Highlighting CUHK Chinese Digital Collections
 
Europeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorEuropeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregator
 
Presentation of DanteSources
Presentation of DanteSourcesPresentation of DanteSources
Presentation of DanteSources
 
Re-Reading the British Memorial Project #de2012
Re-Reading the British Memorial Project #de2012Re-Reading the British Memorial Project #de2012
Re-Reading the British Memorial Project #de2012
 
Europeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcherEuropeana Libraries: bringing content to the researcher
Europeana Libraries: bringing content to the researcher
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environment
 
Advocating Open Access: Before, during and after HEFCE
Advocating Open Access: Before, during and after HEFCEAdvocating Open Access: Before, during and after HEFCE
Advocating Open Access: Before, during and after HEFCE
 
Blowing Up the Book: Design Challenges for Online Cultural Heritage Information
Blowing Up the Book: Design Challenges for Online Cultural Heritage InformationBlowing Up the Book: Design Challenges for Online Cultural Heritage Information
Blowing Up the Book: Design Challenges for Online Cultural Heritage Information
 
PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...
PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...
PATHS: Using Pathways for Navigating and Personalised Access to Cultural Heri...
 
New file
New fileNew file
New file
 
New file
New fileNew file
New file
 
Institutional repositories
Institutional repositoriesInstitutional repositories
Institutional repositories
 
The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...
The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...
The PhD Abstracts Collections in FLAX: Academic English with the Open Access ...
 
Doing DH in Theological Libraries
Doing DH in Theological LibrariesDoing DH in Theological Libraries
Doing DH in Theological Libraries
 

Más de Robert Sanderson

LUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleLUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleRobert Sanderson
 
Zoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataZoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataRobert Sanderson
 
Provenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtProvenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtRobert Sanderson
 
Data is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityData is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityRobert Sanderson
 
A Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityA Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityRobert Sanderson
 
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataLinked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataRobert Sanderson
 
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataIllusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataRobert Sanderson
 
Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Robert Sanderson
 
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemSanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemRobert Sanderson
 
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingTiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingRobert Sanderson
 
The Importance of being LOUD
The Importance of being LOUDThe Importance of being LOUD
The Importance of being LOUDRobert Sanderson
 
Introduction to Linked Art Model
Introduction to Linked Art ModelIntroduction to Linked Art Model
Introduction to Linked Art ModelRobert Sanderson
 
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Robert Sanderson
 
Strong Opinions, Weakly Held
Strong Opinions, Weakly HeldStrong Opinions, Weakly Held
Strong Opinions, Weakly HeldRobert Sanderson
 
IIIF Discovery Walkthrough
IIIF Discovery WalkthroughIIIF Discovery Walkthrough
IIIF Discovery WalkthroughRobert Sanderson
 
Linked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMLinked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMRobert Sanderson
 
Euromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeEuromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeRobert Sanderson
 
Linked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelLinked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelRobert Sanderson
 
EuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDEuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDRobert Sanderson
 

Más de Robert Sanderson (20)

Understanding Linked Art
Understanding Linked ArtUnderstanding Linked Art
Understanding Linked Art
 
LUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleLUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at Yale
 
Zoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataZoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable Data
 
Provenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtProvenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked Art
 
Data is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityData is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD Sustainability
 
A Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityA Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and Usability
 
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataLinked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
 
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataIllusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
 
Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)
 
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemSanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
 
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingTiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
 
The Importance of being LOUD
The Importance of being LOUDThe Importance of being LOUD
The Importance of being LOUD
 
Introduction to Linked Art Model
Introduction to Linked Art ModelIntroduction to Linked Art Model
Introduction to Linked Art Model
 
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
 
Strong Opinions, Weakly Held
Strong Opinions, Weakly HeldStrong Opinions, Weakly Held
Strong Opinions, Weakly Held
 
IIIF Discovery Walkthrough
IIIF Discovery WalkthroughIIIF Discovery Walkthrough
IIIF Discovery Walkthrough
 
Linked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMLinked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRM
 
Euromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeEuromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over Committee
 
Linked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelLinked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data Model
 
EuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDEuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUD
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Open Repositories 2014: Crowdsourced Transcription via IIIF

  • 1. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 1 Distributed Repositories of Medieval Calendars and Crowd-Sourcing of Transcription Rob Sanderson azaroth42@gmail.com azaroth@stanford.edu t: @azaroth42 Stanford University Ben Albritton, Stanford University Doug Emery, University of Pennsylvania Will Noel, University of Pennsylvania Dot Porter, University of Pennsylvania http://iiif.io/ This research was primarily funded by the Andrew W. Mellon Foundation
  • 2. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 2 Image Repositories •  Increase in digitization •  Particularly precious, fragile, beautiful objects •  Medieval Manuscripts •  Digitized images online •  Increasingly Open •  At high resolution •  Easy to capture an image •  Very hard to capture the text http://gallica.bnf.fr/ark:/12148/btv1b8449691v/
  • 3. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 3 Calendars •  Ubiquitous in liturgical books •  e.g. Books of Hours •  Structured and often tabular: Date, Day, Saint / Event •  Content varies slightly •  Variation details give us information about the provenance of the object •  Much easier to transcribe •  Good pilot project! http://www.e-codices.unifr.ch/en/bge/lat0033
  • 4. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 4 Collaborative Crowd Sourcing? •  Meeting at U. Penn including content providers and scholars •  Plan: •  Collect transcriptions together •  Analyze similarities between manuscripts for patterns of provenance •  Manuscripts and images distributed: need a community to collect sufficient data http://brbl-dl.library.yale.edu/vufind/Record/3446275
  • 5. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 5 Micro Repository Rant: TEI •  Most transcribing done in TEI •  Terrible for this use case: •  Single XML file •  Single author •  Single location •  Hard to link to images •  Tries to describe too much •  Impossible to use once created •  Creating TEI is good for: http://www.thedigitalwalters.org/Data/WaltersManuscripts/html/W41/
  • 6. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 6 Micro Repository Rant: TEI •  Most transcribing done in TEI •  Terrible for this use case: •  Single XML file •  Single author •  Single location •  Hard to link to images •  Tries to describe too much •  Impossible to use once created •  Creating TEI is good for: •  The academic exercise of creating TEI http://www.thedigitalwalters.org/Data/WaltersManuscripts/html/W41/
  • 7. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 7 Requirements •  Distributed image content •  Consistent, rich API •  Selection of regions •  Base, not displayed size •  Alignment of text with region •  Distributed creation •  Distributed curation •  Multiple texts per region •  Styling of the text •  Some semantics http://oculus-dev.lib.harvard.edu/manifests/view/drs:5981093
  • 8. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 8 1. Images: BNF next to Yale
  • 9. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 9 Open Technology: IIIF Image API Base URL: {scheme}://{host}{/prefix}/{identifier}! Image Resource: {base}/{region}/{size}/{rotation}/{quality}.{format}! ! http://iiif.io/api/image/1.1/
  • 10. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 10 (Part of the) IIIF Community •  ARTstor •  Bibliothèque Nationale de France •  Bodleian Libraries, Oxford University •  British Library •  C2MRF •  Cambridge University •  Cornell University •  DPLA •  Europeana •  e-codices •  Harvard University •  Johns Hopkins University •  National Library of Denmark •  National Library of Poland •  National Library of New Zealand •  National Library of Norway •  National Library of Wales •  Princeton University •  Stanford University •  Wellcome Trust •  UK National Archives •  Yale University
  • 11. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 11 2. Crowdsourced Box Drawing
  • 12. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 12 2. Crowdsourced Box Drawing
  • 13. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 13 2. Crowdsourced Box Drawing
  • 14. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 14 2. Crowdsourced Box Drawing
  • 15. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 15 2. Crowdsourced Box Drawing
  • 16. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 16 2. Crowdsourced Box Drawing
  • 17. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 17 2. Crowdsourced Box Drawing
  • 18. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 18 2. Crowdsourced Box Drawing
  • 19. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 19 2. Crowdsourced Box Drawing
  • 20. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 20 Open Technologies •  Mirador •  IIIF Community developed viewer •  Stanford, Harvard, Yale, [LANL] •  Zooming via Open SeaDragon •  Princeton, and OSD committers •  JCrop •  JQuery plugin for drawing little boxes •  MongoDB •  Store information via REST interface •  W3C Media Fragment image segments •  Trivially converted to IIIF Image API requests
  • 21. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 21 Open Technologies
  • 22. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 22 Open Technologies
  • 23. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 23 Open Technologies
  • 24. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 24 Open Technologies
  • 25. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 25 Open Technologies
  • 26. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 26 Open Technologies
  • 27. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 27 Open Technology •  Line/Column inspiration from TPEN (IIIF compliant) •  Transcription tool developed at St. Louis •  http://t-pen.org/TPEN/ •  Line detection flakey, no internal columns
  • 28. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 28 Open Technologies •  Inspiration from TPEN (IIIF compliant) •  Transcription tool developed at St. Louis •  http://t-pen.org/TPEN/ •  Line detection flakey, no internal columns
  • 29. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 29 Open Technologies •  Inspiration from TPEN (IIIF compliant) •  Transcription tool developed at St. Louis •  http://t-pen.org/TPEN/ •  Line detection flakey, no internal columns
  • 30. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 30 Boring (but Open) Metadata •  Metadata collection to drive the analysis •  Stored along with the segments •  Defaults are normally correct •  Custom extension, not intended for general purpose use •  Convenient to do inline •  Other metadata could be added •  Could be done in a different workflow
  • 31. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 31 Metadata
  • 32. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 32 Metadata
  • 33. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 33 Metadata
  • 34. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 34 Metadata
  • 35. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 35 Metadata
  • 36. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 36 ...
  • 37. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 37 Metadata
  • 38. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 38 Open Technology: IIIF Presentation API Text/Image Linking is a subset of a larger challenge: •  Non-Text / Image Linking •  Dynamic Images •  No Image to link to •  Multiple Images •  Parts of Images •  Parts of larger texts •  Distributed images, texts and links Need an indirection layer: •  Solution: align text and image with an abstract Canvas http://iiif.io/api/presentation/1.0/
  • 39. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 39 Open Technology: IIIF Presentation API
  • 40. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 40 Open Technology: IIIF Presentation API http://iiif.io/api/presentation/1.0/
  • 41. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 41 Open Technology: IIIF Presentation API http://iiif.io/api/presentation/1.0/
  • 42. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 42 Linked Data People... If you do not want to know the score, look away now!
  • 43. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 43 Linked Data People... { "it's" : "just JSON" }
  • 44. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 44 Web Developers... If you do not want to know the score, look away now!
  • 45. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 45 Web Developers... <_:it's> <_:all> <_:Linked_Data>;
  • 46. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 46 Micro Repository Rant 2: RDF Serialization “RDF/XML was the Semantic Web’s 3 Mile Island incident” -- Manu Sporny, http://manu.sporny.org/2012/nuclear-rdf/ Or … RDF – Not in my back yard! •  Serializing a graph is, admittedly, hard •  RDF/XML is terrible, and too many others •  Web currently uses JSON as convenient transfer syntax •  JSON-LD allows transfer of RDF in syntax that does not require full RDF stack, just a JSON implementation •  … as available in every web browser •  Rob's Conclusion: Require JSON-LD •  http://json-ld.org/
  • 47. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 47 JSON-LD Context Magic { // Canvas resource! "@context":"http://iiif.io/api/presentation/2/context.json",! ! ! @context provides mapping for JSON keys into RDF. ! "sc":"http://www.shared-canvas.org/ns/",! "oa":"http://www.w3.org/ns/oa#",! "service":{! "@type":"@id", ! "@id":"sioc_svcs:has_service"},! "height":{! "@type":"xsd:integer", ! "@id":"exif:height"},! "sequences":{! "@type":"@id",! "@id":"sc:hasSequences",! "@container":"@list"} !
  • 48. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 48 Open Technologies: REST •  Experimental IIIF REST specification •  http://iiif.io/api/annex/rest/ •  For both Presentation and Image •  Trivial Python/WSGI handler •  Processes @context and generates identities •  Stores in MongoDB (but API is agnostic) •  Follows IIIF Presentation and Open Annotation •  http://www.w3.org/community/openannotation/ •  Returns the correct JSON-LD •  Doesn't fully handle image upload yet
  • 49. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 49 The Future is Now •  IIIF Image API 2.0 •  Request for Comment period open! •  http://iiif.io/api/image/2.0/ •  IIIF Presentation API 2.0 •  Ditto! •  http://iiif.io/api/presentation/2.0/ Please give us feedback: iiif-discuss@googlegroups.com •  Ongoing work with U.Penn to make a more robust system
  • 50. Distributed Repositories and Crowd-Sourcing Transcription Open Repositories 2014, Helsinki, Finland, 11th of June 2014 50 Thank You Rob Sanderson azaroth42@gmail.com azaroth@stanford.edu t: @azaroth42 Stanford University http://iiif.io/ iiif-discuss@googlegroups.com