SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
CROWDSOURCING IN THE
    DIGITALKOOT PROJECT
    Majlis Bremer-Laamanen
              IMPACT 24TH OF OCTOBER, 2011


                           Microtask.com:
Digitalkoot: Making Old Archives Accessible Using Crowdsourcing by
                   Otto Chrons and Sami Sundell,
          Discussions Managing Director Harri Holopainen
                        harri@microtask.com
The Centre for Preservation and Digitisation: statistics

• Established in 1990       • Digitisation: 1,3
• Digitisation started in     million pages
  1998                      • Audio digitisation
• Over 50 employees           and cataloguing
                              music 1,300 unique
• Yearly average (past
                              cassettes and the
  three years):
                              sleeves
    • Microfilm
                            • Conservation:
      production: 1, 3
                              10,000-15,000 units
      million exposures
ENRICHING CONTENT
                (http://digi.nationallibrary.fi, http://www.doria.fi/handle/10024/4194)

• Newspapers - > 2 million pages, the Historical Newspaper Library
• Journals - > 2,7 million pages, free to 1910, in all legal deposit
libraries to 1944
• Books - > travel, novels, Dissertations 17th century, Save the Book
• Ephemera - > industrial price lists
• Sound - > national sound archive, C-casettes
• Interest groups: the creators, users, contributors of the material
Context for mass digitisation and crowdsourcing

  Client
Accessibility
                   Centre for Preservation and Digitisation
                                                                Temporary         Physical
                Preparation for                     Post-        storage for
                                  Digitisation                                    objects
Transferring      Digitisation                   processing   digitised objects   Retrieval
  Physical
  Objects



  Mass digitisation activities in the most cost-effective manner:
  Newspapers, books, journals, ephemera, audio:
  •     Logistics for physical items
  •     Process for digital objects: network services and long-term preservation
  •     Metadata Mets - Alto: capturing through process
        •   Metadata development: User experience and crowdsourcing
  •     Customizing of the tracking systems (CCS, Item Tracking, Scan Client)
  •     Operational environment: scaling architecture and implementation
DIGITALKOOT
DIGI = TO DIGITISE
TALKOOT = PEOPLE GATHERING TO WORK TOGETHER
VOLUNTARILY (WITHOUT PAYMENT)

FIRST EXPERIENCE 2011:
DIGITALKOOT: correction of OCR by gamification, turning useful
activities into games ”THE MOLE HUNT” by Microtask.com.
   – People can spend hours on games
   – Turning useful activities into games
   – Activities can be rewarded with scores, achievments and social benefits


From February, 8th to September 15th, 2011: about 80.000
visitors, 4000 hours of effective game time. More than 5 million
tasks.
CHALLENGES

Meaningful tasks without breaking the flow of the game

Real-time feedback – many simultaneous players doing
the same task

Build a bridge to save the moles from falling down =>
   – Correct typing gives you a block to the bridge
   – Incorrect is punished by explosion
DIGITALKOOT: Mole Hunt
Right or wrong?
DIGITALKOOT: Mole Bridge
A bridge has been built…
To the next level?
Changing sceneries
When a mole falls
Incorrect answer exploding
GAMIFICATION CHALLENGES
Balancing game play elements with task completion speed and
accuracy

Keep the motivation of people and enlarge the audience

Introduction of meaningful tasks into the game without breaking
game play mechanisms

Instant feedback on players´ actions (simultaneous players)
•pressure to adapt to varying feedback situations/latencities
POSITIVE EFFECT OF VERIFICATION

”The wisdom of the crowds”
   • includes answers from possible spammers

Game start: verification tasks only

Accurate work shown => verification lowered in phases, never zero

Verification tasks are created automatically:
    • A randomly selected task is sent to several players: all have to
       agree on the result => verification task
VERIFICATION OF THE OCR

Players and their pace cannot be synchronized.

Verification tasks to the task stream:
•Fed to players varies according to the number of active players
•The system knows the answer: the game play is improved by fast
feedback
•Downside: no new information produced
USERS: February 8th to March 31st, 2011

31,816 visitors, 4,768 players, 2,740 hours of game time, 2,5 million
tasks.

1 % via Internet, 99 % via Facebook

Half of the users were men.

Gametime: seconds to over 100 hours (altogether).
Median time: => 9 minutes.
Women >13 minutes and 54 % of the tasks
Hardest working top 4 were all men
ACCURACY

OCR-system 0.8 confidential about accuracy => human correction in 30%

Random selection of 2 articles:
•1,467 words Digitalkoot result: only14 mistakes /228 OCR
•516 words Digitalkoot result: 1 mistake/118 OCR
•>> well over 99% possible by gamification

Spammer play:
  •One player 1,5 hours and 5,692 tasks was detected by the verification
  system and only 4 tasks were accepted
Enriching Digitisation Production
           Processes, METS Profiles: a new
                development platform RESOURCE
                                    DIGITAL
                                      Articles
                                      Illustrations                       COMPREHENSIVE
                                      Poems             LEVEL OF          DIGITAL COLLECTIONS
                                                        MARK UP
                                                                          Standards & OAI-PMH
                             Structural metadata         METS, ALTO       complient METS SIP
                                             POST                         packages
                                          PROCESSING
                                                                          METS EXPORT
      Administrative/technical metadata               MIX/PREMIS
                                                                          Packesges include:
                                SCANNING                                  JPEG2000

      Descriptive metadata                MARC21/MODS                     OCR TXT as ALTO XML

                                                                          PDF
                    CATALOGUING                       Two Bibliographic
Newspapers                                                Records         JPEG(150)
Serials
                                                                          METSXML
Books
Parchments                                                                MARCXML
Notes
Maps      SOURCE MATERIAL
Audio
        PHYSICAL COLLECTIONS
IN THE MEDIA

-Until March 31st, over 30 articles: all around the world: New York
Times…

-Television appearances ongoing

-Helsingin Sanomat : HS talkoot using the National Library´s
digitised newspaper material Historical Newspaper Library >
advertising Digitalkoot e.g. September 15th

-Influenced user interest
        => stabilisation to 300 individual users per week
NEXT
1) Marking of articles and/or
   images
2) Indexing articles and/or
   images
KUVATALKOOT
Goal: sophisticated
user experience
Collections discovery and




                                 Luonnon-kirja ala-alkeiskouluin tarpeeksi / Z. Topelius, 1868
reuse of digital content by
researchers and people at
large:

   Researchers will get better
   systematic coverage of
   images and articles in
   published printed material.

Más contenido relacionado

Destacado

IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...
IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...
IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...IMPACT Centre of Competence
 
IMPACT Final Conference - Apostolos Antonacopoulos
IMPACT Final Conference - Apostolos AntonacopoulosIMPACT Final Conference - Apostolos Antonacopoulos
IMPACT Final Conference - Apostolos AntonacopoulosIMPACT Centre of Competence
 
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocrIMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocrIMPACT Centre of Competence
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT Centre of Competence
 
IMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACTIMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACTIMPACT Centre of Competence
 
IMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a PortalIMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a PortalIMPACT Centre of Competence
 
IMPACT Final Conference - Language Parallel Sessions - Landsbergen
IMPACT Final Conference - Language Parallel Sessions -  LandsbergenIMPACT Final Conference - Language Parallel Sessions -  Landsbergen
IMPACT Final Conference - Language Parallel Sessions - LandsbergenIMPACT Centre of Competence
 

Destacado (18)

IMPACT Final Conference - Michael Fuchs
IMPACT Final Conference - Michael FuchsIMPACT Final Conference - Michael Fuchs
IMPACT Final Conference - Michael Fuchs
 
IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...
IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...
IMPACT Final Conference - Research Parallel Sessions - 01 impact conference_r...
 
IMPACT Final Conference - Apostolos Antonacopoulos
IMPACT Final Conference - Apostolos AntonacopoulosIMPACT Final Conference - Apostolos Antonacopoulos
IMPACT Final Conference - Apostolos Antonacopoulos
 
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocrIMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
 
IMPACT Final Conference - Paul Fogel
IMPACT Final Conference - Paul FogelIMPACT Final Conference - Paul Fogel
IMPACT Final Conference - Paul Fogel
 
IMPACT Final Conference - Clemens Neudecker
IMPACT Final Conference - Clemens NeudeckerIMPACT Final Conference - Clemens Neudecker
IMPACT Final Conference - Clemens Neudecker
 
IMPACT Final Conference - Asaf Tzadok
IMPACT Final Conference - Asaf TzadokIMPACT Final Conference - Asaf Tzadok
IMPACT Final Conference - Asaf Tzadok
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to Taverna
 
IMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACTIMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACT
 
IMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a PortalIMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a Portal
 
IMPACT/myGrid Hackathon - Taverna Roadmap
IMPACT/myGrid Hackathon - Taverna RoadmapIMPACT/myGrid Hackathon - Taverna Roadmap
IMPACT/myGrid Hackathon - Taverna Roadmap
 
IMPACT Final Conference - Muehlberger - FEP
IMPACT Final Conference - Muehlberger - FEPIMPACT Final Conference - Muehlberger - FEP
IMPACT Final Conference - Muehlberger - FEP
 
IMPACT Final Conference - Language Parallel Sessions - Landsbergen
IMPACT Final Conference - Language Parallel Sessions -  LandsbergenIMPACT Final Conference - Language Parallel Sessions -  Landsbergen
IMPACT Final Conference - Language Parallel Sessions - Landsbergen
 
IMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus GravenhorstIMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus Gravenhorst
 
IMPACT Final Conference - Ulrich Reffle
IMPACT Final Conference - Ulrich ReffleIMPACT Final Conference - Ulrich Reffle
IMPACT Final Conference - Ulrich Reffle
 
IMPACT Final Conference - Stefan Pletschacher
IMPACT Final Conference - Stefan PletschacherIMPACT Final Conference - Stefan Pletschacher
IMPACT Final Conference - Stefan Pletschacher
 
IMPACT Final Conference - Jesse de Does
IMPACT Final Conference - Jesse de DoesIMPACT Final Conference - Jesse de Does
IMPACT Final Conference - Jesse de Does
 
IMPACT Final Conference - Katrien Depuydt
IMPACT Final Conference - Katrien DepuydtIMPACT Final Conference - Katrien Depuydt
IMPACT Final Conference - Katrien Depuydt
 

Similar a IMPACT Final Conference - Majlis Bremer Laamanen

Digitization, industrialisation - sport broadcasting challenges and the value...
Digitization, industrialisation - sport broadcasting challenges and the value...Digitization, industrialisation - sport broadcasting challenges and the value...
Digitization, industrialisation - sport broadcasting challenges and the value...FIAT/IFTA
 
2016.12.10 HSE lecture public
2016.12.10 HSE lecture public2016.12.10 HSE lecture public
2016.12.10 HSE lecture publicEd Rodley
 
Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosMuehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosEUscreen
 
Designing Smart Things: user experience design for networked devices
Designing Smart Things: user experience design for networked devicesDesigning Smart Things: user experience design for networked devices
Designing Smart Things: user experience design for networked devicesMike Kuniavsky
 
Cooperation in the Digital Age: Building the Library Platform
Cooperation in the Digital Age:  Building the Library PlatformCooperation in the Digital Age:  Building the Library Platform
Cooperation in the Digital Age: Building the Library PlatformConstance Malpas
 
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosaLive to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosaMediaMosa
 
From Essence to Assets. Making sense of an audiovisual archive
From Essence to Assets. Making sense of an audiovisual archiveFrom Essence to Assets. Making sense of an audiovisual archive
From Essence to Assets. Making sense of an audiovisual archiveBrecht Declercq
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesFIAT/IFTA
 
The Big Data Is A Significant Subject Of Modern Times With...
The Big Data Is A Significant Subject Of Modern Times With...The Big Data Is A Significant Subject Of Modern Times With...
The Big Data Is A Significant Subject Of Modern Times With...Sarah Gordon
 
Print to Pixels: Digitizing in Your Library
Print to Pixels: Digitizing in Your LibraryPrint to Pixels: Digitizing in Your Library
Print to Pixels: Digitizing in Your LibraryMartin Kalfatovic
 
Dr H K Kaul
Dr H K KaulDr H K Kaul
Dr H K Kaullrc.jiit
 
Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...
Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...
Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...Bob Vanden Burgt
 
A Short Course on the Internet of Things
A Short Course on the Internet of ThingsA Short Course on the Internet of Things
A Short Course on the Internet of ThingsPrasant Misra
 
Green gupta 20 years of mmc
Green gupta 20 years of mmcGreen gupta 20 years of mmc
Green gupta 20 years of mmcFIAT/IFTA
 
Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...
Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...
Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...FIAT/IFTA
 
Edinburgh OldMapsOnline Workshop
Edinburgh OldMapsOnline WorkshopEdinburgh OldMapsOnline Workshop
Edinburgh OldMapsOnline WorkshopPetr Pridal
 

Similar a IMPACT Final Conference - Majlis Bremer Laamanen (20)

Digitization, industrialisation - sport broadcasting challenges and the value...
Digitization, industrialisation - sport broadcasting challenges and the value...Digitization, industrialisation - sport broadcasting challenges and the value...
Digitization, industrialisation - sport broadcasting challenges and the value...
 
2016.12.10 HSE lecture public
2016.12.10 HSE lecture public2016.12.10 HSE lecture public
2016.12.10 HSE lecture public
 
Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosMuehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
 
Designing Smart Things: user experience design for networked devices
Designing Smart Things: user experience design for networked devicesDesigning Smart Things: user experience design for networked devices
Designing Smart Things: user experience design for networked devices
 
Cooperation in the Digital Age: Building the Library Platform
Cooperation in the Digital Age:  Building the Library PlatformCooperation in the Digital Age:  Building the Library Platform
Cooperation in the Digital Age: Building the Library Platform
 
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosaLive to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosa
 
From Essence to Assets. Making sense of an audiovisual archive
From Essence to Assets. Making sense of an audiovisual archiveFrom Essence to Assets. Making sense of an audiovisual archive
From Essence to Assets. Making sense of an audiovisual archive
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, Issues
 
The Big Data Is A Significant Subject Of Modern Times With...
The Big Data Is A Significant Subject Of Modern Times With...The Big Data Is A Significant Subject Of Modern Times With...
The Big Data Is A Significant Subject Of Modern Times With...
 
Leonid sheremetov
Leonid sheremetovLeonid sheremetov
Leonid sheremetov
 
Leonid sheremetov
Leonid sheremetovLeonid sheremetov
Leonid sheremetov
 
Print to Pixels: Digitizing in Your Library
Print to Pixels: Digitizing in Your LibraryPrint to Pixels: Digitizing in Your Library
Print to Pixels: Digitizing in Your Library
 
Unit 1
Unit 1Unit 1
Unit 1
 
Dr H K Kaul
Dr H K KaulDr H K Kaul
Dr H K Kaul
 
Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...
Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...
Myths, Challenges and Advances in Power & Signal Distribution for Live Event ...
 
A Short Course on the Internet of Things
A Short Course on the Internet of ThingsA Short Course on the Internet of Things
A Short Course on the Internet of Things
 
Green gupta 20 years of mmc
Green gupta 20 years of mmcGreen gupta 20 years of mmc
Green gupta 20 years of mmc
 
Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...
Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...
Digitisation-Industrialisation: Sport Broadcasting Challenges and the Value o...
 
Edinburgh OldMapsOnline Workshop
Edinburgh OldMapsOnline WorkshopEdinburgh OldMapsOnline Workshop
Edinburgh OldMapsOnline Workshop
 
Image Retrieval at the BnF
Image Retrieval at the BnFImage Retrieval at the BnF
Image Retrieval at the BnF
 

Más de IMPACT Centre of Competence

Más de IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Último

Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationMJDuyan
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.EnglishCEIPdeSigeiro
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxheathfieldcps1
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptxraviapr7
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...CaraSkikne1
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfTechSoup
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and stepobaje godwin sunday
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRADUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRATanmoy Mishra
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17Celine George
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
Human-AI Co-Creation of Worked Examples for Programming Classes
Human-AI Co-Creation of Worked Examples for Programming ClassesHuman-AI Co-Creation of Worked Examples for Programming Classes
Human-AI Co-Creation of Worked Examples for Programming ClassesMohammad Hassany
 
CAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxCAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxSaurabhParmar42
 

Último (20)

Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive Education
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
 
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdfPersonal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
Finals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quizFinals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quiz
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and step
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRADUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
Human-AI Co-Creation of Worked Examples for Programming Classes
Human-AI Co-Creation of Worked Examples for Programming ClassesHuman-AI Co-Creation of Worked Examples for Programming Classes
Human-AI Co-Creation of Worked Examples for Programming Classes
 
CAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxCAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptx
 

IMPACT Final Conference - Majlis Bremer Laamanen

  • 1. CROWDSOURCING IN THE DIGITALKOOT PROJECT Majlis Bremer-Laamanen IMPACT 24TH OF OCTOBER, 2011 Microtask.com: Digitalkoot: Making Old Archives Accessible Using Crowdsourcing by Otto Chrons and Sami Sundell, Discussions Managing Director Harri Holopainen harri@microtask.com
  • 2. The Centre for Preservation and Digitisation: statistics • Established in 1990 • Digitisation: 1,3 • Digitisation started in million pages 1998 • Audio digitisation • Over 50 employees and cataloguing music 1,300 unique • Yearly average (past cassettes and the three years): sleeves • Microfilm • Conservation: production: 1, 3 10,000-15,000 units million exposures
  • 3. ENRICHING CONTENT (http://digi.nationallibrary.fi, http://www.doria.fi/handle/10024/4194) • Newspapers - > 2 million pages, the Historical Newspaper Library • Journals - > 2,7 million pages, free to 1910, in all legal deposit libraries to 1944 • Books - > travel, novels, Dissertations 17th century, Save the Book • Ephemera - > industrial price lists • Sound - > national sound archive, C-casettes • Interest groups: the creators, users, contributors of the material
  • 4. Context for mass digitisation and crowdsourcing Client Accessibility Centre for Preservation and Digitisation Temporary Physical Preparation for Post- storage for Digitisation objects Transferring Digitisation processing digitised objects Retrieval Physical Objects Mass digitisation activities in the most cost-effective manner: Newspapers, books, journals, ephemera, audio: • Logistics for physical items • Process for digital objects: network services and long-term preservation • Metadata Mets - Alto: capturing through process • Metadata development: User experience and crowdsourcing • Customizing of the tracking systems (CCS, Item Tracking, Scan Client) • Operational environment: scaling architecture and implementation
  • 5. DIGITALKOOT DIGI = TO DIGITISE TALKOOT = PEOPLE GATHERING TO WORK TOGETHER VOLUNTARILY (WITHOUT PAYMENT) FIRST EXPERIENCE 2011: DIGITALKOOT: correction of OCR by gamification, turning useful activities into games ”THE MOLE HUNT” by Microtask.com. – People can spend hours on games – Turning useful activities into games – Activities can be rewarded with scores, achievments and social benefits From February, 8th to September 15th, 2011: about 80.000 visitors, 4000 hours of effective game time. More than 5 million tasks.
  • 6. CHALLENGES Meaningful tasks without breaking the flow of the game Real-time feedback – many simultaneous players doing the same task Build a bridge to save the moles from falling down => – Correct typing gives you a block to the bridge – Incorrect is punished by explosion
  • 10. A bridge has been built…
  • 11. To the next level?
  • 13. When a mole falls
  • 15. GAMIFICATION CHALLENGES Balancing game play elements with task completion speed and accuracy Keep the motivation of people and enlarge the audience Introduction of meaningful tasks into the game without breaking game play mechanisms Instant feedback on players´ actions (simultaneous players) •pressure to adapt to varying feedback situations/latencities
  • 16. POSITIVE EFFECT OF VERIFICATION ”The wisdom of the crowds” • includes answers from possible spammers Game start: verification tasks only Accurate work shown => verification lowered in phases, never zero Verification tasks are created automatically: • A randomly selected task is sent to several players: all have to agree on the result => verification task
  • 17. VERIFICATION OF THE OCR Players and their pace cannot be synchronized. Verification tasks to the task stream: •Fed to players varies according to the number of active players •The system knows the answer: the game play is improved by fast feedback •Downside: no new information produced
  • 18. USERS: February 8th to March 31st, 2011 31,816 visitors, 4,768 players, 2,740 hours of game time, 2,5 million tasks. 1 % via Internet, 99 % via Facebook Half of the users were men. Gametime: seconds to over 100 hours (altogether). Median time: => 9 minutes. Women >13 minutes and 54 % of the tasks Hardest working top 4 were all men
  • 19. ACCURACY OCR-system 0.8 confidential about accuracy => human correction in 30% Random selection of 2 articles: •1,467 words Digitalkoot result: only14 mistakes /228 OCR •516 words Digitalkoot result: 1 mistake/118 OCR •>> well over 99% possible by gamification Spammer play: •One player 1,5 hours and 5,692 tasks was detected by the verification system and only 4 tasks were accepted
  • 20. Enriching Digitisation Production Processes, METS Profiles: a new development platform RESOURCE DIGITAL Articles Illustrations COMPREHENSIVE Poems LEVEL OF DIGITAL COLLECTIONS MARK UP Standards & OAI-PMH Structural metadata METS, ALTO complient METS SIP POST packages PROCESSING METS EXPORT Administrative/technical metadata MIX/PREMIS Packesges include: SCANNING JPEG2000 Descriptive metadata MARC21/MODS OCR TXT as ALTO XML PDF CATALOGUING Two Bibliographic Newspapers Records JPEG(150) Serials METSXML Books Parchments MARCXML Notes Maps SOURCE MATERIAL Audio PHYSICAL COLLECTIONS
  • 21. IN THE MEDIA -Until March 31st, over 30 articles: all around the world: New York Times… -Television appearances ongoing -Helsingin Sanomat : HS talkoot using the National Library´s digitised newspaper material Historical Newspaper Library > advertising Digitalkoot e.g. September 15th -Influenced user interest => stabilisation to 300 individual users per week
  • 22. NEXT 1) Marking of articles and/or images 2) Indexing articles and/or images
  • 23. KUVATALKOOT Goal: sophisticated user experience Collections discovery and Luonnon-kirja ala-alkeiskouluin tarpeeksi / Z. Topelius, 1868 reuse of digital content by researchers and people at large: Researchers will get better systematic coverage of images and articles in published printed material.