SlideShare una empresa de Scribd logo
1 de 55
Descargar para leer sin conexión
Handwritten Text Recognition:
Key concepts
PD Dr. Roger Labahn
Computational Intelligence Technology Lab
Mathematical Optimization Group
Institute for Mathematics
University of Rostock
co:op Convention | READ Kickoff 19.01.2016
Handwritten Text Recognition: Key concepts
Introduction
Concepts – Problems – Tasks
Recognition & Training
Interpretation – Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
Framework – Workflow
...
...
...
• Application: keyword search, transcription, . . .. . .. . .
OUT textual information (words, positions, ...) with alternatives & confidences
⇑⇑⇑
• HTR-Engine
⇑⇑⇑
IN writing images (lines, words, table cells, form fields, ...)
• Layout Analysis: . . .. . .. . . , text blocks
...
...
...
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
Framework – Workflow
...
...
...
• Application: keyword search, transcription, . . .. . .. . .
OUT textual information (words, positions, ...) with alternatives & confidences
⇑⇑⇑
• HTR-Engine
⇑⇑⇑
IN writing images (lines, words, table cells, form fields, ...)
• Layout Analysis: . . .. . .. . . , text blocks
...
...
...
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
Alternative recognition strategies
Topological methods
• learn & read graphical substructures of writings
• arcs, lines, curves, holes, ...
HMM based methods
• Hidden Markov Models
• learn & read states while traversing the writing
RNN based methods
• Recurrent Neural Networks
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
Alternative recognition strategies
Topological methods
• learn & read graphical substructures of writings
• arcs, lines, curves, holes, ...
HMM based methods
• Hidden Markov Models
• learn & read states while traversing the writing
RNN based methods
• Recurrent Neural Networks
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
Recognition Engine C I T lab & MoU partner PLANET
Decoding textual output
• textual interpretation of recognition results
• matching external requierements / knowledge (dictionaries, language model, ...)
⇑⇑⇑ ⇑⇑⇑
Recognition recognition matrix
• recognition information from image information
• processing standardized writing image
⇑⇑⇑ ⇑⇑⇑
Writing preprocessing standardized writing
• corrections & normalizations
• e.g.: baseline, slant, height, ...
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
Introduction
Concepts – Problems – Tasks
Segmentation
Context
Language
HTR
Recognition & Training
Interpretation – Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks Roger Labahn | C I T lab
Segmentation ?
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
•
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
•
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• B
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB.
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB.a
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB.ad
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB.ad␣
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB.ad␣.
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Segmentation ? NONE !
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
• BB.ad␣.D
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
Image context is essential !
Single segment without context
•
• virtually not (sufficiently) readable
Character sequence without context
•
• virtually not (sufficiently) explainable
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab
Image context is essential !
Single segment without context
• u ?? OR n ??
• virtually not (sufficiently) readable
Character sequence without context
• ???
• virtually not (sufficiently) explainable
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab
Language context is essential !
Free reading – no restrictions for possible reading results
• BB.ad␣DDolo.auu
• application: figures & general numbers, ...
Comparison against dictionary or keyword
• task: • Read a german city name from a given list !
• Find the name Bad Doberan !
• Bad Doberan
• goal: optimal / possible correspondence
writing / reading result dictionary entry / keyword
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
Language context is essential !
Free reading – no restrictions for possible reading results
• BB.ad␣DDolo.auu
• application: figures & general numbers, ...
Comparison against dictionary or keyword
• task: • Read a german city name from a given list !
• Find the name Bad Doberan !
• Bad Doberan
• goal: optimal / possible correspondence
writing / reading result dictionary entry / keyword
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
Language context is essential !
Free reading – no restrictions for possible reading results
• BB.ad␣DDolo.auu
• application: figures & general numbers, ...
Comparison against dictionary or keyword
• task: • Read a german city name from a given list !
• Find the name Bad Doberan !
• Bad Doberan
• goal: optimal / possible correspondence
writing / reading result dictionary entry / keyword
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
OCR ?
new paradigm – new concepts
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
OCR ? HTR !
new paradigm – new concepts new term !
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
OCR ? HTR !
new paradigm – new concepts new term !
• HTR Handwritten Text Recognition
• ATR Automatic Text Recognition
• ... ???
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
Introduction
Concepts – Problems – Tasks
Recognition & Training
Feature extraction
Writing processing
Neural Network
Parameter training
Interpretation – Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training Roger Labahn | C I T lab
From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
Collect & remember context !
Writing processing
• scanning in different directions data sequences (signals)
•
Information memory
• neural networks with complex neurons (cells)
• recurrent connections =⇒=⇒=⇒ memory
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab
Collect & remember context !
Writing processing
• scanning in different directions data sequences (signals)
•
Information memory
• neural networks with complex neurons (cells)
• recurrent connections =⇒=⇒=⇒ memory
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab
Complex cells
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Neural Network Roger Labahn | C I T lab
Complex cells – memory by recurrent connections
6
?
??

co:op Convention | READ Kickoff HTR Key Concepts | Recognition  Training | Neural Network Roger Labahn | C I T lab
Hierarchical Neuronal Networks
co:op Convention | READ Kickoff HTR Key Concepts | Recognition  Training | Neural Network Roger Labahn | C I T lab
From feature input to network output
(Figure from GRAVES, SCHMIDHUBER: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks)
co:op Convention | READ Kickoff HTR Key Concepts | Recognition  Training | Neural Network Roger Labahn | C I T lab
From feature input to network output
co:op Convention | READ Kickoff HTR Key Concepts | Recognition  Training | Neural Network Roger Labahn | C I T lab
Parameter training: Machine Learning
Theory
• objective: optimally adapt parameters in cells  along network connections
• idea: train the network with learning data samples
• optimization: minimize error (network output vs. sample target) over training data
Practice: impression of large application cases
• 104
network cells
• 106
trainable parameters
• 104
learning data samples (writing images)
• 150 training epochs each processing every sample once
• 4 weeks training from the scratch
co:op Convention | READ Kickoff HTR Key Concepts | Recognition  Training | Parameter training Roger Labahn | C I T lab
Learning data . . .. . .. . .
• . . .. . .. . . labeled training samples ground truth
HTR: writing images with correct text
• . . .. . .. . . the more the better . . .. . .. . . BUT:
start with realistic (reasonable) number improve while working
• . . .. . .. . . represent all project data . . .. . .. . . BUT:
start with HTR (networks) from similar collections  corpora
• . . .. . .. . . contribute to general HTR engine improvement:
put into network repository for specific application cases
co:op Convention | READ Kickoff HTR Key Concepts | Recognition  Training | Parameter training Roger Labahn | C I T lab
Introduction
Concepts – Problems – Tasks
Recognition  Training
Interpretation – Decoding
Network output
Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding Roger Labahn | C I T lab
Channel probabilities
Pre-conditions
• (abstract) alphabet of (abstract) characters
• text composed of exactly these characters
• alphabet characters ⇐⇒⇐⇒⇐⇒ network output neurons channels
• example: digits, uppercase letters, lowercase letters, special characters ␣-
• much more general: any symbol unit learnable from training data
• current (large) application case: up to 150 character channels
• independent from (natural) language – reading/writing direction – understanding
Network output
probability of (character) channel at writing (image) position
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab
Confidence Matrix – recognition / perception matrix
. B D a d l o u ␣
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab
Expression matching
• restrict to
permissible words dictionary
keyword(s) construct(s) regular expression
• consider
character confidences probability measure
or their negative logarithms distance measure
Algorithmic method
• compare confidence matrix against any permissible expression
• use extremely fast algorithm: Dynamic Programming
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
Expression matching
• restrict to
permissible words dictionary
keyword(s) construct(s) regular expression
• consider
character confidences probability measure
or their negative logarithms distance measure
Algorithmic method
• compare confidence matrix against any permissible expression
• use extremely fast algorithm: Dynamic Programming
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
Decoding
Objective – Result
• permissible expression(s) with best matching to recognition output
• best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance
• best alternatives ranked by measure (probability / distance)
Practice: impression of actual application cases
• only decoding on pre-processed lines
• searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average
• reading 1 page against 11.650 word dictionary: 8 - 9 sec. average
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
Decoding
Objective – Result
• permissible expression(s) with best matching to recognition output
• best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance
• best alternatives ranked by measure (probability / distance)
Practice: impression of actual application cases
• only decoding on pre-processed lines
• searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average
• reading 1 page against 11.650 word dictionary: 8 - 9 sec. average
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
Dynamic Programming
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
Dynamic Programming
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
Introduction
Concepts – Problems – Tasks
Recognition  Training
Interpretation – Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
Results
from C I T lab’s contribution to ICDAR’s HTRtS-2015 contest
WER = 0%
CER = 0%
who has with or without right the temporary possession of it : and
who has with or without right the temporary possession of it : and
WER = 17%
CER = 4%
 operation of this act is spent upon Titius only , 
 operation of this act isspeut upon Titius only , 
WER = 67%
CER = 52%
of the said first issue : the amount of such second consequently gap/ to the
of the and put feet the without of such ; said uitrquunity be the
WER = 80%
CER = 17%
for a simple personal Injury the Offender ’ s punish=
For on simple personal injury the offenders punish .
2. Examples of test line images of increasing difficulty. The reference transcript and the CITlab system hypothesis are displayed (in this order) below
h image. The corresponding WER and CER figures are also shown on the right of each image.
the lines with crossed-out word can be transcribed as the
ine shows. Finally, we can see that sometimes, if the line
a large WER but a low CER, the transcript can be more
ul than if the WER is lower and the CER higher (see third
[4] A. Graves, M. Liwicki, S. Fern´andez, R. Bertolami, H. Bunke, and
J. Schmidhuber, “A Novel Connectionist System for Unconstrained
Handwriting Recognition,” IEEE Tr. PAMI, vol. 31, no. 5, pp. 855–
868, 2009.
(Figure from SÁNCHEZ, TOSELLI, ROMERO, VIDAL: ICDAR2015 Competition HTRtS: Handwritten Text Recognition on the tranScriptorium Dataset)
co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
Thanks . . .. . .. . .
C I T lab Group – URO MoU Partner
PLANET intgelligent systems GmbH
EU Funding
Recognition  Enrichment of Archival Documents
. . .. . .. . . for your attention!
co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab

Más contenido relacionado

La actualidad más candente

Ir 1 lec 7
Ir 1 lec 7Ir 1 lec 7
Ir 1 lec 7
alaa223
 
Natural Language Processing in Practice
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in Practice
Vsevolod Dyomkin
 

La actualidad más candente (18)

Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
NLP Project Full Cycle
NLP Project Full CycleNLP Project Full Cycle
NLP Project Full Cycle
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
 
The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022
 
Ir 1 lec 7
Ir 1 lec 7Ir 1 lec 7
Ir 1 lec 7
 
Spatial LDA
Spatial LDASpatial LDA
Spatial LDA
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
 
Arabic question answering ‫‬
Arabic question answering ‫‬Arabic question answering ‫‬
Arabic question answering ‫‬
 
Can functional programming be liberated from static typing?
Can functional programming be liberated from static typing?Can functional programming be liberated from static typing?
Can functional programming be liberated from static typing?
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
How to expand your nlp solution to new languages using transfer learning
How to expand your nlp solution to new languages using transfer learningHow to expand your nlp solution to new languages using transfer learning
How to expand your nlp solution to new languages using transfer learning
 
Natural Language Processing in Practice
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in Practice
 
Intelligent Search
Intelligent SearchIntelligent Search
Intelligent Search
 
Arcomem training entities-and-events_advanced
Arcomem training entities-and-events_advancedArcomem training entities-and-events_advanced
Arcomem training entities-and-events_advanced
 
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
 
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an OverviewNatural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language Processing
 

Destacado (6)

Monitoring IO performance with iostat and pt-diskstats
Monitoring IO performance with iostat and pt-diskstatsMonitoring IO performance with iostat and pt-diskstats
Monitoring IO performance with iostat and pt-diskstats
 
Treasures of the National Library of Myanmar
Treasures of the National Library of MyanmarTreasures of the National Library of Myanmar
Treasures of the National Library of Myanmar
 
Final Project presentation on Image processing based intelligent traffic cont...
Final Project presentation on Image processing based intelligent traffic cont...Final Project presentation on Image processing based intelligent traffic cont...
Final Project presentation on Image processing based intelligent traffic cont...
 
Real time image processing ppt
Real time image processing pptReal time image processing ppt
Real time image processing ppt
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Image processing ppt
Image processing pptImage processing ppt
Image processing ppt
 

Similar a co:op-READ-Convention Marburg - Roger Labahn

Building Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSBuilding Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDS
Sujit Pal
 
A Static Type Analyzer of Untyped Ruby Code for Ruby 3
A Static Type Analyzer of Untyped Ruby Code for Ruby 3A Static Type Analyzer of Untyped Ruby Code for Ruby 3
A Static Type Analyzer of Untyped Ruby Code for Ruby 3
mametter
 
Topic01 intro.post
Topic01 intro.postTopic01 intro.post
Topic01 intro.post
Sree Devi
 
Knowledge_Based_Systems_Siemens
Knowledge_Based_Systems_SiemensKnowledge_Based_Systems_Siemens
Knowledge_Based_Systems_Siemens
Vinay Bhat
 

Similar a co:op-READ-Convention Marburg - Roger Labahn (20)

Building Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSBuilding Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDS
 
gPBL - Reading Assistant for Blind - Working Progress
gPBL - Reading Assistant for Blind - Working Progress gPBL - Reading Assistant for Blind - Working Progress
gPBL - Reading Assistant for Blind - Working Progress
 
2R-3KS03-OOP_UNIT-I (Part-A)_2023-24.pptx
2R-3KS03-OOP_UNIT-I (Part-A)_2023-24.pptx2R-3KS03-OOP_UNIT-I (Part-A)_2023-24.pptx
2R-3KS03-OOP_UNIT-I (Part-A)_2023-24.pptx
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Assessment of OCR quality and font identification in historical documents
Assessment of OCR quality and font identification in historical documentsAssessment of OCR quality and font identification in historical documents
Assessment of OCR quality and font identification in historical documents
 
A Static Type Analyzer of Untyped Ruby Code for Ruby 3
A Static Type Analyzer of Untyped Ruby Code for Ruby 3A Static Type Analyzer of Untyped Ruby Code for Ruby 3
A Static Type Analyzer of Untyped Ruby Code for Ruby 3
 
Deep Learning Summit (DLS01-4)
Deep Learning Summit (DLS01-4)Deep Learning Summit (DLS01-4)
Deep Learning Summit (DLS01-4)
 
DSL Construction rith Ruby
DSL Construction rith RubyDSL Construction rith Ruby
DSL Construction rith Ruby
 
Topic01 intro.post
Topic01 intro.postTopic01 intro.post
Topic01 intro.post
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
 
NLP for Everyday People
NLP for Everyday PeopleNLP for Everyday People
NLP for Everyday People
 
GDSC NYCU | 如何建立自己的開源專案
 GDSC NYCU | 如何建立自己的開源專案 GDSC NYCU | 如何建立自己的開源專案
GDSC NYCU | 如何建立自己的開源專案
 
Knowledge_Based_Systems_Siemens
Knowledge_Based_Systems_SiemensKnowledge_Based_Systems_Siemens
Knowledge_Based_Systems_Siemens
 
Finding local lessons in software engineering
Finding local lessons in software engineeringFinding local lessons in software engineering
Finding local lessons in software engineering
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
OUTDATED Text Mining 3/5: String Processing
OUTDATED Text Mining 3/5: String ProcessingOUTDATED Text Mining 3/5: String Processing
OUTDATED Text Mining 3/5: String Processing
 

Más de ICARUS - International Centre for Archival Research

Más de ICARUS - International Centre for Archival Research (20)

ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
ICARUS-Meeting #20 | The Age of Digital Technology: Documents, Archives and S...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative ...
 

Último

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 

Último (20)

Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 

co:op-READ-Convention Marburg - Roger Labahn

  • 1. Handwritten Text Recognition: Key concepts PD Dr. Roger Labahn Computational Intelligence Technology Lab Mathematical Optimization Group Institute for Mathematics University of Rostock co:op Convention | READ Kickoff 19.01.2016
  • 2. Handwritten Text Recognition: Key concepts Introduction Concepts – Problems – Tasks Recognition & Training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  • 3. Framework – Workflow ... ... ... • Application: keyword search, transcription, . . .. . .. . . OUT textual information (words, positions, ...) with alternatives & confidences ⇑⇑⇑ • HTR-Engine ⇑⇑⇑ IN writing images (lines, words, table cells, form fields, ...) • Layout Analysis: . . .. . .. . . , text blocks ... ... ... co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  • 4. Framework – Workflow ... ... ... • Application: keyword search, transcription, . . .. . .. . . OUT textual information (words, positions, ...) with alternatives & confidences ⇑⇑⇑ • HTR-Engine ⇑⇑⇑ IN writing images (lines, words, table cells, form fields, ...) • Layout Analysis: . . .. . .. . . , text blocks ... ... ... co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  • 5. Alternative recognition strategies Topological methods • learn & read graphical substructures of writings • arcs, lines, curves, holes, ... HMM based methods • Hidden Markov Models • learn & read states while traversing the writing RNN based methods • Recurrent Neural Networks co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  • 6. Alternative recognition strategies Topological methods • learn & read graphical substructures of writings • arcs, lines, curves, holes, ... HMM based methods • Hidden Markov Models • learn & read states while traversing the writing RNN based methods • Recurrent Neural Networks co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  • 7. Recognition Engine C I T lab & MoU partner PLANET Decoding textual output • textual interpretation of recognition results • matching external requierements / knowledge (dictionaries, language model, ...) ⇑⇑⇑ ⇑⇑⇑ Recognition recognition matrix • recognition information from image information • processing standardized writing image ⇑⇑⇑ ⇑⇑⇑ Writing preprocessing standardized writing • corrections & normalizations • e.g.: baseline, slant, height, ... co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  • 8. Introduction Concepts – Problems – Tasks Segmentation Context Language HTR Recognition & Training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks Roger Labahn | C I T lab
  • 9. Segmentation ? (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 10. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 11. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • B co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 12. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 13. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB. co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 14. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.a co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 15. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 16. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad␣ co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 17. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad␣. co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 18. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad␣.D co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  • 19. Image context is essential ! Single segment without context • • virtually not (sufficiently) readable Character sequence without context • • virtually not (sufficiently) explainable co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab
  • 20. Image context is essential ! Single segment without context • u ?? OR n ?? • virtually not (sufficiently) readable Character sequence without context • ??? • virtually not (sufficiently) explainable co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab
  • 21. Language context is essential ! Free reading – no restrictions for possible reading results • BB.ad␣DDolo.auu • application: figures & general numbers, ... Comparison against dictionary or keyword • task: • Read a german city name from a given list ! • Find the name Bad Doberan ! • Bad Doberan • goal: optimal / possible correspondence writing / reading result dictionary entry / keyword co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
  • 22. Language context is essential ! Free reading – no restrictions for possible reading results • BB.ad␣DDolo.auu • application: figures & general numbers, ... Comparison against dictionary or keyword • task: • Read a german city name from a given list ! • Find the name Bad Doberan ! • Bad Doberan • goal: optimal / possible correspondence writing / reading result dictionary entry / keyword co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
  • 23. Language context is essential ! Free reading – no restrictions for possible reading results • BB.ad␣DDolo.auu • application: figures & general numbers, ... Comparison against dictionary or keyword • task: • Read a german city name from a given list ! • Find the name Bad Doberan ! • Bad Doberan • goal: optimal / possible correspondence writing / reading result dictionary entry / keyword co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
  • 24. OCR ? new paradigm – new concepts co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
  • 25. OCR ? HTR ! new paradigm – new concepts new term ! co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
  • 26. OCR ? HTR ! new paradigm – new concepts new term ! • HTR Handwritten Text Recognition • ATR Automatic Text Recognition • ... ??? co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
  • 27. Introduction Concepts – Problems – Tasks Recognition & Training Feature extraction Writing processing Neural Network Parameter training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training Roger Labahn | C I T lab
  • 28. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  • 29. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  • 30. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  • 31. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  • 32. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  • 33. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  • 34. Collect & remember context ! Writing processing • scanning in different directions data sequences (signals) • Information memory • neural networks with complex neurons (cells) • recurrent connections =⇒=⇒=⇒ memory co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab
  • 35. Collect & remember context ! Writing processing • scanning in different directions data sequences (signals) • Information memory • neural networks with complex neurons (cells) • recurrent connections =⇒=⇒=⇒ memory co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab
  • 36. Complex cells co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Neural Network Roger Labahn | C I T lab
  • 37. Complex cells – memory by recurrent connections 6 ? ?? co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  • 38. Hierarchical Neuronal Networks co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  • 39. From feature input to network output (Figure from GRAVES, SCHMIDHUBER: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks) co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  • 40. From feature input to network output co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  • 41. Parameter training: Machine Learning Theory • objective: optimally adapt parameters in cells along network connections • idea: train the network with learning data samples • optimization: minimize error (network output vs. sample target) over training data Practice: impression of large application cases • 104 network cells • 106 trainable parameters • 104 learning data samples (writing images) • 150 training epochs each processing every sample once • 4 weeks training from the scratch co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Parameter training Roger Labahn | C I T lab
  • 42. Learning data . . .. . .. . . • . . .. . .. . . labeled training samples ground truth HTR: writing images with correct text • . . .. . .. . . the more the better . . .. . .. . . BUT: start with realistic (reasonable) number improve while working • . . .. . .. . . represent all project data . . .. . .. . . BUT: start with HTR (networks) from similar collections corpora • . . .. . .. . . contribute to general HTR engine improvement: put into network repository for specific application cases co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Parameter training Roger Labahn | C I T lab
  • 43. Introduction Concepts – Problems – Tasks Recognition Training Interpretation – Decoding Network output Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding Roger Labahn | C I T lab
  • 44. Channel probabilities Pre-conditions • (abstract) alphabet of (abstract) characters • text composed of exactly these characters • alphabet characters ⇐⇒⇐⇒⇐⇒ network output neurons channels • example: digits, uppercase letters, lowercase letters, special characters ␣- • much more general: any symbol unit learnable from training data • current (large) application case: up to 150 character channels • independent from (natural) language – reading/writing direction – understanding Network output probability of (character) channel at writing (image) position co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab
  • 45. Confidence Matrix – recognition / perception matrix . B D a d l o u ␣ co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab
  • 46. Expression matching • restrict to permissible words dictionary keyword(s) construct(s) regular expression • consider character confidences probability measure or their negative logarithms distance measure Algorithmic method • compare confidence matrix against any permissible expression • use extremely fast algorithm: Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  • 47. Expression matching • restrict to permissible words dictionary keyword(s) construct(s) regular expression • consider character confidences probability measure or their negative logarithms distance measure Algorithmic method • compare confidence matrix against any permissible expression • use extremely fast algorithm: Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  • 48. Decoding Objective – Result • permissible expression(s) with best matching to recognition output • best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance • best alternatives ranked by measure (probability / distance) Practice: impression of actual application cases • only decoding on pre-processed lines • searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average • reading 1 page against 11.650 word dictionary: 8 - 9 sec. average co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  • 49. Decoding Objective – Result • permissible expression(s) with best matching to recognition output • best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance • best alternatives ranked by measure (probability / distance) Practice: impression of actual application cases • only decoding on pre-processed lines • searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average • reading 1 page against 11.650 word dictionary: 8 - 9 sec. average co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  • 50. Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  • 51. Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  • 52. Introduction Concepts – Problems – Tasks Recognition Training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
  • 53. Results from C I T lab’s contribution to ICDAR’s HTRtS-2015 contest WER = 0% CER = 0% who has with or without right the temporary possession of it : and who has with or without right the temporary possession of it : and WER = 17% CER = 4% operation of this act is spent upon Titius only , operation of this act isspeut upon Titius only , WER = 67% CER = 52% of the said first issue : the amount of such second consequently gap/ to the of the and put feet the without of such ; said uitrquunity be the WER = 80% CER = 17% for a simple personal Injury the Offender ’ s punish= For on simple personal injury the offenders punish . 2. Examples of test line images of increasing difficulty. The reference transcript and the CITlab system hypothesis are displayed (in this order) below h image. The corresponding WER and CER figures are also shown on the right of each image. the lines with crossed-out word can be transcribed as the ine shows. Finally, we can see that sometimes, if the line a large WER but a low CER, the transcript can be more ul than if the WER is lower and the CER higher (see third [4] A. Graves, M. Liwicki, S. Fern´andez, R. Bertolami, H. Bunke, and J. Schmidhuber, “A Novel Connectionist System for Unconstrained Handwriting Recognition,” IEEE Tr. PAMI, vol. 31, no. 5, pp. 855– 868, 2009. (Figure from SÁNCHEZ, TOSELLI, ROMERO, VIDAL: ICDAR2015 Competition HTRtS: Handwritten Text Recognition on the tranScriptorium Dataset) co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
  • 54. Thanks . . .. . .. . . C I T lab Group – URO MoU Partner PLANET intgelligent systems GmbH EU Funding Recognition Enrichment of Archival Documents . . .. . .. . . for your attention! co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
  • 55. co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab