SlideShare una empresa de Scribd logo
1 de 55
Unlocking the Handwritten Content in  Document Images  Venu Govindaraju [email_address]
Handwritten Documents Relevance Scanner Storage OCR Noisy Text Newton Kinematics Notes Query Forms Letters Notes
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Challenge of Handwriting
Input Output 20187 + 2246 Handwriting Recognition
Postal Context  (138 mil records) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],LDR Lex Top 1 Top 2 10 96.5 98.7 100 89.2 94.1 1000 75.3 86.3
Paradigms Lexicon Driven OCR LDR Lexicon Free  OCR LFR Context Ranked Lexicon Segmentation Recognition Post-processing
Lexicon Free (LFR) i[.8], l[.8] u[.5], v[.2] w[.6], m[.3] w[.7] i[.7] u[.3] m[.2] m[.1] r[.4] d[.8] o[.5] ,[object Object],[object Object],[object Object],Find the best path in graph from segment 1 to 8
Lexicon Driven (LDR) Find the best way of accounting for  characters  ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is 5.0 - segments 1  and  3 is 7.2 - segments 1 and 2 is 7.6 w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] 1 2 3 4 5 6 7 8 9 o[6.1]
Grapheme Models (LFR) Writer Specific Modeling Holistic Features grapheme pos orientation angle Down cusp 3.0 -90 o Up loop Down arc
[object Object],[object Object],[object Object],[object Object],ABLE TRIP TRAP A T N Words Letters Features Interactive Models (LDR) 1-way activation [McClelland and Rumelhart 1981] 2-way  interaction
Interactive Models (LDR) Phrase Level  T-crossings, loops, ascenders, descenders, length West Central Street West Main  Street Sunset Avenue West Central Street East Central Street Sunset Avenue West Central Street West Central Avenue Sunset Avenue Lexicon 1   Lexicon 2 Lexicon 3 Interactive Model features image 2-way interaction
Interactive Models Character Recognition ,[object Object],[object Object],[object Object],Gradient (4) and Moment (5) Features 0  1  0  1  1  1  0  0  1 [Park and Govindaraju, IEEE CVPR 2000]
Active Recognition
Results 10 class digit recognition 25656 training and 12242 test  (Postal +NIST) Active Model Neural  Net KNN Top 1% 95.7 % 96.4% 95.7% Temp 612 976 3,777 Msec 1.45 11.5 384 Training  hrs 1 24 1 Lex size LDR % GM % 10 96.86 96.56 100 91.36 89.12 1000 79.58 75.38 (Top 50) 98.00 98.40 20000 62.43 58.14 (Top 100) 93.59 93.39
Fusion   Identification Task Verification Task LDR LFR
Fusion of Recognizers Type III LDR 5.6 7.4 … LFR .52 .81 … Identification task: Amherst Buffalo … Verification task: 5.6 .52 Amherst Question:  if we find optimal  and  , is it necessarily  ?  Accept Reject
Traditional Fusion Rules ,[object Object],[object Object],[object Object],[object Object],[object Object]
Likelihood Ratio Verification Tasks ,[object Object],[object Object],Minimum risk criteria:  optimal decision boundaries coincide with the contours of likelihood ratio function: Metaclassification with NN, SVM, etc. also possible [Prabhakar, Jain 02] [Nandkumar, Jain, Das 08] Impostor Genuine Recognizer score 2 Recognizer score 1
Optimal Combination functions Identification Task Results Top choice correct rate Verification Task Results ROC LFR is correct 54.8% LDR is correct 77.2% Both are correct 48.9% Either is correct 83.0% Likelihood Ratio 69.8% Weighted Sum 81.6% ,[object Object]
Independence of Scores In a single trial Amherst 5.6 7.4 … Buffalo .52 .81 … LDR LFR … … . … .
Lexicon1 Lexicon  i Lexicon N Independence of Scores In a single trial Recognizer 1 Recognizer  M Tulyakov & Govindaraju, TIFS 2009 Independent? Dependent Dependent
Optimal  Combination  ? Correlated Scores Dependent on input signal Set size LFR LDR Both correct Either correct LR Weighted sum 54.8% 77.2% 48.9% 83.0% 69.8% 81.6% 6147 3366 4744 3005 5105 4293 5015 2 nd  choice 3 rd  choice 4 th  choice Mean LFR .4359 .4755 .4771 .1145 LDR .7885 .7825 .7673 .5685
Optimal Trainable Combination Function  Minimizing misclassification cost: Classify as  rather than Assume that scores assigned to different classes are independent : Tulyakov & Govindaraju IJPRAI 2009
Combination Methods  Identification Tasks No!  Traditional Training mixes the genuine and imposter scores from different trials. Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer Score 2 Recognizer score 1
Combination Methods  Identification Tasks Model  Training MUST process scores from one identification trial as a  single training sample . BRecognizer score 2 Recognizer score 1 Impostor Genuine Rexcognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Biometric score 1
Iterative Methods ,[object Object],[object Object],[object Object],Best Impostor Function ,[object Object],Likelihood Ratio Weighted sum Best Impostor Likelihood Ratio Logistic Sum Neural Network LFR & LDR 69.84 81.58 80.07 81.43 81.67 li & C 97.24 97.23 97.01 97.34 97.39 li & G 95.90 95.47 95.99 96.17 96.29
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Search for Handwritten Documents ,[object Object],[object Object],[object Object],[object Object],[object Object],Lexicon Good Quality 10K  1K Historical 10K  1K Medical 4K Top 1 (%) 57 67 12 28 20 Top 3 (%) 69 72 22 44 27 Top 10 (%) 74 75 32 72 42
Search Engine Handwritten Forms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Search Engine for Medical Forms ,[object Object],[object Object],[object Object]
Topic Categorization  Lexicon Reduction Lex Free Large Lexicon > 5K Handwritten Medical Documents ICR Features ~33% word Recognition rate (10 points gain) Topic  Categorization Select Reduced Lexicon ~2.5K Lex Driven
ICR Features Index
DIGESTIVE-SYSTEM  FQ  CHSN   PHRASE 30  0.72    PAIN INCIDENT 5  0.31    PAIN TRANSPORTED 42  0.54    PAIN CHEST 52  0.81    STOMACH PAIN 9  0.25    HOME PAIN 6  0.43    VOMITING ILLNESS Topic Features
(Chu-Carroll, et al., 1999) Topic Categorization
Results C: complete lexicon R: reduced lexicon A: category given S: features synthetic T: truth present CLT to RLT CL to RL CLT to ALT CLT to SLT HR  7.48%  7.42%  17.58%  7.42% Error Rate  10.78%  10.88%  24.53%  10.21%
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Urgent Issue of our Times ,[object Object],[object Object],Threat:   ‘If it’s not in Google, it doesn’t exist!’ Baird 2003
What is possible today? ,[object Object]
Document Enhancement [Shi, Setlur, and Govindaraju 2008]
Transcript-Mapping 1787 Thomas Jefferson letter and its transcript  Image Transcript + +
What is not possible today?
 
Crosslingual Retrieval Multilingual Document Corpus Retrieved Documents  English Hindi Sanskrit Translations of “strength”
SEARCH Handwritten Documents Image – Based  Use Image Based Features OCR - Based Use OCR Recognition Results Query rendered
Image Based Methods (Rath 07 IJDAR)  Poor performance in multiple writer scenarios
SEARCH Handwritten Documents Image – Based  Use Image Based Features- OCR - Based Use OCR recognition results
Indexing Retrieval Handwriting  Recognition
Vector IR Model (TF-IDF) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Baeza-Yates99]
Modifications to VM ,[object Object],[object Object]
[object Object],[object Object],[object Object],Estimation   :  word images 0.02  0.01  0.2  0.01 0.01 … Doc  d j [Rath 04, Howe 05]
Estimating Term Frequency
Estimating Segmentation ,[object Object],[object Object],[object Object],[object Object],[object Object],d  >  D 3 hypotheses
[object Object],[object Object],[object Object],Word Recognition
[object Object]

Más contenido relacionado

La actualidad más candente

Loss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic CodingLoss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic CodingIJERA Editor
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - IJaganadh Gopinadhan
 
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)NAVER Engineering
 
Nov 04 MS1
Nov 04 MS1Nov 04 MS1
Nov 04 MS1Samimvez
 
A first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupA first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupDan Sullivan, Ph.D.
 

La actualidad más candente (6)

Loss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic CodingLoss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic Coding
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniques
 
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
 
Nov 04 MS1
Nov 04 MS1Nov 04 MS1
Nov 04 MS1
 
A first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupA first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetup
 

Similar a Trivandrum

Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI SystemsGlobecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI SystemsStenio Fernandes
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018Manish Pandey
 
Towards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent supportTowards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent supportConcordia University
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured predictionzukun
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
 
Alexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question AnsweringAlexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question AnsweringAlexander Sirenko
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
 
Not Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software QualityNot Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software QualityRocco Oliveto
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationkrws
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffMartin Pinzger
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?Roger Smith
 
Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?Sebastiano Panichella
 
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...IMPACT Centre of Competence
 
Real Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsReal Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsWassim Filali
 

Similar a Trivandrum (20)

Csmr10c.ppt
Csmr10c.pptCsmr10c.ppt
Csmr10c.ppt
 
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI SystemsGlobecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018
 
Towards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent supportTowards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent support
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured prediction
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
 
Alexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question AnsweringAlexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question Answering
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicators
 
Not Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software QualityNot Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software Quality
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localization
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?
 
Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?
 
Rui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase GenerationRui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase Generation
 
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
 
Real Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsReal Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth Sensors
 
CORRECT-ICSE2016
CORRECT-ICSE2016CORRECT-ICSE2016
CORRECT-ICSE2016
 

Último

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 

Último (20)

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 

Trivandrum

  • 1. Unlocking the Handwritten Content in Document Images Venu Govindaraju [email_address]
  • 2. Handwritten Documents Relevance Scanner Storage OCR Noisy Text Newton Kinematics Notes Query Forms Letters Notes
  • 3.
  • 5. Input Output 20187 + 2246 Handwriting Recognition
  • 6.
  • 7. Paradigms Lexicon Driven OCR LDR Lexicon Free OCR LFR Context Ranked Lexicon Segmentation Recognition Post-processing
  • 8.
  • 9. Lexicon Driven (LDR) Find the best way of accounting for characters ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is 5.0 - segments 1 and 3 is 7.2 - segments 1 and 2 is 7.6 w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] 1 2 3 4 5 6 7 8 9 o[6.1]
  • 10. Grapheme Models (LFR) Writer Specific Modeling Holistic Features grapheme pos orientation angle Down cusp 3.0 -90 o Up loop Down arc
  • 11.
  • 12. Interactive Models (LDR) Phrase Level T-crossings, loops, ascenders, descenders, length West Central Street West Main Street Sunset Avenue West Central Street East Central Street Sunset Avenue West Central Street West Central Avenue Sunset Avenue Lexicon 1 Lexicon 2 Lexicon 3 Interactive Model features image 2-way interaction
  • 13.
  • 15. Results 10 class digit recognition 25656 training and 12242 test (Postal +NIST) Active Model Neural Net KNN Top 1% 95.7 % 96.4% 95.7% Temp 612 976 3,777 Msec 1.45 11.5 384 Training hrs 1 24 1 Lex size LDR % GM % 10 96.86 96.56 100 91.36 89.12 1000 79.58 75.38 (Top 50) 98.00 98.40 20000 62.43 58.14 (Top 100) 93.59 93.39
  • 16. Fusion Identification Task Verification Task LDR LFR
  • 17. Fusion of Recognizers Type III LDR 5.6 7.4 … LFR .52 .81 … Identification task: Amherst Buffalo … Verification task: 5.6 .52 Amherst Question: if we find optimal and , is it necessarily ? Accept Reject
  • 18.
  • 19.
  • 20.
  • 21. Independence of Scores In a single trial Amherst 5.6 7.4 … Buffalo .52 .81 … LDR LFR … … . … .
  • 22. Lexicon1 Lexicon i Lexicon N Independence of Scores In a single trial Recognizer 1 Recognizer M Tulyakov & Govindaraju, TIFS 2009 Independent? Dependent Dependent
  • 23. Optimal Combination ? Correlated Scores Dependent on input signal Set size LFR LDR Both correct Either correct LR Weighted sum 54.8% 77.2% 48.9% 83.0% 69.8% 81.6% 6147 3366 4744 3005 5105 4293 5015 2 nd choice 3 rd choice 4 th choice Mean LFR .4359 .4755 .4771 .1145 LDR .7885 .7825 .7673 .5685
  • 24. Optimal Trainable Combination Function Minimizing misclassification cost: Classify as rather than Assume that scores assigned to different classes are independent : Tulyakov & Govindaraju IJPRAI 2009
  • 25. Combination Methods Identification Tasks No! Traditional Training mixes the genuine and imposter scores from different trials. Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer Score 2 Recognizer score 1
  • 26. Combination Methods Identification Tasks Model Training MUST process scores from one identification trial as a single training sample . BRecognizer score 2 Recognizer score 1 Impostor Genuine Rexcognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Biometric score 1
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. Topic Categorization Lexicon Reduction Lex Free Large Lexicon > 5K Handwritten Medical Documents ICR Features ~33% word Recognition rate (10 points gain) Topic Categorization Select Reduced Lexicon ~2.5K Lex Driven
  • 34. DIGESTIVE-SYSTEM FQ CHSN PHRASE 30 0.72 PAIN INCIDENT 5 0.31 PAIN TRANSPORTED 42 0.54 PAIN CHEST 52 0.81 STOMACH PAIN 9 0.25 HOME PAIN 6 0.43 VOMITING ILLNESS Topic Features
  • 35. (Chu-Carroll, et al., 1999) Topic Categorization
  • 36. Results C: complete lexicon R: reduced lexicon A: category given S: features synthetic T: truth present CLT to RLT CL to RL CLT to ALT CLT to SLT HR  7.48%  7.42%  17.58%  7.42% Error Rate  10.78%  10.88%  24.53%  10.21%
  • 37.
  • 38.
  • 39.
  • 40. Document Enhancement [Shi, Setlur, and Govindaraju 2008]
  • 41. Transcript-Mapping 1787 Thomas Jefferson letter and its transcript Image Transcript + +
  • 42. What is not possible today?
  • 43.  
  • 44. Crosslingual Retrieval Multilingual Document Corpus Retrieved Documents English Hindi Sanskrit Translations of “strength”
  • 45. SEARCH Handwritten Documents Image – Based Use Image Based Features OCR - Based Use OCR Recognition Results Query rendered
  • 46. Image Based Methods (Rath 07 IJDAR) Poor performance in multiple writer scenarios
  • 47. SEARCH Handwritten Documents Image – Based Use Image Based Features- OCR - Based Use OCR recognition results
  • 49.
  • 50.
  • 51.
  • 53.
  • 54.
  • 55.

Notas del editor

  1. ½ min Good Afternoon: I am Venu Govindaraju, Professor at the University at Buffalo. The title of my talk today is “Paradigms in Handwriting Recognition”. This will be in the context of “English” language and the Roman alphabet. The idea is to see if some of the techniques that have proved successful in English are also applicable to Arabic or Chinese. This will be an overview style presentation: describing paradigms, applications, and accuracy figures.
  2. In the postal application, we are able to operate in the Lexicon size 30 (average). When we do not have collateral information, how does one reduce the lexicon size.
  3. 1 min The problem of handwriting recognition has been typically defined as follows: - The inputs are: a bit-map image of the word to be recognized AND a lexicon of possible choices. The lexicon usually captures the context of the application at hand. When the lexicon is not provided by the application, it assumes the size of the entire English Dictionary or at least the words in common usage. In such cases, the lexicon can be of the size of tens of thousands of words. -The output is a ranked list of the lexical choices. The choices are often associated with a confidence score. In this talk, we will make the following 2 assumptions: that we are dealing with single words or short phrases of a few words. There has been a considerable body of work in recognition of entire sentences. An early paper on the topic was published by Kim, Govindaraju and Srihari in IJDAR 1997. Since the, several papers have been published on the topic most notably from Prof Suen’s group at Concordia and Prof. Bunke’s group in Switzerland. The second assumption is that we are dealing with offline handwriting recognition.
  4. We are looking at the narrative text in the medical forms. We are using medical dictionaries. It can be seen that the techniques scale to other applications as well. We want develop a search engine for such medical forms where a health official could search the forms by querying with some medical terms. We demonstrated the method of keyword spotting at the demo session yesterday. We will now describe an alternate method of attempting full transcription- which is expected to be errorful- and see if search engines are still viable. The handwriting is sloppy- written in ambulances and other emergency scenarios. Abbreviations are freely used. Documents are in carbon copies and binarization itself is a challenge.- we presented this work at DAS 06. Lexicon Free recognition can pick up only a few characters in a each word with reasonable confidence. Lexicon driven- the lexicons will be greater than 5K for which the accuracy is in the 20s. What should we do?
  5. One problem with cohesive phrases alone is that during the recognition phase we do not know the words. Therefore, we extract terms from these cohesive phrases to be used to model the category to which its associated. This is the basis for the hypothesis. For example [read slide]
  6. The pseudo-category vector is then attached to the matrix of category column vectors.
  7. Some more detail concerning the impact of ruled line removal on word recognition: We extracted all the test word images from lined pages and measured the top choice recognition performance. Here are the numbers: -- Total word images in test set : 848 from a total of 274 pages. Of these: -- Number of word images from pages with ruled lines: 460, from 146 lined pages. -- The ratio of words and pages with ruled lines in the 34 PAW data set: 460/848 = 54.25% (word), 146/274=53.28% (pages). Recognition performance on words from lined pages: -- Top1: Earlier: 318/460 = 69.13% Now: 349/460 = 75.87% The ruled line removal improves the word recognition for top 1 by 6.74% (evaluated on words from lined pages). Overall improvement for top 1 is by 4.13% (evaluated using test set including all word images from lined or non-lined pages - which we had reported earlier). Also the PAW recognizer is a straightforward implementation using a k-nearest neighbor classifier. The features used are CUBS Gradient, Structure and Concavity Features. The classifier is a very simple implementation that can be improved and its purpose was for testing the effectiveness of our features.
  8. Digital libraries like the George Washington Papers collection at the Library of Congress consist of approximately 152,000 handwritten document images and associated transcripts. The Newton Project aims to make all of Newton's writings available online. The task of aligning the transcription with handwritten text in these libraries would enable one to automatically generate an immense database of word images which in turn can be used as truth data by word recognizers to create transcriptions for the remaining scanned documents. The tedious process of manually dragging a box around each word in an image and keying in the annotations could thus be avoided. In forensic document evaluations capturing characteristics specific to a writer are of paramount importance both in writer identification and writer verification. Thus if a mapping algorithm correctly maps word images to lexicon words during preprocessing the accuracy of writer recognition would improve remarkably. For existing scanned images the alignment enables one to build interfaces where the transcript text can be browsed alongside the manuscript.
  9. Existing keyword spotting approaches can be classified into two categories: (a) Image based and (b) OCR based In image feature based indexing approaches, after preprocessing of document images and word segmentation, feature vectors are extracted from word images and stored in a database. When a user provides a query word, the similarity between the query and the word image in the database is computed, and word images are returned in the decreasing order of similarities. (b) In OCR based approaches, the indices are built from OCR scores such posterior probabilities or feature vector observational likelihoods (probability density) converted from distances returned by word recognizer.
  10. Existing keyword spotting approaches can be classified into two categories: (a) Image based and (b) OCR based In image feature based indexing approaches, after preprocessing of document images and word segmentation, feature vectors are extracted from word images and stored in a database. When a user provides a query word, the similarity between the query and the word image in the database is computed, and word images are returned in the decreasing order of similarities. (b) In OCR based approaches, the indices are built from OCR scores such posterior probabilities or feature vector observational likelihoods (probability density) converted from distances returned by word recognizer.