This document discusses optical character recognition (OCR) and font recognition techniques. It presents the results of several experiments comparing different OCR and font recognition algorithms on various datasets containing English, Farsi, Arabic, and Ottoman fonts and styles. The proposed dual tree complex wavelet transform (DT-CWT) approach achieved higher accuracy than state-of-the-art methods on most datasets, was faster, and was more robust to noise. Mean and standard deviation of wavelet coefficients were used as features with an SVM classifier.
2. OFR as a mean: Optical Character
Recognition (OCR)
• As of August 2010, there
are 129.864.880 books in
the world1.
• Only 20 million of them
have been digitized.
• Digitization ≠ Scanning
– Image vs Context
– Additional processing
• Optical Character
Recognition
1http://booksearch.blogspot.com/2010/08/books-of-world-stand-up-and-be-counted.html
3. OFR as a mean: Optical Character
Recognition (OCR)
• Inter-typeface variability
– Vast number of typefaces
(>50000)
• OCR is like an finding
needle in haystack
• Knowing the font
significantly reduces the
size of haystack
4. OFR as an end: Dead Sea Scrolls
• Digitized by Google
• Currently 5 scrolls
are available
• Classification of
new scripts
5. OFR as an end: Identifont
• Font search service
• Font are expensive! ($25-$1000)
• Finding cheaper alternatives:
Museo (free) Adelle ($599)
6. How to Recognize Fonts?
Local Global
• Information from individual letters • Information from blocks of words
• Higher resolution (decision per • Faster
word/letter) • Lower resolution (decision per
• Needs OCR as preprocessing block)
18. Results:English Font Recognition
• Dataset
– Printscreen, Small natural
noise, Artificial noise, Large
natural noise
– 1 paragraph per font/emphasis
pair
– 8 fonts:
• Arial, Bookman, Century
Gothic, Comic Sans, Courier,
Computer Modern,
Impact,Times New Roman
19. Results: English Font Recognition
• Competition
Algorithm Preprocessing? Subsampling Feature Classifier
Mean, std of SVM (one
Proposed Otsu’s method Variable
CWT againist one)
Text line
100 random
detection, Skewness & EM trained
Aviles-Cruz 64x64
normalization, kurtosis Bayes classifier
subsamples
texture formation
Mean,std, max
Normalization, SVM (one
Ramanathan 3x3 grid of Gabor
Otsu’s method against all)
responses
20. Results: English Font Recognition
Low Natural Noise
Proposed Avilez-Cruz Ramanathan
Low Natural Noise A
100
Font
Proposed Avilez-Cruz Ramanathan 95
Mean: B
90
A 96,88 81,75 100 85
B 100 87 100 80
75
CG 98,45 69,75 97,22 T CG
70
CS 100 75,5 100 65
C 100 96,25 100
I 100 99 100
M 100 97 100 M CS
T 100 91 100
Mean: 99,41625 87,15625 99,6525
I C
21. Results: English Font Recognition
Low Natural Noise + Artificial Noise
Proposed Avilez-Cruz Ramanathan
Low Natural Noise + Artifical Noise
A
Font 100
Proposed Avilez-Cruz Ramanathan
95
Mean: B
A 95,31 78,25 97,22 90
85
B 100 83 100
80
CG 98,44 67,5 97,22
75
CS 100 73 100 T CG
70
C 100 91,5 97,22 65
I 98,44 98,5 100
M 100 91,25 100
T 98,44 79,25 97,22 M CS
Mean: 98,82875 82,78125 98,61
I C
22. Results: English Font Recognition
High Natural Noise
Proposed Avilez-Cruz
A Ramanathan
100
High Natural Noise 98
Font Mean: 96 B
Proposed Avilez-Cruz Ramanathan 94
92
A 98,44 - 91,67 90
88
B 98,44 - 88,89 86
T 84 CG
CG 92,19 - 94,44
82
CS 100 - 97,22 80
C 100 - 94,44
I 100 - 94,44
M 98,44 - 88,88 M CS
T 98,44 - 100
Mean: 98,24375 - 93,7475
I C
23. Results: English Font Recognition
Recognition Means
Proposed Avilez-Cruz Ramanathan
100 100 100 99.6525
99.41625
98.82875 98.61 98.24375
93.7475
87.15625
82.78125
Printscreen Low Natural Noise Low Natural Noise + artificial noise High Natural Noise
24. Results: Farsi Font Recognition
• Dataset
– Small natural noise
– 1 paragraph per font/emphasis pair
– 8 fonts:
• Homa, Lotus, Mitra, Nazanin, Tahoma,
Times New
Roman, Titr, Traffic, Yaghut, and Zar
a: Lotus italic
b:Homa bold italic
[a] c:Times New
[b][c] Roman bold
25. Results: Farsi Font Recognition
• Competition
Algorithm Preprocessing? Subsampling Feature Classifier
Mean, std of SVM (one
Proposed Otsu’s method Variable
CWT againist one)
Text line
Khosravi and detection, Mean,std of
4x4 grid AdaBoost
Kabir normalization, Sobel-Roberts
texture formation
PCA of Sobel,
Senobari and Yes, but not explai 128x128 size Roberts,
MLP classifer
Khosravi ned subsamples Symlet
Wavelets
26. Results: Farsi Font Recognition
Low Natural Noise
Proposed Khosravi Senobari
Font Proposed Khosravi Senobari
L
L 92,2 92,2 90,7 100
M 95,3 93,4 93,7 Mean M
95
N 90,6 85,2 92
90
TR 98,4 97,6 95,9
TN 85 N
Y 96,9 97,6 98,5
Z 92,2 87,4 90,9 80
H 100 99,2 99,8 75
TI 100 95,2 97 T TR
T 100 96,6 98,3
TN 98,4 97,2 98,8
Mean 96,41 94,16 95,56 TI Y
H Z
28. Results: Arabic Font Recognition
• Competition
Algorithm Preprocessing? Subsampling Feature Classifier
Mean, std of SVM (one
Proposed Otsu’s method Variable
CWT againist all)
Ben Moussa No No Fractal based NN
29. Results: Arabic Font Recognition
ALPH-REGIM Database
Proposed Ben Moussa
Font Proposed Ben Moussa
AH 99,633 94 AH
100
AN 98,1595 94 Mean 98 AN
96
AT 99,734 92 94
92
B 99,5968 100 N 90 AT
88
BU 98,2955 100
86
D 99,8592 100 84
82
H 90,4424 100 KO B
K 90,4037 88
KO 99,3478 98
N 98,2418 98 K BU
Mean 97,3714 96,4
H D
33. Conclusion
• New feature for font recognition:
– Mean and std of 3 level CWT
– Higher accuracy than states of art on
English, Farsi, Arabic fonts
– Faster than state of art
– Robust to noise
– Performs well on Ottoman texts
34. References
[1] Abuhaiba, I., 2004. Arabic font recognition using decision trees built [14] Khosravi, H., Kabir, E., 2010. Farsi font recognition based on sobelroberts
from common words. Journal of Computing and Information Technology features. Pattern Recognition Letters 31 (1), 75 – 82.
13 (3), 211–224. [15] Kingsbury, N., 1997. Image processing with complex wavelets. Phil.
[2] Amin, A., 1998. Off-line arabic character recognition: the state of the Trans. Royal Society London A 357, 2543–2560.
art. Pattern recognition 31 (5), 517–530. [16] Kingsbury, N., 1998. The dual-tree complex wavelet transform: a new ef-
[3] Aviles-Cruz, C., Rangel-Kuoppa, R., Reyes-Ayala, M., Andrade- 29
Gonzalez, A., Escarela-Perez, R., 2005. High-order statistical texture ficient tool for image restoration and enhancement. In: Proc. EUSIPCO.
analysis-font recognition applied. Pattern Recognition Letters 26 (2), Vol. 98. pp. 319–322.
135 – 145. [17] Kingsbury, N., 2000. A dual-tree complex wavelet transform with improved
[4] Ben Moussa, S., Zahour, A., Benabdelhafid, A., Alimi, A., 2008. Fractalbased orthogonality and symmetry properties. In: Image Processing,
system for arabic/latin, printed/handwritten script identification. 2000. Proceedings. 2000 International Conference on. Vol. 2. IEEE, pp.
In: Pattern Recognition, 2008. ICPR 2008. 19th International Conference 375–378.
on. IEEE, pp. 1–4. [18] Ma, H., Doermann, D., 2003/// 2003. Gabor filter based multi-class
[5] Borji, A., Hamidi, M., 2007. Support vector machine for persian font classifier for scanned document images. In: 7th International Conference
recognition. International Journal of Intelligent Systems and Technologies, on Document Analysis and Recognition (ICDAR). pp. 968 – 972.
184–187. [19] Otsu, N., 1979. A threshold selection method from gray-level histograms.
[6] Boser, B., Guyon, I., Vapnik, V., 1992. A training algorithm for optimal IEEE Transactions on Systems, Man and Cybernetics 9 (1), 62–66.
margin classifiers. In: Proceedings of the fifth annual workshop on [20] Petkov, N., Wieling, M., 2008. Gabor filter for image processing and
Computational learning theory. ACM, pp. 144–152. computer vision. Tech. rep., University of Groningen.
[7] Cai, S., Li, K., Selesnick, I., ???? Matlab implementation of wavelet [21] Ramanathan, R., Soman, K., Thaneshwaran, L., Viknesh, V., Arunkumar,
transforms. Tech. rep., Polytechnic University. T., Yuvaraj, P., oct. 2009. A novel technique for english font
[8] Chang, C., Lin, C., 2011. Libsvm: a library for support vector machines. recognition using support vector machines. In: Advances in Recent
28 Technologies in Communication and Computing, 2009. ARTCom ’09.
ACM Transactions on Intelligent Systems and Technology (TIST) 2 (3), International Conference on. pp. 766 –769.
27. [22] Rashedi, E., Nezamabadi-pour, H., Saryzadi, S., 2007. Farsi font recognition
[9] Chaudhuri, B., Garain, U., 1998. Automatic detection of italic, bold and using correlation coefficients (in farsi). In: 4th Conf. on Machine
all-capital words in document images. In: Pattern Recognition, 1998. Vision and Image Processing, Ferdosi Mashhad.
Proceedings. Fourteenth International Conference on. Vol. 1. IEEE, pp. [23] Selesnick, I., Baraniuk, R., Kingsbury, N., 2005. The dual-tree complex
610–612. wavelet transform. Signal Processing Magazine, IEEE 22 (6), 123–151.
[10] Cortes, C., Vapnik, V., Sep. 1995. Support-vector networks. Mach. 30
Learn. 20 (3), 273–297. [24] Villegas-Cortez, J., Aviles-Cruz, C., 2005. Font recognition by invariant
[11] Duan, K., Keerthi, S., 2005. Which is the best multiclass svm method? moments of global textures. In: Proceedings of international workshop
an empirical study. Multiple Classifier Systems, 732–760. VLBV05 (very low bit-rate video-coding 2005). pp. 15–16.
[12] Hsu, C., Chang, C., Lin, C., et al., 2003. A practical guide to support [25] Zhu, Y., Tan, T., Wang, Y., Oct. 2001. Font recognition based on global
vector classification. texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23 (10), 1192–
[13] Jung, M., Shin, Y., Srihari, S., 1999. Multifont classification using typographical 1200.
attributes. In: Document Analysis and Recognition, 1999. [26] Zramdini, A., Ingold, R., 1998. Optical font recognition using typographical
ICDAR’99. Proceedings of the Fifth International Conference on. IEEE, features. IEEE Transactions on Pattern Analysis and Machine
pp. 353–356. Intelligence 20, 877–882.