Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Alican Bozkurt
Pınar Duygulu Şahin GRC 2013
A. Enis Çetin Bilkent University

OFR as a mean: Optical Character
Recognition (OCR)
• As of August 2010, there
are 129.864.880 books in
the world1.
• Only 20 million of them
have been digitized.
• Digitization ≠ Scanning
– Image vs Context
– Additional processing
• Optical Character
Recognition

1http://booksearch.blogspot.com/2010/08/books-of-world-stand-up-and-be-counted.html

OFR as a mean: Optical Character
Recognition (OCR)
• Inter-typeface variability
– Vast number of typefaces
(>50000)
• OCR is like an finding
needle in haystack
• Knowing the font
significantly reduces the
size of haystack

OFR as an end: Dead Sea Scrolls
• Digitized by Google
• Currently 5 scrolls
are available
• Classification of
new scripts

OFR as an end: Identifont
• Font search service
• Font are expensive! ($25-$1000)
• Finding cheaper alternatives:

Museo (free) Adelle ($599)

How to Recognize Fonts?

Local Global
• Information from individual letters • Information from blocks of words
• Higher resolution (decision per • Faster
word/letter) • Lower resolution (decision per
• Needs OCR as preprocessing block)

Dual Tree Complex Wavelet Transform
(DT-CWT)

(DT-CWT)
• Why CWT?
– Directional selectivity

DWT CWT

Real

90 45(?) 0(deg)

(DT-CWT)
• Why CWT?
– Directional selectivity
– Shift invariance

DWT CWT

Demonstration
• Train images • Test image
– Printscreens – Random image for “typewriter”
– No noise – Real noise
– White background – Colored background
– ~1900x750 px image size – 1169x1142 px image size
– 168x480 px sample size – 96x96 sample size
– One paragraph per font

Demonstration
• Smaller subsample size
– Different height/width
ratio
• Noise
• Different background
• Not exact font
• %96 success rate
– (125/130)
– Blue: Courier New Regular
– Red: Bookman Regular

Demonstration

Train image for “Courier New regular”

Test image
Train image for “Bookman regular”

Feature extraction
• Input Image
Step 0

Feature extraction
• Input Image
Step 0

• Convert Image
to binary using
Step 1 Otsu’s method

Feature extraction
• Input Image
Step 0

• Convert Image
to binary using

• Divide the
image into
Step 2 subsamples

Feature extraction
• Input Image
Subsample Level 1 Level 2 Level 3 Step 0

level 1 angle 15 level 2 angle 15 level 3 angle 15 • Convert Image
to binary using
level 3 angle 45

level 2 angle 45
• Divide the
image into
level 3 angle 75 Step 2 subsamples
level 1 angle 45

level 2 angle 75 • 3 level DTCWT
For each
subsample

level 1 angle 75

Feature Extraction
• Input Image
Level Step 0

1 μ1 : 0,082091 0,084891 0,060045 0,080689 0,085836 0,060873
• Convert Image
to binary using
σ1 : 0,14791 0,15201 0,11201 0,14617 0,15402 0,11424 Step 1 Otsu’s method

• Divide the
Level Step 2
image into
subsamples
μ2: 0,22597 0,24064 0,11976 0,23731 0,24072 0,12753
2 σ2: 0,36203 0,35692 0,17401 0,37765 0,34842 0,19024

• 3 level DTCWT
For each
subsample

Level
μ3: 0,49943 0,54883 0,35954 0,55623 0,56736 0,30949 • Mean and std
3 σ3: 0,6949 0,65361 0,46078 0,72141 0,68851 0,39779
Step 4

• Concatenate
Step 5
Φ = [μ1, μ2, μ3, σ1, σ2, σ3] (1x36 feature vector)

Results:English Font Recognition
• Dataset
– Printscreen, Small natural
noise, Artificial noise, Large
natural noise
– 1 paragraph per font/emphasis
pair
– 8 fonts:
• Arial, Bookman, Century
Gothic, Comic Sans, Courier,
Computer Modern,
Impact,Times New Roman

Results: English Font Recognition
• Competition
Algorithm Preprocessing? Subsampling Feature Classifier

Mean, std of SVM (one
Proposed Otsu’s method Variable
CWT againist one)
Text line
100 random
detection, Skewness & EM trained
Aviles-Cruz 64x64
normalization, kurtosis Bayes classifier
subsamples
texture formation
Mean,std, max
Normalization, SVM (one
Ramanathan 3x3 grid of Gabor
Otsu’s method against all)
responses

Low Natural Noise
Proposed Avilez-Cruz Ramanathan

Low Natural Noise A
100
Font
Proposed Avilez-Cruz Ramanathan 95
Mean: B
90
A 96,88 81,75 100 85

B 100 87 100 80

75
CG 98,45 69,75 97,22 T CG
70
CS 100 75,5 100 65

C 100 96,25 100

I 100 99 100

M 100 97 100 M CS

T 100 91 100

Mean: 99,41625 87,15625 99,6525
I C

Low Natural Noise + Artificial Noise
Low Natural Noise + Artifical Noise
A
Font 100
95
Mean: B
A 95,31 78,25 97,22 90
85
B 100 83 100
80
CG 98,44 67,5 97,22
75
CS 100 73 100 T CG
70

C 100 91,5 97,22 65

I 98,44 98,5 100

M 100 91,25 100

T 98,44 79,25 97,22 M CS

Mean: 98,82875 82,78125 98,61

I C

High Natural Noise
Proposed Avilez-Cruz
A Ramanathan
100
High Natural Noise 98
Font Mean: 96 B
Proposed Avilez-Cruz Ramanathan 94
92
A 98,44 - 91,67 90
88
B 98,44 - 88,89 86
T 84 CG
CG 92,19 - 94,44
82
CS 100 - 97,22 80

C 100 - 94,44

I 100 - 94,44

M 98,44 - 88,88 M CS

T 98,44 - 100

Mean: 98,24375 - 93,7475
I C


Recognition Means

100 100 100 99.6525
99.41625
98.82875 98.61 98.24375

93.7475

87.15625

82.78125

Printscreen Low Natural Noise Low Natural Noise + artificial noise High Natural Noise

Results: Farsi Font Recognition
• Dataset
– Small natural noise
– 1 paragraph per font/emphasis pair
– 8 fonts:
• Homa, Lotus, Mitra, Nazanin, Tahoma,
Times New
Roman, Titr, Traffic, Yaghut, and Zar

a: Lotus italic
b:Homa bold italic
[a] c:Times New
[b][c] Roman bold

• Competition

CWT againist one)
Text line
Khosravi and detection, Mean,std of
4x4 grid AdaBoost
Kabir normalization, Sobel-Roberts
texture formation
PCA of Sobel,
Senobari and Yes, but not explai 128x128 size Roberts,
MLP classifer
Khosravi ned subsamples Symlet
Wavelets

Low Natural Noise
Proposed Khosravi Senobari
Font Proposed Khosravi Senobari
L
L 92,2 92,2 90,7 100
M 95,3 93,4 93,7 Mean M
95
N 90,6 85,2 92
90
TR 98,4 97,6 95,9
TN 85 N
Y 96,9 97,6 98,5
Z 92,2 87,4 90,9 80

H 100 99,2 99,8 75

TI 100 95,2 97 T TR

T 100 96,6 98,3
TN 98,4 97,2 98,8
Mean 96,41 94,16 95,56 TI Y

H Z

Results: Arabic Font Recognition
• Dataset
– ALPH-REGIM database
– 749 different sized/long
samples
– 10 fonts:
• Ahsa, Andalus, Arabic_
transparant, Badr, Bury
idah, Dammam, Hada,
Kharj, Koufi, Naskh

[a] a: Ahsa
[b] b: Badr
[c] c: Naskh
[d] d: Dammam

• Competition

CWT againist all)
Ben Moussa No No Fractal based NN

ALPH-REGIM Database
Proposed Ben Moussa

Font Proposed Ben Moussa

AH 99,633 94 AH
100
AN 98,1595 94 Mean 98 AN
96
AT 99,734 92 94
92
B 99,5968 100 N 90 AT
88
BU 98,2955 100
86
D 99,8592 100 84
82
H 90,4424 100 KO B
K 90,4037 88
KO 99,3478 98
N 98,2418 98 K BU

Mean 97,3714 96,4
H D

Results: Ottoman Style Recognition
• Dataset
– Ottoman Archives
– 6 pages per style
– Different
backgrounds
– 5 styles:
• Divani, Nesih, Matb
u, Talik, Rika

a: Divani
b: Matbu
c: Nesih
d: Rika [a][b][c]
e: Talik [d][e]

Results: Ottoman Font Recognition

Conclusion
• New feature for font recognition:
– Mean and std of 3 level CWT
– Higher accuracy than states of art on
English, Farsi, Arabic fonts
– Faster than state of art
– Robust to noise
– Performs well on Ottoman texts

References
[1] Abuhaiba, I., 2004. Arabic font recognition using decision trees built [14] Khosravi, H., Kabir, E., 2010. Farsi font recognition based on sobelroberts
from common words. Journal of Computing and Information Technology features. Pattern Recognition Letters 31 (1), 75 – 82.
13 (3), 211–224. [15] Kingsbury, N., 1997. Image processing with complex wavelets. Phil.
[2] Amin, A., 1998. Off-line arabic character recognition: the state of the Trans. Royal Society London A 357, 2543–2560.
art. Pattern recognition 31 (5), 517–530. [16] Kingsbury, N., 1998. The dual-tree complex wavelet transform: a new ef-
[3] Aviles-Cruz, C., Rangel-Kuoppa, R., Reyes-Ayala, M., Andrade- 29
Gonzalez, A., Escarela-Perez, R., 2005. High-order statistical texture ficient tool for image restoration and enhancement. In: Proc. EUSIPCO.
analysis-font recognition applied. Pattern Recognition Letters 26 (2), Vol. 98. pp. 319–322.
135 – 145. [17] Kingsbury, N., 2000. A dual-tree complex wavelet transform with improved
[4] Ben Moussa, S., Zahour, A., Benabdelhafid, A., Alimi, A., 2008. Fractalbased orthogonality and symmetry properties. In: Image Processing,
system for arabic/latin, printed/handwritten script identification. 2000. Proceedings. 2000 International Conference on. Vol. 2. IEEE, pp.
In: Pattern Recognition, 2008. ICPR 2008. 19th International Conference 375–378.
on. IEEE, pp. 1–4. [18] Ma, H., Doermann, D., 2003/// 2003. Gabor filter based multi-class
[5] Borji, A., Hamidi, M., 2007. Support vector machine for persian font classifier for scanned document images. In: 7th International Conference
recognition. International Journal of Intelligent Systems and Technologies, on Document Analysis and Recognition (ICDAR). pp. 968 – 972.
184–187. [19] Otsu, N., 1979. A threshold selection method from gray-level histograms.
[6] Boser, B., Guyon, I., Vapnik, V., 1992. A training algorithm for optimal IEEE Transactions on Systems, Man and Cybernetics 9 (1), 62–66.
margin classifiers. In: Proceedings of the fifth annual workshop on [20] Petkov, N., Wieling, M., 2008. Gabor filter for image processing and
Computational learning theory. ACM, pp. 144–152. computer vision. Tech. rep., University of Groningen.
[7] Cai, S., Li, K., Selesnick, I., ???? Matlab implementation of wavelet [21] Ramanathan, R., Soman, K., Thaneshwaran, L., Viknesh, V., Arunkumar,
transforms. Tech. rep., Polytechnic University. T., Yuvaraj, P., oct. 2009. A novel technique for english font
[8] Chang, C., Lin, C., 2011. Libsvm: a library for support vector machines. recognition using support vector machines. In: Advances in Recent
28 Technologies in Communication and Computing, 2009. ARTCom ’09.
ACM Transactions on Intelligent Systems and Technology (TIST) 2 (3), International Conference on. pp. 766 –769.
27. [22] Rashedi, E., Nezamabadi-pour, H., Saryzadi, S., 2007. Farsi font recognition
[9] Chaudhuri, B., Garain, U., 1998. Automatic detection of italic, bold and using correlation coefficients (in farsi). In: 4th Conf. on Machine
all-capital words in document images. In: Pattern Recognition, 1998. Vision and Image Processing, Ferdosi Mashhad.
Proceedings. Fourteenth International Conference on. Vol. 1. IEEE, pp. [23] Selesnick, I., Baraniuk, R., Kingsbury, N., 2005. The dual-tree complex
610–612. wavelet transform. Signal Processing Magazine, IEEE 22 (6), 123–151.
[10] Cortes, C., Vapnik, V., Sep. 1995. Support-vector networks. Mach. 30
Learn. 20 (3), 273–297. [24] Villegas-Cortez, J., Aviles-Cruz, C., 2005. Font recognition by invariant
[11] Duan, K., Keerthi, S., 2005. Which is the best multiclass svm method? moments of global textures. In: Proceedings of international workshop
an empirical study. Multiple Classifier Systems, 732–760. VLBV05 (very low bit-rate video-coding 2005). pp. 15–16.
[12] Hsu, C., Chang, C., Lin, C., et al., 2003. A practical guide to support [25] Zhu, Y., Tan, T., Wang, Y., Oct. 2001. Font recognition based on global
vector classification. texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23 (10), 1192–
[13] Jung, M., Shin, Y., Srihari, S., 1999. Multifont classification using typographical 1200.
attributes. In: Document Analysis and Recognition, 1999. [26] Zramdini, A., Ingold, R., 1998. Optical font recognition using typographical
ICDAR’99. Proceedings of the Fifth International Conference on. IEEE, features. IEEE Transactions on Pattern Analysis and Machine
pp. 353–356. Intelligence 20, 877–882.

Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (19)

Similar a Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform

Similar a Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform (20)

Último

Último (20)

Classification of Fonts and Calligraphy Styles based on Complex Wavelet Transform