1. Text detection in video images using adaptive
edge detection and stroke width verification
Haojin Yang, Bernhard Quehl, Harald Sack
April 11 – 13, IWSSIP2012, Vienna (Austria)
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
2. Agenda
(1) Introduction / Motivation
(2) Related works
(3) Text detection in video frames
(4) Evaluation and experimental results
(5) Conclusion
Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012
3. Project Mediaglobe
• Semantic Search Engine for Media Archives
• Enable exploratory and semantic search in
Audiovisual Media Archives
http://www.projekt-mediaglobe.de/
5. Automated Audiovisual Analysis!
Concept "
Analysis
Classification:"
Studio"
Indoor"
News Show
Logo "
Overlay " Face "
Text
Detection
Detection
Scene"
Audio-Mining
Text
Structural" Automated" Speaker"
Speech"
Analysis
Recognition
Identification
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
6. Common OCR vs. Video OCR
• optimized for Scans • low resolution
• high resolution
• heterogeneous background
• usually white on black
• (motion) blurring
• homogenous background • perspective distortion
• uneven illumination
• shading, rotation
• large amounts of data (Images)
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
7. Related Works
6 Most of proposed text detection methods take use of texture features, edges, colors
and some text representative features e.g., stroke width feature.
Chen et al.[4] Text detection and recognition in images and video frames:
• edge based approaches achive a high recall rate
• but may also produce many false alarms
Epshtein et al. [1] proposed the SWT (Stroke Width Transform) for text detection of nature
scene images. Shortcomings of the original SWT approach:
• Robust to distinguish text like non-text objects
• The computation of SWT quite costly for images
with complex contents.
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
8. Text Detector
7
Workflow of edge based text detector:
(a) Original image (b) Vertical edge map (c) Vertical dilation map
(d) Binary map of (c) (e) Binary map after (f) After projection- (g) Detection result
connected profiling refinement
Componet analysis
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
9. Text Verification – Workflow
8
e.g
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
10. SWT Based Text Verification
9
Stroke Width Transformation
(a)
Boundary detection
(b)
From each boundary pixel p send a ray along the text gradient direction, this leads to find
another boundary pixel q.
(c)
Calculate the potential stroke width value between p und q
(a) (b) (c)
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
11. SWT Based Text Verification
10
Stroke Width Transformation result example
An example output image from stroke width transform for
character w.
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
12. SWT Based Text Verification
11
SWT Verification Constrains:
A text candidate component is discarded if:
• Its stroke width variance is lying inbetween (MinVar, MaxVar) threshold
• Its mean stroke width is lying inbetween (MinStroke, MaxStroke) threshold
• Generating of the character component by merging candidate
components with similar stroke width value.
• Then, creating character chains by merging character components with
a similar color and a small distance.
• The final verified text line must have more than 2 character chains.
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
13. SWT Based Text Verification
12 Edge detection projection profiling
→ →
SWT Text Verification on profiling candidates
→ →
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
14. Evaluation and Experimental
Results
13
Experiment setup:
Test set:
• Mediaglobe test set (31 images)
• German TV news test set (72 images)
• Microsoft common test set (45 images)
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
15. Evaluation Results
15 • Evaluation Microsoft common test set
Method
Recall Precision F1 measure
Zhao et al.[10] 0.94 0.98 0.96
Thillou et. Al [11] 0.91 0.94 0.92
Lienhard et. al.[12] 0.91 0.94 0.92
Shivakumara et. al. [4] 0.92 0.90 0.91
Gllavata et. al. [13] 0.90 0.87 0.88
0.93 0.94 0.93
Our
• Evaluation other test sets
Testset
Recall Precision F1 measure
TV News 0.86 0.81 0.83
Mediaglobe 0.75 0.81 0.77
• Example images: http://yovisto.com/labs/VideoOCR/visualResult/
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
16. Conclusion
16 We have presented a localization-verification
scheme for text detection in video images.
• Using fast edge text detector and an adaptive refinement
to reduce the false alarms
• The proposed method is quite competitive to
other existing methods
• Detect differenced writing systems (English, Japanese, Arabic )
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
17. Reference
17
[1] B. Epshtein, E. Ofek, Y. Wexler. “Detecting Text in Natural Scene with Stroke Width Transform,” in
Proc. of Computer Vision and Pattern Recognition, 2010, pp. 2963–2970.
[2] Y. Zhong, H-J. Zhang, and A. Jain, “Automatic caption localization in compressed video,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, pp. 385– 392, 2000
[3] X. Qian, G. Liu, H. Wang, and R. Su, “Text detection, localization and tracking in compressed
video,” in Proc. of Signal Processing: Image Communication, 2007, pp. 752–768
[4] Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8
(6), 679–698 (1986). DOI 10.1109/TPAMI.1986.4767851. URL http: //dx.doi.org/10.1109/TPAMI.
1986.4767851
[5] http://yovisto.com/labs/VideoOCR/
[6] http://www.cs.cityu.edu.hk/~liuwy/PE_VTDetect/
Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
18. Text detection in video images using
adaptive edge detection and
stroke width verification
Thank you for your attention!
Bernhard Quehl
Hasso-Plattner-Institut Potsdam
Prof.-Dr.-Helmert Str. 2-4
14482 Potsdam
phone: #+49 (0)331-5509-548#
email: bernhard.quehl@hpi.uni-potsdam.de#
web: http://www.hpi.uni-potsdam.de/#
Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012