The presentation of a paper entitled "Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images" to be presented in ICDAR 2013, Washingthon, DC, USA (August 25h-28th, 2013, on August 27th, 2013.
Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images
1. Reza FARRAHI MOGHADDAM, Fereydoun FARRAHI MOGHADDAM and Mohamed CHERIET
Synchromedia Laboratory, ETS, Montreal (QC), Canada H3C 1K3
imriss@ieee.org, rfarrahi@synchromedia.ca,
ffarrahi@synchromedia.ca, mohamed.cheriet@etsmtl.ca
ICDAR 2013, Washington, DC, USA, August 25th-28th, 2013
2. Outline
Why Ensemble of Experts (EoE) framework?
EoE vs. Ensemble of Classifiers (EoC)
The big picture
Notations
Endorsements and the Endorsement Graph
The selection process
Calculation of the EoE result and its variations
Use cases
Conclusions and future prospects
Any questions!
3. Why Ensemble of Experts (EoE) framework?
In recent years, a large number of binarization methods have been developed, but almost all suffer from
varying performance, generalization and strength against different benchmarks.
There is, and will be, no winner approach in short (or even in long) term because of complexity of study
subjects (document and manuscript images) and also because of new processing goals.
In this work, to leverage on all these methods of varying performance and interrelations, the ensemble of
experts (EoE) framework is introduced, to efficiently combine their outputs toward an output of higher
performance.
The EoE framework can also be applied to other decision making problems:
Medical image segmentation
Parliament setting
Opinion fraud detection
However, caution should be taken when working with smart experts, such as humans, because they
could collectively adjust their behavior, having prior access to the rules of an EoE-based framework, to
win the ensemble’s result.
4. Ensemble of Experts vs. Ensemble of Classifiers
EoE EoC
En ensemble.
It work on a “set” of problems not just
one problem
Every member is “free” to devise its
own approach to modeling and
concluding its opinion on each
problem.
It could be seen as an enabler toward
featureless approaches.
Performance evaluation is not easy
and straightforward.
En ensemble.
It (usually) works on one problem at a
time.
Every member works on the
“regularized” representations of
problem, i.e., the feature vectors.
Performance comparison is more
accurate and trustable because of
regularization approach used.
5. Basics of the EoE framework
The proposed EoE framework offers a new expert selection process from an ensemble, by introducing three concepts:
confidentness, endorsement and schools of experts.
The EoE framework tries to combine the outputs of an ensemble of related and unrelated experts using consolidation
and selection concepts toward an less-biased opinion.
Endorsement graph:
is defined based on the relations among the confidentness of the experts on their own opinions across the ensemble.
Two generic selection principles:
Consolidation of saturated opinions
Selection of schools of experts
For binarization methods, which lack the confidentness values, a confidentness map is defined.
After building the endorsement graph of the ensemble for an input document image based on the confidentness of the
experts, the saturated opinions are consolidated, and then the schools of experts are identified by thresholding the
consolidated endorsement graph.
The framework was successfully applied on the H-DIBCO’12 dataset. However, it is not limited only to handwritten
documents.
A variation of the framework, in which no selection is made, is also introduced that combines the outputs of all experts
using endorsement-dependent weights (called EwEoE).
Many aspects of the proposed framework could be improved.
6. EoE Framework: The Big Picture
EoE Framework is based on three concepts of
Confidentness, Endorsement, and School of Expert
0. Assemble the
Ensemble of Experts
1. Acquire the Set of
Problems
2. Get the Opinions
of experts on
problems
3. Calculate the
Confidentness of
each expert on each
problem
4. Calculate the
Endorsement Graph
among experts
5.1 Consolidate
highly-similar
experts (Reduce
Bias)
5.2 Calculate the
Schools of Experts
(clusters of experts)
6. Calculate the EoE
result by considering
only members of the
schools
7. Go back to step 1
to process a new set
of problems
7. Notations and application of the EoE Framework to document binarization
Currently, the methods do not provide any estimation of their
confidentness on individual pixels
EoE framework notation Equivalent in document binarization
1 An expert A binarization method
2 An Ensemble of Experts A set of binarization methods (can be
the same method with different
parameters)
3 A problem Binarization of a pixel
4 A set of problems Binarization of an image as a set of pixels
5 Opinion of an expert on a
problem
Binarization value of a method on a
pixel
6 Confidentness of an expert
on its opinion
<<To Be Defined>>
7 Endorsement (of expert A by
expert B)
Endorsement (of method A by method
B)
8 Endorsement graph Endorsement graph
8. Endorsement Graph Weights
The relation among confidentness maps on all pixels is used to define the weight of
corresponding edge on the endorsement graph
Confidentness of a on pixel i
masked by that of b
Endorsement b a
9. EoE and EwEoE means
The selection processEoE-adjusted mean output
EwEoE-adjusted mean output
“Regular” mean output
10. An example of a highly-biased ensemble
84 experts using the Gb Sauvola method[1]
1. The Endorsement Matrix 2. Consolidated Endorsement Matrix 3. The selected experts
1. The Endorsement Graph 2. Consolidated Endorsement Graph 3. The selected experts (Graph)
[1] Farrahi Moghaddam, Reza, and Mohamed Cheriet. "A multi-scale framework for adaptive binarization of degraded document
images." Pattern Recognition 43, no. 6 (2010): 2186-2198. DOI: 10.1016/j.patcog.2009.12.024
11. EoE Framework Performance (1): H-DIBCO’12
Ensemble on H-DIBCO’12 datasetOriginal Endorsement Graph
of H-DIBCO’12 for H12
The performance
Final Schools of Expert for H12
12. EoE Framework Performance (2): Gb Sauvola
ensemble (84 experts) on H-DIBCO’12 datasetOriginal Endorsement Graph
of H-DIBCO’12 for H12
Final Schools of Expert for H12
EoE output for H05
“Regular” output for H05The performance
13. EoE Framework Performance (3): Laplacian-
energy[2] ensemble on H-DIBCO’12 dataset
The performance
H-DIBCO’12:H05 H-DIBCO’12:H09 H-DIBCO’12:H14
[2] Howe, Nicholas R. "Document binarization with automatic parameter tuning." International Journal on Document Analysis and
Recognition (IJDAR) (2012): 1-12. DOI: 10.1007/s10032-012-0192-x
14. Conclusions: The EoE framework
Summary Future Prospects
The ensemble of experts (EoE) framework is
introduced, to efficiently combine the opinion of
experts methods on a set of problems.
It is based on
Confidentness
Endorsement
Schools of experts
The EoE framework:
combines the outputs of an ensemble of related and
unrelated experts using consolidation and selection concepts
toward reducing the bias of opinions.
Endorsement graph is defined based on the
confidentness of the experts.
Two generic principles of the EoE framework:
Consolidation of saturated opinions
Selection of schools of experts
It has been applied to the H-DIBCO’12 database using various
ensembles of experts: H-DIBCO’12 participants, Gb Sauvola, and
Laplacian-energy.
Generalization to other applications in other
decision making problems:
Medical image segmentation
Parliament setting
Opinion fraud detection
Improving the selection processes:
Especially the consolidation step
Adding another level of selection by selecting
one school out of all the EoE schools
Improving the endorsement definition
Standardization of the confidentness value as
the secondary output of an expert (a
binarization method) in addition to its opinion
value (binary output).