New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules
1. New Challenges in Learning Classifier
g g
Systems: Mining Rarities and Evolving
Fuzzy Rules
Student: Albert Orriols-Puig
Supervisor: Ester Bernadó-Mansilla
Grup de Recerca en Sistemes Intel·ligents
Enginyeria i Arquitectura La Salle
Universitat Ramon Llull
2. Background
GRSI has been researching on machine learning and data mining
Especially focused on data classification
Research aims at
Improving learning methods
Applying learning methods to real-world applications
Application of LCS to classification problems is one of the main research lines
LCS are appealing because the mine streams of examples
Many applications make the data available in streams
Important challenges need to be addressed to deal with complex applications
Slide 2
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
3. Background
General schema of LCSs
Introduced by Holland
Environment
Sensorial
S il
Action Feedback
state
Apportionment of credit algorithms
Online rule evaluator
Learning
L i
Classifier 1
Cl ifi
Any Representation
AR t ti
XCS: Q-Learning (Sutton & Barto, 1998)
Classifier 2
Classifier Uses Widrow-Hoff delta rule
production rules,
System
genetic programs, Classifier n
perceptrons,
t
SVMs
Rule evolution
Evolutionary
Typically, a GA (Holland, 75; Goldberg, 89)
Algorithm
applied to the population
population.
Slide 3
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
4. When this Work Started
In 2004, when Michigan-style LCSs were reaching maturity
First successful implementations (Wilson, 95; Wilson, 98)
Many other derivations YCS, UCS, XCSF, and many others
Applications in important domains
pp p
Data mining (Bernadó et al, 02; Wilson, 02a; Bacardit & Butz, 04)
Function approximation (Wilson 02b)
(Wilson,
Reinforcement Learning (Lanzi, 02)
Theoretical analyses f d i (B t et al., 02 03 04b)
Th ti l l for design (Butz t l 02, 03,
But still, there are important challenges to face
Slide 4
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
5. Two Key Challenges in ML and LCSs
1st challenge: Learning from domains that contain rare classes
g g
Data classification: Extract interesting, useful, and hidden patterns
The most interesting knowledge resides in rare classes
Example: fraud detection in credit card transactions
Can learners model rare classes accurately? M b not!
Cl dl l t l ? May be t!
Knowledge Model
Dataset
Learner
Minimize learning error +
Mi i i l i
maximize generalization
What about online learning?
More challenging: Model rare classes on the fly
Aim: Analyze and improve LCS for mining domains with rarities
Slide 5
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
6. Two Key Challenges in ML and LCSs
2nd challenge: Building more understandable models and
g g
bring reasoning mechanisms close to human ones
In some domains, interpretability is more important than accuracy
LCSs most often use interval-based rules in domains described by
continuous variables
Variables
V i bl are “
“semantic-free”
ti f ”
Analyses of the inference mechanisms are scarce
Fuzzy logics provides a robust framework for
knowledge representation and
reasoning under uncertainty
d
i tit
Some fuzzy LCS approaches already exist
But no online fuzzy LCS for supervised learning has been designed
Aim: Incorporate fuzzy logics into LCS for supervised learning
Slide 6
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
7. Goal of this Work
General Goal: Address the two challenges with
g
The extended classifier system (XCS) (Wilson, 95, 98)
By far, the most influential Michigan-style LCS
The supervised classifier system (UCS) (Bernadó-Mansilla, 03)
Inherits XCS’s architecture and specialized it for data classification
XCS s
Two challenges with two LCSs that lead to four objectives
2 4
Challenges Objectives
Revise and update UCS and compare it with XCS
1.
XCS and UCS
Analyze and improve LCS for mining rarities
2.
LCS and rare classes Apply LCSs for extracting models from real-world
3.
classification problems with rarities
Design and implement an LCS with fuzzy logic
Fuzzy logics in LCS 4.
reasoning for supervised learning
Slide 7
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
8. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets For Supervised Learning
7. Conclusions and Further Work
Slide 8
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
9. Description of XCS
In training mode for single step tasks (Wilson, 95)
ENVIRONMENT
Match Set [M]
Problem
instance
1C A PεF num as ts exp
Selected
3C A PεF num as ts exp
action
Designed for reinforcement learning:
g g
5C A PεF num as ts exp
Population [P] 6C A PεF num as ts exp
Match set
REWARD
Error: Error of the predicted payoff
…
generation
1C A PεF num as ts exp
2C A PεF num as ts exp
Select action
Fitness: Computed as a function of the error
3C A PεF num as ts exp
randomly
4C A PεF num as ts exp
5C A PεF num as ts exp
6C A PεF num as ts exp Random Action
…
Action Set [A]
[]
Classifier
1C A PεF num as ts exp
Deletion Selection, reproduction,
Parameters
3C A PεF num as ts exp
and mutation
5C A PεF num as ts exp Update
6C A PεF num as ts exp
(Widrow Hoff
(Widrow-Hoff rule)
…
Genetic Algorithm
Fitness Sharing
Competition
in the niche
Slide 9
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
10. Description of UCS
In training mode (Bernadó-Mansilla & Garrell, 03)
Stream of
ENVIRONMENT examples
Match Set [M]
Problem instance
+
output class 1C A acc F num cs ts exp
3C A acc F num cs ts exp
Population [ ]
p [P] 5C A acc F num cs ts exp
6C A acc F num cs ts exp
…
1C A acc F num cs ts exp
Classifier
2C A acc F num cs ts exp
Parameters
3C A acc F num cs ts exp correct set
4C A acc F num cs ts exp generation Update
p
5C A acc F num cs ts exp
Average of the
6C A acc F num cs ts exp Match set
Correct Set [C] parameter values
… generation
No fitness sharing
3 C A acc F num cs t exp
ts
Selection, Reproduction,
Deletion 6 C A acc F num cs ts exp
and mutation
…
Competition Genetic Algorithm
in the niche
Key differences with respect to XCS
Accuracy computation as average of correct predictions
Exploration of the “correct class instead of all classes
correct class”
No fitness sharing
Slide 10
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
11. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work
Slide 11
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
12. Fitness Sharing in UCS
Sharing or not sharing, a key difference between XCS and UCS
Goal
Design a fitness sharing scheme
Empirically compare whether fitness sharing is beneficial to UCS
Empirically compare XCS with UCS
Incorporate a fitness sharing scheme into UCS
Classifier accuracy
Take inspiration from XCS
Classifier numerosity
Relative accuracy
Learning rate
And finally, fitness
is shared in [M]
Slide 12
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
13. Methodology of Analysis
Analysis divided into two comparisons
Compare UCS without fitness sharing (UCSns) and with fitness sharing (UCSs)
1.
Compare UCSs with XCS
2.
Comparison on four boundedly-difficult problems, that permit
moving the complexity along: number of classes, size of the
building block, l
b ildi bl k class i b l
imbalance, and proportion of noise.
d ti fi
The parity problem (par)
The d
Th decoder problem (d )
d bl (dec)
The position problem (pos)
The 20 bit multiplexer with alternating noise (mux-an)
20-bit (mux an)
Slide 13
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
14. Does Fitness Sharing Benefit UCS?
Fitness sharing provides the following benefits:
gp g
Higher pressure toward deletion of over-general classifiers
Higher selective p
g pressure toward the fittest classifiers in [ ]
[C]
Better results in the four problems: par, dec, pos, and mux-an
UCSns vs UCSs in Decoder
UCSs
UCSns
Slide 14
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
15. Comparison of UCS with XCS
Advantages of UCS due to
The exploration regime
XCS explores all the classes while UCS explores only the “correct” class
The accuracy guidance
XCS may provide a misleading guidance toward the fittest classifiers identified as the
fitness dilemma (Butz et. al, 2003)
UCS solves this problem by computing accuracy as the proportion of correct predictions
UCSs vs XCS in Decoder
UCSs
XCS
Slide 15
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
16. Summary of the Comparison
The empirical study has shown that
p y
UCS benefits from a fitness sharing scheme.
Therefore, we use UCSs in the remaining of this work
g
Key differences between XCS and UCS reviewed and
experimentally analyzed
Explore regime
Accuracy guidance
Population size
XCS is a more general architecture and can solve
reinforcement learning problems
Slide 16
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
17. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work
Slide 17
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
18. Motivation
So, does rare classes pose a challenge to XCSs?
, p g
Test on unbalanced 11-bit multiplexer
number of examples of the majority class
IR = number of examples of the minority class
%[O] with XCS
ith
Slide 18
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
19. Design Decomposition
Aim
Analyze the challenges that rare classes pose to XCS
Improve XCS in problems with rare classes
Design decomposition approach (Goldberg, 02) proposes to
Decompose the problem in critical elements
p
p
Derive “little” models or facetwise models for each element, assuming that
the others behave in an ideal manner
Integrate all the models (patchquilt integration)
Slide 19
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
20. Focusing the Problem
How should XCS partition the problem solution?
p p
Nourished niche
Small Disjunct or
Starved niche
Again
more small
disjuncts
Overgeneral
Classifier
Slide 20
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
21. Critical Elements of LCS
Five critical elements to detect small niches were identified
Five critical elements:
1. Estimate the classifier parameters correctly
2. Analyze whether representatives of starved niches can
be provided in initialization
3. Ensure the generation and growth of representatives
of starved niches
4.
4 Adjust the GA application rate
5. Ensure that representatives of starved niches will take
over their niches
Derivations studied according to the imbalance ratio (IR)
Slide 21
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
22. Estimate Classifier Parameters
1
Derive the maximum imbalance ratio
The error of over-general classifiers is:
However, empirical results did not agree with the theory
Error of the most over-general classifier over time tracked
g
Theoretical value
Deviation between theoretical and
ir = 100
empirical error
Over-general
Over general classifiers may be
considered accurate
Slide 22
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
23. Estimate Classifier Parameters
1
We proposed two alternatives to obtain better estimates
Theoretical value
Tune the learning rate of the
1.
Widrow Hoff
Widrow-Hoff rule according to ir
ir = 100
Theoretical value
Apply gradient descent
2.
methods (B t et. al, 2005)
th d (Butz t l
ir = 100
00
Slide 23
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
24. 2 Provide Representatives in Initial.
Can covering provide schemas of classifiers of starved niches?
gp
Probability of activating covering in the first minority class instance
Specificity of [P]
Imbalance ratio
Length of the classifier
For large values of ir, covering will not
provide schemas of the minority class
We
W continue the analysis assuming a
ti th li i
covering failure
Slide 24
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
25. 3 Ensure Growth of Representatives
How to size the population to ensure that representatives of
pp p
starved niches will be supplied?
Assumptions:
Crossover is not considered. Only mutation (probability of mutation μ).
The time to create a representative of a starved niche is
Random deletion
A GA is applied to [A] every time [A] is activated
Time to receive a genetic event
Mixing all together: Population size bound to ensure reproductive
opportunity Number of classes
Imbalance ratio
Slide 25
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
26. 3 Ensure Growth of Representatives
Theory matches empirical results (parity problem)
y p (p yp )
Imbalanced parity problem with building block length from 1 to 4
Unbalanced by removing instances of one of the classes
Theory matches also when the assumptions of the model are not met
Widrow-Hoff Rule
All assumptions satisfied
Slide 26
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
27. Adjust GA Application Rate
4
Assumption in the previous model
p p
A GA is applied to [A] every time [A] is activated
What is the effect of varying GA?
To guarantee that all niches receive the same number of genetic
events approximately:
If satisfied, all niches receive the
same number of genetic
opportunities
Thence, time of deletion
increases linearly with ir and
population size remains constant
Slide 27
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
28. 5 Ensure Take Over of Represent.
The previous facets set the conditions to ensure that
p
Representatives of starved niches are created
1.
Representatives of starved niches receive a g
p genetic event
2.
But still, to ensure full convergence we need that
Representatives of starved niches take over their niche
Ensure that these representatives will not be extinguished
Study takeover time of representatives, which depends on
Initial stock of classifiers in the niche
Type of selection
Proportionate selection (Wilson, 95)
Tournament selection (Butz et al., 2005c)
Slide 28
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
29. 5 Ensure Take Over of Represent.
Takeover time for proportionate selection
pp
Population
size
Ratio of the accuracy of the
Number of niches
over-general classifier to the
Final proportion of classifiers
accuracy of the best representative
Initial proportion of classifiers
Condition for
niche extinction
Maximum
predicted by the
acceptable error
niche extinction model
Slide 29
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
30. 5 Ensure Take Over of Represent.
Takeover time for tournament selection
Population
size
Tournament size
Final proportion of classifiers
Initial proportion of classifiers
Condition for
niche extinction
Key differences with respect to proportionate selection:
Independent of the fitness of the best and the over-general classifier
Highly dependent on the tournament size
Number of
representatives
predicted by the
Number of classifiers
niche extinction model
in the niche
Slide 30
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
31. Patchquilt Integration
Will XCS learn rare classes? Lessons learned from the models
Parameters need to be correctly estimated
1.
Widrow-Hoff
Widrow Hoff rule with auto-adjusted β
auto adjusted
Gradient descent methods
Representatives need to be created and evolved
2.
Covering may fail if ir is large
The h ll
Th challenge can b met b
be t by
Sizing the population according to the imbalance ratio
Setting θGA according to the imbalance ratio
Niche extinction models set the conditions under which XCS will fail
3.
Indicate how parameters should be tuned to satisfy the model
Takeover time models to predict the time to convergence
Slide 31
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
32. Why Is this Analysis Important?
The lessons enable us to solve problems that previously
eluded solution
Unbalanced 11-bit multiplexer problem
After the
%[O] with XCS
analysis
Before the
analysis
Before we could solve up to ir=32
p
Now we can solve up to ir=1024 and more
Slide 32
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
33. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Iimbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work
Slide 33
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
34. Reviewing the Critical Elements
Estimate the classifier parameters correctly
1
Pure averages! We get the exact value
2 Analyze whether representatives of starved niches can be provided in
initialization
Covering applied if the correct set is empty
If no mutation, covering will be always applied to the first minority
g y pp y
class instances
Suppose the worst case: no provision
We derive maximum bounds
Slide 34
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
35. Reviewing the Critical Elements
Ensure the generation and growth
3
of representatives of starved niches
Default configuration
Imbalance ratio
I bl i
All assumptions satisfied
Adjust the GA application rate
4
XCS’s model is still valid
Ensure that representatives of starved niches will take over their niches
5
XCS’s takeover time models are still valid
Slide 35
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
36. Patchquilt Integration
The lessons enable us to solve problems that previously eluded solution
Results following the guidelines provided by the lessons
%[O] with UCS
Slide 36
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
37. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work
Slide 37
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
38. Motivation
From boundedly-difficult problems to real-world problems
y p p
RWP contain continuous attributes Interval-based rules
IF x1 i [l1, u1] and x2 i [l2, u2] and … and xn i [ln, nn] THEN classi
l
in d in d d in
Key difference: Problem characteristics not known
y
Gap between theory and application to RWP
How can we apply the recommendations extracted from the analysis?
Aim
Sta t b dg g the
Start bridging t e gap between theory and practice
bet ee t eo y a d p act ce
1.
Confirm that both LCS are valuable for mining domains with rarities
2.
Slide 38
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
39. What is Different in RWP
Imbalance ratio vs. niche imbalance ratio?
In boundedly-difficult problems IR equaled to the niche imbalance ratio
In RWP, this assumption may not hold
p y
Same imbalance ratio, different niche imbalance ratio
Niche imbalance ratio (NIR) in RWP depends on:
IR
Geometrical distribution of the examples
Knowledge representation
Slide 39
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
40. Self-Adaptation to Unknown Domains
Heuristic to estimate the niche imbalance ratio
Take the strongest over-general classifier
Assume NIR is the imbalance ratio of the over-general classifier
g
Tune parameters according to NIR and the recommendations
extracted from the facetwise analysis
Empirical test on the 11-bit multiplexer problem
%[O] with XCS %[B] with UCS
Slide 40
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
41. LCS in RWP
Id. Data set #Ins. #At. ir
Comparison methodology bald1 balance disc. 1 625 4 11.76
bald2
b ld2 balance di
bl disc. 2 625
62 4 1.17
11
Comparison with: bald3 balance disc. 3 625 4 1.17
C4.5 (Quinlan, 95) bpa bupa 345 6 1.38
g
glsd1 g
glass disc. 1 214 9 22.75
SMO (Platt, 98)
(Pl tt glsd2 glass disc. 2 214 9 15.47
IBk (Aha et al., 91) glsd3 glass disc. 3 214 9 11.59
glsd4 glass disc. 4 214 9 6.38
Co gu ed
Configured to maximize performance
a e pe o a ce glsd5 glass disc 5
disc. 214 9 2.06
2 06
glsd6 glass disc. 6 214 9 1.82
Selection of 25 imbalanced real-world h-s heart-disease 270 13 1.25
problems with different characteristics pim pima-inidan 768 8 1.87
tao tao-grid 1888 2 1.00
10-fold cross validation thyd1 thyroid disc. 1 215 5 6.17
thyd2 thyroid disc. 2 215 5 5.14
Performance measure: TP rate · TN rate thyd3 thyroid disc. 3 215 5 2.31
wavd1 waveform disc. 1 5000 40 2.02
Statistical tests:
wavd2 waveform disc. 2 5000 40 1.96
Friedman’s test (Friedman, 37, 40) wavd3 waveform disc. 3 5000 40 2.02
wbcd
bd Wis. B.
Wi B cancer 699 9 1.90
1 90
Nemenyi test (Nemenyi, 63)
wdbc Wis. diag. 569 30 1.68
Wilcoxon signed-ranks test (Wilcoxon, 45) wined1 wine disc. 1 178 13 2.71
wined2 wine disc. 2 178 13 2.02
wined3 wine disc. 3 178 13 1.51
wpbc wine disc. 4 198 33 3.21
Slide 41
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
42. Summary of the Results
TP rate · TN rate
XCS and UCS perform the best on average for the tested problems
However, no significant differences according to Friedman’s test
Pairwise analysis enables the extraction of further observations
XCS and UCS fail to create accurate models in problems such as bald2, bald3, and tao,
which have low imbalance ratio
Presents difficulties to learn from domains with curved boundaries
Oth complexities i addition t class i b l
l iti in dditi to l imbalance
Other
Slide 42
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
43. Discussion
When a ML practitioner has a new problem
p p
Which learner should she or he apply?
The empirical analysis indicated that
She or he should bet for LCSs
But no guarantees of being the best performer on a particular problem
What is missing?
Evaluate problem complexity
Link problem complexity with domain of competence of LCS
How?
Complexity metrics is a good starting point (Ho & Basu, 02) to bridge
the gap between theory and practice
Slide 43
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
44. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work
Slide 44
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
45. Motivation
Competent data classification techniques should be able to
Evolve
E l accurate models
t dl
in some legible structure
LCS are very appealing since evolve highly accurate models online
li i l hi hl t dl li
However:
Tend to evolve a large number of semantic-free interval-based rules
Use reasoning mechanisms that can be little intuitive
(Bernadó et al., 02)
Slide 45
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
46. Design of Fuzzy-UCS
Linguistic fuzzy representation
Disjunction of linguistic fuzzy terms
IF x1 is A1 and x2 is A2 … and xn is An THEN class1
Rule:
IF x1 is small and x2 is medium or large THEN class1
Example:
In our experiments, all variables shared the same semantics, which were
defined by triangular membership f
d fi d b t i l b hi functions
ti
small medium large
C ass e pa a ete s e e changed
Classifier parameters were c a ged to let t e dea with fuzzy matching
et them deal t u y atc g
Slide 46
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
47. Design of Fuzzy-UCS
Three procedures designed to infer the class of test examples,
p g p,
which result in a tradeoff between intepretability and accuracy
Weighted average Action winner Most numerous and
(wavg) (awin) fittest rules (nfit)
+ size of the rule set -
Based on average voting. All rules considered.
wavg
Best rule decides the class. Only best matching rules considered.
y g
awin
Based on average voting. Only most numerous rules considered.
nfit
Slide 47
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
48. Methodology of Analysis
Id Data set #Ins #At #Cl. %Min %Maj %MI
Comparison methodology
gy
ann Annealing 898 38 5 0.9 76.2 0.0
Two comparisons aut Automobile 205 25 6 1.5 32.7 22.4
bal Balance 625 4 3 7.8 46.1 0.0
Fuzzy learners
bpa Bupa 345 6 2 42.0
42 0 58.0
58 0 0.0
00
Non-fuzzy learners cmc Contrac. choice 1473 9 3 22.6 42.7 0.0
Selection of 20 real-world problems col Horse colic 368 22 2 37.0 63.0 98.1
gls Glass 214 9 6 4.2 35.5 0.0
10-fold
10 fold cross validation
h-c Heart-c 303 13 2 45.5 54.5 2.3
Metrics h-s Heart-s 270 13 2 44.4 56.6 0.0
irs Iris 150 4 3 33.3 33.3 0.0
Test accuracy
y
pim Pima 768
68 8 2 34.9
39 65.1
61 0.0
00
Number of rules of the models son Sonar 208 60 2 46.7 53.3 0.0
tao Tao 1888 2 2 50.0 50.0 0.0
Statistical tests:
thy Thyroid 215 5 3 14.0
14 0 60.0
60 0 0.0
00
Friedman’s test (Friedman, 37, 40) veh Vehicle 846 18 4 23.5 25.8 0.0
Nemenyi test (Nemenyi, 63) wbcd Wisc. breast-cancer 699 9 2 34.5 65.5 2.3
wdbc Wisc. Diagnosis 569 30 2 37.3 62.7 0.0
Bonferroni-Dunn
Bonferroni Dunn test (Dunn, 61)
(Dunn
wne Wine 178 13 3 27.0 39.9 0.0
Wilcoxon signed-ranks test (Wilcoxon, 45) wpbc Wisc. Prognostic 198 33 2 23.7 76.3 2.0
zoo Zoo 101 17 7 4.0 40.6 0.0
Slide 48
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
49. Comparison with the Fuzzy Learners
Accuracy
Fuzzy GP (GP) (Sánchez et al., 01)
F (Sá h l
1.
Fuzzy GAP (GAP) Sánchez & Couso, 00)
2.
Fuzzy SAP (SAP) Sánchez et al, 01)
3.
Fuzzy Ad b
F Adaboost (AB) (del Jesus et al, 04)
t (d l J tl
4.
Fuzzy Logitboost (LB) (Otero & Sánchez, 06)
5.
Fuzzy MaxLogitBoost (MLB) (Otero & Sánchez, 07)
6.
All methods run using KEEL (Alcalá-Fdez et. al, 08)
- Interpretability +
Fuzzy-UCS nfit
Fuzzy-UCS wavg Fuzzy-UCS awin
(> 10 rules)
(1000 s
(1000’s of rules) (< 100 rules)
Fuzzy AdaBoost Fuzzy GAP, Fuzzy SAP
Fuzzy LogitBoost Fuzzy GP, Fuzzy MLB
Slide 49
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
50. Comparison with Non-Fuzzy Learners
Accuracy
C4.5 (Quinlan, 95)
1.
IBk (Aha et al., 91)
2.
Naïve Bayes (NB) (John & Langley, 95)
3.
Part (Frank & Witten, 98)
4.
SMO (Platt, 98)
5.
GAssist (Bacardit, 04)
6.
UCS (Bernadó & Garrell, 03)
7.
- Interpretability +
Fuzzy-UCS awin
Fuzzy-UCS avg Fuzzy-UCS nfit
UCS
SMO C4.5 GAssist
IBk Part Naïve Bayes
Slide 50
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
51. Mining Large Volumes of Data
The last experiment
p
Fuzzy-UCS to extract models from the 1999 KDD Cup intrusion
detection mechanism data set
494,022 examples with 41 features
Slide 51
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
52. Outline
1. Description of XCS and UCS
2. Revisiting UCS: Fitness Sharing and Comparison with XCS
3.
3 Facetwise Analysis of XCS for Imbalanced Domains
4. Carrying over the Facetwise Analysis into UCS
5. XCS and UCS in Imbalanced Real-World Classification Problems
6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning
7. Conclusions and Further Work
Slide 52
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
53. Conclusions and Further Work
This work contributed to
Increasing the comprehension of how LCS work
Improving them to deal with p
p g problems that contain rare classes
Providing new implementations of LCS
Two challenges and four objectives addressed in the context
of LCS
1. Revise and update UCS and compare it to XCS
New fitness sharing designed
Fitness sharing provides benefits to UCS
Key differences between UCS and XCS empirically studied
Further work: Complement the analysis with theory
Slide 53
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
54. Conclusions and Further Work
2 & 3. Study LCS in domains with rare classes
y
Start with a systematic analysis validated with boundedly-difficult problems
Further work
Finish with its application to real-world problems with rare classes
pp p
Design
D i measures t characterize real world classification problems
to h ti l ld l ifi ti bl
Facetwise
Complex
Measure the difficulty of the problems analysis
Problem systems
Link problem diffi lt with d
Li LCSs can learn
k bl difficulty ith domain of competence
if t
from imbalanced
Include problem difficultyof the study of re-sampling techniques, etc.
Lots in Small models
domains
interacting
First steps taken in components et al 06; Orriols et al 08a)
(Bernadó et. al, et. al,
Problem
Application of characterization Domain of
D if
LCSs to a new
competence
real-world problem Heuristic to estimate
of LCSs
the niche imbalance ratio
Complexity
metrics
Future research line
Resampling
pg
techniques
Slide 54
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
55. Conclusions and Further Work
4. Design and implement an LCS with fuzzy logic reasoning for
g p yg g
supervised learning
Analysis to mix
Further work
Accurate online evaluation system of LCSs
Adapt LCSs to extract association rulesreasoning mechanisms of fuzzy logics
Human like representation and online
Many Robust discovery capabilities of GAs
real-world applications generate data streams
LCS are appealing ideas was not novel itself, but the combination of
Each of the three since they mine data streams
them to create a supervised learning technique was
was.
However, in most cases, unlabeled data
Fuzzy-UCS
Aim: design an LCS that is able to extract association rules online
Evolved highly accurate models of moderate size
First steps taken in (Orriols et al., 2008f)
Was able to extract classification models from large volumes of data
Is prepared to deal with domains with uncertainty and vagueness
Slide 55
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
56. Lessons Learned on the Way
The importance of design decomposition
1.
We
W need t improve LCS f mining rarities
d to i for i i iti
Mix existing, powerful techniques that solve problems that you intuitively
1.
identify
The thesis started in this way (Orriols-Puig, 05a, 05b)
Lesson: despite moderate success, poor understanding
Build complete models of your system
2.
Design decomposition and facetwise analysis (Goldberg, 02)
3.
Key for success
Not only for GAs or LCSs
The relevance of ideas crossbreeding
2.
New complex real-world problems require the best practices of different
fields
LCSs are friendly frameworks to ideas crossbreeding
Slide 56
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
57. Publications
This work has resulted in 35 publications:
7j
journal papers (4 accepted/published and 3 currently submitted)
l d/ bli h d d l b i d)
5 papers in LNCS/LNAI volumes
6 book chapters
15 international conference papers
2 national conference papers
Selected publications
Albert Orriols-Puig, Ester Bernadó-Mansilla, David E. Goldberg, Kumara Sastry, and Pier Luca Lanzi. Facetwise Analysis of
XCS for Problems with Class Imbalances IEEE Transactions on Evolutionary Computation 2008 submitted
Imbalances. Computation, 2008,
Albert Orriols-Puig, Jorge Casillas and Ester Bernadó-Mansilla. Fuzzy-UCS: A Michigan-style Fuzzy-Learning Classifier
System for Supervised Learning. IEEE Transactions on Evolutionary Computation, 2008, doi=10.1109/TEVC.2008.925144
Albert Orriols-Puig, Ester Bernadó-Mansilla. Evolutionary Rule-Based Systems for Imbalanced Datasets. Soft Computing
Journal. Special Issue on Evolutionary and Metaheuristic-based Data Mining, 2008, doi=10.1007/s00500-008-0319-7
Albert Orriols-Puig and Ester Bernadó-Mansilla. Revisiting UCS: Description, Fitness Sharing, and Comparison with XCS. In
Advances at the frontier of LCS, LNCS series, volume 4998, pages 96–116, Springer, 2008
Albert Orriols P ig Da id E Goldberg K mara Sastr and Ester Bernadó Mansilla Modeling XCS in Class Imbalances
Orriols-Puig, David. E. Goldberg, Kumara Sastry, Bernadó-Mansilla. Imbalances:
Population Size and Parameter Settings. In GECCO’07, pages 1838-1845, ACM Press, 2007
Albert Orriols-Puig, Kumara Sastry, Pier Luca Lanzi, David E. Goldberg, and Ester Bernadó-Mansilla. Modeling Selection
Pressure in XCS for Proportionate and Tournament Selection. In GECCO’07, pages 1846-1853, ACM Press, 2007
Albert Orriols-Puig and Ester Bernadó-Mansilla. Bounding XCS’s Parameters for Unbalanced Datasets. Best paper
nomination. In GECCO’06, pages 1561-1568. ACM Press, 2006
Slide 57
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
58. Acknowledgments
Enginyeria i Arquitectura La Salle
Prof. Ester Bernadó-Mansilla
My first “second home”: the IlliGAL
Prof. David E. Goldberg for accepting my visits and for all his valuable lessons
All labbies, and especially Kumara Sastry, Xavier Llorà, and Tian Li Yu
My second “second home”: the SCI2S group
Prof. Francisco Herrera for accepting my visits and for his time and advice
All labbies and especially Jorge Casillas
labbies,
My examining committee
Prof. David E. Goldberg, Prof. Francisco Herrera, Prof. Martin V. Butz, Prof. Xavier Llorà, and Prof. Xavier Vilasís
All the people I have worked with
Ester Bernadó-Mansilla, Jorge Casillas, David E. Goldberg, Pier Luca Lanzi, Francisco J. Martínez-López, Sergio
Morales-Ortigosa , Núria Macià, Joaquim Rios-Boutin, Kumara Sastry, Francesc Teixidó-Navarro
The
Th research was supported by
h t db
Departament d’universitats, recerca i societat de la informació (DURSI)
Under a FI scholarship with reference 2005FI-00252
Under two BE travel grants with references 2006BE-00299 and 2007BE2-00124
Generalitat de Catalunya, under grants 2002SGR-00155 and 2005SGR-00302
Ministerio de educación y ciencia under projects KEEL and KEEL2 with references (TIC2002-04036-C05-03 and
TIN2005-08386-C05-04)
TIN2005 08386 C05 04)
Slide 58
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
59. New Challenges in Learning Classifier
g g
Systems: Mining Rarities and Evolving
Fuzzy Rules
Student: Albert Orriols-Puig
Supervisor: Ester Bernadó-Mansilla
Grup de Recerca en Sistemes Intel·ligents
Enginyeria i Arquitectura La Salle
Universitat Ramon Llull