New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules

New Challenges in Learning Classifier
g g
Systems: Mining Rarities and Evolving
Fuzzy Rules

Student: Albert Orriols-Puig

Supervisor: Ester Bernadó-Mansilla

Grup de Recerca en Sistemes Intel·ligents
Enginyeria i Arquitectura La Salle
Universitat Ramon Llull

Background

GRSI has been researching on machine learning and data mining
Especially focused on data classification
Research aims at
Improving learning methods
Applying learning methods to real-world applications

Application of LCS to classification problems is one of the main research lines
LCS are appealing because the mine streams of examples
Many applications make the data available in streams

Important challenges need to be addressed to deal with complex applications

Slide 2
Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS

Background
General schema of LCSs
Introduced by Holland

Environment
Sensorial
S il
Action Feedback
state
Apportionment of credit algorithms
Online rule evaluator
Learning
L i
Classifier 1
Cl ifi
Any Representation
AR t ti
XCS: Q-Learning (Sutton & Barto, 1998)
Classifier 2
Classifier Uses Widrow-Hoff delta rule
production rules,
System
genetic programs, Classifier n
perceptrons,
t
SVMs

Rule evolution
Evolutionary
Typically, a GA (Holland, 75; Goldberg, 89)
Algorithm
applied to the population
population.

Slide 3

When this Work Started

In 2004, when Michigan-style LCSs were reaching maturity
First successful implementations (Wilson, 95; Wilson, 98)
Many other derivations YCS, UCS, XCSF, and many others
Applications in important domains
pp p
Data mining (Bernadó et al, 02; Wilson, 02a; Bacardit & Butz, 04)
Function approximation (Wilson 02b)
(Wilson,
Reinforcement Learning (Lanzi, 02)
Theoretical analyses f d i (B t et al., 02 03 04b)
Th ti l l for design (Butz t l 02, 03,

But still, there are important challenges to face

Slide 4

Two Key Challenges in ML and LCSs
1st challenge: Learning from domains that contain rare classes
g g
Data classification: Extract interesting, useful, and hidden patterns
The most interesting knowledge resides in rare classes
Example: fraud detection in credit card transactions
Can learners model rare classes accurately? M b not!
Cl dl l t l ? May be t!
Knowledge Model
Dataset

Learner

Minimize learning error +
Mi i i l i
maximize generalization

What about online learning?
More challenging: Model rare classes on the fly

Aim: Analyze and improve LCS for mining domains with rarities
Slide 5

Two Key Challenges in ML and LCSs
2nd challenge: Building more understandable models and
g g
bring reasoning mechanisms close to human ones
In some domains, interpretability is more important than accuracy
LCSs most often use interval-based rules in domains described by
continuous variables
Variables
V i bl are “
“semantic-free”
ti f ”
Analyses of the inference mechanisms are scarce
Fuzzy logics provides a robust framework for
knowledge representation and
reasoning under uncertainty
d
i tit

Some fuzzy LCS approaches already exist
But no online fuzzy LCS for supervised learning has been designed

Aim: Incorporate fuzzy logics into LCS for supervised learning

Slide 6

Goal of this Work
General Goal: Address the two challenges with
g
The extended classifier system (XCS) (Wilson, 95, 98)
By far, the most influential Michigan-style LCS

The supervised classifier system (UCS) (Bernadó-Mansilla, 03)
Inherits XCS’s architecture and specialized it for data classification
XCS s

Two challenges with two LCSs that lead to four objectives
2 4
Challenges Objectives

Revise and update UCS and compare it with XCS
1.
XCS and UCS
Analyze and improve LCS for mining rarities
2.

LCS and rare classes Apply LCSs for extracting models from real-world
3.
classification problems with rarities
Design and implement an LCS with fuzzy logic
Fuzzy logics in LCS 4.
reasoning for supervised learning

Slide 7

Outline

1. Description of XCS and UCS

2. Revisiting UCS: Fitness Sharing and Comparison with XCS

3.
3 Facetwise Analysis of XCS for Imbalanced Domains

4. Carrying over the Facetwise Analysis into UCS

5. XCS and UCS in Imbalanced Real-World Classification Problems

6. Fuzzy-UCS: Evolving Fuzzy Rule Sets For Supervised Learning

7. Conclusions and Further Work

Slide 8

Description of XCS
In training mode for single step tasks (Wilson, 95)

ENVIRONMENT

Match Set [M]
Problem
instance
1C A PεF num as ts exp
Selected
action
Designed for reinforcement learning:
g g
Population [P] 6C A PεF num as ts exp
Match set
REWARD
Error: Error of the predicted payoff
…
generation
Select action
Fitness: Computed as a function of the error
randomly
6C A PεF num as ts exp Random Action
…
Action Set [A]
[]
Classifier
Deletion Selection, reproduction,
Parameters
and mutation
5C A PεF num as ts exp Update
(Widrow Hoff
(Widrow-Hoff rule)
…
Genetic Algorithm
Fitness Sharing
Competition
in the niche

Slide 9

Description of UCS
In training mode (Bernadó-Mansilla & Garrell, 03)
Stream of
ENVIRONMENT examples

Match Set [M]
Problem instance
+
output class 1C A acc F num cs ts exp
3C A acc F num cs ts exp
Population [ ]
p [P] 5C A acc F num cs ts exp
…
Classifier
Parameters
3C A acc F num cs ts exp correct set
4C A acc F num cs ts exp generation Update
p
Average of the
6C A acc F num cs ts exp Match set
Correct Set [C] parameter values
… generation
No fitness sharing
3 C A acc F num cs t exp
ts
Selection, Reproduction,
Deletion 6 C A acc F num cs ts exp
and mutation
…
Competition Genetic Algorithm
in the niche
Key differences with respect to XCS
Accuracy computation as average of correct predictions
Exploration of the “correct class instead of all classes
correct class”
No fitness sharing

Slide 10

Outline



3.



6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning


Slide 11

Fitness Sharing in UCS
Sharing or not sharing, a key difference between XCS and UCS
Goal
Design a fitness sharing scheme
Empirically compare whether fitness sharing is beneficial to UCS
Empirically compare XCS with UCS
Incorporate a fitness sharing scheme into UCS
Classifier accuracy
Take inspiration from XCS
Classifier numerosity

Relative accuracy

Learning rate
And finally, fitness
is shared in [M]

Slide 12

Methodology of Analysis

Analysis divided into two comparisons
Compare UCS without fitness sharing (UCSns) and with fitness sharing (UCSs)
1.

Compare UCSs with XCS
2.

Comparison on four boundedly-difficult problems, that permit
moving the complexity along: number of classes, size of the
building block, l
b ildi bl k class i b l
imbalance, and proportion of noise.
d ti fi
The parity problem (par)
The d
Th decoder problem (d )
d bl (dec)
The position problem (pos)
The 20 bit multiplexer with alternating noise (mux-an)
20-bit (mux an)

Slide 13

Does Fitness Sharing Benefit UCS?
Fitness sharing provides the following benefits:
gp g
Higher pressure toward deletion of over-general classifiers
Higher selective p
g pressure toward the fittest classifiers in [ ]
[C]
Better results in the four problems: par, dec, pos, and mux-an
UCSns vs UCSs in Decoder

UCSs

UCSns

Slide 14

Comparison of UCS with XCS
Advantages of UCS due to
The exploration regime
XCS explores all the classes while UCS explores only the “correct” class

The accuracy guidance
XCS may provide a misleading guidance toward the fittest classifiers identified as the
fitness dilemma (Butz et. al, 2003)
UCS solves this problem by computing accuracy as the proportion of correct predictions

UCSs vs XCS in Decoder

UCSs

XCS

Slide 15

Summary of the Comparison
The empirical study has shown that
p y
UCS benefits from a fitness sharing scheme.
Therefore, we use UCSs in the remaining of this work
g

Key differences between XCS and UCS reviewed and
experimentally analyzed
Explore regime
Accuracy guidance
Population size

XCS is a more general architecture and can solve
reinforcement learning problems

Slide 16

Outline



3.





Slide 17

Motivation
So, does rare classes pose a challenge to XCSs?
, p g
Test on unbalanced 11-bit multiplexer

number of examples of the majority class
IR = number of examples of the minority class

%[O] with XCS
ith

Slide 18

Design Decomposition

Aim
Analyze the challenges that rare classes pose to XCS
Improve XCS in problems with rare classes

Design decomposition approach (Goldberg, 02) proposes to
Decompose the problem in critical elements
p
p
Derive “little” models or facetwise models for each element, assuming that
the others behave in an ideal manner
Integrate all the models (patchquilt integration)

Slide 19

Focusing the Problem
How should XCS partition the problem solution?
p p

Nourished niche

Small Disjunct or
Starved niche

Again
more small
disjuncts
Overgeneral
Classifier

Slide 20

Critical Elements of LCS
Five critical elements to detect small niches were identified

Five critical elements:
1. Estimate the classifier parameters correctly
2. Analyze whether representatives of starved niches can
be provided in initialization
3. Ensure the generation and growth of representatives
of starved niches
4.
4 Adjust the GA application rate
5. Ensure that representatives of starved niches will take
over their niches

Derivations studied according to the imbalance ratio (IR)

Slide 21

Estimate Classifier Parameters
1

Derive the maximum imbalance ratio

The error of over-general classifiers is:

However, empirical results did not agree with the theory
Error of the most over-general classifier over time tracked
g

Theoretical value

Deviation between theoretical and
ir = 100
empirical error

Over-general
Over general classifiers may be
considered accurate

Slide 22

Estimate Classifier Parameters
1
We proposed two alternatives to obtain better estimates
Theoretical value

Tune the learning rate of the
1.
Widrow Hoff
Widrow-Hoff rule according to ir
ir = 100

Theoretical value
Apply gradient descent
2.
methods (B t et. al, 2005)
th d (Butz t l

ir = 100
00

Slide 23

2 Provide Representatives in Initial.

Can covering provide schemas of classifiers of starved niches?
gp
Probability of activating covering in the first minority class instance

Specificity of [P]

Imbalance ratio

Length of the classifier

For large values of ir, covering will not
provide schemas of the minority class
We
W continue the analysis assuming a
ti th li i
covering failure

Slide 24

3 Ensure Growth of Representatives

How to size the population to ensure that representatives of
pp p
starved niches will be supplied?
Assumptions:
Crossover is not considered. Only mutation (probability of mutation μ).
The time to create a representative of a starved niche is

Random deletion

A GA is applied to [A] every time [A] is activated
Time to receive a genetic event

Mixing all together: Population size bound to ensure reproductive
opportunity Number of classes

Imbalance ratio

Slide 25

3 Ensure Growth of Representatives

Theory matches empirical results (parity problem)
y p (p yp )
Imbalanced parity problem with building block length from 1 to 4
Unbalanced by removing instances of one of the classes
Theory matches also when the assumptions of the model are not met

Widrow-Hoff Rule
All assumptions satisfied

Slide 26

Adjust GA Application Rate
4

Assumption in the previous model
p p
A GA is applied to [A] every time [A] is activated
What is the effect of varying GA?
To guarantee that all niches receive the same number of genetic
events approximately:

If satisfied, all niches receive the
same number of genetic
opportunities
Thence, time of deletion
increases linearly with ir and
population size remains constant

Slide 27

5 Ensure Take Over of Represent.

The previous facets set the conditions to ensure that
p
Representatives of starved niches are created
1.

Representatives of starved niches receive a g
p genetic event
2.

But still, to ensure full convergence we need that
Representatives of starved niches take over their niche
Ensure that these representatives will not be extinguished

Study takeover time of representatives, which depends on
Initial stock of classifiers in the niche
Type of selection
Proportionate selection (Wilson, 95)
Tournament selection (Butz et al., 2005c)

Slide 28


Takeover time for proportionate selection
pp
Population
size

Ratio of the accuracy of the
Number of niches
over-general classifier to the
Final proportion of classifiers
accuracy of the best representative
Initial proportion of classifiers

Condition for
niche extinction

Maximum
predicted by the
acceptable error
niche extinction model

Slide 29


Takeover time for tournament selection
Population
size

Tournament size
Final proportion of classifiers
Initial proportion of classifiers

Condition for
niche extinction
Key differences with respect to proportionate selection:
Independent of the fitness of the best and the over-general classifier
Highly dependent on the tournament size
Number of
representatives
predicted by the
Number of classifiers
niche extinction model
in the niche

Slide 30

Patchquilt Integration
Will XCS learn rare classes? Lessons learned from the models

Parameters need to be correctly estimated
1.

Widrow-Hoff
Widrow Hoff rule with auto-adjusted β
auto adjusted
Gradient descent methods

Representatives need to be created and evolved
2.

Covering may fail if ir is large
The h ll
Th challenge can b met b
be t by
Sizing the population according to the imbalance ratio
Setting θGA according to the imbalance ratio

Niche extinction models set the conditions under which XCS will fail
3.

Indicate how parameters should be tuned to satisfy the model
Takeover time models to predict the time to convergence

Slide 31

Why Is this Analysis Important?
The lessons enable us to solve problems that previously
eluded solution
Unbalanced 11-bit multiplexer problem
After the
%[O] with XCS
analysis

Before the
analysis

Before we could solve up to ir=32
p
Now we can solve up to ir=1024 and more
Slide 32

Outline



3.


5. XCS and UCS in Iimbalanced Real-World Classification Problems



Slide 33

Reviewing the Critical Elements
Estimate the classifier parameters correctly
1
Pure averages! We get the exact value

2 Analyze whether representatives of starved niches can be provided in
initialization
Covering applied if the correct set is empty
If no mutation, covering will be always applied to the first minority
g y pp y
class instances

Suppose the worst case: no provision
We derive maximum bounds

Slide 34

Reviewing the Critical Elements

Ensure the generation and growth
3
of representatives of starved niches

Default configuration
Imbalance ratio
I bl i
All assumptions satisfied

Adjust the GA application rate
4
XCS’s model is still valid

Ensure that representatives of starved niches will take over their niches
5
XCS’s takeover time models are still valid

Slide 35

Patchquilt Integration
The lessons enable us to solve problems that previously eluded solution
Results following the guidelines provided by the lessons

%[O] with UCS

Slide 36

Outline



3.





Slide 37

Motivation
From boundedly-difficult problems to real-world problems
y p p
RWP contain continuous attributes Interval-based rules

IF x1 i [l1, u1] and x2 i [l2, u2] and … and xn i [ln, nn] THEN classi
l
in d in d d in

Key difference: Problem characteristics not known
y
Gap between theory and application to RWP
How can we apply the recommendations extracted from the analysis?

Aim
Sta t b dg g the
Start bridging t e gap between theory and practice
bet ee t eo y a d p act ce
1.

Confirm that both LCS are valuable for mining domains with rarities
2.

Slide 38

What is Different in RWP
Imbalance ratio vs. niche imbalance ratio?
In boundedly-difficult problems IR equaled to the niche imbalance ratio
In RWP, this assumption may not hold
p y

Same imbalance ratio, different niche imbalance ratio

Niche imbalance ratio (NIR) in RWP depends on:
IR
Geometrical distribution of the examples
Knowledge representation

Slide 39

Self-Adaptation to Unknown Domains

Heuristic to estimate the niche imbalance ratio
Take the strongest over-general classifier
Assume NIR is the imbalance ratio of the over-general classifier
g
Tune parameters according to NIR and the recommendations
extracted from the facetwise analysis
Empirical test on the 11-bit multiplexer problem

%[O] with XCS %[B] with UCS

Slide 40

LCS in RWP
Id. Data set #Ins. #At. ir
Comparison methodology bald1 balance disc. 1 625 4 11.76
bald2
b ld2 balance di
bl disc. 2 625
62 4 1.17
11
Comparison with: bald3 balance disc. 3 625 4 1.17
C4.5 (Quinlan, 95) bpa bupa 345 6 1.38
g
glsd1 g
glass disc. 1 214 9 22.75
SMO (Platt, 98)
(Pl tt glsd2 glass disc. 2 214 9 15.47
IBk (Aha et al., 91) glsd3 glass disc. 3 214 9 11.59
glsd4 glass disc. 4 214 9 6.38
Co gu ed
Configured to maximize performance
a e pe o a ce glsd5 glass disc 5
disc. 214 9 2.06
2 06
glsd6 glass disc. 6 214 9 1.82
Selection of 25 imbalanced real-world h-s heart-disease 270 13 1.25
problems with different characteristics pim pima-inidan 768 8 1.87
tao tao-grid 1888 2 1.00
10-fold cross validation thyd1 thyroid disc. 1 215 5 6.17
thyd2 thyroid disc. 2 215 5 5.14
Performance measure: TP rate · TN rate thyd3 thyroid disc. 3 215 5 2.31
wavd1 waveform disc. 1 5000 40 2.02
Statistical tests:
wavd2 waveform disc. 2 5000 40 1.96
Friedman’s test (Friedman, 37, 40) wavd3 waveform disc. 3 5000 40 2.02
wbcd
bd Wis. B.
Wi B cancer 699 9 1.90
1 90
Nemenyi test (Nemenyi, 63)
wdbc Wis. diag. 569 30 1.68
Wilcoxon signed-ranks test (Wilcoxon, 45) wined1 wine disc. 1 178 13 2.71
wined2 wine disc. 2 178 13 2.02
wined3 wine disc. 3 178 13 1.51
wpbc wine disc. 4 198 33 3.21

Slide 41

Summary of the Results
TP rate · TN rate

XCS and UCS perform the best on average for the tested problems

However, no significant differences according to Friedman’s test

Pairwise analysis enables the extraction of further observations

XCS and UCS fail to create accurate models in problems such as bald2, bald3, and tao,
which have low imbalance ratio

Presents difficulties to learn from domains with curved boundaries

Oth complexities i addition t class i b l
l iti in dditi to l imbalance
Other

Slide 42

Discussion
When a ML practitioner has a new problem
p p
Which learner should she or he apply?

The empirical analysis indicated that
She or he should bet for LCSs
But no guarantees of being the best performer on a particular problem

What is missing?
Evaluate problem complexity
Link problem complexity with domain of competence of LCS

How?
Complexity metrics is a good starting point (Ho & Basu, 02) to bridge
the gap between theory and practice

Slide 43

Outline



3.





Slide 44

Motivation
Competent data classification techniques should be able to
Evolve
E l accurate models
t dl
in some legible structure
LCS are very appealing since evolve highly accurate models online
li i l hi hl t dl li
However:
Tend to evolve a large number of semantic-free interval-based rules
Use reasoning mechanisms that can be little intuitive

(Bernadó et al., 02)

Slide 45

Design of Fuzzy-UCS
Linguistic fuzzy representation
Disjunction of linguistic fuzzy terms

IF x1 is A1 and x2 is A2 … and xn is An THEN class1
Rule:

IF x1 is small and x2 is medium or large THEN class1
Example:

In our experiments, all variables shared the same semantics, which were
defined by triangular membership f
d fi d b t i l b hi functions
ti

small medium large

C ass e pa a ete s e e changed
Classifier parameters were c a ged to let t e dea with fuzzy matching
et them deal t u y atc g

Slide 46

Design of Fuzzy-UCS
Three procedures designed to infer the class of test examples,
p g p,
which result in a tradeoff between intepretability and accuracy

Weighted average Action winner Most numerous and
(wavg) (awin) fittest rules (nfit)

+ size of the rule set -

Based on average voting. All rules considered.
wavg
Best rule decides the class. Only best matching rules considered.
y g
awin
Based on average voting. Only most numerous rules considered.
nfit

Slide 47

Methodology of Analysis
Id Data set #Ins #At #Cl. %Min %Maj %MI
Comparison methodology
gy
ann Annealing 898 38 5 0.9 76.2 0.0
Two comparisons aut Automobile 205 25 6 1.5 32.7 22.4
bal Balance 625 4 3 7.8 46.1 0.0
Fuzzy learners
bpa Bupa 345 6 2 42.0
42 0 58.0
58 0 0.0
00
Non-fuzzy learners cmc Contrac. choice 1473 9 3 22.6 42.7 0.0
Selection of 20 real-world problems col Horse colic 368 22 2 37.0 63.0 98.1
gls Glass 214 9 6 4.2 35.5 0.0
10-fold
10 fold cross validation
h-c Heart-c 303 13 2 45.5 54.5 2.3
Metrics h-s Heart-s 270 13 2 44.4 56.6 0.0
irs Iris 150 4 3 33.3 33.3 0.0
Test accuracy
y
pim Pima 768
68 8 2 34.9
39 65.1
61 0.0
00
Number of rules of the models son Sonar 208 60 2 46.7 53.3 0.0
tao Tao 1888 2 2 50.0 50.0 0.0
Statistical tests:
thy Thyroid 215 5 3 14.0
14 0 60.0
60 0 0.0
00
Friedman’s test (Friedman, 37, 40) veh Vehicle 846 18 4 23.5 25.8 0.0
Nemenyi test (Nemenyi, 63) wbcd Wisc. breast-cancer 699 9 2 34.5 65.5 2.3
wdbc Wisc. Diagnosis 569 30 2 37.3 62.7 0.0
Bonferroni-Dunn
Bonferroni Dunn test (Dunn, 61)
(Dunn
wne Wine 178 13 3 27.0 39.9 0.0
Wilcoxon signed-ranks test (Wilcoxon, 45) wpbc Wisc. Prognostic 198 33 2 23.7 76.3 2.0
zoo Zoo 101 17 7 4.0 40.6 0.0

Slide 48

Comparison with the Fuzzy Learners
Accuracy
Fuzzy GP (GP) (Sánchez et al., 01)
F (Sá h l
1.

Fuzzy GAP (GAP) Sánchez & Couso, 00)
2.

Fuzzy SAP (SAP) Sánchez et al, 01)
3.

Fuzzy Ad b
F Adaboost (AB) (del Jesus et al, 04)
t (d l J tl
4.

Fuzzy Logitboost (LB) (Otero & Sánchez, 06)
5.

Fuzzy MaxLogitBoost (MLB) (Otero & Sánchez, 07)
6.

All methods run using KEEL (Alcalá-Fdez et. al, 08)

- Interpretability +
Fuzzy-UCS nfit
Fuzzy-UCS wavg Fuzzy-UCS awin
(> 10 rules)
(1000 s
(1000’s of rules) (< 100 rules)

Fuzzy AdaBoost Fuzzy GAP, Fuzzy SAP
Fuzzy LogitBoost Fuzzy GP, Fuzzy MLB

Slide 49

Comparison with Non-Fuzzy Learners
Accuracy

C4.5 (Quinlan, 95)
1.

IBk (Aha et al., 91)
2.

Naïve Bayes (NB) (John & Langley, 95)
3.

Part (Frank & Witten, 98)
4.

SMO (Platt, 98)
5.

GAssist (Bacardit, 04)
6.

UCS (Bernadó & Garrell, 03)
7.

- Interpretability +
Fuzzy-UCS awin
Fuzzy-UCS avg Fuzzy-UCS nfit

UCS
SMO C4.5 GAssist
IBk Part Naïve Bayes

Slide 50

Mining Large Volumes of Data
The last experiment
p
Fuzzy-UCS to extract models from the 1999 KDD Cup intrusion
detection mechanism data set
494,022 examples with 41 features

Slide 51

Outline



3.





Slide 52

Conclusions and Further Work
This work contributed to
Increasing the comprehension of how LCS work
Improving them to deal with p
p g problems that contain rare classes
Providing new implementations of LCS
Two challenges and four objectives addressed in the context
of LCS

1. Revise and update UCS and compare it to XCS
New fitness sharing designed
Fitness sharing provides benefits to UCS
Key differences between UCS and XCS empirically studied
Further work: Complement the analysis with theory

Slide 53

2 & 3. Study LCS in domains with rare classes
y
Start with a systematic analysis validated with boundedly-difficult problems
Further work
Finish with its application to real-world problems with rare classes
pp p
Design
D i measures t characterize real world classification problems
to h ti l ld l ifi ti bl
Facetwise
Complex
Measure the difficulty of the problems analysis
Problem systems
Link problem diffi lt with d
Li LCSs can learn
k bl difficulty ith domain of competence
if t
from imbalanced
Include problem difficultyof the study of re-sampling techniques, etc.
Lots in Small models
domains
interacting
First steps taken in components et al 06; Orriols et al 08a)
(Bernadó et. al, et. al,

Problem
Application of characterization Domain of
D if
LCSs to a new
competence
real-world problem Heuristic to estimate
of LCSs
the niche imbalance ratio
Complexity
metrics

Future research line
Resampling
pg
techniques

Slide 54

4. Design and implement an LCS with fuzzy logic reasoning for
g p yg g
supervised learning
Analysis to mix
Further work
Accurate online evaluation system of LCSs
Adapt LCSs to extract association rulesreasoning mechanisms of fuzzy logics
Human like representation and online
Many Robust discovery capabilities of GAs
real-world applications generate data streams
LCS are appealing ideas was not novel itself, but the combination of
Each of the three since they mine data streams
them to create a supervised learning technique was
was.
However, in most cases, unlabeled data
Fuzzy-UCS
Aim: design an LCS that is able to extract association rules online
Evolved highly accurate models of moderate size
First steps taken in (Orriols et al., 2008f)
Was able to extract classification models from large volumes of data
Is prepared to deal with domains with uncertainty and vagueness

Slide 55

Lessons Learned on the Way
The importance of design decomposition
1.
We
W need t improve LCS f mining rarities
d to i for i i iti
Mix existing, powerful techniques that solve problems that you intuitively
1.
identify
The thesis started in this way (Orriols-Puig, 05a, 05b)
Lesson: despite moderate success, poor understanding
Build complete models of your system
2.

Design decomposition and facetwise analysis (Goldberg, 02)
3.

Key for success
Not only for GAs or LCSs
The relevance of ideas crossbreeding
2.

New complex real-world problems require the best practices of different
fields
LCSs are friendly frameworks to ideas crossbreeding

Slide 56

Publications
This work has resulted in 35 publications:
7j
journal papers (4 accepted/published and 3 currently submitted)
l d/ bli h d d l b i d)
5 papers in LNCS/LNAI volumes
6 book chapters
15 international conference papers
2 national conference papers

Selected publications
Albert Orriols-Puig, Ester Bernadó-Mansilla, David E. Goldberg, Kumara Sastry, and Pier Luca Lanzi. Facetwise Analysis of
XCS for Problems with Class Imbalances IEEE Transactions on Evolutionary Computation 2008 submitted
Imbalances. Computation, 2008,
Albert Orriols-Puig, Jorge Casillas and Ester Bernadó-Mansilla. Fuzzy-UCS: A Michigan-style Fuzzy-Learning Classifier
System for Supervised Learning. IEEE Transactions on Evolutionary Computation, 2008, doi=10.1109/TEVC.2008.925144
Albert Orriols-Puig, Ester Bernadó-Mansilla. Evolutionary Rule-Based Systems for Imbalanced Datasets. Soft Computing
Journal. Special Issue on Evolutionary and Metaheuristic-based Data Mining, 2008, doi=10.1007/s00500-008-0319-7
Albert Orriols-Puig and Ester Bernadó-Mansilla. Revisiting UCS: Description, Fitness Sharing, and Comparison with XCS. In
Advances at the frontier of LCS, LNCS series, volume 4998, pages 96–116, Springer, 2008
Albert Orriols P ig Da id E Goldberg K mara Sastr and Ester Bernadó Mansilla Modeling XCS in Class Imbalances
Orriols-Puig, David. E. Goldberg, Kumara Sastry, Bernadó-Mansilla. Imbalances:
Population Size and Parameter Settings. In GECCO’07, pages 1838-1845, ACM Press, 2007
Albert Orriols-Puig, Kumara Sastry, Pier Luca Lanzi, David E. Goldberg, and Ester Bernadó-Mansilla. Modeling Selection
Pressure in XCS for Proportionate and Tournament Selection. In GECCO’07, pages 1846-1853, ACM Press, 2007
Albert Orriols-Puig and Ester Bernadó-Mansilla. Bounding XCS’s Parameters for Unbalanced Datasets. Best paper
nomination. In GECCO’06, pages 1561-1568. ACM Press, 2006

Slide 57

Acknowledgments
Enginyeria i Arquitectura La Salle
Prof. Ester Bernadó-Mansilla
My first “second home”: the IlliGAL
Prof. David E. Goldberg for accepting my visits and for all his valuable lessons
All labbies, and especially Kumara Sastry, Xavier Llorà, and Tian Li Yu

My second “second home”: the SCI2S group
Prof. Francisco Herrera for accepting my visits and for his time and advice
All labbies and especially Jorge Casillas
labbies,
My examining committee
Prof. David E. Goldberg, Prof. Francisco Herrera, Prof. Martin V. Butz, Prof. Xavier Llorà, and Prof. Xavier Vilasís
All the people I have worked with
Ester Bernadó-Mansilla, Jorge Casillas, David E. Goldberg, Pier Luca Lanzi, Francisco J. Martínez-López, Sergio
Morales-Ortigosa , Núria Macià, Joaquim Rios-Boutin, Kumara Sastry, Francesc Teixidó-Navarro
The
Th research was supported by
h t db
Departament d’universitats, recerca i societat de la informació (DURSI)
Under a FI scholarship with reference 2005FI-00252
Under two BE travel grants with references 2006BE-00299 and 2007BE2-00124
Generalitat de Catalunya, under grants 2002SGR-00155 and 2005SGR-00302
Ministerio de educación y ciencia under projects KEEL and KEEL2 with references (TIC2002-04036-C05-03 and
TIN2005-08386-C05-04)
TIN2005 08386 C05 04)

Slide 58

New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules

Similar a New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules (20)

Más de Albert Orriols-Puig

Más de Albert Orriols-Puig (9)

Último

Último (20)

New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules