Automatically Defined Functions for Learning Classifier Systems

Evolutionary
Computation
Research
Group
Code Fragments for
Learning Classifier Systems
Scaling with LCS

Muhammad Iqbal
Victoria University of Wellington
Iqbal@ecs.vuw.ac.nz

Will N. Browne
Will.Browne@vuw.ac.nz

Mengjie Zhang
Mengjie.Zhang@ecs.vuw.ac.nz

Evolutionary
Computation
Research
Group

Automatically Defined Functions (ADFs)
for
Learning Classifier System
Muhammad Iqbal Scaling with LCS
Iqbal@ecs.vuw.ac.nz

Will N. Browne
Will.Browne@vuw.ac.nz

Mengjie Zhang
Mengjie.Zhang@ecs.vuw.ac.nz

Outline
•  Initial investigations into the scaling of LCS

•  Three year research question: can LCS scale to
complex problems from learning simpler related
problems?

•  Immediate question: can Automatically Defined
Functions (ADFs) be useful to LCS?

3

Outline
•  Genetic Programming
•  Automatically Defined Functions
•  Learning Classifier Systems (LCS)
•  Code Fragmented LCS
•  Automatically Defined Functions for LCS
•  Results and Discussion
•  Conclusions
•  Future Work

4

Genetic Programming (GP)
•  Evolutionary algorithm-based methodology
•  To discover a computer program that maps
some input to some output
•  Tree based representation
•  Example:
X Output
0 1
1 3
2 7 Output = F(X)
=?
3 13
4 21
... ...
... ... 5

Genetic Programming (GP)
•  Evolutionary algorithm-based methodology
•  To discover a computer program that maps
some input to some output
•  Tree based representation
•  Example:
X Output
0 1
1 3
2 7
3 13
4 21
... ...
... ... 6

Boolean Multiplexer

a
d=2
n=a+d
n
Num test cases = 2
20-mux 1 million test cases
37-mux 137 billion test cases

7

6-bits Multiplexer

Input
Output
A0 A1 D3 D2 D1 D0
0 0 ≠ ≠ ≠ 0 0
0 0 ≠ ≠ ≠ 1 1
0 1 ≠ ≠ 0 ≠ 0
0 1 ≠ ≠ 1 ≠ 1
1 0 ≠ 0 ≠ ≠ 0
1 0 ≠ 1 ≠ ≠ 1
1 1 0 ≠ ≠ ≠ 0
1 1 1 ≠ ≠ ≠ 1
8

AND, OR, NAND, NOR
X Y X|Y
AND: &
X
0
Y
0
X&Y
0 0 0 0
OR: |
0 1 0 0 1 1
1 0 0 1 0 1
1 1 1 1 1 1

X Y XdY X Y XrY
NAND: d 0 0 1 0 0 1
NOR: r
0 1 1 0 1 0
1 0 1 1 0 0
1 1 0 1 1 0

9

Automatically Defined Functions (ADFs)

•  Genetic programming trees often have repeated
patterns.
•  Repeated subtrees can be treated as subroutines.
•  ADFs is a methodology to automatically select and
implement modularity in GP.
•  This modularity can:
•  Reduce the size of GP tree
•  Reduce training time

10

Comparison of GP Methods

Population without ADFs = 262144
Population with ADFs = 48640 11

Scalability of GP and XCS

12

Learning Classifier System

13

XCS with Code Fragmented Conditions

Population of Code
Fragments
0 D2D5&D0D3|d
Classifier Population 1 D4
Condition Action 2 D5~D1r
code25 codeN-1 code3 code15 codeN code30 0 3 D2~
code15 code19 code15 code5 code1 codeN 1 4 D0D1&
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ...
N-1 ...
N D1D1~| ... #

14

A Code Fragmented Classifier

15

Condition Matching
(Evaluating a Code Fragment)
1

1 0

1 1 0 1

D0 D1 D2 D3 D4 D5
110101 1 1 0 1 0 1 16

Don’t care Code Fragment

17

Code Fragmented Conditions

1.00

0.90 Multiplexer

6-bits using ternary conditions
Performance

0.80
11-bits using ternary conditions

0.70 20-bits using ternary conditions
6-bits using code fragmented conditions

0.60 11-bits using code fragmented conditions
20-bits using code fragmented conditions

0.50
0 20000 40000 60000 80000 100000 120000 140000 160000
Instances

18

Top 10 Highly Ranked ADFs

19

Code Fragments using Address Bits

20

XCS with Code Fragmented Actions
Population of Code
Fragments
0 D2D5&D0D3|d
Classifier Population
1 D4
Condition Action
2 D5~D1r
0 1 # 0 0 # code25
3 D2~
1 0 0 # 1 1 code19
4 D0D1&
... ... ... ... ... ... ...
... ...
... ... ... ... ... ... ...
... ...
... ...
N-1 ...

23

Compare Code Fragmented Actions with
Environment Action
1.  Using the environmental condition.

2.  Using the associated condition from the classifier rule
itself. (# either 0 or 1)

24

Code Fragmented Actions -
Message
1.00

0.90 Multiplexer

6-bits using binary actions
Performance

0.80

0.70 20-bits using binary actions
6-bits using code fragmented actions

0.60 11-bits using code fragmented actions

0.50
0 20000 40000 60000 80000 100000 120000
Instances

25

Code Fragmented Actions - 2
1.00

0.90 Multiplexer

Performance

0.80





0.50
0 20000 40000 60000 80000 100000 120000
Instances

26

Code Fragmented Actions – Rule
Sample (Random 0 or 1 for #)
1.00

0.90 Multiplexer

Performance

0.80





0.50
0 20000 40000 60000 80000 100000 120000
Instances
27

Code Fragmented Actions - 4
1.00

0.90 Multiplexer

Performance

0.80





0.50
0 10000 20000 30000 40000 50000 60000 70000
Instances

28

37-bits Multiplexer

29

Specialization of Address Bits

Condition
Sr. No. Action
A1 A0 D0 D1 D2 D3
1 1 1 # # # 1 1
2 0 1 # 1 # # 1
3 # 1 # 1 # 1 1
4 # # 1 1 1 1 1

30

Standard XCS
Rank
Condi*on
Ac*on
Predic*on
Experience

1
0
1
#
1
#
#
1
1000
23814

2
0
#
0
0
#
#
0
1000
18694

3
#
1
#
0
#
0
0
1000
18287

4
0
1
#
0
#
#
0
1000
18184

5
0
#
1
1
#
#
1
1000
13090

6
0
#
0
0
#
#
1
0
12790

7
0
1
#
0
#
#
1
0
12541

8
#
1
#
0
#
0
1
0
12322

9
1
0
#
#
1
#
1
1000
8654

10
1
0
#
#
0
#
1
0
7799

11
1
#
#
#
0
0
1
0
7050

12
1
1
#
#
#
0
1
0
6888

13
1
#
#
#
1
1
1
1000
6624

14
#
0
1
#
1
#
1
1000
5361

31
15
#
1
#
1
#
1
1
1000
5213

Code Fragmented Actions –
Rule Sample
Rank
Condi3on
Ac3on
Predic3on
Experience

1
0
0
0
#
#
#
D0~D4D0dd
1000
4431

2
1
1
#
#
#
0
D0D1rD1D1|d
0
4251

3
1
0
#
#
1
#
D5D0r~
1000
3159

4
0
1
#
0
#
#
D0D0r~
1000
2822

5
0
0
1
#
#
#
D2D2|D1D1d&
1000
2638

6
1
0
#
#
0
#
D2D0dD1D2dr
1000
2550

7
0
1
#
0
#
#
D0D4dD2D0&d
0
2525

8
0
0
0
#
#
#
D1D1dD2D2&|
0
2189

9
1
1
#
#
#
1
D3D1|D1~&
0
2143

10
0
1
#
1
#
#
D4D0|D1D1rd
1000
2052

11
1
1
#
#
#
0
D0D2rD5D2r&
1000
2068

12
1
0
#
#
0
#
D0D3rD4~&
1000
1926

13
1
1
#
#
#
1
D1D3|D5D3d|
1000
1820

14
0
1
#
1
#
#
D0D2|D4D1||
1000
1758

15
1
1
#
#
#
0
D5D0dD5&
1000
1594
32

Evolved code fragmented conditions
GA - GP
evolution
(excluding codeN)
rule discovery

Condition Action
code25 codeN-1 code3 code15 codeN code30 0 Population of Code
Fragments
code15 code19 code15 code5 code1 codeN 1
0 D2D5&D0D3|d
... ... ... ... ... ... ...
1 D4
... ... ... ... ... ... ...
2 D5~D1r
3 D2~
Match Set 4 D0D1&
... ...
... ...
Action Set ... ...
action N-1 ...
N D1D1~| ... #
Environment reward
33

6-bits MUX using Evolvable Code
Fragmented Conditions
1.00

0.90 6-bits Multiplexer
Performance

0.80

0.70

ternary conditions
0.60 evolved code fragmented conditions

0.50
0 10000 20000 30000 40000
Instances
34

Evolved code fragmented conditions
(Code Fragment Sample)
Sr.
No.
Code
Fragment
Fitness

1
D2D3rD3D5&&
1000.00

2
D0D2|D2D3dr
1000.00

3
D3D4dD3D3&r
1000.00

4
D5D5rD2D5&&
1000.00

5
D4D1|D1D5dr
1000.00

6
D1D1D1rr
1000.00

7
D1D5dD1r
1000.00

8
D0D2rD3D2&&
1000.00

9
D3D5rD0D3&&
1000.00

10
D4D0dD1D4|r
1000.00

11
D2D1|D1~r
1000.00

12
D4D1rD1&
1000.00

13
D4D4D0r&
1000.00

14
D4D3rD1D4&&
1000.00

15
D0D0dD2D0|r
1000.00
35

A Code Fragment always Outputting ‘0’
D2 D3 D5 A=D2D3r B=D3D5& AB&
0 0 0 1 0 0
0 0 1 1 0 0
0 1 0 0 0 0
0 1 1 0 1 0
1 0 0 0 0 0
1 0 1 0 0 0
1 1 0 0 0 0
1 1 1 0 1 0

36

XCS using ADFs
Condition Action
D2D0F25 D1D5F0 D5D3F3 D4D2F15 D0D0FN D0D4F30 0
D5D1F15 D3D3F19 D4D5F15 D0D0FN D5D5F1 D0D0FN 1
... ... ... ... ... ... ...
... ... ... ... ... ... ...

Population of ADFs
0 ARG2ARG1F3ARG1ARG1F2F4
1 ARG1
2 ARG1~ARG2F9
3 ARG2~
4 ARG1ARG2&
... ...
... ...
N-1 ... 37
N ARG1ARG1~|

Condition Matching (Evaluating an ADF)
Condition Action
D3D1F6 D1D5F0 D5D3F3 D4D2F5 D1D0F1 D0D4F4 0

D3D1F6 F0: ARG1ARG2F8ARG1ARG1F1F4
= 1 0 F6 F1: ARG2ARG2rARG1ARG1|F7
= 1 0 ARG2ARG1F8ARG1~d F2: ARG1
= 0 1 F8 1~ d F3: ARG2ARG1F9ARG2ARG1F8F6
= 0 1 F8 0 d F4: ARG2ARG2d~
= 0 1 ARG1~ARG1ARG1rr F5: ARG1ARG2F7~
= 0~ 0 0 r r F6: ARG2ARG1F8ARG1~d
=11r F7: ARG1ARG1&ARG1ARG1||
=0 F8: ARG1~ARG1ARG1rr
F9: ARG2ARG1d~
Not Matched
D0 D1 D2 D3 D4 D5
101110 1 0 1 1 1 0 38

Simplified ADFs used for Multiplexer
Problems

F0: ARG1ARG2F8ARG1ARG1F1F4 à ARG1~
F1: ARG2ARG2rARG1ARG1|F7 à ARG2~
F2: ARG1 à ARG1
F3: ARG2ARG1F9ARG2ARG1F8F6 à ARG2ARG1&ARG2F6
F4: ARG2ARG2d~ à ARG2
F5: ARG1ARG2F7~ à ARG1~
F6: ARG2ARG1F8ARG1~d à ARG2ARG1~d
F7: ARG1ARG1&ARG1ARG1|| à ARG1
F8: ARG1~ARG1ARG1rr à ARG1
F9: ARG2ARG1d~ à ARG2ARG1&

39

Comparison of XCS using ADFs
1.00

0.90 Multiplexer

6-bits using standard XCS
Performance

0.80

0.70 20-bits using standard XCS
6-bits using XCS with ADFs

0.60 11-bits using XCS with ADFs

0.50
0 10000 20000 30000 40000 50000
Instances

Number of ADFs used = 10 ADFs 40


1

37-bits Multiplexer
0.9

0.8
Performance

0.7

0.6 XCS using 20 ADFs

Standard XCS
0.5

0.4
0 100000 200000 300000 400000 500000
Instances

Just 1 run results. 41

Comparison using Multilevel ADFs
1.00

0.90

0.80
Performance

37-bits Multiplexer
0.70

0.60
XCS using multilevel ADFs
0.50
Standard XCS
0.40
0 100000 200000 300000 400000 500000
Instances

42
Number of classifiers used = 8000, 20 runs average

Conclusions
•  Code Fragments capture important information.
•  Automatically Defined Functions reduce training
time in GP.
•  Automatically Defined Functions reduce the
number of iterations needed during training in
LCS.
•  Automatically Defined Functions produce
compact GP trees.
•  Multiple genotypes to a phenotype issue in
feature rich encodings (code fragments and
ADFs) disrupts the subsumption deletion
function. 43

Future Work
•  Simplification into ADFs in LCS
•  Remove non-responsive classifiers.
•  MAM technique for ADFs’ fitness.
•  Seed identified fit ADFs from a simple
problem to a more complex problem in the
same domain.
•  Multiple populations of ADFs from
different problem domains for a general
problem solving system.
44

Thank You

Questions?

45

Supplementary Slides

46

GP Resource Demands
•  GP is notoriously resource consuming
•  CPU cycles
•  Memory
•  Standard GP system, 1µs per node
•  Binary trees, depth 17: 131 ms per tree
•  Fitness cases: 1,000 Population size: 1,000
•  Generations: 1,000 Number of runs: 100
»  Runtime: 10 Gs ≈ 317 years

•  Standard GP system, 1ns per node
»  Runtime: 116 days

•  Limits to what we can approach with GP

47
[Banzhaf and Harding – GECCO 2009]

Sources of Speed-up

•  Fast machines

•  Vector Processors

•  Parallel Machines (MIMD/SIMD)

•  Clusters

•  Loose Networks

•  Multi-core

•  Graphics Processing Units (GPU)

48

Why GPU is faster than CPU ?

The GPU Devotes More Transistors to Data Processing.

[CUDA C Programming Guide Version 3.2 ] 49

GPU Programming APIs
(Application Programming Interface)
•  There are a number of toolkits available for
programming GPUs.
•  CUDA
•  MS Accelerator
•  RapidMind
•  Shader programming
•  So far, researchers in GP have not converged on
one platform

50

CUDA Programming
Massive number (>10000) of light-weight threads.

51

CUDA Memory Model
CUDA exposes all the different types of memory on the GPU:

52
[CUDA C Programming Guide Version 3.2 ]

Boolean Multiplexer

a
d=2
n=a+d
n
Num test cases = 2
20-mux 1 million test cases
37-mux 137 billion test cases

53

A Many Threaded CUDA
Interpreter for Genetic
Programming
•  Solved 20-bits Multiplexer
•  220 = 1048576 fitness cases
•  Has never been solved by tree GP before
•  Previously estimated time: more than 4 years
•  GPU has consistently done it in less than an hour
•  Solved 37-bits Multiplexer
•  237 = 137438953472 fitness cases
•  Has never been attempted before using GP
•  GPU solves it in under a day

54
[W.B.Langdon, EuroGP-2010]

Genetic Programming Parameters
for Solving 20 and 37 Multiplexers
Terminals 20 or 37 Boolean inputs D0 – D19 or D0 – D36 respectively
Functions AND, OR, NAND, NOR
Fitness Pseudo random sample of 2048 of 1048576 or 8192 of 137438953472
fitness cases.
Tournament 4 members run on same random sample. New samples for each tournament
and each generation.
Population 262144
Initial Ramped half-and-half 4:5 (20-Mux) or 5:7 (37-Mux)
Population
Parameters 50% subtree crossover, 5% subtree 45% point mutation.
Max depth 15, max size 511 (20-Mux) or 1023 (37-Mux)
Termination 5000 generations

Solutions are found in generations 423 (20-Mux) and 2866 (37-Mux).
[W.B.Langdon, EuroGP-2010] 55

XCS using Standalone ADFs
Condition Action
D5D2D0F25 D2D1D5F0 D1D5D3F3 D0D4D2F15 D0D0D0FN D5D0D4F30 0
D4D5D1F15 D1D3D3F19 D3D4D5F15 D0D0D0FN D2D5D5F1 D0D0D0FN 1
... ... ... ... ... ... ...
... ... ... ... ... ... ...

Population of ADFs
0 ARG2ARG1&ARG3ARG3|d
1 ARG1
2 ARG1~ARG3r
3 ARG2~
4 ARG3ARG2&
... ...
... ...
N-1 ... 56
N ARG1ARG1~|

20 ADFs used for Multiplexer Problems

F0: ARG1ARG3|ARG2~r F1: ARG2
F2: ARG2ARG2rARG1ARG2r& F3: ARG2ARG2&ARG3ARG2&d
F4: ARG3 F5: ARG1~ARG3ARG1||
F6: ARG2ARG2dARG1~| F7: ARG1ARG2&ARG3&
F8: ARG1 F9: ARG2ARG2|ARG3ARG2|d
F10: ARG3ARG1dARG2ARG1d& F11: ARG1ARG3|ARG1ARG3d&
F12: ARG1~ARG3ARG1rd F13: ARG3ARG2|ARG2ARG2|d
F14: ARG2ARG2dARG3ARG3r& F15: ARG2~ARG3ARG3||
F16: ARG3ARG3|ARG3| F17: ARG3ARG3|~
F18: ARG3ARG1&ARG1ARG2|d F19: ARG2ARG1rARG2~r

57

Condition Action
D0D3D1F18 D2D1D5F0 D1D5D3F13 D0D4D2F15 D5D1D0F1 D0D0D4F4 0

F18: ARG3ARG1&ARG1ARG2|d
D0D3D1F18
= 0 1 1 F18
= 0 1 1 ARG3ARG1&ARG1ARG2|d
=10&01|d
=01d
=1

D0 D1 D2 D3 D4 D5
010110 0 1 0 1 1 0 58


59

1

0.9
37-bits Multiplexer

0.8
Performance

0.7

0.6 XCS using standalone ADFs

0.5 Standard XCS

0.4
0 100000 200000 300000 400000 500000
Instances

60
Number of classifiers used = 8000 Just 1 run results.

XCS using ADFs
Condition Action
D2D0F25 D1D5F0 D5D3F3 D4D2F15 D0D0FN D0D4F30 0
D5D1F15 D3D3F19 D4D5F15 D0D0FN D5D5F1 D0D0FN 1
... ... ... ... ... ... ...
... ... ... ... ... ... ...

Population of ADFs
1 ARG1
2 ARG1~ARG2F9
3 ARG2~
4 ARG1ARG2&
... ...
... ...
N-1 ... 61
N ARG1ARG1~|

Condition Action

D3D1F6 F0: ARG1ARG2F8ARG1ARG1F1F4
= 1 0 F6 F1: ARG2ARG2rARG1ARG1|F7
= 1 0 ARG2ARG1F8ARG1~d F2: ARG1
= 0 1 F8 1~ d F3: ARG2ARG1F9ARG2ARG1F8F6
= 0 1 F8 0 d F4: ARG2ARG2d~
= 0 1 ARG1~ARG1ARG1rr F5: ARG1ARG2F7~
= 0~ 0 0 r r F6: ARG2ARG1F8ARG1~d
=11r F7: ARG1ARG1&ARG1ARG1||
=0 F8: ARG1~ARG1ARG1rr
F9: ARG2ARG1d~
Not Matched
D0 D1 D2 D3 D4 D5
101110 1 0 1 1 1 0 62

Simplified ADFs used for Multiplexer
Problems

F0: ARG1ARG2F8ARG1ARG1F1F4 à ARG1~
F1: ARG2ARG2rARG1ARG1|F7 à ARG2~
F2: ARG1 à ARG1
F3: ARG2ARG1F9ARG2ARG1F8F6 à ARG2ARG1&ARG2F6
F4: ARG2ARG2d~ à ARG2
F5: ARG1ARG2F7~ à ARG1~
F6: ARG2ARG1F8ARG1~d à ARG2ARG1~d
F7: ARG1ARG1&ARG1ARG1|| à ARG1
F8: ARG1~ARG1ARG1rr à ARG1
F9: ARG2ARG1d~ à ARG2ARG1&

63

1.00

0.90 Multiplexer

Performance

0.80


0.60 11-bits using XCS with ADFs

0.50
0 10000 20000 30000 40000 50000
Instances

Number of ADFs used = 10 ADFs 64


1

37-bits Multiplexer
0.9

0.8
Performance

0.7


Standard XCS
0.5

0.4
0 100000 200000 300000 400000 500000
Instances

65


1

0.9
37-bits Multiplexer
0.8
Performance

0.7

0.6
Standard XCS

0.4
0 100000 200000 300000 400000 500000
Instances

66

XCS using Multilevel ADFs
•  Code fragments do not explore the search space
as efficiently as ADFs can do.
•  ADFs takes a lot of time to evaluate an ADF-
Tree because of having nested calls to other
ADFs.
•  ADFs that can not call other ADFs are in
between the above two techniques both in terms
of exploring search space and taking time for it.
•  So, tried one more option ..... Multilevel ADFs

67


•  Three level ADFs
•  20 ADFs at each level
•  Each ADF taking two arguments
•  ADFs at level 1 can call any ADFs from level 2
and level 3 but can not call any ADF from level 1
•  ADFs at level 2 can call any ADF from level 3
but can not call any ADF from level 1 and level 2
•  Level 3 ADFs are not allowed to call any other
ADF
68

Condition Action
... ... ... ... ... ... ...
... ... ... ... ... ... ...
Population of ADFs
Level 1

... ...
19 ARG1ARG2&ARG2ARG1F39F34
20 ARG2ARG1F40ARG1F44
Level 2

... ...
39 ARG1ARG2F58ARG1ARG2F53|
40 ARG2ARG1d~
Level 3

... ...
59 ARG2ARG1ARG2&d 69
60 # ARG1ARG1~|

1.00

0.90 Multiplexer
Performance

6-bits using XCS with multilevel ADFs

0.60 11-bits using XCS with multilevel ADFs
20-bits using XCS with multilevel ADFs

0.50
0 10000 20000 30000 40000 50000 60000 70000 80000
Instances
70

1.00

0.90

0.80
Performance

37-bits Multiplexer
0.70

0.60
XCS using multilevel ADFs
0.50
Standard XCS
0.40
0 100000 200000 300000 400000 500000
Instances

71
Number of classifiers used = 8000, 20 runs average

Automatically Defined Functions for Learning Classifier Systems

Recomendados

Recomendados

Más contenido relacionado

Similar a Automatically Defined Functions for Learning Classifier Systems

Similar a Automatically Defined Functions for Learning Classifier Systems (20)

Más de Daniele Loiacono

Más de Daniele Loiacono (20)

Último

Último (20)

Automatically Defined Functions for Learning Classifier Systems