SlideShare una empresa de Scribd logo
1 de 122
Learning the membership
  function contexts for mining
fuzzy association rules by using
       genetic algorithms
         Jesús Alcalá-Fdez, Rafael Alcalá
       María José Gacto, Francisco Herrera

  Fuzzy Sets and Systems (2008), article in press

         Presenter: Chia-Ming Wang
Before we go




Thanks to Prof. Hong who provide me the second paper today.
Before we go
• T. Hong, C. Chen,Y. Wu,Y. Lee, Using divide-
  and-conquer GA strategy in fuzzy data
  mining, in: IEEE Symp. on Fuzzy Systems,
  Budapest, Hungary, 2004, pp. 116–121.
• T. Hong, C. Kuo, S. Chi,Trade-off between
  time complexity and number of rules for
  fuzzy mining from quantitative data, Journal
  of Uncertain Fuzziness Knowledge-Based
  Systems 9 (5) (2001) 587–604.
      Thanks to Prof. Hong who provide me the second paper today.
Problem Description


       2-tuples   Quantitative
GA      model     Association
                     Rule
A Transaction Database

    TID            items
     1          Bread, Milk
     2    Bread, Diaper, Beer, Eggs
     3    Milk, Diaper, Beer, Coke
     4    Bread, Milk, Diaper, Beer
     5    Bread, Milk, Diaper, Coke
Association Rule Mining
                                         Examples:
TID            items
                                      {Diaper}→{Beer}
 1          Bread, Milk           {Milk, Bread}→{Eggs, coke}
 2    Bread, Diaper, Beer, Eggs     {Beer, Bread}→{Milk}
 3    Milk, Diaper, Beer, Coke
                                       X→Y, X∩Y=∅
 4    Bread, Milk, Diaper, Beer
 5    Bread, Milk, Diaper, Coke
Association Rule Mining
                                           Examples:
TID            items
                                         {Diaper}→{Beer}
 1          Bread, Milk             {Milk, Bread}→{Eggs, coke}
 2    Bread, Diaper, Beer, Eggs        {Beer, Bread}→{Milk}
 3    Milk, Diaper, Beer, Coke
                                          X→Y, X∩Y=∅
 4    Bread, Milk, Diaper, Beer
                                  Implication means co-occurrence,
 5    Bread, Milk, Diaper, Coke
                                            not causality!
Terminology
                                     Examples:
                                  {Milk, Diaper}→{Beer}
TID            items
 1          Bread, Milk
 2    Bread, Diaper, Beer, Eggs
 3    Milk, Diaper, Beer, Coke
 4    Bread, Milk, Diaper, Beer
 5    Bread, Milk, Diaper, Coke
Terminology
                                           Examples:
                                       {Milk, Diaper}→{Beer}
TID            items
                                            support
 1          Bread, Milk
                                     σ{Milk, Diaper, Beer}  2
                                  s=                       = = 0.4
 2    Bread, Diaper, Beer, Eggs               |T|           5

 3    Milk, Diaper, Beer, Coke
 4    Bread, Milk, Diaper, Beer
 5    Bread, Milk, Diaper, Coke
Terminology
                                           Examples:
                                       {Milk, Diaper}→{Beer}
TID            items
                                            support
 1          Bread, Milk
                                     σ{Milk, Diaper, Beer}  2
                                  s=                       = = 0.4
 2    Bread, Diaper, Beer, Eggs               |T|           5

 3    Milk, Diaper, Beer, Coke             confident
 4    Bread, Milk, Diaper, Beer   c=
                                     σ{Milk, Diaper, Beer}  2
                                                           = = 0.67
                                       σ{Milk, Diaper}      3
 5    Bread, Milk, Diaper, Coke
Terminology
                                           Examples:
                                       {Milk, Diaper}→{Beer}
TID            items
                                            support
 1          Bread, Milk
                                     σ{Milk, Diaper, Beer}  2
                                  s=                       = = 0.4
 2    Bread, Diaper, Beer, Eggs               |T|           5

 3    Milk, Diaper, Beer, Coke             confident
 4    Bread, Milk, Diaper, Beer   c=
                                     σ{Milk, Diaper, Beer}  2
                                                           = = 0.67
                                       σ{Milk, Diaper}      3
 5    Bread, Milk, Diaper, Coke
                                    Itemset, minsup, minconf
Real-world
     Transaction Database
TID                   (item, quantity)
 1                  (Bread, 3), (Milk, 1)
 2      (Bread, 1), (Diaper, 2), (Beer, 3), (Eggs, 12)
 3       (Milk,2), (Diaper, 4), (Beer, 5), (Coke, 2)
 4      (Bread, 3), (Milk, 1), (Diaper, 2), (Beer, 12)
 5      (Bread, 2), (Milk, 4), (Diaper, 5), (Coke, 3)
Real-world
     Transaction Database
TID                   (item, quantity)
 1                  (Bread, 3), (Milk, 1)
 2              Quantitative 3), (Eggs, 12)
        (Bread, 1), (Diaper, 2), (Beer,
             Association Rule
 3       (Milk,2), (Diaper, 4), (Beer, 5), (Coke, 2)
                     Mining
 4      (Bread, 3), (Milk, 1), (Diaper, 2), (Beer, 12)
 5      (Bread, 2), (Milk, 4), (Diaper, 5), (Coke, 3)
Quantitative
Association
   Rule
2-tuples   Quantitative
 model     Association
              Rule
Linguistic terms

Low   Middle   High       Low   Middle   High




       age                      weight



  if age is Middle then weight is High
The 2-tuples linguistic
       representation

               if age is Middle then weight is High




F. Herrera, L. Martínez, A 2-tuple fuzzy linguistic representation model for computing with words, IEEE Trans.
Fuzzy Systems 8 (6) (2000) 746–752.
The 2-tuples linguistic
       representation

               if age is Middle then weight is High

if age is (Middle, 0.3) then weight is (High, -0.1)


            (si , αi ),               si ∈ S,                 αi ∈ [−0.5, 0.5)

F. Herrera, L. Martínez, A 2-tuple fuzzy linguistic representation model for computing with words, IEEE Trans.
Fuzzy Systems 8 (6) (2000) 746–752.
-1 -0.5           0.5 1

         s0   s1         s2        s3   s4

domain
         0    1           2        3    4
                   (s2, -0.3)
-1 -0.5           0.5 1

         s0   s1         s2        s3   s4
                       -0.3
domain                1.7
         0    1             2      3    4
                   (s2, -0.3)
-1 -0.5            0.5 1

                s0          s1         s2             s3          s4
                                     -0.3
domain                              1.7
                0           1             2           3           4
                                 (s2, -0.3)




         -0.5        0.5          -0.5         0.5         -0.5        0.5

                     -0.5        0.5           -0.5        0.5


                s0          s1            s2          s3          s4




                0           1             2           3           4
-1 -0.5            0.5 1

                s0          s1         s2             s3          s4
                                     -0.3
domain                              1.7
                0           1             2           3           4
                                 (s2, -0.3)




         -0.5        0.5          -0.5         0.5         -0.5        0.5

                     -0.5        0.5           -0.5        0.5

                                    α=-0.3
                s0          s1            s2          s3          s4




                                 (s2, -0.3)


                0           1             2           3           4
Interpretation


if age is (Middle, 0.3) then weight is (High, -0.1)
Interpretation


if age is (Middle, 0.3) then weight is (High, -0.1)


 if age is (higher than Middle)
 then weight is (a bit smaller than High)
2-tuples
 model
2-tuples
GA    model
Traditional GA
Traditional GA
       Population
    (chromosomes)
Traditional GA
       Population
    (chromosomes)
                       parents


                    Evaluation
                     (fitness)
Traditional GA
       Population
    (chromosomes)
                       parents


                    Evaluation
                     (fitness)



                     Reproduction
     Mating pool
                      (selection)
Traditional GA
                                   Population
                                (chromosomes)
                                                   parents


‣ crossover      Genetic                        Evaluation
‣ mutation      operators                        (fitness)



                   Mates                         Reproduction
                                 Mating pool
              (recombination)                     (selection)
Traditional GA
                                   Population
                                (chromosomes)
               offsprings                          parents


‣ crossover      Genetic                        Evaluation
‣ mutation      operators                        (fitness)



                   Mates                         Reproduction
                                 Mating pool
              (recombination)                     (selection)
GA Used in this paper

• CHC genetic model
• MFs codification and initial gene pool
• Chromosome evaluation
• Crossover operator
GA Used in this paper

• CHC genetic model
• MFs codification and initial gene pool
• Chromosome evaluation
• Crossover operator
Scheme of CHC model




L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population
 and THRESHOLD




     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N
 and THRESHOLD                                    parents




     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N
 and THRESHOLD                                    parents

             Incest prevention
         1/2 * hamming distance > L
          L = (#Genes *BITSGENE)/4




     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N                                Evaluation of the
 and THRESHOLD                                    parents                                    New Individuals




     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N                                Evaluation of the
 and THRESHOLD                                    parents                                    New Individuals




                                                                                         Selection of the best N
                                                                                           individuals between
                                                                                          parents and offsprings




     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N                                Evaluation of the
 and THRESHOLD                                    parents                                    New Individuals




                                                                                         Selection of the best N
                                                                                           individuals between
                                                                                          parents and offsprings




                                                                                           if NO new individual,
                                                                                         decrement THRESHOLD



     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N                                Evaluation of the
 and THRESHOLD                                    parents                                    New Individuals




                                                                                         Selection of the best N
                                                                                           individuals between
                                                                                          parents and offsprings




                                                THRESHOLD                                  if NO new individual,
                                                   <0                                    decrement THRESHOLD



     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N                                Evaluation of the
 and THRESHOLD                                    parents                                    New Individuals




                                                                                         Selection of the best N
                                                                                           individuals between
                                                                                          parents and offsprings

                                                            no

                                                THRESHOLD                                  if NO new individual,
                                                   <0                                    decrement THRESHOLD



     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
Scheme of CHC model
Initialize population                          Crossover of N                                Evaluation of the
 and THRESHOLD                                    parents                                    New Individuals




                                                                                         Selection of the best N
                                                                                           individuals between
                                                                                          parents and offsprings

                                                            no
Restart the population                          THRESHOLD                                  if NO new individual,
  and THRESHOLD                                    <0                                    decrement THRESHOLD
                                yes

     L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic
     recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
GA Used in this paper

• CHC genetic model
• MFs codification and initial gene pool
• Chromosome evaluation
• Crossover operator
age   L1   M1   H1                       L2       M2   H2   weight




                        L1   M1   H1   L2   M2 H2
                        0    0    0    0    0        0

   MFs
Codification
age   L1   M1   H1                        L2       M2   H2   weight




                        L1   M1   H1   L2   M2 H2
                        0    0    0     0    0        0

   MFs
Codification
                        L1   M1   H1   L2   M2 H2
                        0.2 0.4   0    -0.2 -0.3 -0.5

       age   L1   M1   H1                        L2       M2   H2   weight
Initial Gene Pool
chromosome:    (c11,...,c1m,c21,...,c2m,...,cn1,...,cnm)

          1 item with m MFs


 • initial MFs obtained from expert knowledge
 • individuals generated at random in [-0.5, 0.5)
Implementation:
  Gray Code
Decimal   Binary   Gray Code
  0        000        000
  1        001        001
  2        010        011
  3        011        010
  4        100        110
  5        101        111
  6        110        101
  7        111        100
Implementation:
  Gray Code
Decimal   Binary   Gray Code
  0        000        000
  1        001        001
  2        010        011
  3        011        010
  4        100        110
  5        101        111
  6        110        101
  7        111        100
GA Used in this paper

• CHC genetic model
• MFs codification and initial gene pool
• Chromosome evaluation
• Crossover operator
Equation Mania
                                            x∈L1 f uzzy support
                  f itness(Cq ) =
                                             suitability(Cq )
                      n
 suitability(Cq ) =         [overlap f actor(Cqk ) + coverage f actor(Cqk )]
                      k=1

                             m    m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )

                                                       1
                coverage f actor(Cqk )= range(R
                                                        1 ,...,Rm )
                                                    max(Ik )
                                      n
            suitability(Cq ) =            [overlap f actor(Cqk ) + 1]
                                  k=1
m   m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome        kth item
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome         kth item




                   Ri         Rj
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome          kth item




                    Ri         Rj




              overlap
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome          kth item




                    Ri         Rj




              overlap
              SpanR
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome          kth item




                    Ri         Rj




              overlap
              SpanR
              SpanL
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome          kth item




                    Ri         Rj                    Ri   Rj




              overlap                          overlap
              SpanR                            SpanR
              SpanL                            SpanL
m        m
                                           overlap(Ri , Rj )
overlap f actor(Cqk ) =           [max(                         , 1) − 1]
                        i=1 j=i+1
                                        min(spanRRi , spanLRi )
 qth chromosome          kth item




                    Ri         Rj                    Ri   Rj


                                                    penalty

              overlap                          overlap
              SpanR                            SpanR
              SpanL                            SpanL
1
coverage f actor(Cqk )= range(R
                              1 ,...,Rm )
                           max(Ik )
1
coverage f actor(Cqk )= range(R
                                   1 ,...,Rm )
                            max(Ik )
   qth chromosome       kth item
1
             coverage f actor(Cqk )= range(R
                                                1 ,...,Rm )
                                         max(Ik )
                 qth chromosome      kth item




       R1   R2       R3




Milk
       0    5        10
1
             coverage f actor(Cqk )= range(R
                                                1 ,...,Rm )
                                         max(Ik )
                 qth chromosome      kth item




       R1   R2       R3                  R1          R2            R3




Milk                              Milk
       0    5        10                  0              5     10
1
             coverage f actor(Cqk )= range(R
                                                1 ,...,Rm )
                                           max(Ik )
                 qth chromosome      kth item




       R1   R2       R3                  R1           R2           R3




Milk                              Milk
       0    5        10                    0            5     10
                                   range
1
             coverage f actor(Cqk )= range(R
                                                1 ,...,Rm )
                                           max(Ik )
                 qth chromosome      kth item




       R1   R2       R3                  R1           R2           R3




Milk                              Milk
       0    5        10                    0            5     10
                                      coverage f actor(Cqk ) = 1
                                   range
Fuzzy Support (count)
Fuzzy Support (count)
     DB   n item

T
Fuzzy Support (count)
     DB   n item
                    (i)
                   vj
T            ith
Fuzzy Support (count)
     DB   n item
                        (i)
                       vj
T            ith
                                   (i)            (i)
                        (i)       fj1            fjm
               bread   fj     =          + ···
                                  Rj1            Rjm
Fuzzy Support (count)
     DB   n item
                        (i)
                       vj
T            ith
                                   (i)            (i)
                        (i)       fj1            fjm
               bread   fj     =          + ···
                                  Rj1            Rjm

                              item
                                         m mf
Fuzzy Support (count)
     DB   n item
                        (i)
                       vj                               degree
T            ith
                                   (i)            (i)
                        (i)       fj1            fjm
               bread   fj     =          + ···
                                  Rj1            Rjm

                              item
                                         m mf
Fuzzy Support (count)
        DB           n item
                                    (i)
                                   vj                               degree
T                        ith
                                               (i)            (i)
                                    (i)       fj1            fjm
                           bread   fj     =          + ···
                                              Rj1            Rjm
            T
                   (i)
countjk =         fjk                     item
            i=1                                      m mf
bread.low.count
Fuzzy Support (count)
        DB           n item
                                    (i)
                                   vj                               degree
T                        ith
                                               (i)            (i)
                                    (i)       fj1            fjm
                           bread   fj     =          + ···
                                              Rj1            Rjm
            T
                   (i)
countjk =         fjk                     item
            i=1                                      m mf
bread.low.count
     L1 = {Rjk |countjk ≥ α, 1 ≤ j ≤ n and 1 ≤ k ≤ m

                                              n item
Fuzzy Support


                  x∈L1 f uzzy support
f itness(Cq ) =
                   suitability(Cq )
Fuzzy Support


                  x∈L1 f uzzy support
f itness(Cq ) =
                   suitability(Cq )


                         n
   suitability(Cq ) =         [overlap f actor(Cqk ) + 1]
                        k=1
Fuzzy Support


                  x∈L1 f uzzy support
f itness(Cq ) =
                   suitability(Cq )


                         n
   suitability(Cq ) =         [overlap f actor(Cqk ) + 1]
                        k=1



                   n item
Fuzzy Support
                    L1


                  x∈L1 f uzzy support
f itness(Cq ) =
                   suitability(Cq )


                         n
   suitability(Cq ) =         [overlap f actor(Cqk ) + 1]
                        k=1



                   n item
Fuzzy Support
                    L1         count / T         # transaction



                  x∈L1 f uzzy support
f itness(Cq ) =
                   suitability(Cq )


                         n
   suitability(Cq ) =         [overlap f actor(Cqk ) + 1]
                        k=1



                   n item
GA Used in this paper

• CHC genetic model
• MFs codification and initial gene pool
• Chromosome evaluation
• Crossover operator
PCBLX Crossover
    X = (x1 · · · xn ) Y = (y1 · · · yn )                   (xi , yi ∈ [ai , bi ] ⊂ R, i = 1 · · · n)
O1 = (o11 · · · o1n ) [li , u1 ] li = max{ai , xi − Ii · α} u2 = min{bi , xi + Ii · α}
                        1
                             i
                                  1
                                                             i

O2 = (o21 · · · o2n ) [li , u2 ] li = max{ai , yi − Ii · α} u2 = min{bi , yi + Ii · α}
                        2
                             i
                                  2
                                                             i

                                                 Ii = |xi − yi |




      F. Herrera, M. Lozano, A.M. Sánchez, A taxonomy for the crossover operator for real-coded genetic
      algorithms: An experimental study. Int. J. Intell. Syst. 18 (2003) 309-338.
PCBLX Crossover
    X = (x1 · · · xn ) Y = (y1 · · · yn )                   (xi , yi ∈ [ai , bi ] ⊂ R, i = 1 · · · n)
O1 = (o11 · · · o1n ) [li , u1 ] li = max{ai , xi − Ii · α} u2 = min{bi , xi + Ii · α}
                        1
                             i
                                  1
                                                             i

O2 = (o21 · · · o2n ) [li , u2 ] li = max{ai , yi − Ii · α} u2 = min{bi , yi + Ii · α}
                        2
                             i
                                  2
                                                             i

                                                 Ii = |xi − yi |




             ai                         xi                              yi                       bi

                                  PCBLX                              BLX
      F. Herrera, M. Lozano, A.M. Sánchez, A taxonomy for the crossover operator for real-coded genetic
      algorithms: An experimental study. Int. J. Intell. Syst. 18 (2003) 309-338.
Conceptual Flowchart
Conceptual Flowchart
    Learning
Membership Function
Conceptual Flowchart
      Learning
  Membership Function
                Learning
                Process
Predefined MFs




 Transaction
  Database
Conceptual Flowchart
      Learning
  Membership Function
                Learning
                Process
Predefined MFs


                        Evaluation
                          Module
                         (Fitness)

 Transaction
  Database


                  MFs
Conceptual Flowchart
      Learning                         Mining Fuzzy
  Membership Function                Association Rules
                Learning
                Process
Predefined MFs


                        Evaluation
                          Module
                         (Fitness)

 Transaction
  Database


                  MFs
Conceptual Flowchart
      Learning                                Mining Fuzzy
  Membership Function                       Association Rules
                Learning                               Fuzzy
                Process                                mining
Predefined MFs                        Definitive MFs


                        Evaluation
                          Module
                         (Fitness)

 Transaction                           Transaction
  Database                              Database


                  MFs
Conceptual Flowchart
      Learning                                Mining Fuzzy
  Membership Function                       Association Rules
                Learning                                  Fuzzy
                Process                                   mining
Predefined MFs                        Definitive MFs


                        Evaluation
                          Module
                         (Fitness)

 Transaction                           Transaction
  Database                              Database          Fuzzy
                                                     Association Rules
                  MFs
Procedures
Stage 1
1. initialization
2. evaluate the initial chromosomes
     1. for all items in transaction, transfer the
        quantitative values to fuzzy sets
     2. calculate count, fuzzy support
     3. calculate fitness
3. set threshold L
4. generate the next population
5. CHC procedure
6. if # run not reach, goto step4
Stage 2
   Mining Fuzzy association rules by (Hong 2001)
Experiments
Parameters
   Proposed                 Hong’s

• # 50 individuals     • 0.01 mutation rate
• 10,000 evaluations   • 0.35 d factor
• 30 bits per gene
• 0.6 crossover rate
• 0.8 fuzzy rule
  confident
Data Set

                                                                          Bureau of the Census
                                                                          FAM95
                                                                          #63,756 instance
                                                                          #23 attr.
                                                                          #10 attr.




This data set was obtained from the Statistics Data Sets Archive website http://www.stat.ucla.edu/data/fpp.
Results obtained in the
          genetic process
      Proposed approach                Hong el al.’s approach           Uniform fuzzy partition
Sup    Fit   Fsup   Suit   #1I   Sup      Fit   Fsup   Suit     #1I   Sup   Fit   Fsup   Suit   #1I
                                  With three linguistic terms
0.2   0.99 11.68 11.85      20 0.2       0.68 10.83 15.83        19 0.2     0.92 9.24 10.00       16
0.5   0.94 11.68 12.39      17 0.5       0.53 10.28 19.45        15 0.5     0.76 7.55 10.00       10
0.7   0.66   6.98 10.63      9 0.7       0.37 6.55 17.94          8 0.7     0.57 5.71 10.00        7
0.9   0.28   2.80 10.00      3 0.9       0.00 0.00 14.75          0 0.9     0.00 0.00 10.00        0
                                   With five linguistic terms
0.2   0.95 10.46 10.99      22 0.2       0.53 10.22 19.27        22 0.2     0.94 9.43 10.00       21
0.5   0.77   9.92 12.92     15 0.5       0.38 7.95 20.63         12 0.5     0.46 4.57 10.00        7
0.7   0.61   7.69 12.57     10 0.7       0.20 3.96 19.54          5 0.7     0.24 2.36 10.00        3
0.9   0.10   0.92 10.00      1 0.9       0.06 0.90 15.01          1 0.9     0.00 0.00 10.00        0
Results obtained in the
          genetic process
      Proposed approach                Hong el al.’s approach           Uniform fuzzy partition
Sup    Fit   Fsup   Suit   #1I   Sup      Fit   Fsup   Suit     #1I   Sup   Fit   Fsup   Suit   #1I
                                  With three linguistic terms
0.2   0.99 11.68 11.85      20 0.2       0.68 10.83 15.83        19 0.2     0.92 9.24 10.00       16
0.5   0.94 11.68 12.39      17 0.5       0.53 10.28 19.45        15 0.5     0.76 7.55 10.00       10
0.7   0.66   6.98 10.63      9 0.7       0.37 6.55 17.94          8 0.7     0.57 5.71 10.00        7
0.9   0.28   2.80 10.00      3 0.9       0.00 0.00 14.75          0 0.9     0.00 0.00 10.00        0
                                   With five linguistic terms
0.2   0.95 10.46 10.99      22 0.2       0.53 10.22 19.27        22 0.2     0.94 9.43 10.00       21
0.5   0.77   9.92 12.92     15 0.5       0.38 7.95 20.63         12 0.5     0.46 4.57 10.00        7
0.7   0.61   7.69 12.57     10 0.7       0.20 3.96 19.54          5 0.7     0.24 2.36 10.00        3
0.9   0.10   0.92 10.00      1 0.9       0.06 0.90 15.01          1 0.9     0.00 0.00 10.00        0
Results obtained in the
          genetic process
      Proposed approach                Hong el al.’s approach           Uniform fuzzy partition
Sup    Fit   Fsup   Suit   #1I   Sup      Fit   Fsup   Suit     #1I   Sup   Fit   Fsup   Suit   #1I
                                  With three linguistic terms
0.2   0.99 11.68 11.85      20 0.2       0.68 10.83 15.83        19 0.2     0.92 9.24 10.00       16
0.5   0.94 11.68 12.39      17 0.5       0.53 10.28 19.45        15 0.5     0.76 7.55 10.00       10
0.7   0.66   6.98 10.63      9 0.7       0.37 6.55 17.94          8 0.7     0.57 5.71 10.00        7
0.9   0.28   2.80 10.00      3 0.9       0.00 0.00 14.75          0 0.9     0.00 0.00 10.00        0
                                   With five linguistic terms
0.2   0.95 10.46 10.99      22 0.2       0.53 10.22 19.27        22 0.2     0.94 9.43 10.00       21
0.5   0.77   9.92 12.92     15 0.5       0.38 7.95 20.63         12 0.5     0.46 4.57 10.00        7
0.7   0.61   7.69 12.57     10 0.7       0.20 3.96 19.54          5 0.7     0.24 2.36 10.00        3
0.9   0.10   0.92 10.00      1 0.9       0.06 0.90 15.01          1 0.9     0.00 0.00 10.00        0
Results obtained in the
   genetic process
          Hong el al.’s approach with the 2-tuples
Support   Fitness              Fsup               Suit    #1Itemset
                    With three linguistic terms
  0.2      0.97               10.90               11.18      20
  0.5      0.89               11.36               12.64      18
  0.7      0.59                6.20               10.33      7
  0.9      0.26                2.79               10.52      3
                    With five linguistic terms
  0.2      0.93               10.18               10.93      22
  0.5      0.64                7.39               11.80      11
  0.7      0.41               0.476               11.60      6
  0.9      0.08                0.91               10.92      1
Fitness vs Function Evaluation

                           1
Average Fitness Values.




                          0.8

                          0.6

                          0.4

                          0.2

                           0
                                0   2000             4000            6000             8000       10000
                                                            Evaluations

                                           The Proposed Approach        Hong et al.'s Approach
Frequent 1-itemsets vs minsup
Number of Large 1-itemsets




                             20

                             15

                             10

                              5

                              0
                               0.10   0.20       0.30         0.40     0.50   0.60          0.70        0.80            0.90
                                                                 Minimum Support
                                      The Proposed Approach        Hong et al.'s Approach     Uniform Fuzzy Partition
MFs w/o lateral displacement

             l1' = (l1,0.4) l2' = (l2,0.4) l3' = (l3,0.5)        l1' = (l1,0.0) l2' = (l2,-0.2) l3' = (l3,0.0)          l1' = (l1,-0.1) l2' = (l2,-0.2)    l3' = (l3,0.2)


X1                                                          X2                                                     X3



                l1            l2          l3                              l1            l2             l3                         l1             l2            l3
        l1' = (l1,0.0) l2' = (l2,0.0)     l3' = (l3,0.4)         l1' = (l1,0.1) l2' = (l2,-0.2)   l3' = (l3,0.1)        l1' = (l1,0.1) l2' = (l2,-0.5)    l3' = (l3,0.1)


X4                                                          X5                                                     X6



               l1            l2           l3                               l1            l2             l3                        l1            l2             l3
      l1' = (l1,-0.1) l2' = (l2,-0.1)     l3' = (l3,0.4)         l1' = (l1,0.0) l2' = (l2,-0.2) l3' = (l3,-0.2)         l1' = (l1,0.0) l2' = (l2,-0.3)    l3' = (l3,0.1)


X7                                                          X8                                                     X9



                l1            l2           l3                             l1             l2            l3                         l1            l2            l3
      l1' = (l1,0.0) l2' = (l2,-0.2)    l3' = (l3,0.2)


X10



               l1            l2           l3
Hong’s MFs
      l1'             l2'                  l3'                      l1'       l2'              l3'             l1'    l2'                          l3'


X1                                                       X2                                               X3



            l1                        l2           l3         l1                    l2               l3        l1                 l2                     l3
             l1'                l2'          l3'                    l1' l2'              l3'                   l1'          l2'              l3'


X4                                                       X5                                               X6



            l1                        l2           l3          l1                l2                  l3        l1                 l2                     l3
                    l1'         l2'        l3'                l1'             l2' l3'                           l1'     l2'            l3'


X7                                                       X8                                               X9



            l1                        l2           l3         l1                    l2               l3        l1                 l2                     l3
              l1'         l2'              l3'


X10



            l1                        l2           l3
#rules vs minsup
                                                                             minconf = 0.8
                  160000
                  140000
                  120000
Number of Rules




                  100000
                   80000
                   60000
                   40000
                   20000
                       0
                       0.10    0.20       0.30    0.40   0.50   0.60         0.70       0.80        0.90
                                                    Minimum Support

                              Proposed Approach     Hong et al.'s Approach     Uniform Fuzzy Partition
#rules vs minconf
                                                                              minsup = 0.2
                  90000
                  80000
                  70000
Number of Rules




                  60000
                  50000
                  40000
                  30000
                  20000
                  10000
                      0
                       0.10   0.20      0.30      0.40   0.50    0.60          0.70        0.80           0.90
                                                  Minimum Confidence
                              Proposed Approach      Hong et al.'s Approach     Uniform Fuzzy Partition
#rules vs minsup vs
                                    minsup
                  200000
Number of Rules




                  150000

                  100000

                   50000

                       0
                       0.10    0.20       0.30      0.40    0.50  0.60          0.70    0.80         0.90
                                                      Minimum Support
                                 Conf = 0.5      Conf = 0.6   Conf = 0.7   Conf = 0.8   Conf = 0.9
#rules vs minsup vs
                                     minsup
                  200000
Number of Rules




                  150000

                  100000

                   50000

                       0
                        0.10    0.20   0.30           0.40   0.50    0.60   0.70      0.80   0.90
                                                      Minimum Confidence
                                       Minsup = 0.1          Minsup = 0.2   Minsup = 0.3
                                       Minsup = 0.4          Minsup = 0.5   Minsup = 0.6
Time vs #Transaction

                    30.00
                    25.00
Runtime (minutes)




                    20.00
                    15.00
                    10.00
                     5.00
                     0.00
                        10%   20%   30%    40%   50%     60%     70%         80%    90%   100%
                                             Number of Transactions
                                      Proposed Approach    Hong et al.'s Approach
Time vs #Attribute

                    30.00
                    25.00
Runtime (minutes)




                    20.00
                    15.00
                    10.00
                     5.00
                     0.00
                            2     3   4        5      6         7             8     9   10
                                              Number of Attributes
                                      Proposed Approach    Hong et al.'s Approach
Time vs #Linguistic terms

                    70.00
Runtime (minutes)




                    60.00

                    50.00

                    40.00

                    30.00

                    20.00
                            3   4                       5                    6       7
                                      Number of Linguistic Terms
                                    Proposed Approach       Hong et al.'s Approach
Example of Rules
                     If number if children is Low and
 Classic Fuzzy       hours head worked last week is Low
Association Rule     then head’s personal income is Low
                     (Factor of confidence 0.87)


                     If number if children is (Low, -0.16) and
Rule with 2-Tuples   hours head worked last week is (Low, -0.06)
 Representation      then head’s personal income is (Low, 0.1)
                     (Factor of confidence 0.99)
Author’s conclusion
Author’s conclusion


   2-tuples linguistic
representation works!!
Discussions
T. Hong, C. Chen,Y. Wu,Y. Lee, Using divide-and-conquer GA strategy in fuzzy data mining, IEEE Symp. on Fuzzy Systems,
Budapest, Hungary, 2004, pp. 116–121.
Pitfalls
• domain knowledge & Symmetric assumption
• flowchart
• Hong’s method
• inadequate fitness function
• gray code and crossover
• fuzzy association?
• dataset
• replication?
• scalability
Pitfalls
• domain knowledge & Symmetric assumption
• flowchart
• Hong’s method
                                      n
                 suitability(Cq ) =         [overlap f actor(Cqk ) + coverage f actor(Cqk )]
                                      k=1



• inadequate fitness function
• gray code and crossover
• fuzzy association?
• dataset
• replication?
• scalability
Pitfalls
• domain knowledge & Symmetric assumption
• flowchart
• Hong’s method
• inadequate fitness function
• gray code and crossover
• fuzzy association?
• dataset
• replication?
• scalability
Reference
• L. Eshelman, The CHC adaptive search algorithm: How to have safe search when
    engaging in nontraditional genetic recombination, in: G. Rawlin (Ed.), Foundations of
    Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
•   F. Herrera, L. Martínez, A 2-tuple fuzzy linguistic representation model for computing
    with words, IEEE Trans. Fuzzy Systems 8 (6) (2000) 746–752.
•   F. Herrera, M. Lozano, A.M. Sánchez, A taxonomy for the crossover operator for real-
    coded genetic algorithms: An experimental study. Int. J. Intell. Syst. 18 (2003) 309-338.
•   T. Hong, C. Chen, Y. Wu,Y. Lee, Using divide-and-conquer GA strategy in fuzzy data
    mining, in: IEEE Symp. on Fuzzy Systems, Budapest, Hungary, 2004, pp. 116–121.
•   T. Hong, C. Chen, Y. Wu,Y. Lee, quot;Genetic-Fuzzy Data Mining with Divide-and-Conquer
    Strategyquot;, IEEE Transactions on Evolutionary Computation 12 (2) 252-265.
•   T. Hong, C. Kuo, S. Chi, Trade-off between time complexity and number of rules for
    fuzzy mining from quantitative data, Journal of Uncertain Fuzziness Knowledge-Based
    Systems 9 (5) (2001) 587–604.
•   H. Ishibuchi, T. Nakashima, T.Yamamoto, Fuzzy association rules for handling continuous
    attributes, in: IEEE Internat. Symp. on Industrial Electronics Proceedings, Pusan, Korea,
    2001, pp. 118–121.
•   P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining, Addison Wesley, May
    2005.
Thank you!
Questions?

Más contenido relacionado

Más de Jamie (Taka) Wang

Más de Jamie (Taka) Wang (20)

20200808自營電商平台策略討論
20200808自營電商平台策略討論20200808自營電商平台策略討論
20200808自營電商平台策略討論
 
20200427_hardware
20200427_hardware20200427_hardware
20200427_hardware
 
20200429_ec
20200429_ec20200429_ec
20200429_ec
 
20200607_insight_sync
20200607_insight_sync20200607_insight_sync
20200607_insight_sync
 
20220113_product_day
20220113_product_day20220113_product_day
20220113_product_day
 
20200429_software
20200429_software20200429_software
20200429_software
 
20200602_insight_business
20200602_insight_business20200602_insight_business
20200602_insight_business
 
20200408_gen11_sequence_diagram
20200408_gen11_sequence_diagram20200408_gen11_sequence_diagram
20200408_gen11_sequence_diagram
 
20190827_activity_diagram
20190827_activity_diagram20190827_activity_diagram
20190827_activity_diagram
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
20161220 - microservice
20161220 - microservice20161220 - microservice
20161220 - microservice
 
20160217 - Overview of Vortex Intelligent Data Sharing Platform
20160217 - Overview of Vortex Intelligent Data Sharing Platform20160217 - Overview of Vortex Intelligent Data Sharing Platform
20160217 - Overview of Vortex Intelligent Data Sharing Platform
 
20151111 - IoT Sync Up
20151111 - IoT Sync Up20151111 - IoT Sync Up
20151111 - IoT Sync Up
 
20151207 - iot strategy
20151207 - iot strategy20151207 - iot strategy
20151207 - iot strategy
 
20141210 - Microservice Container
20141210 - Microservice Container20141210 - Microservice Container
20141210 - Microservice Container
 
20161027 - edge part2
20161027 - edge part220161027 - edge part2
20161027 - edge part2
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Entities in DCPS (DDS)
Entities in DCPS (DDS)Entities in DCPS (DDS)
Entities in DCPS (DDS)
 
20161125 gostation
20161125 gostation20161125 gostation
20161125 gostation
 
20160420 - git intro
20160420 - git intro20160420 - git intro
20160420 - git intro
 

Último

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 

Último (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Learning The Membership Function Contexts For Mining Fuzzy Association Rules By Using Genetic Algorithms

  • 1. Learning the membership function contexts for mining fuzzy association rules by using genetic algorithms Jesús Alcalá-Fdez, Rafael Alcalá María José Gacto, Francisco Herrera Fuzzy Sets and Systems (2008), article in press Presenter: Chia-Ming Wang
  • 2. Before we go Thanks to Prof. Hong who provide me the second paper today.
  • 3. Before we go • T. Hong, C. Chen,Y. Wu,Y. Lee, Using divide- and-conquer GA strategy in fuzzy data mining, in: IEEE Symp. on Fuzzy Systems, Budapest, Hungary, 2004, pp. 116–121. • T. Hong, C. Kuo, S. Chi,Trade-off between time complexity and number of rules for fuzzy mining from quantitative data, Journal of Uncertain Fuzziness Knowledge-Based Systems 9 (5) (2001) 587–604. Thanks to Prof. Hong who provide me the second paper today.
  • 4. Problem Description 2-tuples Quantitative GA model Association Rule
  • 5. A Transaction Database TID items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
  • 6. Association Rule Mining Examples: TID items {Diaper}→{Beer} 1 Bread, Milk {Milk, Bread}→{Eggs, coke} 2 Bread, Diaper, Beer, Eggs {Beer, Bread}→{Milk} 3 Milk, Diaper, Beer, Coke X→Y, X∩Y=∅ 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
  • 7. Association Rule Mining Examples: TID items {Diaper}→{Beer} 1 Bread, Milk {Milk, Bread}→{Eggs, coke} 2 Bread, Diaper, Beer, Eggs {Beer, Bread}→{Milk} 3 Milk, Diaper, Beer, Coke X→Y, X∩Y=∅ 4 Bread, Milk, Diaper, Beer Implication means co-occurrence, 5 Bread, Milk, Diaper, Coke not causality!
  • 8. Terminology Examples: {Milk, Diaper}→{Beer} TID items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
  • 9. Terminology Examples: {Milk, Diaper}→{Beer} TID items support 1 Bread, Milk σ{Milk, Diaper, Beer} 2 s= = = 0.4 2 Bread, Diaper, Beer, Eggs |T| 5 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
  • 10. Terminology Examples: {Milk, Diaper}→{Beer} TID items support 1 Bread, Milk σ{Milk, Diaper, Beer} 2 s= = = 0.4 2 Bread, Diaper, Beer, Eggs |T| 5 3 Milk, Diaper, Beer, Coke confident 4 Bread, Milk, Diaper, Beer c= σ{Milk, Diaper, Beer} 2 = = 0.67 σ{Milk, Diaper} 3 5 Bread, Milk, Diaper, Coke
  • 11. Terminology Examples: {Milk, Diaper}→{Beer} TID items support 1 Bread, Milk σ{Milk, Diaper, Beer} 2 s= = = 0.4 2 Bread, Diaper, Beer, Eggs |T| 5 3 Milk, Diaper, Beer, Coke confident 4 Bread, Milk, Diaper, Beer c= σ{Milk, Diaper, Beer} 2 = = 0.67 σ{Milk, Diaper} 3 5 Bread, Milk, Diaper, Coke Itemset, minsup, minconf
  • 12. Real-world Transaction Database TID (item, quantity) 1 (Bread, 3), (Milk, 1) 2 (Bread, 1), (Diaper, 2), (Beer, 3), (Eggs, 12) 3 (Milk,2), (Diaper, 4), (Beer, 5), (Coke, 2) 4 (Bread, 3), (Milk, 1), (Diaper, 2), (Beer, 12) 5 (Bread, 2), (Milk, 4), (Diaper, 5), (Coke, 3)
  • 13. Real-world Transaction Database TID (item, quantity) 1 (Bread, 3), (Milk, 1) 2 Quantitative 3), (Eggs, 12) (Bread, 1), (Diaper, 2), (Beer, Association Rule 3 (Milk,2), (Diaper, 4), (Beer, 5), (Coke, 2) Mining 4 (Bread, 3), (Milk, 1), (Diaper, 2), (Beer, 12) 5 (Bread, 2), (Milk, 4), (Diaper, 5), (Coke, 3)
  • 15. 2-tuples Quantitative model Association Rule
  • 16. Linguistic terms Low Middle High Low Middle High age weight if age is Middle then weight is High
  • 17. The 2-tuples linguistic representation if age is Middle then weight is High F. Herrera, L. Martínez, A 2-tuple fuzzy linguistic representation model for computing with words, IEEE Trans. Fuzzy Systems 8 (6) (2000) 746–752.
  • 18. The 2-tuples linguistic representation if age is Middle then weight is High if age is (Middle, 0.3) then weight is (High, -0.1) (si , αi ), si ∈ S, αi ∈ [−0.5, 0.5) F. Herrera, L. Martínez, A 2-tuple fuzzy linguistic representation model for computing with words, IEEE Trans. Fuzzy Systems 8 (6) (2000) 746–752.
  • 19. -1 -0.5 0.5 1 s0 s1 s2 s3 s4 domain 0 1 2 3 4 (s2, -0.3)
  • 20. -1 -0.5 0.5 1 s0 s1 s2 s3 s4 -0.3 domain 1.7 0 1 2 3 4 (s2, -0.3)
  • 21. -1 -0.5 0.5 1 s0 s1 s2 s3 s4 -0.3 domain 1.7 0 1 2 3 4 (s2, -0.3) -0.5 0.5 -0.5 0.5 -0.5 0.5 -0.5 0.5 -0.5 0.5 s0 s1 s2 s3 s4 0 1 2 3 4
  • 22. -1 -0.5 0.5 1 s0 s1 s2 s3 s4 -0.3 domain 1.7 0 1 2 3 4 (s2, -0.3) -0.5 0.5 -0.5 0.5 -0.5 0.5 -0.5 0.5 -0.5 0.5 α=-0.3 s0 s1 s2 s3 s4 (s2, -0.3) 0 1 2 3 4
  • 23. Interpretation if age is (Middle, 0.3) then weight is (High, -0.1)
  • 24. Interpretation if age is (Middle, 0.3) then weight is (High, -0.1) if age is (higher than Middle) then weight is (a bit smaller than High)
  • 26. 2-tuples GA model
  • 28. Traditional GA Population (chromosomes)
  • 29. Traditional GA Population (chromosomes) parents Evaluation (fitness)
  • 30. Traditional GA Population (chromosomes) parents Evaluation (fitness) Reproduction Mating pool (selection)
  • 31. Traditional GA Population (chromosomes) parents ‣ crossover Genetic Evaluation ‣ mutation operators (fitness) Mates Reproduction Mating pool (recombination) (selection)
  • 32. Traditional GA Population (chromosomes) offsprings parents ‣ crossover Genetic Evaluation ‣ mutation operators (fitness) Mates Reproduction Mating pool (recombination) (selection)
  • 33. GA Used in this paper • CHC genetic model • MFs codification and initial gene pool • Chromosome evaluation • Crossover operator
  • 34. GA Used in this paper • CHC genetic model • MFs codification and initial gene pool • Chromosome evaluation • Crossover operator
  • 35. Scheme of CHC model L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 36. Scheme of CHC model Initialize population and THRESHOLD L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 37. Scheme of CHC model Initialize population Crossover of N and THRESHOLD parents L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 38. Scheme of CHC model Initialize population Crossover of N and THRESHOLD parents Incest prevention 1/2 * hamming distance > L L = (#Genes *BITSGENE)/4 L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 39. Scheme of CHC model Initialize population Crossover of N Evaluation of the and THRESHOLD parents New Individuals L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 40. Scheme of CHC model Initialize population Crossover of N Evaluation of the and THRESHOLD parents New Individuals Selection of the best N individuals between parents and offsprings L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 41. Scheme of CHC model Initialize population Crossover of N Evaluation of the and THRESHOLD parents New Individuals Selection of the best N individuals between parents and offsprings if NO new individual, decrement THRESHOLD L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 42. Scheme of CHC model Initialize population Crossover of N Evaluation of the and THRESHOLD parents New Individuals Selection of the best N individuals between parents and offsprings THRESHOLD if NO new individual, <0 decrement THRESHOLD L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 43. Scheme of CHC model Initialize population Crossover of N Evaluation of the and THRESHOLD parents New Individuals Selection of the best N individuals between parents and offsprings no THRESHOLD if NO new individual, <0 decrement THRESHOLD L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 44. Scheme of CHC model Initialize population Crossover of N Evaluation of the and THRESHOLD parents New Individuals Selection of the best N individuals between parents and offsprings no Restart the population THRESHOLD if NO new individual, and THRESHOLD <0 decrement THRESHOLD yes L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283.
  • 45. GA Used in this paper • CHC genetic model • MFs codification and initial gene pool • Chromosome evaluation • Crossover operator
  • 46. age L1 M1 H1 L2 M2 H2 weight L1 M1 H1 L2 M2 H2 0 0 0 0 0 0 MFs Codification
  • 47. age L1 M1 H1 L2 M2 H2 weight L1 M1 H1 L2 M2 H2 0 0 0 0 0 0 MFs Codification L1 M1 H1 L2 M2 H2 0.2 0.4 0 -0.2 -0.3 -0.5 age L1 M1 H1 L2 M2 H2 weight
  • 48. Initial Gene Pool chromosome: (c11,...,c1m,c21,...,c2m,...,cn1,...,cnm) 1 item with m MFs • initial MFs obtained from expert knowledge • individuals generated at random in [-0.5, 0.5)
  • 49. Implementation: Gray Code Decimal Binary Gray Code 0 000 000 1 001 001 2 010 011 3 011 010 4 100 110 5 101 111 6 110 101 7 111 100
  • 50. Implementation: Gray Code Decimal Binary Gray Code 0 000 000 1 001 001 2 010 011 3 011 010 4 100 110 5 101 111 6 110 101 7 111 100
  • 51. GA Used in this paper • CHC genetic model • MFs codification and initial gene pool • Chromosome evaluation • Crossover operator
  • 52. Equation Mania x∈L1 f uzzy support f itness(Cq ) = suitability(Cq ) n suitability(Cq ) = [overlap f actor(Cqk ) + coverage f actor(Cqk )] k=1 m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik ) n suitability(Cq ) = [overlap f actor(Cqk ) + 1] k=1
  • 53. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi )
  • 54. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item
  • 55. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item Ri Rj
  • 56. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item Ri Rj overlap
  • 57. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item Ri Rj overlap SpanR
  • 58. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item Ri Rj overlap SpanR SpanL
  • 59. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item Ri Rj Ri Rj overlap overlap SpanR SpanR SpanL SpanL
  • 60. m m overlap(Ri , Rj ) overlap f actor(Cqk ) = [max( , 1) − 1] i=1 j=i+1 min(spanRRi , spanLRi ) qth chromosome kth item Ri Rj Ri Rj penalty overlap overlap SpanR SpanR SpanL SpanL
  • 61. 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik )
  • 62. 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik ) qth chromosome kth item
  • 63. 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik ) qth chromosome kth item R1 R2 R3 Milk 0 5 10
  • 64. 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik ) qth chromosome kth item R1 R2 R3 R1 R2 R3 Milk Milk 0 5 10 0 5 10
  • 65. 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik ) qth chromosome kth item R1 R2 R3 R1 R2 R3 Milk Milk 0 5 10 0 5 10 range
  • 66. 1 coverage f actor(Cqk )= range(R 1 ,...,Rm ) max(Ik ) qth chromosome kth item R1 R2 R3 R1 R2 R3 Milk Milk 0 5 10 0 5 10 coverage f actor(Cqk ) = 1 range
  • 68. Fuzzy Support (count) DB n item T
  • 69. Fuzzy Support (count) DB n item (i) vj T ith
  • 70. Fuzzy Support (count) DB n item (i) vj T ith (i) (i) (i) fj1 fjm bread fj = + ··· Rj1 Rjm
  • 71. Fuzzy Support (count) DB n item (i) vj T ith (i) (i) (i) fj1 fjm bread fj = + ··· Rj1 Rjm item m mf
  • 72. Fuzzy Support (count) DB n item (i) vj degree T ith (i) (i) (i) fj1 fjm bread fj = + ··· Rj1 Rjm item m mf
  • 73. Fuzzy Support (count) DB n item (i) vj degree T ith (i) (i) (i) fj1 fjm bread fj = + ··· Rj1 Rjm T (i) countjk = fjk item i=1 m mf bread.low.count
  • 74. Fuzzy Support (count) DB n item (i) vj degree T ith (i) (i) (i) fj1 fjm bread fj = + ··· Rj1 Rjm T (i) countjk = fjk item i=1 m mf bread.low.count L1 = {Rjk |countjk ≥ α, 1 ≤ j ≤ n and 1 ≤ k ≤ m n item
  • 75. Fuzzy Support x∈L1 f uzzy support f itness(Cq ) = suitability(Cq )
  • 76. Fuzzy Support x∈L1 f uzzy support f itness(Cq ) = suitability(Cq ) n suitability(Cq ) = [overlap f actor(Cqk ) + 1] k=1
  • 77. Fuzzy Support x∈L1 f uzzy support f itness(Cq ) = suitability(Cq ) n suitability(Cq ) = [overlap f actor(Cqk ) + 1] k=1 n item
  • 78. Fuzzy Support L1 x∈L1 f uzzy support f itness(Cq ) = suitability(Cq ) n suitability(Cq ) = [overlap f actor(Cqk ) + 1] k=1 n item
  • 79. Fuzzy Support L1 count / T # transaction x∈L1 f uzzy support f itness(Cq ) = suitability(Cq ) n suitability(Cq ) = [overlap f actor(Cqk ) + 1] k=1 n item
  • 80. GA Used in this paper • CHC genetic model • MFs codification and initial gene pool • Chromosome evaluation • Crossover operator
  • 81. PCBLX Crossover X = (x1 · · · xn ) Y = (y1 · · · yn ) (xi , yi ∈ [ai , bi ] ⊂ R, i = 1 · · · n) O1 = (o11 · · · o1n ) [li , u1 ] li = max{ai , xi − Ii · α} u2 = min{bi , xi + Ii · α} 1 i 1 i O2 = (o21 · · · o2n ) [li , u2 ] li = max{ai , yi − Ii · α} u2 = min{bi , yi + Ii · α} 2 i 2 i Ii = |xi − yi | F. Herrera, M. Lozano, A.M. Sánchez, A taxonomy for the crossover operator for real-coded genetic algorithms: An experimental study. Int. J. Intell. Syst. 18 (2003) 309-338.
  • 82. PCBLX Crossover X = (x1 · · · xn ) Y = (y1 · · · yn ) (xi , yi ∈ [ai , bi ] ⊂ R, i = 1 · · · n) O1 = (o11 · · · o1n ) [li , u1 ] li = max{ai , xi − Ii · α} u2 = min{bi , xi + Ii · α} 1 i 1 i O2 = (o21 · · · o2n ) [li , u2 ] li = max{ai , yi − Ii · α} u2 = min{bi , yi + Ii · α} 2 i 2 i Ii = |xi − yi | ai xi yi bi PCBLX BLX F. Herrera, M. Lozano, A.M. Sánchez, A taxonomy for the crossover operator for real-coded genetic algorithms: An experimental study. Int. J. Intell. Syst. 18 (2003) 309-338.
  • 84. Conceptual Flowchart Learning Membership Function
  • 85. Conceptual Flowchart Learning Membership Function Learning Process Predefined MFs Transaction Database
  • 86. Conceptual Flowchart Learning Membership Function Learning Process Predefined MFs Evaluation Module (Fitness) Transaction Database MFs
  • 87. Conceptual Flowchart Learning Mining Fuzzy Membership Function Association Rules Learning Process Predefined MFs Evaluation Module (Fitness) Transaction Database MFs
  • 88. Conceptual Flowchart Learning Mining Fuzzy Membership Function Association Rules Learning Fuzzy Process mining Predefined MFs Definitive MFs Evaluation Module (Fitness) Transaction Transaction Database Database MFs
  • 89. Conceptual Flowchart Learning Mining Fuzzy Membership Function Association Rules Learning Fuzzy Process mining Predefined MFs Definitive MFs Evaluation Module (Fitness) Transaction Transaction Database Database Fuzzy Association Rules MFs
  • 90. Procedures Stage 1 1. initialization 2. evaluate the initial chromosomes 1. for all items in transaction, transfer the quantitative values to fuzzy sets 2. calculate count, fuzzy support 3. calculate fitness 3. set threshold L 4. generate the next population 5. CHC procedure 6. if # run not reach, goto step4 Stage 2 Mining Fuzzy association rules by (Hong 2001)
  • 92. Parameters Proposed Hong’s • # 50 individuals • 0.01 mutation rate • 10,000 evaluations • 0.35 d factor • 30 bits per gene • 0.6 crossover rate • 0.8 fuzzy rule confident
  • 93. Data Set Bureau of the Census FAM95 #63,756 instance #23 attr. #10 attr. This data set was obtained from the Statistics Data Sets Archive website http://www.stat.ucla.edu/data/fpp.
  • 94. Results obtained in the genetic process Proposed approach Hong el al.’s approach Uniform fuzzy partition Sup Fit Fsup Suit #1I Sup Fit Fsup Suit #1I Sup Fit Fsup Suit #1I With three linguistic terms 0.2 0.99 11.68 11.85 20 0.2 0.68 10.83 15.83 19 0.2 0.92 9.24 10.00 16 0.5 0.94 11.68 12.39 17 0.5 0.53 10.28 19.45 15 0.5 0.76 7.55 10.00 10 0.7 0.66 6.98 10.63 9 0.7 0.37 6.55 17.94 8 0.7 0.57 5.71 10.00 7 0.9 0.28 2.80 10.00 3 0.9 0.00 0.00 14.75 0 0.9 0.00 0.00 10.00 0 With five linguistic terms 0.2 0.95 10.46 10.99 22 0.2 0.53 10.22 19.27 22 0.2 0.94 9.43 10.00 21 0.5 0.77 9.92 12.92 15 0.5 0.38 7.95 20.63 12 0.5 0.46 4.57 10.00 7 0.7 0.61 7.69 12.57 10 0.7 0.20 3.96 19.54 5 0.7 0.24 2.36 10.00 3 0.9 0.10 0.92 10.00 1 0.9 0.06 0.90 15.01 1 0.9 0.00 0.00 10.00 0
  • 95. Results obtained in the genetic process Proposed approach Hong el al.’s approach Uniform fuzzy partition Sup Fit Fsup Suit #1I Sup Fit Fsup Suit #1I Sup Fit Fsup Suit #1I With three linguistic terms 0.2 0.99 11.68 11.85 20 0.2 0.68 10.83 15.83 19 0.2 0.92 9.24 10.00 16 0.5 0.94 11.68 12.39 17 0.5 0.53 10.28 19.45 15 0.5 0.76 7.55 10.00 10 0.7 0.66 6.98 10.63 9 0.7 0.37 6.55 17.94 8 0.7 0.57 5.71 10.00 7 0.9 0.28 2.80 10.00 3 0.9 0.00 0.00 14.75 0 0.9 0.00 0.00 10.00 0 With five linguistic terms 0.2 0.95 10.46 10.99 22 0.2 0.53 10.22 19.27 22 0.2 0.94 9.43 10.00 21 0.5 0.77 9.92 12.92 15 0.5 0.38 7.95 20.63 12 0.5 0.46 4.57 10.00 7 0.7 0.61 7.69 12.57 10 0.7 0.20 3.96 19.54 5 0.7 0.24 2.36 10.00 3 0.9 0.10 0.92 10.00 1 0.9 0.06 0.90 15.01 1 0.9 0.00 0.00 10.00 0
  • 96. Results obtained in the genetic process Proposed approach Hong el al.’s approach Uniform fuzzy partition Sup Fit Fsup Suit #1I Sup Fit Fsup Suit #1I Sup Fit Fsup Suit #1I With three linguistic terms 0.2 0.99 11.68 11.85 20 0.2 0.68 10.83 15.83 19 0.2 0.92 9.24 10.00 16 0.5 0.94 11.68 12.39 17 0.5 0.53 10.28 19.45 15 0.5 0.76 7.55 10.00 10 0.7 0.66 6.98 10.63 9 0.7 0.37 6.55 17.94 8 0.7 0.57 5.71 10.00 7 0.9 0.28 2.80 10.00 3 0.9 0.00 0.00 14.75 0 0.9 0.00 0.00 10.00 0 With five linguistic terms 0.2 0.95 10.46 10.99 22 0.2 0.53 10.22 19.27 22 0.2 0.94 9.43 10.00 21 0.5 0.77 9.92 12.92 15 0.5 0.38 7.95 20.63 12 0.5 0.46 4.57 10.00 7 0.7 0.61 7.69 12.57 10 0.7 0.20 3.96 19.54 5 0.7 0.24 2.36 10.00 3 0.9 0.10 0.92 10.00 1 0.9 0.06 0.90 15.01 1 0.9 0.00 0.00 10.00 0
  • 97. Results obtained in the genetic process Hong el al.’s approach with the 2-tuples Support Fitness Fsup Suit #1Itemset With three linguistic terms 0.2 0.97 10.90 11.18 20 0.5 0.89 11.36 12.64 18 0.7 0.59 6.20 10.33 7 0.9 0.26 2.79 10.52 3 With five linguistic terms 0.2 0.93 10.18 10.93 22 0.5 0.64 7.39 11.80 11 0.7 0.41 0.476 11.60 6 0.9 0.08 0.91 10.92 1
  • 98. Fitness vs Function Evaluation 1 Average Fitness Values. 0.8 0.6 0.4 0.2 0 0 2000 4000 6000 8000 10000 Evaluations The Proposed Approach Hong et al.'s Approach
  • 99. Frequent 1-itemsets vs minsup Number of Large 1-itemsets 20 15 10 5 0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 Minimum Support The Proposed Approach Hong et al.'s Approach Uniform Fuzzy Partition
  • 100. MFs w/o lateral displacement l1' = (l1,0.4) l2' = (l2,0.4) l3' = (l3,0.5) l1' = (l1,0.0) l2' = (l2,-0.2) l3' = (l3,0.0) l1' = (l1,-0.1) l2' = (l2,-0.2) l3' = (l3,0.2) X1 X2 X3 l1 l2 l3 l1 l2 l3 l1 l2 l3 l1' = (l1,0.0) l2' = (l2,0.0) l3' = (l3,0.4) l1' = (l1,0.1) l2' = (l2,-0.2) l3' = (l3,0.1) l1' = (l1,0.1) l2' = (l2,-0.5) l3' = (l3,0.1) X4 X5 X6 l1 l2 l3 l1 l2 l3 l1 l2 l3 l1' = (l1,-0.1) l2' = (l2,-0.1) l3' = (l3,0.4) l1' = (l1,0.0) l2' = (l2,-0.2) l3' = (l3,-0.2) l1' = (l1,0.0) l2' = (l2,-0.3) l3' = (l3,0.1) X7 X8 X9 l1 l2 l3 l1 l2 l3 l1 l2 l3 l1' = (l1,0.0) l2' = (l2,-0.2) l3' = (l3,0.2) X10 l1 l2 l3
  • 101. Hong’s MFs l1' l2' l3' l1' l2' l3' l1' l2' l3' X1 X2 X3 l1 l2 l3 l1 l2 l3 l1 l2 l3 l1' l2' l3' l1' l2' l3' l1' l2' l3' X4 X5 X6 l1 l2 l3 l1 l2 l3 l1 l2 l3 l1' l2' l3' l1' l2' l3' l1' l2' l3' X7 X8 X9 l1 l2 l3 l1 l2 l3 l1 l2 l3 l1' l2' l3' X10 l1 l2 l3
  • 102. #rules vs minsup minconf = 0.8 160000 140000 120000 Number of Rules 100000 80000 60000 40000 20000 0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 Minimum Support Proposed Approach Hong et al.'s Approach Uniform Fuzzy Partition
  • 103. #rules vs minconf minsup = 0.2 90000 80000 70000 Number of Rules 60000 50000 40000 30000 20000 10000 0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 Minimum Confidence Proposed Approach Hong et al.'s Approach Uniform Fuzzy Partition
  • 104. #rules vs minsup vs minsup 200000 Number of Rules 150000 100000 50000 0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 Minimum Support Conf = 0.5 Conf = 0.6 Conf = 0.7 Conf = 0.8 Conf = 0.9
  • 105. #rules vs minsup vs minsup 200000 Number of Rules 150000 100000 50000 0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 Minimum Confidence Minsup = 0.1 Minsup = 0.2 Minsup = 0.3 Minsup = 0.4 Minsup = 0.5 Minsup = 0.6
  • 106. Time vs #Transaction 30.00 25.00 Runtime (minutes) 20.00 15.00 10.00 5.00 0.00 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Number of Transactions Proposed Approach Hong et al.'s Approach
  • 107. Time vs #Attribute 30.00 25.00 Runtime (minutes) 20.00 15.00 10.00 5.00 0.00 2 3 4 5 6 7 8 9 10 Number of Attributes Proposed Approach Hong et al.'s Approach
  • 108. Time vs #Linguistic terms 70.00 Runtime (minutes) 60.00 50.00 40.00 30.00 20.00 3 4 5 6 7 Number of Linguistic Terms Proposed Approach Hong et al.'s Approach
  • 109. Example of Rules If number if children is Low and Classic Fuzzy hours head worked last week is Low Association Rule then head’s personal income is Low (Factor of confidence 0.87) If number if children is (Low, -0.16) and Rule with 2-Tuples hours head worked last week is (Low, -0.06) Representation then head’s personal income is (Low, 0.1) (Factor of confidence 0.99)
  • 111. Author’s conclusion 2-tuples linguistic representation works!!
  • 113. T. Hong, C. Chen,Y. Wu,Y. Lee, Using divide-and-conquer GA strategy in fuzzy data mining, IEEE Symp. on Fuzzy Systems, Budapest, Hungary, 2004, pp. 116–121.
  • 114.
  • 115.
  • 116.
  • 117. Pitfalls • domain knowledge & Symmetric assumption • flowchart • Hong’s method • inadequate fitness function • gray code and crossover • fuzzy association? • dataset • replication? • scalability
  • 118. Pitfalls • domain knowledge & Symmetric assumption • flowchart • Hong’s method n suitability(Cq ) = [overlap f actor(Cqk ) + coverage f actor(Cqk )] k=1 • inadequate fitness function • gray code and crossover • fuzzy association? • dataset • replication? • scalability
  • 119. Pitfalls • domain knowledge & Symmetric assumption • flowchart • Hong’s method • inadequate fitness function • gray code and crossover • fuzzy association? • dataset • replication? • scalability
  • 120. Reference • L. Eshelman, The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, in: G. Rawlin (Ed.), Foundations of Genetic Algorithms, Vol. 1, Morgan Kaufmann, Los Altos, CA, 1991, pp. 265–283. • F. Herrera, L. Martínez, A 2-tuple fuzzy linguistic representation model for computing with words, IEEE Trans. Fuzzy Systems 8 (6) (2000) 746–752. • F. Herrera, M. Lozano, A.M. Sánchez, A taxonomy for the crossover operator for real- coded genetic algorithms: An experimental study. Int. J. Intell. Syst. 18 (2003) 309-338. • T. Hong, C. Chen, Y. Wu,Y. Lee, Using divide-and-conquer GA strategy in fuzzy data mining, in: IEEE Symp. on Fuzzy Systems, Budapest, Hungary, 2004, pp. 116–121. • T. Hong, C. Chen, Y. Wu,Y. Lee, quot;Genetic-Fuzzy Data Mining with Divide-and-Conquer Strategyquot;, IEEE Transactions on Evolutionary Computation 12 (2) 252-265. • T. Hong, C. Kuo, S. Chi, Trade-off between time complexity and number of rules for fuzzy mining from quantitative data, Journal of Uncertain Fuzziness Knowledge-Based Systems 9 (5) (2001) 587–604. • H. Ishibuchi, T. Nakashima, T.Yamamoto, Fuzzy association rules for handling continuous attributes, in: IEEE Internat. Symp. on Industrial Electronics Proceedings, Pusan, Korea, 2001, pp. 118–121. • P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining, Addison Wesley, May 2005.