SlideShare una empresa de Scribd logo
1 de 85
Complexity bounds in parallel 
 evolution
    A. Auger, H. Fournier,
    N. Hansen, P. Rolet,
    F. Teytaud, O. Teytaud

             Paris, 2010


Tao, Inria Saclay Ile-De-France,
LRI (Université Paris Sud, France),
UMR CNRS 8623, I&A team, Digiteo,
Pascal Network of Excellence.
Outline



   Introduction
   Complexity bounds
   Branching Factor
   Automatic Parallelization
   Real-world algorithms
   Log() corrections



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   2
Outline




   Introduction
    - What is optimization ?
    - What are comparison-based optimization
              algorithms ?
    - Why we are interested in cp-based opt ?
    - Why we consider parallel machines ?


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   3
Introduction: what is optimization ?


  Consider
                                f: X --> R

  We look for x* such that
                 x,f(x*) ≤ f(x)                                w random
                                                                variable


  f is randomly drawn; f(x) = f(x,w).


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution      4
Introduction: what is optimization ?




  Quality of “Opt” quantified as follows:



  (to be minimized)
                                                   w random
                                                    variable

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud     parallel evolution   5
Introduction: what is optimization ?




  Consider
               f: X --> R
  We look for x* such that
                  x,f(x*) ≤ f(x)
  ==> Quasi-Newton, random search,
      Newton, Simplex, Interior points...


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   6
Comparison-based optimization




                is comparison-based if




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   7
The main rules for step-size adaptation


       While ( I have time )
       {
            Generate points (x1,...,x) distributed as N(x,)
            Evaluate the fitness at x1,...,x
            Update x, update 
       }

Main trouble: choosing 

Cumulative step-size adaptation

Mutative self-adaptation

Estimation of Multivariate Normal Algorithm
Example 1: Estimation of Multivariate Normal Algorithm


        While ( I have time )
        {
                 Generate points (x1,...,x) distributed as N(x,)
                 Evaluate the fitness at x1,...,x
                 X= mean  best points
        = standard deviation of  best points
  }



                                             I have a Gaussian...
Example 1: Estimation of Multivariate Normal Algorithm


        While ( I have time )
        {
                 Generate points (x1,...,x) distributed as N(x,)
                 Evaluate the fitness at x1,...,x
                 X= mean  best points
        = standard deviation of  best points
  }


                                             I generate 6 points
Example 1: Estimation of Multivariate Normal Algorithm


        While ( I have time )
        {
                 Generate points (x1,...,x) distributed as N(x,)
                 Evaluate the fitness at x1,...,x
                 X= mean  best points
        = standard deviation of  best points
  }




                                          I select the three best
Example 1: Estimation of Multivariate Normal Algorithm


        While ( I have time )
        {
                 Generate points (x1,...,x) distributed as N(x,)
                 Evaluate the fitness at x1,...,x
                 X= mean  best points
        = standard deviation of  best points
  }




                                         I update the Gaussian
Example 1: Estimation of Multivariate Normal Algorithm


        While ( I have time )
        {
                 Generate points (x1,...,x) distributed as N(x,)
                 Evaluate the fitness at x1,...,x
                 X= mean  best points
        = standard deviation of  best points
  }




                                             Obviously 6-parallel
Example 2: Mutative self-adaptation




        = / 4
       While ( I have time )
       {
            Generate points (1,...,) as  x exp(- k.N)
            Generate points (x1,...,x) distributed as N(x,i)
           Select the  best points
            Update x (=mean), update (=log. mean)
}
Plenty of comparison-based algorithms




  EMNA and other EDA

  Self-adaptive algorithms

  Cumulative step-size adaptation

  Pattern Search Methods ...
Families of comparison-based algorithms



Main parameter =  = number of
       evaluations per iteration = parallelism

Full-Ranking vs Selection-Based (param )
   FR: we know the ranking of the  best
   SB: we just know which are the  best

Elitist or not
    Elitist: comparison with all visited points
    Non-elitist: only within current offspring
EMNA ? Self-adaptation ?

Main parameter =  = number of
        evaluations per iteration = parallelism

Full-Ranking vs Selection-Based
   FR: we know the ranking of all visited points
   SB: we just know which are the  best

Elitist or not
    Elitist: comparison with all visited points
    Non-elitist: only within current offspring

==> yet, they work quite well
Comparison-based algorithms are robust


 Consider
               f: X --> R
 We look for x* such that
                  x,f(x*) ≤ f(x)
 ==> what if we see g o f (g increasing) ?
 ==> x* is the same, but xn might change
 ==> then, comparison-based methods are
      optimal

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   18
Robustness of comparison-based algorithms: formal
statement




    this does not depend on g for a
         comparison-based algorithm
    a comparison-based algorithm is optimal
     for

     (I don't give a proof here, but I promise it's true)

 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   19
Introduction: I like  large


   ● Grid5000 = 5 000 cores (increasing)
   ● Submitting jobs ==> grouping runs


      ==>  much bigger than number of cores.
   ● Next generations of computers: tenths,

      hundreds, thousands of cores.
   ● Evolutionary algorithms are population

     based but they have a bad speed-up.


 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   20
Introduction: I like  large


   ● Grid5000 = 5 000 cores (increasing)
   ● Submitting jobs ==> grouping runs


      ==>  much bigger than
           number of cores.
   ● Next generations of computers: tenths,

      hundreds, thousands of cores.
   ● Evolutionary algorithms are population

     based but they have a bad speed-up.


 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   21
Introduction: I like  large


   ● Grid5000 = 5 000 cores (increasing)
   ● Submitting jobs ==> grouping runs


      ==>  much bigger
         than number of cores.
   ● Next generations of computers: tenths,

      hundreds, thousands of cores.
   ● Evolutionary algorithms are population

     based but they have a bad speed-up.


 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   22
Introduction: I like  large


   ● Grid5000 = 5 000 cores (increasing)
   ● Submitting jobs ==> grouping runs


      ==>  much bigger than number of cores.
   ● Next generations of computers: tenths,

      hundreds, thousands of cores.
   ● Evolutionary algorithms are population

     based but they have a bad speed-up.


 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   23
Introduction: concluding :-)



   ● Optimization = finding minima
   ● Many algorithms are comparison-based


   ●   ==> good idea for robustness
   ● Parallel case interesting


   ●


   ==> now we can have fun with bounds


 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   24
Outline



   Introduction                                      On a given domain D
                                                     On a space F of objective
   Complexity bounds                                  functions such that
                                                           {x*(f);f∈F}=D
   Branching Factor
   Automatic Parallelization
   Real-world algorithms
   Log() corrections



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution            25
Complexity bounds (N = dimension)




                 = nb of fitness evaluations for precision
                     with probability at least ½ for all f

      N() = cov. number of the search space

      Exp ( - Convergence ratio ) = Convergence rate

      Convergence ratio ~ 1 / computational cost
      ==> more convenient for speed-ups
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   26
Complexity bounds                                  ½




                 = nb of fitness evaluations for precision
                     with probability at least ½ for all f

      N() = cov. number of the search space

      Exp ( - Convergence ratio ) = Convergence rate

      Convergence ratio ~ 1 / computational cost
      ==> more convenient for speed-ups
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud       parallel evolution   27
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          28
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          29
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          30
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          31
Complexity bounds: -balls
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          32
Complexity bounds: -balls
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          33
Complexity bounds: -balls
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          34
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
    in an evolutionary algorithm.

  Key observation: (most) evolutionary algorithms are comparison-based

  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm

  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches

  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

  Conclusion: the number n of iterations should verify
                    Kn ≥ N (  )



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution          35
Complexity bounds on the convergence ratio




      FR: full ranking (selected points are ranked)
      SB: selection-based (selected points are not ranked)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   36
Complexity bounds on the convergence ratio

    Linear in  ?




      FR: full ranking (selected points are ranked)
      SB: selection-based (selected points are not ranked)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   37
Linear speed-up ?                                   My bound is
                                                        tight,
                                                   I've proved it!



Bounds:
On a given domain D
On a space F of objective
  functions such that
      {x*(f);f∈F}=D
==> very strange F possible!
==> much easier than
    F={||x-x*|| ; x*∈ D }



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   38
Linear speed-up ?                                   My bound is
                                  Ok, tight
                                   bound.               tight,
                                 But what          I've proved it!
                               happens with
                                      a
                               better model ?




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   39
Complexity bounds on the convergence ratio

  - Comparison-based optimization
                          (or opt. with limited precision numbers)
  - We have developped bounds based on:
Branching factor: finitely many possible
informations on the problem per time step
(→ communication. compl)

Packing number (lower bound on number of
possible outcomes)

    Adding assumptions ==> better bounds ?
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud    parallel evolution   40
Complexity bounds: improved technique
 We want to know how many iterations we need for reaching precision 
   in an evolutionary algorithm.

 Key observation: (most) evolutionary algorithms are comparison-based

 Let's consider (for simplicity) a deterministic selection-based non-elitist
  algorithm

 First idea: how many different branches we have in a run ?
    We select  points among 
    Therefore, at most K = ! / ( ! (  -  )!) different branches

 Second idea: how many different answers should we able to give ?
    Use packing numbers: at least N() different possible answers

 Conclusion: the number nMany of these K verify
                         of iterations should
                   Kn ≥ Nbranches are
                           ( )
                                      very unlikely !
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud        parallel evolution     41
Complexity bounds: improved technique
 We want to know how many iterations we need for reaching precision 
   in an evolutionary algorithm.

 Key observation: (most) evolutionary algorithms are comparison-based

 Let's consider (for simplicity) a deterministic selection-based non-elitist
  algorithm

 First idea: how many different branches we have in a run ?
    We select  points among 
    Therefore, at most K = ! / ( ! (  -  )!) different branches

 Second idea: how many different answers should we able to give ?
    Use packing numbers: at least N() different possible answers

 Conclusion: the number n of iterations should verify
                        Many of these K
                    n
                   K ≥ N( )
                           branches are                           We'll use...
                                                               … VC-dimension !
                                     very unlikely !
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud       parallel evolution         42
(these slides “shattering + VC-dim”
                    extracted from Xue Mei's talk
                            at ENEE698A)


Definition of shattering:
  A set S of points is shattered by a set H of
  sets if for every dichotomy of S there is a
  consistent hypothesis in H
Example: Shattering

         Is this set of points shattered by the set H o
Yes!

     +           -           +           +

 +       +   +       +   -       +   +       -



     +           -           -           -

 -       -   -       +   +       -   -       -
Is this set of points shattered by circles?
How About This One?
VC-dimension

  VC-dimension( set of sets ) =
          maximum cardinal of a shattered set
  VC-dimension (set of functions ) =
                      VC-dimension ( level sets)
  Known (as a function of the dimension)
                    for many sets of functions
  In particular, quadratic for ellipsoids,
     linear for homotheties of a fixed ellipsoid
                        linear for circles...

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   48
VC-dimension

  VC-dimension( set of sets ) =
          maximum cardinal of a shattered set
  VC-dimension (set of functions ) =
                      VC-dimension ( level sets)
  Known (as a function of the dimension)
                    for many sets of functions
  In particular, quadratic for ellipsoids,
     linear for homotheties of a fixed ellipsoid
                        linear for circles...

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   49
VC-dimension

  VC-dimension( set of sets ) =
          maximum cardinal of a shattered set
  VC-dimension (set of functions ) =
                      VC-dimension ( level sets)
  Known (as a function of the dimension)
                    for many sets of functions
  In particular, quadratic for ellipsoids,
     linear for homotheties of a fixed ellipsoid
                        linear for circles...

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   50
VC-dimension

  VC-dimension( set of sets ) =
          maximum cardinal of a shattered set
  VC-dimension (set of functions ) =
                  VC-dimension ( sublevel sets)
  Known (as a function of the dimension)
                    for many sets of functions
  In particular, quadratic for ellipsoids,
     linear for homotheties of a fixed ellipsoid
                        linear for circles...

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   51
VC-dimension: the link with optimization ?


  Sauer's lemma:
   number of subsets of V points consistent
                                    V
    with a set of VC-dim V at most 
  So what ?
   number of possible selections at most
                         V
                    K≤
                              ==> instead of K = ! / ( ! (  -  )!)

                                       (V at least 3, otherwise a few details change...)


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud               parallel evolution          52
Complexity bounds on the convergence ratio




      FR: full ranking (selected points are ranked)
      SB: selection-based (selected points are not ranked)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   53
Complexity bounds on the convergence ratio



                                                   Should not be
                                                   linear in  !




      FR: full ranking (selected points are ranked)
      SB: selection-based (selected points are not ranked)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud       parallel evolution   54
Complexity bounds on the convergence ratio




                                                                    Something
                                                                     remains!




      FR: full ranking (selected points are ranked)
      SB: selection-based (selected points are not ranked)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution           55
Sphere: fitness increases with distance to optimum
                                                    1 comparison = 1 hyperplane




 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud           parallel evolution    56
Sphere: fitness increases with distance to optimum
                                                    1 comparison = 1 hyperplane




 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud           parallel evolution    57
Sphere: fitness increases with distance to optimum
                                                    1 comparison = 1 hyperplane




 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud           parallel evolution    58
Sphere: fitness increases with distance to optimum
                                                    1 comparison = 1 hyperplane




 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud           parallel evolution    59
Outline



   Introduction
   Complexity bounds
   Branching Factor
   Automatic Parallelization
   Real-world algorithms
   Log() corrections



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   60
Branching factor K (more in Gelly06; Fournier08)

Rewrite your evolutionary algorithm as follows:




g has values in a finite set of cardinal K:
 - e.g. subsets of {1,2,...,} of size  (K=! / (!(-)!) )
- or ordered subsets (K=! / (-)! ).
- ...

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   61
Outline


                                                   Upper bounds for the
   Introduction                                     dependency in 
   Complexity bounds
   Branching Factor
   Automatic Parallelization
   Real-world algorithms
   Log() corrections



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud       parallel evolution   62
Automatic parallelization




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   63
Speculative parallelization with branching factor
3




                   Consider the sequential algorithm.
                   (iteration 1)




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   64
Speculative parallelization with branching factor
3




      Consider the sequential algorithm.
      (iteration 2)


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   65
Speculative parallelization with branching factor
3




        Consider the sequential algorithm.
        (iteration 3)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   66
Speculative parallelization with branching factor
3




  Parallel version for D=2.
  Population = union of all pops for 2 iterations.


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   67
Outline



   Introduction                                    Tighter lower bounds for
   Complexity bounds                                 specific algorithms ?


   Branching Factor
   Automatic Parallelization
   Real-world algorithms
   Log() corrections



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution         68
Real world algorithms



  Define:

  Necessary condition for log() speed-up:
   - E log( * ) ~ log()

   But for many algorithms,
   - E log( * ) = O(1) ==> constant speed-up

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   69
One-fifth rule: - E log( * ) = O(1)

     = proportion of mutated points better than x

                    While ( I have time )
                    {
                         Generate points (x1,...,x) distributed as N(x,)
                         Evaluate the fitness at x1,...,x
                         Update x = mean
                      Update 
                         By 1/5th rule
                    }




                  or
 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud      parallel evolution    70
One-fifth rule: - E log( * ) = O(1)

  = proportion of mutated points better than x
Consider e.g.



 Or consider e.g.


                    In both cases * is lower-bounded
                    independently of 
                    ==> parameters should
                        strongly depend on  !
 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   71
Self-adaptation, cumulative step-size adaptation




    In many cases, the same result:
    with parameters depending on the
    dimension only (and not depending on ),
    the speed-up is limited by a constant!




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   72
Outline



   Introduction
   Complexity bounds
   Branching Factor
   Automatic Parallelization
   Real-world algorithms
   Log() corrections



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   73
The starting point of this work

   ●We have shown tight bounds.
   ●Usual algorithms don't reach the bounds

                  for  large.
   ●


   ●Trouble: the algorithms we propose are
   boring (too complicated), people prefer usual
   algorithms.
   ●


   ●   A simple patch for these algorithms?
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   74
Log() corrections


   ●   In the discrete case (XPs): automatic
             parallelization surprisingly efficient.

   ●   Simple trick in the continuous case:
         - E log( *) should be linear in log()

       (this provides corrections which
          work for SA, EMNA and CSA)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   75
Example 1: Estimation of Multivariate Normal Algorithm


        While ( I have time )
        {
                 Generate points (x1,...,x) distributed as N(x,)
                 Evaluate the fitness at x1,...,x
                 X= mean  best points
        = standard deviation of  best points
               /= log( / 7)1 / d
  }




                                          I select the three best
Ex 2: Log(lambda) correction for mutative self-adapt.




        =  / 4 ==> min( /4,d)
       While ( I have time )
       {
            Generate points (1,...,) as  x exp(- k.N)
            Generate points (x1,...,x) distributed as N(x,i)
           Select the  best points
            Update x (=mean), update (=log. mean)
  }
Log() corrections (SA, dim 3)


   ●   In the discrete case (XPs): automatic
             parallelization surprisingly efficient.

   ●   Simple trick in the continuous case
          - E log( *) should be linear in log()

       (this provides corrections which
          work for SA and CSA)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   78
Log() corrections


   ●   In the discrete case (XPs): automatic
             parallelization surprisingly efficient.

   ●   Simple trick in the continuous case
          - E log( *) should be linear in log()

       (this provides corrections which
          work for SA and CSA)
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   79
Conclusion

 The case of large population size is not well
 handled by usual algorithms.
 We proposed
      (I) theoretical bounds
      (II) an automatic parallelization
               matching the bound, and
               which works well in the discrete case.
      (III) a necessary condition for the
              continuous case, which provides
               useful hints.


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   80
Main limitation (of the application to the design of algo)

  All this is about a logarithmic speed-up.

  The computational
  power is like this ==>

                      <== and the result is like that.

  ==> much better speed-up for noisy
  optimization.

 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   81
Further work 1



 Apply VC-bounds for considering only
 “reasonnable” branches in the automatic
 parallelization.

 Theoretically easy, but provides extremely
 complicated algorithms.



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   82
Further work 2


 We have:
 - proofs for complicated algorithms
 - efficient (unproved) hints for usual
 algorithms

 Proofs for the versions with the “trick” ?
 Nb: the discrete case is moral: the best
     algorithm is the proved one :-)


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   83
Further work 3




 What if the optimum is not a point but a
 subset with topological dimension
 N' < N ?




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   84
Further work 4



 Parallel bandits ?
 Experimentally, parallel UCT >> seq. UCT.
  with speed-up depending on nb of arms.

 Theory ? Perhaps not very hard, but not
  done yet.



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   85

Más contenido relacionado

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Destacado

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destacado (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

Complexity bounds for comparison-based optimization and parallel optimization

  • 1. Complexity bounds in parallel  evolution A. Auger, H. Fournier, N. Hansen, P. Rolet, F. Teytaud, O. Teytaud Paris, 2010 Tao, Inria Saclay Ile-De-France, LRI (Université Paris Sud, France), UMR CNRS 8623, I&A team, Digiteo, Pascal Network of Excellence.
  • 2. Outline Introduction Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() corrections Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 2
  • 3. Outline Introduction - What is optimization ? - What are comparison-based optimization algorithms ? - Why we are interested in cp-based opt ? - Why we consider parallel machines ? Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 3
  • 4. Introduction: what is optimization ? Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) w random variable f is randomly drawn; f(x) = f(x,w). Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 4
  • 5. Introduction: what is optimization ? Quality of “Opt” quantified as follows: (to be minimized) w random variable Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 5
  • 6. Introduction: what is optimization ? Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) ==> Quasi-Newton, random search, Newton, Simplex, Interior points... Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 6
  • 7. Comparison-based optimization is comparison-based if Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 7
  • 8. The main rules for step-size adaptation While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x Update x, update  } Main trouble: choosing  Cumulative step-size adaptation Mutative self-adaptation Estimation of Multivariate Normal Algorithm
  • 9. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I have a Gaussian...
  • 10. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I generate 6 points
  • 11. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I select the three best
  • 12. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I update the Gaussian
  • 13. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } Obviously 6-parallel
  • 14. Example 2: Mutative self-adaptation  = / 4 While ( I have time ) { Generate points (1,...,) as  x exp(- k.N) Generate points (x1,...,x) distributed as N(x,i) Select the  best points Update x (=mean), update (=log. mean) }
  • 15. Plenty of comparison-based algorithms EMNA and other EDA Self-adaptive algorithms Cumulative step-size adaptation Pattern Search Methods ...
  • 16. Families of comparison-based algorithms Main parameter =  = number of evaluations per iteration = parallelism Full-Ranking vs Selection-Based (param ) FR: we know the ranking of the  best SB: we just know which are the  best Elitist or not Elitist: comparison with all visited points Non-elitist: only within current offspring
  • 17. EMNA ? Self-adaptation ? Main parameter =  = number of evaluations per iteration = parallelism Full-Ranking vs Selection-Based FR: we know the ranking of all visited points SB: we just know which are the  best Elitist or not Elitist: comparison with all visited points Non-elitist: only within current offspring ==> yet, they work quite well
  • 18. Comparison-based algorithms are robust Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) ==> what if we see g o f (g increasing) ? ==> x* is the same, but xn might change ==> then, comparison-based methods are optimal Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 18
  • 19. Robustness of comparison-based algorithms: formal statement this does not depend on g for a comparison-based algorithm a comparison-based algorithm is optimal for (I don't give a proof here, but I promise it's true) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 19
  • 20. Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 20
  • 21. Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 21
  • 22. Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 22
  • 23. Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 23
  • 24. Introduction: concluding :-) ● Optimization = finding minima ● Many algorithms are comparison-based ● ==> good idea for robustness ● Parallel case interesting ● ==> now we can have fun with bounds Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 24
  • 25. Outline Introduction On a given domain D On a space F of objective Complexity bounds functions such that {x*(f);f∈F}=D Branching Factor Automatic Parallelization Real-world algorithms Log() corrections Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 25
  • 26. Complexity bounds (N = dimension) = nb of fitness evaluations for precision  with probability at least ½ for all f N() = cov. number of the search space Exp ( - Convergence ratio ) = Convergence rate Convergence ratio ~ 1 / computational cost ==> more convenient for speed-ups Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 26
  • 27. Complexity bounds ½ = nb of fitness evaluations for precision  with probability at least ½ for all f N() = cov. number of the search space Exp ( - Convergence ratio ) = Convergence rate Convergence ratio ~ 1 / computational cost ==> more convenient for speed-ups Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 27
  • 28. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 28
  • 29. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 29
  • 30. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 30
  • 31. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 31
  • 32. Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 32
  • 33. Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 33
  • 34. Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 34
  • 35. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 35
  • 36. Complexity bounds on the convergence ratio FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 36
  • 37. Complexity bounds on the convergence ratio Linear in  ? FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 37
  • 38. Linear speed-up ? My bound is tight, I've proved it! Bounds: On a given domain D On a space F of objective functions such that {x*(f);f∈F}=D ==> very strange F possible! ==> much easier than F={||x-x*|| ; x*∈ D } Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 38
  • 39. Linear speed-up ? My bound is Ok, tight bound. tight, But what I've proved it! happens with a better model ? Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 39
  • 40. Complexity bounds on the convergence ratio - Comparison-based optimization (or opt. with limited precision numbers) - We have developped bounds based on: Branching factor: finitely many possible informations on the problem per time step (→ communication. compl) Packing number (lower bound on number of possible outcomes) Adding assumptions ==> better bounds ? Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 40
  • 41. Complexity bounds: improved technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number nMany of these K verify of iterations should Kn ≥ Nbranches are ( ) very unlikely ! Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 41
  • 42. Complexity bounds: improved technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Many of these K n K ≥ N( ) branches are We'll use... … VC-dimension ! very unlikely ! Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 42
  • 43. (these slides “shattering + VC-dim” extracted from Xue Mei's talk at ENEE698A) Definition of shattering: A set S of points is shattered by a set H of sets if for every dichotomy of S there is a consistent hypothesis in H
  • 44. Example: Shattering Is this set of points shattered by the set H o
  • 45. Yes! + - + + + + + + - + + - + - - - - - - + + - - -
  • 46. Is this set of points shattered by circles?
  • 48. VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( level sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles... Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 48
  • 49. VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( level sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles... Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 49
  • 50. VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( level sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles... Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 50
  • 51. VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( sublevel sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles... Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 51
  • 52. VC-dimension: the link with optimization ? Sauer's lemma: number of subsets of V points consistent V with a set of VC-dim V at most  So what ? number of possible selections at most V K≤ ==> instead of K = ! / ( ! (  -  )!) (V at least 3, otherwise a few details change...) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 52
  • 53. Complexity bounds on the convergence ratio FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 53
  • 54. Complexity bounds on the convergence ratio Should not be linear in  ! FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 54
  • 55. Complexity bounds on the convergence ratio Something remains! FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 55
  • 56. Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 56
  • 57. Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 57
  • 58. Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 58
  • 59. Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 59
  • 60. Outline Introduction Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() corrections Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 60
  • 61. Branching factor K (more in Gelly06; Fournier08) Rewrite your evolutionary algorithm as follows: g has values in a finite set of cardinal K: - e.g. subsets of {1,2,...,} of size  (K=! / (!(-)!) ) - or ordered subsets (K=! / (-)! ). - ... Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 61
  • 62. Outline Upper bounds for the Introduction dependency in  Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() corrections Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 62
  • 63. Automatic parallelization Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 63
  • 64. Speculative parallelization with branching factor 3 Consider the sequential algorithm. (iteration 1) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 64
  • 65. Speculative parallelization with branching factor 3 Consider the sequential algorithm. (iteration 2) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 65
  • 66. Speculative parallelization with branching factor 3 Consider the sequential algorithm. (iteration 3) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 66
  • 67. Speculative parallelization with branching factor 3 Parallel version for D=2. Population = union of all pops for 2 iterations. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 67
  • 68. Outline Introduction Tighter lower bounds for Complexity bounds specific algorithms ? Branching Factor Automatic Parallelization Real-world algorithms Log() corrections Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 68
  • 69. Real world algorithms Define: Necessary condition for log() speed-up: - E log( * ) ~ log() But for many algorithms, - E log( * ) = O(1) ==> constant speed-up Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 69
  • 70. One-fifth rule: - E log( * ) = O(1) = proportion of mutated points better than x While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x Update x = mean Update  By 1/5th rule } or Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 70
  • 71. One-fifth rule: - E log( * ) = O(1) = proportion of mutated points better than x Consider e.g. Or consider e.g. In both cases * is lower-bounded independently of  ==> parameters should strongly depend on  ! Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 71
  • 72. Self-adaptation, cumulative step-size adaptation In many cases, the same result: with parameters depending on the dimension only (and not depending on ), the speed-up is limited by a constant! Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 72
  • 73. Outline Introduction Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() corrections Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 73
  • 74. The starting point of this work ●We have shown tight bounds. ●Usual algorithms don't reach the bounds for  large. ● ●Trouble: the algorithms we propose are boring (too complicated), people prefer usual algorithms. ● ● A simple patch for these algorithms? Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 74
  • 75. Log() corrections ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case: - E log( *) should be linear in log() (this provides corrections which work for SA, EMNA and CSA) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 75
  • 76. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points /= log( / 7)1 / d } I select the three best
  • 77. Ex 2: Log(lambda) correction for mutative self-adapt.  =  / 4 ==> min( /4,d) While ( I have time ) { Generate points (1,...,) as  x exp(- k.N) Generate points (x1,...,x) distributed as N(x,i) Select the  best points Update x (=mean), update (=log. mean) }
  • 78. Log() corrections (SA, dim 3) ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case - E log( *) should be linear in log() (this provides corrections which work for SA and CSA) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 78
  • 79. Log() corrections ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case - E log( *) should be linear in log() (this provides corrections which work for SA and CSA) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 79
  • 80. Conclusion The case of large population size is not well handled by usual algorithms. We proposed (I) theoretical bounds (II) an automatic parallelization matching the bound, and which works well in the discrete case. (III) a necessary condition for the continuous case, which provides useful hints. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 80
  • 81. Main limitation (of the application to the design of algo) All this is about a logarithmic speed-up. The computational power is like this ==> <== and the result is like that. ==> much better speed-up for noisy optimization. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 81
  • 82. Further work 1 Apply VC-bounds for considering only “reasonnable” branches in the automatic parallelization. Theoretically easy, but provides extremely complicated algorithms. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 82
  • 83. Further work 2 We have: - proofs for complicated algorithms - efficient (unproved) hints for usual algorithms Proofs for the versions with the “trick” ? Nb: the discrete case is moral: the best algorithm is the proved one :-) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 83
  • 84. Further work 3 What if the optimum is not a point but a subset with topological dimension N' < N ? Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 84
  • 85. Further work 4 Parallel bandits ? Experimentally, parallel UCT >> seq. UCT. with speed-up depending on nb of arms. Theory ? Perhaps not very hard, but not done yet. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 85

Notas del editor

  1. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  2. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  3. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  4. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  5. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  6. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  7. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  8. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  9. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  10. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  11. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  12. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  13. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse