SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
Matrix Chain Scheduling Algorithm

                              yen3


                       March 4, 2009




yen3 ()           Matrix Chain Scheduling Algorithm   March 4, 2009   1 / 30
Outline




1   Introduction
       About the paper
       The Problem
       Notation
       Problem Description

2   Processor Allocation for Matrix Products
      Processor Allocation
      DPA Algorithm

3   Matrix Chain Scheduling Algorithm
     Matrix Chain Scheduling Algorithm
     Two-Pass Matrix Chain Scheduling Algorithm

4   Conclusion


         yen3 ()             Matrix Chain Scheduling Algorithm   March 4, 2009   2 / 30
Introduction   About the paper


About the paper



   Processor Allocation and Task Scheduling of Matrix Chain Products
   on Parallel Systems(Parallelizing matrix chain products)
   Heejo Lee, Jong Kim, Sungje Hong, Sunggu Lee
   Dept of Computer Science and Engineering Pohang University of
   Science and Technology, Korea
   Technical Report CS-HPC-97-003
   April, 2003(December, 1997)




      yen3 ()          Matrix Chain Scheduling Algorithm   March 4, 2009   3 / 30
Introduction   The Problem




MCOP: the Matrix Chain Ordering Problem
MCSP: the Matrix Chain Scheduling Problem
The paper introduce a processor scheduling algorithm for MCSP
which attempts to minimize the evaluations time of a chain of matrix
products on a parallel computer, even at the expense of a slight
increase in the total number of operations.




   yen3 ()           Matrix Chain Scheduling Algorithm   March 4, 2009   4 / 30
Introduction   Notation


A lot of symbol...Orz




    P: the number of processors in a parallel system.
    M: a matrix chain product with n matrices, i.e,
    M = M1 × M2 × · · · × Mn .
    Mi : an mi × mi+1 matrix (mi ≥ 1, 1 ≤ i ≤ n).
    L: a product sequence subtree of L for a matrix chain M.
    Li,j : the sequence subtree of L for (Mi × · · · × Mj ).




       yen3 ()             Matrix Chain Scheduling Algorithm   March 4, 2009   5 / 30
Introduction   Notation


A lot of symbol (cont’d)


    C : the minimum amount of computation for evaluating M.
    ∆C : the amount of increased computation by modifying the current
    sequence tree.
    pi,j : the number of processor assigned for evaluating (Mi × · · · × Mj )
    (mi , mj , mk ): single matrix product for multiplying an mi × mj matrix
    by an mj × mk matrix.
    Φ(mi , mj , mk , p): the execution time of single matrix product
    (mi , mj , mk ) when p processors are allocated.
    D(x): the set of divisors of x, i.e., D(x) = {d|d          divides   x}




       yen3 ()             Matrix Chain Scheduling Algorithm       March 4, 2009   6 / 30
Introduction   Notation


A lot of symbol - 2003 new!




   LD(x, y ): the largest divisor in D(x) that is not larger than y .
   SD(x, y ): if x > y , then SD(x, y ) is the smallest divisor in D(x) that
   is larger then y. Otherwise, SD(x, y ) = x
   m: the largest dimension among all of the matrices, i.e.,
   m = max1≤i≤n+1 (mi )




      yen3 ()             Matrix Chain Scheduling Algorithm   March 4, 2009   7 / 30
Introduction   Problem Description


The Problem


       The problemis finding the optimal schedule with minimum evaluation
       time of M = M1 × M2 × · · · × Mn 1 on a P processor parallel system.
       The matrix multiplication parallel algorithm is based that
       multiplying A by B with p processors, the executions time
       Φ(mi , mj , mk , p)2 can be approx imated as follows.
                                  mi mj mk
                                     p                     if            1 ≤ p ≤ mi mk
       Φ(mi , mj , mk , p) ≈      mi mj mk
                                     p        log( mip k ) if
                                                     m                   mi mk < p ≤ mi mj mk




   1
    M: a matrix chain product with n matrices, i.e, M = M1 × M2 × · · · × Mn .
   2
    Φ(mi , mj , mk , p): the execution time of single matrix product (mi , mj , mk ) when p
processors are allocated.
          yen3 ()                Matrix Chain Scheduling Algorithm               March 4, 2009   8 / 30
Introduction   Problem Description


The Problem - cont’d


         The matrix products can divide to two parts. The Time is
3

           Ti,j ≈ min (Ti,k (pi,j ) + Tk+1,j (pi,j ) + Φ(mi , mk+1 , mj+1 , pi,j ),
                    i≤k<j

              max(Ti,k (pi,k ), Tk+1,j (pk+1,j )) + Φ(mi , mk+1 , mj+1 , pi,j ))


         So, The paper define MCSP: find the product sequence for evaluating
         a chain of matrix chain of matrix products and the processor schedule
         for the sequence such that the evaluation time is minimized an a
         parallel system.


    3
        pi,j : the number of processor assigned for evaluating (Mi × · · · × Mj )
              yen3 ()               Matrix Chain Scheduling Algorithm         March 4, 2009   9 / 30
Introduction   Problem Description


MCSP Complexity



   The paper assume that each matrix multiplication
   (mi × mj ) × (mj × mk ) has log(mj ) time complexity mi mj mk
   processors.
   The problem can be presented

                mini≤k<j max(Ti,k , Tk+1,j ) + log(mk+1 )           1≤i <j ≤n
   Ti,j =
                0                                                   i = j, 1 ≤ i ≤ n

   but, MCSP is NP-hard.




      yen3 ()               Matrix Chain Scheduling Algorithm       March 4, 2009   10 / 30
Processor Allocation for Matrix Products   Processor Allocation


the Concurrent Computations of Multiple Matrix Products

   proportional allocation: the algorithm allocates processors
   proportional to the computation amount of each task. Trying to
   minimize the completion of all tasks by balancing the execution time
   of each task.




      yen3 ()                     Matrix Chain Scheduling Algorithm         March 4, 2009   11 / 30
Processor Allocation for Matrix Products   Processor Allocation


the Concurrent Computations of Multiple Matrix Products

    proportional allocation: the algorithm allocates processors
    proportional to the computation amount of each task. Trying to
    minimize the completion of all tasks by balancing the execution time
    of each task.
For example
    We have two matrix multiplication (2, 3, 8), (4, 4, 3) and 20 processor
    If we allocate 10 processors to each matrix multiplication, we will
        DA (2, 3, 8, 10) = D(2, 3, 8, 8) = 6 unit time
        DB (4, 4, 3, 10) = D(4, 4, 3, 6) = 8 unit time
        D(DA , DB ) = 8 unit time
    If we allocate 8 processors to A, 12 processors to B, we will
        DA (2, 3, 8, 8) = D(2, 3, 8, 8) = 6 unit time
        DB (4, 4, 3, 12) = D(4, 4, 3, 12) = 4 unit time
        D(DA , DB ) = 6 unit time

       yen3 ()                     Matrix Chain Scheduling Algorithm         March 4, 2009   11 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for Two Matrix Products- O(P) time




Discrete Processor Allocation for Two Matrix Products(DPA)
    Input: Two Matrix products X = (mx , mx+1 , mx+2 ) and
    Y = (my , my +1 , my +2 ) and a set of P processors.
    Output: The number of processors allocated to the matrix products
    X and Y , denoted as Px and Py , which satisfy 1 ≤ Px , Py ≤ P and
    Px + Py ≤ P




       yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   12 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for Two Matrix Products- O(P) time




Discrete Processor Allocation for Two Matrix Products(DPA)
                             mx mx+1 mx+2
  1   Set Pprop =     mx mx+1 mx+2 +my my +1 my +2 P




         yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   13 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for Two Matrix Products- O(P) time




Discrete Processor Allocation for Two Matrix Products(DPA)
                             mx mx+1 mx+2
  1   Set Pprop =     mx mx+1 mx+2 +my my +1 my +2 P
  2   Find dx,i in D(mx , mx+2 ) which satisfies dx,i ≤ Pprop




         yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   13 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for Two Matrix Products- O(P) time




Discrete Processor Allocation for Two Matrix Products(DPA)
                             mx mx+1 mx+2
  1   Set Pprop =     mx mx+1 mx+2 +my my +1 my +2 P
  2   Find dx,i in D(mx , mx+2 ) which satisfies dx,i ≤ Pprop
  3   Find dy ,j in D(my , my +2 ) which satisfies dy ,j ≤ P − Pprop




         yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   13 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for Two Matrix Products- O(P) time




Discrete Processor Allocation for Two Matrix Products(DPA)
                              mx mx+1 mx+2
  1   Set Pprop =      mx mx+1 mx+2 +my my +1 my +2 P
  2   Find dx,i in D(mx , mx+2 ) which satisfies dx,i ≤ Pprop
  3   Find dy ,j in D(my , my +2 ) which satisfies dy ,j ≤ P − Pprop
  4   If Φ(mx , mx+1 , mx+2 , dx,i ) < Φ(my , my +1 , my +2 , dy ,j ), then
      Px = dx,i , Py = P − dx,i . Otherwise Px = P − dy ,j , Py = dy ,j




          yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   13 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA description in 2003

                                mx mx+1 mx+2
 1   Pprop = P ×         mx mx+1 mx+2 +my my +1 my +2
 2   di = LD(mx mx+2 , Pprop )
 3   di+1 = SD(mx mx+2 , Pprop )
 4   dj = LD(mj mj+2 , P − Pprop )
 5   dj+1 = SD(mj mj+2 , P − Pprop )
 6   if Φ(X , di ) ≥ Φ(Y , dj ) then
       1    if max(Φ(X , di+1 ), Φ(Y , LD(P − di+1 )))
               Px = di+1 , Py = LD(my my +2 , P − di+1 )
       2    else
               Px = di , Py = LD(my my +2 , P − di + 1)
       3    elseif max(Φ(X , LD(P − dj+1 )), Φ(Y , dj+1 )) < Φ(Y , dj ) then
               Px = LD(mx mx+2 , P − dj+1 ), Py = dj+1
       4    else
               Px = LD(mx mx + 2, P − dj ), Py = dj

           yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   14 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for Two Matrix Products - cont’d



   the na¨ algorithm time complexity is O(P) , P is the number of
         ıve
   processors.(The major reason is step 2,3)
   The completion time of two matrix products when processors are
   allocated by the DPA algorithm is shorter then or equal to that by the
   proportional allocation.
   DPA guarantees the minimum completion time for two matrix
   products.
   DPA can easily extended for k independent matrix products.




      yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   15 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for k Matrix Products



    DPA-k guarantees the minimum completion time of k independent
    matrix products.

Discrete Processor Allocation for k Matrix Products (DPA-k)
    Input: k matrix produets Xi = (mi,1 , mi,2 , mi,3 ) for 1 ≤ i ≤ k are
    given on P processos (k ≤ P)
    Output: The number of allocated processors Pi for each matrix Xi
    which satisfies k Pi ≤ P
                   i=1




        yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   16 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for k Matrix Products - cont’d


Discrete Processor Allocation for k Matrix Products (DPA-k)
  1   For i = 1 to k do
                             m m m
        1    Pprop,i =     Pk i,1 i,2 i,3      P
                            j=1 mj,1 mj,2 mj,3
        2    Find the maximum di,j in D(mi,1 , mi,3 ) which satisfies di,j ≤ Pprop, i
        3    Let Pi = di,j




            yen3 ()                     Matrix Chain Scheduling Algorithm   March 4, 2009   17 / 30
Processor Allocation for Matrix Products   DPA Algorithm


DPA for k Matrix Products - cont’d


Discrete Processor Allocation for k Matrix Products (DPA-k)
  1   For i = 1 to k do
                             m m m
        1    Pprop,i =     Pk i,1 i,2 i,3      P
                            j=1 mj,1 mj,2 mj,3
        2    Find the maximum di,j in D(mi,1 , mi,3 ) which satisfies di,j ≤ Pprop, i
        3    Let Pi = di,j
                           k
  2   While P −            i=1 Pi    > 0 do
        1    Find the product Xi with the maimum Φ(mi,1 , mi,2 , mi,3 , Pi )
        2    If (Pi < mi,1 mi,3 ) then
                 1    Find the minimum di,j in D(mi,1 , mi,3 ) which satisfies di,j > Pi
                      If (P − k Pi − (di,j − Pi ) > 0) then
                              P
                 2               i=1
                         Pi = di,j Otherwise, stop the algorithm.
                      else stop the algorithm




            yen3 ()                     Matrix Chain Scheduling Algorithm    March 4, 2009   17 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Matrix Chain Scheduling Algorithm - Introduction

Matrix Chain Scheduling Algorithm
The algorithm has three stages.
  1   Find optimal product sequence by MCOP




         yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   18 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Matrix Chain Scheduling Algorithm - Introduction

Matrix Chain Scheduling Algorithm
The algorithm has three stages.
  1   Find optimal product sequence by MCOP
  2   Top-Down Processor Assignment
      processors are partitioned and assigned to each subtree to balance the
      evaluation time of both matrix product chains.




         yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   18 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Matrix Chain Scheduling Algorithm - Introduction

Matrix Chain Scheduling Algorithm
The algorithm has three stages.
  1   Find optimal product sequence by MCOP
  2   Top-Down Processor Assignment
      processors are partitioned and assigned to each subtree to balance the
      evaluation time of both matrix product chains.
  3   Bottom-Up Concurrent Execution
      The steps executes products independently from the leaf and tries to
      modify the product sequence to enhance concurrency so as to reduce
      the evaluation time of Ma .




         yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   18 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Matrix Chain Scheduling Algorithm - Introduction

Matrix Chain Scheduling Algorithm
The algorithm has three stages.
  1    Find optimal product sequence by MCOP
  2    Top-Down Processor Assignment
       processors are partitioned and assigned to each subtree to balance the
       evaluation time of both matrix product chains.
  3    Bottom-Up Concurrent Execution
       The steps executes products independently from the leaf and tries to
       modify the product sequence to enhance concurrency so as to reduce
       the evaluation time of Ma .
  a
      M: a matrix chain product with n matrices, i.e, M = M1 × M2 × · · · × Mn .

The Algorithm Time Complexity is O(n2 + nP)

           yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   18 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


First Step - Find Optimal Product Sequence by MCOP


   The optimal product sequence can be found of in O(n log(n)) time
   using a sequential algorithm.
   Many parallel algorithms have been studied which run in ploylog time
   on P processor system.




      yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   19 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


First Step - Find Optimal Product Sequence by MCOP


       The optimal product sequence can be found of in O(n log(n)) time
       using a sequential algorithm.
       Many parallel algorithms have been studied which run in ploylog time
       on P processor system.

notation
    W [i][j]: the minimum number for operations for evaluating Li,j a .
       S[i][j]: the matrix index for partitioning the matrix chain
       (Mi × · · · × Mj )
  a
      Li,j : the sequence subtree of L for (Mi × · · · × Mj ).




            yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   19 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Second Step - Top-Down Processor Assigment



       pi,j 4 processors were assigned to Li,j 5
                             W [i][S[i][j]]
             pi,j × W [i][S[i][j]]+W [S[i][j]+1][j] processors are assigned to the subtree
             Li,S[i][j] .
                           W [S[i][j]+1][j]
             pi,j × W [i][S[i][j]]+W [S[i][j]+1][j] processors are assigned to the subtree
             LS[i][j]+1,j
       It define the tree recursively.




  4
      pi,j : the number of processor assigned for evaluating (Mi × · · · × Mj )
  5
      Li,j : the sequence subtree of L for (Mi × · · · × Mj ).
            yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   20 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Second Step - For Example
   For example, given a chain of 8 matrices with dimensions
   {3, 8, 9, 5, 3, 3, 3, 4} on 64 processors parallel system.




      yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   21 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Third Step - Bottom-Up Concurrent Execution




    There are idle processors in the execution of Li,j , the step try to
    modify the product sequence to use these idle processors.
There are two condition.
    y > x + 1 and the node associated with My +1 has the left child node
    associated with My in the sequence tree.
    y < x − 1 and the node associated with My has the right child node
    associated with My +1 in the sequence tree.




        yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   22 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Sequence modification




      yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   23 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm




       If y > z then
            ∆C = my +1 my +2 (my − mz ) + mz my (my +2 − my +1 )6
       If y < z then
            ∆C = my +1 my +2 (my − mz+1 ) + mz+1 my (my +2 − my +1 )

Lemma
If a leaf product (Mx , Mx+1 ) has a candidate product (My My +1 ) and the
DPA algorithm will allocate px and py processors to the two matrix
products, respectively, then evaluation using the modified sequence
reduces the evaluation time when

∆C < min(Φ(mx , mx+1 , mx+2 , mx mx+2 )×(px +py −mx mx+2 ), my my +1 my +2 )

  6
      ∆C : the amount of increased computation by modifying the current sequence tree.
           yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   24 / 30
Matrix Chain Scheduling Algorithm   Matrix Chain Scheduling Algorithm


Candidate Products




      yen3 ()                     Matrix Chain Scheduling Algorithm                 March 4, 2009   25 / 30
Matrix Chain Scheduling Algorithm   Two-Pass Matrix Chain Scheduling Algorithm


Stage-1 MCOP


Two-Pass Matrix Chain Scheduling Algorithm - Stage 1
Stage-1 MCOP
  1   Find the optimal product sequence for the MCOP by using a parallel
      algorithm.
  2   Generate the sequence tree L.

      The Dynamic Programing Algorhtm - O(n2 ) to O(n3 )
      The sequential algorithm - O(n log(n))
      The Parallel Algorithm - O(log3 n)




         yen3 ()                     Matrix Chain Scheduling Algorithm                March 4, 2009   26 / 30
Matrix Chain Scheduling Algorithm   Two-Pass Matrix Chain Scheduling Algorithm


Stage-2 Top-Down Processor Assignment

Two-Pass Matrix Chain Scheduling Algorithm - Stage 2
Stage-2 Top-Down Processor Assignment
  1    Intialize i = 1, j = n, pi,j = P.
                                                                         W [i][S[i][j]]
  2    If i is not S[i][j], then allocate pi,j ×                W [i][S[i][j]]+W [S[i][j]+1][j]
       processors to Li,S[i][j] .ab
                                                                             W [S[i][j]+1][j]
  3    If j is not S[i][j] + 1, then allocate pi,j ×                  W [i][S[i][j]]+W [S[i][j]+1][j]
       processors to LS[i][j]+1,j .
  4    If i is j + 1 or j, then finish this stage; otherwise, call this algorithm
       recursively, once with i = i, j = S[i][j] and once i = S[i][j] + 1, j = j.
  a
      W [i][j]: the minimum number for operations for evaluating Li,j .
  b
      S[i][j]: the matrix index for partitioning the matrix chain (Mi × · · · × Mj )



            yen3 ()                     Matrix Chain Scheduling Algorithm                March 4, 2009   27 / 30
Matrix Chain Scheduling Algorithm   Two-Pass Matrix Chain Scheduling Algorithm


Stage-3 Bottom-Up Concurrent Execution


Two-Pass Matrix Chain Scheduling Algorithm - Stage 3
Stage-3 Bottom-Up Concurrent Execution
1. Let (Mk , Mk+1 ) be a leaf product and pk,k+1 be the number of
   processors allocated to the leaf product. If pk,k+1 < mk mk+2 , then go
   to 5.
2. Find a candidate product by tracing ancestors of the leaf product using
   postorder traversal. If there is no such candidate product, go to 5.
3. Let the product (Ml Ml+1 ) be a candidate product found by tracing
   ancestors of the leaf product (Mk Mk+1 ). Check whether the candidate
   product satisfies Lemma 2. If not, go to 2.




        yen3 ()                     Matrix Chain Scheduling Algorithm                March 4, 2009   28 / 30
Matrix Chain Scheduling Algorithm   Two-Pass Matrix Chain Scheduling Algorithm


Stage-3 Bottom-Up Concurrent Execution



Two-Pass Matrix Chain Scheduling Algorithm - Stage 3
Stage-3 Bottom-Up Concurrent Execution
4. Modify the sequence tree such that the candidate product (Ml Ml+1 )
   can be executed concurrently with (Mk , Mk+1 ). Relocate pk pk+1
   processors using the DPA algorithm and go to 1 for each leaf product
   of two split subtrees.

5. Schedule the leaf product on min(pi,j , mk mk+2 ) processors. Set the
   parent of the leaf procuctas a new leaf product.




        yen3 ()                     Matrix Chain Scheduling Algorithm                March 4, 2009   29 / 30
Conclusion


Conclusion




   We discuss the Matrix Chain Products Problem, DPA Algorithm,
   Matrix Chain Scheduling Algorithm.
   The paper emphasizes the optimal matrix chain products tree
   modification on parallel system.
   I will try to reduce my slide XD.




      yen3 ()            Matrix Chain Scheduling Algorithm   March 4, 2009   30 / 30

Más contenido relacionado

La actualidad más candente

Euclid algorithm and congruence matrix
Euclid algorithm and congruence matrixEuclid algorithm and congruence matrix
Euclid algorithm and congruence matrixJanani S
 
Job sequencing with deadlines(with example)
Job sequencing with deadlines(with example)Job sequencing with deadlines(with example)
Job sequencing with deadlines(with example)Vrinda Sheela
 
Eucledian algorithm for gcd of integers and polynomials
Eucledian algorithm for gcd of integers and polynomialsEucledian algorithm for gcd of integers and polynomials
Eucledian algorithm for gcd of integers and polynomialsSWAMY J S
 
Information and network security 35 the chinese remainder theorem
Information and network security 35 the chinese remainder theoremInformation and network security 35 the chinese remainder theorem
Information and network security 35 the chinese remainder theoremVaibhav Khanna
 
lecture 15
lecture 15lecture 15
lecture 15sajinsc
 
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...theijes
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomeworkGopi Saiteja
 
parameterized complexity for graph Motif
parameterized complexity for graph Motifparameterized complexity for graph Motif
parameterized complexity for graph MotifAMR koura
 
Analysis of Electro-Mechanical System
Analysis of Electro-Mechanical SystemAnalysis of Electro-Mechanical System
Analysis of Electro-Mechanical SystemCOMSATS Abbottabad
 
Calculating Mine Probability in Minesweeper
Calculating Mine Probability in MinesweeperCalculating Mine Probability in Minesweeper
Calculating Mine Probability in MinesweeperLukeVideckis
 
Mathematical Modelling of Electrical/Mechanical modellinng in MATLAB
Mathematical Modelling of Electrical/Mechanical modellinng in MATLABMathematical Modelling of Electrical/Mechanical modellinng in MATLAB
Mathematical Modelling of Electrical/Mechanical modellinng in MATLABCOMSATS Abbottabad
 
Smu bca sem 1 winter 2014 assignments
Smu bca sem 1 winter 2014 assignmentsSmu bca sem 1 winter 2014 assignments
Smu bca sem 1 winter 2014 assignmentssmumbahelp
 
Ee693 sept2014quizgt2
Ee693 sept2014quizgt2Ee693 sept2014quizgt2
Ee693 sept2014quizgt2Gopi Saiteja
 
Admission for b.tech
Admission for b.techAdmission for b.tech
Admission for b.techEdhole.com
 
K-means Clustering || Data Mining
K-means Clustering || Data MiningK-means Clustering || Data Mining
K-means Clustering || Data MiningIffat Firozy
 

La actualidad más candente (20)

Math basic
Math basicMath basic
Math basic
 
Euclid algorithm and congruence matrix
Euclid algorithm and congruence matrixEuclid algorithm and congruence matrix
Euclid algorithm and congruence matrix
 
Job sequencing with deadlines(with example)
Job sequencing with deadlines(with example)Job sequencing with deadlines(with example)
Job sequencing with deadlines(with example)
 
Eucledian algorithm for gcd of integers and polynomials
Eucledian algorithm for gcd of integers and polynomialsEucledian algorithm for gcd of integers and polynomials
Eucledian algorithm for gcd of integers and polynomials
 
Information and network security 35 the chinese remainder theorem
Information and network security 35 the chinese remainder theoremInformation and network security 35 the chinese remainder theorem
Information and network security 35 the chinese remainder theorem
 
Modeling quadratic fxns
Modeling quadratic fxnsModeling quadratic fxns
Modeling quadratic fxns
 
AML
AMLAML
AML
 
ikh323-05
ikh323-05ikh323-05
ikh323-05
 
Lecture#9
Lecture#9Lecture#9
Lecture#9
 
lecture 15
lecture 15lecture 15
lecture 15
 
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
An Efficient Elliptic Curve Cryptography Arithmetic Using Nikhilam Multiplica...
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomework
 
parameterized complexity for graph Motif
parameterized complexity for graph Motifparameterized complexity for graph Motif
parameterized complexity for graph Motif
 
Analysis of Electro-Mechanical System
Analysis of Electro-Mechanical SystemAnalysis of Electro-Mechanical System
Analysis of Electro-Mechanical System
 
Calculating Mine Probability in Minesweeper
Calculating Mine Probability in MinesweeperCalculating Mine Probability in Minesweeper
Calculating Mine Probability in Minesweeper
 
Mathematical Modelling of Electrical/Mechanical modellinng in MATLAB
Mathematical Modelling of Electrical/Mechanical modellinng in MATLABMathematical Modelling of Electrical/Mechanical modellinng in MATLAB
Mathematical Modelling of Electrical/Mechanical modellinng in MATLAB
 
Smu bca sem 1 winter 2014 assignments
Smu bca sem 1 winter 2014 assignmentsSmu bca sem 1 winter 2014 assignments
Smu bca sem 1 winter 2014 assignments
 
Ee693 sept2014quizgt2
Ee693 sept2014quizgt2Ee693 sept2014quizgt2
Ee693 sept2014quizgt2
 
Admission for b.tech
Admission for b.techAdmission for b.tech
Admission for b.tech
 
K-means Clustering || Data Mining
K-means Clustering || Data MiningK-means Clustering || Data Mining
K-means Clustering || Data Mining
 

Similar a Matrix Chain Scheduling Algorithm

Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxPJS KUMAR
 
Design and Implementation of Parallel and Randomized Approximation Algorithms
Design and Implementation of Parallel and Randomized Approximation AlgorithmsDesign and Implementation of Parallel and Randomized Approximation Algorithms
Design and Implementation of Parallel and Randomized Approximation AlgorithmsAjay Bidyarthy
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexityshowkat27
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Comparisons
ComparisonsComparisons
ComparisonsZifirio
 
Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915Editor IJARCET
 

Similar a Matrix Chain Scheduling Algorithm (20)

Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
 
Design and Implementation of Parallel and Randomized Approximation Algorithms
Design and Implementation of Parallel and Randomized Approximation AlgorithmsDesign and Implementation of Parallel and Randomized Approximation Algorithms
Design and Implementation of Parallel and Randomized Approximation Algorithms
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Comparisons
ComparisonsComparisons
Comparisons
 
Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915Ijarcet vol-2-issue-3-904-915
Ijarcet vol-2-issue-3-904-915
 
Smart Multitask Bregman Clustering
Smart Multitask Bregman ClusteringSmart Multitask Bregman Clustering
Smart Multitask Bregman Clustering
 
ECE 565 FInal Project
ECE 565 FInal ProjectECE 565 FInal Project
ECE 565 FInal Project
 
Clustering-beamer.pdf
Clustering-beamer.pdfClustering-beamer.pdf
Clustering-beamer.pdf
 

Más de Wen-Shih Chao

Write unit test from scratch
Write unit test from scratchWrite unit test from scratch
Write unit test from scratchWen-Shih Chao
 
Programming with effects - Graham Hutton
Programming with effects - Graham HuttonProgramming with effects - Graham Hutton
Programming with effects - Graham HuttonWen-Shih Chao
 
近未來三人女子電音偶像團體 Perfume
近未來三人女子電音偶像團體 Perfume近未來三人女子電音偶像團體 Perfume
近未來三人女子電音偶像團體 PerfumeWen-Shih Chao
 
漫談 Source Control Management
漫談 Source Control Management漫談 Source Control Management
漫談 Source Control ManagementWen-Shih Chao
 
淺談排版系統 Typesetting System
淺談排版系統 Typesetting System淺談排版系統 Typesetting System
淺談排版系統 Typesetting SystemWen-Shih Chao
 

Más de Wen-Shih Chao (6)

Write unit test from scratch
Write unit test from scratchWrite unit test from scratch
Write unit test from scratch
 
Programming with effects - Graham Hutton
Programming with effects - Graham HuttonProgramming with effects - Graham Hutton
Programming with effects - Graham Hutton
 
近未來三人女子電音偶像團體 Perfume
近未來三人女子電音偶像團體 Perfume近未來三人女子電音偶像團體 Perfume
近未來三人女子電音偶像團體 Perfume
 
Java Technicalities
Java TechnicalitiesJava Technicalities
Java Technicalities
 
漫談 Source Control Management
漫談 Source Control Management漫談 Source Control Management
漫談 Source Control Management
 
淺談排版系統 Typesetting System
淺談排版系統 Typesetting System淺談排版系統 Typesetting System
淺談排版系統 Typesetting System
 

Último

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Último (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Matrix Chain Scheduling Algorithm

  • 1. Matrix Chain Scheduling Algorithm yen3 March 4, 2009 yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 1 / 30
  • 2. Outline 1 Introduction About the paper The Problem Notation Problem Description 2 Processor Allocation for Matrix Products Processor Allocation DPA Algorithm 3 Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Two-Pass Matrix Chain Scheduling Algorithm 4 Conclusion yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 2 / 30
  • 3. Introduction About the paper About the paper Processor Allocation and Task Scheduling of Matrix Chain Products on Parallel Systems(Parallelizing matrix chain products) Heejo Lee, Jong Kim, Sungje Hong, Sunggu Lee Dept of Computer Science and Engineering Pohang University of Science and Technology, Korea Technical Report CS-HPC-97-003 April, 2003(December, 1997) yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 3 / 30
  • 4. Introduction The Problem MCOP: the Matrix Chain Ordering Problem MCSP: the Matrix Chain Scheduling Problem The paper introduce a processor scheduling algorithm for MCSP which attempts to minimize the evaluations time of a chain of matrix products on a parallel computer, even at the expense of a slight increase in the total number of operations. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 4 / 30
  • 5. Introduction Notation A lot of symbol...Orz P: the number of processors in a parallel system. M: a matrix chain product with n matrices, i.e, M = M1 × M2 × · · · × Mn . Mi : an mi × mi+1 matrix (mi ≥ 1, 1 ≤ i ≤ n). L: a product sequence subtree of L for a matrix chain M. Li,j : the sequence subtree of L for (Mi × · · · × Mj ). yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 5 / 30
  • 6. Introduction Notation A lot of symbol (cont’d) C : the minimum amount of computation for evaluating M. ∆C : the amount of increased computation by modifying the current sequence tree. pi,j : the number of processor assigned for evaluating (Mi × · · · × Mj ) (mi , mj , mk ): single matrix product for multiplying an mi × mj matrix by an mj × mk matrix. Φ(mi , mj , mk , p): the execution time of single matrix product (mi , mj , mk ) when p processors are allocated. D(x): the set of divisors of x, i.e., D(x) = {d|d divides x} yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 6 / 30
  • 7. Introduction Notation A lot of symbol - 2003 new! LD(x, y ): the largest divisor in D(x) that is not larger than y . SD(x, y ): if x > y , then SD(x, y ) is the smallest divisor in D(x) that is larger then y. Otherwise, SD(x, y ) = x m: the largest dimension among all of the matrices, i.e., m = max1≤i≤n+1 (mi ) yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 7 / 30
  • 8. Introduction Problem Description The Problem The problemis finding the optimal schedule with minimum evaluation time of M = M1 × M2 × · · · × Mn 1 on a P processor parallel system. The matrix multiplication parallel algorithm is based that multiplying A by B with p processors, the executions time Φ(mi , mj , mk , p)2 can be approx imated as follows. mi mj mk p if 1 ≤ p ≤ mi mk Φ(mi , mj , mk , p) ≈ mi mj mk p log( mip k ) if m mi mk < p ≤ mi mj mk 1 M: a matrix chain product with n matrices, i.e, M = M1 × M2 × · · · × Mn . 2 Φ(mi , mj , mk , p): the execution time of single matrix product (mi , mj , mk ) when p processors are allocated. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 8 / 30
  • 9. Introduction Problem Description The Problem - cont’d The matrix products can divide to two parts. The Time is 3 Ti,j ≈ min (Ti,k (pi,j ) + Tk+1,j (pi,j ) + Φ(mi , mk+1 , mj+1 , pi,j ), i≤k<j max(Ti,k (pi,k ), Tk+1,j (pk+1,j )) + Φ(mi , mk+1 , mj+1 , pi,j )) So, The paper define MCSP: find the product sequence for evaluating a chain of matrix chain of matrix products and the processor schedule for the sequence such that the evaluation time is minimized an a parallel system. 3 pi,j : the number of processor assigned for evaluating (Mi × · · · × Mj ) yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 9 / 30
  • 10. Introduction Problem Description MCSP Complexity The paper assume that each matrix multiplication (mi × mj ) × (mj × mk ) has log(mj ) time complexity mi mj mk processors. The problem can be presented mini≤k<j max(Ti,k , Tk+1,j ) + log(mk+1 ) 1≤i <j ≤n Ti,j = 0 i = j, 1 ≤ i ≤ n but, MCSP is NP-hard. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 10 / 30
  • 11. Processor Allocation for Matrix Products Processor Allocation the Concurrent Computations of Multiple Matrix Products proportional allocation: the algorithm allocates processors proportional to the computation amount of each task. Trying to minimize the completion of all tasks by balancing the execution time of each task. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 11 / 30
  • 12. Processor Allocation for Matrix Products Processor Allocation the Concurrent Computations of Multiple Matrix Products proportional allocation: the algorithm allocates processors proportional to the computation amount of each task. Trying to minimize the completion of all tasks by balancing the execution time of each task. For example We have two matrix multiplication (2, 3, 8), (4, 4, 3) and 20 processor If we allocate 10 processors to each matrix multiplication, we will DA (2, 3, 8, 10) = D(2, 3, 8, 8) = 6 unit time DB (4, 4, 3, 10) = D(4, 4, 3, 6) = 8 unit time D(DA , DB ) = 8 unit time If we allocate 8 processors to A, 12 processors to B, we will DA (2, 3, 8, 8) = D(2, 3, 8, 8) = 6 unit time DB (4, 4, 3, 12) = D(4, 4, 3, 12) = 4 unit time D(DA , DB ) = 6 unit time yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 11 / 30
  • 13. Processor Allocation for Matrix Products DPA Algorithm DPA for Two Matrix Products- O(P) time Discrete Processor Allocation for Two Matrix Products(DPA) Input: Two Matrix products X = (mx , mx+1 , mx+2 ) and Y = (my , my +1 , my +2 ) and a set of P processors. Output: The number of processors allocated to the matrix products X and Y , denoted as Px and Py , which satisfy 1 ≤ Px , Py ≤ P and Px + Py ≤ P yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 12 / 30
  • 14. Processor Allocation for Matrix Products DPA Algorithm DPA for Two Matrix Products- O(P) time Discrete Processor Allocation for Two Matrix Products(DPA) mx mx+1 mx+2 1 Set Pprop = mx mx+1 mx+2 +my my +1 my +2 P yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 13 / 30
  • 15. Processor Allocation for Matrix Products DPA Algorithm DPA for Two Matrix Products- O(P) time Discrete Processor Allocation for Two Matrix Products(DPA) mx mx+1 mx+2 1 Set Pprop = mx mx+1 mx+2 +my my +1 my +2 P 2 Find dx,i in D(mx , mx+2 ) which satisfies dx,i ≤ Pprop yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 13 / 30
  • 16. Processor Allocation for Matrix Products DPA Algorithm DPA for Two Matrix Products- O(P) time Discrete Processor Allocation for Two Matrix Products(DPA) mx mx+1 mx+2 1 Set Pprop = mx mx+1 mx+2 +my my +1 my +2 P 2 Find dx,i in D(mx , mx+2 ) which satisfies dx,i ≤ Pprop 3 Find dy ,j in D(my , my +2 ) which satisfies dy ,j ≤ P − Pprop yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 13 / 30
  • 17. Processor Allocation for Matrix Products DPA Algorithm DPA for Two Matrix Products- O(P) time Discrete Processor Allocation for Two Matrix Products(DPA) mx mx+1 mx+2 1 Set Pprop = mx mx+1 mx+2 +my my +1 my +2 P 2 Find dx,i in D(mx , mx+2 ) which satisfies dx,i ≤ Pprop 3 Find dy ,j in D(my , my +2 ) which satisfies dy ,j ≤ P − Pprop 4 If Φ(mx , mx+1 , mx+2 , dx,i ) < Φ(my , my +1 , my +2 , dy ,j ), then Px = dx,i , Py = P − dx,i . Otherwise Px = P − dy ,j , Py = dy ,j yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 13 / 30
  • 18. Processor Allocation for Matrix Products DPA Algorithm DPA description in 2003 mx mx+1 mx+2 1 Pprop = P × mx mx+1 mx+2 +my my +1 my +2 2 di = LD(mx mx+2 , Pprop ) 3 di+1 = SD(mx mx+2 , Pprop ) 4 dj = LD(mj mj+2 , P − Pprop ) 5 dj+1 = SD(mj mj+2 , P − Pprop ) 6 if Φ(X , di ) ≥ Φ(Y , dj ) then 1 if max(Φ(X , di+1 ), Φ(Y , LD(P − di+1 ))) Px = di+1 , Py = LD(my my +2 , P − di+1 ) 2 else Px = di , Py = LD(my my +2 , P − di + 1) 3 elseif max(Φ(X , LD(P − dj+1 )), Φ(Y , dj+1 )) < Φ(Y , dj ) then Px = LD(mx mx+2 , P − dj+1 ), Py = dj+1 4 else Px = LD(mx mx + 2, P − dj ), Py = dj yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 14 / 30
  • 19. Processor Allocation for Matrix Products DPA Algorithm DPA for Two Matrix Products - cont’d the na¨ algorithm time complexity is O(P) , P is the number of ıve processors.(The major reason is step 2,3) The completion time of two matrix products when processors are allocated by the DPA algorithm is shorter then or equal to that by the proportional allocation. DPA guarantees the minimum completion time for two matrix products. DPA can easily extended for k independent matrix products. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 15 / 30
  • 20. Processor Allocation for Matrix Products DPA Algorithm DPA for k Matrix Products DPA-k guarantees the minimum completion time of k independent matrix products. Discrete Processor Allocation for k Matrix Products (DPA-k) Input: k matrix produets Xi = (mi,1 , mi,2 , mi,3 ) for 1 ≤ i ≤ k are given on P processos (k ≤ P) Output: The number of allocated processors Pi for each matrix Xi which satisfies k Pi ≤ P i=1 yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 16 / 30
  • 21. Processor Allocation for Matrix Products DPA Algorithm DPA for k Matrix Products - cont’d Discrete Processor Allocation for k Matrix Products (DPA-k) 1 For i = 1 to k do m m m 1 Pprop,i = Pk i,1 i,2 i,3 P j=1 mj,1 mj,2 mj,3 2 Find the maximum di,j in D(mi,1 , mi,3 ) which satisfies di,j ≤ Pprop, i 3 Let Pi = di,j yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 17 / 30
  • 22. Processor Allocation for Matrix Products DPA Algorithm DPA for k Matrix Products - cont’d Discrete Processor Allocation for k Matrix Products (DPA-k) 1 For i = 1 to k do m m m 1 Pprop,i = Pk i,1 i,2 i,3 P j=1 mj,1 mj,2 mj,3 2 Find the maximum di,j in D(mi,1 , mi,3 ) which satisfies di,j ≤ Pprop, i 3 Let Pi = di,j k 2 While P − i=1 Pi > 0 do 1 Find the product Xi with the maimum Φ(mi,1 , mi,2 , mi,3 , Pi ) 2 If (Pi < mi,1 mi,3 ) then 1 Find the minimum di,j in D(mi,1 , mi,3 ) which satisfies di,j > Pi If (P − k Pi − (di,j − Pi ) > 0) then P 2 i=1 Pi = di,j Otherwise, stop the algorithm. else stop the algorithm yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 17 / 30
  • 23. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm - Introduction Matrix Chain Scheduling Algorithm The algorithm has three stages. 1 Find optimal product sequence by MCOP yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 18 / 30
  • 24. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm - Introduction Matrix Chain Scheduling Algorithm The algorithm has three stages. 1 Find optimal product sequence by MCOP 2 Top-Down Processor Assignment processors are partitioned and assigned to each subtree to balance the evaluation time of both matrix product chains. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 18 / 30
  • 25. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm - Introduction Matrix Chain Scheduling Algorithm The algorithm has three stages. 1 Find optimal product sequence by MCOP 2 Top-Down Processor Assignment processors are partitioned and assigned to each subtree to balance the evaluation time of both matrix product chains. 3 Bottom-Up Concurrent Execution The steps executes products independently from the leaf and tries to modify the product sequence to enhance concurrency so as to reduce the evaluation time of Ma . yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 18 / 30
  • 26. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm - Introduction Matrix Chain Scheduling Algorithm The algorithm has three stages. 1 Find optimal product sequence by MCOP 2 Top-Down Processor Assignment processors are partitioned and assigned to each subtree to balance the evaluation time of both matrix product chains. 3 Bottom-Up Concurrent Execution The steps executes products independently from the leaf and tries to modify the product sequence to enhance concurrency so as to reduce the evaluation time of Ma . a M: a matrix chain product with n matrices, i.e, M = M1 × M2 × · · · × Mn . The Algorithm Time Complexity is O(n2 + nP) yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 18 / 30
  • 27. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm First Step - Find Optimal Product Sequence by MCOP The optimal product sequence can be found of in O(n log(n)) time using a sequential algorithm. Many parallel algorithms have been studied which run in ploylog time on P processor system. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 19 / 30
  • 28. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm First Step - Find Optimal Product Sequence by MCOP The optimal product sequence can be found of in O(n log(n)) time using a sequential algorithm. Many parallel algorithms have been studied which run in ploylog time on P processor system. notation W [i][j]: the minimum number for operations for evaluating Li,j a . S[i][j]: the matrix index for partitioning the matrix chain (Mi × · · · × Mj ) a Li,j : the sequence subtree of L for (Mi × · · · × Mj ). yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 19 / 30
  • 29. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Second Step - Top-Down Processor Assigment pi,j 4 processors were assigned to Li,j 5 W [i][S[i][j]] pi,j × W [i][S[i][j]]+W [S[i][j]+1][j] processors are assigned to the subtree Li,S[i][j] . W [S[i][j]+1][j] pi,j × W [i][S[i][j]]+W [S[i][j]+1][j] processors are assigned to the subtree LS[i][j]+1,j It define the tree recursively. 4 pi,j : the number of processor assigned for evaluating (Mi × · · · × Mj ) 5 Li,j : the sequence subtree of L for (Mi × · · · × Mj ). yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 20 / 30
  • 30. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Second Step - For Example For example, given a chain of 8 matrices with dimensions {3, 8, 9, 5, 3, 3, 3, 4} on 64 processors parallel system. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 21 / 30
  • 31. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Third Step - Bottom-Up Concurrent Execution There are idle processors in the execution of Li,j , the step try to modify the product sequence to use these idle processors. There are two condition. y > x + 1 and the node associated with My +1 has the left child node associated with My in the sequence tree. y < x − 1 and the node associated with My has the right child node associated with My +1 in the sequence tree. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 22 / 30
  • 32. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Sequence modification yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 23 / 30
  • 33. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm If y > z then ∆C = my +1 my +2 (my − mz ) + mz my (my +2 − my +1 )6 If y < z then ∆C = my +1 my +2 (my − mz+1 ) + mz+1 my (my +2 − my +1 ) Lemma If a leaf product (Mx , Mx+1 ) has a candidate product (My My +1 ) and the DPA algorithm will allocate px and py processors to the two matrix products, respectively, then evaluation using the modified sequence reduces the evaluation time when ∆C < min(Φ(mx , mx+1 , mx+2 , mx mx+2 )×(px +py −mx mx+2 ), my my +1 my +2 ) 6 ∆C : the amount of increased computation by modifying the current sequence tree. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 24 / 30
  • 34. Matrix Chain Scheduling Algorithm Matrix Chain Scheduling Algorithm Candidate Products yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 25 / 30
  • 35. Matrix Chain Scheduling Algorithm Two-Pass Matrix Chain Scheduling Algorithm Stage-1 MCOP Two-Pass Matrix Chain Scheduling Algorithm - Stage 1 Stage-1 MCOP 1 Find the optimal product sequence for the MCOP by using a parallel algorithm. 2 Generate the sequence tree L. The Dynamic Programing Algorhtm - O(n2 ) to O(n3 ) The sequential algorithm - O(n log(n)) The Parallel Algorithm - O(log3 n) yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 26 / 30
  • 36. Matrix Chain Scheduling Algorithm Two-Pass Matrix Chain Scheduling Algorithm Stage-2 Top-Down Processor Assignment Two-Pass Matrix Chain Scheduling Algorithm - Stage 2 Stage-2 Top-Down Processor Assignment 1 Intialize i = 1, j = n, pi,j = P. W [i][S[i][j]] 2 If i is not S[i][j], then allocate pi,j × W [i][S[i][j]]+W [S[i][j]+1][j] processors to Li,S[i][j] .ab W [S[i][j]+1][j] 3 If j is not S[i][j] + 1, then allocate pi,j × W [i][S[i][j]]+W [S[i][j]+1][j] processors to LS[i][j]+1,j . 4 If i is j + 1 or j, then finish this stage; otherwise, call this algorithm recursively, once with i = i, j = S[i][j] and once i = S[i][j] + 1, j = j. a W [i][j]: the minimum number for operations for evaluating Li,j . b S[i][j]: the matrix index for partitioning the matrix chain (Mi × · · · × Mj ) yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 27 / 30
  • 37. Matrix Chain Scheduling Algorithm Two-Pass Matrix Chain Scheduling Algorithm Stage-3 Bottom-Up Concurrent Execution Two-Pass Matrix Chain Scheduling Algorithm - Stage 3 Stage-3 Bottom-Up Concurrent Execution 1. Let (Mk , Mk+1 ) be a leaf product and pk,k+1 be the number of processors allocated to the leaf product. If pk,k+1 < mk mk+2 , then go to 5. 2. Find a candidate product by tracing ancestors of the leaf product using postorder traversal. If there is no such candidate product, go to 5. 3. Let the product (Ml Ml+1 ) be a candidate product found by tracing ancestors of the leaf product (Mk Mk+1 ). Check whether the candidate product satisfies Lemma 2. If not, go to 2. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 28 / 30
  • 38. Matrix Chain Scheduling Algorithm Two-Pass Matrix Chain Scheduling Algorithm Stage-3 Bottom-Up Concurrent Execution Two-Pass Matrix Chain Scheduling Algorithm - Stage 3 Stage-3 Bottom-Up Concurrent Execution 4. Modify the sequence tree such that the candidate product (Ml Ml+1 ) can be executed concurrently with (Mk , Mk+1 ). Relocate pk pk+1 processors using the DPA algorithm and go to 1 for each leaf product of two split subtrees. 5. Schedule the leaf product on min(pi,j , mk mk+2 ) processors. Set the parent of the leaf procuctas a new leaf product. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 29 / 30
  • 39. Conclusion Conclusion We discuss the Matrix Chain Products Problem, DPA Algorithm, Matrix Chain Scheduling Algorithm. The paper emphasizes the optimal matrix chain products tree modification on parallel system. I will try to reduce my slide XD. yen3 () Matrix Chain Scheduling Algorithm March 4, 2009 30 / 30