Energy efficient resource management for high-performance clusters

Energy Efficient Scheduling for High-Performance Clusters ZiliangZong, Texas State University Adam Manzanares, Los Alamos National Lab Xiao Qin, Auburn University

Where is Auburn University? Ph.D.’04, U. of Nebraska-Lincoln 04-07, New Mexico Tech 07-now, Auburn University

Storage Systems Research Group at New Mexico Tech (2004-2007) 2011/6/22 3

Storage Systems Research Group at Auburn (2008) 2011/6/22 4

Investigators ZiliangZong, Ph.D. Assistant Professor, Texas State University Adam Manzanares, Ph.D. Candidate Los Alamos National Lab Xiao Qin, Ph.D. Associate Professor Auburn University 2011/6/22 7

2011/6/22 8 Introduction - Applications

Introduction – Data Centers 2011/6/22 9

Motivation – Electricity Usage EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 2011/6/22 10

Motivation – Energy Projections EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 2011/6/22 11

Motivation – Design Issues 2011/6/22 12

Architecture – Multiple Layers 2011/6/22 13

Energy Efficient Devices 2011/6/22 14

Multiple Design Goals 2011/6/22 15

Energy-Aware Scheduling for Clusters 2011/6/22 16

Parallel Applications 2011/6/22 17

Motivational Example 8 T1 T3 T2 T4 1 23 33 39 0 8 6 5 2 3 T1 T3 T4 10 15 23 26 32 0 8 6 2 2 4 T2 4 24 14 6 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 Linear Schedule Time: 39s No Duplication Schedule (NDS) Time: 32s Task Duplication Schedule (TDS) Time: 29s An Example of duplication 2011/6/22 18

Motivational Example (cont.) (8,48) (6,6) (5,5) T1 T3 T2 T4 1 23 33 39 0 8 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 24 14 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 Linear Schedule Time:39s Energy: 234J No Duplication Schedule (MCP) Time: 32s Energy: 242J Task Duplication Schedule (TDS) Time: 29s Energy: 284J An Example of duplication CPU_Energy=6W Network_Energy=1W 2011/6/22 19

Motivational Example (cont.) (8,48) (6,6) (5,5) 1 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 24 14 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 The energy cost of duplicating T1: CPU side: 48J Network side: -6J Total: 42J The performance benefit of duplicating T1: 6s Energy-performance tradeoff: 42/6 = 7 EAD Time: 32s Energy: 242J PEBD Time: 29s Energy: 284J If Threshold = 10 Duplicate T1? EAD: NO PEBD: Yes 2011/6/22 20

Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Step 1: DAG Generation Task Description: Task Set {T1, T2, …, T9, T10 } T1 is the entry task; T10 is the exit task; T2, T3 and T4 can not start until T1 finished; T5 and T6 can not start until T2 finished; T7 can not start until both T3 and T4 finished; T8 can not start until both T5 and T6 finished; T9 can not start until both T6 and T7 finished; T10 can not start until both T8 and T9 finished; 2011/6/22 21

Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Total Execution time from current task to the exit task Earliest Start Time Earliest Completion Time Latest Allowable Start Time Latest Allowable Completion Time Favorite Predecessor Step 2: Parameters Calculation 2011/6/22 22

Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1} Step 3: Scheduling 2011/6/22 23

Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1} Step 4: Duplication Decision Decision 1: Duplicate T1? Decision 2: Duplicate T2? Duplicate T1? Decision 3: Duplicate T1? 2011/6/22 24

The EAD and PEBD Algorithms Generate the DAG of given task sets Calculate energy increase and time decrease Calculate energy increase Find all the critical paths in DAG Ratio= energy increase/ time decrease more_energy<=Threshold? Generate scheduling queue based on the level (ascending) No Yes select the task (has not been scheduled yet) with the lowest level as starting task No Ratio<=Threshold? Duplicate this task and select the next task in the same critical path Yes meet entry task Duplicate this task and select the next task in the same critical path No allocate it to the same processor with the tasks in the same critical path Yes No For each task which is in the same critical path with starting task, check if it is already scheduled Save time if duplicate this task? Yes PEBD EAD 2011/6/22 25

Energy Dissipation in Processors http://www.xbitlabs.com 2011/6/22 26

Parallel Scientific Applications Fast Fourier Transform Gaussian Elimination 2011/6/22 27

Large-Scale Parallel Applications Robot Control Sparse Matrix Solver http://www.kasahara.elec.waseda.ac.jp/schedule/ 2011/6/22 28

Impact of CPU Power Dissipation Impact of CPU Types: 19.4% 3.7% Energy consumption for different processors (Gaussian, CCR=0.4) Energy consumption for different processors (FFT, CCR=0.4) 2011/6/22 29

Impact of Interconnect Power Dissipation Impact of Interconnection Types: 5% 3.1% 16.7% 13.3% Energy consumption (Robot Control, Myrinet) Energy consumption (Robot Control, Infiniband) 2011/6/22 30

Parallelism Degrees Impact of Application Parallelism: 6.9% 5.4% 17% 15.8% Energy consumption of Sparse Matrix (Myrinet) Energy consumption of Robert Control(Myrinet) 2011/6/22 31

Communication-Computation Ratio Impact of CCR: Energy consumption under different CCRs CCR: Communication-Computation Rate 2011/6/22 32

Performance Impact to Schedule Length: Schedule length of Gaussian Elimination Schedule length of Sparse Matrix Solver 2011/6/22 33

Heterogeneous Clusters - Motivational Example 2011/6/22 34

Motivational Example (cont.) Energy calculation for tentative schedule C1 C2 C3 C4 2011/6/22 35

Experimental Settings Simulation Environments 2011/6/22 36

Communication-Computation Ratio CCR sensitivity for Gaussian Elimination 2011/6/22 37

Heterogeneity Computational nodes heterogeneity experiments 2011/6/22 38

Energy-Efficient Scheduling for Clusters

Energy-Efficient Scheduling for Heterogeneous Systems

How to measure energy consumption? Kill-A-Watt2011/6/22 39

Source Code Availability www.mcs.sdsmt.edu/~zzong/software/scheduling.html 2011/6/22 40

Energy efficient resource management for high-performance clusters

Recommended

Recommended

More Related Content

Similar to Energy efficient resource management for high-performance clusters

Similar to Energy efficient resource management for high-performance clusters (20)

More from Xiao Qin

More from Xiao Qin (20)

Recently uploaded

Recently uploaded (20)

Energy efficient resource management for high-performance clusters

Editor's Notes