SlideShare a Scribd company logo
1 of 43
Energy Efficient Scheduling for High-Performance Clusters ZiliangZong, Texas State University  Adam Manzanares, Los Alamos National Lab  Xiao Qin, Auburn University
Where is Auburn University? Ph.D.’04, U. of Nebraska-Lincoln 04-07, New Mexico Tech 07-now, Auburn University
Storage Systems Research Group at New Mexico Tech (2004-2007) 2011/6/22 3
Storage Systems Research Group at Auburn (2008) 2011/6/22 4
Storage Systems Research Group at Auburn (2009) 2011/6/22 5
Storage Systems Research Group at Auburn (2011) 2011/6/22 6
Investigators ZiliangZong, Ph.D.  	Assistant Professor,     Texas State University Adam Manzanares, Ph.D. Candidate  Los Alamos National Lab Xiao Qin, Ph.D.  Associate Professor      Auburn University 2011/6/22 7
2011/6/22 8 Introduction - Applications
Introduction – Data Centers 2011/6/22 9
Motivation – Electricity Usage EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 2011/6/22 10
Motivation – Energy Projections EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 2011/6/22 11
Motivation – Design Issues 2011/6/22 12
Architecture – Multiple Layers 2011/6/22 13
Energy Efficient Devices 2011/6/22 14
Multiple Design Goals 2011/6/22 15
Energy-Aware Scheduling for Clusters 2011/6/22 16
Parallel Applications 2011/6/22 17
Motivational Example 8 T1 T3 T2 T4 1 23 33 39 0 8 6 5 2 3 T1 T3 T4 10 15 23 26 32 0 8 6 2 2 4 T2 4 24 14 6 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 Linear Schedule Time: 39s No Duplication Schedule (NDS) Time: 32s Task Duplication Schedule (TDS) Time: 29s An Example of duplication 2011/6/22 18
Motivational Example (cont.) (8,48) (6,6) (5,5) T1 T3 T2 T4 1 23 33 39 0 8 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 24 14 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 Linear Schedule Time:39s  Energy: 234J  No Duplication Schedule (MCP) Time: 32s  Energy: 242J Task Duplication Schedule (TDS) Time: 29s   Energy: 284J An Example of duplication CPU_Energy=6W Network_Energy=1W 2011/6/22 19
Motivational Example (cont.) (8,48) (6,6) (5,5) 1 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 24 14 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 The energy cost of duplicating T1: CPU side: 48J 	Network side: -6J  	Total: 42J The performance benefit of duplicating T1: 6s Energy-performance tradeoff: 42/6 = 7 EAD Time: 32s  Energy: 242J PEBD Time: 29s   Energy: 284J If Threshold = 10  Duplicate T1?  EAD: NO  PEBD: Yes 2011/6/22 20
Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Step 1: DAG Generation Task Description: Task Set {T1, T2, …, T9, T10 } T1 is the entry task; T10 is the exit task; T2, T3 and T4 can not start until T1 finished; T5 and T6 can not start until T2 finished; T7 can not start until both T3 and T4 finished; T8 can not start until both T5 and T6 finished; T9 can not start until both T6 and T7 finished; T10 can not start until both T8 and T9 finished; 2011/6/22 21
Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Total Execution time from current task to the exit task Earliest Start Time Earliest Completion Time Latest Allowable Start Time Latest Allowable Completion Time Favorite Predecessor Step 2: Parameters Calculation 2011/6/22 22
Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1}  Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1}  Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1}  Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1}  Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1}  Step 3: Scheduling 2011/6/22 23
Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1}  Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1}  Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1}  Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1}  Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1}  Step 4: Duplication Decision Decision 1: Duplicate T1? Decision 2: Duplicate T2?                    Duplicate T1? Decision 3: Duplicate T1? 2011/6/22 24
The EAD and PEBD Algorithms Generate the DAG of given task sets Calculate energy increase and time decrease Calculate energy increase Find all the critical paths in DAG Ratio= energy increase/ time decrease more_energy<=Threshold? Generate scheduling queue based on the level (ascending) No Yes select the task (has not been scheduled yet) with the lowest level as starting task  No Ratio<=Threshold? Duplicate this task and select the next task in the same critical path Yes meet entry task Duplicate this task and select the next task in the same critical path No allocate it to the same processor with the tasks in the same critical path Yes No For each task which is in the  same critical path with starting task, check  if it is already scheduled  Save time if duplicate  this task? Yes PEBD EAD 2011/6/22 25
Energy Dissipation in Processors http://www.xbitlabs.com 2011/6/22 26
Parallel Scientific Applications Fast Fourier Transform Gaussian Elimination 2011/6/22 27
Large-Scale Parallel Applications  Robot Control Sparse Matrix Solver http://www.kasahara.elec.waseda.ac.jp/schedule/ 2011/6/22 28
Impact of CPU Power Dissipation Impact of CPU Types: 19.4% 3.7% Energy consumption for different processors (Gaussian, CCR=0.4)  Energy consumption for different processors (FFT, CCR=0.4)  2011/6/22 29
Impact of Interconnect Power Dissipation Impact of Interconnection Types: 5% 3.1% 16.7% 13.3% Energy consumption (Robot Control, Myrinet)  Energy consumption (Robot Control, Infiniband)  2011/6/22 30
Parallelism Degrees Impact of Application Parallelism: 6.9% 5.4% 17% 15.8% Energy consumption of Sparse Matrix (Myrinet) Energy consumption of Robert Control(Myrinet) 2011/6/22 31
Communication-Computation Ratio Impact of CCR: Energy consumption under different CCRs CCR: Communication-Computation Rate 2011/6/22 32
Performance Impact to Schedule Length: Schedule length of Gaussian Elimination Schedule length of Sparse Matrix Solver 2011/6/22 33
Heterogeneous Clusters - Motivational Example 2011/6/22 34
Motivational Example (cont.) Energy calculation for tentative schedule C1 C2 C3 C4 2011/6/22 35
Experimental Settings Simulation Environments 2011/6/22 36
Communication-Computation Ratio CCR sensitivity for Gaussian Elimination 2011/6/22 37
Heterogeneity Computational nodes heterogeneity experiments 2011/6/22 38
Conclusions ,[object Object]
Energy-Efficient Scheduling for Clusters
Energy-Efficient Scheduling for Heterogeneous Systems
How to measure energy consumption? Kill-A-Watt2011/6/22 39
Source Code Availability www.mcs.sdsmt.edu/~zzong/software/scheduling.html 2011/6/22 40

More Related Content

Similar to Energy efficient resource management for high-performance clusters

ntcir14centre-overview
ntcir14centre-overviewntcir14centre-overview
ntcir14centre-overviewTetsuya Sakai
 
Efficient top k retrieval on massive data
Efficient top k retrieval on massive dataEfficient top k retrieval on massive data
Efficient top k retrieval on massive dataPvrtechnologies Nellore
 
Scalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesScalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesIRJET Journal
 
2Regression testing refers to a software testing technique that re-runs non-f...
2Regression testing refers to a software testing technique that re-runs non-f...2Regression testing refers to a software testing technique that re-runs non-f...
2Regression testing refers to a software testing technique that re-runs non-f...gjeyasriitaamecnew
 
Efficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesEfficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesRui Vieira
 
SC17 Panel: Energy Efficiency Gains From HPC Software
SC17 Panel: Energy Efficiency Gains From HPC SoftwareSC17 Panel: Energy Efficiency Gains From HPC Software
SC17 Panel: Energy Efficiency Gains From HPC Softwareinside-BigData.com
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsLEGATO project
 
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...ijfcstjournal
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
 
Performing Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXPerforming Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXDatavail
 
Analysis of Air Pollution in Nova Scotia Presentation
Analysis of Air Pollution in Nova Scotia PresentationAnalysis of Air Pollution in Nova Scotia Presentation
Analysis of Air Pollution in Nova Scotia PresentationCarlo Carandang
 
04.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 204.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 2melanie_bissonnette
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Anubhav Jain
 
Air Pollution in Nova Scotia: Analysis and Predictions
Air Pollution in Nova Scotia: Analysis and PredictionsAir Pollution in Nova Scotia: Analysis and Predictions
Air Pollution in Nova Scotia: Analysis and PredictionsCarlo Carandang
 
Learn NI STS tester at Univiversity of Florida
Learn NI STS tester at Univiversity of FloridaLearn NI STS tester at Univiversity of Florida
Learn NI STS tester at Univiversity of FloridaHank Lydick
 
Comparisons of building energy simulation softwares
Comparisons of building energy simulation softwaresComparisons of building energy simulation softwares
Comparisons of building energy simulation softwaresZheng Yang
 

Similar to Energy efficient resource management for high-performance clusters (20)

ntcir14centre-overview
ntcir14centre-overviewntcir14centre-overview
ntcir14centre-overview
 
FinalReport
FinalReportFinalReport
FinalReport
 
Efficient top k retrieval on massive data
Efficient top k retrieval on massive dataEfficient top k retrieval on massive data
Efficient top k retrieval on massive data
 
ECP Application Development
ECP Application DevelopmentECP Application Development
ECP Application Development
 
Scalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesScalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehouses
 
2Regression testing refers to a software testing technique that re-runs non-f...
2Regression testing refers to a software testing technique that re-runs non-f...2Regression testing refers to a software testing technique that re-runs non-f...
2Regression testing refers to a software testing technique that re-runs non-f...
 
Efficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesEfficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databases
 
SC17 Panel: Energy Efficiency Gains From HPC Software
SC17 Panel: Energy Efficiency Gains From HPC SoftwareSC17 Panel: Energy Efficiency Gains From HPC Software
SC17 Panel: Energy Efficiency Gains From HPC Software
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
 
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 
Performing Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXPerforming Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEX
 
Rt kernel-prn
Rt kernel-prnRt kernel-prn
Rt kernel-prn
 
RapidRma
RapidRmaRapidRma
RapidRma
 
Analysis of Air Pollution in Nova Scotia Presentation
Analysis of Air Pollution in Nova Scotia PresentationAnalysis of Air Pollution in Nova Scotia Presentation
Analysis of Air Pollution in Nova Scotia Presentation
 
04.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 204.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 2
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...
 
Air Pollution in Nova Scotia: Analysis and Predictions
Air Pollution in Nova Scotia: Analysis and PredictionsAir Pollution in Nova Scotia: Analysis and Predictions
Air Pollution in Nova Scotia: Analysis and Predictions
 
Learn NI STS tester at Univiversity of Florida
Learn NI STS tester at Univiversity of FloridaLearn NI STS tester at Univiversity of Florida
Learn NI STS tester at Univiversity of Florida
 
Comparisons of building energy simulation softwares
Comparisons of building energy simulation softwaresComparisons of building energy simulation softwares
Comparisons of building energy simulation softwares
 

More from Xiao Qin

How to apply for internship positions?
How to apply for internship positions?How to apply for internship positions?
How to apply for internship positions?Xiao Qin
 
How to write research papers? Version 5.0
How to write research papers? Version 5.0How to write research papers? Version 5.0
How to write research papers? Version 5.0Xiao Qin
 
Making a competitive nsf career proposal: Part 2 Worksheet
Making a competitive nsf career proposal: Part 2 WorksheetMaking a competitive nsf career proposal: Part 2 Worksheet
Making a competitive nsf career proposal: Part 2 WorksheetXiao Qin
 
Making a competitive nsf career proposal: Part 1 Tips
Making a competitive nsf career proposal: Part 1 TipsMaking a competitive nsf career proposal: Part 1 Tips
Making a competitive nsf career proposal: Part 1 TipsXiao Qin
 
Auburn csse faculty orientation
Auburn csse faculty orientationAuburn csse faculty orientation
Auburn csse faculty orientationXiao Qin
 
Auburn CSSE graduate student orientation
Auburn CSSE graduate student orientationAuburn CSSE graduate student orientation
Auburn CSSE graduate student orientationXiao Qin
 
CSSE Graduate Programs Committee: Progress Report
CSSE Graduate Programs Committee: Progress ReportCSSE Graduate Programs Committee: Progress Report
CSSE Graduate Programs Committee: Progress ReportXiao Qin
 
Project 2 How to modify os161: A Manual
Project 2 How to modify os161: A ManualProject 2 How to modify os161: A Manual
Project 2 How to modify os161: A ManualXiao Qin
 
Project 2 how to modify OS/161
Project 2 how to modify OS/161Project 2 how to modify OS/161
Project 2 how to modify OS/161Xiao Qin
 
Project 2 how to install and compile os161
Project 2 how to install and compile os161Project 2 how to install and compile os161
Project 2 how to install and compile os161Xiao Qin
 
Project 2 - how to compile os161?
Project 2 - how to compile os161?Project 2 - how to compile os161?
Project 2 - how to compile os161?Xiao Qin
 
Understanding what our customer wants-slideshare
Understanding what our customer wants-slideshareUnderstanding what our customer wants-slideshare
Understanding what our customer wants-slideshareXiao Qin
 
OS/161 Overview
OS/161 OverviewOS/161 Overview
OS/161 OverviewXiao Qin
 
Surviving a group project
Surviving a group projectSurviving a group project
Surviving a group projectXiao Qin
 
P#1 stream of praise
P#1 stream of praiseP#1 stream of praise
P#1 stream of praiseXiao Qin
 
Data center specific thermal and energy saving techniques
Data center specific thermal and energy saving techniquesData center specific thermal and energy saving techniques
Data center specific thermal and energy saving techniquesXiao Qin
 
How to do research?
How to do research?How to do research?
How to do research?Xiao Qin
 
COMP2710 Software Construction: header files
COMP2710 Software Construction: header filesCOMP2710 Software Construction: header files
COMP2710 Software Construction: header filesXiao Qin
 
COMP2710: Software Construction - Linked list exercises
COMP2710: Software Construction - Linked list exercisesCOMP2710: Software Construction - Linked list exercises
COMP2710: Software Construction - Linked list exercisesXiao Qin
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...Xiao Qin
 

More from Xiao Qin (20)

How to apply for internship positions?
How to apply for internship positions?How to apply for internship positions?
How to apply for internship positions?
 
How to write research papers? Version 5.0
How to write research papers? Version 5.0How to write research papers? Version 5.0
How to write research papers? Version 5.0
 
Making a competitive nsf career proposal: Part 2 Worksheet
Making a competitive nsf career proposal: Part 2 WorksheetMaking a competitive nsf career proposal: Part 2 Worksheet
Making a competitive nsf career proposal: Part 2 Worksheet
 
Making a competitive nsf career proposal: Part 1 Tips
Making a competitive nsf career proposal: Part 1 TipsMaking a competitive nsf career proposal: Part 1 Tips
Making a competitive nsf career proposal: Part 1 Tips
 
Auburn csse faculty orientation
Auburn csse faculty orientationAuburn csse faculty orientation
Auburn csse faculty orientation
 
Auburn CSSE graduate student orientation
Auburn CSSE graduate student orientationAuburn CSSE graduate student orientation
Auburn CSSE graduate student orientation
 
CSSE Graduate Programs Committee: Progress Report
CSSE Graduate Programs Committee: Progress ReportCSSE Graduate Programs Committee: Progress Report
CSSE Graduate Programs Committee: Progress Report
 
Project 2 How to modify os161: A Manual
Project 2 How to modify os161: A ManualProject 2 How to modify os161: A Manual
Project 2 How to modify os161: A Manual
 
Project 2 how to modify OS/161
Project 2 how to modify OS/161Project 2 how to modify OS/161
Project 2 how to modify OS/161
 
Project 2 how to install and compile os161
Project 2 how to install and compile os161Project 2 how to install and compile os161
Project 2 how to install and compile os161
 
Project 2 - how to compile os161?
Project 2 - how to compile os161?Project 2 - how to compile os161?
Project 2 - how to compile os161?
 
Understanding what our customer wants-slideshare
Understanding what our customer wants-slideshareUnderstanding what our customer wants-slideshare
Understanding what our customer wants-slideshare
 
OS/161 Overview
OS/161 OverviewOS/161 Overview
OS/161 Overview
 
Surviving a group project
Surviving a group projectSurviving a group project
Surviving a group project
 
P#1 stream of praise
P#1 stream of praiseP#1 stream of praise
P#1 stream of praise
 
Data center specific thermal and energy saving techniques
Data center specific thermal and energy saving techniquesData center specific thermal and energy saving techniques
Data center specific thermal and energy saving techniques
 
How to do research?
How to do research?How to do research?
How to do research?
 
COMP2710 Software Construction: header files
COMP2710 Software Construction: header filesCOMP2710 Software Construction: header files
COMP2710 Software Construction: header files
 
COMP2710: Software Construction - Linked list exercises
COMP2710: Software Construction - Linked list exercisesCOMP2710: Software Construction - Linked list exercises
COMP2710: Software Construction - Linked list exercises
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

Energy efficient resource management for high-performance clusters

  • 1. Energy Efficient Scheduling for High-Performance Clusters ZiliangZong, Texas State University Adam Manzanares, Los Alamos National Lab Xiao Qin, Auburn University
  • 2. Where is Auburn University? Ph.D.’04, U. of Nebraska-Lincoln 04-07, New Mexico Tech 07-now, Auburn University
  • 3. Storage Systems Research Group at New Mexico Tech (2004-2007) 2011/6/22 3
  • 4. Storage Systems Research Group at Auburn (2008) 2011/6/22 4
  • 5. Storage Systems Research Group at Auburn (2009) 2011/6/22 5
  • 6. Storage Systems Research Group at Auburn (2011) 2011/6/22 6
  • 7. Investigators ZiliangZong, Ph.D. Assistant Professor, Texas State University Adam Manzanares, Ph.D. Candidate Los Alamos National Lab Xiao Qin, Ph.D. Associate Professor Auburn University 2011/6/22 7
  • 8. 2011/6/22 8 Introduction - Applications
  • 9. Introduction – Data Centers 2011/6/22 9
  • 10. Motivation – Electricity Usage EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 2011/6/22 10
  • 11. Motivation – Energy Projections EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 2011/6/22 11
  • 12. Motivation – Design Issues 2011/6/22 12
  • 13. Architecture – Multiple Layers 2011/6/22 13
  • 14. Energy Efficient Devices 2011/6/22 14
  • 15. Multiple Design Goals 2011/6/22 15
  • 16. Energy-Aware Scheduling for Clusters 2011/6/22 16
  • 18. Motivational Example 8 T1 T3 T2 T4 1 23 33 39 0 8 6 5 2 3 T1 T3 T4 10 15 23 26 32 0 8 6 2 2 4 T2 4 24 14 6 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 Linear Schedule Time: 39s No Duplication Schedule (NDS) Time: 32s Task Duplication Schedule (TDS) Time: 29s An Example of duplication 2011/6/22 18
  • 19. Motivational Example (cont.) (8,48) (6,6) (5,5) T1 T3 T2 T4 1 23 33 39 0 8 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 24 14 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 Linear Schedule Time:39s Energy: 234J No Duplication Schedule (MCP) Time: 32s Energy: 242J Task Duplication Schedule (TDS) Time: 29s Energy: 284J An Example of duplication CPU_Energy=6W Network_Energy=1W 2011/6/22 19
  • 20. Motivational Example (cont.) (8,48) (6,6) (5,5) 1 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 24 14 T3 T4 T1 T1 23 29 20 0 8 0 8 2 T2 18 The energy cost of duplicating T1: CPU side: 48J Network side: -6J Total: 42J The performance benefit of duplicating T1: 6s Energy-performance tradeoff: 42/6 = 7 EAD Time: 32s Energy: 242J PEBD Time: 29s Energy: 284J If Threshold = 10 Duplicate T1? EAD: NO PEBD: Yes 2011/6/22 20
  • 21. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Step 1: DAG Generation Task Description: Task Set {T1, T2, …, T9, T10 } T1 is the entry task; T10 is the exit task; T2, T3 and T4 can not start until T1 finished; T5 and T6 can not start until T2 finished; T7 can not start until both T3 and T4 finished; T8 can not start until both T5 and T6 finished; T9 can not start until both T6 and T7 finished; T10 can not start until both T8 and T9 finished; 2011/6/22 21
  • 22. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Total Execution time from current task to the exit task Earliest Start Time Earliest Completion Time Latest Allowable Start Time Latest Allowable Completion Time Favorite Predecessor Step 2: Parameters Calculation 2011/6/22 22
  • 23. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1} Step 3: Scheduling 2011/6/22 23
  • 24. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1} Step 4: Duplication Decision Decision 1: Duplicate T1? Decision 2: Duplicate T2? Duplicate T1? Decision 3: Duplicate T1? 2011/6/22 24
  • 25. The EAD and PEBD Algorithms Generate the DAG of given task sets Calculate energy increase and time decrease Calculate energy increase Find all the critical paths in DAG Ratio= energy increase/ time decrease more_energy<=Threshold? Generate scheduling queue based on the level (ascending) No Yes select the task (has not been scheduled yet) with the lowest level as starting task No Ratio<=Threshold? Duplicate this task and select the next task in the same critical path Yes meet entry task Duplicate this task and select the next task in the same critical path No allocate it to the same processor with the tasks in the same critical path Yes No For each task which is in the same critical path with starting task, check if it is already scheduled Save time if duplicate this task? Yes PEBD EAD 2011/6/22 25
  • 26. Energy Dissipation in Processors http://www.xbitlabs.com 2011/6/22 26
  • 27. Parallel Scientific Applications Fast Fourier Transform Gaussian Elimination 2011/6/22 27
  • 28. Large-Scale Parallel Applications Robot Control Sparse Matrix Solver http://www.kasahara.elec.waseda.ac.jp/schedule/ 2011/6/22 28
  • 29. Impact of CPU Power Dissipation Impact of CPU Types: 19.4% 3.7% Energy consumption for different processors (Gaussian, CCR=0.4) Energy consumption for different processors (FFT, CCR=0.4) 2011/6/22 29
  • 30. Impact of Interconnect Power Dissipation Impact of Interconnection Types: 5% 3.1% 16.7% 13.3% Energy consumption (Robot Control, Myrinet) Energy consumption (Robot Control, Infiniband) 2011/6/22 30
  • 31. Parallelism Degrees Impact of Application Parallelism: 6.9% 5.4% 17% 15.8% Energy consumption of Sparse Matrix (Myrinet) Energy consumption of Robert Control(Myrinet) 2011/6/22 31
  • 32. Communication-Computation Ratio Impact of CCR: Energy consumption under different CCRs CCR: Communication-Computation Rate 2011/6/22 32
  • 33. Performance Impact to Schedule Length: Schedule length of Gaussian Elimination Schedule length of Sparse Matrix Solver 2011/6/22 33
  • 34. Heterogeneous Clusters - Motivational Example 2011/6/22 34
  • 35. Motivational Example (cont.) Energy calculation for tentative schedule C1 C2 C3 C4 2011/6/22 35
  • 36. Experimental Settings Simulation Environments 2011/6/22 36
  • 37. Communication-Computation Ratio CCR sensitivity for Gaussian Elimination 2011/6/22 37
  • 38. Heterogeneity Computational nodes heterogeneity experiments 2011/6/22 38
  • 39.
  • 41. Energy-Efficient Scheduling for Heterogeneous Systems
  • 42. How to measure energy consumption? Kill-A-Watt2011/6/22 39
  • 43. Source Code Availability www.mcs.sdsmt.edu/~zzong/software/scheduling.html 2011/6/22 40
  • 44. Download the presentation slideshttp://www.slideshare.net/xqin74 Google: slideshare Xiao Qin ‹#›
  • 47. Download Slides at slidesharehttp://www.slideshare.net/xqin74

Editor's Notes

  1. See also: defense_Ziliang.ppt
  2. High performance computing platforms have been widely deployed for intensive data processing and data storage. The impact of high performance computing platforms could be found in almost every domain: financial services, scientific computing, bioinformatics, computational chemistry, and weather forecast.
  3. High performance computing platforms have been widely deployed for intensive data processing and data storage. The impact of high performance computing platforms could be found in almost every domain: financial services, scientific computing, bioinformatics, computational chemistry, and weather forecast.
  4. High performance computing platforms have been widely deployed for intensive data processing and data storage. The impact of high performance computing platforms could be found in almost every domain: financial services, scientific computing, bioinformatics, computational chemistry, and weather forecast.
  5. High performance computing platforms have been widely deployed for intensive data processing and data storage. The impact of high performance computing platforms could be found in almost every domain: financial services, scientific computing, bioinformatics, computational chemistry, and weather forecast.
  6. High performance computing platforms have been widely deployed for intensive data processing and data storage. The impact of high performance computing platforms could be found in almost every domain: financial services, scientific computing, bioinformatics, computational chemistry, and weather forecast.
  7. High performance computing platforms have been widely deployed for intensive data processing and data storage. The impact of high performance computing platforms could be found in almost every domain: financial services, scientific computing, bioinformatics, computational chemistry, and weather forecast.
  8. This slide shows a typical high-performance computing platform, which was built by Google in the Oregon state. There is no doubt that they have significantly changed our lives and we all benefit from the great services provided by these super computing platforms . However, these giant machines consume a huge amount of energy.
  9. This figure comes from the report of Environmental Protection Agency submitted to the congress last year. Based on their report, the total power usage of servers and data centers in United States is 61.4 billion kwh in 2006. This is more than doubled the energy usage for the same purpose in 2000. If we look at the trend, from 2000 to 2006, the energy consumed by servers and data centers rapidly increased from 28.2 billion kwh all the way up to 61.4 billion kwh.
  10. Even worse, the EPA predicts that the power usage of servers and data centers will be doubled again within 5 years if the historical trends are followed. Even we follow the current efficiency trends, the power usage will exceed 100 billion kwh in 2011. This is a huge amount of energy.
  11. However, most pervious research primarily focus on the performance, security and reliability issues of high-performance computing platforms. The energy consumption issue was ignored. Now the energy problem has become so serious and I believe it is time for us to highlight the energy efficiency research of high-performance computing platforms.
  12. In our architecture, we have four layers: application layer, middleware layer, resource layer, network layer. In each layer, we can incorporate energy-aware techniques. For example, in the application layer, we can reduce the unnecessary access to hardware when writing the code. In the middleware layer, we can schedule parallel tasks in more energy-efficient ways. In the resource and network layers, we can do energy-aware resource management.
  13. This slide shows some typical hardware in the resource and network layers like CPU, main board, storage disk, network adapter, switch and router.
  14. One thing I would like to emphasize here is that any energy-oriented research should not scarify other important characters like performance, reliability or security. Although there must be some tradeoff once we introduce energy-aware techniques, we do not want to see significant degradation in other characters. In other words, we would like to make our research compatible with existing techniques. For my research, I mainly focus on the tradeoff between performance and energy.
  15. Before we talk about the algorithms, let’s see the cluster systems first. In a cluster, we have the master node and slave nodes. The master node is responsible to schedule tasks and allocate them to slave nodes for parallel execution. All slave nodes are connected by high speed interconnections and they communicate with each other through message passing.
  16. The parallel tasks running on clusters are represented using Directed Acyclic Graph , or DAG for short. Usually, a dag has one entry task and one or multiple exit tasks. Dag shows the task number and the execution time of each task. It also shows the dependence and communication time among tasks. Explain a little bit…
  17. Weakness 1: Do not consider energy conservation in memoryWeakness 2: Energy can’t be conserved even then network interconnects are idleIn order to improve performance, we use duplication strategy. This slide shows why duplication can improve performance. Here we have 4 tasks represented by the DAG in the left side. If we use linear scheduling, all four tasks will be allocated in 1 CPU and the execution time will be 39s. However, we noticed that we can schedule task 2 to the 2nd CPU so that we do not need to wait the completion of task 3. In that way, the total time will shortened to 32s. We also noticed that 6s are wasted in the 2nd CPU because task 2 has to wait the message from task 1. If duplicate task 1 in the 2nd CPU, we can further shorted the schedule length to 29s. Obviously, the duplication could improve performance.
  18. However, if we calculate the energy, we will find that duplication may consume more power. For example, if we set the energy consumption for CPU and network 6w and 1w, the total energy consumption of duplication will be 42J more than NDS and 50J more than linear schedule. That is mainly because task 1 are executed twice. Here I would like to mention that I will use NDS(MCP) to represent no duplication schedule and use TDS to represent task duplication schedule. You will see a lot of them in the simulation results.
  19. So we have to consider the tradeoff between performance and power consumption. We propose two algorithms to consider the tradeoff. One is called energy-aware duplication or EAD for short. The other one is called performance-energy balanced duplication or PEBD for short. In EAD, we only calculate the energy cost for duplicating a task. For example, if we duplicate T1, we will pay the 48J energy cost in the CPU side because we have to execute T1 twice . At the same time, we can save 6J energy in the network side because we do not need send message from T1 to T2. So the total cost will be 42J. In PEBD, we also calculate the performance benefit. If we duplicate T1, we can shorten schedule length 6s in maxim. So the ration between energy and performance will be 7. If we set duplication threshold to be 10, EAD will not duplicate while PEBD will duplicate.
  20. Now let’s look at how to implement the algorithms using a concrete example. Step1, we will generate the DAD based on the task description, which should be provided by users.
  21. Next, we are going to calculate the important parameters based on the equations 14-19 shown in Chapter4. The level means…
  22. Once we have these parameters, we can obtain the original task list by sorting the level in an ascending order. We will start from the first unscheduled task in the list, which is 10, and follow the favorite predecessor to the entry task. All tasks on this path will form a critical path. Here the first critical path will be 10-&gt;9-&gt;7-&gt;3-&gt;1; Then, these tasks will be marked as scheduled. In the next iteration, the algorithm will pick up the next unscheduled task as the start task and form the second critical path. Then, the third one and the fourth one. The algorithm will not terminated until all tasks have been scheduled.
  23. The algorithms also have to make the duplication decision. Explain…
  24. This diagram summarize the steps we just talked about. I will just skip it.
  25. Now we are going to discuss the simulation results. We implement our own simulator using C language under Linux system. The CPU power consumption parameters come from the xbitlabs. We simulate 4 different CPUs, 3 of them are AMD and one is Intel.
  26. This slide shows the structure of two small task set. The left one is Fast Fourier Transform and the right one is Gaussian Elimination.
  27. The slide shows the DAG structure of two real-world applications. The left one is Robot Control and the right one is Sparse Matrix Solver.
  28. This slide shows the impact of CPU types. Recall that I simulate 4 different CPUs, which are represented in 4 different colors. We found that the CPU with blue color can save more energy compares with other 3 CPUs. For example, we can save 19.4% energy using blue CPU while we only can save 3.7% for the purple CPU. The indication behind is that these 4 CPUs have different gaps between CPU_busy and CPU_idle. This table summarize the difference. The gap for the blue CPU is 89w but the gap for the purple CPU is only 18w. So our observation is…
  29. This slide shows the impact of interconnections. The left one is the simulation results for Myrinet and the right one is the simulation results for the Infiniband. We can save 16.7% and 13.3% energy when CCR is 0.1 and 0.5 respectively using Myrinet. However, the number drops down to 5% and 3.1% for Infiniband. We found that the only difference between these two simulation sets are the network power consumption rate. The Myrinet is 33.6w and the Infiniband is 65w. So our observation is that…
  30. We also observe the impact of application parallelism. The left figure shows the experimental results for Robot Control and the right one shows the results for Sparse. We noticed that we can save 17% and 15.8% energy for robot but only save 6.9% and 5.4% energy for sparse when CCR is the same. That is because the parallelism of robot is less than sparse. So our observation is…
  31. This slide shows our observation to the impact of CCR. Read...
  32. This group of simulation results show the impact to performance. The left one is for Gaussian and the right one is for Sparse. This table summarize that the overall performance degradation of EAD and PEBD is 5.7% and 2.2% compared with TDS for Gaussian. For Sparse, the number is 2.92% and 2.02%. Our observation is …
  33. For example, we designed a mapping matrix to represent the execution time of tasks in different processors. As you can see, for the same task T1, the execution time are 6.7, 3.9, 2.0 respectively. If a task could not be executed in a processor, we will put a infinite sign.
  34. We compared our HEADUS algorithm with other 4 algorithms and found that HEADUS can obtain the best overall energy savings in all of the 4 different environments.
  35. We also observed that HEADUS can same more energy under environment 2 and 4.