SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Introduction Implementation Concluding Remarks Appendix
Dynamic Fractional Resource Scheduling
Practical Issues and Future Directions
Mark Stillwell Fr´ed´eric Vivien Henri Casanova
INRIA, Universit´e de Lyon, LIP
International Research Workshop on Advanced High
Performance Computing Systems
Cetraro, Italy
June 27–29, 2011
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Outline
Introduction
Implementation
Concluding Remarks
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Outline
Introduction
Implementation
Concluding Remarks
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Dynamic Fractional Resource Scheduling
HPC Scheduling Problem:
homogeneous nodes with high-speed interconnect
network file system
release date and number of tasks known at submission
computation time known only after completion
goal: minimize maximum stretch
approach:
based on virtual machine technology
fractional resource allocations
preemption / migration
on-line scheduling problem → sequence of off-line resource
allocation problems
computable, on-line performance metric related to max
stretch
heuristics for task placement/resource allocation
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Task Placement and Resource Allocation Problem
nodes comprise a number of resources (CPU cores,
memory, bandwith, etc...)
tasks have rigid requirements and fluid needs
job tasks have the same requirements and needs
rigid requirements specify minimum resource allocations
fluid needs specify minimum allocations without
performance degradation
yield of a job is a value between 0 and 1
correlated with performance
jobs allocated product of yield and need in each resource
Goal: find an allocation that maximizes the minimum yield
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Task Placement Heuristics
Greedy Task Placement – incremental, puts each task on
the node with the lowest computational load on which it
can fit without violating memory constraints
MCB Task Placement – global, iteratively applies
multi-capacity (vector) bin-packing heuristics during a
binary search for the maximized minimum yield
achieves higher minimum yield values than Greedy
can potentially cause lots of migration
what if the system is oversubscribed?
use a priority function to decide which jobs to run
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Priority Function?
Virtual Time: subjective time experienced by a job
first idea: 1
VIRTUAL TIME
informed by ideas about fairness
lead to good results
theoretically prone to starvation
second idea: FLOW TIME
VIRTUAL TIME
addresses starvation problem
lead to poor performance
Third Idea: FLOW TIME
(VIRTUAL TIME)2
combines idea #1 and idea #2
addresses starvation
performs about the same as first priority function
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
When to apply Heuristics
We consider a number of different options:
Job Submission – heuristics can use greedy or bin packing
approaches
Job Completion – as above, can help with throughput
when there are lots of short running jobs
Periodically – some heuristics periodically apply vector
packing to improve overall job placement
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Methodology
experiments conducted using discrete event simulator
mix of synthetic and real trace data
considered only memory requirements and CPU needs
preemption/migration of jobs causes 300 second penalty to
runtime
periodic approaches use a 600 second (10 minute) period
absolute bound on max stretch computed for each instance
performance comparison based on degradation from max
stretch bound
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Simulation Algorithms
FCFS
EASY
Greedy*
GreedyPM*
/per
GreedyPM*/per
MCB*/per
GreedyPM*/per/mvt
MCB*/per/mvt
/stretch-per
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Max Stretch Degradation vs. Load, 5 minute penalty
1
10
100
1000
10000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
MaxstretchdegradationFromBound
Load
FCFS
EASY
Greedy*
GreedyPM*
/per
GreedyPM*/per
MCB*/per
/stretch-per
MCB*/per/mvt
GreedyPM*/per/mvt
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Bandwidth vs. Period
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
2000 4000 6000 8000 10000 12000
AverageBandwidth(GB/s)
Period (seconds)
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Max Stretch Degradation vs. Period
5
6
7
8
9
10
11
12
13
14
2000 4000 6000 8000 10000 12000
AverageMaxstretchDegradation
Period (seconds)
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Outline
Introduction
Implementation
Concluding Remarks
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Development Platform
Grid5000 (http://www.grid5000.fr)
Kadeploy images
Genepi Nodes: Quad Core Xeon 2.5GHz w/ 8GB ram
Open-Source Software
Xen 4.0
Debian Squeeze / Linux 2.6.32
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Goals
“small” experiment topics:
VM overhead
resource requirement/needs estimation
time sharing performance impact
preemption/migration costs
bandwidth consumption
performance impact
migration planning strategies
“large” experiment topic: DFRS in practice
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Goals
“small” experiment topics:
VM overhead
resource requirement/needs estimation
time sharing performance impact
preemption/migration costs
bandwidth consumption
performance impact
migration planning strategies
“large” experiment topic: DFRS in practice
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Goals
“small” experiment topics:
VM overhead
resource requirement/needs estimation
time sharing performance impact
preemption/migration costs
bandwidth consumption
performance impact
migration planning strategies
“large” experiment topic: DFRS in practice
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Goals
“small” experiment topics:
VM overhead
resource requirement/needs estimation
time sharing performance impact
preemption/migration costs
bandwidth consumption
performance impact
migration planning strategies
“large” experiment topic: DFRS in practice
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Goals
“small” experiment topics:
VM overhead
resource requirement/needs estimation
time sharing performance impact
preemption/migration costs
bandwidth consumption
performance impact
migration planning strategies
“large” experiment topic: DFRS in practice
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Estimating Resource Needs
Techniques:
repeated runs for benchmarking / model building
online model building
introspection / configuration variation
Models:
constant values
time-varying values
random variables / probability distributions
Questions:
how accurate can we be?
how accurate do we need to be?
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Benchmark Task VM Memory Requirements
bench/tasks 1 2 3 4
CG 477M 251M – 147M
EP <64M <64M <64M <64M
FT 1930M 977M – 499M
IS 637M 329M – 176M
LU 218M 125M – 80M
MG 485M 259M – 147M
SP 369M – – 123M
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Time Sharing Performance Impact
Process thrashing!!!
Gang Scheduling
Flexible Co-scheduling
Uncoordinated might not be so bad...
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
VM Consolidation
2
4
6
8
10
12
14
1 2 3 4 5 6 7 8
slowdown
consolidation factor (maximum tasks/core)
y = x
CG
EP
FT
IS
LU
MG
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Parallel Task Preemption/Migration
preemption of parallel distributed jobs is theoretically
difficult
in the general case this may require coordinated shutdown
on the target platform everything happens quickly enough
that it isn’t an issue
migration can be done by preempt/restore or “live”
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Preemption/Migration Timing Results
0
10
20
30
40
50
60
70
80
90
100
400 600 800 1000 1200 1400 1600 1800 2000
Time(seconds)
Memory Allocation (MB)
Preempt
Restore
Preempt+Restore
Migrate
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Migration Planning
using only live migration is NP-hard
could fail (circular dependencies)
heuristic: try to move the highest priority job
if it can’t move, try next highest
if none can move, preempt the lowest priority job scheduled
for migration and try again
restore any preempted jobs at the end
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Planned Implementation Validation Experiment
generate traces using an accepted model
annotate them with CPU/memory values
”fill in” the traces using benchmarks...
NAS parallel benchmarks (HPC)
TCPP benchmarks (Service Hosting)
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Future Work
complete development of practical implementation
extend the model
nonlocal resources
heterogeneous nodes
varying communications delays
new or arbitrary optimization targets
fault tolerance
energy or other costs
formal definitions of fairness
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
Summary
DFRS performs well in simulation
orders of magnitude better than batch
close to a theoretical bound
prototype in development
preliminary results are promising
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice
Introduction Implementation Concluding Remarks Appendix
References I
Stillwell, M., Schanzenbach, D., Vivien, F., and Casanova,
H. (2010).
Resource allocation algorithms for virtualized service
hosting platforms.
Journal of Parallel and Distributed Computing,
70(9):962–974.
M. Stillwell, F. Vivien, H. Casanova INRIA
DFRS in Practice

Más contenido relacionado

Similar a Dynamic Fractional Resource Scheduling Practical Issues and Future Directions -- 2011, Cetraro

f2afc247105b552c727a4214e5101b76_9_3product_level (1).pptx
f2afc247105b552c727a4214e5101b76_9_3product_level (1).pptxf2afc247105b552c727a4214e5101b76_9_3product_level (1).pptx
f2afc247105b552c727a4214e5101b76_9_3product_level (1).pptx
HENRYSANTIAGOCOLLASO
 
Tablet tools and micro tasks
Tablet tools and micro tasksTablet tools and micro tasks
Tablet tools and micro tasks
mhilde
 
SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...
SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...
SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...
South Tyrol Free Software Conference
 

Similar a Dynamic Fractional Resource Scheduling Practical Issues and Future Directions -- 2011, Cetraro (20)

A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...
A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...
A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
CapellaDays2022 | Thales | Stairway to heaven: Climbing the very first steps
CapellaDays2022 | Thales | Stairway to heaven: Climbing the very first stepsCapellaDays2022 | Thales | Stairway to heaven: Climbing the very first steps
CapellaDays2022 | Thales | Stairway to heaven: Climbing the very first steps
 
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
 
Continuous Software Engineering - A tutorial
Continuous Software Engineering - A tutorialContinuous Software Engineering - A tutorial
Continuous Software Engineering - A tutorial
 
Workshop BI/DWH AGILE TESTING SNS Bank English
Workshop BI/DWH AGILE TESTING SNS Bank EnglishWorkshop BI/DWH AGILE TESTING SNS Bank English
Workshop BI/DWH AGILE TESTING SNS Bank English
 
Atw segments 1 & 2 final 6 6-09
Atw segments 1 & 2 final 6 6-09Atw segments 1 & 2 final 6 6-09
Atw segments 1 & 2 final 6 6-09
 
Introduction of Kanban metrics
Introduction of Kanban metricsIntroduction of Kanban metrics
Introduction of Kanban metrics
 
Sattose 2020 presentation
Sattose 2020 presentationSattose 2020 presentation
Sattose 2020 presentation
 
Continuous Integration Clinic
Continuous Integration ClinicContinuous Integration Clinic
Continuous Integration Clinic
 
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
 
A PeopleSoft Roadmap
A PeopleSoft RoadmapA PeopleSoft Roadmap
A PeopleSoft Roadmap
 
f2afc247105b552c727a4214e5101b76_9_3product_level (1).pptx
f2afc247105b552c727a4214e5101b76_9_3product_level (1).pptxf2afc247105b552c727a4214e5101b76_9_3product_level (1).pptx
f2afc247105b552c727a4214e5101b76_9_3product_level (1).pptx
 
Tablet tools and micro tasks
Tablet tools and micro tasksTablet tools and micro tasks
Tablet tools and micro tasks
 
NFV testing landscape
NFV testing landscapeNFV testing landscape
NFV testing landscape
 
Estimating the Cost for Executing Business Processes in the Cloud
Estimating the Cost for Executing Business Processes in the CloudEstimating the Cost for Executing Business Processes in the Cloud
Estimating the Cost for Executing Business Processes in the Cloud
 
SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...
SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...
SFScon 2020 - Andrea Avancini Michele Santuari - An internal investigation ob...
 
NFV Interoperability Evaluation Results
NFV Interoperability Evaluation ResultsNFV Interoperability Evaluation Results
NFV Interoperability Evaluation Results
 
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017 Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
Serena Yeung, PHD, Stanford, at MLconf Seattle 2017
 
DynoChem_webinar_Novartis_Flav_Susanne_11Jun2014
DynoChem_webinar_Novartis_Flav_Susanne_11Jun2014DynoChem_webinar_Novartis_Flav_Susanne_11Jun2014
DynoChem_webinar_Novartis_Flav_Susanne_11Jun2014
 

Último

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Dynamic Fractional Resource Scheduling Practical Issues and Future Directions -- 2011, Cetraro

  • 1. Introduction Implementation Concluding Remarks Appendix Dynamic Fractional Resource Scheduling Practical Issues and Future Directions Mark Stillwell Fr´ed´eric Vivien Henri Casanova INRIA, Universit´e de Lyon, LIP International Research Workshop on Advanced High Performance Computing Systems Cetraro, Italy June 27–29, 2011 M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 2. Introduction Implementation Concluding Remarks Appendix Outline Introduction Implementation Concluding Remarks M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 3. Introduction Implementation Concluding Remarks Appendix Outline Introduction Implementation Concluding Remarks M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 4. Introduction Implementation Concluding Remarks Appendix Dynamic Fractional Resource Scheduling HPC Scheduling Problem: homogeneous nodes with high-speed interconnect network file system release date and number of tasks known at submission computation time known only after completion goal: minimize maximum stretch approach: based on virtual machine technology fractional resource allocations preemption / migration on-line scheduling problem → sequence of off-line resource allocation problems computable, on-line performance metric related to max stretch heuristics for task placement/resource allocation M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 5. Introduction Implementation Concluding Remarks Appendix Task Placement and Resource Allocation Problem nodes comprise a number of resources (CPU cores, memory, bandwith, etc...) tasks have rigid requirements and fluid needs job tasks have the same requirements and needs rigid requirements specify minimum resource allocations fluid needs specify minimum allocations without performance degradation yield of a job is a value between 0 and 1 correlated with performance jobs allocated product of yield and need in each resource Goal: find an allocation that maximizes the minimum yield M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 6. Introduction Implementation Concluding Remarks Appendix Task Placement Heuristics Greedy Task Placement – incremental, puts each task on the node with the lowest computational load on which it can fit without violating memory constraints MCB Task Placement – global, iteratively applies multi-capacity (vector) bin-packing heuristics during a binary search for the maximized minimum yield achieves higher minimum yield values than Greedy can potentially cause lots of migration what if the system is oversubscribed? use a priority function to decide which jobs to run M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 7. Introduction Implementation Concluding Remarks Appendix Priority Function? Virtual Time: subjective time experienced by a job first idea: 1 VIRTUAL TIME informed by ideas about fairness lead to good results theoretically prone to starvation second idea: FLOW TIME VIRTUAL TIME addresses starvation problem lead to poor performance Third Idea: FLOW TIME (VIRTUAL TIME)2 combines idea #1 and idea #2 addresses starvation performs about the same as first priority function M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 8. Introduction Implementation Concluding Remarks Appendix When to apply Heuristics We consider a number of different options: Job Submission – heuristics can use greedy or bin packing approaches Job Completion – as above, can help with throughput when there are lots of short running jobs Periodically – some heuristics periodically apply vector packing to improve overall job placement M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 9. Introduction Implementation Concluding Remarks Appendix Methodology experiments conducted using discrete event simulator mix of synthetic and real trace data considered only memory requirements and CPU needs preemption/migration of jobs causes 300 second penalty to runtime periodic approaches use a 600 second (10 minute) period absolute bound on max stretch computed for each instance performance comparison based on degradation from max stretch bound M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 10. Introduction Implementation Concluding Remarks Appendix Simulation Algorithms FCFS EASY Greedy* GreedyPM* /per GreedyPM*/per MCB*/per GreedyPM*/per/mvt MCB*/per/mvt /stretch-per M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 11. Introduction Implementation Concluding Remarks Appendix Max Stretch Degradation vs. Load, 5 minute penalty 1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 MaxstretchdegradationFromBound Load FCFS EASY Greedy* GreedyPM* /per GreedyPM*/per MCB*/per /stretch-per MCB*/per/mvt GreedyPM*/per/mvt M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 12. Introduction Implementation Concluding Remarks Appendix Bandwidth vs. Period 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 2000 4000 6000 8000 10000 12000 AverageBandwidth(GB/s) Period (seconds) M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 13. Introduction Implementation Concluding Remarks Appendix Max Stretch Degradation vs. Period 5 6 7 8 9 10 11 12 13 14 2000 4000 6000 8000 10000 12000 AverageMaxstretchDegradation Period (seconds) M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 14. Introduction Implementation Concluding Remarks Appendix Outline Introduction Implementation Concluding Remarks M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 15. Introduction Implementation Concluding Remarks Appendix Development Platform Grid5000 (http://www.grid5000.fr) Kadeploy images Genepi Nodes: Quad Core Xeon 2.5GHz w/ 8GB ram Open-Source Software Xen 4.0 Debian Squeeze / Linux 2.6.32 M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 16. Introduction Implementation Concluding Remarks Appendix Goals “small” experiment topics: VM overhead resource requirement/needs estimation time sharing performance impact preemption/migration costs bandwidth consumption performance impact migration planning strategies “large” experiment topic: DFRS in practice M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 17. Introduction Implementation Concluding Remarks Appendix Goals “small” experiment topics: VM overhead resource requirement/needs estimation time sharing performance impact preemption/migration costs bandwidth consumption performance impact migration planning strategies “large” experiment topic: DFRS in practice M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 18. Introduction Implementation Concluding Remarks Appendix Goals “small” experiment topics: VM overhead resource requirement/needs estimation time sharing performance impact preemption/migration costs bandwidth consumption performance impact migration planning strategies “large” experiment topic: DFRS in practice M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 19. Introduction Implementation Concluding Remarks Appendix Goals “small” experiment topics: VM overhead resource requirement/needs estimation time sharing performance impact preemption/migration costs bandwidth consumption performance impact migration planning strategies “large” experiment topic: DFRS in practice M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 20. Introduction Implementation Concluding Remarks Appendix Goals “small” experiment topics: VM overhead resource requirement/needs estimation time sharing performance impact preemption/migration costs bandwidth consumption performance impact migration planning strategies “large” experiment topic: DFRS in practice M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 21. Introduction Implementation Concluding Remarks Appendix Estimating Resource Needs Techniques: repeated runs for benchmarking / model building online model building introspection / configuration variation Models: constant values time-varying values random variables / probability distributions Questions: how accurate can we be? how accurate do we need to be? M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 22. Introduction Implementation Concluding Remarks Appendix Benchmark Task VM Memory Requirements bench/tasks 1 2 3 4 CG 477M 251M – 147M EP <64M <64M <64M <64M FT 1930M 977M – 499M IS 637M 329M – 176M LU 218M 125M – 80M MG 485M 259M – 147M SP 369M – – 123M M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 23. Introduction Implementation Concluding Remarks Appendix Time Sharing Performance Impact Process thrashing!!! Gang Scheduling Flexible Co-scheduling Uncoordinated might not be so bad... M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 24. Introduction Implementation Concluding Remarks Appendix VM Consolidation 2 4 6 8 10 12 14 1 2 3 4 5 6 7 8 slowdown consolidation factor (maximum tasks/core) y = x CG EP FT IS LU MG M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 25. Introduction Implementation Concluding Remarks Appendix Parallel Task Preemption/Migration preemption of parallel distributed jobs is theoretically difficult in the general case this may require coordinated shutdown on the target platform everything happens quickly enough that it isn’t an issue migration can be done by preempt/restore or “live” M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 26. Introduction Implementation Concluding Remarks Appendix Preemption/Migration Timing Results 0 10 20 30 40 50 60 70 80 90 100 400 600 800 1000 1200 1400 1600 1800 2000 Time(seconds) Memory Allocation (MB) Preempt Restore Preempt+Restore Migrate M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 27. Introduction Implementation Concluding Remarks Appendix Migration Planning using only live migration is NP-hard could fail (circular dependencies) heuristic: try to move the highest priority job if it can’t move, try next highest if none can move, preempt the lowest priority job scheduled for migration and try again restore any preempted jobs at the end M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 28. Introduction Implementation Concluding Remarks Appendix Planned Implementation Validation Experiment generate traces using an accepted model annotate them with CPU/memory values ”fill in” the traces using benchmarks... NAS parallel benchmarks (HPC) TCPP benchmarks (Service Hosting) M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 29. Introduction Implementation Concluding Remarks Appendix Future Work complete development of practical implementation extend the model nonlocal resources heterogeneous nodes varying communications delays new or arbitrary optimization targets fault tolerance energy or other costs formal definitions of fairness M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 30. Introduction Implementation Concluding Remarks Appendix Summary DFRS performs well in simulation orders of magnitude better than batch close to a theoretical bound prototype in development preliminary results are promising M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice
  • 31. Introduction Implementation Concluding Remarks Appendix References I Stillwell, M., Schanzenbach, D., Vivien, F., and Casanova, H. (2010). Resource allocation algorithms for virtualized service hosting platforms. Journal of Parallel and Distributed Computing, 70(9):962–974. M. Stillwell, F. Vivien, H. Casanova INRIA DFRS in Practice