Scaling and scheduling to maximize application performance within budget constraints
1. Ming Mao, Marty Humphrey
CS Department, UVa
Scaling and Scheduling to Maximize
Application Performance within Budget
Constraints in Cloud Workflows
IPDPS 2013
(May 21st 2013)
1
2. 2
Dynamic scalability and cost saving are two of the most important factors when
considering cloud adoption
Two major benefits
- dynamic scalability and cost
A survey from 39 major technology companies [1]
Cloud benefits
On-demand self-services
Broad network access
Resource pooling
Rapid elasticity
Measured services
Cheaper maintenance
……
Why do you move into the cloud?
3. 3
Dynamic scalability – the ability to acquire/release resources in response to
demand dynamically
Dynamic scalability challenge → It relies on the users to tell the size of resource
pool
Over-provisioning → cost more than necessary, offset cloud advantages
Under-provisioning → hurt application performance, cannot meet service level agreements and
lose application customers
Cloud dynamic scalability
over-provisioning under-provisioning
4. 4
Problem - What resources should be acquired/released in the cloud,
and how should the computing activities be mapped to the cloud
resources, so that the application performance can be maximized
within the budget constrains?
In this paper, we discuss limited budget case
The unlimited budget case was discussed in our SC 11 paper
Solution - This paper argues that an automatic resource
provisioning and allocation mechanism, i.e., an auto-scaling
solution – is the key to successful cloud adoption. Essentially, an
auto-scaling solution needs to answer the following two questions:
Capacity determination (or resource provisioning)
what types of resources, how much and for how long
Job scheduling (or resource allocation)
map computing activities onto the cloud resources
Problem statement
5. 5
An application consists of service components. A workflow goes through different
service components and therefore consists of multiple connected tasks
Workload is a stream of workflow jobs not known in advance
Task precedence constraints need to be preserved
Jobs have individual priorities
Service oriented architecture (SOA) & workflow jobs
6. 6
Minimize job turnaround time within budget constraints
Problem formulation
Problem terminology
Cloud application
app = {Si}
Job class
J = {DAG(Si), priorityJ| Si ∈ app}
Cloud VM
VMv = {[𝐽 𝑆 𝑖]v , cv , lagv}
Workload
Wt = 𝑗𝑜𝑏𝐽
𝑆 𝑖
𝑗𝑜𝑏𝐽𝑆 𝑖
Scaling plan
Scalingt = {VMv → Nv}
Scheduling plan
Schedulet = { 𝑗𝐽
𝑆 𝑖
→VMv}
Goal
Min( 𝑗𝑜𝑏𝑡𝑢𝑟𝑛𝑎𝑟𝑜𝑢𝑛𝑑 × 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦/𝑗𝑜𝑏 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦𝑗𝑜𝑏 )
&&
Cost(app) <= B (budget, dollars/hour)
Target - The service provider has a limited budget and
aims to maximize the application performance.
Solution idea – a monitor-control loop that
makes scaling and scheduling decisions based
on updated workload and VM information
7. 7
Scheduling-first
Idea – allocate application budget to individual jobs based on priorities
and schedule tasks within job budget
Step 1 – Distribute budget: 𝐵𝑗 = 𝐵 × 𝑝𝑗/ 𝑝𝑗𝑗
Step 2 – Schedule tasks
for each job, schedule as many tasks as possible on their fast machines
Step 3 – Consolidate budget
return job budget to the application
the application uses the remaining budget collected from individual jobs to schedule
high priority tasks
Step 4 – Acquire instance
acquire instances and execute tasks based on the determined schedule plans
Minimize job turnaround time within budget constraints
Solution: scheduling-first
9. 9
Minimize job turnaround time within budget constraints
Solution: scaling-first
Scaling-first
Idea – determine the computing capacity by looking at the overall
workload and schedule tasks based on priority
Step 1 – determine the VMs
assume tasks run on their fastest machines and calculate the cost Cfast for the next
hour
acquire VMs proportionally based on Budget/Cfast
Step 2 – consolidate budget
use the remaining the budget to purchase new machines.
Step 3 – schedule tasks
schedule tasks based on task priority
10. 10
Minimize job turnaround time within budget constraints
Solution: scaling-first
Scaling-first
Step 1 – determine the VMs
Step 2 – consolidate budget
Step 3 – schedule tasks
Step 1: assume tasks run on fastest
machines and calculate Cfast and
acquire VMs proportionally based on
B/Cfast,
Step 2: the remaining $0.5 can be used to
purchase 1 L machine
Step 3: tasks are scheduled
based on their priorities
11. 11
Instance consolidation
Schedule tasks on different VM types to save partial instance hour cost
Budget allocation schemes
Evenly distributed – e.g. daily x/365, hourly x/8760
Based on workload – e.g. high on busy times, low on non-busy times
Workload prediction – $/hour → $/job
Minimize job turnaround time within budget constraints
Other considerations
12. Workload patterns
Application models
12
Time
72 hours
Task execution
Randomly generated
VM lag
5 min
Minimize job turnaround time within budget constraints
Evaluation – experiment setup
Baseline
Standard
VM Type Price
Micro $0.02/hour
Standard $0.080/hour
High-CPU $0.66/hour
High-Memory $0.45/hour
Extra-Large $1.3/hour
13. 13
Minimize job turnaround time within budget constraints
Evaluation – job turnaround time
above – weighted average job turnaround time for the hybrid application and cycle
workload pattern
Scheduling-first and scaling-first can save 9.8%- 45.2% cost compared to the standard
machine choice.
Scaling-first works better under small budget ranges while scheduling-first works better
under large budget ranges.
14. 14
Minimize job turnaround time within budget constraints
Evaluation – sensitivity to inaccurate parameters
left – scheduling-first’s sensitivity to inaccurate parameters (Hybrid application + Cycle
workload pattern)
right – scaling-first’s sensitivity to inaccurate parameters (Hybrid application + Cycle workload
pattern)
When the estimation error is within ±20%, the job turnaround time shows -10.2% – 16.7%
difference.
When the task estimation error reaches ±60%, the performance of both algorithms shows
significant degradation (more than ±25% difference)
15. 15
Minimize job turnaround time within budget constraints
Evaluation – instance consolidation
left – job turnaround time / resource utilization of scheduling-first’s instance consolidation
(Hybrid application + Cycle workload pattern)
right – job turnaround time / resource utilization of scaling-first’s instance consolidation
(Hybrid application + Cycle workload pattern)
When budget is low or high, the improvement is small. When the budget is in between, the
improvement is more significant (e.g. utilization rate improves 2.2% to 19.9% when the budget
is between $15/hour and $25/hour).
Scaling-first benefits more from instance consolidation process than scheduling-first
16. 16
Conclusions
choose appropriate VM types based on the workload.
Scheduling-first and scaling-first are trade-offs between the task execution time and
waiting time.
As long as the VM performance can be correctly ranked, the proposed mechanisms have
good tolerance to inaccurate parameters.
Instance consolidation is an efficient strategy to save partial instance hours and improve
resource utilization.
Future work
Other billing models – reserved instances, spot instances, $/min
Maximize application performance within budget constraints for data-intensive
applications
Hybrid and federate cloud environments
Develop evaluation benchmarks and simulation platforms
Conclusion and future work