Energy proportionality is the key in order to reduce the Total Cost of Ownership (TCO) of Warehouse Scale Computer (WSC) systems, yet is difficult to achieve in practice. Typical WSC hardware usually does not meet this principle. Furthermore, critical services (e.g. billing) require all servers to remain up regardless the current traffic intensity. These two issues make existing power management technique ineffective at reducing energy use in a WSC dimension. We present Hybrid Performance-aware Power-capping Orchestrator (HyPPO), a distributed Observe Decide Act (ODA) control loop for optimizing energy proportionality of a distribute containerized infrastructures. This first version of HyPPO uses Kubernetes resource metrics (e.g. milli-cpus consumption) in order to dynamically adjust node power consumption, while respecting the Service Level Agreement (SLA) agreement defined by the containerized application owners.
1. HyPPO
Hybrid Performance-aware Power-capping Orchestration
Rolando Brondolin, Marco Arnaboldi, Sara Notargiacomo,
Tommaso Sardelli, Marco D. Santambrogio
{rolando.brondolin, marco.arnaboldi,sara.notargiacomo, marco.santambrogio}@polimi.it
tommaso.sardelli@mail.polimi.it
Sysdig, May 24th 2018
5. 5
In a galaxy… not so far away
Period
Computationalresources
0
25
50
75
100
Jan Feb Mar April May June July Aug Sept Oct Nov Dec
Allocated resources
ENERGY
WASTE
Online retailer data intensive services scenario (OLDI)
6. 6
In a galaxy… not so far away
Period
Computationalresources
0
25
50
75
100
Jan Feb Mar April May June July Aug Sept Oct Nov Dec
Allocated resources
Online retailer data intensive services scenario (OLDI)
Energy proportionality is the key
to reduce Total Cost of Ownership (TCO) in datacenter
20% of TCO is represented by servers power consumption [1]
50-90% under-utilization in case of OLDI and batch workloads together [2]
[1] Y. Cui, C. Ingalz, T. Gao, and A. Heydari, “Total cost of ownership model for data center technology evaluation,” in Thermal and Thermo- mechanical Phenomena in Electronic Systems (ITherm), 2017 16th IEEE Intersociety Conference on. IEEE, 2017, pp. 936–942.
[2] L. A. Barroso, J. Clidaras, and U. Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool Publishers, 2013.
9. 9
Hybrid Performance-aware Power-capping Orchestration
automation through
controlling techniques
HyPPO in a nutshell
monitoring performances in order to reach a
given SLO (Service Level Objective)
10. 10
Hybrid Performance-aware Power-capping Orchestration
automation through
controlling techniques
monitoring performances in order to reach a
given SLO (Service Level Objective)
energy proportionality achieved through
DVFS techniques
HyPPO in a nutshell
11. 11
HyPPO in a nutshell
Hybrid Performance-aware Power-capping Orchestration
automation through
controlling techniques
monitoring performances in order to reach a
given SLO (Service Level Objective)
energy proportionality achieved through
DVFS techniques
HW approach for power capping and SW approach
for performance-power correlation
15. Act
Master
Node Node
15
ODA + Kubernetes = Distributed ODA
API
API
Pod Pod
API
Pod Pod
Observe
Decide
MONITORING
AGENT
MONITORING
AGENT
CONTROLLER
ACTUATOR
AGENT
ACTUATOR
AGENT
16. Act
Master
Node Node
16
ODA + Kubernetes = Distributed ODA
API
API
Pod Pod
API
Pod Pod
MONITORING
AGENT
MONITORING
AGENT
Observe
Decide
CONTROLLER
ACTUATOR
AGENT
ACTUATOR
AGENT
17. Act
Master
Node Node
17
ODA + Kubernetes = Distributed ODA
API
API
Pod Pod
API
Pod Pod
MONITORING
AGENT
MONITORING
AGENT
HyPPO
Backend
Observe
Decide
CONTROLLER
ACTUATOR
AGENT
ACTUATOR
AGENT
18. Act
Master
Node Node
18
ODA + Kubernetes = Distributed ODA
API
API
Pod Pod
API
Pod Pod
MONITORING
AGENT
MONITORING
AGENT
HyPPO
Backend
CONTROLLER
Observe
Decide
ACTUATOR
AGENT
ACTUATOR
AGENT
19. Master
Node Node
19
ODA + Kubernetes = Distributed ODA
API
API
Pod Pod
API
Pod Pod
MONITORING
AGENT
MONITORING
AGENT
HyPPO
Backend
ACTUATOR
AGENT
ACTUATOR
AGENT
CONTROLLER
Observe
Decide
Act
36. 36
Preliminary Results
Testbed
Kubernetes cluster composed by 2 homogenous nodes
Node specs: Dell PowerEdge r720xd equipped with 2x Intel Xeon E5-2680 Ivy
Bridge with 10 cores each (20 HT) clocked at 2.80GHz and with 380GB of RAM
on premise
orchestration
Benchmarck
Phoronix-test suite version 1.7
37. 37
Preliminary Results
on premise
orchestration
apache-cpu CPU Request
CPU%
0
200
400
Execution Time [s]
0 20 40 60 80 100 120 140 160
Apache CPU Opportunity Gap
Testbed
Kubernetes cluster composed by 2 homogenous nodes
Node specs: Dell PowerEdge r720xd equipped with 2x Intel Xeon E5-2680 Ivy
Bridge with 10 cores each (20 HT) clocked at 2.80GHz and with 380GB of RAM
Benchmarck
Phoronix-test suite version 1.7
38. apache-cpu CPU Request
CPU%
0
200
400
Execution Time [s]
0 20 40 60 80 100 120 140 160
Apache CPU Opportunity Gap
38
Preliminary Results
on premise
orchestration
Testbed
Kubernetes cluster composed by 2 homogenous nodes
Node specs: Dell PowerEdge r720xd equipped with 2x Intel Xeon E5-2680 Ivy
Bridge with 10 cores each (20 HT) clocked at 2.80GHz and with 380GB of RAM
Benchmarck
Phoronix-test suite version 1.7
39. 39
on premise
orchestration
Preliminary Results
apache-cpu apache-cpu-ctrl CPU Request
CPU%
0
200
400
600
Execution Time [s]
0 20 40 60 80 100 120 140 160
Apache CPU usage
Testbed
Kubernetes cluster composed by 2 homogenous nodes
Node specs: Dell PowerEdge r720xd equipped with 2x Intel Xeon E5-2680 Ivy
Bridge with 10 cores each (20 HT) clocked at 2.80GHz and with 380GB of RAM
Benchmarck
Phoronix-test suite version 1.7
40. 40
on premise
orchestration
apache-cpu apache-cpu-ctrl CPU Request
CPU%
0
200
400
600
Execution Time [s]
0 20 40 60 80 100 120 140 160
Apache CPU usage
apache-pw apache-pw-ctrl
Power[mW]
0
20000
40000
60000
Execution Time [s]
0 10 20 30 40 50 60 70 80
Apache Power consumed
Preliminary Results
Testbed
Kubernetes cluster composed by 2 homogenous nodes
Node specs: Dell PowerEdge r720xd equipped with 2x Intel Xeon E5-2680 Ivy
Bridge with 10 cores each (20 HT) clocked at 2.80GHz and with 380GB of RAM
Benchmarck
Phoronix-test suite version 1.7
41. 41
on premise
orchestration
Preliminary tests conducted on different workloads showed a
power saving going from 5% to 45%
Preliminary Results
apache-cpu apache-cpu-ctrl CPU Request
CPU%
0
200
400
600
Execution Time [s]
0 20 40 60 80 100 120 140 160
Apache CPU usage
apache-pw apache-pw-ctrl
Power[mW]
0
20000
40000
60000
Execution Time [s]
0 10 20 30 40 50 60 70 80
Apache Power consumed
Testbed
Kubernetes cluster composed by 2 homogenous nodes
Node specs: Dell PowerEdge r720xd equipped with 2x Intel Xeon E5-2680 Ivy
Bridge with 10 cores each (20 HT) clocked at 2.80GHz and with 380GB of RAM
Benchmarck
Phoronix-test suite version 1.7
aggiungere SLA violation
43. 43
Thank for you attention!
Rolando Brondolin
2nd Year PhD Student
Marco Arnaboldi
1st Year PhD Student
Sara Notargiacomo
Technology transfer mgr
Marco D. Santambrogio
Advisor
Tommaso Sardelli
M.Sc. student