SlideShare a Scribd company logo
1 of 140
Download to read offline
Enabling Power-Awareness
For Multi-Tenant Systems
Candidate: Matteo FERRONI
Advisor: Marco D. Santambrogio
Tutor: Donatella Sciuto
Ph.D. Cycle: XXIX
Ph.D. in Information Technology: Final Dissertation
Politecnico di Milano — February, 17th 2017
POWER CONSUMPTION
Credits:	https://citizentv.co.ke/blogs/electricity-supply-is-fundamental-human-right-102340/
The battery of your smartphone does not last a day.
Credits:	http://www.mobileworld.it/2016/01/07/smartphone-ricarica-camminata-62171/
A data center needs to deal with power grid limits.
Credits:	https://resources.workable.com/systems-engineer-job-description
5
Context definition
Common features
(1) hardware heterogeneity
(2) software multi-tenancy
(3) input variability
Key facts:
• Energy budgets and power caps constrain the performance of the system
• The actual power consumption is affected by a pletora of different actors
(0) A bird's eye view
Problem definition and proposed approach
6
• Problems definition
A. How much power is a system going to consume, given certain working conditions?
B. How to control a system to consume less power, still satisfying its requirements?
• Assumption: the system will behave as it did in the past
• High-level approach:
1. Observe the behavior of the system during its real working conditions
2. Build accurate models to describe and predict it
3. Use them to refine decisions and meet goals efficiently
Idea: learn from experience
(0) A bird's eye view
7
Pragmatic methodology
Data-driven power-awareness through a holistic approach
We start from raw data
(power measurements,
load traces, system
stats, etc.)
We are not interested in the
physical components of the
system: it is a black box
We help users and
systems to learn and
predict their power needs
This should be done in automation throughout the whole lifetime of the system
(0) A bird's eye view
Outline
1. A first case study: power models for Android devices
2. Generalization: Model and Analysis of Resource Consumption (MARC)
3. Virtual guests monitoring: towards power-awareness for Xen
4. Modeling power consumption in multi-tenant virtualized systems
5. Maximizing performance under a power cap: a hybrid approach
6. Moving forward: containerization, challenges and opportunities
7. Conclusion and future work
8
CONTROLMODEL
9
• We need to observe and model the phenomenon
The need for a model
EnergyBudget(%)
Power
Model
Energy
Behavior
Time-To-Live (s)
Now! Time
9
(1) A first case study: power models for Android devices
10
Model as-a-Service
• Requirements:
• No monitoring and modeling overheads on the system itself
• adapt to different systems/users, as well as to changes over time
• Proposed solution: Model-as-a-Service
a. send raw traces to a remote server
b. compute power models
c. send back predictions and models
parameters
a
b
c
(1) A first case study: power models for Android devices
Power
constrained
system
11
Pragmatic approach
• Modeling approach: “divide et impera”
We experienced a piecewise linear behavior and tried to attribute
this to domain-specific features
Working	regime	
A
Working	regime	
B
Working	regime		
C
=	actions	on	
controllable	variables
Exogenous	input	
(uncontrollable)
(1) A first case study: power models for Android devices
12
Prediction performance w.r.t. SoA approaches
• Baseline
• Android L and Battery Historian (early 2015)
• Makes use of power models to estimate TTLs
• Performance reported for different models
• SM - one model for the user behavior for the whole day
• HM - one model for the user behavior for every hour of the day
• DM - subset of HM, merging similar hours of the day
• I% - Improvements w.r.t. Android L (AL)
(1) A first case study: power models for Android devices
average error values are reported ± standard deviations
MODEL
Outline
1. A first case study: power models for Android devices
2. Generalization: Model and Analysis of Resource Consumption (MARC)
3. Virtual guests monitoring: towards power-awareness for Xen
4. Modeling power consumption in multi-tenant virtualized systems
5. Maximizing performance under a power cap: a hybrid approach
6. Moving forward: containerization, challenges and opportunities
7. Conclusion and future work
13
CONTROL
Signal
Models
Markov
Models
ARX

Models
PHASE2A
14
A general methodology: the MARC approach
PHASE3
Integration
PHASE1
Data
Conditioning
Traced Battery Level
Battery Discharge
BatteryLevel
25%
30%
35%
40%
45%
50%
55%
60%
65%
70%
75%
80%
85%
90%
95%
100%
Time
32.000s 34.000s 36.000s 38.000s 40.000s
Traced Power
Energy Consumption
Linear Approximation
Linear Approximation
Sudden slope change
Energy
2kJ
4kJ
6kJ
8kJ
10kJ
12kJ
14kJ
16kJ
18kJ
20kJ
22kJ
24kJ
26kJ
28kJ
30kJ
32kJ
Power
2W
4W
6W
8W
10W
12W
14W
16W
18W
20W
22W
24W
26W
28W
30W
Time
0s 200s 400s 600s 800s 1000s 1200s
Traced Power
Energy Consumption
Linear Approximation - IDLE
Linear Approximation - I/O
Linear Approximation - MEM
Linear Approximation - CPU
Linear Approximation - I/O
Linear Approximation - IDLE
Energy
2kJ
4kJ
6kJ
8kJ
10kJ
12kJ
14kJ
16kJ
18kJ
20kJ
22kJ
24kJ
26kJ
28kJ
30kJ
32kJ
Power
2W
4W
6W
8W
10W
12W
14W
16W
18W
20W
22W
24W
26W
28W
30W
Time
0s 200s 400s 600s 800s 1000s 1200s
PHASE2B
PHASE2C
(2) Generalization: Model and Analysis of Resource Consumption (MARC)
• MARC (Model and Analysis of Resource Consumption) is a REST platform
that is able to build resource consumption models in an “as-a-service” fashion
15
A model for each configuration
1
PHASE2B
PHASE2A
Autoregressive Models with Exogenous Variables
3
PHASE2C
Traced Power
Energy Consumption
Linear Approximation - IDLE
Linear Approximation - I/O
Linear Approximation - MEM
Linear Approximation - CPU
Linear Approximation - I/O
Linear Approximation - IDLE
Energy
2kJ
4kJ
6kJ
8kJ
10kJ
12kJ
14kJ
16kJ
18kJ
20kJ
22kJ
24kJ
26kJ
28kJ
30kJ
32kJ
Power
2W
4W
6W
8W
10W
12W
14W
16W
18W
20W
22W
24W
26W
28W
30W
Time
0s 200s 400s 600s 800s 1000s 1200s
FOR EACH
WORKING REGIME
A model 

is computed to
characterize the
process
(2) Generalization: Model and Analysis of Resource Consumption (MARC)
16
Predicting configuration switches
1
PHASE2B
PHASE2A
Hidden Markov Models
3
PHASE2C
BY OBSERVING
PERIODICITY
A predictive
configuration
switching model is
computed
Traced Power
Energy Consumption
Linear Approximation
Linear Approximation
Sudden slope change
Energy
2kJ
4kJ
6kJ
8kJ
10kJ
12kJ
14kJ
16kJ
18kJ
20kJ
22kJ
24kJ
26kJ
28kJ
30kJ
32kJ
Power
2W
4W
6W
8W
10W
12W
14W
16W
18W
20W
22W
24W
26W
28W
30W
Time
0s 200s 400s 600s 800s 1000s 1200s
(2) Generalization: Model and Analysis of Resource Consumption (MARC)
Tackling the residual non-linearity
17
PHASE2B
PHASE2A
3
PHASE2C
WITHIN EACH
WORKING REGIME
The residual non-
linearity is
addressed by
exploiting time
series analyses
Signal Models and Time Series Analysis
Traced Battery Level
Battery Discharge
BatteryLevel
25%
30%
35%
40%
45%
50%
55%
60%
65%
70%
75%
80%
85%
90%
95%
100%
Time
32.000s 34.000s 36.000s 38.000s 40.000s
1
(2) Generalization: Model and Analysis of Resource Consumption (MARC)
MODELCONTROL
Outline
1. A first case study: power models for Android devices
2. Generalization: Model and Analysis of Resource Consumption (MARC)
3. Virtual guests monitoring: towards power-awareness for Xen
4. Modeling power consumption in multi-tenant virtualized systems
5. Maximizing performance under a power cap: a hybrid approach
6. Moving forward: containerization, challenges and opportunities
7. Conclusion and future work
18
19
Use case: Power consumption models for Xen domains
✅
❓Dom 0
Kernel
HW
XEN
CPU MEMORYIO
Drivers
Dom 1
Guest OS
Paravirtualized
Application
Dom 2
Guest OS
Paravirtualized
Application
DomU
Guest OS
Paravirtualized
Application
CONFIG SCHEDULER MMU TIMERS INTERRUPTS
PV frontBack
Toolstack
THE XEN
HYPERVISOR
Type 1 Hypervisor currently employed 

in many production environments
• Question: “how much is a virtual tenant consuming?”
(3) Virtual guests monitoring: towards power-awareness for Xen
20
Use case: Power consumption models for Xen domains
✅
❓Dom 0
Kernel
HW
XEN
CPU MEMORYIO
Drivers
Dom 1
Guest OS
Paravirtualized
Application
Dom 2
Guest OS
Paravirtualized
Application
DomU
Guest OS
Paravirtualized
Application
CONFIG SCHEDULER MMU TIMERS INTERRUPTS
PV frontBack
Toolstack
ASSUMPTION
“The power consumption of a system
depends on what the hardware is doing”
• Proposed solution: model virtual tenants power consumption exploiting
hardware events traces, collected and attributed to each one of them
(3) Virtual guests monitoring: towards power-awareness for Xen
Tracing the Domains’ behavior
21
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
XEMPOWER
Collect and account hardware events
to virtual tenants in two steps:
1. In the Xen scheduler (kernel-level)
• At every context switch, trace the
interesting hardware events
• e.g., INST_RET,
UNHALTED_CLOCK_CYCLES,
LLC_REF, LLC_MISS
2. In Domain 0 (privileged tenant)
• Periodically acquire the events
traces and aggregate them on a
domain basis
(3) Virtual guests monitoring: towards power-awareness for Xen
Outline
1. A first case study: power models for Android devices
2. Generalization: Model and Analysis of Resource Consumption (MARC)
3. Virtual guests monitoring: towards power-awareness for Xen
4. Modeling power consumption in multi-tenant virtualized systems
5. Maximizing performance under a power cap: a hybrid approach
6. Moving forward: containerization, challenges and opportunities
7. Conclusion and future work
22
MODELCONTROL
23
Power models: State-of-Art approaches
Workload classes:
(a) idle
(b) weak I/O intensive
(c) memory intensive
(d) CPU intensive
(e) strong I/O intensive
Use a single power model, built on different hardware events:
A. INST_RET, UNHALTED_CLOCK_CYCLES, LLC_REF, LLC_MISS
B. INST_RET, UNHALTED_CLOCK_CYCLES, LLC_REF
C. UNHALTED_CLOCK_CYCLES, LLC_REF 

Configuration
Model A Model B Model C
RMSE
Relative
error
RMSE
Relative
error
RMSE
Relative
error
(a) ± 17.63 W 35.56% ± 16.44 W 32% ± 17.68 W 35%
(b) ± 4.7 W 9.4% ± 5.86 W 11.7% ± 7.17 W 14%
(c) ± 19.11 W 38% ± 34.54 W 70% ± 18.7 W 37%
(d) ± 0.44 W 0.08% ± 0.6W W 1.2% ± 0.42 W 0.08%
(e) ± 2.98 W 5.9% ± 38.57 W 77% ± 3.29 W 6.5%
average ± 8.97 W 17.79% ± 19.20 W 38.38% ± 9.45 W 18.52%
Table 6.9: The modelling errors (Root MSE and mean relative error) obtained with state of the art
Workload
classes
The best average model is the
worst on a single configuration
No model is better than the others
consistently w.r.t. all the configurations
(4) Modeling power consumption in multi-tenant virtualized systems
Power modeling flow
24(4) Modeling power consumption in multi-tenant virtualized systems
Models
exploitation
25
• Goals of the experiments:
A. assess the precision of the modeling methodology
B. explore model portability on different hardware platforms
C. evaluate colocation of different tenants
• Benchmarks
– Apache Spark (SVM and PageRank)
– Redis (Memory-intensive)
– MySQL and Cassandra (IO-intensive)
– FFmpeg (CPU-intensive)
Experimental evaluation
(4) Modeling power consumption in multi-tenant virtualized systems
• Experimental setup
– A. WRK: Intel Core i7 @ 3.40GHz
8GB DDR3 RAM
– B. SRV1: Intel Xeon @ 2.80GHz
16GB DDR3 RAM
– C. SRV2: two Intel Xeon @ 2.3GHz
128GB RAM DDR4
26(4) Modeling power consumption in multi-tenant virtualized systems
• RMSE around 1W on average,
under 2W in almost all the cases;
• only three results present a
worse behavior (still under 5W)
• Relative error around 2% on
average, under 4% in almost all
the cases
• only three results present a
worse behavior (still under 10%)
Results generally outperform the
works in literature [1,2,3], even in
the worst cases
[1] Anton Beloglazov, Rajkumar Buyya, Young Choon Lee, Albert Zomaya, et al. A taxon- omy and survey of energy-efficient data centers and cloud computing systems. Advances in computers,
82(2):47–111, 2011
[2] W Lloyd Bircher and Lizy K John. Complete system power estimation: A trickle-down approach based on performance events. In Performance Analysis of Systems & Software, 2007. ISPASS
2007. IEEE International Symposium on, pages 158–168. IEEE, 2007
[3] Hailong Yang, Qi Zhao, Zhongzhi Luan, and Depei Qian. imeter: An integrated vm power model based on performance profiling. Future Generation Computer Systems, 36:267–286, 2014.
Outline
1. A first case study: power models for Android devices
2. Generalization: Model and Analysis of Resource Consumption (MARC)
3. Virtual guests monitoring: towards power-awareness for Xen
4. Modeling power consumption in multi-tenant virtualized systems
5. Maximizing performance under a power cap: a hybrid approach
6. Moving forward: containerization, challenges and opportunities
7. Conclusion and future work
27
MODELCONTROL
Problem definition
28(5) Maximizing performance under a power cap: a hybrid approach
• Two points of view:
A. minimize power consumption given a minimum performance requirement
B. maximize performance given a limit on the maximum power consumption
• Requirements:
– work in a virtualized environment
– avoid instrumentation of the guest workloads
• Steps towards the goal:
1. identify a performance metric for all the hosted tenants
2. define a resource allocation policy to deal with the requirements
3. extend the hypervisor to provide the right knobs
(5) Maximizing performance under a power cap: a hybrid approach
Power capping approaches
29
SOFTWARE APPROACH
✓ efficiency
✖ timeliness
MODEL BASED

MONITORING [3]
THREAD

MIGRATION [2]
RESOURCE
MANAGMENT DVFS [4] RAPL [1]
CPU
QUOTA
HARDWARE APPROACH
✖ efficiency
✓ timeliness
[1] H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. Rapl: Memory power estimation and capping. In International Symposium on Low Power Electronics and Design (ISPLED), 2010.
[2] R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. Pack & cap: adaptive dvfs and thread packing under power caps. In International Symposium on Microarchitecture (MICRO), 2011.
[3]M. Ferroni, A. Cazzola, D. Matteo, A. A. Nacci, D. Sciuto, and M. D. Santambrogio. Mpower: gain back your android battery life! In Proceedings of the 2013 ACM conference on Pervasive and
ubiquitous computing adjunct publication, pages 171–174. ACM, 2013.
[4] T. Horvath, T. Abdelzaher, K. Skadron, and X. Liu. Dynamic voltage scaling in multitier web servers with end-to-end delay control. In Computers, IEEE Transactions. IEEE, 2007.
30
[5] H. Zhang and H. Hoffmann. Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques. In International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS), 2016.
HYBRID APPROACH [5]
✓ efficiency
✓ timeliness
(5) Maximizing performance under a power cap: a hybrid approach
Power capping approaches
SOFTWARE APPROACH
✓ efficiency
✖ timeliness
MODEL BASED

MONITORING [3]
THREAD

MIGRATION [2] RESOURCE
MANAGMENT
DVFS [4]
RAPL [1]
CPU
QUOTA
HARDWARE APPROACH
✖ efficiency
✓ timeliness
31(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
• The workloads run in paravirtualized domains
32(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
• XeMPUPiL spans over all the layers
33(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
• Instruction Retired (IR) metric gathered and accounted to each domain,
thanks to XeMPower
• The aggregation is done over a time window of 1 second
34(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
• Observation of both hardware events (i.e., IR) and power
consumption (whole CPU socket)
35(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
36
– given a workload with M virtual resources
and an assignment of N physical resources,
to each pCPUi we assign:
(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
• Hybrid actuation:
– enforce power cap via RAPL
– define a CPU pool for the workload and pin workload’s vCPUs over pCPUs
37(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
38
• Hybrid actuation:
– enforce power cap via RAPL
– define a CPU pool for the workload and pin workload’s vCPUs over pCPUs
(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
39
• Hybrid actuation:
– enforce power cap via RAPL
– define a CPU pool for the workload and pin workload’s vCPUs over pCPUs
(5) Maximizing performance under a power cap: a hybrid approach
Systemdesign
40
• Goals of the experiments:
A. how do different workloads perform under a power cap?
B. can we achieve higher efficiency w.r.t. RAPL power cap?
• Benchmarks
– Embarrassingly Parallel (EP)
– IOzone
– cachebench
– Bi-Triagonal solver (BT)
• Three power caps explored: 40W, 30W and 20W
• Results are normalized w.r.t. the performance obtained with no power caps
(5) Maximizing performance under a power cap: a hybrid approach
Experimental evaluation
• Experimental setup
– 2.8-GHz quad-core Intel Xeon
– 32GB of RAM
– Xen hypervisor version 4.4
41
0
0.2
0.4
0.6
0.8
1.0
NO RAPL
RAPL 40
RAPL 30
RAPL 20
NormalizedPerformance
0
0.2
0.4
0.6
0.8
1.0
EP cachebench IOzone BT
• Preliminary evaluation: how do they perform under a power cap?
(5) Maximizing performance under a power cap: a hybrid approach
42
0
0.2
0.4
0.6
0.8
1.0
NO RAPL
RAPL 40
RAPL 30
RAPL 20
NormalizedPerformance
0
0.2
0.4
0.6
0.8
1.0
EP cachebench IOzone BT
• Preliminary evaluation: how do they perform under a power cap?
• For CPU-bound benchmarks (i.e., EP and BT), the difference are significant
(5) Maximizing performance under a power cap: a hybrid approach
43
0
0.2
0.4
0.6
0.8
1.0
NO RAPL
RAPL 40
RAPL 30
RAPL 20
NormalizedPerformance
0
0.2
0.4
0.6
0.8
1.0
EP cachebench IOzone BT
• Preliminary evaluation: how do they perform under a power cap?
• With IO- and/or memory-bound workloads, the performance degradation is
less significant between different power caps
(5) Maximizing performance under a power cap: a hybrid approach
44
0
0.5
1.0
PUPiL 40
RAPL 40
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
0
0.5
1.0
PUPiL 30
RAPL 30
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
0
0.5
1.0
PUPiL 20
RAPL 20
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
• Performance of the
workloads with
XeMPUPiL, for different
power caps:
– higher performance
than RAPL, in general
– not always true on a
pure CPU-bound
benchmark (i.e., EP)
(5) Maximizing performance under a power cap: a hybrid approach
45
0
0.5
1.0
PUPiL 40
RAPL 40
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
0
0.5
1.0
PUPiL 30
RAPL 30
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
0
0.5
1.0
PUPiL 20
RAPL 20
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
• Performance of the
workloads with
XeMPUPiL, for different
power caps:
– higher performance
than RAPL, in general
– not always true on a
pure CPU-bound
benchmark (i.e., EP)
(5) Maximizing performance under a power cap: a hybrid approach
46
0
0.5
1.0
PUPiL 40
RAPL 40
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
0
0.5
1.0
PUPiL 30
RAPL 30
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
0
0.5
1.0
PUPiL 20
RAPL 20
Normalizedperformance
0
0.5
1.0
EP cachebench IOzone BT
• Performance of the
workloads with
XeMPUPiL, for different
power caps:
– higher performance
than RAPL, in general
– not always true on a
pure CPU-bound
benchmark (i.e., EP)
(5) Maximizing performance under a power cap: a hybrid approach
Outline
1. A first case study: power models for Android devices
2. Generalization: Model and Analysis of Resource Consumption (MARC)
3. Virtual guests monitoring: towards power-awareness for Xen
4. Modeling power consumption in multi-tenant virtualized systems
5. Maximizing performance under a power cap: a hybrid approach
6. Moving forward: containerization, challenges and opportunities
7. Conclusion and future work
47
MODELCONTROL
Containerization: opportunities and challenges
48(6) Moving forward: containerization, challenges and opportunities
A different road to multi-tenancy
• Group the application and all its dependencies in a single container
• The host operating system sees a container as a group of processes
Proposed solution
• A power-aware orchestrator for Docker containers
Manage resources to meet the power consumption goal
• A policy-based system
Guarantee performance of the containers while staying under the power cap
DockerCap: system design
49
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
(6) Moving forward: containerization, challenges and opportunities
DockerCap: system design
50
Power samples from
Intel RAPL
Resource allocation of the
containers from Docker and
cgroups
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
(6) Moving forward: containerization, challenges and opportunities
DockerCap: system design
51
Resource Partitioning
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
Resource Control
(6) Moving forward: containerization, challenges and opportunities
DockerCap: system design
52
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
Actuation through the
cgroup hierarchy for
each container
(6) Moving forward: containerization, challenges and opportunities
DockerCap: system design
53
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
(6) Moving forward: containerization, challenges and opportunities
54
• Goals of the experiments:
A. is the software-level power cap stable and precise?
B. are we able to meet the performance requirements of the containers?
• Benchmarks
– fluidanimate (fluid simulation)
– x264 (video encoding)
– dedup (compression)
• Three power caps explored: 40W, 30W and 20W
• All the benchmark containers run simultaneously on the same node
• Baseline: Intel RAPL power capping solution
Experimental evaluation
• Experimental setup
– 2.8-GHz quad-core Intel Xeon
– 32GB of RAM
– Docker 1.11.2
(6) Moving forward: containerization, challenges and opportunities
55
Power cap: 30W
Power cap: 20W
Power cap: 40W
•dedup
•fluidanimate
•x264
(6) Moving forward: containerization, challenges and opportunities
• Comparison between performance-
agnostic approaches: Fair
partitioning policy vs. RAPL
• Performance metric:
Time To Completion
(lower is better)
• Comparable performance, better
results on lower power caps
Power cap: 30W
Power cap: 20W
Power cap: 40W
56(6) Moving forward: containerization, challenges and opportunities
•dedup
•fluidanimate
•x264
All policies
• Comparing fair and
performance-aware
approaches
• Performance metric:
Time To Completion
(lower is better)
Power cap: 30W
Power cap: 20W
Power cap: 40W
57(6) Moving forward: containerization, challenges and opportunities
•dedup
•fluidanimate
•x264
All policies
• Comparing fair and
performance-aware
approaches
• Performance metric:
Time To Completion
(lower is better)
Power cap: 30W
Power cap: 20W
Power cap: 40W
58
• Comparing fair and
performance-aware
approaches
• Performance metric:
Time To Completion
(lower is better)
• fluidanimate is set to High
Priority with a SLO of 400s
(6) Moving forward: containerization, challenges and opportunities
•dedup
•fluidanimate
•x264
All policies
Conclusion
1. A first case study: power models for Android devices
Better performance w.r.t. Android L predictions
2. Generalization: Model and Analysis of Resource Consumption (MARC)
Modeling pipeline has been generalized and provided “as-a-service”
3. Virtual guests monitoring: towards power-awareness for Xen
HW events are traced with neglibigle overhead on the system
4. Modeling power consumption in multi-tenant virtualized systems
Better performance w.r.t. SoA approaches
5. Maximizing performance under a power cap: a hybrid approach
Better performance w.r.t. standard RAPL power cap
6. Moving forward: containerization, challenges and opportunities
Promising results towards a performance-aware and power-aware orchestration
59
MODELCONTROL
• We want to validate the modeling methodology on different resources
• Time-to-Completion of Hadoop jobs
60
Future Work
• We want to exploit these model to:
• detect anomalies in a distributed
microservice infrastructure
• perform better resource allocation and
consolidation
61
thanks. questions?
- XEN MODELS -
DETAILS ON WORKING REGIMES
63(4) Modeling power consumption in multi-tenant virtualized systems
Working Regime identification
64
A single model is not enough: we explored the MARC approach
Question: What is a working regime in this case study?
Identified
a posteriori
by looking at the
different slopes
on the trace graph
Traced power Energy consumption
Energy
0J
20kJ
40kJ
60kJ
80kJ
100kJ
120kJ
140kJ
160kJ
180kJ
200kJ
220kJ
240kJ
Power
25W
30W
35W
40W
45W
50W
55W
60W
65W
70W
75W
Time
0s 500s 1000s 1500s 2000s 2500s 3000s 3500s 4000s
(4) Modeling power consumption in multi-tenant virtualized systems
KERNEL DENSITY ESTIMATION (KDE)
By observing the local minima of the
reconstructed distribution of power consumption
we identify 

the points where a Working Regime change happens
Working Regime identification - How many are they?
65
Traced power Energy consumption
Energy
0J
20kJ
40kJ
60kJ
80kJ
100kJ
120kJ
140kJ
160kJ
180kJ
200kJ
220kJ
240kJ
Power
25W
30W
35W
40W
45W
50W
55W
60W
65W
70W
75W
Time
0s 500s 1000s 1500s 2000s 2500s 3000s 3500s 4000s
LINEAR RANGES
0: [0W , 42W)
1: [42W , 57W)
2: [57W , +∞)
Probabilitydensity
0
0,005
0,010
0,015
0,020
0,025
0,030
0,035
0,040
0,045
0,050
0,055
0,060
0,065
0,070
Power
10W 20W 30W 40W 50W 60W 70W 80W 90W
(4) Modeling power consumption in multi-tenant virtualized systems
From hardware events to Working Regimes (1)
66
Weights
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,10
0,11
0,12
0,13
0,14
0,15
0,16
Features
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
RELIEFF + KDE
1. ReliefF is used to identify which feature better induce the
Working Regimes classification identified before
(4) Modeling power consumption in multi-tenant virtualized systems
2. For each Working Regime:
The distribution of the values of that feature is reconstructed using KDE
3. The distribution are compared to obtain discriminant values
From hardware events to Working Regimes (2)
67
Weights
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,10
0,11
0,12
0,13
0,14
0,15
0,16
Features
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
CLASS 0
CLASS 1
CLASS 2
Probabilitydensity
0
5×10
−11
10×10
−11
15×10
−11
20×10
−11
25×10
−11
30×10
−11
35×10
−11
40×10
−11
45×10
−11
50×10
−11
55×10
−11
60×10
−11
PMC values
0 2×10
9
4×10
9
6×10
9
8×10
9
RELIEFF + KDE
(4) Modeling power consumption in multi-tenant virtualized systems
RESULT:
A Working Regime classifier that is able to determine in which
Working Regime the system is, starting from the sampled features
From hardware events to Working Regimes (3)
68
Weights
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,10
0,11
0,12
0,13
0,14
0,15
0,16
Features
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
CLASS 0
CLASS 1
CLASS 2
Probabilitydensity
0
5×10
−11
10×10
−11
15×10
−11
20×10
−11
25×10
−11
30×10
−11
35×10
−11
40×10
−11
45×10
−11
50×10
−11
55×10
−11
60×10
−11
PMC values
0 2×10
9
4×10
9
6×10
9
8×10
9
INST_RET
0
[0,

1.235e9]
1
(1.235e9,

3.61e9)
[3.61e9,

5.58e9)
2
(1.235e9,

3.61e9)
(1.235e9,

3.61e9)
[5.58e9,

+∞)
RELIEFF + KDE
(4) Modeling power consumption in multi-tenant virtualized systems
From hardware events to Working Regimes (4)
69
RELIEFF + KDE
Weights
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,10
0,11
0,12
0,13
0,14
0,15
0,16
Features
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
CLASS 0
CLASS 1
CLASS 2
Probabilitydensity
0
5×10
−11
10×10
−11
15×10
−11
20×10
−11
25×10
−11
30×10
−11
35×10
−11
40×10
−11
45×10
−11
50×10
−11
55×10
−11
60×10
−11
PMC values
0 2×10
9
4×10
9
6×10
9
8×10
9
INST_RET L1_HIT
0
[0,

1.235e9]
1
(1.235e9,

3.61e9)
(2.36362e8,

5.672e8)
[3.61e9,

5.58e9)
2
(1.235e9,

3.61e9)
[0,

2.36362e8]
(1.235e9,

3.61e9)
[5.672e8,

+∞)
[5.58e9,

+∞)
• In case of uncertainty, repeat from ReliefF:
• Eliminating the already selected features
• Eliminating all the data that are not part of the uncertain zone
(4) Modeling power consumption in multi-tenant virtualized systems
70(4) Modeling power consumption in multi-tenant virtualized systems
71(4) Modeling power consumption in multi-tenant virtualized systems
72(4) Modeling power consumption in multi-tenant virtualized systems
- DOCKERCAP -
IMPLEMENTATION DETAILS
74
Decide phase
Resource Partitioning
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
Resource Control
75
{
Control theory
76
Resource control
100%
CPU quota cap
available resource
With the feedback control loop logic, we find the
allocation of resources that ensures the power cap
77
Decide phase
Resource Partitioning
Observe
Queue Act
Queue
Act Component
CGroup
Observe Component
Docker
Containers
RAPL
Decide Component
Policy1 Policy2 Policy3
Resource Control
78
Resource partitioning
Containers: C1 C2 C3 C4
?
We explore three different partitioning policies:
• Fair resource partitioning
• Priority-aware resource partitioning
• Throughput-aware resource partitioning
100%
CPU quota cap
available resource
• The quota Q is evenly partitioned across all the containers
• No control over the throughput of a single container
79
1. Fair resource partitioning
100%
CPU quota cap
Containers: C1 C2 C3 C4
Q/4 Q/4 Q/4 Q/4
80
2. Priority-aware partitioning
100%
CPU quota cap
Containers:
• The quota Q is partitioned following the priority of each container
• The quota of the single container is estimated through a weighted mean,
where every priority has its own associated weight
High priority:
Low priority:
C1
HIGH LOW
C2 C3 C4
LOW LOW
81
Throughput-aware resource partitioning
C3 C4Best effort:
+ SLO1
+ SLO2
SLO1 SLO2
100%
CPU quota cap
Containers:
High priority:
Low priority:
C1
C2
BE BE
• The quota Q is partitioned following the priority of each container and its
Service Level Objectives (SLO)
• SLO is here defined as the Time-To-Completion (TTC) of the task
82
Experimental setup
All the benchmark containers run simultaneously on the same node
HW OS
CONTAINER
ENGINE
RUNTIME
Intel Xeon
E5-1410
32GB RAM
Ubuntu 14.04
Linux 3.19.0-42
Docker 1.11.2 Python 2.7.6
BENCHMARK CONTAINERS
PARSEC DESCRIPTION
fluidanimate fluid dynamics simulation generic CPU-bound
x264 video streaming encoding e.g., video surveilance
dedup compression cloud-fog communication
83
Goals of the experiments
The comparison is done with the state of the art power capping solution RAPL by Intel[1]
PERFORMANCES OF THE
CONTAINERS
PRECISION OF THE POWER
CAPPING
allocate resource to meet
containers’ requirements
manage machine
power consumption
84
Precision of the power capping
• Comparable results in terms
of average power
consumption under the
power cap
• As expected, RAPL provides
a more stable power capping
•Fair
•Priority-aware
•Throughput-aware
•RAPL
85
Performances: Fair Partitioning vs RAPL
• Comparison between the
performance-agnostic approaches
• Performance metric:
Time To Completion
(lower is better)
Power cap: 30W
Power cap: 20W
Power cap: 40W
•dedup
•fluidanimate
•x264
Power cap: 30W
Power cap: 20W
Power cap: 40W
86
Performances: all policies
•dedup
•fluidanimate
•x264
• Comparison with the
performance-aware
approaches
• fluidanimate is set to
High Priority with a SLO
of 400s
• Performance metric:
Time To Completion
(lower is better)
87
Conclusion and future work
✓We presented DockerCap, a power-aware orchestrator
that manages containers’ resources
✓We showed how DockerCap is able to limit the power
consumption of the machine
✓We discussed three distinct partitioning policies and
compared their impact on containers’ SLO
FUTURE DIRECTIONS
• Exploit both HW and SW power capping
• Improve the precision of the power capping with
more refined modeling techniques [2]
• Compute the right allocation of resources online by
observing the performance of the containers
[2] Andrea Corna and Andrea Damiani. A scalable framework for resource consumption modelling: the MARC approach.
Master’s thesis. Politecnico di Milano, 2016. 

- XEN MODELS -
PRELIMINARY RESULTS
Experimental settings and benchmarks
89
“XARC1”
Dell OptiPlex 990
“SANDY”
Dell PowerEdge T320
Processor Intel Core i7-2600 @ 3.40GHz Intel Xeon CPU E5-1410 @ 2.80GHz
Memory 4 banks of Synchronous 2GB DIMM DDR3
RAM @ 1.33GHz
2 banks of Synchronous 16GB DIMM DDR3
RAM @ 1.60Ghz
Storage Seagate 250GB 7200rpm 8MB cache SATA
3.5” HDD
Western Digital 250GB 7200rpm 16MB
cache SATA 3.5” HDD
Network Intel 82579LM Gigabit Network Connection Broadcom NetXtreme BCM5720 Gigabit
Ethernet PCIe
[1] YANG, Hailong, et al. iMeter: An integrated VM power model based on
performance profiling. Future Generation Computer Systems, 2014, 36:
267-286.
Micro Benchmarks[1]
• NASA Parallel Benchmarks
• CPUMemory Features
• Cachebench
• Cache Hierarchy
• IOzone
• Disk IO Operations
TRAIN SET
Realistic Benchmarks
• Redis Server
• Non-relational DBMS interrogations
• MySQL Server
• Relational DBMS query
• FFMPEG
• AudioVideo transcoding and compression
TEST SET
Power models: the MARC approach (1)
90
RMSE
Relative
error
Coverage
Redis ±0.58W 1.10% 100.00%
MySQL ±1.94W 3.80% 100.00%
FFMPEG ±0.51W 1.00% 100.00%
SANDY
TRAIN AND TEST
ON THE SAME
PHYSICAL MACHINE
LOWER BOUND IN THE STATE OF THE ART
5% of relative error [1]
RMSE
Relative
error
Coverage
Redis ±2.07W 4.14% 100.00%
MySQL ±9.27W 18.5% 100.00%
FFMPEG ±1.32W 2.64% 99.90%
XARC1
[1] YANG, Hailong, et al. iMeter: An integrated VM power model based on
performance profiling. Future Generation Computer Systems, 2014, 36:
267-286.
Power models: the MARC approach (2)
91
TRAIN ON XARC1,
TEST ON SANDY
RMSE
Relative
error
Coverage
Redis ±0.58W 1.10% 100.00%
MySQL ±1.94W 3.80% 100.00%
FFMPEG ±0.51W 1.00% 100.00%
SANDY
TRAIN AND TEST
ON THE SAME
PHYSICAL MACHINE
RMSE
Relative
error
Coverage
Redis ±0.61W 1.23% 99.70%
MySQL ±1.97W 3.86% 100.00%
FFMPEG ±0.63W 1.26% 100.00%
XARC1
- MPOWER -
EXTRA SLIDES
93
The concept of Working Regime
• Domain-specific feature: hardware modules currently used
• We defined the concept of working regime:
“Given the controllable hardware modules on a device,
a working regime is a combination of their internal state”
Working	regime	
A
Working	regime	
B
Working	regime		
C
(1) A first case study: power models for Android devices
94
MISO Model for every configuration
• We tackle the problem of power model estimation in a fixed configuration with a
linear Multiple Input Single Output (MISO) model
Battery
prediction
Previous
battery
levels
Exogenous
input values
Model
parameter
(1) A first case study: power models for Android devices
95
Actions on controllable variables
• They are determined by the user’s behavior
• We model the evolution of the smartphone’s
configuration as a Markov Decision Process
• A state for every configuration
• Transitions’ weights represent the
probability to go from a configuration to
another
Configuration	A
Configuration	B
Configuration	C
(1) A first case study: power models for Android devices
- XEMPOWER -
DETAILS AND RESULTS
Proposed Approach
• At each context
switch, start counting
the hardware events
of interest
• The configured PMC
registers store the
counts associated
with the domain that
is about to run
97
A1 A3
Core 0 Core N
Time
Xen Kernel
…
Proposed Approach
• At the next context
switch, read and
store PMC values,
accounted to the
domain that was
running
• Counters are then
cleared
98
A1
1
B1
A2
A1
A3
3
B3
Core 0 Core N
Time
context
switch
context
switch
Xen Kernel
…
Proposed Approach
• Steps A and B are
performed at every
context switch on
every system’s CPU
(i.e., physical core or
hardware thread).
• The reason is that
each domain may
have multiple virtual
CPUs (VCPUs).
99
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
Xen Kernel
…
Proposed Approach
• Finally, the PMC
values are
aggregated by
domain and finally
reported or used for
other estimations
• Expose the collected
data to a higher level
– how?
100
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
Xen Kernel Dom0
…
Proposed Approach
xentrace
• a lightweight trace
capturing facility
present in Xen
• we tag every trace
record with the ID of
the scheduled
domain and its
current VCPU
• a timestamp is kept
to later reconstruct
the trace flow
101
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Use case
• Enable real-time
attribution of CPU
power consumption
to each guest
• Socket-level energy
measurements are
also read (via Intel
RAPL interface) at
each context switch
102
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Power models from PMC traces
• High correlation between hardware events
and power consumption [28]
• Non-halted cycle is the best metric to
correlate power consumption (linear
correlation coefficient above 0.95)
• Such correlation suggests that the higher
the rate of non-halted cycles for a domain
is, the more CPU power the domain
consumes
103
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Power models from PMC traces
• High correlation between hardware events
and power consumption [28]
• Non-halted cycle is the best metric to
correlate power consumption (linear
correlation coefficient above 0.95)
• Such correlation suggests that the higher
the rate of non-halted cycles for a domain
is, the more CPU power the domain
consumes
Idea
• Split system-level power consumption and
account it to virtual guests
104
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Proposed approach to account
1. For each tumbling window, the XeMPower
daemon calculates the total number of
non-halted cycles (one of the PMC traced)
2. We estimate the percentage of non-halted
cycles for each domain over the total
number of non-halted cycles; this
represents the contribution of each domain
to the whole CPU power consumption
3. Finally, we split the socket power
consumption proportionally to the
estimated contributions of each domain
105
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Experimental evaluation 106
• Back to the XeMPower requirements:
1. provide precise attribution of hardware events to virtual
tenants
2. agnostic to the mapping between virtual and physical
resources, hosted applications and scheduling policies
3. add negligible overhead
• Goals of the experimental evaluation:
– show how XeMPower monitoring components incur
very low overhead under different configurations
and workload conditions
V
V
Experimental evaluation 107
• Overhead metric:
– the difference in the system’s power consumption
while using XeMPower versus an off-the-shelf Xen 4.6
installation
• Experimental setup:
– 2.8 GHz quad-core Intel Xeon E5-1410 processor (4
hardware threads)
– a Watts up? PRO meter to monitor the entire
machine’s power consumption
– Each guest repeatedly runs a multi-threaded
compute-bound microbenchmark on three VCPUs
and uses a stripped-down Linux 3.14 as the guest OS
Experimental evaluation 108
• Three system configurations:
1. the baseline configuration uses off-the-shelf Xen 4.4
2. the patched configuration introduces the kernel-level
instrumentation without the XeMPower daemon
3. the monitoring configuration is the patched with the XeMPower
daemon running and reporting statistics
• Four running scenarios:
– an idle scenario in which the system only runs Dom0
– 3 running-n scenarios, where n = {1, 2, 3} indicates the number of
guest domains in addition to Dom0
• The idea is to stress the system with an increasing number of
CPU-intensive tenant applications
• This increases the amount of data traced and collected by
XeMPower
• Mean power consumption (μ), in Watts, scenarios idle and running-
{1,2,3}, and configurations baseline (b), patched (p), and monitoring
(m)
• Mean power values are reported with their 95% confidence interval
Experimental Results 109
• At a glance, we can see how measurements are pretty close
pinned-VCPU
unpinned-VCPU
Experimental Results 110
• We estimate an upper bound ϵ for the maximum overhead using a
hypothesis test:
• A rejection of the null hypothesis means that there is strong statistical
evidence that the power consumption overhead is lower than ϵ
• We compute ϵ for the considered test cases and scenarios, ensuring
average values of power consumption (μ) with confidence: α = 5%
• We want to compare the overhead with the one measured for XenMon, a
performance monitoring tool for Xen
• unlike XeMPower, XenMon does not collect PMC reads
• it is still a reference design in the context of runtime monitoring for
the Xen ecosystem
Experimental Results 111
• Estimated upper bound ϵ for the power consumption overhead, in Watts
• Parenthetical values are the overheads w.r.t. mean power consumption
• XeMPower introduces an overhead not greater than 1.18W (1.58%),
observed for the [unpinned-VCPU, running-3, patched] case
• In all the other cases, the overhead is less than 1W (and less than 1%)
• This result is satisfactory compared to an overhead of 1-2% observed for
XenMon, the reference implementation for XeMPower
- XEMPUPIL -
DETAILS
Related work: PUPiL [5] 113
[5] H. Zhang and H. Hoffmann. Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques. In International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS), 2016.
• PUPiL, a framework that aims to minimize and to maximize respectively
the concept of timeliness and efficiency
• Proposed approach:
– both hardware (i.e., the Intel RAPL interface [10]) and software (i.e.,
resource partitioning and allocation) techniques
– exploits a canonical ODA control loop, one of the main building blocks of
self-aware computing
• Limitations
– the applications running on the system need to be instrumented with the
Heartbeat framework, to provide uniform metric of throughput
– applications running bare-metal on Linux
• These conditions might not hold in the context of a multi-tenant
virtualized environment
The Xen Hypervisor 114
Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
1. Performance metric identification
• Hardware event counters as low level metrics of
performance
• We exploit the Intel Performance Monitoring Unit (PMU)
to monitor the number of Instruction Retired (IR)
accounted to each domain in a certain time window
– an insight on how many microinstructions were completely
executed (i.e., that successfully reached the end of the
pipeline)
– it represents a reasonable indicator of performance, as the
same manufacturer suggests [6]
115
[6] Clockticks per instructions retired (cpi). https://software.intel.com/en-us/node/544403. Accessed: 2016-06-01.
2. Decision phase and virtualization
• Evaluation criterion: the average IR rate over a certain time
window
– the time window allows the workload to adapt to the actual
configuration
– the comparison of IR rates of different configurations highlights
which one makes the workload perform better
• Resource allocation granularity: core-level
– each domain owns a set virtual CPUs (vCPUs)
– a set of physical CPUs (pCPU) present on the machine
– each vCPU can be mapped on a pCPU for a certain amount of
time, while multiple vCPUs can be mapped on the same pCPU
• We wanted our allocation to cover the whole set of pCPUs, if
possible
116
3. Extending the hypervisor - RAPL
• Working with the Intel RAPL interface:
– harshly cutting the frequency and the voltage of the whole CPU socket
• On a bare-metal operating system:
– reading and writing data into the right Model Specific Register (MSR)
• MSR_RAPL_POWER_UNIT: read processor-specific time, energy and power
units, used to scale each value read or written
• MSR_PKG_RAPL_POWER_LIMIT: write to set a limit on the power
consumption of the whole socket
• In a virtualized environment:
– the Xen hypervisor does not natively support the RAPL interface
– we developed custom hypercalls, with kernel callback functions and
memory buffers
– we developed a CLI tool that performs some checks on the input
parameters, as well as of instantiating and invoking the Xen command
interface to launch the hypercalls
117
3. Extending the hypervisor - Resources
• cpupool tool:
– allows to cluster the physical CPUs in different pools
– the pool scheduler will schedule the domain’s vCPUs only
on the pCPUs that are part of that cluster
– as a new resource allocation is chosen by the decide phase,
we increase or decrease the number of pCPUs in the pool
– pin the domain’s vCPUs to these, to increase workload
stability
• NO xenpm:
– set a maximum and minimum frequency for each pCPU
– it may interfere with the actuation made by RAPL
118
- MARC -
MODELING APPROACHES
120
Motivation - Modeling approaches (1)
• Deep insights
• Accurate
• Invasive instrumentation
• Ignoreunderrate
degradation
• Adaptive
• Ever-improving
• Generalizable
• At-a-glance view
• Accuracy depends on
acquisition procedures
PROSCONS
PHYSICAL MODELS DATA-DRIVEN MODELS
121
Motivation - Modeling approaches (2)
• Controllable environment
• Ad-hoc instrumentation
• Relies on reasonable
simulations
• Does not evolve with the
target
• Requires ex-novo modeling
for new targets
• Intrinsic ability of evolve
with the target
• Tackles new targets
• Does not require in-lab
phases
• Noisy real-world
environment
PROSCONS
OFF-LINE MODELING ON-LINE MODELING
122
Motivation - Modeling approaches (3)
On-demand data-driven modeling
=
GENERAL AS-A-SERVICE MODELING FRAMEWORK
+
- MARC -
PREPROCESSING
32B
2A
2C
MARC METHODOLOGY
Our KD&DM procedure
15
Preprocessing
Data
Manipulation
Feature
Selection
32B
2A
2C
MARC METHODOLOGY
Our KD&DM procedure
16
Preprocessing
Data
Manipulation
STANDARD DATA CLEANING OPERATIONS
Scope: single sample
1. Coherence Correction
2. Residual Incoherence Elimination
3. Out-of-Bound Elimination
4. Granularity Reduction
Feature
Selection
32B
2A
2C
MARC METHODOLOGY
Our KD&DM procedure
17
Preprocessing
Data
Manipulation
Scope: full dataset
•Standardization
•Quantization Correction
•…
Feature
Selection
32B
2A
2C
MARC METHODOLOGY
Our KD&DM procedure
18
Preprocessing
Data
Manipulation
Feature
Selection
Scope: feature-wise
•Manual feature fusion and exclusion
•Automatic Configuration Feature 

Elicitation and Synthesis
- MARC -
A SCALABLE PLATFORM
4. MARC PLATFORM
Scalability
14
Load Balancer
Communication Actor Communication Actor
Module Specific 

Functional Logic
Module Specific 

Functional Logic
SCALE-IN

INTRA-MODULE PARALLELISM
Technologies: Scala - Akka
4. MARC PLATFORM
Scalability
15
SCALE-OUT

MODULE DISTRIBUTION
Load Balancer
Communication Actor Communication Actor
Module Specific 

Functional Logic
Module Specific 

Functional Logic
=
DOCKER 

CONTAINER
Technologies: Scala - Akka - Docker
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“PHASE2A, please!”
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“PHASE2A, please!”
⏳
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“PHASE2A, please!”
⏳
⏳
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“PHASE2A, please!”
⏳
✅
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“Thank you”
✅
✅
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“PHASE2B, please!”
✅
✅ ⏳
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“PHASE2B, please!”
✅
✅ ⏳
“Already
computed!”
4. MARC PLATFORM
Scalability
16
BACKWARD ACTIVATION
Technologies: Scala - Akka - Docker - Scalatra
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3 WEBAPP
“Thank you”
✅
✅ ✅
4. MARC PLATFORM
Scalability
17
WHITEBOARD APPROACH
Scala - Akka - Docker - Scalatra - Redis
PHASE1
PHASE2A
PHASE2B
PHASE2C
PHASE3
BACKBONE
INTERNAL

WHITEBOARD
EXTERNAL

WHITEBOARD

More Related Content

What's hot

IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...
IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...
IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...Juan C. Vasquez
 
Presentation SIW7 amjad anvari-moghaddam
Presentation SIW7 amjad anvari-moghaddamPresentation SIW7 amjad anvari-moghaddam
Presentation SIW7 amjad anvari-moghaddamJuan C. Vasquez
 
Presentation from Sierra Club panel discussion on Microgrids in DC
Presentation from Sierra Club panel discussion on Microgrids in DCPresentation from Sierra Club panel discussion on Microgrids in DC
Presentation from Sierra Club panel discussion on Microgrids in DCHugh Youngblood
 
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Luigi Vanfretti
 
SSD2014 Invited keynote: Research challenges in Microgrid technolgies
SSD2014 Invited keynote: Research challenges in Microgrid technolgiesSSD2014 Invited keynote: Research challenges in Microgrid technolgies
SSD2014 Invited keynote: Research challenges in Microgrid technolgiesJuan C. Vasquez
 
L3_Power_SIMnews_42
L3_Power_SIMnews_42L3_Power_SIMnews_42
L3_Power_SIMnews_42Sean Bradley
 
Microgrid Presentation
Microgrid PresentationMicrogrid Presentation
Microgrid PresentationShahab Khan
 
Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...
Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...
Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...usman1441
 
Real-time simulator requirement for micro-grid simulation vs large power system
Real-time simulator requirement for micro-grid simulation vs large power systemReal-time simulator requirement for micro-grid simulation vs large power system
Real-time simulator requirement for micro-grid simulation vs large power systemOPAL-RT TECHNOLOGIES
 
Decentralized Generation In Microgrids
Decentralized Generation In MicrogridsDecentralized Generation In Microgrids
Decentralized Generation In MicrogridsJuan C. Vasquez
 
Hands-on-OpenIPSL.org using OpenModelica!
Hands-on-OpenIPSL.org using OpenModelica!Hands-on-OpenIPSL.org using OpenModelica!
Hands-on-OpenIPSL.org using OpenModelica!Luigi Vanfretti
 
Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...
Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...
Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...Luigi Vanfretti
 

What's hot (20)

IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...
IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...
IECON Amjad Optimal planning and operatoin Management of a ship Electrical Po...
 
Airo2014
Airo2014Airo2014
Airo2014
 
Presentation SIW7 amjad anvari-moghaddam
Presentation SIW7 amjad anvari-moghaddamPresentation SIW7 amjad anvari-moghaddam
Presentation SIW7 amjad anvari-moghaddam
 
Presentation from Sierra Club panel discussion on Microgrids in DC
Presentation from Sierra Club panel discussion on Microgrids in DCPresentation from Sierra Club panel discussion on Microgrids in DC
Presentation from Sierra Club panel discussion on Microgrids in DC
 
4.4_Micro Grid Design_Bello_EPRI/SNL Microgrid
4.4_Micro Grid Design_Bello_EPRI/SNL Microgrid4.4_Micro Grid Design_Bello_EPRI/SNL Microgrid
4.4_Micro Grid Design_Bello_EPRI/SNL Microgrid
 
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
 
SSD2014 Invited keynote: Research challenges in Microgrid technolgies
SSD2014 Invited keynote: Research challenges in Microgrid technolgiesSSD2014 Invited keynote: Research challenges in Microgrid technolgies
SSD2014 Invited keynote: Research challenges in Microgrid technolgies
 
4.1_Simulation & Analysis Tools for Microgrids_Weng and Cortes_EPRI/SNL Micro...
4.1_Simulation & Analysis Tools for Microgrids_Weng and Cortes_EPRI/SNL Micro...4.1_Simulation & Analysis Tools for Microgrids_Weng and Cortes_EPRI/SNL Micro...
4.1_Simulation & Analysis Tools for Microgrids_Weng and Cortes_EPRI/SNL Micro...
 
L3_Power_SIMnews_42
L3_Power_SIMnews_42L3_Power_SIMnews_42
L3_Power_SIMnews_42
 
Microgrid Presentation
Microgrid PresentationMicrogrid Presentation
Microgrid Presentation
 
Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...
Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...
Modelling and Control of a Microgrid with100kW PV System and Electrochemical ...
 
10.3_Practical Implementation of Microgrid Control, Protection, and Communica...
10.3_Practical Implementation of Microgrid Control, Protection, and Communica...10.3_Practical Implementation of Microgrid Control, Protection, and Communica...
10.3_Practical Implementation of Microgrid Control, Protection, and Communica...
 
Real-time simulator requirement for micro-grid simulation vs large power system
Real-time simulator requirement for micro-grid simulation vs large power systemReal-time simulator requirement for micro-grid simulation vs large power system
Real-time simulator requirement for micro-grid simulation vs large power system
 
Decentralized Generation In Microgrids
Decentralized Generation In MicrogridsDecentralized Generation In Microgrids
Decentralized Generation In Microgrids
 
Hands-on-OpenIPSL.org using OpenModelica!
Hands-on-OpenIPSL.org using OpenModelica!Hands-on-OpenIPSL.org using OpenModelica!
Hands-on-OpenIPSL.org using OpenModelica!
 
isie yajuan guan_final_3x
isie yajuan guan_final_3xisie yajuan guan_final_3x
isie yajuan guan_final_3x
 
9.3_Site-specific Controller Evaluation using HIL_Pratt_EPRI/SNL Microgrid Sy...
9.3_Site-specific Controller Evaluation using HIL_Pratt_EPRI/SNL Microgrid Sy...9.3_Site-specific Controller Evaluation using HIL_Pratt_EPRI/SNL Microgrid Sy...
9.3_Site-specific Controller Evaluation using HIL_Pratt_EPRI/SNL Microgrid Sy...
 
BeardsleyThomas_CCI
BeardsleyThomas_CCIBeardsleyThomas_CCI
BeardsleyThomas_CCI
 
Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...
Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...
Phasor State Estimation Weighting Coefficients for AC and Hybrid Networks wit...
 
Iecon15 AMI
Iecon15 AMIIecon15 AMI
Iecon15 AMI
 

Similar to [February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi-tenant Systems

SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework
SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework
SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework Jenny Liu
 
Agent based Load Management for Microgrid
Agent based Load Management for MicrogridAgent based Load Management for Microgrid
Agent based Load Management for MicrogridMohamed Abbas
 
How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...
How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...
How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...Sayonsom Chanda
 
Yizhe_Liu_Resume_11092016
Yizhe_Liu_Resume_11092016Yizhe_Liu_Resume_11092016
Yizhe_Liu_Resume_11092016Yizhe Liu
 
Final year project ideas for electrical engineering eepowerschool.com
Final year project ideas for electrical engineering   eepowerschool.comFinal year project ideas for electrical engineering   eepowerschool.com
Final year project ideas for electrical engineering eepowerschool.comMuhammad Sarwar
 
OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...
OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...
OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...OPAL-RT TECHNOLOGIES
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsNECST Lab @ Politecnico di Milano
 
DeployingAnAdvancedDistribution.pdf
DeployingAnAdvancedDistribution.pdfDeployingAnAdvancedDistribution.pdf
DeployingAnAdvancedDistribution.pdfbayu162365
 
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012Dr. Shivananda Koteshwar
 
SunSpec SPG & Distributech Briefing
SunSpec SPG & Distributech BriefingSunSpec SPG & Distributech Briefing
SunSpec SPG & Distributech BriefingDylan Tansy
 
RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...
RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...
RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...OPAL-RT TECHNOLOGIES
 
Model Based Design of Hybrid and Electric Powertrains
Model Based Design of Hybrid and Electric PowertrainsModel Based Design of Hybrid and Electric Powertrains
Model Based Design of Hybrid and Electric PowertrainsSandeep Sovani, Ph.D.
 
Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...
Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...
Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...Aodhgan Gleeson
 
Simulation of dcdc converter
Simulation of dcdc converterSimulation of dcdc converter
Simulation of dcdc converterRajesh Pindoriya
 

Similar to [February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi-tenant Systems (20)

Recent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor ModelRecent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor Model
 
53 aron p_dobos_recent_and_planned_improvements_to_the_system_advisor_model_sam
53 aron p_dobos_recent_and_planned_improvements_to_the_system_advisor_model_sam53 aron p_dobos_recent_and_planned_improvements_to_the_system_advisor_model_sam
53 aron p_dobos_recent_and_planned_improvements_to_the_system_advisor_model_sam
 
SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework
SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework
SE4SG 2013 : MODAM: A MODular Agent-Based Modelling Framework
 
Agent based Load Management for Microgrid
Agent based Load Management for MicrogridAgent based Load Management for Microgrid
Agent based Load Management for Microgrid
 
3 2 dobos - whats new in sam - pv modeling workshop may 2016
3 2 dobos - whats new in sam - pv modeling workshop may 20163 2 dobos - whats new in sam - pv modeling workshop may 2016
3 2 dobos - whats new in sam - pv modeling workshop may 2016
 
How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...
How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...
How to leverage Quantum Computing and Generative AI for Clean Energy Transiti...
 
Yizhe_Liu_Resume_11092016
Yizhe_Liu_Resume_11092016Yizhe_Liu_Resume_11092016
Yizhe_Liu_Resume_11092016
 
Final year project ideas for electrical engineering eepowerschool.com
Final year project ideas for electrical engineering   eepowerschool.comFinal year project ideas for electrical engineering   eepowerschool.com
Final year project ideas for electrical engineering eepowerschool.com
 
OPAL-RT Modern power systems
OPAL-RT Modern power systems OPAL-RT Modern power systems
OPAL-RT Modern power systems
 
OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...
OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...
OPAL-RT | Setup and Performance of a Combined Hardware-in-loop and Software-i...
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
 
DeployingAnAdvancedDistribution.pdf
DeployingAnAdvancedDistribution.pdfDeployingAnAdvancedDistribution.pdf
DeployingAnAdvancedDistribution.pdf
 
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
Keynote Speech - Low Power Seminar, Jain College, October 5th 2012
 
13 2017.03.30 freeman 7th pvpmc iec 61853 presentation
13 2017.03.30 freeman 7th pvpmc iec 61853 presentation13 2017.03.30 freeman 7th pvpmc iec 61853 presentation
13 2017.03.30 freeman 7th pvpmc iec 61853 presentation
 
SunSpec SPG & Distributech Briefing
SunSpec SPG & Distributech BriefingSunSpec SPG & Distributech Briefing
SunSpec SPG & Distributech Briefing
 
RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...
RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...
RT15 Berkeley | Power HIL Simulator (SimP) A prototype to develop a high band...
 
Model Based Design of Hybrid and Electric Powertrains
Model Based Design of Hybrid and Electric PowertrainsModel Based Design of Hybrid and Electric Powertrains
Model Based Design of Hybrid and Electric Powertrains
 
HYPPO - NECSTTechTalk 23/04/2020
HYPPO - NECSTTechTalk 23/04/2020HYPPO - NECSTTechTalk 23/04/2020
HYPPO - NECSTTechTalk 23/04/2020
 
Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...
Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...
Modeling and Simulation of an electrical micro-grid using MATLAB Simulink Sum...
 
Simulation of dcdc converter
Simulation of dcdc converterSimulation of dcdc converter
Simulation of dcdc converter
 

More from Matteo Ferroni

Fight data gravity with event-driven architectures
Fight data gravity with event-driven architecturesFight data gravity with event-driven architectures
Fight data gravity with event-driven architecturesMatteo Ferroni
 
[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloud[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloudMatteo Ferroni
 
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...Matteo Ferroni
 
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...Matteo Ferroni
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...Matteo Ferroni
 
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
[EWiLi2016] Enabling power-awareness for the Xen HypervisorMatteo Ferroni
 
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...Matteo Ferroni
 

More from Matteo Ferroni (7)

Fight data gravity with event-driven architectures
Fight data gravity with event-driven architecturesFight data gravity with event-driven architectures
Fight data gravity with event-driven architectures
 
[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloud[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloud
 
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
 
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
 
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
 
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 

[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi-tenant Systems

  • 1. Enabling Power-Awareness For Multi-Tenant Systems Candidate: Matteo FERRONI Advisor: Marco D. Santambrogio Tutor: Donatella Sciuto Ph.D. Cycle: XXIX Ph.D. in Information Technology: Final Dissertation Politecnico di Milano — February, 17th 2017
  • 3. The battery of your smartphone does not last a day. Credits: http://www.mobileworld.it/2016/01/07/smartphone-ricarica-camminata-62171/
  • 4. A data center needs to deal with power grid limits. Credits: https://resources.workable.com/systems-engineer-job-description
  • 5. 5 Context definition Common features (1) hardware heterogeneity (2) software multi-tenancy (3) input variability Key facts: • Energy budgets and power caps constrain the performance of the system • The actual power consumption is affected by a pletora of different actors (0) A bird's eye view
  • 6. Problem definition and proposed approach 6 • Problems definition A. How much power is a system going to consume, given certain working conditions? B. How to control a system to consume less power, still satisfying its requirements? • Assumption: the system will behave as it did in the past • High-level approach: 1. Observe the behavior of the system during its real working conditions 2. Build accurate models to describe and predict it 3. Use them to refine decisions and meet goals efficiently Idea: learn from experience (0) A bird's eye view
  • 7. 7 Pragmatic methodology Data-driven power-awareness through a holistic approach We start from raw data (power measurements, load traces, system stats, etc.) We are not interested in the physical components of the system: it is a black box We help users and systems to learn and predict their power needs This should be done in automation throughout the whole lifetime of the system (0) A bird's eye view
  • 8. Outline 1. A first case study: power models for Android devices 2. Generalization: Model and Analysis of Resource Consumption (MARC) 3. Virtual guests monitoring: towards power-awareness for Xen 4. Modeling power consumption in multi-tenant virtualized systems 5. Maximizing performance under a power cap: a hybrid approach 6. Moving forward: containerization, challenges and opportunities 7. Conclusion and future work 8 CONTROLMODEL
  • 9. 9 • We need to observe and model the phenomenon The need for a model EnergyBudget(%) Power Model Energy Behavior Time-To-Live (s) Now! Time 9 (1) A first case study: power models for Android devices
  • 10. 10 Model as-a-Service • Requirements: • No monitoring and modeling overheads on the system itself • adapt to different systems/users, as well as to changes over time • Proposed solution: Model-as-a-Service a. send raw traces to a remote server b. compute power models c. send back predictions and models parameters a b c (1) A first case study: power models for Android devices Power constrained system
  • 11. 11 Pragmatic approach • Modeling approach: “divide et impera” We experienced a piecewise linear behavior and tried to attribute this to domain-specific features Working regime A Working regime B Working regime C = actions on controllable variables Exogenous input (uncontrollable) (1) A first case study: power models for Android devices
  • 12. 12 Prediction performance w.r.t. SoA approaches • Baseline • Android L and Battery Historian (early 2015) • Makes use of power models to estimate TTLs • Performance reported for different models • SM - one model for the user behavior for the whole day • HM - one model for the user behavior for every hour of the day • DM - subset of HM, merging similar hours of the day • I% - Improvements w.r.t. Android L (AL) (1) A first case study: power models for Android devices average error values are reported ± standard deviations
  • 13. MODEL Outline 1. A first case study: power models for Android devices 2. Generalization: Model and Analysis of Resource Consumption (MARC) 3. Virtual guests monitoring: towards power-awareness for Xen 4. Modeling power consumption in multi-tenant virtualized systems 5. Maximizing performance under a power cap: a hybrid approach 6. Moving forward: containerization, challenges and opportunities 7. Conclusion and future work 13 CONTROL
  • 14. Signal Models Markov Models ARX
 Models PHASE2A 14 A general methodology: the MARC approach PHASE3 Integration PHASE1 Data Conditioning Traced Battery Level Battery Discharge BatteryLevel 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100% Time 32.000s 34.000s 36.000s 38.000s 40.000s Traced Power Energy Consumption Linear Approximation Linear Approximation Sudden slope change Energy 2kJ 4kJ 6kJ 8kJ 10kJ 12kJ 14kJ 16kJ 18kJ 20kJ 22kJ 24kJ 26kJ 28kJ 30kJ 32kJ Power 2W 4W 6W 8W 10W 12W 14W 16W 18W 20W 22W 24W 26W 28W 30W Time 0s 200s 400s 600s 800s 1000s 1200s Traced Power Energy Consumption Linear Approximation - IDLE Linear Approximation - I/O Linear Approximation - MEM Linear Approximation - CPU Linear Approximation - I/O Linear Approximation - IDLE Energy 2kJ 4kJ 6kJ 8kJ 10kJ 12kJ 14kJ 16kJ 18kJ 20kJ 22kJ 24kJ 26kJ 28kJ 30kJ 32kJ Power 2W 4W 6W 8W 10W 12W 14W 16W 18W 20W 22W 24W 26W 28W 30W Time 0s 200s 400s 600s 800s 1000s 1200s PHASE2B PHASE2C (2) Generalization: Model and Analysis of Resource Consumption (MARC) • MARC (Model and Analysis of Resource Consumption) is a REST platform that is able to build resource consumption models in an “as-a-service” fashion
  • 15. 15 A model for each configuration 1 PHASE2B PHASE2A Autoregressive Models with Exogenous Variables 3 PHASE2C Traced Power Energy Consumption Linear Approximation - IDLE Linear Approximation - I/O Linear Approximation - MEM Linear Approximation - CPU Linear Approximation - I/O Linear Approximation - IDLE Energy 2kJ 4kJ 6kJ 8kJ 10kJ 12kJ 14kJ 16kJ 18kJ 20kJ 22kJ 24kJ 26kJ 28kJ 30kJ 32kJ Power 2W 4W 6W 8W 10W 12W 14W 16W 18W 20W 22W 24W 26W 28W 30W Time 0s 200s 400s 600s 800s 1000s 1200s FOR EACH WORKING REGIME A model 
 is computed to characterize the process (2) Generalization: Model and Analysis of Resource Consumption (MARC)
  • 16. 16 Predicting configuration switches 1 PHASE2B PHASE2A Hidden Markov Models 3 PHASE2C BY OBSERVING PERIODICITY A predictive configuration switching model is computed Traced Power Energy Consumption Linear Approximation Linear Approximation Sudden slope change Energy 2kJ 4kJ 6kJ 8kJ 10kJ 12kJ 14kJ 16kJ 18kJ 20kJ 22kJ 24kJ 26kJ 28kJ 30kJ 32kJ Power 2W 4W 6W 8W 10W 12W 14W 16W 18W 20W 22W 24W 26W 28W 30W Time 0s 200s 400s 600s 800s 1000s 1200s (2) Generalization: Model and Analysis of Resource Consumption (MARC)
  • 17. Tackling the residual non-linearity 17 PHASE2B PHASE2A 3 PHASE2C WITHIN EACH WORKING REGIME The residual non- linearity is addressed by exploiting time series analyses Signal Models and Time Series Analysis Traced Battery Level Battery Discharge BatteryLevel 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100% Time 32.000s 34.000s 36.000s 38.000s 40.000s 1 (2) Generalization: Model and Analysis of Resource Consumption (MARC)
  • 18. MODELCONTROL Outline 1. A first case study: power models for Android devices 2. Generalization: Model and Analysis of Resource Consumption (MARC) 3. Virtual guests monitoring: towards power-awareness for Xen 4. Modeling power consumption in multi-tenant virtualized systems 5. Maximizing performance under a power cap: a hybrid approach 6. Moving forward: containerization, challenges and opportunities 7. Conclusion and future work 18
  • 19. 19 Use case: Power consumption models for Xen domains ✅ ❓Dom 0 Kernel HW XEN CPU MEMORYIO Drivers Dom 1 Guest OS Paravirtualized Application Dom 2 Guest OS Paravirtualized Application DomU Guest OS Paravirtualized Application CONFIG SCHEDULER MMU TIMERS INTERRUPTS PV frontBack Toolstack THE XEN HYPERVISOR Type 1 Hypervisor currently employed 
 in many production environments • Question: “how much is a virtual tenant consuming?” (3) Virtual guests monitoring: towards power-awareness for Xen
  • 20. 20 Use case: Power consumption models for Xen domains ✅ ❓Dom 0 Kernel HW XEN CPU MEMORYIO Drivers Dom 1 Guest OS Paravirtualized Application Dom 2 Guest OS Paravirtualized Application DomU Guest OS Paravirtualized Application CONFIG SCHEDULER MMU TIMERS INTERRUPTS PV frontBack Toolstack ASSUMPTION “The power consumption of a system depends on what the hardware is doing” • Proposed solution: model virtual tenants power consumption exploiting hardware events traces, collected and attributed to each one of them (3) Virtual guests monitoring: towards power-awareness for Xen
  • 21. Tracing the Domains’ behavior 21 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket … XEMPOWER Collect and account hardware events to virtual tenants in two steps: 1. In the Xen scheduler (kernel-level) • At every context switch, trace the interesting hardware events • e.g., INST_RET, UNHALTED_CLOCK_CYCLES, LLC_REF, LLC_MISS 2. In Domain 0 (privileged tenant) • Periodically acquire the events traces and aggregate them on a domain basis (3) Virtual guests monitoring: towards power-awareness for Xen
  • 22. Outline 1. A first case study: power models for Android devices 2. Generalization: Model and Analysis of Resource Consumption (MARC) 3. Virtual guests monitoring: towards power-awareness for Xen 4. Modeling power consumption in multi-tenant virtualized systems 5. Maximizing performance under a power cap: a hybrid approach 6. Moving forward: containerization, challenges and opportunities 7. Conclusion and future work 22 MODELCONTROL
  • 23. 23 Power models: State-of-Art approaches Workload classes: (a) idle (b) weak I/O intensive (c) memory intensive (d) CPU intensive (e) strong I/O intensive Use a single power model, built on different hardware events: A. INST_RET, UNHALTED_CLOCK_CYCLES, LLC_REF, LLC_MISS B. INST_RET, UNHALTED_CLOCK_CYCLES, LLC_REF C. UNHALTED_CLOCK_CYCLES, LLC_REF 
 Configuration Model A Model B Model C RMSE Relative error RMSE Relative error RMSE Relative error (a) ± 17.63 W 35.56% ± 16.44 W 32% ± 17.68 W 35% (b) ± 4.7 W 9.4% ± 5.86 W 11.7% ± 7.17 W 14% (c) ± 19.11 W 38% ± 34.54 W 70% ± 18.7 W 37% (d) ± 0.44 W 0.08% ± 0.6W W 1.2% ± 0.42 W 0.08% (e) ± 2.98 W 5.9% ± 38.57 W 77% ± 3.29 W 6.5% average ± 8.97 W 17.79% ± 19.20 W 38.38% ± 9.45 W 18.52% Table 6.9: The modelling errors (Root MSE and mean relative error) obtained with state of the art Workload classes The best average model is the worst on a single configuration No model is better than the others consistently w.r.t. all the configurations (4) Modeling power consumption in multi-tenant virtualized systems
  • 24. Power modeling flow 24(4) Modeling power consumption in multi-tenant virtualized systems Models exploitation
  • 25. 25 • Goals of the experiments: A. assess the precision of the modeling methodology B. explore model portability on different hardware platforms C. evaluate colocation of different tenants • Benchmarks – Apache Spark (SVM and PageRank) – Redis (Memory-intensive) – MySQL and Cassandra (IO-intensive) – FFmpeg (CPU-intensive) Experimental evaluation (4) Modeling power consumption in multi-tenant virtualized systems • Experimental setup – A. WRK: Intel Core i7 @ 3.40GHz 8GB DDR3 RAM – B. SRV1: Intel Xeon @ 2.80GHz 16GB DDR3 RAM – C. SRV2: two Intel Xeon @ 2.3GHz 128GB RAM DDR4
  • 26. 26(4) Modeling power consumption in multi-tenant virtualized systems • RMSE around 1W on average, under 2W in almost all the cases; • only three results present a worse behavior (still under 5W) • Relative error around 2% on average, under 4% in almost all the cases • only three results present a worse behavior (still under 10%) Results generally outperform the works in literature [1,2,3], even in the worst cases [1] Anton Beloglazov, Rajkumar Buyya, Young Choon Lee, Albert Zomaya, et al. A taxon- omy and survey of energy-efficient data centers and cloud computing systems. Advances in computers, 82(2):47–111, 2011 [2] W Lloyd Bircher and Lizy K John. Complete system power estimation: A trickle-down approach based on performance events. In Performance Analysis of Systems & Software, 2007. ISPASS 2007. IEEE International Symposium on, pages 158–168. IEEE, 2007 [3] Hailong Yang, Qi Zhao, Zhongzhi Luan, and Depei Qian. imeter: An integrated vm power model based on performance profiling. Future Generation Computer Systems, 36:267–286, 2014.
  • 27. Outline 1. A first case study: power models for Android devices 2. Generalization: Model and Analysis of Resource Consumption (MARC) 3. Virtual guests monitoring: towards power-awareness for Xen 4. Modeling power consumption in multi-tenant virtualized systems 5. Maximizing performance under a power cap: a hybrid approach 6. Moving forward: containerization, challenges and opportunities 7. Conclusion and future work 27 MODELCONTROL
  • 28. Problem definition 28(5) Maximizing performance under a power cap: a hybrid approach • Two points of view: A. minimize power consumption given a minimum performance requirement B. maximize performance given a limit on the maximum power consumption • Requirements: – work in a virtualized environment – avoid instrumentation of the guest workloads • Steps towards the goal: 1. identify a performance metric for all the hosted tenants 2. define a resource allocation policy to deal with the requirements 3. extend the hypervisor to provide the right knobs
  • 29. (5) Maximizing performance under a power cap: a hybrid approach Power capping approaches 29 SOFTWARE APPROACH ✓ efficiency ✖ timeliness MODEL BASED
 MONITORING [3] THREAD
 MIGRATION [2] RESOURCE MANAGMENT DVFS [4] RAPL [1] CPU QUOTA HARDWARE APPROACH ✖ efficiency ✓ timeliness [1] H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. Rapl: Memory power estimation and capping. In International Symposium on Low Power Electronics and Design (ISPLED), 2010. [2] R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. Pack & cap: adaptive dvfs and thread packing under power caps. In International Symposium on Microarchitecture (MICRO), 2011. [3]M. Ferroni, A. Cazzola, D. Matteo, A. A. Nacci, D. Sciuto, and M. D. Santambrogio. Mpower: gain back your android battery life! In Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, pages 171–174. ACM, 2013. [4] T. Horvath, T. Abdelzaher, K. Skadron, and X. Liu. Dynamic voltage scaling in multitier web servers with end-to-end delay control. In Computers, IEEE Transactions. IEEE, 2007.
  • 30. 30 [5] H. Zhang and H. Hoffmann. Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2016. HYBRID APPROACH [5] ✓ efficiency ✓ timeliness (5) Maximizing performance under a power cap: a hybrid approach Power capping approaches SOFTWARE APPROACH ✓ efficiency ✖ timeliness MODEL BASED
 MONITORING [3] THREAD
 MIGRATION [2] RESOURCE MANAGMENT DVFS [4] RAPL [1] CPU QUOTA HARDWARE APPROACH ✖ efficiency ✓ timeliness
  • 31. 31(5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 32. • The workloads run in paravirtualized domains 32(5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 33. • XeMPUPiL spans over all the layers 33(5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 34. • Instruction Retired (IR) metric gathered and accounted to each domain, thanks to XeMPower • The aggregation is done over a time window of 1 second 34(5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 35. • Observation of both hardware events (i.e., IR) and power consumption (whole CPU socket) 35(5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 36. 36 – given a workload with M virtual resources and an assignment of N physical resources, to each pCPUi we assign: (5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 37. • Hybrid actuation: – enforce power cap via RAPL – define a CPU pool for the workload and pin workload’s vCPUs over pCPUs 37(5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 38. 38 • Hybrid actuation: – enforce power cap via RAPL – define a CPU pool for the workload and pin workload’s vCPUs over pCPUs (5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 39. 39 • Hybrid actuation: – enforce power cap via RAPL – define a CPU pool for the workload and pin workload’s vCPUs over pCPUs (5) Maximizing performance under a power cap: a hybrid approach Systemdesign
  • 40. 40 • Goals of the experiments: A. how do different workloads perform under a power cap? B. can we achieve higher efficiency w.r.t. RAPL power cap? • Benchmarks – Embarrassingly Parallel (EP) – IOzone – cachebench – Bi-Triagonal solver (BT) • Three power caps explored: 40W, 30W and 20W • Results are normalized w.r.t. the performance obtained with no power caps (5) Maximizing performance under a power cap: a hybrid approach Experimental evaluation • Experimental setup – 2.8-GHz quad-core Intel Xeon – 32GB of RAM – Xen hypervisor version 4.4
  • 41. 41 0 0.2 0.4 0.6 0.8 1.0 NO RAPL RAPL 40 RAPL 30 RAPL 20 NormalizedPerformance 0 0.2 0.4 0.6 0.8 1.0 EP cachebench IOzone BT • Preliminary evaluation: how do they perform under a power cap? (5) Maximizing performance under a power cap: a hybrid approach
  • 42. 42 0 0.2 0.4 0.6 0.8 1.0 NO RAPL RAPL 40 RAPL 30 RAPL 20 NormalizedPerformance 0 0.2 0.4 0.6 0.8 1.0 EP cachebench IOzone BT • Preliminary evaluation: how do they perform under a power cap? • For CPU-bound benchmarks (i.e., EP and BT), the difference are significant (5) Maximizing performance under a power cap: a hybrid approach
  • 43. 43 0 0.2 0.4 0.6 0.8 1.0 NO RAPL RAPL 40 RAPL 30 RAPL 20 NormalizedPerformance 0 0.2 0.4 0.6 0.8 1.0 EP cachebench IOzone BT • Preliminary evaluation: how do they perform under a power cap? • With IO- and/or memory-bound workloads, the performance degradation is less significant between different power caps (5) Maximizing performance under a power cap: a hybrid approach
  • 44. 44 0 0.5 1.0 PUPiL 40 RAPL 40 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT 0 0.5 1.0 PUPiL 30 RAPL 30 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT 0 0.5 1.0 PUPiL 20 RAPL 20 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT • Performance of the workloads with XeMPUPiL, for different power caps: – higher performance than RAPL, in general – not always true on a pure CPU-bound benchmark (i.e., EP) (5) Maximizing performance under a power cap: a hybrid approach
  • 45. 45 0 0.5 1.0 PUPiL 40 RAPL 40 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT 0 0.5 1.0 PUPiL 30 RAPL 30 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT 0 0.5 1.0 PUPiL 20 RAPL 20 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT • Performance of the workloads with XeMPUPiL, for different power caps: – higher performance than RAPL, in general – not always true on a pure CPU-bound benchmark (i.e., EP) (5) Maximizing performance under a power cap: a hybrid approach
  • 46. 46 0 0.5 1.0 PUPiL 40 RAPL 40 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT 0 0.5 1.0 PUPiL 30 RAPL 30 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT 0 0.5 1.0 PUPiL 20 RAPL 20 Normalizedperformance 0 0.5 1.0 EP cachebench IOzone BT • Performance of the workloads with XeMPUPiL, for different power caps: – higher performance than RAPL, in general – not always true on a pure CPU-bound benchmark (i.e., EP) (5) Maximizing performance under a power cap: a hybrid approach
  • 47. Outline 1. A first case study: power models for Android devices 2. Generalization: Model and Analysis of Resource Consumption (MARC) 3. Virtual guests monitoring: towards power-awareness for Xen 4. Modeling power consumption in multi-tenant virtualized systems 5. Maximizing performance under a power cap: a hybrid approach 6. Moving forward: containerization, challenges and opportunities 7. Conclusion and future work 47 MODELCONTROL
  • 48. Containerization: opportunities and challenges 48(6) Moving forward: containerization, challenges and opportunities A different road to multi-tenancy • Group the application and all its dependencies in a single container • The host operating system sees a container as a group of processes Proposed solution • A power-aware orchestrator for Docker containers Manage resources to meet the power consumption goal • A policy-based system Guarantee performance of the containers while staying under the power cap
  • 49. DockerCap: system design 49 Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 (6) Moving forward: containerization, challenges and opportunities
  • 50. DockerCap: system design 50 Power samples from Intel RAPL Resource allocation of the containers from Docker and cgroups Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 (6) Moving forward: containerization, challenges and opportunities
  • 51. DockerCap: system design 51 Resource Partitioning Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 Resource Control (6) Moving forward: containerization, challenges and opportunities
  • 52. DockerCap: system design 52 Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 Actuation through the cgroup hierarchy for each container (6) Moving forward: containerization, challenges and opportunities
  • 53. DockerCap: system design 53 Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 (6) Moving forward: containerization, challenges and opportunities
  • 54. 54 • Goals of the experiments: A. is the software-level power cap stable and precise? B. are we able to meet the performance requirements of the containers? • Benchmarks – fluidanimate (fluid simulation) – x264 (video encoding) – dedup (compression) • Three power caps explored: 40W, 30W and 20W • All the benchmark containers run simultaneously on the same node • Baseline: Intel RAPL power capping solution Experimental evaluation • Experimental setup – 2.8-GHz quad-core Intel Xeon – 32GB of RAM – Docker 1.11.2 (6) Moving forward: containerization, challenges and opportunities
  • 55. 55 Power cap: 30W Power cap: 20W Power cap: 40W •dedup •fluidanimate •x264 (6) Moving forward: containerization, challenges and opportunities • Comparison between performance- agnostic approaches: Fair partitioning policy vs. RAPL • Performance metric: Time To Completion (lower is better) • Comparable performance, better results on lower power caps
  • 56. Power cap: 30W Power cap: 20W Power cap: 40W 56(6) Moving forward: containerization, challenges and opportunities •dedup •fluidanimate •x264 All policies • Comparing fair and performance-aware approaches • Performance metric: Time To Completion (lower is better)
  • 57. Power cap: 30W Power cap: 20W Power cap: 40W 57(6) Moving forward: containerization, challenges and opportunities •dedup •fluidanimate •x264 All policies • Comparing fair and performance-aware approaches • Performance metric: Time To Completion (lower is better)
  • 58. Power cap: 30W Power cap: 20W Power cap: 40W 58 • Comparing fair and performance-aware approaches • Performance metric: Time To Completion (lower is better) • fluidanimate is set to High Priority with a SLO of 400s (6) Moving forward: containerization, challenges and opportunities •dedup •fluidanimate •x264 All policies
  • 59. Conclusion 1. A first case study: power models for Android devices Better performance w.r.t. Android L predictions 2. Generalization: Model and Analysis of Resource Consumption (MARC) Modeling pipeline has been generalized and provided “as-a-service” 3. Virtual guests monitoring: towards power-awareness for Xen HW events are traced with neglibigle overhead on the system 4. Modeling power consumption in multi-tenant virtualized systems Better performance w.r.t. SoA approaches 5. Maximizing performance under a power cap: a hybrid approach Better performance w.r.t. standard RAPL power cap 6. Moving forward: containerization, challenges and opportunities Promising results towards a performance-aware and power-aware orchestration 59 MODELCONTROL
  • 60. • We want to validate the modeling methodology on different resources • Time-to-Completion of Hadoop jobs 60 Future Work • We want to exploit these model to: • detect anomalies in a distributed microservice infrastructure • perform better resource allocation and consolidation
  • 62. - XEN MODELS - DETAILS ON WORKING REGIMES
  • 63. 63(4) Modeling power consumption in multi-tenant virtualized systems
  • 64. Working Regime identification 64 A single model is not enough: we explored the MARC approach Question: What is a working regime in this case study? Identified a posteriori by looking at the different slopes on the trace graph Traced power Energy consumption Energy 0J 20kJ 40kJ 60kJ 80kJ 100kJ 120kJ 140kJ 160kJ 180kJ 200kJ 220kJ 240kJ Power 25W 30W 35W 40W 45W 50W 55W 60W 65W 70W 75W Time 0s 500s 1000s 1500s 2000s 2500s 3000s 3500s 4000s (4) Modeling power consumption in multi-tenant virtualized systems
  • 65. KERNEL DENSITY ESTIMATION (KDE) By observing the local minima of the reconstructed distribution of power consumption we identify 
 the points where a Working Regime change happens Working Regime identification - How many are they? 65 Traced power Energy consumption Energy 0J 20kJ 40kJ 60kJ 80kJ 100kJ 120kJ 140kJ 160kJ 180kJ 200kJ 220kJ 240kJ Power 25W 30W 35W 40W 45W 50W 55W 60W 65W 70W 75W Time 0s 500s 1000s 1500s 2000s 2500s 3000s 3500s 4000s LINEAR RANGES 0: [0W , 42W) 1: [42W , 57W) 2: [57W , +∞) Probabilitydensity 0 0,005 0,010 0,015 0,020 0,025 0,030 0,035 0,040 0,045 0,050 0,055 0,060 0,065 0,070 Power 10W 20W 30W 40W 50W 60W 70W 80W 90W (4) Modeling power consumption in multi-tenant virtualized systems
  • 66. From hardware events to Working Regimes (1) 66 Weights 0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,10 0,11 0,12 0,13 0,14 0,15 0,16 Features 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 RELIEFF + KDE 1. ReliefF is used to identify which feature better induce the Working Regimes classification identified before (4) Modeling power consumption in multi-tenant virtualized systems
  • 67. 2. For each Working Regime: The distribution of the values of that feature is reconstructed using KDE 3. The distribution are compared to obtain discriminant values From hardware events to Working Regimes (2) 67 Weights 0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,10 0,11 0,12 0,13 0,14 0,15 0,16 Features 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 CLASS 0 CLASS 1 CLASS 2 Probabilitydensity 0 5×10 −11 10×10 −11 15×10 −11 20×10 −11 25×10 −11 30×10 −11 35×10 −11 40×10 −11 45×10 −11 50×10 −11 55×10 −11 60×10 −11 PMC values 0 2×10 9 4×10 9 6×10 9 8×10 9 RELIEFF + KDE (4) Modeling power consumption in multi-tenant virtualized systems
  • 68. RESULT: A Working Regime classifier that is able to determine in which Working Regime the system is, starting from the sampled features From hardware events to Working Regimes (3) 68 Weights 0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,10 0,11 0,12 0,13 0,14 0,15 0,16 Features 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 CLASS 0 CLASS 1 CLASS 2 Probabilitydensity 0 5×10 −11 10×10 −11 15×10 −11 20×10 −11 25×10 −11 30×10 −11 35×10 −11 40×10 −11 45×10 −11 50×10 −11 55×10 −11 60×10 −11 PMC values 0 2×10 9 4×10 9 6×10 9 8×10 9 INST_RET 0 [0,
 1.235e9] 1 (1.235e9,
 3.61e9) [3.61e9,
 5.58e9) 2 (1.235e9,
 3.61e9) (1.235e9,
 3.61e9) [5.58e9,
 +∞) RELIEFF + KDE (4) Modeling power consumption in multi-tenant virtualized systems
  • 69. From hardware events to Working Regimes (4) 69 RELIEFF + KDE Weights 0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,10 0,11 0,12 0,13 0,14 0,15 0,16 Features 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 CLASS 0 CLASS 1 CLASS 2 Probabilitydensity 0 5×10 −11 10×10 −11 15×10 −11 20×10 −11 25×10 −11 30×10 −11 35×10 −11 40×10 −11 45×10 −11 50×10 −11 55×10 −11 60×10 −11 PMC values 0 2×10 9 4×10 9 6×10 9 8×10 9 INST_RET L1_HIT 0 [0,
 1.235e9] 1 (1.235e9,
 3.61e9) (2.36362e8,
 5.672e8) [3.61e9,
 5.58e9) 2 (1.235e9,
 3.61e9) [0,
 2.36362e8] (1.235e9,
 3.61e9) [5.672e8,
 +∞) [5.58e9,
 +∞) • In case of uncertainty, repeat from ReliefF: • Eliminating the already selected features • Eliminating all the data that are not part of the uncertain zone (4) Modeling power consumption in multi-tenant virtualized systems
  • 70. 70(4) Modeling power consumption in multi-tenant virtualized systems
  • 71. 71(4) Modeling power consumption in multi-tenant virtualized systems
  • 72. 72(4) Modeling power consumption in multi-tenant virtualized systems
  • 74. 74 Decide phase Resource Partitioning Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 Resource Control
  • 76. 76 Resource control 100% CPU quota cap available resource With the feedback control loop logic, we find the allocation of resources that ensures the power cap
  • 77. 77 Decide phase Resource Partitioning Observe Queue Act Queue Act Component CGroup Observe Component Docker Containers RAPL Decide Component Policy1 Policy2 Policy3 Resource Control
  • 78. 78 Resource partitioning Containers: C1 C2 C3 C4 ? We explore three different partitioning policies: • Fair resource partitioning • Priority-aware resource partitioning • Throughput-aware resource partitioning 100% CPU quota cap available resource
  • 79. • The quota Q is evenly partitioned across all the containers • No control over the throughput of a single container 79 1. Fair resource partitioning 100% CPU quota cap Containers: C1 C2 C3 C4 Q/4 Q/4 Q/4 Q/4
  • 80. 80 2. Priority-aware partitioning 100% CPU quota cap Containers: • The quota Q is partitioned following the priority of each container • The quota of the single container is estimated through a weighted mean, where every priority has its own associated weight High priority: Low priority: C1 HIGH LOW C2 C3 C4 LOW LOW
  • 81. 81 Throughput-aware resource partitioning C3 C4Best effort: + SLO1 + SLO2 SLO1 SLO2 100% CPU quota cap Containers: High priority: Low priority: C1 C2 BE BE • The quota Q is partitioned following the priority of each container and its Service Level Objectives (SLO) • SLO is here defined as the Time-To-Completion (TTC) of the task
  • 82. 82 Experimental setup All the benchmark containers run simultaneously on the same node HW OS CONTAINER ENGINE RUNTIME Intel Xeon E5-1410 32GB RAM Ubuntu 14.04 Linux 3.19.0-42 Docker 1.11.2 Python 2.7.6 BENCHMARK CONTAINERS PARSEC DESCRIPTION fluidanimate fluid dynamics simulation generic CPU-bound x264 video streaming encoding e.g., video surveilance dedup compression cloud-fog communication
  • 83. 83 Goals of the experiments The comparison is done with the state of the art power capping solution RAPL by Intel[1] PERFORMANCES OF THE CONTAINERS PRECISION OF THE POWER CAPPING allocate resource to meet containers’ requirements manage machine power consumption
  • 84. 84 Precision of the power capping • Comparable results in terms of average power consumption under the power cap • As expected, RAPL provides a more stable power capping •Fair •Priority-aware •Throughput-aware •RAPL
  • 85. 85 Performances: Fair Partitioning vs RAPL • Comparison between the performance-agnostic approaches • Performance metric: Time To Completion (lower is better) Power cap: 30W Power cap: 20W Power cap: 40W •dedup •fluidanimate •x264
  • 86. Power cap: 30W Power cap: 20W Power cap: 40W 86 Performances: all policies •dedup •fluidanimate •x264 • Comparison with the performance-aware approaches • fluidanimate is set to High Priority with a SLO of 400s • Performance metric: Time To Completion (lower is better)
  • 87. 87 Conclusion and future work ✓We presented DockerCap, a power-aware orchestrator that manages containers’ resources ✓We showed how DockerCap is able to limit the power consumption of the machine ✓We discussed three distinct partitioning policies and compared their impact on containers’ SLO FUTURE DIRECTIONS • Exploit both HW and SW power capping • Improve the precision of the power capping with more refined modeling techniques [2] • Compute the right allocation of resources online by observing the performance of the containers [2] Andrea Corna and Andrea Damiani. A scalable framework for resource consumption modelling: the MARC approach. Master’s thesis. Politecnico di Milano, 2016. 

  • 88. - XEN MODELS - PRELIMINARY RESULTS
  • 89. Experimental settings and benchmarks 89 “XARC1” Dell OptiPlex 990 “SANDY” Dell PowerEdge T320 Processor Intel Core i7-2600 @ 3.40GHz Intel Xeon CPU E5-1410 @ 2.80GHz Memory 4 banks of Synchronous 2GB DIMM DDR3 RAM @ 1.33GHz 2 banks of Synchronous 16GB DIMM DDR3 RAM @ 1.60Ghz Storage Seagate 250GB 7200rpm 8MB cache SATA 3.5” HDD Western Digital 250GB 7200rpm 16MB cache SATA 3.5” HDD Network Intel 82579LM Gigabit Network Connection Broadcom NetXtreme BCM5720 Gigabit Ethernet PCIe [1] YANG, Hailong, et al. iMeter: An integrated VM power model based on performance profiling. Future Generation Computer Systems, 2014, 36: 267-286. Micro Benchmarks[1] • NASA Parallel Benchmarks • CPUMemory Features • Cachebench • Cache Hierarchy • IOzone • Disk IO Operations TRAIN SET Realistic Benchmarks • Redis Server • Non-relational DBMS interrogations • MySQL Server • Relational DBMS query • FFMPEG • AudioVideo transcoding and compression TEST SET
  • 90. Power models: the MARC approach (1) 90 RMSE Relative error Coverage Redis ±0.58W 1.10% 100.00% MySQL ±1.94W 3.80% 100.00% FFMPEG ±0.51W 1.00% 100.00% SANDY TRAIN AND TEST ON THE SAME PHYSICAL MACHINE LOWER BOUND IN THE STATE OF THE ART 5% of relative error [1] RMSE Relative error Coverage Redis ±2.07W 4.14% 100.00% MySQL ±9.27W 18.5% 100.00% FFMPEG ±1.32W 2.64% 99.90% XARC1 [1] YANG, Hailong, et al. iMeter: An integrated VM power model based on performance profiling. Future Generation Computer Systems, 2014, 36: 267-286.
  • 91. Power models: the MARC approach (2) 91 TRAIN ON XARC1, TEST ON SANDY RMSE Relative error Coverage Redis ±0.58W 1.10% 100.00% MySQL ±1.94W 3.80% 100.00% FFMPEG ±0.51W 1.00% 100.00% SANDY TRAIN AND TEST ON THE SAME PHYSICAL MACHINE RMSE Relative error Coverage Redis ±0.61W 1.23% 99.70% MySQL ±1.97W 3.86% 100.00% FFMPEG ±0.63W 1.26% 100.00% XARC1
  • 93. 93 The concept of Working Regime • Domain-specific feature: hardware modules currently used • We defined the concept of working regime: “Given the controllable hardware modules on a device, a working regime is a combination of their internal state” Working regime A Working regime B Working regime C (1) A first case study: power models for Android devices
  • 94. 94 MISO Model for every configuration • We tackle the problem of power model estimation in a fixed configuration with a linear Multiple Input Single Output (MISO) model Battery prediction Previous battery levels Exogenous input values Model parameter (1) A first case study: power models for Android devices
  • 95. 95 Actions on controllable variables • They are determined by the user’s behavior • We model the evolution of the smartphone’s configuration as a Markov Decision Process • A state for every configuration • Transitions’ weights represent the probability to go from a configuration to another Configuration A Configuration B Configuration C (1) A first case study: power models for Android devices
  • 96. - XEMPOWER - DETAILS AND RESULTS
  • 97. Proposed Approach • At each context switch, start counting the hardware events of interest • The configured PMC registers store the counts associated with the domain that is about to run 97 A1 A3 Core 0 Core N Time Xen Kernel …
  • 98. Proposed Approach • At the next context switch, read and store PMC values, accounted to the domain that was running • Counters are then cleared 98 A1 1 B1 A2 A1 A3 3 B3 Core 0 Core N Time context switch context switch Xen Kernel …
  • 99. Proposed Approach • Steps A and B are performed at every context switch on every system’s CPU (i.e., physical core or hardware thread). • The reason is that each domain may have multiple virtual CPUs (VCPUs). 99 A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch Xen Kernel …
  • 100. Proposed Approach • Finally, the PMC values are aggregated by domain and finally reported or used for other estimations • Expose the collected data to a higher level – how? 100 A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 Xen Kernel Dom0 …
  • 101. Proposed Approach xentrace • a lightweight trace capturing facility present in Xen • we tag every trace record with the ID of the scheduled domain and its current VCPU • a timestamp is kept to later reconstruct the trace flow 101 A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 102. Use Case: Power Consumption Attribution Use case • Enable real-time attribution of CPU power consumption to each guest • Socket-level energy measurements are also read (via Intel RAPL interface) at each context switch 102 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 103. Use Case: Power Consumption Attribution Power models from PMC traces • High correlation between hardware events and power consumption [28] • Non-halted cycle is the best metric to correlate power consumption (linear correlation coefficient above 0.95) • Such correlation suggests that the higher the rate of non-halted cycles for a domain is, the more CPU power the domain consumes 103 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 104. Use Case: Power Consumption Attribution Power models from PMC traces • High correlation between hardware events and power consumption [28] • Non-halted cycle is the best metric to correlate power consumption (linear correlation coefficient above 0.95) • Such correlation suggests that the higher the rate of non-halted cycles for a domain is, the more CPU power the domain consumes Idea • Split system-level power consumption and account it to virtual guests 104 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 105. Use Case: Power Consumption Attribution Proposed approach to account 1. For each tumbling window, the XeMPower daemon calculates the total number of non-halted cycles (one of the PMC traced) 2. We estimate the percentage of non-halted cycles for each domain over the total number of non-halted cycles; this represents the contribution of each domain to the whole CPU power consumption 3. Finally, we split the socket power consumption proportionally to the estimated contributions of each domain 105 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 106. Experimental evaluation 106 • Back to the XeMPower requirements: 1. provide precise attribution of hardware events to virtual tenants 2. agnostic to the mapping between virtual and physical resources, hosted applications and scheduling policies 3. add negligible overhead • Goals of the experimental evaluation: – show how XeMPower monitoring components incur very low overhead under different configurations and workload conditions V V
  • 107. Experimental evaluation 107 • Overhead metric: – the difference in the system’s power consumption while using XeMPower versus an off-the-shelf Xen 4.6 installation • Experimental setup: – 2.8 GHz quad-core Intel Xeon E5-1410 processor (4 hardware threads) – a Watts up? PRO meter to monitor the entire machine’s power consumption – Each guest repeatedly runs a multi-threaded compute-bound microbenchmark on three VCPUs and uses a stripped-down Linux 3.14 as the guest OS
  • 108. Experimental evaluation 108 • Three system configurations: 1. the baseline configuration uses off-the-shelf Xen 4.4 2. the patched configuration introduces the kernel-level instrumentation without the XeMPower daemon 3. the monitoring configuration is the patched with the XeMPower daemon running and reporting statistics • Four running scenarios: – an idle scenario in which the system only runs Dom0 – 3 running-n scenarios, where n = {1, 2, 3} indicates the number of guest domains in addition to Dom0 • The idea is to stress the system with an increasing number of CPU-intensive tenant applications • This increases the amount of data traced and collected by XeMPower
  • 109. • Mean power consumption (μ), in Watts, scenarios idle and running- {1,2,3}, and configurations baseline (b), patched (p), and monitoring (m) • Mean power values are reported with their 95% confidence interval Experimental Results 109 • At a glance, we can see how measurements are pretty close pinned-VCPU unpinned-VCPU
  • 110. Experimental Results 110 • We estimate an upper bound ϵ for the maximum overhead using a hypothesis test: • A rejection of the null hypothesis means that there is strong statistical evidence that the power consumption overhead is lower than ϵ • We compute ϵ for the considered test cases and scenarios, ensuring average values of power consumption (μ) with confidence: α = 5% • We want to compare the overhead with the one measured for XenMon, a performance monitoring tool for Xen • unlike XeMPower, XenMon does not collect PMC reads • it is still a reference design in the context of runtime monitoring for the Xen ecosystem
  • 111. Experimental Results 111 • Estimated upper bound ϵ for the power consumption overhead, in Watts • Parenthetical values are the overheads w.r.t. mean power consumption • XeMPower introduces an overhead not greater than 1.18W (1.58%), observed for the [unpinned-VCPU, running-3, patched] case • In all the other cases, the overhead is less than 1W (and less than 1%) • This result is satisfactory compared to an overhead of 1-2% observed for XenMon, the reference implementation for XeMPower
  • 113. Related work: PUPiL [5] 113 [5] H. Zhang and H. Hoffmann. Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2016. • PUPiL, a framework that aims to minimize and to maximize respectively the concept of timeliness and efficiency • Proposed approach: – both hardware (i.e., the Intel RAPL interface [10]) and software (i.e., resource partitioning and allocation) techniques – exploits a canonical ODA control loop, one of the main building blocks of self-aware computing • Limitations – the applications running on the system need to be instrumented with the Heartbeat framework, to provide uniform metric of throughput – applications running bare-metal on Linux • These conditions might not hold in the context of a multi-tenant virtualized environment
  • 114. The Xen Hypervisor 114 Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
  • 115. 1. Performance metric identification • Hardware event counters as low level metrics of performance • We exploit the Intel Performance Monitoring Unit (PMU) to monitor the number of Instruction Retired (IR) accounted to each domain in a certain time window – an insight on how many microinstructions were completely executed (i.e., that successfully reached the end of the pipeline) – it represents a reasonable indicator of performance, as the same manufacturer suggests [6] 115 [6] Clockticks per instructions retired (cpi). https://software.intel.com/en-us/node/544403. Accessed: 2016-06-01.
  • 116. 2. Decision phase and virtualization • Evaluation criterion: the average IR rate over a certain time window – the time window allows the workload to adapt to the actual configuration – the comparison of IR rates of different configurations highlights which one makes the workload perform better • Resource allocation granularity: core-level – each domain owns a set virtual CPUs (vCPUs) – a set of physical CPUs (pCPU) present on the machine – each vCPU can be mapped on a pCPU for a certain amount of time, while multiple vCPUs can be mapped on the same pCPU • We wanted our allocation to cover the whole set of pCPUs, if possible 116
  • 117. 3. Extending the hypervisor - RAPL • Working with the Intel RAPL interface: – harshly cutting the frequency and the voltage of the whole CPU socket • On a bare-metal operating system: – reading and writing data into the right Model Specific Register (MSR) • MSR_RAPL_POWER_UNIT: read processor-specific time, energy and power units, used to scale each value read or written • MSR_PKG_RAPL_POWER_LIMIT: write to set a limit on the power consumption of the whole socket • In a virtualized environment: – the Xen hypervisor does not natively support the RAPL interface – we developed custom hypercalls, with kernel callback functions and memory buffers – we developed a CLI tool that performs some checks on the input parameters, as well as of instantiating and invoking the Xen command interface to launch the hypercalls 117
  • 118. 3. Extending the hypervisor - Resources • cpupool tool: – allows to cluster the physical CPUs in different pools – the pool scheduler will schedule the domain’s vCPUs only on the pCPUs that are part of that cluster – as a new resource allocation is chosen by the decide phase, we increase or decrease the number of pCPUs in the pool – pin the domain’s vCPUs to these, to increase workload stability • NO xenpm: – set a maximum and minimum frequency for each pCPU – it may interfere with the actuation made by RAPL 118
  • 119. - MARC - MODELING APPROACHES
  • 120. 120 Motivation - Modeling approaches (1) • Deep insights • Accurate • Invasive instrumentation • Ignoreunderrate degradation • Adaptive • Ever-improving • Generalizable • At-a-glance view • Accuracy depends on acquisition procedures PROSCONS PHYSICAL MODELS DATA-DRIVEN MODELS
  • 121. 121 Motivation - Modeling approaches (2) • Controllable environment • Ad-hoc instrumentation • Relies on reasonable simulations • Does not evolve with the target • Requires ex-novo modeling for new targets • Intrinsic ability of evolve with the target • Tackles new targets • Does not require in-lab phases • Noisy real-world environment PROSCONS OFF-LINE MODELING ON-LINE MODELING
  • 122. 122 Motivation - Modeling approaches (3) On-demand data-driven modeling = GENERAL AS-A-SERVICE MODELING FRAMEWORK +
  • 124. 32B 2A 2C MARC METHODOLOGY Our KD&DM procedure 15 Preprocessing Data Manipulation Feature Selection
  • 125. 32B 2A 2C MARC METHODOLOGY Our KD&DM procedure 16 Preprocessing Data Manipulation STANDARD DATA CLEANING OPERATIONS Scope: single sample 1. Coherence Correction 2. Residual Incoherence Elimination 3. Out-of-Bound Elimination 4. Granularity Reduction Feature Selection
  • 126. 32B 2A 2C MARC METHODOLOGY Our KD&DM procedure 17 Preprocessing Data Manipulation Scope: full dataset •Standardization •Quantization Correction •… Feature Selection
  • 127. 32B 2A 2C MARC METHODOLOGY Our KD&DM procedure 18 Preprocessing Data Manipulation Feature Selection Scope: feature-wise •Manual feature fusion and exclusion •Automatic Configuration Feature 
 Elicitation and Synthesis
  • 128. - MARC - A SCALABLE PLATFORM
  • 129. 4. MARC PLATFORM Scalability 14 Load Balancer Communication Actor Communication Actor Module Specific 
 Functional Logic Module Specific 
 Functional Logic SCALE-IN
 INTRA-MODULE PARALLELISM Technologies: Scala - Akka
  • 130. 4. MARC PLATFORM Scalability 15 SCALE-OUT
 MODULE DISTRIBUTION Load Balancer Communication Actor Communication Actor Module Specific 
 Functional Logic Module Specific 
 Functional Logic = DOCKER 
 CONTAINER Technologies: Scala - Akka - Docker
  • 131. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP
  • 132. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “PHASE2A, please!”
  • 133. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “PHASE2A, please!” ⏳
  • 134. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “PHASE2A, please!” ⏳ ⏳
  • 135. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “PHASE2A, please!” ⏳ ✅
  • 136. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “Thank you” ✅ ✅
  • 137. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “PHASE2B, please!” ✅ ✅ ⏳
  • 138. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “PHASE2B, please!” ✅ ✅ ⏳ “Already computed!”
  • 139. 4. MARC PLATFORM Scalability 16 BACKWARD ACTIVATION Technologies: Scala - Akka - Docker - Scalatra PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 WEBAPP “Thank you” ✅ ✅ ✅
  • 140. 4. MARC PLATFORM Scalability 17 WHITEBOARD APPROACH Scala - Akka - Docker - Scalatra - Redis PHASE1 PHASE2A PHASE2B PHASE2C PHASE3 BACKBONE INTERNAL
 WHITEBOARD EXTERNAL
 WHITEBOARD