SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Enabling power-awareness
for the Xen Hypervisor
Matteo Ferroni, Juan A. Colmenares, Steven Hofmeyr,
John D. Kubiatowicz, Marco D. Santambrogio
EWiLi’16, 10/06/2016, Pittsburgh PA, USA.
Co-located with the Embedded Systems Week.
Outline 2
• Introduction on virtualization
• Problem definition and proposed solution
• Requirements and related work
• Proposed approach and system design
• Use case: power consumption attribution
• Experimental evaluation and results
• Conclusion and future work
Introduction
• Embedded ecosystems
– from microcontrollers to multi-core processors
cheaper, smaller, and less power-hungry
• Two main advantages:
– multiple embedded applications on the same System-on-
Chip (SoC)
– application concurrency and parallelism to obtain better
performance
3
Virtualization
• Hardware-assisted and software virtualization enter the
context of embedded systems
– Features:
• applications do not need to be changed
• physical resources shared between applications
• strong security and isolation guarantees
• Heterogeneity supported and fostered:
– software heterogeneity, i.e., memory-bound, I/O-bound, or
CPU-bound;
– hardware heterogeneity, i.e., different SoCs and platforms;
4
The Xen Hypervisor 5
Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
The Xen Hypervisor 6
Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
The Xen Hypervisor 7
Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
Problem Definition
• Problem definition:
– optimization and consolidation is not an easy task
– power consumption is a major concern
• A strong requirement:
– online monitoring system to accurately observe
applications’ and system’s behavior
8
Proposed Solution
• Proposed solution:
– XeMPower, a lightweight hardware and resource
monitoring solution for the Xen hypervisor
• Basic idea:
– monitor hardware events and precisely account them to
each virtual guest
• Use case:
– enable real-time attribution of CPU power consumption
to each guest
9
Requirements
• XeMPower requirements:
1. provide precise attribution of hardware events to virtual
tenants;
2. agnostic to the mapping between virtual and physical
resources, hosted applications and scheduling policies;
3. add negligible overhead;
• Proposed approach:
– kernel-level instrumentation to gather Performance
Monitoring Counter (PMC) values
– Dom0-level aggregation and processing
10
• Example of the counters of interest:
Performance Monitoring Counter (PMC) 11
Related Work: PMC monitoring
• Every monitoring tool is affected by a tradeoff:
– accuracy vs. overhead
• Different approaches to hardware events monitoring:
– Code instrumentation (e.g., Valgrind and IgProf)
• Inject extra code in the applications at compile time and/or runtime,
allowing complex analysis
• High overhead, not suitable for runtime analysis in production
– Performance counter tools (e.g., Perf, OProfile, PAPI
libraries)
• Sample system’s events at different granularity (e.g., thread level,
process level, set of processors, or the entire systems)
• Use kernel modules to access different categories of events:
hardware events, software events (context switches or minor faults),
and tracepoint events (disk I/O and TCP events).
12
Related Work: PMC monitoring in Xen
• Using the Xen hypervisor:
– Xenoprof
• a system-wide statistical profiling toolkit based on OProfile
• allows the domain itself to collect its own hardware event
counters (active mode)
• passive mode profiling (i.e., domain treated as a “black
box”) is limited, not agnostic to hosted applications
– Perfctr-Xen
• supports performance counter virtualization and re-
programs the Performance Monitoring Unit (PMU)
configuration registers (e.g., event selectors) at every
context switch
• Good for workload profiling inside a domain, no centralized
runtime monitoring
13
Proposed Approach
• At each context
switch, start counting
the hardware events
of interest
• The configured PMC
registers store the
counts associated
with the domain that
is about to run
14
A1 A3
Core 0 Core N
Time
Xen Kernel
…
Proposed Approach
• At the next context
switch, read and
store PMC values,
accounted to the
domain that was
running
• Counters are then
cleared
15
A1
1
B1
A2
A1
A3
3
B3
Core 0 Core N
Time
context
switch
context
switch
Xen Kernel
…
Proposed Approach
• Steps A and B are
performed at every
context switch on
every system’s CPU
(i.e., physical core or
hardware thread).
• The reason is that
each domain may
have multiple virtual
CPUs (VCPUs).
16
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
Xen Kernel
…
Proposed Approach
• Finally, the PMC
values are
aggregated by
domain and finally
reported or used for
other estimations
• Expose the collected
data to a higher level
– how?
17
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
Xen Kernel Dom0
…
Proposed Approach
xentrace
• a lightweight trace
capturing facility
present in Xen
• we tag every trace
record with the ID of
the scheduled
domain and its
current VCPU
• a timestamp is kept
to later reconstruct
the trace flow
18
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Use case
• Enable real-time
attribution of CPU
power consumption
to each guest
• Socket-level energy
measurements are
also read (via Intel
RAPL interface) at
each context switch
19
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Power models from PMC traces
• High correlation between hardware events
and power consumption [28]
• Non-halted cycle is the best metric to
correlate power consumption (linear
correlation coefficient above 0.95)
• Such correlation suggests that the higher
the rate of non-halted cycles for a domain
is, the more CPU power the domain
consumes
20
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Power models from PMC traces
• High correlation between hardware events
and power consumption [28]
• Non-halted cycle is the best metric to
correlate power consumption (linear
correlation coefficient above 0.95)
• Such correlation suggests that the higher
the rate of non-halted cycles for a domain
is, the more CPU power the domain
consumes
Idea
• Split system-level power consumption and
account it to virtual guests
21
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Use Case: Power Consumption Attribution
Proposed approach to account
1. For each tumbling window, the XeMPower
daemon calculates the total number of
non-halted cycles (one of the PMC traced)
2. We estimate the percentage of non-halted
cycles for each domain over the total
number of non-halted cycles; this
represents the contribution of each domain
to the whole CPU power consumption
3. Finally, we split the socket power
consumption proportionally to the
estimated contributions of each domain
22
XeMPowerCLI
A1
1
B1
A2
2
B2
A1
1
B1
A3
3
B3
A2
2
1
A1
Core 0 Core N
Time
B2
…
…
…
context
switch
context
switch
context
switch
context
switch
XeMPowerDaemon
B2
B2
B1
B1
B3
B2
B2
B1
B1
B3
Xen Kernel Dom0
Hardware events per core,
energy per socket
…
Experimental evaluation 23
• Back to the XeMPower requirements:
1. provide precise attribution of hardware events to virtual
tenants
2. agnostic to the mapping between virtual and physical
resources, hosted applications and scheduling policies
3. add negligible overhead
• Goals of the experimental evaluation:
– show how XeMPower monitoring components incur
very low overhead under different configurations
and workload conditions
V
V
Experimental evaluation 24
• Overhead metric:
– the difference in the system’s power consumption
while using XeMPower versus an off-the-shelf Xen 4.6
installation
• Experimental setup:
– 2.8 GHz quad-core Intel Xeon E5-1410 processor (4
hardware threads)
– a Watts up? PRO meter to monitor the entire
machine’s power consumption
– Each guest repeatedly runs a multi-threaded
compute-bound microbenchmark on three VCPUs
and uses a stripped-down Linux 3.14 as the guest OS
Experimental evaluation 25
• Three system configurations:
1. the baseline configuration uses off-the-shelf Xen 4.4
2. the patched configuration introduces the kernel-level
instrumentation without the XeMPower daemon
3. the monitoring configuration is the patched with the XeMPower
daemon running and reporting statistics
• Four running scenarios:
– an idle scenario in which the system only runs Dom0
– 3 running-n scenarios, where n = {1, 2, 3} indicates the number of
guest domains in addition to Dom0
• The idea is to stress the system with an increasing number of
CPU-intensive tenant applications
• This increases the amount of data traced and collected by
XeMPower
• Mean power consumption (μ), in Watts, scenarios idle and running-
{1,2,3}, and configurations baseline (b), patched (p), and monitoring
(m)
• Mean power values are reported with their 95% confidence interval
Experimental Results 26
• At a glance, we can see how measurements are pretty close
pinned-VCPU
unpinned-VCPU
Experimental Results 27
• We estimate an upper bound ϵ for the maximum overhead using a
hypothesis test:
• A rejection of the null hypothesis means that there is strong statistical
evidence that the power consumption overhead is lower than ϵ
• We compute ϵ for the considered test cases and scenarios, ensuring
average values of power consumption (μ) with confidence: α = 5%
• We want to compare the overhead with the one measured for XenMon, a
performance monitoring tool for Xen
• unlike XeMPower, XenMon does not collect PMC reads
• it is still a reference design in the context of runtime monitoring for
the Xen ecosystem
Experimental Results 28
• Estimated upper bound ϵ for the power consumption overhead, in Watts
• Parenthetical values are the overheads w.r.t. mean power consumption
• XeMPower introduces an overhead not greater than 1.18W (1.58%),
observed for the [unpinned-VCPU, running-3, patched] case
• In all the other cases, the overhead is less than 1W (and less than 1%)
• This result is satisfactory compared to an overhead of 1-2% observed for
XenMon, the reference implementation for XeMPower
Conclusion and Future Work 29
• We presented XeMPower, a lightweight monitoring
solution for Xen that precisely accounts hardware
events to virtual guests
• We described its use in online attribution of CPU
power consumption to individual domains
• Our results show that XeMPower can provide continuous
statistics with very low overhead compared to an off-
the-shelf Xen installation
• Future work
– explore the complementary use of offline
characterization of both hardware and guest
workloads, towards power consumption
predictions
Questions? 30

Más contenido relacionado

La actualidad más candente

Dealing with Exceptions Computer Architecture part 1
Dealing with Exceptions Computer Architecture part 1Dealing with Exceptions Computer Architecture part 1
Dealing with Exceptions Computer Architecture part 1Gaditek
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating SystemTech_MX
 
A Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles
A Simplex Architecture for Intelligent and Safe Unmanned Aerial VehiclesA Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles
A Simplex Architecture for Intelligent and Safe Unmanned Aerial VehiclesHeechul Yun
 
A deep dive into energy efficient multi core processor
A deep dive into energy efficient multi core processorA deep dive into energy efficient multi core processor
A deep dive into energy efficient multi core processorZongYing Lyu
 
Real Time Kernels
Real Time KernelsReal Time Kernels
Real Time KernelsArnav Soni
 
Operating Systems Part II-Process Scheduling, Synchronisation & Deadlock
Operating Systems Part II-Process Scheduling, Synchronisation & DeadlockOperating Systems Part II-Process Scheduling, Synchronisation & Deadlock
Operating Systems Part II-Process Scheduling, Synchronisation & DeadlockAjit Nayak
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementationRajan Kumar
 
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V Real Time Operating System (RTOS)
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V  Real Time Operating System (RTOS)SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V  Real Time Operating System (RTOS)
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V Real Time Operating System (RTOS)Arti Parab Academics
 
Real Time Operating Systems for Embedded Systems
Real Time Operating Systems for Embedded SystemsReal Time Operating Systems for Embedded Systems
Real Time Operating Systems for Embedded SystemsAditya Vichare
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processingdilip kumar
 
ACTRESS: Domain-Specific Modeling of Self-Adaptive Software Architectures
ACTRESS: Domain-Specific Modeling of Self-Adaptive Software ArchitecturesACTRESS: Domain-Specific Modeling of Self-Adaptive Software Architectures
ACTRESS: Domain-Specific Modeling of Self-Adaptive Software ArchitecturesFilip Krikava
 
Real Time Operating System
Real Time Operating SystemReal Time Operating System
Real Time Operating Systemvivek223
 

La actualidad más candente (20)

Dealing with Exceptions Computer Architecture part 1
Dealing with Exceptions Computer Architecture part 1Dealing with Exceptions Computer Architecture part 1
Dealing with Exceptions Computer Architecture part 1
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating System
 
A Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles
A Simplex Architecture for Intelligent and Safe Unmanned Aerial VehiclesA Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles
A Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles
 
A deep dive into energy efficient multi core processor
A deep dive into energy efficient multi core processorA deep dive into energy efficient multi core processor
A deep dive into energy efficient multi core processor
 
14 superscalar
14 superscalar14 superscalar
14 superscalar
 
Real Time Kernels
Real Time KernelsReal Time Kernels
Real Time Kernels
 
Operating Systems Part II-Process Scheduling, Synchronisation & Deadlock
Operating Systems Part II-Process Scheduling, Synchronisation & DeadlockOperating Systems Part II-Process Scheduling, Synchronisation & Deadlock
Operating Systems Part II-Process Scheduling, Synchronisation & Deadlock
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementation
 
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V Real Time Operating System (RTOS)
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V  Real Time Operating System (RTOS)SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V  Real Time Operating System (RTOS)
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT V Real Time Operating System (RTOS)
 
Real Time Operating Systems for Embedded Systems
Real Time Operating Systems for Embedded SystemsReal Time Operating Systems for Embedded Systems
Real Time Operating Systems for Embedded Systems
 
Rtos Concepts
Rtos ConceptsRtos Concepts
Rtos Concepts
 
How to choose an RTOS?
How to choose an RTOS?How to choose an RTOS?
How to choose an RTOS?
 
Real-Time Operating Systems
Real-Time Operating SystemsReal-Time Operating Systems
Real-Time Operating Systems
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processing
 
ACTRESS: Domain-Specific Modeling of Self-Adaptive Software Architectures
ACTRESS: Domain-Specific Modeling of Self-Adaptive Software ArchitecturesACTRESS: Domain-Specific Modeling of Self-Adaptive Software Architectures
ACTRESS: Domain-Specific Modeling of Self-Adaptive Software Architectures
 
Os2
Os2Os2
Os2
 
Rtos part2
Rtos part2Rtos part2
Rtos part2
 
Real Time Operating System
Real Time Operating SystemReal Time Operating System
Real Time Operating System
 
PCP
PCPPCP
PCP
 
PhD Thesis Defense
PhD Thesis DefensePhD Thesis Defense
PhD Thesis Defense
 

Similar a [EWiLi2016] Enabling power-awareness for the Xen Hypervisor

(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep DiveAmazon Web Services
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsNECST Lab @ Politecnico di Milano
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWSAmazon Web Services Korea
 
Virtual Machine Performance
Virtual Machine PerformanceVirtual Machine Performance
Virtual Machine PerformanceQian Lin
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Amazon Web Services
 
Open programmable architecture for java enabled network devices
Open programmable architecture for java enabled network devicesOpen programmable architecture for java enabled network devices
Open programmable architecture for java enabled network devicesTal Lavian Ph.D.
 
Open Programmable Architecture for Java-enabled Network Devices
Open Programmable Architecture for Java-enabled Network DevicesOpen Programmable Architecture for Java-enabled Network Devices
Open Programmable Architecture for Java-enabled Network DevicesTal Lavian Ph.D.
 
Daniel dauwe ece 561 Trial 3
Daniel dauwe   ece 561 Trial 3Daniel dauwe   ece 561 Trial 3
Daniel dauwe ece 561 Trial 3cinedan
 
Daniel dauwe ece 561 Benchmarking Results
Daniel dauwe   ece 561 Benchmarking ResultsDaniel dauwe   ece 561 Benchmarking Results
Daniel dauwe ece 561 Benchmarking Resultscinedan
 
Daniel dauwe ece 561 Benchmarking Results Trial 2
Daniel dauwe   ece 561 Benchmarking Results Trial 2Daniel dauwe   ece 561 Benchmarking Results Trial 2
Daniel dauwe ece 561 Benchmarking Results Trial 2cinedan
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsNational Cheng Kung University
 
1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptxKesavanGopal1
 
Distributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop GridDistributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop Gridbrent.wilson
 
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...Matteo Ferroni
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAlan Renouf
 
Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)ASHUTOSH KUMAR
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERSROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERSDeepak Shankar
 

Similar a [EWiLi2016] Enabling power-awareness for the Xen Hypervisor (20)

(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS
 
Virtual Machine Performance
Virtual Machine PerformanceVirtual Machine Performance
Virtual Machine Performance
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Open programmable architecture for java enabled network devices
Open programmable architecture for java enabled network devicesOpen programmable architecture for java enabled network devices
Open programmable architecture for java enabled network devices
 
Open Programmable Architecture for Java-enabled Network Devices
Open Programmable Architecture for Java-enabled Network DevicesOpen Programmable Architecture for Java-enabled Network Devices
Open Programmable Architecture for Java-enabled Network Devices
 
Daniel dauwe ece 561 Trial 3
Daniel dauwe   ece 561 Trial 3Daniel dauwe   ece 561 Trial 3
Daniel dauwe ece 561 Trial 3
 
Daniel dauwe ece 561 Benchmarking Results
Daniel dauwe   ece 561 Benchmarking ResultsDaniel dauwe   ece 561 Benchmarking Results
Daniel dauwe ece 561 Benchmarking Results
 
Daniel dauwe ece 561 Benchmarking Results Trial 2
Daniel dauwe   ece 561 Benchmarking Results Trial 2Daniel dauwe   ece 561 Benchmarking Results Trial 2
Daniel dauwe ece 561 Benchmarking Results Trial 2
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
 
1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx1. An Introduction to Embed Systems_DRKG.pptx
1. An Introduction to Embed Systems_DRKG.pptx
 
Distributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop GridDistributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop Grid
 
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtop
 
Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)Cloud computing(bit mesra kolkata extn.)
Cloud computing(bit mesra kolkata extn.)
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERSROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
 
Raga_SDN_NSX_1
Raga_SDN_NSX_1Raga_SDN_NSX_1
Raga_SDN_NSX_1
 
Module3 part1
Module3 part1Module3 part1
Module3 part1
 

Más de Matteo Ferroni

Fight data gravity with event-driven architectures
Fight data gravity with event-driven architecturesFight data gravity with event-driven architectures
Fight data gravity with event-driven architecturesMatteo Ferroni
 
[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloud[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloudMatteo Ferroni
 
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...Matteo Ferroni
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...Matteo Ferroni
 
[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...
[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...
[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...Matteo Ferroni
 
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...Matteo Ferroni
 

Más de Matteo Ferroni (6)

Fight data gravity with event-driven architectures
Fight data gravity with event-driven architecturesFight data gravity with event-driven architectures
Fight data gravity with event-driven architectures
 
[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloud[Droidcon Italy 2017] Client and server, 3 meters above the cloud
[Droidcon Italy 2017] Client and server, 3 meters above the cloud
 
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
[EUC2016] DockerCap: a software-level power capping orchestrator for Docker c...
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
 
[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...
[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...
[February 2017 - Ph.D. Final Dissertation] Enabling Power-awareness For Multi...
 
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
[EUC2014] cODA: An Open-Source Framework to Easily Design Context-Aware Andro...
 

Último

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf203318pmpc
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 

Último (20)

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 

[EWiLi2016] Enabling power-awareness for the Xen Hypervisor

  • 1. Enabling power-awareness for the Xen Hypervisor Matteo Ferroni, Juan A. Colmenares, Steven Hofmeyr, John D. Kubiatowicz, Marco D. Santambrogio EWiLi’16, 10/06/2016, Pittsburgh PA, USA. Co-located with the Embedded Systems Week.
  • 2. Outline 2 • Introduction on virtualization • Problem definition and proposed solution • Requirements and related work • Proposed approach and system design • Use case: power consumption attribution • Experimental evaluation and results • Conclusion and future work
  • 3. Introduction • Embedded ecosystems – from microcontrollers to multi-core processors cheaper, smaller, and less power-hungry • Two main advantages: – multiple embedded applications on the same System-on- Chip (SoC) – application concurrency and parallelism to obtain better performance 3
  • 4. Virtualization • Hardware-assisted and software virtualization enter the context of embedded systems – Features: • applications do not need to be changed • physical resources shared between applications • strong security and isolation guarantees • Heterogeneity supported and fostered: – software heterogeneity, i.e., memory-bound, I/O-bound, or CPU-bound; – hardware heterogeneity, i.e., different SoCs and platforms; 4
  • 5. The Xen Hypervisor 5 Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
  • 6. The Xen Hypervisor 6 Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
  • 7. The Xen Hypervisor 7 Slides from: http://www.slideshare.net/xen_com_mgr/xpds16-porting-xen-on-arm-to-a-new-soc-julien-grall-arm
  • 8. Problem Definition • Problem definition: – optimization and consolidation is not an easy task – power consumption is a major concern • A strong requirement: – online monitoring system to accurately observe applications’ and system’s behavior 8
  • 9. Proposed Solution • Proposed solution: – XeMPower, a lightweight hardware and resource monitoring solution for the Xen hypervisor • Basic idea: – monitor hardware events and precisely account them to each virtual guest • Use case: – enable real-time attribution of CPU power consumption to each guest 9
  • 10. Requirements • XeMPower requirements: 1. provide precise attribution of hardware events to virtual tenants; 2. agnostic to the mapping between virtual and physical resources, hosted applications and scheduling policies; 3. add negligible overhead; • Proposed approach: – kernel-level instrumentation to gather Performance Monitoring Counter (PMC) values – Dom0-level aggregation and processing 10
  • 11. • Example of the counters of interest: Performance Monitoring Counter (PMC) 11
  • 12. Related Work: PMC monitoring • Every monitoring tool is affected by a tradeoff: – accuracy vs. overhead • Different approaches to hardware events monitoring: – Code instrumentation (e.g., Valgrind and IgProf) • Inject extra code in the applications at compile time and/or runtime, allowing complex analysis • High overhead, not suitable for runtime analysis in production – Performance counter tools (e.g., Perf, OProfile, PAPI libraries) • Sample system’s events at different granularity (e.g., thread level, process level, set of processors, or the entire systems) • Use kernel modules to access different categories of events: hardware events, software events (context switches or minor faults), and tracepoint events (disk I/O and TCP events). 12
  • 13. Related Work: PMC monitoring in Xen • Using the Xen hypervisor: – Xenoprof • a system-wide statistical profiling toolkit based on OProfile • allows the domain itself to collect its own hardware event counters (active mode) • passive mode profiling (i.e., domain treated as a “black box”) is limited, not agnostic to hosted applications – Perfctr-Xen • supports performance counter virtualization and re- programs the Performance Monitoring Unit (PMU) configuration registers (e.g., event selectors) at every context switch • Good for workload profiling inside a domain, no centralized runtime monitoring 13
  • 14. Proposed Approach • At each context switch, start counting the hardware events of interest • The configured PMC registers store the counts associated with the domain that is about to run 14 A1 A3 Core 0 Core N Time Xen Kernel …
  • 15. Proposed Approach • At the next context switch, read and store PMC values, accounted to the domain that was running • Counters are then cleared 15 A1 1 B1 A2 A1 A3 3 B3 Core 0 Core N Time context switch context switch Xen Kernel …
  • 16. Proposed Approach • Steps A and B are performed at every context switch on every system’s CPU (i.e., physical core or hardware thread). • The reason is that each domain may have multiple virtual CPUs (VCPUs). 16 A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch Xen Kernel …
  • 17. Proposed Approach • Finally, the PMC values are aggregated by domain and finally reported or used for other estimations • Expose the collected data to a higher level – how? 17 A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 Xen Kernel Dom0 …
  • 18. Proposed Approach xentrace • a lightweight trace capturing facility present in Xen • we tag every trace record with the ID of the scheduled domain and its current VCPU • a timestamp is kept to later reconstruct the trace flow 18 A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 19. Use Case: Power Consumption Attribution Use case • Enable real-time attribution of CPU power consumption to each guest • Socket-level energy measurements are also read (via Intel RAPL interface) at each context switch 19 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 20. Use Case: Power Consumption Attribution Power models from PMC traces • High correlation between hardware events and power consumption [28] • Non-halted cycle is the best metric to correlate power consumption (linear correlation coefficient above 0.95) • Such correlation suggests that the higher the rate of non-halted cycles for a domain is, the more CPU power the domain consumes 20 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 21. Use Case: Power Consumption Attribution Power models from PMC traces • High correlation between hardware events and power consumption [28] • Non-halted cycle is the best metric to correlate power consumption (linear correlation coefficient above 0.95) • Such correlation suggests that the higher the rate of non-halted cycles for a domain is, the more CPU power the domain consumes Idea • Split system-level power consumption and account it to virtual guests 21 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 22. Use Case: Power Consumption Attribution Proposed approach to account 1. For each tumbling window, the XeMPower daemon calculates the total number of non-halted cycles (one of the PMC traced) 2. We estimate the percentage of non-halted cycles for each domain over the total number of non-halted cycles; this represents the contribution of each domain to the whole CPU power consumption 3. Finally, we split the socket power consumption proportionally to the estimated contributions of each domain 22 XeMPowerCLI A1 1 B1 A2 2 B2 A1 1 B1 A3 3 B3 A2 2 1 A1 Core 0 Core N Time B2 … … … context switch context switch context switch context switch XeMPowerDaemon B2 B2 B1 B1 B3 B2 B2 B1 B1 B3 Xen Kernel Dom0 Hardware events per core, energy per socket …
  • 23. Experimental evaluation 23 • Back to the XeMPower requirements: 1. provide precise attribution of hardware events to virtual tenants 2. agnostic to the mapping between virtual and physical resources, hosted applications and scheduling policies 3. add negligible overhead • Goals of the experimental evaluation: – show how XeMPower monitoring components incur very low overhead under different configurations and workload conditions V V
  • 24. Experimental evaluation 24 • Overhead metric: – the difference in the system’s power consumption while using XeMPower versus an off-the-shelf Xen 4.6 installation • Experimental setup: – 2.8 GHz quad-core Intel Xeon E5-1410 processor (4 hardware threads) – a Watts up? PRO meter to monitor the entire machine’s power consumption – Each guest repeatedly runs a multi-threaded compute-bound microbenchmark on three VCPUs and uses a stripped-down Linux 3.14 as the guest OS
  • 25. Experimental evaluation 25 • Three system configurations: 1. the baseline configuration uses off-the-shelf Xen 4.4 2. the patched configuration introduces the kernel-level instrumentation without the XeMPower daemon 3. the monitoring configuration is the patched with the XeMPower daemon running and reporting statistics • Four running scenarios: – an idle scenario in which the system only runs Dom0 – 3 running-n scenarios, where n = {1, 2, 3} indicates the number of guest domains in addition to Dom0 • The idea is to stress the system with an increasing number of CPU-intensive tenant applications • This increases the amount of data traced and collected by XeMPower
  • 26. • Mean power consumption (μ), in Watts, scenarios idle and running- {1,2,3}, and configurations baseline (b), patched (p), and monitoring (m) • Mean power values are reported with their 95% confidence interval Experimental Results 26 • At a glance, we can see how measurements are pretty close pinned-VCPU unpinned-VCPU
  • 27. Experimental Results 27 • We estimate an upper bound ϵ for the maximum overhead using a hypothesis test: • A rejection of the null hypothesis means that there is strong statistical evidence that the power consumption overhead is lower than ϵ • We compute ϵ for the considered test cases and scenarios, ensuring average values of power consumption (μ) with confidence: α = 5% • We want to compare the overhead with the one measured for XenMon, a performance monitoring tool for Xen • unlike XeMPower, XenMon does not collect PMC reads • it is still a reference design in the context of runtime monitoring for the Xen ecosystem
  • 28. Experimental Results 28 • Estimated upper bound ϵ for the power consumption overhead, in Watts • Parenthetical values are the overheads w.r.t. mean power consumption • XeMPower introduces an overhead not greater than 1.18W (1.58%), observed for the [unpinned-VCPU, running-3, patched] case • In all the other cases, the overhead is less than 1W (and less than 1%) • This result is satisfactory compared to an overhead of 1-2% observed for XenMon, the reference implementation for XeMPower
  • 29. Conclusion and Future Work 29 • We presented XeMPower, a lightweight monitoring solution for Xen that precisely accounts hardware events to virtual guests • We described its use in online attribution of CPU power consumption to individual domains • Our results show that XeMPower can provide continuous statistics with very low overhead compared to an off- the-shelf Xen installation • Future work – explore the complementary use of offline characterization of both hardware and guest workloads, towards power consumption predictions