Evaluation of 3 platforms (VM, container, unikernel) using subset of metrics important to 3 sets of enterprise stakeholders: developers/DevOps, CIO, and customers.
2. Agenda
Introduction - bios
Unikernel Background
Developer/DevOps care about
Metric Set 1: Application lifecycle overhead
CIO cares about
Metric Set 2: Application datacenter footprint
2
3. Unikernel Background
Working Definition: Single Process env for running code
3
Unmodified Legacy App
support
Multi-threaded App support
OSv Partial Yes (1: glibc subset, no
fork/exec)
Yes* (pthread subset)
MirageOS No* (until non-OCAML language
bindings are available, no
fork/execve)
Green threads (event loop) only
Rumprun Yes* (no
fork/execve/sigaction/mmap)
Yes (pthread)
4. Developer/DevOps care about
Enterprise Application Lifecycle management
Developer: Time to build app from source code, preferably unmodified
DevOps: Time to configure runtime parameters (ex: TCP port, log file
location)
DevOps: Time to deploy application
DevOps: Qualitative ease of managing+debugging long-running (weeks /
months) application 4
6. Metric Set 1: Application Lifecycle
Convert
Code to
Image
(Hours)
VM 8 (1 , 2,
3)
Container 0
Unikernel 40 ( 1, 2)
6
7. Metric Set 1: Application Lifecycle
Convert
Code to
Image
(Hours)
Start Time
(Seconds)
VM 8 (1 , 2,
3)
66.557
Container 0 1.113
Unikernel 40 ( 1, 2) 0.483
7
8. Metric Set 1: Application Lifecycle
Convert
Code to
Image
(Hours)
Start Time
(Seconds)
Stop Time
(Seconds)
VM 8 (1 , 2,
3)
66.557 7.478
Container 0 1.113 0.685
Unikernel 40 ( 1, 2) 0.483 0.019
8
9. Metric Set 1: Application Lifecycle
Convert
Code to
Image
(Hours)
Start Time
(Seconds)
Stop Time
(Seconds)
Debuggability
VM 8 (1 , 2,
3)
66.557 7.478
Container 0 1.113 0.685
Unikernel 40 ( 1, 2) 0.483 0.019
10. CIO cares about
Consolidation of applications on finite hardware resources
Multi-tenant security isolation amongst applications on a compute node
Multi-tenant Resource Management
Manageability, Accounting, Auditability
Infrastructure Power consumption
10
11. Metric Set 2: Data center footprint
Image Size
(MB)
VM 143
Container 182.8
Unikernel 7.8
12. Metric Set 2: Data center footprint
Image Size
(MB)
Runtime Memory
Overhead (MB)
VM 143 619
(/proc/{vboxpid}/status/{V
mSize} - Configured)
Container 182.8 274.4 (containerd-shim
/proc/{pid}/status/{VmSize
})
Unikernel 7.8 1222
(/proc/{qemupid}/status/{V
mSize} - Configured)
13. Metric Set 2: Data center footprint
Image Size
(MB)
Runtime Memory
Overhead (MB)
Security (Tenant
Isolation)
VM 143 619
(/proc/{vboxpid}/status/{V
mSize} - Configured)
Strong
Container 182.8 274.4 (containerd-shim
/proc/{pid}/status/{VmSize
})
Weak
Unikernel 7.8 1222
(/proc/{qemupid}/status/{V
mSize} - Configured)
Strong
14. Metric Set 2: Data center footprint
Image Size
(MB)
Runtime Memory
Overhead (MB)
Security (Tenant
Isolation)
Resource Knobs
VM 143 619
(/proc/{vboxpid}/status/{V
mSize} - Configured)
Strong Strong
(Reservation,
Limits)
Container 182.8 274.4 (containerd-shim
/proc/{pid}/status/{VmSize
})
Weak Moderate (Limits)
Unikernel 7.8 1222
(/proc/{qemupid}/status/{V
mSize} - Configured)
Strong Moderate (knobs
available, not used
yet)
18. Metrics Set 3: Throughput Explanation
nginx-osv > nginx-linux > nginx-docker > nginx-vm
Baseline: 1 thread/client
Nginx-linux (bare metal) ~600 requests/sec
Nginx-vm slightly lower: expected because the client request needs to traverse two I/O
stacks - the hypervisor’s and the Guest OS’s
Nginx-docker is close to bare metal: expected since the only thing separating the container
from the workload generator is a network bridge
Nginx-osv slightly better than bare metal: client requests still have to go through the
unikernel’s I/O stack but the I/O stack for OSV was designed to be light/lower-overhead -
influenced by a design based on Van Jacobson’s net channels
10 threads
Results get slightly more than 10X better (this is mostly because of reductions in average
latency - next graph) but the ordering remains the same 18
19. Metrics Set 3: Response Time Explanation
nginx-osv > nginx-linux > nginx-docker > nginx-vm
Overall response times between 1ms and 2ms
Single thread case ~1.5ms, and 10 thread case < 1.5ms
Reduction in response time moving 1 to 10 threads is mostly a result of
caching and multiplexing.
With multiple threads, more work gets done per-unit time. While thread A is processing the
results of a response, thread B, which was waiting, can quickly be given a cached copy of
the static file being served.
19
20. Summary
Developer/DevOps care about
Metric Set 1: Application lifecycle overhead
CIO cares about
Metric Set 2: Application datacenter footprint
Customer cares about
Metric Set 3: Application performance
20
Owner: Rean
Note: Refer to image size and overhead for cost estimates.
Worker connections = #clients simultaneously served
Worker processes * worker connections = anticipated upper limit on reqs/sec
Workload version of Rain (git hash b0b29438)
Workload configuration files:
https://github.com/rean/rain-workload-toolkit/blob/master/config/rain.config.nginx.json (determines workload duration, warm up and warm down)
https://github.com/rean/rain-workload-toolkit/blob/master/config/profiles.config.nginx.json (controls the IP address and port, number of threads, workload generator to use)
Experiment description
* simple HTTP GET workload, run for 5 minutes (10 sec warmup before, 10 sec rampdown afterwards) x 5 repeats
* Load generator and nginx instance run on the same machine so there’s no network jitter. We’re mainly capturing I/O stack overheads/differences
* Results reported = average over 5 repeats, error bars are 95% confidence intervals
Response time results
* 1 thread/client is the baseline case
* bare metal (Nginx-linux) ~600 requests/sec, Nginx-vm slightly lower (expected because the client request needs to traverse two I/O stacks - the hypervisor’s and the Guest OS’s), Nginx-docker is close to bare metal (expected since the only thing separating the container from the workload generator is a network bridge), Nginx-osv slightly better than bare metal (client requests still have to go through the unikernel’s I/O stack but the I/O stack for OSV was designed to be light/lower-overhead - influenced by a design based on Van Jacobson’s net channels)
* General ordering is nginx-osv > nginx-linux > nginx-docker > nginx-vm
* 10 threads
* Results get slightly more than 10X better (this is mostly because of reductions in average latency - next graph) but the ordering remains the same
nginx-osv > nginx-linux > nginx-docker > nginx-vm
Response time results
* Overall response times between 1ms and 2ms
* Single thread case ~1.5ms, and 10 thread case < 1.5ms
* The reduction in response time moving 1 to 10 threads is mostly a result of caching and multiplexing. With multiple threads more work gets done per-unit time. While thread A is processing the results of a response, thread B, which was waiting, can quickly be given a cached copy of the static file being served.
Summarize:
3 perspectives on what might be important (CIO, developer, customer). Measurements.