2. Container
is a mechanism to easily build, deploy and run applications everywhere.
build share & deploy run
scripts
library
binary
config
push pull
Container Runtime (e.g. docker, podman)
Container Image Container Image
Platform (e.g. Cloud, Laptop, IoT)
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
1/40
3. Container Image Pull is Slow
Image pull occupy most part of container boot (c.f docker run) time.
※ Pull Image: download & unfreeze Image
0.0s 5.0s 10.0s 15.0s 20.0s
1
2
boot latency of major containers
Series1 Series2
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
2/40
4. Research Goal 1: Reducing Boot Latency
Our research goal is to reduce containers’ boot latency.
0.0s 5.0s 10.0s 15.0s 20.0s
1
2
boot latency of major containers
Series1 Series2
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
3/40
5. Compatibility to Ecosystem is also Important
Many container images are already built based on image’s spec.
Currently over 4.5 million images are stored in DockerHub.
Many tools supporting containers are also based on specification.
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
Container’s Ecosystem
consists of many tools & images
4/40
6. Existing Methods have Problems in Compatibility
Minimize Image
• Building Apps on Small Base Image (e.g. Alpine Linux)
×incompatible in terms of packages & libraries
• Removing Files Manually/Automatically
×degrading functionality
Lazy Image Pull
• NFS・P2P based Lazy Image Pull / Parallel Image Pull
×only applicable in clusters
• Lazy Image Pull with Original Format Image
×hard to coexists with existing software & ecosystem
• Lazy Layer Pull with Library Call Hook
×only apps using libc (shared library) are supported
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
5/40
7. Research Goal
1. Reducing container’s boot time
2. Achieving high compatibility
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
6/40
8. Base Idea for Our Proposal
1. Reducing container’s boot time
↑ Lazy Image Fetching
2. Achieving high applicability
↑ Conforming to container’s de facto standard
(layer & overlay filesystem)
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
7/40
9. Background 1: Lazy Image Fetching
Only about 6.4% of data is need for container execution [1].
[1] Slacker: Fast Distribution with Lazy Docker Containers. FAST’16
Container
process
container
image
Container
process
container
image
unused data can be fetched
lazily at run time
only necessary data for boot
should be fetched in advance
normal lazy fetching
all data is pulled
in advance
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
8/40
10. Background 2: Layers & Overlay Filesystem
• A container image consists of
stacked multiple layers
• Each layer is a directory
and distributed as a tar.gz file
• Overlay filesystem manage layers
and redirect accesses to each layer
Container Image
layer file access
overlay filesystem
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
9/40
11. overlay filesystem
1. Create boot layer on top of original image
• consisting of necessary data for boot
2. Pull other layers lazily
• Only boot layer is pulled on boot
• Our overlay filesystem manages lazy pull
Container Image
layer file access
pull layers lazily
Concepts: Overlay Filesystem for Lazy Layer Pull
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
10/40
12. There are some Design Choices & Problems
How to know which file is needed for boot layer
Static analysis vs Dynamic analysis
Unexpected Layer Pull
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
11/40
13. There are some Design Choices & Problems
How to know which file is needed for boot layer
Static analysis vs Dynamic analysis
Unexpected Layer Pull
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
12/40
14. Static Image Analysis
analyze executable
consider domain knowledge
× less general
Dynamic Analysis
run target container
trace file access dynamically
↑ our approach
Design Choice 1: How to know which file is needed?
overlay filesystem
pull layers lazily
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
13/40
15. There are some Design Choices & Problems
How to know which file is needed for boot layer
Static analysis vs Dynamic analysis
Unexpected Layer Pull
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
14/40
16. Some file access needs additional information & pull all layers
1. Access to non-existing path
• e.g. ) exec command
new file creation, etc…
2. Reading directory entries
• requires children files
overlay filesystem
pull layers lazily
copying accessed files is not enough!
Problem: Unexpected Layer Pull
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
15/40
17. Case Study : Access to non-existing path
overlay filesystem
Need to pull all layers to know
the absence of /sbin/bash
※ Creating new files causes the same problem
Situation: exec bash with PATH=/sbin:/bin:…
①/sbin/bash ②/bin/bash
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
16/40
18. Case Study: Reading Directory Entry
overlay filesystem
Just copying the directory doesn’t work.
/dir /dir/fileA /dir/fileB
Need to pull all layers
to know each file
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
17/40
19. Case Study: Reading Directory Entry
overlay filesystem
/dir /dir/fileA
Copying the directory and its children doesn’t work, either.
Need to pull all layers
to know the absence of
other files.
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
/dir/fileB
18/40
20. Solution1: Whiteout files and opaque directories
Utilize existing functionalities defined in specification, keeping compatibility
1. Access to non-existing path
• Place whiteout files to notify the absence of the file
2. Reading directory entries
• Copy all children into the boot layer
• Set the directory as opaque to stop reading lower layers
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
19/40
21. Case Study: Access to non-existing path
Situation: exec bash with PATH=/sbin:/bin:…
overlay filesystem
Place the whiteout file
to notify the absence of the file.
※ Whiteout files are originally for removing files
①/sbin/bash ②/bin/bash
whiteout file for /sbin/bash
(empty file at /sbin/.wh.bash)
pull layers lazily
pull of lower layers can be postponed
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
20/40
22. Case Study: Reading Directory Entry
overlay filesystem
pull layers lazily
Specify the directory as opaque to stop searching lower layers
※ Opaque Directories are originally for removing/creating directories
stop searching
pull of lower layers can be postponed
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
21/40
23. Whiteout files & opaque directories are not enough
for the following cases.
• Case 1: creating files at random path
• e.g.) python, golang, texlive
• Case 2: interactive filesystem movement
• e.g. ) cd, ls, find commands @ubuntu
• e.g. ) creating new notebook @jupyter
Problem(cont’d): Unpredictable File Access
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
22/40
24. /sbin /etc /var
/bin
/
/bash /ls
/sbin
/bin
/
/.wh..wh.llpfs
/bash
/sbin /etc
/bin
/ls /echo
/echo
/etc
Original Image Boot Layer
meta data tree
copy only metadata
of non-accessed files
copy contents
and metadata
of accessed files
1. non-existing path can be decided.
2. The results of readdir operation can be generated.
3. Interactive application can work due to this extension (e.g. bash, jupyter)
Situation: Executing bash with PATH=/sbin:/bin:…
bash
we can know the
absence of /sbin/bash
These files cannot be seen
when mounted
/var
Solution2: Including All Metadata into Boot Layer
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
23/40
25. Methods for Avoiding Unexpected Layer Pull
We implemented two layer generation methods to solve this problem.
1. Using whiteout files & opaque directories
• exploits existing functionalities defined in the image
specification.
2. Including all metadata
• create metadata dedicated file tree under specific path
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
24/40
34. Evaluation
Application Boot Time Evaluation
Performance Evaluation
Environment
Machine1(Application Boot Time) Machine2(Performance)
CPU Intel Core i7-9700 Intel Xeon E5-2603 v4
CPU Clock 3.00GHz 1.70GHz
#Core (Physical/Logical) 8/8 6/6
RAM 32GB 16GB
Disk Samsung SSD 970 EVO Plus 500GB Samsung SSD 950 EVO Plus 256GB
OS Ubuntu 18.04
(Linux Kernel 5.3.0)
Ubuntu 20.04
(Linux Kernel 5.4.0)
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
33/40
35. app
container
image
registry
Application Boot Time Evaluation Settings
Examine boot time of various popular applications
5 times per setting
Applications
• ubuntu(bash)
• nginx
• redis
• buildpack
• ※Spring Boot application built with buildpack
• golang
• python
• texlive/texlive
• jupyter/datascence-notebook
Machine 1
8core / 32GB RAM
VM
4core/8GB RAM
VM
2core/8GB RAM
connected with virtual network
with 5Mbps, 10Mbps, 50Mbps,
100Mbps, 500Mbps, 1000Mbps, no limit
bandwidth.
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
34/40
36. Application Boot Case Study: jupyter notebook
• Definition of Boot Time:
The time up to first valid http response
• Application(image) source
• most popular official container
• jupyter/datascience-notbook
• obtained from DockerHub
• Boot time is improved at any bandwidth
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
35/40
37. Layer Generation Methods Comparison
Boot Time is a bit faster without
metadata files when possible
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
36/40
38. Application Boot Time Evaluation Summary
The boot times are improved with all settings
maximum
improvement
ratio is 39x
average improvement
1Gbps: 4.9X
100Mbps: 10X
10Mbps: 15X
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
37/40
39. Performance Evaluation Settings
Sysbench for MySQL container on overlay filesystems
to measure filesystem overhead.
Machine 2
6core
16GB RAM
MySQL
container sysbench
issue
requests
overlay
filesystem
• Sysbench oltp_read_write workload
• 10s × 5 each
• on overlay filesystems
• overlayfs (kernel module)
• fuse-overlayfs (FUSE)
• our proposal
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
38/40
41. Future Work
Support image build
Support multiple/gradual boot layers
Support various filesystems/runtimes
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
40/40
42. Summary
Our research goal is reducing container boot latency
with high compatibility
We proposed a new lazy layer pull approach with overlay filesystem
Our proposal improved average boot time by 10x at 100Mbps
with little performance degradation
1. Background 2. Proposal 3. Design & Implementation 4. Evaluation 5. Summary
41/40