Sick of getting paged at 2am and wondering "where did all my disk space go?" This has actually happened to me, and you can learn from my mistakes! New Docker users often start with a stock image in order to get up and running quickly, but that isn't always the right answer. Creating efficient images is often overlooked, but important. Beyond saving resources, using minimal images also delivers important security benefits: include only what you need and not a whole runtime that might have security vulnerabilities. In this session, I'll talk about how to create effective images and lessons I've learned from running containers in production at a number of startups. I'll also cover topics like "how do layers work?" and some things you should think about when creating your images, such as; choosing or creating the right base image; the importance of caching; using RUN statements conservatively; cleaning up as you go. I'll also address best practices; both at a high level with multi-stage builds; and some language-specific best practices, for example, tips and tricks for creating containers for Node.js vs Go. To illustrate these points, we'll cover:
How layers work?
Choosing a base image vs. creating your own
The basics of building minimal images and the importance of choosing a base image vs. creating your own
The basics for building minimal images and the importance of caching
High level best practices for Linux containers (in general, and some language specific examples).
High level best practices for Windows container images.
New and improved: multi-stage builds
Good vs. not so good Dockerfile examples
Docker Image Scanning, and other friends.
What's up next? Looking to the future for more optimization.
3. What are container layers?
Thin read-write layer
Read-only container layers
4. More layers mean a larger image. The larger the image, the
longer that it takes to both build, and push and pull from a
registry.
Smaller images mean faster builds and deploys. This also
means a smaller attack surface.
Why do I care how many layers I have?
5. Sharing is caring
• Use shared base images wherever
possible
• Limit the data written to the
container layer
• Chain RUN statements
• Prevent cache misses at build for
as long as possible
OK, so how can I reduce my layers?
10. First step: choose the right base
Ubuntu latest:
ubuntu latest 113a43faa138 42 hours ago 81.2MB
Alpine latest:
alpine latest 3fd9065eaf02 4 months ago 4.15MB
11. Slightly better: choose a different distro
python latest a5b7afcfdcc8 21 hours ago 912MB
ubuntu latest 113a43faa138 43 hours ago 81.2MB
debian latest 8626492fecd3 5 weeks ago 101MB
alpine latest 3fd9065eaf02 4 months ago 4.15MB
golang latest 3f30f1fc3c43 38 hours ago 794MB
ruby latest 857bc7ff918f 17 hours ago 869MB
12. When do I want a full base OS?
Security!
Compliance!
Ease of development!
More features (package managers!)
13. I have a Dockerfile, now what?
Once you have your Dockerfile, you can build it! That’s how we’ll
a) run our containerized application, and b) see how big it is once
we make our changes.
$ docker build -t hi-dockercon .
Build
command
--tag
flag
Name
assigned
with tag
PATH to
build
14. F is for flags
There are a few flags that affect your final image size when you’re
building your image.
--cache-from (another image to cache from)
--compress (compress build context* with gzip)
--no-cache (ignore the cache, more on that in a sec)
--squash (squash new layers into a single layer)**
* don’t know what build context is? I didn’t either. We’ll talk about it.
**this is still experimental
15. What’s build context?
Once you call docker build, your current directory is the
build context, and gets sent to the Docker daemon. You can see
how much context your sending by looking for a message like
this:
Sending build context to Docker daemon 187.8MB
The more context you send, the bigger your build context will be,
and the larger your image. Avoid including unnecessary files and
directories.
17. But first, some FAQs
Q: Is that really the official Wu-Tang font?
A: Yes. Of course. Insist on the highest standards.
Q: What’s your favorite Wu-Tang song?
A: “Protect Ya Neck”
Q: Have you been sued for copyright infringement yet for your
last slide?
A: No, but don’t tell on me.
18. Back to cache!
Starting from the parent instruction, Docker will look at each following
instruction to see if it matches the cached version.
Only ADD and COPY will look at checksums for a match
Other than ADD and COPY, only the string of the command is used,
not the contents of the files.
Once cache is broken, every subsequent layer is built again.
26. Keeping image sizes down is challenging (or I wouldn’t be here!). Two
common patterns:
Multi-stage builds (new-ish!): separate stages in one (1) Dockerfile. Copy
artifacts from one stage to the other. This results in a smaller final image.
Separate Dockerfiles: build, then copy artifacts over. This results in a
smaller image, but it’s a little more complicated. End result is often
something like Dockerfile.build and Dockerfile.final
Multi-stage builds vs multiple Dockerfiles
27. FROM ubuntu AS build-env
RUN apt-get install make
ADD . /src
RUN cd /src && make
And for the second Dockerfile, copy from #1:
FROM busybox
COPY --from=build-env /src/build/app /usr/local/bin/app
EXPOSE 80
ENTRYPOINT /usr/local/bin/app
Multi-stage builds
28. Still not sure on multi-stage build? I really enjoy this blog post
from Alex Ellis, which also provides the examples for multi-
stage builds in the (excellent) Docker documentation.
I’m still skeptical!
30. Convert an existing Windows image:
ConvertTo-Dockerfile -ImagePath c:dockermyimage.wim
Convert from VHD:
ConvertTo-Dockerfile -ImagePath c:vmstest.vhd -Artifact IIS -ArtifactParam
windows-container -OutputPath c:windows-container
cd c:windows-container
docker build -t windows-container .
docker run -d -p 80:80 windows-container
Port over existing VM workloads
31. Watch what you build:
c: c: / /windows c:/windows
Building any of those PATHs will make your image very
large!
Careful what you build
“Hey! Listen!”
32. MSI installations are not space efficient. This is not the same as Linux distros,
where you can add, use, and remove the installation files!
$ Windows/Installer/<package>.msi
Windows saves these files for uninstalls :(
Avoid installing with MSI
33. Build and run everything the same, regardless of container OS, host OS, or tools.
Just docker build and docker run.
Here's what's really cool though…
35. A few things to think about
Use the right tool: not every language needs to be built the same
way.
Where possible, use two images: one to build an artifact, and one
from base
Official language images can be big: more space effective to use a
more minimal image, but there are tradeoffs
36. Golang
Compile, then COPY binary:
$ go build -o dockercon .
$ docker build -t dockercon .
Dockerfile:
FROM scratch
COPY ./dockercon /dockercon
ENTRYPOINT ["/dockercon"]
37. Hang on, what’s SCRATCH?
Special, empty Dockerfile.
Use this to build your own base images.
Or, use to build minimal images that run a binary and nothing else:
FROM scratch
COPY hello /
CMD [ “/hello” ]
Want more on scratch? Start here.
38. Ruby
Ruby images can be extra huge. A new base + a little extra work pays off.
FROM alpine:3.2
LABEL maintainer abbyfull@amazon.com
RUN apk update && apk upgrade && apk add
curl
bashruby
ruby-dev
ruby-bundler
RUN rm -rf /var/cache/apk/*
39. Node.js
If you love yourself, .dockerignore npm-debug.log. Seriously.
But most importantly, cache node_modules:
COPY package.json .
RUN npm install --production
COPY . .
This way, only run npm install if package.json changes.
40. Java
Multi-stage builds are your friend:
FROM maven:3.5-jdk-8 as BUILD
COPY --from=BUILD
Like Golang, this let’s you build an artifact in one stage, and simply run
the binary in the second stage, resulting in more minimal final images.
More on multistage builds up next.
43. Let’s look at the original
$ docker build –t client .
$ docker images | grep client
client latest 772483ec773e time 568MB
44. Swap out the base image for a quick win
Change from ubuntu:14.04 to debian:stretch-slim
$ docker build –f Dockerfile-slim –t client-slim .
$ docker images | grep client
client latest 772483ec773e time 568MB
client-slim latest a72f5121de6e time 388MB
45. Or, change to a slimmer base image
Swap to python:2-alpine
$ docker build –f Dockerfile-alpine –t client-alpine .
$ docker images | grep client
client latest 772483ec773e time 568MB
client-slim latest a72f5121de6e time 388MB
client-alpine latest 02bbb5c5a079 time 300MB
46. Remove pip cache and extra layers
$ docker build –f Dockerfile-alpine-2 –t client-alpine-2 .
$ docker images | grep client
client latest 772483ec773e time 568MB
client-slim latest a72f5121de6e time 388MB
client-alpine latest 02bbb5c5a079 time 300MB
client-alpine-2 latest cd3e206844b6 time 275MB
47. Combine RUN statements
$ docker build –f Dockerfile-alpine-3 –t client-alpine-3 .
$ docker images | grep client
client latest 772483ec773e time 568MB
client-slim latest a72f5121de6e time 388MB
client-alpine latest 02bbb5c5a079 time 300MB
client-alpine-2 latest cd3e206844b6 time 275MB
client-alpine-3 latest 83fffeb85684 time 162MB
48. Combine RUN statements
$ docker build –f Dockerfile-alpine-3 –t client-alpine-3 .
$ docker images | grep client
client latest 772483ec773e time 568MB
client-slim latest a72f5121de6e time 388MB
client-alpine latest 02bbb5c5a079 time 300MB
client-alpine-2 latest cd3e206844b6 time 275MB
client-alpine-3 latest 83fffeb85684 time 162MB
50. Minimal images only matter so much
Docker image prune:
$ docker image prune –a
Or go even further with Docker system prune:
$ docker system prune -a
51. Safety first!
Lean on tools to check your images for vulnerabilities.
A couple of options, both paid and open source:
• Aqua MicroScanner (community edition)
• Aqua continuous image assurance
• Docker Security Scan with Trusted Registry
• Clair from CoreOS
52. The importance of garbage collection
Clean up after your containers! Beyond image and system prune:
• Make sure your orchestration platform (like ECS or K8s) is garbage
collecting:
• ECS
• Kubernetes
• 3rd party tools like spotify-gc
53. So what did we learn?
One takeaway: less layers is more.
Share layers where possible
Choose or build your base wisely
Not all languages should build the same
Keep it simple, avoid extras
Tools are here to help