SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
AKIHIRO SUDA
NTT Corporation
Hardening Docker
daemon with
Rootless mode
About me
● Software Engineer at NTT
● Maintainer of Moby, containerd, and BuildKit
● Docker Tokyo Community Leader
Rootless Docker
● Run Docker as a non-root user on the host
● Protect the host from potential Docker vulns
and misconfiguration
Non-rootroot
Demo
Don’t confuse with..
$ sudo docker
Image: https://xkcd.com/149/
Don’t confuse with..
$ sudo docker
$ usermod -aG docker penguin
Rootless Docker
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock
$ sudo usermod -aG docker penguin
Non-root username: “penguin”
Rootless Docker
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock
$ sudo usermod -aG docker penguin
Non-root username: “penguin”
Image: https://twitter.com/llegaspacheco/status/1111783777372639232
Rootless Docker
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock
$ sudo usermod -aG docker penguin
Non-root username: “penguin”
Image: https://twitter.com/llegaspacheco/status/1111783777372639232
Don’t confuse with..
$ sudo docker
$ usermod -aG docker penguin
$ docker run --user 42
All of them run the daemon as the root!
Don’t confuse with..
$ sudo docker
$ usermod -aG docker penguin
$ docker run --user 42
$ dockerd --userns-remap
Rootless Docker
● Rootless Docker refers to running the Docker daemon
(and containers of course) as a non-root user
● Even if it got compromised, the attacker wouldn’t be able
to gain the root on the host
(unless you have sudo configured with NOPASSWD)
Some caveats apply..
● No OverlayFS (except on Ubuntu)
● Limited network performance by default
● TCP/UDP port numbers below 1024 can’t be listened on
● No cgroup
○ docker run: --memory and --cpu-* flags are
ignored
○ docker top: does not work
You can install it under your $HOME
right now!
● sudo is not required
● But /etc/subuid and /etc/subgid need to be
configured to contain your username
○ configured by default on recent distros
curl -fsSL https://get.docker.com/rootless | sh
You can install it under your $HOME
right now!
● The installer shows helpful error if /etc/sub[ug]id is
unconfigured
○ Thanks to Tõnis Tiigi and Tibor Vass!
● Feel free to ask me after this session if it doesn’t work
curl -fsSL https://get.docker.com/rootless | sh
Katacoda scenario available!
https://www.katacoda.com/courses/docker/rootless
Motivation
Harden containers
● Docker has a lot of features for hardening containers, so
root-in-container is still contained by default
○ namespaces, capabilities
○ seccomp, AppArmor, SELinux...
● But there is no such thing as vulnerability-free software;
root-in-container could break out with an exploit
○ CVE-2019-5736 runc breakout (Feb 11, 2019)
Harden containers
● And people often make misconfiguration!
● “We found 3,822 Docker hosts with the remote API
exposed publicly.”
-- Vitaly Simonovich and Ori Nakar (March 4, 2019)
https://www.imperva.com/blog/hundreds-of-vulnerable-docker-hosts-exploite
d-by-cryptocurrency-miners/
Harden containers
● Rootless mode per se doesn’t fix vulns and
misconfigurations - but it can mitigate attacks
● Attacker won’t be able to:
○ access files owned by other users
○ modify firmware and kernel (→ undetectable malware)
○ ARP spoofing
Caution: not panacea!
● If Docker had a vuln, attackers still might be able to:
○ Mine cryptocurrencies
○ Springboard-attack to other hosts
● Not effective for potential vulns on
kernel / VM / HW side
High-performance Computing (HPC)
● HPC users are typically disallowed to gain the root on the
host
● Good news: GPU (and perhaps FPGA devices) are
known to work with Rootless mode
Docker-in-Docker
● There are a lot of valid use cases to allow a Docker
container to call Docker API
○ FaaS
○ CI
○ Build images
○ ...
Docker-in-Docker
$ docker run -v /var/run/docker.sock:/var/run/docker.sock
$ docker run --privileged docker:dind
● Two types of Docker-in-Docker, both had been unsafe
without Rootless
How it works
Pretend to be the root
● User namespaces allow non-root users to pretend to be
the root
● Root-in-UserNS can have fake UID 0 and also create
other namespaces (MountNS, NetNS..)
Pretend to be the root
● But Root-in-UserNS cannot gain the real root
○ Inaccessible files still remain inaccessible
○ Kernel modules cannot be loaded
○ System cannot be rebooted
Pretend to be the root
$ id -u
1001
$ ls -ln
-rw-rw---- 1 1001 1001 42 May 1 12:00 foo
Pretend to be the root
$ docker run -v $(pwd):/mnt -it alpine
/ # id -u
0
/ # ls -ln /mnt
-rw-rw---- 1 0 0 42 May 1 12:00 foo
Still owned by 1001 on the host
Still running as 1001 on the host
Pretend to be the root
$ docker run -v /:/host -it alpine
/ # ls -ln /host/dev/sda
brw-rw---- 1 65534 65534 8, 0 May 1 12:00 /host/dev/sda
/ # cat /host/dev/sda
cat: can’t open ‘/host/dev/sda’: Permission denied
Still owned by root(0) on the host
Sub-users (and sub-groups)
● Put users in your user account so you can be a user
while you are a user
● Sub-users are used as non-root users in a container
○ USER in Dockerfile
○ docker run --user
Sub-users (and sub-groups)
● If /etc/subuid contains “1001:100000:65536”
● Having 65536 sub-users should be enough for most
containers
0 1001 100000 165535 232
0 1 65536
Host
UserNS
primary user sub-users start sub-users len
● A container has a mutable copy of the image
● Copying file takes time and wastes disk space
● Rootful Docker uses OverlayFS to reduce extra copy
Snapshotting
Image
container
container
container
docker run
Snapshotting
● OverlayFS is currently unavailable for Rootless mode
(unless you have Ubuntu’s kernel patch)
● On ext4, files are just copied instead; Slow and wasteful
● But on XFS “reflink” is used to deduplicate files
○ copy_file_range(2)
○ Slow but not wasteful
Networking
● Non-root user can create NetNS but cannot create a
vEth pair across the host and a NetNS
● VPNKit is used instead of vEth pair
○ User-mode network stack based on MirageOS TCP/IP
○ Also used by Docker for Mac/Win
Practical Tips
systemd service
● The unit file is in your home:
~/.config/systemd/user/docker.service
● To enable user services on system startup:
$ sudo loginctl enable-linger penguin
$ systemctl --user start docker
$ systemctl --user stop docker
Enable OverlayFS
● The vanilla kernel disallows mounting OverlayFS in user
namespaces
● But if you install Ubuntu kernel, you can get support for
OverlayFS
https://lists.ubuntu.com/archives/kernel-team/2014-February/038091.html
Enable XFS reflink
● If OverlayFS is not available, use XFS to deduplicate files
○ efficient for dedupe but slow
○ otherwise (i.e. ext4) all files are duplicated per layer
● ~/.config/docker/daemon.json:
● Make sure to format with `mkfs.xfs -m reflink=1`,
{“storage-driver”: “vfs”,
“data-root”:”/mnt/xfs/foo”}
Change network stack: slirp4netns
● The default network stack (VPNKit) is slow
● Install slirp4netns (v0.3.0+) to get better throughput
○ iperf3 benchmark (container to host):
514Mbps → 9.21 Gbps
○ still slow compared to native vEth 52.1 Gbps
Benchmark: https://fosdem.org/2019/schedule/event/containers_k8s_rootless/
Change network stack: slirp4netns
● https://github.com/rootless-containers/slirp4netns
● ./configure && make && make install
● RPM/DEB is also available for most distros (but
sometimes outdated)
● If slirp4netns is installed on $PATH, Docker automatically
picks up
Change network stack: lxc-user-nic
● Or install lxc-user-nic to get native performance
○ SETUID binary (executed as the root)
■ potentially result in root privilege escalation
if lxc-user-nic had vuln
$ sudo apt-get install liblxc-common
Change network stack: lxc-user-nic
● /etc/lxc/lxc-usernet needs to be configured:
● $DOCKERD_ROOTLESS_ROOTLESSKIT_NET needs to be
set to lxc-user-nic
# USERNAME TYPE BRIDGE COUNT
penguin veth lxcbr0 1
Count of dockerd and LXC containers
(Not count of Docker containers)
Exposing TCP/UDP ports below 1024
● Exposing port numbers below 1024 requires
CAP_NET_BIND_SERVICE
$ sudo setcap cap_net_bind_service=ep 
~/bin/rootlesskit
$ docker run -p 80:80 ...
Future work
Docker 19.09? 20.03?
FUSE-OverlayFS
● FUSE-OverlayFS can emulate OverlayFS without root
privileges on any distro (requires Kernel 4.18)
● Faster than XFS dedupe but slightly slower than real
OverlayFS
● containerd will be able to support FUSE-OverlayFS
● Docker will be able to use containerd snapshotter
https://github.com/moby/moby/pull/38738
OverlayFS
● There has been also discussion to push Ubuntu’s patch
to the real OverlayFS upstream
● Likely to take more time?
cgroup2
cgroup2 is needed for safely supporting rootless cgroup
Docker
containerd
runc
systemd
Linux Kernel
Already support cgroup2
TODO
Work in progress
cgroup2
● runc doesn’t support cgroup2 yet, but “crun” already
supports cgroup2 https://github.com/giuseppe/crun
● OCI (Open Containers Initiative) is working on bringing
proper cgroup2 support to OCI Runtime Spec and runc
https://github.com/opencontainers/runtime-spec/issues/1002
LDAP
● Configuring /etc/subuid and /etc/subgid might be
painful on LDAP environments
● NSS module is under discussion for LDAP environments
https://github.com/shadow-maint/shadow/issues/154
○ No need to configure /etc/subuid and /etc/subgid
LDAP
● Another way: emulate sub-users using a single user
● runROOTLESS: An OCI Runtime Implementation with
sub-users emulation https://github.com/rootless-containers/runrootless
○ Uses Ptrace and Xattr for emulating syscalls
○ 2-15 times performance overhead
https://github.com/rootless-containers/runrootless/issues/14
LDAP
● seccomp could be used for accelerating ptrace, but we
are still facing implementation issues
● We are also looking into possibility of using
“Seccomp Trap To Userspace” (introduced in Kernel 5.0)
○ Modern replacement for ptrace
Join us at Open Source Summit !
● Thursday, May 2, 12:30 PM - 02:30 PM
● Room 2020
● Three BuildKit talks
including this →
Questions?
get.docker.com/rootless

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

eStargzイメージとlazy pullingによる高速なコンテナ起動
eStargzイメージとlazy pullingによる高速なコンテナ起動eStargzイメージとlazy pullingによる高速なコンテナ起動
eStargzイメージとlazy pullingによる高速なコンテナ起動
 
Istioサービスメッシュ入門
Istioサービスメッシュ入門Istioサービスメッシュ入門
Istioサービスメッシュ入門
 
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
 
Kuberneteの運用を支えるGitOps
Kuberneteの運用を支えるGitOpsKuberneteの運用を支えるGitOps
Kuberneteの運用を支えるGitOps
 
Dockerからcontainerdへの移行
Dockerからcontainerdへの移行Dockerからcontainerdへの移行
Dockerからcontainerdへの移行
 
DockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐるDockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐる
 
DockerとPodmanの比較
DockerとPodmanの比較DockerとPodmanの比較
DockerとPodmanの比較
 
本当は恐ろしい分散システムの話
本当は恐ろしい分散システムの話本当は恐ろしい分散システムの話
本当は恐ろしい分散システムの話
 
コンテナの作り方「Dockerは裏方で何をしているのか?」
コンテナの作り方「Dockerは裏方で何をしているのか?」コンテナの作り方「Dockerは裏方で何をしているのか?」
コンテナの作り方「Dockerは裏方で何をしているのか?」
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020
 
containerdの概要と最近の機能
containerdの概要と最近の機能containerdの概要と最近の機能
containerdの概要と最近の機能
 
Docker入門-基礎編 いまから始めるDocker管理【2nd Edition】
Docker入門-基礎編 いまから始めるDocker管理【2nd Edition】Docker入門-基礎編 いまから始めるDocker管理【2nd Edition】
Docker入門-基礎編 いまから始めるDocker管理【2nd Edition】
 
Prometheus at Preferred Networks
Prometheus at Preferred NetworksPrometheus at Preferred Networks
Prometheus at Preferred Networks
 
【解説】IKE(IIJ Kubernetes Engine):= Vanilla Kubernetes + 何?
【解説】IKE(IIJ Kubernetes Engine):= Vanilla Kubernetes + 何?【解説】IKE(IIJ Kubernetes Engine):= Vanilla Kubernetes + 何?
【解説】IKE(IIJ Kubernetes Engine):= Vanilla Kubernetes + 何?
 
root権限無しでKubernetesを動かす
root権限無しでKubernetesを動かす root権限無しでKubernetesを動かす
root権限無しでKubernetesを動かす
 
分散トレーシング技術について(Open tracingやjaeger)
分散トレーシング技術について(Open tracingやjaeger)分散トレーシング技術について(Open tracingやjaeger)
分散トレーシング技術について(Open tracingやjaeger)
 
Dockerからcontainerdへの移行
Dockerからcontainerdへの移行Dockerからcontainerdへの移行
Dockerからcontainerdへの移行
 
急速に進化を続けるCNIプラグイン Antrea
急速に進化を続けるCNIプラグイン Antrea 急速に進化を続けるCNIプラグイン Antrea
急速に進化を続けるCNIプラグイン Antrea
 
Windowsコンテナ入門
Windowsコンテナ入門Windowsコンテナ入門
Windowsコンテナ入門
 
Knative Eventing 入門(Kubernetes Novice Tokyo #11 発表資料)
Knative Eventing 入門(Kubernetes Novice Tokyo #11 発表資料)Knative Eventing 入門(Kubernetes Novice Tokyo #11 発表資料)
Knative Eventing 入門(Kubernetes Novice Tokyo #11 発表資料)
 

Similar a DCSF19 Hardening Docker daemon with Rootless mode

Introduction to Docker and Containers
Introduction to Docker and ContainersIntroduction to Docker and Containers
Introduction to Docker and Containers
Docker, Inc.
 
A Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersA Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and Containers
Docker, Inc.
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9
Jérôme Petazzoni
 

Similar a DCSF19 Hardening Docker daemon with Rootless mode (20)

[DockerCon 2020] Hardening Docker daemon with Rootless Mode
[DockerCon 2020] Hardening Docker daemon with Rootless Mode[DockerCon 2020] Hardening Docker daemon with Rootless Mode
[DockerCon 2020] Hardening Docker daemon with Rootless Mode
 
Puppet Camp Chicago 2014: Docker and Puppet: 1+1=3 (Intermediate)
Puppet Camp Chicago 2014: Docker and Puppet: 1+1=3 (Intermediate)Puppet Camp Chicago 2014: Docker and Puppet: 1+1=3 (Intermediate)
Puppet Camp Chicago 2014: Docker and Puppet: 1+1=3 (Intermediate)
 
Rootless Containers & Unresolved issues
Rootless Containers & Unresolved issuesRootless Containers & Unresolved issues
Rootless Containers & Unresolved issues
 
Introduction to Docker and Containers
Introduction to Docker and ContainersIntroduction to Docker and Containers
Introduction to Docker and Containers
 
Docker - A Ruby Introduction
Docker - A Ruby IntroductionDocker - A Ruby Introduction
Docker - A Ruby Introduction
 
Docker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los AngelesDocker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los Angeles
 
codemotion-docker-2014
codemotion-docker-2014codemotion-docker-2014
codemotion-docker-2014
 
Real World Experience of Running Docker in Development and Production
Real World Experience of Running Docker in Development and ProductionReal World Experience of Running Docker in Development and Production
Real World Experience of Running Docker in Development and Production
 
Docker linuxday 2015
Docker linuxday 2015Docker linuxday 2015
Docker linuxday 2015
 
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
Introduction to Docker at SF Peninsula Software Development Meetup @GuidewireIntroduction to Docker at SF Peninsula Software Development Meetup @Guidewire
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
 
Running .NET on Docker
Running .NET on DockerRunning .NET on Docker
Running .NET on Docker
 
A Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersA Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and Containers
 
Docker.io
Docker.ioDocker.io
Docker.io
 
Workshop : 45 minutes pour comprendre Docker avec Jérôme Petazzoni
Workshop : 45 minutes pour comprendre Docker avec Jérôme PetazzoniWorkshop : 45 minutes pour comprendre Docker avec Jérôme Petazzoni
Workshop : 45 minutes pour comprendre Docker avec Jérôme Petazzoni
 
Introduction to Docker, December 2014 "Tour de France" Edition
Introduction to Docker, December 2014 "Tour de France" EditionIntroduction to Docker, December 2014 "Tour de France" Edition
Introduction to Docker, December 2014 "Tour de France" Edition
 
JDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
JDO 2019: Tips and Tricks from Docker Captain - Łukasz LachJDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
JDO 2019: Tips and Tricks from Docker Captain - Łukasz Lach
 
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQDocker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9
 
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
 
Rootless Containers
Rootless ContainersRootless Containers
Rootless Containers
 

Más de Docker, Inc.

Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 

Más de Docker, Inc. (20)

Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience
 
How to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker BuildHow to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker Build
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINX
 
How To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and ComposeHow To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and Compose
 
Hands-on Helm
Hands-on Helm Hands-on Helm
Hands-on Helm
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker HubThe First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker Hub
 
Monitoring in a Microservices World
Monitoring in a Microservices WorldMonitoring in a Microservices World
Monitoring in a Microservices World
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
 
Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with Docker
 
Become a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio CodeBecome a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio Code
 
How to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container RegistryHow to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container Registry
 
Monolithic to Microservices + Docker = SDLC on Steroids!
Monolithic to Microservices + Docker = SDLC on Steroids!Monolithic to Microservices + Docker = SDLC on Steroids!
Monolithic to Microservices + Docker = SDLC on Steroids!
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
 
Labels, Labels, Labels
Labels, Labels, Labels Labels, Labels, Labels
Labels, Labels, Labels
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment ModelUsing Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
 
Developing with Docker for the Arm Architecture
Developing with Docker for the Arm ArchitectureDeveloping with Docker for the Arm Architecture
Developing with Docker for the Arm Architecture
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

DCSF19 Hardening Docker daemon with Rootless mode

  • 1. AKIHIRO SUDA NTT Corporation Hardening Docker daemon with Rootless mode
  • 2. About me ● Software Engineer at NTT ● Maintainer of Moby, containerd, and BuildKit ● Docker Tokyo Community Leader
  • 3. Rootless Docker ● Run Docker as a non-root user on the host ● Protect the host from potential Docker vulns and misconfiguration Non-rootroot
  • 5. Don’t confuse with.. $ sudo docker Image: https://xkcd.com/149/
  • 6. Don’t confuse with.. $ sudo docker $ usermod -aG docker penguin
  • 7. Rootless Docker $ ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock $ sudo usermod -aG docker penguin Non-root username: “penguin”
  • 8. Rootless Docker $ ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock $ sudo usermod -aG docker penguin Non-root username: “penguin” Image: https://twitter.com/llegaspacheco/status/1111783777372639232
  • 9. Rootless Docker $ ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock $ sudo usermod -aG docker penguin Non-root username: “penguin” Image: https://twitter.com/llegaspacheco/status/1111783777372639232
  • 10. Don’t confuse with.. $ sudo docker $ usermod -aG docker penguin $ docker run --user 42
  • 11. All of them run the daemon as the root! Don’t confuse with.. $ sudo docker $ usermod -aG docker penguin $ docker run --user 42 $ dockerd --userns-remap
  • 12. Rootless Docker ● Rootless Docker refers to running the Docker daemon (and containers of course) as a non-root user ● Even if it got compromised, the attacker wouldn’t be able to gain the root on the host (unless you have sudo configured with NOPASSWD)
  • 13. Some caveats apply.. ● No OverlayFS (except on Ubuntu) ● Limited network performance by default ● TCP/UDP port numbers below 1024 can’t be listened on ● No cgroup ○ docker run: --memory and --cpu-* flags are ignored ○ docker top: does not work
  • 14. You can install it under your $HOME right now! ● sudo is not required ● But /etc/subuid and /etc/subgid need to be configured to contain your username ○ configured by default on recent distros curl -fsSL https://get.docker.com/rootless | sh
  • 15. You can install it under your $HOME right now! ● The installer shows helpful error if /etc/sub[ug]id is unconfigured ○ Thanks to Tõnis Tiigi and Tibor Vass! ● Feel free to ask me after this session if it doesn’t work curl -fsSL https://get.docker.com/rootless | sh
  • 18. Harden containers ● Docker has a lot of features for hardening containers, so root-in-container is still contained by default ○ namespaces, capabilities ○ seccomp, AppArmor, SELinux... ● But there is no such thing as vulnerability-free software; root-in-container could break out with an exploit ○ CVE-2019-5736 runc breakout (Feb 11, 2019)
  • 19. Harden containers ● And people often make misconfiguration! ● “We found 3,822 Docker hosts with the remote API exposed publicly.” -- Vitaly Simonovich and Ori Nakar (March 4, 2019) https://www.imperva.com/blog/hundreds-of-vulnerable-docker-hosts-exploite d-by-cryptocurrency-miners/
  • 20. Harden containers ● Rootless mode per se doesn’t fix vulns and misconfigurations - but it can mitigate attacks ● Attacker won’t be able to: ○ access files owned by other users ○ modify firmware and kernel (→ undetectable malware) ○ ARP spoofing
  • 21. Caution: not panacea! ● If Docker had a vuln, attackers still might be able to: ○ Mine cryptocurrencies ○ Springboard-attack to other hosts ● Not effective for potential vulns on kernel / VM / HW side
  • 22. High-performance Computing (HPC) ● HPC users are typically disallowed to gain the root on the host ● Good news: GPU (and perhaps FPGA devices) are known to work with Rootless mode
  • 23. Docker-in-Docker ● There are a lot of valid use cases to allow a Docker container to call Docker API ○ FaaS ○ CI ○ Build images ○ ...
  • 24. Docker-in-Docker $ docker run -v /var/run/docker.sock:/var/run/docker.sock $ docker run --privileged docker:dind ● Two types of Docker-in-Docker, both had been unsafe without Rootless
  • 26. Pretend to be the root ● User namespaces allow non-root users to pretend to be the root ● Root-in-UserNS can have fake UID 0 and also create other namespaces (MountNS, NetNS..)
  • 27. Pretend to be the root ● But Root-in-UserNS cannot gain the real root ○ Inaccessible files still remain inaccessible ○ Kernel modules cannot be loaded ○ System cannot be rebooted
  • 28. Pretend to be the root $ id -u 1001 $ ls -ln -rw-rw---- 1 1001 1001 42 May 1 12:00 foo
  • 29. Pretend to be the root $ docker run -v $(pwd):/mnt -it alpine / # id -u 0 / # ls -ln /mnt -rw-rw---- 1 0 0 42 May 1 12:00 foo Still owned by 1001 on the host Still running as 1001 on the host
  • 30. Pretend to be the root $ docker run -v /:/host -it alpine / # ls -ln /host/dev/sda brw-rw---- 1 65534 65534 8, 0 May 1 12:00 /host/dev/sda / # cat /host/dev/sda cat: can’t open ‘/host/dev/sda’: Permission denied Still owned by root(0) on the host
  • 31. Sub-users (and sub-groups) ● Put users in your user account so you can be a user while you are a user ● Sub-users are used as non-root users in a container ○ USER in Dockerfile ○ docker run --user
  • 32. Sub-users (and sub-groups) ● If /etc/subuid contains “1001:100000:65536” ● Having 65536 sub-users should be enough for most containers 0 1001 100000 165535 232 0 1 65536 Host UserNS primary user sub-users start sub-users len
  • 33. ● A container has a mutable copy of the image ● Copying file takes time and wastes disk space ● Rootful Docker uses OverlayFS to reduce extra copy Snapshotting Image container container container docker run
  • 34. Snapshotting ● OverlayFS is currently unavailable for Rootless mode (unless you have Ubuntu’s kernel patch) ● On ext4, files are just copied instead; Slow and wasteful ● But on XFS “reflink” is used to deduplicate files ○ copy_file_range(2) ○ Slow but not wasteful
  • 35. Networking ● Non-root user can create NetNS but cannot create a vEth pair across the host and a NetNS ● VPNKit is used instead of vEth pair ○ User-mode network stack based on MirageOS TCP/IP ○ Also used by Docker for Mac/Win
  • 37. systemd service ● The unit file is in your home: ~/.config/systemd/user/docker.service ● To enable user services on system startup: $ sudo loginctl enable-linger penguin $ systemctl --user start docker $ systemctl --user stop docker
  • 38. Enable OverlayFS ● The vanilla kernel disallows mounting OverlayFS in user namespaces ● But if you install Ubuntu kernel, you can get support for OverlayFS https://lists.ubuntu.com/archives/kernel-team/2014-February/038091.html
  • 39. Enable XFS reflink ● If OverlayFS is not available, use XFS to deduplicate files ○ efficient for dedupe but slow ○ otherwise (i.e. ext4) all files are duplicated per layer ● ~/.config/docker/daemon.json: ● Make sure to format with `mkfs.xfs -m reflink=1`, {“storage-driver”: “vfs”, “data-root”:”/mnt/xfs/foo”}
  • 40. Change network stack: slirp4netns ● The default network stack (VPNKit) is slow ● Install slirp4netns (v0.3.0+) to get better throughput ○ iperf3 benchmark (container to host): 514Mbps → 9.21 Gbps ○ still slow compared to native vEth 52.1 Gbps Benchmark: https://fosdem.org/2019/schedule/event/containers_k8s_rootless/
  • 41. Change network stack: slirp4netns ● https://github.com/rootless-containers/slirp4netns ● ./configure && make && make install ● RPM/DEB is also available for most distros (but sometimes outdated) ● If slirp4netns is installed on $PATH, Docker automatically picks up
  • 42. Change network stack: lxc-user-nic ● Or install lxc-user-nic to get native performance ○ SETUID binary (executed as the root) ■ potentially result in root privilege escalation if lxc-user-nic had vuln $ sudo apt-get install liblxc-common
  • 43. Change network stack: lxc-user-nic ● /etc/lxc/lxc-usernet needs to be configured: ● $DOCKERD_ROOTLESS_ROOTLESSKIT_NET needs to be set to lxc-user-nic # USERNAME TYPE BRIDGE COUNT penguin veth lxcbr0 1 Count of dockerd and LXC containers (Not count of Docker containers)
  • 44. Exposing TCP/UDP ports below 1024 ● Exposing port numbers below 1024 requires CAP_NET_BIND_SERVICE $ sudo setcap cap_net_bind_service=ep ~/bin/rootlesskit $ docker run -p 80:80 ...
  • 46. FUSE-OverlayFS ● FUSE-OverlayFS can emulate OverlayFS without root privileges on any distro (requires Kernel 4.18) ● Faster than XFS dedupe but slightly slower than real OverlayFS ● containerd will be able to support FUSE-OverlayFS ● Docker will be able to use containerd snapshotter https://github.com/moby/moby/pull/38738
  • 47. OverlayFS ● There has been also discussion to push Ubuntu’s patch to the real OverlayFS upstream ● Likely to take more time?
  • 48. cgroup2 cgroup2 is needed for safely supporting rootless cgroup Docker containerd runc systemd Linux Kernel Already support cgroup2 TODO Work in progress
  • 49. cgroup2 ● runc doesn’t support cgroup2 yet, but “crun” already supports cgroup2 https://github.com/giuseppe/crun ● OCI (Open Containers Initiative) is working on bringing proper cgroup2 support to OCI Runtime Spec and runc https://github.com/opencontainers/runtime-spec/issues/1002
  • 50. LDAP ● Configuring /etc/subuid and /etc/subgid might be painful on LDAP environments ● NSS module is under discussion for LDAP environments https://github.com/shadow-maint/shadow/issues/154 ○ No need to configure /etc/subuid and /etc/subgid
  • 51. LDAP ● Another way: emulate sub-users using a single user ● runROOTLESS: An OCI Runtime Implementation with sub-users emulation https://github.com/rootless-containers/runrootless ○ Uses Ptrace and Xattr for emulating syscalls ○ 2-15 times performance overhead https://github.com/rootless-containers/runrootless/issues/14
  • 52. LDAP ● seccomp could be used for accelerating ptrace, but we are still facing implementation issues ● We are also looking into possibility of using “Seccomp Trap To Userspace” (introduced in Kernel 5.0) ○ Modern replacement for ptrace
  • 53. Join us at Open Source Summit ! ● Thursday, May 2, 12:30 PM - 02:30 PM ● Room 2020 ● Three BuildKit talks including this →