2. #ContainerDayFRParis Container Day 2017
Jessie Frazelle
Software Engineer
Googler.
I have contributed to many open source projects
including Docker, Go, Kubernetes, Runc, & the
Linux kernel.
Focus on runtime security for containers.
Security in a Containerized World
2
4. #ContainerDayFRParis Container Day 2017
A brief history of containers and Paris...
Paris is the true home of containers.
Security in a Containerized World
4
8. #ContainerDayFRParis Container Day 2017
Security Model: Stages
Security in a Containerized World
8
Single tenant,
multi-identities
Multi-user, but no hard
enforcement. Still have to
treat as one trust boundary
Cooperative soft
multi-tenancy
Multi-user, fine grained
authorization. Possibly not
fully-hardened
Often good enough for
multi-tenancy inside single
company (can fire bad
actors)
Hard multi-tenancy
Multi-tenant security
boundaries more strongly
enforced
Better resource isolation
E.g. comfortable running
code from multiple
third-parties on the same
cluster
Kubernetes 1.6+ Long term goal
9. #ContainerDayFRParis Container Day 2017
Project/Cluster as Boundary
Security in a Containerized World
9
Authorization permissions
granted at project/cluster level
All nodes have same
authenticated identity
All pods have same
authorization permissions
All pods have full network access
my-project/my-cluster
node-1 node-2 node-3
team1-fe team1-fe
team1-db team1-db
team2-fe team2-fe
team2-db team2-db
10. #ContainerDayFRParis Container Day 2017
Namespace/Pod as a Boundary
Security in a Containerized World
10
my-cluster
node-1 node-2 node-3
Authorization permissions
granted at
namespace/resource level
Nodes have individual
identities w/ per-node
permissions
Pods have identity with
fine-grained permissions
Pods network access can be
limited to what’s necessary
team1-fe team1-fe
team1-db team1-db
team2-fe team2-fe
team2-db team2-db
team1-namespace
team2-namespace
11. #ContainerDayFRParis Container Day 2017
Namespace/Pod as a Boundary
Security in a Containerized World
11
Authorization permissions granted at
namespace/resource level
Nodes have individual identities w/
per-node permissions
Pods have identity with fine-grained
permissions
Pods network access can be limited to
what’s necessary
Per-namespace/per-resource permissions
Move from: Everything has root-in-cluster.
To: Users, system have least privilege.
Examples:
● Alice can list Eng services, but not HR
● Bob can create Pods in Test namespace, not Prod
● Scheduler can read Pods but not Secrets
12. #ContainerDayFRParis Container Day 2017
Namespace/Pod as a Boundary
Security in a Containerized World
12
Authorization permissions granted at
namespace/resource level
Nodes have individual identities w/
per-node permissions
Pods have identity with fine-grained
permissions
Pods network access can be limited to
what’s necessary
Per-node identity & permissions
Move from: All nodes have vast permissions.
To: Each node has least privilege.
Examples:
● Node can only get info about Pods scheduled on it
● Compromised node doesn’t allow additional
escalation through Kubernetes API
13. #ContainerDayFRParis Container Day 2017
Namespace/Pod as a Boundary
Security in a Containerized World
13
Authorization permissions granted at
namespace/resource level
Nodes have individual identities w/
per-node permissions
Pods have identity with fine-grained
permissions
Pods network access can be limited to
what’s necessary
Pod identity & permissions
Move from: Running workloads have full system
access by default.
To: Workloads must be granted permissions.
Example:
● Running workload can list other objects in its
namespace, but not outside of it
14. #ContainerDayFRParis Container Day 2017
Namespace/Pod as a Boundary
Security in a Containerized World
14
Authorization permissions granted at
namespace/resource level
Nodes have individual identities w/
per-node permissions
Pods have identity with fine-grained
permissions
Pods network access can be limited to
what’s necessary
Network policy
Move From: All workloads receive traffic from
anywhere on the network.
To: Network connectivity is controllable by policy.
Example:
● Frontend layer can only communicate with
application layer, not other frontends.
17. #ContainerDayFRParis Container Day 2017
What is a sandbox?
Provides a net reduction in attack surface.
Security in a Containerized World
17
18. #ContainerDayFRParis Container Day 2017
What is a sandbox?
Compare code execution from inside and outside the
sandbox...
Security in a Containerized World
18
19. #ContainerDayFRParis Container Day 2017
What is a sandbox?
Compare code execution from inside and outside the
sandbox…
Are more or less things possible now?
Security in a Containerized World
19
22. #ContainerDayFRParis Container Day 2017
Chrome Sandbox
Each tab gets its own pid namespace.
What’s a pid namespace?
Security in a Containerized World
22
23. #ContainerDayFRParis Container Day 2017
Chrome Sandbox
Each tab gets its own pid namespace.
What’s a pid namespace?
Used to isolate the process ID number space.
Security in a Containerized World
23
24. #ContainerDayFRParis Container Day 2017
Chrome Sandbox
Uses
unprivileged user namespaces
and
network namespaces.
Security in a Containerized World
24
27. #ContainerDayFRParis Container Day 2017
What is Seccomp?
SECure COMPuting with filters.
Allows developers to write BPF programs that
determine whether a given system call will be
allowed or not.
Security in a Containerized World
27
29. #ContainerDayFRParis Container Day 2017
What is Seccomp?
What is BPF?
Berkeley Packet Filter
In-kernel bytecode machine that is used for tracing,
virtual networks, seccomp… and more.
Security in a Containerized World
29
31. #ContainerDayFRParis Container Day 2017
What is a container?
Control what a process can see.
● PID
● Mount
● Network
● UTS
● IPC
● User
● Cgroup
Security in a Containerized World
31
Namespaces
32. #ContainerDayFRParis Container Day 2017
What is a container?
Control what a process can use.
● Memory
● CPU
● Blkio
● Cpuacct
● Cpuset
● Devices
● Net_prio
● Freezer
Security in a Containerized World
32
Cgroups
33. #ContainerDayFRParis Container Day 2017
Linux Security Module can control and audit various
process actions such as file (read, write, execute, etc)
and system functions
(mount, network tcp, etc)
AppArmor
Security in a Containerized World
33
AppArmor
34. #ContainerDayFRParis Container Day 2017
AppArmor
Sane defaults
- Preventing writing to /proc/{num}, /proc/sys,
/sys
- Preventing mount
Security in a Containerized World
34
in Docker
35. #ContainerDayFRParis Container Day 2017
AppArmor
Getting towards sane defaults
- Preventing writing to
/proc/{num}, /proc/sys, /sys
- Preventing mount
Security in a Containerized World
35
in Kubernetes
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
annotations:
container.apparmor.security.beta.kuberne
tes.io/nginx: runtime/default
spec:
containers:
- name: nginx
image: nginx
command: ["nginx", "-g", "daemon off;"]
36. #ContainerDayFRParis Container Day 2017
Syscall filters allow an application to define what
syscalls it allows or denies.
Seccomp
Security in a Containerized World
36
AppArmor
Seccomp
37. #ContainerDayFRParis Container Day 2017
Seccomp
Docker's default seccomp profile is a whitelist
which specifies the calls that are allowed.
Security in a Containerized World
37
in Docker
38. #ContainerDayFRParis Container Day 2017
Seccomp
Docker's default seccomp profile is a whitelist
which specifies the calls that are allowed.
It blocks a bunch of bad stuff… not limited to the
following...
Security in a Containerized World
38
in Docker
39. #ContainerDayFRParis Container Day 2017
Seccomp
add_key, keyctl, request_key
Prevent containers from using the kernel keyring,
which is not namespaced.
Security in a Containerized World
39
in Docker
40. #ContainerDayFRParis Container Day 2017
Seccomp
clone, unshare
Deny cloning/unsharing new namespaces.
Also gated by CAP_SYS_ADMIN for CLONE_*
flags,
except CLONE_USERNS, which has a history of
vulns
Security in a Containerized World
40
in Docker
41. #ContainerDayFRParis Container Day 2017
Seccomp
Pushing towards sane defaults
- Whitelist
- Prevent cloning new unprivileged
user namespaces (has a high rate
of past CVEs)
- Prevent ~150 other syscalls
which are uncommon or
dangerous
Security in a Containerized World
41
in Kubernetes
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
annotations:
container.seccomp.security.alpha.kuberne
tes.io/nginx: docker/default
spec:
containers:
- name: nginx
image: nginx
command: ["nginx", "-g", "daemon off;"]
42. #ContainerDayFRParis Container Day 2017
Provides a mechanism for supporting access control
security policies, including mandatory access
controls.
Controls over file systems, directories, files, and open
file descriptors.
Controls individual labels and controls for kernel
objects and services.
SELinux
Security in a Containerized World
42
AppArmor
Seccomp
SELinux
43. #ContainerDayFRParis Container Day 2017
SELinux
Labeling systems like SELinux require that proper
labels are placed on volume content mounted into
a container.
Without a label, the security system might prevent
the processes running inside the container from
using the content.
Allows relabeling file objects on Docker volumes.
Security in a Containerized World
43
in Docker
45. #ContainerDayFRParis Container Day 2017
SELinux
Apply SELinux labels to volumes.
Security in a Containerized World
45
in Kubernetes
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
spec:
containers:
- name: nginx
image: nginx
command: ["nginx", "-g", "daemon off;"]
securityContext:
capabilities:
drop:
- NET_RAW
volumeMounts:
...
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
readOnlyRootFilesystem: true
runAsNonRoot: true
volumes: ...
46. #ContainerDayFRParis Container Day 2017
A Linux flag (no_new_privs) that’s carried over `fork`,
`clone`, and `execve` to prevent new privileges from
being added to a process.
No New Privileges
Security in a Containerized World
46
AppArmor
Seccomp
SELinux
No new
privs
47. #ContainerDayFRParis Container Day 2017
No New Privileges
Not applied by default unless set by the Docker
daemon.
Security in a Containerized World
47
in Docker
48. #ContainerDayFRParis Container Day 2017
No New Privileges
On by default for all containers
without breaking setuid binaries.
Security in a Containerized World
48
in Kubernetes
49. #ContainerDayFRParis Container Day 2017
RunAsNonRoot: Set containers to only run as a
non-root user
Security Context
Security in a Containerized World
49
in Kubernetes
50. #ContainerDayFRParis Container Day 2017
RunAsNonRoot: Set containers to only run as a
non-root user
ReadOnlyRootFilesystem: Set container
filesystem as read-only
Security Context
Security in a Containerized World
50
in Kubernetes
51. #ContainerDayFRParis Container Day 2017
RunAsNonRoot: Set containers to only run as a
non-root user
ReadOnlyRootFilesystem: Set container
filesystem as read-only
Capabilities: Set containers to run with specific
Capabilities
Security Context
Security in a Containerized World
51
in Kubernetes
52. #ContainerDayFRParis Container Day 2017
RunAsNonRoot
Set containers to only run as a
non-root user
Security in a Containerized World
52
Security Context in Kubernetes
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
spec:
containers:
- name: nginx
image: nginx
command: ["nginx", "-g", "daemon off;"]
securityContext:
capabilities:
drop:
- NET_RAW
volumeMounts:
...
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
readOnlyRootFilesystem: true
runAsNonRoot: true
volumes: ...
53. #ContainerDayFRParis Container Day 2017
ReadOnlyRootFilesystem
Set container filesystem as read-only
Security in a Containerized World
53
Security Context in Kubernetes
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
spec:
containers:
- name: nginx
image: nginx
command: ["nginx", "-g", "daemon off;"]
securityContext:
capabilities:
drop:
- NET_RAW
volumeMounts:
...
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
readOnlyRootFilesystem: true
runAsNonRoot: true
volumes: ...
54. #ContainerDayFRParis Container Day 2017
Capabilities
Set containers to run with specific
Capabilities
Security in a Containerized World
54
Security Context in Kubernetes
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
spec:
containers:
- name: nginx
image: nginx
command: ["nginx", "-g", "daemon off;"]
securityContext:
capabilities:
drop:
- NET_RAW
volumeMounts:
...
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
readOnlyRootFilesystem: true
runAsNonRoot: true
volumes: ...
55. #ContainerDayFRParis Container Day 2017
A Pod Security Policy is a cluster-level resource
that controls the actions that a pod can perform
and what it has the ability to access.
The PodSecurityPolicy objects define a set of
conditions that a pod must run with in order to be
accepted into the system.
Pod Security Policy
Security in a Containerized World
55
in Kubernetes
56. #ContainerDayFRParis Container Day 2017
Privileged
Allows or denies running of privileged
containers.
Default deny.
Security in a Containerized World
56
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
min: 8000
max: 8080
volumes:
- '*'
57. #ContainerDayFRParis Container Day 2017
defaultAddCapabilities
Default set of capabilities that will be
added to a container.
Security in a Containerized World
57
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
defaultAddCapabilities:
- ‘CAP_AUDIT_WRITE’
- ‘CAP_KILL’
- ‘CAP_NET_BIND_SERVICE’
fsGroup:
rule: RunAsAny
hostPorts:
min: 8000
max: 8080
volumes:
- '*'
58. #ContainerDayFRParis Container Day 2017
requiredDropCapabilities
Capabilities that will be dropped from
a container.
Security in a Containerized World
58
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
requiredDropCapabilities:
- ‘CAP_NEW_RAW’
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
min: 8000
max: 8080
volumes:
- '*'
59. #ContainerDayFRParis Container Day 2017
allowedCapabilities
Capabilities a container can request
to be added.
Security in a Containerized World
59
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
allowedCapabilities:
- ‘CAP_NEW_ADMIN’
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
min: 8000
max: 8080
volumes:
- '*'
60. #ContainerDayFRParis Container Day 2017
volumes
Controls the usage of volume types,
defines which ones are allowed.
Security in a Containerized World
60
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
min: 8000
max: 8080
volumes:
- 'hostPath'
- ‘gcePersistentDisk’
61. #ContainerDayFRParis Container Day 2017
hostPorts
Controls the use of host ports, defines
an allowed range.
Default empty.
List of HostPortRange, defined by
min (inclusive) and max (inclusive),
which define the allowed host ports.
Security in a Containerized World
61
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
min: 8000
max: 8080
volumes:
- '*'
62. #ContainerDayFRParis Container Day 2017
hostPID
Allows or denies the use of host’s PID
namespace.
Default deny.
Security in a Containerized World
62
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
hostPID: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
63. #ContainerDayFRParis Container Day 2017
hostIPC
Allows or denies the use of host’s IPC
namespace.
Default deny.
Security in a Containerized World
63
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
hostPID: false
hostIPC: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
64. #ContainerDayFRParis Container Day 2017
seLinux
MustRunAs: Requires
seLinuxOptions to be configured if
not using pre-allocated values. Uses
seLinuxOptions as the default.
Validates against seLinuxOptions.
RunAsAny: No default provided.
Allows any seLinuxOptions to be
specified.
Security in a Containerized World
64
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
65. #ContainerDayFRParis Container Day 2017
runAsUser
MustRunAs: Requires a range to be
configured. Uses the first value of the
range as the default. Validates against the
range.
MustRunAsNonRoot: Requires that the
pod be submitted with a non-zero
runAsUseror have the USER directive
defined in the image.
RunAsAny: No default provided. Allows
any runAsUserto be specified.
Security in a Containerized World
65
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: MustRunAsNonRoot
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
66. #ContainerDayFRParis Container Day 2017
supplementalGroups
MustRunAs: Requires at least one
range to be specified. Uses the
minimum value of the first range as
the default. Validates against all
ranges.
RunAsAny: No default provided.
Allows any supplementalGroups
to be specified.
Security in a Containerized World
66
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
67. #ContainerDayFRParis Container Day 2017
fsGroup
MustRunAs: Requires at least one
range to be specified. Uses the
minimum value of the first range as
the default. Validates against the first
ID in the first range.
RunAsAny: No default provided.
Allows any fsGroup ID to be
specified.
Security in a Containerized World
67
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
68. #ContainerDayFRParis Container Day 2017
readOnlyRootFilesystem
Requiring the use of a read only root
file system.
Security in a Containerized World
68
Pod Security Policy in Kubernetes
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive
spec:
privileged: false
readOnlyRootFilesystem: true
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
hostPorts:
- min: 8000
max: 8080
volumes:
- '*'
70. #ContainerDayFRParis Container Day 2017
Security in a Containerized World
70
Network Policies
my-cluster
node-1 node-2 node-3
pod pod
pod pod
pod pod
pod pod
team1-namespace
team2-namespace
Behavior without a network
policy.
Everything can talk to
everything.
71. #ContainerDayFRParis Container Day 2017
Security in a Containerized World
71
Network Policies
my-cluster
node-1 node-2 node-3
pod pod
pod pod
pod pod
pod pod
team1-namespace
team2-namespace
Network Isolation with
DefaultDeny
Explicitly define
communication between
pods as a whitelist
72. #ContainerDayFRParis Container Day 2017
Security in a Containerized World
72
Network Policies
Network Isolation with
DefaultDeny
Explicitly define
communication between
pods as a whitelist
my-cluster
node-1 node-2 node-3
pod pod
pod pod
pod pod
pod pod
team1-namespace
team2-namespace
73. #ContainerDayFRParis Container Day 2017
Network Policy
Setting DefaultDeny for a
namespace.
Security in a Containerized World
73
in Kubernetes
kind: Namespace
apiVersion: v1
metadata:
name: my-namespace
metadata:
annotations:
net.beta.kubernetes.io/network-policy: |
{
"ingress": {
"isolation": "DefaultDeny"
}
}
74. #ContainerDayFRParis Container Day 2017
Network Policy
Explicitly define communication
between pods.
Security in a Containerized World
74
in Kubernetes
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
name: my-network-policy
namespace: my-namespace
spec:
podSelector:
matchLabels:
role: db
ingress:
- from:
- namespaceSelector:
matchLabels:
project: myproject
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: tcp
port: 6379