4. ● Tasks run in Docker containers
● Daemons run on bare-metal RHEL 7
4
5. ● Seccomp profile
● Linux capabilities
● Run with --no-new-privileges
● Read-only containers and limited mounts
● Enter container as user UID:GID
5
Security
6. 6
Bind-mounts
Bind-mount RO/RW Reason
Hadoop release RO Hadoop jars and confs
Usercache/Filecache directories RO Distributed cache
NSCD socket RW Use host cache for user lookups
HDFS short-circuit socket RW HDFS short-circuit reads/writes
Container log directories RW Write logs where nodemanager can see for
aggregation
Application local directories RW Application-specific temporary space, /tmp, /var/tmp
8. ● No cluster downtime
● No changes to current jobs
● No more than 5% performance degradation
8
Requirements
9. ● Docker or Linux runtime chosen per node
- Based on RHEL version
- See YARN-6456
● During migration jobs will run as mix of bare-metal
processes and Docker containers
● User-transparent
9
Node Specific Runtime
10. ● Preload images on nodes
- Avoid thundering herd
- Avoid task timeouts
● Force jobs to use a Docker image
● Allow users to pick a different image from a small set of
allowed images
10
Image Management
12. ● System call overhead due to seccomp
● Docker losing track of containers in high-memory situations
● System PID reuse issue causing Docker to restart
● Debugging tasks inside of containers
● Tasks can’t talk to each other through /tmp anymore
12
Challenges
15. Phase 1: YARN-3611 - Support Docker Containers in
LinuxContainerExecutor
- 92 subtasks, all resolved
Phase 2: YARN-8274 - YARN Container Phase 2
- 26 subtasks, 7 resolved
15
Docker in Apache Hadoop
16. Many thanks to those who contributed to this work, including the
following:
Shane Kumpf, Eric Yang, Miklos Szegedi, Billie Rinaldi, Chandni
Singh, Craig Condit, Jason Lowe, Jim Brennan, Nathan Roberts,
Dheeraj Kapur
16
Acknowledgements