SlideShare una empresa de Scribd logo
1 de 16
Descargar para leer sin conexión
Moving the Oath
Grid to Docker
1
Eric Badger, Oath
● RHEL 7 adoption
● Security
● Isolation
2
Motivations
*Note: Running arbitrary, user-supplied Docker images on YARN
was not a motivation
Architecture
3
● Tasks run in Docker containers
● Daemons run on bare-metal RHEL 7
4
● Seccomp profile
● Linux capabilities
● Run with --no-new-privileges
● Read-only containers and limited mounts
● Enter container as user UID:GID
5
Security
6
Bind-mounts
Bind-mount RO/RW Reason
Hadoop release RO Hadoop jars and confs
Usercache/Filecache directories RO Distributed cache
NSCD socket RW Use host cache for user lookups
HDFS short-circuit socket RW HDFS short-circuit reads/writes
Container log directories RW Write logs where nodemanager can see for
aggregation
Application local directories RW Application-specific temporary space, /tmp, /var/tmp
Grid Migration
7
● No cluster downtime
● No changes to current jobs
● No more than 5% performance degradation
8
Requirements
● Docker or Linux runtime chosen per node
- Based on RHEL version
- See YARN-6456
● During migration jobs will run as mix of bare-metal
processes and Docker containers
● User-transparent
9
Node Specific Runtime
● Preload images on nodes
- Avoid thundering herd
- Avoid task timeouts
● Force jobs to use a Docker image
● Allow users to pick a different image from a small set of
allowed images
10
Image Management
Challenges
11
● System call overhead due to seccomp
● Docker losing track of containers in high-memory situations
● System PID reuse issue causing Docker to restart
● Debugging tasks inside of containers
● Tasks can’t talk to each other through /tmp anymore
12
Challenges
Future
Improvements
13
● User-specific layers
- Harder than it seems
● Podman
● Kata containers
14
Future Improvements
Phase 1: YARN-3611 - Support Docker Containers in
LinuxContainerExecutor
- 92 subtasks, all resolved
Phase 2: YARN-8274 - YARN Container Phase 2
- 26 subtasks, 7 resolved
15
Docker in Apache Hadoop
Many thanks to those who contributed to this work, including the
following:
Shane Kumpf, Eric Yang, Miklos Szegedi, Billie Rinaldi, Chandni
Singh, Craig Condit, Jason Lowe, Jim Brennan, Nathan Roberts,
Dheeraj Kapur
16
Acknowledgements

Más contenido relacionado

La actualidad más candente

OpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and DockerOpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and DockerKirill Kolyshkin
 
Docker architecture (version modified)
Docker architecture (version modified)Docker architecture (version modified)
Docker architecture (version modified)Amir Arsalan
 
Revolutionizing WSO2 PaaS with Kubernetes & App Factory
Revolutionizing WSO2 PaaS with Kubernetes & App FactoryRevolutionizing WSO2 PaaS with Kubernetes & App Factory
Revolutionizing WSO2 PaaS with Kubernetes & App FactoryImesh Gunaratne
 
Docker Architecture (v1.3)
Docker Architecture (v1.3)Docker Architecture (v1.3)
Docker Architecture (v1.3)rajdeep
 
Docker Global Hack Day #3
Docker Global Hack Day #3 Docker Global Hack Day #3
Docker Global Hack Day #3 Docker, Inc.
 
Containers orchestrators: Docker vs. Kubernetes
Containers orchestrators: Docker vs. KubernetesContainers orchestrators: Docker vs. Kubernetes
Containers orchestrators: Docker vs. KubernetesDmitry Lazarenko
 
LXD Container Hypervisor
LXD Container HypervisorLXD Container Hypervisor
LXD Container HypervisorDanial Behzadi
 
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyLinux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyBoden Russell
 
Docker Swarm & Machine
Docker Swarm & MachineDocker Swarm & Machine
Docker Swarm & MachineEueung Mulyana
 
Orchestrating Docker containers at scale
Orchestrating Docker containers at scaleOrchestrating Docker containers at scale
Orchestrating Docker containers at scaleMaciej Lasyk
 
LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)
LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)
LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)Boden Russell
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Roger Zhou 周志强
 
Performance comparison between Linux Containers and Virtual Machines
Performance comparison between Linux Containers and Virtual MachinesPerformance comparison between Linux Containers and Virtual Machines
Performance comparison between Linux Containers and Virtual MachinesSoheila Dehghanzadeh
 
Shifter: Containers in HPC Environments
Shifter: Containers in HPC EnvironmentsShifter: Containers in HPC Environments
Shifter: Containers in HPC Environmentsinside-BigData.com
 
Lxd the proper way of runing containers
Lxd   the proper way of runing containersLxd   the proper way of runing containers
Lxd the proper way of runing containersMarian Marinov
 
Docker 1.11 @ Docker SF Meetup
Docker 1.11 @ Docker SF MeetupDocker 1.11 @ Docker SF Meetup
Docker 1.11 @ Docker SF MeetupDocker, Inc.
 

La actualidad más candente (20)

OpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and DockerOpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and Docker
 
Docker architecture (version modified)
Docker architecture (version modified)Docker architecture (version modified)
Docker architecture (version modified)
 
Revolutionizing WSO2 PaaS with Kubernetes & App Factory
Revolutionizing WSO2 PaaS with Kubernetes & App FactoryRevolutionizing WSO2 PaaS with Kubernetes & App Factory
Revolutionizing WSO2 PaaS with Kubernetes & App Factory
 
Docker Architecture (v1.3)
Docker Architecture (v1.3)Docker Architecture (v1.3)
Docker Architecture (v1.3)
 
Docker quick start
Docker quick startDocker quick start
Docker quick start
 
Docker Global Hack Day #3
Docker Global Hack Day #3 Docker Global Hack Day #3
Docker Global Hack Day #3
 
Containers orchestrators: Docker vs. Kubernetes
Containers orchestrators: Docker vs. KubernetesContainers orchestrators: Docker vs. Kubernetes
Containers orchestrators: Docker vs. Kubernetes
 
Docker internals
Docker internalsDocker internals
Docker internals
 
LXD Container Hypervisor
LXD Container HypervisorLXD Container Hypervisor
LXD Container Hypervisor
 
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copyLinux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy
 
Docker Swarm & Machine
Docker Swarm & MachineDocker Swarm & Machine
Docker Swarm & Machine
 
Orchestrating Docker containers at scale
Orchestrating Docker containers at scaleOrchestrating Docker containers at scale
Orchestrating Docker containers at scale
 
LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)
LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)
LXC – NextGen Virtualization for Cloud benefit realization (cloudexpo)
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
 
Performance comparison between Linux Containers and Virtual Machines
Performance comparison between Linux Containers and Virtual MachinesPerformance comparison between Linux Containers and Virtual Machines
Performance comparison between Linux Containers and Virtual Machines
 
Shifter: Containers in HPC Environments
Shifter: Containers in HPC EnvironmentsShifter: Containers in HPC Environments
Shifter: Containers in HPC Environments
 
Docker Architecture
Docker ArchitectureDocker Architecture
Docker Architecture
 
Containers in the Cloud
Containers in the CloudContainers in the Cloud
Containers in the Cloud
 
Lxd the proper way of runing containers
Lxd   the proper way of runing containersLxd   the proper way of runing containers
Lxd the proper way of runing containers
 
Docker 1.11 @ Docker SF Meetup
Docker 1.11 @ Docker SF MeetupDocker 1.11 @ Docker SF Meetup
Docker 1.11 @ Docker SF Meetup
 

Similar a Moving the Oath Grid to Docker, Eric Badger, Oath

The internals and the latest trends of container runtimes
The internals and the latest trends of container runtimesThe internals and the latest trends of container runtimes
The internals and the latest trends of container runtimesAkihiro Suda
 
Docker primer and tips
Docker primer and tipsDocker primer and tips
Docker primer and tipsSamuel Chow
 
Super powered Drupal development with docker
Super powered Drupal development with dockerSuper powered Drupal development with docker
Super powered Drupal development with dockerMaciej Lukianski
 
Introducing docker
Introducing dockerIntroducing docker
Introducing dockerBill Wang
 
Containers: from development to production at DevNation 2015
Containers: from development to production at DevNation 2015Containers: from development to production at DevNation 2015
Containers: from development to production at DevNation 2015Jérôme Petazzoni
 
Docker up and Running For Web Developers
Docker up and Running For Web DevelopersDocker up and Running For Web Developers
Docker up and Running For Web DevelopersBADR
 
Docker Up and Running for Web Developers
Docker Up and Running for Web DevelopersDocker Up and Running for Web Developers
Docker Up and Running for Web DevelopersAmr Fawzy
 
Best Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with DockerBest Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with DockerEric Smalling
 
Using Docker to boost your development experience with Drupal
Using Docker to boost your development experience with DrupalUsing Docker to boost your development experience with Drupal
Using Docker to boost your development experience with Drupaldockerizedrupal
 
Powercoders · Docker · Fall 2021.pptx
Powercoders · Docker · Fall 2021.pptxPowercoders · Docker · Fall 2021.pptx
Powercoders · Docker · Fall 2021.pptxIgnacioTamayo2
 
A Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersA Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersDocker, Inc.
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDatainside-BigData.com
 
Docker from A to Z, including Swarm and OCCS
Docker from A to Z, including Swarm and OCCSDocker from A to Z, including Swarm and OCCS
Docker from A to Z, including Swarm and OCCSFrank Munz
 
Java and Container - Make it Awesome !
Java and Container - Make it Awesome !Java and Container - Make it Awesome !
Java and Container - Make it Awesome !Dinakar Guniguntala
 
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo SeidelOSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo SeidelNETWAYS
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 

Similar a Moving the Oath Grid to Docker, Eric Badger, Oath (20)

The internals and the latest trends of container runtimes
The internals and the latest trends of container runtimesThe internals and the latest trends of container runtimes
The internals and the latest trends of container runtimes
 
Docker+java
Docker+javaDocker+java
Docker+java
 
DockerCC.pdf
DockerCC.pdfDockerCC.pdf
DockerCC.pdf
 
Docker primer and tips
Docker primer and tipsDocker primer and tips
Docker primer and tips
 
Super powered Drupal development with docker
Super powered Drupal development with dockerSuper powered Drupal development with docker
Super powered Drupal development with docker
 
Introducing docker
Introducing dockerIntroducing docker
Introducing docker
 
Containers: from development to production at DevNation 2015
Containers: from development to production at DevNation 2015Containers: from development to production at DevNation 2015
Containers: from development to production at DevNation 2015
 
Docker up and Running For Web Developers
Docker up and Running For Web DevelopersDocker up and Running For Web Developers
Docker up and Running For Web Developers
 
Docker Up and Running for Web Developers
Docker Up and Running for Web DevelopersDocker Up and Running for Web Developers
Docker Up and Running for Web Developers
 
Best Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with DockerBest Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with Docker
 
Using Docker to boost your development experience with Drupal
Using Docker to boost your development experience with DrupalUsing Docker to boost your development experience with Drupal
Using Docker to boost your development experience with Drupal
 
Powercoders · Docker · Fall 2021.pptx
Powercoders · Docker · Fall 2021.pptxPowercoders · Docker · Fall 2021.pptx
Powercoders · Docker · Fall 2021.pptx
 
JOSA TechTalk: Introduction to docker
JOSA TechTalk: Introduction to dockerJOSA TechTalk: Introduction to docker
JOSA TechTalk: Introduction to docker
 
A Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersA Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and Containers
 
Janus & docker: friends or foe
Janus & docker: friends or foe Janus & docker: friends or foe
Janus & docker: friends or foe
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
 
Docker from A to Z, including Swarm and OCCS
Docker from A to Z, including Swarm and OCCSDocker from A to Z, including Swarm and OCCS
Docker from A to Z, including Swarm and OCCS
 
Java and Container - Make it Awesome !
Java and Container - Make it Awesome !Java and Container - Make it Awesome !
Java and Container - Make it Awesome !
 
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo SeidelOSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 

Más de Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaYahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanYahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuYahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolYahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsYahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network
 

Más de Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
 

Último

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Último (20)

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Moving the Oath Grid to Docker, Eric Badger, Oath

  • 1. Moving the Oath Grid to Docker 1 Eric Badger, Oath
  • 2. ● RHEL 7 adoption ● Security ● Isolation 2 Motivations *Note: Running arbitrary, user-supplied Docker images on YARN was not a motivation
  • 4. ● Tasks run in Docker containers ● Daemons run on bare-metal RHEL 7 4
  • 5. ● Seccomp profile ● Linux capabilities ● Run with --no-new-privileges ● Read-only containers and limited mounts ● Enter container as user UID:GID 5 Security
  • 6. 6 Bind-mounts Bind-mount RO/RW Reason Hadoop release RO Hadoop jars and confs Usercache/Filecache directories RO Distributed cache NSCD socket RW Use host cache for user lookups HDFS short-circuit socket RW HDFS short-circuit reads/writes Container log directories RW Write logs where nodemanager can see for aggregation Application local directories RW Application-specific temporary space, /tmp, /var/tmp
  • 8. ● No cluster downtime ● No changes to current jobs ● No more than 5% performance degradation 8 Requirements
  • 9. ● Docker or Linux runtime chosen per node - Based on RHEL version - See YARN-6456 ● During migration jobs will run as mix of bare-metal processes and Docker containers ● User-transparent 9 Node Specific Runtime
  • 10. ● Preload images on nodes - Avoid thundering herd - Avoid task timeouts ● Force jobs to use a Docker image ● Allow users to pick a different image from a small set of allowed images 10 Image Management
  • 12. ● System call overhead due to seccomp ● Docker losing track of containers in high-memory situations ● System PID reuse issue causing Docker to restart ● Debugging tasks inside of containers ● Tasks can’t talk to each other through /tmp anymore 12 Challenges
  • 14. ● User-specific layers - Harder than it seems ● Podman ● Kata containers 14 Future Improvements
  • 15. Phase 1: YARN-3611 - Support Docker Containers in LinuxContainerExecutor - 92 subtasks, all resolved Phase 2: YARN-8274 - YARN Container Phase 2 - 26 subtasks, 7 resolved 15 Docker in Apache Hadoop
  • 16. Many thanks to those who contributed to this work, including the following: Shane Kumpf, Eric Yang, Miklos Szegedi, Billie Rinaldi, Chandni Singh, Craig Condit, Jason Lowe, Jim Brennan, Nathan Roberts, Dheeraj Kapur 16 Acknowledgements