SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
Running Containers at
Scale at Netflix
@aspyker @corindwyer
The Titus Team
● Develop
● Operate
● Support
Netflix’s Container Management Platform
Titus
Scheduling
● Service & batch jobs
● Resource management
Container Execution
● Docker/AWS Integration
● Netflix Infra Support
Service
Job and Fleet Management
Resource Management & Optimization
Container Execution
Integration
Batch
● 1000+ Applications
● Netflix API, NodeJS Backend UI Scripts
● Machine Learning (GPUs) for personalization
● Encoding and Content use cases
● Netflix Studio use cases
● CDN tracking and planning
● Massively parallel CI system
● Data Pipeline routing and SPaaS
● Big Data platform use cases
Growing set of container use cases
Batch
Q4 15
Basic
Service
1Q 16
Production
Service
4Q 16
Customer
Facing
Service
2Q 17
shadow
High Level Titus Architecture
Cassandra
Titus Control Plane
● API
● Scheduling
● Job Lifecycle Control
EC2 Autoscaling
Fenzo
container
container
container
docker
Titus Agents
Mesos agent
Docker
Docker Registry
containercontainerUser Containers
AWS Virtual Machines
Mesos
Titus System ServicesBatch/Workflow
Systems
Service
CI/CD
Q1 2018 Container Usage
Common
Jobs Launched 176K jobs / day
Different applications 1K+ different images
Regional isolated Titus stacks 7
Services
Single App Cluster Size 5K (real), 12K containers (benchmark)
Agents managed 16K VMs
Batch
Containers launched 430K / day
Agents autoscaled 350K VMs / month
Leveraging existing Netflix and AWS Infrastructure
Single consistent cloud environment between VMs and containers
VMVM
EC2
AWSAutoScaler
VMs
Service App
Cloud Platform
(metrics, IPC, health)
VPC
VMVM
Atlas
TitusJobControl
Containers
Service App
Cloud Platform
(metrics, IPC, health)
Eureka Edda
VMVMContainers
Batch App
Cloud Platform
(metrics, IPC, health)
Most Native AWS Container Platform
IP per container
● VPC IP, ENI and security group
● Optimized to share ENIs
● ENI pre-attaching, opportunistic batching of IPs (bursty deploys)
IAM Roles and Metadata Endpoint per container
● Container view of 169.254.169.254
Cryptographic identity per container
● Using Amazon instance identity document
Service job container autoscaling
● Using Native AWS Cloudwatch and Autoscaling policies and engine
Application Load Balancing (ALB)
Advanced Scheduling and
Control Plane Technologies
Scheduling / Placement
Considering the realities of …
● Docker, Linux, Image Pulling, etc.
● Complex resources (ENIs)
● Amazon rate limiting
● Filtering (constraints) and ranking (fitness)
● Different profiles for service | batch, critical | operational, etc.
Reliability
Provisioning
Time
Cost
Trade
offs
Capacity Management
User configures “capacity groups” based on workload type
Critical (RIs)
● Preallocated instances in order to achieve low provisioning time
● Buffer to support temporary extra capacity needs for deployments
Flex (On-Demand)
● Autoscaled instances based on demand
Opportunistic (Spot) - Coming
● Utilize extra instances with the ability to preempt or evict the workload
Centralized Agent Management
Agent Management
Other subsystems
Health checks
Cluster lifecycle
Other signals
Unified component for tracking agent
information, Powers other systems like task
migration, canaries, agent remediations
Cluster B
Agents states =
schedulable
For example: Task Migration
Cluster A
Agent state =
non-schedulable,
drain tasks
Agent Management
Task Migration
Cluster state
Integrations
Titus
External Resources
Operational VisibilityInterfaces
Load Balancers
Autoscaling
Spinnaker (Task Migration)
Telemetry - Atlas
Event Storage - Elasticsearch
REST / gRPC
Streaming updates
Advanced Container
Runtime Technologies
Multi-tenant networking is hard
Decided early on we wanted full IP stacks per container
But what about?
● Security group support
● IAM role support
● Network bandwidth isolation
● Leverage VPC
Virtual Machine Host
Titus Networking
sg=A,B
IP 2
sg=B,C
IP 3
Metadata
service
IPVlan, BPF, IFBs to route app traffic
Container 1 Container 2
sg=A,B
IP 4
Container 3
eth 0
sg=Titus
control plane
eth1
sg=A,B
eth2
sg=B,C
eth-mdeth-md
Titus executor
eth0eth0eth0
IP 2
IP 4
IP 3
IP 1
Metadata
service
eth-md
Metadata
service
169.254.169.254
Next challenge: Speed limits of EC2 Networking
Largest EC2 challenge: speed of networking reconfiguration
Changes in how we work with EC2 API’s
● Work with Amazon to redefine networking related API rate limits, buckets
● Pre-attach all networking interfaces
● IPs are asked for in bulk opportunistically
Also, coordination with scheduler
● Prefer instances with containers already in the same security group
For large scale failovers
● Before … hours, after ... minutes
● Detection - health checks
○ Linux subsystems (systemd, filesystems)
○ Docker aspects (runtime health, registry pulls)
○ Titus processes (networking, GPU, security drivers)
○ Mesos aspects (agent, executor)
● Remediation
○ Local reconciliation
○ Docker image cleanup
Overcoming failures on each agent
Process Model Evolution
Single process containers
● Worked for some time, until we needed system services
System services
● Telemetry, IAM support, log uploading
● Added as host installed daemons; isolation & multi-tenancy concerns
● Currently injecting system services into containers
Composing system services into containers
● Considered pods; lifecycle and usage complexities limited value
● Considering future of both systemd and docker image composability
Resource Isolation
● CPU
○ Started with bursting; was interfering with predictability
○ Resource tiers to avoid interference problems
● Memory
○ Hard limit, OOM kills entire container
● Network
○ Bandwidth throttling
● Disk space
● GPUs
Security Isolation
● Deployed user namespaces
○ Challenging due to shared systems without UID shifting
● Needed ad hoc debugging
○ Titus-ssh for user level access to their container
○ Still required power user access for kernel functions
○ Working to automate through tools like Vector (NetflixOSS)
● Seccomp overhead and complexity is prohibitive
○ Working towards automated policies and BPF driven implementations
Open Sourcing
Currently in private open source collaboration with those who want ...
● The NetflixOSS container solution (Spinnaker + Titus + Netflix RPC)
● A unified batch and service Mesos scheduler
● More robust & native AWS container platform
Hope to fully open source in early Q2
● If you want access now, let us know
● Looking for collaborators, feedback
Q&A

Más contenido relacionado

La actualidad más candente

Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley
 

La actualidad más candente (20)

How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Troubleshooting Kafka's socket server: from incident to resolution
Troubleshooting Kafka's socket server: from incident to resolutionTroubleshooting Kafka's socket server: from incident to resolution
Troubleshooting Kafka's socket server: from incident to resolution
 
Fluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at ScaleFluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at Scale
 
Introduction to Docker - 2017
Introduction to Docker - 2017Introduction to Docker - 2017
Introduction to Docker - 2017
 
Kubernetes Summit 2021: Multi-Cluster - The Good, the Bad and the Ugly
Kubernetes Summit 2021: Multi-Cluster - The Good, the Bad and the UglyKubernetes Summit 2021: Multi-Cluster - The Good, the Bad and the Ugly
Kubernetes Summit 2021: Multi-Cluster - The Good, the Bad and the Ugly
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
Docker Networking Deep Dive
Docker Networking Deep DiveDocker Networking Deep Dive
Docker Networking Deep Dive
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
 

Similar a Container World 2018

Similar a Container World 2018 (20)

NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containers
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
How Kubernetes helps Devops
How Kubernetes helps DevopsHow Kubernetes helps Devops
How Kubernetes helps Devops
 
Netflix Titus WASP October 2017
Netflix Titus WASP October 2017Netflix Titus WASP October 2017
Netflix Titus WASP October 2017
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rook
 
Herding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes PublicHerding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes Public
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
Welcome to icehouse
Welcome to icehouseWelcome to icehouse
Welcome to icehouse
 
Netflix and Containers: Not A Stranger Thing
Netflix and Containers:  Not A Stranger ThingNetflix and Containers:  Not A Stranger Thing
Netflix and Containers: Not A Stranger Thing
 
Netflix and Containers: Not Stranger Things
Netflix and Containers: Not Stranger ThingsNetflix and Containers: Not Stranger Things
Netflix and Containers: Not Stranger Things
 
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
 
[KubeCon EU 2021] Introduction and Deep Dive Into Containerd
[KubeCon EU 2021] Introduction and Deep Dive Into Containerd[KubeCon EU 2021] Introduction and Deep Dive Into Containerd
[KubeCon EU 2021] Introduction and Deep Dive Into Containerd
 
Introduction to containers, k8s, Microservices & Cloud Native
Introduction to containers, k8s, Microservices & Cloud NativeIntroduction to containers, k8s, Microservices & Cloud Native
Introduction to containers, k8s, Microservices & Cloud Native
 
GoDocker presentation
GoDocker presentationGoDocker presentation
GoDocker presentation
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsDisenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
 
The journey to container adoption in enterprise
The journey to container adoption in enterpriseThe journey to container adoption in enterprise
The journey to container adoption in enterprise
 

Más de aspyker

Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
aspyker
 
Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinal
aspyker
 

Más de aspyker (20)

Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
 
SRECon Lightning Talk
SRECon Lightning TalkSRECon Lightning Talk
SRECon Lightning Talk
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1
 
CS80A Foothill College Open Source Talk
CS80A Foothill College Open Source TalkCS80A Foothill College Open Source Talk
CS80A Foothill College Open Source Talk
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix Cloud Platform and Open Source
Netflix Cloud Platform and Open SourceNetflix Cloud Platform and Open Source
Netflix Cloud Platform and Open Source
 
NetflixOSS and ZeroToDocker Talk
NetflixOSS and ZeroToDocker TalkNetflixOSS and ZeroToDocker Talk
NetflixOSS and ZeroToDocker Talk
 
Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinal
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Container World 2018

  • 1. Running Containers at Scale at Netflix @aspyker @corindwyer
  • 2. The Titus Team ● Develop ● Operate ● Support
  • 3. Netflix’s Container Management Platform Titus Scheduling ● Service & batch jobs ● Resource management Container Execution ● Docker/AWS Integration ● Netflix Infra Support Service Job and Fleet Management Resource Management & Optimization Container Execution Integration Batch
  • 4. ● 1000+ Applications ● Netflix API, NodeJS Backend UI Scripts ● Machine Learning (GPUs) for personalization ● Encoding and Content use cases ● Netflix Studio use cases ● CDN tracking and planning ● Massively parallel CI system ● Data Pipeline routing and SPaaS ● Big Data platform use cases Growing set of container use cases Batch Q4 15 Basic Service 1Q 16 Production Service 4Q 16 Customer Facing Service 2Q 17 shadow
  • 5. High Level Titus Architecture Cassandra Titus Control Plane ● API ● Scheduling ● Job Lifecycle Control EC2 Autoscaling Fenzo container container container docker Titus Agents Mesos agent Docker Docker Registry containercontainerUser Containers AWS Virtual Machines Mesos Titus System ServicesBatch/Workflow Systems Service CI/CD
  • 6. Q1 2018 Container Usage Common Jobs Launched 176K jobs / day Different applications 1K+ different images Regional isolated Titus stacks 7 Services Single App Cluster Size 5K (real), 12K containers (benchmark) Agents managed 16K VMs Batch Containers launched 430K / day Agents autoscaled 350K VMs / month
  • 7. Leveraging existing Netflix and AWS Infrastructure Single consistent cloud environment between VMs and containers VMVM EC2 AWSAutoScaler VMs Service App Cloud Platform (metrics, IPC, health) VPC VMVM Atlas TitusJobControl Containers Service App Cloud Platform (metrics, IPC, health) Eureka Edda VMVMContainers Batch App Cloud Platform (metrics, IPC, health)
  • 8. Most Native AWS Container Platform IP per container ● VPC IP, ENI and security group ● Optimized to share ENIs ● ENI pre-attaching, opportunistic batching of IPs (bursty deploys) IAM Roles and Metadata Endpoint per container ● Container view of 169.254.169.254 Cryptographic identity per container ● Using Amazon instance identity document Service job container autoscaling ● Using Native AWS Cloudwatch and Autoscaling policies and engine Application Load Balancing (ALB)
  • 9. Advanced Scheduling and Control Plane Technologies
  • 10. Scheduling / Placement Considering the realities of … ● Docker, Linux, Image Pulling, etc. ● Complex resources (ENIs) ● Amazon rate limiting ● Filtering (constraints) and ranking (fitness) ● Different profiles for service | batch, critical | operational, etc. Reliability Provisioning Time Cost Trade offs
  • 11. Capacity Management User configures “capacity groups” based on workload type Critical (RIs) ● Preallocated instances in order to achieve low provisioning time ● Buffer to support temporary extra capacity needs for deployments Flex (On-Demand) ● Autoscaled instances based on demand Opportunistic (Spot) - Coming ● Utilize extra instances with the ability to preempt or evict the workload
  • 12. Centralized Agent Management Agent Management Other subsystems Health checks Cluster lifecycle Other signals Unified component for tracking agent information, Powers other systems like task migration, canaries, agent remediations Cluster B Agents states = schedulable For example: Task Migration Cluster A Agent state = non-schedulable, drain tasks Agent Management Task Migration Cluster state
  • 13. Integrations Titus External Resources Operational VisibilityInterfaces Load Balancers Autoscaling Spinnaker (Task Migration) Telemetry - Atlas Event Storage - Elasticsearch REST / gRPC Streaming updates
  • 15. Multi-tenant networking is hard Decided early on we wanted full IP stacks per container But what about? ● Security group support ● IAM role support ● Network bandwidth isolation ● Leverage VPC
  • 16. Virtual Machine Host Titus Networking sg=A,B IP 2 sg=B,C IP 3 Metadata service IPVlan, BPF, IFBs to route app traffic Container 1 Container 2 sg=A,B IP 4 Container 3 eth 0 sg=Titus control plane eth1 sg=A,B eth2 sg=B,C eth-mdeth-md Titus executor eth0eth0eth0 IP 2 IP 4 IP 3 IP 1 Metadata service eth-md Metadata service 169.254.169.254
  • 17. Next challenge: Speed limits of EC2 Networking Largest EC2 challenge: speed of networking reconfiguration Changes in how we work with EC2 API’s ● Work with Amazon to redefine networking related API rate limits, buckets ● Pre-attach all networking interfaces ● IPs are asked for in bulk opportunistically Also, coordination with scheduler ● Prefer instances with containers already in the same security group For large scale failovers ● Before … hours, after ... minutes
  • 18. ● Detection - health checks ○ Linux subsystems (systemd, filesystems) ○ Docker aspects (runtime health, registry pulls) ○ Titus processes (networking, GPU, security drivers) ○ Mesos aspects (agent, executor) ● Remediation ○ Local reconciliation ○ Docker image cleanup Overcoming failures on each agent
  • 19. Process Model Evolution Single process containers ● Worked for some time, until we needed system services System services ● Telemetry, IAM support, log uploading ● Added as host installed daemons; isolation & multi-tenancy concerns ● Currently injecting system services into containers Composing system services into containers ● Considered pods; lifecycle and usage complexities limited value ● Considering future of both systemd and docker image composability
  • 20. Resource Isolation ● CPU ○ Started with bursting; was interfering with predictability ○ Resource tiers to avoid interference problems ● Memory ○ Hard limit, OOM kills entire container ● Network ○ Bandwidth throttling ● Disk space ● GPUs
  • 21. Security Isolation ● Deployed user namespaces ○ Challenging due to shared systems without UID shifting ● Needed ad hoc debugging ○ Titus-ssh for user level access to their container ○ Still required power user access for kernel functions ○ Working to automate through tools like Vector (NetflixOSS) ● Seccomp overhead and complexity is prohibitive ○ Working towards automated policies and BPF driven implementations
  • 22. Open Sourcing Currently in private open source collaboration with those who want ... ● The NetflixOSS container solution (Spinnaker + Titus + Netflix RPC) ● A unified batch and service Mesos scheduler ● More robust & native AWS container platform Hope to fully open source in early Q2 ● If you want access now, let us know ● Looking for collaborators, feedback
  • 23. Q&A