SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
EFFICIENT
KUBERNETES SCALING
USING KARPENTER_
Marko Bevc
ABOUT
ME_ ●
Head of Consultancy at The Scale Factory (B2B SaaS consultancy,
AWS Advanced consulting partner and K8s service provider)
●
Ops background, wearing different hats, engaged with many
different technologies
●
Open source contributor, maintainer and supporter
●
HashiCorp Ambassador, OpenUK Ambassador
●
Certifications and competencies: AWS, CKA, RHEL, HCTA
●
Fan of automation/simplifying things, hiking and travelling
@_MarkoB
https://www.linkedin.com/in/marko-bevc/
| @marko@hachyderm.io
KUBERNETES
SCALING_
• None out of the box – manual 👩‍💻👨‍💻
• Kubernetes resources:
–Pods – the smallest execution unit
–Nodes – compute/instances to run Pods on
–Other: storage, network, etc.
@_MarkoB
HPA
CONCEPT_
• Horizontal Pod Autoscaler
• Adding more instances(e.g. Pods)
• Doesn’t apply to non-scalable objects (e.g. DaemonSet)
• Target observed metrics (i.e. average CPU or memory
utilization)
• Scaling out
VPA
CONCEPT_
• Vertical Pod Autoscaler
• Adjusting size/power (e.g. resources/limits)
• “Right-sizing” your workloads to actual usage
• Most commonly used on a Deployment objects
• Scaling up
PODS
SCALING_
• Other approaches:
– HPA | VPA* (HorizontalPodAutoscaler | VerticalPodAutoscaler)
– GCP: MultidimPodAutoscaler
– KEDA (K8s Event Driven Autoscaling)
– Knative (K8s based serverless platform)
CLUSTER
AUTOSCALER_
• Industry ‘de-facto’ auto-scaling standard
• Cost efficiency – automatically adjusts cluster: scale up/down
• Leaning on existing Cloud building blocks
• Challenges: Node Group limitations (AZ, instance type, labels),
complex to use, tightly bound to the scheduler, global controller
CLUSTER
AUTOSCALER
SCALE-UP_
●
Reconciliation and filtering
●
Scale up (in-memory simulation, <10sec)
●
Expanders: random, most/least pods, price, priority
●
Scale down (<10min)
New Nodes
Pending Pods
10 sec
NODE
SCHEDULING_
@_MarkoB
Kubernetes
Control Plane
unscheduled
@_MarkoB
NODE
SCHEDULING_
Kubernetes
Control Plane
unscheduled
@_MarkoB
NODE CA
SCHEDULING_
Kubernetes
Control Plane
size, arch, GPU, etc.
@_MarkoB
NODE KARPENTER
SCHEDULING_
Kubernetes
Control Plane
KARPENTER
ARCHITECTURE_
@_MarkoB
https://karpenter.sh
KEY
CONCEPTS_
• Straightforward setup:
– Provision AWS IAM Roles for Service Accounts (IRSA)
– Install controllers (leader elect HA)
– Apply Provisioner CRD (configuration) – one or more!
– Deploy workloads
• Capacity life-cycle loop: watch evaluate provision remove
→ → →
• Well-known labels as Provisioner constraints:
– kubernetes.io/arch = amd64
– kubernetes.io/os = linux
– node.kubernetes.io/instance-type = m5.large
– topology.kubernetes.io/zone = eu-west-1
– karpenter.sh/capacity-type = on-demand | spot
●
Multi-dimension scaling (up/down and in/out)!
@_MarkoB
SCALING
UP_
• Provisioning and scaling
• Adding more just-in-time capacity to meet demand
• Early binding to nodes
• Scheduling constraints: resource.requests, nodeAffinity, nodeSelector,
PodDisruptionBudget, topologySpreadConstraints, inter-pod (anti-)affinity
• Removing scheduler tight coupling
@_MarkoB
New Node
Pending Pods
<10 sec
SCALING
IN_
@_MarkoB
<10 sec
Obsolete Node
Pending Pods
• Terminate obsolete capacity reducing costs
→
• Removing underutilised or empty nodes
• Node TTLs (emptiness & expiration)
• Consolidation
• Interruption
• Drift
CAPACITY
CONSOLIDATION_
●
Consolidation, a.k.a off-line bin packing
●
Rebalancing Node workloads based on utilisation (CPU, memory)
●
Mechanisms for cluster consolidation:
– Delete (on-demand | spot)
– Replace (on-demand)
●
Optimises for cost, minimising disruption obeying:
– Scheduling constraints (PDBs, AZ affinity, topology spread constraints)
– Termination grace period and expiration TTL
– Instance unhealthy events and spot events (termination)
●
Using least disruption when multiple Nodes that could be consolidated:
– Nodes running fewer pods
– Nodes that will expire soon
– Nodes with lower priority Pods
@_MarkoB
OTHER
OPTIONS_
●
Custom User Data and AMI (i.e. Bottlerocket)
●
Kubelet configuration (containerRuntime, systemReserved)
●
Taints (or startupTaints)
●
Control Pod Density
– Network limitations
●
Number of ENIs
●
Number of IP addresses that can be assigned to ENI
– Static Pod Density (podsPerCore)
– Dynamic Pod Density (maxPods)
– Limit Pod Density: topology spread, restrict instance types
@_MarkoB
TIME FOR
A DEMO!_
@_MarkoB
CONCLUSIONS_
& TAKEAWAYS
●
Capacity planning is hard! 🧪
●
Key advantages: 🔥
– Flexible, lowers complexity & portable
– Fast: provisioning latency <1min down to 15sec (group-less)
→
– Efficient: multi-dimension scaling, consolidation (delete or replace)
– Adaptive: right-sizing, interruption events
– Compliance (TTL)📖
●
To keep in mind: 🧑‍🏫
– Currently supported provider is AWS (adoption in the future?*)
– Not supporting Spot Rebalance Recommendations
– Careful with non-interruptable workloads, edge case of 1 replica
– https://github.com/aws/karpenter/issues ➡️ ⚒️
@_MarkoB
●
Resources:
– https://github.com/mbevc1/public-speaking/
– https://github.com/aws/karpenter/
– https://kubernetes.io/docs/reference/labels-annotations-taints/
– https://github.com/kubernetes/autoscaler
– https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html
– https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/
scalability_tests.md
– https://blog.kloia.com/karpenter-cluster-autoscaler-76d7f7ec0d0e
– https://blog.scaleway.com/understanding-kubernetes-autoscaling/
– https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance-
kubernetes-cluster-autoscaler/
FURTHER
READING_
@_MarkoB
KEEP IN
TOUCH_
https://www.scalefactory.com/
@_MarkoB
@mbevc1
@mbevc1
https://www.linkedin.com/in/marko-bevc/
https://www.scalefactory.com/
Web:
Twitter:
GitHub:
GitLab:
LinkedIn:

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Autoscaling Kubernetes
Autoscaling KubernetesAutoscaling Kubernetes
Autoscaling Kubernetes
 
01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx
 
Docker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshopDocker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshop
 
Scaling containers with keda
Scaling containers  with kedaScaling containers  with keda
Scaling containers with keda
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
 
Kubernetes 101 for Beginners
Kubernetes 101 for BeginnersKubernetes 101 for Beginners
Kubernetes 101 for Beginners
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
 
GitOps with Amazon EKS Anywhere by Dan Budris
GitOps with Amazon EKS Anywhere by Dan BudrisGitOps with Amazon EKS Anywhere by Dan Budris
GitOps with Amazon EKS Anywhere by Dan Budris
 
Amazon EKS를 통한 빠르고 편리한 컨테이너 플랫폼 활용 – 이일구 AWS 솔루션즈 아키텍트:: AWS Cloud Week - Ind...
Amazon EKS를 통한 빠르고 편리한 컨테이너 플랫폼 활용 – 이일구 AWS 솔루션즈 아키텍트:: AWS Cloud Week - Ind...Amazon EKS를 통한 빠르고 편리한 컨테이너 플랫폼 활용 – 이일구 AWS 솔루션즈 아키텍트:: AWS Cloud Week - Ind...
Amazon EKS를 통한 빠르고 편리한 컨테이너 플랫폼 활용 – 이일구 AWS 솔루션즈 아키텍트:: AWS Cloud Week - Ind...
 
Kubernetes Architecture - beyond a black box - Part 2
Kubernetes Architecture - beyond a black box - Part 2Kubernetes Architecture - beyond a black box - Part 2
Kubernetes Architecture - beyond a black box - Part 2
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
 

Similar a Efficient Kubernetes scaling using Karpenter

Similar a Efficient Kubernetes scaling using Karpenter (20)

Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015
 
Seamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesSeamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodes
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
Lc3 beijing-june262018-sahdev zala-guangya
Lc3 beijing-june262018-sahdev zala-guangyaLc3 beijing-june262018-sahdev zala-guangya
Lc3 beijing-june262018-sahdev zala-guangya
 
H-Hypermap Heatmap Analytics at Scale
H-Hypermap Heatmap Analytics at ScaleH-Hypermap Heatmap Analytics at Scale
H-Hypermap Heatmap Analytics at Scale
 
Cluster schedulers
Cluster schedulersCluster schedulers
Cluster schedulers
 
Hadoop and Spark
Hadoop and SparkHadoop and Spark
Hadoop and Spark
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
Running Kafka on Kubernetes, across three clouds at Adobe
Running Kafka on Kubernetes, across three clouds at AdobeRunning Kafka on Kubernetes, across three clouds at Adobe
Running Kafka on Kubernetes, across three clouds at Adobe
 
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
 
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentation
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted Cloud
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Ceph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der SterCeph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der Ster
 
Google Kubernetes Engine Deep Dive Meetup
Google Kubernetes Engine Deep Dive MeetupGoogle Kubernetes Engine Deep Dive Meetup
Google Kubernetes Engine Deep Dive Meetup
 
Kudu demo
Kudu demoKudu demo
Kudu demo
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 

Más de Marko Bevc

Más de Marko Bevc (8)

Using HCP Waypoint
Using HCP WaypointUsing HCP Waypoint
Using HCP Waypoint
 
How secure are your Terraform sensitive values?
How secure are your Terraform sensitive values?How secure are your Terraform sensitive values?
How secure are your Terraform sensitive values?
 
Who is afraid of privileged containers ?
Who is afraid of privileged containers ?Who is afraid of privileged containers ?
Who is afraid of privileged containers ?
 
Terraform 0.13: Rise of the modules
Terraform 0.13: Rise of the modulesTerraform 0.13: Rise of the modules
Terraform 0.13: Rise of the modules
 
Who is afraid of privileged containers ?
Who is afraid of privileged containers ?Who is afraid of privileged containers ?
Who is afraid of privileged containers ?
 
Terraform 0.13: Rise of the modules
Terraform 0.13: Rise of the modulesTerraform 0.13: Rise of the modules
Terraform 0.13: Rise of the modules
 
Who is afraid of privileged containers ?
Who is afraid of privileged containers ?Who is afraid of privileged containers ?
Who is afraid of privileged containers ?
 
Commodified IaC using Terraform Cloud
Commodified IaC using Terraform CloudCommodified IaC using Terraform Cloud
Commodified IaC using Terraform Cloud
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Efficient Kubernetes scaling using Karpenter

  • 1.
  • 3.
  • 4. ABOUT ME_ ● Head of Consultancy at The Scale Factory (B2B SaaS consultancy, AWS Advanced consulting partner and K8s service provider) ● Ops background, wearing different hats, engaged with many different technologies ● Open source contributor, maintainer and supporter ● HashiCorp Ambassador, OpenUK Ambassador ● Certifications and competencies: AWS, CKA, RHEL, HCTA ● Fan of automation/simplifying things, hiking and travelling @_MarkoB https://www.linkedin.com/in/marko-bevc/ | @marko@hachyderm.io
  • 5. KUBERNETES SCALING_ • None out of the box – manual 👩‍💻👨‍💻 • Kubernetes resources: –Pods – the smallest execution unit –Nodes – compute/instances to run Pods on –Other: storage, network, etc. @_MarkoB
  • 6. HPA CONCEPT_ • Horizontal Pod Autoscaler • Adding more instances(e.g. Pods) • Doesn’t apply to non-scalable objects (e.g. DaemonSet) • Target observed metrics (i.e. average CPU or memory utilization) • Scaling out
  • 7. VPA CONCEPT_ • Vertical Pod Autoscaler • Adjusting size/power (e.g. resources/limits) • “Right-sizing” your workloads to actual usage • Most commonly used on a Deployment objects • Scaling up
  • 8. PODS SCALING_ • Other approaches: – HPA | VPA* (HorizontalPodAutoscaler | VerticalPodAutoscaler) – GCP: MultidimPodAutoscaler – KEDA (K8s Event Driven Autoscaling) – Knative (K8s based serverless platform)
  • 9. CLUSTER AUTOSCALER_ • Industry ‘de-facto’ auto-scaling standard • Cost efficiency – automatically adjusts cluster: scale up/down • Leaning on existing Cloud building blocks • Challenges: Node Group limitations (AZ, instance type, labels), complex to use, tightly bound to the scheduler, global controller
  • 10. CLUSTER AUTOSCALER SCALE-UP_ ● Reconciliation and filtering ● Scale up (in-memory simulation, <10sec) ● Expanders: random, most/least pods, price, priority ● Scale down (<10min) New Nodes Pending Pods 10 sec
  • 14. size, arch, GPU, etc. @_MarkoB NODE KARPENTER SCHEDULING_ Kubernetes Control Plane
  • 16. KEY CONCEPTS_ • Straightforward setup: – Provision AWS IAM Roles for Service Accounts (IRSA) – Install controllers (leader elect HA) – Apply Provisioner CRD (configuration) – one or more! – Deploy workloads • Capacity life-cycle loop: watch evaluate provision remove → → → • Well-known labels as Provisioner constraints: – kubernetes.io/arch = amd64 – kubernetes.io/os = linux – node.kubernetes.io/instance-type = m5.large – topology.kubernetes.io/zone = eu-west-1 – karpenter.sh/capacity-type = on-demand | spot ● Multi-dimension scaling (up/down and in/out)! @_MarkoB
  • 17. SCALING UP_ • Provisioning and scaling • Adding more just-in-time capacity to meet demand • Early binding to nodes • Scheduling constraints: resource.requests, nodeAffinity, nodeSelector, PodDisruptionBudget, topologySpreadConstraints, inter-pod (anti-)affinity • Removing scheduler tight coupling @_MarkoB New Node Pending Pods <10 sec
  • 18. SCALING IN_ @_MarkoB <10 sec Obsolete Node Pending Pods • Terminate obsolete capacity reducing costs → • Removing underutilised or empty nodes • Node TTLs (emptiness & expiration) • Consolidation • Interruption • Drift
  • 19. CAPACITY CONSOLIDATION_ ● Consolidation, a.k.a off-line bin packing ● Rebalancing Node workloads based on utilisation (CPU, memory) ● Mechanisms for cluster consolidation: – Delete (on-demand | spot) – Replace (on-demand) ● Optimises for cost, minimising disruption obeying: – Scheduling constraints (PDBs, AZ affinity, topology spread constraints) – Termination grace period and expiration TTL – Instance unhealthy events and spot events (termination) ● Using least disruption when multiple Nodes that could be consolidated: – Nodes running fewer pods – Nodes that will expire soon – Nodes with lower priority Pods @_MarkoB
  • 20. OTHER OPTIONS_ ● Custom User Data and AMI (i.e. Bottlerocket) ● Kubelet configuration (containerRuntime, systemReserved) ● Taints (or startupTaints) ● Control Pod Density – Network limitations ● Number of ENIs ● Number of IP addresses that can be assigned to ENI – Static Pod Density (podsPerCore) – Dynamic Pod Density (maxPods) – Limit Pod Density: topology spread, restrict instance types @_MarkoB
  • 22. CONCLUSIONS_ & TAKEAWAYS ● Capacity planning is hard! 🧪 ● Key advantages: 🔥 – Flexible, lowers complexity & portable – Fast: provisioning latency <1min down to 15sec (group-less) → – Efficient: multi-dimension scaling, consolidation (delete or replace) – Adaptive: right-sizing, interruption events – Compliance (TTL)📖 ● To keep in mind: 🧑‍🏫 – Currently supported provider is AWS (adoption in the future?*) – Not supporting Spot Rebalance Recommendations – Careful with non-interruptable workloads, edge case of 1 replica – https://github.com/aws/karpenter/issues ➡️ ⚒️ @_MarkoB
  • 23. ● Resources: – https://github.com/mbevc1/public-speaking/ – https://github.com/aws/karpenter/ – https://kubernetes.io/docs/reference/labels-annotations-taints/ – https://github.com/kubernetes/autoscaler – https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html – https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/ scalability_tests.md – https://blog.kloia.com/karpenter-cluster-autoscaler-76d7f7ec0d0e – https://blog.scaleway.com/understanding-kubernetes-autoscaling/ – https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance- kubernetes-cluster-autoscaler/ FURTHER READING_ @_MarkoB