Navigating Disaster Recovery in Kubernetes and CNCF Crossplane

Carlos Santana
Carlos SantanaSr. Kubernetes Solutions Architect en AWS
Carlos Santana (@csantanapr)
Sr. EKS Specialist SA, AWS
CNCF Ambassador
Navigating Disaster Recovery in
Kubernetes and Crossplane
@csantanapr
Platform Engineering
@csantanapr
Platform Engineering
@csantanapr
Platform Engineering
@csantanapr
Platform Engineering
@csantanapr
Platform Engineering
@csantanapr
Platform Engineering
@csantanapr
SRE Engineering
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
A model to think of resiliency
Resiliency
Disaster
Recovery
One-time
Events
High
Availability
Average
over time
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Disaster recovery (DR)
• About business continuity
• Larger scale, less frequent, events:
• Natural disasters
• Technical failures
• Human actions
• Measures a one-time event:
• Recovery Time
• Recovery Point
Natural Disaster Technical
Failure
Human Actions
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Recovery Objectives
Data Loss Downtime
Recovery Point (RPO) Recovery Time (RTO)
Disaster
How much data can you afford
to recreate or lose?
How quickly must you recover?
What is the cost of downtime?
Time
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Backup &
Restore Pilot Light
Multi-site
active/active
Warm
standby
RPO / RTO:
Hours
RPO / RTO:
10s of minutes
RPO / RTO:
Minutes
RPO / RTO:
Near real-time
• Data backed up
• No services deployed
• Cost $
• Data live
• Services idle
• Cost: $$
• Data live
• Services run reduced capacity
• Cost $$$
• Data live
• Live services
• Cost $$$$
Strategies for disaster recovery
active/passive strategies
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Crossplane Disaster Recovery
• Crossplane Upgrades and Rollbacks
 New api versions added to CRD (ie 11.0 -> 1.10.2)
 Issue #3859
 Providers upgrade and rollback
– CRD ownership
• Configuration Package
 Provider auto upgrade
• Velero
 --features=EnableAPIGroupVersions
13
@csantanapr
managementPolicy (ObserveOnly)
@csantanapr
Disaster Recovery
@csantanapr
Disaster Recovery
@csantanapr
Disaster Recovery
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Scenario 2: Backup Database
22
Crossplane
east-1
ETCD
Claim
mutation webhooks
ArgoCD
AWS Cloud
Crossplane
ETCD
restore
restore
west-2
Amazon RDS Amazon RDS
EKS EKS
Backup-RDS
S3
backup
Backup non-global resources
Backup-EKS
S3
west-2
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Summary
• Everything fails all the time
• Shortest path to Recover
• Different failure domains
• Crossplane rollbacks
• Use auto replication (ie. s3) for faster RTO
• Lower cost by recover from backup DB (high RTO)
23
© 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr
Resources
24
https://github.com/awslabs/crossplane-on-eks
https://crossplane.io
https://go.aws/3K4ue0W
https://velero.io
Recovery When Using Crossplane for
Infrastructure Provisioning on AWS
EKS Blueprints
https://argoproj.github.io/cd
1 de 20

Recomendados

Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit... por
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Amazon Web Services
2.5K vistas57 diapositivas
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:... por
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...Amazon Web Services
2.2K vistas59 diapositivas
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci... por
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
473 vistas65 diapositivas
AWS AutoScalling- Tech Talks Maio 2019 por
AWS AutoScalling- Tech Talks Maio 2019AWS AutoScalling- Tech Talks Maio 2019
AWS AutoScalling- Tech Talks Maio 2019Amazon Web Services LATAM
1.2K vistas72 diapositivas
How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -... por
How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...
How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...Amazon Web Services
596 vistas58 diapositivas
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ... por
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...Amazon Web Services
2.1K vistas51 diapositivas

Más contenido relacionado

Similar a Navigating Disaster Recovery in Kubernetes and CNCF Crossplane

Building Modern Applications on AWS.pptx por
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxNelson Kimathi
28 vistas66 diapositivas
How to build scalable and resilient applications in the cloud - AWS Summit Ca... por
How to build scalable and resilient applications in the cloud - AWS Summit Ca...How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...Amazon Web Services
688 vistas75 diapositivas
Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS... por
Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS...Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS...
Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS...Amazon Web Services
1.2K vistas17 diapositivas
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In... por
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Amazon Web Services
894 vistas44 diapositivas
Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018 por
Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018
Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018Amazon Web Services
326 vistas39 diapositivas
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30... por
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...Amazon Web Services
857 vistas32 diapositivas

Similar a Navigating Disaster Recovery in Kubernetes and CNCF Crossplane (20)

Building Modern Applications on AWS.pptx por Nelson Kimathi
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptx
Nelson Kimathi28 vistas
How to build scalable and resilient applications in the cloud - AWS Summit Ca... por Amazon Web Services
How to build scalable and resilient applications in the cloud - AWS Summit Ca...How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
Amazon Web Services688 vistas
Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS... por Amazon Web Services
Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS...Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS...
Neptune Performance Tuning: Get the Best out of Amazon Neptune (DAT360) - AWS...
Amazon Web Services1.2K vistas
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In... por Amazon Web Services
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Amazon Web Services894 vistas
Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018 por Amazon Web Services
Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018
Day Two Operations of Kubernetes on AWS (GPSTEC309) - AWS re:Invent 2018
Amazon Web Services326 vistas
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30... por Amazon Web Services
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Amazon Web Services857 vistas
SRV205 Architectures and Strategies for Building Modern Applications on AWS por Amazon Web Services
 SRV205 Architectures and Strategies for Building Modern Applications on AWS SRV205 Architectures and Strategies for Building Modern Applications on AWS
SRV205 Architectures and Strategies for Building Modern Applications on AWS
Amazon Web Services976 vistas
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ... por Amazon Web Services
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Amazon Web Services890 vistas
Databases - EBC on the road Brazil Edition [Portuguese] por Amazon Web Services
Databases - EBC on the road Brazil Edition [Portuguese]Databases - EBC on the road Brazil Edition [Portuguese]
Databases - EBC on the road Brazil Edition [Portuguese]
Amazon Web Services205 vistas
Control Planes on Kubernetes and Policy Validation por Carlos Santana
Control Planes on Kubernetes and Policy ValidationControl Planes on Kubernetes and Policy Validation
Control Planes on Kubernetes and Policy Validation
Carlos Santana12 vistas
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov... por All Things Open
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
All Things Open12 vistas
AWSome Day - Solutions Architecture Best Practices por Amazon Web Services
AWSome Day - Solutions Architecture Best PracticesAWSome Day - Solutions Architecture Best Practices
AWSome Day - Solutions Architecture Best Practices
Amazon Web Services3.4K vistas
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:... por Amazon Web Services
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
Amazon Web Services383 vistas
建構全球跨區域 x Active-Active架構的無伺服器化後台服務 por Amazon Web Services
建構全球跨區域  x Active-Active架構的無伺服器化後台服務建構全球跨區域  x Active-Active架構的無伺服器化後台服務
建構全球跨區域 x Active-Active架構的無伺服器化後台服務
Amazon Web Services622 vistas
Accelerate Database Development and Testing with Amazon Aurora (DAT313) - AWS... por Amazon Web Services
Accelerate Database Development and Testing with Amazon Aurora (DAT313) - AWS...Accelerate Database Development and Testing with Amazon Aurora (DAT313) - AWS...
Accelerate Database Development and Testing with Amazon Aurora (DAT313) - AWS...
Amazon Web Services679 vistas
How UCSD Simplified Data Protection with Rubrik and AWS (STG207-S) - AWS re:I... por Amazon Web Services
How UCSD Simplified Data Protection with Rubrik and AWS (STG207-S) - AWS re:I...How UCSD Simplified Data Protection with Rubrik and AWS (STG207-S) - AWS re:I...
How UCSD Simplified Data Protection with Rubrik and AWS (STG207-S) - AWS re:I...
Amazon Web Services616 vistas
Data Design and Modeling for Microservices I AWS Dev Day 2018 por AWS Germany
Data Design and Modeling for Microservices I AWS Dev Day 2018Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018
AWS Germany548 vistas

Más de Carlos Santana

Amazon EKS multi-cluster gitops-bridge por
Amazon EKS multi-cluster gitops-bridgeAmazon EKS multi-cluster gitops-bridge
Amazon EKS multi-cluster gitops-bridgeCarlos Santana
80 vistas38 diapositivas
Building a Bridge between Terraform and ArgoCD por
Building a Bridge between Terraform and ArgoCDBuilding a Bridge between Terraform and ArgoCD
Building a Bridge between Terraform and ArgoCDCarlos Santana
96 vistas34 diapositivas
Scaling production grade EKS Multi-Cluster environments using GitOps por
Scaling production grade EKS Multi-Cluster environments using GitOpsScaling production grade EKS Multi-Cluster environments using GitOps
Scaling production grade EKS Multi-Cluster environments using GitOpsCarlos Santana
34 vistas45 diapositivas
NodeJS Serverless backends for your frontends por
NodeJS Serverless backends for your frontendsNodeJS Serverless backends for your frontends
NodeJS Serverless backends for your frontendsCarlos Santana
475 vistas42 diapositivas
OpenWhisk Meetup - Austin, TX 07/2017 por
OpenWhisk Meetup - Austin, TX 07/2017OpenWhisk Meetup - Austin, TX 07/2017
OpenWhisk Meetup - Austin, TX 07/2017Carlos Santana
302 vistas54 diapositivas
Shark Tank OpenWhisk Incubating at ApacheCon 2017 por
Shark Tank OpenWhisk Incubating at ApacheCon 2017Shark Tank OpenWhisk Incubating at ApacheCon 2017
Shark Tank OpenWhisk Incubating at ApacheCon 2017Carlos Santana
104 vistas20 diapositivas

Más de Carlos Santana(8)

Amazon EKS multi-cluster gitops-bridge por Carlos Santana
Amazon EKS multi-cluster gitops-bridgeAmazon EKS multi-cluster gitops-bridge
Amazon EKS multi-cluster gitops-bridge
Carlos Santana80 vistas
Building a Bridge between Terraform and ArgoCD por Carlos Santana
Building a Bridge between Terraform and ArgoCDBuilding a Bridge between Terraform and ArgoCD
Building a Bridge between Terraform and ArgoCD
Carlos Santana96 vistas
Scaling production grade EKS Multi-Cluster environments using GitOps por Carlos Santana
Scaling production grade EKS Multi-Cluster environments using GitOpsScaling production grade EKS Multi-Cluster environments using GitOps
Scaling production grade EKS Multi-Cluster environments using GitOps
Carlos Santana34 vistas
NodeJS Serverless backends for your frontends por Carlos Santana
NodeJS Serverless backends for your frontendsNodeJS Serverless backends for your frontends
NodeJS Serverless backends for your frontends
Carlos Santana475 vistas
OpenWhisk Meetup - Austin, TX 07/2017 por Carlos Santana
OpenWhisk Meetup - Austin, TX 07/2017OpenWhisk Meetup - Austin, TX 07/2017
OpenWhisk Meetup - Austin, TX 07/2017
Carlos Santana302 vistas
Shark Tank OpenWhisk Incubating at ApacheCon 2017 por Carlos Santana
Shark Tank OpenWhisk Incubating at ApacheCon 2017Shark Tank OpenWhisk Incubating at ApacheCon 2017
Shark Tank OpenWhisk Incubating at ApacheCon 2017
Carlos Santana104 vistas
OpenWhisk: Where Did My Servers Go? por Carlos Santana
OpenWhisk: Where Did My Servers Go?OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?
Carlos Santana294 vistas
How to contribute to Serverless Apache OpenWhisk OpenSource101 NCSU por Carlos Santana
How to contribute to Serverless Apache OpenWhisk OpenSource101 NCSUHow to contribute to Serverless Apache OpenWhisk OpenSource101 NCSU
How to contribute to Serverless Apache OpenWhisk OpenSource101 NCSU
Carlos Santana340 vistas

Último

Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated... por
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...TomHalpin9
5 vistas29 diapositivas
Advanced API Mocking Techniques por
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking TechniquesDimpy Adhikary
19 vistas11 diapositivas
Dapr Unleashed: Accelerating Microservice Development por
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice DevelopmentMiroslav Janeski
10 vistas29 diapositivas
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols por
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDeltares
7 vistas23 diapositivas
Software evolution understanding: Automatic extraction of software identifier... por
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Ra'Fat Al-Msie'deen
7 vistas33 diapositivas
ict act 1.pptx por
ict act 1.pptxict act 1.pptx
ict act 1.pptxsanjaniarun08
13 vistas17 diapositivas

Último(20)

Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated... por TomHalpin9
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
TomHalpin95 vistas
Advanced API Mocking Techniques por Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary19 vistas
Dapr Unleashed: Accelerating Microservice Development por Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski10 vistas
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols por Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 vistas
Software evolution understanding: Automatic extraction of software identifier... por Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... por sparkfabrik
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
sparkfabrik5 vistas
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx por animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 vistas
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... por Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares7 vistas
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium... por Lisi Hocke
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Lisi Hocke28 vistas
Tridens DevOps por Tridens
Tridens DevOpsTridens DevOps
Tridens DevOps
Tridens9 vistas
MariaDB stored procedures and why they should be improved por Federico Razzoli
MariaDB stored procedures and why they should be improvedMariaDB stored procedures and why they should be improved
MariaDB stored procedures and why they should be improved
Federico Razzoli8 vistas
360 graden fabriek por info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info3349237 vistas
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... por Deltares
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
Deltares5 vistas
Headless JS UG Presentation.pptx por Jack Spektor
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptx
Jack Spektor7 vistas
Navigating container technology for enhanced security by Niklas Saari por Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy13 vistas
SUGCON ANZ Presentation V2.1 Final.pptx por Jack Spektor
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptx
Jack Spektor22 vistas
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action por Márton Kodok
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Márton Kodok5 vistas

Navigating Disaster Recovery in Kubernetes and CNCF Crossplane

  • 1. Carlos Santana (@csantanapr) Sr. EKS Specialist SA, AWS CNCF Ambassador Navigating Disaster Recovery in Kubernetes and Crossplane
  • 9. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr A model to think of resiliency Resiliency Disaster Recovery One-time Events High Availability Average over time
  • 10. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Disaster recovery (DR) • About business continuity • Larger scale, less frequent, events: • Natural disasters • Technical failures • Human actions • Measures a one-time event: • Recovery Time • Recovery Point Natural Disaster Technical Failure Human Actions
  • 11. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Recovery Objectives Data Loss Downtime Recovery Point (RPO) Recovery Time (RTO) Disaster How much data can you afford to recreate or lose? How quickly must you recover? What is the cost of downtime? Time
  • 12. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Backup & Restore Pilot Light Multi-site active/active Warm standby RPO / RTO: Hours RPO / RTO: 10s of minutes RPO / RTO: Minutes RPO / RTO: Near real-time • Data backed up • No services deployed • Cost $ • Data live • Services idle • Cost: $$ • Data live • Services run reduced capacity • Cost $$$ • Data live • Live services • Cost $$$$ Strategies for disaster recovery active/passive strategies
  • 13. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Crossplane Disaster Recovery • Crossplane Upgrades and Rollbacks  New api versions added to CRD (ie 11.0 -> 1.10.2)  Issue #3859  Providers upgrade and rollback – CRD ownership • Configuration Package  Provider auto upgrade • Velero  --features=EnableAPIGroupVersions 13
  • 18. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Scenario 2: Backup Database 22 Crossplane east-1 ETCD Claim mutation webhooks ArgoCD AWS Cloud Crossplane ETCD restore restore west-2 Amazon RDS Amazon RDS EKS EKS Backup-RDS S3 backup Backup non-global resources Backup-EKS S3 west-2
  • 19. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Summary • Everything fails all the time • Shortest path to Recover • Different failure domains • Crossplane rollbacks • Use auto replication (ie. s3) for faster RTO • Lower cost by recover from backup DB (high RTO) 23
  • 20. © 2023, Amazon Web Services, Inc. or its affiliates. @csantanapr Resources 24 https://github.com/awslabs/crossplane-on-eks https://crossplane.io https://go.aws/3K4ue0W https://velero.io Recovery When Using Crossplane for Infrastructure Provisioning on AWS EKS Blueprints https://argoproj.github.io/cd