SlideShare a Scribd company logo
1 of 27
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reliability of the Cloud:
How AWS Achieves High Availability
Rodney Lester
Reliability Lead
AWS Well Architected
A R C 3 1 7
Shaun Ray
Manager
AWS Evangelism
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Well-Architected Reliability
Pillar
Once upon a time … (stories)
Availability design goals
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Breakout repeats
Tuesday, November 27
ARC317-R [REPEAT] Reliability of the Cloud: How AWS
Achieves High Availability
3:15 p.m. – 4:15 p.m. | Aria East, Level 1, Joshua 4
Thursday, November 29
ARC317-R [REPEAT 1] Reliability of the Cloud: How AWS
Achieves High Availability
11:30 a.m. – 12:30 p.m. | Mirage, Antigua A
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Related breakouts
Wednesday, November 28
ARC335-R1 Failing Successfully in the Cloud: AWS Approach to
Resilient Design
12:15 p.m. – 1:15 p.m. | Aria East, Level 2, Mariposa 8
Thursday, November 29
ARC335-R2 Failing Successfully in the Cloud: AWS Approach to
Resilient Design
4:00 p.m. – 5:00 p.m. | MGM, Level 3, South Concourse 302
Wednesday, November 28
ARC408 Under the Hood of Route 53
11:30 a.m. – 12:30 p.m. | Venetian, Level 4, Lando 4305
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Well-Architected Reliability Pillar
• Completely refreshed December 2017
• Additional changes approximately every three months
• Plan is to have it more dynamic in the future, but a new version will be released soon
• Significant changes
• Calculating availability
• Application design primer
• Examples, at different design goals
• Appendix contains design goals of 37 AWS services
• More added in each revision and will continue
• These concepts are used to develop services
https://aws.amazon.com/well-architected/
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS uses the information in this white paper
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does this relate to how AWS builds services?
• This document was written in consultation with AWS principal
engineers
• The techniques described are quite proven
• All of the techniques described have articles or books written about
them
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ops meetings
• David Lubell and Kevin Miller conducted a chalk talk in 2017 on how
we run our ops meeting
• Review critical services every week in a two hour meeting
• Charlie Bell (SVP, AWS Operations) leads the meeting
• Senior leaders of the services
• Representation from every AWS service
• Service metrics reviews
• 130+ services * 10 min/service = 22-hr meeting?
• How do we ensure all services are ready every week?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Service review
• Now open source
• http://bit.ly/aws-wheel
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The things that happen once in million happen all the
time in AWS
• Some commonly observed problems:
• Our back end service was having no problems, now it’s overloaded
• An occasional huge spike in traffic that quickly disappears causes problems
• Average response time to requests is slowly creeping up, but the p99 is exponential
• Observe a rise in failed requests “The service/region is failing”
• Experienced a failure, on recovery, we’re receiving duplicate requests that are all errors
• Cannot adapt fast enough to the huge changes in demand up or down
• Dependency on a less reliable system
• No problems until a system that was dependent on us went down, then we went down
• Couldn’t get capacity quick enough when a location went down
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common causes of such problems (cont.)
• Our back end service was having no problems, now it’s overloaded
• Someone deployed a service that uses our service and the requests are much more than
planned/expected
• Someone in marketing is running a campaign and didn’t tell us; our service is not alone
• A bug exists that causes repeated requests to our service, either a new deployment, or a
latent bug
• We see an occasional huge spike in traffic that quickly disappears
• Some kind of edge case exists where things go normally, then under a condition, some kind
of rebuilding of a data model happens
• Someone in marketing is running a campaign and didn’t tell us; our service is not alone
• A bug exists that causes repeated requests to our service, either a new deployment, or a
latent bug
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common causes of such problems (cont.)
• Average response time to requests is slowly creeping up, but the p99 is
exponential
• This can be an indicator of impending problems
• There is a use case that executes a different path, either on your service, or a dependency
• Observe a rise in failed requests “The service/region is failing”
• There may be an event (known internally as a Large Scale Event) occurring
• Maybe a transient problem
• Can often be better to wait it out rather than fail over
• Experienced a failure, on recovery, we’re receiving duplicate requests
that are all errors
• Even if you are not distributed, it is possible that the invoking service has no idea you were
successful in processing some requests
• Idempotency tokens can be used
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common causes of such problems (cont.)
• Cannot adapt fast enough to the huge changes in demand up or down
• Need good communication paths with business drivers of traffic
• You can have the system constantly performing tasks that are replaced by requests from
consumers of your service
• Dependency on a less reliable system
• Can turn this into a soft dependency if you can find an acceptable replacement state
• This usually needs to be negotiated with the product owners
• No problems until a system that was dependent on us went down, then
we went down
• Commonly known as a cascading failure
• Not always a failure (see previous examples of spiky traffic)
• Example of “bi-modal behavior”
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common causes of such problems (cont.)
• Couldn’t get capacity quick enough when a location went down
• Pilot light or running at high utilization can cause a brown out when failure occurs
• Need to be able to take a loss of a location and service the traffic immediately
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Service Design Goals
• Not SLAs
• Managed to in the weekly ops meeting
• Currently document 37 services
• Adding more as I work with services to establish them
• Control Plane versus Data Plane
• Control plane mutates resources (bi-modal!) and data plane is the “day job”
• Control plane is often more “dangerous“ and therefore less available (not always!)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Rodney Lester
rodneyle@amazon.com
Shaun Ray
shaunray@amazon.com
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Software/implementation has an impact on
availability
• Throttling
• Protect your service by refusing requests when out of capacity
• Exponential back off for retries
• This is an art and a science; built into the AWS SDKs
• Fail fast
• Users will retry on failure, so this can allow your system to recover faster
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More advanced implementation patterns
• Idempotency
• You have a choice: “at most once” semantics, or “at least once.” Choose the latter.
• Constant work
• If you have a system that is always performing work, and you replace that work with user
requests, you have a system that is much more predictable
• Colm MacCarthaigh has a tweet thread on this:
https://twitter.com/colmmacc/status/1039228121327648768
• Circuit breaker
• Can be used to remove hard dependencies in your availability calculation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Bi-model behavior and static stability
• Cascading failures are often from “bi-modal” behavior
• I’ve seen this often—anomaly causes huge change in system
• Static stability
• On loss of capacity, you want to be able to handle your current load with no need to acquire
resources
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
It’s a danger to stay on old versions of operating
systems, frameworks, or third-party software
• More than just operating systems
• Operating systems
• Frameworks like Spring, Angular, and more
• Other third-party software like libraries
• Ensure you keep up to date
• Can be more than availability concern—Equifax had a old version of Struts that exposed their
customer data
• This is part of the corporate wide topics communicated in the Ops
meetings

More Related Content

What's hot

(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto ScalingAmazon Web Services
 
AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집
AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집
AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집Amazon Web Services Korea
 
How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018
How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018
How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018Amazon Web Services
 
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon Web Services Korea
 
AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20
AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20
AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20Amazon Web Services Korea
 
Deploy and Govern at Scale with AWS Control Tower
Deploy and Govern at Scale with AWS Control TowerDeploy and Govern at Scale with AWS Control Tower
Deploy and Govern at Scale with AWS Control TowerAmazon Web Services
 
NET304_Deep Dive into the New Network Load Balancer
NET304_Deep Dive into the New Network Load BalancerNET304_Deep Dive into the New Network Load Balancer
NET304_Deep Dive into the New Network Load BalancerAmazon Web Services
 
[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...
[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...
[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...Amazon Web Services Korea
 
Deep dive ECS & Fargate Deep Dive
Deep dive ECS & Fargate Deep DiveDeep dive ECS & Fargate Deep Dive
Deep dive ECS & Fargate Deep DiveAmazon Web Services
 
Application & Account Monitoring in AWS
Application & Account Monitoring in AWSApplication & Account Monitoring in AWS
Application & Account Monitoring in AWSBhuvaneswari Subramani
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Amazon Web Services
 
더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021Amazon Web Services Korea
 
AWS セキュリティとコンプライアンス
AWS セキュリティとコンプライアンスAWS セキュリティとコンプライアンス
AWS セキュリティとコンプライアンスAmazon Web Services Japan
 
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017AWSKRUG - AWS한국사용자모임
 
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
Elastic  Load Balancing Deep Dive - AWS Online Tech TalkElastic  Load Balancing Deep Dive - AWS Online Tech Talk
Elastic Load Balancing Deep Dive - AWS Online Tech TalkAmazon Web Services
 
AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지
AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지
AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지Amazon Web Services Korea
 
Deep Dive on Amazon EC2 Systems Manager
Deep Dive on Amazon EC2 Systems ManagerDeep Dive on Amazon EC2 Systems Manager
Deep Dive on Amazon EC2 Systems ManagerAmazon Web Services
 

What's hot (20)

(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling(CMP201) All You Need To Know About Auto Scaling
(CMP201) All You Need To Know About Auto Scaling
 
Introduction to Amazon EKS
Introduction to Amazon EKSIntroduction to Amazon EKS
Introduction to Amazon EKS
 
AWS Lambda and Serverless Cloud
AWS Lambda and Serverless CloudAWS Lambda and Serverless Cloud
AWS Lambda and Serverless Cloud
 
AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집
AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집
AWS 네트워크 보안을 위한 계층별 보안 구성 모범 사례 – 조이정, AWS 솔루션즈 아키텍트:: AWS 온라인 이벤트 – 클라우드 보안 특집
 
How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018
How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018
How AWS Minimizes the Blast Radius of Failures (ARC338) - AWS re:Invent 2018
 
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
 
AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20
AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20
AWS 기반의 마이크로 서비스 아키텍쳐 구현 방안 :: 김필중 :: AWS Summit Seoul 20
 
Well-Architected Bootcamp
Well-Architected BootcampWell-Architected Bootcamp
Well-Architected Bootcamp
 
Deploy and Govern at Scale with AWS Control Tower
Deploy and Govern at Scale with AWS Control TowerDeploy and Govern at Scale with AWS Control Tower
Deploy and Govern at Scale with AWS Control Tower
 
NET304_Deep Dive into the New Network Load Balancer
NET304_Deep Dive into the New Network Load BalancerNET304_Deep Dive into the New Network Load Balancer
NET304_Deep Dive into the New Network Load Balancer
 
[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...
[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...
[Games on AWS 2019] AWS 사용자를 위한 만랩 달성 트랙 | AWS에서 분산 서비스 거부 공격(DDoS)을 고민하지 않는 ...
 
Deep dive ECS & Fargate Deep Dive
Deep dive ECS & Fargate Deep DiveDeep dive ECS & Fargate Deep Dive
Deep dive ECS & Fargate Deep Dive
 
Application & Account Monitoring in AWS
Application & Account Monitoring in AWSApplication & Account Monitoring in AWS
Application & Account Monitoring in AWS
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
더욱 진화하는 AWS 네트워크 보안 - 신은수 AWS 시큐리티 스페셜리스트 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
 
AWS セキュリティとコンプライアンス
AWS セキュリティとコンプライアンスAWS セキュリティとコンプライアンス
AWS セキュリティとコンプライアンス
 
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
 
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
Elastic  Load Balancing Deep Dive - AWS Online Tech TalkElastic  Load Balancing Deep Dive - AWS Online Tech Talk
Elastic Load Balancing Deep Dive - AWS Online Tech Talk
 
AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지
AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지
AWS Summit Seoul 2023 | SOCAR는 어떻게 2만대의 차량을 운영할까?: IoT Data의 수집부터 분석까지
 
Deep Dive on Amazon EC2 Systems Manager
Deep Dive on Amazon EC2 Systems ManagerDeep Dive on Amazon EC2 Systems Manager
Deep Dive on Amazon EC2 Systems Manager
 

Similar to Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AWS re:Invent 2018

Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf
Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdfRodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf
Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdfAmazon Web Services
 
2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by ddd2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by dddKim Kao
 
Implementing Microservices by DDD
Implementing Microservices by DDDImplementing Microservices by DDD
Implementing Microservices by DDDAmazon Web Services
 
Introduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day JerusalemIntroduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day JerusalemAmazon Web Services
 
Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018AWS Germany
 
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018Amazon Web Services
 
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Amazon Web Services
 
Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...
Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...
Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...Amazon Web Services
 
From Monolithic to Modern Apps: Best Practices
From Monolithic to Modern Apps: Best PracticesFrom Monolithic to Modern Apps: Best Practices
From Monolithic to Modern Apps: Best PracticesTom Laszewski
 
Microservices: Data & Design - Miguel Cervantes
Microservices: Data & Design - Miguel CervantesMicroservices: Data & Design - Miguel Cervantes
Microservices: Data & Design - Miguel CervantesAmazon Web Services
 
Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...
Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...
Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...Amazon Web Services
 
Microservices & Data Design: Database Week SF
Microservices & Data Design: Database Week SFMicroservices & Data Design: Database Week SF
Microservices & Data Design: Database Week SFAmazon Web Services
 
Microservices and Data Design
Microservices and Data DesignMicroservices and Data Design
Microservices and Data DesignAWS Germany
 
Microservices & Data Design: Database Week San Francisco
Microservices & Data Design: Database Week San FranciscoMicroservices & Data Design: Database Week San Francisco
Microservices & Data Design: Database Week San FranciscoAmazon Web Services
 
Coordinating Microservices with AWS Step Functions.pdf
Coordinating Microservices with AWS Step Functions.pdfCoordinating Microservices with AWS Step Functions.pdf
Coordinating Microservices with AWS Step Functions.pdfAmazon Web Services
 
Hybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSHybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSTom Laszewski
 
Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...
Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...
Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...Amazon Web Services
 

Similar to Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AWS re:Invent 2018 (20)

Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf
Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdfRodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf
Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf
 
2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by ddd2019 03-13-implementing microservices by ddd
2019 03-13-implementing microservices by ddd
 
Implementing Microservices by DDD
Implementing Microservices by DDDImplementing Microservices by DDD
Implementing Microservices by DDD
 
Introduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day JerusalemIntroduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day Jerusalem
 
Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018
 
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
From Monolith to Modern Apps: Best Practices (SRV322-R2) - AWS re:Invent 2018
 
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
 
Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...
Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...
Advanced Deployment Best Practices with AWS CodeDeploy (DEV404-R2) - AWS re:I...
 
Microservices & Data Design
Microservices & Data DesignMicroservices & Data Design
Microservices & Data Design
 
From Monolithic to Modern Apps: Best Practices
From Monolithic to Modern Apps: Best PracticesFrom Monolithic to Modern Apps: Best Practices
From Monolithic to Modern Apps: Best Practices
 
Microservices: Data & Design - Miguel Cervantes
Microservices: Data & Design - Miguel CervantesMicroservices: Data & Design - Miguel Cervantes
Microservices: Data & Design - Miguel Cervantes
 
Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...
Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...
Remove Undifferentiated Heavy Lifting from CI/CD Toolsets with Corteva Agrisc...
 
Microservices & Data Design: Database Week SF
Microservices & Data Design: Database Week SFMicroservices & Data Design: Database Week SF
Microservices & Data Design: Database Week SF
 
Microservices and Data Design
Microservices and Data DesignMicroservices and Data Design
Microservices and Data Design
 
Microservices & Data Design: Database Week San Francisco
Microservices & Data Design: Database Week San FranciscoMicroservices & Data Design: Database Week San Francisco
Microservices & Data Design: Database Week San Francisco
 
Coordinating Microservices with AWS Step Functions.pdf
Coordinating Microservices with AWS Step Functions.pdfCoordinating Microservices with AWS Step Functions.pdf
Coordinating Microservices with AWS Step Functions.pdf
 
Breaking Down the 'Monowhat'
Breaking Down the 'Monowhat'Breaking Down the 'Monowhat'
Breaking Down the 'Monowhat'
 
AWS Well-Architected Workshop
AWS Well-Architected WorkshopAWS Well-Architected Workshop
AWS Well-Architected Workshop
 
Hybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSHybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWS
 
Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...
Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...
Resiliency Testing: Verify That Your System Is as Reliable as You Think (ARC4...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reliability of the Cloud: How AWS Achieves High Availability Rodney Lester Reliability Lead AWS Well Architected A R C 3 1 7 Shaun Ray Manager AWS Evangelism
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Well-Architected Reliability Pillar Once upon a time … (stories) Availability design goals
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Breakout repeats Tuesday, November 27 ARC317-R [REPEAT] Reliability of the Cloud: How AWS Achieves High Availability 3:15 p.m. – 4:15 p.m. | Aria East, Level 1, Joshua 4 Thursday, November 29 ARC317-R [REPEAT 1] Reliability of the Cloud: How AWS Achieves High Availability 11:30 a.m. – 12:30 p.m. | Mirage, Antigua A
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Related breakouts Wednesday, November 28 ARC335-R1 Failing Successfully in the Cloud: AWS Approach to Resilient Design 12:15 p.m. – 1:15 p.m. | Aria East, Level 2, Mariposa 8 Thursday, November 29 ARC335-R2 Failing Successfully in the Cloud: AWS Approach to Resilient Design 4:00 p.m. – 5:00 p.m. | MGM, Level 3, South Concourse 302 Wednesday, November 28 ARC408 Under the Hood of Route 53 11:30 a.m. – 12:30 p.m. | Venetian, Level 4, Lando 4305
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Well-Architected Reliability Pillar • Completely refreshed December 2017 • Additional changes approximately every three months • Plan is to have it more dynamic in the future, but a new version will be released soon • Significant changes • Calculating availability • Application design primer • Examples, at different design goals • Appendix contains design goals of 37 AWS services • More added in each revision and will continue • These concepts are used to develop services https://aws.amazon.com/well-architected/
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS uses the information in this white paper
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How does this relate to how AWS builds services? • This document was written in consultation with AWS principal engineers • The techniques described are quite proven • All of the techniques described have articles or books written about them
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ops meetings • David Lubell and Kevin Miller conducted a chalk talk in 2017 on how we run our ops meeting • Review critical services every week in a two hour meeting • Charlie Bell (SVP, AWS Operations) leads the meeting • Senior leaders of the services • Representation from every AWS service • Service metrics reviews • 130+ services * 10 min/service = 22-hr meeting? • How do we ensure all services are ready every week?
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Service review • Now open source • http://bit.ly/aws-wheel
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The things that happen once in million happen all the time in AWS • Some commonly observed problems: • Our back end service was having no problems, now it’s overloaded • An occasional huge spike in traffic that quickly disappears causes problems • Average response time to requests is slowly creeping up, but the p99 is exponential • Observe a rise in failed requests “The service/region is failing” • Experienced a failure, on recovery, we’re receiving duplicate requests that are all errors • Cannot adapt fast enough to the huge changes in demand up or down • Dependency on a less reliable system • No problems until a system that was dependent on us went down, then we went down • Couldn’t get capacity quick enough when a location went down
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common causes of such problems (cont.) • Our back end service was having no problems, now it’s overloaded • Someone deployed a service that uses our service and the requests are much more than planned/expected • Someone in marketing is running a campaign and didn’t tell us; our service is not alone • A bug exists that causes repeated requests to our service, either a new deployment, or a latent bug • We see an occasional huge spike in traffic that quickly disappears • Some kind of edge case exists where things go normally, then under a condition, some kind of rebuilding of a data model happens • Someone in marketing is running a campaign and didn’t tell us; our service is not alone • A bug exists that causes repeated requests to our service, either a new deployment, or a latent bug
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common causes of such problems (cont.) • Average response time to requests is slowly creeping up, but the p99 is exponential • This can be an indicator of impending problems • There is a use case that executes a different path, either on your service, or a dependency • Observe a rise in failed requests “The service/region is failing” • There may be an event (known internally as a Large Scale Event) occurring • Maybe a transient problem • Can often be better to wait it out rather than fail over • Experienced a failure, on recovery, we’re receiving duplicate requests that are all errors • Even if you are not distributed, it is possible that the invoking service has no idea you were successful in processing some requests • Idempotency tokens can be used
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common causes of such problems (cont.) • Cannot adapt fast enough to the huge changes in demand up or down • Need good communication paths with business drivers of traffic • You can have the system constantly performing tasks that are replaced by requests from consumers of your service • Dependency on a less reliable system • Can turn this into a soft dependency if you can find an acceptable replacement state • This usually needs to be negotiated with the product owners • No problems until a system that was dependent on us went down, then we went down • Commonly known as a cascading failure • Not always a failure (see previous examples of spiky traffic) • Example of “bi-modal behavior”
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common causes of such problems (cont.) • Couldn’t get capacity quick enough when a location went down • Pilot light or running at high utilization can cause a brown out when failure occurs • Need to be able to take a loss of a location and service the traffic immediately
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Service Design Goals • Not SLAs • Managed to in the weekly ops meeting • Currently document 37 services • Adding more as I work with services to establish them • Control Plane versus Data Plane • Control plane mutates resources (bi-modal!) and data plane is the “day job” • Control plane is often more “dangerous“ and therefore less available (not always!)
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 21. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Rodney Lester rodneyle@amazon.com Shaun Ray shaunray@amazon.com
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Software/implementation has an impact on availability • Throttling • Protect your service by refusing requests when out of capacity • Exponential back off for retries • This is an art and a science; built into the AWS SDKs • Fail fast • Users will retry on failure, so this can allow your system to recover faster
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. More advanced implementation patterns • Idempotency • You have a choice: “at most once” semantics, or “at least once.” Choose the latter. • Constant work • If you have a system that is always performing work, and you replace that work with user requests, you have a system that is much more predictable • Colm MacCarthaigh has a tweet thread on this: https://twitter.com/colmmacc/status/1039228121327648768 • Circuit breaker • Can be used to remove hard dependencies in your availability calculation
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Bi-model behavior and static stability • Cascading failures are often from “bi-modal” behavior • I’ve seen this often—anomaly causes huge change in system • Static stability • On loss of capacity, you want to be able to handle your current load with no need to acquire resources
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. It’s a danger to stay on old versions of operating systems, frameworks, or third-party software • More than just operating systems • Operating systems • Frameworks like Spring, Angular, and more • Other third-party software like libraries • Ensure you keep up to date • Can be more than availability concern—Equifax had a old version of Struts that exposed their customer data • This is part of the corporate wide topics communicated in the Ops meetings