SlideShare una empresa de Scribd logo
1 de 113
Descargar para leer sin conexión
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 11/30/2016
Design Patterns for High Availability
Lessons learned building Amazon CloudFront
CTD303
Alex Smith
Head of M&E Architecture, APAC
Amazon Web Services
Harvo Jones
Sr. Software Development Engineer
Amazon CloudFront
What to Expect from the Session
• Learn about the design patterns for high availability of
Amazon CloudFront
• Learn how you can implement the patterns in your own
services or applications built on top of AWS
What is a Content Delivery
Network (CDN)?
Amazon CloudFront locations worldwide
North America South America EMEA APAC
POPs
Cities
Countries
Continents
AWS Region CloudFront Edge Location
How does Amazon CloudFront work?
AS-X
AS-Y
AS
16509
Network 1
Network 2
CloudFront POP
52.84.19.0/24
52.84.19.0/24
How does Amazon CloudFront work?
AS-X
AS-ZAS-Y Network 3
52.84.19.0/24
52.84.19.0/24
AS
16509
Network 1
Network 2
CloudFront POP
52.84.19.0/24
52.84.19.0/24
Sequence of events
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Sequence of events
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Sequence of events
Origin
3 HTTP
CF DNSResolver
1 DNS
CF POP
2 HTTP
How to monitor availability
How CloudFront monitors availability
1. Analysis of server-side metrics
How CloudFront monitors availability
1. Analysis of server-side metrics
2. Canaries
How CloudFront monitors availability
1. Analysis of server-side metrics
2. Canaries
3. Third-party global HTTP tests
How CloudFront monitors availability
1. Analysis of server-side metrics
2. Canaries
3. Third party global HTTP tests
Availability interruption
Segfault in CloudFront DNS server
27: int dns_get_domain_length(const char *domain) { /* <== domain is NULL */
28: const char *pos;
29: unsigned char ch;
30:
31: pos = domain;
32: while ((ch = *pos++) != 0) /* <== SEGFAULT */
33: pos += (unsigned int) ch;
34:
35: return pos – domain;
36: }
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Risks to availability
Software Bugs
Surges in Volume Data Corruption
Time Bombs Dependent Services
Rapid Iteration on Capabilities
• Access Logs
• Lower Pricing
Tiers
• Management
Console
• Private Content
• RTMP Streaming
• 1-Hour TTLs
• Custom Origins
• Default Root
Object
• HTTPS Support
• Invalidations
• Price Drop, SLA
• Private
Streaming
• RTMP Access
Logs
• Singapore
• New York City
• Jacksonville
• File Size Increase
• IAM Integration
• Live Streaming
• Price Drop
• Paris
• Stockholm
• Sao Paulo
• San Jose
• South Bend
• Los Angeles
(2nd)
• New York City
(2nd)
• Geo-Blocking
• Live Streaming
Update
• Multiple Cache
Behaviors
• Multiple Origins
• QueryString
Forwarding
• Zero TTL Support
• Osaka
• Milan
• Virginia (2nd)
• Singapore (2nd)
• Frankfurt (2nd)
• London (2nd)
• Dallas (2nd)
• Cookie Based
Caching
• Custom SSL
Support
• Enhanced Logs
• HTTP 1.1
• Lower Inter-
Region Pricing
• Price Classes
• Private Content
Console Support
• Zone Apex
Support
• Chennai &
Mumbai
• Seoul
• Hayward
• Madrid
• Sydney
• Paris (2nd)
• Amsterdam (2nd)
• Tokyo (2nd)
• Hong Kong (2nd)
• New York City
(3rd)
• Virginia (3rd)
• Smooth
Streaming
• SNI Custom SSL
• HTTP to HTTPS
Redirect
• Usage Charts
• Free Tier
• EDNS Client
Subnet
• CloudTrail
• Device Detection,
Geo Targeting,
Host Header
Forwarding
• Advanced SSL
Features
• Wildcard Cookies
• Monitoring &
Alarming
• Content Charts
• Price Reduction
• Zero Rate AWS
Origin Traffic
• Directory Path as
Origin
• Viewer Charts
• Rio de Janeiro
• Taipei
• Melbourne
2008 2009 2010 2011 2012 20142013
• CloudFront
Launched with
14 POPs
• Signed Cookie
Support
• Smart TV Device
Detection
• Devices Report
• Usage Metrics
CSV Export
• Wildcard
Invalidations
• Default TTLs
• Max TTLs
• PCI DSS
Compliance
• AWS WAF
Integration
• Auto Gzip
Compression
• Modify Request
Headers
• Chicago
• Seoul (2nd)
• Enforce HTTPS
Connections
• TLSv1.1 and
TLSv1.2 to Origin
• AWS Certificate
Manager
Integration
• Cost Allocation
Tagging
• Query String
Whitelisting
• HTTP/2
• Toronto
• Montreal
• New Delhi
• Atlanta (2nd)
• Mumbai (2nd)
• Seoul (3rd)
• Frankfurt (2nd)
• Japan (4th)
• Hong Kong (3rd)
• London (4th)
• Berlin
• Minneapolis
• IPv6
• ACM Cert Import
• Regional Edge
Caches
• […]
2015 2016
Design patterns for availability
Design patterns for availability
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Design pattern 1: FoodTaster
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Risks to availability
Software Bugs
Surges in Volume Data Corruption
Time Bombs Dependent Services
FoodTaster
FoodTaster
Praegustator – DNS FoodTaster
DNS
name
server
Before Config
Daemons
Praegustator – DNS FoodTaster
DNS
name
server
DNS
name
server
Before
After
Orchestrator
1 2 3
DNS
name
server
50K
queries
Config
Daemons
Config
Daemons
$ ls
dns/ foodtaster/ landing/
$ ls foodtaster/
customer.data@ pop.data@
resolvers.data@ routing.data
$ ls dns/
customer.data pop.data
resolvers.data routing.data
Praegustator – DNS FoodTaster
Food Tasting in AWS
• Straightforward approach
• Deployments are
inexpensive, redeployments
more too
• Approaches like
CodeDeploy make life even
easier
Food Tasting in AWS
• Example - GeoIP Database
• Automatically pushed out to
every host
• Likely already have checks
• More to consider than just
invalid user configuration
Food Tasting in AWS
• Simple is better (Works for
Amazon CloudFront!)
• Assume we already use a
deployment system (e.g.,
AWS CodeDeploy)
• Complete in-situ tests
before returning complete
hooks:
BeforeInstall:
- location: Scripts/UnzipResourceBundle.sh
- location: Scripts/UnzipDataBundle.sh
AfterInstall:
- location: Scripts/RunResourceTests.sh
timeout: 180
ApplicationStart:
- location: Scripts/RunFunctionalTests.sh
timeout: 3600
ValidateService:
- location: Scripts/MonitorService.sh
timeout: 3600
runas: codedeployuse
CodeDeploy AppSec example
hooks:
BeforeInstall:
- location: Scripts/UnzipResourceBundle.sh
- location: Scripts/UnzipDataBundle.sh
AfterInstall:
- location: Scripts/RunResourceTests.sh
timeout: 180
ApplicationStart:
- location: Scripts/RunFunctionalTests.sh
timeout: 3600
ValidateService:
- location: Scripts/MonitorService.sh
timeout: 3600
runas: codedeployuse
CodeDeploy AppSec example
Food Tasting in AWS
• Acts as a quality gate
• Roll-back can be automatic
• Verification never affects
user facing traffic
Design pattern 2: Flash Crowds
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Risks to availability
Software Bugs
Surges in Volume Data Corruption
Time Bombs Dependent Services
Load dispersion
HTTP Flash Crowds
1x -
15x -
Static & Dynamic Strategies
dest
addr
next hopsPacket-flow config
Feedback loop
Load Dispersal
1x -
2x -
Flash Crowds within AWS
Auto Scaling works really well
But.
Approx. 15 req/s
Approx. 130k req/s
45
Seconds
Approx. 15 req/s
Approx. 130k req/s
45
Seconds
Boo!
Problem: Flash Crowds
Auto Scaling works really well
But.
Sometimes, 60s is too long.
AWS solutions
• Plan Ahead
Planning ahead
Typical examples:
• TV programs
• Live events (sports)
• Game releases
• Established traffic patterns
Planning ahead
Typical examples:
• TV programs
• Live events (sports)
• Game releases
• Established traffic patterns
AWS Solutions:
Infrastructure Event
Management
Scheduled Auto Scaling
Group
Auto Scaling Integration
Schedules and the Big Red Button
• Integration of program
scheduling and Auto
Scaling
• Program scheduling
includes expected online
audience parameters
• Big red button
Schedules and the Big Red Button
• Integration of program
scheduling and Auto
Scaling
• Program scheduling
includes expected online
audience parameters
• Big red button
AWS solutions
• Plan Ahead
• Cache Things
Cache things
Cache things
I can’t!
My website is…
Cache things
I can’t!
My website is…
DYNAMIC!
Cache things
• What’s really dynamic?
• Newspapers
• Voting sites
• Forum sites
• Updates don’t need to be
more than once a second
Cache things (even for a second)
• 10000s req/s
• Assuming every Amazon
CloudFront POP
• 10000s req/s -> ~ 10s req/s
AWS solutions
• Plan Ahead
• Cache Things
• Serve only what you have to
Serve only what you have to
Serve only what you have to
Serve only what you have to
Serve only what you have to
• AJAX Includes
Serve only what you have to
• AJAX Includes
• CloudFront Cache Keys
Design pattern 3: Defense in Depth
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Risks to availability
Data Corruption
Software Bugs
Surges in Volume
Time Bombs Dependent Services
Cache
Multi-impl
Sharding
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Potential causes of application failure
Origin
CF POP
CF DNSResolver
3 HTTP
2 HTTP
1 DNS
Ideas to avoid crashing
• Comprehensive test coverage
• Simplify systems
Failures are going to happen.
How can we survive them when they do?
Ideas to survive crashing
• Reduce blast radius
• Reject input that previously made you crash
Ideas to survive crashing
• Reduce blast radius
• shard customers to separate processes
• Reject input that previously made you crash
Ideas to survive crashing
• Reduce blast radius
• shard customers to separate processes
• recover quickly
• Reject input that previously made you crash
Ideas to survive crashing
• Reduce blast radius
• shard customers to separate processes
• recover quickly
• multiple implementations
• Reject input that previously made you crash
DNS Multi-Impl
Should this be a fallback system?
Place it in front of every DNS name server?
DNS Multi-Impl
Should this be a fallback system?
• We want to know it always works
Place it in front of every DNS name server?
DNS Multi-Impl
Should this be a fallback system?
• We want to know it always works
Place it in front of every DNS name server?
• Would trade point of failure one for another
DNS Stripes
DNS
Impl 2
Go
DNS
Impl 1
C
Stripe 1
(34 POPs)
Stripe 2
(34 POPs)
DNS Stripes
Stripe 1
(34 POPs)
Stripe 2
(34 POPs)
Math.abs() on the lowest 32-bit signed integer, -2^31, yields a negative number.
Leading to invalid DNS configuration.
CloudFront case study
Two's complement: flip bits and add 1.
Math.abs(Integer.MIN_VALUE):
-2^31 = 10000000 00000000 00000000 00000000
bits flipped = 01111111 11111111 11111111 11111111
+1 = 10000000 00000000 00000000 00000000 <= -2^31!
CloudFront case study
Protections in place:
1. Config process crashing in a loop
2. DNS name server defensively ignores an invalid index
3. FoodTaster crashes protect King DNS name server
4. DNS stripes reduce common failure paths
AWS solution
We can use a Load Balancer
with Layer-7 awareness
So we can have multiple
backend types across a fleet
AWS solution
Enabled through microservices
Reimplementing the entire
stack multiple times is a
difficult cost / efficiency
question
Specific high-risk services are
easier
AWS solution
Example: 2FA Platform
API (HTTP) front-end to a
back-end authentication
platform
Reasonably few SLOC
If it fails – no access to
platform
AWS solution
Example: 2FA Platform
API (HTTP) front-end to a
back-end authentication
platform
Reasonably few SLOC
If it fails – no access to
platform
Auto Scaling group Auto Scaling group
Elastic Load
Balancing
Python Java
Users
Design Pattern 4: Time Bomb Jitter Protection
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Risks to availability
Software Bugs
Surges in Volume Data Corruption
Time Bombs Dependent Services
Jitter
Problem: homogeneous platforms
The problem
Server
74656
Server
1701
Server
74205
The problem
Configuration Updates
The problem
Binary Patches
The problem
Pictures of Cats
The problem
Pictures of Cats
Traditionally..
• Instance level monitoring
system with alerts
• Human response
• Monitor both percentage fill,
and the fill rate
High availability matters
High availability matters
Recycling old instances is
unnecessary cost
Automated cleanups may be
too slow
Human operators almost
certainly too slow
The solution
The solution
AWS solution – within an Auto Scaling group
Amazon Linux AMI has
/var/run (and /var/tmp) on root
Ubuntu AMI uses tmpfs (10%
of RAM)
Jittering root volume size is
tricky within an ASG
AWS solution – within Auto Scaling
Consciously segregate your
files to a specified directory –
on a separate volume – with
jittering
Code will follow (check
SlideShare!)
Adds <10s to system startup
What else?
Time
• SSL Certificates
• Domain name registrations
• Deployment Schedules
• ???
Wrap up:
Lessons learned
Risks to availability
Software Bugs
Surges in Volume Data Corruption
Time Bombs Dependent Services
Risks & Common Patterns
Software Bugs
Dependent Services
Multi-impl
Sharding
FoodTaster
Cache
Redundancies
Surges in Volume
Load dispersion
Data Corruption
FoodTaster
Time Bombs
Jitter
Lessons learned
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Don’t rely on
validation in your
main application
Auto Scaling
Integration;
Caching; Selective
Serving
Homogeneity can
hurt
Implementation
striping can save
you
Lessons learned
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Don’t rely on
validation in your
main application
Auto Scaling
Integration;
Caching;
Selective Serving
Homogeneity can
hurt
Implementation
striping can save
you
Lessons learned
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Don’t rely on
validation in your
main application
Auto Scaling
Integration;
Caching; Selective
Serving
Homogeneity can
hurt
Implementation
striping can save
you
Lessons learned
Maximizing
availability with
Food Tasting
Flash Crowds
without scaling for
the peak
Defense in Depth
Strategies
Time Bomb Jitter
Protection
Don’t rely on
validation in your
main application
Auto Scaling
Integration;
Caching; Selective
Serving
Homogeneity can
hurt
Implementation
striping can save
you
Thank you!
Remember to complete
your evaluations!
Related Sessions
ARC309
Moving Mission Critical Apps from One Region to Multi-
Region active/active
ARC204
From Resilience to Ubiquity - #NetflixEverywhere
Global Architecture

Más contenido relacionado

La actualidad más candente

AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...Amazon Web Services
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheAmazon Web Services
 
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech TalksDeep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech TalksAmazon Web Services
 
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...Amazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
Deep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingDeep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingAmazon Web Services
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureAmazon Web Services
 
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)Amazon Web Services
 
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Amazon Web Services
 
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)Amazon Web Services
 
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)Amazon Web Services
 
The Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivThe Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivAmazon Web Services
 
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
AWS Webinar 201: Designing scalable, available & resilient cloud applications
AWS Webinar 201: Designing scalable, available & resilient cloud applicationsAWS Webinar 201: Designing scalable, available & resilient cloud applications
AWS Webinar 201: Designing scalable, available & resilient cloud applicationsAmazon Web Services
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageAmazon Web Services
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraAmazon Web Services
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)Amazon Web Services
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...Amazon Web Services
 
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSAmazon Web Services
 

La actualidad más candente (20)

AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCache
 
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech TalksDeep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
 
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
AWS re:Invent 2016: Optimizing Network Performance for Amazon EC2 Instances (...
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Deep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingDeep Dive on Elastic Load Balancing
Deep Dive on Elastic Load Balancing
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud Architecture
 
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
 
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
 
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
 
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
AWS re:Invent 2016: Workshop: Migrating Microsoft Applications to AWS (ENT216)
 
The Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivThe Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel Aviv
 
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
 
AWS Webinar 201: Designing scalable, available & resilient cloud applications
AWS Webinar 201: Designing scalable, available & resilient cloud applicationsAWS Webinar 201: Designing scalable, available & resilient cloud applications
AWS Webinar 201: Designing scalable, available & resilient cloud applications
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon Aurora
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
 
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
 

Similar a AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazon CloudFront that You Can Replicate on AWS (CTD303)

Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...
Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...
Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...Amazon Web Services
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...Adrian Cockcroft
 
AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...
AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...
AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...Amazon Web Services
 
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...Amazon Web Services
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
 
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersJourney Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersAdrian Hornsby
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Amazon Web Services
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time AnalyticsAmazon Web Services
 
AWS Summit London 2014 | Dynamic Content Acceleration (300)
AWS Summit London 2014 | Dynamic Content Acceleration (300)AWS Summit London 2014 | Dynamic Content Acceleration (300)
AWS Summit London 2014 | Dynamic Content Acceleration (300)Amazon Web Services
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysisAmazon Web Services
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSAmazon Web Services
 
AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFront
AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFrontAWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFront
AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFrontAmazon Web Services
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)Amazon Web Services
 
Scale, baby, scale!
Scale, baby, scale!Scale, baby, scale!
Scale, baby, scale!Julien SIMON
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Amazon Web Services
 
AWS Summit 2013 | Auckland - Understanding AWS Storage Options
AWS Summit 2013 | Auckland - Understanding AWS Storage OptionsAWS Summit 2013 | Auckland - Understanding AWS Storage Options
AWS Summit 2013 | Auckland - Understanding AWS Storage OptionsAmazon Web Services
 

Similar a AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazon CloudFront that You Can Replicate on AWS (CTD303) (20)

Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...
Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...
Scaling to millions of users with Amazon CloudFront - April 2017 AWS Online T...
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 
AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...
AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...
AWS re:Invent 2016: Accelerating the Transition to Broadcast and OTT Infrastr...
 
Amazon CloudFront Complete with Blazeclan's Media Solution Stack
Amazon CloudFront Complete with Blazeclan's Media Solution StackAmazon CloudFront Complete with Blazeclan's Media Solution Stack
Amazon CloudFront Complete with Blazeclan's Media Solution Stack
 
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersJourney Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million Users
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time Analytics
 
AWS Summit London 2014 | Dynamic Content Acceleration (300)
AWS Summit London 2014 | Dynamic Content Acceleration (300)AWS Summit London 2014 | Dynamic Content Acceleration (300)
AWS Summit London 2014 | Dynamic Content Acceleration (300)
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWS
 
AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFront
AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFrontAWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFront
AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon CloudFront
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
 
Scale, baby, scale!
Scale, baby, scale!Scale, baby, scale!
Scale, baby, scale!
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
Cloud for Media - A Complete Solution Stack for Faster Cloud AdoptionCloud for Media - A Complete Solution Stack for Faster Cloud Adoption
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
 
AWS Summit 2013 | Auckland - Understanding AWS Storage Options
AWS Summit 2013 | Auckland - Understanding AWS Storage OptionsAWS Summit 2013 | Auckland - Understanding AWS Storage Options
AWS Summit 2013 | Auckland - Understanding AWS Storage Options
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazon CloudFront that You Can Replicate on AWS (CTD303)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 11/30/2016 Design Patterns for High Availability Lessons learned building Amazon CloudFront CTD303 Alex Smith Head of M&E Architecture, APAC Amazon Web Services Harvo Jones Sr. Software Development Engineer Amazon CloudFront
  • 2. What to Expect from the Session • Learn about the design patterns for high availability of Amazon CloudFront • Learn how you can implement the patterns in your own services or applications built on top of AWS
  • 3. What is a Content Delivery Network (CDN)?
  • 4. Amazon CloudFront locations worldwide North America South America EMEA APAC POPs Cities Countries Continents AWS Region CloudFront Edge Location
  • 5. How does Amazon CloudFront work? AS-X AS-Y AS 16509 Network 1 Network 2 CloudFront POP 52.84.19.0/24 52.84.19.0/24
  • 6. How does Amazon CloudFront work? AS-X AS-ZAS-Y Network 3 52.84.19.0/24 52.84.19.0/24 AS 16509 Network 1 Network 2 CloudFront POP 52.84.19.0/24 52.84.19.0/24
  • 7. Sequence of events Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 8. Sequence of events Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 9. Sequence of events Origin 3 HTTP CF DNSResolver 1 DNS CF POP 2 HTTP
  • 10. How to monitor availability
  • 11. How CloudFront monitors availability 1. Analysis of server-side metrics
  • 12. How CloudFront monitors availability 1. Analysis of server-side metrics 2. Canaries
  • 13. How CloudFront monitors availability 1. Analysis of server-side metrics 2. Canaries 3. Third-party global HTTP tests
  • 14. How CloudFront monitors availability 1. Analysis of server-side metrics 2. Canaries 3. Third party global HTTP tests
  • 16. Segfault in CloudFront DNS server 27: int dns_get_domain_length(const char *domain) { /* <== domain is NULL */ 28: const char *pos; 29: unsigned char ch; 30: 31: pos = domain; 32: while ((ch = *pos++) != 0) /* <== SEGFAULT */ 33: pos += (unsigned int) ch; 34: 35: return pos – domain; 36: }
  • 17. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 18. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 19. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 20. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 21. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 22. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 23. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 24. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 25. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 26. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 27. Risks to availability Software Bugs Surges in Volume Data Corruption Time Bombs Dependent Services
  • 28. Rapid Iteration on Capabilities • Access Logs • Lower Pricing Tiers • Management Console • Private Content • RTMP Streaming • 1-Hour TTLs • Custom Origins • Default Root Object • HTTPS Support • Invalidations • Price Drop, SLA • Private Streaming • RTMP Access Logs • Singapore • New York City • Jacksonville • File Size Increase • IAM Integration • Live Streaming • Price Drop • Paris • Stockholm • Sao Paulo • San Jose • South Bend • Los Angeles (2nd) • New York City (2nd) • Geo-Blocking • Live Streaming Update • Multiple Cache Behaviors • Multiple Origins • QueryString Forwarding • Zero TTL Support • Osaka • Milan • Virginia (2nd) • Singapore (2nd) • Frankfurt (2nd) • London (2nd) • Dallas (2nd) • Cookie Based Caching • Custom SSL Support • Enhanced Logs • HTTP 1.1 • Lower Inter- Region Pricing • Price Classes • Private Content Console Support • Zone Apex Support • Chennai & Mumbai • Seoul • Hayward • Madrid • Sydney • Paris (2nd) • Amsterdam (2nd) • Tokyo (2nd) • Hong Kong (2nd) • New York City (3rd) • Virginia (3rd) • Smooth Streaming • SNI Custom SSL • HTTP to HTTPS Redirect • Usage Charts • Free Tier • EDNS Client Subnet • CloudTrail • Device Detection, Geo Targeting, Host Header Forwarding • Advanced SSL Features • Wildcard Cookies • Monitoring & Alarming • Content Charts • Price Reduction • Zero Rate AWS Origin Traffic • Directory Path as Origin • Viewer Charts • Rio de Janeiro • Taipei • Melbourne 2008 2009 2010 2011 2012 20142013 • CloudFront Launched with 14 POPs • Signed Cookie Support • Smart TV Device Detection • Devices Report • Usage Metrics CSV Export • Wildcard Invalidations • Default TTLs • Max TTLs • PCI DSS Compliance • AWS WAF Integration • Auto Gzip Compression • Modify Request Headers • Chicago • Seoul (2nd) • Enforce HTTPS Connections • TLSv1.1 and TLSv1.2 to Origin • AWS Certificate Manager Integration • Cost Allocation Tagging • Query String Whitelisting • HTTP/2 • Toronto • Montreal • New Delhi • Atlanta (2nd) • Mumbai (2nd) • Seoul (3rd) • Frankfurt (2nd) • Japan (4th) • Hong Kong (3rd) • London (4th) • Berlin • Minneapolis • IPv6 • ACM Cert Import • Regional Edge Caches • […] 2015 2016
  • 29. Design patterns for availability
  • 30. Design patterns for availability Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection
  • 31. Design pattern 1: FoodTaster Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection
  • 32. Risks to availability Software Bugs Surges in Volume Data Corruption Time Bombs Dependent Services FoodTaster FoodTaster
  • 33. Praegustator – DNS FoodTaster DNS name server Before Config Daemons
  • 34. Praegustator – DNS FoodTaster DNS name server DNS name server Before After Orchestrator 1 2 3 DNS name server 50K queries Config Daemons Config Daemons
  • 35. $ ls dns/ foodtaster/ landing/ $ ls foodtaster/ customer.data@ pop.data@ resolvers.data@ routing.data $ ls dns/ customer.data pop.data resolvers.data routing.data Praegustator – DNS FoodTaster
  • 36. Food Tasting in AWS • Straightforward approach • Deployments are inexpensive, redeployments more too • Approaches like CodeDeploy make life even easier
  • 37. Food Tasting in AWS • Example - GeoIP Database • Automatically pushed out to every host • Likely already have checks • More to consider than just invalid user configuration
  • 38. Food Tasting in AWS • Simple is better (Works for Amazon CloudFront!) • Assume we already use a deployment system (e.g., AWS CodeDeploy) • Complete in-situ tests before returning complete
  • 39. hooks: BeforeInstall: - location: Scripts/UnzipResourceBundle.sh - location: Scripts/UnzipDataBundle.sh AfterInstall: - location: Scripts/RunResourceTests.sh timeout: 180 ApplicationStart: - location: Scripts/RunFunctionalTests.sh timeout: 3600 ValidateService: - location: Scripts/MonitorService.sh timeout: 3600 runas: codedeployuse CodeDeploy AppSec example
  • 40. hooks: BeforeInstall: - location: Scripts/UnzipResourceBundle.sh - location: Scripts/UnzipDataBundle.sh AfterInstall: - location: Scripts/RunResourceTests.sh timeout: 180 ApplicationStart: - location: Scripts/RunFunctionalTests.sh timeout: 3600 ValidateService: - location: Scripts/MonitorService.sh timeout: 3600 runas: codedeployuse CodeDeploy AppSec example
  • 41. Food Tasting in AWS • Acts as a quality gate • Roll-back can be automatic • Verification never affects user facing traffic
  • 42. Design pattern 2: Flash Crowds Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection
  • 43. Risks to availability Software Bugs Surges in Volume Data Corruption Time Bombs Dependent Services Load dispersion
  • 45. Static & Dynamic Strategies dest addr next hopsPacket-flow config Feedback loop
  • 47. Flash Crowds within AWS Auto Scaling works really well But.
  • 48. Approx. 15 req/s Approx. 130k req/s 45 Seconds
  • 49. Approx. 15 req/s Approx. 130k req/s 45 Seconds Boo!
  • 50. Problem: Flash Crowds Auto Scaling works really well But. Sometimes, 60s is too long.
  • 52. Planning ahead Typical examples: • TV programs • Live events (sports) • Game releases • Established traffic patterns
  • 53. Planning ahead Typical examples: • TV programs • Live events (sports) • Game releases • Established traffic patterns AWS Solutions: Infrastructure Event Management Scheduled Auto Scaling Group Auto Scaling Integration
  • 54. Schedules and the Big Red Button • Integration of program scheduling and Auto Scaling • Program scheduling includes expected online audience parameters • Big red button
  • 55. Schedules and the Big Red Button • Integration of program scheduling and Auto Scaling • Program scheduling includes expected online audience parameters • Big red button
  • 56. AWS solutions • Plan Ahead • Cache Things
  • 58. Cache things I can’t! My website is…
  • 59. Cache things I can’t! My website is… DYNAMIC!
  • 60. Cache things • What’s really dynamic? • Newspapers • Voting sites • Forum sites • Updates don’t need to be more than once a second
  • 61. Cache things (even for a second) • 10000s req/s • Assuming every Amazon CloudFront POP • 10000s req/s -> ~ 10s req/s
  • 62. AWS solutions • Plan Ahead • Cache Things • Serve only what you have to
  • 63. Serve only what you have to
  • 64. Serve only what you have to
  • 65. Serve only what you have to
  • 66. Serve only what you have to • AJAX Includes
  • 67. Serve only what you have to • AJAX Includes • CloudFront Cache Keys
  • 68. Design pattern 3: Defense in Depth Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection
  • 69. Risks to availability Data Corruption Software Bugs Surges in Volume Time Bombs Dependent Services Cache Multi-impl Sharding
  • 70. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 71. Potential causes of application failure Origin CF POP CF DNSResolver 3 HTTP 2 HTTP 1 DNS
  • 72. Ideas to avoid crashing • Comprehensive test coverage • Simplify systems Failures are going to happen. How can we survive them when they do?
  • 73. Ideas to survive crashing • Reduce blast radius • Reject input that previously made you crash
  • 74. Ideas to survive crashing • Reduce blast radius • shard customers to separate processes • Reject input that previously made you crash
  • 75. Ideas to survive crashing • Reduce blast radius • shard customers to separate processes • recover quickly • Reject input that previously made you crash
  • 76. Ideas to survive crashing • Reduce blast radius • shard customers to separate processes • recover quickly • multiple implementations • Reject input that previously made you crash
  • 77. DNS Multi-Impl Should this be a fallback system? Place it in front of every DNS name server?
  • 78. DNS Multi-Impl Should this be a fallback system? • We want to know it always works Place it in front of every DNS name server?
  • 79. DNS Multi-Impl Should this be a fallback system? • We want to know it always works Place it in front of every DNS name server? • Would trade point of failure one for another
  • 80. DNS Stripes DNS Impl 2 Go DNS Impl 1 C Stripe 1 (34 POPs) Stripe 2 (34 POPs)
  • 81. DNS Stripes Stripe 1 (34 POPs) Stripe 2 (34 POPs)
  • 82. Math.abs() on the lowest 32-bit signed integer, -2^31, yields a negative number. Leading to invalid DNS configuration. CloudFront case study Two's complement: flip bits and add 1. Math.abs(Integer.MIN_VALUE): -2^31 = 10000000 00000000 00000000 00000000 bits flipped = 01111111 11111111 11111111 11111111 +1 = 10000000 00000000 00000000 00000000 <= -2^31!
  • 83. CloudFront case study Protections in place: 1. Config process crashing in a loop 2. DNS name server defensively ignores an invalid index 3. FoodTaster crashes protect King DNS name server 4. DNS stripes reduce common failure paths
  • 84. AWS solution We can use a Load Balancer with Layer-7 awareness So we can have multiple backend types across a fleet
  • 85. AWS solution Enabled through microservices Reimplementing the entire stack multiple times is a difficult cost / efficiency question Specific high-risk services are easier
  • 86. AWS solution Example: 2FA Platform API (HTTP) front-end to a back-end authentication platform Reasonably few SLOC If it fails – no access to platform
  • 87. AWS solution Example: 2FA Platform API (HTTP) front-end to a back-end authentication platform Reasonably few SLOC If it fails – no access to platform Auto Scaling group Auto Scaling group Elastic Load Balancing Python Java Users
  • 88. Design Pattern 4: Time Bomb Jitter Protection Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection
  • 89. Risks to availability Software Bugs Surges in Volume Data Corruption Time Bombs Dependent Services Jitter
  • 96. Traditionally.. • Instance level monitoring system with alerts • Human response • Monitor both percentage fill, and the fill rate
  • 98. High availability matters Recycling old instances is unnecessary cost Automated cleanups may be too slow Human operators almost certainly too slow
  • 101. AWS solution – within an Auto Scaling group Amazon Linux AMI has /var/run (and /var/tmp) on root Ubuntu AMI uses tmpfs (10% of RAM) Jittering root volume size is tricky within an ASG
  • 102. AWS solution – within Auto Scaling Consciously segregate your files to a specified directory – on a separate volume – with jittering Code will follow (check SlideShare!) Adds <10s to system startup
  • 103. What else? Time • SSL Certificates • Domain name registrations • Deployment Schedules • ???
  • 105. Risks to availability Software Bugs Surges in Volume Data Corruption Time Bombs Dependent Services
  • 106. Risks & Common Patterns Software Bugs Dependent Services Multi-impl Sharding FoodTaster Cache Redundancies Surges in Volume Load dispersion Data Corruption FoodTaster Time Bombs Jitter
  • 107. Lessons learned Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection Don’t rely on validation in your main application Auto Scaling Integration; Caching; Selective Serving Homogeneity can hurt Implementation striping can save you
  • 108. Lessons learned Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection Don’t rely on validation in your main application Auto Scaling Integration; Caching; Selective Serving Homogeneity can hurt Implementation striping can save you
  • 109. Lessons learned Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection Don’t rely on validation in your main application Auto Scaling Integration; Caching; Selective Serving Homogeneity can hurt Implementation striping can save you
  • 110. Lessons learned Maximizing availability with Food Tasting Flash Crowds without scaling for the peak Defense in Depth Strategies Time Bomb Jitter Protection Don’t rely on validation in your main application Auto Scaling Integration; Caching; Selective Serving Homogeneity can hurt Implementation striping can save you
  • 112. Remember to complete your evaluations!
  • 113. Related Sessions ARC309 Moving Mission Critical Apps from One Region to Multi- Region active/active ARC204 From Resilience to Ubiquity - #NetflixEverywhere Global Architecture