SlideShare una empresa de Scribd logo
1 de 125
Going Big With Containers
C u s t o m e r C a s e S t u d i e s o f L a r g e - S c a l e D e p l o y m e n t s
M a t t C a l l a n a n – E n g i n e e r i n g M a n a g e r - E x p e d i a
m c a l l a n a n @ e x p e d i a . c o m
l i n k e d i n . c o m / i n / m a t t h e w c a l l a n a n
@ m c a l l a n a
E N T 2 0 9
N o v e m b e r 2 9 , 2 0 1 7
Going Big With Containers
Large-Scale Deployments with Amazon ECS
Matt Callanan
Engineering Manager / Tech Lead
“Cloud Acceleration Team”
Expedia
Brisbane, Australia
• mcallanan@expedia.com
• linkedin.com/in/matthewcallanan
• @mcallana
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline/Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management/Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, Amazon CloudWatch, AWS CloudFormation, Auto Scaling, Amazon Route 53, ELB,
AWS Lambda, SNS, Support
AWS
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS,
Support
AWS
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
What is the Cost of Creating a
Microservice?
Source Code Repository
Working Code Base
Basic Test Suite
Immutable Servers
Infrastructure As Code
Centralized Logging
Centralized Monitoring
Chat Notifications
Load Balancer
DNS Networking
Blue-Green Deploys
Continuous Delivery Pipeline
Phew!
Opportunity Cost: Time Value of Information
Time Value of Information
• “A piece of information is worth
more now than it is tomorrow”
• If every commit is a
hypothesis, how much is
verifying that commit worth
now as opposed to later?
“Primer”
Microservice Generation Platform
“Primer” – Microservice Generator
“Primer” – Technology Choice
User enters the details
of their app into
Primer Web App for
new app creation
Creates Dockerfile
and repo in private
docker registry
Primer Application Creation
• Within 10 minutes:
• Application code repository created
• Continuous Delivery pipeline created
• Docker repository created
• Application built as a Docker image
• Application deployed to a prod-like environment
~20 Primer Applications Created per Day
Hackathons!
How Long Does Feedback Take In a Monolith?
Monolith with 10x release cycleMicroservice
Why is Fast Feedback Important?
• Most Likely to Fail
o 68% Industry Failure Rate
• 10x cycle time = 1/10th success rate
o Monolith: 0.32/1 Feature
o Microservices: 3.2/10 Features
How Many Experiments Could You Run?
1
1
2
3
4
5
6
7
8 910
Does the Experiment Belong in Your Monolith?
• Increasing Technical Debt
Platform Building Blocks
BENEFITS:
• Cost: Reduced cost of
experimentation
• Speed: Fast feedback
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation,
AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Platform Building Blocks
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Motivation For Containers
• Why Containers?
• VMs/Containers/Functions
• Why Container Clusters?
• Why Amazon ECS?
Region 1
Region 2 Region 3 Region 4 Region
5
Expedia ECS Cluster Statistics
2,600 ECS Services (1,100 Applications)
13,000 Containers
860 EC2 Instances (13 ECS Clusters)
Region 1
Region 2 Region 3 Region 4 Region
5
Expedia ECS Cluster Topology
Production Cluster Test Environment Cluster
480 Services
230 Instances
Production Cluster Visualization
230 Instances 480 Services 3,200 Containers
c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
Stress
Deploy
Smoke Tests
Release
Integration
Deploy
Smoke Tests
Release
Docker Registry
GitSource
Code
Commit Build
Compile
Build artifacts (jar, zip, etc.)
Build Docker Image
(based on Primer
template base image)
Test Deployment
Deploy
Smoke Tests
Release
Application
Docker
image
Production
Region 1
Deploy
Smoke Tests
Release
Env-specific
configuration,
Metadata
App
Config
DropWizard,
Springboot,
Scalatra, Sinatra,
ExpressJS, Go, etc.
Base
Docker
image
Typical Deployment Pipeline
Application
Docker
image
Application
Docker
image
Production
Region 2
Deploy
Smoke Tests
Release
Production
Region 3
Deploy
Smoke Tests
Release
…
Blue-Green Deploys
• Split releases into “deploy” and “release” steps
• Allows for testing between deploy and release
1. Deploy a “Canary”
2. Release live upgrade using ECS implicit blue-green
replacement
Blue-Green Deploys with Canary
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Live Traffic
Blue-Green Deploys with Canary
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v1
Live Traffic
Testing
Blue-Green Deploys with Canary
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v1
Live Traffic
Blue-Green Deploys with Canary
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v1  v2
Live Traffic
Blue-Green Deploys with Canary
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v2
Live Traffic
Blue-Green Deploys with Canary
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v2
Live Traffic
Application Auto-Scaling
Auto Scaling
Role-based Security with Identity Access
Management (IAM)
Amazon ECS
IAM Task Role
Identity Access
Management (IAM)
ECS Task
Logging
ECS Task
Application Container
SplunkForwarder Container
Executable
console
splunk
binary
Splunk Server
Traffic Management and Service
Discovery
Application Stack - Single Region
Amazon
Route 53
CNAME
Classic Load
Balancer
Amazon ECS
Service
Multi-Region Traffic Management
App A
Amazon
Route 53
CNAME
Classic Load
Balancer
Amazon
ECS
Service
App A
Amazon
Route 53
CNAME
Classic Load
Balancer
Amazon
ECS
Service
App A
Internet
Traffic Rules
Geo,
Fixed
Region 1
Region 2
Region N
Intra-Region Service Discovery
App A
Amazon
Route 53
CNAME
Classic Load
Balancer
Amazon
ECS
Service
App C
Amazon
Route 53
CNAME
Classic Load
Balancer
Amazon
ECS
Service
Internet
App B
Amazon
Route 53
CNAME
Classic Load
Balancer
Amazon
ECS
Service
Region 1
Public Apps Private Apps
Platform: Deployment Automation
BENEFITS:
• Speed: Many manual steps
reduced to the click of a
button
• Safety: Repeatable, reliable
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation,
AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Platform: Cluster Management
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
ECS Cluster Creation
Cloud
Formation
Stack
EC2 Instances
Auto Scaling Group
Amazon
ECS Cluster
Auto-Scaling Cluster Instances
EC2 Instances
Auto Scaling Group
Scale Up:
• 70% CPU - Add 1 instance
• 60% Memory - Add 1 instances
Scale Down:
• 10% CPU - Remove 1 instance
• 20% Memory - Remove 1 instance
Immutable Servers
Amazon-provided Base AMI
Standard Chef cookbook
Custom setup baked into AMI
ecs-optimized AMI
Expedia standard image
Docker Config
Daemon containers
Golden AMI
docker ecs-agent
Immutable Servers
ecs-optimized AMI
Expedia standard image
Docker Config
Daemon containers
Golden AMI
docker ecs-agent
ecs-optimized AMI
Expedia standard image
Docker Config
Daemon containers
Cluster Instance
Custom bootstrap:
• ECS Cluster Config
• Start ECS Agent, Docker
• Cron: Restart ECS agent
• Cron: Custom Metrics
docker ecs-agent
Zero-Downtime Cluster Updates
• “PRISM”
• Project Replaced In Sixty Minutes
“PRISM” Goals
• Zero-downtime for applications as their workloads get relocated onto new instances
Safety
• Complete as fast as possible
Speed
• Quickly retreat back to known-good state if anything goes wrong
Rollbackable
• Resumeable if anything goes wrong
Idempotent
• Drain in batches to prevent burden on Docker registry and network
• Avoid having tasks relocated to instances about to be drained
Avoid “thundering herd” scenario
“PRISM” Phases
Phase 1: Expand
Phase 2: Relocate Tasks
Phase 3: Clean Up
Zero-Downtime Cluster Updates
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Amazon
ECS
Cluster
Amazon
ECS
Cluster
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Zero-Downtime Cluster Updates
Phase 1: Expand Cluster
Zero-Downtime Cluster Updates
Phase 2: Relocate Tasks
Amazon
ECS
Cluster
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Draining…
Zero-Downtime Cluster Updates
Phase 3: Remove Old Stack
Amazon
ECS
Cluster
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Cloud
Formation
Stack
EC2 Instances
Auto Scaling group
Drained
Zero-Downtime Cluster Updates
Phase 3: Remove Old Stack
Amazon
ECS
ClusterCloud
Formation
Stack
EC2 Instances
Auto Scaling group
Monitoring Clusters
Monitoring: Dashboard
Things to Monitor
• ECS Instances - Memory, CPU, Disk
• ECS Clusters - Memory, CPU Reservation
• Auto-Scaling Groups - Current vs Maximum
• Build & Deployment (Jenkins) Servers & Nodes -
Memory, CPU, Disk
• Logging Servers - Memory, CPU, Disk
• Docker Registry - Memory, CPU, Disk
Monitoring: Flow of Metrics
Amazon CloudWatch
EC2 Instances
Auto Scaling group
ECS agent metrics
Extended CloudWatch metrics
Cron job custom metrics
Jenkins job pulls metrics
periodically
Grafana pulls
from CloudWatch
Grafana pulls
from Graphite
Monitoring: Chat Integration
Right-Sizing Instances
Aim: Balance the CPU and Memory reservation for
applications along the ratios of CPU-to-Memory Resources
available on instance
• c4.4xlarge
• 30GiB RAM, 16 CPU Cores
• r4.2xlarge
• 61GiB RAM, 8 CPU Cores
CPU Memory
Largest Production Cluster – CPU Reservation
230 Instances 480 Services 3,200 Containers
12% CPU Utilization
64% CPU Reservation
c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
Largest Production Cluster – Memory Reservation
230 Instances 480 Services 3,200 Containers
13% Memory Utilization
29% Memory Reservation
c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
Largest Production Cluster – CPU Reservation
230 Instances 480 Services 3,200 Containers
12% CPU Utilization
64% CPU Reservation
c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
Platform Building Blocks
BENEFITS:
• Speed: Pre-built ECS clusters
means no EC2 instance startup
time at deploy & autoscaling
time
• Speed: Docker only pulls the
layers it needs for images with
common hierarchy
• Safety: Immutable Servers
gives confidence no
configuration drift on
production infrastructure
• Scale: Clusters automatically
scale horizontally to match
workload
•Microservice Generation
Application Creation
•Deployment Pipeline/Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management/Service Discovery
Deployment Automation
•ECS Cluster Creation/Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation,
AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Engineering Team Skillset
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation,
AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Engineering Team Skillset
{
•Microservice Generation
Application Creation
•Deployment Pipeline / Blue-Green Deploys
•Auto-Scaling
•Security
•Logging
•Traffic Management / Service Discovery
Deployment Automation
•ECS Cluster Creation / Immutable Servers / Auto-Scaling
•Zero-Downtime Upgrades
•Monitoring
•Right-Sizing
Cluster Management
•ECS, EC2, VPC, IAM, CloudWatch, CloudFormation,
AutoScaling, Route 53, ELB, Lambda, SNS, Support
AWS
Dev/Ops
Dev/Ops Dev/Ops
Dev/Ops Dev/Ops
Project
Manager
Dev/Ops
Engineering
Manager
Dev/Ops
Dev/Ops Dev/Ops
Dev/Ops Dev/Ops
TPM Dev/Ops
Manager
•Liaise with Amazon ECS team
•Upgrading ECS Clusters
•Assisting Development Teams
•Monitoring AWS Resource Limits
•Cost Optimization
•Monitoring Infrastructure
•Migrations
Amazon
Expedia
Team Responsibilities
Expedia Cloud Team
Expedia Developers
• Create Primer applications
• Invoke builds
• Configure pipelines
• Configure applications
• Maintain ECS cluster infrastructure
• Maintain build servers
• Support builds and AWS usage
• Deployment automation
AWS ECS Team
• ECS Support
• Manage Scheduling of Tasks
• Recommendations
• Feedback
Expedia
Team Responsibilities
Expedia Cloud Team
Expedia Developers
• Create Primer applications
• Invoke builds
• Configure pipelines
• Configure applications
• Maintain ECS cluster infrastructure
• Maintain build servers
• Support builds and AWS usage
• Deployment automation
Amazon AWS ECS Team
• ECS Support
• Manage Scheduling of Tasks
• Recommendations
• Feedback
Lessons Learnt
Lesson:
Monitoring is Your Friend!
Lesson:
True Blue-Green Deploys
Current Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Live Traffic
Current Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v1
Live Traffic
Testing
Current Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v1
Live Traffic
Testing
Current Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v1  v2
• Can’t rollback
without re-releasing
• Can’t test new
Tasks
independently
Live Traffic
Current Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Amazon
Route 53
CNAME
Load Balancer
Canary - v2
Live Service - v2
Current Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v2
Lesson: True Blue-Green Deploys
ECS simulates blue-green deploys for each service behind load-balancer
• Benefit: Don’t need to warm up the load-balancer for each release
• Downside: Need to recreate load-balancer if modifying active listener - involves
downtime
• Downside: Can’t send some traffic to old tasks and some to new tasks for load
testing
Some aspects of ELBs are immutable:
• ELB Scheme (e.g. “internet-facing”)
Some aspects of ECS-ELB integration are immutable:
• Once ECS service created, can’t assign different load-balancer
• ELB Listeners associated with containers can’t be removed
Recreating ELB with different configuration necessitates recreating ECS service
What is True Blue-Green?
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Live Traffic
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Load Balancer
Live Service - v2
Live Traffic
Testing
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Load Balancer
Live Service - v2
Live Traffic
• Bleed Traffic at 10% intervals using
weighted CNAMEs
• Load Testing with live traffic
• Allows: Rollback to known good (v1)
• Allows: New ELB settings
• Requires: Warming up ELB
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Load Balancer
Live Service - v2
Live Traffic
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Load Balancer
Live Service - v2
100% Traffic
Live Traffic
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Load Balancer
Live Service - v2
ROLLBACK
Live Traffic
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Load Balancer
Amazon ECS
Live Service - v1
Load Balancer
Live Service - v2
100% Traffic
Live Traffic
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Amazon ECS
Load Balancer
Live Service - v2
Live Traffic
Desired Blue-Green Deploys
Amazon
Route 53
CNAME
Amazon ECS
Load Balancer
Live Service - v2
Live Traffic
Lesson:
Know Your Limits
Lesson: Know Your Resource Limits
Ask nicely :)
Start ECS agent with
exponential back off
Lesson: Beware of Rate Limits
API Rate Limits
• The more ELBs and ECS services you have the more
ECS  ELB traffic your account will have
• DescribeInstanceHealth API call
• Workaround: Shard your Cloud presence into Smaller
Accounts
Lesson:
Avoid Auto-Scale Thrashing
Lesson: Avoid Auto-Scale Thrashing
Problem
1. ASG scales up due to high Memory
Reservation
2. 5mins later ASG scales down due to low
CPU Reservation
3. Repeat from #1
Solution #1 Fix scaling dimensions
• Scale Down only when both are low
Solution #2 Fix Ratios
• Match service resource ratios to instance
type resource ratio
For now Set scale down policies low
CPU Memory
Future Plans
Future Plans
• Cost Allocation
• Service Discovery
• ECR Adoption
Benefits of Microservice Platform on ECS
• Cost: Reduced cost of experimentation
• Speed: Fast feedback
Application Creation - Microservice Generation
• Speed: Many manual steps reduced to the click of a button
• Safety: Repeatable, reliable
Deployment Automation
• Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling
time
• Speed: Docker only pulls the layers it needs for images with common hierarchy
• Safety: Immutable Servers gives confidence no configuration drift on production infrastructure
• Scale: Clusters automatically scale horizontally to match workload
Cluster Management
Benefits of Microservice Platform on ECS
• Cost: Reduced cost of experimentation
• Speed: Fast feedback
Application Creation - Microservice Generation
• Speed: Many manual steps reduced to the click of a button
• Safety: Repeatable, reliable
Deployment Automation
• Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling
time
• Speed: Docker only pulls the layers it needs for images with common hierarchy
• Safety: Immutable Servers gives confidence no configuration drift on production infrastructure
• Scale: Clusters automatically scale horizontally to match workload
Cluster Management
Benefits of Microservice Platform on ECS
• Cost: Reduced cost of experimentation
• Speed: Fast feedback
Application Creation - Microservice Generation
• Speed: Many manual steps reduced to the click of a button
• Safety: Repeatable, reliable
Deployment Automation
• Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling
time
• Speed: Docker only pulls the layers it needs for images with common hierarchy
• Safety: Immutable Servers gives confidence no configuration drift on production infrastructure
• Scale: Clusters automatically scale horizontally to match workload
Cluster Management
Benefits of Microservice Platform on ECS
• Cost: Reduced cost of experimentation
• Speed: Fast feedback
Application Creation - Microservice Generation
• Speed: Many manual steps reduced to the click of a button
• Safety: Repeatable, reliable
Deployment Automation
• Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling
time
• Speed: Docker only pulls the layers it needs for images with common hierarchy
• Safety: Immutable Servers gives confidence no configuration drift on production infrastructure
• Scale: Clusters automatically scale horizontally to match workload
Cluster Management
Did We Succeed?
• Mission: Speed up developers lives
Time Savings Per Deploy
• Primer: Dedicated EC2 instances with Chef-built AMIs:
• 30 minutes per deploy
• Primer 2.0: Docker on ECS:
• 3 minutes per deploy
• Receive feedback 27 minutes faster
Average 524 Deploys per Day to Test Environment
Support team = 8 people
EC2 Deploy with AMI = 30mins
ECS Deploy with Container = 3mins
27min saving per deploy 524 builds =
29.5 dev days saved
every day
30 Developer Days Saved Every Business Day
Opportunity Cost Savings
ECS EC2
• Pre-built fleet of clusters
• Quickly and safely run software as Docker images
• Reduced opportunity cost by 30x
Thanks! Questions?
Matt Callanan
m c a l l a n a n @ e x p e d i a . c o m
l i n k e d i n . c o m / i n / m a t t h e w c a l l a n a n
@ m c a l l a n a
Image Attribution
Image
“Pipelines descending to Inveruglas Power Station” (http://www.geograph.org.uk/photo/2214366) is licensed under CC BY SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0/) /
Desaturated and cropped from original
“The Future” (https://flic.kr/p/26YCn1) by Kristian Bjornard is licensed under CC BY SA 2.0 (https://creativecommons.org/licenses/by-sa/2.0/)
“CTA Loop Junction” (https://commons.wikimedia.org/wiki/File:CTA_loop_junction.jpg) by Daniel Schwen is licensed under CC BY SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/)
“Logging operations at Millmoor Rig” (http://bit.ly/1Nb20LS) by Walter Baxter is licensed under CC BY SA 2.0 (https://creativecommons.org/licenses/by-sa/2.0/)
“Traffic Monitoring” (https://commons.wikimedia.org/wiki/File:Traffic_Monitoring.JPG) by Suryasuharman is licensed under CC BY SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/)
“DNS logo” (https://commons.wikimedia.org/wiki/File:DNS_logo.jpg) by I laramide I is licensed under CC BY SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/)
“Matrix-code-computer-pc-data” (https://pixabay.com/en/matrix-code-computer-pc-data-356024/) by Comfreak is licensed under CC ZERO
(https://creativecommons.org/publicdomain/zero/1.0/)
“Sample-color-blue-green” (https://pixabay.com/en/sample-color-blue-green-rubber-815141/ ) by LyraBelacqua-Sally is licensed under CC ZERO
(https://creativecommons.org/publicdomain/zero/1.0/)
“Fashion-wristwatch-time” (https://www.pexels.com/photo/fashion-wristwatch-time-watch-1252/) by SplitShire.com is licensed under CC ZERO
(https://creativecommons.org/publicdomain/zero/1.0/)
“Chat” (https://openclipart.org/detail/129049/chat) by Merlin2525 is licensed under unlimited-commercial-use (https://openclipart.org/unlimited-commercial-use-clipart)
“scales” (https://openclipart.org/detail/24101/scales) by scott_kirkwood is licensed under unlimited-commercial-use (https://openclipart.org/unlimited-commercial-use-clipart)
“Compiz GIT Repository” (https://flic.kr/p/Ssras) by -= Treviño =- is licensed under BY NC SA 2.0 (https://creativecommons.org/licenses/by-nc-sa/2.0)
“logs” (https://flic.kr/p/9F8tjX) by Rick Payette is licensed under CC BY NC ND 2.0 (https://creativecommons.org/licenses/by-nc-nd/2.0)
Docker logo used according to https://www.docker.com/brand-guidelines
Shipping Container Clip Art: https://pixabay.com/en/container-shipping-trucking-307872/ by Clker-Free-Vector-Images is licensed under CC ZERO
Computer Code: https://pixabay.com/en/binary-1-0-computer-code-zero-1066983/ by HypnoArt is licensed under CC ZERO

Más contenido relacionado

La actualidad más candente

AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...Amazon Web Services
 
CI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the TimeCI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the TimeAmazon Web Services
 
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...Amazon Web Services
 
Advanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWSAdvanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWSAmazon Web Services
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017
Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017
Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017Amazon Web Services
 
Ten^H^H^H Many Cloud App Design Patterns
Ten^H^H^H Many Cloud App Design PatternsTen^H^H^H Many Cloud App Design Patterns
Ten^H^H^H Many Cloud App Design PatternsShlomo Swidler
 
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...Amazon Web Services
 
AWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig Dickson
AWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig DicksonAWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig Dickson
AWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig DicksonAmazon Web Services Korea
 
Adopting Java for the Serverless world at IT Tage
Adopting Java for the Serverless world at IT TageAdopting Java for the Serverless world at IT Tage
Adopting Java for the Serverless world at IT TageVadym Kazulkin
 
Automated Governance of Your AWS Resources
Automated Governance of Your AWS ResourcesAutomated Governance of Your AWS Resources
Automated Governance of Your AWS ResourcesAmazon Web Services
 
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014Amazon Web Services
 
How did we get here and where are we going
How did we get here and where are we goingHow did we get here and where are we going
How did we get here and where are we goingYan Cui
 
Leveraging elastic web scale computing with AWS
 Leveraging elastic web scale computing with AWS Leveraging elastic web scale computing with AWS
Leveraging elastic web scale computing with AWSShiva Narayanaswamy
 
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...Amazon Web Services
 
Deploy, scale and manage your application with AWS Elastic Beanstal
Deploy, scale and manage your application with AWS Elastic BeanstalDeploy, scale and manage your application with AWS Elastic Beanstal
Deploy, scale and manage your application with AWS Elastic BeanstalAmazon Web Services
 
Building a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQLBuilding a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQLYan Cui
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkAdrian Cockcroft
 

La actualidad más candente (20)

AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
AWS re:Invent 2016: The AWS Hero’s Journey to Achieving Autonomous, Self-Heal...
 
CI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the TimeCI/CD on AWS Deploy Everything All the Time
CI/CD on AWS Deploy Everything All the Time
 
DevOps and AWS
DevOps and AWSDevOps and AWS
DevOps and AWS
 
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
 
Advanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWSAdvanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWS
 
Mini-Training: Netflix Simian Army
Mini-Training: Netflix Simian ArmyMini-Training: Netflix Simian Army
Mini-Training: Netflix Simian Army
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017
Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017
Building a CI/CD Pipeline for Containers - DevDay Los Angeles 2017
 
Ten^H^H^H Many Cloud App Design Patterns
Ten^H^H^H Many Cloud App Design PatternsTen^H^H^H Many Cloud App Design Patterns
Ten^H^H^H Many Cloud App Design Patterns
 
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
Managing Your Application Lifecycle on AWS: Continuous Integration and Deploy...
 
AWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig Dickson
AWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig DicksonAWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig Dickson
AWS Innovate: Smaller IS Better – Exploiting Microservices on AWS, Craig Dickson
 
Adopting Java for the Serverless world at IT Tage
Adopting Java for the Serverless world at IT TageAdopting Java for the Serverless world at IT Tage
Adopting Java for the Serverless world at IT Tage
 
Automated Governance of Your AWS Resources
Automated Governance of Your AWS ResourcesAutomated Governance of Your AWS Resources
Automated Governance of Your AWS Resources
 
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
 
How did we get here and where are we going
How did we get here and where are we goingHow did we get here and where are we going
How did we get here and where are we going
 
Leveraging elastic web scale computing with AWS
 Leveraging elastic web scale computing with AWS Leveraging elastic web scale computing with AWS
Leveraging elastic web scale computing with AWS
 
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
 
Deploy, scale and manage your application with AWS Elastic Beanstal
Deploy, scale and manage your application with AWS Elastic BeanstalDeploy, scale and manage your application with AWS Elastic Beanstal
Deploy, scale and manage your application with AWS Elastic Beanstal
 
Building a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQLBuilding a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQL
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New York
 

Similar a Going Big with Containers: Customer Case Studies of Large-Scale Deployments - ENT209 - re:Invent 2017

Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersAmazon Web Services
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerAmazon Web Services
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSAmazon Web Services
 
Deep Dive on Microservices and Docker
Deep Dive on Microservices and DockerDeep Dive on Microservices and Docker
Deep Dive on Microservices and DockerKristana Kane
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSAmazon Web Services
 
Running Containerised Applications at Scale on AWS
Running Containerised Applications at Scale on AWSRunning Containerised Applications at Scale on AWS
Running Containerised Applications at Scale on AWSAmazon Web Services
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Amazon Web Services
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)AWS Vietnam Community
 
Running Microservices on Amazon ECS - AWS April 2016 Webinar Series
Running Microservices on Amazon ECS - AWS April 2016 Webinar SeriesRunning Microservices on Amazon ECS - AWS April 2016 Webinar Series
Running Microservices on Amazon ECS - AWS April 2016 Webinar SeriesAmazon Web Services
 
AWS Startup Insights Kuala Lumpur
AWS Startup Insights Kuala LumpurAWS Startup Insights Kuala Lumpur
AWS Startup Insights Kuala LumpurAmazon Web Services
 
Episode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-ServiceEpisode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-ServiceMesosphere Inc.
 
Making sense of containers, docker and Kubernetes on Azure.
Making sense of containers, docker and Kubernetes on Azure.Making sense of containers, docker and Kubernetes on Azure.
Making sense of containers, docker and Kubernetes on Azure.Nills Franssens
 
Continuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWSContinuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWSAmazon Web Services
 
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECSWeaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECSWeaveworks
 
How Online Retailer Resident Scaled DevOps with AWS and CloudShell Colony
How Online Retailer Resident Scaled DevOps with AWS and CloudShell ColonyHow Online Retailer Resident Scaled DevOps with AWS and CloudShell Colony
How Online Retailer Resident Scaled DevOps with AWS and CloudShell ColonyDevOps.com
 
Building and scaling your first containerized microservice
Building and scaling your first containerized microserviceBuilding and scaling your first containerized microservice
Building and scaling your first containerized microserviceAmazon Web Services
 
AWS Update from AWS User Group UK July Meetup
AWS Update from AWS User Group UK July MeetupAWS Update from AWS User Group UK July Meetup
AWS Update from AWS User Group UK July MeetupIan Massingham
 

Similar a Going Big with Containers: Customer Case Studies of Large-Scale Deployments - ENT209 - re:Invent 2017 (20)

Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to Containers
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and Docker
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
Deep Dive on Microservices and Docker
Deep Dive on Microservices and DockerDeep Dive on Microservices and Docker
Deep Dive on Microservices and Docker
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
From Monolith to Microservices
From Monolith to MicroservicesFrom Monolith to Microservices
From Monolith to Microservices
 
Running Containerised Applications at Scale on AWS
Running Containerised Applications at Scale on AWSRunning Containerised Applications at Scale on AWS
Running Containerised Applications at Scale on AWS
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)
 
Running Microservices on Amazon ECS - AWS April 2016 Webinar Series
Running Microservices on Amazon ECS - AWS April 2016 Webinar SeriesRunning Microservices on Amazon ECS - AWS April 2016 Webinar Series
Running Microservices on Amazon ECS - AWS April 2016 Webinar Series
 
AWS Startup Insights Kuala Lumpur
AWS Startup Insights Kuala LumpurAWS Startup Insights Kuala Lumpur
AWS Startup Insights Kuala Lumpur
 
Episode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-ServiceEpisode 1: Building Kubernetes-as-a-Service
Episode 1: Building Kubernetes-as-a-Service
 
Digital Workloads on AWS
Digital Workloads on AWSDigital Workloads on AWS
Digital Workloads on AWS
 
Making sense of containers, docker and Kubernetes on Azure.
Making sense of containers, docker and Kubernetes on Azure.Making sense of containers, docker and Kubernetes on Azure.
Making sense of containers, docker and Kubernetes on Azure.
 
Continuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWSContinuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWS
 
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECSWeaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
 
How Online Retailer Resident Scaled DevOps with AWS and CloudShell Colony
How Online Retailer Resident Scaled DevOps with AWS and CloudShell ColonyHow Online Retailer Resident Scaled DevOps with AWS and CloudShell Colony
How Online Retailer Resident Scaled DevOps with AWS and CloudShell Colony
 
AWS Startup Insights Singapore
AWS Startup Insights SingaporeAWS Startup Insights Singapore
AWS Startup Insights Singapore
 
Building and scaling your first containerized microservice
Building and scaling your first containerized microserviceBuilding and scaling your first containerized microservice
Building and scaling your first containerized microservice
 
AWS Update from AWS User Group UK July Meetup
AWS Update from AWS User Group UK July MeetupAWS Update from AWS User Group UK July Meetup
AWS Update from AWS User Group UK July Meetup
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Going Big with Containers: Customer Case Studies of Large-Scale Deployments - ENT209 - re:Invent 2017

  • 1. Going Big With Containers C u s t o m e r C a s e S t u d i e s o f L a r g e - S c a l e D e p l o y m e n t s M a t t C a l l a n a n – E n g i n e e r i n g M a n a g e r - E x p e d i a m c a l l a n a n @ e x p e d i a . c o m l i n k e d i n . c o m / i n / m a t t h e w c a l l a n a n @ m c a l l a n a E N T 2 0 9 N o v e m b e r 2 9 , 2 0 1 7
  • 2. Going Big With Containers Large-Scale Deployments with Amazon ECS
  • 3. Matt Callanan Engineering Manager / Tech Lead “Cloud Acceleration Team” Expedia Brisbane, Australia • mcallanan@expedia.com • linkedin.com/in/matthewcallanan • @mcallana
  • 4.
  • 5. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline/Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management/Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, Amazon CloudWatch, AWS CloudFormation, Auto Scaling, Amazon Route 53, ELB, AWS Lambda, SNS, Support AWS
  • 6. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 7. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 8. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 9. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 10. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 11. What is the Cost of Creating a Microservice?
  • 24. Phew!
  • 25. Opportunity Cost: Time Value of Information
  • 26. Time Value of Information • “A piece of information is worth more now than it is tomorrow” • If every commit is a hypothesis, how much is verifying that commit worth now as opposed to later?
  • 30. User enters the details of their app into Primer Web App for new app creation Creates Dockerfile and repo in private docker registry
  • 31. Primer Application Creation • Within 10 minutes: • Application code repository created • Continuous Delivery pipeline created • Docker repository created • Application built as a Docker image • Application deployed to a prod-like environment
  • 32. ~20 Primer Applications Created per Day Hackathons!
  • 33. How Long Does Feedback Take In a Monolith? Monolith with 10x release cycleMicroservice
  • 34. Why is Fast Feedback Important? • Most Likely to Fail o 68% Industry Failure Rate • 10x cycle time = 1/10th success rate o Monolith: 0.32/1 Feature o Microservices: 3.2/10 Features
  • 35. How Many Experiments Could You Run? 1 1 2 3 4 5 6 7 8 910
  • 36. Does the Experiment Belong in Your Monolith? • Increasing Technical Debt
  • 37. Platform Building Blocks BENEFITS: • Cost: Reduced cost of experimentation • Speed: Fast feedback •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 38. Platform Building Blocks •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 39. Motivation For Containers • Why Containers? • VMs/Containers/Functions • Why Container Clusters? • Why Amazon ECS?
  • 40. Region 1 Region 2 Region 3 Region 4 Region 5 Expedia ECS Cluster Statistics 2,600 ECS Services (1,100 Applications) 13,000 Containers 860 EC2 Instances (13 ECS Clusters)
  • 41. Region 1 Region 2 Region 3 Region 4 Region 5 Expedia ECS Cluster Topology Production Cluster Test Environment Cluster 480 Services 230 Instances
  • 42. Production Cluster Visualization 230 Instances 480 Services 3,200 Containers c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
  • 43. Stress Deploy Smoke Tests Release Integration Deploy Smoke Tests Release Docker Registry GitSource Code Commit Build Compile Build artifacts (jar, zip, etc.) Build Docker Image (based on Primer template base image) Test Deployment Deploy Smoke Tests Release Application Docker image Production Region 1 Deploy Smoke Tests Release Env-specific configuration, Metadata App Config DropWizard, Springboot, Scalatra, Sinatra, ExpressJS, Go, etc. Base Docker image Typical Deployment Pipeline Application Docker image Application Docker image Production Region 2 Deploy Smoke Tests Release Production Region 3 Deploy Smoke Tests Release …
  • 44. Blue-Green Deploys • Split releases into “deploy” and “release” steps • Allows for testing between deploy and release 1. Deploy a “Canary” 2. Release live upgrade using ECS implicit blue-green replacement
  • 45. Blue-Green Deploys with Canary Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Live Traffic
  • 46. Blue-Green Deploys with Canary Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v1 Live Traffic Testing
  • 47. Blue-Green Deploys with Canary Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v1 Live Traffic
  • 48. Blue-Green Deploys with Canary Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v1  v2 Live Traffic
  • 49. Blue-Green Deploys with Canary Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v2 Live Traffic
  • 50. Blue-Green Deploys with Canary Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v2 Live Traffic
  • 52. Role-based Security with Identity Access Management (IAM) Amazon ECS IAM Task Role Identity Access Management (IAM) ECS Task
  • 53. Logging ECS Task Application Container SplunkForwarder Container Executable console splunk binary Splunk Server
  • 54. Traffic Management and Service Discovery
  • 55. Application Stack - Single Region Amazon Route 53 CNAME Classic Load Balancer Amazon ECS Service
  • 56. Multi-Region Traffic Management App A Amazon Route 53 CNAME Classic Load Balancer Amazon ECS Service App A Amazon Route 53 CNAME Classic Load Balancer Amazon ECS Service App A Internet Traffic Rules Geo, Fixed Region 1 Region 2 Region N
  • 57. Intra-Region Service Discovery App A Amazon Route 53 CNAME Classic Load Balancer Amazon ECS Service App C Amazon Route 53 CNAME Classic Load Balancer Amazon ECS Service Internet App B Amazon Route 53 CNAME Classic Load Balancer Amazon ECS Service Region 1 Public Apps Private Apps
  • 58. Platform: Deployment Automation BENEFITS: • Speed: Many manual steps reduced to the click of a button • Safety: Repeatable, reliable •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 59. Platform: Cluster Management •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 60. ECS Cluster Creation Cloud Formation Stack EC2 Instances Auto Scaling Group Amazon ECS Cluster
  • 61. Auto-Scaling Cluster Instances EC2 Instances Auto Scaling Group Scale Up: • 70% CPU - Add 1 instance • 60% Memory - Add 1 instances Scale Down: • 10% CPU - Remove 1 instance • 20% Memory - Remove 1 instance
  • 62. Immutable Servers Amazon-provided Base AMI Standard Chef cookbook Custom setup baked into AMI ecs-optimized AMI Expedia standard image Docker Config Daemon containers Golden AMI docker ecs-agent
  • 63. Immutable Servers ecs-optimized AMI Expedia standard image Docker Config Daemon containers Golden AMI docker ecs-agent ecs-optimized AMI Expedia standard image Docker Config Daemon containers Cluster Instance Custom bootstrap: • ECS Cluster Config • Start ECS Agent, Docker • Cron: Restart ECS agent • Cron: Custom Metrics docker ecs-agent
  • 64. Zero-Downtime Cluster Updates • “PRISM” • Project Replaced In Sixty Minutes
  • 65. “PRISM” Goals • Zero-downtime for applications as their workloads get relocated onto new instances Safety • Complete as fast as possible Speed • Quickly retreat back to known-good state if anything goes wrong Rollbackable • Resumeable if anything goes wrong Idempotent • Drain in batches to prevent burden on Docker registry and network • Avoid having tasks relocated to instances about to be drained Avoid “thundering herd” scenario
  • 66. “PRISM” Phases Phase 1: Expand Phase 2: Relocate Tasks Phase 3: Clean Up
  • 67. Zero-Downtime Cluster Updates Cloud Formation Stack EC2 Instances Auto Scaling group Amazon ECS Cluster
  • 68. Amazon ECS Cluster Cloud Formation Stack EC2 Instances Auto Scaling group Cloud Formation Stack EC2 Instances Auto Scaling group Zero-Downtime Cluster Updates Phase 1: Expand Cluster
  • 69. Zero-Downtime Cluster Updates Phase 2: Relocate Tasks Amazon ECS Cluster Cloud Formation Stack EC2 Instances Auto Scaling group Cloud Formation Stack EC2 Instances Auto Scaling group Draining…
  • 70. Zero-Downtime Cluster Updates Phase 3: Remove Old Stack Amazon ECS Cluster Cloud Formation Stack EC2 Instances Auto Scaling group Cloud Formation Stack EC2 Instances Auto Scaling group Drained
  • 71. Zero-Downtime Cluster Updates Phase 3: Remove Old Stack Amazon ECS ClusterCloud Formation Stack EC2 Instances Auto Scaling group
  • 74. Things to Monitor • ECS Instances - Memory, CPU, Disk • ECS Clusters - Memory, CPU Reservation • Auto-Scaling Groups - Current vs Maximum • Build & Deployment (Jenkins) Servers & Nodes - Memory, CPU, Disk • Logging Servers - Memory, CPU, Disk • Docker Registry - Memory, CPU, Disk
  • 75. Monitoring: Flow of Metrics Amazon CloudWatch EC2 Instances Auto Scaling group ECS agent metrics Extended CloudWatch metrics Cron job custom metrics Jenkins job pulls metrics periodically Grafana pulls from CloudWatch Grafana pulls from Graphite
  • 77. Right-Sizing Instances Aim: Balance the CPU and Memory reservation for applications along the ratios of CPU-to-Memory Resources available on instance • c4.4xlarge • 30GiB RAM, 16 CPU Cores • r4.2xlarge • 61GiB RAM, 8 CPU Cores CPU Memory
  • 78. Largest Production Cluster – CPU Reservation 230 Instances 480 Services 3,200 Containers 12% CPU Utilization 64% CPU Reservation c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
  • 79. Largest Production Cluster – Memory Reservation 230 Instances 480 Services 3,200 Containers 13% Memory Utilization 29% Memory Reservation c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
  • 80. Largest Production Cluster – CPU Reservation 230 Instances 480 Services 3,200 Containers 12% CPU Utilization 64% CPU Reservation c3vis Open Source: https://github.com/ExpediaDotCom/c3vis
  • 81. Platform Building Blocks BENEFITS: • Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & autoscaling time • Speed: Docker only pulls the layers it needs for images with common hierarchy • Safety: Immutable Servers gives confidence no configuration drift on production infrastructure • Scale: Clusters automatically scale horizontally to match workload •Microservice Generation Application Creation •Deployment Pipeline/Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management/Service Discovery Deployment Automation •ECS Cluster Creation/Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 82. Engineering Team Skillset •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 83. Engineering Team Skillset { •Microservice Generation Application Creation •Deployment Pipeline / Blue-Green Deploys •Auto-Scaling •Security •Logging •Traffic Management / Service Discovery Deployment Automation •ECS Cluster Creation / Immutable Servers / Auto-Scaling •Zero-Downtime Upgrades •Monitoring •Right-Sizing Cluster Management •ECS, EC2, VPC, IAM, CloudWatch, CloudFormation, AutoScaling, Route 53, ELB, Lambda, SNS, Support AWS
  • 85. Dev/Ops Dev/Ops Dev/Ops Dev/Ops Dev/Ops TPM Dev/Ops Manager •Liaise with Amazon ECS team •Upgrading ECS Clusters •Assisting Development Teams •Monitoring AWS Resource Limits •Cost Optimization •Monitoring Infrastructure •Migrations
  • 86. Amazon Expedia Team Responsibilities Expedia Cloud Team Expedia Developers • Create Primer applications • Invoke builds • Configure pipelines • Configure applications • Maintain ECS cluster infrastructure • Maintain build servers • Support builds and AWS usage • Deployment automation AWS ECS Team • ECS Support • Manage Scheduling of Tasks • Recommendations • Feedback
  • 87. Expedia Team Responsibilities Expedia Cloud Team Expedia Developers • Create Primer applications • Invoke builds • Configure pipelines • Configure applications • Maintain ECS cluster infrastructure • Maintain build servers • Support builds and AWS usage • Deployment automation Amazon AWS ECS Team • ECS Support • Manage Scheduling of Tasks • Recommendations • Feedback
  • 91. Current Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Live Traffic
  • 92. Current Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v1 Live Traffic Testing
  • 93. Current Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v1 Live Traffic Testing
  • 94. Current Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v1  v2 • Can’t rollback without re-releasing • Can’t test new Tasks independently Live Traffic
  • 95. Current Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Amazon Route 53 CNAME Load Balancer Canary - v2 Live Service - v2
  • 96. Current Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v2
  • 97. Lesson: True Blue-Green Deploys ECS simulates blue-green deploys for each service behind load-balancer • Benefit: Don’t need to warm up the load-balancer for each release • Downside: Need to recreate load-balancer if modifying active listener - involves downtime • Downside: Can’t send some traffic to old tasks and some to new tasks for load testing Some aspects of ELBs are immutable: • ELB Scheme (e.g. “internet-facing”) Some aspects of ECS-ELB integration are immutable: • Once ECS service created, can’t assign different load-balancer • ELB Listeners associated with containers can’t be removed Recreating ELB with different configuration necessitates recreating ECS service
  • 98. What is True Blue-Green?
  • 99. Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Live Traffic
  • 100. Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Load Balancer Live Service - v2 Live Traffic Testing
  • 101. Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Load Balancer Live Service - v2 Live Traffic
  • 102. • Bleed Traffic at 10% intervals using weighted CNAMEs • Load Testing with live traffic • Allows: Rollback to known good (v1) • Allows: New ELB settings • Requires: Warming up ELB Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Load Balancer Live Service - v2 Live Traffic
  • 103. Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Load Balancer Live Service - v2 100% Traffic Live Traffic
  • 104. Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Load Balancer Live Service - v2 ROLLBACK Live Traffic
  • 105. Desired Blue-Green Deploys Amazon Route 53 CNAME Load Balancer Amazon ECS Live Service - v1 Load Balancer Live Service - v2 100% Traffic Live Traffic
  • 106. Desired Blue-Green Deploys Amazon Route 53 CNAME Amazon ECS Load Balancer Live Service - v2 Live Traffic
  • 107. Desired Blue-Green Deploys Amazon Route 53 CNAME Amazon ECS Load Balancer Live Service - v2 Live Traffic
  • 109. Lesson: Know Your Resource Limits Ask nicely :) Start ECS agent with exponential back off
  • 110. Lesson: Beware of Rate Limits API Rate Limits • The more ELBs and ECS services you have the more ECS  ELB traffic your account will have • DescribeInstanceHealth API call • Workaround: Shard your Cloud presence into Smaller Accounts
  • 112. Lesson: Avoid Auto-Scale Thrashing Problem 1. ASG scales up due to high Memory Reservation 2. 5mins later ASG scales down due to low CPU Reservation 3. Repeat from #1 Solution #1 Fix scaling dimensions • Scale Down only when both are low Solution #2 Fix Ratios • Match service resource ratios to instance type resource ratio For now Set scale down policies low CPU Memory
  • 114. Future Plans • Cost Allocation • Service Discovery • ECR Adoption
  • 115. Benefits of Microservice Platform on ECS • Cost: Reduced cost of experimentation • Speed: Fast feedback Application Creation - Microservice Generation • Speed: Many manual steps reduced to the click of a button • Safety: Repeatable, reliable Deployment Automation • Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling time • Speed: Docker only pulls the layers it needs for images with common hierarchy • Safety: Immutable Servers gives confidence no configuration drift on production infrastructure • Scale: Clusters automatically scale horizontally to match workload Cluster Management
  • 116. Benefits of Microservice Platform on ECS • Cost: Reduced cost of experimentation • Speed: Fast feedback Application Creation - Microservice Generation • Speed: Many manual steps reduced to the click of a button • Safety: Repeatable, reliable Deployment Automation • Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling time • Speed: Docker only pulls the layers it needs for images with common hierarchy • Safety: Immutable Servers gives confidence no configuration drift on production infrastructure • Scale: Clusters automatically scale horizontally to match workload Cluster Management
  • 117. Benefits of Microservice Platform on ECS • Cost: Reduced cost of experimentation • Speed: Fast feedback Application Creation - Microservice Generation • Speed: Many manual steps reduced to the click of a button • Safety: Repeatable, reliable Deployment Automation • Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling time • Speed: Docker only pulls the layers it needs for images with common hierarchy • Safety: Immutable Servers gives confidence no configuration drift on production infrastructure • Scale: Clusters automatically scale horizontally to match workload Cluster Management
  • 118. Benefits of Microservice Platform on ECS • Cost: Reduced cost of experimentation • Speed: Fast feedback Application Creation - Microservice Generation • Speed: Many manual steps reduced to the click of a button • Safety: Repeatable, reliable Deployment Automation • Speed: Pre-built ECS clusters means no EC2 instance startup time at deploy & auto-scaling time • Speed: Docker only pulls the layers it needs for images with common hierarchy • Safety: Immutable Servers gives confidence no configuration drift on production infrastructure • Scale: Clusters automatically scale horizontally to match workload Cluster Management
  • 119. Did We Succeed? • Mission: Speed up developers lives
  • 120. Time Savings Per Deploy • Primer: Dedicated EC2 instances with Chef-built AMIs: • 30 minutes per deploy • Primer 2.0: Docker on ECS: • 3 minutes per deploy • Receive feedback 27 minutes faster
  • 121. Average 524 Deploys per Day to Test Environment
  • 122. Support team = 8 people EC2 Deploy with AMI = 30mins ECS Deploy with Container = 3mins 27min saving per deploy 524 builds = 29.5 dev days saved every day 30 Developer Days Saved Every Business Day
  • 123. Opportunity Cost Savings ECS EC2 • Pre-built fleet of clusters • Quickly and safely run software as Docker images • Reduced opportunity cost by 30x
  • 124. Thanks! Questions? Matt Callanan m c a l l a n a n @ e x p e d i a . c o m l i n k e d i n . c o m / i n / m a t t h e w c a l l a n a n @ m c a l l a n a
  • 125. Image Attribution Image “Pipelines descending to Inveruglas Power Station” (http://www.geograph.org.uk/photo/2214366) is licensed under CC BY SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0/) / Desaturated and cropped from original “The Future” (https://flic.kr/p/26YCn1) by Kristian Bjornard is licensed under CC BY SA 2.0 (https://creativecommons.org/licenses/by-sa/2.0/) “CTA Loop Junction” (https://commons.wikimedia.org/wiki/File:CTA_loop_junction.jpg) by Daniel Schwen is licensed under CC BY SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/) “Logging operations at Millmoor Rig” (http://bit.ly/1Nb20LS) by Walter Baxter is licensed under CC BY SA 2.0 (https://creativecommons.org/licenses/by-sa/2.0/) “Traffic Monitoring” (https://commons.wikimedia.org/wiki/File:Traffic_Monitoring.JPG) by Suryasuharman is licensed under CC BY SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/) “DNS logo” (https://commons.wikimedia.org/wiki/File:DNS_logo.jpg) by I laramide I is licensed under CC BY SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/) “Matrix-code-computer-pc-data” (https://pixabay.com/en/matrix-code-computer-pc-data-356024/) by Comfreak is licensed under CC ZERO (https://creativecommons.org/publicdomain/zero/1.0/) “Sample-color-blue-green” (https://pixabay.com/en/sample-color-blue-green-rubber-815141/ ) by LyraBelacqua-Sally is licensed under CC ZERO (https://creativecommons.org/publicdomain/zero/1.0/) “Fashion-wristwatch-time” (https://www.pexels.com/photo/fashion-wristwatch-time-watch-1252/) by SplitShire.com is licensed under CC ZERO (https://creativecommons.org/publicdomain/zero/1.0/) “Chat” (https://openclipart.org/detail/129049/chat) by Merlin2525 is licensed under unlimited-commercial-use (https://openclipart.org/unlimited-commercial-use-clipart) “scales” (https://openclipart.org/detail/24101/scales) by scott_kirkwood is licensed under unlimited-commercial-use (https://openclipart.org/unlimited-commercial-use-clipart) “Compiz GIT Repository” (https://flic.kr/p/Ssras) by -= Treviño =- is licensed under BY NC SA 2.0 (https://creativecommons.org/licenses/by-nc-sa/2.0) “logs” (https://flic.kr/p/9F8tjX) by Rick Payette is licensed under CC BY NC ND 2.0 (https://creativecommons.org/licenses/by-nc-nd/2.0) Docker logo used according to https://www.docker.com/brand-guidelines Shipping Container Clip Art: https://pixabay.com/en/container-shipping-trucking-307872/ by Clker-Free-Vector-Images is licensed under CC ZERO Computer Code: https://pixabay.com/en/binary-1-0-computer-code-zero-1066983/ by HypnoArt is licensed under CC ZERO