At Capital One, we are using Docker and container technologies to advance microservices adoption, increase efficiencies of cloud resources, and decouple the application layer from the underlying infrastructure. Capital One is a federated organization with a “you build it, you own it” culture that provides autonomy and speed for delivery teams. Each federated team runs and operates their container management stack. In order for the federated teams to accelerate their cloud and container-based apps adoption, we created self-service automation tools for creation and operations management of container management stack.
In this session, we explore our push-button automation tool that includes capabilities such as the creation and management of Amazon ECS clusters, an Application Load Balancer for dynamic and context-based routing and provides a user interface via a Jenkins Job or a AWS Lambda function. Our tooling also includes a home-grown dynamic service discovery and routing for applications requiring two-way mutual SSL authentication. We talk through how Capital One regularly updates AMIs with the latest patches and software versions using an automated solution that leverages AWS Lambda to rehydrate the Amazon ECS compute cluster with the latest AMI without causing any downtime. We also discuss how we created a sophisticated canary deployment automation using AWS Lambda and application services, where users can specify how to migrate to a new version of containerized apps and manage the deployment.
AWS empowers enterprise Docker deployment with Amazon ECS and an ecosystem of cloud services and serverless architectures, making containerization in mission-critical environments easier than ever.
4. We use Docker and ECS-based container technologies to advance
microservices adoption and increase efficiencies of cloud resources:
Microservices
architecture
We embraced microservices architecture for our cloud
applications and this is driving Docker container
technology adoption
Federated operating
model
Self-service
automation tools
Ours is a federated organization with a You Build You
Own operating model providing autonomy and speed
for delivery teams
We developed self service container management
automation tools based on ECS for accelerating
federating teams application delivery
5. Amazon ECS is the most adopted container management solution
in Capital One
• ECS and Docker implementations at Capital One include Credit Card servicing, Auto
Loan Servicing, and Enterprise Open Source office applications
• We run microservices, event-driven applications, batch applications, real-time APIs
and web applications using ECS and Docker solutions
• ECS is adopted in multiple lines of business for both internal and customer
applications
• ECS simplified the containerization journey in Capital One
• We leverage ECS’s integration with CloudWatch, IAM and other native services for
seamless integration with operations
• With ECS and our automation tooling, Docker apps can be deployed with a production
hardened container stack in minutes
6. Container stacks are integrated with Enterprise DevOps tools
providing an end-to-end automation solution for containerized
microservices.
SCM Build
Code
Binary
Repo
Docker
Image
Repo
Cluster
Scheduler
Cluster
manager
Service
Discovery
Software
LB
ELB API
Gateway
clients
Developers
Container management solution
components
Capability SCM Build Repos Compute cluster
Cluster manager
Container scheduler
Dynamic
Service
Discovery
Load balancer Load
balancer
API
Gateway
Solution GitHub
Enterprise
Jenkins Nexus
Docker
registry
EC2 instances
ECS
Consul
Target Group
Nginx
App load
balancer
Elastic Load
Balancing
API
Gateway
8. Our container stack evolved along with the ECS and ELB
advancement.
Simple stack
Mutual SSL
1 +
ECS
Classic load
balancer
9. Our container stack evolved along with the ECS and ELB
advancement.
Registrator
+ + + +
ECS Classic load
balancer
Consul Registrator Nginx
Simple stack
Mutual SSL
High Density Packing
Mutual SSL
1
2
+
ECS
Classic load
balancer
10. Our container stack evolved along with the ECS and ELB
advancement.
+
ECS
Application
load balancer
3
Simple stack
High Density Packing
Simple stack
Mutual SSL
1 +
ECS
Classic load
balancer
Registrator
+ + + +
ECS Classic ELB Consul Registrator Nginx
High Density Packing
Mutual SSL
2
ECS
Classic load
balancer
Consul Registrator Nginx
11. ECS and Classic Load Balancer is a simpler solution for running
containers; however, fixed host port mapping constrains running
one task per service in a ECS instance.
SV1
Task1
SV1
Task2
SV2
Task1
SV2
Task2
X X
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS instance ECS instance
Service 1
ELB
Service 2
ELB
ELB’s fixed listener
port constrains
running only one task
per ECS instance for
a service
ELB’s fixed listener
port constrains
running only one task
per ECS instance for
a service
1
12. ECS, Classic Load Balancer, Consul, Nginx, and Registrator
solution provides dynamic service discovery and load balancing;
However, it involves management of several components.
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS instance ECS instance
Services
ELB
SV1
Task1
SV2
Task2
Nginx
SV1
Task3
Consul Agent
Registrator Consul
Templates
SV1
Task2
SV2
Task1
Nginx
SV2
Task3
Consul Agent
Registrator Consul
Templates
ELB fixed listener port is
mapped to Nginx running
in each ECS instance. An
ELB can serve multiple
services
Nginx config routes
service requests to
appropriate service
containers/tasks
Registrator, consul
and consul templates
dynamically discover
containers/tasks and
configure nginx
Consul cluster for
dynamic service
discovery
Consul ELB
Availability
Zone A
Availability
Zone B
Availability
Zone C
instance
consul
Auto Scaling group
instance
consul
instance
consul
2
13. ECS and Application Load Balancer solution provides a simpler,
efficient solution with dynamic service discovery and load balancing
capabilities.
Services
ELB
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS instance ECS instance
SV1
Task1
SV2
Task2
SV1
Task3
SV1
Task2
SV2
Task1
SV2
Task3
Application Load Balancer
For each service, a Target
Group is created with Routing
Rule. Service Containers with
dynamic ports are added to
the target group
Application Load Balancer
Routes requests to service
containers based on routing
rules and target groups
3
15. Infrastructure automation lets developers focus on application
development and less time on infrastructure coding:
• Lambda functions, Jenkins jobs for container stack creation, termination
• Blue/green and canary deployment automation tools
• AMI update automation
• Container health checks, alerts, and actions
• Integration with enterprise logging solution
• Monitoring solution with CloudWatch
• JVM stats monitoring with CloudWatch
• Automatic scaling of ECS containers
• Automatic scaling of ECS Instances
• Test apparatus self-service tool for performance testing
17. Automation tooling provides a consistent and repeatable way for
users to create container stacks without writing a single line of
infrastructure code
virtual private cloud
Parameters
for container
stack creation
S3
parameters put
event triggers
Lambda
Lambda + Terraform
Lambda executes Terraform with
parameters for infrastructure
creation
Container stack is
created in the VPC
Users provide
parameters like
subnets, security
groups, etc.
18. Users provide information like subnets, security groups, metrics,
alarms, and alerts as parameters for a container stack creation tool.
instance_type="m3.medium”
server_subnets="subnet-ab12,subnet-ab12”
ecs_sg="sg-sg1234”
asg_min=”3”
asg_max=”9”
asg_desired=”6”
sns_topic=”my-alerts”
scalein_adjustment="-1”
scaleout_adjustment="1”
scalein_cooldown="300”
scaleout_cooldown="300”
scaleout_alarm_cpu_interval_secs="900”
scaleout_cpu_percent="80”
scalein_cpu_percent=”40"
custom_script_location=“my_s3_bucket"
custom_script_name =”custom-script”
X509_cert_location=“my_s3_bucket”
X509_cert_files=“cert1.cer,cert2.cer”
ecs_cluster_name=”my-app-cluster”
iam_role=“my_app_Iam_role”
docker_registry=“my-docker-registry”
proxy_server=“my-co.proxycom”
19. Three Lambda functions make up the core of the stack creation; these
microfunctions decouple the compute cluster, Application Load
Balancer, and ECS service so they can have their own lifecycles.
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
ECS cluster
Lambda
Creates ECS cluster and
EC2 instances
Lambda
Creates load balancer with
default TG
Lambda
Creates target group, ECS
service, and rule
Compute cluster
Rehydrated independently
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
ECS cluster
Application Load Balancer
End users get the same endpoints
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
ECS cluster
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
ECS service
Target group and service deployments
21. Users perform regular AMI updates without outages using
automation tooling Lambda functions
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
Old AMI Old AMI
Old AMI container stack
22. AMI update: Lambda function creates new ECS cluster and EC2
instances with the new AMI.
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
New AMI New AMI
Lambda
Creates updated EC2
instances
ECS cluster
1
Old AMI Old AMI
23. AMI update: Lambda function replicates ECS services from old
AMI cluster to new AMI-based ECS cluster
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
New AMI New AMI
Lambda
Replicates ECS services to
new AMI instances
ECS cluster
Old AMI Old AMI
2
24. AMI update: Lambda function drains and deletes ECS services from
the old ECS cluster and terminates the old AMI EC2 instances
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
New AMI New AMI
ECS cluster
Old AMI Old AMI
Lambda
Delete ECS services and old
instances
3
X
25. AMI update: This completes the AMI update for the stack without
causing any outages
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service2
Task2
service1
Task3
service1
Task2
service2
Task1
service2
Task3
New AMI New AMI
New AMI
container stack
27. Blue/green deployment reduces downtime and risk by running two
environments called Blue and Green and toggling between them
Image Courtesy: http://martinfowler.com/bliki/BlueGreenDeployment.html
28. Users perform Blue/Green deployments and rollbacks using
automation tooling Lambda functions
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service1
Task2
Blue services
29. Blue/Green: Lambda function creates a beta ELB and green service;
users can test green service with Beta ELB
Lambda
Creates beta load balancer and
green service
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service1
Task2service1
Task1
service1
Task1
30. Blue/Green: Lambda function adds green service to the original
ELB; traffic flows to green service
Lambda
Adds green service to
original LB
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service1
Task2service1
Task1
service1
Task1
31. Blue/Green: Lambda function deletes blue services and the Beta
ELB; traffic flows to green service
Lambda
Deletes beta LB and blue
services
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling group
ECS Instance ECS Instance
service1
Task1
service1
Task1
33. Canary deployment is a way of releasing a new version of an
application by mixing new and old versions and gradually
increasing the percentage of new version
Load Balancer
V1 V1 V1 V1 V2
Load Balancer
V1 V1 V2V2V2
Load Balancer
V2V2V2V2V2
37. Lessons learned and looking forward
• Amazon ECS has significantly reduced our container stack operations
• With ECS and our automation tooling, Docker apps can be deployed
with production hardened container stack in minutes
We would like to see the following ECS features that will
accelerate our enterprise adoption
• Container-level security groups
• Container placement constraints
• Balancing placements with scale-in, scale-out actions