SlideShare a Scribd company logo
1 of 114
Download to read offline
CONTINUOUS DEPLOYMENT
                          The Dirty Details


Mike Brittain
ENGINEERING DIRECTOR   @mikebrittain
                       mike@etsy.com
CONTINUOUS DEPLOYMENT
       The Dirty Details
“OK, sounds cool. But I have some questions...”
CD 100- & 200-levels
- CI environment for automated tests
- Committing to trunk
- Branching in code
- Config flags (a.k.a. feature flags)
- DevOps mentality
- Metrics and alerting
- Automated deploy scripts




                                        credit: photobookgirl (flickr)
CD 100- & 200-levels
- CI environment for automated tests
- Committing to trunk
- Branching in code
- Config flags (a.k.a. feature flags)
- DevOps mentality                       CD 300 level
- Metrics and alerting                - Deploys vs. releases
- Automated deploy scripts
                                   - Decoupled systems, schema changes
                                   - How we work: Arch. & Process
                                   - Integration and Operations


                                                           credit: photobookgirl (flickr)
www.   .com
GROSS MERCHANDISE SALES




http://www.etsy.com/blog/news/2013/notes-from-chad-2012-year-in-review/
DECEMBER 2012
                                                                                        1.5 Billion page views
                                                                                        $117 Million of goods sold
                                                                                        6 Million items sold




Items by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet    http://www.etsy.com/blog/news/2013/etsy-statistics-december-2012-weather-report/
175+ Committers, everyone deploys




credit: martin_heigan (flickr)
DEPLOYMENTS PER DAY




       Very end of 2009
                          Today
Continuous delivery is a pattern language in growing use
          in software development to improve the process of
          software delivery.
                 Techniques such as automated testing, continuous integration,
           and continuous deployment allow software to be developed to a high
           standard and easily packaged and deployed to test environments,
           resulting in the ability to rapidly, reliably and repeatedly push out
           enhancements and bug fixes to customers at low risk and with
           minimal manual overhead.
                                                                            ~wikipedia
credit: Stewart, redgen (flickr)
Architecture Stack
                                  Linux, Apache, MySQL, PHP
                                  Memcache, Gearman, Postgresql, Solr,
                                  Java, Traffic Server, Hadoop, HBase
                                  Git, Jenkins




credit: Saire Elizabeth (flickr)
Then             Now
     2009          2010-today

 Just before we
started using CD
Then                 Now
    6-14 hours            15 mins
“Deployment Army”         1 person

 Special event and    Part of everyday
highly orchestrated      workflow
Then           Now
Blocked for   Blocked for
6-14 hours.   15 minutes.

6+ hours to   15 minutes to
 redeploy       redeploy
Then                 Now
  Release branch,        Mainline,
 database schemas,     minimal linking
  data transforms,      and building,
     packaging,            rsync,
   rolling restarts,       site up
   cache purging,
scheduled downtime
1st day
Put your face on Etsy.com.
2nd day
                         Complete tax, insurance, and
                              benefits forms.




credit: ktpupp (flickr)
WARNING
GROSS MERCHANDISE SALES
Continuous Deployment
      Small, frequent changes.
Constantly integrating into production.
        30+ deploys per day.
“Wow... 30 deploys a day.
How do you build features so quickly?”
Software Deploy ≠ Product Launch
Deploys frequently gated by config flags
             (“dark” releases)
$cfg[‘new_search’]   =   array('enabled'   =>   'off');
$cfg[‘sign_in’]      =   array('enabled'   =>   'on');
$cfg[‘checkout’]     =   array('enabled'   =>   'on');
$cfg[‘homepage’]     =   array('enabled'   =>   'on');
$cfg[‘new_search’] = array('enabled' => 'off');
$cfg[‘new_search’] = array('enabled' => 'off');

// Meanwhile...




# old and boring search
$results = do_grep();
$cfg[‘new_search’] = array('enabled' => 'off');

// Meanwhile...

if ($cfg[‘new_search’] == ‘on’) {
  # New and fancy search
  $results = do_solr();
} else {
  # old and boring search
  $results = do_grep();
}
$cfg[‘new_search’] = array('enabled' => 'on');

// or...

$cfg[‘new_search’] = array('enabled' => 'staff');

// or...

$cfg[‘new_search’] = array('enabled' => '1%');

// or...

$cfg[‘new_search’] = array('enabled' => 'users',
                           'user_list' => 'mike,john,kellan');
Validate in production, hidden from public.
What’s in a deploy?
Small incremental changes to the application
New classes, methods, controllers
Graphics, stylesheets, templates
Copy/content changes

Turning flags on, off, or % ramp up
Low MTTR (response times)
Latent bugs and security holes
Traffic management, load shedding
Adding and removing infrastructure

Tweaking config flags or releasing patches.
http://www.flickr.com/photos/flyforfun/2694158656/
Config flags
                          Operator
                                                        Metrics




http://www.flickr.com/photos/flyforfun/2694158656/
Favorites
 “on”
Favorites
 “off”
Many deploys eventually lead to a product launch.
“How do you continuously deploy
  database schema changes?”
Code deploys ~15-20 minutes
Schema changes
Code deploys ~15-20 minutes
Schema changes THURSDAYS!
Our web application is largely monolithic.
     Etsy.com, Support & Back-office tools,
     Developer API, Gearman (async work)
Our web application is largely monolithic.
     Etsy.com, Support & Back-office tools,
     Developer API, Gearman (async work)


                                PHP, Apache, Memcache
External “services” are not deployed with
          the main application.
e.g. Databases, Search, Photo storage, Payments
External “services” are not deployed with
            the main application.
   e.g. Databases, Search, Photo storage, Payments


    MYSQL                                                        PCI
                                           PROXY CACHE,
(schema changes)                                                 (controlled access)
                                        FILERS, AMAZON S3
                      SOLR, JVM
                                          (specialized infra.)
                   (rolling restarts)
For every config flag, there are two states
  we can support — present and future.
For every config flag, there are two states
  we can support — present and future.
                            ... or past and present.
“Non-Breaking Expansions”
 Expose new version in a service interface;
support multiple versions in the consumer.
Example: Changing a Database Schema
    Merging “users” and “users_prefs”
C
     RULE OF THUMB:
     Prefer ADDs over ALTERs (non-breaking expansion)
1. Write to both versions
2. Backfill historical data
3. Read from new version
4. Cut-off writes to old version
0. Add new version to schema
1. Write to both versions
2. Backfill historical data
3. Read from new version
4. Cut-off writes to old version
0. Add new version to schema
Schema change to add prefs columns to “users” table.

“write_prefs_to_user_prefs_table” => “on”
“write_prefs_to_users_table” => “off”
“read_prefs_from_users_table” => “off”
1. Write to both versions
Write code for writing prefs to the “users” table.

“write_prefs_to_user_prefs_table” => “on”
“write_prefs_to_users_table” => “on”
“read_prefs_from_users_table” => “off”
2. Backfill historical data
Offline process to sync existing data from “user_prefs”
to new columns in “users”
3. Read from new version
Data validation tests. Ensure consistency both internally
and in production.

“write_prefs_to_user_prefs_table” => “on”
“write_prefs_to_users_table” => “on”
“read_prefs_from_users_table” => “staff”
3. Read from new version
Data validation tests. Ensure consistency both internally
and in production.

“write_prefs_to_user_prefs_table” => “on”
“write_prefs_to_users_table” => “on”
“read_prefs_from_users_table” => “1%”
3. Read from new version
Data validation tests. Ensure consistency both internally
and in production.

“write_prefs_to_user_prefs_table” => “on”
“write_prefs_to_users_table” => “on”
“read_prefs_from_users_table” => “5%”
3. Read from new version
Data validation tests. Ensure consistency both internally
and in production.

“write_prefs_to_user_prefs_table” => “on”
“write_prefs_to_users_table” => “on”
“read_prefs_from_users_table” => “on”


(“on” == “100%”)
4. Cut-off writes to old version
After running on the new table for a significant amount
of time, we can cut off writes to the old table.

“write_prefs_to_user_prefs_table” => “off”
“write_prefs_to_users_table” => “on”
“read_prefs_from_users_table” => “on”
“Branch by Astraction”

                            Controller                               Controller




                                              Users Model                    (Abstraction)




“users” (old)                     “user_prefs”                                          “users”

                  old schema                                                          new schema


                              http://paulhammant.com/blog/branch_by_abstraction.html
     http://continuousdelivery.com/2011/05/make-large-scale-changes-incrementally-with-branch-by-abstraction/
“The Migration 4-Step”
1. Write to both versions
2. Backfill historical data
3. Read from new version
4. Cut-off writes to old version
“When do you clean up all of those config flags?
We might remove config flags for the old version when...
It is no longer valid for the business.
It is no longer stable, maintained, or trusted.
It has poor performance characteristics.
The code is a mess, or difficult to read.
We can afford to spend time on it.
Promote “dev flags” to “feature flags”
// Feature flag
$cfg[‘mobilized_pages’] = array('enabled' => 'on');

// Dev flags
$cfg[‘mobile_templates_seller_tools’]     =   array('enabled'   =>   'on');
$cfg[‘mobile_templates_account_tools’]    =   array('enabled'   =>   'on');
$cfg[‘mobile_templates_member_profile’]   =   array('enabled'   =>   'on');
$cfg[‘mobile_templates_search’]           =   array('enabled'   =>   'off');
$cfg[‘mobile_templates_activity_feed’]    =   array('enabled'   =>   'off');

...

if ($cfg[‘mobilized_pages’] == ‘on’ && $cfg[‘mobile_templates_search’] == ‘on’) {
     // ...
     // ...
}
// Feature flags
$cfg[‘search’]                = array('enabled' => 'on');
$cfg[‘developer_api’]         = array('enabled' => 'on');
$cfg[‘seller_tools’]          = array('enabled' => 'on');

$cfg[‘the_entire_web_site’]                       = array('enabled' => 'on');
// Feature flags
$cfg[‘search’]              = array('enabled' => 'on');
$cfg[‘developer_api’]       = array('enabled' => 'on');
$cfg[‘seller_tools’]        = array('enabled' => 'on');

$cfg[‘the_entire_web_site’]                     = array('enabled' => 'on');
$cfg[‘the_entire_web_site_no_really_i_mean_it’] = array('enabled' => 'on');
Architecture and Process
Deploying is cheap.
Releasing is cheap.
Some philosophies on product development...
Gathering data should be cheap, too.
        staff, opt-in prototypes, 1%
Treat first iterations as experiments.
Get into code as quickly as possible.
“Where a new system concept or new technology is used, one has to build a
system to throw away, for even the best planning is not so omniscient as to
get it right the first time. Hence plan to throw one away; you will, anyhow.”

                                             ~ Fred Brooks, The Mythical Man-Month
Architecture largely doesn’t matter.
Kill things that don’t work.
Your assumptions will be wrong
    once you’ve scaled 10x.
“We don’t optimize for being right. We optimize
  for quickly detecting when we’re wrong.”
                                ~Kellan Elliott-McCrea, CTO
Become really good at changing
     your architecture.
Invest time in architecture by the
       2nd or 3rd iteration.
Integration and Operations
WARNING

          REMEMBER THIS?
Continuous Deployment
      Small, frequent changes.
Constantly integrating into production.
         30 deploys per day.
Code review before commit
Automated tests before deploy
Why Integrate with Production?
Dev ≠ Prod
Verify frequently and in small batches.
Integrating with production is a test in itself.
    We do this frequently and in small batches.
More database servers in prod.
Bigger database hardware in prod.
More web servers.
Various replication schemes.
Different versions of server and OS software.
Schema changes applied at different times.
Physical hardware in prod.
More data in prod.
Legacy data (7 years of odd user states).
More traffic in prod.
Wait, I mean MUCH more traffic in prod.
Fewer elves.
Faster disks (SSDs) in prod.
Using a MySQL database in dev for an application that will be running
on Oracle in production: Priceless
Verify frequently and in small batches.
Dev ⇾ QA ⇾ Staging ⇾ Prod
Dev ⇾ QA ⇾ Staging ⇾ Prod
Dev ⇾ Pre-Prod ⇾ Prod
Test and integrate where you’ll see value.
Config flags (again)
off, on, staff, opt-in prototypes, user list, 0-100%
Config flags (again)
off, on, staff, opt-in prototypes, user list, 0-100%

                    “canary pools”
Automated tests after deploy
Real-time metrics and dashboards
Network & Servers, Application, Business
SERVER METRICS
Apache requests/sec, Busy processes,
CPU utilization, Script exec time (med. & 95th)
APPLICATION METRICS
Logins, Registrations, Checkouts,
Listings created, Forum posts


Time and event correlated.
Real humans reporting trouble!
“Theoretical” vs. “Practical” Engineering
Config flags
                          Operator
                                                        Metrics




http://www.flickr.com/photos/flyforfun/2694158656/
Managing risk during Holiday Shopping season
  Thanksgiving, “Black Friday,” “Cyber Monday” ➔ Christmas
                           (~30 days)



                    Code Freeze?
DEPLOYMENTS PER DAY




                      “Code Slush”
Tighten your feedback cycles
Integrate with production and validate early in cycle.
Use tools that allow you to detect issues early.
Optimize for quick response times.

Applied to both feature development and operability.
Thank you
                                                           ... and questions?


These slides will be available later today at http://mikebrittain.com/talks



    Mike Brittain
     ENGINEERING DIRECTOR       @mikebrittain
                                mike@etsy.com

More Related Content

What's hot

Introduction to Azure Functions
Introduction to Azure FunctionsIntroduction to Azure Functions
Introduction to Azure FunctionsCallon Campbell
 
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...Edureka!
 
Continuous Delivery
Continuous DeliveryContinuous Delivery
Continuous DeliveryMike McGarr
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingAraf Karsh Hamid
 
DevOps 101 - an Introduction to DevOps
DevOps 101  - an Introduction to DevOpsDevOps 101  - an Introduction to DevOps
DevOps 101 - an Introduction to DevOpsRed Gate Software
 
Design patterns for microservice architecture
Design patterns for microservice architectureDesign patterns for microservice architecture
Design patterns for microservice architectureThe Software House
 
Introduction to DevOps slides.pdf
Introduction to DevOps slides.pdfIntroduction to DevOps slides.pdf
Introduction to DevOps slides.pdfBoreVishnusai
 
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps WayDevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Waysmalltown
 
Dev ops != Dev+Ops
Dev ops != Dev+OpsDev ops != Dev+Ops
Dev ops != Dev+OpsShalu Ahuja
 
Introduction to Azure DevOps
Introduction to Azure DevOpsIntroduction to Azure DevOps
Introduction to Azure DevOpsLorenzo Barbieri
 
Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Bilgin Ibryam
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introductionJason Vance
 

What's hot (20)

Introduction to Azure Functions
Introduction to Azure FunctionsIntroduction to Azure Functions
Introduction to Azure Functions
 
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
 
Azure DevOps
Azure DevOpsAzure DevOps
Azure DevOps
 
Continuous Delivery
Continuous DeliveryContinuous Delivery
Continuous Delivery
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
DevOps 101 - an Introduction to DevOps
DevOps 101  - an Introduction to DevOpsDevOps 101  - an Introduction to DevOps
DevOps 101 - an Introduction to DevOps
 
Azure dev ops
Azure dev opsAzure dev ops
Azure dev ops
 
Design patterns for microservice architecture
Design patterns for microservice architectureDesign patterns for microservice architecture
Design patterns for microservice architecture
 
Intro to Azure DevOps
Intro to Azure DevOpsIntro to Azure DevOps
Intro to Azure DevOps
 
Azure DevOps - Azure Guatemala Meetup
Azure DevOps - Azure Guatemala MeetupAzure DevOps - Azure Guatemala Meetup
Azure DevOps - Azure Guatemala Meetup
 
infrastructure as code
infrastructure as codeinfrastructure as code
infrastructure as code
 
Introduction to DevOps slides.pdf
Introduction to DevOps slides.pdfIntroduction to DevOps slides.pdf
Introduction to DevOps slides.pdf
 
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps WayDevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
 
Azure DevOps
Azure DevOpsAzure DevOps
Azure DevOps
 
Dev ops != Dev+Ops
Dev ops != Dev+OpsDev ops != Dev+Ops
Dev ops != Dev+Ops
 
Introduction to Azure DevOps
Introduction to Azure DevOpsIntroduction to Azure DevOps
Introduction to Azure DevOps
 
Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...
 
Sonarqube
SonarqubeSonarqube
Sonarqube
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introduction
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 

Viewers also liked

Continuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two ApproachesContinuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two ApproachesRoss Snyder
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyMike Brittain
 
Continuous delivery for databases
Continuous delivery for databasesContinuous delivery for databases
Continuous delivery for databasesDevOpsGroup
 
Managing (Schema) Migrations in Cassandra
Managing (Schema) Migrations in CassandraManaging (Schema) Migrations in Cassandra
Managing (Schema) Migrations in CassandraDataStax Academy
 
From Building a Marketplace to Building Teams
From Building a Marketplace to Building TeamsFrom Building a Marketplace to Building Teams
From Building a Marketplace to Building TeamsMike Brittain
 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at FlickrJohn Allspaw
 
How to Get to Second Base with Your CDN
How to Get to Second Base with Your CDNHow to Get to Second Base with Your CDN
How to Get to Second Base with Your CDNMike Brittain
 
Take My Logs. Please!
Take My Logs. Please!Take My Logs. Please!
Take My Logs. Please!Mike Brittain
 
Continuous Deployment with Cassandra
Continuous Deployment with CassandraContinuous Deployment with Cassandra
Continuous Deployment with CassandraDataStax Academy
 
Advanced Topics in Continuous Deployment
Advanced Topics in Continuous DeploymentAdvanced Topics in Continuous Deployment
Advanced Topics in Continuous DeploymentMike Brittain
 
Fungus on White Bread
Fungus on White BreadFungus on White Bread
Fungus on White BreadGaurav Lochan
 
Scaling Up Continuous Deployment
Scaling Up Continuous DeploymentScaling Up Continuous Deployment
Scaling Up Continuous DeploymentTimothy Fitz
 
Continuous Delivery in the AWS Cloud
Continuous Delivery in the AWS CloudContinuous Delivery in the AWS Cloud
Continuous Delivery in the AWS CloudNigel Fernandes
 
Metrics-Driven Engineering at Etsy
Metrics-Driven Engineering at EtsyMetrics-Driven Engineering at Etsy
Metrics-Driven Engineering at EtsyMike Brittain
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven EngineeringMike Brittain
 
Web Performance Culture and Tools at Etsy
Web Performance Culture and Tools at EtsyWeb Performance Culture and Tools at Etsy
Web Performance Culture and Tools at EtsyMike Brittain
 
Analysis of TLS in SMTP World
Analysis of TLS in SMTP WorldAnalysis of TLS in SMTP World
Analysis of TLS in SMTP WorldBinu Ramakrishnan
 
The Hard Problems of Continuous Deployment
The Hard Problems of Continuous DeploymentThe Hard Problems of Continuous Deployment
The Hard Problems of Continuous DeploymentTimothy Fitz
 
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec ProgramAppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec ProgramMatt Tesauro
 

Viewers also liked (20)

Database compatibility
Database compatibilityDatabase compatibility
Database compatibility
 
Continuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two ApproachesContinuous Deployment at Etsy: A Tale of Two Approaches
Continuous Deployment at Etsy: A Tale of Two Approaches
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at Etsy
 
Continuous delivery for databases
Continuous delivery for databasesContinuous delivery for databases
Continuous delivery for databases
 
Managing (Schema) Migrations in Cassandra
Managing (Schema) Migrations in CassandraManaging (Schema) Migrations in Cassandra
Managing (Schema) Migrations in Cassandra
 
From Building a Marketplace to Building Teams
From Building a Marketplace to Building TeamsFrom Building a Marketplace to Building Teams
From Building a Marketplace to Building Teams
 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
 
How to Get to Second Base with Your CDN
How to Get to Second Base with Your CDNHow to Get to Second Base with Your CDN
How to Get to Second Base with Your CDN
 
Take My Logs. Please!
Take My Logs. Please!Take My Logs. Please!
Take My Logs. Please!
 
Continuous Deployment with Cassandra
Continuous Deployment with CassandraContinuous Deployment with Cassandra
Continuous Deployment with Cassandra
 
Advanced Topics in Continuous Deployment
Advanced Topics in Continuous DeploymentAdvanced Topics in Continuous Deployment
Advanced Topics in Continuous Deployment
 
Fungus on White Bread
Fungus on White BreadFungus on White Bread
Fungus on White Bread
 
Scaling Up Continuous Deployment
Scaling Up Continuous DeploymentScaling Up Continuous Deployment
Scaling Up Continuous Deployment
 
Continuous Delivery in the AWS Cloud
Continuous Delivery in the AWS CloudContinuous Delivery in the AWS Cloud
Continuous Delivery in the AWS Cloud
 
Metrics-Driven Engineering at Etsy
Metrics-Driven Engineering at EtsyMetrics-Driven Engineering at Etsy
Metrics-Driven Engineering at Etsy
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven Engineering
 
Web Performance Culture and Tools at Etsy
Web Performance Culture and Tools at EtsyWeb Performance Culture and Tools at Etsy
Web Performance Culture and Tools at Etsy
 
Analysis of TLS in SMTP World
Analysis of TLS in SMTP WorldAnalysis of TLS in SMTP World
Analysis of TLS in SMTP World
 
The Hard Problems of Continuous Deployment
The Hard Problems of Continuous DeploymentThe Hard Problems of Continuous Deployment
The Hard Problems of Continuous Deployment
 
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec ProgramAppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
 

Similar to Continuous Deployment: The Dirty Details

Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsMike Brittain
 
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...Amazon Web Services
 
ServerTemplate Deep Dive
ServerTemplate Deep DiveServerTemplate Deep Dive
ServerTemplate Deep DiveRightScale
 
Working Software Over Comprehensive Documentation
Working Software Over Comprehensive DocumentationWorking Software Over Comprehensive Documentation
Working Software Over Comprehensive DocumentationAndrii Dzynia
 
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileEngineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileKenAtIndeed
 
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise ApplicationsDaniel Oh
 
Cloud Best Practices
Cloud Best PracticesCloud Best Practices
Cloud Best PracticesEric Bottard
 
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...DevOpsDays Tel Aviv
 
Web Apps and more
Web Apps and moreWeb Apps and more
Web Apps and moreYan Shi
 
Web app and more
Web app and moreWeb app and more
Web app and morefaming su
 
AD113 Speed Up Your Applications w/ Nginx and PageSpeed
AD113  Speed Up Your Applications w/ Nginx and PageSpeedAD113  Speed Up Your Applications w/ Nginx and PageSpeed
AD113 Speed Up Your Applications w/ Nginx and PageSpeededm00se
 
The Ember.js Framework - Everything You Need To Know
The Ember.js Framework - Everything You Need To KnowThe Ember.js Framework - Everything You Need To Know
The Ember.js Framework - Everything You Need To KnowAll Things Open
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth PresentationEric Ries
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth PresentationTimothy Fitz
 
How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...Jos Boumans
 
Real-World Pulsar Architectural Patterns
Real-World Pulsar Architectural PatternsReal-World Pulsar Architectural Patterns
Real-World Pulsar Architectural PatternsDevin Bost
 
Everything is Awesome - Cutting the Corners off the Web
Everything is Awesome - Cutting the Corners off the WebEverything is Awesome - Cutting the Corners off the Web
Everything is Awesome - Cutting the Corners off the WebJames Rakich
 
App engine devfest_mexico_10
App engine devfest_mexico_10App engine devfest_mexico_10
App engine devfest_mexico_10Chris Schalk
 
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014Amazon Web Services
 

Similar to Continuous Deployment: The Dirty Details (20)

Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
 
ServerTemplate Deep Dive
ServerTemplate Deep DiveServerTemplate Deep Dive
ServerTemplate Deep Dive
 
Working Software Over Comprehensive Documentation
Working Software Over Comprehensive DocumentationWorking Software Over Comprehensive Documentation
Working Software Over Comprehensive Documentation
 
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileEngineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
 
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
 
Cloud Best Practices
Cloud Best PracticesCloud Best Practices
Cloud Best Practices
 
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
 
Web Apps and more
Web Apps and moreWeb Apps and more
Web Apps and more
 
Web app and more
Web app and moreWeb app and more
Web app and more
 
AD113 Speed Up Your Applications w/ Nginx and PageSpeed
AD113  Speed Up Your Applications w/ Nginx and PageSpeedAD113  Speed Up Your Applications w/ Nginx and PageSpeed
AD113 Speed Up Your Applications w/ Nginx and PageSpeed
 
The Ember.js Framework - Everything You Need To Know
The Ember.js Framework - Everything You Need To KnowThe Ember.js Framework - Everything You Need To Know
The Ember.js Framework - Everything You Need To Know
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
 
How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...
 
Real-World Pulsar Architectural Patterns
Real-World Pulsar Architectural PatternsReal-World Pulsar Architectural Patterns
Real-World Pulsar Architectural Patterns
 
Everything is Awesome - Cutting the Corners off the Web
Everything is Awesome - Cutting the Corners off the WebEverything is Awesome - Cutting the Corners off the Web
Everything is Awesome - Cutting the Corners off the Web
 
Dev Ops without the Ops
Dev Ops without the OpsDev Ops without the Ops
Dev Ops without the Ops
 
App engine devfest_mexico_10
App engine devfest_mexico_10App engine devfest_mexico_10
App engine devfest_mexico_10
 
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Continuous Deployment: The Dirty Details

  • 1. CONTINUOUS DEPLOYMENT The Dirty Details Mike Brittain ENGINEERING DIRECTOR @mikebrittain mike@etsy.com
  • 2. CONTINUOUS DEPLOYMENT The Dirty Details “OK, sounds cool. But I have some questions...”
  • 3. CD 100- & 200-levels - CI environment for automated tests - Committing to trunk - Branching in code - Config flags (a.k.a. feature flags) - DevOps mentality - Metrics and alerting - Automated deploy scripts credit: photobookgirl (flickr)
  • 4. CD 100- & 200-levels - CI environment for automated tests - Committing to trunk - Branching in code - Config flags (a.k.a. feature flags) - DevOps mentality CD 300 level - Metrics and alerting - Deploys vs. releases - Automated deploy scripts - Decoupled systems, schema changes - How we work: Arch. & Process - Integration and Operations credit: photobookgirl (flickr)
  • 5.
  • 6. www. .com
  • 8. DECEMBER 2012 1.5 Billion page views $117 Million of goods sold 6 Million items sold Items by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet http://www.etsy.com/blog/news/2013/etsy-statistics-december-2012-weather-report/
  • 9. 175+ Committers, everyone deploys credit: martin_heigan (flickr)
  • 10. DEPLOYMENTS PER DAY Very end of 2009 Today
  • 11. Continuous delivery is a pattern language in growing use in software development to improve the process of software delivery. Techniques such as automated testing, continuous integration, and continuous deployment allow software to be developed to a high standard and easily packaged and deployed to test environments, resulting in the ability to rapidly, reliably and repeatedly push out enhancements and bug fixes to customers at low risk and with minimal manual overhead. ~wikipedia credit: Stewart, redgen (flickr)
  • 12. Architecture Stack Linux, Apache, MySQL, PHP Memcache, Gearman, Postgresql, Solr, Java, Traffic Server, Hadoop, HBase Git, Jenkins credit: Saire Elizabeth (flickr)
  • 13.
  • 14. Then Now 2009 2010-today Just before we started using CD
  • 15. Then Now 6-14 hours 15 mins “Deployment Army” 1 person Special event and Part of everyday highly orchestrated workflow
  • 16. Then Now Blocked for Blocked for 6-14 hours. 15 minutes. 6+ hours to 15 minutes to redeploy redeploy
  • 17. Then Now Release branch, Mainline, database schemas, minimal linking data transforms, and building, packaging, rsync, rolling restarts, site up cache purging, scheduled downtime
  • 18. 1st day Put your face on Etsy.com.
  • 19. 2nd day Complete tax, insurance, and benefits forms. credit: ktpupp (flickr)
  • 20.
  • 23. Continuous Deployment Small, frequent changes. Constantly integrating into production. 30+ deploys per day.
  • 24. “Wow... 30 deploys a day. How do you build features so quickly?”
  • 25. Software Deploy ≠ Product Launch
  • 26. Deploys frequently gated by config flags (“dark” releases)
  • 27. $cfg[‘new_search’] = array('enabled' => 'off'); $cfg[‘sign_in’] = array('enabled' => 'on'); $cfg[‘checkout’] = array('enabled' => 'on'); $cfg[‘homepage’] = array('enabled' => 'on');
  • 29. $cfg[‘new_search’] = array('enabled' => 'off'); // Meanwhile... # old and boring search $results = do_grep();
  • 30. $cfg[‘new_search’] = array('enabled' => 'off'); // Meanwhile... if ($cfg[‘new_search’] == ‘on’) { # New and fancy search $results = do_solr(); } else { # old and boring search $results = do_grep(); }
  • 31. $cfg[‘new_search’] = array('enabled' => 'on'); // or... $cfg[‘new_search’] = array('enabled' => 'staff'); // or... $cfg[‘new_search’] = array('enabled' => '1%'); // or... $cfg[‘new_search’] = array('enabled' => 'users', 'user_list' => 'mike,john,kellan');
  • 32. Validate in production, hidden from public.
  • 33. What’s in a deploy? Small incremental changes to the application New classes, methods, controllers Graphics, stylesheets, templates Copy/content changes Turning flags on, off, or % ramp up
  • 34. Low MTTR (response times) Latent bugs and security holes Traffic management, load shedding Adding and removing infrastructure Tweaking config flags or releasing patches.
  • 36. Config flags Operator Metrics http://www.flickr.com/photos/flyforfun/2694158656/
  • 39. Many deploys eventually lead to a product launch.
  • 40. “How do you continuously deploy database schema changes?”
  • 41. Code deploys ~15-20 minutes Schema changes
  • 42. Code deploys ~15-20 minutes Schema changes THURSDAYS!
  • 43. Our web application is largely monolithic. Etsy.com, Support & Back-office tools, Developer API, Gearman (async work)
  • 44. Our web application is largely monolithic. Etsy.com, Support & Back-office tools, Developer API, Gearman (async work) PHP, Apache, Memcache
  • 45. External “services” are not deployed with the main application. e.g. Databases, Search, Photo storage, Payments
  • 46. External “services” are not deployed with the main application. e.g. Databases, Search, Photo storage, Payments MYSQL PCI PROXY CACHE, (schema changes) (controlled access) FILERS, AMAZON S3 SOLR, JVM (specialized infra.) (rolling restarts)
  • 47. For every config flag, there are two states we can support — present and future.
  • 48. For every config flag, there are two states we can support — present and future. ... or past and present.
  • 49. “Non-Breaking Expansions” Expose new version in a service interface; support multiple versions in the consumer.
  • 50. Example: Changing a Database Schema Merging “users” and “users_prefs”
  • 51. C RULE OF THUMB: Prefer ADDs over ALTERs (non-breaking expansion)
  • 52. 1. Write to both versions 2. Backfill historical data 3. Read from new version 4. Cut-off writes to old version
  • 53. 0. Add new version to schema 1. Write to both versions 2. Backfill historical data 3. Read from new version 4. Cut-off writes to old version
  • 54. 0. Add new version to schema Schema change to add prefs columns to “users” table. “write_prefs_to_user_prefs_table” => “on” “write_prefs_to_users_table” => “off” “read_prefs_from_users_table” => “off”
  • 55. 1. Write to both versions Write code for writing prefs to the “users” table. “write_prefs_to_user_prefs_table” => “on” “write_prefs_to_users_table” => “on” “read_prefs_from_users_table” => “off”
  • 56. 2. Backfill historical data Offline process to sync existing data from “user_prefs” to new columns in “users”
  • 57. 3. Read from new version Data validation tests. Ensure consistency both internally and in production. “write_prefs_to_user_prefs_table” => “on” “write_prefs_to_users_table” => “on” “read_prefs_from_users_table” => “staff”
  • 58. 3. Read from new version Data validation tests. Ensure consistency both internally and in production. “write_prefs_to_user_prefs_table” => “on” “write_prefs_to_users_table” => “on” “read_prefs_from_users_table” => “1%”
  • 59. 3. Read from new version Data validation tests. Ensure consistency both internally and in production. “write_prefs_to_user_prefs_table” => “on” “write_prefs_to_users_table” => “on” “read_prefs_from_users_table” => “5%”
  • 60. 3. Read from new version Data validation tests. Ensure consistency both internally and in production. “write_prefs_to_user_prefs_table” => “on” “write_prefs_to_users_table” => “on” “read_prefs_from_users_table” => “on” (“on” == “100%”)
  • 61. 4. Cut-off writes to old version After running on the new table for a significant amount of time, we can cut off writes to the old table. “write_prefs_to_user_prefs_table” => “off” “write_prefs_to_users_table” => “on” “read_prefs_from_users_table” => “on”
  • 62. “Branch by Astraction” Controller Controller Users Model (Abstraction) “users” (old) “user_prefs” “users” old schema new schema http://paulhammant.com/blog/branch_by_abstraction.html http://continuousdelivery.com/2011/05/make-large-scale-changes-incrementally-with-branch-by-abstraction/
  • 63. “The Migration 4-Step” 1. Write to both versions 2. Backfill historical data 3. Read from new version 4. Cut-off writes to old version
  • 64. “When do you clean up all of those config flags?
  • 65. We might remove config flags for the old version when... It is no longer valid for the business. It is no longer stable, maintained, or trusted. It has poor performance characteristics. The code is a mess, or difficult to read. We can afford to spend time on it.
  • 66. Promote “dev flags” to “feature flags”
  • 67. // Feature flag $cfg[‘mobilized_pages’] = array('enabled' => 'on'); // Dev flags $cfg[‘mobile_templates_seller_tools’] = array('enabled' => 'on'); $cfg[‘mobile_templates_account_tools’] = array('enabled' => 'on'); $cfg[‘mobile_templates_member_profile’] = array('enabled' => 'on'); $cfg[‘mobile_templates_search’] = array('enabled' => 'off'); $cfg[‘mobile_templates_activity_feed’] = array('enabled' => 'off'); ... if ($cfg[‘mobilized_pages’] == ‘on’ && $cfg[‘mobile_templates_search’] == ‘on’) { // ... // ... }
  • 68. // Feature flags $cfg[‘search’] = array('enabled' => 'on'); $cfg[‘developer_api’] = array('enabled' => 'on'); $cfg[‘seller_tools’] = array('enabled' => 'on'); $cfg[‘the_entire_web_site’] = array('enabled' => 'on');
  • 69.
  • 70. // Feature flags $cfg[‘search’] = array('enabled' => 'on'); $cfg[‘developer_api’] = array('enabled' => 'on'); $cfg[‘seller_tools’] = array('enabled' => 'on'); $cfg[‘the_entire_web_site’] = array('enabled' => 'on'); $cfg[‘the_entire_web_site_no_really_i_mean_it’] = array('enabled' => 'on');
  • 74. Some philosophies on product development...
  • 75. Gathering data should be cheap, too. staff, opt-in prototypes, 1%
  • 76. Treat first iterations as experiments.
  • 77. Get into code as quickly as possible.
  • 78. “Where a new system concept or new technology is used, one has to build a system to throw away, for even the best planning is not so omniscient as to get it right the first time. Hence plan to throw one away; you will, anyhow.” ~ Fred Brooks, The Mythical Man-Month
  • 80. Kill things that don’t work.
  • 81. Your assumptions will be wrong once you’ve scaled 10x.
  • 82. “We don’t optimize for being right. We optimize for quickly detecting when we’re wrong.” ~Kellan Elliott-McCrea, CTO
  • 83. Become really good at changing your architecture.
  • 84. Invest time in architecture by the 2nd or 3rd iteration.
  • 86. WARNING REMEMBER THIS?
  • 87. Continuous Deployment Small, frequent changes. Constantly integrating into production. 30 deploys per day.
  • 90. Why Integrate with Production?
  • 92. Verify frequently and in small batches.
  • 93. Integrating with production is a test in itself. We do this frequently and in small batches.
  • 94. More database servers in prod. Bigger database hardware in prod. More web servers. Various replication schemes. Different versions of server and OS software. Schema changes applied at different times. Physical hardware in prod. More data in prod. Legacy data (7 years of odd user states). More traffic in prod. Wait, I mean MUCH more traffic in prod. Fewer elves. Faster disks (SSDs) in prod.
  • 95. Using a MySQL database in dev for an application that will be running on Oracle in production: Priceless
  • 96. Verify frequently and in small batches.
  • 97. Dev ⇾ QA ⇾ Staging ⇾ Prod
  • 98. Dev ⇾ QA ⇾ Staging ⇾ Prod
  • 99. Dev ⇾ Pre-Prod ⇾ Prod
  • 100. Test and integrate where you’ll see value.
  • 101. Config flags (again) off, on, staff, opt-in prototypes, user list, 0-100%
  • 102. Config flags (again) off, on, staff, opt-in prototypes, user list, 0-100% “canary pools”
  • 104. Real-time metrics and dashboards Network & Servers, Application, Business
  • 105.
  • 106. SERVER METRICS Apache requests/sec, Busy processes, CPU utilization, Script exec time (med. & 95th) APPLICATION METRICS Logins, Registrations, Checkouts, Listings created, Forum posts Time and event correlated.
  • 107.
  • 110. Config flags Operator Metrics http://www.flickr.com/photos/flyforfun/2694158656/
  • 111. Managing risk during Holiday Shopping season Thanksgiving, “Black Friday,” “Cyber Monday” ➔ Christmas (~30 days) Code Freeze?
  • 112. DEPLOYMENTS PER DAY “Code Slush”
  • 113. Tighten your feedback cycles Integrate with production and validate early in cycle. Use tools that allow you to detect issues early. Optimize for quick response times. Applied to both feature development and operability.
  • 114. Thank you ... and questions? These slides will be available later today at http://mikebrittain.com/talks Mike Brittain ENGINEERING DIRECTOR @mikebrittain mike@etsy.com