SlideShare una empresa de Scribd logo
1 de 60
Canary Analyze All the 
Things 
Roy Rapoport 
@royrapoport 
June 12, 2014 
Significant contributions by Chris Sanden, @chris_sanden 
1
Watch the video with slide 
synchronization on InfoQ.com! 
http://www.infoq.com/presentations 
/canary-analysis-deployment-pattern 
InfoQ.com: News & Community Site 
• 750,000 unique visitors/month 
• Published in 4 languages (English, Chinese, Japanese and Brazilian 
Portuguese) 
• Post content from our QCon conferences 
• News 15-20 / week 
• Articles 3-4 / week 
• Presentations (videos) 12-15 / week 
• Interviews 2-3 / week 
• Books 1 / month
Presented at QCon New York 
www.qconnewyork.com 
Purpose of QCon 
- to empower software development by facilitating the spread of 
knowledge and innovation 
Strategy 
- practitioner-driven conference designed for YOU: influencers of 
change and innovation in your teams 
- speakers and topics driving the evolution and innovation 
- connecting and catalyzing the influencers and innovators 
Highlights 
- attended by more than 12,000 delegates since 2007 
- held in 9 cities worldwide
Oh, the Places We’ll Go! 
• Introductions 
• Proposed Use Case and Definition 
• Continuous Improvement / MVP Model 
• Issues, Solutions 
• Cloud Considerations 
• The Road at Netflix 
2
A Word About Me … 
•About 20 years in technology 
•Systems engineering, networking, software development, QA, 
release management 
•Time at Netflix: 1809 days 
4y:11m:14d 
•At Netflix: 
•Systems Engineering, Service Delivery in IT/Ops 
•Troubleshooter and Builder of Python Things[tm] in Product 
Engineering 
•Current role: Insight Engineering in Product Engineering 
•Real-Time Operational Insight 
3
A Word About Netflix… 
Just the Stats 
•16 years 
•2000+ employees 
•48 million users 
•5x10^9 hours/quarter 
4
A Word About Netflix… 
Freedom and Responsibility Culture 
•Optimize speed of innovation 
Constrain availability 
Cost will be what cost will be 
•Hire smart (experienced) 
people 
Get out of their way 
•Anti-process bias 
5
A Word About Netflix… 
Technology and Operations 
•Service Oriented Architecture 
•Decentralized Operations. You 
•Build 
•Test 
•Deploy 
•Set up alerting and monitoring 
•Wake up at 2AM 
6
Oh, the Places We’ll Go! 
• Introductions 
• Proposed Use Case and Definition 
• Continuous Improvement / MVP Model 
• Issues, Solutions 
• Cloud Considerations 
• The Road at Netflix 
7
Why Canary Analysis? 
8
So You’ve Just Done a Release 
> curl http://WhatDoesTheFooSay.prod.netflix.net/api/v1/cat 
{“response”: “meow”} 
9
So You’ve Just Done a Release 
> curl http://WhatDoesTheFooSay.prod.netflix.net/api/v1/dog 
{“response”: “woof”} 
10
So You’ve Just Done a Release 
> curl http://WhatDoesTheFooSay.prod.netflix.net/api/v1/fox 
{“response”: “wa-pa-pa-pa-pa-pa-pow”} 
The correct answer to “what does the fox say?” is left an exercise for the reader 
11
You Need Better Testing! 
Well, yeah 
12
You Need Better Testing! 
“I’m going to push to production, though 
I’m pretty sure it’s going to kill the system” 
13 
- Said no one, ever* 
* Hopefully
Detour 
Rate of Change vs Availability 
1 10 100 1000 
Rate of Change 
6 
5 
4 
3 
2 
1 
0 
Availability (nines) 
Operations 
Engineering 
14
You Need Better Testing!Deployments! 
Canary Analysis 
• A deployment process where 
• a new change (in behavior, code, or both) 
• is rolled out into production gradually, 
• with checkpoints along the way to examine the new (canary) systems 
• (optionally versus the old (baseline) systems) 
• and make go/no-go decisions. 
15
Canary Analysis Is Not 
•A replacement for any sort of 
software testing 
•A/B Testing 
•Releasing 100% to production 
and hoping for the best 
16
Version 
Control 
System 
1000 
servers 
@ 1.0.2 
1000 
servers 
@ 1.0.1 
Customers 
commit 
Build & 
Deployment 
System 
1 server 
@ 1.0.2 
build 
deploy 
Automated 
Canary 
go 
Analysis 
10 
servers 
@ 1.0.2 
One Possible Process 
17
Version 
Control 
System 
1000 
servers 
@ 1.0.1 
Customers 
Build & 
Deployment 
System 
Automated 
Canary 
go 
Analysis 
1000 
servers 
@ 1.0.2 
One Possible Process 
18
Version 
Control 
System 
1000 
servers 
@ 1.0.1 
Customers 
Build & 
Deployment 
System 
Automated 
no Canary 
go 
Analysis 
1000 
servers 
@ 1.0.2 
One Possible Process 
19
Oh, the Places We’ll Go! 
• Introductions 
• Proposed Use Case and Definition 
• Continuous Improvement / MVP Model 
• Issues, Solutions 
• Cloud Considerations 
• The Road at Netflix 
20
Are We There Yet? 
• We’re not 
• You’re probably not either 
21
Minimally … 
• Observability 
• Partial traffic routing 
• Decision-making 
22
Better Yet … 
• Focus on the Goal 
• Current Baseline Matters 
• Observability segregation 
26% fewer errors in canary 
23
Hold On a Minute! 
26% fewer errors in canary 
Mission 
Accomplished 
24
Hold On a Minute! 
26% fewer errors in canary 
Mission 
Accomplished 
30% fewer requests handled in canary 
25
Hold On a Minute! 
26
Hold On a Minute! 
• Absolute numbers are relatively 
unimportant 
• Relative numbers matter 
• Error rate 
• RPS per CPU cycle 
27
So You’ve Got Your Graphs requests 
Requests Rate Comparison 
Type RAM Cores Cost 
Baseline m3.medium 3.75GB 3 $.11/hr 
Canary m1.small 1.7GB 1 $.06/hr 
28
So You’ve Got Your Graphs 
29
Automating … 
• Decision 
• Execution 
30
A Quick Recap 
• Observe 
• Segregate metrics 
• Partial deploy 
• Compare to Baseline 
• Absolutes are never right 
• Automate decision 
• Automate execution 
31
Oh, the Places We’ll Go! 
• Introductions 
• Proposed Use Case and Definition 
• Continuous Improvement / MVP Model 
• Issues, Solutions 
• Cloud Considerations 
• The Road at Netflix 
32
To Save You Some Time … 
Not all 
metrics are 
created 
equal 
Focus on 
System and 
Application 
Metrics 
Weight by 
category 
(system, 
latency, etc) 
33
To Save You Some Time … 
Outliers are 
out, lying 
Use a group 
of servers 
Balance 
fidelity with 
customer 
impact 
34
To Save You Some Time … 
Exercise 
without 
Repeat 
warmup 
canary 
can result 
analysis 
in injury 
frequently 
Both traffic 
and startup 
time are 
factors 
35
To Save You Some Time … 
vive la 
différence! 
Hot-OK, 
Cold-OK 
Let 
Application 
Owners 
Choose 
36
To Save You Some Time … 
Signal is better 
than no1$#[NO 
CARRIER] 
Ignore weak 
signals 
37
Oh, the Places We’ll Go! 
• Introductions 
• Proposed Use Case and Definition 
• Continuous Improvement / MVP Model 
• Issues, Solutions 
• Cloud Considerations 
• The Road at Netflix 
38
Good News 
• Software-Defined Everything 
• Incremental Pricing 
39
Bad News 
• Capacity Management 
• Unpredictable Inconsistency 
40
Oh, the Places We’ll Go! 
• Introductions 
• Proposed Use Case and Definition 
• Continuous Improvement / MVP Model 
• Issues, Solutions 
• Cloud Considerations 
• The Road at Netflix 
41
Numbers 
• 752 services in production 
• In-house telemetry platform 
• A few metrics 
42
Been there. 
Done that. 
Manually. Artisanally 
• Started in the Data Center 
• Manual, dashboard-driven 
43
Been there. 
Done that. 
Manually. 
44 
Errors Requests CPU
Been there. 
Done that. 
Manually. 
45
Been there. 
Done that. 
Manually. 
46
Been there. 
Done that. 
Manually. 
47
Been there. 
Done that. 
Manually. 
• Context vs Precision 
• No … 
• Repeatability 
• Trending 
• Manual effort is manual 
48
So Now What? 
• Automate Analysis 
• Took Some Effort 
• Approach and analytics 
• Presentation matters 
49
Automated Canary Analysis 
50
Automated Canary Analysis 
51
Automated Canary Analysis 
52
Automated Canary Analysis 
53
Automated Canary Analysis 
54
For Our Next Trick … 
• Configuration GUI 
• Deployment System Integration 
• ACA All The Things 
• OpenConnect firmware updates 
• Client software changes 
• Configuration changes in production 
55
Summary 
• Canary Analysis makes your changes 
• Safer 
• Faster 
• Easier 
• Most people can start doing it 
• Everyone can do it better 
56
http://bit.ly/qcon-netflix? 57 
Questions, Attributions, Feedback 
• https://www.flickr.com/photos/cseeman 
• https://www.flickr.com/photos/ransomtech 
• https://www.flickr.com/photos/dougbrown47 
• https://www.flickr.com/photos/andresthor/ 
• https://www.flickr.com/photos/dougbrown47 
• https://www.flickr.com/photos/pkdesigns 
@royrapoport 
rsr@netflix.com
Watch the video with slide synchronization on 
InfoQ.com! 
http://www.infoq.com/presentations/canary-analysis- 
deployment-pattern

Más contenido relacionado

La actualidad más candente

DOES SFO 2016 - Marc Priolo - Are we there yet?
DOES SFO 2016 - Marc Priolo - Are we there yet? DOES SFO 2016 - Marc Priolo - Are we there yet?
DOES SFO 2016 - Marc Priolo - Are we there yet? Gene Kim
 
Continuous Delivery and Automated Operations on k8s with keptn
Continuous Delivery and Automated Operations on k8s with keptnContinuous Delivery and Automated Operations on k8s with keptn
Continuous Delivery and Automated Operations on k8s with keptnAndreas Grabner
 
TechTalk: Reduce Risk with Canary Deployments
TechTalk: Reduce Risk with Canary DeploymentsTechTalk: Reduce Risk with Canary Deployments
TechTalk: Reduce Risk with Canary DeploymentsCA Technologies
 
DOES SFO 2016 - Ray Krueger - Speed as a Prime Directive
DOES SFO 2016 - Ray Krueger - Speed as a Prime DirectiveDOES SFO 2016 - Ray Krueger - Speed as a Prime Directive
DOES SFO 2016 - Ray Krueger - Speed as a Prime DirectiveGene Kim
 
Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10CA Technologies
 
DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?
DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?
DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?Gene Kim
 
Metrics-driven Continuous Delivery
Metrics-driven Continuous DeliveryMetrics-driven Continuous Delivery
Metrics-driven Continuous DeliveryAndrew Phillips
 
Metrics driven dev ops 2017
Metrics driven dev ops 2017Metrics driven dev ops 2017
Metrics driven dev ops 2017Jerry Tan
 
Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...
Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...
Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...Marilyne Huret
 
DevOps Transformation at Dynatrace and with Dynatrace
DevOps Transformation at Dynatrace and with DynatraceDevOps Transformation at Dynatrace and with Dynatrace
DevOps Transformation at Dynatrace and with DynatraceAndreas Grabner
 
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOpsDOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOpsGene Kim
 
Moving to Continuous Delivery with XebiaLabs XL Release
Moving to Continuous Delivery with XebiaLabs XL ReleaseMoving to Continuous Delivery with XebiaLabs XL Release
Moving to Continuous Delivery with XebiaLabs XL ReleaseXebiaLabs
 
Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...
Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...
Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...CA Technologies
 
Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...
Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...
Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...Serena Software
 
Achieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the EnterpriseAchieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the EnterpriseCollabNet
 
The 7 Principles of DevOps and Cloud Applications
The 7 Principles of DevOps and Cloud ApplicationsThe 7 Principles of DevOps and Cloud Applications
The 7 Principles of DevOps and Cloud ApplicationsSolarWinds
 
Building Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptnBuilding Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptnJohannes Bräuer
 
Legacy On Premise Apps Got You Down? No Problem - DevOps for All
Legacy On Premise Apps Got You Down? No Problem - DevOps for AllLegacy On Premise Apps Got You Down? No Problem - DevOps for All
Legacy On Premise Apps Got You Down? No Problem - DevOps for AllMuly Gottlieb
 
Jenkins Online Meetup - Automated SLI based Build Validation with Keptn
Jenkins Online Meetup - Automated SLI based Build Validation with KeptnJenkins Online Meetup - Automated SLI based Build Validation with Keptn
Jenkins Online Meetup - Automated SLI based Build Validation with KeptnAndreas Grabner
 

La actualidad más candente (20)

DOES SFO 2016 - Marc Priolo - Are we there yet?
DOES SFO 2016 - Marc Priolo - Are we there yet? DOES SFO 2016 - Marc Priolo - Are we there yet?
DOES SFO 2016 - Marc Priolo - Are we there yet?
 
Continuous Delivery and Automated Operations on k8s with keptn
Continuous Delivery and Automated Operations on k8s with keptnContinuous Delivery and Automated Operations on k8s with keptn
Continuous Delivery and Automated Operations on k8s with keptn
 
TechTalk: Reduce Risk with Canary Deployments
TechTalk: Reduce Risk with Canary DeploymentsTechTalk: Reduce Risk with Canary Deployments
TechTalk: Reduce Risk with Canary Deployments
 
DOES SFO 2016 - Ray Krueger - Speed as a Prime Directive
DOES SFO 2016 - Ray Krueger - Speed as a Prime DirectiveDOES SFO 2016 - Ray Krueger - Speed as a Prime Directive
DOES SFO 2016 - Ray Krueger - Speed as a Prime Directive
 
Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10
 
DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?
DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?
DOES SFO 2016 - Cornelia Davis - DevOps: Who Does What?
 
Metrics-driven Continuous Delivery
Metrics-driven Continuous DeliveryMetrics-driven Continuous Delivery
Metrics-driven Continuous Delivery
 
Metrics driven dev ops 2017
Metrics driven dev ops 2017Metrics driven dev ops 2017
Metrics driven dev ops 2017
 
Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...
Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...
Our Journey to 100% Agile and a BizDevOps Product Portfolio - Dr. Frank Ramsa...
 
DevOps Transformation at Dynatrace and with Dynatrace
DevOps Transformation at Dynatrace and with DynatraceDevOps Transformation at Dynatrace and with Dynatrace
DevOps Transformation at Dynatrace and with Dynatrace
 
DevOps and Cloud
DevOps and CloudDevOps and Cloud
DevOps and Cloud
 
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOpsDOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
 
Moving to Continuous Delivery with XebiaLabs XL Release
Moving to Continuous Delivery with XebiaLabs XL ReleaseMoving to Continuous Delivery with XebiaLabs XL Release
Moving to Continuous Delivery with XebiaLabs XL Release
 
Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...
Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...
Moving to Open-Source Tools - How to Increase Performance Test Coverage Throu...
 
Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...
Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...
Serena DevOps Drive-in: Leading the Agile and DevOps transformation with Gary...
 
Achieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the EnterpriseAchieving DevOps using Open Source Tools in the Enterprise
Achieving DevOps using Open Source Tools in the Enterprise
 
The 7 Principles of DevOps and Cloud Applications
The 7 Principles of DevOps and Cloud ApplicationsThe 7 Principles of DevOps and Cloud Applications
The 7 Principles of DevOps and Cloud Applications
 
Building Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptnBuilding Autonomous Operations for Kubernetes with keptn
Building Autonomous Operations for Kubernetes with keptn
 
Legacy On Premise Apps Got You Down? No Problem - DevOps for All
Legacy On Premise Apps Got You Down? No Problem - DevOps for AllLegacy On Premise Apps Got You Down? No Problem - DevOps for All
Legacy On Premise Apps Got You Down? No Problem - DevOps for All
 
Jenkins Online Meetup - Automated SLI based Build Validation with Keptn
Jenkins Online Meetup - Automated SLI based Build Validation with KeptnJenkins Online Meetup - Automated SLI based Build Validation with Keptn
Jenkins Online Meetup - Automated SLI based Build Validation with Keptn
 

Similar a Canary Analyze All The Things: How We Learned to Keep Calm and Release Often

Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?C4Media
 
Introduction to the Typesafe Reactive Platform
Introduction to the Typesafe Reactive PlatformIntroduction to the Typesafe Reactive Platform
Introduction to the Typesafe Reactive PlatformBoldRadius Solutions
 
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015Vimal Suba
 
The Evolution of Continuous Delivery at Scale @ Linkedin
The Evolution of Continuous Delivery at Scale @ LinkedinThe Evolution of Continuous Delivery at Scale @ Linkedin
The Evolution of Continuous Delivery at Scale @ LinkedinC4Media
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalabilityGuy Tomer
 
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation ProjectsAmazon Web Services
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapJosh Evans
 
Dev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps SuccessDev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps SuccessC4Media
 
Continuous Delivery for the Rest of Us
Continuous Delivery for the Rest of UsContinuous Delivery for the Rest of Us
Continuous Delivery for the Rest of UsC4Media
 
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...Amazon Web Services
 
Practical Methods for Adopting DevOps - Michael Stahnke
Practical Methods for Adopting DevOps - Michael StahnkePractical Methods for Adopting DevOps - Michael Stahnke
Practical Methods for Adopting DevOps - Michael StahnkePuppet
 
Neotys PAC - Ian Molyneaux
Neotys PAC - Ian MolyneauxNeotys PAC - Ian Molyneaux
Neotys PAC - Ian MolyneauxNeotys_Partner
 
5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous DeliveryXebiaLabs
 
ARC's Bob Mick Workshop - Server Virtualization in Manufacturing Operations ...
ARC's Bob Mick Workshop  - Server Virtualization in Manufacturing Operations ...ARC's Bob Mick Workshop  - Server Virtualization in Manufacturing Operations ...
ARC's Bob Mick Workshop - Server Virtualization in Manufacturing Operations ...ARC Advisory Group
 
Engineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudEngineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudJosh Evans
 
Case Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidCase Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidSalil Kalia
 
Deployment is the new build
Deployment is the new buildDeployment is the new build
Deployment is the new buildAndrew Phillips
 
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps worldLucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps worldDevOps Enterprise Summit
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Thingsroyrapoport
 

Similar a Canary Analyze All The Things: How We Learned to Keep Calm and Release Often (20)

Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
 
Introduction to the Typesafe Reactive Platform
Introduction to the Typesafe Reactive PlatformIntroduction to the Typesafe Reactive Platform
Introduction to the Typesafe Reactive Platform
 
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
 
The Evolution of Continuous Delivery at Scale @ Linkedin
The Evolution of Continuous Delivery at Scale @ LinkedinThe Evolution of Continuous Delivery at Scale @ Linkedin
The Evolution of Continuous Delivery at Scale @ Linkedin
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the Gap
 
Dev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps SuccessDev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps Success
 
Continuous Delivery for the Rest of Us
Continuous Delivery for the Rest of UsContinuous Delivery for the Rest of Us
Continuous Delivery for the Rest of Us
 
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...
AWS re:Invent 2016: Moving Mission Critical Apps from One Region to Multi-Reg...
 
Practical Methods for Adopting DevOps - Michael Stahnke
Practical Methods for Adopting DevOps - Michael StahnkePractical Methods for Adopting DevOps - Michael Stahnke
Practical Methods for Adopting DevOps - Michael Stahnke
 
Neotys PAC - Ian Molyneaux
Neotys PAC - Ian MolyneauxNeotys PAC - Ian Molyneaux
Neotys PAC - Ian Molyneaux
 
5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery
 
ARC's Bob Mick Workshop - Server Virtualization in Manufacturing Operations ...
ARC's Bob Mick Workshop  - Server Virtualization in Manufacturing Operations ...ARC's Bob Mick Workshop  - Server Virtualization in Manufacturing Operations ...
ARC's Bob Mick Workshop - Server Virtualization in Manufacturing Operations ...
 
Engineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudEngineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the Cloud
 
Case Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidCase Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with Druid
 
Deployment is the new build
Deployment is the new buildDeployment is the new build
Deployment is the new build
 
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps worldLucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Things
 

Más de C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Más de C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Último

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Último (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Canary Analyze All The Things: How We Learned to Keep Calm and Release Often

  • 1. Canary Analyze All the Things Roy Rapoport @royrapoport June 12, 2014 Significant contributions by Chris Sanden, @chris_sanden 1
  • 2. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /canary-analysis-deployment-pattern InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. Oh, the Places We’ll Go! • Introductions • Proposed Use Case and Definition • Continuous Improvement / MVP Model • Issues, Solutions • Cloud Considerations • The Road at Netflix 2
  • 5. A Word About Me … •About 20 years in technology •Systems engineering, networking, software development, QA, release management •Time at Netflix: 1809 days 4y:11m:14d •At Netflix: •Systems Engineering, Service Delivery in IT/Ops •Troubleshooter and Builder of Python Things[tm] in Product Engineering •Current role: Insight Engineering in Product Engineering •Real-Time Operational Insight 3
  • 6. A Word About Netflix… Just the Stats •16 years •2000+ employees •48 million users •5x10^9 hours/quarter 4
  • 7. A Word About Netflix… Freedom and Responsibility Culture •Optimize speed of innovation Constrain availability Cost will be what cost will be •Hire smart (experienced) people Get out of their way •Anti-process bias 5
  • 8. A Word About Netflix… Technology and Operations •Service Oriented Architecture •Decentralized Operations. You •Build •Test •Deploy •Set up alerting and monitoring •Wake up at 2AM 6
  • 9. Oh, the Places We’ll Go! • Introductions • Proposed Use Case and Definition • Continuous Improvement / MVP Model • Issues, Solutions • Cloud Considerations • The Road at Netflix 7
  • 11. So You’ve Just Done a Release > curl http://WhatDoesTheFooSay.prod.netflix.net/api/v1/cat {“response”: “meow”} 9
  • 12. So You’ve Just Done a Release > curl http://WhatDoesTheFooSay.prod.netflix.net/api/v1/dog {“response”: “woof”} 10
  • 13. So You’ve Just Done a Release > curl http://WhatDoesTheFooSay.prod.netflix.net/api/v1/fox {“response”: “wa-pa-pa-pa-pa-pa-pow”} The correct answer to “what does the fox say?” is left an exercise for the reader 11
  • 14. You Need Better Testing! Well, yeah 12
  • 15. You Need Better Testing! “I’m going to push to production, though I’m pretty sure it’s going to kill the system” 13 - Said no one, ever* * Hopefully
  • 16. Detour Rate of Change vs Availability 1 10 100 1000 Rate of Change 6 5 4 3 2 1 0 Availability (nines) Operations Engineering 14
  • 17. You Need Better Testing!Deployments! Canary Analysis • A deployment process where • a new change (in behavior, code, or both) • is rolled out into production gradually, • with checkpoints along the way to examine the new (canary) systems • (optionally versus the old (baseline) systems) • and make go/no-go decisions. 15
  • 18. Canary Analysis Is Not •A replacement for any sort of software testing •A/B Testing •Releasing 100% to production and hoping for the best 16
  • 19. Version Control System 1000 servers @ 1.0.2 1000 servers @ 1.0.1 Customers commit Build & Deployment System 1 server @ 1.0.2 build deploy Automated Canary go Analysis 10 servers @ 1.0.2 One Possible Process 17
  • 20. Version Control System 1000 servers @ 1.0.1 Customers Build & Deployment System Automated Canary go Analysis 1000 servers @ 1.0.2 One Possible Process 18
  • 21. Version Control System 1000 servers @ 1.0.1 Customers Build & Deployment System Automated no Canary go Analysis 1000 servers @ 1.0.2 One Possible Process 19
  • 22. Oh, the Places We’ll Go! • Introductions • Proposed Use Case and Definition • Continuous Improvement / MVP Model • Issues, Solutions • Cloud Considerations • The Road at Netflix 20
  • 23. Are We There Yet? • We’re not • You’re probably not either 21
  • 24. Minimally … • Observability • Partial traffic routing • Decision-making 22
  • 25. Better Yet … • Focus on the Goal • Current Baseline Matters • Observability segregation 26% fewer errors in canary 23
  • 26. Hold On a Minute! 26% fewer errors in canary Mission Accomplished 24
  • 27. Hold On a Minute! 26% fewer errors in canary Mission Accomplished 30% fewer requests handled in canary 25
  • 28. Hold On a Minute! 26
  • 29. Hold On a Minute! • Absolute numbers are relatively unimportant • Relative numbers matter • Error rate • RPS per CPU cycle 27
  • 30. So You’ve Got Your Graphs requests Requests Rate Comparison Type RAM Cores Cost Baseline m3.medium 3.75GB 3 $.11/hr Canary m1.small 1.7GB 1 $.06/hr 28
  • 31. So You’ve Got Your Graphs 29
  • 32. Automating … • Decision • Execution 30
  • 33. A Quick Recap • Observe • Segregate metrics • Partial deploy • Compare to Baseline • Absolutes are never right • Automate decision • Automate execution 31
  • 34. Oh, the Places We’ll Go! • Introductions • Proposed Use Case and Definition • Continuous Improvement / MVP Model • Issues, Solutions • Cloud Considerations • The Road at Netflix 32
  • 35. To Save You Some Time … Not all metrics are created equal Focus on System and Application Metrics Weight by category (system, latency, etc) 33
  • 36. To Save You Some Time … Outliers are out, lying Use a group of servers Balance fidelity with customer impact 34
  • 37. To Save You Some Time … Exercise without Repeat warmup canary can result analysis in injury frequently Both traffic and startup time are factors 35
  • 38. To Save You Some Time … vive la différence! Hot-OK, Cold-OK Let Application Owners Choose 36
  • 39. To Save You Some Time … Signal is better than no1$#[NO CARRIER] Ignore weak signals 37
  • 40. Oh, the Places We’ll Go! • Introductions • Proposed Use Case and Definition • Continuous Improvement / MVP Model • Issues, Solutions • Cloud Considerations • The Road at Netflix 38
  • 41. Good News • Software-Defined Everything • Incremental Pricing 39
  • 42. Bad News • Capacity Management • Unpredictable Inconsistency 40
  • 43. Oh, the Places We’ll Go! • Introductions • Proposed Use Case and Definition • Continuous Improvement / MVP Model • Issues, Solutions • Cloud Considerations • The Road at Netflix 41
  • 44. Numbers • 752 services in production • In-house telemetry platform • A few metrics 42
  • 45. Been there. Done that. Manually. Artisanally • Started in the Data Center • Manual, dashboard-driven 43
  • 46. Been there. Done that. Manually. 44 Errors Requests CPU
  • 47. Been there. Done that. Manually. 45
  • 48. Been there. Done that. Manually. 46
  • 49. Been there. Done that. Manually. 47
  • 50. Been there. Done that. Manually. • Context vs Precision • No … • Repeatability • Trending • Manual effort is manual 48
  • 51. So Now What? • Automate Analysis • Took Some Effort • Approach and analytics • Presentation matters 49
  • 57. For Our Next Trick … • Configuration GUI • Deployment System Integration • ACA All The Things • OpenConnect firmware updates • Client software changes • Configuration changes in production 55
  • 58. Summary • Canary Analysis makes your changes • Safer • Faster • Easier • Most people can start doing it • Everyone can do it better 56
  • 59. http://bit.ly/qcon-netflix? 57 Questions, Attributions, Feedback • https://www.flickr.com/photos/cseeman • https://www.flickr.com/photos/ransomtech • https://www.flickr.com/photos/dougbrown47 • https://www.flickr.com/photos/andresthor/ • https://www.flickr.com/photos/dougbrown47 • https://www.flickr.com/photos/pkdesigns @royrapoport rsr@netflix.com
  • 60. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/canary-analysis- deployment-pattern