SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
WAYSTO MINIMISE PERFORMANCE RISKS
IN CONTINUOUS DELIVERY
AdriaanThomas
4 June 2013
INTRODUCTION
OBJECTIVE
Put working software into production as quickly as possible, whilst minimising risk of
load-related problems:
• Bad response times
• Lack of capacity
• Availability too low
• Excessive system resource use
Within the context of websites.
TRADITIONAL APPROACH
Load testing through simulation
http://www.flickr.com/photos/danramarch/4423023837
DECIDE WHATTOTEST
•Focus on busiest instant
•Model most-hit functionality
•Extrapolate to expected load
•Look at production traffic
•Or attempt educated guess
DECIDE ON SCOPE
Component test
Chain test
Full environment test
•Test coverage
•Level of certainty
•Number of systems
•Amount of work
SET UPTEST DATA
• Usually starts as a copy from production
• Or educated guess what people will enter
• Render anonymous
• Make tests deterministic
• Synchronise between all systems
http://www.flickr.com/photos/22168167@N00/3889737939/
DECIDE ON STRATEGY
One or more of:
•Scalability test
•Stress test
•Endurance test
•Regression test
•Resilience test
http://www.flickr.com/photos/timjoyfamily/5935279962/
DECIDE ONTEST DURATION
(which is tricky)
http://www.flickr.com/photos/wwarby/3297205226
PROVIDE HARDWARE
http://www.flickr.com/photos/s_w_ellis/2681151694/
Copy of production?
Only one copy?
Virtualisation?
Sharing between teams?
INTEGRATE INTO PIPELINE
Unit test
Functional
integration
test
Load test
Very fast Fast Takes longer
INTEGRATE INTO PIPELINE
Unit test
Functional
integration
test
Load test
Very fast Takes longer
PERMANENT LOADTESTING
Daytime: constant load, teams
inspect impact of changes
Nighttime: Endurance
test
Weekends: refresh test data
http://www.flickr.com/photos/renaissancechambara/5106171956/
RESPONSETIME
DNS lookup (www.xebia.com)
Time to first byte + loading HTML
Time to render
Time to document complete
Browser CPU use
Bandwidth
# connections to a single
host
http://www.webpagetest.org/result/130522_FG_10SC/1/details/
SSL handshake
Parse times
Blocking client code
IMPACT OFTHE BROWSER
www.browserscope.org
CLEAR REQUIREMENTS
Response time
Fail: 10 Now: 3.5 Goal: 1
Intention: Users get a response quickly so that
they are happy and spend more money.
Stakeholder: Marketing dept.
Scale: 95th percentile of “document complete”
response times, in seconds, measured over one
minute.
Metric: Page load times as reported by our
RUM tool.
Inspired byTom Gilb, Competitive Engineering
WebPageTest: first view + repeat view (median of 3)
95th percentile response times from access logs
ADJUST REQUIREMENTS DUETO LACK OF
REAL BROWSERS
Playground to test changes
No impact on real users
Less pressure
More work
Guesswork and extrapolation
Can take a significant amount of time
More hardware
THINGS WILL BREAK...
... in spite of your best efforts
http://www.flickr.com/photos/jmarty/1239950166/
SO INSTEAD WE SHOULD FOCUS ON
FAST RECOVERY
http://www.flickr.com/photos/19107136@N02/8386567228/
“MTTR is more important than
MTBF*”
John Allspaw
* for most types of F
0
0.5
1.0
1.5
2.0
99thpercentileresponsetime(s)
Test duration
MTBF LEADSTO FUD
Time→
TTD find cause (RCA) write & test fix build deploy
validate
compile
deploy&test
Monitoring
Alerts
•Skills
•Organisation
•Culture
•Maintainability
•Simple architecture
•Fastworkstations
•Goodtooling
•Abletoquicklytestlocally
•Automation
•Fastbuildserver
•Efficienttests
Monitoring
•Automation
•Flexiblearchitecture
TTR
DEMING FEEDBACK LOOPS
Plan
Do
Study
Act
OODA LOOPS
Observe
Orient
Decide
Act
AVOIDTEST-ONLY MEASUREMENTS
SIMPLE ARCHITECTURE
THE ONLYTHINGTHAT MATTERS IS
WHAT HAPPENS IN PRODUCTION
Everything else is an assumption.
DEPLOYING CHANGES
http://www.flickr.com/photos/39463459@N08/5083733600
BLUE-GREEN DEPLOYMENTS
Version n+1
Version n
Amazon
Route 53
Elastic
Load
Balancer
Elastic
Load
Balancer
Instances
Instances
DARK LAUNCHING
Web page DB
DARK LAUNCHING
Web page DB Weather SP
DARK LAUNCHING
Web page DB Weather SP
FEATURETOGGLES
CANARY RELEASING
0% 100%
PRODUCTION-IMMUNE SYSTEMS
CONTROLLED LOADTESTING
Instance RDS DB
Instance
RDS DB Instance
Read Replica
Instance
Instance
Amazon
Route 53
Elastic
Load
Balancer
MONITORING
http://www.flickr.com/photos/smieyetracking/5609671098/
MONITORING
Technical metrics
•CPU use
•Memory use
•TPS
•Response times
•etc
Process metrics
•# bugs
•MTTR, MTTD
•Time from idea to live on site
•etc
Business metrics
•Revenue
•# unique visitors
•etc
http://www.flickr.com/photos/smieyetracking/5609671098/
MEASURE IMPACT OF CHANGES
tail	
  -­‐f	
  access_log	
  |	
  alstat.pl	
  -­‐i10	
  -­‐n10	
  -­‐stt
	
  	
  	
  	
  Hits	
  	
  Hits%	
  	
  	
  	
  TPS	
  AvgTmTk	
  TTmTk%	
  	
  AvgRSize	
  RSize%	
  2013-­‐06-­‐04	
  19:37:40	
  (08)
	
  	
  	
  	
  	
  	
  14	
  	
  	
  0.1%	
  	
  	
  	
  1.4	
  	
  	
  1.652	
  	
  	
  5.7%	
  	
  	
  	
  	
  	
  2691	
  	
  	
  0.2%	
  POST	
  	
  	
  200	
  /login.do
	
  	
  	
  	
  	
  	
  14	
  	
  	
  0.1%	
  	
  	
  	
  1.4	
  	
  	
  0.918	
  	
  	
  3.2%	
  	
  	
  	
  	
  	
  3739	
  	
  	
  0.3%	
  GET	
  	
  	
  	
  200	
  /home.do
	
  	
  	
  	
  	
  	
  14	
  	
  	
  0.1%	
  	
  	
  	
  1.4	
  	
  	
  0.879	
  	
  	
  3.1%	
  	
  	
  	
  	
  	
  3185	
  	
  	
  0.2%	
  POST	
  	
  	
  200	
  /order.do
	
  	
  	
  	
  	
  	
  	
  7	
  	
  	
  0.1%	
  	
  	
  	
  0.7	
  	
  	
  0.807	
  	
  	
  1.4%	
  	
  	
  	
  	
  	
  1974	
  	
  	
  0.1%	
  POST	
  	
  	
  200	
  /account.do
	
  	
  	
  	
  	
  	
  	
  4	
  	
  	
  0.0%	
  	
  	
  	
  0.4	
  	
  	
  0.735	
  	
  	
  0.7%	
  	
  	
  	
  	
  	
  3228	
  	
  	
  0.1%	
  GET	
  	
  	
  	
  200	
  /products.do
	
  	
  	
  	
  	
  	
  	
  5	
  	
  	
  0.0%	
  	
  	
  	
  0.5	
  	
  	
  0.697	
  	
  	
  0.9%	
  	
  	
  	
  	
  	
  	
  969	
  	
  	
  0.0%	
  POST	
  	
  	
  200	
  /settings.do
	
  	
  	
  	
  	
  	
  	
  9	
  	
  	
  0.1%	
  	
  	
  	
  0.9	
  	
  	
  0.687	
  	
  	
  1.5%	
  	
  	
  	
  	
  	
  1827	
  	
  	
  0.1%	
  POST	
  	
  	
  200	
  /changeorder.do
	
  	
  	
  	
  	
  	
  27	
  	
  	
  0.2%	
  	
  	
  	
  2.7	
  	
  	
  0.649	
  	
  	
  4.3%	
  	
  	
  	
  	
  	
  2997	
  	
  	
  0.4%	
  POST	
  	
  	
  200	
  /newpasswd.do
	
  	
  	
  	
  	
  	
  15	
  	
  	
  0.1%	
  	
  	
  	
  1.5	
  	
  	
  0.580	
  	
  	
  2.2%	
  	
  	
  	
  	
  	
  2488	
  	
  	
  0.2%	
  GET	
  	
  	
  	
  200	
  /offer.do
	
  	
  	
  	
  	
  	
  95	
  	
  	
  0.9%	
  	
  	
  	
  9.5	
  	
  	
  0.520	
  	
  12.2%	
  	
  	
  	
  	
  	
  4801	
  	
  	
  2.3%	
  GET	
  	
  	
  	
  200	
  /search.do
MEASURE LATENCY
Avg. response times front end vs backend
Number of calls
SMALL DEPLOYMENTS
http://www.flickr.com/photos/rbulmahn/4925464931/
GO/NO-GO MEETINGS
• What are the biggest fears?
• How can we measure this?
• What can be done if it does happen?
RETROSPECTIVES
How can we prevent a failure from
happening again?
How can we detect it earlier?
Was there only one root cause?
http://www.flickr.com/photos/katerha/8380451137
INTRODUCE OUTAGES
Chaos monkey
Game day exercises
http://www.flickr.com/photos/frostnova/440551442/
CULTURE
• Dev and Ops work together on providing information.
• Assumptions are dangerous, try to eliminate as many as possible.
• Small changes are easier to fix than large ones.
• Deploy during office hours so everyone is available in case problems happen.
• All information, including business metrics, should be accessible to everyone.
CLAMS
Culture
Lean
Automation
Measurement
Sharing
SIMPLE, FLEXIBLE ARCHITECTURE
• If the site goes down often, probably its architecture is at fault
• Avoid fragile systems
• Resilience is key
• Scalable (redundancy is not waste)
• Rather many small systems than a few large ones
• State is a “hot brick”
CHANGES FORTHE BUSINESS
• Accept to push smaller changes.
• Continuous delivery vs continuous
deployment.
• Share data.
CONCLUSION
Work on your ability to respond to failure.Trying to prevent failure can slow you down
and make you focus on the wrong things.
Keep assumptions clearly separated from facts. Make your decisions based on evidence.
Measure everything, including the impact of changes to the business.
Look for your compromise, try permanent load testing first and learn from that.
QUESTIONS?
athomas@xebia.com
@a32an
www.xebia.com
blog.xebia.com
(we’re hiring)

Más contenido relacionado

Destacado (18)

Analysing contents pages prep for blog ppt
Analysing  contents pages prep for blog pptAnalysing  contents pages prep for blog ppt
Analysing contents pages prep for blog ppt
 
My Genre
My GenreMy Genre
My Genre
 
Top ten original images
Top ten original imagesTop ten original images
Top ten original images
 
Q7
Q7Q7
Q7
 
Contents draft
Contents draftContents draft
Contents draft
 
As media unit–preliminary task
As media unit–preliminary taskAs media unit–preliminary task
As media unit–preliminary task
 
Main task brief
Main task briefMain task brief
Main task brief
 
Analysing contents pages prep for blog ppt
Analysing  contents pages prep for blog pptAnalysing  contents pages prep for blog ppt
Analysing contents pages prep for blog ppt
 
Stages of development double page spread
Stages of development   double page spreadStages of development   double page spread
Stages of development double page spread
 
Question 7
Question 7Question 7
Question 7
 
Q3
Q3Q3
Q3
 
Preliminary evaluation example_ppt
Preliminary evaluation example_pptPreliminary evaluation example_ppt
Preliminary evaluation example_ppt
 
Hip hop magazine
Hip hop magazineHip hop magazine
Hip hop magazine
 
conventions
conventionsconventions
conventions
 
Photographs (unaltered and permission)
Photographs (unaltered and permission)Photographs (unaltered and permission)
Photographs (unaltered and permission)
 
Initial analysis of music magazine
Initial analysis of music magazineInitial analysis of music magazine
Initial analysis of music magazine
 
Analysing nme contents page
Analysing nme contents pageAnalysing nme contents page
Analysing nme contents page
 
The brief and + initial ideas
The brief and + initial ideasThe brief and + initial ideas
The brief and + initial ideas
 

Similar a Ways to minimise performance risks in continuous delivery

Health monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterHealth monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenter
Andrei Khurshudov
 
[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)
[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)
[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)
Insight Technology, Inc.
 
Framework and Application Benchmarking
Framework and Application BenchmarkingFramework and Application Benchmarking
Framework and Application Benchmarking
Paul Jones
 

Similar a Ways to minimise performance risks in continuous delivery (20)

Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2
 
Health monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterHealth monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenter
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
 
Stay clear of the bugs: Troubleshooting Applications in Microsoft Azure
Stay clear of the bugs: Troubleshooting Applications in Microsoft AzureStay clear of the bugs: Troubleshooting Applications in Microsoft Azure
Stay clear of the bugs: Troubleshooting Applications in Microsoft Azure
 
Designing and Running Performance Experiments
Designing and Running Performance ExperimentsDesigning and Running Performance Experiments
Designing and Running Performance Experiments
 
[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)
[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)
[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)
 
Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliency
 
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
 
Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Pa...
Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Pa...Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Pa...
Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Pa...
 
itSMF Presentation March 2009
itSMF Presentation March 2009itSMF Presentation March 2009
itSMF Presentation March 2009
 
Load Test Like a Pro
Load Test Like a ProLoad Test Like a Pro
Load Test Like a Pro
 
Performance testingfromthecloud_usingBlazemeter
Performance testingfromthecloud_usingBlazemeterPerformance testingfromthecloud_usingBlazemeter
Performance testingfromthecloud_usingBlazemeter
 
Value add: Single User Performance Testing (http://managingperformancetesting...
Value add: Single User Performance Testing (http://managingperformancetesting...Value add: Single User Performance Testing (http://managingperformancetesting...
Value add: Single User Performance Testing (http://managingperformancetesting...
 
Supply chain design and operation
Supply chain design and operationSupply chain design and operation
Supply chain design and operation
 
Telemetry Onboarding
Telemetry OnboardingTelemetry Onboarding
Telemetry Onboarding
 
Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Introduction to Continuous Delivery (BBWorld/DevCon 2013)Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Introduction to Continuous Delivery (BBWorld/DevCon 2013)
 
Framework and Application Benchmarking
Framework and Application BenchmarkingFramework and Application Benchmarking
Framework and Application Benchmarking
 
New ideas for trimming O&M costs
New ideas for trimming O&M costsNew ideas for trimming O&M costs
New ideas for trimming O&M costs
 
ICEflo Implementation Management Solution V1d1
ICEflo Implementation Management Solution V1d1ICEflo Implementation Management Solution V1d1
ICEflo Implementation Management Solution V1d1
 
Fujitsu APD Introduction
Fujitsu APD IntroductionFujitsu APD Introduction
Fujitsu APD Introduction
 

Último

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 

Ways to minimise performance risks in continuous delivery