SlideShare una empresa de Scribd logo
1 de 37
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/tradeshift-automatic-deployment
Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
The mean and lean pipeline
From code to production as fast and safe as
possible
QCon London 2014
Who, what, and why?
Joakim Recht
Senior Code Monkey at Tradeshift – the
platform for all your business interactions
“- Please don't do that”
The perfect build pipeline
Prevents bad code from entering
production systems
Gives fast feedback
Ensures consistency
Quality becomes absolute
Is fully automated
Creates nice screens and buttons to
click
You get to talk about it on conferences
Our build pipeline
Gives feedback
Prevents most bad code from entering
production systems
Ensures some consistency
Increases quality
Is pretty automated
There are screens and buttons to click
You get to talk about it on conferences
Build pipeline stats (mid 2012+)
● Servers in Jenkins CI env: 28-52
● Backend builds: 18649
● Frontend builds: 10822
● PRs on Bob: 652
● Integration test subset runs: 61586
● Github PRs: 4781
● Number of integration test specs: 248
● Number of war files deployed during prod
deploy: 20
The realities of a startup
● Not enough time
● Not enough people
● Not enough money
● Too many (crazy) ideas
● Too many requirements
2010: The beginning
The simple life
Java backend, Drupal frontend, REST API.
All code in Subversion. No branches.
Everything hosted on Amazon EC2.
Hudson CI to run tests. Green build policy.
Tests written in jUnit, BDD style.
Deployment to test server: manual ssh,
download war, restart Jetty.
Deployment to prod: semi-automated using
custom scripts and AMI building.
Getting a feature into prod
Switching from SVN to Git
● Just working off trunk is easy, but
release management is hard
● SVN branches (and merging) sucks
● Switched to Git in mid 2010
– Offline support, team/personal branches,
cherry-picking, history rewriting
– No formal process yet, other than the
introduction of a production branch. Manual
merging all around. Based on social contract.
Selenium FTW. Or not.
● Manual testing is a pain
● There were no testers employed
● Regressions, esp. in UI, happened all the
time
● Tried out Selenium UI tests
– Manual maintenance
– Not part of normal development process
– Tests too fragile
● Selenium did not help.
UI tests, take 2
Geb and Spock for UI tests
● Geb: Groovy framework on top of
Webdriver (Selenium 2)
● Spock: Groovy-based BDD framework
● Writing tests become part of the
development process
● Tests executed on the only physical server
in order to show the runs on a big screen
● UI tests must be green
Introducing pipeline
visualization
● By mid 2011 the number of teams
had grown
● As had the number of components
● Keeping track of build status on all
the branches was getting
increasingly hard
Starting a new office in SF
“For the n'th time since their arrival
they are fixing broken stuff in master
that was not committed by
themselves. For the team that has no
contact with the owner of the merged
code for 7 hours or more, that could
mean half a day of troubleshooting,
missed sprint targets, convulsions and
so on.” - Gert Sylvest, CTO
The Pull Request
● No manual pushes to master
● Anything going into master must be tested as a
complete configuration
● All components must be green, all integration
tests must run
● Ensures proper consistency
● Implemented by Jenkins jobs
– Not Gerrit. Seemed too complicated.
● Physical build machine for Geb abandoned and
replaced with Jenkins swarm on EC2
Developers, developers,
developers
● Approaching 40 developers by 2013
● Keeping consistency and quality getting hard
● Knowledge sharing equally hard
● No written procedures or guides
– Also, nobody wanted to write any or maintain them
Solution: Code reviews using Github PRs
– All code must be reviewed by at least one other dev
– DB migrations to be reviews by select group
More automation
Removing ops as bottleneck
● In 2011 all environments (except local dev)
were changed to use Puppet
● Release procedure a matter of starting a job
on Jenkins (only available to Ops)
● 2013: Any team can deploy any configuration
to any sandbox env by the click of a button
● Thread dumps for running env can be
generated from Jenkins
● Sandbox availability times can be controlled
Recent issues
Regular annoyances
● Anybody can still push to master – Github cannot
prevent this
– Add comment to PR automatically if opened against master
● Build times
– At the extremes: 40 minutes for backend, 45 minutes for integration
tests
– Cut down to 15 minutes each by optimizing tests and scaling out –
see http://blog.tradeshift.com/just-add-servers/
– Backend tests to be shortened more by splitting into other
components
● Randomly failing tests
– Esp. integration tests
– Zero tolerance initiated
Key learnings
● Too many teams working concurrently
on new features
● Too many regressions and bugs
introduced into production
● Too many components to deploy
● Too many offices and timezones
● Pipeline throughput not sufficient
● SLAs being violated due to downtime
Pivotal events
Automation is king
But also quite expensive
Don't be religious
But make sure to have an
extensive test suite
That cannot be too slow
And with understandable tests
What we probably should have
done earlier
● Use Git instead of Subversion
● Perform code reviews for all changes
● Using Geb/Spock would have been nice,
but not stable enough at the time
● All of the above have had a significantly
higher impact than anticipated
● Full automation – also for dev envs
● Explicit code styles – just not painful enough
without
● Naming conventions
● Framework versioning or policies
● Agreement on unit vs system vs integration
vs mock vs UI tests
● Consistent use of Findbugs/Checkstyle/similar
● Testers
What we still don't have
Thank you
Questions?
jre@tradeshift.com
@joakimrecht
https://plus.google.com/+JoakimRecht
http://tradeshift.com/blog/
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/tradeshift-
automatic-deployment

Más contenido relacionado

Más de C4Media

Más de C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

The Lean Pipeline

  • 1.
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /tradeshift-automatic-deployment
  • 3. Presented at QCon London www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. The mean and lean pipeline From code to production as fast and safe as possible QCon London 2014
  • 5. Who, what, and why? Joakim Recht Senior Code Monkey at Tradeshift – the platform for all your business interactions “- Please don't do that”
  • 6. The perfect build pipeline Prevents bad code from entering production systems Gives fast feedback Ensures consistency Quality becomes absolute Is fully automated Creates nice screens and buttons to click You get to talk about it on conferences
  • 7. Our build pipeline Gives feedback Prevents most bad code from entering production systems Ensures some consistency Increases quality Is pretty automated There are screens and buttons to click You get to talk about it on conferences
  • 8. Build pipeline stats (mid 2012+) ● Servers in Jenkins CI env: 28-52 ● Backend builds: 18649 ● Frontend builds: 10822 ● PRs on Bob: 652 ● Integration test subset runs: 61586 ● Github PRs: 4781 ● Number of integration test specs: 248 ● Number of war files deployed during prod deploy: 20
  • 9. The realities of a startup ● Not enough time ● Not enough people ● Not enough money ● Too many (crazy) ideas ● Too many requirements
  • 11. The simple life Java backend, Drupal frontend, REST API. All code in Subversion. No branches. Everything hosted on Amazon EC2. Hudson CI to run tests. Green build policy. Tests written in jUnit, BDD style. Deployment to test server: manual ssh, download war, restart Jetty. Deployment to prod: semi-automated using custom scripts and AMI building.
  • 12. Getting a feature into prod
  • 13. Switching from SVN to Git ● Just working off trunk is easy, but release management is hard ● SVN branches (and merging) sucks ● Switched to Git in mid 2010 – Offline support, team/personal branches, cherry-picking, history rewriting – No formal process yet, other than the introduction of a production branch. Manual merging all around. Based on social contract.
  • 14.
  • 15. Selenium FTW. Or not. ● Manual testing is a pain ● There were no testers employed ● Regressions, esp. in UI, happened all the time ● Tried out Selenium UI tests – Manual maintenance – Not part of normal development process – Tests too fragile ● Selenium did not help.
  • 17. Geb and Spock for UI tests ● Geb: Groovy framework on top of Webdriver (Selenium 2) ● Spock: Groovy-based BDD framework ● Writing tests become part of the development process ● Tests executed on the only physical server in order to show the runs on a big screen ● UI tests must be green
  • 18. Introducing pipeline visualization ● By mid 2011 the number of teams had grown ● As had the number of components ● Keeping track of build status on all the branches was getting increasingly hard
  • 19.
  • 20. Starting a new office in SF
  • 21. “For the n'th time since their arrival they are fixing broken stuff in master that was not committed by themselves. For the team that has no contact with the owner of the merged code for 7 hours or more, that could mean half a day of troubleshooting, missed sprint targets, convulsions and so on.” - Gert Sylvest, CTO
  • 22. The Pull Request ● No manual pushes to master ● Anything going into master must be tested as a complete configuration ● All components must be green, all integration tests must run ● Ensures proper consistency ● Implemented by Jenkins jobs – Not Gerrit. Seemed too complicated. ● Physical build machine for Geb abandoned and replaced with Jenkins swarm on EC2
  • 23.
  • 24.
  • 25. Developers, developers, developers ● Approaching 40 developers by 2013 ● Keeping consistency and quality getting hard ● Knowledge sharing equally hard ● No written procedures or guides – Also, nobody wanted to write any or maintain them Solution: Code reviews using Github PRs – All code must be reviewed by at least one other dev – DB migrations to be reviews by select group
  • 27. Removing ops as bottleneck ● In 2011 all environments (except local dev) were changed to use Puppet ● Release procedure a matter of starting a job on Jenkins (only available to Ops) ● 2013: Any team can deploy any configuration to any sandbox env by the click of a button ● Thread dumps for running env can be generated from Jenkins ● Sandbox availability times can be controlled
  • 29. Regular annoyances ● Anybody can still push to master – Github cannot prevent this – Add comment to PR automatically if opened against master ● Build times – At the extremes: 40 minutes for backend, 45 minutes for integration tests – Cut down to 15 minutes each by optimizing tests and scaling out – see http://blog.tradeshift.com/just-add-servers/ – Backend tests to be shortened more by splitting into other components ● Randomly failing tests – Esp. integration tests – Zero tolerance initiated
  • 31. ● Too many teams working concurrently on new features ● Too many regressions and bugs introduced into production ● Too many components to deploy ● Too many offices and timezones ● Pipeline throughput not sufficient ● SLAs being violated due to downtime Pivotal events
  • 32. Automation is king But also quite expensive Don't be religious But make sure to have an extensive test suite That cannot be too slow And with understandable tests
  • 33. What we probably should have done earlier ● Use Git instead of Subversion ● Perform code reviews for all changes ● Using Geb/Spock would have been nice, but not stable enough at the time ● All of the above have had a significantly higher impact than anticipated
  • 34. ● Full automation – also for dev envs ● Explicit code styles – just not painful enough without ● Naming conventions ● Framework versioning or policies ● Agreement on unit vs system vs integration vs mock vs UI tests ● Consistent use of Findbugs/Checkstyle/similar ● Testers What we still don't have
  • 36.
  • 37. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/tradeshift- automatic-deployment