SlideShare a Scribd company logo
1 of 34
Download to read offline
Scaling with Continuous Deployment

               Web 2.0 Expo
     New York, NY, September 29, 2010

             Brett G. Durrett (@bdurrett)
 Vice President Engineering & Operations, IMVU, Inc.




                                                       0
An online community where members use 3D avatars
           to meet new people, chat, create
            and have fun with their friends
Who is my audience?



Mix of engineering / product?




                                   2
Who is my audience?



Mix of engineering / product?

 How many from a startup?




                                   3
Who is my audience?



       Mix of engineering / product?

        How many from a startup?

How many believe iterating on your product
 is critical to the success of your business?



                                                4
How quickly can your business iterate?




                                         5
Can I interest you in some
Continuous Deployment?




                             6
In a Nutshell




What is Continuous Deployment?

• Engineer commits code
• 20 minutes later it is live in production
• Repeat for about 50 commits per day


                                                 7
Does This Really Work?


“Maybe this is just viable for a single
  developer … your site will be down. A lot.”

“It seems like the author either has no
   customers or very understanding
   customers”

         Responses to February 2009 blog posting about Continuous Deployment at IMVU
                                           (at the time IMVU had a $12 million run rate)



                                                                                           8
Benefits




• Regressions easy to find, correct

• Releases have zero overhead

• Rapid iteration using real customer metrics


                                                9
Finding and Fixing Problems


                        • Each release has few
                          changes, 1-3 commits

                        • Production issues
                          correlate with check-
                          in timestamp

                        • No overhead to
Identifying cause
  takes minutes
                          producing a new
                          release to correct
                          issue
CD at IMVU: Simple Overview

  Local tests                                                   Rollback
pass, engineer                                                  (Blocks)
commits code
                                No


Lots and lots of            Metrics        Yes    Code deployed
   tests run                good?                  to all servers




   All tests       Yes   Code deployed               Metrics
                                                                    No
    pass?                to % of servers              still
                                                     good?

         No                                               Yes

Revert commit
  (Blocks)                                 Win!

                                                                           11
CD at IMVU: Detailed Overview




                            12
Getting Started – Extreme Basics




1. Continuous integration system
2. Production monitoring and alerting
   – System performance
   – Business metrics
   – Trending is nice too 
3. Simple deploy / roll-back system


                                             13
Commit to Making Forward Progress




• Require coverage for all new code

• Add coverage for bugs / regressions

• Understand and fix root cause of failures


                                              14
Expect Some Hurdles


• Production outages
• New overhead
   – Tests
   – Build systems
• Production outages
• Frustration
• Production outages

   (but well worth it)
Dealing with SQL


Problems
• Difficult to roll-back schema
• Alter statements lock / impact customers

Solutions
• New schema has formal review process
• No alter on large tables, create new table
   – Copy on read
   – Complete migration with background job
                                               16
Big Features



• Developed on trunk, not branch
   – “hidden” from customers by A/B experiment
   – 100% control, add QA to experiment


• Deployed daily during development

• Slow roll-out by increasing experiment %
   – Experiment closed = fully launched
                                                 17
Test Speed


Slow tests burden to scaling
• Can’t run all tests in sandbox
• Faster to debug on build cluster

If possible…
• Keep tests fast
• Keep tests specific


                                              18
The cost of failing tests



As the team grows…

• More likely to have test failures
• More people blocked as a result

        Intermittent failures very bad
          Eliminate the root cause

                                                19
Other Issues


• Won’t catch issues that fail slowly
   – SELECT * FROM growing_table WHERE 1


• Some critical areas cause hard lock-ups
   – MySQL
   – Memcached


• Lack of test coverage of older code
   – Not an issue if you start with test coverage
                                                    20
Does Continuous Deployment Scale?

• Technical staff ~50 people

• 10 million monthly unique visitors

• Peak ~130K concurrent IM client logins

• It’s a real business!
   – $40 million run rate
   – Profitable and doubled revenue in 2009
                                              21
Newer Scaling Challenges




Biggest challenges come with growth of the
          engineering organization




                                             22
SLA for Build Systems




Build systems are a critical service




                                        23
SLA for Build Systems




Build systems are a critical service
        Run them that way




                                        24
Build and Push Times




                   25
Overall Availability




                   26
http://www.flickr.com/photos/onebigchickenman/4869442019/
Build Throughput


• Initial implementation sequential builds
   – Scaled okay to ~20 engineers
   – Like trains running every 20 minutes
   – One “red” blocks all following builds


• Solution: build isolation
   – Enable testing single build without deploy
   – “Red” build pulled, allow other builds to pass


                                                      28
Web Build Software



•   Custom test-file runner with JS GUI
•   PHP SimpleTest
•   Python's built-in unittest
•   Selenium Core with in-house API wrapper
•   YUITest for browser JS unit tests
•   Erlang Eunit
•   Buildbot

                                              29
Current Systems


• > 15,000 tests

• 86 web build servers
   – 62 Linux

   – 24 Windows

• ~ 10 minutes on build servers

• Deploy to cluster of ~700 servers
                                            30
Conclusion



• Continuous Deployment is possible!

• Starting earlier is easier - baby steps

• The value of being able to iterate
  outweighs the challenges


                                                31
Questions?




             32
Thank You!


                        IMVU recognized as:

                      Inc. 500
                           http://bit.ly/dv52wK
Brett G. Durrett
                      Red Herring 100:
bdurrett@imvu.com          http://bit.ly/bbz5Ex
 Twitter: @bdurrett   Best Place to Work:
                           http://bit.ly/aAVdp8


                           (and we're hiring)
                      http://www.imvu.com/jobs

More Related Content

Viewers also liked (7)

Jenkins CI
Jenkins CIJenkins CI
Jenkins CI
 
Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)
 
Continuous Delivery
Continuous DeliveryContinuous Delivery
Continuous Delivery
 
Pitfalls of Continuous Deployment
Pitfalls of Continuous DeploymentPitfalls of Continuous Deployment
Pitfalls of Continuous Deployment
 
Continuous Delivery Un caso de estudio
Continuous Delivery Un caso de estudioContinuous Delivery Un caso de estudio
Continuous Delivery Un caso de estudio
 
Dev ops e infraestructura – acompañando nuestro software a producción
Dev ops e infraestructura – acompañando nuestro software a producciónDev ops e infraestructura – acompañando nuestro software a producción
Dev ops e infraestructura – acompañando nuestro software a producción
 
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) PipelineAnatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
 

More from Brett Durrett

More from Brett Durrett (7)

Social VR and Gaming
Social VR and GamingSocial VR and Gaming
Social VR and Gaming
 
Continuous Deployment at Lean LA
Continuous Deployment at Lean LAContinuous Deployment at Lean LA
Continuous Deployment at Lean LA
 
Learning Fast With A/B Testing and Continuous Deployment
Learning Fast With A/B Testing and Continuous DeploymentLearning Fast With A/B Testing and Continuous Deployment
Learning Fast With A/B Testing and Continuous Deployment
 
3 Reasons You Should Use Continuous Deployment
3 Reasons You Should Use Continuous Deployment3 Reasons You Should Use Continuous Deployment
3 Reasons You Should Use Continuous Deployment
 
IMVU: Real Money from Virtual Goods, Media X at Stanford
IMVU: Real Money from Virtual Goods, Media X at StanfordIMVU: Real Money from Virtual Goods, Media X at Stanford
IMVU: Real Money from Virtual Goods, Media X at Stanford
 
Lean Startup Pitfalls Uncovered
Lean Startup Pitfalls UncoveredLean Startup Pitfalls Uncovered
Lean Startup Pitfalls Uncovered
 
IMVU: “But Does It Scale?” from Startup Lessons Learned Conference
IMVU: “But Does It Scale?” from Startup Lessons Learned ConferenceIMVU: “But Does It Scale?” from Startup Lessons Learned Conference
IMVU: “But Does It Scale?” from Startup Lessons Learned Conference
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Scaling Continuous Deployment at IMVU

  • 1. Scaling with Continuous Deployment Web 2.0 Expo New York, NY, September 29, 2010 Brett G. Durrett (@bdurrett) Vice President Engineering & Operations, IMVU, Inc. 0
  • 2. An online community where members use 3D avatars to meet new people, chat, create and have fun with their friends
  • 3. Who is my audience? Mix of engineering / product? 2
  • 4. Who is my audience? Mix of engineering / product? How many from a startup? 3
  • 5. Who is my audience? Mix of engineering / product? How many from a startup? How many believe iterating on your product is critical to the success of your business? 4
  • 6. How quickly can your business iterate? 5
  • 7. Can I interest you in some Continuous Deployment? 6
  • 8. In a Nutshell What is Continuous Deployment? • Engineer commits code • 20 minutes later it is live in production • Repeat for about 50 commits per day 7
  • 9. Does This Really Work? “Maybe this is just viable for a single developer … your site will be down. A lot.” “It seems like the author either has no customers or very understanding customers” Responses to February 2009 blog posting about Continuous Deployment at IMVU (at the time IMVU had a $12 million run rate) 8
  • 10. Benefits • Regressions easy to find, correct • Releases have zero overhead • Rapid iteration using real customer metrics 9
  • 11. Finding and Fixing Problems • Each release has few changes, 1-3 commits • Production issues correlate with check- in timestamp • No overhead to Identifying cause takes minutes producing a new release to correct issue
  • 12. CD at IMVU: Simple Overview Local tests Rollback pass, engineer (Blocks) commits code No Lots and lots of Metrics Yes Code deployed tests run good? to all servers All tests Yes Code deployed Metrics No pass? to % of servers still good? No Yes Revert commit (Blocks) Win! 11
  • 13. CD at IMVU: Detailed Overview 12
  • 14. Getting Started – Extreme Basics 1. Continuous integration system 2. Production monitoring and alerting – System performance – Business metrics – Trending is nice too  3. Simple deploy / roll-back system 13
  • 15. Commit to Making Forward Progress • Require coverage for all new code • Add coverage for bugs / regressions • Understand and fix root cause of failures 14
  • 16. Expect Some Hurdles • Production outages • New overhead – Tests – Build systems • Production outages • Frustration • Production outages (but well worth it)
  • 17. Dealing with SQL Problems • Difficult to roll-back schema • Alter statements lock / impact customers Solutions • New schema has formal review process • No alter on large tables, create new table – Copy on read – Complete migration with background job 16
  • 18. Big Features • Developed on trunk, not branch – “hidden” from customers by A/B experiment – 100% control, add QA to experiment • Deployed daily during development • Slow roll-out by increasing experiment % – Experiment closed = fully launched 17
  • 19. Test Speed Slow tests burden to scaling • Can’t run all tests in sandbox • Faster to debug on build cluster If possible… • Keep tests fast • Keep tests specific 18
  • 20. The cost of failing tests As the team grows… • More likely to have test failures • More people blocked as a result Intermittent failures very bad Eliminate the root cause 19
  • 21. Other Issues • Won’t catch issues that fail slowly – SELECT * FROM growing_table WHERE 1 • Some critical areas cause hard lock-ups – MySQL – Memcached • Lack of test coverage of older code – Not an issue if you start with test coverage 20
  • 22. Does Continuous Deployment Scale? • Technical staff ~50 people • 10 million monthly unique visitors • Peak ~130K concurrent IM client logins • It’s a real business! – $40 million run rate – Profitable and doubled revenue in 2009 21
  • 23. Newer Scaling Challenges Biggest challenges come with growth of the engineering organization 22
  • 24. SLA for Build Systems Build systems are a critical service 23
  • 25. SLA for Build Systems Build systems are a critical service Run them that way 24
  • 26. Build and Push Times 25
  • 29. Build Throughput • Initial implementation sequential builds – Scaled okay to ~20 engineers – Like trains running every 20 minutes – One “red” blocks all following builds • Solution: build isolation – Enable testing single build without deploy – “Red” build pulled, allow other builds to pass 28
  • 30. Web Build Software • Custom test-file runner with JS GUI • PHP SimpleTest • Python's built-in unittest • Selenium Core with in-house API wrapper • YUITest for browser JS unit tests • Erlang Eunit • Buildbot 29
  • 31. Current Systems • > 15,000 tests • 86 web build servers – 62 Linux – 24 Windows • ~ 10 minutes on build servers • Deploy to cluster of ~700 servers 30
  • 32. Conclusion • Continuous Deployment is possible! • Starting earlier is easier - baby steps • The value of being able to iterate outweighs the challenges 31
  • 34. Thank You! IMVU recognized as: Inc. 500 http://bit.ly/dv52wK Brett G. Durrett Red Herring 100: bdurrett@imvu.com http://bit.ly/bbz5Ex Twitter: @bdurrett Best Place to Work: http://bit.ly/aAVdp8 (and we're hiring) http://www.imvu.com/jobs