"Technologies that are going to affect our lives in the next decade are being tested and developed in the video game sphere." In January 2016 Activision approved a pilot project to build a containerised continuous delivery pipeline using Docker. This project spanned multiple devops teams and would culminate in launching a production title "Skylanders Imaginators" in October 2016. The Mission Statement : “Our mission is to deliver an amazing build, test and deploy pipeline that aims to be so reliable, effective and easy to use that our product and title departments will end up writing high value gaming services all day long without giving a second thought to how they may reliably deliver these in record time.” This talk will discuss the cultural and technical challenges faced throughout the pilot. Spoiler alert: Not everyone was happy with the decision to use Docker. The talk will cover the concerns and how we handled them. It will cover why it is important, especially in the games industry, to be evaluating and integrating technologies like Docker in order to remain relevant. For the first time in Demonware history developers were responsible for the launch and support of a title. We are also the first studio under Activision to be running Docker in Production.
3. 1. Who are Demonware?
2. What is Skypilot?
3. Skypilot Principles
4. Workflows
5. Takeaways
Agenda
4. Who are Demonware?
We provide online services and
infrastructure for some of the world’s
most popular video game franchises.
5.
6. A brief history of Demonware time
2009(MW2)
Company size
Concurrent users
Size of our monolith
Number of services
Operational overhead
Gastric ulcer size
2013 (Ghosts) 2015 (BO3)
7. What was Skypilot ?
Goal : deliver and run game services for a
production title, Skylanders Imaginators.
How : through a containerised continuous delivery
pipeline.
Length : 9 months.
8. Deliver an amazing build, test and deploy pipeline that aims
to be so reliable, effective and easy to use that our product
and title departments will end up writing high value gaming
services without giving a second thought to how they may
reliably deliver these in record time.
Unlocking engineering agility allows us to deliver “amazing
value in record time”.
Mission Statement
9. Building, configuring, testing, deploying [and running] services
• requires specialist knowledge
• is not safe (incomplete and/or unreliable automation)
• is not efficient
• is not consistent across environments
• does not allow us to work in small batch sizes
The problem
10. “Evolution of our
existing tools was
doomed to fail. What
we needed was a
revolution”
- Morgan Brickley, Dev Lead
11. This wasn’t a greenfield project. We had some legacy issues and
tech debt that we needed to address
• Services had always run on bare metal
• Monolithic codebase
• Internal tooling didn’t support container deployment
• Software choices made a decade ago no longer made sense
• Processes were inconsistent across teams
• Processes were immature
Some other challenges
12. We want to work in small batch sizes in order to
• reduce iteration time
• increase quality
• fulfil our mandate: unlock engineering agility
We also wanted to
• codify our deployments
• become lean
• reduce the entry-level to service deployment
The solution
13. From hero to zero
1 - Initial
2 - Repeatable
3 - Defined
4 - Managed
5 - Optimizing
14. “What’s the smallest number of
steps, with the smallest number of
people and the smallest amount
of ceremony required to get new
code running on your servers?”
- Erik Kastner
18. Databases
● MySQL in containers
● Data stored on GP2 EBS volumes
● Simple master/slave topology (no master
promotion)
● Slaves attempt sync on startup
23. So how did it work out?
Skylanders was successfully launched on
October 13th, 2016 without issue.
24. Efficiencies
Time to build a shared cluster days → 15 - 20 minutes
Time to deploy a title environment hours → 10 - 15
minutes
Time to recover from a full outage hours → 8 - 15
minutes
Time to recover from a database failure hours → 40 - 60 seconds
25. What next ?
● Provide a better UX for deployments
● Reduce new deployment time by
parallelizing the CD test stage
● Prepare for larger scale deployments
● Educate
26. “I was able to update a
service from scratch within
45 mins. This included setup
time, learning git and 30
mins of user error
debugging”
- Lisa Reilly, Project Manager
27. “With zero Ops
experience, I was able
to create a Production
cluster, on my own, in
minutes”
- Anar Rahimli, Skypilot customer