This talk will describe the evolution of how we've used Puppet at Demonware, a subsidiary of Activision Blizzard, to run the infrastructure of some of the world's biggest games, supporting millions of concurrent users for titles such as Call of Duty.
Ruaidhri Power of DemonWare at PuppetCamp Dublin '12. http://www.puppetlabs.com
2. Overview
History of Demonware and our growth
What do we do?
Early Puppet approaches
Current state
New improvements
The future
Questions
3. Foundation
Original founders (~2003)
− Seán Blanchfield
PhD student in Distributed Systems group in CS dept in
Trinity College Dublin
DSG previously spun out Iona (CORBA)
TCD CS dept spun out Havok (games physics)
Seán was studying Grid P2P topologies
− Dylan Collins
Business graduate, also TCD
Both were hooked on Counterstrike and
Quake
4. Startup
Started hosting lobby servers in 2005
By 2007, lots of customers: Activision, Ubisoft,
Codemasters, THQ
Acquired by Activision in May
Some big games
− Splinter Cell Double Agent
− Saints Row
− Worms Open Warfare
− Colin McRae DiRT
− Enemy Territory Quake Wars
5. Startup
But no monster blockbuster
20,000 concurrent users was a big title
Still a tiny company
11 devs, 3 ops, 3 managers
Acquired by Activision (now Activision-
Blizzard)
6. Products
Bitdemon
− Cross platform
− Game friendly SDK for P2P communications (no
server side components)
− Minimal memory allocation, non blocking etc.
− Designed to be called in a game loop
− Had higher level libraries to support client server
and peer to peer games
− Origin of “bd” prefix.
7. What do we do?
The full online infrastructure for all Activision
games
– Lobby services:
• Matchmaking, Leaderboards, Stats storage,
Messaging, Friends/Teams, Anti cheat
• Via XBox Live Service Platform (XLSP) →
Windows boxes
– Webservice access to our services
• Elite, elite.callofduty.com
• Mobile
8. Games
Call of Duty
− Call of Duty 4: Modern Warfare (2007)
− Call of Duty: World at War (2008)
− Call of Duty: Modern Warfare 2 (2009)
− Call of Duty: Black Ops (2010)
− Call of Duty: Modern Warfare 3 (2011)
− →Call of Duty: Black Ops 2 (2012)
9. Games
Guitar Hero
Spyro
Blur
DJ Hero
James Bond – GoldenEye and Quantum of
Solace
Transformers: WFC
Singularity
90+ games in total
10. Demonware in numbers
Our services are used by 280+ million gamers
We support over 2.4 million+ concurrent online gamers
Demonware software has shipped in 90+ games
We serve 300,000 requests per second at peak
We have an average query response time of < .01 second
We collect 500,000+ metrics every minute
Our services respond to 100 billion+ API calls per month
11. In the beginning (~2007)
Tech with Ubuntu DVD
− lots of notes on wiki
− compiling from source; “ask Seán/Tilman”
Standard image with basics done
− hard drive removed and imaged
− frozen at point in time
− hard to update
12. Fun times
New accounts by hand everywhere
Network setup over the network
− and shorewall fun
Changing /etc/hosts made sudo unhappy
Reboot and cross fingers
Mail remote hands in shame
13. Cobbler
Provisioning server
− Written in Python
− delivers network installs via PXE
− integrated DHCP server
− also supports Windows and virtualized hardware
such as KVM and VMWare servers
Install Puppet
14. 2009 Architecture
Fledgling Puppet deployment
ENC script connecting to MySQL inventory
database with IPs and list of Puppet classes
With great power comes great responsibility
− UPDATE without WHERE clause
− “I'm such a dummy, I can't spell --i-am-a-dummy
properly” — anon.
15. 2009 Problems
Puppet class proliferation
No conditionals or service/host-based
conditionals in code
Passwords in code!
Use Puppet to copy over a shell script ☹
− MySQL users via shell script
− Change MySQL root password; no more puppet
changes
Machine inventory in spreadsheet
noop
16. noop
Tells transactional layer to not make any
changes
− logs them instead
All production machines ran in noop mode
Machines in setup did not
Trade off between automation and not making
changes accidentally
Run puppet client from command line or just
make changes and log messages go away
17. noop
noop saved us downtime
− Turned what would have been complete downtime
on GH5 and MW2 into a problem with wsproxy
and contingency only
− Political necessity at the time
In the process of removing it now that Puppet
has proven itself
18. Puppet gains traction
Servers per sysadmin
More in-house expertise
Base system install for dev
Full production install
Server rebuild
– Faster than debugging subtly broken system
19. 2010 rewrite
Move from Ubuntu to CentOS
Much improved from previous version
− Custom types
− Proper dependencies
− Password lookup function
20. Load balancing
Standard Webrick
Apache and Mongrel
Now moved to Passenger
21. Custom types
MySQL users
− users
− passwords
− grants
MySQL databases
Generic MySQL module for use with multiple
services
sysctls
22. Custom functions
Password lookup
− $auth_database_password =
password("mysql_auth_database", $service)
− Passwords configured locally per Puppetmaster,
outside version control
− Allows sharing of modules without sharing secrets
23. Devzone integration
Internal Django app
Game developer interface to Demonware
Internal service configuration interface
− double sign off of changes
Inventory database
− servers
− Interfaces – IPs, netmasks, default routes
− clusters / subclusters
− Puppet modules!
24. ENC script
Python script which connected to our custom
inventory database (Django app)
Makes Devzone API call for classes, network,
subcluster, etc.
Simple conditionals to add extra configuration
to output
Disadvantages
− Need SSH and root access to update
− Brittle and no way to avoid simultaneous update
26. bdPuppetConfig
Python XMLRPC server (bdPuppetConfigd)
Simple client (bdpupc)
bdconfig as a standard for configuring
Demonware services
Devzone integration
− View how your service is configured
− Make updates self-service
− Traceability
− NOC
27. Puppet modules
schema.yaml in the root of each module
defines available variables
bdconfig variable types
− host
− ip
− hostport
− string
− boolean
− etc.
Versioned per puppet branch
28. Gerrit
git code review tool
http://code.google.com/p/gerrit/
Clone from standard git repository (we use
gitolite and cgit)
Push to Gerrit and have change reviewed and
confirmed
Post-commit hooks distribute to the relevant
datacentres (per git branch)
29. Future for Demonware
CoD n+1
− Elite
Bungie
Mobile
CoD online (China)
Next-gen consoles