4. 4
• Company
founded
in
2001
• 350+
employees
world
wide
• 180M+
unique
visitors
per
month
• 45
portals
in
19
languages
• Casual
games
• Social
games
• Real
=me
mul=player
games
• Mobile
games
• 35+
MySQL
clusters
• 60k
queries
per
second
(3.5
billion
qpd)
Facts
5. 5
Geographic Reach
180
Million
Monthly
Ac=ve
Users(*)
Source:
(*)
Google
Analy3cs,
August
2012
• Over
45
localized
portals
in
19
languages
• Mul=
pla9orm:
web,
mobile,
tablet
• Focus
on
casual
and
social
games
• 180M
MAU
per
month
(30M
YoY
growth)
• Over
50M
registered
users
6. 6
Girls,
Teens
and
Family
spielen.com
juegos.com
gamesgames.com
games.co.uk
Brands
8. 8
• What
does
it
exactly
mean?
Retaining globally distributed HA
9. 9
Wikipedia:
High
availability
is
a
system
design
approach
and
associated
service
implementa=on
that
ensures
a
prearranged
level
of
opera=onal
performance
will
be
met
during
a
contractual
measurement
period.
Oracle:
• Availability
of
resources
in
a
computer
system
What is high availability?
10. 10
• Master
with
(many)
slave(s)
How do we reach HA with MySQL?
Master
Slave Slave Slave
11. 11
• Master
with
(many)
slave(s)
• Mul=
Master
How do we reach HA with MySQL?
Master
Slave
Master
Slave
12. 12
• Master
with
(many)
slave(s)
• Mul=
Master
• Clustering
How do we reach HA with MySQL?
MysqldMysqld
ndbd
ndbd ndbd
ndbd
ndbd
mgmt
13. 13
• Master
with
(many)
slave(s)
• Mul=
Master
• Clustering
• Geographical
redundancy
How do we reach HA with MySQL?
Master
local DC
Slave local
DC
Slave Asia Slave US
14. 14
• Scale
up
• Ver=cal
• Faster
CPU/Memory/disks
• Expensive
• Costs
mul=ply
in
same
rate
as
#
of
nodes
• Scale
out
• Horizontal
• More
(small)
machines
• Inexpensive
• Par==oning/federa=ng
(sharding)
What if we keep growing?
15. 15
• Func=onal
• Shard
your
database
func=onally
• Reads
• Add
more
slaves
(keep
them
coming!)
• Writes
• More
disks
• Horizontal
par==oning
• Federated
par==ons
Scale out
16. 16
• Breaking
up
tables
in
small
parts
on
the
same
host
• Par==oned
on
a
column
• Infinite
growth
(as
long
as
you
add
diskspace)
• Less
used
data
to
slower
(cheaper)
disks
• No
stored
procedures,
func=ons,
etc
• Uneven
usage
of
par==ons
(hash
par==on
may
help)
• Once
wrihen,
data
remains
on
the
par==on
Horizontal partitioning
17. 17
• Breaking
up
your
table
in
parts
on
mul=ple
hosts
• Par==oned
on
a
column
• Infinite
growth
(as
long
as
you
add
hosts)
• Less
used
data
on
slower
hosts
• Not
supported
in
(standard)
MySQL
• Par==oning
on
applica=on
level
(or
proxy)
• Alterna=vely:
NDB
• Uneven
usage
of
par==ons
• Once
wrihen
data
(mostly)
remains
on
the
par==on
• Parallel
queries
to
retrieve
data
from
all
shards
Federated partitions (sharding)
18. 18
• Parallel
execu=on
of
sequen=al
jobs
• Limited
by
the
weakest
link
• As
fast
as
the
slowest
node
• Fix:
nonsequen=al
(asynchronous)
execu=on
Amdahl's law
22. 22
• Dependent
on
one
storage
pla9orm
• No
more
pla9orm-‐specific
query
language
• Differen=ate
writes
• Op=mis=c
(asynchronous)
• Pessimis=c
(synchronous)
• Shard
data
beher
• Par==on
on
user
and
func=on
• Cluster
informa=on
by
users,
not
by
func=on
• Global
expansion
• Par==on
on
geographic
loca=on
• Solve
uneven
usage
of
data
storage
• Move
data
from
shard
to
shard
• Anything
may/could/will
fail
eventually
• Not
designed
for
the
“happy”
flow
What was our wishlist?
25. 25
New architecture overview
Server API
Application Model
Storage platform
Client-side API
Presentation layer
Physical storage
26. 26
• Everything
wrihen
in
Erlang
• Piqi
as
protocol
• binary
• JSON
• XML
• SSP
u=lizes
local
caching
(memcache)
• Flexible
(persistent)
storage
layer
• MySQL
(various
flavors)
• Membase/Couchbase
• Could
be
any
other
storage
product
• MQs
(DWH
updates)
Our building blocks
28. 28
• Func=onal
language
• High
availability:
designed
for
telecom
solu=ons
• Excels
at
concurrency,
distribu=on,
fault
tolerance
• Do
more
with
less!
• Other
companies
using
Erlang:
Why Erlang?
29. 29
• What
is
the
bucket
model?
• Each
record
has
one
unique
owner
ahribute
(GID)
• GID
(Global
IDen=fier)
iden=fying
different
types
• Bucket(s)
per
func=onality
• Bucket
is
structured
data
• Ahributes
contain
data
of
records
• Ahributes
do
not
have
to
correspond
to
schema
How do we shard?
36. 36
• Nearest
datacenter
(DC)
to
the
end
user
• Satellite
DC
• Processing
and
caching
• Do
not
own/store
data
• Storage
DC
• Processing,
caching
and
persistent
storage
• Store
all
same
user
data
in
same
DC
• Par==on
on
user
globally
• Global
IDen=fier
per
user
Global distribution
37. 37
• Contains
GIDs
and
their
master
DC
• GIDs
master
DC
predefined
• Migrated
GIDs
get
updated
The lookup server
38. 38
• Globally
sharded
on
GID
• (local)
GID
Lookup
How does this work?
GID
lookup
Shard 1 Shard 2
Persistent
storage
40. 40
• Spread
data
even
on
shards
• Migra=on
of
buckets
between
shards
• GID
migra=on
between
DCs
• Crea=ng
a
new
storage
DC
needs
data
migra=on
• Users
will
automa=cally
be
migrated
a‚er
visi=ng
another
DC
many
=mes
Why do we need data migration?
41. 41
• Versioning
on
bucket
defini=ons
• GIDs
are
assigned
to
a
bucket
version
• Data
in
old
bucket
versions
remain
(read
only)
• New
data
only
gets
wrihen
to
new
bucket
version
• Updates
migrate
data
to
new
bucket
version
• Migrates
can
be
triggered
Seamless schema upgrades
42. 42
Seamless schema upgrades
Demobucket
v1
GID
1234
1235
1236
1237
1238
1239
name
Roy
Moss
Jen
Douglas
Denholm
Richmond
Demobucket
v2
GID
name
gender
GID
1241
name
Patricia
gender
f
GID
1241
1235
name
Patricia
Moss
gender
f
m
GID
1234
1236
1237
1238
1239
name
Roy
Jen
Douglas
Denholm
Richmond
GID
1234
1237
1238
1239
name
Roy
Douglas
Denholm
Richmond
GID
1241
1235
1236
name
Patricia
Moss
Jen
gender
f
m
f
43. 43
• Every
cluster
(two
masters)
will
contain
two
shards
• Data
wrihen
interleaved
• HA
for
both
shards
• No
warmup
needed
• Both
masters
ac=ve
and
“warmed
up”
• Slaves
added
(other
DC)
for
HA
and
backup
Multi Master writes
SSP
Shard
1
Shard
2
44. 44
• SPAPI
is
in
place
• SSP
is
(mostly)
running
in
shadow
mode
• GID
buckets
running
in
produc=on
• Ac=vity
feed
system
first
to
produc=on
• Satellite
DC
in
early
2013!
Where do we stand now?
47. 47
• Presenta=on
can
be
found
at:
hhp://spil.com/perconalondon2012
• If
you
wish
to
contact
me:
art@spilgames.com
• Don’t
forget
to
rate
my
talk!
Thank you!