5. Why difficult?
• Launch global…
• Single DB? Cross-region copy data?
• How to scale?
• Latency…
• Availability..
• Agility…
• Cost effective…
6. Average Network Latency
http://bit.ly/superdata-latency, See http://bit.ly/verizon-latency
N.America
41.7ms
Europe to Asia
137.9ms
Asia Pacific
97.9ms
Trans-Pacfic
103.8ms
Trans-Atlantic
79.6ms
Latin America
133.2ms
Europe
11.6ms
Japan
16.8ms
11. What should we do?
• Game servers near players
• Why?
1. Low latency
2. Play with friends
3. When traveling…
go sightseeing~ !
https://gigaom.com/wp-content/uploads/sites/1/2012/10/jason-server-hug.jpeg?
quality=80&strip=all
12. What should we do?
• Everything else as HTTP APIs
• Why?
1. Login
2. Friends list
3. Leaderboard
4. Inventory
5. IAP
6. etc
13. What should we do?
• Data replication is not a good idea
• Why?
1. Fragile architecture
2. High Latency
3. Replicate all data in regions…? "
4. Dependency on other region
14. What should we do?
• Local caches is a good idea.
• Why?
1. Caching hot data
2. No data replication
3. Low latency
19. Loosely Coupled Pods (Game Server Pods)
Tokyo
Oregon
Frankfurt
Virginia
HTTP
TCP / UDP
TCP / UDP
TCP / UDP
20. VPC Subnet
VPC Subnet
Game API Pods
Availability Zone A
Availability Zone B
VPC Subnet
VPC Subnet
Auto Scaling group
WEB
VPC Subnet
WEB
JOBS
Cognito
SNS Mobile
Push
SES
Other
Pods
21. Game API Pods
VPC Subnet
VPC Subnet
Availability Zone A
Availability Zone B
VPC Subnet
VPC Subnet
Auto Scaling group
WEB
VPC Subnet
WEB
JOBS
Cognito
SNS Mobile
Push
SES
VPC Subnet
VPC Subnet
Availability Zone A
Availability Zone B
VPC Subnet
VPC Subnet
Auto Scaling group
WEB
VPC Subnet
WEB
JOBS
Cognito
SNS Mobile
Push
SES
VPC Subnet
VPC Subnet
Availability Zone A
Availability Zone B
VPC Subnet
VPC Subnet
Auto Scaling group
WEB
VPC Subnet
WEB
JOBS
Cognito
SNS Mobile
Push
SES
22. Region
Login via HTTP API
Download Game Assets
Matchmaking to Game Server
EC2
Game Flow
EC2
EC2
23. Region
Login via HTTP API
Download Game Assets
Matchmaking to Game Server
Connect to Server
Hack Apart Your Friends
Game Over
Game Flow
EC2
EC2
24. Region
Login via HTTP API
Download Game Assets
Matchmaking to Game Server
Connect to Server
Hack Apart Your Friends
Game Over
Write via HTTP API
Game Flow
EC2
EC2
25. VPC Private Subnet
VPC Public Subnet
Game Server Pods
Availability Zone A
Availability Zone B
VPC Public Subnet
VPC Private Subnet
GAME GAME GAME GAME GAME GAME
26. VPC Private Subnet
VPC Private Subnet
RabbitMQ + Elastic Load Balancing
Availability Zone A
Availability Zone B
10.1.0.13 10.2.0.16
TCP 5672
rabbitmq-node1 rabbitmq-node2
TCP 4369
& 25672
27. RabbitMQ + Elastic Load Balancing
• Clustering Tutorial
https://www.rabbitmq.com/clustering.html
• Set Queue HA policy to "all"
https://www.rabbitmq.com/ha.html
• Create Internal load balancer
– Listen Port: TCP 5672
• VPC Security Group
– For load balancer: TCP 5672
– Between EC2 nodes: TCP 4369 & 25672
• Set up Client to Heartbeat
28. VPC Private Subnet
VPC Private Subnet
SQS for critical data (purchasing, voucher)
Availability Zone A
Availability Zone B
10.1.0.13 10.2.0.16
TCP 5672
rabbitmq-node1 rabbitmq-node2
TCP 4369
& 25672
29. VPC Private Subnet
VPC Public Subnet
Redis Pub / Sub
Availability Zone A
Availability Zone B
VPC Public Subnet
VPC Private Subnet
GAME GAME GAME GAME GAME GAME
Auto Scaling group
30. VPC Private Subnet
VPC Public Subnet
CloudFormation + Chef
Availability Zone A
GAME GAME GAME
Auto Scaling group
31. CloudFormation
Use CloudFormation to create a
template for your complete
regional environment
VPC Private Subnet
VPC Public Subnet
Availability Zone A
Availability Zone B
VPC Public Subnet
VPC Private Subnet
Region
GAME GAME GAME GAME GAME GAME
Redis Redis
JSON Format, which can be but
into source control repositories
33. VPC Subnet
Server Registration
Availability Zone A
Availability Zone B
VPC Subnet
Region
Auto Scaling group
WEB WEB
Oregon
Tokyo
VPC Subnet
Cleanup Loop for
unhealthy server
Launch new game
pods via the
CloudFormation
template
JOBS
Health Flag
34. Server Registration & Scaling
• HTTPS POST /api/servers/register
• Include an HMAC (RFC 2104) and custom salt for
the payload
• Send Server Status
– Public IP
– # Players
– Game Modes
• Matchmaking Service
– Maintains server list
– Removes servers
35. Example of Web API Call
API Request: Response from Web Services:
POST https://api.coolgame.com/Client/
Matchmake
Content-Type: application/json;
X-Authentication: 7BC920BC255F7E60-0-0-5F4
{
"BuildVersion": "5.01",
"Region": "USCentral", "GameMode": "0",
"LobbyId": "Lobby 32",
"EnableQueue": false
}
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
{
"LobbyID": "Lobby 32",
"ServerHostname": "192.168.0.1",
"ServerPort": 7777,
"Ticket": "e98yf289f248902f4904f0924f9pj",
"Status": "Waiting",
"Queue": [
"User1",
"User2"
]
}
36. Scaling Game Servers
• Monitor Server
Capacity
• Launch new ones via
Amazon EC2 API or
CloudFormation
• Choose Server
• Drain Players
• Terminate
Auto-Scaling of AWS cannot be used
unless game state is centralized
56. Core Game Backend
ELB
S3
• Choose Region
• Elastic Load Balancer
• Two Availability Zones
• EC2 for App
• RDS Database
• Multi-AZ
• S3 for Game Data
• Assets
• UGC
• Analytics
Region
57. Scale It Way Out
CloudFront
CDN
ELB
S3
EC2EC2EC2
Region
• Auto Scaling Group
• Capacity on Demand
• Respond to Users
• ElastiCache
• Memcache
• Redis
• CloudFront CDN
• DLC, Assets
• Game Saves
• UGC
61. Region
Writing Is Painful
Availability
Zone A
Availability
Zone B
S3
EC2
• Games are Write Heavy
• Caching of Limited Use
• Key Value Key Value
• Binary Structures
• Database = Bottleneck
ELB
EC2
CloudFront
CDN
63. Sharding is tough to operate
Source: http://keithburgun.net/wp-content/uploads/2013/03/1565.1276479-difficulty_doom_super.png-610x0.png
64. DynamoDB
Availability
Zone A
Availability
Zone B
S3
• NoSQL Data Store
• Fully-Managed
• Highly Available
• PUT/GET Keys
• Secondary Indexes
• Provisioned Throughput
• Auto Scaling
EC2 EC2
ELB
CloudFront
CDN
Elastic Beanstalk Container
65. Leaderboard in DynamoDB
• Hash key = Primary key
• Range key = Sub key
• Others attributes are
unstructured, unindexed
• So… How to sort based
on Top Score?
66. Leaderboard with Secondary Indexes
• Create a secondary index!
• Set hash key to Game Level
• Set range key to Top Score
• Can now query by Level,
Sorted by Top Score
• Handles any (sane) gaming
use case
67. Summary
• Decouple Game Servers from APIs
• Create local mini pods
• Create Matchmaking APIs to point user to game pod
• Replicate asynchronously to avoid high latency
• Use Managed services
– DynamoDB
– RDS
– S3/CloudFront
– ElastiCache