3. Background
Microblogging site
● user messages (blog)
● cockpit/wall
Classic architecture
● database
● web server(s)
● loadbalancer(s)
4. Background
Web servers, load balancers
● one server
● ...
● 1000 servers
● not a problem
Database
● one database
● two databases (master -> slave)
● two databases (master <-> master)
● n databases (slave(s)<-master<->master->slave(s))
a lot of replication ;)
5. Background
Replication
● increase read performance (raid1)
● increase data safety (raid1)
● does not increase system's capacity (GBs)
6. Background
Scalability
● stateless elements scale well
● stateful elements
○ quite easy to scale
■ if we want more reads (cache, replication)
○ hard to scale
■ if we want more writes
■ if we want more capacity
9. Theory
Scaling
● Scale Back
○ delete, archive unuset data
● Scale Up (vertical)
○ more power, more disks
● Scale Out (horizontal)
○ add machines
■ functional partitioning
■ replication
■ sharding
10. Theory
Sharding
● split one big database into many smaller databases
○ spread rows
○ spread them across many servers
● shared-nothing partitioning
● not a replication
11. Theory
Sharding key
● shard by a key
● all data with that key will be on the same shard
● i.e. shard by user - all informations connected to user are on
one shard (user info, messages, friends list)
user 1 -> shard 1
user 2 -> shard 2
user 3 -> shard 1
user 4 -> shard 2
● choosing a right key is very important!
12. Theory
Sharding function
● maps keys to shards
● where to find the data
● where to store the data
shard number = sf(key)
13. Theory
Sharding function
● Dynamic
○ Mapping in a database table
● Fixed
○ Modulo
shard number = id % shards_count
○ Hash + Modulo
shard number = md5(email) % shards_count
○ Consistent hasing
http://en.wikipedia.org/wiki/Consistent_hashing
14. Theory
Advantages
● Linear write/read performance scalability (raid0)
● Capacity increase (raid0)
● Smaller databases are easier to manage
○ alter
○ backup/restore
○ truncate ;)
● Smaller databases are faster
○ as may fit into memory
● Cost effective
○ 80core, 20 HD, 80GB RAM vs
○ 10 x (8core, 2HD, 8GB RAM)
15. Theory
Challenges
● Globally unique IDs
○ unique across all shards
■ auto_increment_increment, auto_increment_offset
■ global IDs table
○ not unique across shards
■ IDs in dbs - not unique
■ shard_number - unique
■ global unique ID = shard_number + db ID
16. Challenges
Re-sharding
1,4,7 2,5,8 3,6,9
1,6 2,7 3,8 4,9 5
● consistent hasing
or
● more shards than machines/nodes
(i.e. 100 shards on 10 machines)
17. Challenges
Cross-shard
● queries
○ sent to many shards
○ collect result from one
○ avoidable (better sharding key, more sharding keys)
● joins
○ send query to many shards
○ join results in an application
○ sometimes unavoidable
18. Challenges
Network
● more machines, more smaller streams
● full-mesh between webservers and shards
● pconnect vs. connect
Complexity
● usually sharding is done in application logic
20. Practice
Microblogging site
● see users messages
● see stream/wall
Classic architecture
● database
● web server(s)
● loadbalancer(s)
21. Practice
who whose
Data
John's messages? 1 2
id login John's follows? 3 4
1 John
3 2
2 Bob
1 3
3 Andy
id owner message 5 2
4 Claire
1 2 M1 2 1
5 Megan
2 1 M2 1 5
3 2 M3
4 3
4 3 M4
4 1
5 2 M5
22. Practice
User
● no need for sharding
User
Message
sharded by user (owner field)
● shard_number = owner % 2
Follow
sharded by user (who field)
● shard_number = who % 2 Message Message
Follow Follow
2 shards, 3 machines
Follow
shard0 shard1
23. Practice
shard0
id owner message who whose
1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob
3 Andy
shard1 who whose
4 Claire
1 2
5 Megan
id owner message 3 4
2 1 M2 3 2
4 3 M4 1 3
5 2
mapping?
1 5
24. Practice
Bob's blog
● Bob's messages
○ find Bob's id in User table (id = 2)
○ find Bob's shard (2%2 = 0, shard0)
○ fetch Messages (shard0) where owner = 2
● People Bob follows
○ find Bob's id in User table (id = 2)
○ find Bob's shard (2%2 = 0, shard0)
○ fetch whose id from Follow table (shard0)
○ fetch people info from User table
25. Practice
shard0
id owner message who whose
1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob
3 Andy
shard1 who whose
4 Claire
1 2
5 Megan
id owner message 3 4
2 1 M2 3 2
4 3 M4 1 3
5 2
1 5
26. Practice
Who follows Andy ?
● find Andy's id in User table (id=3)
● find Andy's shard (3%2 = 1, shard1)
● hmmm
27. Practice
shard0
id owner message who whose
1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob
3 Andy
shard1 who whose
4 Claire
5 Megan 1 2
id owner message 3 4
2 1 M2 3 2
Cross-shard
4 3 M4 1 3
query! 5 2
1 5
28. Practice
shard0
id owner message who whose
1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob
3 Andy
shard1 who whose
4 Claire
5 Megan 1 2
id owner message 3 4
2 1 M2 3 2
Ideas? 4 3 M4 1 3
5 2
1 5
30. Summary
Shard or not to shard
● many reads, little writes? - don't
● many writes and no capacity problems? - don't (use SSD)
● capacity problems? - yes
● many writes and capacity problems? - yes
● scale-up is affordable? - don't shard
As You see... it depends!
31. Summary
If You have to shard
● always use sharding + replication = raid10
○ sharding reduces high availability (like raid0)
● more shards than You need
○ i.e. 4 machines, 100 shards
○ or dynamic allocation
● think of network capacity (full-mesh)
○ load sharding (google it ;))
● sharding key - important!
○ cross-shard queries