Data sharding

Data Sharding

Michał Gruchała
michal@gruchala.info
WebClusters 2011

TODO

● Background
● Theory
● Practice
● Summary

Background

Microblogging site
● user messages (blog)
● cockpit/wall

Classic architecture
● database
● web server(s)
● loadbalancer(s)

Background

Web servers, load balancers
● one server
● ...
● 1000 servers
● not a problem

Database
● one database
● two databases (master -> slave)
● two databases (master <-> master)
● n databases (slave(s)<-master<->master->slave(s))

a lot of replication ;)

Background

Replication
● increase read performance (raid1)
● increase data safety (raid1)
● does not increase system's capacity (GBs)

Background

Scalability

● stateless elements scale well

● stateful elements
○ quite easy to scale
■ if we want more reads (cache, replication)
○ hard to scale
■ if we want more writes
■ if we want more capacity

Background

Sharding ;)
AB CD

ABCD GH
EF
EFGH
IJKL

IJ KL

Theory

Scaling
● Scale Back
○ delete, archive unuset data
● Scale Up (vertical)
○ more power, more disks
● Scale Out (horizontal)
○ add machines
■ functional partitioning
■ replication
■ sharding

Theory

Sharding
● split one big database into many smaller databases
○ spread rows
○ spread them across many servers
● shared-nothing partitioning
● not a replication

Theory

Sharding key

● shard by a key
● all data with that key will be on the same shard
● i.e. shard by user - all informations connected to user are on
one shard (user info, messages, friends list)

user 1 -> shard 1
user 2 -> shard 2
user 3 -> shard 1
user 4 -> shard 2

● choosing a right key is very important!

Theory

Sharding function

● maps keys to shards
● where to find the data
● where to store the data

shard number = sf(key)

Theory

Sharding function

● Dynamic
○ Mapping in a database table

● Fixed
○ Modulo
shard number = id % shards_count
○ Hash + Modulo
shard number = md5(email) % shards_count
○ Consistent hasing
http://en.wikipedia.org/wiki/Consistent_hashing

Theory

Advantages

● Linear write/read performance scalability (raid0)
● Capacity increase (raid0)
● Smaller databases are easier to manage
○ alter
○ backup/restore
○ truncate ;)
● Smaller databases are faster
○ as may fit into memory
● Cost effective
○ 80core, 20 HD, 80GB RAM vs
○ 10 x (8core, 2HD, 8GB RAM)

Theory

Challenges

● Globally unique IDs
○ unique across all shards
■ auto_increment_increment, auto_increment_offset
■ global IDs table
○ not unique across shards
■ IDs in dbs - not unique
■ shard_number - unique
■ global unique ID = shard_number + db ID

Challenges

Re-sharding

1,4,7 2,5,8 3,6,9

1,6 2,7 3,8 4,9 5

● consistent hasing
or
● more shards than machines/nodes
(i.e. 100 shards on 10 machines)

Challenges

Cross-shard

● queries
○ sent to many shards
○ collect result from one
○ avoidable (better sharding key, more sharding keys)
● joins
○ send query to many shards
○ join results in an application
○ sometimes unavoidable

Challenges

Network

● more machines, more smaller streams
● full-mesh between webservers and shards
● pconnect vs. connect

Complexity

● usually sharding is done in application logic

Practice

Microblogging site
● see users messages
● see stream/wall

Classic architecture
● database
● web server(s)
● loadbalancer(s)

Practice
who whose
Data
John's messages? 1 2

id login John's follows? 3 4

1 John
3 2
2 Bob
1 3
3 Andy
id owner message 5 2
4 Claire
1 2 M1 2 1
5 Megan
2 1 M2 1 5
3 2 M3
4 3
4 3 M4
4 1
5 2 M5

Practice

User
● no need for sharding
User
Message
sharded by user (owner field)
● shard_number = owner % 2

Follow
sharded by user (who field)
● shard_number = who % 2 Message Message
Follow Follow
2 shards, 3 machines

Follow
shard0 shard1

Practice
shard0
id owner message who whose

1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob

3 Andy
shard1 who whose
4 Claire
1 2
5 Megan

2 1 M2 3 2

4 3 M4 1 3
5 2
mapping?
1 5

Practice

Bob's blog

● Bob's messages
○ find Bob's id in User table (id = 2)
○ find Bob's shard (2%2 = 0, shard0)
○ fetch Messages (shard0) where owner = 2

● People Bob follows
○ find Bob's id in User table (id = 2)
○ find Bob's shard (2%2 = 0, shard0)
○ fetch whose id from Follow table (shard0)
○ fetch people info from User table

Practice
shard0

1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob

3 Andy
shard1 who whose
4 Claire
1 2
5 Megan

2 1 M2 3 2

4 3 M4 1 3
5 2

1 5

Practice

Who follows Andy ?

● find Andy's id in User table (id=3)
● find Andy's shard (3%2 = 1, shard1)
● hmmm

Practice
shard0

1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob

3 Andy
shard1 who whose
4 Claire

5 Megan 1 2

2 1 M2 3 2
Cross-shard
4 3 M4 1 3
query! 5 2

1 5

Practice
shard0

1 2 M1 2 1
id login
3 2 M3 4 3
1 John
5 2 M5 4 1
2 Bob

3 Andy
shard1 who whose
4 Claire

5 Megan 1 2

2 1 M2 3 2

Ideas? 4 3 M4 1 3
5 2

1 5

Summary

Shard or not to shard

● many reads, little writes? - don't
● many writes and no capacity problems? - don't (use SSD)
● capacity problems? - yes
● many writes and capacity problems? - yes
● scale-up is affordable? - don't shard

As You see... it depends!

Summary

If You have to shard

● always use sharding + replication = raid10
○ sharding reduces high availability (like raid0)
● more shards than You need
○ i.e. 4 machines, 100 shards
○ or dynamic allocation
● think of network capacity (full-mesh)
○ load sharding (google it ;))
● sharding key - important!
○ cross-shard queries

Data sharding

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (20)

Data sharding