3. Common Setup
• A common setup is 3 shards with 3 servers
per shard: 3 masters, 6 slaves
• Can add sharding later to an existing replica
set with no down time
• Can have sharded and non-sharded
collections
4. Range Based
MIN MAX LOCATION
A F shard1
F M shard1
M R shard2
R Z shard3
• collection is broken into chunks by range
• chunks default to 64mb or 100,000 objects
5. Config Servers
• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
6. mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
traffic
• Cache meta data from config servers
8. Queries
• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
sort
9. Splitting
• Take a chunk and split it in 2
• Splits on the median value
• Splits only change meta data, no data
change
10. Splitting
T1
MIN MAX LOCATION
A Z shard1
T2
MIN MAX LOCATION
A G shard1
G Z shard1
T3
MIN MAX LOCATION
A D shard1
D G shard1
G S shard1
S Z shard1
11. Balancing
• Moves chunks from one shard to another
• Done online while system is running
• Balancing runs in the background
12. Migrating
T3 MIN MAX LOCATION
A D shard1
D G shard1
G S shard1
S Z shard1
T4 MIN MAX LOCATION
A D shard1
D G shard1
G S shard1
S Z shard2
T5
MIN MAX LOCATION
A D shard1
D G shard1
G S shard2
S Z shard2
13. Setting it Up
• Start servers
• add shards: db.runCommand( { addshard :
"10.1.1.5" } )
• turn on partitioning:
db.runCommand( { enablesharding : "test" }
• shard a collection:
db.runCommand( { shardcollection : "test.data" ,
key : { num : 1 } } )
14. Live Migration
• Copy initial data set
• Copy anything modified since start
• Commit start
• Copy delta
• Finish commit
15. Download MongoDB
http://www.mongodb.org
and let us know what you think
@eliothorowitz @mongodb
10gen is hiring!
http://www.10gen.com/jobs