Sharding with MongoDB allows scaling a database horizontally across multiple servers. It involves splitting data into chunks and distributing those chunks across shards. The mongos router directs read and write operations to the appropriate shards. Documents are sharded based on a shard key to ensure related data resides on the same shard. Queries are routed efficiently based on the shard key. Splits and migrations balance data as needs change over time.
26. Shard
Really is just a mongod (or replica set)
mongod
Where your data lives
27. Config Server
Mongod started with --configsvr option
config Must have 3 (or 1 in development)
Data is commited using 2 phase commit
28. mongos
Acts just like shard router / proxy
One or as many as you want
mongos
Light weight -- can run on App servers
Caches meta-data from config servers
68. Routed Request
1
1. Query arrives at Mongos
2. Mongos routes query to a
mongos single shard
2
shard shard shard
69. Routed Request
1
1. Query arrives at Mongos
2. Mongos routes query to a
mongos single shard
3. Shard returns results of
2 query
3
shard shard shard
70. Routed Request
1
1. Query arrives at Mongos
4
2. Mongos routes query to a
mongos single shard
3. Shard returns results of
2 query
4. Results returned to client
3
shard shard shard
78. Distributed Merge Sort Req.
1
1. Query arrives at Mongos
2. Mongos broadcasts query
mongos to all shards
2 2 2
shard shard shard
79. Distributed Merge Sort Req.
1
1. Query arrives at Mongos
2. Mongos broadcasts query
mongos to all shards
3. Each shard locally sorts
results
2 2 2
shard shard shard
3 3 3
80. Distributed Merge Sort Req.
1
1. Query arrives at Mongos
2. Mongos broadcasts query
mongos to all shards
3. Each shard locally sorts
results
2 2 2
4. Results returned to
mongos
4 4 4
shard shard shard
3 3 3
81. Distributed Merge Sort Req.
1
1. Query arrives at Mongos
2. Mongos broadcasts query
mongos 5 to all shards
3. Each shard locally sorts
results
2 2 2
4. Results returned to
mongos
4 4 4
5. Mongos merges sorted
results
shard shard shard
3 3 3
82. Distributed Merge Sort Req.
1
1. Query arrives at Mongos
6
2. Mongos broadcasts query
mongos 5 to all shards
3. Each shard locally sorts
results
2 2 2
4. Results returned to
mongos
4 4 4
5. Mongos merges sorted
results
shard shard shard
3 3 3 6. Combined results
returned to client
83. Queries
By Shard Key Routed db.users.find({email: “bob@10gen.com”})
Sorted by Routed in order db.users.find().sort({email:-1})
shard key
Find by non Scatter Gather db.users.find({state:”NY”})
shard key
Sorted by Distributed merge db.users.find().sort({state:1})
sort
non shard key
88. Writes should be distributed
{
node: "ny153.example.com",
application: "apache",
time: "2011-01-02T21:21:56Z",
level: "ERROR",
msg: "something is broken"
}
Bad { time : 1 }
89. Writes should be distributed
{
node: "ny153.example.com",
application: "apache",
time: "2011-01-02T21:21:56Z",
level: "ERROR",
msg: "something is broken"
}
Bad { time : 1 }
Better {node:1, application:1, time:1}
91. Queries should be routed to one shard
{
node: "ny153.example.com",
application: "apache",
time: "2011-01-02T21:21:56Z",
level: "ERROR",
msg: "something is broken”
}
Bad {msg: 1, node: 1}
92. Queries should be routed to one shard
{
node: "ny153.example.com",
application: "apache",
time: "2011-01-02T21:21:56Z",
level: "ERROR",
msg: "something is broken”
}
Bad {msg: 1, node: 1}
Better {node: 1, time: 1}
94. Chunks should be able to split
{
node: "ny153.example.com",
application: "apache",
time: "2011-01-02T21:21:56Z",
level: "ERROR",
msg: "something is broken"
}
Bad {node: 1}
95. Chunks should be able to split
{
node: "ny153.example.com",
application: "apache",
time: "2011-01-02T21:21:56Z",
level: "ERROR",
msg: "something is broken"
}
Bad {node: 1}
Better {node:1, time:1}