The document discusses MongoDB replication and sharding. Replication uses replica sets for high availability and disaster recovery. Sharding partitions data across multiple servers (shards) to improve scalability. The key points covered include:
- Replication maintains copies of data on multiple servers for redundancy and high availability. It uses replica sets and elections for failover.
- Sharding partitions data by a shard key across multiple mongod instances (shards) to scale reads and writes. It requires config servers to store metadata and mongos instances as query routers.
- Write concerns allow controlling acknowledgments and replication of write operations. Tag-aware sharding allows controlling data distribution across shards.
7. Why Replication?
• How many have faced node failures?
• How many have been woken up from sleep to do a
fail-over(s)?
• How many have experienced issues due to network
latency?
• Different uses for data
– Normal processing
– Simple analytics
8. Why Replication?
• Replication is designed for
– High Availability (HA)
– Disaster Recovery (DR)
• Not designed for scaling reads
– You can but there are drawbacks: eventual
consistency, etc.
– Use sharding for scaling!
21. Write Concerns
• Network acknowledgement (w = 0)
• Wait for return info/error (w = 1)
• Wait for journal sync (j = 1)
• Wait for replication (w >=2)
22. Tagging
• Control where data is written to, and read from
• Each member can have one or more tags
– tags: {dc: "ny"}
– tags: {dc: "ny", subnet: "192.168", rack:
"row3rk7"}
• Replica set defines rules for write concerns
• Rules can change without changing app code
24. Read Preference Modes
• 5 modes
– primary (only) - Default
– primaryPreferred
– secondary
– secondaryPreferred
– Nearest
When more than one node is possible, closest node is used
for reads (all modes but primary)
25. Tagged Read Preference
• Custom read preferences
• Control where you read from by (node) tags
– E.g. { "disk": "ssd", "use": "reporting" }
• Use in conjunction with standard read
preferences
– Except primary
31. Partitioning
• User defines shard key
• Shard key defines range of data
• Key space is like points on a line
• Range is a segment of that line (chunk), smaller than
64MB
• Chunks are migrated from one shard to another to
maintain a balanced state
32. Shard Key
• Shard key is immutable
• Shard key values are immutable
• Shard key must be indexed
• Shard key limited to 512 bytes in size
• Shard key used to route queries
– Choose a field commonly used in queries
• Only shard key can be unique across shards
33. Shard Key Considerations
• Cardinality
• Write Distribution
• Query Isolation
• Reliability
• Index Locality
34. Initially 1 chunk
Default max chunk size: 64mb
MongoDB automatically splits & migrates chunks
when max reached
Data Distribution
35. Queries routed to specific shards
MongoDB balances cluster
MongoDB migrates data to new nodes
Routing and Balancing
38. Partitioning
- ∞ + ∞
- ∞ { x : 1}, { x : 3} …. { x : 99} +
∞
- ∞ { x : 1} …. { x : 55}
{ x : 56} …. { x : 110} +
∞ shard 2 shard 3
shard 2 shard 3
shard 2 shard 3
39. Partitioning
- ∞ + ∞
- ∞ { x : 1}, { x : 3} …. { x : 99} +
∞
- ∞ { x : 1} …. { x : 55}
{ x : 56} …. { x : 110} +
∞ shard 2 shard 3
shard 2 shard 3
shard 2 shard 3
40. Partitioning
- ∞ + ∞
- ∞ { x : 1}, { x : 3} …. { x : 99} +
∞
- ∞ { x : 1} …. { x : 55}
{ x : 56} …. { x : 110} +
∞ shard 2 shard 3
shard 2 shard 3
shard 2 shard 3
41. MongoDB Auto-Sharding
• Minimal effort required
– Same interface as single mongod
• Two steps
– Enable Sharding for a database
– Shard collection within database
49. mongod --configsvr
Starts a configuration server on the default port
(27019)
Starting the Configuration Server
50. mongos --configdb <hostname>:27019
For 3 configuration servers:
mongos --configdb<host1>:<port1>,<host2>:<port2>,<host3>:<port3>
Thisis always how to start a new mongos, even if the cluster is already running
Start the mongos Router
51. mongod --shardsvr
Starts a mongod with the default shard port (27018)
Shard is not yet connected to the rest of the cluster
Shard may have already been running in production
Start the shard database
53. db.runCommand({ listshards:1 })
{ "shards" :
[{"_id”:"shard0000”,"host”:”<hostname>:27018”} ],
"ok" : 1
}
Verify that the shard was added
54. Enabling Sharding
• Enable sharding on a database
sh.enableSharding(“<dbname>”)
• Shard a collection with the given key
sh.shardCollection(“<dbname>.people”,{“country”:1})
• Use a compound shard key to prevent duplicates
sh.shardCollection(“<dbname>.cars”,{“year”:1,”uniqueid”:1})
55. Tag Aware Sharding
• Tag aware sharding allows you to control the
distribution of your data
• Tag a range of shard keys
– sh.addTagRange(<collection>,<min>,<max>,<tag>)
• Tag a shard
– sh.addShardTag(<shard>,<tag>)
Basic explanation2 or more nodes form the setQuorum
Initialize -> ElectionPrimary + data replication from primary to secondaryHeartbeat every 2 seconds, timeout 10 seconds
Primary down/network failureAutomatic election of new primary if majority existsFailover usually takes a couple of seconds. Depending on your application code and configuration, this can be seamless/transparent.
New primary electedReplication established from new primary
Down node comes upRejoins setsRecovery and then secondary
Note that replication doesn’t always need to pull from the primary. Will pull from secondary if it is faster (less ping time).
PrimaryData memberSecondaryHot standbyArbitersVoting member
PriorityFloating point number between 0..1000Highest member that is up to date wins Up to date == within 10 seconds of primaryIf a higher priority member catches up, it will force election and win Slave DelayLags behind master by configurable time delay Automatically hidden from clientsProtects against operator errorsFat fingeringApplication corrupts data
ConsistencyWrite preferencesRead preferences
Using 'someDCs' so that in the event of an outage, at least a majority of the DCs would receive the change. This favors availability over durability.
Indexes should be contained in working set.
From mainframes, to RAC Oracle servers... People solved problems by adding more resources to a single machine.
Large scale operation can be combined with high performance on commodity hardware through horizontal scalingBuild - Document oriented database maps perfectly to object oriented languagesScale - MongoDB presents clear path to scalability that isn't ops intensive - Provides same interface for sharded cluster as single instance
_id could be unique across shards if used as shard key.we could only guarantee uniqueness of (any) attributes if the keys are used as shard keys with unique attribute equals true
Cardinality – Can your data be broken down enough?Query Isolation - query targeting to a specific shardReliability – shard outagesA good shard key can:Optimize routingMinimize (unnecessary) trafficAllow best scaling
Don’t use this setup in production!Only one Config server (No Fault Tolerance)Shard not in a replica set (Low Availability)Only one mongos and shard (No Performance Improvement)Useful for development or demonstrating configuration mechanics
MongoDB 2.2 and later only need <host> and <port> for one member of the replica set
This can be skipped for the intro talk, but might be good to include if you’re doing the combined sharding talk. Totally optional, you don’t really have enough time to do this topic justice but it might be worth a mention.
The mongos does not have to load the whole set into memory since each shard sorts locally. The mongos can just getMore from the shards as needed and incrementally return the results to the client.