Managing a Maturing MongoDB Ecosystem

Charity Majors
@mipsytipsy
Thursday, June 20, 13

Managing a maturing MongoDB
ecosystem

automating with chef
performance tuning
disaster recovery

Basic replica set

How do I chef that?
... grab the AWS and mongodb cookbooks,
create a site wrapper cookbook

make a role for your cluster,
launch some nodes,

initiate the replica set,
... and you’re done.

Adding snapshots

adding RAID for EBS volumes

this will bootstrap a new node for the cluster from snapshots
with this role ...

multiple clusters
distinct cluster name, backup host, backup volumes

sharding

assign a shard name per cluster, per role
treat them like ordinary replica sets

Arbiters
• Mongod processes that do nothing but vote
• Highly reliable
• To provision an arbiter, use the LWRP
• Easy to run multiple arbiters on a single host

arbiter LWRP

replica set with arbiters

run multiple arbiters on a single host:

Managing votes with arbiters

tuning and performance.

resources and provisioning
tuning your ﬁlesystem
snapshotting and warmups
fragmentation

Provisioning tips
• Memory is your primary scaling constraint
• Your working set must ﬁt in to memory
• in 2.4, estimate with:
• Page faults? Your working set may not ﬁt

Disk options
• If you’re on Amazon:
• EBS
• Dedicated SSD
• Provisioned IOPS
• Ephemeral
• If not:
• use SSDs!

EBS classic
EBS with
PIOPS:
... just say no to EBS

SSD
(hi1.4xlarge)
• 8 cores
• 60 gigs RAM
• 2 1-TB SSD drives
• 120k random reads/sec
• 85k random writes/sec
• expensive! $2300/mo on demand

PIOPS
• Up to 2000 IOPS/volume
• Up to 1024 GB/volume
• Variability of < 0.1%
• Costs double regular EBS
• Supports snapshots
• RAID together multiple volumes
for more storage/performance

• multiply that by 2-3x depending on your spikiness
Estimating PIOPS
• estimate how many IOPS to provision with the “tps”
column of sar -d 1

Ephemeral
Storage
• Cheap
• Fast
• No network latency
• No snapshot capability
• Data is lost forever if you stop or
resize the instance

Filesystem and limits
• Raise ﬁle descriptor limits
• Raise connection limits
• Mount with noatime and nodiratime
• Consider putting the journal on a separate volume

Blockdev
• Your default blockdev is probably wrong
• Too large? you will underuse memory
• Too small? you will hit the disk too much
• Experiment.

Snapshot best practices
• Set priority = 0
• Set hidden = 1
• Consider setting votes = 0
• Lock mongo or stop mongod before snapshot
• Consider running continuous compaction on
snapshot node

Restoring from snapshot
• EBS snapshot will lazily-load blocks from S3
• run “dd” on each of the data ﬁles to pull blocks down
• Always warm up a secondary before promoting
• warm up both indexes and data
• http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/
• in mongodb 2.2 and above you can use the touch command:

Fragmentation
• Your RAM gets fragmented too!
• Leads to underuse of memory
• Deletes are not the only source of fragmentation
• Repair, compact, or resync regularly

3 ways to ﬁx fragmentation:
• Re-sync a secondary from scratch
• hard on your primary; rs.syncFrom() a secondary
• Repair a secondary
• can cause small discrepancies in your data
• Run continuous compaction on your snapshot
node
• won’t reset padding factors
• not appropriate if you do lots of deletes

Fragmentation is terrible

Upgrade!
mongo is getting faster. :)

disasters and recovery.

Finding bad queries
• db.currentOp()
• mongodb.log
• proﬁling collection

db.currentOp()
• Check the queue size
• Any indexes building?
• Sort by num_seconds
• Sort by num_yields, locktype
• Consider adding comments to your queries
• Run explain() on queries that are long-running

mongodb.log
• Conﬁgure output with --slowms
• Look for high execution time, nscanned, ntoreturn
• See which queries are holding long locks
• Match connection ids to IPs

system.profile collection
• Enable profiling with db.setProfiling()
• Does not persist through restarts
• Like mongodb.log, but queryable
• Writes to this collection incur some cost
• Use db.system.profile.find() to get slow queries for
a certain collection, time range, execution time, etc

• Know what your tipping point looks like
• Don’t switch your primary or restart
• Do kill queries before the tipping point
• Write your kill script before you need it
• Don’t kill internal mongo operations, only queries.
... when queries pile up ...

can’t elect a master?
• Never run with an even number of votes (max 7)
• You need > 50% of votes to elect a primary
• Set your priority levels explicitly if you need
warmup
• Consider delegating voting to arbiters
• Set snapshot nodes to be nonvoting if possible.
• Check your mongo log. Is something vetoing? Do
they have an inconsistent view of the cluster state?

secondaries crashing?
• Some rare mongo bugs will cause all secondaries
to crash unrecoverably
• Never kill oplog tailers or other internal database
operations, this can also trash secondaries
• Arbiters are more stable than secondaries,
consider using them to form a quorum with your
primary

replication stops?
• Other rare bugs will stop replication or cause
secondaries to exit without a corrupt op
• The correct way to ﬁx this is to re-snapshot off
the primary and rebuild your secondaries.
• However, you can sometimes *dangerously* repair
a secondary:
1. stop mongo
2. bring it back up in standalone mode
3. repair the offending collection
4. restart mongo again as part of the replica set

• Everything is getting vaguely slower?
• check your padding factor, try compaction
• You rs.remove() a node and get weird driver
errors?
• always shut down mongod after removing from replica set
• Huge background ﬂush spike?
• probably an EBS or disk problem
• You run out of connection limits?
• possibly a driver bug
• hard-coded to 80% of soft ulimit until 20k is reached.

• It looks like all I/O stops for a while?
• check your mongodb.log for large newExtent warnings
• also make sure you aren’t reaching PIOPS limits
• You get weird driver errors after adding/removing/
re-electing?
• some drivers have problems with this, you may have to restart

Glossary of resources
• Opscode AWS cookbook
• https://github.com/opscode-cookbooks/aws
• edelight MongoDB cookbook
• https://github.com/edelight/chef-mongodb
• Parse MongoDB cookbook fork
• https://github.com/ParsePlatform/Ops/tree/master/chef/cookbooks/
mongodb
• Parse compaction scripts and warmup scripts
• http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/
• http://blog.parse.com/2013/03/26/always-be-compacting/

Managing a Maturing MongoDB Ecosystem

Recomendados

Recomendados

Más contenido relacionado

Más de MongoDB

Más de MongoDB (20)

Último

Último (20)

Managing a Maturing MongoDB Ecosystem