eHarmony moved one family of business-critical back-end applications to MongoDB several months ago. In this presentation, I discuss some of the important lessons we learned along the way about how to provision, scale, manage, and troubleshoot MongoDB.
5. Application: Find Potential Matches
As fast as possible:
1. Find people who
meet each other’s
preferences
1. Bidirectional
User-Defined
Criteria
2. Discard combos
that violate
Compatibility
Models
6. Application: Find Potential Matches
• User attributes in
MongoDB
– Replicated
– Sharded
• Data access pattern:
1. Bidirectional
User-Defined
Criteria
– Read-heavy
– Complex queries
• Java application
7. Application: Find Potential Matches
• In full production
> 6 mos
– Following several mos
limited production
– Following several mos
intensive dev+testing
• No production
outages
• MongoDB no longer
the thing we worry
about most
• User attributes in
MongoDB
– Replicated
– Sharded
• Data access pattern:
– Read-heavy
– Complex queries
• Java application
8. Lesson: Provision for Success
Fit all data & indexes in memory
– MongoDB storage implemented using
mem-mapped files
– Beware under-provisioned VMs
Minimize field names to keep data
as small as possible
– “Schema-less records” ==
“schema repeated millions of times”
– Morphia Java library can help with mapping
9. Lesson: Provision for Success
Scale write ops & data volume by adding shards
Scale read ops
by adding secondaries
Shard / RS
Shard / RS
Primary
Primary
Secondary
Secondary
Secondary
Secondary
…
…
…
10. Lesson: Be Ready to Tinker
• Many processes:
Use Puppet, Chef, or similar
– mongod on each
node, primary or secondary
– Helps with config
files, command-line arguments
– 2 MMS agents
– Insufficient for adding
secondaries, configuring
indexes, etc.
– Plus, if sharding:
• mongos for each app instance
• 3 config servers
• …Each configured
separately & differently
– Configuration file
– Manual commands to set up
• Less likely to have
DBA support
– …and relational Best
Practices may not transfer
If scripting, use real client
driver, not mongo shell
– Doesn’t handle output or errors
consistently
– Can’t wait in JavaScript
Train your DB/Ops team(s)
– And expect to do more yourself
11. Lesson: Shadow Mode Is Your Friend
Test with real production data, conditions, and queries
Measure everything (MMS is a good start, but insufficient)
Real Application
Real Events
& Requests
“Shadow” Application
X
Kill mongod instances to verify resiliency
Primary school enrollment, Armenia:
http://data.worldbank.org/country/armenia
12. Lesson: Be Ready to Restore Your Data
• Schemas will
change
Maintain 2nd copy in
another format
– Backing source of truth?
• Shard key(s) will
change
– More on this later…
• You’ll experience
MongoDB bugs
– Backup in standard format?
– Second cluster with different
version of MongoDB?
Increment DB name
with each reload
Automate reload
process, and use it
Image credit:
http://tutorialphotoshopcs-putradom.blogspot.com/2012/11/create-dramatic-meteor-and-burning-city.html
13. Lesson: Pick a Good Shard Key
1. Distribute Data Volume Evenly
– This is what auto-balancing does for you.
2. Multiply Query Performance
– Isolate queries to 1 shard to multiply read
capacity by # of shards.
3. Distribute Workload Evenly
– Conflicts with above!
14. Lesson: Pick a Good Shard Key
Shard 1
Shard 2
mongos
1. Distribute Data Volume Evenly
– This is what auto-balancing does for you.
2. Multiply Query Performance
– Isolate queries to 1 shard to multiply read
capacity by # of shards.
3. Distribute Workload Evenly
– Conflicts with above!
Jessica Rabbit: http://disney.wikia.com/wiki/Jessica_Rabbit
Steve Urkel:
http://celebratingtvandfilmgeeks.wordpress.com/2010/04/25/steve-urkel-the-
15. Lesson: Pick a Good Shard Key
DO These Things
BEWARE These Things
Use fields appearing in
every query
• Include serial numbers
(or similar)
Choose combo that
finely partitions data
• Hash fields when reads
might be a problem
Measure relative load
across shards
• Mutable fields in shard
key—remove and add
– Consider adding
secondaries to loaded
shard(s) ONLY
16. Summary
1. Provision for Success
2. Be Ready to Tinker
3. Shadow Mode Is Your Friend
4. Be Ready to Restore Your Data
5. Pick a Good Shard Key
Specifically, we’ll be talking about 5 lessons.It should take about 30 minutes.
At some point, you’ll realize the data in your cluster isn’t what and/or how you need. You’ll need to reconstruct it.In first two cases, you could dump and reload a single cluster.What about production changes in the mean time?
Idea is for the breakdown of data across shards to reflect the same natural divisions of data you’re likely to query against.