MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
The importance of indexes in mongo db
1. The Importance of
Indexes in MongoDB
How we increased the loading speed Profile Gliphs and Insights
James Toyer, Lead Software Engineer at Glipho
2. What is Glipho?
Social network for text
based content
Aims to better engage
writers and readers
Original content only
Not an aggregator
Automatically share to
Facebook, LinkedIn and
Twitter
3. Insights Page
Load up the gliphs a writer
has
Iterate through, and
sum, actions for each gliph
Load and sum actions for
the writers profile
This can be over 100 calls to
the database
We know it’s inefficient but
it does the job for now
4. Insights Document Structure
Timestamp – when the action took place
EntityId – identifier of the original entity
ActionType – the type of the entity (probably should be entity
type)
Action – the actual action that took place
6. Troubleshooting
CPU spiking? NO
Memory high? NO
Disk IO high? NO
Are there any actual regular hits happening? NO
Do you know anything? NO
Crack out the code performance tools…
7. Pre-Index performance
• 3 passes on each filter page
• Average time for each page to load = 3.9 seconds
• “ListAll” method calls the database
• “ListAll” is iterated over for each gliph in the database and the profile (in this case ~10 times)
• Average time in “ListAll” 256ms
8. More Troubleshooting
Is the code doing obviously stupid things? NO
Has Linq screwed you over again? NO
Do you trust the driver? PROBABLY
Check the database
~ 400,000 documents (now ~690,000)
No indexes
10. Index analysis
Without action field With action field
Query structure Query structure
Query time before index: Query time before index:
334ms 409ms
Index Index
Query time after index: Query time after index:
>1ms >1ms
12. Gliph listings for Writers
Problems:
Slow loading
Sometimes erroring out
Reasons:
Indexes were no longer
accurate
Code had changed
Solution:
New indexes
Remove old indexes
13. What did I learn?
Know exactly what queries are being run
Don’t do a “best guess” on an index. Test them out
Don’t “forget” to add indexes
Ensure your indexes evolve as your queries do
14. Any Questions?
james@glipho.com
glipho.com/james
@jamestoyer
Notas del editor
Who are you?What are you talking about?Mention how it got recognisedThis is a case study…kinda
“Think of it like twitter for blogs”You can bring your existing content with you for no cost
Writers are vain and lazyTime filtersUp to 100 gliphs
Anonymous4 important fields for this
Insights page appeared to be taking an age to load. Could be temporary blip. Something that is just being a bit slow. Then a bunch of timeout errors from the page effectively not completing the map-reduce job. Coincidentally the gliph listing page for a writer started loading really slowly
Use New Relic
Not original figures – ran yesterdayThis are averages over (3 x 3 = 9 passes)
obviously is not a healthy combinationMy PC = Solid StateProduction on AWS, even with 8 drive in RAID 10 (as recommended by the MongoDB documentation)MASSIVE FAIL!!!!
Can’t just add indexes…don’t know what queries are.We use Linq. Not as smart as you hope.Use “GetMongoQuery”
These are through the shell
Asproimised listings for writersThis was AJAX so less pronouncedUsed “GetMongoQuery” again
I know good developers who guess.Forget reasons: - prototype to production - do them later - forget from restore