SlideShare una empresa de Scribd logo
1 de 14
The Importance of
Indexes in MongoDB
How we increased the loading speed Profile Gliphs and Insights


James Toyer, Lead Software Engineer at Glipho
What is Glipho?

                   Social network for text
                    based content
                   Aims to better engage
                    writers and readers
                   Original content only
                   Not an aggregator
                   Automatically share to
                    Facebook, LinkedIn and
                    Twitter
Insights Page

                 Load up the gliphs a writer
                  has
                 Iterate through, and
                  sum, actions for each gliph
                 Load and sum actions for
                  the writers profile
                 This can be over 100 calls to
                  the database
                     We know it’s inefficient but
                      it does the job for now
Insights Document Structure




 Timestamp – when the action took place
 EntityId – identifier of the original entity
 ActionType – the type of the entity (probably should be entity
  type)
 Action – the actual action that took place
Helpful Error Page
…it’s all gone wrong!
Troubleshooting


 CPU spiking? NO
 Memory high? NO
 Disk IO high? NO
 Are there any actual regular hits happening? NO
 Do you know anything? NO


Crack out the code performance tools…
Pre-Index performance

•   3 passes on each filter page
•   Average time for each page to load = 3.9 seconds
•   “ListAll” method calls the database
•   “ListAll” is iterated over for each gliph in the database and the profile (in this case ~10 times)
•   Average time in “ListAll” 256ms
More Troubleshooting


 Is the code doing obviously stupid things? NO
 Has Linq screwed you over again? NO
 Do you trust the driver? PROBABLY
 Check the database
    ~ 400,000 documents (now ~690,000)
    No indexes
Know your query

 GetMongoQuery code   Output
Index analysis

    Without action field           With action field
 Query structure             Query structure




 Query time before index:    Query time before index:
     334ms                       409ms
 Index                       Index




 Query time after index:     Query time after index:
     >1ms                        >1ms
Post-Index performance

  Pre-index performance           Post-index performance
 Page load time:                Page load time:
    3.9s                            72ms
 “ListAll” method execution     “ListAll” method execution
  time:                           time:
    256ms                           >0.2ms



         Page load time deceased by 98%
         “ListAll” method execution time decreased
          by 99%
Gliph listings for Writers

                  Problems:
                     Slow loading
                     Sometimes erroring out
                  Reasons:
                     Indexes were no longer
                      accurate
                     Code had changed
                  Solution:
                     New indexes
                     Remove old indexes
What did I learn?



 Know exactly what queries are being run
 Don’t do a “best guess” on an index. Test them out
 Don’t “forget” to add indexes
 Ensure your indexes evolve as your queries do
Any Questions?

       james@glipho.com
       glipho.com/james
            @jamestoyer

Más contenido relacionado

La actualidad más candente

NoSQL - No Security? - The BSides Edition
NoSQL - No Security? - The BSides EditionNoSQL - No Security? - The BSides Edition
NoSQL - No Security? - The BSides Edition
Gavin Holt
 
Effective approaches to web application security
Effective approaches to web application security Effective approaches to web application security
Effective approaches to web application security
Zane Lackey
 

La actualidad más candente (10)

SQL Server Tips & Tricks
SQL Server Tips & TricksSQL Server Tips & Tricks
SQL Server Tips & Tricks
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)
 
Exploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeExploiting NoSQL Like Never Before
Exploiting NoSQL Like Never Before
 
NoSQL - No Security? - The BSides Edition
NoSQL - No Security? - The BSides EditionNoSQL - No Security? - The BSides Edition
NoSQL - No Security? - The BSides Edition
 
Effective approaches to web application security
Effective approaches to web application security Effective approaches to web application security
Effective approaches to web application security
 
Saving Time By Testing With Jest
Saving Time By Testing With JestSaving Time By Testing With Jest
Saving Time By Testing With Jest
 
Power shell v3 session1
Power shell v3   session1Power shell v3   session1
Power shell v3 session1
 
Continuous integration sql in the city
Continuous integration sql in the cityContinuous integration sql in the city
Continuous integration sql in the city
 
GraphQL with Spring Boot
GraphQL with Spring BootGraphQL with Spring Boot
GraphQL with Spring Boot
 
InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017
 

Similar a The importance of indexes in mongo db

Abusing bleeding edge web standards for appsec glory
Abusing bleeding edge web standards for appsec gloryAbusing bleeding edge web standards for appsec glory
Abusing bleeding edge web standards for appsec glory
Priyanka Aash
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off!  Best Practices for Code Hand Offs""Hands Off!  Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
Naomi Dushay
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
xlight
 
Zane lackey. security at scale. web application security in a continuous depl...
Zane lackey. security at scale. web application security in a continuous depl...Zane lackey. security at scale. web application security in a continuous depl...
Zane lackey. security at scale. web application security in a continuous depl...
Yury Chemerkin
 

Similar a The importance of indexes in mongo db (20)

System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Static Code Analysis and AutoLint
Static Code Analysis and AutoLintStatic Code Analysis and AutoLint
Static Code Analysis and AutoLint
 
Docker/DevOps Meetup: Metrics-Driven Continuous Performance and Scalabilty
Docker/DevOps Meetup: Metrics-Driven Continuous Performance and ScalabiltyDocker/DevOps Meetup: Metrics-Driven Continuous Performance and Scalabilty
Docker/DevOps Meetup: Metrics-Driven Continuous Performance and Scalabilty
 
Profiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty DetailsProfiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty Details
 
Abusing bleeding edge web standards for appsec glory
Abusing bleeding edge web standards for appsec gloryAbusing bleeding edge web standards for appsec glory
Abusing bleeding edge web standards for appsec glory
 
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
 
JUG Poznan - 2017.01.31
JUG Poznan - 2017.01.31 JUG Poznan - 2017.01.31
JUG Poznan - 2017.01.31
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off!  Best Practices for Code Hand Offs""Hands Off!  Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
 
SQL Server Dev ToolKit
SQL Server Dev ToolKitSQL Server Dev ToolKit
SQL Server Dev ToolKit
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
 
Devtest: using Lean and Devops practices to bring QA and coders together by L...
Devtest: using Lean and Devops practices to bring QA and coders together by L...Devtest: using Lean and Devops practices to bring QA and coders together by L...
Devtest: using Lean and Devops practices to bring QA and coders together by L...
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Zane lackey. security at scale. web application security in a continuous depl...
Zane lackey. security at scale. web application security in a continuous depl...Zane lackey. security at scale. web application security in a continuous depl...
Zane lackey. security at scale. web application security in a continuous depl...
 
JavaOne 2015: Top Performance Patterns Deep Dive
JavaOne 2015: Top Performance Patterns Deep DiveJavaOne 2015: Top Performance Patterns Deep Dive
JavaOne 2015: Top Performance Patterns Deep Dive
 
Dynamo DB & RDS Deep Dive - AWS India Summit 2012
Dynamo DB & RDS Deep Dive - AWS India Summit 2012Dynamo DB & RDS Deep Dive - AWS India Summit 2012
Dynamo DB & RDS Deep Dive - AWS India Summit 2012
 

Más de MongoDB

Más de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

The importance of indexes in mongo db

  • 1. The Importance of Indexes in MongoDB How we increased the loading speed Profile Gliphs and Insights James Toyer, Lead Software Engineer at Glipho
  • 2. What is Glipho?  Social network for text based content  Aims to better engage writers and readers  Original content only  Not an aggregator  Automatically share to Facebook, LinkedIn and Twitter
  • 3. Insights Page  Load up the gliphs a writer has  Iterate through, and sum, actions for each gliph  Load and sum actions for the writers profile  This can be over 100 calls to the database  We know it’s inefficient but it does the job for now
  • 4. Insights Document Structure  Timestamp – when the action took place  EntityId – identifier of the original entity  ActionType – the type of the entity (probably should be entity type)  Action – the actual action that took place
  • 6. Troubleshooting  CPU spiking? NO  Memory high? NO  Disk IO high? NO  Are there any actual regular hits happening? NO  Do you know anything? NO Crack out the code performance tools…
  • 7. Pre-Index performance • 3 passes on each filter page • Average time for each page to load = 3.9 seconds • “ListAll” method calls the database • “ListAll” is iterated over for each gliph in the database and the profile (in this case ~10 times) • Average time in “ListAll” 256ms
  • 8. More Troubleshooting  Is the code doing obviously stupid things? NO  Has Linq screwed you over again? NO  Do you trust the driver? PROBABLY  Check the database  ~ 400,000 documents (now ~690,000)  No indexes
  • 9. Know your query GetMongoQuery code Output
  • 10. Index analysis Without action field With action field  Query structure  Query structure  Query time before index:  Query time before index:  334ms  409ms  Index  Index  Query time after index:  Query time after index:  >1ms  >1ms
  • 11. Post-Index performance Pre-index performance Post-index performance  Page load time:  Page load time:  3.9s  72ms  “ListAll” method execution  “ListAll” method execution time: time:  256ms  >0.2ms  Page load time deceased by 98%  “ListAll” method execution time decreased by 99%
  • 12. Gliph listings for Writers  Problems:  Slow loading  Sometimes erroring out  Reasons:  Indexes were no longer accurate  Code had changed  Solution:  New indexes  Remove old indexes
  • 13. What did I learn?  Know exactly what queries are being run  Don’t do a “best guess” on an index. Test them out  Don’t “forget” to add indexes  Ensure your indexes evolve as your queries do
  • 14. Any Questions? james@glipho.com glipho.com/james @jamestoyer

Notas del editor

  1. Who are you?What are you talking about?Mention how it got recognisedThis is a case study…kinda
  2. “Think of it like twitter for blogs”You can bring your existing content with you for no cost
  3. Writers are vain and lazyTime filtersUp to 100 gliphs
  4. Anonymous4 important fields for this
  5. Insights page appeared to be taking an age to load. Could be temporary blip. Something that is just being a bit slow. Then a bunch of timeout errors from the page effectively not completing the map-reduce job. Coincidentally the gliph listing page for a writer started loading really slowly
  6. Use New Relic
  7. Not original figures – ran yesterdayThis are averages over (3 x 3 = 9 passes)
  8. obviously is not a healthy combinationMy PC = Solid StateProduction on AWS, even with 8 drive in RAID 10 (as recommended by the MongoDB documentation)MASSIVE FAIL!!!!
  9. Can’t just add indexes…don’t know what queries are.We use Linq. Not as smart as you hope.Use “GetMongoQuery”
  10. These are through the shell
  11. Asproimised listings for writersThis was AJAX so less pronouncedUsed “GetMongoQuery” again
  12. I know good developers who guess.Forget reasons: - prototype to production - do them later - forget from restore