How AI, OpenAI, and ChatGPT impact business and software.
Indexing documents
1. Indexing Documents in MongoDB Alberto LernerSoftware Engineer – 10Gen alerner@10gen.com
2. Indexing Basics MongoDB can use separate tree structures to index a collection When processing a search criteria, MongoDB will try to avoid going through a collection, taking advantage of existing indices
3. Team Work MongoDB’s job: use an index, if possible SearchCriteria using index scanning the collection
4. Your Job To provide indices for important queries Important queries? Very frequently used Especially low response time required
5. Creating an Index You have an automatic one over _id Others can be created with ‘ensureIndex’ # index over attribute ‘name’ db.<collection>.ensureIndex({name:1}) # compound keys, ascending/descending db.<collection>.ensureIndex({name:1, date:-1 }) # unique keys db.<collection>.ensureIndex({sku:1}, {unique:true}) # building in background db.<collection>.ensureIndex( …, {background:true})
6. Simple Search Criteria Search criteria is the index key or a prefix thereof db.<collection>.find({sku:1234}) # index over sku db.<collection>.find({sku:1234}) # index over sku, <xxx>
7. More Exact Matching # index over sku ….find({sku: {$in:[1234,5678]) # index over ‘product.sku’ ….find({“product.sku”:1234}) # a tricky query, would need index on ‘product’ instead ….find({product: {sku:1234}}) { _id:1, product: {sku:1234} } # matches
8. Range Criteria Search criteria may return several results db.<collection>.findOne({sku: {$gt:1234}}) db.<collection>.find({sku: {$gt:5678,$lt:5699}})
10. Other Operations # index over sku ….update({sku:1234},{$inc:{sold:1}}}) ….remove({sku:1234})
11. Index Covering Sometimes, all the needed information is in the index itself ….count({color:blue}) # index over color ….find({sku:1234},{color:1}) # index over sku, color
12. Missing fields All documents have an entry on an index A missing field is indexed as a NULL # matches all documents without sku # if index over sku is unique, there could be only one ….find({sku:NULL}) # will be using a sku index, but not there yet ….find({{sku:{$exits:true}})
13. Array Matching A field that contains an array will have one entry in the index per element in the array { _id: “abcd”, x:[2,10]} will appear in all the following queries using an index over x ….find({x:2}) ….find({x:10}) ….find({x:[2,10]}) ….find({x:{$gt:5}}) # because of 10
14. Indexes and Ordering Sort elimination is also accomplished though using indexes ….find({sku:{$gt:56678}).sort({sku:1}) ….find().sort({sku:-1}) # can traverse backwards
15. Is It Using the Index? explain() tool allows you to see whether an index is being chosen db.<collection>.find({sku:{$gt:5}}).explain() { “cursor” : “BtreeCursor sku_1”, … }
16. Hinting Sometimes we may force or avoid the use of an index Usually, it should not be necessary to intervene # forces use of index over sku ….find{{sku:1, …}).hint({sku:1}) # prevents any index to be used ….find({sku:1,…}).hint({$natural:1})
17. When Indexes Don’t Help # negation ….find({sku:{$ne:9876}}) # index helps to filter string sku’s, though ….find({sku:/88/}) # generic regex # $where may contain very expressive searches # we don’t even try ….find({$where:”this.sku==1234”})
18. Many indices? Evaluating search criteria currently uses just one index, even if more than one would be possible The choice is based on previous executions; if an index “worked well” for a query before, it’ll likely be again Exception: $or can use more than one index
19. So When to Index? There’s a trade off between search criteria efficiency and insertion/update/deletion of keys Also, there is (a quite high) limit on number of indexes per collection (that we keep bumping up)
20. Indexes Resources Indexes are memory mapped as well; Be mindful of number of indexes and choice of keys # In ‘indexSizes’, individual indexes in collection db.<collection>.stats() # All indexes in collection db.<collection>.TotalIndexSize()
21. Take away The picture to keep in mind SearchCriteria