SlideShare una empresa de Scribd logo
1 de 18
I have a good shard key now what? 
David Murphy , Mongo Master 
Lead DBA, ObjectRocket 
@dmurphy_data @objectrocket
Background 
• 16 yrs in databases, development, & system engineering 
• Lead DBA @ ObjectRocket 
• Mongo Master with a focus on sharding, chunks, and 
scaling mongo beyond normal means.
Quick Sharding Introduction 
Why you might need to shard: 
• Scaling Reads 
• Fast Queries 
• Scaling Writes 
• Balancing nodes 
What are things should you consider? 
• Shard Key is Immutable 
• Targeted vs Scatter-Gather 
• Sharding Overhead 
• Some commands no longer work 
http://bit.ly/1oXYDfm - Kenny Gorman sharding talk 
http://bit.ly/ZTtDI1 - Several other sharding talks
What does a shared cluster look like?
The balancer is great… 
… but you can also perform some proactive checks to predict issues. 
Understanding the reasons why not all chunks are equal 
Counting Chunks per shard without sh.status() 
Finding out how many chunkMove’s or splits occur at a time 
Getting the size of all chunks for a collection
Ways to Count Chunks 
Many functions for this: 
• sh.status() 
• db.collection.stats() 
Simple query for counting chunks: 
• db.chunks.count({ns:"database.collection"}); 
Why I prefer using Aggregation: 
• Faster as you only look at one type of data 
• Provides chunk count per shard 
• Can programmatically use the output
Tracking Chunks and Moves 
The config.changelog tracks chunk changes. 
The types of changes to chunks are 
• split 
• moveChunk 
• moveChunk.start 
• moveChunk.from 
• moveChunk.to 
• moveChunk.commit 
Chunk Collection Examples 
http://bit.ly/112tXEx 
ChangeLog Examples 
http://bit.ly/1oZlyqN
Sizing Chunks 
• dataSize command - Very expensive 
Scans all documents to be 100% accurate 
• find().count() * avgObjSize on chunk ranges 
Rough size estimate based on stats 
• ChunkHunter.py (On GitHub http://bit.ly/1yWQf9T ) 
New community script from ObjectRocket to: 
a) copy chunks meta data to new collection 
b) scan each chunk there with dataSize or count 
c) output summary
Example ChunkHunter Output
Example ChunkHunter Output 
Total Chunks in Cluster
Example ChunkHunter Output 
Total Chunks Populated
Example ChunkHunter Output 
Total Chunks processed
Example ChunkHunter Output 
>250k Documents per chunk
Example ChunkHunter Output 
>64M per Chunk
What did we add to the doc?
Some example queries might be 
• Give me all Jumbo:true chunks 
• Order Jumbo chunks by documents so I can split them 
• Order Jumbo chunks by size so I can split them 
• Aggregate Jumbo chunks to count them per shard 
• Give me all Jumbo on Shard “X” 
Find all these and more at: 
http://bit.ly/1oYVGeJ
The Future… 
Will be releasing tools to simplify: 
• Manual Splits 
• Moving of Chunks 
These are not replacements for the balancer but ways to 
help it stay on track if things deviate. As opposed to waiting 
for an incident and fixing it in a panic.
Contact 
@dmurphy_data 
@objectrocket 
david@objectrocket.com 
https://www.objectrocket.com 
WE ARE HIRING! (DBA,DEVOPS, and more) 
https://www.objectrocket.com/careers 
Resources: 
ChunkHunter.py http://bit.ly/1yWQf9T 
Chunk Hunter Example Scripts bit.ly/1oYVGeJ 
Chunk Collection Queries http://bit.ly/112tXEx 
Changelog Queries http://bit.ly/1oZlyqN 
Presentations: 
Kenny Gorman Sharding - bit.ly/1oXYDfm 
Other Sharding Links - bit.ly/ZTtDI1

Más contenido relacionado

La actualidad más candente

Info 2402 assignment 2_ crawler
Info 2402 assignment 2_ crawlerInfo 2402 assignment 2_ crawler
Info 2402 assignment 2_ crawler
Shahriar Rafee
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB Performance
Moshe Kaplan
 

La actualidad más candente (14)

Mongodb beijingconf yottaa_3.3
Mongodb beijingconf yottaa_3.3Mongodb beijingconf yottaa_3.3
Mongodb beijingconf yottaa_3.3
 
Drinking from the firehose: Logging at scale with ELK
Drinking from the firehose: Logging at scale with ELKDrinking from the firehose: Logging at scale with ELK
Drinking from the firehose: Logging at scale with ELK
 
Geospatial data
Geospatial dataGeospatial data
Geospatial data
 
Info 2402 assignment 2_ crawler
Info 2402 assignment 2_ crawlerInfo 2402 assignment 2_ crawler
Info 2402 assignment 2_ crawler
 
MongoSF - Spatial MongoDB in OpenShift - script file
MongoSF - Spatial MongoDB in OpenShift - script fileMongoSF - Spatial MongoDB in OpenShift - script file
MongoSF - Spatial MongoDB in OpenShift - script file
 
Database novelty detection
Database novelty detectionDatabase novelty detection
Database novelty detection
 
London MongoDB User Group April 2011
London MongoDB User Group April 2011London MongoDB User Group April 2011
London MongoDB User Group April 2011
 
Elastic meetup june16
Elastic meetup june16Elastic meetup june16
Elastic meetup june16
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB Performance
 
Elasticsearch - Zero to Hero
Elasticsearch - Zero to HeroElasticsearch - Zero to Hero
Elasticsearch - Zero to Hero
 
Alluxio
AlluxioAlluxio
Alluxio
 
RasterFrames + STAC
RasterFrames + STACRasterFrames + STAC
RasterFrames + STAC
 
InfluxDb and Grafana fighting with data
InfluxDb and Grafana fighting with dataInfluxDb and Grafana fighting with data
InfluxDb and Grafana fighting with data
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 

Destacado

China micro grid technology progress and prospects forecast report, 2013-2018
China micro grid technology progress and prospects forecast report, 2013-2018China micro grid technology progress and prospects forecast report, 2013-2018
China micro grid technology progress and prospects forecast report, 2013-2018
Qianzhan Intelligence
 
China excavator manufacturing industry production & marketing demand and inve...
China excavator manufacturing industry production & marketing demand and inve...China excavator manufacturing industry production & marketing demand and inve...
China excavator manufacturing industry production & marketing demand and inve...
Qianzhan Intelligence
 
China methanol industry market research and investment forecast report
China methanol industry market research and investment forecast reportChina methanol industry market research and investment forecast report
China methanol industry market research and investment forecast report
Qianzhan Intelligence
 
China midwestern cement industry production and marketing demand and investme...
China midwestern cement industry production and marketing demand and investme...China midwestern cement industry production and marketing demand and investme...
China midwestern cement industry production and marketing demand and investme...
Qianzhan Intelligence
 
China auto parts and components manufacturing industry in depth market resear...
China auto parts and components manufacturing industry in depth market resear...China auto parts and components manufacturing industry in depth market resear...
China auto parts and components manufacturing industry in depth market resear...
Qianzhan Intelligence
 
China fluorine chemical industry indepth research and investment strategic pl...
China fluorine chemical industry indepth research and investment strategic pl...China fluorine chemical industry indepth research and investment strategic pl...
China fluorine chemical industry indepth research and investment strategic pl...
Qianzhan Intelligence
 

Destacado (14)

China micro grid technology progress and prospects forecast report, 2013-2018
China micro grid technology progress and prospects forecast report, 2013-2018China micro grid technology progress and prospects forecast report, 2013-2018
China micro grid technology progress and prospects forecast report, 2013-2018
 
China dredging engineering industry development prospect and investment strat...
China dredging engineering industry development prospect and investment strat...China dredging engineering industry development prospect and investment strat...
China dredging engineering industry development prospect and investment strat...
 
ProTravelPlus - Direct sponsor Tim Sebert
ProTravelPlus -  Direct sponsor Tim SebertProTravelPlus -  Direct sponsor Tim Sebert
ProTravelPlus - Direct sponsor Tim Sebert
 
China excavator manufacturing industry production & marketing demand and inve...
China excavator manufacturing industry production & marketing demand and inve...China excavator manufacturing industry production & marketing demand and inve...
China excavator manufacturing industry production & marketing demand and inve...
 
China methanol industry market research and investment forecast report
China methanol industry market research and investment forecast reportChina methanol industry market research and investment forecast report
China methanol industry market research and investment forecast report
 
China midwestern cement industry production and marketing demand and investme...
China midwestern cement industry production and marketing demand and investme...China midwestern cement industry production and marketing demand and investme...
China midwestern cement industry production and marketing demand and investme...
 
China high end equipment manufacturing park development pattern and investmen...
China high end equipment manufacturing park development pattern and investmen...China high end equipment manufacturing park development pattern and investmen...
China high end equipment manufacturing park development pattern and investmen...
 
Quizzatrix
QuizzatrixQuizzatrix
Quizzatrix
 
China auto parts and components manufacturing industry in depth market resear...
China auto parts and components manufacturing industry in depth market resear...China auto parts and components manufacturing industry in depth market resear...
China auto parts and components manufacturing industry in depth market resear...
 
Prototyping With WordPress: No Coding Required
Prototyping With WordPress: No Coding RequiredPrototyping With WordPress: No Coding Required
Prototyping With WordPress: No Coding Required
 
Epoch final
Epoch finalEpoch final
Epoch final
 
China coal industry development trend and investment strategic decision repor...
China coal industry development trend and investment strategic decision repor...China coal industry development trend and investment strategic decision repor...
China coal industry development trend and investment strategic decision repor...
 
China fluorine chemical industry indepth research and investment strategic pl...
China fluorine chemical industry indepth research and investment strategic pl...China fluorine chemical industry indepth research and investment strategic pl...
China fluorine chemical industry indepth research and investment strategic pl...
 
China construction quality testing industry market forecast and competition s...
China construction quality testing industry market forecast and competition s...China construction quality testing industry market forecast and competition s...
China construction quality testing industry market forecast and competition s...
 

Similar a I have a good shard key now what - Advanced Sharding

Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
MongoDB
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scaling
MongoDB
 

Similar a I have a good shard key now what - Advanced Sharding (20)

Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
 
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Big Data vs Data Warehousing
Big Data vs Data WarehousingBig Data vs Data Warehousing
Big Data vs Data Warehousing
 
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 MinutesLightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Managing your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgManaging your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed Luxembourg
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Sharding why,what,when, how
Sharding   why,what,when, howSharding   why,what,when, how
Sharding why,what,when, how
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scaling
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
Python VS GO
Python VS GOPython VS GO
Python VS GO
 
MongoDB Basics Unileon
MongoDB Basics UnileonMongoDB Basics Unileon
MongoDB Basics Unileon
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
 
DEFCON 23 - Jason Haddix - how do i shot web
DEFCON 23 - Jason Haddix - how do i shot webDEFCON 23 - Jason Haddix - how do i shot web
DEFCON 23 - Jason Haddix - how do i shot web
 
Design for Scale / Surge 2010
Design for Scale / Surge 2010Design for Scale / Surge 2010
Design for Scale / Surge 2010
 

Último

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 

Último (20)

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 

I have a good shard key now what - Advanced Sharding

  • 1. I have a good shard key now what? David Murphy , Mongo Master Lead DBA, ObjectRocket @dmurphy_data @objectrocket
  • 2. Background • 16 yrs in databases, development, & system engineering • Lead DBA @ ObjectRocket • Mongo Master with a focus on sharding, chunks, and scaling mongo beyond normal means.
  • 3. Quick Sharding Introduction Why you might need to shard: • Scaling Reads • Fast Queries • Scaling Writes • Balancing nodes What are things should you consider? • Shard Key is Immutable • Targeted vs Scatter-Gather • Sharding Overhead • Some commands no longer work http://bit.ly/1oXYDfm - Kenny Gorman sharding talk http://bit.ly/ZTtDI1 - Several other sharding talks
  • 4. What does a shared cluster look like?
  • 5. The balancer is great… … but you can also perform some proactive checks to predict issues. Understanding the reasons why not all chunks are equal Counting Chunks per shard without sh.status() Finding out how many chunkMove’s or splits occur at a time Getting the size of all chunks for a collection
  • 6. Ways to Count Chunks Many functions for this: • sh.status() • db.collection.stats() Simple query for counting chunks: • db.chunks.count({ns:"database.collection"}); Why I prefer using Aggregation: • Faster as you only look at one type of data • Provides chunk count per shard • Can programmatically use the output
  • 7. Tracking Chunks and Moves The config.changelog tracks chunk changes. The types of changes to chunks are • split • moveChunk • moveChunk.start • moveChunk.from • moveChunk.to • moveChunk.commit Chunk Collection Examples http://bit.ly/112tXEx ChangeLog Examples http://bit.ly/1oZlyqN
  • 8. Sizing Chunks • dataSize command - Very expensive Scans all documents to be 100% accurate • find().count() * avgObjSize on chunk ranges Rough size estimate based on stats • ChunkHunter.py (On GitHub http://bit.ly/1yWQf9T ) New community script from ObjectRocket to: a) copy chunks meta data to new collection b) scan each chunk there with dataSize or count c) output summary
  • 10. Example ChunkHunter Output Total Chunks in Cluster
  • 11. Example ChunkHunter Output Total Chunks Populated
  • 12. Example ChunkHunter Output Total Chunks processed
  • 13. Example ChunkHunter Output >250k Documents per chunk
  • 14. Example ChunkHunter Output >64M per Chunk
  • 15. What did we add to the doc?
  • 16. Some example queries might be • Give me all Jumbo:true chunks • Order Jumbo chunks by documents so I can split them • Order Jumbo chunks by size so I can split them • Aggregate Jumbo chunks to count them per shard • Give me all Jumbo on Shard “X” Find all these and more at: http://bit.ly/1oYVGeJ
  • 17. The Future… Will be releasing tools to simplify: • Manual Splits • Moving of Chunks These are not replacements for the balancer but ways to help it stay on track if things deviate. As opposed to waiting for an incident and fixing it in a panic.
  • 18. Contact @dmurphy_data @objectrocket david@objectrocket.com https://www.objectrocket.com WE ARE HIRING! (DBA,DEVOPS, and more) https://www.objectrocket.com/careers Resources: ChunkHunter.py http://bit.ly/1yWQf9T Chunk Hunter Example Scripts bit.ly/1oYVGeJ Chunk Collection Queries http://bit.ly/112tXEx Changelog Queries http://bit.ly/1oZlyqN Presentations: Kenny Gorman Sharding - bit.ly/1oXYDfm Other Sharding Links - bit.ly/ZTtDI1

Notas del editor

  1. collections - Contains collection name, if its dropped, and any shard key straight line query for similar results: db.chunks.aggregate({$match:{ns:"<<full.namespace>>"}},{$group:{_id:"$shard",count:{$sum:1}}})
  2. collections - Contains collection name, if its dropped, and any shard key
  3. Examples of aggregation commands?
  4. Caution on using dataSize?