SlideShare una empresa de Scribd logo
1 de 21
Sharding Internals
     Eliot Horowitz
     @eliothorowitz
        MongoSV
    December 3, 2010
MongoDB Sharding

• Scale horizontally for data size, index size,
  write and consistent read scaling
• Distribute databases, collections or a
  objects in a collection
• Auto-balancing, migrations, management
  happen with no down time
• Choose how you partition data
• Can convert from single master to sharded
  system with no downtime
• Same features as non-sharding single
  master
• Fully consistent
Range Based
        MIN        MAX      LOCATION
         A          F        shard1
         F          M        shard1
         M          R        shard2
         R          Z        shard3




• collection is broken into chunks by range
• chunks default to 64mb or 100,000 objects
Architecture
                     Shards
            mongod   mongod     mongod
                                               ...
 Config      mongod   mongod     mongod
 Servers

mongod

mongod

mongod               mongos    mongos    ...


                      client
Shards


• Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-
  failover
• Regular mongod processes
Config Servers

• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
mongos

• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
  traffic
• Cache meta data from config servers
Writes

• Inserts : require shard key, routed
• Removes: routed and/or scattered
• Updates: routed or scattered
Queries

• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
  sort
Splitting

• Take a chunk and split it in 2
• Splits on the median value
• Splits only change meta data, no data
  change
Splitting
T1
     MIN      MAX   LOCATION
      A        Z      shard1


T2
     MIN      MAX   LOCATION
      A        G       shard1
      G        Z       shard1


T3
     MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z      shard1
Balancing
• Moves chunks from one shard to another
• Done online while system is running
• Balancing runs in the background
• Any mongos can initiate
• All data transfer happens directly between
  shards
• Once initiated, mongos can go without
  issue
Migrating
T3   MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z      shard1

T4   MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z     shard2
T5
     MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S     shard2
      S        Z      shard2
Live Migrations
• Migrate initiated
• Initial data copied
• Delta copied until steady state + secondary
  ack
• Start commit
• Copy final delta, wait for secondary ack
• Finish commit
DEMO
Download MongoDB
      http://www.mongodb.org



   and
let
us
know
what
you
think
    @eliothorowitz



@mongodb


       10gen is hiring!
http://www.10gen.com/jobs
User profiles

• Partition by user_id
• Secondary indexes on location, dates, etc...
• Reads/writes know which shard to hit
User Activity Stream

• Shard by user_id
• Loading a user’s stream hits a single shard
• Writes are distributed across all shards
• Can index on activity for deleting
Photos


• Can shard by photo_id for best read/write
  distribution
• Secondary index on tags, date
Logging

Possible Shard Keys


• date
• machine, date
• logger name

Más contenido relacionado

Similar a 2010 mongo sv-shardinginternals

Scaling MongoDB (Mongo Austin)
Scaling MongoDB (Mongo Austin)Scaling MongoDB (Mongo Austin)
Scaling MongoDB (Mongo Austin)MongoDB
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)MongoSF
 
Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb shardingxiangrong
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)MongoDB
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scalingMongoDB
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBMongoDB
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Randall Hunt
 
2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodbantoinegirbal
 
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)Ontico
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
 
MongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB ShardingRob Walters
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
HPTS talk on micro-sharding with Katta
HPTS talk on micro-sharding with KattaHPTS talk on micro-sharding with Katta
HPTS talk on micro-sharding with KattaTed Dunning
 
HPTS talk on micro sharding with Katta
HPTS talk on micro sharding with Katta HPTS talk on micro sharding with Katta
HPTS talk on micro sharding with Katta MapR Technologies
 
Sharding
ShardingSharding
ShardingMongoDB
 
Advanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and BackupAdvanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and BackupMongoDB
 
2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)MongoDB
 

Similar a 2010 mongo sv-shardinginternals (20)

Scaling MongoDB (Mongo Austin)
Scaling MongoDB (Mongo Austin)Scaling MongoDB (Mongo Austin)
Scaling MongoDB (Mongo Austin)
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)
 
Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb sharding
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scaling
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb
 
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
MongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB Basic Concepts
MongoDB Basic Concepts
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
HPTS talk on micro-sharding with Katta
HPTS talk on micro-sharding with KattaHPTS talk on micro-sharding with Katta
HPTS talk on micro-sharding with Katta
 
HPTS talk on micro sharding with Katta
HPTS talk on micro sharding with Katta HPTS talk on micro sharding with Katta
HPTS talk on micro sharding with Katta
 
Sharding
ShardingSharding
Sharding
 
Advanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and BackupAdvanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and Backup
 
2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)
 

Más de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Más de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

2010 mongo sv-shardinginternals

  • 1. Sharding Internals Eliot Horowitz @eliothorowitz MongoSV December 3, 2010
  • 2. MongoDB Sharding • Scale horizontally for data size, index size, write and consistent read scaling • Distribute databases, collections or a objects in a collection • Auto-balancing, migrations, management happen with no down time
  • 3. • Choose how you partition data • Can convert from single master to sharded system with no downtime • Same features as non-sharding single master • Fully consistent
  • 4. Range Based MIN MAX LOCATION A F shard1 F M shard1 M R shard2 R Z shard3 • collection is broken into chunks by range • chunks default to 64mb or 100,000 objects
  • 5. Architecture Shards mongod mongod mongod ... Config mongod mongod mongod Servers mongod mongod mongod mongos mongos ... client
  • 6. Shards • Can be master, master/slave or replica sets • Replica sets gives sharding + full auto- failover • Regular mongod processes
  • 7. Config Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
  • 8. mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic • Cache meta data from config servers
  • 9. Writes • Inserts : require shard key, routed • Removes: routed and/or scattered • Updates: routed or scattered
  • 10. Queries • By shard key: routed • sorted by shard key: routed in order • by non shard key: scatter gather • sorted by non shard key: distributed merge sort
  • 11. Splitting • Take a chunk and split it in 2 • Splits on the median value • Splits only change meta data, no data change
  • 12. Splitting T1 MIN MAX LOCATION A Z shard1 T2 MIN MAX LOCATION A G shard1 G Z shard1 T3 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard1
  • 13. Balancing • Moves chunks from one shard to another • Done online while system is running • Balancing runs in the background • Any mongos can initiate • All data transfer happens directly between shards • Once initiated, mongos can go without issue
  • 14. Migrating T3 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard1 T4 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard2 T5 MIN MAX LOCATION A D shard1 D G shard1 G S shard2 S Z shard2
  • 15. Live Migrations • Migrate initiated • Initial data copied • Delta copied until steady state + secondary ack • Start commit • Copy final delta, wait for secondary ack • Finish commit
  • 16. DEMO
  • 17. Download MongoDB http://www.mongodb.org and
let
us
know
what
you
think @eliothorowitz



@mongodb 10gen is hiring! http://www.10gen.com/jobs
  • 18. User profiles • Partition by user_id • Secondary indexes on location, dates, etc... • Reads/writes know which shard to hit
  • 19. User Activity Stream • Shard by user_id • Loading a user’s stream hits a single shard • Writes are distributed across all shards • Can index on activity for deleting
  • 20. Photos • Can shard by photo_id for best read/write distribution • Secondary index on tags, date
  • 21. Logging Possible Shard Keys • date • machine, date • logger name

Notas del editor

  1. \n
  2. \n
  3. for inconsistent read scaling\n\n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. don’t shard by date\n\n
  23. \n
  24. \n