SlideShare una empresa de Scribd logo
1 de 29
Scaling with MongoDB
       Eliot Horowitz
       @eliothorowitz
        MongoAustin
      February 15, 2011
Scaling

• Storage needs only go up
• Operations/sec only go up
• Complexity only goes up
Horizontal Scaling

• Vertical scaling is limited
• Hard to scale vertically in the cloud
• Can scale wider than higher
Read Scaling

• One master at any time
• Programmer determines if read hits master
  or a slave
• Pro: easy to setup, can scale reads very well
• Con: reads are inconsistent on a slave
• Writes don’t scale
One Master, Many Slaves


• Custom Master/Slave setup
• Have as many slaves as you want
• Can put them local to application servers
• Good for 90+% read heavy applications
  (Wikipedia)
Replica Sets
• High Availability Cluster
• One master at any time, up to 6 slaves
• A slave automatically promoted to master if
  failure
• Drivers support auto routing of reads to
  slaves if programmer allows
• Good for applications that need high write
  availability but mostly reads (Commenting
  System)
Sharding

• Many masters, even more slaves
• Can scale in two dimensions
• Add Shards for write and data size scaling
• Add slaves for inconsistent read scaling and
  redundancy
Sharding Basics
• Data is split up into chunks
• Shard: Replica sets that hold a portion of
  the data
• Config Servers: Store meta data about
  system
• Mongos: Routers, direct direct and merge
  requests
Architecture
                             Shards

           mongod      mongod             mongod
                                                               ...
           mongod      mongod             mongod

           mongod      mongod             mongod
 Config
 Servers

mongod
                       mongos           mongos           ...
mongod

mongod
                    client    client   client   client
Common Setup

• A common setup is 3 shards with 3 servers
  per shard: 3 masters, 6 slaves
• Can add sharding later to an existing replica
  set with no down time
• Can have sharded and non-sharded
  collections
Range Based
       MIN         MAX        LOCATION
        A           F          shard1
        F           M          shard1
       M            R          shard2
        R           Z          shard3




• collection is broken into chunks by range
• chunks default to 64mb or 100,000 objects
Config Servers

• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
mongos

• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
  traffic
• Cache meta data from config servers
Writes

• Inserts : require shard key, routed
• Removes: routed and/or scattered
• Updates: routed or scattered
Queries

• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
  sort
Splitting

• Take a chunk and split it in 2
• Splits on the median value
• Splits only change meta data, no data
  change
Splitting
T1
     MIN      MAX   LOCATION
      A        Z      shard1


T2
     MIN      MAX   LOCATION
      A        G       shard1
      G        Z       shard1


T3
     MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z      shard1
Balancing

• Moves chunks from one shard to another
• Done online while system is running
• Balancing runs in the background
Migrating
T3   MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z      shard1

T4   MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S      shard1
      S        Z     shard2
T5
     MIN      MAX   LOCATION
      A        D      shard1
      D        G      shard1
      G        S     shard2
      S        Z      shard2
Choosing a Shard Key

• Shard key determines how data is
  partitioned
• Hard to change
• Most important performance decision
Use Case: User Profiles
  { email : “eliot@10gen.com” ,
      addresses : [ { state : “NY” } ]
  }
• Shard by email
• Lookup by email hits 1 node
• Index on { “addresses.state” : 1 }
Use Case: Activity
          Stream
  { user_id : XXX, event_id : YYY , data : ZZZ }
• Shard by user_id
• Looking up an activity stream hits 1 node
• Writing even is distributed
• Index on { “event_id” : 1 } for deletes
Use Case: Photos
  { photo_id : ???? , data : <binary> }
  What’s the right key?
• auto increment
• MD5( data )
• now() + MD5(data)
• month() + MD5(data)
Use Case: Logging
    { machine : “app.foo.com” , app : “apache” ,
     when : “2010-12-02:11:33:14” , data : XXX }
    Possible Shard keys
•   { machine : 1 }
•   { when : 1 }
•   { machine : 1 , app : 1 }
•   { app : 1 }
Roadmap
Past Releases
• First release - February 2009
• v1.0 - August 2009
• v1.2 - December 2009 - Map/Reduce, lots
  of small things
• v1.4 - March 2010 - Concurrency/Geo
• V1.6 - August 2010 - Sharding/Replica Sets
1.8

• Single Server Durability
• Covered Indexes
• Enhancements to Sharding/Replica Sets
Short List

• Better Aggregation
• Full Text Search
• TTL timeout collections
• Concurrency
• Compaction
Download MongoDB
      http://www.mongodb.org



   and
let
us
know
what
you
think
    @eliothorowitz



@mongodb


       10gen is hiring!
http://www.10gen.com/jobs

Más contenido relacionado

La actualidad más candente

Sharding
ShardingSharding
ShardingMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: ShardingMongoDB
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Lucidworks
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataShalin Shekhar Mangar
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4thelabdude
 
Scaling Social Games
Scaling Social GamesScaling Social Games
Scaling Social GamesPaolo Negri
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLNguyen Van Vuong
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHPMike Lively
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterLucidworks
 
Mongrel2, a short introduction
Mongrel2, a short introductionMongrel2, a short introduction
Mongrel2, a short introductionPaolo Negri
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksShalin Shekhar Mangar
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 
Real time fulltext search with sphinx
Real time fulltext search with sphinxReal time fulltext search with sphinx
Real time fulltext search with sphinxAdrian Nuta
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)MongoDB
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
Real time indexes in Sphinx, Yaroslav Vorozhko
Real time indexes in Sphinx, Yaroslav VorozhkoReal time indexes in Sphinx, Yaroslav Vorozhko
Real time indexes in Sphinx, Yaroslav VorozhkoFuenteovejuna
 

La actualidad más candente (20)

Sharding
ShardingSharding
Sharding
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4
 
Scaling Social Games
Scaling Social GamesScaling Social Games
Scaling Social Games
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQL
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, Twitter
 
Mongrel2, a short introduction
Mongrel2, a short introductionMongrel2, a short introduction
Mongrel2, a short introduction
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Real time fulltext search with sphinx
Real time fulltext search with sphinxReal time fulltext search with sphinx
Real time fulltext search with sphinx
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)2010 mongo berlin-shardinginternals (1)
2010 mongo berlin-shardinginternals (1)
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Real time indexes in Sphinx, Yaroslav Vorozhko
Real time indexes in Sphinx, Yaroslav VorozhkoReal time indexes in Sphinx, Yaroslav Vorozhko
Real time indexes in Sphinx, Yaroslav Vorozhko
 

Similar a Scaling MongoDB (Mongo Austin)

2011 mongo sf-sharding
2011 mongo sf-sharding2011 mongo sf-sharding
2011 mongo sf-shardingMongoDB
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBMongoDB
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scalingMongoDB
 
Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb shardingxiangrong
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB ShardingRob Walters
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Randall Hunt
 
2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodbantoinegirbal
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Sharding
ShardingSharding
ShardingMongoDB
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2MongoDB
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraJon Haddad
 
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)Ontico
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talkSatish Mehta
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsDataStax
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment StrategyMongoDB
 

Similar a Scaling MongoDB (Mongo Austin) (20)

2011 mongo sf-sharding
2011 mongo sf-sharding2011 mongo sf-sharding
2011 mongo sf-sharding
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scaling
 
Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb sharding
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Sharding
ShardingSharding
Sharding
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 

Más de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Más de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Scaling MongoDB (Mongo Austin)

  • 1. Scaling with MongoDB Eliot Horowitz @eliothorowitz MongoAustin February 15, 2011
  • 2. Scaling • Storage needs only go up • Operations/sec only go up • Complexity only goes up
  • 3. Horizontal Scaling • Vertical scaling is limited • Hard to scale vertically in the cloud • Can scale wider than higher
  • 4. Read Scaling • One master at any time • Programmer determines if read hits master or a slave • Pro: easy to setup, can scale reads very well • Con: reads are inconsistent on a slave • Writes don’t scale
  • 5. One Master, Many Slaves • Custom Master/Slave setup • Have as many slaves as you want • Can put them local to application servers • Good for 90+% read heavy applications (Wikipedia)
  • 6. Replica Sets • High Availability Cluster • One master at any time, up to 6 slaves • A slave automatically promoted to master if failure • Drivers support auto routing of reads to slaves if programmer allows • Good for applications that need high write availability but mostly reads (Commenting System)
  • 7. Sharding • Many masters, even more slaves • Can scale in two dimensions • Add Shards for write and data size scaling • Add slaves for inconsistent read scaling and redundancy
  • 8. Sharding Basics • Data is split up into chunks • Shard: Replica sets that hold a portion of the data • Config Servers: Store meta data about system • Mongos: Routers, direct direct and merge requests
  • 9. Architecture Shards mongod mongod mongod ... mongod mongod mongod mongod mongod mongod Config Servers mongod mongos mongos ... mongod mongod client client client client
  • 10. Common Setup • A common setup is 3 shards with 3 servers per shard: 3 masters, 6 slaves • Can add sharding later to an existing replica set with no down time • Can have sharded and non-sharded collections
  • 11. Range Based MIN MAX LOCATION A F shard1 F M shard1 M R shard2 R Z shard3 • collection is broken into chunks by range • chunks default to 64mb or 100,000 objects
  • 12. Config Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
  • 13. mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic • Cache meta data from config servers
  • 14. Writes • Inserts : require shard key, routed • Removes: routed and/or scattered • Updates: routed or scattered
  • 15. Queries • By shard key: routed • sorted by shard key: routed in order • by non shard key: scatter gather • sorted by non shard key: distributed merge sort
  • 16. Splitting • Take a chunk and split it in 2 • Splits on the median value • Splits only change meta data, no data change
  • 17. Splitting T1 MIN MAX LOCATION A Z shard1 T2 MIN MAX LOCATION A G shard1 G Z shard1 T3 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard1
  • 18. Balancing • Moves chunks from one shard to another • Done online while system is running • Balancing runs in the background
  • 19. Migrating T3 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard1 T4 MIN MAX LOCATION A D shard1 D G shard1 G S shard1 S Z shard2 T5 MIN MAX LOCATION A D shard1 D G shard1 G S shard2 S Z shard2
  • 20. Choosing a Shard Key • Shard key determines how data is partitioned • Hard to change • Most important performance decision
  • 21. Use Case: User Profiles { email : “eliot@10gen.com” , addresses : [ { state : “NY” } ] } • Shard by email • Lookup by email hits 1 node • Index on { “addresses.state” : 1 }
  • 22. Use Case: Activity Stream { user_id : XXX, event_id : YYY , data : ZZZ } • Shard by user_id • Looking up an activity stream hits 1 node • Writing even is distributed • Index on { “event_id” : 1 } for deletes
  • 23. Use Case: Photos { photo_id : ???? , data : <binary> } What’s the right key? • auto increment • MD5( data ) • now() + MD5(data) • month() + MD5(data)
  • 24. Use Case: Logging { machine : “app.foo.com” , app : “apache” , when : “2010-12-02:11:33:14” , data : XXX } Possible Shard keys • { machine : 1 } • { when : 1 } • { machine : 1 , app : 1 } • { app : 1 }
  • 26. Past Releases • First release - February 2009 • v1.0 - August 2009 • v1.2 - December 2009 - Map/Reduce, lots of small things • v1.4 - March 2010 - Concurrency/Geo • V1.6 - August 2010 - Sharding/Replica Sets
  • 27. 1.8 • Single Server Durability • Covered Indexes • Enhancements to Sharding/Replica Sets
  • 28. Short List • Better Aggregation • Full Text Search • TTL timeout collections • Concurrency • Compaction
  • 29. Download MongoDB http://www.mongodb.org and
let
us
know
what
you
think @eliothorowitz



@mongodb 10gen is hiring! http://www.10gen.com/jobs

Notas del editor

  1. \n
  2. \n
  3. What is scaling?\nWell - hopefully for everyone here.\n\n
  4. ec2 goes up to 64gb, maybe mention 256gb box here??? ($30-40k)\nmaybe can but 256gb box, but i spin up 10 ec2 64gb boxes in 10 minutes\n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. Don&amp;#x2019;t pre-emptively shard - easy to add later\n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n