SlideShare una empresa de Scribd logo
1 de 53
MongoDB Best Practices in AWS
           Chris Harris
   Email : charris@10gen.com
      Twitter : cj_harris5
Terminology

RDBMS           MongoDB
Table           Collection
Row(s)          JSON Document
Index           Index
Join            Embedding & Linking
Partition       Shard
Partition Key   Shard Key
Here is a “simple” SQL Model
mysql> select * from book;
+----+----------------------------------------------------------+
| id | title                            |
+----+----------------------------------------------------------+
| 1 | The Demon-Haunted World: Science as a Candle in the Dark |
| 2 | Cosmos                               |
| 3 | Programming in Scala                     |
+----+----------------------------------------------------------+
3 rows in set (0.00 sec)

mysql> select * from bookauthor;
+---------+-----------+
| book_id | author_id |
+---------+-----------+
|    1|       1|
|    2|       1|
|    3|       2|
|    3|       3|
|    3|       4|
+---------+-----------+
5 rows in set (0.00 sec)

mysql> select * from author;
+----+-----------+------------+-------------+-------------+---------------+
| id | last_name | first_name | middle_name | nationality | year_of_birth |
+----+-----------+------------+-------------+-------------+---------------+
| 1 | Sagan    | Carl     | Edward    | NULL    | 1934         |
| 2 | Odersky | Martin       | NULL     | DE     | 1958         |
| 3 | Spoon    | Lex      | NULL     | NULL    | NULL         |
| 4 | Venners | Bill      | NULL    | NULL     | NULL        |
+----+-----------+------------+-------------+-------------+---------------+
4 rows in set (0.00 sec)
The Same Data in MongoDB

      {
    "_id" : ObjectId("4dfa6baa9c65dae09a4bbda5"),
    "title" : "Programming in Scala",
    "author" : [
        {
           "first_name" : "Martin",
           "last_name" : "Odersky",
           "nationality" : "DE",
           "year_of_birth" : 1958
        },
        {
           "first_name" : "Lex",
           "last_name" : "Spoon"
        },
        {
           "first_name" : "Bill",
           "last_name" : "Venners"
        }
    ]
}
Cursors

          $gt, $lt, $gte, $lte, $ne, $all, $in, $nin, $or,
        $not, $mod, $size, $exists, $type, $elemMatch


> var c = db.test.find({x: 20}).skip(20).limit(10)> c.next()
> c.next()
...

                                query
                      first N results + cursor id


                        getMore w/ cursor id
                    next N results + cursor id or 0
                                  ...
Creating Indexes
An index on _id is automatic.
For more use ensureIndex:


    db.blogs.ensureIndex({author: 1})

    1 = ascending
    -1 = descending
Compound Indexes


db.blogs.save({
  author: "James",
  ts: new Date()
  ...
});

db.blogs.ensureIndex({author: 1, ts: -1})
Indexing Embedded Documents
db.blogs.save({
  title: "My First blog",
  stats : { views: 0,
          followers: 0 }
});

db.blogs.ensureIndex({"stats.followers": -1})

db.blogs.find({"stats.followers": {$gt: 500}})
MongoDB on AWS
Four things to think about

1. Machine Sizing: Disk and Memory

2. Load Testing and Monitoring

3. Backup and restore

4. Ops Play Book
Collection 1




 Index 1
Collection 1   Virtual
               Address
               Space 1




 Index 1
Collection 1   Virtual
               Address
               Space 1




 Index 1                 This is your virtual
                         memory size
                         (mapped)
Collection 1   Virtual
               Address
               Space 1

                         Physical
                         RAM



 Index 1
Collection 1   Virtual
               Address
               Space 1

                         Physical
                         RAM



 Index 1


                                    This is your
                                    resident
                                    memory size
Collection 1   Virtual              Disk
               Address
               Space 1

                         Physical
                         RAM



 Index 1
Collection 1   Virtual              Disk
               Address
               Space 1

                         Physical
                         RAM



 Index 1




               Virtual
               Address
               Space 2
Collection 1       Virtual                   Disk
                   Address
                   Space 1

                                  Physical
                                  RAM



 Index 1




                         100 ns
               =
                       10,000,000 ns
               =
Sizing RAM and Disk
• Working set
• Document Size
• Memory versus disk
• Data lifecycle patterns
    • Long tail
    • pure random
    • bulk removes
Figuring out working Set
> db.wombats.stats()
{
	

     "ns" : "test.wombats",                  Size of data
	

     "count" : 1338330,
	

     "size" : 46915928,                  Average document
	

     "avgObjSize" : 35.05557523181876,          size
	

     "storageSize" : 86092032,
	

     "numExtents" : 12,                  Size on disk (and in
                                                 memory!)
	

     "nindexes" : 2,
	

     "lastExtentSize" : 20872960,
	

     "paddingFactor" : 1,                Size of all indexes
	

     "flags" : 0,
	

     "totalIndexSize" : 99860480,
	

     "indexSizes" : {                    Size of each index
	

     	

          "_id_" : 55877632,
	

     	

          "name_1" : 43982848
	

     },
Disk configurations
Single Disk


                           ~200 seeks / second
Disk configurations
Single Disk


                                                               ~200 seeks / second



RAID 0




                   ~200 seeks / second   ~200 seeks / second       ~200 seeks / second
Disk configurations
Single Disk


                                                               ~200 seeks / second



RAID 0




                   ~200 seeks / second   ~200 seeks / second       ~200 seeks / second


RAID 10




                   ~400 seeks / second   ~400 seeks / second       ~400 seeks / second
Basic Tips
 • Focus on higher Memory and
  not adding CPU core based
  instances

 • Use 64-bit instances

 • Use XFS or EXT4 file system

 • Use EBS in RAID. Use RAID 0
  or 10 for data volume, RAID 1
  for configdb
Basic Installation Steps

1. Create your EC2 Instance
2. Attached EBS Storage
3. Make a EXT4 file system
      $sudo mkfs -t ext4 /dev/[connection to volume]
4. Make a data directory
      $sudo mkdir -p /data/db
5. Mount the volume
      $sudo mount -a /dev/sdf /data/db
6. Install MongoDB
      $curl http://[mongodb download site] > m.tgz
      $tar xzf m.tgz
7. Start mongoDB
      $./mongodb
Types of outage

• Planned
  • Hardware upgrade
  • O/S or file-system tuning
  • Relocation of data to new file-system / storage
  • Software upgrade

• Unplanned
  • Hardware failure
  • Data center failure
  • Region outage
  • Human error
  • Application corruption
How MongoDB Replication works

             Member
               1                     Member
                                       3




                      Member
                        2



•Set is made up of 2 or more nodes
How MongoDB Replication works

            Member
              1                 Member
                                  3




                     Member
                        2
                     PRIMARY


•Election establishes the PRIMARY
•Data replication from PRIMARY to SECONDARY
How MongoDB Replication works
                         negotiate
                        new master
             Member
               1                     Member
                                       3




                      Member
                        2
                      DOWN


•PRIMARY may fail
•Automatic election of new PRIMARY if majority
exists
How MongoDB Replication works

             Member
               1                  Member
                                     3
                                  PRIMARY




                      Member
                        2
                      DOWN


•New PRIMARY elected
•Replication Set re-established
How MongoDB Replication works

            Member
              1                    Member
                                      3
                                   PRIMARY




                      Member 2
                      RECOVERING




•Automatic recovery
How MongoDB Replication works

             Member
               1                  Member
                                     3
                                  PRIMARY




                      Member
                        2



•Replication Set re-established
Replica Set 0
     •Two Node?
     •Network failure can
     cause the nodes to slip
     which will result in the
     the whole system going
     read only
Replica Set 1
     •Single datacenter
     •Single switch & power
     •Points of failure:
      •Power
      •Network
      •Datacenter
      •Two node failure
     •Automatic recovery of
     single node crash
Replica Set 3
               •Single datacenter
        AZ:1
               •Multiple power/network
               zones
AZ:3    AZ:2   •Points of failure:
                •Datacenter
                •Two node failure
               •Automatic recovery of
               single node crash
Replica Set 4




•Multi datacenter
•DR node for safety
•Can’t do multi data center durable write safely since only 1
node in distant DC
Replica Set 5

      •Three data centers
      •Can survive full data
      center loss
      •Can do w= { dc : 2 } to
      guarantee write in 2 data
      centers
Scaling
http://community.qlikview.com/cfs-filesystemfile.ashx/__key/CommunityServer.Blogs.Components.WeblogFiles/
                          theqlikviewblog/Cutting-Grass-with-Scissors-_2D00_-2.jpg
http://www.bitquill.net/blog/wp-content/uploads/2008/07/pack_of_harvesters.jpg
Sharding Across AZs
• Each Shard is made up of a Replica
Set
• Each Replica Set is distributed
across availability zones for HA and
data protection
                                    AZ:1




                AZ:3                AZ:2
Balancing
                                      mongos
                                                                                  config
                                      balancer
                                                                                  config
Chunks!
                                                                                  config




  1    2    3    4    13    14   15   16         25    26   27   28   37    38   39   40

  5    6    7    8    17    18   19   20         29    30   31   32   41    42   43   44

  9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


      Shard 1              Shard 2                    Shard 3              Shard 4
Balancing
                                      mongos
                                                                                  config
                                      balancer
                                                                                  config


                    Imbalance
                     Imbalance                                                    config




1    2    3    4

5    6    7    8

9    10   11   12     21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1                Shard 2                    Shard 3              Shard 4
Balancing
                                    mongos
                                                                                config
                                    balancer
                                                                                config

                           Move chunk 1 to                                      config
                           Shard 2




1    2    3    4

5    6    7    8

9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                    Shard 3              Shard 4
Balancing
                                    mongos
                                                                                config
                                    balancer
                                                                                config

                                                                                config




1    2    3    4

5    6    7    8

9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                    Shard 3              Shard 4
Balancing
                                    mongos
                                                                                config
                                    balancer
                                                                                config

                                                                                config




     2    3    4

5    6    7    8    1

9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                    Shard 3              Shard 4
Balancing
                                    mongos
                                                                                config
                                    balancer
                                                                                config
                                         Chunk 1 now lives on
                                              Shard 2
                                                                                config




     2    3    4

5    6    7    8    1

9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                    Shard 3              Shard 4
Backup
Replica Set 3
           1. Lock the “Backup” Node:
              db.fsyncLock()
backup

           2. Check Locked
              db.currentOp()

           3. Take a EBS Snapshot or MongoDump
               ec2-create-snapshot -d mybackup vol-nn

           4. Unlock
               db.fsyncUnlock()
Monitoring
Monitoring Tools
mongostat -




MMS! - http://mms.10gen.com
munin, cacti, nagios -
http://www.mongodb.org/display/DOCS/Monitoring+and+Diagnostics
download at mongodb.org

         We’re Hiring !
               Chris Harris
       Email : charris@10gen.com
          Twitter : cj_harris5

conferences, appearances, and meetups
     http://www.10gen.com/events

Más contenido relacionado

La actualidad más candente

Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMongoDB
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudMongoDB
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSMongoDB
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localyticsandrew311
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsBenjamin Darfler
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamMydbops
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDBStone Gao
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialSteven Francia
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deploymentYoshinori Matsunobu
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoHyunsik Choi
 
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to ChangesBenefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to ChangesAlex Nguyen
 
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...ronwarshawsky
 

La actualidad más candente (20)

Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWS
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deployment
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
 
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to ChangesBenefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
 
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
 

Destacado

Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersSeveralnines
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchMongoDB
 
Introduction to Ext JS 4
Introduction to Ext JS 4Introduction to Ext JS 4
Introduction to Ext JS 4Stefan Gehrig
 
Tomislav Capan - Muzika Hr (IT Showoff)
Tomislav Capan - Muzika Hr (IT Showoff)Tomislav Capan - Muzika Hr (IT Showoff)
Tomislav Capan - Muzika Hr (IT Showoff)IT Showoff
 
Logminingsurvey
LogminingsurveyLogminingsurvey
Logminingsurveydrewz lin
 
Adobe Digital Publishing Suite by dualpixel
Adobe Digital Publishing Suite by dualpixelAdobe Digital Publishing Suite by dualpixel
Adobe Digital Publishing Suite by dualpixeldualpixel
 
GreenWater Stakeholders Package
GreenWater Stakeholders PackageGreenWater Stakeholders Package
GreenWater Stakeholders PackageJeff Lemon
 
Pesquisa AvançAda Na Internet 2009
Pesquisa AvançAda Na Internet 2009Pesquisa AvançAda Na Internet 2009
Pesquisa AvançAda Na Internet 2009Luis Vidigal
 
Métricas em mídias sociais (versão 2010)
Métricas em mídias sociais (versão 2010)Métricas em mídias sociais (versão 2010)
Métricas em mídias sociais (versão 2010)Edney Souza
 
Medication Reconciliation in Electronic Health Information Exchange
Medication Reconciliation in Electronic Health Information ExchangeMedication Reconciliation in Electronic Health Information Exchange
Medication Reconciliation in Electronic Health Information ExchangeTomasz Adamusiak
 

Destacado (20)

Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun Verch
 
Introduction to Ext JS 4
Introduction to Ext JS 4Introduction to Ext JS 4
Introduction to Ext JS 4
 
Tomislav Capan - Muzika Hr (IT Showoff)
Tomislav Capan - Muzika Hr (IT Showoff)Tomislav Capan - Muzika Hr (IT Showoff)
Tomislav Capan - Muzika Hr (IT Showoff)
 
Logminingsurvey
LogminingsurveyLogminingsurvey
Logminingsurvey
 
Infolitigpart1
Infolitigpart1Infolitigpart1
Infolitigpart1
 
Adobe Digital Publishing Suite by dualpixel
Adobe Digital Publishing Suite by dualpixelAdobe Digital Publishing Suite by dualpixel
Adobe Digital Publishing Suite by dualpixel
 
GreenWater Stakeholders Package
GreenWater Stakeholders PackageGreenWater Stakeholders Package
GreenWater Stakeholders Package
 
document
documentdocument
document
 
Metodos en php
Metodos en phpMetodos en php
Metodos en php
 
Asp net (versione 1 e 2)
Asp net (versione 1 e 2)Asp net (versione 1 e 2)
Asp net (versione 1 e 2)
 
Catalog
CatalogCatalog
Catalog
 
Funciones A1t2
Funciones A1t2Funciones A1t2
Funciones A1t2
 
Pesquisa AvançAda Na Internet 2009
Pesquisa AvançAda Na Internet 2009Pesquisa AvançAda Na Internet 2009
Pesquisa AvançAda Na Internet 2009
 
Training for Foster Parents
Training for Foster ParentsTraining for Foster Parents
Training for Foster Parents
 
Recherche
RechercheRecherche
Recherche
 
Métricas em mídias sociais (versão 2010)
Métricas em mídias sociais (versão 2010)Métricas em mídias sociais (versão 2010)
Métricas em mídias sociais (versão 2010)
 
JSF2 and JSP
JSF2 and JSPJSF2 and JSP
JSF2 and JSP
 
00 a linguagem html
00 a linguagem html00 a linguagem html
00 a linguagem html
 
Medication Reconciliation in Electronic Health Information Exchange
Medication Reconciliation in Electronic Health Information ExchangeMedication Reconciliation in Electronic Health Information Exchange
Medication Reconciliation in Electronic Health Information Exchange
 

Similar a MongoDB Best Practices in AWS

MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101MongoDB
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryMongoDB
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment StrategyMongoDB
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment StrategiesMongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replicationMarc Schwering
 
Webinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyWebinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyMongoDB
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)MongoDB
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at nightMichael Yarichuk
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB MongoDB
 
Python mongo db-training-europython-2011
Python mongo db-training-europython-2011Python mongo db-training-europython-2011
Python mongo db-training-europython-2011Andreas Jung
 

Similar a MongoDB Best Practices in AWS (20)

MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Mongo db roma replication and sharding
Mongo db roma replication and shardingMongo db roma replication and sharding
Mongo db roma replication and sharding
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replication
 
Webinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyWebinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data Safety
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Introduction to Mongodb
Introduction to MongodbIntroduction to Mongodb
Introduction to Mongodb
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB
 
Mongo db japan
Mongo db japanMongo db japan
Mongo db japan
 
Deployment
DeploymentDeployment
Deployment
 
Python mongo db-training-europython-2011
Python mongo db-training-europython-2011Python mongo db-training-europython-2011
Python mongo db-training-europython-2011
 
MongoDB
MongoDBMongoDB
MongoDB
 

Último

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Último (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

MongoDB Best Practices in AWS

  • 1. MongoDB Best Practices in AWS Chris Harris Email : charris@10gen.com Twitter : cj_harris5
  • 2. Terminology RDBMS MongoDB Table Collection Row(s) JSON Document Index Index Join Embedding & Linking Partition Shard Partition Key Shard Key
  • 3. Here is a “simple” SQL Model mysql> select * from book; +----+----------------------------------------------------------+ | id | title | +----+----------------------------------------------------------+ | 1 | The Demon-Haunted World: Science as a Candle in the Dark | | 2 | Cosmos | | 3 | Programming in Scala | +----+----------------------------------------------------------+ 3 rows in set (0.00 sec) mysql> select * from bookauthor; +---------+-----------+ | book_id | author_id | +---------+-----------+ | 1| 1| | 2| 1| | 3| 2| | 3| 3| | 3| 4| +---------+-----------+ 5 rows in set (0.00 sec) mysql> select * from author; +----+-----------+------------+-------------+-------------+---------------+ | id | last_name | first_name | middle_name | nationality | year_of_birth | +----+-----------+------------+-------------+-------------+---------------+ | 1 | Sagan | Carl | Edward | NULL | 1934 | | 2 | Odersky | Martin | NULL | DE | 1958 | | 3 | Spoon | Lex | NULL | NULL | NULL | | 4 | Venners | Bill | NULL | NULL | NULL | +----+-----------+------------+-------------+-------------+---------------+ 4 rows in set (0.00 sec)
  • 4. The Same Data in MongoDB { "_id" : ObjectId("4dfa6baa9c65dae09a4bbda5"), "title" : "Programming in Scala", "author" : [ { "first_name" : "Martin", "last_name" : "Odersky", "nationality" : "DE", "year_of_birth" : 1958 }, { "first_name" : "Lex", "last_name" : "Spoon" }, { "first_name" : "Bill", "last_name" : "Venners" } ] }
  • 5. Cursors $gt, $lt, $gte, $lte, $ne, $all, $in, $nin, $or, $not, $mod, $size, $exists, $type, $elemMatch > var c = db.test.find({x: 20}).skip(20).limit(10)> c.next() > c.next() ... query first N results + cursor id getMore w/ cursor id next N results + cursor id or 0 ...
  • 6. Creating Indexes An index on _id is automatic. For more use ensureIndex: db.blogs.ensureIndex({author: 1}) 1 = ascending -1 = descending
  • 7. Compound Indexes db.blogs.save({ author: "James", ts: new Date() ... }); db.blogs.ensureIndex({author: 1, ts: -1})
  • 8. Indexing Embedded Documents db.blogs.save({ title: "My First blog", stats : { views: 0, followers: 0 } }); db.blogs.ensureIndex({"stats.followers": -1}) db.blogs.find({"stats.followers": {$gt: 500}})
  • 10. Four things to think about 1. Machine Sizing: Disk and Memory 2. Load Testing and Monitoring 3. Backup and restore 4. Ops Play Book
  • 12. Collection 1 Virtual Address Space 1 Index 1
  • 13. Collection 1 Virtual Address Space 1 Index 1 This is your virtual memory size (mapped)
  • 14. Collection 1 Virtual Address Space 1 Physical RAM Index 1
  • 15. Collection 1 Virtual Address Space 1 Physical RAM Index 1 This is your resident memory size
  • 16. Collection 1 Virtual Disk Address Space 1 Physical RAM Index 1
  • 17. Collection 1 Virtual Disk Address Space 1 Physical RAM Index 1 Virtual Address Space 2
  • 18. Collection 1 Virtual Disk Address Space 1 Physical RAM Index 1 100 ns = 10,000,000 ns =
  • 19. Sizing RAM and Disk • Working set • Document Size • Memory versus disk • Data lifecycle patterns • Long tail • pure random • bulk removes
  • 20. Figuring out working Set > db.wombats.stats() { "ns" : "test.wombats", Size of data "count" : 1338330, "size" : 46915928, Average document "avgObjSize" : 35.05557523181876, size "storageSize" : 86092032, "numExtents" : 12, Size on disk (and in memory!) "nindexes" : 2, "lastExtentSize" : 20872960, "paddingFactor" : 1, Size of all indexes "flags" : 0, "totalIndexSize" : 99860480, "indexSizes" : { Size of each index "_id_" : 55877632, "name_1" : 43982848 },
  • 21. Disk configurations Single Disk ~200 seeks / second
  • 22. Disk configurations Single Disk ~200 seeks / second RAID 0 ~200 seeks / second ~200 seeks / second ~200 seeks / second
  • 23. Disk configurations Single Disk ~200 seeks / second RAID 0 ~200 seeks / second ~200 seeks / second ~200 seeks / second RAID 10 ~400 seeks / second ~400 seeks / second ~400 seeks / second
  • 24. Basic Tips • Focus on higher Memory and not adding CPU core based instances • Use 64-bit instances • Use XFS or EXT4 file system • Use EBS in RAID. Use RAID 0 or 10 for data volume, RAID 1 for configdb
  • 25. Basic Installation Steps 1. Create your EC2 Instance 2. Attached EBS Storage 3. Make a EXT4 file system $sudo mkfs -t ext4 /dev/[connection to volume] 4. Make a data directory $sudo mkdir -p /data/db 5. Mount the volume $sudo mount -a /dev/sdf /data/db 6. Install MongoDB $curl http://[mongodb download site] > m.tgz $tar xzf m.tgz 7. Start mongoDB $./mongodb
  • 26. Types of outage • Planned • Hardware upgrade • O/S or file-system tuning • Relocation of data to new file-system / storage • Software upgrade • Unplanned • Hardware failure • Data center failure • Region outage • Human error • Application corruption
  • 27. How MongoDB Replication works Member 1 Member 3 Member 2 •Set is made up of 2 or more nodes
  • 28. How MongoDB Replication works Member 1 Member 3 Member 2 PRIMARY •Election establishes the PRIMARY •Data replication from PRIMARY to SECONDARY
  • 29. How MongoDB Replication works negotiate new master Member 1 Member 3 Member 2 DOWN •PRIMARY may fail •Automatic election of new PRIMARY if majority exists
  • 30. How MongoDB Replication works Member 1 Member 3 PRIMARY Member 2 DOWN •New PRIMARY elected •Replication Set re-established
  • 31. How MongoDB Replication works Member 1 Member 3 PRIMARY Member 2 RECOVERING •Automatic recovery
  • 32. How MongoDB Replication works Member 1 Member 3 PRIMARY Member 2 •Replication Set re-established
  • 33. Replica Set 0 •Two Node? •Network failure can cause the nodes to slip which will result in the the whole system going read only
  • 34. Replica Set 1 •Single datacenter •Single switch & power •Points of failure: •Power •Network •Datacenter •Two node failure •Automatic recovery of single node crash
  • 35. Replica Set 3 •Single datacenter AZ:1 •Multiple power/network zones AZ:3 AZ:2 •Points of failure: •Datacenter •Two node failure •Automatic recovery of single node crash
  • 36.
  • 37. Replica Set 4 •Multi datacenter •DR node for safety •Can’t do multi data center durable write safely since only 1 node in distant DC
  • 38. Replica Set 5 •Three data centers •Can survive full data center loss •Can do w= { dc : 2 } to guarantee write in 2 data centers
  • 42. Sharding Across AZs • Each Shard is made up of a Replica Set • Each Replica Set is distributed across availability zones for HA and data protection AZ:1 AZ:3 AZ:2
  • 43. Balancing mongos config balancer config Chunks! config 1 2 3 4 13 14 15 16 25 26 27 28 37 38 39 40 5 6 7 8 17 18 19 20 29 30 31 32 41 42 43 44 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4
  • 44. Balancing mongos config balancer config Imbalance Imbalance config 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4
  • 45. Balancing mongos config balancer config Move chunk 1 to config Shard 2 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4
  • 46. Balancing mongos config balancer config config 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4
  • 47. Balancing mongos config balancer config config 2 3 4 5 6 7 8 1 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4
  • 48. Balancing mongos config balancer config Chunk 1 now lives on Shard 2 config 2 3 4 5 6 7 8 1 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4
  • 50. Replica Set 3 1. Lock the “Backup” Node: db.fsyncLock() backup 2. Check Locked db.currentOp() 3. Take a EBS Snapshot or MongoDump ec2-create-snapshot -d mybackup vol-nn 4. Unlock db.fsyncUnlock()
  • 52. Monitoring Tools mongostat - MMS! - http://mms.10gen.com munin, cacti, nagios - http://www.mongodb.org/display/DOCS/Monitoring+and+Diagnostics
  • 53. download at mongodb.org We’re Hiring ! Chris Harris Email : charris@10gen.com Twitter : cj_harris5 conferences, appearances, and meetups http://www.10gen.com/events

Notas del editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. The things I’m going to talk about are completely inter-related and intertwined. \n\nThere will be talks that go into much greater details on these topics.\n\nArmed with the information you gather and confident in the skills your team has practiced, you should be able to spot long term problems well before it’s too late and handle the emergencies that are sure to arise.\n
  11. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  12. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  13. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  14. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  15. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  16. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  17. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  18. Add a second process just to illustrate what happens when you have more than one process contending for RAM.\n
  19. Since we’re talking about data stores, specifically, MongoDB, before you do anything else at all, you need to understand your data.\n\nHow big is your data set in total?\nHow big is your working set? that is, the size of the data and indexes that need to fit in RAM\nReads vs. writes? (example and use case)\nLong tail or random access? (example)\n\nArmed with this knowledge, you can accommodate both massive growth spurts without excessive over-provisioning.\n\nRandom access:\nTake a user database\nLong tail:\nTwitter feed\nYou need to be ready for 1MM users, how do I size my Use collection.stats to extrapolate\n
  20. \n
  21. Using standard enterprise spinning disks you can get about 200 seeks / second\n\nSo, you want to be thinking about how you can increase my seeks / second\n
  22. Here, if you can imagine that you’re not pulling all your data from a single partition, you can actually increase you throughput by spreading the load across multiple stripes.\n\nSo in this case gaining potentially three times the speed.\n
  23. What we typically recommend to run RAID10 in production which adds a mirror volume for each stripe.\n\nWe’ve found that this configuration really works out well for most use cases.\n\nYou get the benefit of increased redundancy and parallelization, despite the cost of writing each update to two volumes.\n\n\n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n