SlideShare una empresa de Scribd logo
1 de 58
How to Achieve Scale with 
MongoDB 
Jake Angerman 
Sr. Solutions Architect, MongoDB
Today’s Webinar Agenda 
Schema Design 
Indexes 
Monitoring your Workload 
Achieve Scale 
Optimization Tips 
Scale Vertically 
Horizontal 
Scaling 
1 
2 
3
Optimization Tips to 
Scale Your App
Premature Optimization 
• There is no doubt that the grail of efficiency leads to abuse. 
Programmers waste enormous amounts of time thinking about, 
or worrying about, the speed of noncritical parts of their 
programs, and these attempts at efficiency actually have a strong 
negative impact when debugging and maintenance are 
considered. We should forget about small efficiencies, say about 
97% of the time: premature optimization is the root of all evil. 
Yet we should not pass up our opportunities in that critical 3%. 
- Donald Knuth, 1974
Premature Optimization 
• "There is no doubt that the grail of efficiency leads to abuse. 
Programmers waste enormous amounts of time thinking about, 
or worrying about, the speed of noncritical parts of their 
programs, and these attempts at efficiency actually have a strong 
negative impact when debugging and maintenance are 
considered. We should forget about small efficiencies, say about 
97% of the time: premature optimization is the root of all evil. 
Yet we should not pass up our opportunities in that critical 3%." 
- Donald Knuth, 1974
Premature Optimization 
• "There is no doubt that the grail of efficiency leads to abuse. 
Programmers waste enormous amounts of time thinking about, 
or worrying about, the speed of noncritical parts of their 
programs, and these attempts at efficiency actually have a strong 
negative impact when debugging and maintenance are 
considered. We should forget about small efficiencies, say 
about 97% of the time: premature optimization is the root of 
all evil. Yet we should not pass up our opportunities in that 
critical 3%." 
- Donald Knuth, 1974
Schema Design 
• Document Model 
• Dynamic Schema 
• Collections 
{ "customer_id" : 123, 
"first_name" : ”John", 
"last_name" : "Smith", 
"address" : { 
"street": "123 Main Street", 
"city": "Houston", 
"state": "TX", 
"zip_code": "77027" 
} 
policies: [ { 
policy_number : 13, 
description: “short term”, 
deductible: 500 
}, 
{ policy_number : 14, 
description: “dental”, 
visits: […] 
} ] 
}
The Importance of Schema Design 
• MongoDB schemas are built oppositely than relational 
schemas! 
• Relational Schema: 
– normalize data 
– write complex queries to join the data 
– let the query planner figure out how to make queries efficient 
• MongoDB Schema: 
– denormalize the data 
– create a (potentially complex) schema with prior knowledge of your 
actual (not just predicted) query patterns 
– write simple queries
Real World Example: Optimizing Schema for 
Scale 
Product catalog schema for retailer selling in 20 countries 
{ 
_id: 375, 
en_US: { name: …, description: …, <etc…> }, 
en_GB: { name: …, description: …, <etc…> }, 
fr_FR: { name: …, description: …, <etc…> }, 
fr_CA: { name: …, description: …, <etc…> }, 
de_DE: …, 
de_CH: …, 
<… and so on for other locales …> 
}
What's good about this schema? 
• Each document contains all the data about the 
product across all possible locales. 
• It is the most efficient way to retrieve all translations of 
a product in a single query (English, French, German, 
etc).
But that's not how the data was accessed 
db.catalog.find( { _id: 375 }, { en_US: true } ); 
db.catalog.find( { _id: 375 }, { fr_FR: true } ); 
db.catalog.find( { _id: 375 }, { de_DE: true } ); 
… and so forth for other locales 
The data model did not fit the access pattern.
Why is this inefficient? 
Data in RED are 
being used. Data in 
BLUE take up 
memory but are not in 
demand. 
{ 
_id: 375, 
en_US: { name: …, description: …, <etc…> }, 
en_GB: { name: …, description: …, <etc…> }, 
fr_FR: { name: …, description: …, <etc…> }, 
fr_CA: { name: …, description: …, <etc…> }, 
de_DE: …, 
de_CH: …, 
<… and so on for other locales …> 
} 
{ 
_id: 42, 
en_US: { name: …, description: …, <etc…> }, 
en_GB: { name: …, description: …, <etc…> }, 
fr_FR: { name: …, description: …, <etc…> }, 
fr_CA: { name: …, description: …, <etc…> }, 
de_DE: …, 
de_CH: …, 
<… and so on for other locales …> 
}
Consequences of the schema 
• Each document contained 20x more data than the 
common use case requires 
• Disk IO was too high for the relatively modest query 
load on the dataset 
• MongoDB lets you request a subset of a document's 
contents via projection… 
• … but the entire document must be loaded into RAM 
to service the request
Consequences of the schema redesign 
• Queries induced minimal memory overhead 
• 20x as many distinct products fit in RAM at once 
• Disk IO utilization reduced 
• Application latency reduced 
{ 
_id: "375-en_GB", 
name: …, 
description: …, 
<… the rest of the document …> 
}
Schema Design Patterns 
• Pattern: pre-computing interesting quantities, ideally with each 
write operation 
• Pattern: putting unrelated items in different collections to take 
advantage of indexing 
• Anti-pattern: appending to arrays ad infinitum 
• Anti-pattern: importing relational schemas directly into 
MongoDB
Schema Design Tips 
• Avoid inherently slow operations 
– Updates of unindexed arrays of several thousand elements 
– Updates of indexed arrays of several hundred elements 
– Document moves 
• Arrays are great, but know how to use them
Schema Design resources 
• Blog series, "6 rules of thumb" 
– Part 1: http://goo.gl/TFJ3dr 
– Part 2: http://goo.gl/qTdGhP 
– Part 3: http://goo.gl/JFO1pI
Indexing 
• Indexes are tree-structured sets of references to your 
documents 
• Indexes are the single biggest tunable performance factor in 
the database 
• Indexing and schema design go hand in hand
Indexing Mistakes 
• Failing to build necessary indexes 
• Building unnecessary indexes 
• Running ad-hoc queries in production
Indexing Fixes 
• Failing to build necessary indexes 
– Run .explain(), examine slow query log, mtools, system.profile 
collection 
• Building unnecessary indexes 
– Talk to your application developers about usage 
• Running ad-hoc queries in production 
– Use a staging environment, use secondaries
mongod log files 
Sun Jun 29 06:35:37.646 [conn2] query 
test.docs query: { parent.company: 
"22794", parent.employeeId: "83881" } 
ntoreturn:1 ntoskip:0 nscanned:806381 
keyUpdates:0 numYields: 5 
locks(micros) r:2145254 nreturned:0 
reslen:20 1156ms
mongod log files 
date and time thread operation 
Sun Jun 29 06:35:37.646 [conn2] query 
test.docs query: { parent.company: 
"22794", parent.employeeId: "83881" } 
ntoreturn:1 ntoskip:0 nscanned:806381 
keyUpdates:0 numYields: 5 
locks(micros) r:2145254 nreturned:0 
reslen:20 1156ms 
n… 
counters 
lock 
times 
duration 
number 
of yields
You need a tool when doing log file analysis
mtools 
• http://github.com/rueckstiess/mtools 
• log file analysis for poorly performing queries 
– Show me queries that took more than 1000 ms from 6 am to 6 pm: 
– mlogfilter mongodb.log --from 06:00 --to 18:00 --slow 
1000 > mongodb-filtered.log
Graphing with mtools 
% mplotqueries --type histogram --group namespace --bucketSize 3600
Real World Example: Indexing for Scale 
Sun Jun 29 06:35:37.646 [conn2] query 
test.docs query: { parent.company: 
"22794", parent.employeeId: "83881" } 
ntoreturn:1 ntoskip:0 nscanned:806381 
keyUpdates:0 numYields: 5 
locks(micros) r:2145254 nreturned:0 
reslen:20 1156ms
Document schema 
{ 
_id: ObjectId("53b9ab7e939f1e229b4f574c"), 
firstName: "Alice", 
lastName: "Smith", 
parent: { 
company: 22794, 
employeeId: 83881 
} 
}
But there's an index!?! 
db.system.indexes.find().toArray() 
[{ 
"v" : 1, 
"key" : { 
"company" : 1, 
"employeeId" : 1 
}, 
"ns" : "test.docs", 
"name" : "company_1_employeeId_1" 
}]
But there's an index!?! 
db.system.indexes.find().toArray() 
[{ 
"v" : 1, 
"key" : { 
"company" : 1, 
"employeeId" : 1 
}, 
"ns" : "test.docs", 
"name" : "company_1_employeeId_1" 
}] 
This isn't 
the index 
you're 
looking for.
Did you see the problem? 
{ 
_id: ObjectId("53b9ab7e939f1e229b4f574c"), 
firstName: "Alice", 
lastName: "Smith", 
parent: { 
company: 22794, 
employeeId: 83881 
} 
}
The index was created incorrectly 
db.system.indexes.find().toArray() 
[{ 
"v" : 1, 
"key" : { 
"parent.company" : 1, 
"parent.employeeId" : 1 
}, 
"ns" : "test.docs", 
"name" : 
"parent.company_1_parent.employeeId_1" 
}] 
Subdocument 
needed
Indexing Strategies 
• Create indexes that support your queries! 
• Create highly selective indexes 
• Eliminate duplicate indexes with a compound index, if possible 
– db.collection.ensureIndex({A:1, B:1, C:1}) 
– allows queries using leftmost prefix 
• Order compound index fields thusly: equality, sort, then range 
– see http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ 
• Create indexes that support covered queries 
• Prevent collection scans in pre-production environments 
– mongod --notablescan 
– db.getSiblingDB("admin").runCommand( { setParameter: 1, notablescan: 1 } )
Monitoring Your Workload 
• Log files, iostat, mtools, mongotop are for debugging 
• MongoDB Management Service (MMS) can do metrics 
collection and reporting
What can MMS do?
Database Metrics
Hardware statistics (CPU, disk)
MMS Monitoring Setup
Cloud Version of MMS 
1. Go to http://mms.mongodb.com 
2. Create an account 
3. Install one agent in your datacenter 
4. Add hosts from the web interface 
5. Enjoy!
Today’s Webinar Agenda 
Hardware Considerations 
Achieve Scale 
1 Optimization Tips 
Scale Vertically 
Horizontal 
Scaling 
2 
3
Vertical Scaling 
Factors: 
– RAM 
– Disk 
– CPU 
– Network 
Replica Set 
Primary 
Secondary 
Secondary 
Replica Set 
Primary 
Secondary 
Secondary 
Horizontal Scaling
Working Set Exceeds Physical 
Memory
RAM - Measure your working set and index 
sizes 
• db.serverStatus({workingSet:1}).workingSet 
{ "computationTimeMicros": 2751, 
"note": "thisIsAnEstimate", 
"overSeconds": 1084, 
"pagesInMemory": 2041 
} 
• db.stats().indexSize 
2032880640 
• In this example, 
(2041 * 4096) + 2032880640 = 2041240576 bytes 
= 1.9 GB 
• Note: this is a subset of the virtual memory used by mongod
Real World Example: Vertical Scaling 
• System that tracked status information for entities in the 
business 
• State changes happen in batches; sometimes 10% of entities 
get updated, sometimes 100% get updated
Initial Architecture 
Sharded cluster with 4 shards using spinning disks 
Application / mongos 
mongod
Adding shards to scale horizontally 
• Application was a success! Business entities grew by a factor of 
5 
• Cluster capacity multiplied by 5, but so did the TCO 
Application / mongos 
mongod 
…16 more shards…
More success means more shards 
• 10x growth means … 200 shards 
• Horizontal scaling with sharding is linear scaling, but an order 
of magnitude was needed 
• Bulk updates of random documents approaches speed of 
disks
Final architecture 
• Scaling the random IOPS with SSDs was a vertical scaling 
approach 
Application / mongos 
mongod SSD
Before you add hardware… 
• Make sure you are solving the right scaling problem 
• Remedy schema and index problems first 
– schema and index problems can look like hardware problems 
• Tune the Operating System 
– ulimits, swap, NUMA, NOOP scheduler with hypervisors 
• Tune the IO subsystem 
– ext4 or XFS vs SAN, RAID10, readahead, noatime 
• See MongoDB "production notes" page 
• Heed logfile startup warnings
Today’s Webinar Agenda 
Achieve Scale 
1 Optimization Tips 
2 Scale Vertically 
The Horizontal Basics of Sharding 
Scaling 
3
The basics of 
Horizontal Scaling
The basics of 
Horizontal Scaling 
(aka Sharding)
The Basics of Sharding
Rule of Thumb 
To make good decisions about 
MongoDB implementations, you 
must understand MongoDB and your 
applications and the workload your 
applications generate and your 
business requirements.
Summary 
• Don't throw hardware at the problem until you examine all 
other possibilities (schema, indexes, OS, IO subsystem) 
• Know what is considered "normal" performance by monitoring 
• Horizontal scaling in MongoDB is implemented with sharding, 
but you must understand schema design and indexing before 
you shard 
Sharding a sub-optimally designed 
database will not make it performant
Today’s Webinar Agenda 
Achieve Scale 
1 Optimization Tips 
The Horizontal Basics of Sharding 
Scaling 
3 
Schema Design 
Indexes 
Monitoring your Workload 
2 Scale Vertically
Limited Time: Get Expert Advice for Free 
If you’re thinking about 
scaling, why reinvent the 
wheel? 
Our experts can collaborate 
with you to provide detailed 
guidance. 
Sign Up For a Free One Hour 
Consult: 
http://bit.ly/1rkXcfN
Questions? 
Stay tuned after the webinar and take our survey 
for your chance to win MongoDB schwag.
Thank You 
Jake Angerman 
Sr. Solutions Architect, MongoDB

Más contenido relacionado

La actualidad más candente

Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
On-boarding with JanusGraph Performance
On-boarding with JanusGraph PerformanceOn-boarding with JanusGraph Performance
On-boarding with JanusGraph PerformanceChin Huang
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudDatabricks
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...DataWorks Summit/Hadoop Summit
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use CasesDATAVERSITY
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture ForumChristopher Spring
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingAmir Reza Hashemi
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Webinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsWebinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 

La actualidad más candente (20)

Vector database
Vector databaseVector database
Vector database
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
On-boarding with JanusGraph Performance
On-boarding with JanusGraph PerformanceOn-boarding with JanusGraph Performance
On-boarding with JanusGraph Performance
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Webinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsWebinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance Implications
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 

Destacado

Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
MongoDb scalability and high availability with Replica-Set
MongoDb scalability and high availability with Replica-SetMongoDb scalability and high availability with Replica-Set
MongoDb scalability and high availability with Replica-SetVivek Parihar
 
The Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBThe Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBMongoDB
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBRick Copeland
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 

Destacado (13)

Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
MongoDb scalability and high availability with Replica-Set
MongoDb scalability and high availability with Replica-SetMongoDb scalability and high availability with Replica-Set
MongoDb scalability and high availability with Replica-Set
 
The Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBThe Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDB
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 

Similar a How to Achieve Scale with MongoDB

Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationMongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsMatt Kuklinski
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbMongoDB APAC
 
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 20197 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019Dave Stokes
 
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Amazon Web Services
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011bostonrb
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsMatias Cascallares
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the FieldMongoDB
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkinskiwilkins
 
MongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB
 
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...Citus Data
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionDaniel Coupal
 
Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems MongoDB
 
Using Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterUsing Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterMongoDB
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
MongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseMongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseGaurav Awasthi
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNATomas Cervenka
 

Similar a How to Achieve Scale with MongoDB (20)

Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 20197 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
 
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
 
MongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOL
 
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems
 
Using Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterUsing Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your Cluster
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
MongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseMongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL Database
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
 

Más de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Más de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Último (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

How to Achieve Scale with MongoDB

  • 1. How to Achieve Scale with MongoDB Jake Angerman Sr. Solutions Architect, MongoDB
  • 2. Today’s Webinar Agenda Schema Design Indexes Monitoring your Workload Achieve Scale Optimization Tips Scale Vertically Horizontal Scaling 1 2 3
  • 3. Optimization Tips to Scale Your App
  • 4. Premature Optimization • There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. - Donald Knuth, 1974
  • 5. Premature Optimization • "There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." - Donald Knuth, 1974
  • 6. Premature Optimization • "There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." - Donald Knuth, 1974
  • 7. Schema Design • Document Model • Dynamic Schema • Collections { "customer_id" : 123, "first_name" : ”John", "last_name" : "Smith", "address" : { "street": "123 Main Street", "city": "Houston", "state": "TX", "zip_code": "77027" } policies: [ { policy_number : 13, description: “short term”, deductible: 500 }, { policy_number : 14, description: “dental”, visits: […] } ] }
  • 8. The Importance of Schema Design • MongoDB schemas are built oppositely than relational schemas! • Relational Schema: – normalize data – write complex queries to join the data – let the query planner figure out how to make queries efficient • MongoDB Schema: – denormalize the data – create a (potentially complex) schema with prior knowledge of your actual (not just predicted) query patterns – write simple queries
  • 9. Real World Example: Optimizing Schema for Scale Product catalog schema for retailer selling in 20 countries { _id: 375, en_US: { name: …, description: …, <etc…> }, en_GB: { name: …, description: …, <etc…> }, fr_FR: { name: …, description: …, <etc…> }, fr_CA: { name: …, description: …, <etc…> }, de_DE: …, de_CH: …, <… and so on for other locales …> }
  • 10. What's good about this schema? • Each document contains all the data about the product across all possible locales. • It is the most efficient way to retrieve all translations of a product in a single query (English, French, German, etc).
  • 11. But that's not how the data was accessed db.catalog.find( { _id: 375 }, { en_US: true } ); db.catalog.find( { _id: 375 }, { fr_FR: true } ); db.catalog.find( { _id: 375 }, { de_DE: true } ); … and so forth for other locales The data model did not fit the access pattern.
  • 12. Why is this inefficient? Data in RED are being used. Data in BLUE take up memory but are not in demand. { _id: 375, en_US: { name: …, description: …, <etc…> }, en_GB: { name: …, description: …, <etc…> }, fr_FR: { name: …, description: …, <etc…> }, fr_CA: { name: …, description: …, <etc…> }, de_DE: …, de_CH: …, <… and so on for other locales …> } { _id: 42, en_US: { name: …, description: …, <etc…> }, en_GB: { name: …, description: …, <etc…> }, fr_FR: { name: …, description: …, <etc…> }, fr_CA: { name: …, description: …, <etc…> }, de_DE: …, de_CH: …, <… and so on for other locales …> }
  • 13. Consequences of the schema • Each document contained 20x more data than the common use case requires • Disk IO was too high for the relatively modest query load on the dataset • MongoDB lets you request a subset of a document's contents via projection… • … but the entire document must be loaded into RAM to service the request
  • 14. Consequences of the schema redesign • Queries induced minimal memory overhead • 20x as many distinct products fit in RAM at once • Disk IO utilization reduced • Application latency reduced { _id: "375-en_GB", name: …, description: …, <… the rest of the document …> }
  • 15. Schema Design Patterns • Pattern: pre-computing interesting quantities, ideally with each write operation • Pattern: putting unrelated items in different collections to take advantage of indexing • Anti-pattern: appending to arrays ad infinitum • Anti-pattern: importing relational schemas directly into MongoDB
  • 16. Schema Design Tips • Avoid inherently slow operations – Updates of unindexed arrays of several thousand elements – Updates of indexed arrays of several hundred elements – Document moves • Arrays are great, but know how to use them
  • 17. Schema Design resources • Blog series, "6 rules of thumb" – Part 1: http://goo.gl/TFJ3dr – Part 2: http://goo.gl/qTdGhP – Part 3: http://goo.gl/JFO1pI
  • 18. Indexing • Indexes are tree-structured sets of references to your documents • Indexes are the single biggest tunable performance factor in the database • Indexing and schema design go hand in hand
  • 19. Indexing Mistakes • Failing to build necessary indexes • Building unnecessary indexes • Running ad-hoc queries in production
  • 20. Indexing Fixes • Failing to build necessary indexes – Run .explain(), examine slow query log, mtools, system.profile collection • Building unnecessary indexes – Talk to your application developers about usage • Running ad-hoc queries in production – Use a staging environment, use secondaries
  • 21. mongod log files Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms
  • 22. mongod log files date and time thread operation Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms n… counters lock times duration number of yields
  • 23. You need a tool when doing log file analysis
  • 24. mtools • http://github.com/rueckstiess/mtools • log file analysis for poorly performing queries – Show me queries that took more than 1000 ms from 6 am to 6 pm: – mlogfilter mongodb.log --from 06:00 --to 18:00 --slow 1000 > mongodb-filtered.log
  • 25. Graphing with mtools % mplotqueries --type histogram --group namespace --bucketSize 3600
  • 26. Real World Example: Indexing for Scale Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms
  • 27. Document schema { _id: ObjectId("53b9ab7e939f1e229b4f574c"), firstName: "Alice", lastName: "Smith", parent: { company: 22794, employeeId: 83881 } }
  • 28. But there's an index!?! db.system.indexes.find().toArray() [{ "v" : 1, "key" : { "company" : 1, "employeeId" : 1 }, "ns" : "test.docs", "name" : "company_1_employeeId_1" }]
  • 29. But there's an index!?! db.system.indexes.find().toArray() [{ "v" : 1, "key" : { "company" : 1, "employeeId" : 1 }, "ns" : "test.docs", "name" : "company_1_employeeId_1" }] This isn't the index you're looking for.
  • 30. Did you see the problem? { _id: ObjectId("53b9ab7e939f1e229b4f574c"), firstName: "Alice", lastName: "Smith", parent: { company: 22794, employeeId: 83881 } }
  • 31. The index was created incorrectly db.system.indexes.find().toArray() [{ "v" : 1, "key" : { "parent.company" : 1, "parent.employeeId" : 1 }, "ns" : "test.docs", "name" : "parent.company_1_parent.employeeId_1" }] Subdocument needed
  • 32. Indexing Strategies • Create indexes that support your queries! • Create highly selective indexes • Eliminate duplicate indexes with a compound index, if possible – db.collection.ensureIndex({A:1, B:1, C:1}) – allows queries using leftmost prefix • Order compound index fields thusly: equality, sort, then range – see http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ • Create indexes that support covered queries • Prevent collection scans in pre-production environments – mongod --notablescan – db.getSiblingDB("admin").runCommand( { setParameter: 1, notablescan: 1 } )
  • 33. Monitoring Your Workload • Log files, iostat, mtools, mongotop are for debugging • MongoDB Management Service (MMS) can do metrics collection and reporting
  • 38. Cloud Version of MMS 1. Go to http://mms.mongodb.com 2. Create an account 3. Install one agent in your datacenter 4. Add hosts from the web interface 5. Enjoy!
  • 39. Today’s Webinar Agenda Hardware Considerations Achieve Scale 1 Optimization Tips Scale Vertically Horizontal Scaling 2 3
  • 40. Vertical Scaling Factors: – RAM – Disk – CPU – Network Replica Set Primary Secondary Secondary Replica Set Primary Secondary Secondary Horizontal Scaling
  • 41. Working Set Exceeds Physical Memory
  • 42. RAM - Measure your working set and index sizes • db.serverStatus({workingSet:1}).workingSet { "computationTimeMicros": 2751, "note": "thisIsAnEstimate", "overSeconds": 1084, "pagesInMemory": 2041 } • db.stats().indexSize 2032880640 • In this example, (2041 * 4096) + 2032880640 = 2041240576 bytes = 1.9 GB • Note: this is a subset of the virtual memory used by mongod
  • 43. Real World Example: Vertical Scaling • System that tracked status information for entities in the business • State changes happen in batches; sometimes 10% of entities get updated, sometimes 100% get updated
  • 44. Initial Architecture Sharded cluster with 4 shards using spinning disks Application / mongos mongod
  • 45. Adding shards to scale horizontally • Application was a success! Business entities grew by a factor of 5 • Cluster capacity multiplied by 5, but so did the TCO Application / mongos mongod …16 more shards…
  • 46. More success means more shards • 10x growth means … 200 shards • Horizontal scaling with sharding is linear scaling, but an order of magnitude was needed • Bulk updates of random documents approaches speed of disks
  • 47. Final architecture • Scaling the random IOPS with SSDs was a vertical scaling approach Application / mongos mongod SSD
  • 48. Before you add hardware… • Make sure you are solving the right scaling problem • Remedy schema and index problems first – schema and index problems can look like hardware problems • Tune the Operating System – ulimits, swap, NUMA, NOOP scheduler with hypervisors • Tune the IO subsystem – ext4 or XFS vs SAN, RAID10, readahead, noatime • See MongoDB "production notes" page • Heed logfile startup warnings
  • 49. Today’s Webinar Agenda Achieve Scale 1 Optimization Tips 2 Scale Vertically The Horizontal Basics of Sharding Scaling 3
  • 50. The basics of Horizontal Scaling
  • 51. The basics of Horizontal Scaling (aka Sharding)
  • 52. The Basics of Sharding
  • 53. Rule of Thumb To make good decisions about MongoDB implementations, you must understand MongoDB and your applications and the workload your applications generate and your business requirements.
  • 54. Summary • Don't throw hardware at the problem until you examine all other possibilities (schema, indexes, OS, IO subsystem) • Know what is considered "normal" performance by monitoring • Horizontal scaling in MongoDB is implemented with sharding, but you must understand schema design and indexing before you shard Sharding a sub-optimally designed database will not make it performant
  • 55. Today’s Webinar Agenda Achieve Scale 1 Optimization Tips The Horizontal Basics of Sharding Scaling 3 Schema Design Indexes Monitoring your Workload 2 Scale Vertically
  • 56. Limited Time: Get Expert Advice for Free If you’re thinking about scaling, why reinvent the wheel? Our experts can collaborate with you to provide detailed guidance. Sign Up For a Free One Hour Consult: http://bit.ly/1rkXcfN
  • 57. Questions? Stay tuned after the webinar and take our survey for your chance to win MongoDB schwag.
  • 58. Thank You Jake Angerman Sr. Solutions Architect, MongoDB

Notas del editor

  1. trap: concern about correctness overrides optimization at scale
  2. importing a relational schema directly into MongoDB is an anti-pattern!
  3. different parts of the world are awake and shopping at a given time
  4. Anti-pattern: embedding highly volatile data in an array
  5. these may look like performance tips instead of schema design tips sub-optimal query might be $unwind followed by $match instead of projection
  6. 100ms threshold by default
  7. shard key aside
  8. Indexes should be contained in working set.
  9. In this case I had a 50GB database but only ~2GB were needed in RAM
  10. this applies to both vertical and horizontal scaling
  11. The order presented is the order you should analyze
  12. www.mongodb.com/lp/contact/scaling-101