SlideShare a Scribd company logo
1 of 76
Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBWorld
Mythbusting: Understanding
How We Measure the
Performance of MongoDB
Before we start…
• We are going to look a lot at
– C++ kernel code
– Java benchmarks
– JavaScript tests
• And lots of charts
• And its going to be awesome!
Measuring "Performance"
https://www.youtube.com/watch?v=7wm-pZp_mi0
Benchmarking
• Some common traps
• Performance measurement & diagnosis
• What's next
Part One
Some Common Traps
The Milk Train Doesn't Stop Here Anymore
Tennessee Williams
"We all live in a house on fire, no fire department
to call; no way out, just the upstairs window to
look out of while the fire burns the house down
with us trapped, locked in it."
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x
Documents
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
So that looks ok, right?
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Object creation and GC
management?
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Thread contention on
nextInt()?
Object creation and GC
management?
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Time to synthesize data?
Object creation and GC
management?
Thread contention on
nextInt()?
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Object creation and GC
management?
Thread contention on
addAndGet()?
Thread contention on
nextInt()?
Time to synthesize data?
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
BasicDBObject doc = new BasicDBObject();
doc.put("_id",id);
doc.put("k",rand.nextInt(numMaxInserts)+1);
String cVal = "…"
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i]=doc;
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
globalInserts.addAndGet(documentsPerInsert);
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Object creation and GC
management?
Clock resolution?
Thread contention on
nextInt()?
Time to synthesize data?
Thread contention on
addAndGet()?
// Pre Create the Object outside the Loop
BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert];
for (int i=0; i < documentsPerInsert; i++) {
BasicDBObject doc = new BasicDBObject();
String cVal = "…";
doc.put("c",cVal);
String padVal = "…";
doc.put("pad",padVal);
aDocs[i] = doc;
}
Solution: Pre-Create the objects
Pre-create non varying
data outside the timing
loop
Alternative
• Pre-create the data in a file; load from file
// Use ThreadLocalRandom generator or an instance of java.util.Random per thread
java.util.concurrent.ThreadLocalRandom rand;
for (long roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
doc = aDocs[i];
doc.put("_id",id);
doc.put("k", nextInt(rand, numMaxInserts)+1);
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
}
// Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum);
Solution: Remove contention
Remove contention
nextInt() by making
Thread local
// Use ThreadLocalRandom generator or an instance of java.util.Random per thread
java.util.concurrent.ThreadLocalRandom rand;
for (long roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i < documentsPerInsert; i++) {
id++;
doc = aDocs[i];
doc.put("_id",id);
doc.put("k", nextInt(rand, numMaxInserts)+1);
}
coll.insert(aDocs);
numInserts += documentsPerInsert;
}
// Maintain count outside the loop
globalInserts.addAndGet(documentsPerInsert * roundNum);
Solution: Remove contention
Remove contention on
addAndGet()
Remove contention
nextInt() by making
Thread local
long startTime = System.currentTimeMillis();
…
long endTime = System.currentTimeMillis();
long startTime = System.nanoTime();
…
long endTime = System.nanoTime() - startTime;
Solution: Timer resolution
"resolution is at least as
good as that of
currentTimeMillis()"
"granularity of the value
depends on the
underlying operating
system and may be
larger"
Source
• http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
General Principal #1
Know what you are
measuring
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
#2 Response time to return all
results
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
So that looks ok, right?
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Each doc is is 4080 bytes
on disk with powerOf2Sizes
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Each doc is is 4080 bytes
on disk with powerOf2Sizes
Unrestricted predicate?
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("_id",i); coll.insert(doc);
}
BasicDBObject predicate = new BasicDBObject();
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate);
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
What are else you measuring?
Each doc is is 4080 bytes
on disk with powerOf2Sizes
Measuring
• Time to parse &
execute query
• Time to retrieve all
document
But also
• Cost of shipping ~4MB
data through network
stack
Unrestricted predicate?
BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1);
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate, projection );
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
Solution: Limit the projection
Return fixed range
BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1);
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate, projection );
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
Solution: Limit the projection
Only project _id
Return fixed range
BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
BasicDBObject projection = new BasicDBObject();
projection.put("_id", 1);
long startTime = System.currentTimeMillis();
DBCursor cur = coll.find(predicate, projection );
DBObject foundObj;
while (cur.hasNext()) {
foundObj = cur.next();
}
long endTime = System.currentTimeMillis();
Solution: Limit the projection
Only project _id
Only 46k transferred
through network stack
Return fixed range
General Principal #2
Measure only what you
need to measure
Part Two
Performance
measurement &
diagnosis
The Physical Principles of the Quantum Theory (1930)
Werner Heisenberg
"Every experiment destroys some of the
knowledge of the system which was obtained by
previous experiments."
Broad categories
• Micro Benchmarks
• Workloads
Micro benchmarks: mongo-perf
mongo-perf: goals
• Measure
– commands
• Configure
– Single mongod, ReplSet size (1 -> n), Sharding
– Single vs. Multiple DB
– O/S
• Characterize
– Throughput by thread count
• Compare
What do you get?
Better
What do you get?
Measured
improvement
between rc0 and
rc2
Better
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 1000; i++ ) {
collection.insert( { _id : i } );
}
collection.getDB().getLastError();
},
ops: [
{ op: "command",
ns : "testdb",
command : { count : "mycollection",
query : { _id : { "$gt" : 10, "$lt" : 100 } } } }
] } );
Benchmark source code
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 1000; i++ ) {
collection.insert( { _id : i } );
}
collection.getDB().getLastError();
},
ops: [
{ op: "command",
ns : "testdb",
command : { count : "mycollection",
query : { _id : { "$gt" : 10, "$lt" : 100 } } } }
] } );
Benchmark source code
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 1000; i++ ) {
collection.insert( { _id : i } );
}
collection.getDB().getLastError();
},
ops: [
{ op: "command",
ns : "testdb",
command : { count : "mycollection",
query : { _id : { "$gt" : 10, "$lt" : 100 } } } }
] } );
Benchmark source code
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 1000; i++ ) {
collection.insert( { _id : i } );
}
collection.getDB().getLastError();
},
ops: [
{ op: "command",
ns : "testdb",
command : { count : "mycollection",
query : { _id : { "$gt" : 10, "$lt" : 100 } } } }
] } );
Benchmark source code
Code Change
Workloads
• "public" workloads
– YCSB
– Sysbench
• "real world" simulations
– Inbox fan in/out
– Message Stores
– Content Management
Example: Bulk Load Performance
16m Documents
Better
55% degradation
2.6.0-rc1 vs 2.4.10
Ouch… where's the tree in the
woods?
• 2.4.10 -> 2.6.0
– 4495 git commits
git-bisect
• Bisect between good/bad hashes
• git-bisect nominates a new githash
– Build against githash
– Re-run test
– Confirm if this githash is good/bad
• Rinse and repeat
Code Change - Bad Githash
Code Change - Fix
Bulk Load Performance - Fix
Better
11% improvement
2.6.1 vs 2.4.10
The problem with measurement
• Observability
– What can you observe on the system?
• Effect
– What effects does t heobservation cause?
mtools
mtools
• MongoDB log file analysis
– Filter logs for operations, events
– Response time, lock durations
– Plot
• https://github.com/rueckstiess/mtools
Response Times > 100ms
Bulk Insert 2.6.0-rc0 Ops/Sec
Time
Response Times > 100ms
Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2
Floor raised
Code Change – Yielding Policy
Code Change
Response Times
Bulk Insert 2.6.0 vs 2.6.1
Ceiling similar, lower floor
resulting in 40%
improvement in throughput
Secondary effects of Yield policy change
Write lock time reduced
Order of magnitude reduction
of write lock duration
> db.serverStatus()
Yes – will cause a read lock to be acquired
> db.serverStatus({recordStats:0})
No – lock is not acquired
> mongostat
Yes - until SERVER-14008 resolved, uses db.serverStatus()
Unexpected side effects of
measurement?
CPU sampling
• Get an impression of
– Call Graphs
– CPU time spent on node and called nodes
> sudo apt-get install google-perftools
> sudo apt-get install libunwind7-dev
> scons --use-cpu-profiler mongod
Setup & building with google-
profiler
> mongodb –dbpath <…>
Note: Do not use –fork
> mongo
> use admin
> db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}})
Execute some commands that you want to profile
> db.runCommand({_cpuProfilerStop: 1})
Start the profiling
Sample start vs. end of workload
Sample start vs. end of workload
Code change
Public Benchmarks – Not all forks are
the same…
• YCSB
– https://github.com/achille/YCSB
• sysbench-mongodb
– https://github.com/mdcallag/sysbench-mongodb
Part Three
And next?
Beavis & Butthead
"The future sucks. Change it."
"I'm way cool Beavis, but I cannot change the
future."
What we are working on
• mongo-perf
– UI refactor
– Adding more micro benchmarks
• Workloads
– Adding external benchmarks
– Creating benchmarks for common use cases
• Inbox fan in/out
• Analytical dashboards
• Stream / Feeds
• Customers, Partners & Community
Here's how you can help change the
future!
• Got a great workload? Great benchmark?
• Want to donate it?
• alvin@mongodb.com
Don't be that benchmark…
#1 Know what you are measuring
#2 Measure only what you need to
measure
alvin@mongodb.com / @jonnyeight
Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBWorld
Thank You

More Related Content

What's hot

Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
MongoDB
 
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureCloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
Ankur Dave
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
MongoDB
 
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Geoffrey De Smet
 

What's hot (20)

Kenneth Truyers - Using Git as a NoSql database - Codemotion Milan 2018
Kenneth Truyers - Using Git as a NoSql database - Codemotion Milan 2018Kenneth Truyers - Using Git as a NoSql database - Codemotion Milan 2018
Kenneth Truyers - Using Git as a NoSql database - Codemotion Milan 2018
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
Tricks
TricksTricks
Tricks
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
 
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureCloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
 
Javascript Arrays
Javascript ArraysJavascript Arrays
Javascript Arrays
 
The Ring programming language version 1.5.4 book - Part 68 of 185
The Ring programming language version 1.5.4 book - Part 68 of 185The Ring programming language version 1.5.4 book - Part 68 of 185
The Ring programming language version 1.5.4 book - Part 68 of 185
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
Mongodb debugging-performance-problems
Mongodb debugging-performance-problemsMongodb debugging-performance-problems
Mongodb debugging-performance-problems
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]
 
Scalaで実装してみる簡易ブロックチェーン
Scalaで実装してみる簡易ブロックチェーンScalaで実装してみる簡易ブロックチェーン
Scalaで実装してみる簡易ブロックチェーン
 
File System Operations
File System OperationsFile System Operations
File System Operations
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
MongoDB - Back to Basics - La tua prima Applicazione
MongoDB - Back to Basics - La tua prima ApplicazioneMongoDB - Back to Basics - La tua prima Applicazione
MongoDB - Back to Basics - La tua prima Applicazione
 
MongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and MergingMongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and Merging
 
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
 
D3.js workshop
D3.js workshopD3.js workshop
D3.js workshop
 
MongoDB Live Hacking
MongoDB Live HackingMongoDB Live Hacking
MongoDB Live Hacking
 

Similar to Mythbusting: Understanding How We Measure the Performance of MongoDB

Implement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdfImplement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdf
feelingspaldi
 
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdfDoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
aathiauto
 
Deep dumpster diving 2010
Deep dumpster diving 2010Deep dumpster diving 2010
Deep dumpster diving 2010
RonnBlack
 
Presentation 2
Presentation 2Presentation 2
Presentation 2
s2team
 
Codestrong 2012 breakout session hacking titanium
Codestrong 2012 breakout session   hacking titaniumCodestrong 2012 breakout session   hacking titanium
Codestrong 2012 breakout session hacking titanium
Axway Appcelerator
 
Look Ma, “update DB to HTML5 using C++”, no hands! 
Look Ma, “update DB to HTML5 using C++”, no hands! Look Ma, “update DB to HTML5 using C++”, no hands! 
Look Ma, “update DB to HTML5 using C++”, no hands! 
aleks-f
 

Similar to Mythbusting: Understanding How We Measure the Performance of MongoDB (20)

Mythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDBMythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDB
 
Implement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdfImplement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdf
 
Ac2
Ac2Ac2
Ac2
 
Look Mommy, No GC! (TechDays NL 2017)
Look Mommy, No GC! (TechDays NL 2017)Look Mommy, No GC! (TechDays NL 2017)
Look Mommy, No GC! (TechDays NL 2017)
 
Google apps script
Google apps scriptGoogle apps script
Google apps script
 
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdfDoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
 
C++ practical
C++ practicalC++ practical
C++ practical
 
Deep dumpster diving 2010
Deep dumpster diving 2010Deep dumpster diving 2010
Deep dumpster diving 2010
 
The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31
 
Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#
 
The Ring programming language version 1.9 book - Part 21 of 210
The Ring programming language version 1.9 book - Part 21 of 210The Ring programming language version 1.9 book - Part 21 of 210
The Ring programming language version 1.9 book - Part 21 of 210
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
C#을 이용한 task 병렬화와 비동기 패턴
C#을 이용한 task 병렬화와 비동기 패턴C#을 이용한 task 병렬화와 비동기 패턴
C#을 이용한 task 병렬화와 비동기 패턴
 
The Ring programming language version 1.7 book - Part 48 of 196
The Ring programming language version 1.7 book - Part 48 of 196The Ring programming language version 1.7 book - Part 48 of 196
The Ring programming language version 1.7 book - Part 48 of 196
 
Presentation 2
Presentation 2Presentation 2
Presentation 2
 
The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210
 
Codestrong 2012 breakout session hacking titanium
Codestrong 2012 breakout session   hacking titaniumCodestrong 2012 breakout session   hacking titanium
Codestrong 2012 breakout session hacking titanium
 
The Ring programming language version 1.5 book - Part 8 of 31
The Ring programming language version 1.5 book - Part 8 of 31The Ring programming language version 1.5 book - Part 8 of 31
The Ring programming language version 1.5 book - Part 8 of 31
 
The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212
 
Look Ma, “update DB to HTML5 using C++”, no hands! 
Look Ma, “update DB to HTML5 using C++”, no hands! Look Ma, “update DB to HTML5 using C++”, no hands! 
Look Ma, “update DB to HTML5 using C++”, no hands! 
 

More from MongoDB

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Mythbusting: Understanding How We Measure the Performance of MongoDB

  • 1. Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Mythbusting: Understanding How We Measure the Performance of MongoDB
  • 2. Before we start… • We are going to look a lot at – C++ kernel code – Java benchmarks – JavaScript tests • And lots of charts • And its going to be awesome!
  • 4. Benchmarking • Some common traps • Performance measurement & diagnosis • What's next
  • 6. The Milk Train Doesn't Stop Here Anymore Tennessee Williams "We all live in a house on fire, no fire department to call; no way out, just the upstairs window to look out of while the fire burns the house down with us trapped, locked in it."
  • 7. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 8. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 9. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 10. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 11. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 12. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); So that looks ok, right?
  • 13. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management?
  • 14. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Thread contention on nextInt()? Object creation and GC management?
  • 15. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Time to synthesize data? Object creation and GC management? Thread contention on nextInt()?
  • 16. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Thread contention on addAndGet()? Thread contention on nextInt()? Time to synthesize data?
  • 17. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Clock resolution? Thread contention on nextInt()? Time to synthesize data? Thread contention on addAndGet()?
  • 18. // Pre Create the Object outside the Loop BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i < documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject(); String cVal = "…"; doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i] = doc; } Solution: Pre-Create the objects Pre-create non varying data outside the timing loop Alternative • Pre-create the data in a file; load from file
  • 19. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention nextInt() by making Thread local
  • 20. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention on addAndGet() Remove contention nextInt() by making Thread local
  • 21. long startTime = System.currentTimeMillis(); … long endTime = System.currentTimeMillis(); long startTime = System.nanoTime(); … long endTime = System.nanoTime() - startTime; Solution: Timer resolution "resolution is at least as good as that of currentTimeMillis()" "granularity of the value depends on the underlying operating system and may be larger" Source • http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
  • 22. General Principal #1 Know what you are measuring
  • 23. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 24. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 25. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 26. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 27. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); So that looks ok, right?
  • 28. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes
  • 29. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Unrestricted predicate?
  • 30. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Measuring • Time to parse & execute query • Time to retrieve all document But also • Cost of shipping ~4MB data through network stack Unrestricted predicate?
  • 31. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Return fixed range
  • 32. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Return fixed range
  • 33. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Only 46k transferred through network stack Return fixed range
  • 34. General Principal #2 Measure only what you need to measure
  • 36. The Physical Principles of the Quantum Theory (1930) Werner Heisenberg "Every experiment destroys some of the knowledge of the system which was obtained by previous experiments."
  • 37. Broad categories • Micro Benchmarks • Workloads
  • 39. mongo-perf: goals • Measure – commands • Configure – Single mongod, ReplSet size (1 -> n), Sharding – Single vs. Multiple DB – O/S • Characterize – Throughput by thread count • Compare
  • 40. What do you get? Better
  • 41. What do you get? Measured improvement between rc0 and rc2 Better
  • 42. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 43. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 44. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 45. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 47. Workloads • "public" workloads – YCSB – Sysbench • "real world" simulations – Inbox fan in/out – Message Stores – Content Management
  • 48. Example: Bulk Load Performance 16m Documents Better 55% degradation 2.6.0-rc1 vs 2.4.10
  • 49. Ouch… where's the tree in the woods? • 2.4.10 -> 2.6.0 – 4495 git commits
  • 50. git-bisect • Bisect between good/bad hashes • git-bisect nominates a new githash – Build against githash – Re-run test – Confirm if this githash is good/bad • Rinse and repeat
  • 51. Code Change - Bad Githash
  • 53. Bulk Load Performance - Fix Better 11% improvement 2.6.1 vs 2.4.10
  • 54. The problem with measurement • Observability – What can you observe on the system? • Effect – What effects does t heobservation cause?
  • 56. mtools • MongoDB log file analysis – Filter logs for operations, events – Response time, lock durations – Plot • https://github.com/rueckstiess/mtools
  • 57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec Time
  • 58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2 Floor raised
  • 59. Code Change – Yielding Policy
  • 61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar, lower floor resulting in 40% improvement in throughput
  • 62. Secondary effects of Yield policy change Write lock time reduced Order of magnitude reduction of write lock duration
  • 63. > db.serverStatus() Yes – will cause a read lock to be acquired > db.serverStatus({recordStats:0}) No – lock is not acquired > mongostat Yes - until SERVER-14008 resolved, uses db.serverStatus() Unexpected side effects of measurement?
  • 64. CPU sampling • Get an impression of – Call Graphs – CPU time spent on node and called nodes
  • 65. > sudo apt-get install google-perftools > sudo apt-get install libunwind7-dev > scons --use-cpu-profiler mongod Setup & building with google- profiler
  • 66. > mongodb –dbpath <…> Note: Do not use –fork > mongo > use admin > db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}}) Execute some commands that you want to profile > db.runCommand({_cpuProfilerStop: 1}) Start the profiling
  • 67. Sample start vs. end of workload
  • 68. Sample start vs. end of workload
  • 70. Public Benchmarks – Not all forks are the same… • YCSB – https://github.com/achille/YCSB • sysbench-mongodb – https://github.com/mdcallag/sysbench-mongodb
  • 72. Beavis & Butthead "The future sucks. Change it." "I'm way cool Beavis, but I cannot change the future."
  • 73. What we are working on • mongo-perf – UI refactor – Adding more micro benchmarks • Workloads – Adding external benchmarks – Creating benchmarks for common use cases • Inbox fan in/out • Analytical dashboards • Stream / Feeds • Customers, Partners & Community
  • 74. Here's how you can help change the future! • Got a great workload? Great benchmark? • Want to donate it? • alvin@mongodb.com
  • 75. Don't be that benchmark… #1 Know what you are measuring #2 Measure only what you need to measure
  • 76. alvin@mongodb.com / @jonnyeight Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Thank You

Editor's Notes

  1. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/util/Random.html "Instances of java.util.Random are threadsafe. However, the concurrent use of the same java.util.Random instance across threads may encounter contention and consequent poor performance. Consider instead using ThreadLocalRandom in multithreaded designs."
  2. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html "The specifications of these methods enable implementations to employ efficient machine-level atomic instructions that are available on contemporary processors. However on some platforms, support may entail some form of internal locking."
  3. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis() "Returns the current time in milliseconds. Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds."
  4. Per Jav7 documentation http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadLocalRandom.html "A random number generator isolated to the current thread…Use of ThreadLocalRandom is particularly appropriate when multiple tasks (for example, each a ForkJoinTask) use random numbers in parallel in thread pools."
  5. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime() "This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis()."
  6. Githash https://github.com/mongodb/mongo/commit/d1dc7cf2b213d77103658ccd2ea4816b33a27f6a#diff-7ba76fe024c203ca35087f3b93395acc
  7. Githash https://github.com/mongodb/mongo/commit/00f7aeaa25f98de5e66f0759d5b102951a247526#diff-fa99d4a7f4e8efac0787f30c60814eaf
  8. Githash https://github.com/mongodb/mongo/commit/68d42de9a958688acbf659dfb651fb699e9d7394#diff-fa99d4a7f4e8efac0787f30c60814eaf
  9. Githash https://github.com/mongodb/mongo/commit/00f7aeaa25f98de5e66f0759d5b102951a247526#diff-fa99d4a7f4e8efac0787f30c60814eaf
  10. Githash https://github.com/mongodb/mongo/commit/68d42de9a958688acbf659dfb651fb699e9d7394#diff-fa99d4a7f4e8efac0787f30c60814eaf
  11. Githash https://github.com/mongodb/mongo/commit/8d43b5cb9949c16452cb8d949c89d94cab9c8bad#diff-264fb70c85a638c671570970f3752bf3