SlideShare a Scribd company logo
1 of 47
The Rough Guide to MongoDB
Simeon Simeonov
@simeons
Founding. Funding.
Growing. Startups.
Why MongoDB?
I am @simeons
recruit amazing people
solve hard problems
ship
make users happy
repeat
Why MongoDB?
Again,
please
SQL is slow
(for our business)
SQL is slow
(for our developer workflow)
SQL is slow
(for our analytics system)
So what’s Swoop?
Display Advertising
Makes the Web Suck
User-focused optimization
Tens of millions of users
1000+% better than average
200+% better than Google
Swoop Fixes That
Mobile SDKs
iOS & Android
Web SDK
RequireJS & jQuery
Components
AngularJS
NLP, etc.
Python
Targeting
High-Perf Java
Analytics
Ruby 2.0
Internal Apps
Ruby 2.0 / Rails 3
Pub Portal
Ruby 2.0 / Rails 3
Ad Portal
Ruby 2.0 / Rails 4
MongoDB: the Good
Fast
Flexible
JavaScript
MongoDB: the Bad
Not Quite Enterprise-Grade
Not Quite Enterprise-Grade
Not Cheap to Run Well
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
RAM + locks == $$$
Five Steps to Happiness
Sharding
Native Relationships
Atomic Update Buffering
Content-Addressed Storage
Shell Tricks
// Google AdWords object model
Account
Campaign
AdGroup // this joins ads & keywords
Ad
Keyword
// For example
AdGroup has an Account
AdGroup has a Campaign
AdGroup has many Ads
AdGroup has many Keywords
Slam dunk
for SQL
// Let’s play a bit
Account
Campaign
AdGroup
Ad
Keyword
// Let’s play some more
Account
Campaign
AdGroup
Ad
Keyword
// There is just one bit left
Account
Campaign
AdGroup
1 Ad
0 Keyword
// build a hierarchical ID
accountIDcampaignIDadGroupID((0keywordID)|(1adID))
// a binary ID!
10100100001100000000101001100110101010010100
< accountID >< campaignID >< …
// Encode it in base 16, 32 or 64
{"_id" : "a4300a66a94d20f1", … }
// Example
The 5th
ad
Of the 3rd
ad group
Of the 7th
campaign
Of the 255th
account
could have the _id 0x00ff000700031005
The _id for the 10th
keyword of the same ad group
would be 0x00ff00070003000a
// Neat: the ad’s and keyword’s _ids contain the
// IDs of all of their ancestors in the hierarchy.
keywordId = 0x00ff00070003000a
adGroupId = keywordId & 0xffffffffffff0000
campaignId = keywordId & 0xffffffff00000000
accountId = keywordId & 0xffff000000000000
// has-a relationship is a simple lookup
account = db.accounts.findOne({_id: accountId})
// Neater: has-many relationships are just
// range queries on the _id index.
adGroupId = keywordId & 0xffffffffffff0000
startOfAds = adGroupId + 0x1000
endOfAds = adGroupId + 0x1fff
adsForKeyword = db.ads.find({
_id: {$gte: startOfAds, $lte: endOfAds}
})
// Technically, that was a join via the ad group.
// Who said Mongo can’t do joins???
> db.reports.findOne()
{
"_id" : …,
"period" : "hour",
"shard" : 0, // 16Mb doc limit protection
"topic" : "ce-1",
"ts" : ISODate("2012-06-12T05:00:00Z"),
"variations" : {
"2" : { // variationID (dimension set)
"hint" : {
"present" : 311, // hint.present is a metric
"clicks" : 1
}
},
"4" : {
"hint" : {
"present" : 331
}
}
}
}
Content Addressed Storage
Lazy join abstraction
Very space efficient
Extremely (pre-)cacheable
Join only happens during reporting
// Step 1: take a set of dimensions worth tracking
data = {
"domain_id" : "SW-28077508-16444",
"hint" : "Find an organic alternative",
"theme" : "red"
}
// Step 2: compute a digital signature, e.g., MD5
sig = "000069569F4835D16E69DF704187AC2F”
// Step 3: if new sig, increment a counter
counter = 264034
// Step 4: create a document in the context-
// addressed store collection for these
> db.cas.findOne()
{
"_id" : "000069569F4835D16E69DF704187AC2F", // MD5 hash
"data" : { // data that was digested to the hash above
"domain_id" : "SW-28077508-16444",
"hint" : "Find an organic alternative",
"theme” : "red"
},
"meta_data" : {
"id" : 264034 // variationID
},
"created_at" : ISODate("2013-02-04T12:05:34.752Z")
}
// Elsewhere, in the reports collection…
"variations" : {
"264034" : {
// metrics here
},
…
lazy join
// Use underscore.js in the shell
// See http://underscorejs.org/
function underscore() {
load("/mongo_hacks/underscore.js");
}
// Loads underscore.js on the MongoDB server
function server_underscore(force) {
force = force || false;
if (force || typeof(underscoreLoaded) === 'undefined') {
db.eval(cat("/mongo_hacks/underscore.js"));
underscoreLoaded = true;
}
}
// Callstack printing on exception -- wraps a function
function dbg(f) {
try {
f();
} catch (e) {
print("n**** Exception: " + e.toString());
print("n");
print(e.stack);
print("n");
if (arguments.length > 1) {
printjson(arguments);
print("n");
}
throw e;
}
}
function minutesAgo(minutes, d) {
d = d || new Date();
return new Date(d.valueOf() - minutes * 60 * 1000);
}
function hoursAgo(hours, d) {
d = d || new Date();
return minutesAgo(60 * hours, d);
}
function daysAgo(days, d) {
d = d || new Date();
return hoursAgo(24 * days, d);
}
// Don’t write in the shell.
// Use your fav editor, save & type t() in mongo
function t() {
load("/mongo_hacks/bag_of_tricks.js");
}
@simeons
sim@swoop.com

More Related Content

Similar to The Rough Guide to MongoDB

Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalogMongoDB
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeWim Godden
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
When dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniquesWhen dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniquesWim Godden
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchAlexei Gorobets
 
Supercharging your Organic CTR
Supercharging your Organic CTRSupercharging your Organic CTR
Supercharging your Organic CTRPhil Pearce
 
Lessons Learned - Building YDN
Lessons Learned - Building YDNLessons Learned - Building YDN
Lessons Learned - Building YDNDan Theurer
 
moma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layermoma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layerGadi Oren
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB PerformanceMoshe Kaplan
 
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubJoining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubData Con LA
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
 
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)MongoDB
 
IOOF IT System Modernisation
IOOF IT System ModernisationIOOF IT System Modernisation
IOOF IT System ModernisationMongoDB
 
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuffBig Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuffMoshe Kaplan
 
Building LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBBuilding LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBMongoDB
 
API-Entwicklung bei XING
API-Entwicklung bei XINGAPI-Entwicklung bei XING
API-Entwicklung bei XINGMark Schmidt
 
MongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same treeMongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same treeMongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 

Similar to The Rough Guide to MongoDB (20)

Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalog
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
When dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniquesWhen dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniques
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet Elasticsearch
 
Supercharging your Organic CTR
Supercharging your Organic CTRSupercharging your Organic CTR
Supercharging your Organic CTR
 
Lessons Learned - Building YDN
Lessons Learned - Building YDNLessons Learned - Building YDN
Lessons Learned - Building YDN
 
moma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layermoma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layer
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB Performance
 
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubJoining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDB
 
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
 
IOOF IT System Modernisation
IOOF IT System ModernisationIOOF IT System Modernisation
IOOF IT System Modernisation
 
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuffBig Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
 
Building LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBBuilding LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDB
 
API-Entwicklung bei XING
API-Entwicklung bei XINGAPI-Entwicklung bei XING
API-Entwicklung bei XING
 
MongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same treeMongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same tree
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 

More from Simeon Simeonov

HyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard MathHyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard MathSimeon Simeonov
 
High accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive dataHigh accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive dataSimeon Simeonov
 
Memory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails ApplicationsMemory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails ApplicationsSimeon Simeonov
 
Three Tips for Winning Startup Weekend
Three Tips for Winning Startup WeekendThree Tips for Winning Startup Weekend
Three Tips for Winning Startup WeekendSimeon Simeonov
 
Swoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly RobotsSwoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly RobotsSimeon Simeonov
 
Build a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy StepsBuild a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy StepsSimeon Simeonov
 
Strategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon SimeonovStrategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon SimeonovSimeon Simeonov
 
Patterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon SimeonovPatterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon SimeonovSimeon Simeonov
 
Customer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob DorfCustomer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob DorfSimeon Simeonov
 

More from Simeon Simeonov (10)

HyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard MathHyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard Math
 
High accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive dataHigh accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive data
 
Memory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails ApplicationsMemory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails Applications
 
Three Tips for Winning Startup Weekend
Three Tips for Winning Startup WeekendThree Tips for Winning Startup Weekend
Three Tips for Winning Startup Weekend
 
Swoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly RobotsSwoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly Robots
 
Build a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy StepsBuild a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy Steps
 
Strategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon SimeonovStrategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon Simeonov
 
Patterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon SimeonovPatterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon Simeonov
 
Customer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob DorfCustomer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob Dorf
 
Beyond Bootstrapping
Beyond BootstrappingBeyond Bootstrapping
Beyond Bootstrapping
 

Recently uploaded

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

The Rough Guide to MongoDB

  • 1. The Rough Guide to MongoDB Simeon Simeonov @simeons
  • 5. recruit amazing people solve hard problems ship make users happy repeat
  • 6.
  • 8. SQL is slow (for our business)
  • 9. SQL is slow (for our developer workflow)
  • 10. SQL is slow (for our analytics system)
  • 12.
  • 13. Display Advertising Makes the Web Suck User-focused optimization Tens of millions of users 1000+% better than average 200+% better than Google Swoop Fixes That
  • 14. Mobile SDKs iOS & Android Web SDK RequireJS & jQuery Components AngularJS NLP, etc. Python Targeting High-Perf Java Analytics Ruby 2.0 Internal Apps Ruby 2.0 / Rails 3 Pub Portal Ruby 2.0 / Rails 3 Ad Portal Ruby 2.0 / Rails 4
  • 16. MongoDB: the Bad Not Quite Enterprise-Grade Not Quite Enterprise-Grade Not Cheap to Run Well
  • 17. I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code
  • 18. I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce
  • 19. RAM + locks == $$$
  • 20.
  • 21. Five Steps to Happiness Sharding Native Relationships Atomic Update Buffering Content-Addressed Storage Shell Tricks
  • 22.
  • 23.
  • 24. // Google AdWords object model Account Campaign AdGroup // this joins ads & keywords Ad Keyword // For example AdGroup has an Account AdGroup has a Campaign AdGroup has many Ads AdGroup has many Keywords Slam dunk for SQL
  • 25. // Let’s play a bit Account Campaign AdGroup Ad Keyword
  • 26. // Let’s play some more Account Campaign AdGroup Ad Keyword
  • 27. // There is just one bit left Account Campaign AdGroup 1 Ad 0 Keyword
  • 28. // build a hierarchical ID accountIDcampaignIDadGroupID((0keywordID)|(1adID)) // a binary ID! 10100100001100000000101001100110101010010100 < accountID >< campaignID >< … // Encode it in base 16, 32 or 64 {"_id" : "a4300a66a94d20f1", … }
  • 29. // Example The 5th ad Of the 3rd ad group Of the 7th campaign Of the 255th account could have the _id 0x00ff000700031005 The _id for the 10th keyword of the same ad group would be 0x00ff00070003000a
  • 30. // Neat: the ad’s and keyword’s _ids contain the // IDs of all of their ancestors in the hierarchy. keywordId = 0x00ff00070003000a adGroupId = keywordId & 0xffffffffffff0000 campaignId = keywordId & 0xffffffff00000000 accountId = keywordId & 0xffff000000000000 // has-a relationship is a simple lookup account = db.accounts.findOne({_id: accountId})
  • 31. // Neater: has-many relationships are just // range queries on the _id index. adGroupId = keywordId & 0xffffffffffff0000 startOfAds = adGroupId + 0x1000 endOfAds = adGroupId + 0x1fff adsForKeyword = db.ads.find({ _id: {$gte: startOfAds, $lte: endOfAds} }) // Technically, that was a join via the ad group. // Who said Mongo can’t do joins???
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. > db.reports.findOne() { "_id" : …, "period" : "hour", "shard" : 0, // 16Mb doc limit protection "topic" : "ce-1", "ts" : ISODate("2012-06-12T05:00:00Z"), "variations" : { "2" : { // variationID (dimension set) "hint" : { "present" : 311, // hint.present is a metric "clicks" : 1 } }, "4" : { "hint" : { "present" : 331 } } } }
  • 37. Content Addressed Storage Lazy join abstraction Very space efficient Extremely (pre-)cacheable Join only happens during reporting
  • 38. // Step 1: take a set of dimensions worth tracking data = { "domain_id" : "SW-28077508-16444", "hint" : "Find an organic alternative", "theme" : "red" } // Step 2: compute a digital signature, e.g., MD5 sig = "000069569F4835D16E69DF704187AC2F” // Step 3: if new sig, increment a counter counter = 264034 // Step 4: create a document in the context- // addressed store collection for these
  • 39. > db.cas.findOne() { "_id" : "000069569F4835D16E69DF704187AC2F", // MD5 hash "data" : { // data that was digested to the hash above "domain_id" : "SW-28077508-16444", "hint" : "Find an organic alternative", "theme” : "red" }, "meta_data" : { "id" : 264034 // variationID }, "created_at" : ISODate("2013-02-04T12:05:34.752Z") } // Elsewhere, in the reports collection… "variations" : { "264034" : { // metrics here }, … lazy join
  • 40.
  • 41. // Use underscore.js in the shell // See http://underscorejs.org/ function underscore() { load("/mongo_hacks/underscore.js"); }
  • 42. // Loads underscore.js on the MongoDB server function server_underscore(force) { force = force || false; if (force || typeof(underscoreLoaded) === 'undefined') { db.eval(cat("/mongo_hacks/underscore.js")); underscoreLoaded = true; } }
  • 43. // Callstack printing on exception -- wraps a function function dbg(f) { try { f(); } catch (e) { print("n**** Exception: " + e.toString()); print("n"); print(e.stack); print("n"); if (arguments.length > 1) { printjson(arguments); print("n"); } throw e; } }
  • 44. function minutesAgo(minutes, d) { d = d || new Date(); return new Date(d.valueOf() - minutes * 60 * 1000); } function hoursAgo(hours, d) { d = d || new Date(); return minutesAgo(60 * hours, d); } function daysAgo(days, d) { d = d || new Date(); return hoursAgo(24 * days, d); }
  • 45. // Don’t write in the shell. // Use your fav editor, save & type t() in mongo function t() { load("/mongo_hacks/bag_of_tricks.js"); }
  • 46.