This presentation was given at the LDS Tech SORT Conference 2011 in Salt Lake City. The slides are quite comprehensive covering many topics on MongoDB. Rather than a traditional presentation, this was presented as more of a Q & A session. Topics covered include. Introduction to MongoDB, Use Cases, Schema design, High availability (replication) and Horizontal Scaling (sharding).
22. Signs something
needed
• doubleclick - 400,000 ads/second
• people writing their own stores
• caching is de rigueur
• complex ORM frameworks
• computer architecture trends
• cloud computing
23. Requirements
• need a good degree of functionality
to handle a large set of use cases
• sometimes need strong
consistency / atomicity
• secondary indexes
• ad hoc queries
24. Trim unneeded
features
• leave out a few things so we can
scale
• no choice but to leave out
relational
• distributed transactions are hard
to scale
25. Needed a scalable
data model
• some options:
• key/value
• columnar / tabular
• document oriented (JSON inspired)
• opportunity to innovate -> agility
26. MongoDB philosphy
• No longer one-size-fits all. but not 12 tools either.
• Non-relational (no joins) makes scaling horizontally
practical
• Document data models are good
• Keep functionality when we can (key/value stores are
great, but we need more)
• Database technology should run anywhere, being
available both for running on your own servers or VMs,
and also as a cloud pay-for-what-you-use service.
• Ideally open source...
27. MongoDB
• JSON Documents
• Querying/Indexing/Updating similar
to relational databases
• Traditional Consistency
• Auto-Sharding
28. Under the hood
• Written in C++
• Available on most platforms
• Data serialized to BSON
• Extensive use of memory-mapped
files
33. Photo Meta-
Problem:
• Business needed more flexibility than Oracle could deliver
Solution:
• Used MongoDB instead of Oracle
Results:
• Developed application in one sprint cycle
• 500% cost reduction compared to Oracle
• 900% performance improvement compared to Oracle
34. Customer Analytics
Problem:
• Deal with massive data volume across all customer sites
Solution:
• Used MongoDB to replace Google Analytics / Omniture
options
Results:
• Less than one week to build prototype and prove business
case
• Rapid deployment of new features
35. Online
Problem:
• MySQL could not scale to handle their 5B+ documents
Solution:
• Switched from MySQL to MongoDB
Results:
• Massive simplification of code base
• Eliminated need for external caching system
• 20x performance improvement over MySQL
36. E-commerce
Problem:
• Multi-vertical E-commerce impossible to model (efficiently)
in RDBMS
Solution:
• Switched from MySQL to MongoDB
Results:
• Massive simplification of code base
• Rapidly build, halving time to market (and cost)
• Eliminated need for external caching system
• 50x+ improvement over MySQL
49. Secondary Indexes
Create index on any Field in Document
// 1 means ascending, -1 means descending
> db.posts.ensureIndex({author: 1})
> db.posts.find({author: 'roger'})
> { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "roger",
... }
58. MongoDB
Replication
•MongoDB replication like MySQL
replication (kinda)
•Asynchronous master/slave
•Variations
•Master / slave
•Replica Sets
59. Replica Set features
• A cluster of N servers
• Any (one) node can be primary
• Consensus election of primary
• Automatic failover
• Automatic recovery
• All writes to primary
• Reads can be to primary (default) or a
secondary
61. How MongoDB
Replication works
Member 1 Member 3
Member 2
PRIMARY
Election establishes the PRIMARY
Data replication from PRIMARY to SECONDARY
62. How MongoDB
Replication works
negotiate
new master
Member 1 Member 3
Member 2
DOWN
PRIMARY may fail
Automatic election of new PRIMARY if majority
exists
63. How MongoDB
Replication works
Member 3
Member 1
PRIMARY
Member 2
DOWN
New PRIMARY elected
Replication Set re-established
67. Replica Set Options
• {arbiterOnly: True}
• Can vote in an election
• Does not hold any data
• {hidden: True}
• Not reported in isMaster()
• Will not be sent slaveOk() reads
• {priority: n}
• {tags: }
68. Using Replicas for
Reads
• slaveOk()
• - driver will send read requests to
Secondaries
• - driver will always send writes to Primary
• Java examples
• - DB.slaveOk()
• - Collection.slaveOk()
• find(q).addOption(Bytes.QUERYOPTION_SLAVEO
K);
69. Safe Writes
• db.runCommand({getLastError: 1, w : 1})
• - ensure write is synchronous
• - command returns after primary has written to memory
• w=n or w='majority'
• n is the number of nodes data must be replicated to
• driver will always send writes to Primary
• w='myTag' [MongoDB 2.0]
• Each member is "tagged" e.g. "US_EAST", "EMEA",
"US_WEST"
• Ensure that the write is executed in each tagged "region"
70. Safe Writes
• fsync:true
• Ensures changed disk blocks are
flushed to disk
• j:true
• Ensures changes are flush to
Journal
71. When are elections
triggered?
• When a given member see's that the
Primary is not reachable
• The member is not an Arbiter
• Has a priority greater than other
eligible members
72. Typical
Use?
Set
size
Deployments
Data Protection High Availability Notes
X One No No Must use ‐‐journal to protect against crashes
On loss of one member, surviving member is
Two Yes No read only
On loss of one member, surviving two
Three Yes Yes ‐ 1 failure members can elect a new primary
* On loss of two members, surviving two
X Four Yes Yes ‐ 1 failure* members are read only
On loss of two members, surviving three
Five Yes Yes ‐ 2 failures members can elect a new primary
73. Replication features
• Reads from Primary are always
consistent
• Reads from Secondaries are eventually
consistent
• Automatic failover if a Primary fails
• Automatic recovery when a node joins
the set
• Control of where writes occur
89. Sharding Features
• Shard data without no downtime
• Automatic balancing as data is written
• Commands routed (switched) to correct node
• Inserts - must have the Shard Key
• Updates - must have the Shard Key
• Queries
• With Shard Key - routed to nodes
• Without Shard Key - scatter gather
• Indexed Queries
• With Shard Key - routed in order
• Without Shard Key - distributed sort merge
92. Config Servers
• 3 of them
• changes are made with 2 phase
commit
• if any are down, meta data
goes read only
• system is online as long as 1/3
is up
93. Config Servers
• 3 of them
• changes are made with 2 phase
commit
• if any are down, meta data
goes read only
• system is online as long as 1/3
is up
94. Shards
• Can be master, master/slave or
replica sets
• Replica sets gives sharding + full
auto-failover
• Regular mongod processes
95. Shards
• Can be master, master/slave or
replica sets
• Replica sets gives sharding + full
auto-failover
• Regular mongod processes
96. Mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra
network traffic
97. Mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra
network traffic
99. Priorities
• Prior to 2.0.0
• {priority:0} // Never can be elected Primary
• {priority:1} // Can be elected Primary
• New in 2.0.0
• Priority, floating point number between 0 and 1000
• During an election
• Most up to date
• Highest priority
• Allows weighting of members during failover
100. Priorities - example
• Assuming all members are up to date
A D
• Members A or B will be chosen first p:2 p:1
• Highest priority
B E
• Members C or D will be chosen next if
p:2 p:0
• A and B are unavailable
• A and B are not up to date C
p:1
• Member E is never chosen
• priority:0 means it cannot be elected
101. Tagging
• New in 2.0.0
• Control over where data is written to
• Each member can have one or more tags e.g.
• tags: {dc: "ny"}
• tags: {dc: "ny",
ip: "192.168",
rack: "row3rk7"}
• Replica set defines rules for where data resides
• Rules can change without change application code
103. Use Cases - Multi
Data Center
• write to three data centers
• allDCs : {"dc" : 3}
• > db.runCommand({getLastError : 1, w : "allDCs"})
• write to two data centers and three availability zones
• allDCsPlus : {"dc" : 2, "az": 3}
• > db.runCommand({getLastError : 1, w : "allDCsPlus"})
US‐EAST‐1 US‐WEST‐1 LONDON‐1
tag : {dc: "JFK", tag : {dc: "SFO", tag : {dc: "LHR",
az: "r1"} az : "r3"} az: "r5"}
US‐EAST‐2 US‐WEST‐2
tag : {dc: "JFK" tag : {dc: "SFO"
az: "r2"} az: "r4"}
104. Use Cases - Data Protection
& High Availability
• A and B will take priority during a failover
• C or D will become primary if A and B become unavailable
• E cannot be primary
• D and E cannot be read from with a slaveOk()
• D can use be used for Backups, feed Solr index etc.
• E provides a safe guard for operational or application error
E
A C
priority: 0
priority: 2 priority: 1
hidden: True
slaveDelay: 3600
D
B
priority: 1
priority: 2
hidden: True
114. http://spf13.com
http://github.com/spf13
@spf13
Questions?
download at mongodb.org
PS: We’re hiring!! Contact us at
jobs@10gen.com
Editor's Notes
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
\n
\n
\n
\n
\n
\n
\n
\n
By reducing transactional semantics the db provides, one can still solve an interesting set of problems where performance is very important, and horizontal scaling then becomes easier.\n\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
sharding isn’t new\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
write: add new paragraph. read: read through book.\ndon't go into indexes yet\n