These are the slides I presented at the Nosql Night in Boston on Nov 4, 2014. The slides were adapted from a presentation given by Steve Francia in 2011. Original slide deck can be found here:
http://spf13.com/presentation/mongodb-sort-conference-2011
3. • 12 years writing code
• 11 years using Oracle
• 9 months using Mongo
• BYU Alumnus
• Principal Engineer @ Cengage
• Currently doing MEAN stack dev
5. 1.Don’t want/need a rigid schema
1.Need horizontally scalable
performance for high loads
1.Make sure you won’t need real-time
reporting that aggregates a
lot of disparate data
7. Photo Meta-Data
Problem:
•Business needed more flexibility than Oracle could deliver
Solution:
•Used MongoDB instead of Oracle
RReessuullttss::
• Developed application in one sprint cycle
• 500% cost reduction compared to Oracle
• 900% performance improvement compared to Oracle
• http://www.mongodb.com/customers/shutterfly
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
8. Online Dictionary
Problem:
•MySQL could not scale to handle their 5B+ documents
Solution:
•Switched from MySQL to MongoDB
Results:
• Massive simplification of code base
• Eliminated need for external caching system
• 20x performance improvement over MySQL
• http://www.mongodb.com/customers/reverb-technologies
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
9. E-commerce
Problem:
•Multi-vertical E-commerce impossible to model (efficiently) in RDBMS
Solution:
•Switched from MySQL to MongoDB
Results:
• Massive simplification of code base
• Rapidly build, halving time to market (and cost)
• Eliminated need for external caching system
• 50x+ improvement over MySQL
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
10. Mongo’s Philosophy
• Mongo tries to provide a good degree of
functionality to handle a large set of use
cases
• sometimes need strong consistency /
atomicity
• secondary indexes
• ad hoc queries
11. Had to leave out a few
things in order to scale
• No Joins
• no choice here. Can’t have joins if we want to scale
horizontally
• No ACID Transactions
• distributed transactions are hard to scale
• Mongo does not support multi-document
transactions
• Only document level atomic operations provided
12. MongoDB
• JSON Documents
• Querying/Indexing/Updating similar to
relational databases
• Configurable Consistency
• Auto-Sharding
13. Database Landscape
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
14. MongoDB is:
Horizontally Scalable
Document
Oriented
{{ aauutthhoorr:: ““sstteevvee””,,
ddaattee:: nneeww DDaattee(()),,
tteexxtt:: ““AAbboouutt MMoonnggooDDBB......””,,
ttaaggss:: [[““tteecchh””,, ““ddaattaabbaassee””]]}}
Application
High
Performance
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
15. “• MongoDB has the best
features of key/ values stores,
document databases and
relational databases in one.
• John Nunemaker
17. Normalized Relational Data
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
18. Document databases make
normalized data look like this
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
19. Terminology
RDBMS Mongo
Table, View ➜ Collection
Row ➜ JSON Document
Index ➜ Index
Join ➜ Embedded Document
Partition ➜ Shard
Partition Key ➜ Shard Key
Slide Courtesy of Steve Francia - http://spf13.com/presentation/mongodb-sort-conference-2011
23. Secondary Indexes
• Create index on any field in document
// 1 means ascending, -1 means descending
> db.posts.ensureIndex({author: 1})
> db.posts.find({author: 'roger'})
> { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "roger",
... }
SQL equivalent
CREATE INDEX ON posts(author)
27. Our Use Case for
Mongo
1.We needed to prototype some app
ideas for a class test in the market. We
didn’t want a hardened schema. Just
wanted to get stuff out quick to try it out.
2.We made sure that real-time analytic
reporting wasn’t needed.
3.We were using nodejs on the backend
so Mongo was a natural fit.
28. What we gained by using Mongo
• Faster turnaround in development
• The flexibility to figure out our schema
design as we went and change our minds
often if needed
• A database that we could scale
horizontally if needed in the future
29. What we gave up by using Mongo
• No multi-document transactions. This means
We could not guarantee consistency in some
cases.
• Can’t write queries that use more than one
collection. Aggregation framework only works
on one collection at a time. Joining data has
to be done programmatically and doesn’t
scale.
• Nesting isn’t always possible, and there are
no foreign key constraints to enforce
consistency.
31. Limitations
• Max BSON document size is 16MB
– Mongo provides GridFS to get around this
• No more than 100 levels of nesting
• No more than 12 members in a replica set
http://docs.mongodb.org/manual/reference/limits/
33. MongoDB Sharding
• Shard data without no downtime
• Automatic balancing as data is written
• Range based or hash based sharding
34. Accessing a sharded
collection
• Inserts - must have the Shard Key
• Updates - must have the Shard Key
• Queries
• With Shard Key - routed to nodes
• Without Shard Key - scatter gather
• Indexed Queries
• With Shard Key - routed in order
• Without Shard Key - distributed sort merge
36. MongoDB Replication
• MongoDB replication like MySQL replication
(kinda)
• Asynchronous master/slave
• Variations
•Master / slave
•Replica Sets
37. Replication features
• Reads from Primary are always consistent
• Reads from Secondaries are eventually
consistent
• Automatic failover if a Primary fails
• Automatic recovery when a node joins the set
• Control of where writes occur
39. How MongoDB
Replication works
Member 1
Member 2
PRIMARY
Member 3
Election establishes the PRIMARY
Data replication from PRIMARY to SECONDARY
40. How MongoDB
Replication works
PRIMARY may fail
Automatic election of new PRIMARY if majority
exists
Member 1
Member 2
DOWN
Member 3
negotiate
new master
41. How MongoDB
Replication works
Member 1
Member 2
DOWN
Member 3
PRIMARY
New PRIMARY elected
Replication Set re-established
42. How MongoDB
Replication works
Member 1
Member 3
PRIMARY
Member 2
RECOVERING
Automatic recovery
43. How MongoDB
Replication works
Member 1
Member 3
PRIMARY
Member 2
Replication Set re-established
44. Typical Deployments
Use
?
Set
size
Data
Protection
High
Availability Notes
X One No No Must use --journal to protect against
crashes
Two Yes No On loss of one member, surviving member
is read only
Three Yes Yes - 1 failure On loss of one member, surviving two
members can elect a new primary
X Four Yes Yes - 1 failure* * On loss of two members, surviving two
members are read only
Five Yes Yes - 2 failures On loss of two members, surviving three
members can elect a new primary
45. Replica Set features
• A cluster of up to 12 servers
• Any (one) node can be primary
• Consensus election of primary
• Automatic failover
• Automatic recovery
• All writes to primary
• Reads can be to primary (default) or a
secondary