SlideShare una empresa de Scribd logo
1 de 39
MongoDB:  An Introduction Chris Westin Software Engineer, 10gen © Copyright 2010 10gen Inc.
Outline The Whys of Non-Relational Databases Vocabulary of the Non-Relational World MongoDB
Why did non-relational databases arise? Problems with relational databases in the web world The Whys of Non-Relational Databases
Problem - Schema Evolution Applications are evolving all the time Applications need new fields Applications need new indexes Data is growing – sometimes very fast Users need to be able to alter their schemas without making their data unavailable The web world expects 24x7 service RDBMSs can have a hard time doing this
Problem – Write Rates Replication is a solution for high read loads Sooner or later, writing becomes a bottleneck Sharding – partitioning a logical database across multiple database instances Joins and aggregation become a problem Distributed transactions are too slow for the web Manual management of shards Choosing shard partitions Rebalancing shards
An introduction to terminology you’re going to be seeing a lot Vocabulary of the Non-Relational World
Data Models A non-relational database’s data model determines the kinds of items it can contain and how they can be retrieved What can the system store, and what does it know about what it contains? The relational data model is about storing records made up of named, scalar-valued fields, as specified by a schema, or type definition What kind of queries can you do? SQL is a manifestation of the kinds of queries that fall out of relational algebra
Non-Relational Data Models Key-value stores Document stores Column-oriented databases Graph databases
Key-Value Stores A mapping from a key to a value The store doesn’t know anything about the the key or value The store doesn’t know anything about the insides of the value Operations Set, get, or delete a key-value pair
Document Stores The store is a container for documents Documents are made up of named fields Fields may or may not have type definitions e.g. XSDs for XML stores, vs. schema-less JSON stores Can create “secondary indexes” These provide the ability to query on any document field(s) Operations: Insert and delete documents Update fields within documents
Column-Oriented Stores Like a relational store, but flipped around: all data for a column is kept together An index provides a means to get a column value for a record Operations: Get, insert, delete records; updating fields Streaming column data in and out of Hadoop
Graph Databases Stores vertex-to-vertex edges Operations: Getting and setting edges Sometimes possible to annotate vertices or edges Query languages support finding paths between vertices, subject to various constraints
Consistency Models Relational databases support transactions Can only see committed changes Commit/abort span multiple changes Read-only transaction flavors Read committed, repeatable read, etc Classic assumption: “I’m querying the one-and-only database” Scaling reads and writes introduce different problems
Replication - The 1st Breakdown of Consistency
Limitations of a Single Master Replication can provide arbitrary read scalability Subject to coping with read-consistency issues Sooner or later, writing becomes a bottleneck Physical limitations (seek time) Throughput of a single I/O subsystem
Sharding Paritition the primary key space via hashing Set up a duplicate system for each shard The write-rate limitation now applies to each shard Joins or aggregation across shards are problematic Can the data be re-sharded on a live system? Can shards be re-balanced on a live system?
Multi-Site Operation Failure of a single-master system’s master A new master can be chosen But what if there’s a network partition? Can the application continue in read-only mode?
Dynamo Now a generic term for multi-master systems Writes can occur to any node The same record can be updated on different nodes by different clients All writes are replicated everywhere
Dynamo – the 2nd breakdown of consistency Collisions can occur Who wins? A collision resolution strategy is required Vector clocks http://en.wikipedia.org/wiki/Vector_clock Application access must be aware of this
The Commercial Landscape
Key Client Implementation Concerns Monotonic reads Can my reads go back in time? Read-your-own-writes If I issue a query immediately after an insert or update, will I see my changes? Uninterrupted writes Am I always guaranteed the ability to write? Conflict Resolution Do I need to have a conflict resolution strategy?
Using a Single-Master System What does the intermediate agent or system do for… Monotonic reads? Read-your-own-writes? Uninterrupted writes? Conflict Resolution?
Using a Multi-Master System What does the intermediate agent or system do for… Monotonic reads? Read-your-own-writes? Uninterrupted writes? Conflict Resolution?
Where MongoDB fits in the non-relational world MongoDB’s architecture and features Some real-world users MongoDB
MongoDB is a Document Store MongoDB stores JSON objects as BSON { LastName: ‘Flintstone’, FirstName: ‘Fred’, …} Secondary Indexes db.collection.ensureIndex({LastName : 1, FirstName : 1}); Simple QBE-like query syntax db.collection.find({LastName : ‘Flintstone’}); db.collection.find({LastName : { $gte : ‘Flintstone’});
MongoDB – Advanced Queries Geo-spatial queries Create a geo index Find points near a given point, sorted by radial distance Can be planar or spherical Find points within a certain radial distance, within a bounding box, or a polygon Built-in Map-Reduce The caller provides map and reduce functions written in JavaScript
MongoDB is a Single-Master System A database is served by members of a “replica set” The system elects a primary (master) Failure of the master is detected, and a new master is elected Application writes get an error if there is no quorum to elect a new master Reads continue to be fulfilled
MongoDB Replica Set
MongoDB Supports Sharding A collection can be sharded Each shard is served by its own replica set New shards (each a replica set) can be added at any time Shard key ranges are automatically balanced
MongoDB – Sharded Deployment
MongoDB Storage Management Data is kept in memory-mapped files Servers should have a lot of memory Files are allocated as needed Documents in a collection are kept on a list using a geographical addressing scheme Indexes (B*-trees) point to documents using geographical addresses
MongoDB Server Management Replica set members are aware of each other A majority of votes is required to elect a new primary Members can be assigned priorities to affect the election e.g., an “invisible” replica can be created with zero priority for backup purposes
MongoDB Access Drivers are available in many languages 10gen supported C, C# (.Net), C++, Erlang, Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, Scala Community supported Clojure, ColdFusion, F#, Go, Groovy, Lua, R http://www.mongodb.org/display/DOCS/Overview+-+Writing+Drivers+and+Tools
MongoDB Availability Source https://github.com/mongodb/mongo Server License:  AGPL http://www.mongodb.org/downloads Drivers License:  Apache http://www.mongodb.org/display/DOCS/Drivers
MongoDB – Hosted Services http://www.mongodb.org/display/DOCS/Hosting+Center MongoHQ, Mongo Machine, MongoLab RESTful access to collections
MongoDB Support Paid Support http://www.10gen.com/client-portal 10gen Hosted Monitoring Consulting, training Free Support http://groups.google.com/group/mongodb-user http://stackoverflow.com/questions/tagged/mongodb
MongoDB Users http://www.10gen.com/customers http://www.10gen.com/presentations craigslist: http://www.10gen.com/presentation/mongosf2011/craigslist bit.ly: http://blip.tv/mongodb/bit-ly-user-history-auto-sharded-3723147 shutterfly: http://www.10gen.com/presentation/mongosv2010/shutterfly
Mini-demo/tutorial http://try.mongodb.org/

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET Driver
 
Mongodb tutorial at Easylearning Guru
Mongodb tutorial  at Easylearning GuruMongodb tutorial  at Easylearning Guru
Mongodb tutorial at Easylearning Guru
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Top 10 frameworks of node js
Top 10 frameworks of node jsTop 10 frameworks of node js
Top 10 frameworks of node js
 
Introduction to mongoDB
Introduction to mongoDBIntroduction to mongoDB
Introduction to mongoDB
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
Mongo db
Mongo dbMongo db
Mongo db
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 

Destacado

Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0MongoDB
 
SAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance FeaturesSAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance FeaturesSAP Technology
 
SpringPeople Introduction to MongoDB Administration
SpringPeople Introduction to MongoDB AdministrationSpringPeople Introduction to MongoDB Administration
SpringPeople Introduction to MongoDB AdministrationSpringPeople
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 

Destacado (6)

Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0
 
Introduction to Mongodb
Introduction to MongodbIntroduction to Mongodb
Introduction to Mongodb
 
SAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance FeaturesSAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance Features
 
SpringPeople Introduction to MongoDB Administration
SpringPeople Introduction to MongoDB AdministrationSpringPeople Introduction to MongoDB Administration
SpringPeople Introduction to MongoDB Administration
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 

Similar a MongoDB: An Introduction - june-2011

MongoDB: An Introduction - July 2011
MongoDB:  An Introduction - July 2011MongoDB:  An Introduction - July 2011
MongoDB: An Introduction - July 2011Chris Westin
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceJ Singh
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scalingMongoDB
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesAndrew Kandels
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewPierre Baillet
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsFirat Atagun
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfajajkhan16
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.pptAnandKonj1
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'sankarapu posibabu
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.pptssuser8c8fc1
 
No SQL - MongoDB
No SQL - MongoDBNo SQL - MongoDB
No SQL - MongoDBMirza Asif
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 

Similar a MongoDB: An Introduction - june-2011 (20)

MongoDB: An Introduction - July 2011
MongoDB:  An Introduction - July 2011MongoDB:  An Introduction - July 2011
MongoDB: An Introduction - July 2011
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
MongoDB
MongoDBMongoDB
MongoDB
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scaling
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational Databases
 
Open source Technology
Open source TechnologyOpen source Technology
Open source Technology
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, Implementations
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdf
 
MongoDB
MongoDBMongoDB
MongoDB
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.ppt
 
MongoDb - Details on the POC
MongoDb - Details on the POCMongoDb - Details on the POC
MongoDb - Details on the POC
 
No SQL - MongoDB
No SQL - MongoDBNo SQL - MongoDB
No SQL - MongoDB
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 

Más de Chris Westin

Data torrent meetup-productioneng
Data torrent meetup-productionengData torrent meetup-productioneng
Data torrent meetup-productionengChris Westin
 
Ambari hadoop-ops-meetup-2013-09-19.final
Ambari hadoop-ops-meetup-2013-09-19.finalAmbari hadoop-ops-meetup-2013-09-19.final
Ambari hadoop-ops-meetup-2013-09-19.finalChris Westin
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera managerChris Westin
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcacheChris Westin
 
SDN/OpenFlow #lspe
SDN/OpenFlow #lspeSDN/OpenFlow #lspe
SDN/OpenFlow #lspeChris Westin
 
cfengine3 at #lspe
cfengine3 at #lspecfengine3 at #lspe
cfengine3 at #lspeChris Westin
 
mongodb-aggregation-may-2012
mongodb-aggregation-may-2012mongodb-aggregation-may-2012
mongodb-aggregation-may-2012Chris Westin
 
Nimbula lspe-2012-04-19
Nimbula lspe-2012-04-19Nimbula lspe-2012-04-19
Nimbula lspe-2012-04-19Chris Westin
 
mongodb-brief-intro-february-2012
mongodb-brief-intro-february-2012mongodb-brief-intro-february-2012
mongodb-brief-intro-february-2012Chris Westin
 
Stingray - Riverbed Technology
Stingray - Riverbed TechnologyStingray - Riverbed Technology
Stingray - Riverbed TechnologyChris Westin
 
MongoDB's New Aggregation framework
MongoDB's New Aggregation frameworkMongoDB's New Aggregation framework
MongoDB's New Aggregation frameworkChris Westin
 
Replication and replica sets
Replication and replica setsReplication and replica sets
Replication and replica setsChris Westin
 
Architecting a Scale Out Cloud Storage Solution
Architecting a Scale Out Cloud Storage SolutionArchitecting a Scale Out Cloud Storage Solution
Architecting a Scale Out Cloud Storage SolutionChris Westin
 
Practical Replication June-2011
Practical Replication June-2011Practical Replication June-2011
Practical Replication June-2011Chris Westin
 
Ganglia Overview-v2
Ganglia Overview-v2Ganglia Overview-v2
Ganglia Overview-v2Chris Westin
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011Chris Westin
 
Mysql Proxy Presentation Yahoo
Mysql Proxy Presentation YahooMysql Proxy Presentation Yahoo
Mysql Proxy Presentation YahooChris Westin
 

Más de Chris Westin (20)

Data torrent meetup-productioneng
Data torrent meetup-productionengData torrent meetup-productioneng
Data torrent meetup-productioneng
 
Gripshort
GripshortGripshort
Gripshort
 
Ambari hadoop-ops-meetup-2013-09-19.final
Ambari hadoop-ops-meetup-2013-09-19.finalAmbari hadoop-ops-meetup-2013-09-19.final
Ambari hadoop-ops-meetup-2013-09-19.final
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera manager
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcache
 
SDN/OpenFlow #lspe
SDN/OpenFlow #lspeSDN/OpenFlow #lspe
SDN/OpenFlow #lspe
 
cfengine3 at #lspe
cfengine3 at #lspecfengine3 at #lspe
cfengine3 at #lspe
 
mongodb-aggregation-may-2012
mongodb-aggregation-may-2012mongodb-aggregation-may-2012
mongodb-aggregation-may-2012
 
Nimbula lspe-2012-04-19
Nimbula lspe-2012-04-19Nimbula lspe-2012-04-19
Nimbula lspe-2012-04-19
 
mongodb-brief-intro-february-2012
mongodb-brief-intro-february-2012mongodb-brief-intro-february-2012
mongodb-brief-intro-february-2012
 
Stingray - Riverbed Technology
Stingray - Riverbed TechnologyStingray - Riverbed Technology
Stingray - Riverbed Technology
 
MongoDB's New Aggregation framework
MongoDB's New Aggregation frameworkMongoDB's New Aggregation framework
MongoDB's New Aggregation framework
 
Replication and replica sets
Replication and replica setsReplication and replica sets
Replication and replica sets
 
Architecting a Scale Out Cloud Storage Solution
Architecting a Scale Out Cloud Storage SolutionArchitecting a Scale Out Cloud Storage Solution
Architecting a Scale Out Cloud Storage Solution
 
FlashCache
FlashCacheFlashCache
FlashCache
 
Large Scale Cacti
Large Scale CactiLarge Scale Cacti
Large Scale Cacti
 
Practical Replication June-2011
Practical Replication June-2011Practical Replication June-2011
Practical Replication June-2011
 
Ganglia Overview-v2
Ganglia Overview-v2Ganglia Overview-v2
Ganglia Overview-v2
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011
 
Mysql Proxy Presentation Yahoo
Mysql Proxy Presentation YahooMysql Proxy Presentation Yahoo
Mysql Proxy Presentation Yahoo
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Último (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

MongoDB: An Introduction - june-2011

  • 1. MongoDB: An Introduction Chris Westin Software Engineer, 10gen © Copyright 2010 10gen Inc.
  • 2. Outline The Whys of Non-Relational Databases Vocabulary of the Non-Relational World MongoDB
  • 3. Why did non-relational databases arise? Problems with relational databases in the web world The Whys of Non-Relational Databases
  • 4. Problem - Schema Evolution Applications are evolving all the time Applications need new fields Applications need new indexes Data is growing – sometimes very fast Users need to be able to alter their schemas without making their data unavailable The web world expects 24x7 service RDBMSs can have a hard time doing this
  • 5. Problem – Write Rates Replication is a solution for high read loads Sooner or later, writing becomes a bottleneck Sharding – partitioning a logical database across multiple database instances Joins and aggregation become a problem Distributed transactions are too slow for the web Manual management of shards Choosing shard partitions Rebalancing shards
  • 6. An introduction to terminology you’re going to be seeing a lot Vocabulary of the Non-Relational World
  • 7. Data Models A non-relational database’s data model determines the kinds of items it can contain and how they can be retrieved What can the system store, and what does it know about what it contains? The relational data model is about storing records made up of named, scalar-valued fields, as specified by a schema, or type definition What kind of queries can you do? SQL is a manifestation of the kinds of queries that fall out of relational algebra
  • 8. Non-Relational Data Models Key-value stores Document stores Column-oriented databases Graph databases
  • 9. Key-Value Stores A mapping from a key to a value The store doesn’t know anything about the the key or value The store doesn’t know anything about the insides of the value Operations Set, get, or delete a key-value pair
  • 10. Document Stores The store is a container for documents Documents are made up of named fields Fields may or may not have type definitions e.g. XSDs for XML stores, vs. schema-less JSON stores Can create “secondary indexes” These provide the ability to query on any document field(s) Operations: Insert and delete documents Update fields within documents
  • 11. Column-Oriented Stores Like a relational store, but flipped around: all data for a column is kept together An index provides a means to get a column value for a record Operations: Get, insert, delete records; updating fields Streaming column data in and out of Hadoop
  • 12. Graph Databases Stores vertex-to-vertex edges Operations: Getting and setting edges Sometimes possible to annotate vertices or edges Query languages support finding paths between vertices, subject to various constraints
  • 13. Consistency Models Relational databases support transactions Can only see committed changes Commit/abort span multiple changes Read-only transaction flavors Read committed, repeatable read, etc Classic assumption: “I’m querying the one-and-only database” Scaling reads and writes introduce different problems
  • 14. Replication - The 1st Breakdown of Consistency
  • 15. Limitations of a Single Master Replication can provide arbitrary read scalability Subject to coping with read-consistency issues Sooner or later, writing becomes a bottleneck Physical limitations (seek time) Throughput of a single I/O subsystem
  • 16. Sharding Paritition the primary key space via hashing Set up a duplicate system for each shard The write-rate limitation now applies to each shard Joins or aggregation across shards are problematic Can the data be re-sharded on a live system? Can shards be re-balanced on a live system?
  • 17. Multi-Site Operation Failure of a single-master system’s master A new master can be chosen But what if there’s a network partition? Can the application continue in read-only mode?
  • 18. Dynamo Now a generic term for multi-master systems Writes can occur to any node The same record can be updated on different nodes by different clients All writes are replicated everywhere
  • 19. Dynamo – the 2nd breakdown of consistency Collisions can occur Who wins? A collision resolution strategy is required Vector clocks http://en.wikipedia.org/wiki/Vector_clock Application access must be aware of this
  • 21. Key Client Implementation Concerns Monotonic reads Can my reads go back in time? Read-your-own-writes If I issue a query immediately after an insert or update, will I see my changes? Uninterrupted writes Am I always guaranteed the ability to write? Conflict Resolution Do I need to have a conflict resolution strategy?
  • 22. Using a Single-Master System What does the intermediate agent or system do for… Monotonic reads? Read-your-own-writes? Uninterrupted writes? Conflict Resolution?
  • 23. Using a Multi-Master System What does the intermediate agent or system do for… Monotonic reads? Read-your-own-writes? Uninterrupted writes? Conflict Resolution?
  • 24. Where MongoDB fits in the non-relational world MongoDB’s architecture and features Some real-world users MongoDB
  • 25. MongoDB is a Document Store MongoDB stores JSON objects as BSON { LastName: ‘Flintstone’, FirstName: ‘Fred’, …} Secondary Indexes db.collection.ensureIndex({LastName : 1, FirstName : 1}); Simple QBE-like query syntax db.collection.find({LastName : ‘Flintstone’}); db.collection.find({LastName : { $gte : ‘Flintstone’});
  • 26. MongoDB – Advanced Queries Geo-spatial queries Create a geo index Find points near a given point, sorted by radial distance Can be planar or spherical Find points within a certain radial distance, within a bounding box, or a polygon Built-in Map-Reduce The caller provides map and reduce functions written in JavaScript
  • 27. MongoDB is a Single-Master System A database is served by members of a “replica set” The system elects a primary (master) Failure of the master is detected, and a new master is elected Application writes get an error if there is no quorum to elect a new master Reads continue to be fulfilled
  • 29. MongoDB Supports Sharding A collection can be sharded Each shard is served by its own replica set New shards (each a replica set) can be added at any time Shard key ranges are automatically balanced
  • 30. MongoDB – Sharded Deployment
  • 31. MongoDB Storage Management Data is kept in memory-mapped files Servers should have a lot of memory Files are allocated as needed Documents in a collection are kept on a list using a geographical addressing scheme Indexes (B*-trees) point to documents using geographical addresses
  • 32. MongoDB Server Management Replica set members are aware of each other A majority of votes is required to elect a new primary Members can be assigned priorities to affect the election e.g., an “invisible” replica can be created with zero priority for backup purposes
  • 33. MongoDB Access Drivers are available in many languages 10gen supported C, C# (.Net), C++, Erlang, Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, Scala Community supported Clojure, ColdFusion, F#, Go, Groovy, Lua, R http://www.mongodb.org/display/DOCS/Overview+-+Writing+Drivers+and+Tools
  • 34. MongoDB Availability Source https://github.com/mongodb/mongo Server License: AGPL http://www.mongodb.org/downloads Drivers License: Apache http://www.mongodb.org/display/DOCS/Drivers
  • 35. MongoDB – Hosted Services http://www.mongodb.org/display/DOCS/Hosting+Center MongoHQ, Mongo Machine, MongoLab RESTful access to collections
  • 36. MongoDB Support Paid Support http://www.10gen.com/client-portal 10gen Hosted Monitoring Consulting, training Free Support http://groups.google.com/group/mongodb-user http://stackoverflow.com/questions/tagged/mongodb
  • 37. MongoDB Users http://www.10gen.com/customers http://www.10gen.com/presentations craigslist: http://www.10gen.com/presentation/mongosf2011/craigslist bit.ly: http://blip.tv/mongodb/bit-ly-user-history-auto-sharded-3723147 shutterfly: http://www.10gen.com/presentation/mongosv2010/shutterfly
  • 38.