This document provides an overview of MongoDB, including what NoSQL databases are, MongoDB features like querying, indexing, replication, load balancing and aggregation. It discusses how MongoDB stores data in documents and collections, can be used for file storage, and is used by many large companies. The document also covers installing and running MongoDB on a local system.
2. What is NoSQL?
NoSQL features
NoSQL database types
What is MongoDB?
Who uses MongoDB?
Installation and Running
Documents and Collections
MongoDB features
Querying
Indexing
Replication
Load balancing
File storage
Aggregation
3. What is NoSQL?
Definition: “Next generation databases mostly
addressing some of the points: being non-relational,
distributed, open source and horizontally
scalable… schema-free, easy replication support,
simple API, eventually consistent, huge amount
of data…”
- nosql-database.org
4. What is NoSQL?
Definition: “Next generation databases mostly
addressing some of the points: being non-relational,
distributed, open source and horizontally
scalable… schema-free, easy replication support,
simple API, eventually consistent, huge amount
of data…”
NoSQL database types
•
•
•
•
Document databases
Graph stores
Key-value stores
Wide-column stores
- nosql-database.org
5. What is MongoDB?
Definition: “MongoDB (from "humongous") is an
open-source document database that provides high
performance, high availability, and automatic
scaling.”
- mongodb.org
Key features
High performance
High availability
Automatic scaling
7. Installation
Download MongoDB from mongodb.org
Extract to local disk C:
Rename the extracted folder to “mongodb”
Running MongoDB
Create a folder to store files, C:datadb
To start MongoDB with Command Prompt:
C:mongodbbinmongod.exe
Open another CMD and execute:
C:monogodbbinmongo.exe
8. Documents and Collections
A document is the basic unit of data. Documents
are stored on disk in BSON (binary JSON)
serialization format.
{
name: “klevis”,
value
age: 21,
value
status: “A”,
groups: [ “news”, “sports” ]
}
field:
field:
field: value
field: value
9. Documents and Collections
A collection is a group of documents (equivalent to
a table in a RDBMS). A collection exists within a
single database.
12. Querying
The find() method returns a cursor to the results
To display all the results:
var c = db.testData.find()
while ( c.hasNext() ) printjson( c.next() )
To limit the number of results
db.testData.find().limit(3)
To print a certain result
printjson( c [1] )
Searching for certain values of a field
db.testData.find({x:3})
14. Indexing
Indexes provide high performance read operations
for frequently used queries. Indexes are special data
structures that store a small portion of the collection’s
data set in an easy to traverse form.
15. Replication
MongoDB provides high availability and increased
throughput with replica sets. A replica set consists of
two or more copies of the data. Each replica may act
in the role of primary or secondary replica at any
time.
16. Load balancing
MongoDB scales horizontally using sharding. The
user chooses a shard key, which determines how the
data in a collection will be distributed. The data is
split into ranges and distributed across multiple
shards.
MongoDB can run over multiple servers, balancing
the load and/or duplicating data to keep the system
up and running in case of hardware failure. Automatic
configuration is easy to deploy, and new machines
can be added to a running database.
17. File storage
MongoDB can be used as a file system, taking
advantage of load balancing and data replication
features over multiple machines for storing files.
The function GriFS is included with MongoDB drivers
and available with no difficulty for development
languages. MongoDB exposes functions for file
manipulation and content to developers. In a multimachine MongoDB system, files can be distributed
and copied multiple times between machines
transparently, thus effectively creating a loadbalanced and fault-tolerant system.
18. Aggregation
MapReduce can be used for batch processing of
data and aggregation operations. The aggregation
framework enables users to obtain the kind of results
for which the SQL GROUP BY clause is used.