2. MongoDB on Windows Azure
MongoDB on Windows Azure brings the provide customers the tools to build limit-
power of the leading NoSQL database lessly scalable applications in the cloud.
to Microsoft’s flexible, open, and
scalable cloud. This paper begins with an overview of
MongoDB. Next, we describe the two
MongoDB is an open source, document- primary deployment options available on
oriented database designed with scalability Microsoft’s cloud platform, Windows Azure
and developer agility in mind. Windows Virtual Machines and Windows Azure Cloud
Azure is the cloud services operating sys- Services. Finally, to help those evaluating
tem that provides the development, service deploying MongoDB on Windows Azure,
hosting, and service management envi- we outline the pros and cons of the two
ronment for the Azure Services Platform. deployment options available.
Together, MongoDB and Windows Azure
2
3. Figure 1: Sample JSON Document
{
"_id": ObjectId("504e4dd43796b3da50183991"),
"text": "Study Implicates Immune System in Parkinson’s Disease Pathogenesis http://bit.ly/duhe4P",
"source": "<a href="http://twitterfeed.com" rel="nofollow">twitterfeed</a>,
"coordinates": null,
"truncated": false,
"entities": {
"urls": [{
"indices": [
67,
87],
"url": "http://bit.ly/duhe4P",
"expanded_url": null
}],
"hashtags": []
},
"retweeted": false,
"place": null,
"user": {
"friends_count": 780,
"created_at": "Fri Jan 08 17:40:11 +0000 2010",
"description": "Latest medical news, articles, and features from Medscape Pathology.",
"time_zone": "Eastern Time (US & Canada)",
"url": "http://www.medscape.com/pathology",
"screen_name": "MedscapePath",
"utc_offset": -18000
},
"favorited": false,
"in_reply_to_user_id": null,
"id": NumberLong("22819397000")
}
About MongoDB
MongoDB is an open source, document- across documents and to adapt schemas as
oriented database. MongoDB bridges the their applications evolve.
gap between key-value stores – which are
fast and scalable – and relational databas- Unlike relational databases, MongoDB does
es – which have rich functionality. Instead not use SQL syntax. Rather, MongoDB has
of storing data in tables and rows as one a query language based on JSON. It also
would with a relational database, MongoDB has drivers for most modern languages,
stores a binary form of JSON (BSON or such as C#, Java, Ruby, Python, and many
‘binary JSON’ documents). An example of a others. MongoDB’s flexible data model and
document is shown in Figure 1. support for modern programming languages
simplify development and administration
The document serves as the fundamental significantly.
unit within MongoDB (like a row in an
RDBMS); one can add fields (like a column
in an RDBMS), as well as nested fields and
embedded documents. Rather than impos-
ing a flat, rigid schema across an entire
table, the schema is implicit in what fields
are used in the documents. Thus, MongoDB
allows developers to have variable schemas
3
4. Figure 2: Replica Sets with MongoDB
MongoDB Architecture
MongoDB’s core capabilities deliver reli-
ability, high availability, high performance,
Application and scalability.
Replication through replica sets provides for
Read Write high availability and data safety. A replica
set is comprised of one primary node and
some number of secondary nodes (de-
Primary termined by the user). Figure 2 shows an
example replica set, with one primary and
two secondaries (a common deployment
Asynchronous
Secondary model). By default, the primary node takes
Replication
all reads and writes from the application;
the secondaries replicate asynchronously in
the background. If the primary node goes
Secondary
down for any reason, one of the secondaries
is automatically promoted to primary status
Automatic and begins to take all reads and writes.
Leader Election Replica sets help protect applications from
hardware and data center-related down-
time. Moreover, they make it easy for DBAs
to conduct operational tasks, including
software upgrades and hardware changes.
Figure 3: Sharding with MongoDB
Sharding enables users to scale horizon-
tally as their data volumes grow and/or as
Shard A Shard B Shard C Shard N demands on their data stores grow. A shard
0...30 31...60 61...90 n...n+30 is a subset of the database, kind of like a
partition of the data. In Figure 3, Shard A
... contains documents 1-30; Shard B contains
documents 31-60; and so on. One can
choose any key on which to shard the col-
Horizontally Scalable lection (e.g., user name), and MongoDB will
automatically shard the data store based on
this key. One can scale a database infi-
nitely using sharding by adding new nodes
to a cluster. When a new node is added,
MongoDB recognizes it and redistributes
the data across the cluster. Because shard-
ing distributes both the actual data and
therefore the load (i.e., traffic), it enables
horizontal scalability as well as high per-
formance.
4
5. Figure 4: MongoDB Architecture
Application
mongos
Replica Set A Replica Set B Replica Set C Replica Set N
0...30 31...60 61...90 n...n+30
Primary Primary Primary Primary
Secondary Secondary Secondary Secondary
Secondary Secondary Secondary
... Secondary
An overview of the MongoDB architecture is shown in Figure EC2, Azure VMs give users access to elastic, on-demand
4. In a multi-shard environment, the application communi- virtual servers. Users can install Windows or Linux on a VM
cates with mongos, an intermediary router that directs reads and configure it based on their own preferences or their
and writes to the appropriate shard. Each shard is a replica apps’ specific needs. Users manage the VMs themselves,
set, providing scalability, availability, and performance to including scaling, installing security patches, and ongoing
performance monitoring and management. Azure VMs give
developers.
users a relatively significant degree of control over their
MongoDB was built for the cloud. Cloud services like Windows environments, but by the same token require users to take
on the VM management. Note: This service is currently in
Azure are therefore a natural fit for MongoDB. By coupling
preview (beta).
MongoDB’s easy-to-scale architecture and Azure’s elastic
cloud capacity, users can quickly and easily build, scale, and »» Windows Azure Cloud Services. Windows Azure Cloud
manage their applications. Services (Worker Roles and Web Roles) is Microsoft’s
Platform–as–a–Service (PaaS) offering. Similar to Heroku,
Worker Roles provide users with prebuilt, preconfigured
About Windows Azure Services instances of compute power. In contrast with Azure VMs,
users do not have to configure or manage Azure Worker
for MongoDB Roles. Windows Azure handles the deployment details –
Windows Azure is Microsoft’s suite of cloud services, providing from provisioning and load balancing to health monitoring
developers on-demand compute and storage to create, host for continuous availability. This can be helpful to some
and manage scalable and available web applications through users who prefer not to manage their applications at the
Microsoft data centers. When deploying MongoDB to Windows infrastructure level, though it restricts the level of control
Azure, users can choose from two deployment options: users have over their environments.
»» Windows Azure Virtual Machines. Windows Azure
Virtual Machines (VMs) is Microsoft’s Infrastructure-as-a-
Service (IaaS) offering. Similar to Amazon Web Services
5
6. Understanding the Deployment Options
Given that MongoDB can be deployed on either Windows Azure Virtual Machines may not always be the right fit for the
Azure Virtual Machines (IaaS) or Windows Azure Cloud Services following reasons:
(PaaS), it is important for users to consider the different capa-
bilities and implementation details of each service to deter- Increased Operational Effort. The increased control that Azure
mine which deployment model makes the most sense for their Virtual Machines provide comes with increase effort, as well.
applications. Users must define and implement their own security measures,
apply patches, and locate instances for fault tolerance. This
consideration may be important for developers that lack expe-
Azure Virtual Machines rience managing their own infrastructure or for companies that
BASIC SETUP don’t have the operational bandwidth to devote to managing
After being granted access to the preview functionality for this component of the stack.
Azure Virtual Machines, users can launch an instance and
install and configure MongoDB on it manually. Alternatively, Beta. The Azure Virtual Machines service is still in
users can use the recently released Windows Azure installer for Preview (beta).
MongoDB to set up a MongoDB replica set quickly and easily on
Windows Azure VMs. Azure Cloud Services
The installer is built on top of Windows PowerShell and the BASIC SETUP
Windows Azure command line tool. It contains a number of Users can also deploy MongoDB on Azure Cloud Services. To
deployment scripts. The tool is designed to help users get do so, download the MongoDB Azure Worker Role package,
single or multi-node MongoDB configurations up and running which is a preconfigured Worker Role with MongoDB. When
quickly. There are only two steps to installing and configur- deployed, each replica set member runs as a separate Worker
ing a MongoDB replica set on Azure VMs. Note: the installer Role instance; MongoDB data files are stored in Azure Cloud
is designed to run on a user’s local machine (i.e., not directly Drives. For detailed instructions, visit the MongoDB wiki (wiki.
on an Azure VM), and then to deploy output to Windows Azure mongodb.org).
VMs. To start, download the publish settings file. Next, run the
installer from the command prompt. PROS AND CONS OF AZURE CLOUD SERVICES
The pros and cons of running MongoDB on Azure Cloud
With Azure Virtual Machines, users can create their own VMs or Services are generally consistent with those of using PaaS in
they can create a VM instance from one of several pre-installed general, though there are some Azure-specific considerations.
operating system configurations. Both Windows and Linux are Overall, Windows Azure Cloud Services decreases the opera-
supported on Azure Virtual Machines. To deploy MongoDB on tional burden on users but affords them less control from an
Linux, visit the MongoDB wiki (wiki.mongodb.org) for step-by- infrastructure configuration standpoint. The advantages of
step instructions. using Azure Cloud Services are as follows:
PROS AND CONS OF AZURE VIRTUAL MACHINES »» Lower Operational Effort. Microsoft manages OS updates
and security, decreasing the operational burden on the
The pros and cons of deploying MongoDB on Azure Virtual Ma-
users.
chines are generally consistent with the considerations around
using IaaS more broadly. Overall, Azure Virtual Machines allow »» Built-in Fault Tolerance.
When deploying multiple
users to fine-tune their deployments but by the same token MongoDB worker role instances, Windows Azure
require increased operational effort. automatically deploys the instances across multiple fault
and update domains to guarantee better uptime.
The advantages of using MongoDB on Azure Virtual Machines
are as follows: »» Secure by Default. Microsoft takes measures to ensure that
worker and web roles are secure. Endpoints on instances
»» Increased Control. Users have more control over their can be enabled for instance-to-instance communication
infrastructural configuration relative to Azure Cloud without making them public. Thus, one can configure
Services. For instance, they can install and configure MongoDB to be secure by enabling it only for other roles in
services on the OS, define policies, etc. This consideration the same deployment.
may be important for enterprises that have regimented
policies and processes for IT security and compliance.
»» OS Choice. Users can use Windows or Linux.
6
7. Table 1: Pros and Cons Summary - Windows Azure Virtual Machines and
Windows Azure Cloud Services
By the same token, there are some aspects
of Azure Cloud Services that may be consid-
ered drawbacks:
PROS CONS
»» Windows Only. Worker Roles can only be
deployed with Windows; Linux is not an
IaaS – - Increased control - Increased operational
effort option.
Windows - OS choice
Azure Virtual
Machines
»» Fixed OS Configuration. Users cannot
configure the OS, and must therefore
develop applications that run on the
pre-defined machine configurations
Initial - Lower operational effort - Windows only
available.
Administrative - Built-in fault tolerance - Fixed OS
Effort configuration
- Secure by default Table 1 summarizes the pros and cons of
using Windows Azure Worker Roles and
Windows Azure Virtual Machines.
Summary
MongoDB was built for ease of use, scal-
ability, availability, and performance, and
it’s quickly becoming an attractive alterna-
tive to relational databases. Windows Azure
provides a flexible cloud platform for host-
ing MongoDB, with two deployment models
to choose from. Developers and enterprises
looking at deploying MongoDB on Windows
Azure should consider the pros and cons
discussed here when evaluating which
option is most appropriate for them. We
hope that this paper helps customers better
understand these solutions, how they work,
and how to assess them.
To learn more about MongoDB and how
to deploy it in the cloud, or to speak
to a sales representative, please email
info@10gen.com.
7
8. New York 578 Broadway, New York, NY 10012 • London 5-25 Scrutton St., London EC2A 4HJ
info@10gen.com • US (866) 237-8815 • INTL +1 (650) 440-4474