Replication MongoDB Days 2013

#mongodbdays chicago

Introduction to Replication and
Replica Sets
J. Randall Hunt
Hackathoner, MongoDB

Notes to the presenter
•  Themes for this presentation:
•  Balance between cost and redundancy.
•  Cover the many scenarios which replication

would solve and why.
•  Secret sauce of avoiding downtime & data loss
•  If time is short, skip 'Behind the curtain' section
•  If time is very short, skip Operational

Considerations also

Why Replication?
•  How many have faced node failures?
•  How many have been woken up from sleep to do a

fail-over(s)?
•  How many have experienced issues due to network

latency?
•  Different uses for data
–  Normal processing
–  Simple analytics

Agenda
•  Replica Sets Lifecycle
•  Developing with Replica Sets
•  Operational Considerations

Node 1

Node 2

Node 3

Replica Set – Creation

Node 1

Node 2

Secondary

Secondary

Heartbeat

ica
Primary

Replica Set – Initialize

Re
pl

n

o
ati
lic

tio

n

p
Re

Node 3

Node 1

Secondary

Primary Election
Heartbeat

Node 3

Replica Set – Failure

Node 2

Secondary

Node 1

Replication

Secondary

Node 2
Primary

Heartbeat

Node 3

Replica Set – Failover

Node 1

Replication

Secondary

Node 2
Primary

Re
pl

ica

tio
n

Heartbeat

Node 3
Recovery

Replica Set – Recovery

Node 1

Replication

Secondary

Node 2
Primary

Re
pl

ica

tio
n

Heartbeat

Node 3

Secondary

Replica Set – Recovered

Replica Set Roles &
Conﬁguration

Node 1

Node 2

Secondary

Arbiter

Heartbeat

on
ati
lic

p
Re

Node 3
Primary

Replica Set Roles

Conﬁguration Options
> conf = {
_id: ”empire",
members: [
{ _id: 0, host: 'vader.deathstar.mil'},
{ _id: 1, host: 'tarkin.deathstar.mil'},
{ _id: 2, host: 'palpatine.deathstar.mil'}
]
}
> rs.initiate(conf)

> conf = {
_id: ”empire",
members: [

Primary DC

{ _id: 0, host: 'vader.deathstar.mil', priority: 3},
{ _id: 1, host: 'tarkin.deathstar.mil', priority: 2},
{ _id: 2, host: 'palpatine.deathstar.mil'}
]
}
> rs.initiate(conf)

> conf = {
_id: ”empire",
members: [
{ _id: 2, host: 'palpatine.deathstar.mil'},
{ _id: 3, host: 'marajade.empire.mil', hidden: true}
]
}
> rs.initiate(conf)

Analytics
node

> conf = {
_id: "myset",
members: [
{ _id: 2, host: 'palpatine.deathstar.mil'},
{ _id: 3, host: 'marajade.empire.mil', hidden: true},
{ _id: 4, host: 'thrawn.empire.mil', hidden: true, slaveDelay: 3600}
]
}}
> rs.initiate(conf)

Backup node

Client Application

Write

Read

Driver

Primary

Secondary

Strong Consistency

Secondary

Client Application
Driver

Re
a

Delayed Consistency

d

Secondary

a
Re

d

Write

Primary

Secondary

Write Concern
•  Network acknowledgement
•  Wait for error
•  Wait for journal sync
•  Wait for replication

Driver
write

Primary
apply in
memory

Unacknowledged

Primary

getLastError

Driver

apply in
memory

MongoDB Acknowledged (wait for error)

j:true

write

Primary

getLastError

Driver

apply in
memory

write to
journal

Wait for Journal Sync

w:2

write

Primary

getLastError

Driver

replicate

apply in
memory

Secondary

Wait for Replication

Tagging
•  Control where data is written to, and read from
•  Each member can have one or more tags
–  tags: {dc: "ny"}
–  tags: {dc: "ny",

subnet: "192.168",
rack: "row3rk7"}

•  Replica set deﬁnes rules for write concerns
•  Rules can change without changing app code

Tagging Example
{
_id : "mySet",
members : [
{_id : 0, host : "frodo", tags : {"dc": "ny"}},
{_id : 1, host : "sam", tags : {"dc": "ny"}},
{_id : 2, host : "merry", tags : {"dc": "sf"}},
{_id : 3, host : "pippin", tags : {"dc": "sf"}},
{_id : 4, host : "gandalf", tags : {"dc": "cloud"}}],
settings : {
getLastErrorModes : {
allDCs : {"dc" : 3},
someDCs : {"dc" : 2}} }
}
> db.blogs.insert({...})
> db.runCommand({getLastError : 1, w : "someDCs"})

W:allDCs

write

Primary (SF)

getLastError

Driver

replicate

apply in
memory

Secondary (NY)
replicate

Secondary (Cloud)

Wait for Replication (Tagging)

Read Preference Modes
•  5 modes
–  primary (only) - Default
–  primaryPreferred
–  secondary
–  secondaryPreferred
–  Nearest

When more than one node is possible, closest node is used for
reads (all modes but primary)

Tagged Read Preference
•  Custom read preferences
•  Control where you read from by (node) tags
–  E.g. { "disk": "ssd", "use": "reporting" }

•  Use in conjunction with standard read

preferences
–  Except primary

Maintenance and Upgrade
•  No downtime
•  Rolling upgrade/maintenance
–  Start with Secondary
–  Primary last

Replica Set – 1 Data Center
•  Single datacenter
Datacenter

Member 1
Member 2
Datacenter 2

Member 3

•  Single switch & power
•  Points of failure:
–  Power
–  Network
–  Data center
–  Two node failure

•  Automatic recovery of

single node crash

Replica Set – 2 Data Centers
Datacenter 1

Member 1
Member 2

Datacenter 2

Member 3

Multi data center
DR node for safety
Can’t do multi data center
durable write safely since
only 1 node in remote DC

Replica Set – 3 Data Centers
Datacenter 1
Member 1
Member 2

Datacenter 2
Member 3
Member 4

Datacenter 3
Member 5

Three data centers
Can survive full data
center loss
Can do w= { dc : 2 } to
guarantee write in 2 data
centers (with tags)

Implementation details
•  Heartbeat every 2 seconds
–  Times out in 10 seconds

•  Local DB (not replicated)
–  system.replset
–  oplog.rs

•  Capped collection
•  Idempotent version of operation stored

Op(erations) Log is idempotent
> db.replsettest.insert({_id:1,value:1})
{ "ts" : Timestamp(1350539727000, 1), "h" :
NumberLong("6375186941486301201"), "op" : "i", "ns" :
"test.replsettest", "o" : { "_id" : 1, "value" : 1 } }
> db.replsettest.update({_id:1},{$inc:{value:10}})
{ "ts" : Timestamp(1350539786000, 1), "h" :
NumberLong("5484673652472424968"), "op" : "u", "ns" :
"test.replsettest", "o2" : { "_id" : 1 },
"o" : { "$set" : { "value" : 11 } } }

Single operation can have many entries
> db.replsettest.update({},{$set:{name : ”foo”}, false, true})
{ "ts" : Timestamp(1350540395000, 1), "h" :
NumberLong("-4727576249368135876"), "op" : "u", "ns" :
"test.replsettest", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" :
"foo" } } }
{ "ts" : Timestamp(1350540395000, 2), "h" :
"foo" } } }
{ "ts" : Timestamp(1350540395000, 3), "h" :
"foo" } } }

Recent improvements
•  Read preference support with sharding
–  Drivers too

•  Improved replication over WAN/high-latency

networks

•  rs.syncFrom command
•  buildIndexes setting
•  replIndexPrefetch setting

Just Use It
Use replica sets
Easy to setup
–  Try on a single machine

Check doc page for RS tutorials
–  http://docs.mongodb.org/manual/replication/#tutorials

#mongodbdays seattle

Thank You
J. Randall Hunt
Hackathoner, MongoDB
@jrhunt, randall@mongodb.com

Replication MongoDB Days 2013

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Replication MongoDB Days 2013

Similar a Replication MongoDB Days 2013 (20)

Más de Randall Hunt

Más de Randall Hunt (12)

Último

Último (20)

Replication MongoDB Days 2013