Replica Set Tutorial
This tutorial will guide you through the
basic configuration of a replica set. Given the tutorial is an example and
should be easy to try, it runs several mongod processes on a single machine (in
the real world one would use several machines). If you are attempting to deploy
replica sets in production, be sure to read the replica set documentation. Replica sets are
available in MongoDB V1.6+.
A replica set is group of n mongod nodes (members) that work
together. The goal is that each member of the set has a complete copy (replica)
of the data form the other nodes.
Setting up a replica set is a two-step
process that requires starting each mongod process and then formally initiating
the set. Here, we'll be configuring a set of three nodes, which is standard.
Once the mongod processes are started, we will
issue a command to initialize the set. After a few seconds, one node will be
elected master, and you can begin writing to and querying the set.
First, create a separate data directory
for each of the nodes in the set. In a real environment with multiple servers
we could use the default /data/db directory if we wanted to, but on a single
machine we will have to set up non-defaults:
$ mkdir -p /data/r0
$ mkdir -p /data/r1
$ mkdir -p /data/r2
Next, start each mongod process with the --replSet parameter. The parameter requires
that you specify a logical name for our replica set. Let's call our replica set
"foo". We'll launch our first node like so:
$ mongod --replSet foo --port 27017
--dbpath /data/r0
Let's now start the second and third
nodes:
$ mongod --replSet foo --port 27018
--dbpath /data/r1
$ mongod --replSet foo --port 27019 --dbpath
/data/r2
You should now have three nodes running.
At this point, each node should be printing the following warning:
Mon Aug
2 11:30:19 [startReplSets] replSet can't get local.system.replset config
from self or any seed (EMPTYCONFIG)
We can't use the replica set until we've initiated it,
which we'll do next.
We can initiate the replica set by
connecting to one of the members and running the replSetInitiate command (that
is,rs.initiate() in the mongo shell).
This command takes a configuration object that specifies the name of the set
and each of the members.
The replSetInitiate command may be sent
to any member of an uninitiated set. However, only the member performing the
initiation may have any existing data. This data becomes the initial data for
the set. The other members will begin synchronizing and receiving that data (if
present; starting empty is fine too). This is called the "initial
sync". Secondaries will not be online for reads (in state 2,
"SECONDARY") until their initial sync completes.
Note: the replication oplog (in
the local database) is allocated at initiation time. The oplog can be quite
large, thus initiation may take some time.
$ mongo localhost:27017 MongoDB shell version: 1.5.7 connecting to: localhost:27017/test > rs.help(); // if you are curious run this (optional) > > config = {_id: 'foo', members: [ {_id: 0, host: 'localhost:27017'}, {_id: 1, host: 'localhost:27018'}, {_id: 2, host: 'localhost:27019'}] } > rs.initiate(config); { "info" : "Config now saved locally. Should come online in about a minute.", "ok" : 1 }
We specify the config object and then
pass it to rs.initiate(). Then, if everything is in order, we get a response saying that the
replica set will be online in a minute. During this time, one of the nodes will
be elected master.
To check the status of the set, run rs.status():
> rs.status() { "set" : "foo", "date" : "Mon Aug 02 2010 11:39:08 GMT-0400 (EDT)", "myState" : 1, "members" : [ { "name" : "arete.local:27017", "self" : true, }, { "name" : "localhost:27019", "health" : 1, "uptime" : 101, "lastHeartbeat" : "Mon Aug 02 2010 11:39:07 GMT-0400", }, { "name" : "localhost:27018", "health" : 1, "uptime" : 107, "lastHeartbeat" : "Mon Aug 02 2010 11:39:07 GMT-0400", } ], "ok" : 1 }
You'll see that the other members of the
set are up. You may also notice that the myState value is 1, indicating that we're
connected to the member which is currently primary; a value of 2 indicates a
secondary.
You can also check the set's status in
the HTTP Admin UI.
Go ahead and write something to the
master node:
db.messages.insert({name: "ReplSet Tutorial"});
If you look at the logs on the secondary
nodes, you'll see the write replicated.
The purpose of a replica set is to
provide automated failover. This means that, if the primary node goes down, a
secondary node can take over. When this occurs the set members which are up
perform an election to
select a new primary. To see how this works in practice, go ahead and kill the
master node with Control-C (^C) (or if running with --journal, kill -9 would be ok too):
^CMon Aug 2 11:50:16 got kill or ctrl c or hup signal 2 (Interrupt), will terminate after current cmd ends
Mon Aug 2 11:50:16 [interruptThread] now exiting
Mon Aug 2 11:50:16 dbexit:
If you look at the logs on the
secondaries, you'll see a series of messages indicating fail-over. On our first
slave, we see this:
Mon Aug 2 11:50:16 [ReplSetHealthPollTask] replSet info localhost:27017 is now down (or slow to respond) Mon Aug 2 11:50:17 [conn1] replSet info voting yea for 2 Mon Aug 2 11:50:17 [rs Manager] replSet not trying to elect self as responded yea to someone else recently Mon Aug 2 11:50:27 [rs_sync] replSet SECONDARY
And on the second, this:
Mon Aug 2 11:50:17 [ReplSetHealthPollTask] replSet info localhost:27017 is now down (or slow to respond)
Mon Aug 2 11:50:17 [rs Manager] replSet info electSelf 2
Mon Aug 2 11:50:17 [rs Manager] replSet PRIMARY
Mon Aug 2 11:50:27 [initandlisten] connection accepted from 127.0.0.1:61263 #5
Both nodes notice that the master has
gone down and, as a result, a new primary node is elected. In this case, the
node at port 27019 is promoted. If we bring the failed node on 27017 back
online, it will come back up as a secondary.
Changing the replica set configuration
There are times when you'll want to
change the replica set configuration. Suppose, for instance, that you want to
make a member have priority zero, indicating the member should never be
primary. To do this, you need to pass a new configuration object to the
database's replSetReconfig command. The shell rs.reconfig() helper makes this easier.
One note: the reconfig command must be
sent to the current primary of the set. This implies that you need a majority
of the set up to perform a reconfiguration.
> // we should be primary. can be checked with rs.status() or with: > rs.isMaster(); > var c = rs.conf(); {_id: 'foo', members: [ {_id: 0, host: 'localhost:27017'}, {_id: 1, host: 'localhost:27018'}, {_id: 2, host: 'localhost:27019'}] } > c.members[2].priority = 0; > c {_id: 'foo', members: [ {_id: 0, host: 'localhost:27017'}, {_id: 1, host: 'localhost:27018'}, {_id: 2, host: 'localhost:27019', priority: 0}] } > rs.reconfig(c); > //done. to see new config,and new status: > rs.conf() > rs.status()
Suppose you want to run replica sets
with just two database servers (that is, have a replication factor of two).
This is possible, but as replica sets perform elections, here a majority would
be 2 out of 2 which is not helpful. Thus in this situation one normally also
runs an arbiter on a separate server. An arbiter is a set
member which has no data but gets to vote in elections. In the case here, the
arbiter is the tie breaker in elections. Arbiters are very lightweight and can
be ran anywhere – say, on an app server or a micro vm. With an arbiter in
place, the replica set will behave appropriately, recovering automatically
during both network partitions and node failures (e.g., machine crashes).
You start up an arbiter just as you
would a standard replica set node, as a mongod process with the --replSet option. However, when initiating,
you need to include the arbiterOnly option in the config document.
With an arbiter, the configuration
presented above would look like this instead:
config = {_id: 'foo', members: [ {_id: 0, host: 'localhost:27017'}, {_id: 1, host: 'localhost:27018'}, {_id: 2, host: 'localhost:27019', arbiterOnly: true}] }
Most of the MongoDB drivers are replica
set aware. The driver when connecting takes a list of seed hosts from the
replica set and can then discover which host is primary and which are secondary
(the isMaster command is used internally by the driver for this).
With this complete set of potential
master nodes, the driver can automatically find the new master if the current
master fails. See your driver's documentation for specific details.
If you happen to be using the Ruby
driver, you may want to check out Replica Sets in Ruby.
참고 : http://mongodb.onconfluence.com/display/DOCSKR/Home ( 한국어 메뉴얼 )
http://groups.google.com/group/mongodb-kr ( 한국 MongoDB 사용자 그룹 )