Friday, January 8, 2016

Add Node To MongoDB Replica-Set With 

Minimum Sync Time


Normally when one adds node to MongoDB replica set, the mongodb will wipe out all the existing DBs on newly added node and sync up the data from primary node.
This is the most easy way to add node to replica set but if the volume of the data is high then initial sync up will take long time to complete.
To avoid that there is another way to achieve the same with much less time. 

Following are the steps that one can take to add node to mongodb replica set with minimum time. 

Initial Setup -
All of my mongo processes are running on one node on different ports.

RepB - Primary
RepA - Secondary

RepC - New Secondary <- This is the new node which will be added to replica set.

Now one need to copy the files from one of the secondary replica nodes. 
To make sure the copy is consistent, apply the lock on secondary node to make sure no writes are applied while the files are copied.
Also make sure the oplog is large enough to store the updates from primary without having to recycle it self. 

[root@]# mongo  --port 29001
MongoDB shell version: 3.0.7
connecting to: 127.0.0.1:29001/test
Server has startup warnings:
TestABC:SECONDARY>

TestABC:SECONDARY> db.fsyncLock()
{
        "info" : "now locked against writes, use db.fsyncUnlock() to unlock",
        "seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
        "ok" : 1
}
TestABC:SECONDARY> EXIT

Copy the files from existing secondary node to new secondary node.

[root@lpdosput00251 RepA]# scp -r * ../RepC/
[root@lpdosput00251 RepA]# cd ../RepC/
[root@lpdosput00251 RepC]# ll
total 163852
drwxr-xr-x 2 root root     4096 Jan  8 14:28 journal
-rwxr-xr-x 1 root root 67108864 Jan  8 14:28 local.0
-rwxr-xr-x 1 root root 16777216 Jan  8 14:28 local.ns
-rwxr-xr-x 1 root root        5 Jan  8 14:28 mongod.lock
-rwxr-xr-x 1 root root       69 Jan  8 14:28 storage.bson
-rwxr-xr-x 1 root root 67108864 Jan  8 14:28 test.0
-rwxr-xr-x 1 root root 16777216 Jan  8 14:28 test.ns
[root@lpdosput00251 RepC]# rm mongod.lock

Once the copy is complete, unlock the secondary node as follows.

TestABC:SECONDARY> db.fsyncUnlock()
{ "ok" : 1, "info" : "unlock completed" }

Now start the mongod process as a member of the replica set. 

[root@lpdosput00251 RepC]# mongod --replSet TestABC --dbpath /amex/mongodb/RepC --port 29003 --oplogSize 50 --logpath log.C --logappend --fork

Now one has to add this node to replica set configuration.

connect to primary node.

[root@lpdosput00251 ~]# mongo  --port 29002
MongoDB shell version: 3.0.7
connecting to: 127.0.0.1:29002/test

TestABC:PRIMARY> rs.add("10.20.176.194:29003")
{ "ok" : 1 }

Log file will show following messages

2016-01-08T14:30:32.734-0700 I REPL     [ReplicationExecutor] New replica set config in use: { _id: "TestABC", version: 6, members: [ { _id: 0, host: "10.20.176.194:29001", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "10.20.176.194:29002", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "10.20.176.194:29003", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatTimeoutSecs: 10, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
2016-01-08T14:30:32.734-0700 I REPL     [ReplicationExecutor] This node is 10.20.176.194:29003 in the config
2016-01-08T14:30:32.734-0700 I REPL     [ReplicationExecutor] transition to STARTUP2
2016-01-08T14:30:32.735-0700 I REPL     [ReplicationExecutor] Member 10.20.176.194:29002 is now in state PRIMARY
2016-01-08T14:30:32.735-0700 I REPL     [ReplicationExecutor] Member 10.20.176.194:29001 is now in state SECONDARY
2016-01-08T14:30:34.280-0700 I NETWORK  [initandlisten] connection accepted from 10.20.176.194:39327 #4 (3 connections now open)
2016-01-08T14:30:34.281-0700 I NETWORK  [conn4] end connection 10.20.176.194:39327 (2 connections now open)
2016-01-08T14:30:35.450-0700 I REPL     [ReplicationExecutor] syncing from: 10.20.176.194:29002
2016-01-08T14:30:35.450-0700 I REPL     [SyncSourceFeedback] replset setting syncSourceFeedback to 10.20.176.194:29002
2016-01-08T14:30:37.431-0700 I REPL     [ReplicationExecutor] transition to RECOVERING
2016-01-08T14:30:37.432-0700 I REPL     [ReplicationExecutor] transition to SECONDARY

Once you notice that, newly added node is in finished recovering, check the replica set status.

TestABC:SECONDARY> rs.status().members[0].name
10.20.176.194:29001
TestABC:SECONDARY> rs.status().members[0].stateStr
SECONDARY
TestABC:SECONDARY> rs.status().members[1].name
10.20.176.194:29002
TestABC:SECONDARY> rs.status().members[1].stateStr
PRIMARY
TestABC:SECONDARY> rs.status().members[2].name
10.20.176.194:29003
TestABC:SECONDARY> rs.status().members[2].stateStr
SECONDARY

To check whether the data is populated correctly or not.

[root@lpdosput00251 RepC]# mongo  --port 29003
MongoDB shell version: 3.0.7
connecting to: 127.0.0.1:29003/test

TestABC:SECONDARY> show dbs
2016-01-08T15:01:31.697-0700 E QUERY    Error: listDatabases failed:{ "note" : "from execCommand", "ok" : 0, "errmsg" : "not master" }
    at Error (<anonymous>)
    at Mongo.getDBs (src/mongo/shell/mongo.js:47:15)
    at shellHelper.show (src/mongo/shell/utils.js:630:33)
    at shellHelper (src/mongo/shell/utils.js:524:36)
    at (shellhelp2):1:1 at src/mongo/shell/mongo.js:47
TestABC:SECONDARY> rs.slaveOk()
TestABC:SECONDARY> show dbs
local  0.078GB
test   0.078GB
TestABC:SECONDARY> use test
switched to db test

TestABC:SECONDARY> show collections
foo
system.indexes
TestABC:SECONDARY> db.foo.find()
{ "_id" : ObjectId("5660264dfb21b4cc04a0c734"), "str" : "Sunil" }
{ "_id" : ObjectId("5660265bfb21b4cc04a0c735"), "str" : "Hardik", "x" : 3 }
{ "_id" : ObjectId("56602667fb21b4cc04a0c736"), "str" : "Hardik", "x" : 1 }
{ "_id" : ObjectId("56602674fb21b4cc04a0c737"), "str" : "sunny", "x" : 3 }
{ "_id" : ObjectId("5660267ffb21b4cc04a0c738"), "str" : "sunilkumar", "x" : 7 }

Your node is now successfully added to replica set and in sync with primary.

2 comments: