Monday, December 1, 2014

MongoDB Upgrade From 2.4.x to 2.6.x


Recently, in our environment, we undertook an exercise  to test the new version of MongoDB and look n feel of the upgrade process. 

Our env has following components - 

2 MongoS processes.
3 config servers.
2 shard servers. 

All of the above was using version 2.4.8, which we intend to upgrade to latest version of 2.6.5. 

To begin the upgrade procedure, connect a 2.6 mongo shell to your MongoDB 2.4 mongos or mongod and run the db.upgradeCheckAllDBs() to check your data set for compatibility. This is a preliminary automated check. Assess and resolve all issues identified by db.upgradeCheckAllDBs().

mongos> use admin
switched to db admin
mongos> db.upgradeCheckAllDBs()

Checking database sharding

Checking collection sharding.col1
37055 documents processed
76072 documents processed
......
.....
.....
Checking database admin

Checking collection admin.system.indexes

Checking collection admin.system.users
Everything is ready for the upgrade!
true



-- Now one has to  Stop/Disable the Balancer, so that there are no cluster related operations.

mongos> sh.isBalancerRunning()
true
mongos> sh.stopBalancer()
Waiting for active hosts...
Waiting for the balancer lock...
Waiting again for active hosts after balancer is off...
mongos> sh.isBalancerRunning()
false


There is specific order of the upgrade when you wish to upgrade the sharded cluster.
1. Mongos upgrade
2. config server upgrade
3. shard server upgrade. 

1. Upgrade mongos.

-- unzip the binaries of 2.6 mongos and use it to start new mongos of 2.6 version

-- stop mongos 
root      2988  1941  0 16:24 pts/1    00:00:29 /usr/bin/mongos --configdb localhost:19001,localhost:19002,localhost:19003 --logpath /db/mongos1.log --port 30002

root@INPUNPCLX:~# kill 2988

root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# /usr/bin/mongos --version
MongoS version 2.4.8 starting: pid=4471 port=27017 64-bit host=INPUNPCLX (--help for usage)

root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# mv /usr/bin/mongos /usr/bin/mongos.bak

root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# scp ./mongos /usr/bin/mongos
root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# which mongos
/usr/bin/mongos
root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# /usr/bin/mongos --version
MongoS version 2.6.5 starting: pid=4491 port=27017 64-bit host=INPUNPCLX (--help for usage)


Start a single 2.6 mongos instance with the configDB pointing to the cluster’s config servers and with the --upgrade option.

To run a mongos with the --upgrade option, you can upgrade an existing mongos instance to 2.6, or if you need to avoid reconfiguring a production mongos instance, you can use a new 2.6 mongos that can reach all the config servers.

To upgrade the meta data, run:

mongos --configdb <configDB string>  --logpath  xxxx --upgrade --port nnnnn

-- to start a new mongos 
 /usr/bin/mongos --configdb "localhost:19001,localhost:19002,localhost:19003" --upgrade  --logpath /db/mongos2.log --port 30002 &

7ac7218077eb4215", server: "INPUNPCLX", clientAddr: "N/A", time: new Date(1417182378230), what: "finished upgrade of config database", ns: "config.version", details: { from: 4, to: 5 } }
2014-11-28T19:16:18.248+0530 [mongosMain] upgrade of config server to v5 successful
2014-11-28T19:16:18.385+0530 [mongosMain] distributed lock 'configUpgrade/INPUNPCLX:30002:1417182377:1804289383' unlocked.
2014-11-28T19:16:18.385+0530 [mongosMain] Config database is at version v5


The mongos will exit upon completion of the --upgrade process.

Upgrade the remaining mongos instances to v2.6.
Upgrade and restart without the --upgrade option the other mongos instances in the sharded cluster. After upgrading all the mongos, see Complete Sharded Cluster Upgrade for information on upgrading the other cluster components

root@INPUNPCLX:~# ps -ef | grep mongos
root      2925     1  1 16:23 pts/1    00:02:38 /usr/bin/mongos --configdb localhost:19001,localhost:19002,localhost:19003 --logpath /db/mongos1.log --port 30001
root      4596  4102  0 19:17 pts/5    00:00:00 grep --color=auto mongos
root@INPUNPCLX:~# kill 2925
root@INPUNPCLX:~# /usr/bin/mongos --configdb "localhost:19001,localhost:19002,localhost:19003" --logpath /db/mongos1.log --port 30001 &
[1] 4598
root@INPUNPCLX:~# 2014-11-28T19:17:56.370+0530 log file "/db/mongos1.log" 

-- start the another mongos 

root@INPUNPCLX:~# /usr/bin/mongos --configdb "localhost:19001,localhost:19002,localhost:19003" --logpath /db/mongos2.log --port 30002 &
[2] 4645
root@INPUNPCLX:~# 2014-11-28T19:18:19.123+0530 log file "/db/mongos2.log" exists; moved to "/db/mongos2.log.2014-11-28T13-48-19".

root@INPUNPCLX:~# ps -ef | grep mongos
root      4598  4102  0 19:17 pts/5    00:00:00 /usr/bin/mongos --configdb localhost:19001,localhost:19002,localhost:19003 --logpath /db/mongos1.log --port 30001
root      4645  4102  1 19:18 pts/5    00:00:00 /usr/bin/mongos --configdb localhost:19001,localhost:19002,localhost:19003 --logpath /db/mongos2.log --port 30002

-- Do a test of connection, with new mongos 
root@INPUNPCLX:~# mongo --port 30001 admin -u hardik -p xxxx
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:30001/admin

2. Upgrade config db 

Upgrade all 3 mongod config server instances, leaving the first system in the mongos --configdb argument to upgrade last.


--For example, on Linux, run 2.4 mongod with the --shutdown option as follows:
mongod --dbpath /var/mongod/data --shutdown

-- Start the new mongod instance. Ensure you start the 2.6 mongod with the same dbPath:
mongod --dbpath /var/mongod/data

root      2889     1  0 16:23 ?        00:01:39 /usr/bin/mongod --configsvr --config /etc/config3.conf
root      2893     1  0 16:23 ?        00:01:34 /usr/bin/mongod --configsvr --config /etc/config2.conf
root      4722  4102  0 19:20 pts/5    00:00:00 grep --color=auto mongod
root@INPUNPCLX:~# kill 2889

root@INPUNPCLX:~# /usr/bin/mongod --version
db version v2.4.8

root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# ./mongod --version
db version v2.6.5
root@INPUNPCLX:~/mongodb-linux-x86_64-2.6.5/bin# scp ./mongod /usr/bin/mongod

root@INPUNPCLX:~/# mongod --version
db version v2.6.5
root@INPUNPCLX:/db/mongo01/data# /usr/bin/mongod --configsvr --config /etc/config3.conf &

root@INPUNPCLX:/db/mongo01/data# ps -ef | grep mongod

root      5377     1  0 19:36 ?        00:00:00 /usr/bin/mongod --configsvr --config /etc/config3.conf


root@INPUNPCLX:/db/mongo01/data# mongo --port 19003
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:19003/test
configsvr> show dbs
admin   0.063GB
config  0.063GB
local   0.031GB
configsvr> use config
switched to db config

configsvr> db.version()
2.6.5
-- above displays the config server version

WHEN YOU CONNECT THE OTHER CONFIG SERVER YOU WILL SEE THE OLD VERSION

root@INPUNPCLX:/db/mongo01/data# mongo --port 19002
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:19002/test
configsvr> use config
switched to db config
configsvr> db.version()
2.4.8

-- shutdown config2 server
root@INPUNPCLX:/db/mongo01/data# mongod --dbpath /db/config2/data --shutdown
killing process with pid: 2893

root@INPUNPCLX:/db/mongo01/data# /usr/bin/mongod --configsvr --config /etc/config2.conf &
[3] 5522
root@INPUNPCLX:/db/mongo01/data# about to fork child process, waiting until server is ready for connections.
forked process: 5524
child process started successfully, parent exiting

[3]+  Done                    /usr/bin/mongod --configsvr --config /etc/config2.conf
root@INPUNPCLX:/db/mongo01/data# ps -ef | grep mongod

root      5524     1  1 19:40 ?        00:00:00 /usr/bin/mongod --configsvr --config /etc/config2.conf

root@INPUNPCLX:/db/mongo01/data# mongo --port 19002
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:19002/test
configsvr> use config
switched to db config
configsvr> db.version()
2.6.5

-- Shut down and start up the last config server 
root@INPUNPCLX:/db/mongo01/data# mongod --dbpath /db/config1/data --shutdown
killing process with pid: 2884
root@INPUNPCLX:~# /usr/bin/mongod --configsvr --config /etc/config1.conf &
[3] 6172
root@INPUNPCLX:~# about to fork child process, waiting until server is ready for connections.
forked process: 6174
child process started successfully, parent exiting

[3]+  Done                    mongod --configsvr --config /etc/config1.conf


root@INPUNPCLX:~# ps -ef | grep mongod
root      2852     1  3 16:23 ?        00:07:28 /usr/bin/mongod --shardsvr --config /etc/mongo02.conf
root      2856     1  1 16:23 ?        00:03:17 /usr/bin/mongod --shardsvr --config /etc/mongo01.conf
root      5377     1  0 19:36 ?        00:00:01 /usr/bin/mongod --configsvr --config /etc/config3.conf
root      5524     1  0 19:40 ?        00:00:01 /usr/bin/mongod --configsvr --config /etc/config2.conf
root      6174     1  0 19:49 ?        00:00:00 /usr/bin/mongod --configsvr --config /etc/config1.conf

root@INPUNPCLX:~# mongo --port 19001
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:19001/test
configsvr> use config
switched to db config
configsvr> db.version()
2.6.5

3. Upgrade shard servers

Same way upgrade the shard servers same way you do configdb..
WARNING
Do not upgrade mongod instances until after you have upgraded all mongos instances.

-- check the shard server version
root@INPUNPCLX:~# mongo --port 20001
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:20001/test
> use config
switched to db config
> db.version()
2.4.8

-- kill the shard server 
root      2856     1  1 16:23 ?        00:03:18 /usr/bin/mongod --shardsvr --config /etc/mongo01.conf

root@INPUNPCLX:~# kill 2856

root@INPUNPCLX:~# /usr/bin/mongod --shardsvr --config /etc/mongo01.conf  &
child process started successfully, parent exiting

root      6269     1  0 19:55 ?        00:00:00 /usr/bin/mongod --shardsvr --config /etc/mongo01.conf

root@INPUNPCLX:~# mongo --port 20001
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:20001/test
Server has startup warnings:
2014-11-28T19:55:03.847+0530 ** WARNING: --rest is specified without --httpinterface,
2014-11-28T19:55:03.847+0530 **          enabling http interface
> use config
switched to db config
> db.version()
2.6.5

root@INPUNPCLX:~# ps -ef | grep mongod
root      2852     1  3 16:23 ?        00:07:29 /usr/bin/mongod --shardsvr --config /etc/mongo02.conf

root@INPUNPCLX:~# kill 2852

root@INPUNPCLX:~#  /usr/bin/mongod --shardsvr --config /etc/mongo02.conf &
[3] 6331
root@INPUNPCLX:~# 2014-11-28T19:57:58.410+0530 ** WARNING: --rest is specified without --httpinterface,
2014-11-28T19:57:58.410+0530 **          enabling http interface
about to fork child process, waiting until server is ready for connections.
forked process: 6333
child process started successfully, parent exiting


root@INPUNPCLX:~# mongo --port 20002
MongoDB shell version: 2.6.5
connecting to: 127.0.0.1:20002/test
Server has startup warnings:
2014-11-28T19:57:58.410+0530 ** WARNING: --rest is specified without --httpinterface,
2014-11-28T19:57:58.410+0530 **          enabling http interface
> use config
switched to db config
> db.version()
2.6.5

AS you can see now that even the shard servers are also upgraded. So one can use this method to upgrade the sharded clusters. Here I did not installed the Mongo RPM but used the tar'red version and copied the executables to /usr/bin myself. This process was adopted to keep control with DBA rather than to system to make trouble shooting easy. 

The only remaining bit that was not covered in post is the upgrade of Auth Schema. Which I will cover in separate post. 

Happy Learning!!

Wednesday, October 8, 2014

Partial DB Recovery To ReClaim User Objects


There are cases where users drops the objects and later they realize, what they have done. In case, you dont have flashback enabled or recycle bin turned on, you are in for some messy work. In such cases, one has to restore the whole database and then export and import back the desired objects. This works fine when you are dealing with small DBs however, if the DB is running in TBs, this brings whole new set of problems due to its sheer size. 

Though, there is a way where you can do the partial restore of DB without restoring the whole database which helps you to reduce the time, efforts and resources. 
One can follow the bellow mentioned process to achieve it should it needed. 

Use Case - There a package called as TEST in sysman schema, which got dropped. We need to recover this using partial recovery. The object residing in SYSAUX table space. Hence we will ignore the TEMP & USERS table spaces. 

On SOURCE DATABASE -

SQL> select a.FILE#, b.NAME from v$datafile a , v$tablespace b where a.TS#=b.TS#;
     FILE# NAME
---------- ------------------------------
         1 SYSTEM
         2 SYSAUX
         3 UNDOTBS1
         4 TEMP
         5 USERS
One has to start the Auxiliary instance on target node, with pfile copied from DB to be restored. In this case, I will rename the db instance as AUX.


[oracle@urac1 BKP_TST]$ . oraenv
ORACLE_SID = [AUX] ?

ORACLE_HOME = [/home/oracle] ?
[oracle@urac1 BKP_TST]$ sqlplus  / as sysdba
Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area 1603411968 bytes
Fixed Size                  2228784 bytes
Variable Size             402656720 bytes
Database Buffers         1191182336 bytes
Redo Buffers                7344128 bytes


## NOW ONE NEEDS TO RESTORE CONTROLFILE FROM BACKUP PIECE. FOLLOWING BACKUP PIECE CONTAINS MY CONTRILFILE.


restore controlfile from '/gold_backups/BKP_TST/09_03pi266o_1_1.bkp'
RMAN> restore controlfile from '/gold_backups/BKP_TST/09_03pi266o_1_1.bkp' ;

Starting restore at 09-SEP-14
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=+DATA/aux/control01.ctl
Finished restore at 09-SEP-14



RMAN> sql "alter database mount clone database";

sql statement: alter database mount clone database
released channel: ORA_DISK_1


## NOW WE NEED TO CHANGE THE NAME OF THE DATAFILES BEFORE RESTORE. 

## THIS IS USEFUL IF WE ARE USING THE SAME DISK GROUP AS SOURCE DATABASE.
## IN THIS TEST CASE, I AM CREATING AUXILIARY DATABASE ON SAME HOST WITH DIFF UNIQUE NAME.
## DB_UNIQUE_NAME=AUX

RUN
{

set newname for datafile 1 to '+DATA/AUX/system01.dbf';
set newname for datafile 2 to '+DATA/AUX/sysaux01.dbf';
set newname for datafile 3 to '+DATA/AUX/undotbs01.dbf';
restore tablespace system,sysaux, undotbs1;
switch datafile all;

sql "alter database datafile 1,2,3 online";


-- recover database skip forever tablespace TEMP,USERS;

--SINCE WE ARE RESTORING BACKUP ON DIFFERENT HOST, THE REDO LOG FILE DOES NOT EXISTS.
-- sql "alter database rename file ''/fs01/oradata/primary/REDO01.LOG'' to ''/fs01/oradata/tspitr/REDO01.LOG''";
-- sql "alter database rename file ''/fs01/oradata/primary/REDO02.LOG'' to ''/fs01/oradata/tspitr/REDO02.LOG''";
-- sql "alter database rename file ''/fs01/oradata/primary/REDO03.LOG'' to ''/fs01/oradata/tspitr/REDO03.LOG''";
}


RMAN> RUN
{

set newname for datafile 1 to '+DATA/AUX/system01.dbf';
set newname for datafile 2 to '+DATA/AUX/sysaux01.dbf';
set newname for datafile 3 to '+DATA/AUX/undotbs01.dbf';
restore tablespace system,sysaux, undotbs1;
switch datafile all;
sql "alter database datafile 1,2,3 online";
}

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting restore at 09-SEP-14
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=35 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to +DATA/AUX/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to +DATA/AUX/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00003 to +DATA/AUX/undotbs01.dbf
channel ORA_DISK_1: reading from backup piece /gold_backups/BKP_TST/09_02pi2669_1_1.bkp
channel ORA_DISK_1: piece handle=/gold_backups/BKP_TST/09_02pi2669_1_1.bkp tag=FULL_BACKUP
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 09-SEP-14

datafile 1 switched to datafile copy
input datafile copy RECID=4 STAMP=857808883 file name=+DATA/aux/system01.dbf
datafile 2 switched to datafile copy
input datafile copy RECID=5 STAMP=857808883 file name=+DATA/aux/sysaux01.dbf
datafile 3 switched to datafile copy
input datafile copy RECID=6 STAMP=857808883 file name=+DATA/aux/undotbs01.dbf

sql statement: alter database datafile 1,2,3 online

## RECOVER THE DATABASE TILL SPECIFIC TIME IF YOU KNOW IT OR JUST DO THE RECOVERY

RMAN> run
{
-- set until time "to_date( '09-09-2014 09:18', 'DD-MM-RRRR HH24:MI')";
recover database skip forever tablespace TEMP,USERS;
}


Starting recover at 09-SEP-14
using channel ORA_DISK_1

Executing: alter database datafile 4 offline drop
starting media recovery

RMAN-08187: WARNING: media recovery until SCN 1001013 complete
Finished recover at 09-SEP-14



## TRYING TO RECOVER DATABASE USING CANCEL BASED RECOVERY

SQL> recover database using backup controlfile until cancel;
ORA-00279: change 1002979 generated at 09/09/2014 09:19:51 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/product/11.2.0.3/db_1/dbs/arch1_9_857810115.dbf
ORA-00280: change 1002979 for thread 1 is in sequence #9

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

SQL>  alter database open RESETLOGS;
 alter database open RESETLOGS
*
ERROR at line 1:
ORA-00395: online logs for the clone database must be renamed
ORA-00312: online log 3 thread 1: '+DATA/testdb/redo03.log'


## NOW SINCE ITS COMPLAINING ABOUT THE NAME OF THE LOGFILE, ONE HAS TO RENAME IT. RENAME ALL THE THREE LOGFILES.
## THIS HAPPENED DUE TO THE FACT THAT I AM RESTORING THE DATABASE ON SAME DISK GROUP. ONE HAS TO BE CAREFUL HERE
## EVEN IF YOU RENAME THE LOGFILES, IT WONT HURT AS YOU ARE JUST UPDATING ENTRIES IN CONTROLFILE.
## REDO LOGS WILL BE RECREATED WHEN YOU OPEN THE DB IN RESETLOGS MODE

alter database rename file '+DATA/testdb/redo03.log' to '+DATA/AUX/redo03.log'

SQL> alter database rename file '+DATA/testdb/redo03.log' to '+DATA/AUX/redo03.log'
Database altered.

SQL> alter database rename file '+DATA/testdb/redo01.log' to '+DATA/AUX/redo01.log;
Database altered.

SQL> alter database rename file '+DATA/testdb/redo02.log' to '+DATA/AUX/redo02.log';
Database altered.

SQL> alter database open RESETLOGS;
Database altered.

SQL> conn sysman/sysman
Connected.
SQL> select text from dba_source where  name='TEST';

TEXT
--------------------------------------------------------------------------------
PACKAGE test AS
  TYPE TimeRec IS RECORD (
    minutes SMALLINT,
    hours   SMALLINT);
  TYPE TransRec IS RECORD (
    category VARCHAR2(10),
    account  INT,
    amount   REAL,
    time_of  TimeRec);
  minimum_balance     CONSTANT REAL := 10.00;
  number_processed    INT;
  insufficient_funds  EXCEPTION;
END trans_data;


So as you can see, you can now see a package of interest which we were trying to restore. Hope this helps.


Tuesday, September 9, 2014