Saturday, April 5, 2014

Minimal Downtime Patching Using Cloned RDBMS Home


Jan 2014 patch is out since some time and we need to now apply it as there is no major issue reported with it. The target is 8 node cluster with around 30+ DBs running on it. These are the critical business applications which cannot be taken offline as they are running some critical global HR and Financial apps. We really struggled to get every application owner to agree to same downtime (DBA's nigthmare!!). Also due to approvals issue we have to cancel the patching twice. This was b'coz all are sharing same infrastructure. 

There are two fundamental problems that we have.
1. Amount of downtime. 
2. Get every one to agree to same downtime window, as they all have different business priorities. 

Considering this, we considered the following approach, which solves our both the issues.
Idea is very simple, where we clone the existing RDBMS Home and Patch it and then switch ORACLE HOME of the database to run out of the new patched home. 
So following is the outline of the process.


1. Clone oracle home.
2. Apply the patches on the cloned home.
3. Switch DB to run from new home and run sql script.

We tried this on three node RAC cluster running 11.2.0.3 with latest OPatch.
So first we need to clone the RDBMS home on all the nodes. 
One needs to do this on all oracle rac nodes, in this case on 3 nodes, since this a 3 node RAC.

[oracle@Node1a 11.2.0]$  cp -rP dbhome_1/ dbhome_2/
cp: cannot open `dbhome_1/bin/nmo' for reading: Permission denied
cp: cannot open `dbhome_1/bin/nmb' for reading: Permission denied
cp: cannot open `dbhome_1/bin/nmhs' for reading: Permission denied
[oracle@Node1a 11.2.0]$ pwd
/u01/app/oracle/product/11.2.0

-- Unset all the ENV variables to make sure you dont point to ORIGINAL ENV

[oracle@Node1a 11.2.0]$ unset ORACLE_HOME
[oracle@Node1a 11.2.0]$ unset PATH
[oracle@Node1a 11.2.0]$ unset ORACLE_SID

[oracle@Node1a 11.2.0]$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_2
[oracle@Node1a 11.2.0]$ cd $ORACLE_HOME
[oracle@Node1a ~]$ cd $ORACLE_HOME
[oracle@Node1a dbhome_2]$ cd clone/bin/

[oracle@Node1a bin]$ ./clone.pl ORACLE_HOME=$ORACLE_HOME ORACLE_HOME_NAME=OraDB_home2 ORACLE_BASE=/u01/app/oraInventory

./runInstaller -clone -waitForCompletion  "ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_2" "ORACLE_HOME_NAME=OraDB_home2" 

"ORACLE_BASE=/u01/app/oraInventory" -silent -noConfig -nowait
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3820 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2014-03-31_08-01-10AM. Please wait ...Oracle Universal Installer, Version 

11.2.0.3.0 Production
Copyright (C) 1999, 2011, Oracle. All rights reserved.

You can find the log of this install session at:
 /u01/app/oraInventory/logs/cloneActions2014-03-31_08-01-10AM.log
.................................................................................................... 100% Done.

Installation in progress (Monday, March 31, 2014 7:01:19 AM CDT)
...............................................................................                                                 79% Done.
Install successful

Linking in progress (Monday, March 31, 2014 7:01:26 AM CDT)
Link successful

Setup in progress (Monday, March 31, 2014 7:02:12 AM CDT)
Setup successful

End of install phases.(Monday, March 31, 2014 7:02:35 AM CDT)
WARNING:
The following configuration scripts need to be executed as the "root" user.
/u01/app/oracle/product/11.2.0/dbhome_2/root.sh
To execute the configuration scripts:
    1. Open a terminal window
    2. Log in as "root"
    3. Run the scripts

The cloning of OraDB_home2 was successful.

Run Root script - 
[root@Node1a ~]# cd /u01/app/oracle/product/11.2.0/dbhome_2/
[root@Node1a dbhome_2]# ./root.sh
Check /u01/app/oracle/product/11.2.0/dbhome_2/install/root_Node1c.intra.searshc.com_2014-03-31_08-04-35.log for the output of root script

This clone script needs be executed on all nodes of the cluster

UpdateNodeList - 
Command - 
$ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={Node1a,Node1b,Node1c}"

This script needs be executed on all nodes of the cluster

[oracle@Node1a bin]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/dbhome_2
[oracle@Node1a bin]$
[oracle@Node1a bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={Node1a,Node1b,Node1c}"
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB.   Actual 3819 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'UpdateNodeList' was successful.

[oracle@Node1b bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={Node1a,Node1b,Node1c}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3820 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'UpdateNodeList' was successful.

[oracle@Node1c bin]$  ./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={Node1a,Node1b,Node1c}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3820 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'UpdateNodeList' was successful.

NOTE - THIS IS ONLY NEEDED IF YOU ARE PATCHING ON EXADATA. IF YOU DO THIS ON NON EXADATA SYSTEMS, THE START OF THE DB POST HOME SWITCH WILL FAIL.
----------------------------------------------------------------------------------------------------
As the software owner, relink the oracle binary to use RDS. The cloning step causes the oracle binary to be relinked with UDP instead of RDS:
[oracle@Node1a lib]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/dbhome_2
[oracle@Node1a lib]$ pwd
/u01/app/oracle/product/11.2.0/dbhome_2/rdbms/lib

make -f ins_rdbms.mk ipc_rds ioracle

[oracle@Node1a lib]$ make -f ins_rdbms.mk ipc_udp ioracle
rm -f /u01/app/oracle/product/11.2.0/dbhome_2/lib/libskgxp11.so
cp /u01/app/oracle/product/11.2.0/dbhome_2/lib//libskgxpr.so /u01/app/oracle/product/11.2.0/dbhome_2/lib/libskgxp11.so
chmod 755 /u01/app/oracle/product/11.2.0/dbhome_2/bin

 - Linking Oracle
rm -f /u01/app/oracle/product/11.2.0/dbhome_2/rdbms/lib/oracle
gcc  -o /u01/app/oracle/product/11.2.0/dbhome_2/rdbms/lib/oracle -m64 -L/u01/app/oracle/product/11.2.0/dbhome_2/rdbms/lib/ 

..............................
/u01/app/oracle/product/11.2.0/dbhome_2/lib/sysliblist` -ldl -lm   -L/u01/app/oracle/product/11.2.0/dbhome_2/lib
test ! -f /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracle ||\
           mv -f /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracle /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracleO
mv /u01/app/oracle/product/11.2.0/dbhome_2/rdbms/lib/oracle /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracle
chmod 6751 /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracle


Post-clone verification:

Verify that all nodes have the same version of OPatch. If these versions do not match, you may need to update OPatch on the nodes running older versions so that all nodes have the same version.

Command - $ORACLE_HOME/OPatch/opatch version

[oracle@Node1a lib]$ $ORACLE_HOME/OPatch/opatch version
OPatch Version: 11.2.0.3.4
OPatch succeeded.

Command - $ORACLE_HOME/OPatch/opatch lsinventory -oh $ORACLE_HOME | grep node

[oracle@Node1a lib]$ $ORACLE_HOME/OPatch/opatch lsinventory -oh $ORACLE_HOME | grep node
Rac system comprising of multiple nodes
  Local node = Node1a
  Remote node = Node1b
  Remote node = Node1c



Now as you the new cloned home you need to apply the patch on this new home. In this case I am planning to apply Jan 2014 SPU. Download and copy the patch on target system.

PATCHING OUTPUT -

[oracle@Node1a 17478415]$ /u01/app/oracle/product/11.2.0/dbhome_2/OPatch/opatch napply -skip_subset -skip_duplicate
Oracle Interim Patch Installer version 11.2.0.3.4
Copyright (c) 2012, Oracle Corporation.  All rights reserved.

Oracle Home       : /u01/app/oracle/product/11.2.0/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/dbhome_2/oraInst.loc
OPatch version    : 11.2.0.3.4
OUI version       : 11.2.0.3.0
Log file location : /u01/app/oracle/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch2014-03-31_08-08-27AM_1.log

Verifying environment and performing prerequisite checks...

The following patches are identical and are skipped:
[ 13742433 13742434 13742435 13742436 13742438 14062795 14062797 14480675 14480676 15862016 15862017 15862018 15862019 15862020 15862021 15862022 

15862023 15862024 16314467 16794241 16794242 16794244  ]

Checking skip_duplicate
Checking skip_subset

Patching in rolling mode.

Remaining nodes to be patched:
'Node1b' 'Node1c'
What is the next node to be patched?
Node1b
You have selected 'Node1b' from 'Node1b' 'Node1c'

The node 'Node1b' will be patched next.
.......................
The node 'Node1c' will be patched next.
Please shutdown Oracle instances running out of this ORACLE_HOME on 'Node1c'.
(Oracle Home = '/u01/app/oracle/product/11.2.0/dbhome_2')

Is the node ready for patching? [y|n]
Y
User Responded with: Y
Updating nodes 'Node1c'
   Apply-related files are:

Please shutdown Oracle instances running out of this ORACLE_HOME on 'Node1b'.
(Oracle Home = '/u01/app/oracle/product/11.2.0/dbhome_2')

Is the node ready for patching? [y|n]
Y
User Responded with: Y
Updating nodes 'Node1b'
   Apply-related files are:
     FP = "/u01/app/oracle/product/11.2.0/dbhome_2/.patch_storage/NApply/2014-03-31_08-08-27AM/rac/copy_files.txt"

The node 'Node1c' has been patched.  You can restart Oracle instances on it.

Patches 17333197,17333198,17333199,17333202,17333203,17478415,17748830,17748831,17748832,17748833,17748834,17748835 successfully applied.
OPatch Session completed with warnings.
Log file location: /u01/app/oracle/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch2014-03-31_08-08-27AM_1.log
OPatch completed with warnings.

As you can see now that we have patch in place on new home, we are ready to switch the DB to new home. I am currently doing this for only one database i.e. GPDB

Confirm the database is using the old ORACLE_HOME with the following command

[oracle@Node1b lib]$ srvctl config database -d gpdb -a
Database unique name: GPDB
Database name: GPDB
Oracle home: /u01/app/oracle/product/11.2.0/dbhome_1  <This is the old home>

Modify the database resource using the following command.

[oracle@Node1b ~]$ srvctl modify database -d GPDB -o /u01/app/oracle/product/11.2.0/dbhome_2
[oracle@Node1b ~]$ srvctl config database -d gpdb -a
Database unique name: GPDB
Database name: GPDB
Oracle home: /u01/app/oracle/product/11.2.0/dbhome_2   <This is the new home>

Edit the /etc/oratab files on all nodes to set new ORACLE_HOME for the database instance on each node

GPDB:/u01/app/oracle/product/11.2.0/dbhome_2:N

One instance at a time, stop the instance and start the instance. As the instance is started, it will pick up the new ORACLE_HOME defined in the 

database configuration and use that for restarting. 
[oracle@Node1b ~]$ srvctl stop database -d GPDB
[oracle@Node1b ~]$ srvctl start database -d GPDB

Now connect to DB and run following query - 

SQL> @?/rdbms/admin/catbundle.sql cpu apply; 

Once the query is done use the following query to check whether patch is applied or not.

SQL> select * from registry$history;
31-MAR-14 10.46.55.494052 AM                                                APPLY                          SERVER
11.2.0.3                                9
CPUJan2014
CPU