Monday, May 20, 2013


ORA-00020 on DB Instance / HUNDREDS OF ORAAGENT.BIN@HOSTNAME SESSSIONS IN 11.2.0.2 DATABASE


The issue happen on one of our production env running 8 node cluster. The DB having issue was running on two nodes of the cluster. One one node, say node A, it was running fine. However, on another node B, it was not allowing connection to DB. Later on we figured out the process parameter seems exhausted and we increased the "PROCESSES" parameter from 100 to 500. 

Time went by and again after few weeks we started seeing the same problem. Some thing is not right as ASM instance doesn't eat that many processes. Oracle Best Practices for the ASM instance suggests to keep 100 Processes for medium to high load DB. 

Checking went on, one surprising issue that I saw during the event is that there were lots of ORAAGENT processes spawned. Total I saw was around 700+.  Not sure why. So we decided to clear everything and start the instance afresh. We shut down the instance and on the startup surprise was waiting for us. ASM Instance was able to start and to our surprise the process from oraagent.bin were still in place. 

SQL> startup
ORA-03113: end-of-file on communication channel

Once it happens, other instance may fail to restart as no new connection can be made to the ORA-00020 instance

To double check my suspicion I checked the log files and trace files. And what I found is following. 
[grid@xxxx cdmp_20130405075132]$ cat /u01/app/grid/diag/asm/+asm/+ASM7/trace/+ASM7_diag_30046.trc | grep oraagent.bin@tryprorarac2g.intra.searshc.com | wc -l
734


It perfectly fits the figure of 700 odd connections that I saw. Eventually ORA-00020 will happen to DB instance as oraagent.bin keeps making new connections without closing them. Not sure why they were abundance, I checked MOS and I came across following two bug notes and they fits the bill. I also checked that we have not applied PSU3 , in which they say, they have fixed it.

Bug 11877079 : HUNDREDS OF ORAAGENT.BIN@HOSTNAME SESSSIONS IN 11.2.0.2 DATABASE
Bug 10299006 : AFTER 11.2.0.2 UPGRADE, ORAAGENT.BIN CONNECTS TO DATABASE WITH TOO MANY SESSIONS

So, we ultimately decided to reboot the box to kill all the ORAAGENT.BIN processes. Post refresh ASM came up clean and DB started successfully.

Hope this will help you to trouble shoot your ASM bug. 

On a side note, this is also fixed in 11.2.0.3 and we recently applied 11.2.0.3 Grid Upgrade to fix this. DBs are still running with 11.2.0.2 Version. 

No comments:

Post a Comment