Friday, September 21, 2012

Apps/EBS R12: How to Diagnose Start-up Problems for Apache

When using the script to start Apache in Release 12 this fails with an error like:
12/07/12-15:32:56 :: starting OPMN managed OHS instance
opmnctl: starting opmn managed processes...
   0 of 1 processes started.
--> Process (index=1,uid=1561342258,pid=14927)
   failed to start a managed process after the maximum retry limit
12/07/11-15:33:00 :: exiting with status 204


The log file HTTP_Server~1.log referenced does not exactly reports a root-cause for the failure, so further analysis is required to identify what prevents Apache from being started.
The is essentially a wrapper script calling the native Apache starting command. To identify what is preventing the Apache from being started the approach is run this command direct. For this some additional actions are needed.

1. Run the <SID>_<host>.env location in $INST_TOP/ora/10.1.3 directory
# . ./$INST_TOP/ora/10.1.3/<SID>_<host>.env

This sets the $ORACLE_HOME to the AS10G 10.1.3 HOME (instead of AS10G 10.1.2 HOME), so relevant settings picked from the right AS10G HOME.

2. Run the following command:
#  $INST_TOP/ora/10.1.3/Apache/Apache/bin/apachectl configtest -f $INST_TOP/ora/10.1.3/Apache/Apache/conf/httpd.conf

This validates the httpd.conf configuration file used by Apache. If there are errors raised for this step it appears the httpd.conf may be corrupted/misconfigured and this prevents Apache from being started. Resolve any problems reported (e.g. by running Autoconfig to have the configuration being recreated) and retest. If the commands responds with an OK proceed with the next step.

3. Run the following command:
#  $INST_TOP/ora/10.1.3/Apache/Apache/bin/apachectl startsll -f $INST_TOP/ora/10.1.3/Apache/Apache/conf/httpd.conf

This starts the Apache server direct instead of using OPMN. This could expose errors not observed easily when Apache is started as OPMN service, so can assist in finding out why Apache can not be started.
After this command completes it's expected to see number of httpd processes while running:
# ps -ef | grep httpd

If this still does not show any obvious errors the next step is to run the same command and run strace/truss/tusc to see what OS calls are executed.
The below example uses strace command available on Linux platform. Check OS documentation for exact parameters to be used for the utility on the platform used.

4. Run the following command:
# strace -o startapache.trc -ff -t  $INST_TOP/ora/10.1.3/Apache/Apache/bin/apachectl startsll -f $INST_TOP/ora/10.1.3/Apache/Apache/conf/httpd.conf &

This command saves the output in startapache.trc and on Linux the -ff makes that each child process started is logged in separate log file where the <PID> is added to the file name.
Review the trace files for errors reported. If useful it may be an option to collect the same from similar instance not having the problem so trace files can be compared. The OS calls logged in the trace file may  expose problems in areas like:
Opening files required for Apache to run (missing, privileges)
Creating or updating (log/pid) files (privileges, size of log file hitting 2GB limit)
Memory issues

5. After root-cause has been identified and issue is resolved so direct start works fine run the following command to stop Apache service
#  $INST_TOP/ora/10.1.3/Apache/Apache/bin/apachectl stop -f $INST_TOP/ora/10.1.3/Apache/Apache/conf/httpd.conf

Then use the script to confirm that Apache now also starts using the recommended way and confirm this also works fine:
# $INST_TOP/admin/scripts/ start