11.2.0.4 RAC ora.ons and ora.oc4j resources down

  • From: "Stephens, Chris" <Chris.Stephens@xxxxxxx>
  • To: "'oracle-l@xxxxxxxxxxxxx'" <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 1 Nov 2013 09:41:52 -0500

(second attempt to send...first didn't appear to get to the list)

Oracle Linux 6
11.2.0.4 RAC 2 node (development) and 3 node (production)

I'm very new to RAC so pardon my leaving out relevant details.

2 days ago I received several alerts from EM12c indicating a state change for 
the ora.ons and ora.oc4j resources in both production at development 
environments.  Both resources in both environments ran into problems at 
basically the same time  (11:00:04 - 11:01:02).

This hasn't affected the availability of either database.  On initial 
investigation, I noticed many log messages indicating problems with NTP so we 
switch to CTSS.  No reason to believe that was related but it is something 
we've changed since the problem occurred.  Because this happened in both 
environments at the same time, I'm thinking it is something external to Oracle 
or at least to RAC that caused the problems but I have no idea where to start 
looking.  I've been through all sorts of log files including the ones mentioned 
below and nothing jumps out of me as relevant.

Can anyone get me started down some productive troubleshooting?  ...or, even 
better, already know what my problem is?

"crsctl stat res -t" shows (dev):


ora.ons
               ONLINE  UNKNOWN      admoract1n1              CHECK TIMED OUT
               ONLINE  UNKNOWN      admoract1n2              CHECK TIMED OUT

ora.oc4j

1            ONLINE  OFFLINE

"crsctl start resource -all" results in:

crsctl start resource -all
CRS-5702: Resource 'ora.DATA.dg' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.FRA.dg' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.LISTENER_SCAN1.lsnr' is already running on 'admoract1n2'
CRS-5702: Resource 'ora.LISTENER_SCAN2.lsnr' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.LISTENER_SCAN3.lsnr' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.asm' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'admoract1n1'
CRS-2501: Resource 'ora.gsd' is disabled
CRS-5702: Resource 'ora.admoract1n1.vip' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.asm' is already running on 'admoract1n2'
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'admoract1n2'
CRS-2501: Resource 'ora.gsd' is disabled
CRS-5702: Resource 'ora.admoract1n2.vip' is already running on 'admoract1n2'
CRS-5702: Resource 'ora.asm' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.cvu' is already running on 'admoract1n1'
CRS-2501: Resource 'ora.gsd' is disabled
CRS-5702: Resource 'ora.net1.network' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.oract1db.adm_dba.svc' is already running on 
'admoract1n1'
CRS-5702: Resource 'ora.oract1db.db' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.oract1db.grcl_712_dev.svc' is already running on 
'admoract1n1'
CRS-5702: Resource 'ora.oract1db.grcl_712_grmt1.svc' is already running on 
'admoract1n1'
CRS-5702: Resource 'ora.oract1db.grcl_712_test.svc' is already running on 
'admoract1n1'
CRS-5702: Resource 'ora.registry.acfs' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.scan1.vip' is already running on 'admoract1n2'
CRS-5702: Resource 'ora.scan2.vip' is already running on 'admoract1n1'
CRS-5702: Resource 'ora.scan3.vip' is already running on 'admoract1n1'
CRS-2679: Attempting to clean 'ora.ons' on 'admoract1n2'
CRS-2679: Attempting to clean 'ora.ons' on 'admoract1n1'
CRS-2672: Attempting to start 'ora.oc4j' on 'admoract1n2'
CRS-5014: Agent "/u01/app/11.2.0.4/grid/bin/oraagent.bin" timed out starting 
process "/u01/app/11.2.0.4/grid/opmn/bin/onsctli" for action "clean": details 
at "(:CLSN00009:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n2/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-5017: The resource action "ora.ons clean" encountered the following error:
(:CLSN00009:)Utils:execCmd aborted. For details refer to "(:CLSN00106:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n2/agent/crsd/oraagent_grid/oraagent_grid.log".
CRS-5014: Agent "/u01/app/11.2.0.4/grid/bin/oraagent.bin" timed out starting 
process "/u01/app/11.2.0.4/grid/opmn/bin/onsctli" for action "clean": details 
at "(:CLSN00009:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n1/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-5017: The resource action "ora.ons clean" encountered the following error:
(:CLSN00009:)Utils:execCmd aborted. For details refer to "(:CLSN00106:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n1/agent/crsd/oraagent_grid/oraagent_grid.log".
CRS-5014: Agent "/u01/app/11.2.0.4/grid/bin/oraagent.bin" timed out starting 
process "/u01/app/11.2.0.4/grid/opmn/bin/onsctli" for action "check": details 
at "(:CLSN00009:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n1/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-5017: The resource action "ora.ons check" encountered the following error:
(:CLSN00009:)Utils:execCmd aborted. For details refer to "(:CLSN00109:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n1/agent/crsd/oraagent_grid/oraagent_grid.log".
CRS-5014: Agent "/u01/app/11.2.0.4/grid/bin/oraagent.bin" timed out starting 
process "/u01/app/11.2.0.4/grid/opmn/bin/onsctli" for action "check": details 
at "(:CLSN00009:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n2/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-5017: The resource action "ora.ons check" encountered the following error:
(:CLSN00009:)Utils:execCmd aborted. For details refer to "(:CLSN00109:)" in 
"/u01/app/11.2.0.4/grid/log/admoract1n2/agent/crsd/oraagent_grid/oraagent_grid.log".
CRS-2680: Clean of 'ora.ons' on 'admoract1n1' failed
CRS-2503: Resource 'ora.ons' is in UNKNOWN state and must be stopped first
CRS-2680: Clean of 'ora.ons' on 'admoract1n2' failed
CRS-2503: Resource 'ora.ons' is in UNKNOWN state and must be stopped first
CRS-2674: Start of 'ora.oc4j' on 'admoract1n2' failed
CRS-2679: Attempting to clean 'ora.oc4j' on 'admoract1n2'
CRS-2681: Clean of 'ora.oc4j' on 'admoract1n2' succeeded
CRS-2563: Attempt to start resource 'ora.oc4j' on 'admoract1n2' has failed. 
Will re-retry on 'admoract1n1' now.
CRS-2672: Attempting to start 'ora.oc4j' on 'admoract1n1'
CRS-2674: Start of 'ora.oc4j' on 'admoract1n1' failed
CRS-2679: Attempting to clean 'ora.oc4j' on 'admoract1n1'
CRS-2681: Clean of 'ora.oc4j' on 'admoract1n1' succeeded
CRS-2632: There are no more servers to try to place resource 'ora.oc4j' on that 
would satisfy its placement policy
CRS-4000: Command Start failed, or completed with errors.



CONFIDENTIALITY NOTICE:
This message is intended for the use of the individual or entity to which it is 
addressed and may contain information that is privileged, confidential and 
exempt from disclosure under applicable law. If the reader of this message is 
not the intended recipient or the employee or agent responsible for delivering 
this message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately by email reply.


Other related posts: