Hi These are the errors, a bit tricky to read! Right now this is a showstopper for us to go with 11gR2 RAC. *ASM alert when pmon was killed:* Mon Feb 01 11:25:02 2010 Shutting down instance (abort) License high water mark = 6 USER (ospid: 279022): terminating the instance Mon Feb 01 11:25:03 2010 ORA-1092 : opitsk aborting process Mon Feb 01 11:25:03 2010 ORA-1092 : opitsk aborting process Instance terminated by USER, pid = 279022 Mon Feb 01 11:25:04 2010 Instance shutdown complete Mon Feb 01 11:25:05 2010 .................... Starting ORACLE instance (normal) .................... SQL> ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent */ SUCCESS: ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent */ NOTE: diskgroup resource ora.OCR.dg is online NOTE: diskgroup resource ora.DATA01.dg is online Mon Feb 01 11:25:24 2010 Starting background process ASMB *Database alert after restarted by crsd* Mon Feb 01 11:25:03 2010 NOTE: ASMB terminating Errors in file /u01/app/oracle/diag/rdbms/cpmdga/cpmdga1/trace/cpmdga1_asmb_332148.trc: ORA-15064: communication failure with ASM instance ORA-03113: end-of-file on communication channel Process ID: Session ID: 48 Serial number: 107 ASMB (ospid: 332148): terminating the instance due to error 15064 Mon Feb 01 11:25:03 2010 System state dump is made for local instance System State dumped to trace file /u01/app/oracle/diag/rdbms/cpmdga/cpmdga1/trace/cpmdga1_diag_413936.trc Mon Feb 01 11:25:03 2010 ORA-1092 : opitsk aborting process Mon Feb 01 11:25:03 2010 License high water mark = 15 Trace dumping is performing id=[cdmp_20100201112503] Instance terminated by ASMB, pid = 332148 USER (ospid: 368740): terminating the instance Instance terminated by USER, pid = 368740 ...................... Mon Feb 01 11:25:25 2010 PMON started with pid=2, OS id=360536 ...................... Mon Feb 01 11:25:32 2010 ALTER DATABASE MOUNT NOTE: Loaded library: System SUCCESS: diskgroup DATA01 was mounted *ERROR: failed to establish dependency between database racdb and diskgroup resource ora.DATA01.dg* Successful mount of redo thread 1, with mount id 1487734179 Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE) Lost write protection disabled Completed: ALTER DATABASE MOUNT *crsd log* 2010-02-01 11:25:02.472: [UiServer][13629] Done for ctx=11a880230 2010-02-01 11:25:02.683: [UiServer][13886] S(11a87ff10): set Properties ( grid,11a87c9b0) 2010-02-01 11:25:02.693: [UiServer][13629] processMessage called 2010-02-01 11:25:02.693: [UiServer][13629] Sending message to PE. ctx= 11a8802b0 2010-02-01 11:25:02.694: [ CRSPE][13115] Processing PE command id=1090. Description: [Stat Resource : 11a8821b0] 2010-02-01 11:25:02.696: [ CRSPE][13115] PE Command [ Stat Resource : 11a8821b0 ] has completed 2010-02-01 11:25:02.697: [ CRSPE][13115] UI Command [Stat Resource : 11a8821b0] is replying to sender. 2010-02-01 11:25:02.702: [UiServer][13629] Done for ctx=11a8802b0 2010-02-01 11:25:02.705: [ AGFW][11830] Agfw Proxy Server received the message: RESOURCE_STATUS[Proxy] ID 20481:195684 2010-02-01 11:25:02.705: [ AGFW][11830] Received state change for ora.asm racsrv1 1 [old state = ONLINE, new state = OFFLINE] 2010-02-01 11:25:02.705: [ AGFW][11830] Received state LABEL change for ora.asm racsrv1 1 [old label = Started, new label = Abnormal Termination] 2010-02-01 11:25:02.705: [ AGFW][11830] Agfw Proxy Server sending message to PE, Contents = [MIDTo:2|OpID:3|FromA:{Invalid|Node:0|Process:0|Type:0}|ToA:{Invalid|Node:-1|Process:-1|Type:-1}|MIDFrom:0|Type:4|Pri2|Id:120619] 2010-02-01 11:25:02.706: [ AGFW][11830] Agfw Proxy Server replying to the message: RESOURCE_STATUS[Proxy] ID 20481:195684 2010-02-01 11:25:02.706: [ CRSPE][13115] State change received from racsrv1 for ora.asm racsrv1 1 ................ 2010-02-01 11:25:02.708: [ CRSPE][13115] Resource Resource Instance ID[ora.asm racsrv1 1]. Values: STATE=OFFLINE TARGET=ONLINE LAST_SERVER=racsrv1 CURRENT_RCOUNT=0 LAST_RESTART=0 FAILURE_COUNT=0 FAILURE_HISTORY= STATE_DETAILS=Abnormal Termination INCARNATION=0 STATE_CHANGE_VERS=0 LAST_FAULT=0 DEGREE_ID=1 ID=ora.asm racsrv1 1 Lock Info: Write Locks:none ReadLocks:|STATE INITED||ONLINE STATERECOVERED| has failed! 2010-02-01 11:25:02.708: [ CRSRPT][13372] Publishing event: Cluster Resource State Change Event for ora.asm:racsrv1 : 11a881fd0 2010-02-01 11:25:02.708: [ CRSRPT][13372] Publish to eons buffered event : 11a881fd0 2010-02-01 11:25:02.709: [ CRSPE][13115] Processing unplanned state change for [ora.asm racsrv1 1] ............... 2010-02-01 11:25:02.720: [ CRSPE][13115] CRS-2672: Attempting to start 'ora.asm' on 'bd1c ................. 2010-02-01 11:25:02.797: [ AGFW][11830] Received state change for ora.OCR.dg racsrv1 1 [old state = ONLINE, new state = OFFLINE] 2010-02-01 11:25:02.798: [ AGFW][11830] Agfw Proxy Server sending message to PE, Contents = [MIDTo:2|OpID:3|FromA:{Invalid|Node:0|Process:0|Type:0}|ToA:{Invalid|Node:-1|Process:-1|Type:-1}|MIDFrom:0|Type:4|Pri2|Id:120653] 2010-02-01 11:25:02.798: [ AGFW][11830] Agfw Proxy Server replying to the message: RESOURCE_STATUS[Proxy] ID 20481:195696 2010-02-01 11:25:02.798: [ AGFW][11830] Received the reply to the message: RESOURCE_PROBE[ora.OCR.dg racsrv1 1] ID 4097:120626 from the agent /u01/grid/11.2.0/bin/oraagent_grid 2010-02-01 11:25:02.798: [ CRSPE][13115] State change received from racsrv1 for ora.OCR.dg racsrv1 1 2010-02-01 11:25:02.799: [ AGFW][11830] ora.OCR.dg racsrv1 1 received state from probe request. Old state = OFFLINE, New state = OFFLINE 2010-02-01 11:25:02.799: [ CRSPE][13115] Processing PE command id=1093. Description: [Resource State Change (ora.OCR.dg racsrv1 1) : 11a87e2d0] 2010-02-01 11:25:02.799: [ AGFW][11830] Agfw Proxy Server sending the last reply to PE for message:RESOURCE_PROBE[ora.OCR.dg racsrv1 1] ID 4097:120623 2010-02-01 11:25:02.799: [ CRSPE][13115] RI [ora.OCR.dg racsrv1 1] new external state [OFFLINE] old value: [ONLINE] on racsrv1 label = [] 2010-02-01 11:25:02.800: [ CRSPE][13115] Resource Resource Instance ID[ora.OCR.dg racsrv1 1]. Values: ................ 2010-02-01 11:25:03.177: [ CRSPE][13115] RI [ora.racdb.db 1 1] new external state [UNKNOWN] old value: [ONLINE] on racsrv1 label = [Abnormal Termination] override:OFFLINE 2010-02-01 11:25:03.177: [ CRSPE][13115] Set State Details to [Abnormal Termination] from [ Open] for [ora.racdb.db 1 1] .................. 2010-02-01 11:25:03.335: [ CRSPE][13115] Got agent-specific msg: ORA-01092: ORACLE instance terminated. Disconnection forced Process ID: 0 Session ID: 0 Serial number: 0 2010-02-01 11:25:03.449: [ AGFW][11830] Received the reply to the message: RESOURCE_START[ora.OCR.dg racsrv1 1] ID 4098:120660 from the agent /u01/grid/11.2.0/bin/oraagent_grid 2010-02-01 11:25:03.451: [ AGFW][11830] Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.OCR.dg racsrv1 1] ID 4098:120659 2010-02-01 11:25:03.452: [ AGFW][11830] Received the reply to the message: RESOURCE_START[ora.OCR.dg racsrv1 1] ID 4098:120660 from the agent /u01/grid/11.2.0/bin/oraagent_grid 2010-02-01 11:25:03.452: [ CRSPE][13115] Received reply to action [Start] message ID: 120659 2010-02-01 11:25:03.452: [ CRSPE][13115] Got agent-specific msg: ORA-01092: ORACLE instance terminated. Disconnection forced .................... 2010-02-01 11:25:19.019: [ CRSPE][13115] ICE has queued an operation. Details: Operation [START of [ora.racdb.db 1 1] on [racsrv1] : 11a8e1f50] cannot run cause it needs W lock for: WO for Placement Path RI:[ora.racdb.db 1 1] server [racsrv1] target states [ONLINE INTERMEDIATE ], locked by op [START of [ora.racdb.db 1 1] on [racsrv1] : 11a893b90]. Owner: CRS-2683: It is locked by 'SYSTEM' for command 'Unplanned Resource State Change : ora.racdb.db' .................... 2010-02-01 11:25:32.061: [UiServer][13629] Container [ Name: ORDER MESSAGE: TextMessage[CRS-2545: Cannot operate on 'ora.racdb.db with CARDINALITY_ID=1'. It is locked by 'SYSTEM' for command 'Unplanned Resource State Change : ora.racdb.db'] MSGTYPE: TextMessage[1] OBJID: TextMessage[ora.racdb.db 1 1] WAIT: TextMessage[0] ] 2010/2/24 Martin Bach <development@xxxxxxxxxxxxxxxxx> > Hi Alex, > > On 02/24/2010 08:09 AM, amonte wrote: > > Hi > > > [...] > > > > We have been doing extensive destructive tests to check for these issues. > > > > Would you mind sharing these with the rest of the list? I'd also be > interested to see alert.log contents more detail from the ohasd logs :) > > Thanks and regards, > > Martin > > -- > Martin Bach > OCM 10g > http://martincarstenbach.wordpress.com > http://www.linkedin.com/in/martincarstenbach >