Re: Oracle RAC and VIPs
- From: "Alessandro Vercelli" <alever@xxxxxxxxx>
- To: "dannorris" <dannorris@xxxxxxxxxxxxx>
- Date: Tue, 15 Jul 2008 11:29:45 +0200
The crash exact time is not clearly defined, in the morning of May 9th, it was
a database crash, not system; crsd.log reported many messages like:
2008-05-09 12:32:33.833: [ CRSEVT][3695033264]0CAAMonitorHandler :: 0:Action
Script /u01/app/oracle/product/crs/bin/racgwrap(check) timed out for
ora.<failednode>.ons! (timeout=600)
each message referred to a different resource.
Last week, I tried to restart the failed node (in the meantime, other people
made other attempts) and crsd.log reported, among other messages, the following:
2008-07-07 16:10:18.743: [ CRSRES][3781585840]0CRS-1028: Dependency analysis
failed because of:
'Resource in UNKNOWN state: ora.<failednode>.vip'
Using crs_stat -t the ora.<failednode>.vip resource allocation was on the
partner node - not the failed one - and its state was UNKNOWN (as expected).
My opinion is that, at the crash time, the partner node performed an automatic
failover but it failed; crsd.log of partner node:
2008-05-09 11:55:55.278: [ CRSRES][3686595504]0Attempting to start
`ora.<failednode>.vip` on member `<partnernode>`
2008-05-09 11:56:58.305: [ CRSAPP][3686595504]0StartResource error for
ora.<failednode>.vip error code = -2
2008-05-09 11:57:05.429: [ CRSEVT][3697085360]0CAAMonitorHandler :: 0:Action
Script /u01/app/oracle/product/crs/bin/racgwrap(check) timed out for
ora.<failednode>.vip! (timeout=60)
and, finally:
2008-05-09 11:58:01.422: [ CRSRES][3686595504]0X_OP_StopResourceFailed : Stop
Resource failed
(File: rti.cpp, line: 1698
2008-05-09 11:58:01.422: [ CRSRES][3686595504][ALERT]0`ora.<failednode>.vip`
on member `<partnernode>` has experienced an unrecoverable failure.
2008-05-09 11:58:01.422: [ CRSRES][3686595504]0Human intervention required to
resume its availability.
2008-05-09 11:58:01.444: [ CRSRES][3686595504]0CRS-1028: Dependency analysis
failed because of:
'Resource in UNKNOWN state: ora.<failednode>.vip'
Sorry for the *mess* of messages.....
Thanks,
Alessandro
>If you think it's related to the resource not starting because of some
>dependency, then I'd suggest looking at
>$CRS_HOME/log/<nodename>/crsd/crsd.log on each node (especially the
>crashed node) and see what's there around the time of startup.
>
>If the node won't boot, try booting it into single user mode and
>disabling clusterware from starting if you think clusterware is what's
>not allowing it to boot completely.
>
>Dan
>
>Alessandro Vercelli wrote:
>> O.S.: RHEL AS4
>> Hardware is HP BL45P, 4 x AMD Dual core, 8 Gb RAM.
>> Oracle 10.2.0.1, RAC and Clusterware
>>
>> Anyway, the issue became "crabbed", since the last attempt to start the
>> failing node succeeded, so I've one more task now...:)).
>>
>> The failed attempts reported on the console that the listener nodeapp could
>> not start; looking into network configuration, I noticed vip IP address for
>> the failing listener was not allocated on that node but on its partner;
>> please, what log files do you suggest for errors?
>>
>> Thanks,
>>
>> Alessandro
>>
>
--
http://www.freelists.org/webpage/oracle-l
- Follow-Ups:
- Re: Oracle RAC and VIPs
- From: Dan Norris
Other related posts:
- » Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » RE: Oracle RAC and VIPs
- » RE: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- » Re: Oracle RAC and VIPs
- Re: Oracle RAC and VIPs
- From: Dan Norris