Re: Doubt about timeout between nodes of cluster

  • From: "Waldirio Manhães Pinheiro" <waldirio@xxxxxxxxx>
  • To: "Riyaj Shamsudeen" <riyaj.shamsudeen@xxxxxxxxx>, oracle-l-freelists <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 12 Jun 2008 14:42:12 -0300

    Hello Friend

  Thank you for answer .., let's check.

2008/6/12, Riyaj Shamsudeen <riyaj.shamsudeen@xxxxxxxxx>:
> Hello Waldirio
>   >> the time to the first machine detect the second machine powered off is
> very big (between 1 and 2 min),
>  How are you measuring this time? Are you checking alert log or are you
> using DB connections to check it?

   I was check this time starting when I have been send the shutdown to
server until the second VIP interface up on second node (backup node).

 Can you also send crsd.log?

Ok, following the address because the size ...

When I send the power off on first node, on second node (crsd log on link
above), on line 1 log the message "[ COMMCRS][1147169120]clsc_receive:
(0xc6d180) Error receiving, ns (12535, 12560), transport (505, 110, 0)" and
still "Connection not active" until  line 2045.

PS: Now, my VIP address of first node don't migrated to second node later
power off ... (maybe will be necessary re-install the OS and Oracle
ClusterWare, because I've changed the system a lot of to test)

 Further, refer $CRS_HOME/bin/racgvip and there are few parameters such as
> check interval, restart attempts etc controlling behavior of VIP failover
> too. Not sure, they are applicable when machine is rebooted since heartbeat
> will fail before vip check..

Yes, I checked this file too, but don't changed.

Now, looking the crsd log file, I believe the Oracle know when another node
is out, but who is responsible to make a failover (mount the aliases of VIP
on another machine) !? (Script, Daemon, Angel :P )

Thank you friends for help.

> Riyaj Shamsudeen
> The Pythian Group
> Personal blog:
> Waldirio Manhães Pinheiro wrote:
>>   Hello Friends
>>    I'd like to ask about Oracle RAC in Linux environment. I installed two
>> machine with RedHat AS 4Up5 and Oracle <> with
>> ClusterWare. The installation finish with successful and the data base work
>> fine.
>>    I checked my environment of availability with the test below:
>>  Station cambeba UP
>> Station cangua UP
>>  # crs_stat -t
>>  Name           Type           Target    State     Host
>> ------------------------------------------------------------
>> ora....BA.lsnr application    ONLINE    ONLINE    cambeba
>> ora....eba.gsd application    ONLINE    ONLINE    cambeba
>> ora....eba.ons application    ONLINE    ONLINE    cambeba
>> application    ONLINE    ONLINE    cambeba
>> ora....UA.lsnr application    ONLINE    ONLINE    cangua
>> ora.cangua.gsd application    ONLINE    ONLINE    cangua
>> ora.cangua.ons application    ONLINE    ONLINE    cangua
>> application    ONLINE    ONLINE    cangua
>> ora.ora10gq.db application    ONLINE    ONLINE    cangua
>> ora....q1.inst application    ONLINE    ONLINE    cangua
>> ora....q2.inst application    ONLINE    ONLINE    cambeba
>>  At this point, that's ok, but when I force a power off in cangua or
>> cambeba (the name of my machines), the time to the firt machine detect the
>> second machine powered off is very big (between 1 and 2 min), so, if my
>> client was working, will lost the query for time out.
>>  I changed the configurations in objects and
>>, but without successful.
>>  Any Ideia to fix this problem (decrease the time of check between nodes
>> on cluster) ?!?!
>>  PS: I checked in list database, but without successful about this problem
>>  Thanks in advanced.
>> --
>> ______________
>> Atenciosamente
>> Waldirio
>> msn: wmp@xxxxxxxxxxxxx <mailto:wmp@xxxxxxxxxxxxx>
>> Site: <>
>> Blog: <>
>> PGP: <

msn: wmp@xxxxxxxxxxxxx

Other related posts: