Interconnect switch failure - was: RAC on OCFS2 acceptance testing

  • From: Boris Dali <boris_dali@xxxxxxxx>
  • To: kevinc@xxxxxxxxxxxxx, oracle-l@xxxxxxxxxxxxx
  • Date: Mon, 15 Jan 2007 09:20:16 -0500 (EST)

One of the surprising (at least to me) results of
these tests was that while a failure of interconnect
NIC (not bonded) on either node of a 2-node RAC
doesn't create much trouble, the switch failure
triggered a FSFO (fast-start failover) to a target
standby. 

Just like with a single interconnect NIC failure (on
either node) the same voting node goes down with the
similar messages...


ospid 13081: network interface with IP address
<ip_address> no longer running

Message from syslogd@<node_name>
<node_name> kernel: Kernel panic: ocfs2 is very sorry
to be fencing this system by panicing


... yet in contrast to a single NIC failure, RAC (or
is it OCFS2?) doesn't recover, failing over to a
standby, instead of a master node.

Was that to be expected for a 2-node RAC or is it a
problem with our setup (e.g. FSFO-threshold)?

Thanks,
Boris Dali.


--- Boris Dali <boris_dali@xxxxxxxx> wrote:

> Thanks, Kevin.
> 
> The issue discussed in this SuSE thread is
> documented on Metalink in the note# 394408.1
...


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
--
//www.freelists.org/webpage/oracle-l


Other related posts:

  • » Interconnect switch failure - was: RAC on OCFS2 acceptance testing