Jeff, Are the pieces you are failing redundant in nature? For example, multiple HBAs, switches etc? We had some issues in our fail-over testing that had to do with Service Processor fail-over and it was due to a Linux kernel issue and nmi watchdog processes (again, this was on linux). Without redundancy in the components you mentioned, I would expect CRS to reboot the node. What are you using for OCR and Voting Disk? -- Bradd Piontek Twitter: http://www.twitter.com/piontekdd Oracle Blog: http://piontekdd.blogspot.com Linked In: http://www.linkedin.com/in/piontekdd Last.fm: http://www.last.fm/user/piontekdd/ On Fri, May 30, 2008 at 10:21 AM, Jeffery Thomas <jeffthomas24@xxxxxxxxx> wrote: > Solaris 10, RAC 10.2.0.3. Using IPMP groups for NIC redundancy. > > We've been conducting failover testing -- disabling a HBA port, power > off a switch, > yank an IC link, etc. > > In every single case, CRS rebooted the server where the dire deed was > performed, > and when the server came back up, the repair was successful, e.g. failed > over to > the secondary HBA port, or the physical IP for the IPMP group floated > to the standby > NIC and so forth. > > The other server stayed up and all Oracle components remained > available. During > the switch power off test, the physical IP for the IC actually > floated over to the > standby NIC with no outage on this server. > > Is this what is to be expected? CRS will always reboot a server to repair > itself when an underlying hardware failure is detected? > > Thanks, > Jeff > -- > //www.freelists.org/webpage/oracle-l > > >