RE: All RAC Nodes reboot without log-entry
- From: "Oliver Jost" <Oliver.Jost@xxxxxxxxxxxxxx>
- To: <oracle-l@xxxxxxxxxxxxx>
- Date: Thu, 17 Jan 2008 15:48:07 -0500
Just a little check... I haven't read this entire thread so,.... for what it's
worth I've seen this once before. The on one node CRS was trying to come up;
because the interface registered with the CRS did not have a match (a new
interface under a new name came) the host and failed, no logging was evident...
we replaced the new interface name in the hosts file and it came back...
try an ocrdump, it may shed some light.
Thanks,
Oliver
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of Dan Norris
Sent: Thursday, January 17, 2008 11:45 AM
To: bkaltofen@xxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: Re: All RAC Nodes reboot without log-entry
In absence of any solid theories, I'd start by shutting down one of the nodes
and seeing if you still experience crashes. Then I'd repeat the test with the
other node to see if somehow one node has a problem. If both nodes crash while
running solo, then I would say that we should put clusterware low on the
suspect list. That is, unless it has intermittent problems accessing its OCR
and/or voting disk(s). Then, maybe it'd have a problem.
All things equal, if this is a new environment, I would consider reinstalling
the whole thing to see if you can reproduce the problem. However, if the
problem doesn't come back immediately, you'll always fear it might. Then again,
maybe you just missed a step or a prereq or something that you may catch during
reinstallation.
I acknowledge that I'm violating the BAAG pledge, but I don't know that we can
establish any test cases at this point to improve our chances of finding the
right solution.
Dan
----- Original Message ----
From: "bkaltofen@xxxxxx" <bkaltofen@xxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Sent: Thursday, January 17, 2008 5:53:37 AM
Subject: Re: All RAC Nodes reboot without log-entry
More Information and replies:
The RAC is a new setup. The Nodes with OS-only where running for 4 weeks
without issue before. I installed the RAC last week.
@Dan
There is no cron or something. Only me and 2 administrators have access to the
system.
dmesg will only show entries starting with the last boot. So there is no chance
do get information from the affected system.
I'm with you about booting both nodes is not a common Cluster behavior.
@Brian
There is nothing seen from the OS that indicates a Interconnect issue. I have
no access to the network and power supply, as I'm remote. The local admins say
there is no issue, too.
@Andrew
There are ASM and DB instances, but nothing in the logs. The logs show normal
behavior till they report the startup of the instances after the reboot.
regards, Björn
Dan Norris schrieb:
In my past days when I was much more mischevious than I am now (or
maybe not :), I once set a cron job to reboot a system once a day. It was on a
SGI system sitting on a desk in a classroom at Silicon Graphics' training
facility, so I didn't do it to your system, but it might be worth changing all
the passwords just for good measure.
You might also check the "dmesg" output to see if there's something
being reported in there that didn't make it into the logfiles. I haven't ever
seen Clusterware reboot a node without logging a message about it *somewhere*
(one of the places you said you looked), so my theory is that this isn't caused
by Oracle software. Therefore, I'd investigate something in hardware or OS. Odd
that they both reboot--that's not a common Clusterware method to resolve
anything either.
I presume that Oracle Clusterware is the only cluster manager. if you
also have VCS or SunCluster, obviously you need to check on their logs as well.
Dan
----- Original Message ----
From: "bkaltofen@xxxxxx" <mailto:bkaltofen@xxxxxx> <bkaltofen@xxxxxx>
<mailto:bkaltofen@xxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Sent: Wednesday, January 16, 2008 10:01:53 AM
Subject: All RAC Nodes reboot without log-entry
Hello,
I'm facing the problem, that all my RAC nodes (2 Node Cluster 10.2.0.3
on Solaris 10, Oracle CRS + ASM) reboot at the same time without any
log
entries in /var/etc/messages, /var/log/syslog and all log files under
$ORACLE_CRS_HOME/log/...
The cluster is a fresh installation. Any suggestions? Where can I look
further?
Kind regards,
Björn
--
http://www.freelists.org/webpage/oracle-l
- References:
- Re: All RAC Nodes reboot without log-entry
- From: Dan Norris
Other related posts:
- » All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- » RE: All RAC Nodes reboot without log-entry
- » Re: All RAC Nodes reboot without log-entry
- Re: All RAC Nodes reboot without log-entry
- From: Dan Norris