Re: ocssd.bin does not start after 10.2.0.1 clusterware install

  • From: Kumar Madduri <ksmadduri@xxxxxxxxx>
  • To: oraclelist@xxxxxxxxxxxxx
  • Date: Tue, 8 Sep 2009 17:15:06 -0700

Hello Randy
We are not on Linux. We are trying this on a Solaris box.  The thing is if
we apply the 10..2.0.4 patchset, the ocssd.bin comes up fine on secondary
node but it loops on the primary node. Oracle is trying to point  finger at
Sun Cluster installation or QFS file system that is used  for the ocr and
voting disks. But the question is if the Sun Cluster is bad or if it is an
issue with QFS how does the root.sh run fine on node 2 plus ocrcheck and
ocrdump works fine from the the primary node. Oracle is just doing circles
without any solution so far.
Another question in this regard is,does the Oracle Clusterware expect more
packages (SUNW*) on the primary node as compard to the secondary node. Does
the root.sh run some additional stuff on primary node that it does not do on
secondary (it does not seem logical but when I compared with a working
cluster, I noticed this difference where the primary node had more SUNW*
packages as compared to the secondary node).
Another thing that Oracle brings up is that, the Sun Clusterware may be
starting a bit later than the Oracle clusterware and Oracle clusterware is
trying to look for the Sun cluster process and it loops. But that does not
make sense also for couple of reasons (1) why is the occsd.bin starting on
second node and the root.sh completling successfuly on the node 2 if that is
the case  (2) Even if what Oracle is saying is true, should not this happen
the first time Oracle cluster tries to start and for subsequent tries, it
should be able to succedd (actually for the 2nd try because Sun Cluster
should be up by then even if there was  a time difference) instead of
looping round and round.
We cant take a pstack or ptree of any process because the only thing I see
is css fatal process and when we did a ptree it just shows the sleep process


Thank you
Kumar

On Tue, Sep 8, 2009 at 7:45 AM, Randy Johnson <oraclelist@xxxxxxxxxxxxx>wrote:

>  Can you post the output from the root.sh script? I believe there is a bug
> that has to be addressed before this script will succeed for 10.2.0.1.
> I believe doc: 466673.1 covers it.
>
>  ------------------------------
>  *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto:
> oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Kumar Madduri
> *Sent:* Thursday, September 03, 2009 3:38 PM
> *To:* Oracle-L@xxxxxxxxxxxxxxx
> *Subject:* ocssd.bin does not start after 10.2.0.1 clusterware install
>
>   Hi
> We have a suncluster on top of which we are trying to install the 10.2.0.1
> clusterware. The installation is fine but when you run root.sh it fails to
> bring ocssd.bin. I think it is looping at /etc/init.d/init.cssd trying to
> validate the conditions under the SunOS case statement and it keeps looping
> (start check process start and it does not go to the next stage).
> Prior to doing this install, we clean up the localconfig as per note
> 239998.1.
>
> cluvfy does not report any issues before the start of the installation.
>
> We have a tar with Oracle but there is not much progress with them
> unfortunately.
>
> Any ideas?
>
> Thank you
> Kumar
>
>
>
>  __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 4406 (20090908) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 4406 (20090908) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>

Other related posts: