Re: Oracle clusterware related question

  • From: Martin Berger <martin.a.berger@xxxxxxxxx>
  • To: Amir.Hameed@xxxxxxxxx
  • Date: Tue, 8 May 2012 20:02:47 +0200

Amir,

in Oracle Clusterware no node can be evicted by the remote nodes.
The 'others' can only exclude any node and hope this one commits suicide.

The problem here, on your hanging node the clusterware processes are
hanging in IO to logfiles. As your NFS does not disappear, the
filehandles are still open. It seems writing to logfiles is a
synchronous task - so when these hang in file-IO, they can not do
higher priority tasks as killing the node.

You can try to mount your log-directories 'soft' - maybe this solves
the hanging issue. But I don't know which side-effects this might
cause!

I am not sure if crs shows the same behavior in case logfile write
hangs (as on NFS) or log file write fails (as on "mountpoints
disappears as SAN-nwtwork is removed") - Mathias, do you remember the
details? But as they where back in 11.2.0.1, I probably should do the
testcase again.

I second Mathias, grid-logs (and also grid-binaries) should be local!
All others, like rdbms binaries and logs can be on any remote system.

hth
 Martin

On Tue, May 8, 2012 at 6:11 PM, Hameed, Amir <Amir.Hameed@xxxxxxxxx> wrote:
> So, if voting disks are not updated by a certain node for any reason for
> an extended period of time, that node would not be evicted by the remote
> nodes from the cluster?
>
>
> From: Tim Gorman [mailto:tim@xxxxxxxxx]
> Sent: Tuesday, May 08, 2012 12:05 PM
> To: Mathias.Zarick@xxxxxxxxxxxx; Hameed, Amir
> Cc: oracle-l@xxxxxxxxxxxxx
> Subject: Re: Oracle clusterware related question
>
>
>
> Mathias hit the nail on the head.  Think about it this way:  NFS errors
> and disconnects typically do not kill running programs, but cause them
> to hang.  If the binaries for the clusterware are themselves on NFS,
> then clearly they are going to hang also.
--
//www.freelists.org/webpage/oracle-l


Other related posts: