Re: Using Diagwait on Oracle Clusterware

  • From: LS Cheng <exriscer@xxxxxxxxx>
  • To: Martin Berger <martin.a.berger@xxxxxxxxx>
  • Date: Tue, 24 Nov 2009 09:57:49 +0100

I dont know how to do it, someone from Sun did it once for a customer but
didnt want to tell me how :-S



On Tue, Nov 24, 2009 at 9:52 AM, Martin Berger <martin.a.berger@xxxxxxxxx>wrote:

> Can you give me some hints how to do this?
> (even if my Solaris-Admins might not know, it's worth I know about it :))
>
> thank you
>  Martin
>
> On Tue, Nov 24, 2009 at 08:08, LS Cheng <exriscer@xxxxxxxxx> wrote:
> > one of the reasons I use diagwait is that it makes oprocd less sensitive
> :-)
> >
> > the other reasons are those the note states but when there are evictions
> in
> > Solaris for example it is still quite hard to find out the root cause
> > (because CRSD sends some eviction messages to system console and that
> > usually is not wriiten to files unless configured so but many solaris
> admin
> > does not know how to do it!)
> >
> >
> >
> > Thanks
> >
> > --
> > LSC
> >
> >
> > On Mon, Nov 23, 2009 at 6:08 PM, Vishal Gupta <vishal@xxxxxxxxxxxxxxx>
> > wrote:
> >>
> >> Hello List,
> >>
> >> What is the general consensus among RAC users regarding use of diagwait
> on
> >> Oracle clusterware.
> >>
> >> Metalink Note - 559365.1
> >>
> >>
> >> Symptoms
> >>
> >> Oracle Clusterware evicts the node from the cluster when
> >>
> >> Node is not pinging via the network heartbeat
> >> Node is not pinging the Voting disk
> >> Node is hung/busy and is unable to perform either of the earlier tasks
> >>
> >> In Most cases when the node is evicted, there is information written to
> >> the logs to analyze the cause of the node eviction. However in certain
> cases
> >> this may be missing, the steps documented in this note are to be used
> for
> >> those cases where there is not enough information or no information to
> >> diagnose the cause of the eviction.
> >>
> >> Changes
> >>
> >> None
> >>
> >> Cause
> >>
> >> When the node is evicted and the node is extremely busy in terms of CPU
> >> (or lack of it) it is possible that the OS did  not get time to flush
> the
> >> logs/traces to the file system. It may be useful to set diagwait
> attribute
> >> to delay the node reboot to give additional time to the OS to write the
> >> traces. This setting will provide more time for diagnostic data to be
> >> collected by safely and will NOT increase probability of corruption.
> After
> >> setting diagwait, the Clusterware will wait an additional 10 seconds
> >> (Diagwait - reboottime). Customers can unset diagwait by following the
> steps
> >> documented below after fixing their OS scheduling issues.
> >>
> >>
> >>
> >>
> >>
> >> Regards,
> >> Vishal Gupta
> >> http://www.vishalgupta.com
> >
>
>
>
> --
> Martin Berger           martin.a.berger@xxxxxxxxx
> Lederergasse 27/2/14           +43 660 660 83306
> 1080 Wien                                       http://berx.at/
>

Other related posts: