Using Diagwait on Oracle Clusterware

  • From: "Vishal Gupta" <vishal@xxxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 23 Nov 2009 17:08:06 -0000

Hello List,
 
What is the general consensus among RAC users regarding use of diagwait on 
Oracle clusterware.
 
Metalink Note - 559365.1
 

Symptoms


Oracle Clusterware evicts the node from the cluster when 

*       Node is not pinging via the network heartbeat 
*       Node is not pinging the Voting disk 
*       Node is hung/busy and is unable to perform either of the earlier tasks 

In Most cases when the node is evicted, there is information written to the 
logs to analyze the cause of the node eviction. However in certain cases this 
may be missing, the steps documented in this note are to be used for those 
cases where there is not enough information or no information to diagnose the 
cause of the eviction.


Changes


None


Cause


When the node is evicted and the node is extremely busy in terms of CPU (or 
lack of it) it is possible that the OS did  not get time to flush the 
logs/traces to the file system. It may be useful to set diagwait attribute to 
delay the node reboot to give additional time to the OS to write the traces. 
This setting will provide more time for diagnostic data to be collected by 
safely and will NOT increase probability of corruption. After setting diagwait, 
the Clusterware will wait an additional 10 seconds (Diagwait - reboottime). 
Customers can unset diagwait by following the steps documented below after 
fixing their OS scheduling issues.


 

 
 
Regards,
Vishal Gupta
http://www.vishalgupta.com   

Other related posts: