Re: RAC and tcp timeout

  • From: David Sharples <davidsharples@xxxxxxxxx>
  • To: Harish Kalra <hkalra27@xxxxxxxxxxx>
  • Date: Wed, 10 Aug 2005 11:57:00 +0100

Thanks that looks like exactly my problem.  Bit hard to test now as
server1 has recovered and everything is ok again.

Will schedule some testing of killing power to some machines to
simulate a system crash and change the parameters

On 8/10/05, Harish Kalra <hkalra27@xxxxxxxxxxx> wrote:
> David:
>  
> This can be a issue with tcp timeout setting. Oracle cannot determine that a
> listener is unreachable until it gets a specific error back from the tcp
> layer. So, when a node is down, any tcp connection will appear to be hung
> because it keeps retrying until the timeout threshold is reached.
> At that time, a error code is returned and Oracle knows that the first
> listener is unreachable and will then proceed to try connecting to the
> second host listed in the tnsnames.ora file.
>  
> You can see the value of tcp related parameters on linux under
> /proc/sys/net/ipv4 directory. Following parameters may have major impacts:
>  
> tcp_keepalive_time - Default is 2hours, if an abnormal network failure
> happens between two ends, it will take 2 hours for the connection to be
> marked as 'terminated'. It may  have major impact on performance and should
> be set to lower possible values. like 10 min or even less than that.In some
> cases we have seen performance gain by setting to 2min.
> tcp_keepalive_intvl - This defines interval between tcp keepalive packets
> and should be set at 60 seconds. 
>  
> tcp_fin_timeout - Setting to 60 seconds will close <pending close> tcp
> socket to <fully-closed> and allow early reconnection to another listener.

Other related posts: