Re: Long connect time when one node in RAC goes down

  • From: Yechiel Adar <adar666@xxxxxxxxxxxx>
  • Date: Thu, 04 Sep 2008 10:34:19 +0300

With the help of Oracle support we narrowed the problem to names resolution.
We shut down node 2 and started a session with client trace.
I saw in the trace that sqlnet is deciding to use server2-vip.
After that it try to convert the name to tcp/ip address.
When sqlnet try to convert server2-vip to tcp/ip address he is stuck.

It seems that somewhere in the network something is not updated
when the vip is moved to the other node and it takes about 6 (or 6*2) seconds
until sqlnet gets error from the network and then it try to connect with
the second entry, server-vip1, and this works.

Have you heard anything about this problem?

We are going to do a test using the ip itself instead of names in the tnsnames
and also to use a sniffer to find out what happens during these 6 seconds.

Adar Yechiel
Rechovot, Israel



Yechiel Adar wrote:
We need some help.
RAC, Oracle 10.2.0.3 on windows 2003 servers  64 bit.

We did a fail over test. We disconnected one server from the network by pulling the network cable. The system worked fine, but once in a while a connection will take 6 seconds instead on 20 ms. We understand that this happens because the VIP is moved to the second computer and there is
nothing there to handle calls on that TCP address.

I would like to know how to shorten the time from 6 seconds to almost nothing.

--
//www.freelists.org/webpage/oracle-l


Other related posts: