RE: Standby hung following a network disconnect

  • From: "Chandra Pabba" <chandra_pabba@xxxxxxxxxxx>
  • To: <vrajagopal@xxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Sat, 03 Jul 2010 19:15:18 -0500

Vasu,

 

What are the different settings/attributes (like: REOPEN,
NET_TIMEOUT,MAX_FAILURE etc)  you have currently defined for
LOG_ARCHIVE_DEST_n pointing to standby?

 

Thanks
Chandra

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Vasu Rajagopal
Sent: Friday, July 02, 2010 10:15 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Standby hung following a network disconnect

 

 

Hi,

 

I have a Data Guard issue ,  It is RAC Production on 10.2.0.4  using  LGWR
ASYNC redo transport to DR site which is also a RAC configuration (Physical
Standby).  

One of the standby database (in Real time apply mode) is hanging after
NETWORK DISCONNECT error,  causing  few Gb/sec read I/O on Stanby Redo Logs
(SRLs)  and it seems to be stuck forever.

This has occurred almost twice a week in the last 2 months .

 

As a temporary work-around,  Cancelled the  managed recovery process (MRP)
and put it into ARCH apply mode, that seems to be working ,  though we would
like to have the DR site 

Running in REAL TIME APPLY mode.   

 

I have got an update from Oracle saying :

Most of the issues relating to ora-3135 have ended up being a router /
switch / firewall / http protocol issue, asking  , If cisco router is used
then to disable the fixup protocol for the sqlnet port , etc.

However, I am not sure why the LNS/MRP processes are unable to recover and
get back into normal mode after detecting the timeout .

 

Looking for inputs on ways to diagnose/resolve this.

Thanks,

Vasu

 

Here is the excerpt from log files showing disconnect :

Primary DB (mydb) alert log
----------------------------------------
Errors in file /u001/app/oracle/admin/mydb/bdump/mydb1_lns1_22694.trc:
ORA-03135: connection lost contact
Fri Jun 25 16:00:42 2010
LGWR: I/O error 3135 archiving log 2 to
'(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=racsrvr11.mycomp.co
m)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=dgmydb_XPT.mycomp.com)(INSTANCE_N
AME=dgmydb2)(SERVER=dedicated)))'

Primary LNS Trace file --- mydb1_lns1_22694.trc
===========================
Sending online log thread 1 seq 14857 [logfile 2] to standby
Archiving to destination
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=racsrvr11.mycomp.com
)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=dgmydb_XPT.mycomp.com)(INSTANCE_NA
ME=dgmydb2)(SERVER=dedicated))) ASYNC blocks=20480
Log file opened [logno 2]
*** 2010-06-25 16:00:42.157
RFS network connection lost at host
'(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=racsrvr11.mycomp.co
m)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=dgmydb_XPT.mycomp.com)(INSTANCE_N
AME=dgmydb2)(SERVER=dedicated)))'
Error 3135 writing standby archive log file at host
'(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=racsrvr11.mycomp.co
m)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=dgmydb_XPT.mycomp.com)(INSTANCE_N
AME=dgmydb2)(SERVER=dedicated)))'
ORA-03135: connection lost contact
*** 2010-06-25 16:00:42.170 64208 kcrr.c

Standby  alert log
-----------------------------------------------
Mem# 0: /netapp/oracle/dr/redologs11/dgmydb/group_32.1296.718556259
Fri Jun 25 16:00:42 2010
RFS[2]: Possible network disconnect with primary database
Fri Jun 25 18:14:17 2010
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[6]: Assigned to RFS process 10755
RFS[6]: Identified database type as 'physical standby'

 

 

  _____  

Fiberlink Disclaimer: The information transmitted is intended only for the
person or entity to which it is addressed and may contain confidential
and/or privileged material. Any review, retransmission, dissemination or
other use of, or taking of any action in reliance upon, this information by
persons or entities other than the intended recipient is prohibited. If you
received this in error, please contact the sender and delete the material
from any computer.

Other related posts: