Re: RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly

From: De DBA <dedba@xxxxxxxxxx>
To: ashoke.k.mandal@xxxxxxxxxxxxx
Date: Wed, 21 Mar 2012 10:57:21 +1000

Hi Ashoke,

The logs that you mention are the Oracle Database Extension logs. The media 
manager logs that I meant are in <Netbackup_Root>/volmgr/debug. This article: 
http://www.symantec.com/business/support/index?page=content&id=TECH31097 has a 
list of location & process names that may be helpful.

If the lines you show under b) are the last in the file though, there does not 
seem to be a problem with the mount. These lines merely indicate that the 
backup piece was restored. As you can see in article TECH53002, if the media 
manager encounters an error it will be written below the "Error=7504" line.

The client read timeout that is mentioned is another property, unrelated to 
media mount timeout, which you can set (in V6.0) on the server or the client 
side. It defaults to 5 minutes, which the manual states is too short for the 
database extension. The client will use the local value if it does not receive 
a value from the server - as is the case in your situation: the log shows that 
no client read timeout is set. It seems to me that the size of your restore is 
the issue here, which may lead to (very) long waiting periods between reads as 
Oracle is restoring the piece just read.

The oracle error ( ORA-7445 ... [SIGSEGV] [Address not mapped to object] ...) 
seems to indicate that some object that used to be there (perhaps a TCP socket 
or another process) no longer exists, e.g. process exited on a timeout or 
socket closed. Other points to look at would include timeouts on TCP 
connections (firewalls perhaps?) and OS errors on the database host that may 
have caused the NB client to exit (message log, syslog, core dumps).

Hope this helps
Tony

On 21/03/12 00:27, Mandal, Ashoke wrote:
> Hi Tony,
>
> Here is the info on NetBackup Media Manager's logs
> a)  The log under /usr/openv/netbackup/logs/user_ops/dbext/logs directory 
> shows the following:
> 09:10:11 (197975.001) INF - Beginning restore from server phx00bs2 to client 
> phx00apt1.
> 09:50:35 (197975.001) Status of restore from copy 1 of image created Mon Feb 
> 27 19:09:01 2012 = the restore failed to recover the requested
>
> b) The log under /usr/openv/netbackup/logs/dbclient directory shows the 
> following:
> 09:59:40.758 [6456]<4>  VxBSASetEnv: INF - entering SetEnv - 
> NBBSA_CLIENT_READ_TIMEOUT
> 09:59:40.758 [6456]<4>  VxBSAGetEnv: INF - entering GetEnv - 
> NBBSA_CLIENT_READ_TIMEOUT
> 09:59:40.758 [6456]<4>  VxBSAGetEnv: INF - returning - 10800
> 09:59:40.758 [6456]<4>  dbc_SetClientReadTimeout: INF - sending client read 
> timeout
> 09:59:40.758 [6456]<2>  xbsa_SetEnv: INF - leaving (0)
> 09:59:40.758 [6456]<8>  int_ReadData: WRN - Failed to set client read timeout.
> 09:59:40.759 [6456]<2>  sbterror: INF - entering
> 09:59:40.759 [6456]<2>  sbterror: INF - Error=7504: Got end-of-file
>
> d)  /usr/openv/netbackup/logs/bphdb directory didn't have any log.
>
> e)  When I googled with "WRN - Failed to set client read timeout" I found the 
> Article TECH73065 and Article: TECH53002 from Symantec site and these 
> suggests me to verify the media Mount Timeout. Our storage administrator 
> verified that it was set to unlimited.
> <phx00bs2><root>bpconfig -U | grep -i mount
> Media Mount Timeout:          0 minutes (unlimited)
> Shared Media Mount Timeout:0 minutes (unlimited)
>
> Let me know if any other are I should look at.
>
> Thanks,
> Ashoke
>
> -----Original Message-----
> From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
> Behalf Of De DBA
> Sent: Tuesday, March 20, 2012 5:47 AM
> To: oracle-l@xxxxxxxxxxxxx
> Subject: Re: RMAN duplicate is failing : database session for 
> channel<channel_name>  terminated unexpectedly
>
> Reposted due to overquoting..
>
> Did you check the NetBackup Media Manager's logs? Perhaps it is trying to 
> read from a tape that is not (no longer) mounted? Those logs should be on the 
> NetBackup Media server, not necessarily on the database host (depending on 
> your setup, of course).
>
> Cheers,
> Tony
>
>> On 20/03/12 14:02, Mandal, Ashoke wrote:
>>> I noticed that it generates the following error in alert log:
>>> Errors in file 
>>> /phx11dbt1/u01/app/oracle/admin/vtwdmas/udump/vtwdmas_ora_6462.trc:
>>> ORA-07445: exception encountered: core dump [VxBSAGetData()+716]
>>> [SIGSEGV] [Address not mapped to object] [0x000000DF8] [] []
>>>
>>> The tracefile has the following message but the sbtio.log doesn't have any 
>>> information as the size of sbtio.log is 0.
>>> SKGFQ OSD: Error in function sbtread2 on line 1156 SKGFQ OSD: Look
>>> for SBT Trace messages in file
>>> /phx11dbt1/u01/app/oracle/admin/vtwdmas/udump/sbtio.log
>>> Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to
>>> object), addr: 0xdf8, PC: [0xffffffff7d736d94, VxBSAGetData()+716]
>>>
>>> Couldn't locate any note in Metalink related to this error. Any suggestions 
>>> will be appreciated.
>>>
>>> Thanks,
>>> Ashoke
>>>
>>>

--
//www.freelists.org/webpage/oracle-l

References:
- RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly
  - From: Mandal, Ashoke
- RE: RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly
  - From: Taylor, Chris David
- RE: RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly
  - From: Mandal, Ashoke
- Re: RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly
  - From: De DBA
- RE: RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly
  - From: Mandal, Ashoke

Re: RMAN duplicate is failing : database session for channel <channel_name> terminated unexpectedly

Other related posts: