An Unrelated Process Fails When RMAN Fails

  • From: "Sam Bootsma" <sbootsma@xxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 3 Jul 2007 15:27:12 -0400

Hi all,

 

Can anybody provide me with some insight on what RMAN is doing about ten
minutes before it gets an error like the one below?  I ask because
frequently an unrelated process (always the same one) will send a
timeout message about 10 minutes before RMAN fails.  When RMAN finishes
successfully, the unrelated process never times out.

 

The bottom of the RMAN log file has the following error messages:

 

RMAN-03009: failure of backup command on t2 channel at 07/01/2007
12:27:19

ORA-27191: sbtinfo2 returned error

Additional information: 2

ORA-19511: Error received from media manager layer, error text:

   do_info2: A connection to NW server 'tcclnw01' could not be
established because

 'Port mapper failure'.

 

The person in charge of our media management layer says it just means
the RMAN process is not able to find a port to connect to.   He will be
looking into the problem further.  However, this does not explain why we
frequently see a timeout error with this unrelated process.  

 

I suspect that RMAN may be utilizing all the CPU under these conditions,
but I don't know for sure.  This backup occurs on the weekend, and I am
not monitoring the system at that time.

 

Additional Information:

- AIX 5.3

- Oracle 10.2.0.2

- RMAN backup to tape (which you already know because a media management
layer is involved)

- We rarely (or never) see this error when RMAN completes successfully.

- In the past, this unrelated process has also timed out when a run away
process causes the entire UNIX system to hang (until it finishes).  

 

Thanks a lot for any insight you can provide!

 

Sam Bootsma

Oracle Database Administrator

Information Technology Services
George Brown College

Phone: 416-415-5000 x4933
Fax: 416-415-4836
E-mail: sbootsma@xxxxxxxxxxxxxx <mailto:sbootsma@xxxxxxxxxxxxxx> 

 

Other related posts: