Swapping causing RMAN controlfile snapshot to fail?

  • From: "Rich Jesse" <rjoralist3@xxxxxxxxxxxxxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Wed, 4 Sep 2013 11:22:40 -0500 (CDT)

Hey all,

In 11.2.0.3 under AIX 5.3 TL12, I had a one-time RMAN error where it failed
to snapshot the controlfile after a scheduled archive log backup:

RMAN-03009: failure of Control File and SPFILE Autobackup command on
ORA_DISK_1 channel at 08/03/2013 02:10:10
ORA-01580: error creating control backup file
/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_xxxxx.f
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 22: Invalid argument
Additional information: 4

I opened an SR and the tech saw this in the alert.log:

WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [6.25%] pct of memory swapped out [11.01%].

So, the tech surmised that the RMAN failure was due to swapping ("paging" in
AIX land).  Huh?  That seems to be the opposite of the intent of paging,
which is to keep programs running during memory pressure.  Here's some more
AIX info:

minperm%=5
maxperm%=90
maxclient%=90
lru_file_repage=0

nmon reports the FileSystemCache usually around 12%, PageSpace at about 13%
used (1.7GB of 12.8GB).

When this failure occurred, a mksysb root VG backup kicked off.  That
apparently is the cause of the paging spike, as it happens every time mksysb
runs.  And it so happens that the controlfile snapshot is on the root VG (on
purpose!).  So my theory is that RMAN just happened to hit the controlfile
snapshot at the exact same time that mksysb had a hold of the old one,
although I can find no documentation to backup that behavior nor to discount
it.

Thoughts?

TIA!
Rich

--
//www.freelists.org/webpage/oracle-l


Other related posts: