RE: Redo Log improvement - how to analyze and attack?

  • From: "Bobak, Mark" <Mark.Bobak@xxxxxxxxxxxx>
  • To: "Christopher.Taylor2@xxxxxxxxxxxx" <Christopher.Taylor2@xxxxxxxxxxxx>, "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 15 Nov 2012 14:32:45 -0500

Hi Chris,

If you're switching logs 300 times per hour, that's 5 log switches per minute, 
or about one log switch every 12 seconds.  I like to target one log switch 
every 15-20 minutes.  So, if we say we want a log switch every 15 minutes, and 
you're switching every 12 seconds, you should look at increasing your log file 
size by a factor of 75x (15x5).  Actually, if this is a 3 node cluster, and the 
300 log switches per hour is aggregate across all 3 nodes, then you probably 
are looking at a factor of 25x.

So, that's the first problem I think you should fix, to eliminate your log file 
switch waits.

Once you've done that, if you have lulls in redo generation, to avoid long 
periods without a log switch, you could consider setting archive_lag_target, to 
perhaps 1800 or so.

Hope that helps,

-Mark


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Christopher.Taylor2@xxxxxxxxxxxx
Sent: Wednesday, November 14, 2012 4:49 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Redo Log improvement - how to analyze and attack?

I have this 3-Node RAC (10.2.0.4) that I've inherited (as I've mentioned) and 
after putting out a few issues I'm ready to move onto the redo logs (and 
perhaps I should have started here earlier).
First question:
Are high numbers of log file switches (during certain hours) enough validation 
for that an improvement is needed?

In my case, during heavy batch processing, this RAC cluster may experience 300 
log switches during an hour (I know, I know - sounds bad doesn't it)

(Caveat: The log file switches were generated from within Toad - I'm assuming 
they are valid)

My first step was to enable FAST_START_MTTR_TARGET and it is set for 3 minutes 
(180 seconds).

Here's some of my numbers for the time interval on Node1:

Redo Size:
1,451,121.63 per second

AWR redo/log stats (formatted) here:
https://gist.github.com/4075050

Second question:
V$INSTANCE_RECOVERY OPTIMAL_LOGFILE_SIZE = 5421 (~5 GB)

Is OPTIMAL_LOGFILE_SIZE calculated by the *EXISTING LOGFILE GROUPS*?  So if I 
had more groups, would this number go down?

Third Question:
I've seen various mentions of " archive_lag_target" - what do I need to think 
about regarding this parameter in relation to check points?


Any suggestions/thoughts are appreciated.

Regards,

Chris Taylor
Oracle DBA
Parallon IT&S
christopher.taylor2@xxxxxxxxxxxx<mailto:christopher.taylor2@xxxxxxxxxxxx>
www.parallon.net



--
//www.freelists.org/webpage/oracle-l




--
//www.freelists.org/webpage/oracle-l


Other related posts: