RE: Redo Log improvement - how to analyze and attack?

  • From: "Bobak, Mark" <Mark.Bobak@xxxxxxxxxxxx>
  • To: "Christopher.Taylor2@xxxxxxxxxxxx" <Christopher.Taylor2@xxxxxxxxxxxx>, "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 15 Nov 2012 15:06:23 -0500

I would look for log file switch completion wait events among the top 5 timed 
events in an AWR report and/or significant waits on specific sessions being 
traced for performance.

If there are no log file switch wait events, I wouldn't expect reducing the 
frequency of log file switches to have any impact.

But, with multiple switches per minute, I find it hard to believe that there's 
no performance impact on your system.

Finally, if you're having a hard time seeing the problem in AWR, I'd recommend 
taking some samples, at the session level, using Tanel's snapper.sql script, 
or, enabling 10046 trace on the problem session, and doing some profiling.  The 
data is there, you just need to find it. :-)

-Mark

-----Original Message-----
From: Christopher.Taylor2@xxxxxxxxxxxx 
[mailto:Christopher.Taylor2@xxxxxxxxxxxx] 
Sent: Thursday, November 15, 2012 3:00 PM
To: Bobak, Mark; oracle-l@xxxxxxxxxxxxx
Subject: RE: Redo Log improvement - how to analyze and attack?

Mark - how can a person tell what the impact numbers are for having too many 
log switches.  I know Oracle recommends 3-4 times an hour but how can you 
identify that a lot of log switches are impacting performance - what events 
from AWR would reflect a negative impact?

I did bump my redo logs up already to 5GB and set archive_lag_Target and only 
had 3-4 switches last night but everything finished just like normal - no 
improvements which is what I thought was going to happen based on what I was 
seeing in the AWR but I thought perhaps I wasn't seeing the issue in the AWR.

Chris

-----Original Message-----
From: Bobak, Mark [mailto:Mark.Bobak@xxxxxxxxxxxx] 
Sent: Thursday, November 15, 2012 1:33 PM
To: Taylor Christopher - Nashville; oracle-l@xxxxxxxxxxxxx
Subject: RE: Redo Log improvement - how to analyze and attack?

Hi Chris,

If you're switching logs 300 times per hour, that's 5 log switches per minute, 
or about one log switch every 12 seconds.  I like to target one log switch 
every 15-20 minutes.  So, if we say we want a log switch every 15 minutes, and 
you're switching every 12 seconds, you should look at increasing your log file 
size by a factor of 75x (15x5).  Actually, if this is a 3 node cluster, and the 
300 log switches per hour is aggregate across all 3 nodes, then you probably 
are looking at a factor of 25x.

So, that's the first problem I think you should fix, to eliminate your log file 
switch waits.

Once you've done that, if you have lulls in redo generation, to avoid long 
periods without a log switch, you could consider setting archive_lag_target, to 
perhaps 1800 or so.

Hope that helps,

-Mark


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Christopher.Taylor2@xxxxxxxxxxxx
Sent: Wednesday, November 14, 2012 4:49 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Redo Log improvement - how to analyze and attack?

I have this 3-Node RAC (10.2.0.4) that I've inherited (as I've mentioned) and 
after putting out a few issues I'm ready to move onto the redo logs (and 
perhaps I should have started here earlier).
First question:
Are high numbers of log file switches (during certain hours) enough validation 
for that an improvement is needed?

In my case, during heavy batch processing, this RAC cluster may experience 300 
log switches during an hour (I know, I know - sounds bad doesn't it)

(Caveat: The log file switches were generated from within Toad - I'm assuming 
they are valid)

My first step was to enable FAST_START_MTTR_TARGET and it is set for 3 minutes 
(180 seconds).

Here's some of my numbers for the time interval on Node1:

Redo Size:
1,451,121.63 per second

AWR redo/log stats (formatted) here:
https://gist.github.com/4075050

Second question:
V$INSTANCE_RECOVERY OPTIMAL_LOGFILE_SIZE = 5421 (~5 GB)

Is OPTIMAL_LOGFILE_SIZE calculated by the *EXISTING LOGFILE GROUPS*?  So if I 
had more groups, would this number go down?

Third Question:
I've seen various mentions of " archive_lag_target" - what do I need to think 
about regarding this parameter in relation to check points?


Any suggestions/thoughts are appreciated.

Regards,

Chris Taylor
Oracle DBA
Parallon IT&S
christopher.taylor2@xxxxxxxxxxxx<mailto:christopher.taylor2@xxxxxxxxxxxx>
www.parallon.net



--
//www.freelists.org/webpage/oracle-l






--
//www.freelists.org/webpage/oracle-l


Other related posts: