RE: LOG_BUFFER

From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
To: 'oracle-l-freelists' <oracle-l@xxxxxxxxxxxxx>
Date: Thu, 19 May 2016 21:19:54 +0000

I may have trouble modelling that on my laptop ;)
If there are any boundary conditions causing odd effects in this area I think
you're probably going to find most of them.

How many concurrent transactions are you seeing at peak times - and do these
transactions all do a very small amount of work (say a couple of rows inserted
or updated before commit) or are they doing work which is more batch like (100+
rows to a transaction).  If you snapshot v$latch_children across a few minutes
of stress are the redo allocation latch sleeps evenly spread, or is there a
suggestion of a pattern. If you snapshot v$session_event  across a few minutes
of stress are the waits for "latch: redo allocation" evenly spread or is there
a suggestion of a pattern.

At this point I'm thinking about NUMA problems - along the lines of some of
your redo allocation latches not being local to the CPUs that are trying to
acquire them, resulting in being beaten by the CPUs where the latches are
local. Wondering if this would show up as a non-uniform distribution pattern of
misses and waits.

The granule size and log buffer sizes are loosely linked - the SGA is split
into fixed memory granules of a uniform size, and at least one granule had to
be reserved for the log buffer(s) - and, if I recall correctly, the "fixed SGA"
(x$ksmfsv) shared the granule.  In many cases a single granule was more than
enough to hold all the log buffer space that the instance required, so (from
10g onwards) people often found that their log buffer was reported as much
bigger than the size they had seen in earlier versions because Oracle basically
expanded the log buffer to fill most of the granule.  In large cases, though,
the log buffer allocation could  take more than a single granule; and since
you've specified a very large log buffer you'll see multiple granules allocated.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: Ram Cheruvattath [ram.cheruvattath@xxxxxxxxx]
Sent: 19 May 2016 14:38
To: Jonathan Lewis; 'oracle-l-freelists'
Subject: Re: LOG_BUFFER

Hi Jonathan

Yes. This is an Oracle RAC 12c database with 4 instances. The database is
approximately 500TB. There are 288 physical cores on each server.
Multithreading is disabled. Each node has 5.5TB of memory. SGA is approximately
4.5TB. Whenever we experience high waits for redo allocation latch, there is no
indication of CPU shortage. I have verified from both AWR and OSW that the
overall CPU idle is around 30-35% when this happens, and that is very
consistent. The LG* processes are running in RT.

Yes. The default granule size would have been 512M. But is is currently 128M
due to explicitly setting _ksmg_granule_size. We plan on removing it from
spfile so it can default.

With the default no. of redo strands (18) I am seeing 10 log writer workers.
However, on our test system, after changing the _max_parallelism_max to 36, I
am seeing 18 workers. Of course, the redo allocation latches have also gone up
to 36.

Finally, I was under the impression that granule size does not apply to
log_buffer. Is that not the case?

Thanks
Ram

From: Jonathan Lewis<mailto:jonathan@xxxxxxxxxxxxxxxxxx>
Sent: Wednesday, May 18, 2016 3:48 PM
To: 'oracle-l-freelists'<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: RE: LOG_BUFFER

Looks like you have a very large SGA (> 64GB), with 512MB granules and two of
them for the log buffer.
Do you really have 288 CPUs, or do you have a smaller number of cores that are
pretending to be multiple CPUs but not doing it very well ?

Waits for redo allocation latch are deceptive - they might be waits for the
public thread latch as private threads are dues to be copied into the public
threads, or they may be waits for the private thread latches because you have
too many processes try to get private redo all the time.

Log file sync > 200ms is suspect and I think it points to CPU starvation (which
is why I ask about the cores/CPUs difference). The log file sync timeout is 10
centiseconds by default (i.e. 100 ms) so processes should have timed out and
checked for log write completion. It's possible that if the write hasn't
completed then the session resumes the same wait rather than starting a new
wait - and I don't know whether or not it does, I suppose I could check it by
suspending the log writer - but if it starts a new wait then a 200 - 300 ms log
file sync wait means the process is going onto the run queue after no more than
100ms but not running for another 100 - 200 ms i.e. CPU starvation.   (In fact,
any time you see things like log file parallel write = quick, log file sync =
much slower, you can suspect CPU starvation - otherwise start checking Frits
Hoogland's blog for other reasons for log writer anomalies under pressure.)

The thinking on cores vs. CPUs: if you have (say) 64 cores claiming to be 288
CPUs you might actually find that you are reporting 17% CPU utilisation (thanks
to the arithmetic CPU count) while suffering CPU starvation (thanks to core
count).

And I've just realised you said gc flush time - which is largely about the
speed for log file syncs triggered by the RAC background processes - so you're
running RAC, which means (even in 12c, I think) no private strands (check
x$kcrfstrand to confirm) which means your latch waits are for the public redo
strands.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: Ram Cheruvattath [ram.cheruvattath@xxxxxxxxx]
Sent: 18 May 2016 18:35
To: Jonathan Lewis; 'oracle-l-freelists'
Subject: Re: LOG_BUFFER

Thanks for the detailed explanation. On this database, the total number of
strands is at 18 (ncpu/16) and each strand is about 56M. We see significant
waits for redo allocation latch when redo generation is at its peak. We are
looking at increasing the strands. We do see high averages for log file sync
(200-300 ms) along with similar numbers for gc cr/current block flush time
during peak redo generation. log file parallel writes are usually under 3-4ms,
and may go up to 20-25ms at times.

IMHO, since the database is in no archive mode anyway we should do as much non
logged operations as possible. But... [Smile emoticon]

Ram

From: Jonathan Lewis<mailto:jonathan@xxxxxxxxxxxxxxxxxx>
Sent: Wednesday, May 18, 2016 11:54 AM
To: 'oracle-l-freelists'<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: RE: LOG_BUFFER

The documentation on the log_buffer parameter has hardly changed since 8i -
although its degree of accuracy has changed significantly.
On my 2 CPU 12c instance at present I see that log_buffer has defaulted to
1.3M, though checking x$kcrfstrand I see this is really 2 public strands of
6.5M each.

The 1MB trigger is correct, though it's not in the documentation - it's easy to
demonstrate, though - and the documentation also doesn't point out that the log
writer will start a new write immediately after it finishes processing a write
if it finds that there is still some log waiting to be written; nor does the
documentation point out that processes can wake themselves up and detect that
their bit of log buffer has been written BEFORE the log writer gets around to
posting them - which is why the old threat of "log buffer too big" is not as
important as it used to be.

The statement of the 1MB being "not useful" claim pre-dates machines with 64
CPUs and a couple of hundred (actually) concurrent sessions, so you can (and
ought to) ignore that. It's actually important in busy systems to have a large
log buffer otherwise lots of sessions can end up waiting on "log buffer space".
The log writer can write asynchronously in multiple chunks of up to 1MB each -
so a "single" write can take some time, during which there may be lots of
sessions still working to fill the log buffer, and that's what a buffer is for.
(12c allows multiple log writer slaves - lg00 to lg99 - to help with very busy
systems rather than depending on async I/O).

1GB sounds big - but if you're not seeing much time lost on "log file sync"
waits (and assuming that CPU overload is not contributing to those waits) and
the typical duration of the log file sync waits is similar to the duration of
the log file parallel writes then its size is not the cause of a problem.

(Standard caveat: I didn't write the specification or code for the Oracle
program - so I may be missing something that I might notice if I had your
system running in front of me.)

If you really wanted to get some idea of whether or not the buffer was much
bigger than it needed to be you could monitor x$kcrfstrand (extremely
frequently) to keep checking how much of the public buffer was in use at any
instant - but I've never felt it necessary to do this so I have no idea whether
trying to do this MIGHT introduce a performance problem.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on behalf
of Ram Cheruvattath [ram.cheruvattath@xxxxxxxxx]
Sent: 18 May 2016 16:18
To: Mark W. Farnham; Hemant-K.Chitale@xxxxxx; 'oracle-l-freelists'
Subject: Re: LOG_BUFFER

Yes. I have not come across anything that says a large log_buffer is an issue
(other than wasting memory if not being used properly). Even the Oracle
documentation says that, though, as per the documentation, a size of more than
1MB is generally not useful. On one of Tom's articles I have read that LGWR
write redo out when it get's to 1MB regardless of other factors. I have not see
that in the documentation (12c) though. I wonder if that is why more than 1M is
not useful, though I would think that the rest of the space can still be used
by DML in progress while LGWR is writing.
Configuring the Redo Log Buffer

Server processes making changes to data blocks in the buffer cache generate
redo data into the log buffer. The log writer process (LGWR) begins writing to
copy entries from the redo log buffer to the online redo log if any of the
following conditions are true:

  *   The redo log buffer becomes at least one-third full

  *   LGWR is posted by a server process performing a COMMIT or ROLLBACK

  *   A database writer process (DBWR) posts LGWR to do so

Sizing the Redo Log Buffer

The default size of the redo log buffer is calculated as follows:

MAX(0.5M, (128K * number of cpus))

On most systems, sizing the log buffer larger than 1 MB does not provide any
performance benefit. However, increasing the size of the redo log buffer does
not have any negative implications on performance; it merely uses more memory.
More importantly, a modestly-sized redo log buffer can significantly enhance
throughput on systems that perform many updates. Applications that insert,
modify, or delete large volumes of data may require a change to the default
size of the redo log buffer.

To determine if the size of the redo log buffer is too small, monitor the redo
log buffer statistics, as described in "Using Redo Log Buffer
Statistics<http://docs.oracle.com/database/121/TGDBA/tune_buffer_cache.htm#GUID-772D48A0-FE2B-4A05-8826-3ADA25F4D7C5>".
You can also check if the log buffer space wait event is a significant factor
in the wait time for the database instance. If it is not, then the log buffer
size is most likely adequately-sized.

http://docs.oracle.com/database/121/TGDBA/tune_buffer_cache.htm#TGDBA556

From: Mark W. Farnham<mailto:mwf@xxxxxxxx>
Sent: Wednesday, May 18, 2016 9:43 AM
To: Hemant-K.Chitale@xxxxxx<mailto:Hemant-K.Chitale@xxxxxx> ;
ram.cheruvattath@xxxxxxxxx<mailto:ram.cheruvattath@xxxxxxxxx> ;
'oracle-l-freelists'<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: RE: LOG_BUFFER

If you never come close to filling the LOG_BUFFER before a flush event, then
the downside is occupying memory you never use.

I have *not observed* Oracle performance degrade due to having a large
LOG_BUFFER if there is no essential better use for the memory such as caching
large frequently referenced look up tables.

I usually safely oversize the LOG_BUFFER so that hitting the buffer one-third
full is not routinely the flushing event, but I’ve never needed as much as you
are currently using.

Once your LOG_BUFFER is big enough so you cannot write one-third of it before
the time triggered flush, I *believe* that constitutes a useful upper size
limit, even if your jobs are all single commit monoliths. (There may be other
useful upper size limits that are smaller, ergo the use of “a” not “the” in the
phrase “a useful upper size limit.”)

*Someone let me know if that is not still at least theoretically correct.*

So while you *might* be “wasting” some memory, I’m not aware of any other harm.
I don’t believe they will let you set it bigger than the program can handle.

mwf

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Chitale, Hemant K
Sent: Wednesday, May 18, 2016 2:02 AM
To: ram.cheruvattath@xxxxxxxxx; oracle-l-freelists
Subject: RE: LOG_BUFFER

What kind of storage do you write to @4TB/hour ?  How and when do you backup
and purge/delete the archivelogs.

I’ve seen log buffer exceeding 1GB.

Hemant K Chitale

From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Ram Cheruvattath
Sent: Wednesday, May 18, 2016 3:00 AM
To: oracle-l-freelists
Subject: LOG_BUFFER

Hi

We have a 12c database that has very high redo generation (4TB/hr). Our
log_buffer is set to 1GB.

Is there a downside to setting log_bugger this big?

Thanks
Ram

This email and any attachments are confidential and may also be privileged. If
you are not the intended recipient, please delete all copies and notify the
sender immediately. You may wish to refer to the incorporation details of
Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at
https://www.sc.com/en/incorporation-details.html

Follow-Ups:
- Re: LOG_BUFFER
  - From: Thiago Maciel

References:
- LOG_BUFFER
  - From: Ram Cheruvattath
- RE: LOG_BUFFER
  - From: Chitale, Hemant K
- RE: LOG_BUFFER
  - From: Mark W. Farnham
- Re: LOG_BUFFER
  - From: Ram Cheruvattath
- RE: LOG_BUFFER
  - From: Jonathan Lewis
- Re: LOG_BUFFER
  - From: Ram Cheruvattath
- RE: LOG_BUFFER
  - From: Jonathan Lewis
- Re: LOG_BUFFER
  - From: Ram Cheruvattath

RE: LOG_BUFFER

Other related posts: