RE: High Buffer busy waits

From: "Mark W. Farnham" <mwf@xxxxxxxx>
To: <jlewisoracle@xxxxxxxxx>, "'Oracle L'" <oracle-l@xxxxxxxxxxxxx>
Date: Thu, 27 Jul 2023 07:14:06 -0400

A useful thing to do is review the total volume of redo log switched through
compared to the number of inserts completed in a given time. This volume times
the number of inserts you attempted divided by the number completed can give
you a reasonable minimum target for redo needed to not wait on it. Round up.

IF your load is peaked and you have some good prediction of when a peak load
will start, you can do a manual log switch and checkpoint shortly prior to the
beginning of the onslaught. The value of doing this diminishes as the number of
log groups rises, but at 10 log groups it gives you up to 10% extra space
before waiting on the incomplete and that checkpoint should complete relatively
quickly during a lull compared to peak.

IF your load is relatively constant this won’t do anything useful. Many large
online redo log groups really only consumes disk volume.

IF there are signs that the archive process is falling behind there are several
ways to improve archive throughput, starting with making sure the archiver is
not starved for cpu and that the archive destination has plenty of available
write throughput capacity disjoint from the resources used by the multiple
concurrent single row inserters. (That includes network and non-Oracle load.)

Also, just curious, are these single row inserts also single row commits?

Good luck.

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Jonathan Lewis
Sent: Thursday, July 27, 2023 4:06 AM
To: Oracle L
Subject: Re: High Buffer busy waits

As Tim has said, your primary problem is the "checkpoint incomplete" waits. To
avoid waits on the "log file switch (checkpoint incomplete)" you generally need
more redo log.  Tim has suggested making the existing log files larger, but if
you don't like the idea of very large log files you could just add more files
of the same size; so long as you don't have a silly (tiny) size it's the total
volume of the redo log that matters nost not the specific size chosen for the
files.

Since you refer to this as a "sudden" occurrence It is possible that the log
writer was temporarily delayed by an unusual slowdown in the database writer -
so its worth checking if there was some load on your I/O subsystem (possibly
from outside Oracle) that slowed down Oracle's I/O rates.

You'll notice that the buffer busy waits at the top of the list is labelled
"(1st level bmb)" - that about space management blocks, not about competition
on the data blocks, so changing the initrans setting wouldn't make a
difference. The much smaller percentage of buffer busy waits labelled as [data
block] would be the ones that might improve with an increase in initrans, but I
wouldn't mess about with the setting until after I'd checked the effect of
eliminating the checkpoint waits.

An idea that I've not experimented with for systems like yours that have a
large number of concurrent processes doing single row inserts is whether the
size of the extents makes any difference. The way in which Oracle associates
processes with data blocks does change with the size of the extent, and I have
wondered whether the algorithm goes a little further and allocates more level 1
bitmap blocks as the number of inserting processes grows.  You could examine
the block addresses of the bmbs that show up in bbws to see how many different
blocks are involved in a short period of time, this might give us some clues
about how the feature works in extreme cases.

Regards

Jonathan Lewis

On Wed, 26 Jul 2023 at 22:17, yudhi s <learnerdatabase99@xxxxxxxxx> wrote:

Hello Listers,

We suddenly see higher response time for our incoming transactions during
certain time periods. and then tried running tanels DASH_WAIT_CHAIN getting
below wait chains. It's pointing to high 'Buffer  busy waits' and the top sqls
it's pointing to is a single row INSERT query.

And as ist a single row insert , so the plan_hash_value is showing as '0' and
also the Avg number of execution of the Insert query remains same. So wondering
what can be the cause of the issue? Can it be associated with something related
to infrastructure as the final blocker showing as "DB writer"?

or

As the current_obj# in ASH is pointing to table only and it has default
settings i.e.  ini_trans - 1 and max_trans- 255 and we are using ASSM. Should
we consider increasing the value of ini_trans here?

https://github.com/tanelpoder/tpt-oracle/blob/master/ash/dash_wait_chains.sql

%This

SECONDS

AAS

DISTINCT_SIDS

WAIT_CHAIN

FIRST_SEEN

LAST_SEEN

38%

5710

15.9

19

-> USER1:(PROGMGR) buffer busy waits [1st level bmb]  -> USER1:(PROGMGR) log
file switch (checkpoint incomplete)  -> [idle blocker 1,2395,34375
(oracle@XXXXXXXXXXX (DBW0))]

7/25/2023 5:46

7/25/2023 5:46

20%

3050

8.5

166

-> USER1:(PROGMGR) log file switch (checkpoint incomplete)  -> [idle blocker
1,2395,34375 (oracle@XXXXXXXXXXX (DBW0))]

7/25/2023 5:46

7/25/2023 5:46

7%

990

2.8

47

-> USER1:(PROGCAP) ON CPU

7/25/2023 5:46

7/25/2023 5:51

5%

770

2.1

40

-> USER1:(PROGCAP) log file switch (checkpoint incomplete)  -> [idle blocker
1,2395,34375 (oracle@XXXXXXXXXXX (DBW0))]

7/25/2023 5:46

7/25/2023 5:46

5%

700

1.9

16

-> USER1:(PROGMGR) buffer busy waits [data block]  -> USER1:(PROGMGR) log file
switch (checkpoint incomplete)  -> [idle blocker 1,2395,34375
(oracle@XXXXXXXXXXX (DBW0))]

7/25/2023 5:46

7/25/2023 5:46

3%

390

1.1

5

-> USER1:(PROGCAP1) buffer busy waits [data block]  -> USER1:(PROGCAP1) log
file switch (checkpoint incomplete)  -> [idle blocker 1,2395,34375
(oracle@XXXXXXXXXXX (DBW0))]

7/25/2023 5:46

7/25/2023 5:46

2%

330

0.9

5

-> RPT_USER:(JDBC Thin Client) ON CPU

7/25/2023 5:46

7/25/2023 5:51

2%

290

0.8

6

-> ADMIN:(JDBC Thin Client) ON CPU

7/25/2023 5:46

7/25/2023 5:51

2%

270

0.8

22

-> USER1:(PROGMGR) ON CPU

7/25/2023 5:46

7/25/2023 5:51

References:
- High Buffer busy waits
  - From: yudhi s
- Re: High Buffer busy waits
  - From: Jonathan Lewis

RE: High Buffer busy waits

Other related posts: