Hi Jonathan,
Thanks for responding so quickly.
As expected, we have 2 public threads and no private threads (RAC). The
sizes of log_buffer, Redo Size, Fixed Size, public thread size as
documented above are identical for all three nodes.
Looking at v$instance, all nodes have been up for ~10 days (not identical
startup_time, but within a day) with status of OPEN.
I can't find any clues in v$archived_log
primary:sys@ohcops1> l
1 select * from (
2 select dest_id, thread#, sequence#, creator, first_time from
v$archived_log
3 order by first_time desc, thread# asc, dest_id asc
4 )
5* where rownum <=20
primary:sys@ohcops1> /
DEST_ID THREAD# SEQUENCE# CREATOR FIRST_TIME
---------- ---------- ---------- ------- -------------------
1 2 1056302 ARCH 08-03-2021 20:34:11
2 2 1056302 LGWR 08-03-2021 20:34:11
1 1 1126141 ARCH 08-03-2021 20:34:06
2 1 1126141 LGWR 08-03-2021 20:34:06
1 3 1164895 ARCH 08-03-2021 20:33:47
2 3 1164895 LGWR 08-03-2021 20:33:47
1 3 1164894 ARCH 08-03-2021 20:32:26
2 3 1164894 LGWR 08-03-2021 20:32:26
1 2 1056301 ARCH 08-03-2021 20:32:20
2 2 1056301 LGWR 08-03-2021 20:32:20
1 1 1126140 ARCH 08-03-2021 20:32:15
2 1 1126140 LGWR 08-03-2021 20:32:15
1 3 1164893 ARCH 08-03-2021 20:30:50
2 3 1164893 LGWR 08-03-2021 20:30:50
1 2 1056300 ARCH 08-03-2021 20:30:29
2 2 1056300 LGWR 08-03-2021 20:30:29
1 1 1126139 ARCH 08-03-2021 20:30:09
2 1 1126139 LGWR 08-03-2021 20:30:09
1 3 1164892 ARCH 08-03-2021 20:29:32
2 3 1164892 LGWR 08-03-2021 20:29:32
20 rows selected.
I'll keep looking.
Henry
On Sat, Mar 6, 2021 at 3:41 AM Jonathan Lewis <jlewisoracle@xxxxxxxxx>
wrote:
RAC doesn't use private redo - so with 32 CPUs we'd expect two public
threads with (granule - fixed - overheads)/2 as the log buffer size.
Is this happening on all three instances (redo threads)?
If you report the "first_time" from v$archived_log can you see a pattern
to the timing of the switch.
Is there any clue in the "creator" from v$archived_log ?
One possible explanation for this (assuming you've discounted all the
usual suspects): Are all three instances active when this happens?
I haven't checked recently but when an instance is down the active
instances may be "kicking" it (KK lock) on a regular basis to do a log file
switch so that all instances have archived log files with SCNs that are
reasonably close to each other. Maybe there's a side effect (or timing
issue, or bug) related to this that means the kicking is happening too
frequently and you're not getting through a lot of redo before it happens.
Regards
Jonathan Lewis
On Fri, 5 Mar 2021 at 20:39, Henry Poras <henry.poras@xxxxxxxxx> wrote:
- I was just rereading this thread and ran into a slightly different
case I am just starting to dig into. My environment is 11.2.0.4 RAC (3
nodes).
The rough breakdown:
Granule size: 256M
select * from v$sga;
NAME VALUE
-------------------- ----------
Fixed Size 2269072
Variable Size 3.9460E+10
Database Buffers 8.8584E+10
Redo Buffers 227807232
cpu_count: 32
log_buffer: 220798976
public threads: 2 public (no private) 105M each
primary:sys@ohcops1> l
1 select
2 indx,
3 total_bufs_kcrfa,
4 strand_size_kcrfa,
5 index_kcrf_pvt_strand,
6 space_kcrf_pvt_strand
7 from
8* x$kcrfstrand
primary:sys@ohcops1> /
INDX TOTAL_BUFS_KCRFA STRAND_SIZE_KCRFA INDEX_KCRF_PVT_STRAND
SPACE_KCRF_PVT_STRAND
---------- ---------------- ----------------- ---------------------
---------------------
0 215624 110399488 0
0
1 215624 110399488 0
0
redo logs: 256M
archive logs: 40-43M
So if each public thread takes 105M, I have ~~46M left over. Filling one
thread and switching, as discussed in Jonathan's article, should give me
archive logs of ~105 M. Instead, I am getting an archive log of ~ the rump
size in the redo (256-105-105=46M).
I checked the obvious (not a manual log switch, not archive_lag_target).
I'll chime in if I find something interesting. Also wondering if any
obvious things I'm missing off the top.
Henry
===================================================================