Jonathan, Sayan,
Based on your suggestions I've made some tests with various LOB sizes, and
I'm sharing the results if someone else might find it useful or if you can
make a comment if you notice anything interesting. I also did some tests
with CACHE lobs, as Sayan sugested, but then focused on NOCACHE lobs only.
I'd like to point out, that when comparing session statistics from the
problematic database when the session was mostly on CPU, with those
gathered in the testcase below, they don't correlate, but that is probably
because in my testcase I couldn't reproduce a scenario where the session
was 90%+ time on CPU, like on the problematic database. The results from
the test below are thus only useful for observing the behaviour when using
different LOB sizes, and not to reproduce/model the issue on the
problematic database.
The commands used to setup the test were described in the previous mail,
the only difference is that the "t_rn" table now holds 10000 rows. The
tests consisted of inserting 10000 rows using "insert /*+append*/ into
t2[@loopback] .. select from t1[@loopback] .." . The CLOB column in the
source table is a multiple of 2048 bytes, so it's not aligned with LOB
chunk boundaries. The figures are for the whole insert of 10000 rows,
hopefully the data will be readable:
1) Both tables accessed locally:
-------------------------------------------------------------------------------------------------------------------------------------------
STATISTIC / CLOB size 2kb 4kb
6kb 8kb 16kb 32kb 64kb 128kb 265kb
-------------------------------------------------------------------------------------------------------------------------------------------
CR blocks created 51 31
31 33 32 51 35 80 248
securefile allocation chunks 10000 10069
10069 10160 10299 10488 10674 10925 11386
securefile allocation bytes 81920000 163840000 163840000
245760000 409600000 737280000 1392640000 2703360000 5406720000
segment dispenser allocations 86 164
164 242 383 550 721 960 1417
segment chunks allocation from disepnser 86 165
165 243 384 552 723 962 1419
undo change vector size 3371308 3639852 3666492
3811712 3964348 4126492 4282812 4551436 5018620
db block gets 15514 26196
26248 36735 57371 98138 178851 340150 691604
consistent gets 19279 29310
36091 34851 54226 54126 53693 54584 58764
----
ENQ: TX - Transaction 361 607
584 792 1153 1623 2108 3042 4288
----
value of v$transaction.used_urec 87 166
166 244 385 553 724 963 1420
2) Both tables accessed over a DB link:
-------------------------------------------------------------------------------------------------------------------------------------------
STATISTIC / CLOB size 2kb 4kb
6kb 8kb 16kb 32kb 64kb 128kb 265kb
-------------------------------------------------------------------------------------------------------------------------------------------
CR blocks created 9065 9190
9190 10060 11211 17316 20642 52130 130922
securefile allocation chunks 10000 10046
10046 10158 20213 30332 50500 90804 161379
securefile allocation bytes 81920000 163840000 163840000
245760000 409600000 737280000 1392640000 2703360000 4999733248
segment dispenser allocations 10000 10014
10014 10026 20022 30030 50045 90061 160130
segment chunks allocation from disepnser 10000 10046
10046 10158 20213 30332 50500 90805 161379
undo change vector size 5621952 6330056 6330008
6914916 9529804 12080408 15101092 25449228 45067204
db block gets 3257052 3361954 3347046
3616426 6361141 9807562 10431427 26261562 63726198
consistent gets 34175 33821
35405 34912 49270 76570 91563 285364 832355
SQL*Net roundtrips to/from dblink 10026 10021
10021 10021 10021 10021 10026 10026 10026
----
ENQ: TX - Transaction 19476 19923
19915 21381 41555 61668 102396 187723 333918
----
value of v$transaction.used_urec 10001 10047
10047 10159 20214 30333 50501 90805 161380
Some observations:
1) with local tables, the "securefile allocation chunks" statistic doesn't
increase much when the size of the LOB increases. It seems Oracle was able
to "*write using maximum sized contiguous allocations on disk in order to
reduce fragmentation*" as stated in the description of the (silently
ignored) CHUNK parameter here:
http://www.oracle.com/technetwork/database/database-technologies/performance/securefilesperformancepaper-130360.pdf
On the other hand, with remote tables, the statistic increases when the LOB
size crosses the 8kb boundary, and it seems Oracle extends LOBs more than
8kb at a time, getting metadata mapping from the dispenser as stated here
on page 6: http://www.vldb.org/pvldb/1/1454170.pdf
I might be wrong, but I infer that from "segment dispenser allocations"
correlating perfectly with "securefile allocation chunks", and the later
means: "*The number of "chunks" returned by the space layer in this
session. In this case a “chunk” is a contiguous series of disk blocks.*" So
every allocation from the dispenser is a "chunk", but the average size of
an allocation is greater than 8kb, as seen if dividing "securefile
allocation bytes" by "securefile allocation chunks".
2) the value of v$transaction.used_urec seems to correlate perfectly with
"segment chunks allocation from disepnser". I haven't checked what's in the
undo blocks like I did in the previous mail. Maybe it's all due to LOB
space management.
3) "securefile allocation bytes" seems quite large compared to the space
actually used for LOBs
4) it's also interesting that "SQL*Net roundtrips to/from dblink" doesn't
increase with larger LOB sizes. SDU size, SEND_BUF_SIZE and RECV_BUF_SIZE
were all default.
Regarding the problematic database, I made a quick flamegraph analysis on
it when wasn't mostly on CPU, and a large portion of the time was spent in
functions which seem to manage LOB space, e.g. kdliAllocChunks,
ktsl_fill_dispenser_main, ktsla_ins_chunk, so like it was mentioned, it
seems that the problem lies in the LOB space management, however the root
cause of the slowdown on the problematic database is (still) unknown. I
also can't explain why so many consistent read copies of blocks have to be
created as the INSERT progresses.
Regards,
Jure Bratina