Re: ASSM freespace management

From: Rich <richa03@xxxxxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Date: Fri, 4 Nov 2011 09:48:21 -0700
Update on this issue:
It appears we cannot repro at will using the same Oracle PID (from
v$process.PID) nor using the same session id (from v$session.SID).

We now have a test server with this issue and we are using guaranteed
flashback to try to repro - the issue continues on our staging system
as before.

From analysis of a complete trace of a problem insert (73minutes,
2.5GB), it appears that Oracle is [physically] reading 18218 distinct
blocks from the LOBINDEX segment 1024 times each interspersed.

The number 1024 is not an accident:
We have event 44951 set to 1024; now, Oracle Note 6376915.8 has
changed to reflect that this event should not be set at 11.2 (rolls
eyes...).

Testing with event 44951 unset shows that ALL inserts into this table
now require 8-11 seconds still generating lots of IO on the LOB index
segment.
However, this time, the trace shows Oracle doing scattered reads as
well as sequential reads.
This appears to me to be an ASSM space search issue.

We have enabled events 10320 at level 3 and 10612 at level 1 as in
Oracle Note INSERT slow on ASSM [ID 1263496.1].
This trace shows scattered reads as well as sequential reads seemingly
at random increasing through the files mostly after:
Enter ktslagsp_main, nbytes: 1  ---------------
WAIT #47064793535120: nam='db file sequential read' ela= 4775 file#=16
block#=521 blocks=1 obj#=25913 tim=1320362345508849
CBK: Reclaim LOB heuristic ---
RECL: fmb_flag:0x44
RECL: reclaim_loops:0
WAIT #47064793535120: nam='db file sequential read' ela= 1224 file#=16
block#=512 blocks=1 obj#=25913 tim=1320362345514576
BFMB: Entry, instance #0 ...
WAIT #47064793535120: nam='Disk file operations I/O' ela= 8
FileOperation=2 fileno=34 filetype=2 obj#=25914 tim=1320362345520225
WAIT #47064793535120: nam='KSV master wait' ela= 255 p1=0 p2=0 p3=0
obj#=25914 tim=1320362345520574
WAIT #47064793535120: nam='KSV master wait' ela= 1375 p1=0 p2=0 p3=0
obj#=25914 tim=1320362345521972
WAIT #47064793535120: nam='ASM file metadata operation' ela= 53
msgop=33 locn=0 p3=0 obj#=25914 tim=1320362345521991
WAIT #47064793535120: nam='db file sequential read' ela= 7418 file#=34
block#=375188 blocks=1 obj#=25914 tim=1320362345529440
WAIT #47064793535120: nam='db file sequential read' ela= 3357 file#=34
block#=374971 blocks=1 obj#=25914 tim=1320362345533202
WAIT #47064793535120: nam='db file sequential read' ela= 3658 file#=34
block#=374869 blocks=1 obj#=25914 tim=1320362345537496
WAIT #47064793535120: nam='db file scattered read' ela= 18337 file#=34
block#=374870 blocks=81 obj#=25914 tim=1320362345557616
WAIT #47064793535120: nam='db file scattered read' ela= 2323 file#=34
block#=375189 blocks=103 obj#=25914 tim=1320362345562191
WAIT #47064793535120: nam='db file scattered read' ela= 1658 file#=34
block#=375114 blocks=74 obj#=25914 tim=1320362345565534
WAIT #47064793535120: nam='db file scattered read' ela= 8933 file#=34
block#=375001 blocks=113 obj#=25914 tim=1320362345576180
WAIT #47064793535120: nam='db file scattered read' ela= 9088 file#=34
block#=374784 blocks=85 obj#=25914 tim=1320362345587021
WAIT #47064793535120: nam='db file scattered read' ela= 383 file#=34
block#=374972 blocks=29 obj#=25914 tim=1320362345588496
WAIT #47064793535120: nam='db file scattered read' ela= 438 file#=34
block#=374951 blocks=20 obj#=25914 tim=1320362345589207
WAIT #47064793535120: nam='db file sequential read' ela= 5340 file#=16
block#=426805 blocks=1 obj#=25914 tim=1320362345595506
WAIT #47064793535120: nam='db file sequential read' ela= 6695 file#=16
block#=426534 blocks=1 obj#=25914 tim=1320362345602830
WAIT #47064793535120: nam='db file scattered read' ela= 10284 file#=16
block#=426535 blocks=83 obj#=25914 tim=1320362345614753
WAIT #47064793535120: nam='db file scattered read' ela= 1689 file#=16
block#=426739 blocks=66 obj#=25914 tim=1320362345617663
...

(obj# 25913 is the LOB index - note that these reads do not seem [to
me] to be reads around buffer cache entries)

At some point, Oracle decides to do only sequential reads at a very
high rate, then responds with:
BFMB: Exit, space reclaimed: 1 ...
RECL: used:4203511 free:409363 rec:1
Scan begin: option: ALL
WAIT #47064793535120: nam='db file sequential read' ela= 4217 file#=33
block#=338945 blocks=1 obj#=25913 tim=1320362368886283
ktspffc sdba: 0x8452c01 sc->snfree 0 ffree:156 nfree: 84 opt:2
Ready to search 11. L1: 0x8455e00
WAIT #47064793535120: nam='db file sequential read' ela= 7683 file#=33
block#=351744 blocks=1 obj#=25913 tim=1320362368894079
ktspfupdst: state change tsn 22, objd 26785, objn 25913, blksz 2048
ktspfupdst:     oldstate 0, newstate 1, delta bytes 2048
ktspfupdst: fdba: 0x08455e00 offset: 24 nblks: 1 newstate: 1
ktspfsrch returned :Success
Exit ktspgsp_main ------

Wow, reclaimed one block in the index segment in about 10 seconds!
Good job, Oracle...

It does some more reading from disk and finally reports:
STAT #47064793535120 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE
CONVENTIONAL  (cr=33779 pr=39707 pw=0 time=23699482 us)'

Oracle Support finally raised bug 13341274, however, it is still in
status "16 - Bug Screening/Triage".
The first Support Analyst was very good at beginning to frame the
issue (to "IO on LOB index segment"), however, he disappeared.
The second Support Analyst wanted to tune the whole instance, did not
follow ODM and had to be redirected to this real issue a number of
times - he was basically no help...

We cannot use Rebuilding LOB freepools [ID 365156.1] (last Modified
29JUL2010) because this is either:
        Not applicable (Applies to Oracle Server - Enterprise Edition -
Version: 9.2.0.1 to 10.2.0.5 - Release: 9.2 to 10.2); we are at
11.2.0.2.
or
        Unacceptable - the LOB segments on our production instances are
currently sized very large and might induce an unacceptable
performance hit.
        The alternatives listed in this Note are unacceptable as these would
require our application to go down.

We are now working toward Bug 8286901: SPACE SEARCH INEFFICIENCY
DURING INDEX SPLIT

Any input from the list is appreciated :)

Rich

On Sat, Oct 29, 2011 at 12:19 AM, Grzegorz Goryszewski
<grzegorzof@xxxxxxxxxx> wrote:
> On 2011-10-28 23:41, Rich wrote:
>> Also, any tracing hints that might be useful (other than the standard
>> 10046, sql_trace, and process level like
>> http://files.e2sn.com/scripts/tpt_public_unixmac.tar.gz)?
>>
>>
>
> Hi,
>  You can use
> alter session set events '10320 trace name context forever, level 3';
> alter session set events '10612 trace name context forever, level 1';
>
> 10320 Enable data layer (kdtgrs) tracing of space management calls"
> 10612  "prints debug information for auto-space managed segments"
>
> But they can produce huge trace files so be prepared .
> Above not tested so be warned :).
> Regards
> GregG
>
>
>
>
>
> ----------------------------------------------------------------
> Dom pod miastem lub mieszkanie w centrum znajdziesz wlasnie tu!
> http://linkint.pl/f2a58
> --
> //www.freelists.org/webpage/oracle-l
>
>
>
--
//www.freelists.org/webpage/oracle-l
Follow-Ups:
- Re: ASSM freespace management
  - From: Sidney Chen
Re: ASSM freespace management

Other related posts: