RE: reduce table fragmentation by design

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: "'LS Cheng'" <exriscer@xxxxxxxxx>
  • Date: Fri, 28 Jan 2011 12:03:36 -0500

If you know the dynamic range of each numeric column you can add up the
possible change maximum in length (basically delta 1 byte for each order of
magnitude change in value is close enough, presuming you're not trying to
save bytes by initializing zeros to null instead of zero). Since I don't
know the variability of your data I cannot calculate an aggressive but safe
percent free for you. Since all your columns are either the unchanging key
values or numbers, it seems likely that a very aggressively small percent
free would be useful for either ASSM or freelists.

 

As for pctused with freelists management (for which they have created the
acronym MSSM meaning Manual Segment Space Management, though there is
nothing manual about it), I'm not sure you would benefit from putting blocks
back on the freelist sooner. 

 

The default is 40 in all releases I'm aware of, and that means blocks become
available to receive new rows (having previously been take off by inserts or
updates eroding free space to less than the specified percent free) when
updates or deletions bring the size utilization below 40 percent. 85 percent
would put blocks much more aggressively back on the free list. Since in your
original post the complaint was that index probes tended to bring back as
many block reads as index probes, this will only make that worse. Since you
purge by time, you might want the blocks to be completely empty before they
go back on the freelist so the rows also age out and are purged together.

 

That would mean a percent used setting so that the smallest possible row you
have being deleted drags you below it. Given your earlier metrics your
overall storage requirements for the table should float at about 52/45ths of
the possible minimum, but rows will be decently clustered by time. Whether
this produces a material benefit is something you would have to test from
time to time.

 

Whether it produces less scatter than ASSM is something you would also have
to test case by case unless you had Oracle's exact algorithm in hand and
knew enough about the skew in your daily average row insertion and command
of the mathematics to make a prediction. I suppose you could check it
retrospectively to see where rows are placed by time and figure out whether
there was likely an advantage to moving to freelists.

 

You've reported that maintenance windows are tough to find; should I
interpret that as you use the application's delete protocol that may be
row-by-row instead?

 

I would think old rows tend to be quiescent, and that by pre-copying the
bulk of the rows (perhaps days 45 old through 5 days old or whatever seems
to a preponderance of updates), then very briefly having an outage to copy
the last 4 days and scan for any changes in the 45 through 5 day window you
could keep the outage quite brief. Then it wouldn't matter much whether you
use ASSM or freelists.

 

Regards,

 

mwf

  _____  

From: LS Cheng [mailto:exriscer@xxxxxxxxx] 
Sent: Friday, January 28, 2011 2:54 AM
To: Mark W. Farnham
Cc: Niall Litchfield; tim@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: Re: reduce table fragmentation by design

 

Hi

We actully do the CTAS time to time, it reduces response time x8.

But it is become harder to ask for maintenance window in this application
that is why I am thinking to do what I have mentioned.

The critical queries are querying by timestamp so may be ordering by the
timestamp when insert should be enough. Your last point that non-direct
inserts will still cause fragmentation is true, I do have suggested a once
per month CTAS ordering by timestamp because shrink table does not solve
100% the honeycomb neither.

Lastly dont you think MSSM would be better suited in this case? If I set a
high pctused such as 85 pctfree 10 the block reutilization should be better
than ASSM. With ASSM the timestamp index clustering factor is always the
same as the number of rows if I recall ASSM tends to reduce block contention
by spreading DML in more blocks which tend to cause even more honeycomb
problem.

In the long run partitioning or partitioning view is the way to go.

Thanks

--
LSC

<snip>

 

Other related posts: