RE: Index Rebuilds (aka scan a smaller index)

From: "Mark W. Farnham" <mwf@xxxxxxxx>
To: "Oracle-L" <oracle-l@xxxxxxxxxxxxx>
Date: Tue, 1 Aug 2006 10:01:18 -0400
The bitmap would be a new index on just the type.

I believe index fragmentation has an operational order greater than n. If
that is true, then partitioning would be a superlinear improvement.

Your concerns were valid and I should have been clearer about why I thought
this might help.

Now, let's move on to what l_m_d is being used for.

Usually, l_m_d usage falls into two categories:

1) Documentation for research.
2) Operational trigger to cause additional processing.


Usually usage #1 doesn't cause too much of a problem because the need is
relatively infrequent compared to operational drivers.
In general the volatility of change can be reduced in this case by
quantizing the time keeping unit. (Date truncated to year, month, week,
day).
The coarser the quantization, the less the volatility (presuming the
quantization crosses the mean of the rate of change curve - it is possible
to not win, but you can't lose; for example if you already only change an
individual record on average once a day and you quantize from seconds to
minutes you do not reduce the volatility. If the average change is once a
second and you quantize to day you reduce volatility by a factor of 86400.

Usage #2, however often represents a huge opportunity. If the meaning of
l_m_d is "Hey, I changed! Post process me!" Then the index you actually want
is date_I_changed_since_full_post_processing, which gets set to NULL when
post processing is complete. You then pull from this index to get candidates
to work upon. Since it is highly volatile, it is perversely not a good
candidate for bitmap, but punching to NULL keeps it small. This is a
reasonably neat trick in Oracle, what with the all NULL index values being
dropped, so the total size stays pretty small (again this is subject to the
rates of flux and rate of processing to the point you punch it to NULL.) A
frequent objection is that you lose the datestamp upon which final
processing was complete, but that is a case for an additional column that is
used for documentation, not as an operational trigger. The architecture of
the database structures to maintain these queues is largely a matter of
taste. If I'm not building a replayable, batch contol totalled hard queue
structure partially outside Oracle, I have a marginal preference for a
separate table with just the primary key of the main table and the nullable
column to indicate that work is needed, and possibly a priority rank (though
the priority rank is a relational slippery slope to a whole array of columns
used to decide fine grained processing priority such as promise date,
penalty date, etc. If this is an issue, it is usually best to just have the
key and the date (or an even smaller flag), and leave the AI about who gets
processed first from the candidates thus identified to a sufficiently
complex structure designed for the purpose that you can consider for a small
subset of all rows.)

Additionally there are whole systems for handling operational processing
including at least Oracle triggers, job scheduler, workflow, the e-Business
suite concurrent manger, BPEL, and custom hard queues.

None of these suggestions for working around the problem actually deals with
the problem you are currently observing. I'm not sure there is a reasonable
solution to keeping a highly volatile index as small as possible. Roughly 11
to 64, about a factor of 6, seems a little high to me compared to Bayer tree
mathematics,  but Oracle has considerably "fancied up" from a bare bones
Bayer tree, and l_m_d has that old, hard to handle monotonically increasing
problem. Still, Oracle might be interested in your exact data with a way to
replay it from "good" (11K) to "bad" (64K) so they can see whether they are
missing an opportunity to do even better. Trying to stamp out every excuse
to rebuild as a solution is a useful quest.

There is also a very real possibility that you're using the l_m_d for
something entirely different that none of these suggestions are appropriate
for your case.

Regards,

mwf

-----Original Message-----
From: Stalin [mailto:stalinsk@xxxxxxxxx]
Sent: Monday, July 31, 2006 2:45 AM
To: mwf@xxxxxxxx
Subject: Re: Index Rebuilds (aka scan a smaller index)

Hi Mark,

On 7/30/06, Mark W. Farnham <mwf@xxxxxxxx> wrote:
> Others have already suggested the rownum=1 (or rownum<2) trick if you're
> really just checking for existence.

As you have already seen my earlier replies, they are used for both
processing and to check existence.

> Barring that, isn't this pretty much what bit map indexes were created
for?
> True, you'll have another index, but IF this query is important and
> frequently issued AND response time is important to the business, that is
> one way to make it a lot quicker.
>

Yep i agree that bitmap index would do the trick, but this the most
busiest table in terms of inserts/updates and bitmap index won't fit
in here.

> Alternatively, you could partition by type and at least get some pruning.
>

hmm i think the problem would still be there as every partition of
type would still have l_m_d  index fragmented no?

> If it gets into the level of importance of extreme solutions, you could
map
> your types to numbers putting 'live' low and index "type_id" by itself.
Then
> type_id <= 1 (ie. the lookup of 'live' in the table types table) should be
> pretty quick.
>

yep. separting type from the multi column index would solve this query
as the type is not modified that often as l_m_d. However that leaves
another question of how to slove issues with indexes like l_m_d that
are being updated more often. Is rebuild the only option?

> This is partly presuming that changes to type are less volatile than
changes
> to l_m_d, but even so it should be much smaller. The type being volatile
> would mitigate against the bitmap and partitioning solutions.
>
> Now if this isn't either frequent or very important to the business that
it
> is quick when the question is asked, then you've identified a case of CTD
> (complusive tuning disorder) versus a case for an extreme solution.
>
> Of course you could be exploring this for academic reasons to add to your
> kit bag of solutions. That's always my excuse when I realize I've
succumbed
> to CTD. Oh, wait, I don't have CTD, I'm doing research!
>

Nope, this is not CTD excercise. Looking for solution on a real
production problem.

> Regards,
>
> mwf
>
> -----Original Message-----
> From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On
> Behalf Of Stalin
> Sent: Friday, July 28, 2006 7:08 PM
> To: Oracle Discussion List
> Subject: Index Rebuilds
<snip, the list has seen the rest>


--
//www.freelists.org/webpage/oracle-l
RE: Index Rebuilds (aka scan a smaller index)

Other related posts: