Re: Index on status field?

From: Christo Kutrovsky <kutrovsky.oracle@xxxxxxxxx>
To: zoran_martic@xxxxxxxxx
Date: Fri, 8 Apr 2005 09:57:30 -0400
Imagine an ordered list, ordered by YOUR_INDEX_KEY, rowid.

Your index will have:

-- Block 1 --
0, AAAb... 1
0, AAAb... 2
0, AAAb... 3
0, AAAb... 101
0, AAAb... 350
...
-- Block 2 --
0, AAAb... 551
0, AAAb... 800
0, AAAb... 950
0, AAAb... 1001
0, AAAb... 1002
...

As long as you dont leave unprocessed behind, as the entries are
removed, blocks in the begining of the list will be emptied, and thus
reused later on. A block can be reused only if all entries from it are
removed, or a new entry's place is arround entries in that block
(based on your key and rowid)

The approach you have chosen is pretty much optimal. The one thing i
would add is to maybe compress the index, but i am not sure wehter you
will save much at all, given that you field is already number(1).

Does that answer your question ?

On Apr 8, 2005 4:27 AM, Martic Zoran <zoran_martic@xxxxxxxxx> wrote:
> Don,
> 
> Thanks a lot.
> 
> That is exactly my option.
> 
> But again curious about how Oracle hadnling the
> situtation in the index where your are making the old
> values NULL and inserting the same value 0 again and
> again.
> 
> My partitions are around 30M each.
> One index on that stupid modified date is 70% bigger
> then on the insert date suggesting that index may be
> bigger then needed when you are updating to some
> value. This is reasonable to suggest or to have,
> because Oracle cannot make the B*tree the same with
> different spread of values on the date field.
> 
> But now curious if I use only 0 and NULL (for
> processed) how the index going to behave?
> Just curious about what will be with the index blocks
> where I making values NULL (removing entries from the
> index). Which kind of discrepancy is going to happen.
> 
> So again, looking somebody to explain to me what is
> Oracle doing with the same values coming to the index
> while there are many blocks emptied with the UPDATE to
> NULL value?
> 
> Anybody?
> 
> Regards,
> Zoran
> 
> 
> --- Don Granaman <granaman@xxxxxxx> wrote:
> 
> > I have used this technique occasionally since Oracle
> > 6.  It works best when
> > the number of index entries is a very small
> > percentage of the total rows.
> > For example, we have a table now with about 300
> > million rows and a status
> > column which can have one of a few distinct non-null
> > values, each meaning
> > some particular state of "unprocessed" and a null
> > meaning "processed".  The
> > critical process using this is an analyzer that is
> > continually looking for
> > unprocessed rows to process.  After it does its
> > thing, it sets the column to
> > null/  Thus there are typically only a few hundred
> > or a few dozen records
> > with a non-null value.  It is an extremely efficient
> > way to sift through the
> > haystack to find the needles.
> >
> > The index space will be reused since new records are
> > always coming in with
> > identical indexed values as what previously existed.
> >  It does not grow
> > without bound - as would occur, for example, with an
> > index on something like
> > a datetime.
> >
> > I have also used this in tables where the data is
> > "processed" in batch
> > mode - perhaps once per month - where it also worked
> > well.  In this case the
> > number of "dead" index entries is large after the
> > batch processing
> > completes, but the space will be reused after a
> > time.
> >
> > The only drawbacks of which I have experienced are
> > these.
> > *This is a very special-purpose attribute of the
> > data.  You have to be
> > vigilante that nobody gets "creative" and starts
> > using it inappropriately.
> > For example, I once saw a developer (definitely a
> > duhveloper) take this
> > construct and pervert it by recreating the check
> > constraint so he could use
> > a new value - to flag records as "never mind" after
> > his (seriously flawed
> > and untested) code started inserting duplicate
> > records for everything.
> > Rather than fix the core problem, the code, he
> > decided that a "better
> > solution" was to continue inserting duplicate
> > records, then keep them from
> > being processed by having another job running to
> > find the duplicates and set
> > this column to the "never mind" status in all but
> > one in each set of
> > duplicates.  The index became useless and nothing
> > but overhead.  It was
> > discovered rather quickly though since the
> > processing essentially "stopped"
> > (became so inefficient as to always fall further
> > behind).  [Of course, this
> > same developer misdiagnosed the issue as a "locking
> > problem" ;-).  There was
> > no DBA.  I was called back in as a consultant to
> > "fix the locking problem".]
> > *The index is, of course, useless for finding
> > "processed" records.  If you
> > need multiple "processed" status codes and determine
> > that an index on the
> > status would be useful for finding records with
> > particular "processed"
> > values, another approach is preferred.
> >
> > Don Granaman (OraSaurus)
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Personals - Better first dates. More second dates.
> http://personals.yahoo.com
> 
> --
> //www.freelists.org/webpage/oracle-l
> 


-- 
Christo Kutrovsky
Database/System Administrator
The Pythian Group
--
//www.freelists.org/webpage/oracle-l
Follow-Ups:
- Re: Index on status field?
  - From: Martic Zoran
References:
- Re: Index on status field?
  - From: Don Granaman
- Re: Index on status field?
  - From: Martic Zoran
Re: Index on status field?

Other related posts: