Re: Why does this query need a table access by rowid?

  • From: Tim Gorman <tim@xxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 23 Jul 2004 12:23:46 -0600

Because the function-based index stores the result values of a DECODE
expression, not DOC_ID column values, as far as Oracle is concerned.

We humans can eye-ball the DECODE statement and deduce that the index
contains DOC_IDs, but apparently that deduction has not (yet?) been built
into the optimizer.

11 years ago, I had some fun reverse-engineering Oracle's SQL processing
engine using the packages YACC and LEX.  From what little I remember of that
experience, I know I would hate to try to program this seemingly simple
little deduction into a language.  Remember how complex DECODE can get
anyway, and then realize that it might be a function or a column from a view
not a table, etc, etc, etc...

Perhaps if you try rewriting the query so that the SELECT clause contains
"DECODE(DOC_HOLD_STATUS,'Y',DOC_ID,NULL)" instead of just DOC_ID???



on 7/23/04 11:55 AM, Bobak, Mark at Mark.Bobak@xxxxxxxxxxxxxxx wrote:

> Hi,
> 
> I've got a table, DOCUMENTS, with lots of columns (and lots of rows).
> DOC_ID is the PK, and DOC_HOLD_STATUS is another column in the table,
> which is NOT NULL and will always have 'Y' or 'N'.
> 
> Now, I've got a query where, given the DOC_ID, I want to determine if a
> particular document is on hold.  Note that DOCUMENTS contains on the
> order of 170M rows, of which approximately 200k rows are flagged as
> DOC_HOLD_STATUS=3D'Y', or, docs that are on hold.
> 
> So, given the large disparity in the number of docs on hold vs. not on
> hold, I created a function-based index defined as:
> Create DOC_ON_HOLD on
> DOCUMENTS(DECODE(DOC_HOLD_STATUS,'Y',DOC_ID,NULL));
> 
> Now, that gives me a convenient index of just the docs that are on hold.
> (And with only around 200k DOC_IDs, it's *much* smaller than if I just
> had an index on DOC_HOLD_STATUS, which makes it a lot more cache
> friendly.)
> 
> So, finally, I get to the problem.
> When I execute this query:
> SELECT /*+ index(doc doc_on_hold) */ DOC_ID=20
> FROM DOCUMENTS doc=20
> WHERE DECODE(DOC_HOLD_STATUS,'Y',DOC_ID,NULL) =3D :B1;
> 
> I get this execution plan:
> Execution Plan
> ----------------------------------------------------------
>  0      SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D1 Card=3D1430947
> Bytes=3D10016629)
>  1    0   TABLE ACCESS (BY INDEX ROWID) OF 'DOCUMENTS' (Cost=3D1
> Card=3D1430947 Bytes=3D10016629)
>  2    1     INDEX (RANGE SCAN) OF 'DOC_ON_HOLD' (NON-UNIQUE) (Cost=3D3
> Card=3D1430947)
> 
> So, the index is recognized, and that's great.  My question is, why is
> the TABLE ACCESS BY ROWID required?
> Since only the DOC_ID is in the select list, why the table access?
> 
> Thanks,
> 
> -Mark
> 
> PS  Granted, this is an efficient query, and the elimination of the
> table access amounts to one less consistent get.  However, this is a
> VERY heavily hit query, and reducing 4 or 5 consistent gets to 3 or 4
> could potentially be a big savings on a query that gets executed as much
> as this one does.

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: