RE: CBO - hash join vs nested loops

  • From: "Laimutis Nedzinskas" <lnd@xxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 5 Dec 2005 14:01:11 -0000

That´s a very interesting reading on its own. 
Interestingly 4276 vs 3088 consistent gets and 00:00:01.11 vs 00:00:00.56 
elapsed time is not a negligible difference (40% and 98%) but it is not of 
order of 10 too.

Some questions still remain: 

Dimitre's wrote:
"This is only one of approximately 20 statements that I found on one instance, 
provoking this CBO behavior."

- Should those queries be written such that CBO is forced to take what the 
developer thinks is an optimal plan? Interestingly, here the old good NL-loop 
turns out to be a better plan than hashing, scanning, etc. 

- Is the first thing to do with the Oracle instance that to tune OCI? Should 
high OIC values (the "correct 99.9) be tested immediately? (This is actually 
what I do - set all OLTP or general purpose instances to favour indexes.) 











-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Jonathan Lewis
Sent: 5. desember 2005 11:27
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: CBO - hash join vs nested loops



This is the reply I sent to Dimitre's last email
(and referencing some off-line details).

> So,
> if  I've understood properly:
>
> 1.With apparently correct row source estimate, the CBO chooses not 
> optimal access path.
>
    Correct - but the thing you are calling the 'optimal access path' is
    only the optimal access path for your particular circumstances -
    because you happen to have a particular set of data in a particular
    distribution - AND a particular caching pattern.    

> 2. Setting the OIC to 10, makes it choose the NL access, but not with 
> the "right" driving table.
>

These are the plans you sent me for hinting when OIC = 0 (i.e. the default). 
With your particularly caching effects, the first plan is the optimal plan - 
even though Oracle thinks it will have to do 3,500 physical reads to acquire 
the data (for 1,750 rows from the first table).

The second plan is the one taken when you set OIC = 10.

But the critical problem with these two plans and adjusting OIC is that the 
'correct' value for OIC should probably be 99+ for this particular query. 
(After all, run the nested loop a few times, and all the blocks of the second 
index will probably be cached, even if none were cached to start with).

Setting the OIC to 10 happens to make things worse because it drops the cost of 
the index in the first plan from 2 to 1 - leaving a total cost of about 1,750; 
but it drops the cost of the index in the second plan from 1 to zero - leaving 
the total cost at about 409. This is a danger of tweaking OIC and OICA - they 
can easily push Oracle from one extreme to another because they scale costs 
DOWN, and when you scale things down, the impact of rounding errors is 
exaggerated.

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3507 Card=1 Bytes=46)
   1    0   SORT (GROUP BY)
   2    1     NESTED LOOPS (Cost=3507 Card=2076 Bytes=95496)
   3    2       INDEX (RANGE SCAN) OF 'SYS_IOT_TOP_14178' (UNIQUE) (Cost=11 
Card=1748 Bytes=33212)
   4    2       INDEX (RANGE SCAN) OF 'XIF02FILTRO_DATI_CATALOGO' (UNIQUE) 
(Cost=2 Card=1 Bytes=27)

   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=424724 Card=1 Bytes=46)
   1    0   SORT (GROUP BY)
   2    1     NESTED LOOPS (Cost=424724 Card=2076 Bytes=95496)
   3    2       INDEX (FAST FULL SCAN) OF 'XIF01FILTRO_DATI_CATALOGO' 
(UNIQUE) (Cost=409 Card=424315 Bytes=11456505)
   4    2       INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_14178' (UNIQUE) (Cost=1 
Card=1 Bytes=19)


(As an aside, it would be interesting to know how many blocks there are in 
index 'XIF02FILTRO_DATI_CATALOGO' - if it's less than 3,500 the optimizer ought 
to have a sanity check that limits the cost to the number of the first query to 
(roughly) the actual number of leaf blocks).


> 3. With the hint USE_NL, without ORDERED, it chooses the right driving 
> table. I was trying to trace (event 10053) this execution, even if a 
> hint was present, I was hoping that the reason for choosing the right 
> driving table would appear in the trace file(some optimization has 
> been done, even if there was a hint).
>
> Any suggestion that can help me understand this behaviour and 
> eventually correct it, would be appreciated!
>

Since you are running 9i, you should consider enabled system statistics (CPU 
costing)
If you do that, the CPU cost of the (predicted) 400,000 logical I/Os in the 
bad
nested loop, combined with the increased cost of the index fast full scan 
(multiblock reads
are slower than single block reads) may be sufficient to get you the right 
path by default.


Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
Cost Based Oracle: Fundamentals

http://www.jlcomp.demon.co.uk/appearances.html
Public Appearances - schedule updated 29th Nov 2005

----- Original Message ----- 
From: "Laimutis Nedzinskas" <lnd@xxxxxxx>
To: <oracle-l@xxxxxxxxxxxxx>
Sent: Monday, December 05, 2005 10:31 AM
Subject: RE: CBO - hash join vs nested loops


From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Radoulov, Dimitre

>>1.With apparently correct row source estimate, the CBO chooses not
optimal
access path.

>I've already read chapter 5 for the clustering factor, congratulations
for
the great work!

I am interested to here the conclusion: was Oracle CBO correct or not? I see 
the following options:

1. CBO was just wrong
2. Oracle was wrong because of inaccurate statistics
3. Oracle was correct.
4. One interesting option: currently Oracle is correct but in the future(when 
more data comes into the system) it can possibly fail. In theory this case 
should be handled by regular statistics gathering but in practice it depends on 
the answers to 1-3.





--



--


--
//www.freelists.org/webpage/oracle-l


Other related posts: