Re: CBO - hash join vs nested loops

  • From: "Jonathan Lewis" <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 5 Dec 2005 11:27:05 -0000


This is the reply I sent to Dimitre's last email (and referencing some off-line details).

So,
if  I've understood properly:

1.With apparently correct row source estimate, the CBO chooses not optimal
access path.

   Correct - but the thing you are calling the 'optimal access path' is
   only the optimal access path for your particular circumstances -
   because you happen to have a particular set of data in a particular
   distribution - AND a particular caching pattern.

2. Setting the OIC to 10, makes it choose the NL access, but not with the
"right" driving table.


These are the plans you sent me for hinting when OIC = 0 (i.e. the default). With your particularly caching effects, the first plan is the optimal plan - even though Oracle thinks it will have to do 3,500 physical reads to acquire the data (for 1,750 rows from the first table).

The second plan is the one taken when you set OIC = 10.

But the critical problem with these two plans and adjusting OIC is that
the 'correct' value for OIC should probably be 99+ for this particular
query. (After all, run the nested loop a few times, and all the blocks
of the second index will probably be cached, even if none were cached
to start with).

Setting the OIC to 10 happens to make things worse because it
drops the cost of the index in the first plan from 2 to 1 - leaving
a total cost of about 1,750; but it drops the cost of the index in the
second plan from 1 to zero - leaving the total cost at about 409.
This is a danger of tweaking OIC and OICA - they can easily
push Oracle from one extreme to another because they scale
costs DOWN, and when you scale things down, the impact of rounding
errors is exaggerated.

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3507 Card=1 Bytes=46)
1 0 SORT (GROUP BY)
2 1 NESTED LOOPS (Cost=3507 Card=2076 Bytes=95496)
3 2 INDEX (RANGE SCAN) OF 'SYS_IOT_TOP_14178' (UNIQUE) (Cost=11 Card=1748 Bytes=33212)
4 2 INDEX (RANGE SCAN) OF 'XIF02FILTRO_DATI_CATALOGO' (UNIQUE) (Cost=2 Card=1 Bytes=27)


0 SELECT STATEMENT Optimizer=CHOOSE (Cost=424724 Card=1 Bytes=46)
1 0 SORT (GROUP BY)
2 1 NESTED LOOPS (Cost=424724 Card=2076 Bytes=95496)
3 2 INDEX (FAST FULL SCAN) OF 'XIF01FILTRO_DATI_CATALOGO' (UNIQUE) (Cost=409 Card=424315 Bytes=11456505)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_14178' (UNIQUE) (Cost=1 Card=1 Bytes=19)



(As an aside, it would be interesting to know how many blocks there are in index 'XIF02FILTRO_DATI_CATALOGO' - if it's less than 3,500 the optimizer ought to have a sanity check that limits the cost to the number of the first query to (roughly) the actual number of leaf blocks).


3. With the hint USE_NL, without ORDERED, it chooses the right driving
table. I was trying to trace (event 10053) this execution, even if a hint
was present, I was hoping that the reason for choosing the right driving
table would appear in the trace file(some optimization has been done, even
if there was a hint).

Any suggestion that can help me understand this behaviour and eventually
correct it, would be appreciated!


Since you are running 9i, you should consider enabled system statistics (CPU costing)
If you do that, the CPU cost of the (predicted) 400,000 logical I/Os in the bad
nested loop, combined with the increased cost of the index fast full scan (multiblock reads
are slower than single block reads) may be sufficient to get you the right path by default.



Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
Cost Based Oracle: Fundamentals

http://www.jlcomp.demon.co.uk/appearances.html
Public Appearances - schedule updated 29th Nov 2005

----- Original Message ----- From: "Laimutis Nedzinskas" <lnd@xxxxxxx>
To: <oracle-l@xxxxxxxxxxxxx>
Sent: Monday, December 05, 2005 10:31 AM
Subject: RE: CBO - hash join vs nested loops



From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Radoulov, Dimitre

1.With apparently correct row source estimate, the CBO chooses not
optimal
access path.

I've already read chapter 5 for the clustering factor, congratulations
for
the great work!

I am interested to here the conclusion: was Oracle CBO correct or not?
I see the following options:

1. CBO was just wrong
2. Oracle was wrong because of inaccurate statistics
3. Oracle was correct.
4. One interesting option: currently Oracle is correct but in the
future(when more data comes into the system) it can possibly fail. In
theory this case should be handled by regular statistics gathering but
in practice it depends on the answers to 1-3.





--
//www.freelists.org/webpage/oracle-l



--
//www.freelists.org/webpage/oracle-l


Other related posts: