Re: In what circumstances might optimizer not choose the lowest cost?

From: Kim Berg Hansen <kibeha@xxxxxxxxx>
To: l.flatz@xxxxxxxxxx, mauro.pagano@xxxxxxxxx
Date: Tue, 20 Jan 2015 15:33:29 +0100

Hi again

@Mladen:

Yes, I have histograms, including the hidden columns.

@Mauro:

It does appear that the unpeeked bind variable is involved here. I did a
trial with bind variables in SQL*Plus that I've set before the parsing/call
of the queries - that worked nicely no matter if my check constraint is in
place or not.

As unfortunately I cannot help that my production environment isn't really
using bind variable peeking, I'll research this further and see if I hit
something similar to Doc ID 4112254.8. In normal operation this hasn't been
a problem before, but then I'm using very very few FBI's. If I can make
something reproducible testcase, I'll try it out on version 12 and see - if
it's fixed there, I can wait until Easter where we (probably) will upgrade.

Until then I'll just have to beware how I use FBI's in an environment
without bind variable peeking...


Thanks all, for the inputs. Very helpful to get some pointers where to look
:-)


Regards


Kim Berg Hansen

http://dspsd.blogspot.com
kibeha@xxxxxxxxx
@kibeha


On Tue, Jan 20, 2015 at 3:20 PM, Kim Berg Hansen <kibeha@xxxxxxxxx> wrote:

> Hi, Mauro and Lothar
>
> Thanks for fast feedback ;-)
>
> @Lothar:
>
> Adding "<expression> is not null" makes no difference at all, neither to
> cardinality estimates, nor to cost. As I am doing "<expression> between
> something and something", the optimizer knows that this can only be true
> for non-null values and the FBI will contain all non-null values of
> <expression>. It seems the optimizer is smart enough here ;-)
>
> If I use "between :1 and :2" instead of "between upperalphanum(:1) and
> upperalphanum(:2)", the cardinality estimates and IO cost is identical, the
> total cost is a bit lower reflecting a lower CPU cost.
>
> @Mauro:
>
> Unpeeked bind might be something to look into. Unfortunately this old old
> ERP application environment parses SQL before populating bind variables, so
> I haven't much help from bind variable peeking. Although that doc is an old
> bug that should be fixed in version 10.2 (and we're running 11.2.0.3.0), I
> can't rule out that this may be involved.
>
>
> I can make a 10053 trace, but asking you all to read through such a
> document might be relying on your generosity a bit too much ;-)
> Anyway, I can fix this query with various workarounds, no problem.
> I am mostly a bit worried that the optimizer didn't use the lowest cost,
> so I am looking for any "underlying reasons" that could influence other of
> my queries that at present I might be unaware that they perform badly.
>
> I think I'll try the 10053 and at least try myself to see if I can spot
> something ;-)
>
>
>
> Regards
>
>
> Kim Berg Hansen
>
> http://dspsd.blogspot.com
> kibeha@xxxxxxxxx
> @kibeha
>
>
> On Tue, Jan 20, 2015 at 2:49 PM, Lothar Flatz <l.flatz@xxxxxxxxxx> wrote:
>
>>  Hi Kim,
>>
>> first of all to finally answer your question I think we would need a
>> 10053 trace.
>> However, maybe we can get away with a bit simpler approach.
>> I would try to add to your query "case dataset when 'DAT' then
>> upperalphanum(eksterntvarenr) end" is not NULL.
>> You can well expect the optimizer does not understand your function.
>> Therefore it does not know that you deliberately are skipping nulls.
>> It could be that the optimizer thinks the index access is not save (in
>> terms of missing some rows) and you are helping him with your Index FBI
>> hint.
>> Actually the optimizer should never do an unsafe transformation.
>> Not even hinted. But maybe Jonathan Lewis would know about exceptions.
>>
>> One more additional suggestion: If you can, avoid upperalphanum(:bind1) ,
>> but rather send a pre calculated bind variable. The estimate of the
>> optimizer should be better.
>> Did you try cardinality feedback?
>>
>> regards
>>
>> Lothar
>>
>> (BTW: a salted banana is a throw away of a table row. A banana is a throw
>> away from the index entry)  ;-)
>>
>> Am 20.01.2015 um 14:20 schrieb Kim Berg Hansen:
>>
>> Hi, fellows.
>>
>>  I had a weird little case today, which I'll try to describe simply:
>>
>>
>>  Two tables - Tab1 with 4 million rows, Tab2 with 3/4 million rows.
>> Tab1 has a function based index on an expression: "case dataset when
>> 'DAT' then upperalphanum(eksterntvarenr) end" - upperalphanum is a function
>> returning uppercase of input stripped of any whitespace and
>> non-alphanumeric characters. The FBI contains about two hundred thousand of
>> the 4 million rows of Tab1, for the rest the expression is NULL.
>>
>>  Query is a simple join between the two tables joining on a two-column
>> key.
>> There is a predicate on Tab1 on the FBI expression:
>>    "case dataset when 'DAT' then upperalphanum(eksterntvarenr) end
>> BETWEEN upperalphanum(:bind1) and upperalphanum(:bind2)"
>> And a filter predicate on two columns of Tab2.
>> The access I want (and normally get) is index range scan of the FBI index
>> on Tab1 and nested loop/index access of Tab2.
>> (The whole purpose of the FBI is to have a small fast index for this and
>> other similar queries.)
>>
>>
>>  I have three versions of the query for testing:
>> Q1: Hinted to use FBI index access on Tab1.
>> Q2: Hinted to use an alternative normal index on Tab1 containing the
>> columns of the FBI expression, where the expression then will be evaluated
>> for all rows.
>> Q3: Unhinted (my normal query.)
>> Apart from hints, the three queries are identical.
>>
>>  Normally they get plans costed like this:
>> Q1 hinted to FBI gets total Cost=26276.
>>  Q2 hinted to normal index gets total Cost=40473.
>> So normal index has a higher cost than FBI.
>> Q3 unhinted picks the lower cost access plan and uses FBI with total
>> Cost=26676.
>>
>>  Then I added a check constraint "check( dataset='DAT' )" on Tab2 on one
>> of the two key columns used for the join.
>> This changed the access plans for the queries - suddenly appeared
>> (optimizer generated) a filter predicate dataset='DAT' on Tab1, as the
>> optimizer know nows via the check constraint on Tab2 and the join between
>> Tab1 and Tab2, that accessing any Tab1 rows with dataset NOT equal to 'DAT'
>> would be folly, because they would be certain to be "thrown away" when we
>> join to Tab1 on dataset column. ("Salted banana", as NoCoug Journal
>> recently called it ;-)
>>
>>  When that filter predicate was added, my three test queries got new
>> costs, of course:
>>  Q1 hinted to FBI gets total Cost=24374.
>>  Q2 hinted to normal index gets total Cost=35493.
>> So even with the new filter predicate reducing estimated cardinality (and
>> cost) slightly, normal index is still higher cost than FBI.
>>
>>  BUT... Q3 unhinted picks the HIGHER cost access plan and uses normal
>> index with total Cost=35493 ??
>>
>>
>>  I can understand that my check constraint has a sideeffect of adding a
>> filter predicate.
>> I have also tested dropping the constraint again and instead added the
>> same filter predicate manually to the queries - it gives the same result
>> (so it is not specifically because there's a check constraint.)
>>
>>  What I canNOT understand is, that with the extra filter predicate in
>> place, the optimizer picks the HIGHER costed of the two access plans?
>>
>>
>>  So my question really is:
>>
>>  Are there known circumstances where the optimizer does NOT choose the
>> lowest cost, even though same query with a hint CAN produce a plan with a
>> lower cost?
>>
>>  Or is this "buggy" behaviour? (My version is 11.2.0.3.0 EE.)
>>
>>
>>  Thanks in advance for any hints I can research ;-)
>>
>>
>>  Regards
>>
>>
>>  Kim Berg Hansen
>>
>>  http://dspsd.blogspot.com
>> kibeha@xxxxxxxxx
>> @kibeha
>>
>>
>>
>> --
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------
>>    <http://www.avast.com/>
>>
>> Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
>> www.avast.com
>>
>>
>

Follow-Ups:
- Re: In what circumstances might optimizer not choose the lowest cost?
  - From: Sayan Sergeevich Malakshinov
- Re: In what circumstances might optimizer not choose the lowest cost?
  - From: stephen van linge
- Re: In what circumstances might optimizer not choose the lowest cost?
  - From: Mladen Gogala

References:
- In what circumstances might optimizer not choose the lowest cost?
  - From: Kim Berg Hansen
- Re: In what circumstances might optimizer not choose the lowest cost?
  - From: Lothar Flatz
- Re: In what circumstances might optimizer not choose the lowest cost?
  - From: Kim Berg Hansen

Re: In what circumstances might optimizer not choose the lowest cost?

Other related posts: