RE: improve cardinality estimation

From: "Mark W. Farnham" <mwf@xxxxxxxx>
To: "'Laurentiu Oprea'" <laurentiu.oprea06@xxxxxxxxx>
Date: Tue, 27 Oct 2020 14:14:34 -0400

well, first try everything that JL wrote. I WILL NOT create an ordered list of
who knows how best to beat the CBO into sanity, but he would be on anyone’s top
10, INCLUDING the folks writing it.

also, it did seem strange that someone was asking so useful a question with
that so questionable code. You living with already written code is indeed a sad
tale and true. (Lemon Tree).

oh, and sorry I missed the column name change to
trunc(nvl(col_to_date,sysdate+1)),

which you clearly figured out.

Since this table is SO tiny, so *maybe* we can ignore maintenance costs:

I wonder what would happen if you made a function index on:

trunc(col_from_date), trunc(nvl(col_to_date,sysdate+1))

and you dropped/created just before the query at least each day. (You probably
have to instantiate sysdate+1 as a constant to make that an index, unless I’m
misremembering that dynamic values can’t be part of indexes.) I think that
problem right now is that it has no idea from static stats that you are going
to include all those nulls on col_to_date. Sigh, so if you have to change the
column function to create it, that probably won’t match up in the use the
matching function index reference. I have literally forgotten whether there is
a reasonable way to finesse this without changing the code. Good luck to JL,
et. all, I’ll be reading.

I don’t suppose you could actually update col_to_date to one day in the future
(from null) every day and re-calculate the stats. You would have to keep track
of the ones that are actually one day in the future when you did it yesterday
so you would not update them today. Presumably this is all some current versus
future or expired list. A daily update should give you good query stats, but
you would have to check whether it breaks anything in your suite. I’d go 50-50
on that one, but you don’t want my percentage guesses.

Then, of course it would depend on how often had to do that. Possibly it works
out doing it once a day just after midnight.

Good luck. I spend about 80 percent of my career getting crap code in packaged
products to run decently, 80 percent of my career re-writing crap code that
customers did control to run better, and the rest advising folks to not write
crappy code in the first place and running my business. I’m not sure what the
total percent comes out to, but it was enough to occasionally annoy my wife.

mwf

From: Laurentiu Oprea [mailto:laurentiu.oprea06@xxxxxxxxx] ;
Sent: Tuesday, October 27, 2020 1:28 PM
To: Mark W. Farnham
Cc: ORACLE-L
Subject: Re: improve cardinality estimation

Thanks for the answer.

Long story short at this point I need to live with this crappy code.

I found that this low cardinality determines the optimizer to choose for one
particular query a NLJ (duration 1 hour) instead of HJ (duration 1 minute).

Rewriting the where clause like:

where (sysdate between trunc(col_from_date) and trunc(col_to_date) ) or
col_to_date is null

gives perfect cardinality

The question will be: is there a smarter solution to give oracle enough
statistical information so that he can figure out a good cardinality with the
current where clause?

În mar., 27 oct. 2020 la 18:51, Mark W. Farnham <mwf@xxxxxxxx> a scris:

do you mean like:

where

col_from_date is null

or ( trunc(col_from_date) between sysdate and sysdate+1)

it *should* be able to get the number of nulls and a decent number in the range
estimate from the stats.

I would also look into your exact query return desires to see if you can lose
the trunc on col_from_date with an appropriate formulation of sysdate
boundaries…

For a given call, sysdate is a constant, and the optimizer tends to do better
with functions and calculations on the constant rather than the value in the
index or table.

in general I find it less confusing for human readers to write:

<column_value> is between x and y

rather than x is between <column_value> and some_function_on(<column_value>)
that might return y.

even

trunc(nvl(col_to_date,sysdate+1)) is between sysdate and sysdate+1

is more understandable, although that obscures that you want to include all
null values a bit. Essentially that is the difference between writing
“I want all the nulls and all the actual dates from now through the next day”
in place of “I’m going to pretend all the nulls are now plus a day, and I want
all the values from now through now plus a day.”

good luck.

I hope I read all that right and wrote all the right. I need more coffee and
I’m old, rusty, and cranky.

mwf

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Laurentiu Oprea
Sent: Tuesday, October 27, 2020 11:31 AM
To: ORACLE-L (oracle-l@xxxxxxxxxxxxx)
Subject: improve cardinality estimation

Hello,

I found a query with a very bad execution plan due to low cardinality
evaluation of below where clause

where

sysdate between trunc(col_from_date) and trunc(nvl(col_to_date,sysdate+1))

Is there any way I can improve the estimate (like extended stats, etc) ?

Thanks for your help.

References:
- improve cardinality estimation
  - From: Laurentiu Oprea
- RE: improve cardinality estimation
  - From: Mark W. Farnham
- Re: improve cardinality estimation
  - From: Laurentiu Oprea

RE: improve cardinality estimation

Other related posts: