Re: improve cardinality estimate

From: "mcpeakm@xxxxxxxxxxxxxxxxxxxxxxxxxxx" <mcpeakm@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
To: "ORACLE-L (oracle-l@xxxxxxxxxxxxx)" <oracle-l@xxxxxxxxxxxxx>, Jonathan Lewis <jlewisoracle@xxxxxxxxx>
Date: Thu, 18 Mar 2021 15:09:19 +0000 (UTC)

Jonathan,
Is creating a virtual column and then gathering a histogram on that
different/better than gathering expression statistics via something like:
BEGIN  DBMS_STATS.GATHER_TABLE_STATS(user, 'ORDERS', method_opt => 'FOR ALL
COLUMNS SIZE SKEWONLY FOR COLUMNS ( order_processing_date -
business_processing_date) SIZE SKEWONLY');END;
?

    On Thursday, March 18, 2021, 10:31:16 AM EDT, Jonathan Lewis
<jlewisoracle@xxxxxxxxx> wrote:

Are all these dates date-only, or is there a time component allowed ?Is this
one partition per day.
From what you've said, and assuming YES as the answer to the questions above,
creating a virtual column on (order_processing_date - business_processing_date)
looks like an obvious thing to do. And then create a (frequency) histogram on
the virtual column so that the optimizer can see there are (usually?)  only 3
distinct values and the one you're interested in is rare.

RegardsJonathan Lewis

On Thu, 18 Mar 2021 at 13:52, Laurentiu Oprea <laurentiu.oprea06@xxxxxxxxx>
wrote:

Regarding parallel I`m relaying my affirmation of the fact that I tested with
hint /*+ dynamic_sampling(8) */ (the DS level with parallel hint) and gives the
same estimate.
Yes there is a relation order_procesing_date between business_procesing_date -1
and business_procesing_date + 1 and there is no fooling like null values or
year 1900/9999
The pattern is constant , grouping by partition_date  for the last 7 days it
ranges from 0 to 130k (out of a total of 24M)  rows filtered by condition
order_procesing_date > = business_procesing_date .
Also the columns are not virtual columns (strange expressions)
În joi, 18 mar. 2021 la 15:14, Jonathan Lewis <jlewisoracle@xxxxxxxxx> a scris:

The change for parallel processing is one I'd have to think about - it may be
something to do with the way the predicates are rewritten for the parallel
query slaves - so I'll postpone thinking about that bit.

The questions to ask first of all are:a) what's the business logic.  is there
an approximate relationship between order_processing_date and
business_processing date (e.g. the former is usually about 3 days before the
latter).b) has the optimizer been fooled by a "silly null" effect - like "not
yet processed" = "31-Dec-9999"c) how much does the pattern change over time -
e.g. if the partition date is sysdate - 1 is this a partition where a lot of
rows have not yet had business_processing_date set, but if partition date =
sysdate - 7 then maybe lots of rows now have a business_processing_date set.
RegardsJonathan Lewis

On Thu, 18 Mar 2021 at 12:26, Laurentiu Oprea <laurentiu.oprea06@xxxxxxxxx>
wrote:

Hello,
I recently saw a query with a wrong execution plan and looks to be caused by an
overestimate of a particular where clause. Ex:
select     count(*)from     orderswhere     partition_date = sysdate-1 and
order_procesing_date > = business_procesing_date
the count itself is = 42Kthe count itself without second condition ~ 24Mthe
cardinality estimate ~ 24Mthe cardinality estimate if I add parallel hint: 1M
(this is an interesting one and looks to be related with the fact that parallel
queries get DS level 8)the cardinality estimate  if I rewrite the condition :
order_procesing_date - business_procesing_date  >= interval '0' minute :1.4M
I haven't tried yet but I suspect that if I create extended stats on expression
( order_procesing_date - business_procesing_date) can get me better results
The question is what is the logic behind beter cardinality estimates if I
rewrite the where clause?What will be the best approach for this query?
Thank you.

Follow-Ups:
- Re: improve cardinality estimate
  - From: Jonathan Lewis

References:
- improve cardinality estimate
  - From: Laurentiu Oprea
- Re: improve cardinality estimate
  - From: Jonathan Lewis
- Re: improve cardinality estimate
  - From: Laurentiu Oprea
- Re: improve cardinality estimate
  - From: Jonathan Lewis

Re: improve cardinality estimate

Other related posts: