Re: Incorrect cardinality estimate

I must have used reply instead of reply-all. Since Paul refers to my suggestion (which only he got), here it is


Date: Tue, 23 Jan 2007 11:23:24 -0700
To: paul.baumgartel@xxxxxxxxxxxxxxxxx
From: Wolfgang Breitling <breitliw@xxxxxxxxxxxxx>
Subject: Re: Incorrect cardinality estimate

Paul,

the problem is that 12/27/2006 is not a popular value so the cardinality estimate didn't change much. Height balanced histograms are only good for predicates on popular values (I know, an oversimplification, but not by much). For non-popular values the cardinality of an equality predicate is num_rows * density = 194447369 * 6.2921e-03 = 1223482.29.

Ergo, in order to get the cardinality estimate down you have to change either num_rows or density. I suggest you whack density to 2.5e-6 or even less. Because you did gather the histogram this won't affect any sql which go after popular values. It will affect sql where you have a join on the ods_process_date column.

At 09:41 AM 1/23/2007, you wrote:
Following up on my post from Friday regarding an optimizer plan to use an index join rather than a simple index lookup.

Wolfgang Breitling pointed out that the optimizer estimated that the query would return 1.3M rows. In fact the query returns 492 rows, so I took at look at data value distribution in the predicate column (query is select trans_id from ods_execution where ods_process_date='12/27/2006'). Distribution of ods_process_date is definitely skewed (here are the last few rows of the count of each value):

ODS_PROCESS_DATE        COUNT(*)
-------------------     -------
/16/2006 00:00:00     1544886
11/16/2006 08:53:52           1
11/17/2006 00:00:00     1226408
11/17/2006 15:50:12           1
11/17/2006 16:45:10           1
11/17/2006 16:46:00           1
11/17/2006 16:46:10           1
11/20/2006 00:00:00          12
12/27/2006 00:00:00         492

So, I have created a histogram on the ods_process_date column, via

exec dbms_stats.gather_table_stats('ODS','ODS_EXECUTION',-
> method_opt=>'FOR COLUMNS ODS_PROCESS_DATE SIZE 254',-
> stattab=>'ODS_STATS',statown=>'P_BAUMGA2',statid=>'NoHisto')

Now the puzzle: after flushing the shared pool (to force a re-parse), the optimizer's cardinality estimate, and thus its plan, is not changing. A 10053 trace shows that the optimizer is aware of the histogram:

SINGLE TABLE ACCESS PATH
Column: ODS_PROCES  Col#: 14     Table: ODS_EXECUTION   Alias: ODS_EXECUTION
    NDV: 772       NULLS: 0         DENS: 6.2921e-03
    HEIGHT BALANCED HISTOGRAM: #BKT: 254 #VAL: 143
TABLE: ODS_EXECUTION ORIG CDN: 194447369 ROUNDED CDN: 1223485 CMPTD CDN: 1223485


I am not sure what should be the next step--any suggestions welcome.  Thanks!

Regards

Wolfgang Breitling
Centrex Consulting Corporation
www.centrexcc.com


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email ______________________________________________________________________
--
http://www.freelists.org/webpage/oracle-l


Other related posts: