Quite correct. I was not aware that a height-balanced histogram was of little use for queries citing unpopular values. A hint would work; per Wolfgang's suggestion, setting the density to a much lower value also yielded an efficient plan. Thanks for everyone for their responses. Paul Baumgartel CREDIT SUISSE Information Technology DBA & Admin - NY, KIGA 1 11 Madison Avenue New York, NY 10010 USA Phone 212.538.1143 paul.baumgartel@xxxxxxxxxxxxxxxxx www.credit-suisse.com -----Original Message----- From: Alberto Dell'Era [mailto:alberto.dellera@xxxxxxxxx] Sent: Tuesday, January 23, 2007 2:39 PM To: paul.baumgartel@xxxxxxxxxxxxxxxxx Cc: oracle-l@xxxxxxxxxxxxx Subject: Re: Incorrect cardinality estimate Given the data distribution, it's unlikely (impossible) that the value "12/27/2006 00:00:00" shows up as *popular* value in the histogram, in which case the CBO will compute the cardinality as density * num_rows (check Jonathan's "Cost Based Oracle", page 172; I remember a paper by Wolfgang saying the same, check his site). Check the density of ODS_PROCESS_DATE, I guess that it will be .006292114 = 1223485 / 194447369 Probably you'll need a good old hint ... On 1/23/07, Baumgartel, Paul <paul.baumgartel@xxxxxxxxxxxxxxxxx> wrote: > Following up on my post from Friday regarding an optimizer plan to use an > index join rather than a simple index lookup. > > Wolfgang Breitling pointed out that the optimizer estimated that the query > would return 1.3M rows. In fact the query returns 492 rows, so I took at > look at data value distribution in the predicate column (query is select > trans_id from ods_execution where ods_process_date='12/27/2006'). > Distribution of ods_process_date is definitely skewed (here are the last few > rows of the count of each value): > > ODS_PROCESS_DATE COUNT(*) > ------------------- ------- > /16/2006 00:00:00 1544886 > 11/16/2006 08:53:52 1 > 11/17/2006 00:00:00 1226408 > 11/17/2006 15:50:12 1 > 11/17/2006 16:45:10 1 > 11/17/2006 16:46:00 1 > 11/17/2006 16:46:10 1 > 11/20/2006 00:00:00 12 > 12/27/2006 00:00:00 492 > > So, I have created a histogram on the ods_process_date column, via > > exec dbms_stats.gather_table_stats('ODS','ODS_EXECUTION',- > > method_opt=>'FOR COLUMNS ODS_PROCESS_DATE SIZE 254',- > > stattab=>'ODS_STATS',statown=>'P_BAUMGA2',statid=>'NoHisto') > > Now the puzzle: after flushing the shared pool (to force a re-parse), the > optimizer's cardinality estimate, and thus its plan, is not changing. A > 10053 trace shows that the optimizer is aware of the histogram: > > SINGLE TABLE ACCESS PATH > Column: ODS_PROCES Col#: 14 Table: ODS_EXECUTION Alias: ODS_EXECUTION > NDV: 772 NULLS: 0 DENS: 6.2921e-03 > HEIGHT BALANCED HISTOGRAM: #BKT: 254 #VAL: 143 > TABLE: ODS_EXECUTION ORIG CDN: 194447369 ROUNDED CDN: 1223485 CMPTD > CDN: 1223485 -- Alberto Dell'Era "Per aspera ad astra" ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== -- //www.freelists.org/webpage/oracle-l