Hi, Richard Foote did some really interesting things to prove that this feature is dangerous. http://richardfoote.files.wordpress.com/2008/01/dbms_stats-method_opt-auto-dangers-demo-version-3.txt -- Mit freundlichem Gruß Martin Klier Senior Oracle Database Administrator ------------------------------------------------------------------------------ Klug GmbH integrierte Systeme Lindenweg 13, D-92552 Teunz Tel.: +49 9671/9216-245 Fax.: +49 9671/9216-112 mailto: martin.klier@xxxxxxxxxx www.klug-is.de ------------------------------------------------------------------------------ Geschäftsführer: Johann Klug, Roman Sorgenfrei Sitz der Gesellschaft: Teunz, USt-ID-Nr. DE175481608, HRB Nr. 2037, Amtsgericht Amberg |------------> | Von: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |neil kodner <nkodner@xxxxxxxxx> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | An: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |oracle-l List <oracle-l@xxxxxxxxxxxxx> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Datum: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |20.11.2009 18:08 | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Betreff: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |is SIZE AUTO ever wrong? or am I wrong? | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Gesendet | | von: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |oracle-l-bounce@xxxxxxxxxxxxx | >--------------------------------------------------------------------------------------------------------------------------------------------------| Does Oracle ever make the wrong decision when using FOR COLUMN column_name SIZE AUTO? I ask because I have table letter_bin with 4778159 rows, claims with 487842 rows. letter_bin is indexed by both clm_id and prty_id Here are the distributions of the counts of the columns: table claims column clmt_prty_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 319156 1 29 1.52855657 .939775607 table letter_bin column prty_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 234326 1 1712371 20.3912157 3542.44185 table letter_bin column clm_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 470855 1 1424 10.147916 7.88601772 I just now realized that the numbers for letter_bin include NULLs. excluding NULLs, I get table letter_bin column clm_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 470859 1 1424 10.1464345 7.82825864 table letter_bin column prty_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 234326 1 42314 13.0835759 188.911964 letter_bin is typically queried by both clm_id and prty_id at the same time. although both columns are NULL able, when the table is queried in a certain fashion, both values are used, one as a literal and one as a result of a join. prty_id is supplied, clm_id is the result of querying by clmt_prty_id I analyzed letter bin using for columns prty_id size auto, and for columns clm_id size auto. both 100% samples, execute separately. When I compute statistics on column prty_id, I get 254 buckets.which I expect. When I compute statistics on clm_id, I get no histogram although I think I should be expecting one. To me, the concept of 'skewed' is somewhat nebulous but I think I ought to get a histogram here. I prefer letter_bin to by clm_id and not by prty_id. I say this because the average number of rows in letter_bin for a clm_id should be 10 with a lower SD an example query would be along the lines of select xyz from letter_bin a,claims b where a.prty_id = 123 and a.clm_id=b.clm_id and b.clmt_prty_id=456. The query typically uses access path of letter_bin-party_id and then join to claims. I would expect it to be the other way around. select count(*) "count",count(distinct &2)"distinct",min(cnt)"min", max (cnt)"max", avg(cnt)"avg", stddev(cnt)"stddev" from ( select /*+ PARALLEL (a 6) */ &2, count(*) cnt from &1 a group by &2 ); Now here's where the plot thickens: This query is executed thousands of times in a batch job. I think it would be more efficient for letter_bin to be accessed by clm_id, and its not. It's getting accessed by prty_id and HJ to claims. Am I being unreasonable, or are my stats no good? -- //www.freelists.org/webpage/oracle-l -- //www.freelists.org/webpage/oracle-l