dbms_stats tries first to get an idea of the volume of the tables, and so does a first sampling, and depending on the result tries to find a suitable sample size. Hence the multiple scans. but beware, the calculation queries/algos keep changing. rgds
Thanks. I assume you meant a 10046 Level 4 trace. I tried a level 12 trace (just because I always regret it when I don't), and it does expose some rather interesting pieces of code, just like Niall promised. I asked a Metalink engineer and haven't heard anything official, yet. Or even unofficial for that matter.
------------------------------ *From:* John Kanagaraj [mailto:john.kanagaraj@xxxxxxx] *Sent:* Monday, June 05, 2006 5:16 PM *To:* Schultz, Charles; oracle-l *Subject:* RE: What are the gory details behind DBMS_STATS.AUTO_SAMPLE_SIZE?
Charles,
I have found some interesting stuff when I turned on a level 4 trace on a DBMS_STATS collection session using AUTO_SAMPLE_SIZE. Unfortunately, I did not document what I saw, but I do remember that it performed multiple scans.... Not a Good Thing (tm).
Regards,
John Kanagaraj <>< DB Soft Inc Phone: 408-970-7002 (W)
http://www.klove.com - Positive, encouraging music 24x7 worldwide
** The opinions and facts contained in this message are entirely mine and do not reflect those of my employer or customers **
------------------------------ *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto: oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Schultz, Charles *Sent:* Friday, June 02, 2006 10:52 AM *To:* oracle-l *Subject:* What are the gory details behind DBMS_STATS.AUTO_SAMPLE_SIZE?
I am curious how the algorithms behind DBMS_STATS.AUTO_SAMPLE_SIZE actually pick a sample size (10.2.0.1). We have witnessed a case where the sample size for a particular column was reduced 91.6%, causing the number of distinct values to be inaccurate, further causing a query to choose an FTS rather than an index. As far as I can tell, the skew has not changed that much, if at all, and the overall volume has increased by ~1%.
I am aware of workarounds (compute stats, lock stats, use an outline, etc), but like I said, I am really curious why the algorithm made such a drastic jump in the first place. I could not easily locate an appropriate white paper on metalinks, and I get too many hits on Google. Can anyone help satiate my burning desire to know? =)
*charles schultz* *oracle dba* *aits - adsd* *university of illinois*