Density statistic calculation in case of histogram

Hi,

Please tell me how density is calculated in case of histograms.

I've a table with very skewed data:

scott@ORADB10G> select col_skew, count(*) from tab_skew group by col_skew
order by 1;

  COL_SKEW   COUNT(*)
---------- ----------
         1         10
         2         10
         3         10
         4         10
         5         10
         6         10
         7         10
         8         10
         9         10
        10       9910

10 rows selected.

scott@ORADB10G> exec dbms_stats.delete_table_stats(user,'TAB_SKEW');

PL/SQL procedure successfully completed.

-- Creating width-based histograms, #buckets = #distinct values
scott@ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW',
method_opt=>'FOR COLUMNS COL_SKEW size 10');

PL/SQL procedure successfully completed.

scott@ORADB10G> select * from dba_tab_col_statistics where
table_name='TAB_SKEW' and column_name='COL_SKEW';

OWNER  TABLE_NA COLUMN_NAME          NUM_DISTINCT LOW_VA HIGH_V    DENSITY
NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
------ -------- -------------------- ------------ ------ ------ ----------
---------- ----------- --------- ----------- --- --- -----------
---------------
SCOTT  TAB_SKEW COL_SKEW                       10 C102   C10B
.00005          0          10 06-APR-09       10000 YES NO            3
FREQUENCY

Question: How density is calculated here? Also it's equal to Selectivity of
non-popular values.

scott@ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW',
method_opt=>'FOR COLUMNS COL_SKEW size 5');

PL/SQL procedure successfully completed.

-- Creating Height-based histograms, #buckets < #distinct values
scott@ORADB10G> select * from dba_tab_col_statistics where
table_name='TAB_SKEW' and column_name='COL_SKEW';

OWNER  TABLE_NA COLUMN_NAME          NUM_DISTINCT LOW_VA HIGH_V    DENSITY
NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
------ -------- -------------------- ------------ ------ ------ ----------
---------- ----------- --------- ----------- --- --- -----------
---------------
SCOTT  TAB_SKEW COL_SKEW                       10 C102   C10B
.98209          0           5 06-APR-09       10000 YES NO            3
HEIGHT BALANCED

Question: How density is calculated here and how selectivity will be
estimated for popolar as well as non-popular values?


Any points/advice will be appreciated.

Neeraj.

Other related posts: