Density statistic calculation in case of histogram
- From: Neeraj Bhatia <neeraj.dba@xxxxxxxxx>
- To: oracle-l@xxxxxxxxxxxxx
- Date: Mon, 6 Apr 2009 12:17:30 +0530
Hi,
Please tell me how density is calculated in case of histograms.
I've a table with very skewed data:
scott@ORADB10G> select col_skew, count(*) from tab_skew group by col_skew
order by 1;
COL_SKEW COUNT(*)
---------- ----------
1 10
2 10
3 10
4 10
5 10
6 10
7 10
8 10
9 10
10 9910
10 rows selected.
scott@ORADB10G> exec dbms_stats.delete_table_stats(user,'TAB_SKEW');
PL/SQL procedure successfully completed.
-- Creating width-based histograms, #buckets = #distinct values
scott@ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW',
method_opt=>'FOR COLUMNS COL_SKEW size 10');
PL/SQL procedure successfully completed.
scott@ORADB10G> select * from dba_tab_col_statistics where
table_name='TAB_SKEW' and column_name='COL_SKEW';
OWNER TABLE_NA COLUMN_NAME NUM_DISTINCT LOW_VA HIGH_V DENSITY
NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
------ -------- -------------------- ------------ ------ ------ ----------
---------- ----------- --------- ----------- --- --- -----------
---------------
SCOTT TAB_SKEW COL_SKEW 10 C102 C10B
.00005 0 10 06-APR-09 10000 YES NO 3
FREQUENCY
Question: How density is calculated here? Also it's equal to Selectivity of
non-popular values.
scott@ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW',
method_opt=>'FOR COLUMNS COL_SKEW size 5');
PL/SQL procedure successfully completed.
-- Creating Height-based histograms, #buckets < #distinct values
scott@ORADB10G> select * from dba_tab_col_statistics where
table_name='TAB_SKEW' and column_name='COL_SKEW';
OWNER TABLE_NA COLUMN_NAME NUM_DISTINCT LOW_VA HIGH_V DENSITY
NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
------ -------- -------------------- ------------ ------ ------ ----------
---------- ----------- --------- ----------- --- --- -----------
---------------
SCOTT TAB_SKEW COL_SKEW 10 C102 C10B
.98209 0 5 06-APR-09 10000 YES NO 3
HEIGHT BALANCED
Question: How density is calculated here and how selectivity will be
estimated for popolar as well as non-popular values?
Any points/advice will be appreciated.
Neeraj.
Other related posts: