RE: Creating Histograms

  • From: "Goulet, Dick" <DGoulet@xxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 21 Jul 2004 14:21:12 -0400

Wolfgang,

        Could you post the link to your wonderful paper/presentation of the CBO 
=
fallacies?  It's right to the point here.

Dick Goulet
Senior Oracle DBA
Oracle Certified 8i DBA

-----Original Message-----
From: Juan Carlos Reyes Pacheco [mailto:jreyes@xxxxxxxxxxxxxxxx]
Sent: Wednesday, July 21, 2004 2:08 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Creating Histograms


Hi Donald I was in the same process, for the advanced tuning course I'm
doing, t his is what I found, I hope be useful, but what I found, is =
that
CBO is smarter (or bugged) and in more than one test I couldn't get to =
use
the histogram.
Any comment about this will be appreciated.
=20
From my new (and more serious ) paper, not like the previous
1.A Histograms
Histograms could affect nevatively performance, you have to verify they =
have
a positive effect in your system.
To create histograms you execute the following command, the size =
parameter
specifies the number of buckets, depending of the amount of distinct =
values
you give distinct value to bucket.
EXEC DBMS_STATS.GATHER_TABLE_STATS ('scott','emp', METHOD_OPT =3D> 'FOR
COLUMNS SIZE 10 sal');
You can view histogram information with the following views =
DBA_HISTOGRAMS,
DBA_PART_HISTOGRAMS, DBA_SUBPART_HISTOGRAMS, DBA_TAB_COLSTATISTICS
1.A.i Bucket Size
Oracle divides the distinct value in bands, called buckets.
From Documentation
"If the number of frequently occurring distinct values in a column is
relatively small, then set the number of buckets to be greater than that
number. The default number of buckets for a histogram is 75, This =
valuesp
rovides an appropiate level of detail for most data distributions =
however,
because the number of buckets in the histograms, and the data =
distribution,
bot affect a histogram's usefulness, you might need to experiment with
different number of buckets to obtain optimal results."
For example
=20
We will create a test table, gather statistics and create two groups one
small of 3 records and other of 88713 records, we will run a test with =
and
without histograms.
create table test as=20
select * from dba_objects UNION ALL=20
select * from dba_objects UNION ALL=20
select * from dba_objects;
create index idxtest on test (OWNER);
SQL> UPDATE TEST SET OWNER =3D 'XXX' WHERE NOT OWNER =3D 'CACHITO';
SQL> COMMIT;
SQL> SELECT OWNER,COUNT(*) FROM TEST GROUP BY OWNER;
OWNER COUNT(*)
------------------------------ ----------
CACHITO 3
XXX 88713
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=3D>'ADM',TABNAME=3D>'TEST'
CASCADE=3D>TRUE);
=20
SQL> SELECT * FROM TEST WHERE OWNER =3D 'CACHITO';
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D70 Card=3D44358 =
Bytes=3D4036578)
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=3D70 Card=3D44358 =
Bytes=3D4036578)
2 1
INDEX (RANGE SCAN) OF 'IDXTEST' (NON-UNIQUE) (Cost=3D93 Card=3D44358)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
1213 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
SQL> SELECT * FROM TEST WHERE OWNER =3D 'XXX';
88713 filas seleccionadas.
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D70 Card=3D44358 =
Bytes=3D4036578)
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=3D70 Card=3D44358 =
Bytes=3D4036578)
2 1
INDEX (RANGE SCAN) OF 'IDXTEST' (NON-UNIQUE) (Cost=3D93 Card=3D44358)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
13132 consistent gets
0 physical reads
0 redo size
4304202 bytes sent via SQL*Net to client
65557 bytes received via SQL*Net from client
5916 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
88713 rows processed
SQL>
SQL> SELECT NUM_BUCKETS FROM DBA_TAB_COL_STATISTICS WHERE OWNER =3D =
'ADM' AND
TABLE_NAME =3D 'TEST' AND COLUMN_NAME =3D 'OWNER';
=20
NUM_BUCKETS
-----------
1
=20
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS ('ADM','TEST', METHOD_OPT =3D> =
'FOR
COLUMNS SIZE 75 OWNER');
Procedimiento PL/SQL terminado correctamente.
SQL> SELECT NUM_BUCKETS FROM DBA_TAB_COL_STATISTICS WHERE OWNER =3D =
'ADM' AND
TABLE_NAME =3D 'TEST' AND COLUMN_NAME =3D 'OWNER';
=20
As you can see there is only one bucket, because there are too few =
values in
the column.
NUM_BUCKETS
-----------
1
=20
Now we will try again
SQL> SELECT * FROM TEST WHERE OWNER =3D 'CACHITO';
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D2 Card=3D3 Bytes=3D273)
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=3D2 Card=3D3 Bytes=3D273)
2 1
INDEX (RANGE SCAN) OF 'IDXTEST' (NON-UNIQUE) (Cost=3D1 Card=3D3)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
1213 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
=20
SQL> SELECT * FROM TEST WHERE OWNER =3D 'XXX';
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D140 Card=3D88713 =
Bytes=3D8072883)
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=3D140 Card=3D88713 =
Bytes=3D8072883)
2 1
INDEX (RANGE SCAN) OF 'IDXTEST' (NON-UNIQUE) (Cost=3D186 Card=3D88713)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
13132 consistent gets
0 physical reads
0 redo size
4304202 bytes sent via SQL*Net to client
65557 bytes received via SQL*Net from client
5916 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
88713 rows processed
=20
Maybe this is a error in the CBO, so we will force a full scan in the =
table
=20
SQL> SELECT /*+ FULL(TEST) */ * FROM TEST WHERE OWNER =3D 'XXX';
88713 filas seleccionadas.
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D188 Card=3D88713 =
Bytes=3D8072883)
1 0
TABLE ACCESS (FULL) OF 'TEST' (Cost=3D188 Card=3D88713 Bytes=3D8072883)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7060 consistent gets
0 physical reads
0 redo size
4304202 bytes sent via SQL*Net to client
65557 bytes received via SQL*Net from client
5916 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
88713 rows processed
=20
But it seems not to be, a full scan cost 40 more, so it seems this is =
not a
good place to use histograms.-
Meanwhile the CBO is becoming smarter, the more test one have to develop =
to
verify the things one do, are really improving the performance.
=20
=20
More about histograms you can read=20
http://www.dba-oracle.com/art_otn_cbo_p4.htm
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: