RE: Poor performance with Histogram
- From: Wolfgang Breitling <breitliw@xxxxxxxxxxxxx>
- To: rgoulet@xxxxxxxxxx
- Date: Tue, 12 Dec 2006 10:02:34 -0700
At 08:17 AM 12/12/2006, Richard J. Goulet wrote:
Shiva,
One of the problems with histograms, especially those that are
not refreshed on a regular basis, is that they have a min and max
value. If your query specifies a value that is outside the bounds
of the histogram then the CBO decides that a full table scan is the
only way to handle the problem.
I am fairly certain that this is not correct:
10.1.0.5> drop table t1;
Table dropped.
10.1.0.5> create table t1 as select rownum id, rpad('*',100,'*')
filler from dual connect by level <= 200000;
Table created.
10.1.0.5> create unique index t1 on t1(id);
Index created.
10.1.0.5> alter table t1 modify(id not null);
Table altered.
10.1.0.5> alter table t1 add constraint t1_pk primary key (id) using index t1;
Table altered.
10.1.0.5> begin
2
dbms_stats.gather_table_stats(user,'t1',estimate_percent=>100,method_opt=>'for
all columns size 1',cascade=>true);
3
dbms_stats.gather_table_stats(user,'t1',estimate_percent=>100,method_opt=>'for
columns size 254 id');
4 end;
5 /
PL/SQL procedure successfully completed.
10.1.0.5> @colstats t1
table column NDV
density nulls lo hi av lg bkts H G U
last analyzed sample
------------------------ --------------------- ----------
------------ ------- ---------------- ---------------- ----- ----- -
- - ---------------- ----------
T1 ID 200,000
5.0000E-06 0 1 200000 5 254 H Y
N 2006-12-12 09:55 200,000
FILLER 1
1.0000E+00 0 **************** **************** 101 1 N Y N
2 rows selected.
10.1.0.5> insert into t1 values(-1,rpad('x',100,'x'));
1 row created.
10.1.0.5> explain plan set statement_id = '001' for
2 delete from t1 where id=-1;
Explained.
10.1.0.5> @xplain 001
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3092449301
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 5 | 2 (0)| 00:00:01 |
| 1 | DELETE | T1 | | | | |
|* 2 | INDEX UNIQUE SCAN| T1 | 1 | 5 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=(-1))
14 rows selected.
Obviously the PK index is being used to find the id outside the
bounds known by the statistics. Note that you need to force the
creation of the (silly) histogram because "size skewonly" or "size
auto" do not gather histograms for unique columns.
That is not to say that referencing values outside the known bounds
does not present a challenge to the CBO. It may fall back to the
default 5% selectivity for cases where it does not have enough
information, which in turn may cause a full scan to be preferred to
an index access.
Case in point that I've run into. An application (name to
remain anonymous) did a delete on a table where the primary key was
a negative 1 (-1). Now that's great to eliminate odd rows of data
that are not suppose to be there, but there was a trigger on the
table to prevent a negative number from ever entering. CBO decided
that the only way to resolve this statement was to fully scan the
table of 127 million rows. The Band-Aid was to disable the
trigger, force a -1 into the column, refresh the histogram, delete
the -1 row, and re-enable the trigger. Thereafter we NEVER
refreshed that table's histograms which made the app run great,
until the vendor could "fix" the code. Which they did, 6 months later.
Regards
Wolfgang Breitling
Centrex Consulting Corporation
www.centrexcc.com
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
- References:
- RE: Poor performance with Histogram
- From: Richard J. Goulet
Other related posts:
- » Poor performance with Histogram
- » Re: Poor performance with Histogram
- » RE: Poor performance with Histogram
- » Re: Poor performance with Histogram
- » RE: Poor performance with Histogram
- » RE: Poor performance with Histogram
- » RE: Poor performance with Histogram
- » RE: Poor performance with Histogram
- » RE: Poor performance with Histogram
- » RE: Poor performance with Histogram
Shiva,One of the problems with histograms, especially those that are not refreshed on a regular basis, is that they have a min and max value. If your query specifies a value that is outside the bounds of the histogram then the CBO decides that a full table scan is the only way to handle the problem.
Case in point that I've run into. An application (name to remain anonymous) did a delete on a table where the primary key was a negative 1 (-1). Now that's great to eliminate odd rows of data that are not suppose to be there, but there was a trigger on the table to prevent a negative number from ever entering. CBO decided that the only way to resolve this statement was to fully scan the table of 127 million rows. The Band-Aid was to disable the trigger, force a -1 into the column, refresh the histogram, delete the -1 row, and re-enable the trigger. Thereafter we NEVER refreshed that table's histograms which made the app run great, until the vendor could "fix" the code. Which they did, 6 months later.
- RE: Poor performance with Histogram
- From: Richard J. Goulet