Re: full-scan vs index for "small" tables

From: Mladen Gogala <gogala@xxxxxxxxxxxxx>
To: cary.millsap@xxxxxxxxxx
Date: Wed, 28 Jun 2006 17:44:18 -0400

On 06/28/2006 12:00:17 PM, Cary Millsap wrote:
> > ...need stable sql plans.
> 
> But the whole point of the CBO is that execution plan stability is
> inferior to execution plan adaptation to changing circumstances. As
> Jonathan Lewis points out very well, all it takes is the insertion of a
> single row to make the True Best Plan change from one execution to
> another.

Cary, while True Best Plan(TM) can change from one execution to the
next, there are also some problems that are introduced by this:

1) One cannot develop on a smaller machine and expect to have the same 
   execution path on another machine. That makes explain plan statement
   practically useless. The same applies to performance testing. This, in
   turn, complicates development process immensely.
2) Gathering statistics introduces an element of randomness into production
   environment. Some data is loaded into a big table in a disorderly fashion,
   it pushes the clustering factor over the limit and the index is no longer
   used. Tuning is complex, requires a cooperation from development and is 
   generally regarded as an exercise in futility, as some future data change
   will send us all into an overdrive yet again.

> 
> If you truly want plan stability, then you want stored outlines, do you
> not?

Not necessarily. I advise my clients not to gather statistics too frequently.
Usually, I recommend bi-monthly period. I also recommend cheating and locking
statistics. I love setting clustering factor to a low value and then locking
the statistics for the index.

> 
> Certainly, there are two distinct categories where CBO messes up:
> 
>  I. Where it has been misinformed by the data it uses to make decisions.
> II. Where it makes poor decisions based upon truly representative data.
> 
> My experience is that most problems that people think are category II
> problems are really category I problems in disguise. The difference can
> be revealed by inspection of 10053 data.
> 
> I do recognize the existence of category II problems as well. It's just
> that I think they're considerably rarer than most people believe.

I do believe that RBO should have been left as a choice. It is very nice to 
have 
a smart optimizer, as long as I am smarter still and I am able to predict what 
will the optimizer do. In other words, I believe that something like this should
be built into the optimizer:

1) You will not harm the user or allow them to be harmed by inadequate
   performance because of your inactivity.
2) You will always follow orders of a DBA or developers, except when
   that contradicts the 1st law.
3) You will always protect the existence of statistics, except when that
   contradicts the 1st or the 2n law.

Unfortunately, I'm getting old and CBO is getting too complex, a real black
box, practically impossible to predict. Maybe it does sound like Matt Lauer's 
doomsday comedy on the Sci-Fi channel, but I do believe that the optimizer is 
out to get us. Run for the hills!
-- 
Mladen Gogala
http://www.mgogala.com

--
//www.freelists.org/webpage/oracle-l

References:
- RE: full-scan vs index for "small" tables
  - From: Cary Millsap

Re: full-scan vs index for "small" tables

Other related posts: