Re: Which plan is better - what COST really means ?

  • From: Mladen Gogala <gogala@xxxxxxxxxxxxx>
  • To: breitliw@xxxxxxxxxxxxx
  • Date: Sat, 15 Jan 2005 06:00:17 +0000

On 01/14/2005 08:52:44 PM, Wolfgang Breitling wrote:
> At 03:43 PM 1/14/2005, Karen Morton wrote:
> >So, if a query has a cost of 2, the optimizer is estimating a 2
> >second/centisecond/microsecond response time when the statement is =3D
> >executed. =3D20
> >
> >In what time measurement is the cost?  Seconds, centiseconds, =3D
> >microseconds? =3D20
>=20
> As Jonathan already said, the cost figures which the CBO attaches to
> individual operations, and ultimately the plan, are the estimated
> number of
> block IO to do the operation / entire plan. Even with cpu costing in
> Oracle
> 9 and onward, the cpu cost is expressed in block IO equivalent. So in
> theory - again as Jonathan already pointed out - the execution time =20
> of
> a
> sql is the number of blocks that need to be read (i.e. the cost) =20
> times
> the
> time a single block read takes (i.e. sreadtim). However, there are
> many
> assumptions the CBO has to make when it attaches a cost ( =3D block IO
> count
> ) to an operation. Some of them I show in my paper "Fallacies of the
> cost
> based optimizer". Most of them relate to the fact that even the most
> up-to-date statistics do not, and can not without prohibitive
> overhead,
> reflect non-uniform data distribution or data dependencies. Plus,
> statistics are aggregates and counts; and as Cary tries to hammer =20
> into
> our
> heads: "You can not derive detail from aggregate" and "You can not
> determine how long something took [or how long it will take] by
> counting
> how often it occurred [or will occur]". On top of that, even if the
> cost,
> block IO count, was accurate, another assumption the optimizer makes
> is
> that each theoretically required block IO requires a physical IO. Now
> you
> can try and correct that with the optimizer_index_caching parameter,
> but
> then YOU make some assumption about the ratio (here is that four
> letter
> word) of physical IO to CBO cost. The crux of course is that that
> ratio as
> well is yet another case of non-uniform data distribution: it is
> different
> from sql to sql. And even "physical" IO (OS system calls to request a
> datablock) are not created all equal - some will take longer than
> others.

Wolfgang, correct me if I am wrong, but the purpose of the optimizer is =20
not to learn all about the data, the purpose is to construct the best
execution path for the given SQL statement. There is very finite number =20
of the possible access paths, usually including full table scans, index =20
scans (the last great novelty here was the skip-scan) and various join =20
methods. Optimizer used to assign cost to all permutations and then =20
pick the path it considers the cheapest. I suspect that the version in
10g uses some kind of graph theory based algorithm, not dissimilar to =20
the one described in Dan Dow's great book to construct the optimal path =20
and no longer relies solely on permutations.
Having that in mind, constructing the best path should not be too =20
costly, even if the information about the data distribution and =20
densities isn't perfect. Furthermore, some assumptions are safe to =20
make:
If the system you are controlling is a busy OLTP system with lots of =20
RAM, it is safe to assume that at least 75% of any of the given index
will be cached. Also, the density of the index is much greater then the =20
density of the table, so it is generally safe to make assumptions about
optimizer_index parameters. Granted, Jonathan has proven that anomalies
are possible and even likely, but those parameters can help with the =20
vast majority of queries. I mentioned OLTP situation, because that =20
precisely is the area where CBO doesn't fare very well. Most of today's =20
OLTP systems are designed with indexes to be used if they exist, which =20
was the philosophy of RBO. Theoretically speaking, anomalies can be =20
expected and there will be queries with weird behavior, but setting
optimizer_index_caching and optimizer_index_cost_adj will be, generally =20
speaking, beneficial for the vast majority of people moving from RBO to =20
CBO.
There are whole theories of the cost based optimizer and anomalies. My =20
experience tells me that if the underlying application system is =20
properly constructed, so that the rules of good design are observed, =20
then analyzing statistics and setting those two parameters will come as =20
close to the proverbial silver bullet as can reasonably be expected.
All this speak of enormously complicated optimizer just frightens =20
people, which is ludicrous, because there is much less tuning to do =20
today then few years ago when you were supposed to be mindful of the =20
table order, concatenate variables with empty string or add zero to
the numeric ones, in order to prevent index usage. Furthermore, there =20
is a tool we didn't have before. It's called "event 10053", described =20
in detail in your very well known paper, and it helps us to look under =20
the hood of the optimizer, if we really need to.
There is no need to theorize about the perfect distribution, what can =20
we learn from samples and when are samples representative of the whole,
and alike.
The units of the "cost" are like monopoly money. They  help CBO to pick
the best strategy, but other then that, CBO cost has no meaning =20
whatsoever. To make long story short, I think that the story around CBO =20
is unnecessarily complicated, as it is quite a bit easier to use then =20
its predecessor, the RBO. DBA doesn't have to know the theory of =20
Lebesgue measure, complete metric spaces and functional analysis to be =20
capable to utilize CBO. The two parameters mentioned above have never =20
given me the slightest reason for concern. If I ever do encounter one =20
of Jonathan's anomalies, there are hints, and plenty of them


--=20
Mladen Gogala
Oracle DBA


--
//www.freelists.org/webpage/oracle-l

Other related posts: