RE: Which plan is better - what COST really means ?

  • From: Wolfgang Breitling <breitliw@xxxxxxxxxxxxx>
  • To: karen.morton@xxxxxxxxxx
  • Date: Fri, 14 Jan 2005 18:52:44 -0700

At 03:43 PM 1/14/2005, Karen Morton wrote:
>So, if a query has a cost of 2, the optimizer is estimating a 2
>second/centisecond/microsecond response time when the statement is =
>executed. =20
>In what time measurement is the cost?  Seconds, centiseconds, =
>microseconds? =20

As Jonathan already said, the cost figures which the CBO attaches to 
individual operations, and ultimately the plan, are the estimated number of 
block IO to do the operation / entire plan. Even with cpu costing in Oracle 
9 and onward, the cpu cost is expressed in block IO equivalent. So in 
theory - again as Jonathan already pointed out - the execution time of a 
sql is the number of blocks that need to be read (i.e. the cost) times the 
time a single block read takes (i.e. sreadtim). However, there are many 
assumptions the CBO has to make when it attaches a cost ( = block IO count 
) to an operation. Some of them I show in my paper "Fallacies of the cost 
based optimizer". Most of them relate to the fact that even the most 
up-to-date statistics do not, and can not without prohibitive overhead, 
reflect non-uniform data distribution or data dependencies. Plus, 
statistics are aggregates and counts; and as Cary tries to hammer into our 
heads: "You can not derive detail from aggregate" and "You can not 
determine how long something took [or how long it will take] by counting 
how often it occurred [or will occur]". On top of that, even if the cost, 
block IO count, was accurate, another assumption the optimizer makes is 
that each theoretically required block IO requires a physical IO. Now you 
can try and correct that with the optimizer_index_caching parameter, but 
then YOU make some assumption about the ratio (here is that four letter 
word) of physical IO to CBO cost. The crux of course is that that ratio as 
well is yet another case of non-uniform data distribution: it is different 
from sql to sql. And even "physical" IO (OS system calls to request a 
datablock) are not created all equal - some will take longer than others.

>One day someone, somewhere, will agree
>with me that the cost of a query IS and always
>has been the optimizer's estimate of the actual
>run time of a query - with the slightly idiosyncratic
>"assumed time for a single block read" as the unit
>of measure.


Wolfgang Breitling
Centrex Consulting Corporation 


Other related posts: