At 10:00 AM 1/15/2005, you wrote: >Which type of IO by your opinion optimizer takes as base type for cost >estimating? >As far as I know different types of operations (FS, IRS, IQS, IFS, FILTER) >execute IO differently. I mean CPU utilization is different for single IO >unit in case of different operations. > >Another question is If cost is IO, how optimizer estimates sorting/hash >etc. actions in IO units? The "cost" in the CBO's calculations is basically the count of the number of blocks that need to be examined in order to get the result. For example, for a fully qualified primary key access to a row one block in each of the index levels needs to be examined from the root over the branch blocks to the leaf and then the data blocks needs to be examined, ergo the cost is "LVLS +1", the height of the index plus one. For a full table scan, each block up to the hwm needs to be examined. The CBO is aware of multiblock reads, so the number of blocks is divided be db_file_multi_block_read_count to come up with a basic cost for the FTS. However, the CBO is also aware that not every multiblock read will be set up to read dfmrc blocks so it applies a fudge factor to the basic FTS cost => table-scan-cost (tsc) = blocks/dfmrc*fudge-factor. These were the easy cases. The CBO needed 1 or all blocks of the table. If you have a predicate where an unknown number of rows qualify, the CBO must first estimate how many rows will qualify. That's the selectivity of a predicate and this is where the first assumption comes in, that each of the possible num_distinct predicate values will yield the same num_rows/num_distinct (= FilterFactor = Selectivity) rows - unless you collected a frequency histogram then the CBO has the exact number. Once it has the row estimate, the CBO needs to convert that into a block estimate, its cost "currency". If the base access method is a FTS, the cost is the FTS cost as explained above, else it uses the clustering_factor of the index as a measure of how clustered vs scattered the rows found via the index are. If they are clustered (a low clustering factor) the row estimate translates into a lower block estimate than if they are scattered. But remember that this is a statistic and therefore an aggregate. The clustering factor measures how many block transitions there are when you progress from one index leaf value to the next one. But if your predicates don't include all the trailing index columns then the rows you're seeking may be far more scattered than the clustering_factor for the full index indicates. Even worse if you are missing leading index columns (skip_scan). To answer Juri's question about the costs of sorting/hashing. As long as from the size of the sort/hash the CBO can assert that no spill to the temp tablespace is necessary (that is where sort/hash area size play a role), the sort and hash cost seem to be constant at 2 and 1 respectively (from what I see in 9.2). Beyond that the number of blocks to be written/read (i.e. the cost "currency") get estimated from the number of sort and merge runs required or hash probes required and then tacked on to the cost. I hope you appreciate by now how reality can differ from even the best attempts by the CBO of estimating the cost (ie single block IOs) of a query. That is why it is virtually impossible to compare the costs of two plans even for the same query, even though the cost is by design not a meaningless number. You have to know the details of how those costs were derived. Assume just for a moment you have a single table sql with a single range predicate which results in an index range scan with cost N. If I set optimizer_index_cost_adj=50, the cost will come out as N/2. Would you assume that just by doing that the query now runs twice as fast? Regards Wolfgang Breitling Centrex Consulting Corporation http://www.centrexcc.com -- //www.freelists.org/webpage/oracle-l