Comparing Cardinality Estimates with Results

  • From: "Larry Elkins" <elkinsl@xxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 15 Nov 2013 16:34:43 -0600

Stephane Faroult, who is having trouble posting to the list right now, asked me 
to pose this question on his behalf:

  I am currently working on an application of hell (Entity/Attribute/Value all 
over the place, unpartitioned tables in the 200
million to 3 billion row range, blanket indexing, cluster factor usually very 
low, hard-coded queries with long lists of
identifiers, cursor_sharing set to force ... and the bulk of the work of course 
done in Java).
In a few cases the optimizer goes south, which is understandable and seems to 
be more or less corrected by more aggressive dynamic
sampling (the application isn't too demanding on Oracle CPU). However, I was 
curious to compare, globally, cardinality estimates
with actual results. I wrote the following query, which is a kind of first 
draft (warning: requires the performance pack, I query
the dba_hist tables), in which I try to collect the "last level" of cardinality 
estimate as reported in the plan (I know, rather
meaningless with aggregates, but that's the best I have come up with so far; 
improvements are welcome) for queries during a precise
interval. I have set three categories, "Spot on" when the order of magnitude is 
the same for the estimate and the actual average,
"Off" when there is one order of magnitude of difference (either way) and "Gone 
South" if it's more than that.
I was rather surprised to discover that queries were more or less equally 
distributed between the three categories. I would be
curious to know how the optimizer performs, cardinality-wise, in other 
environments and what could be considered "reasonably
stable".

with q as
    (select d.dbid,
            i.instance_number,
            s.snap_id,
            round(s.begin_interval_time, 'HH24') snap_hour
     from dba_hist_snapshot s
          cross join
           (select dbid
           from v$database) d
          cross join
           (select instance_number
            from v$instance) i
     where s.dbid = d.dbid
       and s.instance_number = i.instance_number
       and s.begin_interval_time >= to_timestamp('&start_YYYYMMDDHHMI', /*start 
YYYYMMMDDHH */
                                                 'YYYYMMDDHH24MI')
       and s.end_interval_time <= to_timestamp('&end_YYYYMMDDHHMI', /*end 
YYYYMMDDHH */
                                                 'YYYYMMDDHH24MI')) select * 
from (select case actual_cardinality
               when estimated_cardinality then 'Spot on'  
               else case 
                      when abs(round(log(10, greatest(actual_cardinality, 1)
                                             /greatest(estimated_cardinality, 
1)))) < 1 then 'Off'
                      else 'Gone south'
                    end
             end as cardinality_estimate,
             count(*) number_of_plans
      from (select s.dbid,
                   s.sql_id,
                   s.plan_hash_value,
                   round(log(10, greatest(avg(rows_processed_delta), 1))) 
actual_cardinality,
                   round(avg(rows_processed_delta)) as actual
            from q
                 inner join dba_hist_sqlstat s
                    on s.dbid = q.dbid
                   and s.instance_number = q.instance_number
                   and s.snap_id = q.snap_id
            group by s.dbid,
                     s.sql_id,
                     s.plan_hash_value) a
           inner join (select x.sql_id,
                              x.plan_hash_value,
                              round(log(10, greatest(sum(x.cardinality), 1))) 
as estimated_cardinality,
                              sum(x.cardinality) as estimate
                       from (select dbid,
                                    sql_id,
                                    plan_hash_value,
                                    cardinality,
                                    rank() over (partition by dbid,
                                                              plan_hash_value
                                                 order by id) rnk
                             from dba_hist_sql_plan
                             where cardinality is not null) x
                       where x.rnk = 1
                       group by x.sql_id,
                                x.plan_hash_value) b
              on b.sql_id = a.sql_id
             and b.plan_hash_value = a.plan_hash_value
     group by case actual_cardinality
                when estimated_cardinality then 'Spot on'
                else case 
                       when abs(round(log(10, greatest(actual_cardinality, 1)
                                               /greatest(estimated_cardinality, 
1)))) < 1 then 'Off'
                       else 'Gone south'
                     end
               end)
order by case cardinality_estimate
           when 'Spot on' then 1
           when 'Off' then 2
           else 3
         end
/

Larry G. Elkins
elkinsl@xxxxxxxxxxx
Cell: 214.695.8605



--
//www.freelists.org/webpage/oracle-l


Other related posts:

  • » Comparing Cardinality Estimates with Results - Larry Elkins