On 30/09/2008, William Robertson <
william@xxxxxxxxxxxxxxxxxxxx>
wrote:
> I'm not sure how using or not using histograms makes any
fundamental
> difference to your approach.
>
> One thing that comes to mind is the use of date predicates. If the
> optimizer knows the highest sale date you have (for example) is 30
Sept
> 2008, you are fine today and maybe some of next week, but in a few
> months' time any queries for "last week's sales" may get
unrealistically
> low cardinality estimates leading to inappropriate index/nested
loop
> access paths, Cartesian joins and so on.
>
> If you use partitioning and you have date columns as
> partition/subpartition keys you could have even more fun in store.
>
>
> -----Original message-----
> From: Sriram Kumar
> Date: 30/9/08 22:41
>> Hi Folks,
>>
>> We had migrated a OLTP/Batch hybrid 9i RBO application
to 10g
>> and after rounds of tuning the application now is working at
its best.
>> We had experimented with various sampling rates for
statistics for
>> various tables and now I feel we are in a optimal point of
performance
>> where CBO is picking up the right Indexes.
>>
>> 1) We do not use histograms
>> 2) From here on the application volumes are bound increase
proportionately
>> 3) No new tables would be added in production.
>> 4) As of now, We have locked the schema stats for the
application
>> schemas. The default scheduler job collects stats for all
other schemas
>>
>> Since the increase in volume of data is quite proportional to
the
>> existing data, what would be the compelling reason to
periodically
>> recollect statistics for the tables that already have good
execution
>> plans?
>>
>> e.g lets say that there is a table with million records and
with few
>> indexes. We have collected the stats for this table and
indexes with
>> million records and all the access paths to this table has been
>> validated that the CBO is picking up the correct optimal index
in
>> every case.
>>
>> Now the volume of the table increases to 2 million and from
>> application point of view, the access paths that were valid
for 1
>> million would be the valid for 2 million as well.
>>
>> Given this scenario, is it required to collect statistics for 2
>> million as well?. I opine that it would not be required but
some of my
>> colleagues feel that we should collect statistics periodically
but I
>> am not able to get a valid reason for periodically
recollecting the
>> stats?. Is this one of the myths?
>>
>> I would agree to periodically recollect stats if we are using
>> histograms but we are not using histograms. Any other reason
that
>> would need a periodic statistic gathering?