Re: Usage of the dbms_stats.gather_database_stats() procedure

  • From: Robyn <robyn.sands@xxxxxxxxx>
  • To: sorr@xxxxxxxxxxxx
  • Date: Mon, 13 Feb 2006 12:09:39 -0500

I had to do something similar with a large data warehouse and choose to use
a series of scripts to gather schema statistics.  I used cascade option and
ran the scripts in subsets over several weeks to minimize the impact on the
users.  For a few of the larger schemas, I broke it down further to gather
table statistics and spread those out into several separate runs as well.

For the percentages, I used compute on many of the tables, and selected
percentages for the other tables based on number of rows and the average
change rate.  I turned on monitoring before I started so I could see what
was going on.  All of this was completed in the test environment before
moving to production and I tried different percentages and tracked the
results, and in some case, using compute made a pretty noticeable difference
in job completion times.

This was probably overkill for stats, but this shop had a bad experience
when someone else tried cost based optimization for this warehouse, so I had
to be sure that when we switched over, the results would be good.  They
were, but I probably could have had the same level of results with a little
less work.

hth ... Robyn



On 2/13/06, Orr, Steve <sorr@xxxxxxxxxxxx> wrote:
>
>  I've just become aware of an instance that has 130 schemas with a total
> of 15,000 tables, 12,700 of which have no optimizer statistics. Even though
> things run relatively well without the stats we are sometimes bitten by poor
> performance without warning and have to compute stats as a fix. I'm thinking
> I want to get stats on all tables by default. I'm looking for
> recommendations on updating stats on all the tables and indexes. I'm
> thinking I want to use a big shotgun approach as a first pass then fine tune
> after that.
>
>
>
> I've used gather_schema_stats, gather_table_stats, and gather_index_stats
> with good effect but have not been confronted with the need to gather so
> many stats en masse. The database supports 24X7 Internet services and there
> is less activity over the weekend but it's still significant. I've never
> used the gather_database_stats procedure and I'm thinking that using it in
> these circumstances could take a long time and significantly degrade end
> user response times.
>
>
>
> (For the most part the 130 schemas have the same tables with variations
> for the version of the hosted software they are running and the number of
> customizations they have made. Many tables are empty because they are for
> product features the customer has not purchased or implemented and the
> product installs all the tables by default.)
>
>
>
> To lighten the impact of gathering stats I think I should use estimate at
> about 10%.
>
>
>
> I'm thinking this would be too much:
>
> Gather_database_stats(10,method_opt=>'for all indexed columns size
> auto',cascade=>true);
>
>
>
> I'm thinking about creating a script that loops through all the schemas
> and does this:
>
> Gather_schema_stats(schema,10,options=>'gather_empty',cascade=>true);
>
>
>
> But I'm not too sure about using the cascade option. Maybe I need to
> further script to get index stats separately.
>
>
>
> Any recommendations on the estimate percentage? Using cascade? Using
> gather_database_stats? Other?
>
>
>
>
>
> AtDhVaAnNkCsE,
>
> Steve Orr
>
> Bozeman, Montana
>
>
>



--
Robyn Anderson Sands
email: Robyn.Sands@xxxxxxxxxx

Other related posts: