I had to do something similar with a large data warehouse and choose to use a series of scripts to gather schema statistics. I used cascade option and ran the scripts in subsets over several weeks to minimize the impact on the users. For a few of the larger schemas, I broke it down further to gather table statistics and spread those out into several separate runs as well. For the percentages, I used compute on many of the tables, and selected percentages for the other tables based on number of rows and the average change rate. I turned on monitoring before I started so I could see what was going on. All of this was completed in the test environment before moving to production and I tried different percentages and tracked the results, and in some case, using compute made a pretty noticeable difference in job completion times. This was probably overkill for stats, but this shop had a bad experience when someone else tried cost based optimization for this warehouse, so I had to be sure that when we switched over, the results would be good. They were, but I probably could have had the same level of results with a little less work. hth ... Robyn On 2/13/06, Orr, Steve <sorr@xxxxxxxxxxxx> wrote: > > I've just become aware of an instance that has 130 schemas with a total > of 15,000 tables, 12,700 of which have no optimizer statistics. Even though > things run relatively well without the stats we are sometimes bitten by poor > performance without warning and have to compute stats as a fix. I'm thinking > I want to get stats on all tables by default. I'm looking for > recommendations on updating stats on all the tables and indexes. I'm > thinking I want to use a big shotgun approach as a first pass then fine tune > after that. > > > > I've used gather_schema_stats, gather_table_stats, and gather_index_stats > with good effect but have not been confronted with the need to gather so > many stats en masse. The database supports 24X7 Internet services and there > is less activity over the weekend but it's still significant. I've never > used the gather_database_stats procedure and I'm thinking that using it in > these circumstances could take a long time and significantly degrade end > user response times. > > > > (For the most part the 130 schemas have the same tables with variations > for the version of the hosted software they are running and the number of > customizations they have made. Many tables are empty because they are for > product features the customer has not purchased or implemented and the > product installs all the tables by default.) > > > > To lighten the impact of gathering stats I think I should use estimate at > about 10%. > > > > I'm thinking this would be too much: > > Gather_database_stats(10,method_opt=>'for all indexed columns size > auto',cascade=>true); > > > > I'm thinking about creating a script that loops through all the schemas > and does this: > > Gather_schema_stats(schema,10,options=>'gather_empty',cascade=>true); > > > > But I'm not too sure about using the cascade option. Maybe I need to > further script to get index stats separately. > > > > Any recommendations on the estimate percentage? Using cascade? Using > gather_database_stats? Other? > > > > > > AtDhVaAnNkCsE, > > Steve Orr > > Bozeman, Montana > > > -- Robyn Anderson Sands email: Robyn.Sands@xxxxxxxxxx