I just wanted to follow-up on this issue. Thanks to Wolfgang Breitling, I've got the issue resolved. I wrote a summary on my blog, http://ora.seiler.us/2007/08/dr-statslove-or-how-i-learned-to-stop.html Hopefully it makes sense. Try not to laugh too hard at my expense. Thanks again to Wolfgang! Don. On 6/19/07, Don Seiler <don@xxxxxxxxx> wrote: > I may have mentioned this before, but the archives only showed me a > tangential reference [1]. > > The setup: > * Oracle 10.2.0.2 on RHEL 3. > * Table FOO is range partitioned by date, total of 4 monthly > partitions (rolling window) > * There are two indexes on FOO, the PK has 10 fields (say A-J), and a > second index has 5 fields (A,C,D from PK and two others K, L). > * Statistics are gathered automatically via 10g default gather_stats_job > > The problem: > A partition of FOO is bulk loaded around 10 times a month. The > partition is created a few days before it will first be loaded into. > We have a recurring problem when, after the partition is loaded into > (2-3 million records per load), queries on FOO that have A, B, C, D in > the WHERE clause will start using the second index. The fact that > this second index doesn't use column B means it is walking over a HUGE > set of records, practically a complete partition scan. > > It seems like an out-of-whack statistics issue. After I gather stats > on the table (takes 2.5 hours) the queries then use the PK as desired. > To fight this, we had just this month started calling > dbms_stats.set_table_stats and set_index_stats to indicate a full > month's worth of stats before loading. Then for the rest of the day > we could fool Oracle, until the nightly stats job then set the real > stats. However this didn't work yesterday. I'm not sure why, but I > did notice that the dba_tab_statistics table showed a sample_size for > my user-set stats, when I don't believe there was one before. > > The solution: > I'm placing blame first on the PK. A 10-field primary key is absurd, > especially when 82% of the queries (logged via FGA) only go against > those first four fields. This starts a whole natural-vs-synthetic key > argument that only dictatorial control will resolve. Anyway, my > theory is that the CBO recognizes that, after the load, the data is > drastically different than the statistics. Now the touchy-feely part > of the theory is that the CBO then says all bets are off and looks at > the two indexes and sees that the second index is much, much smaller > and will be much easier to deal with. And that's why it chooses that > one. Obviously this isn't very technical and my developers aren't > buying into it. What traces can I do to spell out what the CBO is > thinking and why it is making certain choices? > > So, avoiding a change to the PK (as much as it kills me), the long > term proposed solution is to load data into a staging table, build > indexes and gather statistics and then use partition exchange to bring > all of that online in the "real" table. However, that would required > a sub-partitioning of the FOO table by load units, so we can't just do > it right off. > > My near term work-around is to build a non-unique index on those first > four columns. In my opinion we'd have just this one and not that PK, > but such things are not up to me now. > > I'm up for any theories or proposals at this point. My frustration > level with this design and the refusal to fault it, is at an all-time > high. > > [1] //www.freelists.org/archives/oracle-l/01-2007/msg00995.html > > -- > Don Seiler > oracle blog: http://ora.seiler.us > ultimate: http://www.mufc.us > -- Don Seiler oracle: http://ora.seiler.us ultimate: http://www.mufc.us -- //www.freelists.org/webpage/oracle-l