Stats and behavior weirdness

  • From: "Kevin Lidh" <kevin.lidh@xxxxxxxxx>
  • To: ORACLE-L <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 6 Mar 2006 10:12:19 -0700

We have a large CRM implementation in production (5 TB) which is 9.2.0.4 on
Tru64 5.1b.  We have a scaled down version in development (853 GB) on the
same OS and database version, where we do our performance tests.  We have
this one query that runs subsecond in production but ran for 1.5 hours
before I shot it in the development environment.  I profiled, 10053'd, and
everything and couldn't come up with a "for sure" reason why it was so bad.
The explain plan cost for production is 31 and only 21 in dev (I know, I
know, it's just a number).  But here's the weirdness, I decided to export
with no rows the schema from development and import it into a database on my
Linux x86-64 laptop.  I imported the statistics and changed the pfile to
match (as much as I could) the development box.  When I ran the explain plan
for the query, it matched production exactly.  With the exception of row
counts, it was identical.  The same indexes that weren't picked up in
development.

So what I'm wondering is if the following conditions exist:
1) parameters match between production and development (from 10053 traces)
2) no outlines are created for the query (looking at hash_value and
hash_value2 from outln.ol$)
3) statistics don't seem to be the issue (from my Linux box test)
4) there aren't any system statistics on either system (query sys.aux_stats$
)
5) the alter session commands are the same (run from a script in all three
environments)

Has anyone seen anything else that could influence the optimizer in such a
negative way?  This one really baffles me.  The CPU consumed on the
development box is dramatic.  I have access to all three systems if anyone
is interested in this challenge.

Other related posts: