Dynamic Sampling: some questions about the guts

  • From: "Charles Schultz" <sacrophyte@xxxxxxxxx>
  • To: oracle-l <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 30 Nov 2007 14:45:45 -0600

In one of our reporting environments (OSEE 10.2.0.2) we have
optimizer_dynamic_sampling cranked up to 5. We have a case where dynamic
sampling is pushing the CBO to choose a poor plan. Our situation highlights
Level 4, specifically "tables that have single-table predicates that
reference 2 or more columns". (from the
documentation<http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/stats.htm#sthref1491>
)

Looking at a 10046 trace, I see a OPT_DYN_SAMP query that counts a sample of
rows that match variations on the predicate. I have found that a 10046 can
always be good grounds for finding more questions than it answers. *grin*

To avoid the bandwidth, I will show just the relevant parts of the sampling
query:
SELECT /* OPT_DYN_SAMP */
 ...
FROM
 (SELECT /* ... */
   1 AS C1,
   CASE
    WHEN     "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1
         AND "RCRAPP1"."RCRAPP1_INFC_CODE"=:bind2
         AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3
         AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4
    THEN 1
    ELSE 0
   END AS C2,
   CASE
    WHEN     "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1
         AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4
    THEN 1
    ELSE 0
   END AS C3,
   CASE
    WHEN     "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1
         AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4
    THEN 1
    ELSE 0
   END AS C4
 FROM ...
 SAMPLE BLOCK (.037704 , 1) SEED (1) "RCRAPP1"
) SAMPLESUB
/

First question: Why the predicate variations? I am assuming that sometimes
C4 is different than C3, so I am going to ignore C4 for now.
2) How does the Optimizer use these sample counts to influence the costs and
subsequent access/join methods?

I believe what is happening in our case is that sampling is "just unlucky"
and it happens to come up with a inaccurate picture of how many rows match
(an order of magnitude too many). I can prove this by using this query
directly with different sample sizes.

I am hoping that by understanding dynamic sampling a little better, I can
understand how to resolve this issue. I realize that one can add the
dynamic_sampling query hint, and perhaps that is the "best" solution for
this. Aside from adjusting dynamic sampling (whether it be at the system
level or via a hint), is there any other way to address this situation?

TIA,

-- 
Charles Schultz

Other related posts: