Join cardinality and query tuning.

Hi all,

Please see the plan below. The estimated cardinality of the join between the
two tables is 3274K. This doesn't seem to tally with the figure I have
derived from the statistics in the database. Where am I going wrong?

-------------------------------------------------------------------
| Id  | Operation              |  Name            | Rows  | Cost  |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |     1 |   543 |
|   1 |  SORT AGGREGATE        |                  |     1 |       |
|   2 |   HASH JOIN            |                  |  3274K|   543 |
|   3 |    TABLE ACCESS FULL   | W_LOV_D          |    17 |    13 |
|   4 |    INDEX FAST FULL SCAN| W_ACTIVITY_F_N5  |  3452K|   525 |
-------------------------------------------------------------------

Let me try to elaborate on the problem:

The SQL statement is:

select count(*)
from
W_ACTIVITY_F T29238,
W_LOV_D T230600
where
T29238.X_BT_OUTCOME_AREA_WID = T230600.ROW_WID
and T230600.VAL in ('Save-No', 'Save-Yes')


I have used the following formula for calculating the join selectivity:

JS = (1/MAX(NDV(T1.c1), NDV(T2.c1)))
     * (Card(T1) - Num_Nulls(T1.c1)/Card(T1))
     * (Card(T2) - Num_Nulls(T2.c1)/Card(T2))

Join Cardinality = SelectedRows(T1)* SelectedRows(T2)* JS
where  SelectedRows(T) = Card(T) * FilterFactor

(From Mr. Breitling's famous paper on 10053 event).

The database statistics are:

siebel@VOLANL>SELECT table_name, num_rows FROM user_tables
  2  where table_name IN ('W_ACTIVITY_F', 'W_LOV_D');

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
W_ACTIVITY_F                     16858500
W_LOV_D                             24000


siebel@VOLANL>SELECT table_name, column_name, num_distinct, num_nulls
  2   FROM user_tab_col_statistics
  3  WHERE  (table_name='W_ACTIVITY_F' AND column_name =
'X_BT_OUTCOME_AREA_WID')
  4  OR ( table_name='W_LOV_D' AND column_name = 'ROW_WID');

TABLE_NAME      COLUMN_NAME           NUM_DISTINCT  NUM_NULLS
--------------- --------------------- ------------ ----------
W_ACTIVITY_F    X_BT_OUTCOME_AREA_WID           12   13406400
W_LOV_D         ROW_WID                      22720          0


So the Join Selectivity base on this information would be:

JS =(1/MAX(12,22720))*((16858500-13406400)/16858500)* ((24000 - 0)/24000)

   = 9.0127E-06

And the Join Cardinality will be

JC = 9.0127E-06 * 17 * 3274K
   = 528.916608

where 17 = SelectedRows(W_LOV_D) - as per the explain plan.
    3274K = Card(W_ACTIVITY_F) -
Num_Nulls(W_ACTIVITY_F.X_BT_OUTCOME_AREA_WID)
           - Again as per the explain plan

So Oracle should have estimated it to be 529. But it has estimated it 3274K.
Why?? What other factor is influencing the join cardinality estimate??


Now here's part 2:

I think the reason Oracle decides to do an FTS and Index FFS on W_LOV_D and
W_ACTIVITY_F tables respectively, is because of the incorrect join
cardinality estimate. The query actually returns only 67 rows!! The best
execution plan for this query is obviously INDEXED RANGE SCAN of W_LOV_D
followed by NESTED LOOPS join into W_ACTIVITY_F with INDEXED RANGE SCAN as
the table access method. I cannot provide any hint as the front-end
dynamically generates the queries. How do I manipulate the statistics so
that Oracle will choose this execution plan?

Many thanks & regards,
Charu.



*********************************************************
Disclaimer:          

This message (including any attachments) contains 
confidential information intended for a specific 
individual and purpose, and is protected by law. 
If you are not the intended recipient, you should 
delete this message and are hereby notified that 
any disclosure, copying, or distribution of this
message, or the taking of any action based on it, 
is strictly prohibited.

*********************************************************
Visit us at http://www.mahindrabt.com

--
http://www.freelists.org/webpage/oracle-l

Other related posts: