RE: Join cardinality and query tuning.

  • From: "Odland, Brad" <Brad.Odland@xxxxxxxxxxxxxxxxx>
  • To: <joshic@xxxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 1 Nov 2004 10:07:24 -0600

It would help to actually have an index on W_LOV_D....and make sure that =
you reference all the keys in the index if there are more than one...you =
can't do a index range scan without a index... (:>P)

Nothing in your messege says that there is an index for the table =
W_LOV_D

The TABLE, W_ACTIVITIY_F has a INDEX named W_ACTIVITY_F_N5 according to =
your explain plan taht is being accessed. The TABLE is not being =
accessed. The Index is scanned for the rowid to fetch the data....



-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]
Sent: Monday, November 01, 2004 9:22 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: Join cardinality and query tuning.



Hi all,

Please see the plan below. The estimated cardinality of the join between =
the
two tables is 3274K. This doesn't seem to tally with the figure I have
derived from the statistics in the database. Where am I going wrong?

-------------------------------------------------------------------
| Id  | Operation              |  Name            | Rows  | Cost  |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |     1 |   543 |
|   1 |  SORT AGGREGATE        |                  |     1 |       |
|   2 |   HASH JOIN            |                  |  3274K|   543 |
|   3 |    TABLE ACCESS FULL   | W_LOV_D          |    17 |    13 |
|   4 |    INDEX FAST FULL SCAN| W_ACTIVITY_F_N5  |  3452K|   525 |
-------------------------------------------------------------------

Let me try to elaborate on the problem:

The SQL statement is:

select count(*)
from
W_ACTIVITY_F T29238,
W_LOV_D T230600
where
T29238.X_BT_OUTCOME_AREA_WID =3D T230600.ROW_WID
and T230600.VAL in ('Save-No', 'Save-Yes')


I have used the following formula for calculating the join selectivity:

JS =3D (1/MAX(NDV(T1.c1), NDV(T2.c1)))
     * (Card(T1) - Num_Nulls(T1.c1)/Card(T1))
     * (Card(T2) - Num_Nulls(T2.c1)/Card(T2))

Join Cardinality =3D SelectedRows(T1)* SelectedRows(T2)* JS
where  SelectedRows(T) =3D Card(T) * FilterFactor

(From Mr. Breitling's famous paper on 10053 event).

The database statistics are:

siebel@VOLANL>SELECT table_name, num_rows FROM user_tables
  2  where table_name IN ('W_ACTIVITY_F', 'W_LOV_D');

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
W_ACTIVITY_F                     16858500
W_LOV_D                             24000


siebel@VOLANL>SELECT table_name, column_name, num_distinct, num_nulls
  2   FROM user_tab_col_statistics
  3  WHERE  (table_name=3D'W_ACTIVITY_F' AND column_name =3D
'X_BT_OUTCOME_AREA_WID')
  4  OR ( table_name=3D'W_LOV_D' AND column_name =3D 'ROW_WID');

TABLE_NAME      COLUMN_NAME           NUM_DISTINCT  NUM_NULLS
--------------- --------------------- ------------ ----------
W_ACTIVITY_F    X_BT_OUTCOME_AREA_WID           12   13406400
W_LOV_D         ROW_WID                      22720          0


So the Join Selectivity base on this information would be:

JS =3D(1/MAX(12,22720))*((16858500-13406400)/16858500)* ((24000 - =
0)/24000)

   =3D 9.0127E-06

And the Join Cardinality will be

JC =3D 9.0127E-06 * 17 * 3274K
   =3D 528.916608

where 17 =3D SelectedRows(W_LOV_D) - as per the explain plan.
    3274K =3D Card(W_ACTIVITY_F) -
Num_Nulls(W_ACTIVITY_F.X_BT_OUTCOME_AREA_WID)
           - Again as per the explain plan

So Oracle should have estimated it to be 529. But it has estimated it =
3274K.
Why?? What other factor is influencing the join cardinality estimate??


Now here's part 2:

I think the reason Oracle decides to do an FTS and Index FFS on W_LOV_D =
and
W_ACTIVITY_F tables respectively, is because of the incorrect join
cardinality estimate. The query actually returns only 67 rows!! The best
execution plan for this query is obviously INDEXED RANGE SCAN of W_LOV_D
followed by NESTED LOOPS join into W_ACTIVITY_F with INDEXED RANGE SCAN =
as
the table access method. I cannot provide any hint as the front-end
dynamically generates the queries. How do I manipulate the statistics so
that Oracle will choose this execution plan?

Many thanks & regards,
Charu.



*********************************************************
Disclaimer:         =20

This message (including any attachments) contains=20
confidential information intended for a specific=20
individual and purpose, and is protected by law.=20
If you are not the intended recipient, you should=20
delete this message and are hereby notified that=20
any disclosure, copying, or distribution of this
message, or the taking of any action based on it,=20
is strictly prohibited.

*********************************************************
Visit us at http://www.mahindrabt.com

--
//www.freelists.org/webpage/oracle-l
--
//www.freelists.org/webpage/oracle-l

Other related posts: