RE: Statistics Problem on partitioned table

  • From: "Ric Van Dyke" <ric.van.dyke@xxxxxxxxxx>
  • To: "Mohamed Houri" <mohamed.houri@xxxxxxxxx>
  • Date: Tue, 1 Oct 2013 09:58:04 -0500

The optimizer is a pretty simple machine when you get down to it.  Here
is exactly why you are getting the estimated number of rows you are
getting:
 

(1/721,701)  = 0.000001  this is the density for the PER_IND column 1
over number of distinct keys. 

 

49,754,928 is the rows in the table.

 

With a simple equality predicate like this the optimizer does a simple
rows X density calculation:

 

49,754,928 X 0.000001  = 68.941193  rounded to 69.

 

Since the optimizer can't do anything about partition pruning at this
level since there is no reference to the partition key, that is what you
get. 

 

 

 

From: Mohamed Houri [mailto:mohamed.houri@xxxxxxxxx] 
Sent: Tuesday, October 01, 2013 10:47 AM
To: Ric Van Dyke
Cc: ORACLE-L
Subject: Re: Statistics Problem on partitioned table

 

Ric,

 

If you divide A-Rows/Starts you will find 550K/111 = 4954 rows. That's
close to what you have pointed out. How did you figure out this?

 

select leaf_blocks, distinct_keys, clustering_factor,
num_rows,partitioned

from all_indexes where index_name = 'XXX_PER_IND';

leaf_blocks distinct_keys clustering_factor num_rows partitioned

119369      721701            204870                 49754928  YES

 

Mohamed Houri

 

2013/10/1 Ric Van Dyke <ric.van.dyke@xxxxxxxxxx>

I suspect that the real issue for the optimizer is the fact that the
index is a non-prefixed local index.  I can only presume that the
estimated rows you see are a "pre partition" based number, it's still
wrong should be in the near 4000 not less than 100. (given the total
number for rows returned is 6 million) Since it's not prefixed the
optimizer has no idea which partitions it will go once it retrieves a
row from the index. What at the global level stats for the per_ind
column?  Especially the number for distinct values.


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Mohamed Houri
Sent: Tuesday, October 01, 2013 10:07 AM
To: ORACLE-L
Subject: Statistics Problem on partitioned table

Dear list
I have a performance problem that I have narrowed to a statistics
problem where the CBO is not doing good estimations on a partitioned
table

The culprit select looks like



select

    a

   ,b

   ,c

   ,per_ind

from XXX_PER_YYY

where per_ind = 0;



XXX_PER_YYY is a table range partitioned by a date. There are 1493
partitions.

AND there is an index XXX_PER_IND on (per_ind) local (note that it is
local non prefixed)



The execution plan with Estimations and Actuals looks like


------------------------------------------------------------------------
----------

| Id | Operation                        | Name        | Starts | E-Rows
|
A-Rows |

------------------------------------------------------------------------
----------

| 0  | SELECT STATEMENT                 |             | 1      |
|550K    |

| 1  | PARTITION RANGE ALL              |             | 1      | 69
|550K    |

| 2  | TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY | 111    | 69
|550K    |

|* 3 |  INDEX RANGE SCAN                | XXX_PER_IND | 111    | 69
|550K    |

------------------------------------------------------------------------
----------

Predicate Information (identified by operation id):

---------------------------------------------------

    3 - access("PER_IND"=0)



And the execution plan showing the number of partition looks like

------------------------------------------------------------------------
-------------------

| Id  | Operation                          | Name        | Rows  | Bytes
|
Pstart| Pstop  |

------------------------------------------------------------------------
--------------------

|   0 | SELECT STATEMENT                   |             |    69 |  1173
|        |       |

|   1 |  PARTITION RANGE ALL               |             |    69 |  1173
|   1    |  1493 |

|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY |    69 |  1173
|   1    |  1493 |

|*  3 |    INDEX RANGE SCAN                | XXX_PER_IND |    69 |
|   1    |  1493 |

------------------------------------------------------------------------
--------------------

 Predicate Information (identified by operation id):

---------------------------------------------------

    3 - access("PER_IND"=0)



There is only one distinct value of per_ind (per_ind =0)

select per_ind, count(1) from XXX_PER_YYY group by per_ind;

*per_ind cout(1)*


0      6,118,184



And I have the following partitions num_rows distribution



758 partitions with num_rows = 0;

60  partitions with num_rows <= 5000

295 partitions with num_rows between 5000 and 10000

315 partitions with num_rows > 10,000 and num_rows <50,000

65 partitions with num_rows > 100,000;



Statistics are calculated at a global level



How to make the CBO having correct estimations and hence an optimal
execution plan?



Thanks in advance

--
Bien Respectueusement
Mohamed Houri
www.hourim.wordpress.com



--
//www.freelists.org/webpage/oracle-l







 

-- 
Bien Respectueusement
Mohamed Houri


--
//www.freelists.org/webpage/oracle-l


Other related posts: