RE: Statistics Problem on partitioned table

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: "'Mohamed Houri'" <mohamed.houri@xxxxxxxxx>
  • Date: Wed, 2 Oct 2013 13:13:02 -0400

This version survived the list formatting sufficiently that I can see that
it is an execution plan with estimates and actuals. That would be good
enough if it is a complete execution. I do not understand how it can be a
complete execution since the actuals should be 6.1 million rows (not 550K).
The count I suggested will not have to go to the table to get a result, so
it should be cheaper to test.
 

With no histogram then, you've got about 49 million rows, and a little over
700,000 distinct values. Since 4900/700 is 70, the estimate of 69 seems
pretty doggone good.

 

The confusing thing about your distinct cnt  query that generated 56 rows is
that it was distinct cnt, not distinct values.

 

The number of rows in the test table matches the distinct values, which
seems correct.

 

I'm curious what the largest few values other than the 6 million count for
the distinct value of zero was. That would tell us the maximum popularity
level apart from value zero, but that is just a curiosity.

 

It seems to  me that about one eighth of your rows have the value zero, the
rest of your distinct values have a pretty flat distribution (although
without seeing the high missing distinct cnt below the 6 million and change
we can't be sure.)  So that histogram probably would help the optimizer
report better estimates, but I doubt it would generate a more effective plan
in this case since there is no other reasonable filter or access method.

 

I suppose depending on hardware and row width is it possible that full scans
of partitions would be faster than the indirect fetch via the local index. I
suppose if you plugged a sufficiently bad cluster factor in for you index
you might be able to test that plan. Possibly the histogram would tip the
plan in favor of scanning.

 

I suppose you could generate a union all for only the partitions having some
zeros. In theory it could use the local index when that was effective and
scan when it is not, per partition. I have not tested that.

 

Is that sort of what you're trying to figure out, or were you just curious
at the difference between estimates and actuals? I think that is just based
on the flat distribution presumption of distinct values.

 

mwf

 

From: Mohamed Houri [mailto:mohamed.houri@xxxxxxxxx] 
Sent: Wednesday, October 02, 2013 8:19 AM
To: Mark W. Farnham
Cc: Jonathan Lewis; ORACLE-L
Subject: Re: Statistics Problem on partitioned table

 

Ok Mark I will request what you have asked.

 

I am sorry

 

Mohamed Houri

 

2013/10/2 Mohamed Houri <mohamed.houri@xxxxxxxxx>

Mark

 

I haven't posted the execution plan you've requested because I did it in my
first e-mail

 

select 

    a

   ,b

   ,c

   ,per_ind

from XXX_PER_YYY

where per_ind = 0;

 

----------------------------------------------------------------------------
------

| Id | Operation                        | Name        | Starts | E-Rows |
A-Rows |

----------------------------------------------------------------------------
------

| 0  | SELECT STATEMENT                 |             | 1      |
|550K    |

| 1  | PARTITION RANGE ALL              |             | 1      | 69
|550K    |

| 2  | TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY | 111    | 69
|550K    |

|* 3 | INDEX RANGE SCAN                 | XXX_PER_IND | 111    | 69
|550K    |

----------------------------------------------------------------------------
------

Predicate Information (identified by operation id):

---------------------------------------------------

 

   3 - access("PER_IND"=0)

 

It is not exactly what you've requested but was this not sufficient?

 

 

Best regards

 

Mohamed 

 

 

2013/10/2 Mark W. Farnham <mwf@xxxxxxxx>

 

I fail to understand why the --+ gather_plan_statistics execution of the
simple count I requested is not forthcoming.

Now it does leave out the table access for the non-index columns, but it
should tell us a great deal about what the optimizer thinks is the cheapest
way to get the relevant rowids from all the partitions and the plan should
show what the CBO has estimated along with the actuals.

Fragmentary revelation of the underlying facts (obscuring, for example, that
a view was involved) does not make it easier for folks to help you explain
what you're asking about.

Please try to make it easier.

mwf


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Mohamed Houri

Sent: Wednesday, October 02, 2013 7:39 AM
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Statistics Problem on partitioned table

Jonathan,
I am sorry I may have not been clear but there are in fact 721,699 and that
is what I showed above

 *select per_ind, count(1) cnt from XXX_PER_YYY group by per_ind;*

721,699 rows .
<snip>







 

-- 
Bien Respectueusement
Mohamed Houri





 

-- 
Bien Respectueusement
Mohamed Houri



--
//www.freelists.org/webpage/oracle-l


Other related posts: