Histogram worthwhile?

  • From: Roberto Veiga <roberto_veiga@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Tue, 24 Feb 2009 06:01:10 -0800 (PST)

We have another performance problem to solve. The performance that was good, 
now is bad. Very bad with an unacceptable performance and nobody makes any 
changes to the system. Nor the analyst neither the dba. Why the performance is 
not ok anymore?

I think the answer is the data. The data is changing every day, every hour, 
every minute and that query that always was ok is taking a long time to finish. 
It is running slow and we have to have an explanation for this situation.

Oracle has a very good feature named histogram. And I think the histogram was 
created just for this. When the data change, the histogram can see that change 
and send information the the optimizer that uses that information to build a 
better access plan. So where is the problem?

First lets talk about bind variables. We always have heard that there are only 
benefits in using bind variables. This reduce the hard parses, relieve the 
memory consumption and let a clean code.

But when a column doesnt have values with a balanced distribution, bind 
variables could be a problem. Oracle has a phase before the access plan could 
be generated that is "bind variable peeking". The first value received by the 
query is the one that is going to be used by the optimizer to decide for the 
best access plan and this plan is going to be used for all future executions. 
But what if the second value can retrieve a different amount of data than the 
first one?

If my data has a skewed distribution and i am using bind variables, the 
optimizer is not going to rebuild the access plan every time the value changes 
even if I had collected histograms. For one specific value is better use  ,for 
example, the index idx1, for another value the best performance is achieved 
using the idx2 and for another value the best choice would be do a full table 
scan.

I dont know if this is a problem or if it is just a optimizer's behavior, but I 
am doing a research to find some alternatives to this problem. And I find some 
(thanks Tom Kyte):

.use bind variables but dont use histograms. so the plan is going to be the 
same for every single execution. one size fits all approach
.dont use bind variables but use histograms. so the optimizer will evaluate the 
plan every execution and discover the best plan
.use bind variables, use histogram but change the code. if the value is in this 
range, execute this query. if not, execute another query. i need  to know 
distribution data. using the width_bucket function could be a good way to 
discover how is the distribution


Anyone have another idea?

Thanks

Veiga




      Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com
--
//www.freelists.org/webpage/oracle-l


Other related posts: