Re: AUTO_SAMPLE_SIZE is every row
- From: Mladen Gogala <gogala.mladen@xxxxxxxxx>
- To: oracle-l@xxxxxxxxxxxxx
- Date: Sat, 27 Jan 2018 13:40:01 -0500
Replies in-line
On 01/27/2018 10:12 AM, Neil Chandler wrote:
Mladen,
Sweeping generalisations are not helpful, and this is the 2nd time you
have made this one, fishing for a response. So let me respond.
I haven't made any generalization, much less "a sweeping one". My
statement is that for some tables, full table scan is unacceptable. The
very form of the sentence, "for some tables", tells you that this is not
a generalization. And I am not fishing for an answer, I'm just stating
the obvious. I'm having a cold and I don't feel like having an argument
today, much less a dilbertian one, like this. And you can consider me a
narcissist.
Some of the time a full table scan may be problematically slow, but it
is simplistic to assume that sampling a fractional percentage will
result in good overall outcomes.
Who has said that? You are arguing against something I have never said.
There is nothing that will result in "good overall outcomes". Such a
solution is popularly known as a "silver bullet" and its existence is
widely disputed. Not even tuning pack and tuning profile will result in
"good overall outcomes". That is why consultants like me still have
their jobs.
The full scan provides other benefits, by minimising additional work
needed to be taken by the stats gathering job. It ensures increased
accuracy of the stats, improves single-sample adaptive stat collection
by understanding NULL distributions (and therefore sample escalations
are proactively anticipated), and allows perfect frequency and top
frequency histograms at marginal additional cost. It is needed for
Hybrid histograms to replace the largely ineffectual Height-balanced
histograms. It allows the use of approximate_ndv to lower the overall
impact of the stats gather. All of those benefits are lost by
implementing any other estimate_percent.
I have seen tables of more than a TB in size. Good luck with a full
table scan and benefits it can produce.
Given estimate_percent=auto_sample_size is the Oracle default, and has
been through several major releases, there is a good chance that it
has rather a lot of benefits. As a default, it's not going away.
If your table is so large that the scan time is unacceptable, the
better solution is more likely to be to implement partitioning and use
incremental stats (or a locked-partition manual version), or maybe
even look to manually fabricate your stats if you know how to do that.
It depends upon your goals (and licensing).
And partition licenses are expensive. Oracle is the only major RDBMS
vendor which charges for the right to create a partitioned tables. And
yes, I do know how to partition tables man manually fabricate the stats.
However, that is no longer necessary. There is dynamic sampling, which
can produce the results of the same quality as DBMS_STATS. Furthermore,
I don't see the need for being rude and personal with "if you know how
to do that". Please keep the discussion civil, if you know how to do that.
If you are going to suggest alternatives, please ensure you list at
least some of the benefits and disbenefits associated with the
alternatives so people can make informed descisions about their
approach.
I am not going to suggest alternatives any more than I have done that so
far. I don't really care about people making informed decisions. I am
not running for office. I am a consultant and I will help tune any SQL
that my customers require me.
regards
Neil Chandler
Database Guy.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
Other related posts: