RE: "Multi-Column" Histogram / Histogram for Concatenated Index

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <hkchital@xxxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 13 Dec 2007 12:01:17 -0500

Is this index required for relational integrity?

IF *NOT*, have you considered redefining the index without column C_5. That
might result in an overall improvement, depending on possible extra cost to
visit the table when the index would have been enough on some queries and
possibly less good plans when there is an equijoin or in-list for *NOT*
"881".

IF *REQUIRED*, second caveat above, consider putting C_5 last instead of
third. Paradoxically, putting it first might result in a much poorer cost
estimate for input to the optimizer.

I am not aware of way to raise the cost the optimizer puts on a skip scan.

For the 55% combination it seems to me if any C_2 results in 55% of the rows
(or presumably more, since even with the other columns pinned to particular
values you get 55%, then C_2 is pretty non-selective for that value. So
probably you need a histogram on that column as well. So ditto on moving C_2
lower in the food chain in the order of the columns in the index.

Am I missing something.

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Hemant K Chitale
Sent: Thursday, December 13, 2007 10:36 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: "Multi-Column" Histogram / Histogram for Concatenated Index



I have a table with an Index on 4 columns.  One particular column 
(the third in the index)
is highly skewed with one value accounting for 70% of the rows.
Queries with the Index columns as predicates are perfectly fine for all
values
other than that one value.
Even if I gather a histogram on the skewed column, queries for the "bad"
value
still use the index.  The bad value alone accounts for 70% of the rows.
In combination with a particular set of values for the other columns, the
query
fetches about 55% of the rows.  I do not want to use the index.

Say :
Table TAB_1
Columns  C_1, C_2, C_3, C_4, C_5, C_6, C7
Index on C_2,C_4,C_5,C6

Value "881" in column C_5 accounts for 70% of the rows.
A particular combination of values for C_2, C_4, C_5 ("881") and C_6
account for 55% of the rows.

I cannot change the SQL code.  It is "generated" by the application.
Even if I were to change the code,  I find that this SQL query runs
in a loop for different combinations  of these columns.  It is only
one combination that is my problem.  The combinations of
C_2, C_4, C_5, C_6   are driven from three other tables
Besides these columns C_1 and  C_7 are also linked to the other tables.
Some joins are equi-joins, some are ranges.
{Yes, this sounds like a fact table and dimension tables, doesn't it ?}

This is 9.2.0.4, 64-bit.
Statistics are gathered nightly.  I am allowed to modify the gather
statistics
job to collect specific histograms.

I find that gathering histogram on C_5 alone doesn't help.
{There are 6- distinct values, and I can specify upto 254 buckets}.
Queries still use the index

Questions :
1.  If I rebuild the index to use C_5 as the leading column would
Oracle be using the Histogram to convert queries for C_5='881'
into FullTableScans ?
2.  Should I gather histograms on all the columns of this index ?

This is not exactly like the "correlated" columns that is used as an
example for 11g MultiColumn Statistics ("Extended Statistics")
as there is no relationship between the column values.
What I am looking for is something like a Histogram on the Index Key
(rather than just "num_rows" and "distinct_keys", I want the density
for different combinations of key values -- a Histogram on the Index !)




Hemant K Chitale
http://web.singnet.com.sg/~hkchital
and
http://hemantscribbles.blogspot.com
and
http://hemantoracledba.blogspot.com

"There is more to life than increasing its speed."
Mohandas Gandhi Quotes 
:  http://www.brainyquote.com/quotes/authors/m/mohandas_gandhi.html

--
//www.freelists.org/webpage/oracle-l




--
//www.freelists.org/webpage/oracle-l


Other related posts: