Dan Tow's SQL formula

  • From: amonte <ax.mount@xxxxxxxxx>
  • To: oracle-l <oracle-l@xxxxxxxxxxxxx>
  • Date: Sun, 22 Feb 2009 20:15:33 +0100

Hi

A few months ago I was asking some SQL tuning question and someone sggested
me Dan Tow's SQL Tuning book release by Oreilly. So I bought the book a few
weeks ago and just started to reading it.

I am a bit puzzled about the formula in page 30, calculating the filter
selectivity for a query which searchs for last_name:

select ....
from customers
where last_name = 'SMITH'

And he suggested this formula to obtain the selectivity, sum of (n(i)/C') x
(n(i)/C) where n(i) is the count of rows wiyj ith nonnull last name (I
understand as count of all last_name) and C is count of all rows and C' is
count of all nonnull last name which turns out to be following SQL statement
(which I dont understand)


SELECT SUM(COUNT(Last_Name)*COUNT(Last_Name))  /
(SUM(COUNT(Last_Name))*SUM(COUNT(*)))
FROM Customers
GROUP BY Last_Name;

To start with the formula is not exactly as described, there are are 3 SUM()
instead of one.


To better my understanding I did a test with more familiar table which is
scott.emp table with following query:

select sum(count(mgr)*count(mgr)) / (sum(count(mgr))*sum(count(*)))
from emp
group by mgr

SUM(COUNT(MGR)*COUNT(MGR))/(SUM(COUNT(MGR))*SUM(COUNT(*)))
----------------------------------------------------------
                                                .225274725


So I got a selectivity of roughly 0.225. Does not sound right to me because
there are 6 managers and 13 employees (14 but one has no manager)

My question is really, does anyone understand this formula and actually use
it?

If so please throw some lights :-)))


Thank you all

Alex

Other related posts: