Re: artificial values vs null

  • From: Dan Tow <dantow@xxxxxxxxxxxxxx>
  • To: niall.litchfield@xxxxxxxxx
  • Date: Fri, 24 Sep 2004 11:10:25 -0500

For what it is worth, I have seen an alternative to Oracle's unindexed NULLs,
and it isn't pretty:

DB2 indexes NULLs just like any other value, so it is perfectly happy reaching a
table with

Column1 IS NULL

on a single-column index on Column1. In fact, it treats (last I checked, anyway)
NULL like any other value when it calculates cardinality, too, so if Column1 has
999,999 distinct non-null values, plus NULL, it will estimate (unless you ask
for a histogram) that

Column1 IS NULL

will point to one millionth of the table, and it will just love driving from
that index. Of course, whatever else NULL is or is not, it is certainly *NOT*
"just another value," and if it is ever used at all, it is normally *way* more
common than an average specific non-null value, so this turns out to be a major
cause of bad execution plans on DB2.

I'd say that if a database is to reach NULLs with indexes safely, it must
automatically keep stats at the very least on the frequency of the NULLs, as
opposed to estimating that frequency as equal to the frequncy of the average
non-null value.

Thanks,

Dan Tow
650-858-1557
www.singingsql.com


Quoting Niall Litchfield <niall.litchfield@xxxxxxxxx>:

> On Thu, 23 Sep 2004 21:21:40 +1000, Nuno Souto <nsouto@xxxxxxxxxxxxxx> wrote:
> > Whatever the optimiser may do with 31 dec 2099,
> > it will be a darn long shot better than a NULL value
> > that can't be indexed...
>
> you may regret making that statement.... see below.
>
> >
> > Still, the correct design is to add a "status" column
> > where the "current" can be indicated.  The high date
> > is a poor substitute.  But still better than having to do
> > range or full table scans to find the current row of a
> > particular category.
>
> For this purpose I entirely agree. Though in the case of current
> 'logins' you can bet your bottom dollar that a crashed session will
> still be shown as current when it isn't. At least then you have
> demonstrably incorrect data rather than assigning meaning to
> meaningless data :)
>
...
--
//www.freelists.org/webpage/oracle-l

Other related posts: