Re: artificial values vs null

  • From: "Nuno Souto" <nsouto@xxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 24 Sep 2004 21:51:33 +1000

----- Original Message ----- 
From: "Niall Litchfield" <niall.litchfield@xxxxxxxxx>
>> Whatever the optimiser may do with 31 dec 2099,
>> it will be a darn long shot better than a NULL value
>> that can't be indexed...
>
> you may regret making that statement.... see below.

Not at all. My statement stands: it's not a throwaway comment.
The use of a specific value as opposed to a NULL will
cause the optimiser to pick an index if one exists and the
query is a simple one.

Of course: if there isn't one or you use an index-stop form of
the predicate, or you use a complex series of predicates where a
FTS or hash makes more sense, then it might not.

I did not say it was the perfect solution.  However, it works when
the perfect solution is just too expensive to implement.  For
whatever reasons.

What I fail to understand is what the char/number part of your
script proves.  That predicates involving mismatched data types
don't use an index?  Known since V4 and it has nothing to do
with using artificial dates.


> For this purpose I entirely agree. Though in the case of current
> 'logins' you can bet your bottom dollar that a crashed session will
> still be shown as current when it isn't.

Never disputed that.  If anything, I said specifically that this
is what happens when NULLS are assigned an arbitrary meaning.
But fnd_logins is a specific case.  With no bearing whatsoever
on the more general case of using NULL to signify the current
row in a time series.  And replacing that by an artificial value
- as a stopgap measure, I must emphasize.

> I have created a script to demonstrate what the 'optimizer' does (10g)
> with the use of various strategies for assigning an artificial value
> for what should be a NULL. It isn't pretty. You can see the full
> results at my site

Yeah.  As I said: the proper way is to use a separate attribute.
NULL as current will always cause a FTS if it is the only predicate.
Artificial dates might, with mis-matched data types. But then
again so does any other operation with conflicting data types,
in the absence of a FBI.

Which raises an interesting option: what if a FBI is created
with NVL2 to replace the NULL value with a high value instead?
Gotta look into this.

> The script is available at
> http://www.niall.litchfield.dial.pipex.com/scripts/tests/date_distribution.sql
> and you are welcome to play around with it on test systems yourself.

Thanks.  Certainly will.  Interested in seeing what happens
with 7, 8i and 9i.  In which I've always found the biggest problems
with this "NULL-as-current" design option.


>
> What I find is that *in the absence of histograms* then using an
> artificial value - especially if you do the other thing app designers
> do and use the wrong datatype

Oh no they don't!  Not *THIS* app designer.  The correct way of
implementing this artificial date is to create a stored function
that returns the correct type and value.  It is the function
return value that we compare against in the predicate:
guaranteed data type match. And guaranteed storage of the
correct high value, with a couple of triggers.


> do will most likely not be sensible at all. histograms do fix this,
> but at a non-zero cost.

Agreed.  I still think though that if one has the possibility of
complex joins, histograms are a must (10g possibly excepted?)
In this I must grant that DKB is correct, IME: use of histograms
is mandatory to get the optimiser to pick the correct mix
of indexes in a complex join.  At least in 7 and 8i. With 9i,
I've already found a few cases where it makes the correct
choices, histograms or no histograms. It appears dynamic sampling
works quite well.

Cheers
Nuno Souto
nsouto@xxxxxxxxxxxxxx 

--
//www.freelists.org/webpage/oracle-l

Other related posts: