Re: quick FK question

  • From: Robert Freeman <robertgfreeman@xxxxxxxxx>
  • To: cary.millsap@xxxxxxxxxxxx, mwf@xxxxxxxx
  • Date: Mon, 21 Dec 2009 09:01:33 -0800 (PST)

So... shift enter in Yahoo mail sends a mail, most inconvenient... The Kimball 
link is here: 

http://www.ralphkimball.com/html/designtipsPDF/DesignTips2003/KimballDT43DealingWith.pdf

RF


 Robert G. Freeman
Oracle ACE
Ask me about on-site Oracle Training! RMAN, DBA, Tuning, you name it!
Author:
Oracle Database 11g RMAN Backup and Recovery (Oracle Press) - ON ITS WAY SOON!
OCP: Oracle Database 11g Administrator Certified Professional Study Guide 
(Sybex)
Oracle Database 11g New Features (Oracle Press)
Oracle Database 10g New Features (Oracle Press)
Other various titles
Blog: http://robertgfreeman.blogspot.com
Check out my new blog series on installing Oracle Database 11gR2 on Windows 
using VMWare!




________________________________
From: Robert Freeman <robertgfreeman@xxxxxxxxx>
To: cary.millsap@xxxxxxxxxxxx; mwf@xxxxxxxx
Cc: rgravens@xxxxxxxxx; gints.plivna@xxxxxxxxx; JSweetser@xxxxxxxx; 
oracle-l@xxxxxxxxxxxxx
Sent: Mon, December 21, 2009 10:00:15 AM
Subject: Re: quick FK question


So.... Stepping in the waters and hope I can swim on this one....

The web link is interesting reading. I'll confess that as I read it I was 
answering phones, talking to people and generally not giving it 100% attention. 
:) That being said, one thing that seemed to come to mind is that the debate on 
this page is about the relational mechanics/reasons/debate about NULL or 
"default" values. It is an RDBMS agnostic debate, and as such one important 
aspect isn't addressed, the performance of whatever the "perfect" solution is. 
In my mind, in the end, if I can assure data integrity (whatever the method) 
then performance becomes the next important consideration. The question of data 
integrity is a combination of the application and the data model (I could add 
things like testing, change control, etc too)...

As this is a data warehouse, I would assume (I know... I know) that the data 
comes from one or more OLTP systems, and I would assume (sigh...) that the data 
integrity of these systems would not be in question. With a warehouse, one 
might well be bringing in data from disparate sources. Simply based on that 
rule, some of the basic relational rules may not be able to be enforced all the 
time. In fact with a data warehouse we denormalize with a vengeance. For 
example, we may have facts without associated dimensions in a warehouse for a 
time (because of data integration issues) .... We might well have defined but 
not enforced FK's in a warehouse (later versions of Oracle), or we might not 
have any FK's if we are running older versions of Oracle that don't support 
non-enforcement of FK's. Thus, we ignore relational rules in warehouses all the 
time, in part for performance and in part based on an assumption that the 
source of record system is consistent.

In my mind in this case the real question is one of performance, not trying to 
protect relational integrity. That being the case, then the next question is 
"Is there a problem with performance". If the answer is no, then I would not 
worry about it. If the answer is Yes, then your modeler will need to try to  
justify his assertion that the design works better. If he can not justify, back 
to the drawing board....

Kimbal has addressed the issue of NULL's in a data warehouse... one link is 
here: 




 Robert G. Freeman
Oracle ACE
Ask me about on-site Oracle Training! RMAN, DBA, Tuning, you name it!
Author:
Oracle Database 11g RMAN Backup and Recovery (Oracle Press) - ON ITS WAY SOON!
OCP: Oracle Database 11g Administrator Certified Professional Study Guide 
(Sybex)
Oracle Database 11g New Features (Oracle Press)
Oracle Database 10g New Features (Oracle Press)
Other various titles
Blog: http://robertgfreeman.blogspot.com
Check out my new blog series on installing Oracle Database 11gR2 on Windows 
using VMWare!




________________________________
From: Cary Millsap <cary.millsap@xxxxxxxxxxxx>
To: mwf@xxxxxxxx
Cc: rgravens@xxxxxxxxx; gints.plivna@xxxxxxxxx; JSweetser@xxxxxxxx; 
oracle-l@xxxxxxxxxxxxx
Sent: Mon, December 21, 2009 8:03:38 AM
Subject: Re: quick FK question

This isat the root of the Codd vs. Date debate summarized at 
http://www.dbdebunk.com/page/page/1706814.htm.

I am eager to see Chris Date present on this near our office next month. 
Enrollment is still open, and there's plenty of space available (wink): 
http://method-r.com/education/107-cj-date-course


Cary Millsap
Method R Corporation



On Mon, Dec 21, 2009 at 8:36 AM, Mark W. Farnham <mwf@xxxxxxxx> wrote:

>
>This is subject to testing and experimentation to see which way works better
>>with the current technology.
>
>>It does raise and interesting question: When faced with incomplete data in
>>columns that are not mere scalars but rather are either part of a key of the
>>current relation or to another relation, is it better to include the tuple
>>in your representation of the relation with some indication the tuple is
>>incomplete or is it better to relegate the incomplete tuple to a staging
>>location until its critical column values are known?
>
>>There is meaning added to the the database by having the constraint and a
>>value that declares "this is a provisional value because we don't know the
>>real value" as opposed to merely allowing the column to be null. When the
>>provisional value can be used to drive a process to identify quality
>>problems in the data without discarding the bits of the tuple that are known
>>some good can come out of it. In terms of total work that must be done in an
>>Oracle database to complete the data, this may be superior to having the
>>incomplete tuples in a separate table.
>
>>But you mentioned "warehouse," which to a certain extent implies the data is
>>in final form. Will the answers gotten from the warehouse be improved or
>>degraded from including the incomplete tuple? Is there a good general answer
>>to this question? (Am I mistaken or is this at root one of the bits Codd and
>>Date had a mild disagreement about?)
>
>>Ah, the quasi-religious war that can be waged over this one!
>
>>So, does your "data warehouse modeler" produce models that work well to
>>answer the questions put to it? Does this practice tend to make the models
>>more or less usable?
>
>>Regards,
>
>>mwf
>
>-----Original Message-----
>>From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
>On Behalf Of Rumpi Gravenstein
>>Sent: Monday, December 21, 2009 8:01 AM
>>To: gints.plivna@xxxxxxxxx
>>Cc: JSweetser@xxxxxxxx; oracle-l@xxxxxxxxxxxxx
>>Subject: Re: quick FK question
>
>><snip>
>>I've seen once from some duhvelopers - they created all FK columns NOT
>>NULL. Unfortunately of course there were cases with FK columns where
>>actual value could not be provided. So what did they do? An obvious
>>solution! ;) Added one row with id = -1 as a stub to all db tables.
>></snip>
>
>>Gints I wanted to follow-up on this statement as our warehouse data
>>modeler insists on doing this for all dimension joins claiming that
>>this is good warehouse design as it avoids having to worry about outer
>>joins.  How would you respond?
>
>>--
>>Rumpi Gravenstein
>
>--
>//www.freelists.org/webpage/oracle-l
>
>
>
>
>>--
>//www.freelists.org/webpage/oracle-l
>
>
>

Other related posts: