Re: Data load ideas

  • From: Tim Gorman <tim@xxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Sat, 01 May 2004 11:17:58 -0600

Todd,

I'm not arguing against the index-supports for PKs or FKs.  Indexes and
constraints are two different animals.  I'm merely arguing against the use
of RI constraints (over the top of the indexes) in a database that is not
the source-of-record.  Note that it is the constraints, not the indexes,
that cause problems cited in this thread and in Jonathan Lewis's post on
EXCHANGE PARTITION on dbazine.com.

The disadvantages of RI constraints in a DW are obvious -- the original post
in this thread is one example.  Any mechanism geared for transactions and
transactional processing (such as RI constraints or triggers) is bound to
cause difficulties in the bulk-load operations commonly to large DWs.

Following such commandments such as "thou shalt enforce RI" (i.e. the
technical corollary to "thou shalt enforce business rules") leads to a lot
of pain, unnecessarily.

If you step back and think about it, there are multiple ways to enforce
uniqueness without using a UNIQUE index, just as there are multiple ways to
ensure that dimension keys in a fact reference rows in the dimension.  Using
RI constraints to prevent "bad data" from entering the DW is only one of
many possible methods to accomplish that purpose, and arguably not the best
one.

The discussion always reminds me of an old vaudeville groaner:

    Patient (flailing one arm comically): "Doctor!  Doctor!  It hurts
                                           when I do this!"
    Doctor (usually Groucho Marx): "Then don't do that!"

Obviously, PKs and FKs are absolutely necessary in a database supporting an
application which is the source-of-record.  But, in a DW, the main
requirement is accuracy during extraction-transformation-loading (ETL), not
the enforcement of business rules.

So, all I'm saying is:  if RI constraint mechanisms are presenting serious
difficulties, then bear in mind what they are intended for and also clearly
understand what your primary intentions are.  Don't force a square peg into
a round hole.

I've added more specific responses to your questions (below)...

Hope this helps...

-Tim


on 5/1/04 6:01 AM, tboss@xxxxxxxxxxxxxxxxxx at tboss@xxxxxxxxxxxxxxxxxx
wrote:

> Why NOT have FKs in your data warehouse?  I realize its not the system
> of record, so to speak, but having RI constraints enabled after a data
> load is a nice way to confirm you didn't just load up any bad data.

[TG]: How about a validation report instead?  Much nicer, in my opinion...

> 
> What is the harm in having RI enabled?  Lets say I'm doing my data
> loads to the warehouse through transportable tablespaces, and then
> the alter table ... switch partition method described in teh Oracle
> DW guide.  AFter that's done, I rebuild indexes, re-enable RI constraints
> and I get a great quick check to confirm nothing unexpected just got loaded.

[TG]: If that "quick check" is the only reason for re-enabling the RI
constraints, why not do the job more efficiently using a validation report?
You can tune a validation report, take advantage of all of the wonderful
capabilities that Oracle provides.  Re-enabling a constraint has to done one
way and one way only...

> 
> Is there any overhead generated by leaving FK checks in place, once
> you have data loaded in?  Its just a rule-set check, right?  They'd
> never get "fired" on pure selects, and we're never updating or inserting
> transactionally into the DW.

[TG]:  Yup.

> 
> Also, If you don't have index-supports for  your natural FK-relationships
> between facts and dimensions, how would you ever resolve a common dimensional
> query?  We have bitmap indexes supporting every fact-dimension relationship
> (which are technically foreign keys).

[TG]: Response provided above.  Indexes and constraints are two different
topics...

Another valid purpose for PKs and FKs is to provide a form of "online
documentation" for query tools, to illustrate relationships between tables,
much the same way that Oracle's DIMENSION objects illustrate hierarchical
relationships.  In this case, I'd advise creating the PK and FK constraints,
but RELY (i.e. completely unenforced).  Again, its a matter of understanding
the purpose and matching the right feature to that purpose.

> 
> I'm not trying to be difficult; i'm just wondering if my DW implementation
> is wrong or needs serious thought.

[TG]: That's for you to decide.  If the RI constraints aren't inhibiting
anything, then that's OK.  In a large DW environment, bulk-loads and
exchange partition is an absolute necessity, and in such an environment, the
disadvantages of transaction-based RI constraints become unavoidably
apparent.  A DW that uses RI constraints in the manner you describe will not
scale well.

> 
> Todd
> 
>> 
>> Why are you using foreign keys in a data warehouse anyway?
>> 
>> The only technical reason for FKs in a DW is when one is trying to use
>> materialized view query-rewrite and one is utilizing QUERY_REWRITE_INTEGRITY
>> = ENFORCED to prevent MV "staleness".  Otherwise, there is no technical
>> reason for FKs, as there is no need for RI to be enforced in the DW, because
>> the DW is not the source-of-record for the data.  Validation reporting is
>> much more effective than enforced constraints.
>> 
>> Just my $0.02...
>> 
>> 
>> 
>> on 4/30/04 6:33 AM, jtesta@xxxxxxxxxx at jtesta@xxxxxxxxxx wrote:
>> 
>>> I did some testing since i'm going to implement something like that for
>>> the warehouse project i'm working on.
>>> 
>>> what i'm finding is when i create the table to hold the data(that will be
>>> exchanged from), if i add the FK to that table before the exchange is
>>> done(and everyone is enable validate right now), i have no problem doing
>>> the exchange.
>>> 
>>> so i must be missing something here in the thread, what is it?

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: