RE: null or value

  • From: "Steiner, Randy" <Randy.Steiner@xxxxxxxx>
  • To: <tim@xxxxxxxxx>
  • Date: Thu, 17 Sep 2009 09:08:14 -0400

Tim,

 

It is a semi-dw application.  I get feeds once a day with many
transaction.  Some of the transactions are done with test cards.  I may
not know which the test cards are until days or weeks after the
transaction is loaded.  So I will do a batch update going back several
weeks, to mark which transaction are done with the test cards.  Most of
the reports will be for only the non test transaction.  

 

Randy

 

________________________________

From: Tim Gorman [mailto:tim@xxxxxxxxx] 
Sent: Thursday, September 17, 2009 3:31 AM
To: Steiner, Randy
Cc: oracle-l@xxxxxxxxxxxxx
Subject: Re: null or value

 

Randy,

Everyone else is going to respond to this question focusing on the
issues of NULL-ness or NOT NULL-ness, etc...

I'd like to step back a bit and reflect on a different concern, one of
design:  Is this a data warehouse application, and the proposed
ACTIVE_FLAG column an indicator of which "version" of a record is
supposed to "active" when there are multiple versions, such as...

         ID           EFF_DT      ACTIVE_FLAG
=========== ================ ================
     123456      02-Aug 2009                Y
     123456      17-Jun 2007                N
     123456      23-Dec 2005                N
     123457      11-Mar 2009                Y
     123457      08-Oct 2006                N
     123458      11-Jan 2005                Y
     123459      01-Jul 2008                Y
     123459      12-May 2004                N

...(and so on)...


So, the row with the latest EFF_DT value for each distinct ID value has
ACTIVE_FLAG = 'Y', while all the other (older) rows with the same ID
value have ACTIVE_FLAG = 'N'?  And each time a newer row is added for a
particular ID value, then the existing row with ACTIVE_FLAG = 'Y' will
need to be updated?

If this is the case, then please know that this method of using an
ACTIVE_FLAG to identify the "current" or "active" row will become one of
the biggest ETL performance problems in your data warehouse, and even
worse it will prevent you from making use of the time-variant nature of
your data in order to set tablespaces containing older data to READ
ONLY.  Thus, as your data warehouses grows in volume, it will continue
to be necessary to backup the entire database on a regular basis, and
the entire database will continue to reside on expensive "tier 1"
storage, and your loins will become barren, and locusts will blanket the
earth, and...  sorry, got kind of carried away there...

But if this is not the case, then never mind...... :-)

Thanks!



Tim Gorman
consultant - Evergreen Database Technologies, Inc.
P.O. Box 630791, Highlands Ranch CO  80163-0791
website   = http://www.EvDBT.com/
email     = Tim@xxxxxxxxx
mobile    = +1-303-885-4526
fax       = +1-303-484-3608
Yahoo IM  = tim_evdbt



Steiner, Randy wrote: 

I have a table with 1 - 2 million records.  I need to add a column to
indicate if the records is active or not.  I would guess that only 1,000
of the records would not be active.  Should I make one of the values
null?  So I could put a Y or Null?  Or put Y or N?

Would a b-tree or bitmap index do any good?

 

Generally I would want to see all the records that are active

 

Thanks

Randy

 

Other related posts: