Re: normalization

  • From: <rjsearle@xxxxxxxxx>
  • To: Oracle-L Freelists <Oracle-L@xxxxxxxxxxxxx>
  • Date: Wed, 2 Nov 2005 09:33:30 +1000

Hi Chris,
 Is everyone clear on the objectives of this database? Normalisation is not
the holy grail, it is one path to the holy grail! The primary benefit of
normalisation (IMHO) is to avoid update anomalies. So that is the goal.
 Therefore when deciding to normalise or not, or in this case whether to
store derivable values or not, you must consider the risk of update
anomalies, resulting in a loss of data integrity.
 Now if your environment is an OLTP, then your data integrity risk is higher
. If you proceed with storing derivable values in an OLTP, I believe that
you must have additional mechanisms to enforce data integrity, such as
background processes that verify the accuracy of the derived columns,
triggers, MVs, function based indexes etc. SO it may still make sense to
store derivable values, but remember that there is always a price to pay,
it's just a question of when the price is paid.
 Conversely, if the environment is DW then the risks are far lower, possibly
zero if the app only inserts and never updates. In this case, I advocate
storing derivable data, because it is stable, therefore there are no update
anomalies ('cause there are no updates!)
 My guess is that this is an OLTP env with stringent performance criteria.
In this situation, my preferred option are (in order)
1. Ensure that the existing hardware is being efficiently used. Try to
design the database for performance BEFORE jumping on the denormalise
option. This is a lazy approach that has many hidden costs. So look at
function based indexes, MVs, physical storage design etc. IE. start at the
existing hardware and work up from there. there are many, many stages of
tuning before a change is required to the data model. This possibly the
cheapest option
2. reconsider the hardware platform. It is easier and cheaper to upgrade the
hardware to achieve performance than to have a team of people design, build,
test, implement and support additional software to enforce data integrity.
3. Once the first two options are fully exhausted, then assess carefully the
costs and benefits of denormalising and decide if there is any real value.
If denormalisation is truly the only option then, remember that you are
compromising the primary goal above as you are introducing the risk of
update anomalies! So you must have measures to first avoid such problems and
also detect and correct problems should they arise (and they will!). So
first up, use constraints on the column, triggers on inserts and updates
etc. They are some of the problem avoidance measures. Secondly, you must
detect the problems, so background processes may be necessary. These
measures will cost to produce, support and employ. THere is a performance
hit from these measures as well as more design, develop, test, implement and
support costs. After all of this, there is still a risk of innaccuracies.
Remember to factor in to your cost model the cost to the business of
inaccurate data because this data will be used to underpin commercial
decisions, so I' recommend avoiding introducing flaws into the decision
process. That process already has enough flaws without compounding it with
dodgy data!.
   Where do I leave the $0.02?
 Russell

 On 11/2/05, stephen booth <stephenbooth.uk@xxxxxxxxx> wrote:
>
> On 01/11/05, Powell, Mark D <mark.powell@xxxxxxx> wrote:
> >
> > Jared >> Calculated fields are for reporting. eg. data marts and/or data
> > warehouses. <<
> >
> > For a simple colA + ColB * Col7 = calc_col value the cost should not be
> that
> > high. But what if you have to extract 50,000,000 rows in one shot then
> the
> > calculation time cost adds up quickly. So it will depend somewhat on
> > business requirements. Storing calculated columns is something to avoid
> > doing if at all possible, but doing so will generally not cause DML
> activity
> > anomalies like failing to normalize normal columns can.
> >
>
> In contemporary versions of Oracle you could possibly use a
> Materialized View instead of calculated fields in the table itself.
> Obviously there's a whole bunch of other issues that come along with
> MVs but they're probably worth looking at as an option.
>
> Calculated fields are a breach of 3NF. But then, like most things in
> life, business applications don't always fit into nice mathematical
> models so every rule has to have an addendum something like "...unless
> there's a really good reason to do otherwise." I think it's important
> to be aware that you are breaching a rule and to spend some time
> thinking about the implications.
>
> e.g. You might need a before update trigger to make sure that when any
> field that is used in a calculation is changed the fileds thaqt are
> calculated from that are recalculated. Actually you might be best to
> put the calculation in a trigger and make it a before insert or update
> if you can.
>
> Stephen
>
> --
> It's better to ask a silly question than to make a silly assumption.
> --
> //www.freelists.org/webpage/oracle-l
>
>
>

Other related posts: