Re: refreshing DW environment from APPS (9i)
- From: "jaromir nemec" <jaromir@xxxxxxxxxxxx>
- To: <mwf@xxxxxxxx>, <cosmini@xxxxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
- Date: Thu, 18 Jan 2007 22:54:43 +0100
Hi Cos,
some thoughts to the topic
> The problem with this approach, because of the "OR"
> clause, you're forced into full table scans. (and here, I'm only giving as
> an example a 2 table join however there are FACTS, etc, that require many
> more joins so the "full table scan" stuff gets really ugly)
this a is typical extraction problem. The OLPT databases are not suitable to
perform large joins.
> so, basically, what we're tying to do is capture any changes for the year
> (month/day/whatever)
I my experience to get a year of data from the source system is sometimes
better to get all required tables separately (whole tables or maybe there is a
limit how long in the history can the application change the data; in this case
you may limit the extract) and join the data in the stage area of the DW (using
full scan and parallel hash join).
to extract a small number of data (e.g. daily changes) the application the
above brute force approach is not suitable. The application should provide some
change data capture (CDC) functionality, i.e. to store the whole required
information in separated (delta) tables. There is a CDC provided by Oracle:
synchronous base don triggers and asynchronous based on Oracle log. Some
application implement proprietary CDC defining there own DELTA tables.
If this is not an option, last resort is to define an index of the last_update
column and try to tune the join to get cascaded nested loop plan. Important to
note that this is valid only in case of extraction of *small* number of data.
The FULL SCAN you mentioned above is in my opinion not caused by the OR but due
to extraction of large parts of the table.
> One can probably circumvent this by using materialized views to determine
> the delta,
The materialized view LOG contains the delta. The materialized view replicates
the state of the table (or a select) in the DW database.
The MV can be partitioned, check the CREATE MATERIALIZED VIEW command.
Not in all case is it possible to have a) fast refresh and b) the required join
in the MV definition.
Similar to the discussion above my experience with MV is:
to get the data once a year make a full refresh of the MV
to get the data daily or so make two steps: replicate the source tables with
fast refresh and rebuild the new state (join) in DW
to get the data more fervently some kind of fast refresh must be found.
> and ((ooha.last_update_date > '1-jan-2005' and ooha.last_update_date <=
> 31-dec-2005')
last note - remember if a record can be updated after insert, you may get an
incomplete state of 2005 changes in your example.
If a record is updated 2005 *and* 2006 it will be missing the 2005 extract.
HTH
Jaromir D.B. Nemec
- Follow-Ups:
- Re: refreshing DW environment from APPS (9i)
- From: cosmin ioan
- References:
- RE: refreshing DW environment from APPS (9i)
- From: Mark W. Farnham
Other related posts:
- » re: refreshing DW environment from APPS (9i)
- » RE: refreshing DW environment from APPS (9i)
- » RE: refreshing DW environment from APPS (9i)
- » Re: refreshing DW environment from APPS (9i)
- » Re: refreshing DW environment from APPS (9i)
- Re: refreshing DW environment from APPS (9i)
- From: cosmin ioan
- RE: refreshing DW environment from APPS (9i)
- From: Mark W. Farnham