Re: refreshing DW environment from APPS (9i)

  • From: cosmin ioan <cosmini@xxxxxxxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Thu, 18 Jan 2007 14:48:03 -0800 (PST)

thanks much Jaromir;  I will certainly tinker with the ideas.   Yes, basically 
we would do an initial load when yes, we will take a hit, but then, daily 
delta's should be pretty fast  ;-) 
  ahh.... learned something new about the MV's.... did not check into the 
syntax and was not aware they could be partitioned.... so if that's the case -- 
and will thoroughly investigate that -- then that solves a big issue we're 
having....
   
  Thanks much again for the thorough & informative approach. 
  Cos
   
  

jaromir nemec <jaromir@xxxxxxxxxxxx> wrote:
        
  @font-face {   font-family: Tahoma;  }  @page Section1 {size: 8.5in 11.0in; 
margin: 1.0in 1.25in 1.0in 1.25in; mso-header-margin: .5in; mso-footer-margin: 
.5in; mso-paper-source: 0; }  P.MsoNormal {   FONT-SIZE: 12pt; MARGIN: 0in 0in 
0pt; FONT-FAMILY: "Times New Roman"; mso-style-parent: ""; mso-pagination: 
widow-orphan; mso-fareast-font-family: "Times New Roman"  }  LI.MsoNormal {   
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman"; 
mso-style-parent: ""; mso-pagination: widow-orphan; mso-fareast-font-family: 
"Times New Roman"  }  DIV.MsoNormal {   FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; 
FONT-FAMILY: "Times New Roman"; mso-style-parent: ""; mso-pagination: 
widow-orphan; mso-fareast-font-family: "Times New Roman"  }  P.MsoAutoSig {   
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman"; 
mso-pagination: widow-orphan; mso-fareast-font-family: "Times New Roman"  }  
LI.MsoAutoSig {   FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New
 Roman"; mso-pagination: widow-orphan; mso-fareast-font-family: "Times New 
Roman"  }  DIV.MsoAutoSig {   FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; 
FONT-FAMILY: "Times New Roman"; mso-pagination: widow-orphan; 
mso-fareast-font-family: "Times New Roman"  }  SPAN.EmailStyle15 {   COLOR: 
navy; mso-style-type: personal-reply; mso-ansi-font-size: 10.0pt; 
mso-ascii-font-family: Arial; mso-hansi-font-family: Arial; 
mso-bidi-font-family: Arial  }  DIV.Section1 {   page: Section1  }      Hi Cos,
   
  some thoughts to the topic
> The  problem with this approach, because of the "OR"
> clause, you're forced into full table scans.  (and here, I'm only giving as
> an example a 2 table join however there are FACTS, etc, that require many
> more joins so the "full table scan" stuff gets really ugly)
   
  this a is typical extraction problem. The OLPT databases are not suitable to 
perform large joins. 
   
  > so, basically, what we're tying to do is capture any changes for the year
> (month/day/whatever)
   
  I my experience to get a year of data from the source system is sometimes 
better to get all required tables separately (whole tables or  maybe there is a 
limit how long in the history can the application change the data; in this case 
you may limit the extract) and join the data in the stage area of the DW (using 
full scan and parallel  hash join).
   
  to extract a small number of data (e.g. daily changes) the application the 
above brute force approach is not suitable. The application should provide some 
change data capture (CDC) functionality, i.e. to store the whole required 
information in separated (delta) tables. There is a CDC provided by Oracle: 
synchronous base don triggers and asynchronous based on Oracle log. Some 
application implement proprietary CDC defining there own DELTA tables.
   
  If this is not an option, last resort is to define an index of the  
last_update column and try to tune the join to get cascaded nested loop plan. 
Important to note that this is valid only in case of extraction of *small* 
number of data. The FULL SCAN you mentioned above is in my opinion not caused 
by the OR but due to extraction of large parts of the table.
   
  > One can probably circumvent this by using materialized views to determine
> the delta,
   
  The materialized view LOG contains the delta. The materialized view 
replicates the state of the table (or a select) in the DW database.
The MV can be partitioned, check the CREATE MATERIALIZED VIEW command.
Not in all case is it possible to have a) fast refresh and b) the required join 
in the MV definition.
   
  Similar to the discussion above my experience with MV is:
to get the data once a year make a full refresh of the MV
to get the data daily or so make two steps: replicate the source tables with 
fast refresh and rebuild the new state (join) in DW 
to get the data more fervently some kind of fast refresh must be found.
   
  
> and ((ooha.last_update_date > '1-jan-2005' and ooha.last_update_date <=
> 31-dec-2005')
last note - remember if a record can be updated after insert, you may get an 
incomplete state of 2005 changes in your example. 
If a record is updated 2005 *and* 2006 it will be missing the 2005 extract.
  
HTH 
   
  Jaromir D.B. Nemec
   

Other related posts: