Re: Data load question
- From: "jaromir nemec" <jaromir@xxxxxxxxxxxx>
- To: <dubey.sandeep@xxxxxxxxx>, "Mike Spalinger" <mike.spalinger@xxxxxxxxx>, <sfaroult@xxxxxxxxxxxx>
- Date: Mon, 10 Apr 2006 00:52:12 +0200
Hi Sandeep,
However, I want b.id and c.id as pk.
I'm not sure what you mean with PK in this context. What result you would
expect if the table a will be initially populated as follows?
insert into a values ('A1','B1');
insert into a values ('A1','B1');
If the problem is that the generated ID start from 1 (and violating so in
the next run the PK) you may use the sequence solution proposed by Stephane,
alternatively you may use a max subquery to get the last generated value,
see example below (the tricky part here is to get the max efficiently if the
tables b and c get large)
slightly modified Mike's solution
insert all
when table_num = 1 then
into b (id, col1)
values (a_seq, col1)
when table_num >= 1 then
into c (id, col2, b_id)
values (b_seq, col2, a_seq)
select col1,
col2,
row_number() over (partition by col1 order by col1) as table_num,
(select nvl(max(id),0) from b) +
dense_rank() over (order by col1) as a_seq,
(select nvl(max(id),0) from c) +
row_number() over (order by col1, col2) as b_seq
from a;
I'd prefer the sequence based solution if there are potentially more
parallel job performing the load. The MAX solution works well if the load is
performed serially one job at a time. The decision is more a matter of taste
depending on your preferences of dealing with redundant data.
If you plan to treat the col1 as a natural key and the id plays the
surrogate key a more complicated solution with a lookup in the table b (to
get the information whether the natural key is new and need a new surrogate
key) will be needed.
something like this.
insert all
when table_num = 1 and new_pk_b = 1 then
into b (id, col1)
values (a_seq, col1)
when table_num >= 1 then
into c (id, col2, b_id)
values (b_seq, col2, a_seq)
select a.col1,
a.col2,
row_number() over (partition by a.col1 order by a.col1) as table_num,
nvl(b.id,
(select nvl(max(id),0) from b) +
dense_rank() over (partition by case when b.col1 is not NULL then 0
else 1 end order by a.col1)
) as a_seq,
(select nvl(max(id),0) from c) +
row_number() over (order by a.col1, a.col2) as b_seq,
case when b.col1 is not NULL then 0 else 1 end new_pk_b
from a, b
where a.col1 = b.col1(+);
Depending on the question above you may need to deploy additionally some
de-dup processing.
Regards,
Jaromir D.B. Nemec
----- Original Message -----
From: "Sandeep Dubey" <dubey.sandeep@xxxxxxxxx>
To: "Mike Spalinger" <mike.spalinger@xxxxxxxxx>
Cc: <oracle-l@xxxxxxxxxxxxx>
Sent: Friday, April 07, 2006 10:50 PM
Subject: Re: Data load question
--
//www.freelists.org/webpage/oracle-l
Other related posts: