Re: de-dup process

From: "Gints Plivna" <gints.plivna@xxxxxxxxx>
To: ebadi01@xxxxxxxxx
Date: Wed, 13 Dec 2006 12:23:16 +0200

Probably instead of blindly loading all rows into the base table you
can do your direct path load into some temporary table and then insert
into base table only distinct rows (if the base table is empty) or
insert into base table only those distinct rows that aren't already in
the base table (if the base table is not empty i.e. something like
insert into base_table select * from temp_table minus select * from
base_table).

Probably eliminating reason will be more effective than fighting with
consequences?

Gints Plivna
http://www.gplivna.eu


2006/12/13, A Ebadi <ebadi01@xxxxxxxxx>:

We have a huge table (> 160 million rows) which has about 20 million
duplicate rows that we need to delete.  What is the most efficient way to do
this as we will need to do this daily?
A single varchar2(30) column is used to identified duplicates.  We could
possibly have > 2 rows of duplicates.

We are doing direct path load so no unique key indexes can be put on the
table to take care of the duplicates.

Platform: Oracle 10G RAC (2 node) on Solaris 10.

Thanks!

________________________________
Need a quick answer? Get one in minutes from people who know. Ask your
question on Yahoo! Answers.

--
//www.freelists.org/webpage/oracle-l

References:
- de-dup process
  - From: A Ebadi

Re: de-dup process

Other related posts: