Re: de-dup process

From: Mladen Gogala <mgogala@xxxxxxxxxxx>
To: tboss@xxxxxxxxxxxxxxxxxx
Date: Wed, 13 Dec 2006 09:05:04 -0500

On 12/12/2006 08:42:38 PM, tboss@xxxxxxxxxxxxxxxxxx wrote:
> >From asktom, the best way I've found is to use Tom's little code snippet 
> >below:
> 
> delete from table your_huge_table
> where rowid in
>   (select rid
>         from
>         (select rowid rid,
>                      row_number() over
>                         (partition by varchar_that_defines_duplicates
>                          order by rowid ) rn
>    from your_huge_table
>   )
> where rn <> 1
> )
> /



Good luck with that if cardinality is +60M rows. I would also add a condition 
like
WHERE ROWID in (SELECT ROW_ID from EXCEPTIONS) to your query. You only need to 
populate
exceptions table with duplicates and if those duplicates are a small percentage 
of the
total number of records, your task will be done two order of magnitude faster 
then without
the exceptions table.
-- 
Mladen Gogala
http://www.mladen-gogala.com

--
//www.freelists.org/webpage/oracle-l

Follow-Ups:
- Re: de-dup process
  - From: A Ebadi

References:
- de-dup process
  - From: A Ebadi
- Re: de-dup process
  - From: tboss

Re: de-dup process

Other related posts: