Andrew,
I think that might be a good idea for certain data patterns, index strategies
and versions of Oracle (I think you've got the (+) on the wrong side of the
join, by the way), but newer versions of the optimizer are capable of turning
an outer join into an anti join, and with the quick and dirty test I ran I got
the following execution paths (T1 is the big table, t2 is the small):
Original query: "not in (subquery)"
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | | | 126 (100)| |
| 1 | DELETE | T1 | | | | |
|* 2 | HASH JOIN RIGHT ANTI| | 47541 | 1114K| 126 (5)| 00:00:01 |
| 3 | TABLE ACCESS FULL | T2 | 6 | 54 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T1 | 63388 | 928K| 122 (4)| 00:00:01 |
------------------------------------------------------------------------------
Your query: "in (outer join subquery)"
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
-------------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | | | 259 (100)|
|
| 1 | DELETE | T1 | | | |
|
|* 2 | HASH JOIN | | 154K| 3913K| 259 (9)|
00:00:02 |
| 3 | VIEW | VW_NSO_1 | 6 | 66 | 134 (11)|
00:00:01 |
| 4 | SORT UNIQUE | | 6 | 114 | 134 (11)|
00:00:01 |
|* 5 | HASH JOIN RIGHT ANTI| | 47541 | 882K| 126 (5)|
00:00:01 |
| 6 | TABLE ACCESS FULL | T2 | 6 | 54 | 2 (0)|
00:00:01 |
| 7 | TABLE ACCESS FULL | T1 | 63388 | 619K| 122 (4)|
00:00:01 |
| 8 | TABLE ACCESS FULL | T1 | 63388 | 928K| 122 (4)|
00:00:01 |
-------------------------------------------------------------------------------------
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: Andrew Kerber [andrew.kerber@xxxxxxxxx]
Sent: 23 August 2016 15:57
To: Jonathan Lewis
Cc: JDunn@xxxxxxxxx; Chris Taylor; oracle-l@xxxxxxxxxxxxx
Subject: Re: Deletion from large table
I have generally had good performance with syntax like this:
delete from big_table where id in (select big_table.id<http://big_table.id>
from small_table, big_table where
small_table.id<http://small_table.id>=big_table.id<http://big_table.id> (+) and
small_table.id<http://small_table.id> is null)
On Tue, Aug 23, 2016 at 9:38 AM, Jonathan Lewis
<jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>> wrote:
Best access path does vary with circumstances.
If you're expecting lots of inserts while doing the deletes you may find that
as the delete progresses the rate slows down and the volume of undo applied for
read-consistency climbs.
If you see that as a problem it may be that finding an index that lets you walk
the big table in reverse order of data arrival may (slightly
counter-intuitively) improve performance.
Under any circumstances deleting by tablescan and deleting by index range scan
behave differently with respect to index maintenance (this note on big updates
also applies to big deletes:
http://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/<https://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/>
).
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>
[oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>] on behalf
of John Dunn [JDunn@xxxxxxxxx<mailto:JDunn@xxxxxxxxx>]
Sent: 23 August 2016 14:39
To: Chris Taylor
Cc: oracle-l@xxxxxxxxxxxxx<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: RE: Deletion from large table
Unfortunately it’s a nightly thing….whilst updates are still going on….
John
From: Chris Taylor
[mailto:christopherdtaylor1994@xxxxxxxxx<mailto:christopherdtaylor1994@xxxxxxxxx>]
Sent: 23 August 2016 14:38
To: John Dunn
Cc: oracle-l@xxxxxxxxxxxxx<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: Re: Deletion from large table
Is this a one time thing, or a regularly occurring thing? (A one time data
cleanup versus a nightly routine)
If it's a one time data cleanup (or rarely needed), I'd recommend saving off
the rows you want to keep into another table, truncate the big_table and reload
the rows from the temporary table you created to save the rows you wanted.
Delete is one of the (if not THE) single most expensive operation you can run
in a database (but I'm sure you're aware of that but wanted to mention it).
Chris
On Tue, Aug 23, 2016 at 5:17 AM, John Dunn
<JDunn@xxxxxxxxx<mailto:JDunn@xxxxxxxxx>> wrote:
I need to delete large numbers of rows from a large table based upon whether a
record exists in a small table.
I am currently using :
delete from big_table where not exists (select 1 from small_table
s where s.id<http://s.id> = b.id<http://b.id>)"
big_table may have up to 100,000 rows for the same id value.
small_table will only have one row per id value
Is there a better way to code this?
John
--
Andrew W. Kerber
'If at first you dont succeed, dont take up skydiving.'