RE: Big Update on Busy Table

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <fuzzy.graybeard@xxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 17 Jul 2014 05:37:21 -0400

Hans identified a key question in 2) below.

 

IF there is an easy way to identify chunks remaining to be updated and
especially IF  there is an efficient way to group rows in the same database
block together, then doing so in chunks of at least 1,000 differs from
slow-by-slow by 3 orders of magnitude. IF the driving select is cheap, this
should be sufficient.

 

IF there is a lot of work to the select to identify the rows that need to be
updated but you can rely on the OLTP portion of the job not updating this
column in a way that means you should no longer be updating it in  your
batch job, then creating an interim table containing the rowids to be
updated indexed by a nullable key of the block of each row to be updated is
a pretty nifty way to do this. Since only you will be updating the table of
the list to be updated in the real table, your first select then only
operates as a select on the OLTP table.

 

What you deposit in the (at this point unindexed) "list of rows to be
updated" table is the rowid, two columns containing the block id extracted
from the rowid, one initially null, and, if variable, the new value for the
column for each row, however you functionally determine that. Let's call
this the ACTION table with columns RID, ALL_BLOCKS, BLOCK_ACTION,
[NEW_VALUE].

 

Then create single column indexes on ACTION.ALL_BLOCKS and
ACTION.BLOCK_ACTION. At this point it helps to know the minimum and maximum
rows per block to be updated and the average.

If you don't mind the commit blocks being a bit variable, just update the
stats and use the average. For a useful total number of rows, update
BLOCK_ACTION to the ALL_BLOCKS value and the ALL_BLOCKS value to NULL where
ALL_BLOCKS is not null. Then run your actual update where rowid in select
RID from ACTION where BLOCK_ACTION is not null, update ACTION setting
BLOCK_ACTION to null where BLOCK_ACTION is not null. COMMIT. Rinse and
repeat until all ALL_BLOCKS is null. [IF you are updating to a constant
NEW_VALUE, just leave that column out of this whole thing and use the
constant.]

 

I've suspended disbelief that you cannot find a useful window to do this as
a monolith and/or that a monolith would just be too doggone big. (IF you can
find an off hours window and the row needs to be updated frequency is not
horrible, the CTAS solution already mentioned is likely good. IF you are
only updating several million rows of a 10 billion row table that is
probably not the case unless partition pruning comes into play.)

 

Block oriented disappearing index batch nibblers are one of the design
answers to avoiding slow-by-slow or excessive sized monolith operations.
Remember that the general case limitation is no one else can be allowed to
change the required new_value or whether a row should be updated until all
the selected rows are complete. They CAN change anything else about the row.
Often (but not always) when a batch operation needs to be done on a column
these criteria are met.

 

Slow-by-slow (TKYTE) has badness about it. Sooner or later you do need to do
chunks of set operations in reasonable sizes. What is reasonable changes
over time.

 

mwf

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Hans Forbrich
Sent: Wednesday, July 16, 2014 5:36 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Big Update on Busy Table

 

A couple of thoughts:

1) Is it really going to take too long to just get it over and done with on
the live table?  Perhaps an off-hours one-shot transaction?  I've
occasionally rethought my bulk operations and realized that the overall hit
might not be 'that bad', but your situation is, of course, unknown to me.

2) How are you planing on getting those 'few thousand rows at a time'?  And
how are you planning on identifying those which have been done and are not
do be repeated?  

3) Is this something that can be handled through PL/SQL's bulk operations?
Perhaps with a "SELECT FOR UPDATE OF"?
(http://morganslibrary.com/reference/plsql/array_processing.html#apbc and
other areas of Morgan's Library)

/Hans


On 16/07/2014 3:07 PM, Charlotte Hammond  wrote:

Hi All,

 

I need to run an update of one column of millions of rows in a busy OLTP
table.  To prevent the transaction blocking other sessions for too long I'm
planning to break it into updates of a few thousand rows at a time with a
commit in between.   This will prevent any one row being locked for more
than about a second.   However it is tending towards slow-by-slow
processing:  is there a better way of updating this table without creating
long lived TX locks?

 

Thanks!

Charlotte

 

Other related posts: