Re: Tuning Update query

From: Jonathan Lewis <jlewisoracle@xxxxxxxxx>
To: oracle-l <oracle-l@xxxxxxxxxxxxx>
Date: Mon, 3 May 2021 16:16:08 +0100

William,

I haven't got around to setting up an always free 21c online yet, but if
you have a little time it would interesting to know if the update at the
end of https://jonathanlewis.wordpress.com/2012/10/26/running-total/ that
joins a table to a analytic select from itself now works, or whether some
other restriction appears.

Regards
Jonathan Lewis

On Mon, 3 May 2021 at 15:11, William Robertson <william@xxxxxxxxxxxxxxxxxxxx>
wrote:

The ORA-01779 error means you would need a unique index or constraint on
product_tab (part_dt, tid) to be able to use this syntax. (Strangely, I did
not get the error in 21c. Maybe there is a new feature I am missing.)

You may be able to combine the updates into a smaller number of merges
even when different rules apply, using CASE expressions so that each column
is only set to a new value when the relevant conditions are met, otherwise
set it to itself.

*when* *matched* *then* *update*
    *set* tmp.c1 = *case when* x* = *y *then* p.col1 *else* tmp.c1 *end*
*       , *tmp.c2 = *case when* a* =* b *then* p.col2 *else* tmp.c2 *end*
*;*

William

On 3 May 2021, at 12:57, Lok P <loknath.73@xxxxxxxxx> wrote:

Thank You Jonathan and Lothar. I was trying to modify one of the sample
updates(as below) as per your suggested method.

UPDATE GTT_TAB TMP
  SET (c1, c2, c3, c4,c5.. c11) =
         ( (SELECT col1,col2,..col11
              FROM PRODUCT_TAB P
             WHERE    P.TID = TMP.TID
                   AND TMP.CIND = 'Y'
                   AND P.PART_DT = TMP.PART_DT
                   AND P.CODE = 'XX'))

Jonathan, let me know if my understanding is correct on the suggested
points
" a) See if you can minimise the number of update ste[ps by updating from
a JOIN of several source tables"

So if I get the above point correct then I was trying to modify the UPDATE
as below , but I am getting  ORA-01779 while running to see the plan. So
does it mean that the GTT has to have a unique key present in it to have
this method work?

UPDATE (SELECT P.COL1...P.col11,TMP.C1.. TMP.C11
              FROM PRODUCT_TAB P,GTT_TAB TMP
             WHERE    P.TID = TMP.TID
                   AND TMP.CIND = 'Y'
                   AND P.PART_DT = TMP.PART_DT
                   AND P.CODE = 'XX'
)      SET    C1=COL1,    ...     C11=COL11;

ERROR at line 10:

ORA-01779: cannot modify a column which maps to a non key-preserved table

Regarding below point ,
"b) Rewrite the code to step through a series of GTT doing
        insert /*+ append */ into next_gtt select from previous_gtt join
{source tables}"

Do you mean i have to replace the UPDATE with INSERT /*+APPEND*/...
queries but as it cant be done in the same GTT , so i have to create
multiple GTT's for each UPDATES, so as to replace them with INSERT APPEND
queries?

Lothar , as you suggested , i tried modifying the same above query as
below MERGE statement. Though i was not able to see the plan on production
because its a DML, so i tried running it on DEV and as the data volume was
not same so with some force cardinality hints, i do see a HASH JOIN
operation, but not seeing bloom filter right away. Maybe   , we will see it
in prod. And also i remember , there was a bug on 11.2.0.4 which was
restricting the cell smart scan in case of DML queries and work around for
that was to set _serial_direct_read=always. Not sure if we have to do that
here in case of MERGE statement to go for cell offloading. Also if it's
worth modifying each of these 4-5 different UPDATES to one MERGE statement
each?  I was thinking if combining them to one is possible and will give
benefit.

MERGE INTO GTT_TAB TMP
USING (SELECT col1,col2,..col11,CODE, PART_DT,TID
              FROM PRODUCT_TAB P WHERE P.CODE = 'XX')P
             On (    P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT   )
WHEN MATCHED THEN UPDATE
SET TMP.C1=P.COL1,
       ...
       TMP.C11=P.col11
where TMP.CIND = 'Y'

Execution Plan
----------------------------------------------------------
Plan hash value: 2005806797

-------------------------------------------------------------------------------------------------------------------------------

| Id | Operation                    | Name                 | Rows | Bytes
|TempSpc| Cost (%CPU)| Time    | Pstart| Pstop |

-------------------------------------------------------------------------------------------------------------------------------
|  0 | MERGE STATEMENT              |                      |    1 | 1339 |
     |  256K (1)| 00:51:23 |      |      |
|  1 | MERGE                       | GTT_TAB              |      |      |
     |           |      |      |      |
|  2 |  VIEW                       |                      |      |      |
     |           |      |      |      |
|* 3 |   HASH JOIN                 |                      |    1 |  639 |
732M|  256K (1)| 00:51:23 |      |      |
|  4 |    PARTITION RANGE ALL      |                      | 8000K|  640M|
     | 5894  (3)| 00:01:11 |    1 |  331 |
|* 5 |     TABLE ACCESS STORAGE FULL| PRODUCT_TAB          | 8000K|  640M|
     | 5894  (3)| 00:01:11 |    1 |  331 |
|  6 |    TABLE ACCESS STORAGE FULL | GTT_TAB              | 8000K| 4234M|
     |  206  (0)| 00:00:03 |      |      |

-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
  3 - access("TID"="TMP"."TID" AND "PART_DT"="TMP"."PART_DT")
  5 - storage("P"."CODE"='XX')
      filter("P"."CODE"='XX')

Regards
Pap

On Mon, May 3, 2021 at 2:25 PM Jonathan Lewis <jlewisoracle@xxxxxxxxx>
wrote:

If you're updating many columns of every row in a large (global
temporary) table using lots of separate statements that use the "set  ..
subquery" strategy you are introducing 4 problems.

a) Updating lots of columns means you will probably start doing lots of
row migration - and you can't set pctfree on GTTs to avoid this, it WILL be
10%

b) You will be generating a huge amount of undo

c) If you aren't including a "where exists" predicate on the source table
that matches the predicate on the set subquery you will be updating rows to
null that don't need to be updated - the monitor report echoed by Mladen
showed 17M rows updated, but only 8M executions of the  subquery - which
probably means 9M rows in the GTT updated to NULL.

In your specific case there's (probably) a logic flaw in the update
statement. You have a "where tmp.cind='Y'" in the subquery. This means
(effectively) set the values to NULL when tmp.cind ='Y', NOT 'don't update
the rows where tmp.cind='Y'.  If you want the update to be selective then
you need the predicate as part of the where clause on the update.

Finally -
If this is a table of roughly 17M rows (rather than the 8billion of the
estimate) then the fastest strategy would probably be to
a) See if you can minimise the number of update ste[ps by updating from a
JOIN of several source tables
b) Rewrite the code to step through a series of GTT doing
        insert /*+ append */ into next_gtt select from previous_gtt join
{source tables}

Using insert /*+ append */ with gtts will:
a) use bulk processing rather that the row by row processing for filter
subqueries
b) miimise BOTH undo and redo.

Regards
Jonathan Lewis

On Sun, 2 May 2021 at 19:53, Lok P <loknath.73@xxxxxxxxx> wrote:

Hello, We have a plsql procedure in which it  updates multiple columns
of the global temporary table after loading it in different steps. It's
updated for different filter criteria and each of these UPDATES are running
for hours even if they are simple ones. I am trying to see if it's
possible to do it in an efficient fashion by combining all different
updates into one UPDATE statement or any other possible method? One initial
thought was to do it as part of the INSERT itself rather than updating
it through multiple statements at the later stage , but then I see the
data load/INSERT is happening for multiple if/else conditions which may not
be straight enough.

Attached is the sql monitor for few of the long running UPDATES. And i
am seeing this is not really spending time for updating rows but rather in
finding the eligible rows for update i.e. the SELECT part of it is actually
taking time. So wanted to understand how these UPDATES can be done faster?

Its version 11.2.0.4 of Oracle Exadata-X5.

Regards

Lok

Follow-Ups:
- Re: Tuning Update query
  - From: William Robertson

References:
- Tuning Update query
  - From: Lok P
- Re: Tuning Update query
  - From: Jonathan Lewis
- Re: Tuning Update query
  - From: Lok P
- Re: Tuning Update query
  - From: William Robertson

Re: Tuning Update query

Other related posts: