Re: ASH reporting on SQL spending most of time on CPU

From: Jonathan Lewis <jlewisoracle@xxxxxxxxx>
To: ORACLE-L <oracle-l@xxxxxxxxxxxxx>
Date: Fri, 27 Aug 2021 12:41:26 +0100

Two basic points

a) If you use the same table more than once in a single query don't use the
same alias every time append a counter to the base alias e.g. msm, msm2,
ct, ct2. This makes it easier to read and reduces the risk of a logic error
getting into subqueries. It also makes the predicate information easier to
read.

b) There's a very clear indication in the SQL Monitor report that the
optimizer has picked a very bad plan because of a very bad cardinality
estimate, and as a starting point this means you probably have some stats
on the tables that do not represent the truth. I see that you have
dynamic_sampling at level 6 so maybe you've deliberately left some tables
with no stats, so you may not be allowed to do anything about gathering
suitable stats.

The clue about stats is that the hash join at operation 14 that joins to
the customer table has an estimate of 4114 that drops to an estimate of one
after the join to cust_terr, when the join to cust_terr drops the actual
row coutns from 494,000 to 61,000. The estimate of 1 is the reason why the
optimizer then thinks a nested loop anti-join is a good idea for the "NOT
IN" transformation when (perhaps) a HASH anti-join would be far better.

Another indicator is that the anti join has eliminated all 61,000 rows (and
checking furhter down the plan you've got through 6,300 customers out of
5M, so that plan has a long time to run.

SO: a quick and dirty test -
how long would it take to produce the result of the main query if you got
rid of the subquery but added ct.cust_id to the select list, and how many
rows.
how long would it take to produce the FULL set of data from the subquery if
you ran it standalone, again adding the c.cust_id to the list but
eliminating the correlation predicate (c.cust_id = ct.cust_id).
I think it should be possible to structure the query to run in a little
more than the time it takes to produce those two results.

One thing that puzzles me - why do you have a NOT IN subquery with a
correlation predicate rather than a NOT EXISTS ?

Regards
Jonathan Lewis

(P.S. as a really quick ahd dirty test, try adding the hints /*+
unnest(UNNEST_INNERJ_DISTINCT_VIEW NO_MERGE NO_PUSH_PRED) */ to the
subquery and see if that changes the plan. I think it might not have any
effect because of that correlation predicate)

On Wed, 25 Aug 2021 at 12:12, Goti <aryan.goti@xxxxxxxxx> wrote:

Thanks Laurentiu for responding. Please find the below link which contains
the sql monitor report. This was taken 5 minutes after it was started.The
SQL ran for more than 5 hours and hence we had to cancel that. I don't see
SQL in RTSM history too. I am not sure how far this will be useful.

https://gist.githubusercontent.com/aryangoti/eae0776d88d46361c2c60d0d6b083680/raw/ec1ac6a6359ca3f99c6f7216850dfa09103b6d78/gistfile1.txt

Thanks,
Goti

On Wed, Aug 25, 2021 at 4:07 PM Laurentiu Oprea <
laurentiu.oprea06@xxxxxxxxx> wrote:

Hello,

Can you attach a sql monitor report?

În mie., 25 aug. 2021 la 12:58, Goti <aryan.goti@xxxxxxxxx> a scris:

Hi All,

We have a SQL performance issue and as per ASH the SQL is spending most
of its time on CPU. Not sure which operation is taking more time.  Can
someone guide me how to troubleshoot what is causing the SQL to spend time
on the CPU?

SQL_ID             EVENT                                SESSION MODULE
                            PCT
------------------ ------------------------------------ -------
------------------------------------ -------
fyfnh1pnxjg2d                                           ON CPU  PL/SQL
Developer                     91.9%  <<<<<<<<<<<<

OBJ                         OTYPE                              OID ONAME
                               SQL_ID             CLASS              PCT
--------------------------- --------------------------- ----------
------------------------------------ ------------------ ------------------
----------
XIE1CUST_TERR               INDEX PARTITION                 106842
XIE1CUST_TERR                        fyfnh1pnxjg2d      data block
0%
XFK1PARTITION_TERR_POSTAL   INDEX PARTITION                 107430
XFK1PARTITION_TERR_POSTAL            fyfnh1pnxjg2d      data block
.2%
XFK1PARTITION_TERR_POSTAL   INDEX PARTITION                 107435
XFK1PARTITION_TERR_POSTAL            fyfnh1pnxjg2d
95.5%
XIE1CUST_TERR               INDEX PARTITION                 106843
XIE1CUST_TERR                        fyfnh1pnxjg2d      data block
1.6%
XIE1CUST_TERR               INDEX PARTITION                 106841
XIE1CUST_TERR                        fyfnh1pnxjg2d      data block
.1%
XIE4CUST_TERR               INDEX PARTITION                 106911
XIE4CUST_TERR                        fyfnh1pnxjg2d      free list
0%

                                fyfnh1pnxjg2d                         0%
XIE1CUST_TERR               INDEX PARTITION                 106839
XIE1CUST_TERR                        fyfnh1pnxjg2d      data block
2.3%
XIE1CUST_TERR               INDEX PARTITION                 106838
XIE1CUST_TERR                        fyfnh1pnxjg2d      data block
.3%

Execution Plan Details:

https://gist.githubusercontent.com/aryangoti/7f0bc85cbe6df372e488deecbfdf30ef/raw/40598552b0a5c47c6857c86b06eabbca2d5daf8a/gistfile1.txt

Thanks,
Goti

Follow-Ups:
- Re: ASH reporting on SQL spending most of time on CPU
  - From: Jonathan Lewis

References:
- ASH reporting on SQL spending most of time on CPU
  - From: Goti
- Re: ASH reporting on SQL spending most of time on CPU
  - From: Laurentiu Oprea
- Re: ASH reporting on SQL spending most of time on CPU
  - From: Goti

Re: ASH reporting on SQL spending most of time on CPU

Other related posts: