RE: hash join waits on cpu 100% time

  • From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Sun, 4 Jan 2015 12:12:37 +0000


You might as well capture the full SQL Monitoring output for the completed 
query.

Then I'd check the definition of the view V_SA_tab_personal_ADD  to see what 
it's doing with the "with subquery".  

I think most of the work relates purely to the number and size of the 
tablescans of the resultant GTT - but the CPU used still looks a little high 
(even if it does mange to scan 2 billiong rows in the first 1,000 seconds) so I 
want to know how it reduces those rows down to one per scan - as indicated by 
the Rows (Actual) in lines 19 and 18.  I'd want the execution plan from 
v$sql_plan so that I could check the predicate section to see how any 
predicates were applied.

If I were to start running the query again I would use Tanel's code to capture 
a snapshot of latch activity over a few seconds to see if there were any 
significant latch gets over and above those related to the buffer cache.


Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle

________________________________________
From: GG [grzegorzof@xxxxxxxxxx]
Sent: 04 January 2015 10:48
To: Jonathan Lewis; oracle-l@xxxxxxxxxxxxx
Subject: Re: hash join waits on cpu 100% time

W dniu 2015-01-04 o 11:30, Jonathan Lewis pisze:
> Checking the description of the bug it seems to say that it's the creation of 
> the internal GTT that is the threat. In the context of the bug - it's Siebel 
> (which typically means large numbers of columns) and a very large number of 
> (possibly small ?) queries. This would make the recursive SQL a significant 
> fraction of the total work done, and for concurrent queries introduce a 
> degree of latch contention.
>
> In your case you are talking about the execution of a single big query; 
> moreover, if the problem were recursive SQL then the Tanel snapshot would 
> show a high value for STAT execute count. Finally, if you check the 
> monitoring execution plan you can see (lines 12/13) that the Load as Select 
> starts 8 seconds into the query and completes within one second - that's the 
> point at which the bug would apply.
>
> There is one oddity that I would like  an explanation for - how come line 18 
> started at +4 seconds, when line 12 didn't start until +8 seconds ?  The only 
> (non-buggy) thing I can think of is that there's a run-time dependency that 
> doesn't show up in the execution plan - even then that's only showing (at 
> worst) only a 4 second anomaly, not a very slow query.
Got Your point Jonathan . Query finished after 10h , I'm out of office
right now so cant provide any more details .
Just wondering ,how can we narrow down the issue, should I go for
detailed lvl 12 10046 trace or/and better 10053 ?
Seems like Oracle instrumentation lacks of granularity if Your query
'hangs on' CPU .
Maybe I'll try with few short stack dumps using oradebug when query is
running .

Regards
GG


--
//www.freelists.org/webpage/oracle-l


Other related posts: