RE: kkslce [KKSCHLPIN2] mutex contention on hard parse of PQ with binds

  • From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: oracle-l-freelists <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 15 Sep 2015 10:25:28 +0000



What's the environment from which you are calling the SQL ? Is it a basic
SQL*Plus session, or some type of front-end that might be causing the double
parse with binds anyway ? I forget which mid-tier, or driver it was but there
used to be common questions about seeing statement with bind variables being
parsed twice, once with no values for the bind variable, then a second time
because the values were only sent on a second call.

Can you find any consistency in the queries which show this behaviour (apart
from the partitioning) ? If you check the v$pq_tqstats view after running the
queries do you see a "ranger" operation as the first operation ? (This might be
visible as a S->P distribution into TQ10000 if all you can get at is execution
plans).

Have you tried adding hints to any of the queries that persistently show this
problem simply to check if the restriction on the optimization phase bypasses
the problem ?




Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: Stefan Knecht [knecht.stefan@xxxxxxxxx]
Sent: 15 September 2015 10:50
To: mohamed.houri@xxxxxxxxx
Cc: Jonathan Lewis; oracle-l-freelists
Subject: Re: kkslce [KKSCHLPIN2] mutex contention on hard parse of PQ with binds

So the key difference is that it's creating one more child, if we're using bind
variables. Whereas the query with literals only has child 0.

And that induces an overhead of anywhere between 1 and 3 seconds. Wow.

On Tue, Sep 15, 2015 at 4:48 PM, Stefan Knecht
<knecht.stefan@xxxxxxxxx<mailto:knecht.stefan@xxxxxxxxx>> wrote:
Here's another test, without binds:

SQL> select sql_id, child_number, executions, invalidations, parse_calls from
v$sql where sql_id='avgm0shx7hnmv';

SQL_ID CHILD_NUMBER EXECUTIONS INVALIDATIONS PARSE_CALLS
------------- ------------ ---------- ------------- -----------
avgm0shx7hnmv 0 1 0 9


We still see several parse calls. But zero "cursor: pin S wait on X" waits in
any of the tracefiles.



On Tue, Sep 15, 2015 at 4:38 PM, Stefan Knecht
<knecht.stefan@xxxxxxxxx<mailto:knecht.stefan@xxxxxxxxx>> wrote:
Mohamed

This is precisely happening during a hard parse only. There's nothing to share
- except between the PX slaves. The query is deliberately being hard parsed,
and a new cursor is created hence v$sql_shared_cursor, etc won't be helpful in
this case.

Stefan


On Tue, Sep 15, 2015 at 4:33 PM, Mohamed Houri
<mohamed.houri@xxxxxxxxx<mailto:mohamed.houri@xxxxxxxxx>> wrote:
"cursor: pin S wait on X" is a symptom of a session wanting a cursor on Shared
mode but another session is holding the same cursor in eXlusive mode.

Typically the cursor holding the eXclusive lock is hard parsing. It might also
be due to Extended Cursor Sharing kicking in and optimizing an execution plan
for each execution.This is why, as far as you are using bind variables, I would
have started by checking the reason for which your cursor is not shared (using
nonshared.sql script of Tanel Poder for example)

Best regards
Mohamed

2015-09-15 10:37 GMT+02:00 Jonathan Lewis
<jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>>:


Are the 4 out of 8 isolated to one of the two slave sets, or are they scattered
randomly across the two slave sets. (I would suspect the former).

It's probably expected behaviour (at least in some circumstances) but an
extreme example of slow parse times. I have seen a couple of cases in 11g where
PX slaves actually managed to get different execution plans for the same query
at the same time because the slaves re-optimized the query - if this is
possible, then the slaves could be in a position where they are attempting to
re-optimize the query, but are still waiting for the QC to finish its parse
call.

Gathering related information:

From v$sql - with one call to execute a new version of the statement, how many
parse_calls does the parallel statement report.
If you enable 10053 for a serial and a parallel version, does the parallel
trace show much more work than the serial ? Does the 12c serial show much more
work than the 11g serial?
If you enable 10046 trace can you see the 4 PX slaves doing work that looks
like re-optimization after they get the child mutex

Speculative question
Do the queries involve partitioned tables ?





Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>
[oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>] on behalf
of Stefan Knecht [knecht.stefan@xxxxxxxxx<mailto:knecht.stefan@xxxxxxxxx>]
Sent: 15 September 2015 09:12
To: oracle-l-freelists
Subject: kkslce [KKSCHLPIN2] mutex contention on hard parse of PQ with binds

Hi all

Fairly odd event. After an upgrade to 12.1.0.2 we've been seeing very high
"cursor: pin S wait on X" waits. But here's the catch - they only happen under
very special (and seemingly odd) circumstances:

- Query runs in parallel
- Query uses bind variables
- Query is hard parsed

If we run the same query in serial, or with literal values, or we re-use a
cursor, no waits are seen.

The waits are huge (up to 0.3 seconds) and are seen on the PX slaves only. Say
we're using 8 slaves, 4 of them will show these waits. 4 won't. And only if
we're using binds. No binds, no waits.

Does anyone have an explanation for what could be happening?

I've already tried disabling some of the new fancy adaptive features, and most
bind related features (peeking, px bind peeking, bind value captures, etc).
Heck, it even happens with opt features set to 11.2.

On a very similar database, that's still on 11.2.0.3, we're also seeing the
same behavior. The key difference being that the waits are very short - less
than 0.01 seconds. But they're still there, and I don't see why they would be
in the first place. What causes Oracle to pin that mutex, if a bind variable is
used in the query?

A bug has been filed. So we'll see what comes out of that. It would seem odd
that we'd be the first ones to hit this, it's not exactly exotic to run a
parallel query with binds.

Stefan





--

Houri Mohamed

Oracle DBA-Developer-Performance & Tuning

Member of Oraworld-team<http://www.oraworld-team.com/>

Visit My - Blog<http://www.hourim.wordpress.com/>

Let's Connect - <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/> Linkedin
Profile<http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>

My Twitter<https://twitter.com/MohamedHouri> -
MohamedHouri<https://twitter.com/MohamedHouri>



Other related posts: