Re: Understanding "cursor: pin S wait on X"

From: Tanel Poder <tanel@xxxxxxxxxxxxxx>
To: thomas.kellerer@xxxxxxxxxx
Date: Tue, 9 Apr 2013 13:32:33 +0300
This wait happens when your session wants to examine / execute an existing
child cursor - but it's being pinned in X mode by someone else. The usual
reasons for this are:
1) The cursor got flushed out for some reason (or it's just the first time
anyone is running it) and the other session is currently hard
parsing/compiling/loading the contents of this cursor back into library
cache. The longer the parsing/loading takes, the longer you wait. If the
parsing session hits some (performance) issues, this pin may be held for a
long time.

or

2) The child cursor is already loaded into cache, but someone else is
currently flushing it out (shared pool shortage, fragmentation & loading of
a big object, SGA target / ASMM reducing the SGA size).

The thing with "auto-extending" connection pools is that there's a
chicken-and-egg situation with troubleshooting performance & utilization
spikes - what caused what? Sometimes there's just a mild performance hiccup
for whatever reason, which nobody would even noticed, but a trigger-happy
connection pool will greatly amplify it by starting up hundreds of new
connections during the problem time.

0.6 logons per second is not a problem if that's your actual number. But if
you are talking about 0.6 logons per second *on average in a 1-hour AWR
report, *then this could mean also (let's say) 40 new connections per
second during your spike time (of 60 seconds for example) and then no more
logins throughout the rest of the hour. 60 x 50 / 3600  = 0.66...

So if your spike lasted for X seconds, then your performance data should
also be from X seconds (or even more granular), otherwise these averages
over long time may hide the truth. You can parse the listener log with a
simple grep & awk command to get a more detailed overview (and somewhat
closer to the connection pool) about connection rates.

If you want to reproduce "cursor: pin S for X" waits, you can use one of my
seminar demo scripts (
http://blog.tanelpoder.com/files/scripts/aot/demo4.sql ). Just run the
entire script in one session (it will pin a child and end up hard parsing
for a very long time) and then run the select part of that script only in a
different session. But my demo very narrowly focuses on causing the wait
event problem, I don't think it will reproduce anything like you saw.

Another option would be to get 10s or hundreds of sessions running the same
SQL statement (and overload the CPUs) and then purging the cursor with
DBMS_SHARED_POOL.PURGE (not just flush shared pool as it will skip pinned
chunks, but purge will wait until it gets the pin).


-- 
*Tanel Poder*
Enkitec (The Exadata Experts)
Training <http://blog.tanelpoder.com/seminar/> |
Troubleshooting<http://blog.tanelpoder.com/>
 | Exadata<http://www.amazon.com/Expert-Oracle-Exadata-Apress/dp/1430233923>
 | Voicee App <http://voic.ee/>



On Tue, Apr 9, 2013 at 11:59 AM, Thomas Kellerer <thomas.kellerer@xxxxxxxxxx
> wrote:

> Hello all,
>
> we had a situation on a 2 node RAC system where for some reason the system
> more or less stopped working due to excessive "cursor: pin S wait on X"
> waits. (89% of the DB time on one node, 36% on the other node).
>
> This seems to be the result of the application servers opening an
> excessive high number of connections to the RAC (the reason for that is yet
> unknown).
>
> We are not done invstigating everything that happened on the application
> server yet, but it seems to be that all those sessions were running the
> same statements and for some reason they were all routed to the same RAC
> node.
>
> The top 5 Wait events from the overloaded node show up like this:
>
> Event                      Waits        Time(s)     Avg wait (ms)   % DB
> time   Wait Class
> cursor: pin S wait on X    4,411        429,949     97472           88.68
>       Concurrency
> library cache lock         246          30,608      124424          6.31
>      Concurrency
> db file sequential read    1,237,878    8,275       7               1.71
>      User I/O
> DB CPU                                  4,728                       0.98
> db file scattered read     1,014,046    2,738       3               0.56
>      User I/O
>
> (1 hour AWR report)
>
> In order to understand better the backround on when a "cursor: pin S wait
> on X" is requested, I tried to create a test environment to reproduce this.
> But up to now without luck. The "best" I can do simulating the high
> concurrency (on a test environment) will result in a high number of "latch:
> shared pool" and "latch: row cache objects".
>
> So my question is:
>
> Could somebody explain in more detail at which step(s) in the query
> processing a "cursor: pin S wait on X" is requested?
> By understanding the sequence that leads to such a wait event I hope to
> understand better what was going on on the Oracle side of the problem.
>
> Btw: is 0.6 logons per second considered a "logon storm"?
>
> Thanks in advance
>
> Kind regards
> Thomas
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
>


--
//www.freelists.org/webpage/oracle-l
Follow-Ups:
- Re: Understanding "cursor: pin S wait on X"
  - From: Thomas Kellerer
References:
- Understanding "cursor: pin S wait on X"
  - From: Thomas Kellerer
Re: Understanding "cursor: pin S wait on X"

Other related posts: