Re: select for update wait issues

From: Sandra Becker <sbecker6925@xxxxxxxxx>
To: "Mark J. Bobak" <mark@xxxxxxxxx>
Date: Mon, 13 Apr 2015 14:22:54 -0600

Contrary to what I was told this morning, we did have the option to
switchback to the old hardware. We had a standby running there.
Management decided we should switch because the application had ground to a
halt and customer transactions were rapidly queueing, which of course led
to a lot of customer complaints. So we're back on the old hardware.

log file syncs were waiting as long as 15 minutes, the average was about 7
minutes. Some buffer busy waits were over 20 minutes. Now that we're back
on the old hardware, the queue has been reduced and we're processing in
real-time again. I'm leaning towards the idea that the problem lies in our
new T5/EMC configuration. Maybe an incorrect or missing parameter
setting. We have engaged Oracle and EMC to help troubleshoot all the
performance issues on the new hardware. This is only one of 4 databases
we're having issues with on the new hardware. We definitely cannot switch
them all back to the old hardware, which doesn't exist anymore in some
cases.

During the switchback, we moved the redo logs, controlfiles, and flash
recovery area to faster disk. We decided to standardize. A novel concept
here.

I will definitely use the snapper.sql script on the other databases still
on the new hardware.

Thanks.

On Mon, Apr 13, 2015 at 11:46 AM, Mark J. Bobak <mark@xxxxxxxxx> wrote:

Sandra,

To get a better picture of what specific wait events the SELECT FOR UPDATE
is waiting on, try running Tanel Poder's snapper.sql while the problem is
actively happening.
snapper.sql is available here:
http://blog.tanelpoder.com/files/scripts/snapper.sql

Cut-n-paste from there, save as 'snapper.sql', then run it like this:
SQL> @snapper ash 30 1 all

'ash' mean active session history, 30 means 30 second snapshot, 1 means
run just one snapshot, and 'all' means consider all sessions on the local
instance. If you're on RAC, you can do 'all@*' to do all instances.

Post results back. Also, the output will show waits by SQL_ID, it would
be worthwhile to confirm which, if any of the SQL_IDs in the output are the
SELECT FOR UPDATE. Try 'select sql_fulltext from v$sqlarea where
sql_id='sql_id_from_snapper';

If you provide that info, I'm sure we'll be able to offer some insight.

-Mark

On Mon, Apr 13, 2015 at 1:29 PM MARK BRINSMEAD <mark.brinsmead@xxxxxxxxx>
wrote:

Sandra,

SELECT ... FOR UPDATE WAIT will try to acquire locks on each row
selected, and will wait (indefinitely) to get those locks. (This is
distinct from SELECT without the FOR UPDATE clause, which locks nothing,
and from the FOR UPDATE NOWAIT clause which returns an error if the locks
cannot be obtained).

There are lots of things that could be happening here, so Mladen's
question is a good one -- what are the WAIT EVENTS you are seeing these
statements waiting on? Some of them are probably waiting on locks held by
other sessions (most likely other sessions running the same statement) --
it might be very interesting to see what the ones *not* waiting on locks
are doing.

In situations where you have lots of sessions running statements like
this, things will often work fine so long as the SELECT statements -- and
the rest of the transaction following them -- all run quickly. A session
grabs a lock, does some work, commits that work, and the lock is released
before there is a high probability of another session trying to lock the
same row(s).

If something has changed to make these transactions slower (that could
the the SELECT statement itself, or the statements that follow, up and
including to the COMMIT) some systems can move from running "just fine" to
running "almost not at all" very abruptly.

(Note: The foregoing statements are generalizations just loaded with
assumptions. How well they might fit your situation is yet to be seen.)

Please note that while you analyse this, it is important to pay
attention to the AMOUNT OF TIME WAITED, and not just the number of waits.
It will probably be helpful, too, to compare the amount of time waited on
relevant events now versus the amount of time waited on the same events
when the system was performing well.

Apologies if I am telling you things you already know. If these are
not things you already know, then hopefully this will at least help to get
you started.

On Mon, Apr 13, 2015 at 12:25 PM, Sandra Becker <sbecker6925@xxxxxxxxx>
wrote:

Oracle 11.2.0.2
Solaris 10 64-bit on Oracle T5
EMC storage (fast cache enabled)

Last Wednesday night we moved a production database to new hardware. We
do not have the option to move it back. On Friday we started experiencing
severe performance issues. ASH reports show the "SELECT...FOR UPDATE
WAIT..." running as the top SQL for events and row sources. This is NOT
new code. What can I look at, is there anything I can do, short of
killilng sessions, to get this under control? I'm opening a ticket with
Oracle, but this forum generally responds faster.

I'm not really familiar with how the SELECT FOR UPDATE WAIT works and
haven't found a good explanation yet. The select is on a partitioned
table. Does it lock only a row or does it lock the partition? At times I
can see a dozen or more of these statements running. Once they complete, I
then see a lot of waits on log file sync.

If anyone knows, how does having fast-cache enabled affect this?

Thanks for any suggestions/guidance.

--
Sandy
GHX

--
Sandy
GHX

Follow-Ups:
- Re: select for update wait issues
  - From: Freek D'Hooge

References:
- select for update wait issues
  - From: Sandra Becker
- Re: select for update wait issues
  - From: MARK BRINSMEAD
- Re: select for update wait issues
  - From: Mark J. Bobak

Re: select for update wait issues

Other related posts: