RE: help with a GES deadlock

Thanks. I'm curious though about something in that deadlock message below. It 
appears to be showing that some of the transactions are by SYS, which doesn't 
make sense. Note:
pid=41 serial=4770 audsid=156523088 user: 0/SYS

These commands are all run by the application connections, and I checked and 
don't see SYS connections from any of the app servers.

any ideas?

From: Tim Gorman [mailto:tim@xxxxxxxxx]
Sent: Monday, August 15, 2011 1:59 PM
To: Walker, Jed S; oracle-l@xxxxxxxxxxxxx
Subject: Re: help with a GES deadlock

Jed,

Don't get lost in the RAC wait events, they are all just mirrors to the non-RAC 
wait events.  Roughly, RAC wait events break down in two categories:  global 
enqueues and global cache.  The former (global enqueue waits) mirror all the 
non-RAC wait events starting with the phrase "enq:", the latter (global cache 
waits) mirror the datafile I/O waits (i.e. db file xxx read) and I/O contention 
waits (i.e. "buffer busy wait", "read on other session", etc).

My advice is to filter out the RAC waits and perform diagnosis based on the 
non-RAC waits.  Take a global view, of course -- look across all the instances, 
but try not to get distracted by the amplifications that the RAC waits present 
-- focus on the underlying non-RAC issues.  Fix the problem that way, make the 
non-RAC waits go away, and the RAC waits disappear commensurately.

Hope this helps...

-Tim

-----Original Message-----
From: Walker, Jed S [mailto:Jed_Walker@xxxxxxxxxxxxxxxxx]
Sent: Monday, August 15, 2011 09:33 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: help with a GES deadlock
Hi all,

Here?s my next new to RAC question. We are getting a Global Enqueue Services 
Deadlock quite often. Looking at the trace file it looks to me like it is a 
regular deadlock, so I?m wondering if something else is going on here, or if 
RAC just reports a regular deadlock as a GED deadlock. This is a five node 
10.2.0.4.0 RAC cluster.

Each schema should only be used on one instance of the database (each ?market? 
has its own schema and all connections for that market are sent to a single 
service running on a single instance, so there should not be any blocks being 
worked by multiple instances).

Would a standard deadlock show up as a GED deadlock in RAC?

If I read this correctly, sid 42 is doing a delete and then an update, but 41 
is doing an update and then a delete (presumably affecting the same rows).

Am I reading this correctly?


*** 2011-08-01 17:17:45.785
Setting 3-way CR grants to 1 global-lru off? 0
*** 2011-08-01 18:14:11.067
user session for deadlock lock 0x4560db490
  pid=41 serial=4770 audsid=156523088 user: 0/SYS
  O/S info: user: , term: , ospid: 1234, machine: 
flux-atlt-04.westchester.pa.bo.comcast.net
            program:
  Current SQL Statement:
  delete from STB_SETTING where STB_SETTING_ID=:1
user session for deadlock lock 0x4560db340
  pid=42 serial=16019 audsid=156522961 user: 70/RDVRGW_ATLT
  O/S info: user: , term: , ospid: 1234, machine: 
flux-atlt-01.westchester.pa.bo.comcast.net
            program:
  Current SQL Statement:

update FLUX_STB_MESSAGE_QUEUE set SETTING_TYPE_ID=:1, 
SETTING_TYPE_RECORD_ID=:2, MAC_ADDR=:3, MESSAGE_SEQUENCE=:4, REQUEST_TAG=:5, 
STB_SYNC_STATE_ID=:6, ACTION_ID=:7, MESSAGE_SENT_DATE=:8, REC_START_TIME=:9, 
REC_END_TIME=:10, MESSAGE_CREATED=:11, PRIORITY=:12, STB_SETTING_ID=:13 where 
FLUX_STB_MESSAGE_QUEUE_ID=:14
user session for deadlock lock 0x457130888
  pid=42 serial=16019 audsid=156522961 user: 70/RDVRGW_ATLT
  O/S info: user: , term: , ospid: 1234, machine: 
flux-atlt-01.westchester.pa.bo.comcast.net
            program:
  Current SQL Statement:

update FLUX_STB_MESSAGE_QUEUE set SETTING_TYPE_ID=:1, 
SETTING_TYPE_RECORD_ID=:2, MAC_ADDR=:3, MESSAGE_SEQUENCE=:4, REQUEST_TAG=:5, 
STB_SYNC_STATE_ID=:6, ACTION_ID=:7, MESSAGE_SENT_DATE=:8, REC_START_TIME=:9, 
REC_END_TIME=:10, MESSAGE_CREATED=:11, PRIORITY=:12, STB_SETTING_ID=:13 where 
FLUX_STB_MESSAGE_QUEUE_ID=:14
user session for deadlock lock 0x457130738
  pid=41 serial=4770 audsid=156523088 user: 0/SYS
  O/S info: user: , term: , ospid: 1234, machine: 
flux-atlt-04.westchester.pa.bo.comcast.net
            program:
  Current SQL Statement:
  delete from STB_SETTING where STB_SETTING_ID=:1
Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x2fa0019][0xc6eb2],[TX]
----------resource 0x0x4329c7288----------------------
resname       : [0x2fa0019][0xc6eb2],[TX]
Local node    : 0
dir_node      : 0
master_node   : 0
hv idx        : 31
hv last r.inc : 10
current inc   : 10
hv status     : 0
hv master     : 1
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_node   : 0
vbreq_state   : 0
state         : x0
resp          : 0x4329c7288
On Scan_q?    : N
Total accesses: 43
Imm.  accesses: 38
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 0x4560db340 gl KJUSEREX rp 0x4329c7288 [0x2fa0019][0xc6eb2],[TX]
  master 0 gl owner 0x45b3b2330 possible pid 13832 xid 2B000-0001-00000BB0 bast 
0 rseq 3 mseq 0 history 0x14951495
  open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0x4560db490 gl KJUSERNL rl KJUSEREX rp 0x4329c7288 [0x2fa0019][0xc6eb2],[TX]
  master 0 gl owner 0x45b41d3b0 possible pid 13830 xid 2B000-0001-00000BC8 bast 
0 rseq 3 mseq 0 history 0x1495149a
  convert opt KJUSERGETVALUE
??.



-          Jed Walker






Other related posts: