Sessions wait on "resmgr:cpu quantum" with little CPU load

  • From: Yong Huang <yong321@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Mon, 14 Jul 2008 12:57:44 -0700 (PDT)

Environment: 4-node RAC. A user reported application slowdown that happened
last Friday. I pulled out my "perfmon" log (log of top correlated with session
wait events). For about 20 minute, about 100 sessions on the 3rd node were
waiting on "resmgr:cpu quantum" event, with p1 mostly being 1 or 2,
occasionally 3 or 4. The top output captured on node 3 is below (head and sed
are from my monitoring script):

top - 16:09:02 up 92 days, 10:03,  0 users,  load average: 0.26, 1.16, 1.65
Tasks: 340 total,   1 running, 339 sleeping,   0 stopped,   0 zombie
Cpu(s): 17.8% us,  6.6% sy,  0.0% ni, 71.1% id,  4.0% wa,  0.1% hi,  0.3% si
Mem:   8293488k total,  8158512k used,   134976k free,   298864k buffers
Swap:  8388576k total,   940624k used,  7447952k free,  4252272k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11434 oracle    -2   0 5241m 2.1g 2.1g S    1 26.9 122:53.97 ora_lms6_riscp13
  424 oracle    19   0 53216 1344 1136 S    0  0.0   0:00.00 head -15
  425 oracle    18   0 53408 1616 1392 S    0  0.0   0:00.00 sed s/  *$//
  451 root      18   0  4176 1424 1120 S    0  0.0   0:00.00 /bin/sleep 1
11410 oracle    -2   0 5241m 2.2g 2.2g S    0 27.6 309:45.19 ora_lms0_riscp13
11414 oracle    -2   0 5241m 2.1g 2.1g S    0 26.9 120:41.45 ora_lms1_riscp13
11418 oracle    -2   0 5241m 2.1g 2.1g S    0 26.9 119:07.49 ora_lms2_riscp13

The DB sessions were from different applications. SQL IDs vary. No new datafile
was added during that period (so it's not Bug 4602661).

Other nodes sometimes had high CPU and was always caused by
"/home/oracle/product/10.2.0/crs/bin/crs_stat.bin -t", not by any DB server
process (i.e. "oracleSID (LOCAL=...").

Nothing interesting is recorded in alert.log or /var/log/messages on any node.
Resource manager was set up on this database a long time ago. It doesn't seem
to limit what I want it to (the mystic crs_stat.bin,  probably from emagent or
some monitoring script) and limits what it should not. Any insight is
appreciated.

Yong Huang

$ uname -a
Linux xxx03p 2.6.9-55.EL #1 SMP Fri Apr 20 16:30:19 EDT 2007 ia64 ia64 ia64
GNU/Linux

SQL> select * from v$version where rownum = 1;
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bi




      
--
//www.freelists.org/webpage/oracle-l


Other related posts: