10.2.0.3 RAC diag process - high cpu util

Hi Everybody,

 

Anyone had ever problems with a spinning/runaway diag process (rdbms
instance, not asm) on 10.2.0.3 RAC / RH 4?

 

> ls -altr *diag*

...

-rw-rw----  1 oracle oinstall     311506 May 10  2008
prorac2_diag_27727.trc

-rw-------  1 oracle oinstall       1096 May 14  2008
prorac2_diag_13877.trc

-rw-------  1 oracle oinstall 4985163676 Dec  3 10:27
prorac2_diag_29143.trc

 

 > tail prorac2_diag_29143.trc

SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 120 bytes

SKGXPSEGRCV: trucated message buffer data skgxpmsg meta data header
0x0x7fbfffd948 len 48 bytes

SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 120 bytes

SKGXPSEGRCV: trucated message buffer data skgxpmsg meta data header
0x0x7fbfffd948 len 48 bytes

SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 120 bytes

SKGXPSEGRCV: trucated message buffer data skgxpmsg meta data header
0x0x7fbfffd948 len 48 bytes

SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 120 bytes

SKGXPSEGRCV: trucated message buffer data skgxpmsg meta data header
0x0x7fbfffd948 len 48 bytes

SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 120 bytes

SKGXPSEGRCV: trucated message buffer data skgxpmsg meta data header
0x0x7fbfffd948 len 48 bytes

 

And here's top output:

 

top - 10:30:14 up 202 days, 11:16,  4 users,  load average: 2.87, 2.77,
2.78

Tasks: 369 total,   2 running, 367 sleeping,   0 stopped,   0 zombie

Cpu(s): 24.3% us, 21.1% sy,  0.0% ni, 33.9% id, 19.2% wa,  0.0% hi,  1.4%
si

Mem:  16408900k total, 16379576k used,    29324k free,   176712k buffers

Swap:  8388600k total,   677940k used,  7710660k free, 11023172k cached

 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

29143 oracle    25   0 10.2g  91m  36m R 99.9  0.6  65151:03
ora_diag_prorac2

31085 oracle    15   0 10.1g 1.3g 1.3g S 38.5  8.2   0:30.15 oracleprorac2
(LOCAL=NO)

 6137 oracle    16   0 10.2g 818m 798m S 13.3  5.1   0:10.59 oracleprorac2
(LOCAL=NO)

 2783 oracle    15   0 10.1g 2.6g 2.6g S  4.6 16.9   0:35.02 oracleprorac2
(LOCAL=NO)

14250 oracle    15   0 10.1g 2.3g 2.3g S  3.0 14.9   0:27.06 oracleprorac2
(LOCAL=NO)

 

Killing the diag process just restarts it, and runs up to the high util
again .

 

ML search doesn't find anything, or any other searches. 

 

Any ideas?

 

Thanks,

Juergen Stegmair

ENKITEC

Other related posts: