Re: Oracle 10g "Ghost" SID Eating up CPU

  • From: Martin Berger <martin.a.berger@xxxxxxxxx>
  • To: kjped1313@xxxxxxxxx
  • Date: Wed, 2 Jun 2010 20:19:00 +0200

Kellyn,

As you still have a sid and serial#, have you tried to 'alter system disconnect session' or 'alter system kill session' ?? maybe you should note dhe saddr and paddr
do you have a paddr to this sid?
Even if v$process does not shows anything, maybe x$ksupgp or x$ksupr or other structures which shows (prarallel) processes can give you some more details.

but if you can avoid bouncing the DB it would be worth investigating!


I'd also second Tim Gorman: if you identify all OS-process IDs which eats CPUs, can you identify all of these within your database(s)? Or are some processes remaining you cannot link to any purpose?

sorry, more questions than answers.

Martin


Am 01.06.2010 um 20:25 schrieb Kellyn Pedersen:

The last two weekends, due to some new code, my main datawarehouse/ OLTP, (yes, I know it's an oxy-moron and it's 10.2.0.4 on Linux, 64bit with a number of one off patches for parallel bugs...) has been overwhelmed by 32 CTAS concurrently running, all requesting 4 parallel on large table selects. Parallel was downgraded a number of times 75% or more during this step in their package. This is the second time I've come back in after the occurrance to find one parallel coordinator session running- on it's own, no other producers/consumers, no parent SID, just this one process eating up CPU-

SID SERIAL# STATUS OSUSER PROCESS MACHINE PROGRAM ROW_WAIT_OBJ# PDDL_STATUS PQ_STATUS EVENT
P1TEXT                              SECONDS_IN_WAIT
540 20564 ACTIVE sdev_user 31988 appmachine prodmachine (P039) 2815532 ENABLED ENABLED PX Deq: Execution sleeptime/senderid 141170

If you try to search for the OS Process, (31988), it doesn't exist, the SQL_ID is unknown but I can see it was sitting on the primary key for a particular table, (although different one than the last time this ghost was present last week!) What I believe happened is that the parallel query died, but the coorindator is still out there.

ERROR at line 1:
ORA-12805: parallel query server died unexpectedly
I found a couple of these errors, (12805) in trace files from the times that parallel was downgraded. The process doesn't exist on the app server, I don't have an OS PID to kill and I can't kill it at the Oracle session level, (ORA-00030: User session ID does not exist.) Last time we had a maintenance window and solved the problem quickly with a database cycle, but here I am again- HOW DO I GET RID of this thing!?!? It's starting to eat up CPU and won't die... :(

 SID PID Coordinator SPID  Group Set Degree Req Degree Wait Event
540 134 540 31988 PX Deq: Execution Msg

Anybody have any ideas? I actually have two P039 processes in my database right now! This cannot be good! :(

Kellyn Pedersen

Sr. Database Administrator

I-Behavior Inc.

http://www.linkedin.com/in/kellynpedersen

www.dbakevlar.blogspot.com


"Go away before I replace you with a very small and efficient shell script..."




Other related posts: