Weird database hanging

Running 10gR2 (10.2.0.2) on RHEL4 on 4 64-bit dual-core box.  Earlier
today users complained en masse about not being able to connect to the
database.  I got to a term and found that even trying to log in
locally as sysdba was hanging.

The alert log had this:
Fri Sep 14 11:23:36 2007
kkjcre1p: unable to spawn jobq slave process
Fri Sep 14 11:23:36 2007
Errors in file /u00/app/oracle/admin/foo/bdump/foo_cjq0_11519.trc:
Fri Sep 14 11:25:46 2007
kkjcre1p: unable to spawn jobq slave process
Fri Sep 14 11:25:46 2007
Errors in file /u00/app/oracle/admin/foo/bdump/foo_cjq0_11519.trc:
Fri Sep 14 11:26:49 2007
PMON failed to acquire latch, see PMON dump

I hit the Google but before I came back, my login was ready and things
seemed back to normal.  Checking back in the alert log I saw a lot of
these:

WARNING: inbound connection timed out (ORA-3136)

I'm assuming those are just symptoms of the original cause, and I've
already found suggestions for treating timeouts in sqlnet.ora and
listener.ora.

The trace files mentioned have lines such as:

Waited for process J000 to initialize for 60 seconds

repeated for 70, 80, 90 seconds, etc.

So in summary, there was hanging for everyone, it seemed to clear
itself up.  One post on Google suggested that the PROCESSES init
parameter might be too low.  Any other possibilities?

-- 
Don Seiler
oracle: http://ora.seiler.us
ultimate: http://www.mufc.us
--
http://www.freelists.org/webpage/oracle-l


Other related posts: