Kellyn, Rather than just bouncing the database instance, especially if there is no direct proof that application performance is impacted, then it is better to stick it out and research it into the ground. If there is no process in the OS, then how is it possible that CPU is being consumed? By what OS utility is that observation coming from? What is the chain of evidence indicating massive CPU consumption from outside the Oracle database instance? You can use "top" or a hand-rolled version (i.e. "ps -eaf | sort -n +3 | tail") next time this happens, to see if any process on the server is eating up CPU. The nice thing about the hand-rolled "ps" command is that it doesn't just show you the executable name (i.e. "oracle") as "top" does, but shows you the functional command-name (i.e. "ora_pmon_XXX", etc) that "ps -eaf" shows, which can be helpful. Whatever you find, you can backtrack the OS PID into the database using "select * from v$session where paddr in (select addr from v$process where spid = NNNN)" and see whether it is a "ghost" session or something more normal and expected. Also, from within Oracle, what are the values for STATUS, EVENT, P1, P2, P3 for this session? If the session seems to be waiting on something, then it is further unlikely that CPU is being consumed. Particularly if the wait-event seems to be related to an OS system call, such as I/O. If the Oracle instance is holding a session open when its underlying OS process has actually been killed, then certainly it is not chewing up CPU, but it may be holding locks, which could possibly be more devastating. You may want to query V$LOCK for that session's SID value to see what, if any, locks are being held. More importantly, you would want to look at the BLOCKS column in V$LOCK to see if the session is blocking other sessions (i.e. BLOCKS > 0) -- bear in mind that the column name (BLOCKS) is being used as a verb in the context of V$LOCK, not as a noun. If BLOCKS > 0, you may want to look at what those other sessions are by querying V$LOCK again, this time by the same TYPE and ID1 value as the blocking lock. ----- As far as why this is happening, there should be trace files left behind in the BACKGROUND_DUMP_DEST directory for the terminated Pnnn processes, and those trace files might give some general clue, in the first 50-100 lines of text. If I had to guess off the top of my head, purely speculating, I would guess that you're encountering some OS resource limitation. The reason for this guess is because the failures are occurring when the demand for OS resources are sky-high (i.e. 32 concurrent CTAS operations, each degree 4, etc). Please verify that allocated swap space is 75% of physical RAM, that process limits are configured correctly. There's more to check, and of course it varies by platform, so it can be fairly laborious when doing it the first time. ----- For a faster way to do this, especially on a regular basis, the RDA HCVE (i.e. Health Check Validation Engine) for your RDBMS version and platform is pretty helpful and authoritative. To download, go to support.oracle.com, search on keywords "rda hcve download", download the appropriate ".tar" file to your database server, unpack, and just run the command "./rda.sh -T hcve" and follow the shell-script prompts. Be aware that HCVE is intended as a "pre-installation" validation, and that some of the validations will fail expectedly in an already-installed environment (i.e. "LD_LIBRARY_PATH is set", etc). If you need help interpreting the report, feel free to post back. I get the feeling not many people know about HCVE. I like to run it periodically on existing environments because the OS configuration gets changed as often as the RDBMS configuration, and a server which was configured correctly when Oracle was installed may no longer be configured within recommended guidelines. Hope this helps... Tim Gorman consultant -> Evergreen Database Technologies, Inc. postal => P.O. Box 630791, Highlands Ranch CO 80163-0791 website => http://www.EvDBT.com/ email => Tim@xxxxxxxxx mobile => +1-303-885-4526 fax => +1-303-484-3608 Lost Data? => http://www.ora600.be/ for info about DUDE... David Fitzjarrell wrote: -- //www.freelists.org/webpage/oracle-l |