Re: Weird database hanging

  • From: "Alex Gorbachev" <ag@xxxxxxxxxxxx>
  • To: don@xxxxxxxxx
  • Date: Tue, 18 Sep 2007 19:50:37 -0400

Don,

Kudos to Jeremiah for directing you away from guesswork path (thanks from
BAAG) but Oracle Support normally does exactly that so don't wait for them -
rather try to guide them to proper analysis. Did they give you explanation
on how they wanted to avoid these hungs increasing OPEN_CURSORS and
SESSION_CACHED_CURSORS?

So far there seems to be a clear indication that new session creation is
part of the problem so maybe this can give us some hints.
Few things that come up to my mind to help with investigation:
- Can you pre-establish SQL*Plus connection as SYSDBA and have it handy?
During the hang run system state dump and haganlyze dump. This would provide
some food for thoughts. Oracle support might request it anyway.
- Do your application/bulk load process reconnect often? Maybe it's
visibility of hung whereas you simply can't connect? If that's connection
issue - audit enabled? AFTER LOGON trigger exist?
- You haven't mentioned it explicitly or I missed but I assume it's not RAC.
Is it?
- Do you think you can generate ASH report and figure out object id for
"enq: SQ - contention". This might give us the hot sequence. Same report can
provide some hints (P1,P2,P3) for other waits. If instance was really bad,
we might not have any ASH info if instance was really bad, though.

Since you mentioned large SGA, do you use huge pages? Without them, there
might be tremendous overhead of large SGA with many processes and especially
overhead when process starts and starts accessing shared memory.

Hope this helps a bit,
Alex


On 9/18/07, Don Seiler <don@xxxxxxxxx> wrote:
>
> I have opened an SR with Oracle, as it has hung 3 times today and
> actually crashed once.
>
> When the database hang, Ignite is showing "latch: shared pool" and
> "latch: library cache" waits.  Otherwise I don't see these at all.
>
> Oracle has had me up OPEN_CURSORS and SESSION_CACHED_CURSORS, but I
> did that last night (with instance restart) and, as I said, it has
> hung 3 times and crashed once since then.  Oracle's also telling me
> that this is largely due to application coding.  My problem with that
> is that the application code has been in place for a while.
>
> To recap: we migrated to the 64-bit machine on Sep 1.  Hanging has
> occurred since Sep 13, seemingly during bulk load activity.  Our SGA
> is quite a bit larger (16 GB) than on our 32 bit box (1.5 GB).
>
> db_cache_size                        big integer 12G
> shared_pool_size                     big integer 2G
>
> I haven't seen the "block change tracking buffer space" wait since
> yesterday morning, thankfully.
>
> Any tips would be appreciated.
>
> Thanks,
> Don.
>
> --
> Don Seiler
> oracle: http://ora.seiler.us
> ultimate: http://www.mufc.us
> --
> //www.freelists.org/webpage/oracle-l
>
>
>


-- 
Alex Gorbachev, Oracle DBA Brewer, The Pythian Group
http://www.pythian.com/blogs/author/alex http://www.oracloid.com
BAAG party - www.BattleAgainstAnyGuess.com

Other related posts: