Re: How to get notified for a frozen oracle database - NEED HELP

  • From: "Keerthi Kumar" <myneel@xxxxxxxxx>
  • To: TaftD@xxxxxxxxxxx
  • Date: Sat, 14 Apr 2007 11:47:53 -0500

Thanks for your script. As cron is being replaced by other products, we
cannot cron this job. But will use this as a job inside the EM and get

Raj, Srini, Eric, and Jared,

Thanks for your input.


On 4/13/07, Taft, David <TaftD@xxxxxxxxxxx> wrote:


This is pretty much what we do in our shop, but I found that select
against v$database would sometimes succeed even though users were calling
the help desk saying everything was hung.  I found doing a truncate against
an empty dummy table to be more effective.  Here is a snippet of code that
gives a pretty good idea of how I handle it.


David Taft

P.S. Our monitoring runs from cron on each server, which doesn't
satisfy Keerthi's original request (paging via EM/Grid Control), but you'll
notice at the bottom of the snippet, that I do make a call to Big Brother
monitoring.  I am guessing the same could be done with EM.


# Execute in background so that script doesn't hang if the database is
# SqlPlus function writes output to a temporary file.
         TRUNCATE TABLE dbmon.status;" &

# Give truncate a chance to complete before entering the loop (especially
for databases with heavy resource contention).
sleep 5

# Minute counter.

# Comparison loop counter.
count2=`expr $timeout + 1`
# NOTE: count2 must be one greater, otherwise the '$count -eq $timeout'
#       statement below will never execute.

# Initialize background job check to zero.

# Check results of above SQL*Plus session for database status.
while [ $count -lt $count2 ];
   # Check if background job is still running.
   stat=`kill -0 $! 2>/dev/null;echo $?`

   # If exit status of kill command is greater than zero, then job is
   if [ "$stat" != "0" ]; then

      case `cat $tmpfile` in
         *"Table truncated"* )
            # If truncate executed, then everything is OK.
         *"table or view does not exist"*)
            # Table never created or dropped?
            message="Unable to access the dbstatus table DBMON.STATUS."
         *"ORACLE not available"*)
            # Verify database didn't go down while the script was
            message="ORACLE not available."
         *"ORACLE initialization or shutdown in progress"*)
            # Verify database is not in a CLOSED state.
            message="ORACLE initialization or shutdown in progress.
                     Database not open."
            # Query is not still running and $count has not exceeded
            # so there must be a problem connecting to the database.
            message="Problem connecting to ORACLE or the DBMON schema."

   elif [ $count -eq $timeout ]; then
      # Database is hung if query still running after specified timeout,
      # kill the truncate statement running in the background.

      kill -9 $! >/dev/null 2>&1
      message="Database is hung."

      # Query is still running and the timeout has not been exceeded, so
      # increment counter and wait one minute before looping again.

      count=`expr $count + 1`
      sleep 60

# If $message is not null, then there is a problem.
if [ "$message" != "" ]; then

    PageDBA      "$message"
    MakeLogEntry "$message"
    WriteCSV     "$message"
    NotifyBigBrother "red"

  # Exit with status of 1.
  Cleanup 1


 *From:* Pal, Raj [mailto:Raj.Pal@xxxxxxxxxxxx]

Have your script log in to extract a known statement (ie. Select name from

Run this in the background (ie. &) and send it to a dummy/temporary file
(eg. /tmp/monitor.tmp).

Give your script a configurable number of seconds to find the file.

If it doesn't find it after 'x' seconds, then send an alert.

Just make sure you delete the potentially existing dummy file before
repopulating it.


*From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto:
oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Keerthi Kumar

Is there a way to get notified for a frozen oracle database to a pager via
EM 10g GRID. We have Database Up/Down event but could not find anything in
the list for this type of problem.

Other related posts: