David, Thanks for your script. As cron is being replaced by other products, we cannot cron this job. But will use this as a job inside the EM and get notified. Raj, Srini, Eric, and Jared, Thanks for your input. KK. On 4/13/07, Taft, David <TaftD@xxxxxxxxxxx> wrote:
Raj, This is pretty much what we do in our shop, but I found that select against v$database would sometimes succeed even though users were calling the help desk saying everything was hung. I found doing a truncate against an empty dummy table to be more effective. Here is a snippet of code that gives a pretty good idea of how I handle it. Cheers, David Taft P.S. Our monitoring runs from cron on each server, which doesn't satisfy Keerthi's original request (paging via EM/Grid Control), but you'll notice at the bottom of the snippet, that I do make a call to Big Brother monitoring. I am guessing the same could be done with EM. <SNIP> # Execute in background so that script doesn't hang if the database is hung. # SqlPlus function writes output to a temporary file. SqlPlus "SET TIMING ON TRUNCATE TABLE dbmon.status;" & # Give truncate a chance to complete before entering the loop (especially for databases with heavy resource contention). sleep 5 # Minute counter. count=0 # Comparison loop counter. count2=`expr $timeout + 1` # # NOTE: count2 must be one greater, otherwise the '$count -eq $timeout' # statement below will never execute. # Initialize background job check to zero. stat=0 # Check results of above SQL*Plus session for database status. while [ $count -lt $count2 ]; do # Check if background job is still running. stat=`kill -0 $! 2>/dev/null;echo $?` # If exit status of kill command is greater than zero, then job is finished. if [ "$stat" != "0" ]; then case `cat $tmpfile` in *"Table truncated"* ) # If truncate executed, then everything is OK. break ;; *"table or view does not exist"*) # Table never created or dropped? message="Unable to access the dbstatus table DBMON.STATUS." break ;; *"ORACLE not available"*) # Verify database didn't go down while the script was sleeping. message="ORACLE not available." break ;; *"ORACLE initialization or shutdown in progress"*) # Verify database is not in a CLOSED state. message="ORACLE initialization or shutdown in progress. Database not open." break ;; *) # Query is not still running and $count has not exceeded $timeout, # so there must be a problem connecting to the database. message="Problem connecting to ORACLE or the DBMON schema." break ;; esac elif [ $count -eq $timeout ]; then # Database is hung if query still running after specified timeout, so # kill the truncate statement running in the background. kill -9 $! >/dev/null 2>&1 message="Database is hung." break else # Query is still running and the timeout has not been exceeded, so # increment counter and wait one minute before looping again. count=`expr $count + 1` sleep 60 fi done # If $message is not null, then there is a problem. if [ "$message" != "" ]; then PageDBA "$message" MakeLogEntry "$message" WriteCSV "$message" NotifyBigBrother "red" # Exit with status of 1. Cleanup 1 <SNIP> *From:* Pal, Raj [mailto:Raj.Pal@xxxxxxxxxxxx] Have your script log in to extract a known statement (ie. Select name from v$database). Run this in the background (ie. &) and send it to a dummy/temporary file (eg. /tmp/monitor.tmp). Give your script a configurable number of seconds to find the file. If it doesn't find it after 'x' seconds, then send an alert. Just make sure you delete the potentially existing dummy file before repopulating it. ------------------------------ *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto: oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Keerthi Kumar Is there a way to get notified for a frozen oracle database to a pager via EM 10g GRID. We have Database Up/Down event but could not find anything in the list for this type of problem.