RE: Checking if remote database is up

  • From: "Smith, Ron L." <rlsmith@xxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 26 Apr 2005 07:57:44 -0500

Below is a script we run every 15 minutes against both local and remote
databases to make sure we can connect.  The script will either page or
email based on the DBA id given when the script is run.  This is just
one of several similar scripts we use to monitor our databases.

Ron

SAMPLE CRON ENTRY: /u001/home/oracle/dbastuff/scripts/monitoring/db.sh
mydb mydba > /dev/null 2>&1 Or
/u001/home/oracle/dbastuff/scripts/monitoring/db.sh mydb mydba@mycomp >
/dev/null 2>&1

>>> DB.SH script:

#! /bin/sh
#          DBA MONITORING SCRIPTS=20
# ******************************************************************
#
#     Author:         Ron Smith
#     Date:           12/18/00
#     Funtion:        Checks to make sure in instance is up and
#                     responding.
#
# ******************************************************************
#
#                     CHANGE HISTORY
#
#     DATE        WHO             Reason for Change
#
#     12/19/00    Ron Smith       New Prog
#
#
# ******************************************************************
#
#                     FUNCTION

#
#     This script calls db.sql.
#
#     The function of this script is to try to connect to a SID and
#     return the value of the name field in the V$database view.  If
#     this fails, the listener and the database should be checked.
#
#     (The following paragraph may not be true.  The check for the
#     error file may have been commented out)
#     If an error file already exists, the script exits without any=20
#     action.  The DBA should delete the error file when the problem
#     is resolved.  Another script should be scheduled to run daily
#     to delete the error file so the DBA is paged at least once a
#     day if the condition continues.
#
#     If the id of the DBA is a Zid, a page will be sent.  If the=20
#     id of the DBA is an email address (determined by looking for
#     an "@" ) , an EMAIL will be sent.
#
# ******************************************************************
#
#                     PREREQUISITES
#
#     The OPS$ORACLE user must exist in the instance.  This can be=20
#     created by running the opsuer.sql script in SQLPLUS while
#     logged on as SYSTEM.=20
#
#     The cdmonitoring script must exist in the home/oracle
#     directory.
#
# ******************************************************************
#
#                     RUN SYNTAX
#     =20
#     db.sh (sid) (dba)     =20
#     =20
#     =20
# ******************************************************************

# cd to the monitoring script directory
. $HOME/cdmonitoring.sh

ORACLE_SID=3D$1
export ORACLE_SID
DBA=3D$2
export DBA
ATCNT=3D`echo $DBA | grep @ |  wc -l`
export ATCNT
SERVER=3D`uname -a | cut -d " " -f2`
export SERVER

ORACLE_HOME=3D`grep "^$ORACLE_SID:" /etc/oratab | head -1 | cut -d: -f2`
export ORACLE_HOME PATH=3D$ORACLE_HOME/bin:/usr/local/bin:$PATH:.
export PATH

# Delete the old list file if it exists

if [ -f db_$ORACLE_SID.lst ]
then rm db_$ORACLE_SID.lst
fi

# Check to see if an error file exists.  If it does get out.

#if [ -f db_$ORACLE_SID.err ]
#then echo "Error file db_$ORACLE_SID.err exists - will exit now" # exit
#fi

# If sending to EMAIL address, run sql with headings on

if [ "$ATCNT" -gt "0" ]
        then
        sqlplus / @db.sql on $ORACLE_SID
        else
        sqlplus / @db.sql off $ORACLE_SID
fi

# If there is anything in the lst file then send a message
=20
WC=3D`grep -i $ORACLE_SID db_$ORACLE_SID.lst | wc -l`

if [ "$WC" -lt "1" ]
        then echo "-DBA- Could not connect to $ORACLE_SID on server
$SERVER " > db_$ORACLE_SID.err
        if [ "$ATCNT" -gt "0" ]=20
                then
                echo "email sent"
                cat db_$ORACLE_SID.lst >> db_$ORACLE_SID.err
                elm -s "-DBA- Could not connect to $ORACLE_SID on server
$SERVER" $DBA < db_$ORACLE_SID.err
                else
                LC=3D`cat db_$ORACLE_SID.lst | sed -e 's/  */ /g' | wc =
-c`
                echo $LC
                if [ "$LC" -gt "160" ]
                        then echo "Too many errors to send. Check
db_$ORACLE_SID.lst" >> db_$ORACLE_SID.err
                        else
                        cat db_$ORACLE_SID.lst >> db_$ORACLE_SID.err
                fi
                echo "page sent"
                pager $DBA "`cat db_$ORACLE_SID.err`"
        fi
Fi

>>> DB.SQL:
set pause off
SET ECHO off
set verify off
set feedback off
set hea &1
define ORACLE_SID =3D &2
spool db_$ORACLE_SID.lst
select name from v$database
/
spool off;
exit

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Mercadante, Thomas F
(LABOR)
Sent: Tuesday, April 26, 2005 6:41 AM
To: 'stephenbooth.uk@xxxxxxxxx'; Oracle-L (E-mail)
Subject: RE: Checking if remote database is up


Stephen,

We have a similar situation.  What we do is this:

Create a "ocella_available" table in your local database that indicates
whether the remote database is available.  Run a cron job every minute
that does the simple query you are talking about.  Depending on the
response, update the ocella_available table.  In your application, check
the ocella_available table to see if it is ok to try and retrieve a
record from the remote database.

This works for us.  The cron job will also send us email if the remote
database becomes unavailable so that we at least know about it.

Good Luck!

Tom

-----Original Message-----
From: stephen booth [mailto:stephenbooth.uk@xxxxxxxxx]=20
Sent: Tuesday, April 26, 2005 6:53 AM
To: Oracle-L (E-mail)
Subject: Checking if remote database is up

One of our new systems (Documentum from EMC) uses a database link to a
remote database (Ocella from Ocella)  for some processes.  Due to
organisation politics the people managing the Ocella database don't tell
the people managing Documentum when they're taking their system down
(the joys of working in the public sector).  Documentum can do most of
it's functions when Ocella is down, it just can't do certain
transactions, unfortunately it's currently not very good at dealing with
situations where the Ocella database is down.

We're looking at some way of checking if the Ocella database is up
before trying a transaction that needs it then reporting back to the
user if it's down.  What we're currently thinking of is putting an empty
table in the Ocella database then querying that from a PL/SQL function
over the link and trapping the error.  If we get data or 'No Rows
Returned' then we know that the database is up and the link is working.
If we get an ORA-03113 then we know that the database is down or the
link isn't working for some other reason (e.g. Network broken again).
the function returns either TRUE or FALSE depending on whether the
remote dtabase is up or not.

Does anyone have experience of a similar situation?  Is there a more
elegant/reliable method?  Anything I've failed to consider that will
make this all blow up?

Thanks

Stephen

--=3D20
It's better to ask a silly question than to make a silly assumption.
--
//www.freelists.org/webpage/oracle-l
--
//www.freelists.org/webpage/oracle-l


Important Notice!!
If you are not the intended recipient of this e-mail message, any use,
distribution or copying of the message is prohibited. Please let me know
immediately by return e-mail if you have received this message by
mistake, then delete the e-mail message. Thank you.
--
//www.freelists.org/webpage/oracle-l

Other related posts: