RE: A Challenge

  • From: "Spears, Brian" <BSpears@xxxxxxxxxxxxxxxxx>
  • To: <stellr@xxxxxxxxxx>, "Post, Ethan" <Ethan.Post@xxxxxx>
  • Date: Mon, 14 Nov 2005 15:32:22 -0500

Yes, this is what I did 4 or so years ago. Just save and grep on top (in a 5
second loop) for thresholds and page. Grep the saved files for history of
problem... Very basic. The unix guys now save all the stats for us and can
produce reports for any type of performance issue.

Brian  

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Ray Stell
Sent: Monday, November 14, 2005 2:58 PM
To: Post, Ethan
Cc: oracle-l
Subject: Re: A Challenge

I do this adhoc reporting with top remotely/centrally using ssh shared key
files (no pw prompt on command line).  I've never taken the next step that
you suggest of automating the page.  I go look when I'm looking for trouble.
Seems like an easy tcl/perl/cron to review and move the files around is in
order.

If you have snmp based enterprise monitoring in place, it could/should be
done there also.  We use Smarts/InCharge for this, which is this side of
wonderful.  


> ssh stellr@nem top -b -d1 >> nem.la

Authorized use only.  All activity may be monitored and reported.


> cat nem.la
load averages:  0.23,  0.37,  0.48    14:42:28
87 processes:  86 sleeping, 1 on cpu

Memory: 4096M real, 2070M free, 1411M swap in use, 5821M swap free


   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  3135 oracle    11  58    0    0K    0K sleep   60:45  7.72% oracle
 14942 oracle    14  58    0    0K    0K sleep  184.3H  1.18% oracle
 14950 oracle     1  58    0    0K    0K sleep   54.9H  0.57% oracle
  9168 oracle     1  58    0    0K    0K sleep   52:54  0.40% oracle
  8211 stellr     1  18    0 1776K 1144K cpu/0    0:00  0.24% top
  3141 oracle     1  58    0    0K    0K sleep   61:15  0.20% oracle
  8208 root       1  28    0 6816K 2936K sleep    0:00  0.17% sshd
  3137 oracle     1  58    0    0K    0K sleep   62:14  0.16% oracle
  7831 oracle     1  58    0    0K    0K sleep   55:40  0.13% oracle
  3139 oracle     1  58    0    0K    0K sleep   63:09  0.13% oracle
 14944 oracle    19  58    0    0K    0K sleep   21.9H  0.03% oracle
  1358 root      14  11    0 7192K 6416K sleep  147:47  0.02% picld
  3131 oracle    11  58    0    0K    0K sleep    9:15  0.02% oracle
  8210 stellr     1  28    0 6816K 2176K sleep    0:00  0.02% sshd
 14930 oracle    15  58    0    0K    0K sleep  399:58  0.02% oracle







On Mon, Nov 14, 2005 at 01:17:42PM -0600, Post, Ethan wrote:
> This is just for fun, I have a solution but wondering what others 
> would do. I will post mine later.
>  
> Task: 
> Monitor load average on a unix host.
>  
> Requirements:
> * Notify me via email when load exceeds 8 for more than 60 minutes.
> * After initial alert do not alert me again for at least 2 hours 
> unless the load falls below 8 before the 2 hours expires.
> * Log the load to a file for historical purposes.
> * Alert me when the load falls back below 8.
>  
> Challenge:
> Can you come up with a solution to this problem using existing 
> scripts/utilities (home grown or otherwise)? Ideally lines/$$ required 
> to implement the better. Just wondering what others might do.
>  
>  

--
--
//www.freelists.org/webpage/oracle-l



--
//www.freelists.org/webpage/oracle-l


Other related posts: