Re: Cron management...

  • From: "Yong Huang" <dmarc-noreply@xxxxxxxxxxxxx> (Redacted sender "yong321@xxxxxxxxx" for DMARC)
  • To: cgrabowy@xxxxxxxxx
  • Date: Mon, 13 Apr 2015 08:01:44 -0700

Recently for a short time the crontab entry for a production backup was
commented out.
Just last week one of the DBAs had "accidently" deleted all the backup
scripts.
The scripts directory is NFS mounted so it impacted every server.
Managing crontabs, jobs and scripts across 30 servers just doesn't seem to be
working.

Hi Chris,

Our shop runs cron jobs on all servers. We evaluated job management software
and decided to not use it. It adds one more layer of complexity and, in fact,
moves multiple points of failure to one single point. This is also why we
stopped using Oracle EM to manage jobs. We'd rather have failures on certain
servers instead of all.

Another reason we don't use any other software is that cron is far more mature
(with 40+ years of UNIX history) and stable. We used to have EM issues
scheduling jobs. Ever since we migrated all to cron, we never had one failure
due to cron itself.

As to pushing the same scripts to many servers, we assign the work to
individual DBAs. We're a relatively small shop. When I worked at a big shop
with hundreds of databases a few years ago, we had one script to do the
deployment.

One feature any job management software has but cron does not have or has
difficulty to implment is conditional scheduling: immediately after one job
finishes on one server, optionally on certain conditions, another job on a
different server starts. Fortunately, we don't have such need at this time.

Not directly related to the topic. Certain work is better not done by a cron
job, such as reading a log file. When alert.log is big, reading from head to
tail wastes CPU and disk I/O. It should be implemented as an always running
tail -f command piped to a pattern checking logic.

One common yet serious problem with cron or any type of jobs is that one admin
temporarily comments out an entry and forgets to uncomment it. My habit is to
make a calendar entry for the next morning to remind myself. One other problem
is to accidentally delete some or all cron jobs ("e" and "r" are close on the
keyboard!). We have a cron job to backup all cron jobs on all servers.

Cron jobs are also very easy to troubleshoot. Unlike at many shops, we mandate
this syntax in crontab:

<time> <script> > /tmp/jobname.out 2>&1

I believe throwing away stdout and stderr is a big mistake. It would make
troubleshooting very difficult. Directing them to mail (by default) is not good
practice either because it's harder to read and could increase mailbox size
unintentionally.

Yong Huang
P.S. more tips at http://yong321.freeshell.org/computer/CronJobs.html
--
//www.freelists.org/webpage/oracle-l


Other related posts: