Re: Cron management...

  • From: MARK BRINSMEAD <mark.brinsmead@xxxxxxxxx>
  • To: Seth Miller <sethmiller.sm@xxxxxxxxx>
  • Date: Sun, 12 Apr 2015 22:11:41 -0400

I am in agreement with Seth on both points.

The sysadmins here are simply being cautious -- as well they should be. I,
too, would be concerned about a network service that runs as "root" and can
-- by design -- run any command as any user at any time, based on
instructions received from a remote server, and I would also want to be
convinced of its safety before deploying it. These are the sorts of things
of which internaltion headline stories of massive security breaches are
made. Which is not to say, of course, that there is anything *wrong* with
the products you have mentioned -- just that the sysadmins in question are
simply doing their jobs in asking for assurance that there isn't.

As for crontabs...

Managing crontabs across 30 servers can be a little unwieldy, but it is
certainly possible. Here are a few things I have seen in the past:

- Monitoring jobs that report -- and require acknowledgement -- when a
crontab has been modified.
- Monitoring jobs that report when backups have failed to run as
scheduled. (Note: this is NOT the same as reporting "when a backups has
run and failed".)
- Source code control for administrative scripts (which can include your
crontab, by the way). Something as simple as RCS can do. If you are
paranoid, "check out" each of your scripts from the repository once per
day. Let people make unauthorized changes -- who cares?
- Keep a log of changes to the crontab. Something as simple as piping
"crontab -l" to a file, and then checking that into an RCS archive will
keep a very concise record of what was changed and when, although not
necessarily by whom.

A centralized solution might be right for you. Maybe even for most
people. But I would not say it is mandatory. By the way, what do you
propose to do when somebody "accidentally deletes" all the
scripts/configuration for the centralized job-scheduling facility? You'll
probably need a plan for that.


As for the NetBackup maintenance in the middle of the day...

... well, let's hope that doesn't happen too often. Of course, for a
backup utility "middle of the day" is about as off-peak as you are going to
get, so this is a reality that you're probably going to have to accept.

Usually, the only backups I run during "the middle of the day" would be
archivelog backups. On some systems I might run those every few hours, or
even in extreme cases, once per hour. But I also try to structure and size
my archivelog destinations such that I can go for up to 48 hours without
running an archivelog backup (that is, 48 hours without deleting
archivelogs) without worrying that the database is going to freeze. Nobody
is going to tolerate me allowing the database to stop working for 48 hours,
but I have seen backup infrastructures go offline for that long -- without
anybody losing their jobs.

In any case, if you size your storage properly and run your archivelog
backups with appropriate frequency, it should not be a "problem" if
maintenance in the backup infrastruction (very) occasionally causes an
archivelog backup to fail. The next one will run in an hour or two, the
archivelogs will be backed up and deleted, and it will be like nothing ever
happened. Assuming, of course, that your *(rman) backup scripts* are the
only jobs that ever delete (or manipulate) archivelogs. I'm kind of
religious about that last point -- having seen far too many failures
arising from doing it any other way.


---

I hope this helps!



On Sun, Apr 12, 2015 at 8:01 PM, Seth Miller <sethmiller.sm@xxxxxxxxx>
wrote:

Chris,

Now that Mladen is done belittling your Linux admin for simply being
cautious, the rest of us can offer you constructive advice.

If you like your shell scripts and are comfortable with cron, you might be
able to just enhance it enough to eliminate the single point of failure and
dramatically reduce your risks by centralizing your backups.

Modify your rman scripts to use an Oracle wallet to authenticate to the
databases remotely through an rman client. That way, you can take a backup
without having to be on the server and won't expose the password of a
privileged account. I would also suggest creating a separate sysdba account
just for the use of logging in to do the backups.

You can reduce the single point of failure by using Oracle Clusterware to
set up a failover resource that enables the crons to run on another node in
the cluster if a node were to fail. This is relatively simple although you
do need at least a basic Oracle Clusterware install to use it.

Seth Miller



On Sun, Apr 12, 2015 at 4:38 PM, Mladen Gogala <
dmarc-noreply@xxxxxxxxxxxxx> wrote:

On 04/12/2015 02:47 PM, Chris Grabowy wrote:

Howdy.

We currently have about 30 Redhat Linux servers running Oracle 11.2

Recently for a short time the crontab entry for a production backup was
commented out.

Just last week one of the DBAs had "accidently" deleted all the backup
scripts. The scripts directory is NFS mounted so it impacted every server.

The Netbackup folks like to do maintenance during the day. Any Oracle
backups that may have been running abort. These days we get notice from
the Netbackup folks but it's kinda tricky to check 30 servers and determine
if anything is running. Or kick off 30+ archive log backup scripts across
all the servers to clean up the archive log directories before the
Netbackup maintenance.

Managing crontabs, jobs and scripts across 30 servers just doesn't seem
to be working.

Our company uses a job scheduling app called Tidal. The manager of that
app demo'd the product to me and it seems like it can address many of our
headaches. In theory a single simple interface to manage all the jobs
scheduled across all the database servers.

However one of the issues identified by the Linux admin is that the
Tidal agent needs root access so he is reluctant to install the Tidal agent
anywhere but a couple of designated Tidal servers.

I am wondering if other sites have stopped using crontab? If so then
what did you replace it with?

Anyway, I am open to any thoughts, suggestions, etc.

Thanks,
Chris Grabowy


--
//www.freelists.org/webpage/oracle-l



Chris, I am not sure why are you using crontab with NetBackup? NetBackup
has its own scheduler and can schedule the scripts centrally, through the
NB GUI. All you need is the right script in /usr/openv/netbackup/bin and
all will be well.

As for Tidal, I have no experience whatsoever with the product, but I do
have an experience with the competing product called Control-M from BMC.
Unfortunately, all 3rd party scheduling products, including NetBackup which
also has a centralized scheduler, must have a service which runs as user
"root". The reason for that is that they have to be able to switch user and
run something as user "oracle", without being prompted for password every
time they need to run a job.

These products are usually installed as Linux service, in /etc/init.d
directory and are started during the Linux start-up. Please, inform your
system administrator that NetBackup requires root access as well and ask
him to remove it from all the systems for security reasons. Why stop there?
Oracle also requires root access in the installation phase, one must run
orainstRoot.sh and $ORACLE_HOME/root.sh as user root, so your system
administrator should remove that, too. God forbid you have ASM, that
requires root access, too. To further secure your systems, after removing
all 3rd party products, including Oracle, he or she should execute "service
network stop" as user "root" on every system.
That would completely secure your Linux system and make it impossible to
anyone without the physical access to the server to use them. Of course, no
security is complete without physical security, so you should consider the
industry standard security measures like barbed wire, mine fields,
electrified fence, guard dogs, machine gun nests and Chuck Norris.

Long story short, you are dealing with an unreasonable system
administrator. It's not your problem, it's a problem for your boss.
Management decides what runs on the company systems, not the system
administrator. There is one piece of trivia which people frequently forget:
the only completely secure systems are systems that are not being used and
contain no useful information. Refusing to install an established 3rd party
product based on the assessment that it "needs root access" is ludicrous,
to put it mildly.

--
Mladen Gogala
Oracle DBA
http://mgogala.freehostia.com

--
//www.freelists.org/webpage/oracle-l




Other related posts: