Protecting production from "us"

  • From: "Herring, David" <HerringD@xxxxxxx>
  • To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 3 Dec 2015 16:45:07 +0000

Folks,



The whole subject of locking down production, limiting access, etc. comes up
periodically in our list so I apologize if this seems to be a repeat but in
short I'm looking for anyone who's willing to share, on this list or privately,
how they "protect" production from those who support it.



Here's the situation: as with many others, we're (DBA team) asked to support
hundreds of environments. In one situation a DBA (let's call him Scapebob) had
multiple putty sessions open for hosts supporting stage and production for the
same application. In the heat of the moment he typed a "srvctl stop
instance..." command in wrong window - production instead of stage. Both stage
and production are 4-node RACs and initially no one noticed, not even the
client. Scapebob immediately restarted the production instance and all was
fine for about an hr but then some locking issues came up that caused an outage
at which point upper-management heard of the accidental instance shutdown and
our whole team came under fire.



The question/issue/subject I'm researching is how to best avoid this kind of
thing happening again.



* We already have LDAP/RH directory involved on a number of
environments but that doesn't differentiate production vs. lower env. All
require individual accounts and use "sudo -u oracle" to execute more dangerous
commands.

* Should we look into some kind of additional controls where commands
like "srvctl stop..." cannot be run under our own accounts using "sudo -u
oracle" but instead need a different account on production? For example,
normally our unfortunate DBA would use his "scapebob" Linux account but perhaps
to perform a production shutdown he'd need to connect as "scapebob-rw", a new,
special account just for dangerous production activities.

* The problem in our situation was over confusion with multiple
windows. Do people set a Linux TMOUT to something short like 10 or 15 minutes,
to hopefully avoid accidentally leaving production putty sessions open?

* Beyond changing the linux prompt and text colors (we set $PS1 with
escape sequences and various key, env-specific values) do you do anything else
for protection of production?



Thanks in advance for anything shared.



Regards,



Dave

Other related posts: