Re: Protecting production from "us"

  • From: Alvaro Jose Fernandez <alvaro.fernandez@xxxxxxxxx>
  • To: "alfredo.abate@xxxxxxxxx" <alfredo.abate@xxxxxxxxx>, Jeremy Schneider <jeremy.schneider@xxxxxxxxxxxxxx>
  • Date: Fri, 4 Dec 2015 08:11:50 +0000

I remember many years ago a related situation. I was typing on a shell session
over a big HP-UX system. Then, a guy (former external consultant, not working
in the same business as mine's, and a quite smart person btw) entered the room,
approached to *my* keyboard and...to my dismay typed "rm -rf *", while smiling
and saying "hoho! I too know UNIX!". The whole thing happened in a mere 5 or 6
seconds. I quickly typed "Control-C" trying to abort ..too late. At this moment
my current working directory was...the datafiles folder of the main app's
database.



When the guy understood what he did he panicked, of course. We recovered the
database from backups, and no one was fired. Fortunately at the time being ,
the database/application deployment was only in the initial stages, so no real
harm happened, but truly...accidents happen!



Alvaro



________________________________
De: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] en nombre de
Alfredo Abate [alfredo.abate@xxxxxxxxx]
Enviado: jueves, 03 de diciembre de 2015 21:49
Para: Jeremy Schneider
CC: HerringD@xxxxxxx; Oracle-L
Asunto: Re: Protecting production from "us"
I'm disappointed at management's response of the backlash is now any further
mistakes on production will result in immediate termination. I don't see how
any person (in any field) could work knowing that if they make another mistake
like this that they are terminated. Especially given the track record that it
sounds you and the rest of your team has had (years between this happening).
If someone was making these types of mistakes frequently then that is another
story all together.

I suppose if the system at hand cost a company tons of money for each outage
(say a trading system or high volume eCommerce) then things might be a little
different (maybe this is the case here).

At the end of the day these machines, systems, etc are all operated by the all
mighty error prone humans.

Alfredo



On Thu, Dec 3, 2015 at 2:27 PM, Alfredo Abate
<alfredo.abate@xxxxxxxxx<mailto:alfredo.abate@xxxxxxxxx>> wrote:
I like Jeremy's server side control better for the terminal background colors.
I'll have to look into that one.

Thanks for that tip.

Alfredo



On Thu, Dec 3, 2015 at 1:05 PM, Jeremy Schneider
<jeremy.schneider@xxxxxxxxxxxxxx<mailto:jeremy.schneider@xxxxxxxxxxxxxx>> wrote:
On Thu, Dec 3, 2015 at 11:45 AM, Herring, David
<HerringD@xxxxxxx<mailto:HerringD@xxxxxxx>> wrote:

* Should we look into some kind of additional controls where
commands like "srvctl stop..." cannot be run under our own accounts using
"sudo -u oracle" but instead need a different account on production? For
example, normally our unfortunate DBA would use his "scapebob" Linux account
but perhaps to perform a production shutdown he'd need to connect as
"scapebob-rw", a new, special account just for dangerous production
activities.

I think that I'd be hesitant to introduce too much variation between
production and test environments when it comes to processes. It's a
major advantage if you can test your processes in the test tier, then
run those same processes verbatim (key-for-key) in production
afterwards.

* The problem in our situation was over confusion with multiple
windows. Do people set a Linux TMOUT to something short like 10 or 15
minutes, to hopefully avoid accidentally leaving production putty sessions
open?

I feel like a short timeout is likely to cause more frustration in the
trenches than what it's worth, for anyone who spends any significant
amount of time troubleshooting production systems. Often you have
multiple windows open and switch between them... an aggressive timeout
really makes that much more difficult.

* Beyond changing the linux prompt and text colors (we set $PS1 with
escape sequences and various key, env-specific values) do you do anything
else for protection of production?

Personally, I think background color is your best bet. Only difference
from Alfredo's suggestion would be that I'd prefer having it be
controlled server-side rather than relying on each engineer to setup
all their terminal connections correctly. Not to mention that you
could get the *wrong* bg color if it's client-side and somehow
somebody ssh's between tiers.

--
http://about.me/jeremy_schneider
--
//www.freelists.org/webpage/oracle-l



Other related posts: