Re: OT - Getting fired for database oops

  • From: Charles Schultz <sacrophyte@xxxxxxxxx>
  • To: Mark.Bobak@xxxxxxxxxxxx
  • Date: Mon, 18 May 2009 09:09:00 -0500

alias rm 'rm –yes-i-know-i-specified-rf-and-yes-i-know-im-about-to-
specify-the-root-directory-and-yes-i-know-im-logged-in-
as-root-do-it-anyways'

"With great power comes great responsibility".

You just cannot have it two different ways. Great power and no
responsibility only happens in Hollywood and books/movies.

On Mon, May 18, 2009 at 08:53, Bobak, Mark <Mark.Bobak@xxxxxxxxxxxx> wrote:

>  Ouch!
>
>
>
> rm –rf /   ….how many ways can that one bite you?  Honestly, sometimes I
> think rm needs an option like:
>
> rm –rf
> –yes-i-know-i-specified-rf-and-yes-i-know-im-about-to-specify-the-root-directory-and-yes-i-know-im-logged-in-as-root-do-it-anyways
> /
>
>
>
>
>
> Maybe then, this particular calamity’s frequency would be reduced…..
>
>
>
> -Mark
>
>
>
> *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto:
> oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Goulet, Richard
> *Sent:* Monday, May 18, 2009 9:43 AM
> *To:* stephenbooth.uk@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
> *Subject:* RE: OT - Getting fired for database oops
>
>
>
> Well,  I'll give you all a good one to laugh about.  Regrettably it's only
> marginally about Oracle.  We had an HP tech in on a Sunday morning to
> install, configure, etc... Service Guard.  He and our resident Unix hack
> worked away at it all day, with a couple of hardware mess-up's along the
> way.  Now the HP tech had wisely placed a number of configuration files in a
> /temp directory.  At the end of this long day he went to delete the saved
> config files and very absent mindly typed
>
> "rm -fr /".
>
>
>
>     They spent a majority of the night restoring the system from tape, the
> hard way & called me at 3AM to start & check out the database.  The Service
> Guard install was deferred to another weekend.
>
>
>
> *Dick Goulet*
> Senior Oracle DBA
> PAREXEL International
>
>
>
>
>  ------------------------------
>
> *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto:
> oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Stephen Booth
> *Sent:* Monday, May 18, 2009 8:08 AM
> *To:* oracle-l@xxxxxxxxxxxxx
> *Subject:* Re: OT - Getting fired for database oops
>
>
>
> On 05/18/2009, *John Hallas* <John.Hallas@xxxxxxxxxxxxxxxxxx> wrote:
>
> I do know of a DBA who deleted the test database ready for a refresh from
> production. The 578 datafiles took a long time to delete but slightly longer
> (36 hours)  to recover once he realised that he was logged onto production.
>
>
> Something very similar happened in one of my past jobs.  A consultant DBA
> at a customer site (employed by the customer through an agency) trashed the
> main production finance system at 17:00 one Friday, thinking he was dropping
> the QA one ready for a restore from the production backup over the weekend.
> I then had to spend the entire weekend restoring the production system and
> rolling it forward (this was a 23:55 by 7 (i.e. 5 minutes permitted downtime
> a day) system, fortunately weekends were slow and there was provision to
> cache transactions locally then apply them as a batch, unfortunately the
> total transaction for the weekend amounted to about the average for 10
> minutes transactions on Monday morning so getting it fixed for Monday was
> vital), plus restoring the QA system.
>
>
>
>
> The company got a £1.8 million fine for the outage  - government supplier
> etc
>
>
> Fortunately we were able to get the system back by the early hours of
> Monday morning so losses were minimal (about £1million, pocket change for
> this organisation).
>
>
>
>
> He kept his job though
>
>
>
>  I suspect that the DBA who trashed the database would have been sacked
> but from what we could tell from some forensic unpicking of events, phone
> logs, statements from people on site at the time and CCTV footage he spent
> about 30 minutes trying to fix it, phoned his agency for 10 minutes, cleared
> his desk and left for destination unknown.  When contacted his agency denied
> any knowledge of him.
>
> The key lessons we learned from this were:
>
> * Don't use the same passwords on production and QA (OS and Oracle).
> * For any regular destructive jobs (e.g. deleting datafiles to clear down
> QA ready for restore from prod) have a pre-written script that is only on
> the server it's needed on rather than using a manual script.
> * When you've broken a mirror from a 3 way stack to back up from, consider
> not resilvering until the last possible moment (had this been the case here
> we could have restored by resilvering from the detached copy to the other
> two 'disks' and rolling forward on the logfiles, total downtime less than 3
> hours).
>
> We did try to get the customer to agree to us doing the trashing of the
> database as part of our restore process on the Saturday but they insisted on
> keeping control of the process and that it be done by their own staff.
>
> Stephen
>
> --
> It's better to ask a silly question than to make a silly assumption.
>
> http://stephensorablog.blogspot.com/ |
> http://www.linkedin.com/in/stephenboothuk | Skype: stephenbooth_uk
>
> Apparently I'm a "Eierlegende Woll-Milch-Sau", I think it was meant as a
> compliment.
>



-- 
Charles Schultz

Other related posts: