Re: Shutdown Abort

  • From: "Andrew Kerber" <andrew.kerber@xxxxxxxxx>
  • To: robertgfreeman@xxxxxxxxx
  • Date: Tue, 3 Jul 2007 11:31:49 -0500

I am not trying to start an argument here, but you are missing the point
entirely.  It would not require any sort of bug in the shutdown abort
'process' to cause problems with your database.  Shutdown abort does not
check file status, checkpoint status, transactions, or anything else.  It
just kills the instance processes.  Immediately.  If the database is shut
down, shutdown abort works the way it is expected.  And frankly, the
simplest test case there is is (as I mentioned before) put the db files in
backup mode, then shutdown abort.  The instance will not come back up w/o
internvention of the dba.  Thats what I call a problem w/ shutdown abort.
Just because you can fix the problem easily does not mean its not a
problem.  Shutdown immediate will not allow you to shutdown the instance
with files in backup mode.

Shutdown abort is sometimes required, but as I have said before, until
Oracle endorses it as the standard method to shutdown an instance, its a
good idea to avoid it.

On 7/3/07, Robert Freeman <robertgfreeman@xxxxxxxxx> wrote:

 Anything is possible given the right set of conditions. Clearly there is
an issue with fsck in AIX5L that can cause problems with temporary
tablespace temp files. This same problem would seem to exist when doing a
shutdown immediate too. So if one were to do a shutdown abort, and reboot
the system and have this failure occur, one might well jump to the
conclusion that it's a problem with shutdown abort, when in fact it is not.

So there may be other OS interactions that happen in very odd cases that
might cause it appear that a shutdown abort is the issue when it is, in
reality, something else. Also there could be bugs present anywhere in the
configuration (OS, firmware, etc..) that could cause IO corruption given the
right conditions. Then there is the possibility of something running in the
background of the OS that caused the problem, who is to know? The bottom
line is that you are supposed to be able to pull the PLUG on the thing, and
expect that it will come up without help *every* *single* time (assuming
that pulling the plug didn't take out a physical disk for some reason).

I think you are spot on that this was not a controlled test, so anything
could have caused it. In my mind, if it's not reproducable, then there is
something about the test that was not controlled. The exact set of
conditions needs to be known and reproducable to call a test controlled.

The OP's problem might well have happened with a shutdown immediate, there
is no way of telling and so blaming shutdown abort is jumping to conclusions
that are not supported by any hard facts other than the fact that one of the
actions performed was a shutdown abort. How do we know it wasn't the startup
command that was at fault?? This test makes shutdown abort a suspect but not
a criminal. If you can replicate the problem consistently, then we have
something to work with, and I'd LOVE to see your results.

I guess my point of view is that you are as likely to find a bug with
shutdown immediate as you are with shutdown abort, so do we just not
shutdown the database at all because there  *might* be a bug? All other
things being equal, shutdown abort should not have negative impacts on your
database, and if it does, it's a bug. Besides, by the time I go production
on any given system, I've done a shutdown abort enough times on test and
development that it should be exercised pretty well. You can't spend your
life worrying about the bugs that *might* be there (there are enough real
ones to contend with!!), that is what backup strategies are for.

Finally the *body* of experience here seems to be that shutdown abort
works, and is perfectly safe. I've yet to see one case where anyone can
reliably replicate a shutdown abort bug in 9i or 10g that is exclusive to
shutdown abort. If anyone can provide a reproducible test case of shutdown
abort failure, please let me know.

RF



Robert G. Freeman
Oracle Consultant/DBA/Author
Principal Engineer/Team Manager
The Church of Jesus Christ of Latter-Day Saints
Father of Five, Husband of One,
Author of various geeky computer titles
from Osborne/McGraw Hill (Oracle Press)
Oracle Database 11g New Features Now Available for Pre-sales on Amazon.com
!
Sig V1.1

-----Original Message-----
*From:* Joel.Patterson@xxxxxxxxxxx [mailto:Joel.Patterson@xxxxxxxxxxx]
*Sent:* Tuesday, July 03, 2007 9:47 AM
*To:* robertgfreeman@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
*Subject:* RE: Shutdown Abort

 I agree.   It was very unusual – but apparently possible.    My original
note sided with immediate, but if that was unacceptable, I would use
abort.     …  If Randy claims it has to be a controlled test, then so be
it.   I would think it quite hard to find exactly what was happening and
make it happen.



Joel Patterson
Database Administrator
joel.patterson@xxxxxxxxxxx
x72546
904  727-2546





--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

Other related posts: