RE: Shutdown Abort

  • From: "Hameed, Amir" <Amir.Hameed@xxxxxxxxx>
  • To: <andrew.kerber@xxxxxxxxx>, <robertgfreeman@xxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 3 Jul 2007 13:28:27 -0400

We have been using shutdown abort for our mission critical Oracle Apps
systems for over a decade now and have never had any issues. Why are we
using this option instead of immediate is because immediate takes a long
time to shutdown the database and if we were to use it then we would not
be able to stay in our agreed LOS time.


________________________________

        From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Andrew Kerber
        Sent: Tuesday, July 03, 2007 12:32 PM
        To: robertgfreeman@xxxxxxxxx
        Cc: oracle-l@xxxxxxxxxxxxx
        Subject: Re: Shutdown Abort
        
        
        I am not trying to start an argument here, but you are missing
the point entirely.  It would not require any sort of bug in the
shutdown abort 'process' to cause problems with your database.  Shutdown
abort does not check file status, checkpoint status, transactions, or
anything else.  It just kills the instance processes.  Immediately.  If
the database is shut down, shutdown abort works the way it is expected.
And frankly, the simplest test case there is is (as I mentioned before)
put the db files in backup mode, then shutdown abort.  The instance will
not come back up w/o internvention of the dba.  Thats what I call a
problem w/ shutdown abort.  Just because you can fix the problem easily
does not mean its not a problem.  Shutdown immediate will not allow you
to shutdown the instance with files in backup mode.
        
        Shutdown abort is sometimes required, but as I have said before,
until Oracle endorses it as the standard method to shutdown an instance,
its a good idea to avoid it.
        
        
        On 7/3/07, Robert Freeman <robertgfreeman@xxxxxxxxx> wrote: 

                Anything is possible given the right set of conditions.
Clearly there is an issue with fsck in AIX5L that can cause problems
with temporary tablespace temp files. This same problem would seem to
exist when doing a shutdown immediate too. So if one were to do a
shutdown abort, and reboot the system and have this failure occur, one
might well jump to the conclusion that it's a problem with shutdown
abort, when in fact it is not.
                 
                So there may be other OS interactions that happen in
very odd cases that might cause it appear that a shutdown abort is the
issue when it is, in reality, something else. Also there could be bugs
present anywhere in the configuration (OS, firmware, etc..) that could
cause IO corruption given the right conditions. Then there is the
possibility of something running in the background of the OS that caused
the problem, who is to know? The bottom line is that you are supposed to
be able to pull the PLUG on the thing, and expect that it will come up
without help *every* *single* time (assuming that pulling the plug
didn't take out a physical disk for some reason). 
                 
                I think you are spot on that this was not a controlled
test, so anything could have caused it. In my mind, if it's not
reproducable, then there is something about the test that was not
controlled. The exact set of conditions needs to be known and
reproducable to call a test controlled. 
                 
                The OP's problem might well have happened with a
shutdown immediate, there is no way of telling and so blaming shutdown
abort is jumping to conclusions that are not supported by any hard facts
other than the fact that one of the actions performed was a shutdown
abort. How do we know it wasn't the startup command that was at fault??
This test makes shutdown abort a suspect but not a criminal. If you can
replicate the problem consistently, then we have something to work with,
and I'd LOVE to see your results.
                 
                I guess my point of view is that you are as likely to
find a bug with shutdown immediate as you are with shutdown abort, so do
we just not shutdown the database at all because there  *might* be a
bug? All other things being equal, shutdown abort should not have
negative impacts on your database, and if it does, it's a bug. Besides,
by the time I go production on any given system, I've done a shutdown
abort enough times on test and development that it should be exercised
pretty well. You can't spend your life worrying about the bugs that
*might* be there (there are enough real ones to contend with!!), that is
what backup strategies are for. 
                 
                Finally the *body* of experience here seems to be that
shutdown abort works, and is perfectly safe. I've yet to see one case
where anyone can reliably replicate a shutdown abort bug in 9i or 10g
that is exclusive to shutdown abort. If anyone can provide a
reproducible test case of shutdown abort failure, please let me know.
                 
                RF
                 
                 

                Robert G. Freeman
                Oracle Consultant/DBA/Author
                Principal Engineer/Team Manager
                The Church of Jesus Christ of Latter-Day Saints
                Father of Five, Husband of One,
                Author of various geeky computer titles
                from Osborne/McGraw Hill (Oracle Press)
                Oracle Database 11g New Features Now Available for
Pre-sales on Amazon.com!
                Sig V1.1 

                        -----Original Message-----
                        From: Joel.Patterson@xxxxxxxxxxx
[mailto:Joel.Patterson@xxxxxxxxxxx]
                        Sent: Tuesday, July 03, 2007 9:47 AM
                        To: robertgfreeman@xxxxxxxxx;
oracle-l@xxxxxxxxxxxxx
                        Subject: RE: Shutdown Abort
                        
                        

                        I agree.   It was very unusual - but apparently
possible.    My original note sided with immediate, but if that was
unacceptable, I would use abort.     ...  If Randy claims it has to be a
controlled test, then so be it.   I would think it quite hard to find
exactly what was happening and make it happen.   

                         

                        Joel Patterson 
                        Database Administrator 
                        joel.patterson@xxxxxxxxxxx 
                        x72546 
                        904  727-2546 

                         




        -- 
        Andrew W. Kerber
        
        'If at first you dont succeed, dont take up skydiving.' 

Other related posts: