RE: hanging shutdowns (addressing the requirement for a UNIX reboot)

To: <roger_xu@xxxxxxxxxxx>, "Oracle-L@Freelists" <oracle-l@xxxxxxxxxxxxx>
Date: Mon, 27 Feb 2006 17:18:35 -0800
All,

I am with Jeremiah on this: A shutdown abort DOES NOT harm a database
(at least in the five years I had used it on a set of active databases a
few years ago). The ONLY time a Db had a problem after shutdown abort
was in a 8i upgraded to 9i database (there was a bug a while ago which
was related to the change of format in the redo log to support LSB which
manifested itself when a shutdown abort was issued in between the
upgrade before it was completed - I don't remember the specifics, but it
manifested only during the upgrade).

As to the requirement to reboot the Solaris server, was this because the
Database did not restart and complained of 'Unable to create Shared Mem
segment' (Or similar message)? I believe this could have been because
you killed the background processes after a 'shutdown immediate' "hang".
This is because once you initiate a 'shutdown immediate' and
'control-c'ed out of it, then you will never be able to login since any
new attaches will complain that a shutdown is in progress, and the only
way out is to kill the backend processes. In this case, the shared
memory segment is never released and you get the error at database
restart because the SHM start address is calculated to the same existing
but currently open value, everything being equal). You can very easily
get out of this using the example in the following real life event:

In this case, I had three databases (the surviving Ist, 2nd Dbs and then
the third whose backend had to be killed). In this case, use 'ipcs -am'
to determine the memory segments, calculate the SGA size of the
surviving databases and map the segment IDs using the LPIDs as shown
below. Then use 'ipcrm -m <Key>' to kill the *right* segment (ipcrm -m
23175 in tis case) which will then allow you to restart the database.
(Take it from me, I have done it many times before). In addition, the
NATTCH column which shows 0 attaches is another giveaway!

$ ipcs -am | head -2; ipcs -am | grep oracle
IPC status from <running system> as of Thu Dec  8 13:47:57 BST 2005
T         ID      KEY        MODE        OWNER    GROUP  CREATOR
CGROUP NATTCH      SEGSZ  CPID  LPID   ATIME    DTIME    CTIME 
m     147840   0          --rw-r-----   oracle      dba   oracle
dba      0  655441920  8931 23175 13:47:22 13:47:22 11:42:07
m          2   0xdd27ed28 --rw-r-----   oracle      dba   oracle
dba     16  371458048  6548 22193 13:45:01 13:45:01 14:35:12
m     276867   0xfa9fd35c --rw-r-----   oracle      dba   oracle
dba      0  502874112  8931 23175 13:47:22 13:47:22 11:42:11
m     787590   0          --rw-r-----   oracle      dba   oracle
dba    139  655441920 11593 23223 13:47:46 13:47:47  6:06:10
m     716359   0xe315db0c --rw-r-----   oracle      dba   oracle
dba    139  502874112 11593 23223 13:47:46 13:47:47  6:06:15

Ist surviving DB SQL> show sga

Total System Global Area 1157681312 bytes <== LPID 23223, 139 attaches)
Fixed Size                    73888 bytes
Variable Size             501182464 bytes
Database Buffers          655360000 bytes
Redo Buffers                1064960 bytes

1158316032 = 655441920 + 502874112 (LPID 23223 - 2 segments)

2nd surviving DB SQL> show sga

Total System Global Area  370548720 bytes  <== LPID 22193)
Fixed Size                    69616 bytes
Variable Size             328454144 bytes
Database Buffers           40960000 bytes
Redo Buffers                1064960 bytes
 
John Kanagaraj <><
DB Soft Inc
Phone: 408-970-7002 (W)
 
Co-Author: Oracle Database 10g Insider Solutions
http://www.amazon.com/exec/obidos/tg/detail/-/0672327910/
 
** The opinions and facts contained in this message are entirely mine
and do not reflect those of my employer or customers **




-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Roger Xu
Sent: Monday, February 27, 2006 3:24 PM
To: Oracle-L@Freelists
Subject: RE: hanging shutdowns

What should I do if "shutdown immediate" hangs?
Last time, I had to reboot the Solaris Server.

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On Behalf Of Edgar Chupit
Sent: Monday, February 27, 2006 2:12 PM
To: Oracle-L@Freelists
Subject: Re: hanging shutdowns


Dear Jeremiah,

First of all, I would like to mention that I don't like to shutdown
database without any practical reason (like hardware/OS
maintenance/upgrades/etc).

And still I would like to argue that under normal circumstances startup
force restrict + shutdown immediate (or shutdown abort, startup force,
shutdown immediate) will run almost as fast and is as dangerous as a
single shutdown immediate.

After shutting down abort in order to perform cold backup you still need
to startup database and close it in consistent mode. Database startup is
not very fast process in it self, because Oracle not only needs to
recover database into consistent state (rollback uncommitted
transactions), but also allocate memory structures and prepare itself
for a normal work. And to shutdown database in consistent state you
still need to issue shutdown immediate.

One of the popular reasons why shutdown immediate can take a longer time
to proceed is because Oracle waits for SNP process to wakeup
(Note: 1018421.102), but this can also happened when the shutdown
immediate is called second time (after startup force), so even
checkpointing and using startup force restrict can cause database to
hang in shutdown immediate mode.

Also, there is a Note: 46001.1 that suggest to minimize usage of
shutdown abort on Windows systems, because it can cause "allocation
problems when Oracle is next started.". Note: 161234.1 that describes
situation when shutdown abort can hang. Note: 222553.1 that states that
startup force can be safer than shutdown abort. And plenty of other
notes that describes different problems that can occur during database
shutdown.

And surely there are many bugs that can occur after shutdown abort (but
under normal circumstances shutdown abort is very safe).

Saying all this, I would like to return to thread subject and suggest to
the original poster to try to convince the management to switch to hot
backups, and forget about shutting down the databases because of backup
at all.

On 2/27/06, Jeremiah Wilton <jeremiah@xxxxxxxxxxx> wrote:
> If you 'alter system checkpoint' before the 'shutdown abort' then it 
> should be a lot faster for the user with a hanging or prolonged 
> 'shutdown immediate'.

> Jeremiah Wilton
> ORA-600 Consulting
> Recoveries - Seminars - Hiring
> http://www.ora-600.net


--
Best regards,
  Edgar Chupit
  callto://edgar.chupit
--
//www.freelists.org/webpage/oracle-l



For technical support please email tech_support@xxxxxxxxxxx or you can
call (972)721-8257. 
This email has been scanned for all viruses by the MessageLabs Email
Security System.

This e-mail is intended solely for the person or entity to which it is
addressed and may contain confidential and/or privileged information.
Any review, dissemination, copying, printing or other use of this e-mail
by persons or entities other than the addressee is prohibited. If you
have received this e-mail in error, please contact the sender
immediately and delete the material. 
____________________________________________________________________
This email has been scanned for all viruses by the MessageLabs Email
Security System. Any questions please call 972-721-8257 or email your
request to tech_support@xxxxxxxxxxxx
--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l
Follow-Ups:
- Re: hanging shutdowns (addressing the requirement for a UNIX reboot)
  - From: LiShan Cheng
RE: hanging shutdowns (addressing the requirement for a UNIX reboot)

Other related posts: