RE: hanging shutdowns (addressing the requirement for a UNIX reboot)
- To: <roger_xu@xxxxxxxxxxx>, "Oracle-L@Freelists" <oracle-l@xxxxxxxxxxxxx>
- Date: Mon, 27 Feb 2006 17:18:35 -0800
All,
I am with Jeremiah on this: A shutdown abort DOES NOT harm a database
(at least in the five years I had used it on a set of active databases a
few years ago). The ONLY time a Db had a problem after shutdown abort
was in a 8i upgraded to 9i database (there was a bug a while ago which
was related to the change of format in the redo log to support LSB which
manifested itself when a shutdown abort was issued in between the
upgrade before it was completed - I don't remember the specifics, but it
manifested only during the upgrade).
As to the requirement to reboot the Solaris server, was this because the
Database did not restart and complained of 'Unable to create Shared Mem
segment' (Or similar message)? I believe this could have been because
you killed the background processes after a 'shutdown immediate' "hang".
This is because once you initiate a 'shutdown immediate' and
'control-c'ed out of it, then you will never be able to login since any
new attaches will complain that a shutdown is in progress, and the only
way out is to kill the backend processes. In this case, the shared
memory segment is never released and you get the error at database
restart because the SHM start address is calculated to the same existing
but currently open value, everything being equal). You can very easily
get out of this using the example in the following real life event:
In this case, I had three databases (the surviving Ist, 2nd Dbs and then
the third whose backend had to be killed). In this case, use 'ipcs -am'
to determine the memory segments, calculate the SGA size of the
surviving databases and map the segment IDs using the LPIDs as shown
below. Then use 'ipcrm -m <Key>' to kill the *right* segment (ipcrm -m
23175 in tis case) which will then allow you to restart the database.
(Take it from me, I have done it many times before). In addition, the
NATTCH column which shows 0 attaches is another giveaway!
$ ipcs -am | head -2; ipcs -am | grep oracle
IPC status from <running system> as of Thu Dec 8 13:47:57 BST 2005
T ID KEY MODE OWNER GROUP CREATOR
CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME
m 147840 0 --rw-r----- oracle dba oracle
dba 0 655441920 8931 23175 13:47:22 13:47:22 11:42:07
m 2 0xdd27ed28 --rw-r----- oracle dba oracle
dba 16 371458048 6548 22193 13:45:01 13:45:01 14:35:12
m 276867 0xfa9fd35c --rw-r----- oracle dba oracle
dba 0 502874112 8931 23175 13:47:22 13:47:22 11:42:11
m 787590 0 --rw-r----- oracle dba oracle
dba 139 655441920 11593 23223 13:47:46 13:47:47 6:06:10
m 716359 0xe315db0c --rw-r----- oracle dba oracle
dba 139 502874112 11593 23223 13:47:46 13:47:47 6:06:15
Ist surviving DB SQL> show sga
Total System Global Area 1157681312 bytes <== LPID 23223, 139 attaches)
Fixed Size 73888 bytes
Variable Size 501182464 bytes
Database Buffers 655360000 bytes
Redo Buffers 1064960 bytes
1158316032 = 655441920 + 502874112 (LPID 23223 - 2 segments)
2nd surviving DB SQL> show sga
Total System Global Area 370548720 bytes <== LPID 22193)
Fixed Size 69616 bytes
Variable Size 328454144 bytes
Database Buffers 40960000 bytes
Redo Buffers 1064960 bytes
John Kanagaraj <><
DB Soft Inc
Phone: 408-970-7002 (W)
Co-Author: Oracle Database 10g Insider Solutions
http://www.amazon.com/exec/obidos/tg/detail/-/0672327910/
** The opinions and facts contained in this message are entirely mine
and do not reflect those of my employer or customers **
-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Roger Xu
Sent: Monday, February 27, 2006 3:24 PM
To: Oracle-L@Freelists
Subject: RE: hanging shutdowns
What should I do if "shutdown immediate" hangs?
Last time, I had to reboot the Solaris Server.
-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On Behalf Of Edgar Chupit
Sent: Monday, February 27, 2006 2:12 PM
To: Oracle-L@Freelists
Subject: Re: hanging shutdowns
Dear Jeremiah,
First of all, I would like to mention that I don't like to shutdown
database without any practical reason (like hardware/OS
maintenance/upgrades/etc).
And still I would like to argue that under normal circumstances startup
force restrict + shutdown immediate (or shutdown abort, startup force,
shutdown immediate) will run almost as fast and is as dangerous as a
single shutdown immediate.
After shutting down abort in order to perform cold backup you still need
to startup database and close it in consistent mode. Database startup is
not very fast process in it self, because Oracle not only needs to
recover database into consistent state (rollback uncommitted
transactions), but also allocate memory structures and prepare itself
for a normal work. And to shutdown database in consistent state you
still need to issue shutdown immediate.
One of the popular reasons why shutdown immediate can take a longer time
to proceed is because Oracle waits for SNP process to wakeup
(Note: 1018421.102), but this can also happened when the shutdown
immediate is called second time (after startup force), so even
checkpointing and using startup force restrict can cause database to
hang in shutdown immediate mode.
Also, there is a Note: 46001.1 that suggest to minimize usage of
shutdown abort on Windows systems, because it can cause "allocation
problems when Oracle is next started.". Note: 161234.1 that describes
situation when shutdown abort can hang. Note: 222553.1 that states that
startup force can be safer than shutdown abort. And plenty of other
notes that describes different problems that can occur during database
shutdown.
And surely there are many bugs that can occur after shutdown abort (but
under normal circumstances shutdown abort is very safe).
Saying all this, I would like to return to thread subject and suggest to
the original poster to try to convince the management to switch to hot
backups, and forget about shutting down the databases because of backup
at all.
On 2/27/06, Jeremiah Wilton <jeremiah@xxxxxxxxxxx> wrote:
> If you 'alter system checkpoint' before the 'shutdown abort' then it
> should be a lot faster for the user with a hanging or prolonged
> 'shutdown immediate'.
> Jeremiah Wilton
> ORA-600 Consulting
> Recoveries - Seminars - Hiring
> http://www.ora-600.net
--
Best regards,
Edgar Chupit
callto://edgar.chupit
--
http://www.freelists.org/webpage/oracle-l
For technical support please email tech_support@xxxxxxxxxxx or you can
call (972)721-8257.
This email has been scanned for all viruses by the MessageLabs Email
Security System.
This e-mail is intended solely for the person or entity to which it is
addressed and may contain confidential and/or privileged information.
Any review, dissemination, copying, printing or other use of this e-mail
by persons or entities other than the addressee is prohibited. If you
have received this e-mail in error, please contact the sender
immediately and delete the material.
____________________________________________________________________
This email has been scanned for all viruses by the MessageLabs Email
Security System. Any questions please call 972-721-8257 or email your
request to tech_support@xxxxxxxxxxxx
--
http://www.freelists.org/webpage/oracle-l
--
http://www.freelists.org/webpage/oracle-l
- Follow-Ups:
- Re: hanging shutdowns (addressing the requirement for a UNIX reboot)
- From: LiShan Cheng
Other related posts:
- » RE: hanging shutdowns (addressing the requirement for a UNIX reboot)
- » RE: hanging shutdowns (addressing the requirement for a UNIX reboot)
- » Re: hanging shutdowns (addressing the requirement for a UNIX reboot)
- » RE: hanging shutdowns (addressing the requirement for a UNIX reboot)
- » RE: hanging shutdowns (addressing the requirement for a UNIX reboot)
- Re: hanging shutdowns (addressing the requirement for a UNIX reboot)
- From: LiShan Cheng