Re: Oracle crashes on Windows shutdown

On 12/21/05, Allen, Brandon <Brandon.Allen@xxxxxxxxxxx> wrote:
>
> Just curious if any of you have ever had a problem with Oracle coming back
> up after a "crash" forced by a normal Windows shutdown (i.e. no disk or
> other hardware problems - just a reboot)?
>
> I know there are various registry settings to determine if Oracle shuts
> down, how it shuts down (normal/immediate/abort), and how long Windows
> should wait.  And I know that Oracle Support's canned response on the
> subject is:
>
> "If the entry's are not set, stopping an service will do WORSE then an
> shutdown abort. Windows NT / 2000 will just "clean" the memory. This could
> be compared by a kill -9 on unix. This is likely to make any cold backup
> useless. "
>
> And the Oracle Admin doc says:
>
> "If ORA_SHUTDOWN or ORA_SID_SHUTDOWN is set to false, then shutting down
> OracleServiceSID will still shut down Oracle9i database. But it will be an
> abnormal shutdown, and Oracle Corporation does not recommend it."
>
> But, has anyone ever actually seen it cause a problem in real life?  I've
> seen Windows shutdowns force thousands of Oracle crashes over the years and
> not once have I seen it cause any trouble for Oracle on startup - other than
> a little crash recovery of course.
>
> Thanks,
> Brandon

Brandon,

The win32 OSes use NTFS for its file systems, which is a journaling filesystem.
Provided that the write is journaled, instance recovery should be able
to get things going again after a hard crash.

If you throw write-back caching at the RAID controller level into the
mix, things become very scary very quickly. Ok, so there is a battery
backup on the RAID controller, such that its rendered non-volatile,
you say? Funny thing about batteries is that the drain and
periodically require replacement or recharging. Some RAID controllers
(PERC comes to mind) don't allow you to manually recharge the battery
- it will recharge on its own. Fine - except for if you combine events
... drained battery, hard power off event, unwritten cached data in
NVRAM ... you have unwritten data on disk that oracle thinks has been
written.

I've also seen what were normally healthy PERC units start throwing
writefile errors, causing volumes to go offline. If the volume is
offline, so is the NTFS journal.

I don't see anything wrong with

"SQL> shutdown abort"
"C:\> net stop oracleservice%ORACLE_SID%"
"C:\sysinternals\sync"

followed by a power down.

Crash recovery only works if the controlfile and current online redo
log are healthy.
If they aren't, you're looking at media recovery and incomplete
database recovery.

hth.

Paul
--
http://www.freelists.org/webpage/oracle-l


Other related posts: