RE: Help with database corruption issue

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <stmontgo@xxxxxxxxx>, <pjhoraclel@xxxxxxxxx>
  • Date: Sun, 5 Aug 2012 12:14:24 -0400

It is possible that the rebuilt header was still in deferred write. A cp
operation forces a complete actual write, even on ext4. So it *seems*
possible to me that whatever corruption was there was always a recoverable
error and that it was simply not on disk yet when Oracle tried to read it.

I don't know how you would prove what happened without access to a time
machine to do something like an every instruction strace on all the
processes involved. It is *possible* that enough is logged somewhere.

Good luck,

mwf

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Steve Montgomerie
Sent: Saturday, August 04, 2012 11:24 PM
To: pjhoraclel@xxxxxxxxx
Cc: oracle-l
Subject: Re: Help with database corruption issue

Thanks List!

Dennis and Peter,

We could start 19 of 20 databases. When we tried to start database X, it
would lock up the mount point, would not open, and would hang all of the
other 19 databases.

The actual error points to software corruption. Something like running fsck
against a mounted file system.
SA swears he did not do that we believe him.

In regards to the error it points to s system utility that detects a bad
block and then tries to fix it which ends up with the header information
being zeroed out of some blocks.

The only thing that makes sense to me, is that the CP command somehow
rebuilt the header information of the bad blocks. Is that possible?

On Fri, Aug 3, 2012 at 6:49 AM, Peter Hitchman <pjhoraclel@xxxxxxxxx> wrote:
> Hi,
> Well for some reason the ext4 file system had errors, leading to lost 
> data. That impacts the undo tablespace data file and Oracle could not 
> recover. All I can think is that at some point in time the ext4 file 
> system was not 100% OK and then when you made the data file copy is 
> had been fixed. What sort of disk layout do you have, maybe the error 
> was corrected by way of a disk mirror or some other RAID set-up 
> protection?
>
> Regards
> Pete
> --
> //www.freelists.org/webpage/oracle-l
>
>
--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l


Other related posts: