[haiku-development] Re: Q: recover partially corrupt bfs without reinitializing?

  • From: "Axel Dörfler" <axeld@xxxxxxxxxxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Thu, 09 Apr 2009 12:09:01

Marcus Jacob wrote:
> On 08.04.2009, at 17:53, "Axel Dörfler" <axeld@xxxxxxxxxxxxxxxx>
> wrote:
> >> checkfs seems to do more bad than good ...
> > I would doubt that.
> Multiple runs of checkfs produce additional errors in my case instead
> of stating that everything is fine, as I would expect from the second
> run.

The errors on your partition cannot be fixed by checkfs, so how should
it not report them on subsequent runs? :-)
If two files share the same block, you should check manually which of
the two files is corrupted, delete it, and then run checkfs again. Then
it will be able to solve at least this particular problem.

> >> Corruption occured in a single directory, see ticket #3150.
> > So far, I haven't been able to reproduce it, unfortunately.
> Well ;) I don't know, whats so special in my usage, but my file system
> gets corrupted frequently. My main system is trashed every other week
> .

While I wouldn't like this on my production system, I'd love to be able
to reproduce that this well. I'm running Haiku since quite some time as
my main OS, and I haven't yet seen any BFS problems so far.
But it's not a nice feeling knowing that you shouldn't trust it yet.

> Most errors appear in checked out source trees, but also in other
> places.
>
> Anything I can do the next time my file system gets corrupted to help
> locate the problem.
>
> Btw, once the error mentioned in the ticket appears, I also get
> frequent panics "vnode already exists".

I think the main problem is located in the block cache, at least that's
pretty much the only component that could be responsible for this kind
of errors.
So far I'm aware of three kinds of problems:
1) files are in B+trees that shouldn't be there anymore
2) inodes are written over existing inodes
3) data ends up in files that shouldn't contain them

I think it's very likely that this has one root cause, and this can
then only be in the block cache, and are caused by outdated block data
finding their way to the disk.

I guess I should spend more time testing this component - I've already
written some tests, but they obviously don't cause any error (anymore).


> >> Where do I find this tool? I remember such a command from BeOS but
> >> haven't seen it in Haiku.
> > There is none. I would actually prefer to fix bugs instead of
> > delivering (and shipping!) work arounds like that :-)
> Granted. Just hate to reinstall my primary system once a week. And a
> sepetate data partition doesn't help as its the data which gets
> corrupted.

Indeed. Having a non system file system is not really suited for a
primary system - I'll improve the tracing/debugging capabilities of the
block cache next week, maybe something useful shows up.

Bye,
   Axel.


Other related posts: