[haiku-gsoc] Re: [HCD]: Bfs bug #1

  • From: Ingo Weinhold <ingo_weinhold@xxxxxx>
  • To: haiku-gsoc@xxxxxxxxxxxxx
  • Date: Mon, 23 Jun 2008 15:40:17 +0200

On 2008-06-23 at 12:12:59 [+0200], Axel Dörfler <axeld@xxxxxxxxxxxxxxxx> 
wrote:
> "Salvatore Benedetto" <emitrax@xxxxxxxxx> wrote:
> > Anyway, I wanted to test with condition variable, but I'm not sure on
> > how to proceed, despite of that
> > though, I think we should hanle the error E_BUSY differently, because
> > 10 seconds with our current
> > scheduler, and with not I/O scheduler doesn't seem to be a good
> > choice
> > on slow system like running
> > on vmware is.
> 
> We might want to update the wait time if any more pages could be
> written back, but we definitely shouldn't wait forever for it to happen
> - it's not acceptable to kill the system upon an I/O error.

Maybe I don't get it, but when an I/O error occurs the respective driver 
should just report an error back (after some timeout supposedly), shouldn't 
it? When that happens the vnode will be marked unbusy again.

> If you look closely to the vnode_low_memory_handler(), it will first
> try to write back the vnode's pages without making the vnode busy.
> Only later it will mark the vnode busy in order to delete it -- since
> it cannot guaranty that it will get the same vnodes here, it needs to
> try to write them back again. At this point, a new (and dirty) vnode is
> obviously chosen, since otherwise it wouldn't be possible to hang
> there.
> Condition variables obviously wouldn't help here, even though it might
> be worthwhile to use them instead of the busy flag. I don't really like
> that it would waste 4 more precious bytes per vnode, though.

I don't really think 4 bytes is that dramatic -- the structure is already 
over 60 bytes big and it's not like we have millions of vnodes hanging 
around -- but anyway, we could employ the same strategy as for pages or 
caches, i.e. continue to use a flag and use a published condition variable.

> In any case, it's probably a good idea to improve the low memory
> handler that it will only pick clean vnodes.

+1

> I can't imagine this being an IDE problem. It's much more probably a
> deadlock problem of some allocator in a low memory situation that
> causes this.

According to the low memory handler stack trace attached to the ticket it 
is in free_vnode(), sync'ing the vnode and never returns. The thread isn't 
even waiting -- it is just releasing a spinlock in start_waiting(). It 
might be some sort of livelock, but definitely seems related to the 
ide/scsi/... modules.

CU, Ingo

Other related posts: