On 2008-06-23 at 12:12:59 [+0200], Axel Dörfler <axeld@xxxxxxxxxxxxxxxx> wrote: > "Salvatore Benedetto" <emitrax@xxxxxxxxx> wrote: > > Anyway, I wanted to test with condition variable, but I'm not sure on > > how to proceed, despite of that > > though, I think we should hanle the error E_BUSY differently, because > > 10 seconds with our current > > scheduler, and with not I/O scheduler doesn't seem to be a good > > choice > > on slow system like running > > on vmware is. > > We might want to update the wait time if any more pages could be > written back, but we definitely shouldn't wait forever for it to happen > - it's not acceptable to kill the system upon an I/O error. Maybe I don't get it, but when an I/O error occurs the respective driver should just report an error back (after some timeout supposedly), shouldn't it? When that happens the vnode will be marked unbusy again. > If you look closely to the vnode_low_memory_handler(), it will first > try to write back the vnode's pages without making the vnode busy. > Only later it will mark the vnode busy in order to delete it -- since > it cannot guaranty that it will get the same vnodes here, it needs to > try to write them back again. At this point, a new (and dirty) vnode is > obviously chosen, since otherwise it wouldn't be possible to hang > there. > Condition variables obviously wouldn't help here, even though it might > be worthwhile to use them instead of the busy flag. I don't really like > that it would waste 4 more precious bytes per vnode, though. I don't really think 4 bytes is that dramatic -- the structure is already over 60 bytes big and it's not like we have millions of vnodes hanging around -- but anyway, we could employ the same strategy as for pages or caches, i.e. continue to use a flag and use a published condition variable. > In any case, it's probably a good idea to improve the low memory > handler that it will only pick clean vnodes. +1 > I can't imagine this being an IDE problem. It's much more probably a > deadlock problem of some allocator in a low memory situation that > causes this. According to the low memory handler stack trace attached to the ticket it is in free_vnode(), sync'ing the vnode and never returns. The thread isn't even waiting -- it is just releasing a spinlock in start_waiting(). It might be some sort of livelock, but definitely seems related to the ide/scsi/... modules. CU, Ingo