[haiku-bugs] Re: [Haiku] #9839: VFS: race condition between get_vnode() and new_vnode()/publish_vnode()

  • From: "bonefish" <trac@xxxxxxxxxxxx>
  • Date: Sat, 07 Dec 2013 19:55:45 -0000

#9839: VFS: race condition between get_vnode() and new_vnode()/publish_vnode()
-----------------------------+----------------------------
   Reporter:  bonefish       |      Owner:  axeld
       Type:  bug            |     Status:  new
   Priority:  normal         |  Milestone:  R1
  Component:  System/Kernel  |    Version:  R1/Development
 Resolution:                 |   Keywords:  vfs
 Blocked By:                 |   Blocking:
Has a Patch:  0              |   Platform:  All
-----------------------------+----------------------------
Description changed by bonefish:

Old description:

> There is a race condition between VFS's `get_vnode()` (e.g. in response
> to `_user_open_dir_entry_ref()`) and `new_vnode()` or `publish_vnode()`:
> Since `get_vnode()` first calls `create_new_vnode_and_lock()` and
> afterwards the FS's `get_vnode()` hook, there's a time window where the
> vnode exists, but the respective node may not exist in the FS. If the
> node doesn't actually exist and concurrently another thread creates a new
> file or directory with that node ID, the `new_vnode()` or
> `publish_vnode()` it calls will encounter the existing vnode and fail
> (the former will `panic()`).
>
> Not sure what a good solution would be. `new_vnode()`/`publish_vnode()`
> could wait until the vnode becomes unbusy or disappears (a new marker
> flag might be needed to distinguish the case from others where the node
> is busy). That might be problematic, however, since the caller may hold a
> FS specific lock, so that we'd risk a deadlock with the concurrent
> `get_vnode()` hook which may also want to acquire that lock.
>
> An IMO cleaner alternative would be to change the interface and semantics
> of the `get_vnode()` hook. It would be required to create the vnode (via
> `publish_vnode()` or a new interface). We would have to push the burden
> of dealing with concurrent invocations to the hook, though.
>
> It is not understood yet whether the described issue is related to #5262.
> Here the panic reports the `vnode->node` to be NULL, while that isn't the
> case in #5262.

New description:

 There is a race condition between VFS's `get_vnode()` (e.g. in response to
 `_user_open_dir_entry_ref()`) and `new_vnode()` or `publish_vnode()`:
 Since `get_vnode()` first calls `create_new_vnode_and_lock()` and
 afterwards the FS's `get_vnode()` hook, there's a time window where the
 vnode exists, but the respective node may not exist in the FS. If the node
 doesn't actually exist and concurrently another thread creates a new file
 or directory with that node ID, the `new_vnode()` or `publish_vnode()` it
 calls will encounter the existing vnode and fail (the former will
 `panic()`).

 Not sure what a good solution would be. `new_vnode()`/`publish_vnode()`
 could wait until the vnode becomes unbusy or disappears (a new marker flag
 might be needed to distinguish the case from others where the node is
 busy). That might be problematic, however, since the caller may hold a FS
 specific lock, so that we'd risk a deadlock with the concurrent
 `get_vnode()` hook which may also want to acquire that lock.

 An IMO cleaner alternative would be to change the interface and semantics
 of the `get_vnode()` hook. It would be required to create the vnode (via
 `publish_vnode()` or a new interface). We would have to push the burden of
 dealing with concurrent invocations to the hook, though.

 This issue is somewhat related to #5262 in that there the problem is also
 that `create_new_vnode_and_lock()` reports a pre-existing vnode, although
 in that case the vnode is about to be deleted and already no longer known
 to the FS (cf. ticket:5262#comment:24).

--

--
Ticket URL: <http://dev.haiku-os.org/ticket/9839#comment:1>
Haiku <http://dev.haiku-os.org>
Haiku - the operating system.

Other related posts: