Re: Eliminating the need for qemu for file images on Xen unikernels

  • From: Anil Madhavapeddy <anil@xxxxxxxxxx>
  • To: Martin Lucina <martin@xxxxxxxxxx>
  • Date: Thu, 6 Aug 2015 15:47:32 +0100

On 6 Aug 2015, at 15:34, Martin Lucina <martin@xxxxxxxxxx> wrote:


On Wednesday, 05.08.2015 at 18:45, Dave Scott wrote:

On 5 Aug 2015, at 19:33, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:

Reducing the dependency on qemu is very desirable given the recent rash
of security issues from the codebase as well.

There's another alternative method which was used in the form of the
'blktap' driver. This redirects block traffic to a userspace daemon
that can then write to (e.g.) a VHD or VMDK file. Blktap also doesnt
require using up a loop device.

Blktap's been through a series of rewrites though, and never seems to
have been upstreamed even though it's the default storage mode used
in XenServer. CCing Dave Scott: do you know if using blktap is a viable
option these days?

I think George Dunlap (cc:d) has been adding support to the Xen build for
the XenServer-derived blktap:

http://lists.xen.org/archives/html/xen-devel/2015-04/msg01853.html

I’ve not had a chance to try it myself, but it ought to be pretty good. What
do you think, George?

Note: the vhd support in blktap/tapdisk is well tested, but it doesn’t
support vmdk IIRC (or at least if it does it won’t be anywhere near as well
tested). It supports raw format too, which is equivalent to using a loop
device + blkback.

I'm not familiar with the underlying protocols involved so the answers to
these might be obvious, but I have a couple of questions:

1) The blkback+qdisk setup is already talking to a userspace daemon (QEMU).
It follows that it should be easy to swap out QEMU for a single-purpose
daemon which does nothing except provide the same qdisk interface to a
bunch of raw files, right?

Yes. My understanding is that qemu-dm is increasingly stripped down
to just include the bits of qemu that are necessary for that mode of
operation, but I haven't looked at it for a few years to confirm this.


2) Is there any technical reason why the Linux blkback driver requires a
block (loop) device and cannot access a raw format image file directly? For
example, the in-kernel nfsd can access files, so I don't understand why
blkback can't.

Blkback just uses submit_bio() within the kernel to submit the block operation
to some other block device that actually controls the physical storage. The
loopback device is the simplest way to connect a physical volume (including
LVM or other logical volumes) to blkback via the kernel IO interface.

Blktap does what you describe, but it pushes the file handling logic into
userspace so that the complex file format code doesn't need to run in
kernel space.

-a

Other related posts: