Re: Eliminating the need for qemu for file images on Xen unikernels

  • From: Anil Madhavapeddy <anil@xxxxxxxxxx>
  • To: martin@xxxxxxxxxx
  • Date: Wed, 5 Aug 2015 19:33:42 +0100

Reducing the dependency on qemu is very desirable given the recent rash
of security issues from the codebase as well.

There's another alternative method which was used in the form of the
'blktap' driver. This redirects block traffic to a userspace daemon
that can then write to (e.g.) a VHD or VMDK file. Blktap also doesnt
require using up a loop device.

Blktap's been through a series of rewrites though, and never seems to
have been upstreamed even though it's the default storage mode used
in XenServer. CCing Dave Scott: do you know if using blktap is a viable
option these days?


On 22 Jul 2015, at 18:31, Martin Lucina <martin@xxxxxxxxxx> wrote:


I've managed to track down the Xen upstream change (will be in 4.6) which
changes the default backend type for raw fles from 'qdisk' to 'phy'. In
practice this means that a qemu process is no longer required to serve the
qdisk protocol which saves both memory and domU startup time.

I've tested this patch on my Debian jessie install by backporting it to the
Debian Xen 4.4.1 packages and it works fine. However, note that what it
actually does in practice (not obvious from the patch) is that it
internally uses a loop device to back the file. It's not clear to me why
the Linux blkback driver can't just use a file directly, but that's not
something I want to investigate right now.

The default maximum number of loop devices is (AFAIK) quite low, so you
will want to increase that if you want to use this patch, by setting the
max_loop parameter appropriately when the loop kernel module is loaded.



commit a0a2dc45f1bdbabc661332e8530c480a77391d85
Author: Wei Liu <wei.liu2@xxxxxxxxxx>
Date: Thu Jan 8 13:46:53 2015 +0000

libxl, hotplug/Linux: default to phy backend for raw format file, take 2

This patch resurrects 11a63a166. The previous patch had a bug that
wrong "physical-device" was written to xenstore causing block script
execution fail. This patch fixes that problem.

Following configurations have been tested:

1. Raw file and PV
2. Raw file and HVM
3. Block device and PV
4. Block device and HVM

Creation / destruction / local migration all worked.

Original commit message from 11a63a166:

Modify libxl and hotplug script to allow raw format file to use phy

The block script now tests the path and determine the actual type of
file (block device or regular file) then use the actual type to
determine which branch to run.

With these changes, plus the current ordering of backend preference (phy
qdisk > tap), we will use phy backend for raw format file by default.

Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
Cc: Ian Campbell <ian.campbell@xxxxxxxxxx>
Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>
Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>

diff --git a/tools/hotplug/Linux/block b/tools/hotplug/Linux/block
index da26e22..8d2ee9d 100644
--- a/tools/hotplug/Linux/block
+++ b/tools/hotplug/Linux/block
@@ -206,6 +206,13 @@ and so cannot be mounted ${m2}${when}."

t=$(xenstore_read_default "$XENBUS_PATH/type" 'MISSING')
+p=$(xenstore_read "$XENBUS_PATH/params")
+mode=$(xenstore_read "$XENBUS_PATH/mode")
+if [ -b "$p" ]; then
+ truetype="phy"
+elif [ -f "$p" ]; then
+ truetype="file"

case "$command" in
@@ -217,16 +224,11 @@ case "$command" in
exit 0

- if [ -n "$t" ]
- then
- p=$(xenstore_read "$XENBUS_PATH/params")
- mode=$(xenstore_read "$XENBUS_PATH/mode")
- fi
FRONTEND_ID=$(xenstore_read "$XENBUS_PATH/frontend-id")
FRONTEND_UUID=$(xenstore_read_default \
"/local/domain/$FRONTEND_ID/vm" 'unknown')

- case $t in
+ case $truetype in
dev=$(expand_dev $p)

@@ -319,7 +321,7 @@ mount it read-write in a guest domain."

- case $t in
+ case $truetype in
exit 0
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 372dd3b..11cf0e1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2416,9 +2416,9 @@ static void device_disk_add(libxl__egc *egc, uint32_t
if (!disk->script &&
disk->backend_domid == LIBXL_TOOLSTACK_DOMID) {
int major, minor;
- libxl__device_physdisk_major_minor(dev, &major, &minor);
- flexarray_append_pair(back, "physical-device",
- libxl__sprintf(gc, "%x:%x", major, minor));
+ if (!libxl__device_physdisk_major_minor(dev, &major,
+ flexarray_append_pair(back, "physical-device",
+ libxl__sprintf(gc, "%x:%x",
major, minor));

assert(device->backend_kind == LIBXL__DEVICE_KIND_VBD);
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 4b51ded..0f50d04 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -332,6 +332,8 @@ int libxl__device_physdisk_major_minor(const char
*physpath, int *major, int *mi
struct stat buf;
if (stat(physpath, &buf) < 0)
return -1;
+ if (!S_ISBLK(buf.st_mode))
+ return -1;
*major = major(buf.st_rdev);
*minor = minor(buf.st_rdev);
return 0;
diff --git a/tools/libxl/libxl_linux.c b/tools/libxl/libxl_linux.c
index ea5d8c1..b51930c 100644
--- a/tools/libxl/libxl_linux.c
+++ b/tools/libxl/libxl_linux.c
@@ -19,11 +19,11 @@

int libxl__try_phy_backend(mode_t st_mode)
- if (!S_ISBLK(st_mode)) {
- return 0;
+ if (S_ISBLK(st_mode) || S_ISREG(st_mode)) {
+ return 1;

- return 1;
+ return 0;

#define EXT_SHIFT 28

Other related posts: