Re: rumpbake support for baked-in rootfs

  • From: Martin Lucina <martin@xxxxxxxxxx>
  • To: rumpkernel-users@xxxxxxxxxxxxx
  • Date: Tue, 16 Jun 2015 15:36:39 +0200

On Tuesday, 16.06.2015 at 10:35, Antti Kantee wrote:

I have no attachment to this particular *implementation*, in fact I'd be
the first to throw it out and replace with something like SquashFS.

I read between the lines that you don't really even want the tarball
method (correct me if I'm interpreting too much).

I don't particularly like the tarball method, but it was the least effort
for me to get working.

The tarball method has another advantage in being able to trivially support
compression (just add zlib, which you are going to have included in the
stack for most things anyhow).

Wouldn't a zero-effort driver implementation have been easier than an
adhoc tarball implementation, i.e. use MFS to mount an in-memory FFS
image? We'd need to bundle newfs and fs-utils, but we want to do that
*anyway* for other purposes. It seems like a double win over adding
something you don't really want.

I don't follow -- MFS is a separate filesystem from FFS. When we looked at
the options with Justin it was not clear if MFS is just a worse kind of
TMPFS or something else?

As for bundling fs-utils and using an in-memory FFS, how suitable is FFS
for in-memory usage? Also, wouldn't bundling fs-utils require yet another
build of rumpsrc?

Therefore, if I want to run a minimal and small[*] service as a rumprun
unikernel in the cloud *today*, and that service does not need any
persistent data, why should I bother with an (external) block device at all?

Ok, good reason. Can you point me to a resource giving an overview?
I clearly need to read up on the subject.

I'm probably not the best person to ask that, what I know is based on
reading the EC2 website, CLI tools manuals and API, and looking at the
websites of the other providers offering compute nodes.

Perhaps Justin can suggest something?

You too are confusing a particular way of bundling the binary+fs
with the concept of it.

I absolutely think that we should be able to distribute a single
file image. The problem is that to launch it, you must somehow
distribute the rumprun parameters too. So I'm a bit confused as to
how including the data but not the configuration in the image gives
you what you actually want. That's the whole thing that bugs me. I
can't wrap my head around the different configuration spaces, which
*all* contribute to what happens at runtime:

1: binary code
2: block device data
3: rumprun launch parameters, including application command line
(4: now-proposed alternative way to supply data)

This part (distributing the rumprun parameters) seems easy:

Rumpconfig could process configuration information in this order:

1) platform-specific mechanism (cmdline, Xenstore)
2) a '/rumpconfig.json' file

1) would override 2). It would also eliminate the current similar hack used
by the "iso" target.

I get the feeling that we aren't thinking hard enough outside of the
context of the toolchain that we already have to obtain the solution
that we really want.

Ok, I'll think about it some more.

Doesn't spawning a qemu for networking take time? Or do you want to
run a kernel without any I/O capabilities?

Networking in a PV domain does not use qemu in any way.

Block devices require a qemu instance, *if and only if* a block device is
using an *image file* on the backend. This is due to the way Linux dom0
implements blkback for image files; it spawns a qemu instance in userspace
and blkback on dom0 talks a "qdisk" protocol to it.

I can make the implementation even more minimal and allow for including
only a single directory tree if it bothers you. The reason I implemented
multiple -R's was to provide a (granted, poor/minimal but working) ability
to accomplish:

rumpbake -R ...path/to/default/root -R data/ ...

ie. Give us the ability to provide a default /etc in the rumprun
repository.

So with a single -R parameter, is the user is responsible for
figuring out where to get /etc from?

Either that, or they get it by default (and can override the default if
they don't want it). In either case, for the purposes of rumprun unikernels
I'm treating /etc as "legacy OS cruft that needs to be included to make
things work", not as "a legitimate place to put your configuration". This
is what my packages also follow, application configuration goes in
/data/conf, not /etc.

Is rumpbake really the best tool for handling the complexity? How
about a builtin file system tool which supports something like
"prepopulate", <user uses normal shell commands here>, "slurp"?
(throwing out an unchewed idea, not saying it's a good one)

No idea yet. I'll think about it.


Other related posts: