[modular-debian] Re: Replace dbus with per-service filesystems (was: Re: Humor With A Message: dbus)

  • From: Jude Nelson <judecn@xxxxxxxxx>
  • To: slitt@xxxxxxxxxxxxxxxxxxx
  • Date: Sun, 7 Dec 2014 23:50:28 -0500

Hi Steve,

On Sun, Dec 7, 2014 at 11:28 AM, Steve Litt <slitt@xxxxxxxxxxxxxxxxxxx>
wrote:

> On Sat, 6 Dec 2014 03:36:02 -0500
> Jude Nelson <judecn@xxxxxxxxx> wrote:
>
>
[snip]


> > If my understanding of dbus is correct (someone please correct me if
> > I'm wrong), the key features it offers to applications include the
> > following:
> > * signals and methods as first-class entities
> > * hierarchical organization:  processes aggregate related signals and
> > methods under common prefixes; two different processes A and B have
> > their own prefixes to avoid namespace collisions
> > * introspection:  process A can query the signals and methods exposed
> > by process B
> > * RPC-like IPC semantics:  process A can invoke methods in process B
> > (both synchronous and asynchronous operations are allowed)
> > * 1-to-N broadcast:  process A can send a signal to one or more
> > listener processes B1...BN.
> > * authentication:  processes A and B only interact once they prove to
> > one another that they each know a shared secret (which is shared
> > out-of-band by a mutually trusted party, like a session manager or an
> > ancestor process).
> > * activation and lifecycle management:  process A can learn when
> > process B exposes or withdraws an API
>
> I look at the preceding list, and for the most part, they look like bad
> habits to me. For the sake of debugability, I prefer that if A and B
> need to communicate, they are built from the ground up to recognize
> each others common API, not throw and grab info to/from some sort of
> generic "we accept everything" entity. I'm a big fan of well defined
> interfaces.
>

I think the reason for this is that developers tend to (wrongfuly, IMHO)
think of dbus as an IPC facility, whereas its function is more like that of
a dynamic linker--it lets processes bind to and call methods in other blobs
of code (processes in this case, instead of shared libraries).  But in
doing so, it introduces tight coupling between processes in the same way
that dynamic linking introduces tight coupling between a process and a
shared library.  As you say, it's a bad habit.


>
> But that's not the end of the story. Read on...
>
>
> >
> > But, if this is what dbus conventionally gets used for, then I think
> > the features offered by dbus could be addressed with a userspace
> > filesystem as well:
> > * signals and methods:  represented by files, or well-defined
> > directory structures
> > * hierarchical organization:  directory trees
> > * introspection:  opendir(), readdir(), stat(), getxattr(),
> > listxattr()
> > * RPC-like semantics:  open(), read(), write(), close(); can be used
> > to emulate method invocations if need be (i.e. write arguments to a
> > file; read result from a file).
> > * 1-to-N broadcast:  combination of select()/poll() and read() (also:
> > inotify(), kqueue, etc.)
> > * authentication:  process's PID, effective UID/GID, access(), path
> > resolution
> > * activation and lifecycle management:  mount(), umount(), mount
> > namespace (/etc/mtab, /proc/mounts)
>
> Filesystems are good. If you *simply must* have a communal
> communication path, at least a filesystem is examinable. And, if you
> have a function to write a snapshot out to disk, then forensic
> examination is possible. Also, I have a feeling that what you describe
> could be used for things that aren't as bug-attractive as the list of
> dbus "benefits" you first mentioned. I hope you have ways of making
> certain communications private to specific apps.
>

Hmmm, I hadn't thought about adding a snapshot feature to fskit.  Sounds
like an interesting and useful proposition for debugging--I'll open an
issue for it on github.

Regarding security, fskit already does POSIX-style permission checks.  If
the application needs further security, I have a separate library called
libpstat [0] that allows a program to "stat" a running process.  When using
the FUSE back-end at least, the filesystem will be able to tell which
process is performing the I/O request.  In doing so, it can also use
libpstat to learn some critical information about the process given its PID
(such as the path to the executable it's running) to perform further access
controls.


> >
> > I'm convinced that per-service filesystems encourage modularity
> > better than dbus to the point that I've gone ahead and created a
> > library, called fskit [1], that makes it easy to embed them into
> > services.  Unlike FUSE, fskit handles all the bookkeeping required to
> > maintain a POSIX-y directory hierarchy, and lets the service
> > programmer define WSGI-style routes to handle I/O operations over
> > sets of paths (a multi-threaded RAM filesystem can be had in about
> > 200 lines of C).  It's almost ready for a stable release--it's only
> > missing symlink and hard-link support at this point--but I've already
> > used it to create a special RAM filesystem, called runfs [2], that
> > automatically removes files once the process that created them dies
> > (i.e. no more stale PID files).  I use it every day to avoid
> > polluting my /tmp with files generated (but not cleaned up) by
> > software I work on at $DAYJOB.
>
> I'll help you document it.
>

Very kind of you to offer :)  The wiki [1] is currently editable by anyone,
but I don't have anything there yet.  Feel free to email me and we can work
together.


>
> If you make it small and tight and clean enough, it might be of value
> in the init system itself.
>

Heh, we'll see :)  I'd definitely like to get the number of unchecked
malloc()'s down first, though (but at this point, they're mostly limited to
adding elements to STL containers without a try/catch for bad_alloc
exceptions).


>
> >
> > Before I dive off the deep end into doing something big like
> > re-implementing udev, I'm looking for a sanity check.  Is what I'm
> > trying to do a desirable, or even a sensible, approach towards
> > encouraging a more modular debian?
>
> I have no idea. But I do know that using the filesystem for information
> is usually a good idea. I guess your main purpose in making it an
> in-memory filesystem is performance???
>

Actually, I think the gain from implementing /dev as a userspace filesystem
is in introducing the ability to do per-process access control.  Since FUSE
at least exposes the PID of the requesting process, it should be possible
to reply to stat() and readdir() with only the device nodes the *process*
is allowed to see, regardless of what their effective UIDs or GIDs are.
For example, a hypothetical implementation could whitelist /usr/bin/X as
the only program allowed to read and write /dev/dri/cardX and
/dev/input/eventXX.


>
> > To be clear, it's not that I
> > dislike dbus; it's that the lack of generic interfaces between dbus
> > components makes it difficult for me to avoid tightly coupling
> > multiple dbus services together.
>
> Having a common wire conducting all interprocess communications is a
> very abuseable thing, no matter how well you write it. Yours looks a
> little more debuggable than dbus. Especially if you include a log.
>

It includes at least two--its own log, and its back-end's log (i.e. you'd
be able to use FUSE's debugging facility as well as fskit's).


>
> > I think this can be avoided if
> > services stuck to generic conventions for interacting with the rest
> > of the system (such as the POSIX filesystem API), since then existing
> > tools would not need to be designed around the services'
> > otherwise-arbitrary methods and their arbitrary side-effects.  I
> > think making it easy for services to expose functionality through the
> > filesystem is a big step in this direction.
> >
> > Any thoughts or feedback welcome :)
>
> You're already using it, and it's solving some of your problems.
> Hopefully it has few dependencies and is easy to deploy. If this is a
> free software project, I'll help you document it.
>

It has minimal dependencies--just libc and libpthread, plus the backend's
library (currently libfuse).  It's free software--LGPLv3 at the moment, but
I'm open to dual-licensing under ISC down the road if any of the *BSDs take
an interest.

Regards,
-Jude

[0] https://github.com/jcnelson/libpstat
[1] https://github.com/jcnelson/fskit/wiki

Other related posts: