[haiku-development] Re: Haiku package management system implementation (was: Haiku package manager)

From: David Flemström <david.flemstrom@xxxxxxxxx>
To: haiku-development@xxxxxxxxxxxxx
Date: Sun, 7 Feb 2010 18:09:35 +0100
2010/2/7 Jonas Sundström <jonas@xxxxxxxxxxx>

> Just to be clear (perhaps redundantly, my apologies upfront),
> the idea of the packagefs is not to have each package mounted
> separately, or even to force contents of any package into a
> folder of its own. Merely a single packagefs mount, for all
> currently installed packages.

Yes, I suspected this, but it wasn't completely clear to me exactly how it
would work (I had only seen the source code for packagefs briefly, mind
you). I never realized that packagefs would serve as an union file system,
so thanks for clearing this up. So, this essentially means that the plan is
to create a package system resembling those used by e.g. Puppy Linux and
Pandora Ångström Linux (sorry for bringing up Linux again, but there aren't
that many operating systems with package managers around to compare
with...). Fair enough; that's a valid direction to go in, too.

In short, don't give up, pardon our oversensitivy, learn the system
> and the culture, find the implementation that hits the spot.

I must admit that I don't use Haiku as my primary system, mainly because it
lacks the kind of applications I need (isn't that ironic), so I've yet to
grow completely familiar with it and its philosophy. That being said, I'm
not really coming from a Linux-background, either; most of what I've said
here has been coming from systems like mobile phone operating systems,
Embedded Linux (which has a completely different philosophy compared to the
standard Linux/GNU stitch-and-patch-together philosophy) and from
industrial-quality code package managers such as Maven, Gem, OSGi and Ivy.

On Sun, Feb 7, 2010 at 4:43 PM, Michael Lotz <mmlr@xxxxxxxx> wrote:
>
> It would be nice if you could try to reply to the specific parts by
> quoting instead of this semi top-posting, as it makes it hard to follow
> when not having the whole converstation in mind otherwise.
>
I've tried to do this until now, but I accidentally deleted the quoted parts
in my previous email because of a browser crash, apologies.

What I wouldn't do though is
> making them an integral part of the process, but merely an optional
> feature. I cannot remember a single package off hand that would have
> used a script or required setting PATH or library paths when installing
> and I frankly don't find it a good idea to do so.
>
As long as Haiku remains a strictly desktop/home-user oriented operating
system, I could see that this would work, since normal users usually don't
have a need for "subsystems" like daemons, servers, services, VMs etc.
If you want it to evolve, however (I could see it being used as a render
farm component for 3D modeling, or as a smartphone/netbook OS, or as an
administration/monitoring system in industrial environments, etc), the need
to be able to keep your system up and running while updating a component of
it becomes more important, and features like this one become necessary. I
would suggest to at least leave the door open for them, so that the package
management system doesn't have to be changed (maybe even sparking a fork of
Haiku in the process) once the need arises. Install scripts are not the way
to go, however, I think; guaranteed-to-be-reversible hooks are a lot more
sustainable.

I understand what you're getting at here and I think this is a genuine
> point. If you depend on "libz.1.0" and would depend on the "zlib"
> package in the traditional package based dependency system it becomes
> cumbersome to install "zlib-debug" which would also provide "libz.1.0",
> because the depending packages don't necessarily understand the
> difference (or the lack thereof from their point of view) between "zlib"
> and "zlib-debug". I'm not sure if this isn't better solved as "package
> additions"/"subpackages", a seperate type of package overlaying parts of
> a "parent" because of see below.
>
This is traditionally solved by adding "provided features" or "virtual
packages" to the system, so that if a package lists "java" as a dependency,
there are two packages that provide "java", namely "jre" and "jdk", and the
user will choose between these. This is not a good idea, however, because of
see below.

Except that you're going to have more individual dependency "entries"
> when you need to describe each library/file. If you depend on "OpenSSL"
> that's a single package to look for, while if you're going to depend on
> "libcrypto.so", "libssl.so" and maybe the individual engine libraries
> (moderately good example, I know)...
>
Well, if I were a package manager designer, I would implement it to do this:

   - Package lists "lib/libcrypto.so" as a dependency.
   - Manager checks whether "lib/libcrypto.so" is in the current packagefs
   (a simple file existence check) or in an internal, currently empty stack (an
   O(n) operation), let's assume it isn't found.
   - Manager searches the hashed repository index for the string
   "lib/libcrypto.so" as the key, and retrieves the package "OpenSSL" as the
   value. OpenSSL is added to the "install-dep" queue.
   - Manager retrieves the list of files that OpenSSL provides (which should
   be maybe 10 or so lib files and a few bin files) and pushes them to the
   internal stack.
   - Package lists "lib/libssl.so" as a dependency.
   - Manager checks the file system and internal stack, and finds the file
   in the internal stack (because it's inside the OpenSSL package), so it goes
   to the next package dependency.
   - etc...

*This* part of the depsolving operation is more expensive than in the
traditional model, but not by much; just a couple of extra strcmp's.

This solution is superior to the "virtual package" version, since someone
releasing a package doesn't have to know which virtual packages s/he
provides via the package; the files in the package are used as "provided
files", as-is.

E.g., let's say that an user is programming an e-mail application, and
creates a library called "libmail.so", intending to use internally in the
mail app. Then, however, someone else creates a modified version of that
mail application, and can then simply list "lib/libmail.so" as a dependency
of the extension to share some code base with the original application. The
original author now hears of this extension, and wants to split off
"libmail.so" in a separate package, so that users don't have to install his
mail application just to use the modified version. This is done, and a new
package, "libmail", now provides the library. The original modified email
app package will then still be working, since the dependency hasn't
disappeared, but can still be found, yet only in a different package.

Please refer to the Google Go programming language "interface" technology to
learn more about why a system with this philosophy is to be preferred.

Note: I might have misunderstood the following part, so please clarify if
you feel misunderstood.

> What I'm not sure of though is how exactly you plan to describe the
> dependency of the main package content with their support files.
> Considering OpenSSL alone you have a ton of stuff ending up in share,
> include and ssl. Would the library "libssl.so" tehn still depend on the
> "OpenSSL" package as a whole?

You could easily split up OpenSSL in openssl-libs and openssl-runtime or
similar. The definition of a package is IMO "the smallest set of files
necessary to provide one feature". So, you could easily break it down
arbitrarily far, creating an individual, 10 kiB "libssl" package that would
depend on "lib/libcrypto.so" and only provide the file "lib/libssl.so", and
one "openssl" package that would depend on "lib/libssl.so". Or not,
depending on how practical it is.

I mean what if you end
> up with "libcrypto.so" from one package and "libssl.so" from another?
> How are you going to decide which "OpenSSL" you're going to install and
> which support files you're going to pick?
>
I don't see how this could ever happen, except for these two scenarios:

   - There are multiple SSL implementations, say openssl and ciscossl
   (doesn't really exist and is really implausible of course), that both
   provide "lib/libssl.so". The user would then have to choose between the two
   implementations via a selection dialog.
   - There are debug- and release versions of openssl. The debug version
   would then have be stored in a separate repository not active by default, so
   the package manager should only see the "correct" alternative by default.

As I said, I think that we're talking on different wavelengths here, because
I think that I missed your point.

Also maintaining/creating such packages
> sounds not so fun to me.
>
Why not? It's actually easier this way than with the alternative! You look
in your build file and see that your application links with e.g.
"libconf.so", so in the package's "depends" section, you simply add the line
"lib/libconf.so" (or you let the build script do it for you). Then, you just
copy your produced application (or let the build script do it) into the
"bin/" folder of the package and add "bin/myapp" to the "provides" section
of the package. That's it! I don't realize how it could be any easier! :-)
This way, you don't have to look up which package provides that lib that
you're using, and whether you want to depend on the "dev" or the "stripped"
package or somesuch. You simply list the .so's you need and that's it!

Many of your points, like automatic "install/uninstall" are inherent
> with packagefs, where packages just show up in the packagefs when they
> are added and vanish when they are removed. No need to actually touch
> any files by running "hooks". That's all Rene tried to point out I
> think.
>
Yes, I realize that now. Thanks.


> It's not the first discussion of this kind, and you have to understand
> that it can be a bit tedious to explain our views over and over again
> to people who aren't necessarily familiar with Haiku concepts. I
> understand where you're coming from and I can appreciate the thought
> you put into combining various strongpoints of existing package
> managers and coming up with own ideas. So please try to understand us
> as well when we judge how far these concepts may apply to Haiku, as we
> are afterall pretty familiar with how our system works. The ideal thing
> would be if you could get yourself as familiar with Haiku as with the
> platform you're coming from, then you can easily see what concepts may
> or may not apply to Haiku.
>
I understand, I would probably feel the same if I were more used to the
Haiku philosophy. It's necessary to be careful, though, so as to not
alienate new members of the community!

Cheers,
David Flemström
Follow-Ups:
- [haiku-development] Re: Haiku package management system implementation (was: Haiku package manager)
  - From: Ingo Weinhold
References:
- [haiku-development] Re: Haiku package management system implementation (was: Haiku package manager)
  - From: David Flemström
- [haiku-development] Re: Haiku package management system implementation (was: Haiku package manager)
  - From: Jonas Sundström
[haiku-development] Re: Haiku package management system implementation (was: Haiku package manager)

Other related posts: