[haiku-development] Re: Package buildmeister

From: Michael Lotz <mmlr@xxxxxxxx>
To: haiku-development@xxxxxxxxxxxxx
Date: Thu, 7 Jul 2016 17:26:27 +0200

On 21.06.2016 15:20, Alexander von Gluck IV wrote:

June 20 2016 6:24 PM, "Michael Lotz" <mmlr@xxxxxxxx> wrote:
We've been idle on finishing the package building infrastructure... which
should be something exciting.
I'm trying to figure out why. This is one of the last *big* blockers for R1B1
and beyond.

I can only speak for myself: The simple fact is that I've been working myself almost to death and therefore had absolutely no time for anything Haiku related (or any other hobbies for that matter).

Seriously? It uses SSH. The only dependency here is a TCP connection. TCP can
be tunnelled and
forwarded at will with whatever tool fits. I personally use reverse SSH tunnels
to hook up my
builders. But you can use stunnel or just plain netcat as the connection will
be encrypted anyway.
I find SSL generally to be more of a hassle to set up than SSH, so I just use
that.

Oh, so the plan for reverse connections is doing ssh tunnels. If the link gets
disconnected we
script out a restart... ok. I guess that'll work... it feels very bubble-gum +
duct tape though :-)

Sure, whatever. I find it funny how you always pick out the thing you can make a rant about and ignore the other listed options. In the very same paragraph I describe that other means could be used. If using an SSH reverse tunnel is beneath you then just use something else. In actual practice, over the last couple of months neither the SSH reverse, nor the SSH forward connections (which has the same issue and would actually be relevant here) have been a problem.

Coming up with a job control based mechanism so that the haikuporter build continuous running and then waiting for completion asynchronously should be rather trivial.

Of course many other RPC mechanisms already exist that could be used instead. SSH just seemed straight forward at the time. Feel free to implement something superior as you see fit.

I'm really getting a strange feeling here. You make it sound like I'm hiding
some dark secret. But
this really is plain basic networking. You need to make a TCP port available
where an SSH server
can be reached. You can do that through a static IP and a port forward through
your NAT on your
home router. You can also make that port available by forwarding it through a
reverse SSH tunnel or
similar setup. This is not something I just invented, this has been used since
decades.

This wasn't the point. It means our infrastructure is going to be highly
dependent on upkeep.
Users managing SSH tunnels, managing reverse proxies through NAT's, dynamic dns
entries, haiku
buildslaves, haikuports trees. We collectively can barely find the time to get
this stuff
set up or handle SMART drive errors on baron... let alone get all this stuff
configured then
maintain it.

Here you're just saying words to try and underscore a point. Just because I said there are multiple ways to do things doesn't mean you have to do it different for each and every builder. One would standardize on one approach and then use that for each builder. You known, transport mechanism independence is a feature, not a bug.

There is a one time setup per builder for the SSH tunnel script I've included last time. It could also easily be packaged if copying a script around is too much of a hassle. The initial configuration will need to be done for any kind of RPC with any solution you come up with.

HaikuPorts is obviously also needed (in all of the setups as well). HaikuPorts
needs to be in sync
on the buildmaster and the builder. The buildmaster ensures that automatically.

Oh, so the buildmaster does git pull's on the buildslave haikuports trees? I
bring this up because
the other solutions generally manage their own haikuports and haikuporter
trees. I just want to make
sure everyone knows this as this will all need to be common knowledge for
anyone running a package builder.

Oh come on, obviously this is super trivial to automate if you want that. It is literally two lines of SSH commands to do the initial setup. This is not some closed source or external project where it is not possible to change things. Just add it as an option to the create_builder script and be done with it.

The person owning the builder also doesn't need to be the person to create that initial setup, the one configuring the builder would probably do that once the owner of the builder made clear where it should live. In the case of the builders of Jessica and Urias that's how it worked, they told me where things should be set up and I would then configure it there. Since these machines are possibly used for other things beside being a builder I wouldn't want to assume that a default directory that is always in the same place can be used.

The createbuilder.sh script asks you a couple of questions that you have to
answer. All questions
that can have a sensible default have one. I don't exactly see how this is
classified as difficult.
It even automates creation of all the necessary SSH keys and queries the remote
host key for you.
The only directories that you have to specify is where on the builder
HaikuPorter and the
HaikuPorts tree can be found. All the other directories have defaults that you
can just accept and
it will work fine.

Right. It asks a series of questions about where things will be located at on
build slaves. Once
again, I bring this up because all of the other tools manage a "workspace" and
just need a base
directory.

As stated above, you only configure a base directory with create_builder as well. The other directories can be specified for special cases (e.g. a shared cache directory on a bigger volume might make sense depending on the builder), but you don't have to configure them if you don't need them to be different from their default. Doing the initial clone of the haikuporter and haikuports directories can then just be added as an optional step of the setup.

- Doesn't know about architectures of buildslaves (one entire environment for
each arch)

I don't understand? All setups will need builders for the different
architectures. Conveniently the
fully host independent chroot in this setup will allow you to run builds for
different
versions/branches on the same builder (as long as that one is reasonably
compatible) as no system
packages are used at all. So overall builder count should be reduced compared
to the other setups.

If you mean there's one *entire environment for each arch*, then yes that is
true. It consists of a
HaikuPorts checkout and a builder configuration.

Once again, so haikuporter build-master assumes all slaves can build the
architecture specified, so
we'll need one complete environment for each architecture (x86_gcc2, x86,
x86_64, 2 or 3 arm targets,
powerpc). I bring this up because the other solutions manage multiple builders
of multiple
architectures.

I still don't understand. My buildmaster is hosted on Linux. It does not impose an architecture and there is no need for individual servers or buildtools for each architecture either. It needs a haikuports clone for each architecture, yes, as each clone is configured for one respective architecture. I guess every solution does have some architecture specific configuration and most probably an output repository for each architecture as well.

Builders are obviously architecture specific in any case. All solutions will require builders of the respective architecture.

So, we have a number of build slaves for an architecture. A git commit comes in
and a new job
spins up. We get another commit and another job can start instantly (and not use the
"busy"
buildslave?)

Fair point, the buildrun of the former commit would indeed have to complete first and the idle builders would not be reused immediately.

I don't understand the remark about --do-bootstrap. None of the setups are
meant for bootstrapping.
This is for continuous automated package building and publishing.

Ok, so is there a way to build all packages in a shot across multiple
buildslaves? We'll likely
need to do something like that pre-r1b1

You run "buildmaster.sh everything" (or with a list of recipes you actually want to build) in each architecture specific clone.

This is a very special case that you trigger manually once the release branch has been made. You probably want to switch to a release branch of haikuports at that time as well.

You actually listed all of them, so "lots" might be a bit of an exaggeration.
The package and
package_repo tools as well as any build_host libraries needed are a byproduct
of building a
standard Haiku image. You can also just build the two tools individually if you
don't want to wait
for a whole image to build.

Fair. But once again, a comparison of the solutions. The other options don't
require so much manual
script distribution.

I realize that you want to compare a couple of approaches here. In my opinion you are however focusing on some pretty superficial parts of these solutions. In the case of buildmaster mode most of these can, with a bit of work, be automated or resolved easily.

The more fundamental difference between these solutions, and the very reason for implementing buildmaster mode in the first place, is the duplication of recipe and dependency logic. These do seem trivial at first, but they really just aren't.

The whole point of putting buildmaster mode into haikuporter was to leverage the existing recipe parsing and dependency logic and automatically keeping up with any changes made to it. A solution that reimplements these fundamental parts will, in my opinion, always fail to match the actual workings of haikuporter in subtle or not so subtle ways and cause hard to debug situations. And even if they replicate the logic so perfectly that they indeed work exactly the same, then you still have multiple code bases that always need to be kept in sync. In my opinion this should be the cornerstone of any comparison.

While implementing buildmaster mode I've pulled out some of these inner workings so that they can be used by external tools. Getting a list of affected recipes for a list of changed files is one such thing. Parts of the dependency logic are accessible as well. I do not really care if the eventual solution uses buildmaster mode or if it is an external solution, but in my opinion it must use these exposed interfaces of haikuporter to make sure the tool managing the builds and the tool actually doing them have the same view of the world.

The difference seems to be in the interpretation of whether
this as a good or a bad thing. I find shell scripts, at least reasonably
structured ones, pretty
obvious. Large do-it-all servers on the other hand can quickly wander in the
"blackbox/magic"
direction.

That's where we differ. I do DevOps by day and "automate all the things" :-D
I've used a lot of
build automation tools in production environments, so I might have some unfair
cynicisms about
this stuff. Jenkins does connect out to buildslaves over ssh, however it also
handles "setting
up and managing all the things on the slaves". Windows build slaves in Jenkins
generally run a
java agent as a system service and connect to the build master.

Then it should come natural to you to automate the missing steps of the builder setup as outlined above. And how excatly is that Jenkins agent any different from a reverse SSH tunnel?

I also deploy + manage a cloud of servers + storage (6 PiB so far, yay!) in my
day job and learned
manual steps will kill a huge amount of free time at a large scale. Sure, 1
cron job, a git hook,
3-5 scripts might work... but then likely a variety of people will need to
replicate that
configuration (in functional form) to ~6+ other environments (one for each
architecture). The
chances of things being 'different' and non-functional are quite high.

I known that when you scale things even a couple of seconds here and there can become a huge problem. On the other hand I just don't see us setting up hundreds of builders, let alone build masters. A single build master with a reasonable amount of storage for the repositories should be enough to host builders for all the needed architectures. So the one time setup really just isn't something I personally would spend a lot of time optimizing for.

There will certainly be more builders than buildmasters, so the situation might be somewhat different there. But the dependencies on the builder are extremely narrow and also a one time setup that can be fully automated.

I do better understand where you're coming from now with this stuff. Given a
lot of work has gone
into the other two build systems as well, it wouldn't be fair to jump to yours
without the same
level of analysis we (+I) gave the other two.

Yes, but please include things like duplication of logic and longer term maintenance of the code bases and the likelihood of things not working due to incompatible logic into account.

I want what everyone else wants:

  * A release as soon as possible
  * Recent + quality package builds
  * Haiku to not become stagnant.

I certainly would like to see a release soon as well. But I'm still of the opinion that using a solution that does not leverage the existing logic will, at least in the long term if not immediately, cost more time to get working properly and maintain.

Regards,
Michael

Follow-Ups:
- [haiku-development] Re: Package buildmeister
  - From: Urias McCullough

[haiku-development] Re: Package buildmeister

Other related posts: