[haiku-development] Re: Package buildmeister

From: Jessica Hamilton <jessica.l.hamilton@xxxxxxxxx>
To: "haiku-development@xxxxxxxxxxxxx" <haiku-development@xxxxxxxxxxxxx>
Date: Wed, 6 Jul 2016 15:14:42 +1200

On 21 June 2016 at 11:23, Michael Lotz <mmlr@xxxxxxxx> wrote:

Hi

This email is rather long and contains mostly technical background and
opinions about the stated cons of the buildmaster setup. Feel free to skip
it if you're not interested in the workings of my current setup or why I
find it sensible.

On 19.06.2016 06:13, Alexander von Gluck IV wrote:

I worked on getting the haikuporter build-master installed in a public VM
tonight and
ran in to quite a few concerns about it.

Seriously? I mean that is exactly what I've set up and has been running for
the last couple of months. I understood the goal was to put it on Haiku
infrastructure and make it official, not build it up in private again.

The number one issue I saw was the requirement
that each Haiku build-slaves be accessible via IP + SSH

Seriously? It uses SSH. The only dependency here is a TCP connection. TCP
can be tunnelled and forwarded at will with whatever tool fits. I personally
use reverse SSH tunnels to hook up my builders. But you can use stunnel or
just plain netcat as the connection will be encrypted anyway. I find SSL
generally to be more of a hassle to set up than SSH, so I just use that.

Given these machines generally run behind user NAT's, and is "single-shot"
I think the
haikuporter build-master might be *too* simplistic.

The SSH connections are closed when the buildrun is through, yes. That
doesn't mean that the SSH servers are suddenly going to disappear and need
to be set up again for the next buildrun. You make it sound like there's
some kind of manual work involved here, which there isn't.

Regarding reverse SSH tunnelling:

I can't find any documentation, arguments, or code on this... it seems
this should be
the default behavior for the outlines reasons. (Since we need one complete
haikuporter
build-master per architecture, how would this even work?)

I'm really getting a strange feeling here. You make it sound like I'm hiding
some dark secret. But this really is plain basic networking. You need to
make a TCP port available where an SSH server can be reached. You can do
that through a static IP and a port forward through your NAT on your home
router. You can also make that port available by forwarding it through a
reverse SSH tunnel or similar setup. This is not something I just invented,
this has been used since decades.

The general approach was to not reinvent stuff. HaikuPorter needs a way to
remotely execute commands on the builder. A remote shell like SSH fits that
bill, so I used that (through paramiko) to leverage the fact that we ship an
SSH server with every Haiku nightly ready to use.

Of course you can build remote execution using your own protocol and wrap
that within SSL to secure it and implement yet another way to do the same
thing. I just didn't find it useful to build such a protocol directly into
HaikuPorter as that IMHO would just be bloat that would need to be
maintained.

If anyone wants to try and document it let me know and i'll give you
access to a
buildmaster *and* a remote buildslave.

To make this very obvious, I'm going to insert the full text of everything
I'm discussing here at the end of this email.

For the reverse SSH tunnel I'm using a script [1] with a corresponding
configuration [2]. This configuration forwards port 22 of my builder (where
the local SSH server listens) to port 8124 on the server where the
buildmaster is configured to connect to (see the full builder configuration
in [3]). It does that with a weak but fast cipher because it is going to
only provide a tunnel for another, more strongly encrypted SSH connection.
The builder configuration was created with the createbuilder.sh script,
which I've committed to the HaikuPorter repository under the buildmaster
directory and which automates the configuration. The only thing I had to do
to set this builder up was to name the HaikuPorter and HaikuPorts directory,
host, port and user for the SSH connection (using localhost:8124 for the
forwarded port) and finally add the automatically generated SSH public key
to the authorized_keys of the desired user on my builder.

The ssh_tunnel.sh is symlinked in my ~/config/settings/boot/launch so that
it start automatically on boot. For obvious reasons I am going to leave out
the private key used by that configuration. The tunnel user on the server is
set up with git-shell to prevent normal shell access. The authorized_keys
file [4] further limits the possible actions to pretty much just port
forwarding, which is all that is needed here.

Why didn't I document that as the official way to set up a builder? Because
I find this to be an implementation detail that is entirely up to the
operator of the server and builder. How the connection is established does
not matter to HaikuPorter and the choice of infrastructure to make it happen
is a matter of various factors including security and trust concerns,
available tools and personal preference. The described setup can be used for
pretty much any port forwarding need and is not in any way specific to
HaikuPorter (and wasn't written for this use case either, it stems from the
setup I've implemented at work to do most of our remote support).

Another person would maybe prefer not to create a user on the server at all
and configure stunnel to do SSL tunnels instead. This would work just as
well.

I think we're all assuming haikuporter build-master is a lot more magic
than it actually
is. Some great work has been put into it, but I want to make sure there is
consensus
that haikuporter build-master is the way to go.

Why are you assuming that everyone's assuming magic here? It's pretty far
away from magic or being a black box. It's a single source file within
HaikuPorter with ~800 lines of code [5]. How much magic can there be?

This is of course not by accident. Indeed the whole point was for it to do
as little as possible by leveraging existing tools (like the dependency
logic and recipe handling inside HaikuPorter itself, but also regarding
protocols for remote command execution (SSH) and file transfers (SFTP) as
well as serving out status (apache httpd in my setup)).

I generally find that if something seems like magic one just doesn't yet
fully understand how it works.

https://github.com/haikuports/haikuporter
haikuporter build-master mode  (mmlr)
   Pro
    - Python which has good community knowledge
    - Fully leverages haikuporter internal logic for dependencies
    - Builds repos
   Cons
    - SSH's out to slaves and requires user to open ssh port per slave.
(and static ip)

See above.

    - Requires haikuporter + haikuports on master and each slave (does
haikuports have to be in sync?)

Yes, obviously HaikuPorter is needed because in this setup both the
buildmaster logic and the builder are implemented in it. The builder uses
HaikuPorter in all of these setups, so I don't see why it's listed as a con
for this setup.

HaikuPorts is obviously also needed (in all of the setups as well).
HaikuPorts needs to be in sync on the buildmaster and the builder. The
buildmaster ensures that automatically.

    - Difficult slave configuration + lots of directory settings per slave

The createbuilder.sh script asks you a couple of questions that you have to
answer. All questions that can have a sensible default have one. I don't
exactly see how this is classified as difficult. It even automates creation
of all the necessary SSH keys and queries the remote host key for you. The
only directories that you have to specify is where on the builder
HaikuPorter and the HaikuPorts tree can be found. All the other directories
have defaults that you can just accept and it will work fine.

    - Doesn't know about architectures of buildslaves (one entire
environment for each arch)

I don't understand? All setups will need builders for the different
architectures. Conveniently the fully host independent chroot in this setup
will allow you to run builds for different versions/branches on the same
builder (as long as that one is reasonably compatible) as no system packages
are used at all. So overall builder count should be reduced compared to the
other setups.

If you mean there's one *entire environment for each arch*, then yes that is
true. It consists of a HaikuPorts checkout and a builder configuration.

    - *Basic* html report of each single-shot run.

I give up on this point. I've explained my reasoning for a JSON output
numerous times. Maybe just think of it as serving out the "database" that
the other approaches also have?

    - Single shot for one package (or a bunch? --do-bootstrap seems broken
here) and deps

You can do a buildrun for a single package, many packages, the packages that
were affected by changes to recipes and referenced files, whatever you want.
It is just a buildrun, what you put inside is decided by how you start it.

I've taken great care to make sure that this can run as a git hook or by
comparing different git revisions by implementing the functionality to
derive a set of affected recipes from a set of changed files. This includes
things easily missed by a more simple approach like a referenced license or
additional file or patches.

The buildmaster/buildmaster.sh frontend script automates most of the common
tasks (including updating to a new revision and building everything affected
by the changed files). For reference I'm inserting the full text of my magic
updateloop script in [6]. That's all there is to it for continuous building
of changed/new recipes. The script is run in the HaikuPorts checkout on the
server and takes everything it needs from there. Setting it up for a
different branch means: just checking out that branch.

I don't understand the remark about --do-bootstrap. None of the setups are
meant for bootstrapping. This is for continuous automated package building
and publishing.

    - Lots of requirements on build-master system (package, package_repo,
haiku repo for licenses)

You actually listed all of them, so "lots" might be a bit of an
exaggeration. The package and package_repo tools as well as any build_host
libraries needed are a byproduct of building a standard Haiku image. You can
also just build the two tools individually if you don't want to wait for a
whole image to build.

That the licenses aren't duplicated as part of HaikuPorts is a bug in our
setup IMHO. Relying on the presence of license files in the Haiku package
without explicitly declaring that or bringing the license with your recipe
is a bad practice IMO.

    - Poor documentation (I've written whats out there now)

I've tried to outline the concepts a couple of times in my emails. In my job
I am partly a sysadmin for various servers and set up a lot of machines and
services, so obviously the tools used here don't seem strange to me at all.
I understand that this doesn't necessarily apply to other people. However I
would expect some sort of sysadmin background from a person running official
Haiku servers and services as well.

I'm all about microservices, but my main concern is this whole thing
sounds
like it is going to be held together via 30 cron jobs, 10 scripts in
/usr/local/bin,
and a few old men to log in and manually fix stuff every other day.

I wouldn't really say 30 is old and don't see why one would need to tend to
an automated system every other day, but the rest sounds about right (maybe
lower numbers, say 1 cron job or git hook and 3 or 4 scripts chained
together). The difference seems to be in the interpretation of whether this
as a good or a bad thing. I find shell scripts, at least reasonably
structured ones, pretty obvious. Large do-it-all servers on the other hand
can quickly wander in the "blackbox/magic" direction.

In my opinion this is still just a very modular and flexible setup that can
easily be hooked into, just as I personally would expect from such a system.

Regards,
Michael

--

[1] - ssh_tunnel.sh

#!/bin/bash

cd "$(dirname "$0")"

exec 1>> ssh_tunnel.log 2>&1

function log {
        echo "$(date +%Y%m%d_%H%M) $1"
}

log "Starting SSH tunnel loop from $0 in $(pwd)"

while true
do
        log "Starting SSH process"
        ssh -nNT -F ssh_tunnel.config ssh_tunnel
        log "SSH process quit with status $?"
        sleep 5
done

[2] - ssh_tunnel.config

Host ssh_tunnel
        HostName                hpkg.mlotz.ch
        User                    tunnel
        BatchMode               yes
        ConnectionAttempts      3
        ConnectTimeout          15
        ExitOnForwardFailure    yes
        IdentityFile            ssh_tunnel.key
        IdentitiesOnly          yes
        UserKnownHostsFile      ssh_tunnel.hostkey
        LogLevel                VERBOSE
        Protocol                2
        RemoteForward           8124 localhost:22
        ServerAliveInterval     15
        ServerAliveCountMax     3
        Cipher                  arcfour

[3] - mmlr_htpc_x86_gcc2.json
{
        "name": "mmlr_htpc_x86_gcc2",
        "ssh": {
                "host": "localhost",
                "port": "8124",
                "user": "mmlr",
                "privateKeyFile": "keydir/mmlr_htpc_x86_gcc2.key",
                "hostKeyFile": "keydir/mmlr_htpc_x86_gcc2.hostkey"
        },
        "portstree": {
                "path": "/Media/Source/builder/x86_gcc2",
                "packagesPath": "/Media/Source/builder/x86_gcc2/packages",
                "packagesCachePath":
"/Media/Source/builder/x86_gcc2/packages/.cache"
        },
        "haikuporter": {
                "path": "/Media/Source/builder/haikuporter/haikuporter",
                "args": "-j2"
        }
}

[4] - authorized_keys of the tunnel user on the server
no-pty,no-X11-forwarding,permitopen=":1",command="/bin/echo tunnelonly"
ssh-rsa AAAA...publickey...== mmlr_htpc_x86_gcc2

[5] -
https://github.com/haikuports/haikuporter/blob/master/HaikuPorter/BuildMaster.py

[6] - updateloop.sh
#!/bin/sh
while true
do
        date -u
        ~/haikuporter/buildmaster/buildmaster.sh update \
                && ~/haikuporter/buildmaster/createrepo.sh
        sleep 180
done >> update.log 2>&1

So I'm trying to help set up an independent build master, but I'm
running into a couple of issues.

1. For buildmaster.sh, what should go in buildmaster/config? Is it
just HAIKUPORTER=/path/to/haikuporter?

2. How are command line arguments for haikuporter supposed to be
specified? In buildmaster.sh, it uses "$HAIKUPORTER", which breaks if
I add these in the buildmaster/config file. I've removed the double
quotes, and am making _some_ progress, but still not there yet. Or are
these supposed to go into the haikuporter configuration file?
Currently, the arguments I pass to haikuporter are --config=...
--licenses=... --command-mimeset=... --command-package=...
--system-packages-directory=...

3. In buildmaster mode, if the settings are supposed to be in the
haikuporter config file, where does it look for the config file?

4. I can't get it to figure out where the system packages are; I've
added --system-packages-directory to my HAIKUPORTER variable inside
the buildmaster/config file, but it seems to have no effect. Still
says 'requires "haiku" of package ... could not be resolved'.
Similarly for haiku_x86.

Follow-Ups:
- [haiku-development] Re: Package buildmeister
  - From: Michael Lotz

[haiku-development] Re: Package buildmeister

Other related posts: