[haiku-depot-web] Re: Dealing with Multiple Repositories and Conflicts

  • From: Stephan Aßmus <superstippi@xxxxxx>
  • To: haiku-depot-web@xxxxxxxxxxxxx
  • Date: Thu, 21 May 2015 17:00:57 +0200

Hi Andrew,

my first impression is that what you write makes sense completely. I think it
is actually crucial that the same package is shown to be contained in different
repositories. Just imagine that HaikuPorts is split into three different
repositories: development, testing and stable. Then you need to be able to
differentiate between the package versions which would be contained in each.
They all need to have their own ratings for example.

Best regards,
-Stephan


Am 21.05.2015 um 14:09 schrieb Andrew Lindesay <apl@xxxxxxxxxxxxxx>:

Hello;

I am working through some planning for the changes necessary to handle
multiple repositories in HDS. This email is to give my thoughts on this and
an opportunity to discuss any problems that anybody can see.

Terms
~~~~~

First to be clear - I distinguish between a package and a package version
like this;

Package = "apr"
Package Version = apr - 1.5.0-1 - x86_gcc2

A package has no version coordinates or arch and is identified by the name
alone. The package version is the package with coordinates and arch.

Multiple Repositories - Not Really
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In a way, at the moment, HDS has "multiple repositories" configured, but
actually they are all just different architectures for "HaikuPorts". I am
planning to change the "repository" concept to be the "HaikuPorts" (as an
example) which would then be associated with multiple URLs to feed-in from.
Once this is done, it will be possible to add other repositories which is
what I'm trying to achieve. I think this part of the change is simple, makes
sense and is fairly straight-forward. No problems here!

Conflicting Data Between Repositories
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Where matters get complex is considering handling the inevitable situation
wherein the same package and/or same package version (determined by the
version coordinates + arch) appears in two or more different repositories.
This raises some challenges. At the moment this is _not_ allowed during
import from HPKR.

My initial reaction was to consider the Package Version as being independent
of anything found in a repository [1] and to then create an additional
structure to capture the material from the HPKR that is specific to that
Package Version as found on that specific repository URL.

After working through the implications, I decided that this approach is not
ideal. It attempts to handle the conflicts gracefully, but in doing so
creates complexity and has flaws. The main flaw would be that we cannot be
sure that different sets of maintainers across different repositories are
going to name their packages the same and we cannot be too sure that they
will version their packages the same either. So I would consider that any
attempt to unify (at least at the version-level) data between repositories is
probably going to lead to maintenance problems and anomalies.

I don't think this makes sense.

A Better Approach
~~~~~~~~~~~~~~~~~

So my thinking now is that Package Version data and User Ratings (including
aggregates) are separated by repository -- not shared between repositories.
This keeps things logically and conceptually simple and I think it will work.

For example; If you were to search _across_ repositories and two repositories
happen to have the same Package then you will see this in the data because
two Package Versions will be returned and would be identified as belonging to
the two repositories. The system would not try to 'hide' this fact from you
by presenting this as one search result.

Packages themselves, as identified by their "name" in the HPKR data would be
logically the *same* between repositories. Is it reasonable to assume that
repository maintainers are going to avoid Package name-conflicts between
repositories? I guess if this were not the case, there would be the case for
problems at the HaikuOS level. ^^ Assuming this, the following data would be
shared across repositories;

* screenshots
* icons
* prominence
* categorization
* authorization

Conflicting package version + arch from any single repository (URL) would be
considered an error, but it would be allowed between repository sources and
by extension, repositories.

cheers.

[1]
http://www.silvereye.co.nz/tmp/hds-img-datamodel-21may2015__DRAFT.pdf

--
Andrew Lindesay



Other related posts: