[mira_talk] Re: RFC: bundling or not bundling rRNA databases with MIRA

  • From: Francisco Pina Martins <f.pinamartins@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 18 Dec 2015 15:47:37 +0000

To add my 0.02€, as a maintainer of several packages in Arch Linux AUR.
I am with Sven, and **in my opinion**, data should be split from binaries.
For what is worth, my recommendation would be adding an optional switch to the build system that will disable downloading the data (and eventually allow the user to choose where to store it).
Packagers don't have to use this switch, and can simply make a split package (eg. mira and mira-rRNA, which can be an optional dependency for mira) and users that compile from source will benefit from the build time switch.
This is both convenient for users, simple for packagers, relatively easy to implement, and saves a lot of bandwidth.
As for the binary distribution, I would argue that the ideal would be to have mira check if the rRNA data exists (and if it is the correct version, via say md5sum or equivalent), and in negative case, offer to download the file (either automatically, or just provide the link). This would, of course, be more work for Bastien...

I hope this helps,

Francisco

On 17-12-2015 20:08, Sven Klages wrote:


2015-12-17 20:48 GMT+01:00 Peter Stockwell <peter.stockwell@xxxxxxxxxxx <mailto:peter.stockwell@xxxxxxxxxxx>>:

At 10Mb why not just include it and be done with it.



-
​ because it's data, it is static, no software
- because in future there might be some other (bigger?) dataset, enabling MIRA to screen out XYZ and/or ABC ...

Just a matter of "design", not really a matter of disk or download capacity .. but I am a lone wolf here ;-)

best,
Sven​

Other related posts: