[mira_talk] Re: RFC: bundling or not bundling rRNA databases with MIRA

  • From: Chris Hoefler <hoeflerb@xxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Thu, 17 Dec 2015 09:31:25 -0600

This sounds like a very nice feature! I'm curious to see how it works. ;)

For the bundling, I would suggest keeping it with the Mira binaries, as I
suspect without the dataset mira/mirabait will cough up an error? Since it
is actually a part of Mira in this case, and not a separate utility, it
seems appropriate to bundle. As Peter mentioned, linux package maintainers
usually compile from source, so they can separate it out again if they wish.


On Thu, Dec 17, 2015 at 2:53 AM, Peter Cock <dmarc-noreply@xxxxxxxxxxxxx>
wrote:

Hi Bastien,

This is an interesting idea - and given it is only 10MB, I'd vote
with the other voices for bundling it with the binaries as one
download for the typical MIRA user.

I would expect the Debian Med team to want to split it out as
a separate sub-package, they already do this a bit, e.g.
https://packages.debian.org/source/sid/mira

Shall we forward this to their mailing list for comment/warning?

I presume the new rRNA database would be a self contained
file or directory? The Debian (and other Linux distro) packagers
may have some more specific guidance/preference here.
Regards,

Peter

On Thu, Dec 17, 2015 at 4:11 AM, Bastien Chevreux <bach@xxxxxxxxxxxx>
wrote:
Dear all,

I plan to release MIRA 4.9.6 soon, either shortly before Christmas or by
mid January. While the bump in version number is small, a lot has happened
behind the scenes.

One feature I have added is the ability of mira/mirabait to directly
fish for or fish out rRNA sequences, something extremely useful in
RST/RNASeq assemblies. There’s just a slight problem: the dataset for this
functionality is ~10Mb. Not several gigabytes like RFAM, Silva or other
rRNA databases, just 10 megabytes … and with that one should be able to
recognise rRNA reads for the vast majority of sequenced organisms on this
planet.

The question I currently have: do I bundle this together with the MIRA
binaries or not?

Pro:
- easy install for novices (and forgetful ppl)
- easy for package and system maintainers

Con:
- the size of the binary distributable package doubles from 10 Mb to 20
Mb

I’m strongly tending for bundling as in today’s world, 10 Mb or 20 Mb
are more or less negligible sizes. However, I would like to have feedback
on this just in case someone sees a larger inconvenience.

Bastien


--
You have received this mail because you are subscribed to the mira_talk
mailing list. For information on how to subscribe or unsubscribe, please
visit http://www.chevreux.org/mira_mailinglists.html

--
You have received this mail because you are subscribed to the mira_talk
mailing list. For information on how to subscribe or unsubscribe, please
visit http://www.chevreux.org/mira_mailinglists.html

Other related posts: