[mira_talk] RFC: bundling or not bundling rRNA databases with MIRA

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 16 Dec 2015 23:11:21 -0500

Dear all,

I plan to release MIRA 4.9.6 soon, either shortly before Christmas or by mid
January. While the bump in version number is small, a lot has happened behind
the scenes.

One feature I have added is the ability of mira/mirabait to directly fish for
or fish out rRNA sequences, something extremely useful in RST/RNASeq
assemblies. There’s just a slight problem: the dataset for this functionality
is ~10Mb. Not several gigabytes like RFAM, Silva or other rRNA databases, just
10 megabytes … and with that one should be able to recognise rRNA reads for the
vast majority of sequenced organisms on this planet.

The question I currently have: do I bundle this together with the MIRA binaries
or not?

Pro:
- easy install for novices (and forgetful ppl)
- easy for package and system maintainers

Con:
- the size of the binary distributable package doubles from 10 Mb to 20 Mb

I’m strongly tending for bundling as in today’s world, 10 Mb or 20 Mb are more
or less negligible sizes. However, I would like to have feedback on this just
in case someone sees a larger inconvenience.

Bastien


--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html

Other related posts: