[mira_announce] Call for testing: MIRA 3.2.1.7

Dear all,

in the last announcement regarding the development versions I wrote this:

  These snapshots will not contain the latest bleeding edge new developments I 
  am currently working on / testing out, so they should be relatively stable
  as they most of the time went through a couple of projects on my home
  machine already.

Well, I changed my mind. In future (starting right now), some snapshots *may* 
contain code which needs more testing by a wider audience. However, these 
versions should be stable enough as I am also using them myself on a day to 
day basis.

Version 3.2.1.7 is such a release: I merged two development branches into the 
main branch as part of some wider restructuring of MIRA to get it to assemble 
projects in the 50m to 100m reads range.

For this version I'm interested in feedback on three things:
- what decrease of memory usage do you see between older version and 3.2.1.7.
  E.g., in a project with 800k 454 FLX reads (45x coverage), it goes down from
  7.1 GiB to 5.7 GiB.
- are there significant changes in assembly metrics like number of large
  contigs etc.? That is, are there joins MIRA suddenly missed?
- does MIRA stop or segfault where it did not in the past?


The main changes for 3.2.1.7: 
- new routines to choose and trim down SKIM hits. Save tremendous amount of
  disk and memory for high coverage projects (coverages >50x), especially when
  coverages reach >100x.
- new SKIM chooser also should lead to less misassemblies for those rare cases
  they still occure.
- fixed bug which made MIRA misestimating overlap scores ind some cases.
- fixed bug which made MIRA misinterpret partial overlaps as complete
  overlaps.
- fixed memery allocation bug which led MIRA to perform a couple of more
  allocations than necessary.
- fixed possible division by 0 (entered in 3.2.1.6)
- temporary fix for error message "Asked for elements in start cache while not
  using startcache?" (introduced 3.2.1.5).
- convert_project: -n now also works with read collections, i.e., not only on
  contig names when loading contigs, but reads when loading data without
  contigs (like in all FASTA, FASTQ, but also CAF, MAF when no contigs are
  defined)

Other related posts:

  • » [mira_announce] Call for testing: MIRA 3.2.1.7 - Bastien Chevreux