[mira_talk] Re: MIRA disk space issue

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 15 Apr 2011 23:55:54 +0200

On Friday 15 April 2011 23:44:07 Robert Bruccoleri wrote:
>     What's new in 3.2.1.13?

The "CHANGES.txt" is your friend there :-)

Depending on which version he used (I'd guess 3.2.1), the following fixes will 
apply when it comes to disk/memory usage:

directly after 3.2.1:
- for mapping Solexa data, MIRA now reduces unnecessary mapping attempts when
  no more reads match a contig. This should save a bit of time, especially in
  projects with lots (>1000) of reference sequences.
(3.2.1.2)
- assembly of millions of EST/RNASeq now *way* faster. E.g.: 10m Solexa reads
  (100bp) down from several days to ~1 day. 
(3.2.1.4)
- improved internal handling of invalid overlaps. Effectively slashes memory
  needed there by a factor of 8 to 10. E.g.: in Solexa transcriptome 10m reads
  @ 100bp with quite some ploidy, uses just ~1 GiB instead of ~11.5 GiB
(3.2.1.7)
- fixed memory allocation bug which led MIRA to perform a couple of more
  allocations than necessary.
- new routines to choose and trim down SKIM hits. Save tremendous amount of
  disk and memory for high coverage projects (coverages >50x), especially when
  coverages reach >100x.
(3.2.1.9)
- SKIM now uses intermediate result purging, tremendously reducing size of
  result files in large assemblies.
- bugfix: SKIM sometimes saved wrong hits when there were effectively none
- bugfix: large increases of virtual memory needs between the passes have been
  cut down.
(3.2.1.11)
- bugfix for regression in 3.2.1.10: in mapping mode, SKIM made too many
  comparisons, in the order of a de-novo assembly
(3.2.1.13)
- more efficient disk I/O for hash statistics analysis

B.

Other related posts: