[mira_talk] Re: ouch... that memory

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 18 Nov 2014 08:03:56 +0100

On 18 Nov 2014, at 2:47 , Wei-Jen Chang <wchang@xxxxxxxxxxxx> wrote:
> Here your main point was that it was due to overly high coverages (200M 
> reads)? or large size of genome (150 Mb)? or Both?

Both. But read on.

> Sort of the same question I asked above... MIRA is not made to handle a 
> genome larger than?

See
  
http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_for_which_data_sets_to_use_mira_and_for_which_not
 
<http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_for_which_data_sets_to_use_mira_and_for_which_not>

Though I might need to update that section a bit. I’ve used MIRA with 70m 
Illuminas in RNASeq projects and know people who use it with >100m reads in 
genome projects, but they’re very patient. Also they get lucky and probably 
have some repeat filtering parameters tightened in MIRA.

The main problem arises from a combination of repeats and number of reads, so 
it’s more or less impossible to predict a maximum genome size. Again, some 
users reported getting “extremely good assemblies” with lower eukaryotes in the 
100mb range, while I have a horrendous 40mb bug where MIRA fails (but so does 
about every other short read assembler).

B.

Other related posts: