On 18 Nov 2014, at 2:47 , Wei-Jen Chang <wchang@xxxxxxxxxxxx> wrote: > Here your main point was that it was due to overly high coverages (200M > reads)? or large size of genome (150 Mb)? or Both? Both. But read on. > Sort of the same question I asked above... MIRA is not made to handle a > genome larger than? See http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_for_which_data_sets_to_use_mira_and_for_which_not <http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_for_which_data_sets_to_use_mira_and_for_which_not> Though I might need to update that section a bit. I’ve used MIRA with 70m Illuminas in RNASeq projects and know people who use it with >100m reads in genome projects, but they’re very patient. Also they get lucky and probably have some repeat filtering parameters tightened in MIRA. The main problem arises from a combination of repeats and number of reads, so it’s more or less impossible to predict a maximum genome size. Again, some users reported getting “extremely good assemblies” with lower eukaryotes in the 100mb range, while I have a horrendous 40mb bug where MIRA fails (but so does about every other short read assembler). B.