[mira_talk] Re: Mira says "killed" as last word after being almost done

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 16 May 2011 19:43:42 +0200

On May 16, 2011, at 19:14 , Adrian Pelin wrote:
> My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was 
> wondering if that is normal. Here is a bit of info:
> 240,000 454 reads
> 2000000 solexa pair-end reads
> There are some contaminations of the DNA thought:( we have chunks of 
> bacterial and fungal DNA and we are only interested in the Fungal 
> mitochondrial DNA ~70-80kb.

Oh, old friends of mine: mitochondria and chloroplasts. They are inherently 
difficult as most data sets I have seen up to know have a wildly varying 
coverage and, to complicate things, most of the time contain DNA from slightly 
different mitochondria/chloroplasts. Assembly hell par excellence.

And then you have a small target (80kb) you sequence with tons and tons of 
reads. Ouch. Coverage >1000x ? MIRA will have a hard time.

> We already have the genome assembled into 9 contigs, problem is that is still 
> a lot, we would like to reduce it further to 4-5 contigs if possible and get 
> different contigs from Mira algorithms. 
> Can anyone confirm that it is normal to have such a long wait time?

That depends on the definition of "normal". However, as comparison, the run 
time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and 3.5m 
Solexa reads is < 1 day (and that just because MIRA starts to build huge 
contigs of 1.5mb which slows down some things tremendously ... I'm working on 
that).

MIRA *has* a hard time. That it takes so long is a sign that a lot of SNPs 
respectively repeat markers were found and disentangling them is a time 
consuming process. To asses where MIRA is: 
  grep "^Pass:" log_assembly.txt
(or to whatever you redirected the output) and compare that to the number of 
passes with which MIRA is configured (see -AS:nop in the parameter section atop 
said file)

The memory usage is less of a problem: MIRA just grabbed all it could / was 
allowed to load big tables and to less disk IO. No need to worry.

B.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: