Oh I see it did 3 Passes out of 5. So another day or two. By the way, would a solid state drive increase speed of assemblies of mira and other assemblers in general? Adrian On Mon, May 16, 2011 at 1:43 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > On May 16, 2011, at 19:14 , Adrian Pelin wrote: > > My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was > wondering if that is normal. Here is a bit of info: > > 240,000 454 reads > > 2000000 solexa pair-end reads > > There are some contaminations of the DNA thought:( we have chunks of > bacterial and fungal DNA and we are only interested in the Fungal > mitochondrial DNA ~70-80kb. > > Oh, old friends of mine: mitochondria and chloroplasts. They are inherently > difficult as most data sets I have seen up to know have a wildly varying > coverage and, to complicate things, most of the time contain DNA from > slightly different mitochondria/chloroplasts. Assembly hell par excellence. > > And then you have a small target (80kb) you sequence with tons and tons of > reads. Ouch. Coverage >1000x ? MIRA will have a hard time. > > > We already have the genome assembled into 9 contigs, problem is that is > still a lot, we would like to reduce it further to 4-5 contigs if possible > and get different contigs from Mira algorithms. > > Can anyone confirm that it is normal to have such a long wait time? > > That depends on the definition of "normal". However, as comparison, the run > time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and 3.5m > Solexa reads is < 1 day (and that just because MIRA starts to build huge > contigs of 1.5mb which slows down some things tremendously ... I'm working > on that). > > MIRA *has* a hard time. That it takes so long is a sign that a lot of SNPs > respectively repeat markers were found and disentangling them is a time > consuming process. To asses where MIRA is: > grep "^Pass:" log_assembly.txt > (or to whatever you redirected the output) and compare that to the number > of passes with which MIRA is configured (see -AS:nop in the parameter > section atop said file) > > The memory usage is less of a problem: MIRA just grabbed all it could / was > allowed to load big tables and to less disk IO. No need to worry. > > B. > > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html >