On May 16, 2011, at 19:14 , Adrian Pelin wrote: > My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was > wondering if that is normal. Here is a bit of info: > 240,000 454 reads > 2000000 solexa pair-end reads > There are some contaminations of the DNA thought:( we have chunks of > bacterial and fungal DNA and we are only interested in the Fungal > mitochondrial DNA ~70-80kb. Oh, old friends of mine: mitochondria and chloroplasts. They are inherently difficult as most data sets I have seen up to know have a wildly varying coverage and, to complicate things, most of the time contain DNA from slightly different mitochondria/chloroplasts. Assembly hell par excellence. And then you have a small target (80kb) you sequence with tons and tons of reads. Ouch. Coverage >1000x ? MIRA will have a hard time. > We already have the genome assembled into 9 contigs, problem is that is still > a lot, we would like to reduce it further to 4-5 contigs if possible and get > different contigs from Mira algorithms. > Can anyone confirm that it is normal to have such a long wait time? That depends on the definition of "normal". However, as comparison, the run time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and 3.5m Solexa reads is < 1 day (and that just because MIRA starts to build huge contigs of 1.5mb which slows down some things tremendously ... I'm working on that). MIRA *has* a hard time. That it takes so long is a sign that a lot of SNPs respectively repeat markers were found and disentangling them is a time consuming process. To asses where MIRA is: grep "^Pass:" log_assembly.txt (or to whatever you redirected the output) and compare that to the number of passes with which MIRA is configured (see -AS:nop in the parameter section atop said file) The memory usage is less of a problem: MIRA just grabbed all it could / was allowed to load big tables and to less disk IO. No need to worry. B. -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html