Hi Bastien,The "good" results were from V2.9.15 . Making sure this effect was real, I've just tried assembling an old dataset using 2.9.43 and exactly the same input reads. This example uses pre-Titanium reads, a pool of two samples of relatively degraded cdna with an average read length 120bp.
number total number of of reads bases contigs 169796 2865603 8540 V2.9.15 149758 6167756 24376 V2.9.43Looking at contigs >500 bp, V2.9.15 produced 1159 contigs and V2.9.43 produced 1298 contigs.
V2.9.15 was assembled with:mira -project=cmb -AS:nop=7:rbl=3 -SK:pr=80 -AL:mrs=80 -FN:xtii=dummy_traceinfo.xml -GE:mxti=yes -454data -454:l454d=yes -CL:msvs=no:qc=no:bsqc=no:pvlc=no:mbc=no:emlc=no -DP:ure=no -OUT:otc=yes
and V2.9.43 was assembled with (hopefully comparable):mira -job=denovo,est,draft,454 -project=$project -AS:nop=7:rbl=3 -SK:pr=80 -AL:mrs=80 -FN:xtii=dummy_traceinfo.xml -LR:mxti=no -LR:l454d=yes -CL:msvs=no:qc=no:bsqc=no:pvlc=no:mbc=no:emlc=no -DP:ure=no -SK:mnr=yes -OUT:otc=yes
Using cap3 on these 24376 Mira contigs produces 1167 cap3 contigs using 17545 Mira contigs. Using cap3 on the 8540 Mira contigs of V2.9.15, 630 contigs are produced using 2277 Mira contigs.
The 454 Titanium reads for another project are of much better quality, they are the expected length for the technology used. The same effect can be seen in these, many Mira contigs which cap3 can assemble with default options. Looking at the .ace file from cap3, the cap3 assembly is reasonable and leaves the impression Mira only needs a single base difference to start a new contig.
Thanks for the quick response, Richard Bastien Chevreux wrote:
On Dienstag 02 Juni 2009 Richard Gregory wrote:[...] The only clue comes previous assemblies with earlier versions of Mira, which produced much less redundancy, ie, was ~8000 contigs, now V2.9.43 produces ~18000. Mapping this onto a reference showed ~1500 contigs could be the same gene. Assembling the ~1500 contigs with cap3 produced ~3 contigs, one containing hundreds of contigs.Hello Richard,hmmm ... sounds funny, indeed. Could you tell me the last version of MIRA with which you get "good" results and which version gives you troubles?I admit that I have been concentrating more on genome assemblies lately and perhaps a changed default parameter or a new algorithm behaves somewhat unexpectedly with cDNA.Regards, Bastien
-- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html