[mira_talk] Re: assembly options for non-redundant contigs

From: Richard Gregory <R.Gregory@xxxxxxxxxxxxxxx>
To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
Date: Wed, 03 Jun 2009 02:00:05 +0100

Hi Bastien,

The "good" results were from V2.9.15 . Making sure this effect was real,I've just tried assembling an old dataset using 2.9.43 and exactly thesame input reads. This example uses pre-Titanium reads, a pool of twosamples of relatively degraded cdna with an average read length 120bp.


number     total    number of
of reads   bases     contigs
169796    2865603      8540    V2.9.15
149758    6167756     24376    V2.9.43

Looking at contigs >500 bp, V2.9.15 produced 1159 contigs and V2.9.43produced 1298 contigs.


V2.9.15 was assembled with:

mira -project=cmb -AS:nop=7:rbl=3 -SK:pr=80 -AL:mrs=80-FN:xtii=dummy_traceinfo.xml -GE:mxti=yes -454data -454:l454d=yes-CL:msvs=no:qc=no:bsqc=no:pvlc=no:mbc=no:emlc=no -DP:ure=no -OUT:otc=yes


and V2.9.43 was assembled with (hopefully comparable):

mira -job=denovo,est,draft,454 -project=$project -AS:nop=7:rbl=3-SK:pr=80 -AL:mrs=80 -FN:xtii=dummy_traceinfo.xml -LR:mxti=no-LR:l454d=yes -CL:msvs=no:qc=no:bsqc=no:pvlc=no:mbc=no:emlc=no-DP:ure=no -SK:mnr=yes -OUT:otc=yes

Using cap3 on these 24376 Mira contigs produces 1167 cap3 contigs using17545 Mira contigs. Using cap3 on the 8540 Mira contigs of V2.9.15, 630contigs are produced using 2277 Mira contigs.

The 454 Titanium reads for another project are of much better quality,they are the expected length for the technology used. The same effectcan be seen in these, many Mira contigs which cap3 can assemble withdefault options. Looking at the .ace file from cap3, the cap3 assemblyis reasonable and leaves the impression Mira only needs a single basedifference to start a new contig.



Thanks for the quick response,

Richard

Bastien Chevreux wrote:

On Dienstag 02 Juni 2009 Richard Gregory wrote:
[...]
The only clue comes previous assemblies with earlier versions of Mira,
which produced much less redundancy, ie, was ~8000 contigs, now V2.9.43
produces ~18000. Mapping this onto a reference showed ~1500 contigs
could be the same gene.  Assembling the ~1500 contigs with cap3
produced ~3 contigs, one containing hundreds of contigs.
Hello Richard,
hmmm ... sounds funny, indeed. Could you tell me the last version of MIRA withwhich you get "good" results and which version gives you troubles?
I admit that I have been concentrating more on genome assemblies lately andperhaps a changed default parameter or a new algorithm behaves somewhatunexpectedly with cDNA.
Regards,
  Bastien


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Follow-Ups:
- [mira_talk] Re: assembly options for non-redundant contigs
  - From: Bastien Chevreux

References:
- [mira_talk] assembly options for non-redundant contigs
  - From: Richard Gregory
- [mira_talk] Re: assembly options for non-redundant contigs
  - From: Bastien Chevreux

[mira_talk] Re: assembly options for non-redundant contigs

Other related posts: