Your stats look like it was not a subset. J Sent from my mobile device On 2011-10-07, at 5:23 PM, Hui Sun <hsun@xxxxxxx> wrote: > Hello, > > I am trying to assemble a genome with an estimated size of 3.8M. I > have used allpath and generated an assembly. As a comparison, I'm > trying to use MIRA assembler. > > I have 4 million 454 PE reads and 120 million Solexa reads. I have > screened out adaptors by using SSAHA2. > > I then subset Solexa reads to 2.5 million and 454 reads to 400K, which > is ~42x coverage for each of the platform. I then ran MIRA hybrid > denovo assembly: mira > --project=test --job=denovo,genome,normal,454,solexa > > The resulting contig stats seem to be really fragmented, see stats > below. What can I do to improve scaffolding? My allpath assembly > generated 15106 contigs, 11965 scaffolds, N50 contig size 1.9kb, which > seems to be much better. Thanks for the help. > > All contigs: > ============ > Length assessment: > ------------------ > Number of contigs: 124259 > Total consensus: 46425879 > Largest contig: 4936 > N50 contig size: 402 > N90 contig size: 240 > N95 contig size: 206 > > Coverage assessment: > -------------------- > Max coverage (total): 298 > Max coverage per sequencing technology > Sanger: 0 > 454: 304 > IonTor: 0 > PacBio: 0 > Solexa: 767 > Solid: 0 > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html