[mira_talk] Denovo hybrid assembly of a 3.8M genome using 454 and Solexa

  • From: Hui Sun <hsun@xxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 7 Oct 2011 14:23:57 -0700

Hello,

I am trying to assemble a genome with an estimated size of 3.8M.  I
have used allpath and generated an assembly.  As a comparison, I'm
trying to use MIRA assembler.

I have 4 million 454 PE reads and 120 million Solexa reads. I have
screened out adaptors by using SSAHA2.

I then subset Solexa reads to 2.5 million and 454 reads to 400K, which
is ~42x coverage for each of the platform. I then ran MIRA hybrid
denovo assembly: mira
  --project=test --job=denovo,genome,normal,454,solexa

The resulting contig stats seem to be really fragmented, see stats
below.  What can I do to improve scaffolding?   My allpath assembly
generated 15106 contigs, 11965 scaffolds, N50 contig size 1.9kb, which
seems to be much better.  Thanks for the help.

All contigs:
============
  Length assessment:
  ------------------
  Number of contigs:    124259
  Total consensus:      46425879
  Largest contig:       4936
  N50 contig size:      402
  N90 contig size:      240
  N95 contig size:      206

  Coverage assessment:
  --------------------
  Max coverage (total): 298
  Max coverage per sequencing technology
        Sanger: 0
        454:    304
        IonTor: 0
        PacBio: 0
        Solexa: 767
        Solid:  0

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: