[mira_talk] Re: large number of contigs

  • From: Nick Heng <nicholas.heng@xxxxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Thu, 1 Nov 2012 03:33:19 +0000

It could be repeat sequences - if there are a lot of insertion sequences of 
different types (and variants), they might cluster as single contigs.  Plus 
poor quality sequence reads can be a reason. Looks like a bacterial genome (9.x 
Mb).  I tried MIRA 3.4 (part of Bio-Linux 6) with data from an old 454 run - 
the archaic GS-FLX (not Titanium) with 248 bp average read length - I got 600+ 
contigs (280+ with Newbler) - it was a really crap run.  Much better results 
from MIRA with Titanium data (30-fold coverage, 40-50 contigs).

================================
Nicholas (Nick) C.K. Heng, Ph.D.
Department of Oral Sciences
Faculty of Dentistry
University of Otago
P.O. Box 647
Dunedin 9054
NEW ZEALAND.
Ph: +643 4799254
Fx: +643 4797078
================================

________________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] on behalf 
of Bastien Chevreux [bach@xxxxxxxxxxxx]
Sent: Thursday, 1 November 2012 12:33 p.m.
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: large number of contigs

On Oct 31, 2012, at 20:43 , John Nash <john.he.nash@xxxxxxxxx> wrote:
> 17-fold coverage for 454 is way too low. Try 40-fold.

Agreed, that would be the main reason.

Though 703 contigs still seems high.

B.

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: