It could be repeat sequences - if there are a lot of insertion sequences of different types (and variants), they might cluster as single contigs. Plus poor quality sequence reads can be a reason. Looks like a bacterial genome (9.x Mb). I tried MIRA 3.4 (part of Bio-Linux 6) with data from an old 454 run - the archaic GS-FLX (not Titanium) with 248 bp average read length - I got 600+ contigs (280+ with Newbler) - it was a really crap run. Much better results from MIRA with Titanium data (30-fold coverage, 40-50 contigs). ================================ Nicholas (Nick) C.K. Heng, Ph.D. Department of Oral Sciences Faculty of Dentistry University of Otago P.O. Box 647 Dunedin 9054 NEW ZEALAND. Ph: +643 4799254 Fx: +643 4797078 ================================ ________________________________________ From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] on behalf of Bastien Chevreux [bach@xxxxxxxxxxxx] Sent: Thursday, 1 November 2012 12:33 p.m. To: mira_talk@xxxxxxxxxxxxx Subject: [mira_talk] Re: large number of contigs On Oct 31, 2012, at 20:43 , John Nash <john.he.nash@xxxxxxxxx> wrote: > 17-fold coverage for 454 is way too low. Try 40-fold. Agreed, that would be the main reason. Though 703 contigs still seems high. B. -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html