[mira_talk] Re: N50 decrease while sequencing depth increase ?

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 4 Aug 2009 20:27:17 +0200

On Dienstag 04 August 2009 Brian Forde wrote:
> Not sure if this is relevant
> I believe that with 454 sequences the assembly can become saturated at a
> certain point and I believe that this can depend on the size of the genome.
> MWG did a presentation on it at FEMS this year. they showed that with a 2.7
> mb genome the assembly becomes saturated at a coverge of 21-22x above this
> threshold the assembly quality decrease i.e. you in fact get more contigs.
> Also a representative of GATC said something similar at a seminar where I
> work recently.
> I am currently looking for papers on this and will pass them on if/when I
> find them.

I know the claims from MWG and GATC and they're mostly right. The problem is 
more due to computational complexity: for non-repetitive sequence and a 
coverage of 100, the number of overlaps to check is ~16 times higher than with 
coverage 25.

For repeats, things then get *really* ugly. Even a small and easy bacterium 
with, say, 10 rRNA islands suddenly generates overlap numbers that are a major 
pain both in time and space requirement.

On the other hand, what Johann describes looks different ... let's see.


You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 

Other related posts: