On Dienstag 04 August 2009 Brian Forde wrote: > Not sure if this is relevant > I believe that with 454 sequences the assembly can become saturated at a > certain point and I believe that this can depend on the size of the genome. > MWG did a presentation on it at FEMS this year. they showed that with a 2.7 > mb genome the assembly becomes saturated at a coverge of 21-22x above this > threshold the assembly quality decrease i.e. you in fact get more contigs. > Also a representative of GATC said something similar at a seminar where I > work recently. > I am currently looking for papers on this and will pass them on if/when I > find them. I know the claims from MWG and GATC and they're mostly right. The problem is more due to computational complexity: for non-repetitive sequence and a coverage of 100, the number of overlaps to check is ~16 times higher than with coverage 25. For repeats, things then get *really* ugly. Even a small and easy bacterium with, say, 10 rRNA islands suddenly generates overlap numbers that are a major pain both in time and space requirement. On the other hand, what Johann describes looks different ... let's see. Regards, Bastien -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html