[mira_talk] N50 decrease while sequencing depth increase ?

  • From: "johann JOETS" <joets@xxxxxxxxxxxxxx>
  • To: <mira_talk@xxxxxxxxxxxxx>
  • Date: Mon, 3 Aug 2009 18:00:51 +0200

Dear Bastien,

In order to have a better view about what sequencing strategy to retain for
a piece of plant genome, I undertook some simulation work.

First I created a 150 kb sequence with RANDNA (GC% 46%). A dot plot analysis
did not show any repeated fragment.

Next I simulated 454 reads using the metasim software. I simulated 5X, 10X,
15X 25X 50X and 100X datasets and then I subjected them to assembly with
Mira.
Parameters were -job=denovo,genome,normal,454 -notraceinfo -noclipping

The N50 is as follow :
        N50     av cov
5X      257     0
10X     5419    8,95
15X     111471  13,74
20X     108664  18,29
25X     27526   22,21
50X     446     40,08

You may notice that the average coverage is roughly as expected. However I
was surprised by the decrease of n50 for datasets deeper than 15X. This is
also true for the length of the largest contig. 
As I know were reads should have been assembled I can check assembly quality
(roughly I count breakages in contigs). According to these tests, the
quality of the assembly also drop down.

maybe I made something wrong using Mira ?
Hope you will have an idea about this.

Many thanks for your help.

Johann




-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: