[mira_talk] Re: unrecognized option

  • From: Andrej Benjak <abenjak@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 01 Oct 2012 09:35:33 +0200

Hi Adrian,

You still didn't tell us size of the genome (to me "very large" is in Gb), but even assuming you have something not bigger than Arabidopsis (120 Mb) and that you are doing /de novo/ assembly, only 1/4 of 454 plate is very limited.

For larger genomes, de novo assembly and ONLY 454 data I think newbler is the way to go (in my humble experience). Because you have PEs, newbler will give you scaffolds, which is more important than contig size (unless you are interested in intergenic or repetitive regions). But if the coverage is very low the results wont be great, no matter the program used.

Good thing to use different assemblers and compare the results though...

Andrej

On 29/09/12 23:53, Adrian Pelin wrote:
A 1/4 of a plate 454 where we do it is about 270k reads. Since they are paired end, that number doubled when I extracted them from .sff so at about half a million now.

What exactly is template size?

On 9/29/2012 5:49 PM, Bastien Chevreux wrote:
On Sep 29, 2012, at 23:09 , Adrian Pelin wrote:
We have 1/4 plate of 454 for a very large genome
Umm, could you define "very large" in genome length? Maybe also number of reads?

and for some odd reason unknown to me we did paired end 454 with an 8kb insert.
Well, not that odd. Standard approach atm would be shotgun + 3kb + 7-10kb, so having an 8kb library looks OK to me.

Here comes the big one, how do I tell mira it is paired end and that it is an 8kn insert? any way?
segment_placement = SB
template_size= 6000 10000

Note that "6000 10000" is a wild guess of mine (should not be too far off though), your sequencing provider should be able to tell you what it normally is.

B.



Other related posts: