Hello Greg, I checked the repeats in the most closely related genome (honey bee). The biggest repeat was around 2000bp. Kind regards, Filip From: mira_talk-bounce@xxxxxxxxxxxxx [mailto:mira_talk-bounce@xxxxxxxxxxxxx] On Behalf Of Gregory Harhay Sent: donderdag 10 december 2009 18:44 To: mira_talk@xxxxxxxxxxxxx Subject: [mira_talk] Re: 500Mb assembly Fillip: How big are the repetitive regions in the insect genome? I they are larger than 3000 nt, your paired-end reads will only take you so far .. Your contigs won't scaffold completely no matter what assembler & scaffolder you are using as they will be chopped up by the unbridgeable repetitive regions, unless of course, there is some other physical evidence joining the contigs. Greg On 12/10/09 9:39 AM, "Filip Van Nieuwerburgh" <Filip.VanNieuwerburgh@xxxxxxxx> wrote: Dear, I have been asked to de novo assemble an insect genome. Until now, I only de novo assembled bacterial genomes (max 5 Mb). Mira proved to be the best assembler (in my hands) to tackle these assemblies. What would be the best strategy (also taking into account that I cannot wait more than a month for MIRA to finish) to de novo assemble the 500Mb insect genome? I have: - 100E6 reads of 2x100 bp Illumina data - 1 full Roche titanium run: paired-end insert size= 3000 bp. - A 16 core 64-bit linux with 128Gb of RAM Thank you! Kind regards, Filip