[mira_talk] Assembly rearrangements in the face of repeats

From: Robert Bruccoleri <bruc@xxxxxxxxxxxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Sun, 07 Aug 2011 22:39:50 -0400

I have an interesting and difficult assembly that I'm attempting withMira. I'm working with a bacteria that has a large number of NonRibosomal Peptide Synthases (NRPS) and Poly Ketide Synthases (PKS) andthere are many domain and gene duplications that have occurred duringthe course of evolution. The bacteria has a GC content in excess of 70%.

I have one gene in this bacteria that has a large number of domains,some of which are exactly duplicated (>500bp) in the gene. From thechemical structure of the compound made by this gene, I have a good ideaof what the domain structure ought to be.

We have an extensive collection of data, both 454 and Illumina, forthis bacteria. For Illumina, we have paired end data of various lengths.I've been experimenting with different combinations of data to see if Ican get a complete assembly of the gene of interest above.

Just recently, I started a 'normal' Mira run using 3.4rc2, and I enabledintermediate FASTA output at every pass. On the second pass, Miragenerated my gene with the expected pattern of domains. However, onsucceeding passes, it eliminated some of the repetitive sequences, andat the end of the run, I had lost about 30% of the expected domains.

Has anyone else run into issues like these? How can I control thedecision making with regard to repeats? Is there any way of having Mirareport a graph of the possible assemblies (like Allpaths). (BTW, I don'thave data that is suitable for Allpaths).


Thanks. --Bob

begin:vcard
fn:Robert Bruccoleri
n:Bruccoleri;Robert
org:Audacious Energy, LLC and Congenomics, LLC
adr:;;;;;;USA
email;internet:bruc@xxxxxxx
title:President
version:2.1
end:vcard

Follow-Ups:
- [mira_talk] Re: Assembly rearrangements in the face of repeats
  - From: Bastien Chevreux

[mira_talk] Assembly rearrangements in the face of repeats

Other related posts: