[mira_talk] Re: allow for more polymorphisms

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 7 Dec 2009 19:18:18 +0100

On Montag 07 Dezember 2009 Oscar Franzén wrote:
> I'm trying to assemble a highly polymorphic genome using MIRA, but my
> problem is that the final genome created by MIRA is almost twice the
> haploid size of the genome (output size of all contigs is around 24Mbp
> but the haploid genome size should be around 12 Mbp). I think that the
> high polymorphism rate causes MIRA to duplicate all contigs. So my
> question is, is there an option/parameter in MIRA to increase the
> allowed polymorphism rate?
> 
> I'm using the latest version of MIRA and the data type is 454 and I'm
> running a 'normal' assembly with default parameters.

I suspect that this falls under the no-can-do. Technically speaking, sequences 
from different ploidies with base differences very much look like a repeat to 
an assembler. In a perfect world, with even coverage everywhere, one could 
distinguish them by looking at the coverage, but ... I dare to say that this 
would be a pretty daring thing to try with actual data (be it Sanger, 454 or 
Solexa).

The master switch to turn off recognition of repeat marker bases is -CO:mr, 
setting it to no will stop it from working. But then a whole new range of 
problem arises and you will need to tighten a few other parameters.

If you really want to try, these would be the parameters I'd change (untested, 
you might want to test on a smaller data set first): in Smith-Waterman 
alignments, crank up the minimum relative score (-AL:mrs) to 90 or 95% for all 
sequencing technologies used. In the contig parameters, switch off the marking 
of repeats (-CO:mr=no) but at the same time you absolutely must make the 
contig assembly more strict: decrease -CO:rodirs to 10 (or perhaps even some 
value between 5 and 10) for all sequencing technologies used.


I think you will still run into some troubles with repeats, but it's worth a 
try.

Regards,
  Bastien

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: