[mira_talk] MIRA and gap penalty

  • From: Laurent Abi-Rached <Laurent.Abi-Rached@xxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 31 Oct 2012 10:31:20 +0100

Hello,

I have a question regarding how gap penalties are set in MIRA: indeed, while conducting a batch of tests, I noticed that MIRA would sometimes favor gaps over mismatches; for example, instead of generating the alignment #1 below with two mismatches and no gaps, MIRA would create alignment #2 with no mismatches but two gaps.

Alignment #1 (expected)
AGCCCCACGTGC CA TGGAGGGACATGGAA
AGCCCCACGTGC AG TGGAGGGACATGGAA

Alignment #2 (generated by MIRA)
AGCCCCACGTG CCA* TGGAGGGACATGGAA
AGCCCCACGTG *CAG TGGAGGGACATGGAA

Since these sequences are coding sequences, alignment #1 should be favored but I have not been able so far to find settings that would lead to such an alignment. In particular, I made sure egp was set to yes and tried different values for egpl (2 and 10) but alignment #1 is still favored.

I am thus wondering if there is a way to increase the gap penalty to prevent MIRA from favoring alignment #2 over alignment #1?

In case this is useful, to conduct these tests I used the latest version of MIRA (dev3.9.5) and the following parameters: --job=mapping,genome,accurate,Solexa -CO:mr=0 -GE:not=6 -AS:nop=3:ard=0 -CL:pechsgp=1:pecbph=12:cbse=1 -SK:not=6:bph=12:pr=70 -MI:somrnl=0 SOLEXA_SETTINGS -CL:pec=0 -AL:mo=25:ms=25:mrs=70:shme=10:egp=1:egpl=10 -AS:mrl=20 -CO:msr=0:amgb=1:amgbemc=1:rodirs=10

Also, this mapping assembly with Solexa reads has a coverage of about 120X in the region where the problem above occurs (on a side note, a small number of reads (7-8) are aligned 'properly' (with two mismatches) but all the other reads have the first or the second gap).

Thanks,
Laurent


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: