[mira_talk] Re: Mappging to Reference

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 10 Jun 2010 18:40:39 +0200

On Donnerstag 10 Juni 2010 Saulo Alves wrote:
> I'm facing a problem where I'm trying to map all the "events" on the query
> sequence (the new genome) to the reference genome.

Hello Saulo,

I'm not sure I can follow you. What are "events"?

> The problem is, MIRA inserts several * on the reference prior to assembly
> and shift frames (as reported in the AT field in the MAF file).

Well, looking at a case like this:

ref ....acgta*cgt....
s1  ....acgtaGcgt....
s2  ....acgtaGcgt....
s3  ....acgtaGcgt....
s4  ....acgtaGcgt....
s5  ....acgtaGcgt....

then why should MIRA not insert a gap? This is what makes most sense and 
reflects accurately the change of the new genome against the reference.

> After all those modification I'm not able to map BACK base-by-base each
> problem.

You can parse the positions with gaps in the reference sequence by looking at 
"AO" lines in the reference reads. It's pretty easy actually: fill an array 
with "-1", then apply all "AO" line from the reference read. Positions having 
"-1" at the end of this procedure are gaps. From there, creating a mapping to 
your original sequence is a breeze.

> I'm planning on creating a "multiple alignment" of the two sequences for
> high density annotation.

This actually I do not understand: what do you want to do?

Regards,
  Bastien

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: