[mira_talk] Re: Mappging to Reference

  • From: Saulo Alves <sauloal@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 11 Jun 2010 13:57:01 +0200

Hello Bastien,

Sorry if i didn't make myself clear.
events stands for all the tags mira add's to the sequencing. Gaps,
SNPs, repeats.

I have tried the AO process but, after placing the gaps, there are
still discrepancies.
for one chromossome my FASTA reference has 1.462.416 bp. the "PADDED"
(w/ gaps) generated sequence has 1.462.514 bp and the "UNPADDED"
sequence has 1.462.431 bp.
There's this difference (+17) between the original and unpadded which
i cant explain neither map back. If i want to know where in the
reference sequence a SNP was found, i'm not able with a 17bp
discrepancy and, for other chromossome, this difference get's even
bigger.

1.462.416 bp fasta
1.462.514 bp padded
1.462.431 bp unpadded

Regards,

----------------
s.



On Thu, Jun 10, 2010 at 6:40 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
> On Donnerstag 10 Juni 2010 Saulo Alves wrote:
>> I'm facing a problem where I'm trying to map all the "events" on the query
>> sequence (the new genome) to the reference genome.
>
> Hello Saulo,
>
> I'm not sure I can follow you. What are "events"?
>
>> The problem is, MIRA inserts several * on the reference prior to assembly
>> and shift frames (as reported in the AT field in the MAF file).
>
> Well, looking at a case like this:
>
> ref ....acgta*cgt....
> s1  ....acgtaGcgt....
> s2  ....acgtaGcgt....
> s3  ....acgtaGcgt....
> s4  ....acgtaGcgt....
> s5  ....acgtaGcgt....
>
> then why should MIRA not insert a gap? This is what makes most sense and
> reflects accurately the change of the new genome against the reference.
>
>> After all those modification I'm not able to map BACK base-by-base each
>> problem.
>
> You can parse the positions with gaps in the reference sequence by looking at
> "AO" lines in the reference reads. It's pretty easy actually: fill an array
> with "-1", then apply all "AO" line from the reference read. Positions having
> "-1" at the end of this procedure are gaps. From there, creating a mapping to
> your original sequence is a breeze.
>
>> I'm planning on creating a "multiple alignment" of the two sequences for
>> high density annotation.
>
> This actually I do not understand: what do you want to do?
>
> Regards,
>  Bastien
>
> --
> You have received this mail because you are subscribed to the mira_talk 
> mailing list. For information on how to subscribe or unsubscribe, please 
> visit http://www.chevreux.org/mira_mailinglists.html
>

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: