[mira_talk] Re: Alignment errors in Solexa mapping

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 11 Dec 2009 19:49:48 +0100

On Freitag 11 Dezember 2009 Björn Nystedt wrote:
> I am still somewhat confused about exactly how the second error can occur.
>  The situation is that this is what you would expect 
>  ...CATTTTTTC...
>      ATTTTTTC...
> but this is what you get
>   ...CATTTTTTC...
>      A*TTTTTTC...
> I can see the point of relaxing gap penalties at ends of reads, but to
>  prefer a mismatch over a match seems strange to me. Note also that there
>  was no IUPAC in the reference in this case!

As long as what you show me are two reads and not the reference and a read, 
I'm fine. I don't like it, but I've learned that whatever clever trick one 
applies, there will always be a combination of mapping sequence and sequencing 
errors that will lead to situations like this.

Welcome to the wonderful world of mapping.

Imagine you have a backbone:

B   ...GCTTTTTTCG...

You now map the first read which has one sequencing error right at the 
beginning/end (an A instead of a C):

B     ...GCTTTTTTCG...
r1        ATTTTTTCG...

Now you map a second read, where the sequecning error is an insertion (an 
additional A) somewhere in the middle of the read. Remember, you map against 
the reference:

B     ...GC*TTTTTTCG...
r1        A*TTTTTTCG...
r2    ...GCATTTTTTCG...

Wham! You have the situation you described. From there on, things can only 
deteriorate with additional reads :-)

I have an idea how this kind of effect could be toned down a bit and the 
pieces for it are almost in place in MIRA, but I need to test it (and hope it 
really works).

Regards,
  Bastien

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: