On Freitag 11 Dezember 2009 Björn Nystedt wrote: > I am still somewhat confused about exactly how the second error can occur. > The situation is that this is what you would expect > ...CATTTTTTC... > ATTTTTTC... > but this is what you get > ...CATTTTTTC... > A*TTTTTTC... > I can see the point of relaxing gap penalties at ends of reads, but to > prefer a mismatch over a match seems strange to me. Note also that there > was no IUPAC in the reference in this case! As long as what you show me are two reads and not the reference and a read, I'm fine. I don't like it, but I've learned that whatever clever trick one applies, there will always be a combination of mapping sequence and sequencing errors that will lead to situations like this. Welcome to the wonderful world of mapping. Imagine you have a backbone: B ...GCTTTTTTCG... You now map the first read which has one sequencing error right at the beginning/end (an A instead of a C): B ...GCTTTTTTCG... r1 ATTTTTTCG... Now you map a second read, where the sequecning error is an insertion (an additional A) somewhere in the middle of the read. Remember, you map against the reference: B ...GC*TTTTTTCG... r1 A*TTTTTTCG... r2 ...GCATTTTTTCG... Wham! You have the situation you described. From there on, things can only deteriorate with additional reads :-) I have an idea how this kind of effect could be toned down a bit and the pieces for it are almost in place in MIRA, but I need to test it (and hope it really works). Regards, Bastien -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html