[mira_talk] Re: Homopolymer errors and MIRA

  • From: Lionel Guy <guy.lionel@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 30 Sep 2009 17:19:49 +0200

Hi Bastien,

On 26 Sep 2009, at 17:20 , Bastien Chevreux wrote:

hrm, let me guess: Newbler 2.x?

Yes, Newbler 2.0.0.20

What I'd be interested in would be this: do you have some statistics which show the errors broken down by homopolymer length. I strongly suspect that longer homopolymers are more prone to the base calling error, so shifting calling weights according to homopolymer length is probably one possible
solution.

It doesn't seem so. Actually, it seems that the absolute number of overcalling errors is constant over hp length, but that the number of undercalls is going down with hp length (this is only for mira, numbers of homopolymers with length (rows) and length diff with reference (columns):
                                        
                                                        
                -2      -1      0       1       2       
                                                        
All     4       2       80      62502   7       1       
        5       2       93      22854   5       2       
        6       2       156     7819    7       0       
        7       2       110     2596    6       0       
        8       1       21      621     7       0       
        9       0       0       49      4       0       


Ideally there would also be statistics which show how many
gaps/bases were at each erroneous site, but that might be a bit too much to
ask.

I attach a file to this email, which contains (for mira assembly for one of the genome) each "incorrectly" called HP, and every read as they appear in the assembly file. Hope that helps... If you want more, or even the full dataset (including correct calls), or the scripts, let me know...



Lionel

============================================
Lionel Guy
Thunmansgatan 25, SE-75421 Uppsala

phone: +46 (0)18 245596
mobile: +46 (0)73 9760618
email: guy.lionel@xxxxxxxxx
============================================

Other related posts: