[mira_talk] Re: How does Mira determine quality scores?

Dear Bastien,
thanks for the quick reply.

I think that 2/3 gaps could be a good threshold to start with.
At least we would get rid of those pesky situations where I have 30 gaps and 3 bases and still get a base.

I would love to try a version with this algorithm, so for once I can help instead of just using the software. I cannot guarantee on how fast I will be able to give you some results as I'm very busy, but I'll do my best.

Reasoning from another point of view, couldn't Mira give quality score to gaps also? it could either be a fixed number or something like the mean value of the two bases around the gap?

maybe this does not make sense, what do you think?

Davide

On Mittwoch 29 Juli 2009 Davide Sassera wrote:
I would like to add something on this topic.
I found that often in situations of long homopolymers the presence of
few reads containing "1 more base" overcomes the presence of many more
reads with "1 less base" in the consensus.
Manual corrections shows that the majority "1 less base" reads are
right, so I have to correct the consensus each time this happens.
Could the problem brought up by David Hesselbom be the reason for this
"bug"?

The current consensus algorithms look at base specific qualities only, and often a few reads are enough to have a high enough quality to be considered as valid base. In this case MIRA currently prefers the base over the gap, that is true.

Question is: how many times do you have a majority of gaps which is right ... and how often do you have a majority of gaps which is wrong? Would you have any numbers on that? I could fine tune the algorithm a bit with that. I looked at the function and I think that building it a simple majority vote (e.g. when
=2/3 of all bases are gaps then take the gap regardles of the base
qualities).

Would you want to try a version with that algorithm and report back whether you see improvements?

Regards,
  Bastien






--
Davide Sassera
Sezione di Patologia Generale e Parassitologia
Dipartimento di Patologia Animale, Igiene e Sanità Pubblica Veterinaria Facoltà di Veterinaria
Università degli Studi di Milano
Via Celoria 10, 20133, Milano, ITALY
Tel: +39 0250318094
Fax: +39 0250318095

Other related posts: