Bastien, I've run some tests on 454 assemlies from both Mira and Newbler and have concluded that the quality scores attributed to homopolymers are very different depending on the source, even within the same genome. For example, in a homopolymer in a consensus sequence, Newbler quality scores are nearly always the same in the neighboring bases and throughout the homopolymer itself, except for its last base, which has a very low score compared to the rest of the bases in the homopolymer. Supposedly, this is because the length of the homopolymer is not certain (the reads do not agree), but it's only the last of the bases that is uncertain whether it should be there or not. In Mira assemblies, however, all bases in a homopolymer have varying quality scores, none of which are very low, and typically, bases in (at least) long homopolymers have a lower average score than those surrounding the homopolymer, meaning it constitutes a considerable "drop" in the quality scores. To me, the Newbler quality scores in homopolymers seem to make more sense than the Mira ones, since what we're uncertain about is the number of bases in the homopolymer. Since it doesn't matter which base we remove within the homopolymer, the low quality score might as well be attributed to the last one. Mira seems to spread out the quality score penalty over each base in the homopolymer, though I do not believe this is what's actually happening. :) I'd like to know why the quality scores are determined so differently by Mira and Newbler, and also the details on how Mira does it. For example, does it take homopolymers into special consideration? Thanks, David Hesselbom Research assistant Molecular Evolution EBC, Uppsala University