[mira_talk] Re: CT tag acronyms in .ace files

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 28 Jul 2009 23:27:31 +0200

On Samstag 25 Juli 2009 Björn Nystedt wrote:
> one thing that would be very nice to tag would be potential frameshifts in
> 454 data due to homopolymer errors. This would be a tag at sites where
> about half of the 454 reads call (the same) base and the rest call a gap
> (the cutoffs here could be discussed). In principle, these sites could have
> turned up in the SNP tag 'SAOc', but I don't think it does?

Hello Björn,

the S?Oc tags are reserved for SNPs between strains. But I've added a new tag 
"DGPc": Dubious Gap Position on Consensus which is set whenever the number of 
gaps is 40-60% (I didn't have time tonight to make that configurable) of the 
number of bases from the next frequent base.

Try version 3rc1a (which I've just uploaded) and tell me whether this suits 
your needs.

> In a hybrid assembly, this would also be sites where the basecalls from
> non-454 reads would be preferred in the consensus, even when 454 data
> dominate in coverage, so it could be used to automatically improve the
> consensus quality during the assembly, if this is not allready taken into
> account somehow?

It is taken into account in hybrid assemblies already. Those site are tagged 
STMS or STMU depending on whether MIRA thinks it could correctly resolve the 
problem or not (although STMS and STMU are also set for differences not 
involving gaps).

Regards,
  Bastien


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: