[mira_talk] Position of gaps in homopolymers

  • From: David Hesselbom <dahe9761@xxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 5 Aug 2009 16:19:50 +0200

Bastien,

comparing Newbler and Mira assemblies it seems that homopolymer sequences
are left-adjusted (gaps to the right) in the former and right-adjusted (gaps
to the left) in the latter, e.g.

Newbler:
AAAAA**
AAAAAAA
AAAAAA*

Mira:
**AAAAA
AAAAAAA
*AAAAAA

I would like to know whether this is consistently the case.

Is there any case where, in a Mira assembly, the gaps would be to the right
of the homopolymer sequence, or are they always to the left, as seems to be
the case? I'm working on a script that counts the number of homopolymers in
the consensus sequence and then checks the length of each of these
homopolymers in each read, and I need to know if I have to check both ends
of the homopolymer for gaps or if just one is enough. Which end to look at
for gaps would then depend on whether it's a Newbler or a Mira assembly.

David Hesselbom
Research assistant
Molecular Evolution
EBC, Uppsala University

Other related posts: