[mira_talk] Re: 454 homopolymers

  • From: Leonor Palmeira <mlpalmeira@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 23 Mar 2011 14:05:06 +0100

On 22/03/11 20:43, Bastien Chevreux wrote:
On Tuesday 22 March 2011 11:55:07 Leonor Palmeira wrote:

 > [...]

 > I realize the difficulty of the implementation, but would there be a way

 > of integrating flowgrams in the 454 part of the MIRA assembler some time

 > in the future?

(just to be sure: which version of MIRA?)

I'm using MIRA V3.2.0 (production version).

A way there could be for sure, but this is currently not on my TODO list.

I admit I have been neglecting improvement of homopolymer regions in 454
sequences a bit lately. The reason being that nowadays, no project I
touch works with 454 data alone. I *always* have some Solexa data to
complement it and then all problems with homopolymers simply go *poooof*
and vanish. No troubles, no guesses, just perfect sequence.

Well, I guess if I had some Solexa data, my problems would just go *poooof* :-) But I unfortunately don't. On the same note, I have to say that we ABI-Sanger sequenced a particular repetitive region that was impossible to assemble on the 454 data alone and it indeed went *poooof* on the hybrid MIRA assembly.

Note to self: do not use 454 alone on future assembly projects! :-)

However, in case you had some "nice" examples for validated erroneous
consensus call of MIRA, I'd be happy to take a look at the data to see
whether I could improve consensus routines.

I'm not sure the consensus calling in MIRA can be improved without using the flowgrams. The problem with the .fasta extracted from the .sff files, is that they contain reads in a discrete space (A present or not) whereas the flowgrams are in a continous space (A is present with intensity 0.45). In Newbler, the consensus is called after calculating the per-base arithmetic mean of the flows which allows to take into consideration the ambiguous flow values which are at the boundary between being called "present" or "absent".

I will prepare some data for you with some "nice" regions where I have identified homopolymer problems, so you can have a look.

Leonor.


B.


--
Leonor Palmeira, PhD

Phone: +32 4 366 42 69
Email: mlpalmeira AT ulg DOT ac DOT be
http://sites.google.com/site/leonorpalmeira

Immunology-Vaccinology, Bat. B43b
Faculty of Veterinary Medicine
Boulevard de Colonster, 20
University of Liege, B-4000 Liege (Sart-Tilman)
Belgium

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: