[mira_talk] Re: 454 homopolymers

  • From: Sandrine Moreira <sandrine.moreira@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 22 Mar 2011 15:29:02 -0400

Hi,

Has anyone try to use some denoiser tools such as Ampliconnoise ?

SaM

Le 2011-03-22 à 2:42 PM, Andrzej N a écrit :

> Same here. Basically all known genes have frame shifts compared to previously 
> sequenced genes in databases. 
> 
> Andrzej 
> 
> On Tue, Mar 22, 2011 at 9:53 AM, Cladonia2 <fermaral1981@xxxxxxxxx> wrote:
> Hi, I am in the same situation, and I read this paper were they compare
> the novo assemblers and they said that merging two different assembly
> programs (MIRA and Newbler2.5) with CAP3 and then mapping in to the new
> contigs is the best approach:
> 
> http://www.biomedcentral.com/1471-2164/11/571
> 
> What do you think about it?
> 
> 
> El mar, 22-03-2011 a las 12:32 +0100, Yvan Wenger escribió:
> > Hello everybody,
> >
> > I'm in a similar situation, although I'm working on cDNA with a large
> > eukaryotic transcriptome (without reference). I get a very high
> > representation of sequences I know of with Mira, but frequents 1 base
> > insertions/deletions when compared to Newbler 2.5 output.
> >
> > In my case, I was considering taking the Newbler sequences whenever
> > available to correct the Mira sequences... did anybody try this by any
> > chance?
> >
> > Finally about the difference between chromatograms and fasta(+qual), I
> > was wondering if there is any tool allowing to remove adapters/vector
> > sequences directly in the sff or xml file used by mira? The problem
> > here is that my sff file is correct, but some prior adapters used for
> > normalisation are still in the sequences.
> >
> > Finally in my experience, Newbler performs slightly better with the
> > sff files are input than with fasta+qual, but the difference is not
> > dramatic. I see still more "future frameshift" after in-silico
> > translation of mira seqs than after newbler seqs even when the input
> > is the same for both.
> >
> > All the best,
> >
> > Yvan
> >
> >
> >
> > On Tue, Mar 22, 2011 at 11:55 AM, Leonor Palmeira <mlpalmeira@xxxxxxxxx> 
> > wrote:
> > > Dear All,
> > >
> > > I am assembling a small 110kb viral genome and comparing the results 
> > > between
> > > MIRA and Newbler. The data I have is a 454 run, and some Sanger reads
> > > covering one of my repetitive regions that was very hard to assemble 'de
> > > novo'.
> > >
> > > I am quite happy with the MIRA hybrid assembly (with the -highlyrepetitive
> > > flag) which yields a very large contig covering almost my entire genome,
> > > including my repeats. However, compared to some previously sequenced 
> > > Sanger
> > > reads and to another strain, there is a significant number of errors in
> > > homopolymers. This is particularly annoying in CDSs as it leads to a shift
> > > in the reading frame...
> > >
> > > The Newbler assembly, however, yields much smaller contigs but with fewer
> > > homopolymer length differences. I suspect this comes from the usage of the
> > > flowgram information in the alignment of the reads?
> > >
> > > The MIRA assembly is much better at disentangling repeats but these small
> > > errors are probably due to the usage of .fasta and .qual files instead of
> > > the flowgrams as used in Newbler. I find it very frustrating to be forced 
> > > to
> > > use my Newbler contigs, as the MIRA assembly is much better on several
> > > points.
> > >
> > > I realize the difficulty of the implementation, but would there be a way 
> > > of
> > > integrating flowgrams in the 454 part of the MIRA assembler some time in 
> > > the
> > > future?
> > >
> > > Best,
> > > Leonor.
> > > --
> > > Leonor Palmeira, PhD
> > >
> > > Phone: +32 4 366 42 69
> > > Email: mlpalmeira AT ulg DOT ac DOT be
> > > http://sites.google.com/site/leonorpalmeira
> > >
> > > Immunology-Vaccinology, Bat. B43b
> > > Faculty of Veterinary Medicine
> > > Boulevard de Colonster, 20
> > > University of Liege, B-4000 Liege (Sart-Tilman)
> > > Belgium
> > >
> > > --
> > > You have received this mail because you are subscribed to the mira_talk
> > > mailing list. For information on how to subscribe or unsubscribe, please
> > > visit http://www.chevreux.org/mira_mailinglists.html
> > >
> >
> 
> 
> 
> --
> You have received this mail because you are subscribed to the mira_talk 
> mailing list. For information on how to subscribe or unsubscribe, please 
> visit http://www.chevreux.org/mira_mailinglists.html
> 
> 
> 
> 
> 

--
Sandrine Moreira Rousseau
Doctorante en Bio-informatique

(514) 343-6111 poste 2842

Centre Robert-Cedergren en Bio-informatique et Génomique
Université de Montréal
Montréal, Québec, Canada





Other related posts: