I think you hit the problem. My organism is a cross between two species of plants and thus may be a great allelic diversity. But this explains the big difference between MIRA and Newbler 2.5? Is it possible to rely on the results of MIRA or in this case it is preferable to change parameters? Sorry for the insistence ... thanks Diogo Santos From: bach@xxxxxxxxxxxx Subject: [mira_talk] Re: MIRA vs Newbler (454 est's) Date: Mon, 25 Jul 2011 23:02:54 +0200 To: mira_talk@xxxxxxxxxxxxx On Jul 25, 2011, at 17:19 , Diogo Santos wrote:I have received data from an EST's project from 454. they make the assembly with Newbler 2.5 and they get 5869 contigs (length>100) with coverage 11x (I know it's low, but it's what I get:( ). I try to use MIRA to make some testing and use the original data to reassemble, but I get a strange result (19378 contigs with length >100 and coverage 4x). Can you teel wich parameters should I change? The number of contigs in an EST assembly can vary widely depending on several factors. First, make absolutely sure that he data you got in the SFF is preprocessed correctly by the sequencing provider. I've seen just lately a data set where a provider played around with the adaptors but did not tell the Roche post-processing pipeline about it. The led to 1/3 of the reads still having adaptor unclipped i the SFF ... and that is deadly. Also, be really sure that MIDs are clipped away. Again, this is something the provider normally should do as they *are* responsible to deliver correct data as free as possible from sequencing artefacts. Once you are sure your data is good, it's time to think about biological explanations: is you organism multiploid with many differences between alleles? If yes, then many people are taken by surprise that MIRA doesn't assemble that together. MIRA is NOT a clusterer! It is an assembler and as such, it will assemble the mRNA as it was in the cell. Hope that helps, Bastien