[mira_talk] Re: hybrid assembly of 454/Sanger ESTs

On Tuesday 17 March 2009 emilie tisserant wrote:
> I have performed an hybrid assembly of 454/Sanger ESTs.
> These sequences come from a fungal genome with a high AT content.
>
> Many contig consensus sequences are very similar (they differ only by
> one or two positions).
> So I don't understand why reads are not assembled into a single contig.

MIRA decided that these sequences / contigs do not belong together. Diagnosis 
from afar is difficult, but you should take a look at the assembled data of the 
contigs where you see this and decide whether it's a true difference (due to, 
e.g., poliploidy or repetitive data) or whether some MIRA parameters needs to 
be adjusted / relaxed.

> Is this a normal situation ?

MIRA has been built and configured to catch true differences down to one base. 
I.e., if you have 30 almost identical reads where the only difference is 1 base 
(say, 20 in one group and 10 in the other), the MIRA will look at the data and 
decide upon quality values and a few other things whether it can assemble them 
into one contig with 30 sequences ... or whether it needs to make 2 contigs, 
one with 20 and one with 10 sequences.

> What are the Mira's mechanisms/options which allow this ?

The parameters for "rough" identities (given in percentage numbers) are -SK:pr 
and -AL:mrs. You probably do not want to touch these in this case.

The fine grained differentiation is made in the parameters for the contig 
options. Have a look at all the options in the block of -CO:mr. E.g., to tune 
down the sensitivity, increase -CO:mrpg. Note that most of these options can 
and should be tuned differently for Sanger and 454 reads.

Regards,
  Bastien


-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: