[mira_talk] Re: hybrid assembly of 454/Sanger ESTs
- From: Bastien Chevreux <bach@xxxxxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Wed, 18 Mar 2009 19:01:18 +0100
On Tuesday 17 March 2009 emilie tisserant wrote:
> I have performed an hybrid assembly of 454/Sanger ESTs.
> These sequences come from a fungal genome with a high AT content.
>
> Many contig consensus sequences are very similar (they differ only by
> one or two positions).
> So I don't understand why reads are not assembled into a single contig.
MIRA decided that these sequences / contigs do not belong together. Diagnosis
from afar is difficult, but you should take a look at the assembled data of the
contigs where you see this and decide whether it's a true difference (due to,
e.g., poliploidy or repetitive data) or whether some MIRA parameters needs to
be adjusted / relaxed.
> Is this a normal situation ?
MIRA has been built and configured to catch true differences down to one base.
I.e., if you have 30 almost identical reads where the only difference is 1 base
(say, 20 in one group and 10 in the other), the MIRA will look at the data and
decide upon quality values and a few other things whether it can assemble them
into one contig with 30 sequences ... or whether it needs to make 2 contigs,
one with 20 and one with 10 sequences.
> What are the Mira's mechanisms/options which allow this ?
The parameters for "rough" identities (given in percentage numbers) are -SK:pr
and -AL:mrs. You probably do not want to touch these in this case.
The fine grained differentiation is made in the parameters for the contig
options. Have a look at all the options in the block of -CO:mr. E.g., to tune
down the sensitivity, increase -CO:mrpg. Note that most of these options can
and should be tuned differently for Sanger and 454 reads.
Regards,
Bastien
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts: