[mira_talk] Re: joining contigs (WAS Not finding HAsh Frequency tag colours in Gap5(database generated from .CAF))

  • From: Rameez Mj <rameez03online@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 8 Oct 2015 08:35:10 +0530

Yes sorry about the topic change,I will keep it in mind next time.
Thank you Hoefler for the time you spent to explain that.Now I understand I
shouldn't have made that contig break. I will do the follow ups. Thanks
once again

On 7 October 2015 at 20:28, Chris Hoefler <hoeflerb@xxxxxxxxx> wrote:

Please don't change topics in the middle of a thread. Just start a new
thread with your new question.

The general rule with respect to joining Mira contigs is, use caution.
Mira can introduce contig breaks at inappropriate places if, for example,
there are large coverage discrepancies. But most of the time Mira
introduces contig breaks when it is unable to resolve ambiguities from the
data it is given. In other words, if you have a contig break, it is
probably because of a lack of sufficient coverage in the break region, or
you have an unresolved repeat region, or you have too much garbage in your
reads, or you have systematic sequencing errors that are indistinguishable
from nearly identical repeats, or you have mixed sub-populations of reads
within your dataset, etc. There is some discussion in the Mira guide,


http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_res_joining_contigs

It does not appear from your screenshots that you have paired data, so I
would take the first advice from that segment of the manual seriously. You
can resolve a lot of issues with more confidence if you have paired data.

In case 4, one side of overlap is clean and clear(case4.1) but other side
of overlap is having so many mismatches(case4.2). What I did in case 4 was
split the contigs from where the mismatch starts.


Never do this. You are basically trying to force two pieces together that
the data does not support. Those mismatches are telling you that those
contigs do not belong together. Ditto with case 1.1.

In your other two cases, the problem appears to be a lack of coverage. In
one case you have a very short overlap (15 bases), and in the other you
have only one read with a sufficiently large overlap (59 bases). Since
these are not repetitive regions it might be safe to join, but you would be
able to make the call with more confidence if you had paired data. My
personal opinion is that you should do some PCR across the gaps to be sure
before joining.


On Tue, Oct 6, 2015 at 11:42 PM, Rameez Mj <rameez03online@xxxxxxxxx>
wrote:



On 7 October 2015 at 10:12, Rameez Mj <rameez03online@xxxxxxxxx> wrote:

Please add this screenshot also in my previous query.Can I join this
contigs?

On 7 October 2015 at 10:10, Rameez Mj <rameez03online@xxxxxxxxx> wrote:

Thank you very much Hoefler and Chevreux for helping .Now I am getting
the colour assisted hash tags.

On 7 October 2015 at 09:59, Rameez Mj <rameez03online@xxxxxxxxx> wrote:

Ok Thank you Chevreux for the explanation. I will try this. I started
using gap5 to find internal joints. How to decide if I join two contigs or
not. I am attaching some screen shots please give me your suggestion about
joining in those cases. In case 4, one side of overlap is clean and
clear(case4.1) but other side of overlap is having so many
mismatches(case4.2). What I did in case 4 was split the contigs from where
the mismatch starts. In some cases of mismatches one consensus have low
base confidence(11) compared to other consensus(126) can I accept the base
from second consensus and join the contigs?. Please share your much valued
opinion and suggestion in this.


Other related posts: