[mira_talk] Re: Using Mira for hydrid assembly: how to prohibit gaps insertion?

  • From: Alexander Tyakht <at@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 25 Aug 2011 12:16:50 +0400

Hello Sven,
Thank you for sharing the thoughts, I will review the assembly. Maybe
the contigs are low-quality at the ends - this is where the gaps
usually appear.
But so far I think that those gaps are artefact - they appear in the
places where an sequence from another - ALMOST repeating - part of
genome is mapped.
One almost repetitive region (#1) has a gap, the other (#2) doesn't.
When reads from #1 are mapped to #2,it creates unwanted gaps in #2 consensus.

As for manual curation, the contigs were created from short reads
(50bp), so they are long (1000-2000bp) in comparison with those short
reads but not genome!
So there are still a few hundreds of them, might be cumbersome for manual work)



On Wed, Aug 24, 2011 at 11:02 PM, Sven Klages
<sir.svencelot@xxxxxxxxxxxxxx> wrote:
> If both sets are of "high confidence" which one is the "more correct" one?
> You are assembling the same dataset with two different assembly approaches
> and want to join these assemblies, which are expected to be different, in a
> third step without allowing MIRA to create gaps in order to put together the
> sequences? I am not sure if this is a very promising way ...
> If it is a bacterial genome with large contigs, why not trying to manually
> join the contigs in e.g. gap4 or gap5?
> This gives you almost absolute control over how the sequences are going to
> be joined.
> Manually joining contigs and curing assemblies is sometimes still a very
> effective procedure ..
> just my 2p,
> Sven
>
> 2011/8/24 Alexander Tyakht <at@xxxxxxxxx>
>>
>> Hello,
>> I have 2 sets of high-confidence large contigs for same bacterial
>> genome - each assembled using different methods.
>> The sets are different, so now I want to combine both together using Mira.
>> I follow instructions as described in
>>
>> //www.freelists.org/post/mira_talk/Closing-gaps-reassemble-with-high-quality-Sanger-reads,1
>> (The only difference I synthesize 1000 bp reads with 500bp overlap.)
>>
>> Now when I take a look at resulting Mira contigs (.ace) file, I notice
>> gaps generated in them, because genome is highly repetitive and few
>> synthetic reads map to wrong places.
>> That's not what I want because these gaps are obviously false.
>>
>> The question is: How to tell Mira to prohibit gaps completely during
>> assembly?
>>
>> Thanks!
>>
>> --
>> You have received this mail because you are subscribed to the mira_talk
>> mailing list. For information on how to subscribe or unsubscribe, please
>> visit http://www.chevreux.org/mira_mailinglists.html
>
>

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: