[mira_talk] Re: Too high coverage contig

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 27 May 2011 23:11:52 +0200

On Monday 23 May 2011 23:20:54 Adnan Niazi wrote:
> I have found a contig of 5kb in the denovo assembly which has too high
> coverage at some points (1500x to 2400x). I know that this sequence should
> be there in the genome at different places with micro-heterogenity but it
> is assembled only as one contig and now i am not able to cover those
> remaining regions which are left as gaps due to this contig. Is there any
> way to grab the reads from the contig and reassemble them into several
> contigs with some accuracy?

Let me guess: you have either no paired-end data or the library size is <= 
3kb?

If MIRA had been able to place copies of this heavy repeat unambigously, it 
would have (aside from possible bugs). The only way to enable automatic 
placement would now be to order a sequencing library with a size of, say, 7 kb 
(if you had no paired-end util now, else it's size of largest library + 7kb), 
have that sequenced and then do a complete new assembly. Gone should be then 
your problem.

As it is, your only option is to cheat: if you "know" that two contigs are 
joined by one copy of that repeat, you simply join them into one contig by an 
artificial read inbetween representing that copy.

B.

Other related posts: