[mira_talk] Re: Backbone assembly

  • From: John Nash <john.he.nash@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 15 Apr 2011 09:21:35 -0400

Depends on how many errors need to be fixed by pcr-sequencing. If your coverage 
is sufficient, and your scaffold molecule is sufficiently closely related to 
the target, you may be in luck.

j

On 2011-04-15, at 9:18 AM, Andrei Tudor wrote:

> Thanks,
>  
> So there is no need to use a program like ABACAS to build a pseudomolecule?
> Also after the checking through gap5 and if there is no need for a 
> pseudomolecule, can i submit it for annotation?
>  
> Andrei
> 
> From: John Nash <john.he.nash@xxxxxxxxx>
> To: mira_talk@xxxxxxxxxxxxx
> Sent: Friday, April 15, 2011 9:13:02 AM
> Subject: Re: [mira_talk] Re: Backbone assembly
> 
> On 2011-04-15, at 8:59 AM, Andrei Tudor wrote:
> 
>> Hello,
>>  
>> I have just finished a backbone assembly. It is the first time I have done 
>> such an assembly, and i am wondering what do I do with the resulting files.
>> I saw that instead of a multifasta file with contigs, MIRA made 1 hole 
>> chromosome from the reads. Does this mean that I do not have to creade a 
>> pseudomolecule?
>> If not what should I do next?
>> 
> 
> What I usually do next is (using gap5)
> 
> 1. Use tg_index to convert the CAF file to a gap5 database
> 
> 2. "Top and tail" the assembly, i.e. make sure that you have true circularity 
> if your genome is a circular one.  Then using gap5, trim the start and end of 
> the genome to make sure the coordinates match up as a circular genome.
> 
> 3. Next I use the assembly view in gap5 (or use Tablet), to look for:
> a. holes - where there is no coverage of the corresponding region in the 
> scaffold
> b. Areas of very low coverage - indicating possible misassembly, using the 
> data and Mira's tags to look for flanking regions which can be closed by 
> fresh PCR
> c. Regions of extremely high coverage - indicating repeats. I usually PCR the 
> regions from HIGH coverage (usually in a repeat) to normal or low coverage 
> (indicating the flanking non-repeated sequence), to make sure that 
> scaffold-bias has not influenced the assembly.
> d. I pay special attention to what I call "cliffs" - regions of very high 
> coverage next to regions of very low coverage.
> 
> 4. I browse through Mira's tags using gap5 to look for areas that Mira wants 
> me to check. The manual has good coverage of how to do that.
> 
> 5. Then I scan the sequence to proofread it using gap5. I don't care about 
> pads but I use gap5's "find next" search parameter (using the consensus 
> quality selection) to scan and fix obvious miscalls - where there is 
> obviously NO pad but mira has put a base there - there are not many of those.
> 
> Then I am done.
> 
> The ORF-calling software should find bases that could be indels causing 
> frameshifts - let that software remove that worry!
> 
> HTH
> John
> 
> 
> 
> 
> 
> 

Other related posts: