[mira_talk] Re: all my 16S in one contig

  • From: John Nash <john.he.nash@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 5 Mar 2012 13:03:04 -0500

On 2012-03-05, at 12:55 PM, Davide Sassera (davide.sassera) wrote:

>  Dear Bastien and Mira ppl,
> 
> I'm assemblying with solexa (100bp, paired) a 5,6 Mb genome, with 200x 
> coverage.
> 
> My problem is that all the copies of the ribosomal genes (16S, 23S, 5S) get 
> assembled together in one single contig.
> 
> Based on reference I think I should have 8 ribosomal operons, which agrees 
> with the 8fold coverage of the "all the ribosomal sequences mashed together" 
> contig.
> 
> I have been thinking about possible solutions to this, but I then realized 
> other people must have had the same issue, so why lose my mind when I can 
> stand on the shoulder of giants?

Welcome… it's good to had enew blood.

In my opinion, I don't think that you can assemble a whole genome de novo with 
just illumina reads, no matter what the coverage.  There is not enough genetic 
diversity in the stretch between the boundary of a repeat to the region of 
unique coverage with illumina alone, even with standard paired reads - where I 
believe the fragment sizes are 250-500 bp. I would recommend either mapping 
this to a reference genome or getting 40-fold 454 coverage.

Speaking of coverage, I think 200x is over-kill, and would also lead to 
misassembles - try 80x.

HTH,
John



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: