[mira_talk] Re: Illumina/Solexa sequence coverage.

  • From: John Nash <john.he.nash@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 13 Sep 2011 17:27:29 -0400

On 2011-09-13, at 2:21 PM, Bastien Chevreux wrote:

> On Sep 13, 2011, at 18:53 , John Nash wrote:

<snip>


>> b. 30 Sanger reads - these are PCR reads to gap-close the 454 data, from the 
>> days when I only had 454 data. Converted to CAF format via 
>> phred/phrap/crossmatch to ace format, and then via tg_index to a gap5_db, 
>> then gap5 to CAF format.  (Tell me there's a faster way!!!).  
> 
> Ummm ... there is a faster way: 
> 
>   convert_project -f fasta -t CAF input.fasta output
> 
> (and it reads qualities from .fasta.qual files if present). In case your 
> Sanger did not have clippings (like, e.g., you trimmed the .ab1 beforehand), 
> then that's what I'd have used.
> 
>> In both experiments, these appear to have ended up in the debris (???)
> 
> Hmmm ... question would be: why?

I really do not know.  These are genuine PCR-generated reads assembled back in. 
 Can I re-add them back to an assembly (after first searching to make sure that 
they are not hidden in a contig somewhere)?


> 
>> PS. I have one question.  Can I use the resulting assembly to map against a 
>> related genome to use synteny to cover gaps.  I know how to do it with a 454 
>> assembly but not a hybrid one.
> 
> I didn't quite get that question ... what do you want to do?

I guess that I am asking what the command line parameters are to take a CAF 
file from a hybrid assembly and then use it as input for a mapping assembly 
against a reference genome.  Or should I dump the contigs as fastq files and 
use those. 

I didnt want to do a mapping assembly right from the start as I the genomes are 
not *that* closely related. However they are related enough that I'm curious.

John

Other related posts: