[mira_talk] Re: Removing a duplicate entry from a CAF file

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 13 Oct 2011 22:08:52 +0200

On Oct 13, 2011, at 21:46 , John Nash wrote:
> After sending me SIX genomes in casava 1.8 format, it appears that the 
> SEVENTH genome came in the OLD format (as evidenced by the "/2" and "/1" at 
> the ends of the lines of the headers).  Of course, I didn't check and just 
> popped the new sequence in the pipeline.

Uh ... I could not help but chuckle at that. I know I would have walked 
straight into the same trap.

>  My converter happily but incorrectly converted the headers - thus removing 
> the "/1" and "/2" at the ends. That resulted in the error that 
> convert_project threw

Were "duplicates" already in the input or did that now happen after MIRA?

>  when I was trimming the CAF file to decent sized contigs.  The sequence 
> assembly looks really weird!

You know the -s option of convert_project? Together with -x, -y and -z it can 
be quite useful at times.

On a slightly related note: I got my first CASAVA 1.8 data this week. 100GB 
(packed). Fun. But this means that a name resolver for 1.8 style names / 
comments will be written very soon ;-)

Luckily I learned about the /1, /2 and /3 reads today ... hadn't known Illumina 
had changed their sequencing strategy.

B.
--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: