[mira_talk] Re: Removing a duplicate entry from a CAF file

  • From: John Nash <john.he.nash@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 13 Oct 2011 17:05:44 -0400

On 2011-10-13, at 4:08 PM, Bastien Chevreux wrote:

> On Oct 13, 2011, at 21:46 , John Nash wrote:
>> After sending me SIX genomes in casava 1.8 format, it appears that the 
>> SEVENTH genome came in the OLD format (as evidenced by the "/2" and "/1" at 
>> the ends of the lines of the headers).  Of course, I didn't check and just 
>> popped the new sequence in the pipeline.
> 
> Uh ... I could not help but chuckle at that. I know I would have walked 
> straight into the same trap.
> 
>> My converter happily but incorrectly converted the headers - thus removing 
>> the "/1" and "/2" at the ends. That resulted in the error that 
>> convert_project threw
> 
> Were "duplicates" already in the input or did that now happen after MIRA?

The duplicates were not in the unconverted sequence.  Converting the converted 
sequence gave every read pair the same name.  Mira didn't pick it up but 
convert_project does (with -x and -y). 

> 
>> when I was trimming the CAF file to decent sized contigs.  The sequence 
>> assembly looks really weird!
> 
> You know the -s option of convert_project? Together with -x, -y and -z it can 
> be quite useful at times.

I use -x and -y extensively but never have figured out how to use -z or -s 
appropriately, i.e. I know what the switches do but not what numbers are useful 
to plug in.


> On a slightly related note: I got my first CASAVA 1.8 data this week. 100GB 
> (packed). Fun. But this means that a name resolver for 1.8 style names / 
> comments will be written very soon ;-)

NICE!!! 


On another note, is it possible or even desirable to remove all the "rep" 
contigs from an assembly?  Assume that I have already checked them in gap5 to 
make sure that they are crap.

John



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: