[mira_talk] Re: cleaning reads

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 5 May 2011 23:12:19 +0200

On Thursday 05 May 2011 08:59:59 Jose Blanca wrote:
> We've developed a new software called clean_reads and a MIRA user has
> send us a mail asking for the better way to go from sff_extract to MIRA
> passing through clean_reads. He was using the xml file created by
> sff_extract, but clean_reads does not handle that file at all. I've said
> to him that if he uses the clipping option in sff_extract (-c) he
> wouldn't need to use the xml file at all. Is this correct?

This is correct. Also correct nowadays is that the XML file is not needed 
anymore for clipping if the reads are 454 AND adhere to the 454-convention 
that the clipped part is lower-case while the unclipped part is upper case.

Going that way is preferred to using "-c" in sff_extract because it still 
gives the user or whomever the opportunity to make a 1:1 mapping of bases in 
the 454 read to the SFF signal of that read as shown by some finishing 
programs (gap etc.).

> Would MIRA
> need any additional functionality in clean_reads to work ok?
> The clean_reads page is:
> 
> http://bioinf.comav.upv.es/clean_reads/

I'll have a look at it as soon as possible, but that may still be in a couple 
of weeks only.

As said above: as long as the 454 lowercase/uppercase convention is repscted, 
I do not think additional things are required.

And I saw duplicate work ... I implemented regex masking in MIRA for Solexa a 
couple of weeks ago, maybe I should've looked at your pipelines earlier. Why 
doesn't a day have 48 hours (without sleep)? *sigh* 

B.

Other related posts: