[mira_talk] Re: bug


On 5 May 2009, at 13:17 , Laurent MANCHON wrote:

no i don't use sff_extract script, i have just received my fasta file and the associated quality file from 454 sequencing center and the quality file is corrupted (not very well formated).

Well, blame them ;)

>FUJ9VHQ01DRU97 length = 242
37 37 37 37 35 35 36 36 37 39 39 39 39 37 37 37 37 37 37 37
[...]
19 19 20 33 33 33 32 32 24 25 25 35 37 34 34 32 35 33 33 33
26 25 20 20 0 24 24 24 27 30 27 27 27 27 19 19 19 23 32 32 2
7 27 27 27 19 19 20 25 25 27 33 32 31 27 27 27 27 27 27 27 2
9 26 25 20 16 17 16 16 17 11 11 11 11 12 12 13 18 18 16 16 2
2 22

you see the bad format of quality record, here there are 245 values ! and not 242 !


That shouldn't be too hard to fix: the lines are badly cut, and some values are split between two lines. There should be an extra white space at the end of the well-formatted lines (line 1 and 3 here) and none for the others (3 last lines). For each record, concatenate the lines, including the eventual last white space, and that should be fine. If there is no white space at the end of the well-formatted lines, just count the characters: if 59, add a white space, else not, and concatenate...

Lionel

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: