>From: Ganga Jeena > Thanks it worked with 0.2.9 0.2.9 and 0.2.9 are identical except I re-activated a check for ssaha2 runability from Python. If 0.2.9 worked, 0.2.8 should have too. > Here for only 80709 paired end reads sequences generated could have been > maximum > double 161418 but they are 7 times 754371 ( Converted 441320 reads into > 754371 sequences.) > How is it possible ? > How exactly does the sff_extarct works? SSAHA2 for finding linker sequences, splitting reads at these places. For details, please read the function description comment in the function split_paired_end(data, sff_fh, seq_fh, qual_fh, xml_fh): of sff_extract > Does it not only take sequences with linker and discard the others which > either had no liker > or had linker in far-end which when separated could not have the other pair > end? No, why throw away data which could still be useful? > Is the .r for reverse and .f for forward strand of same sequence ends ? Yes. > What does this .fn indicate?? > Why are nnn appended to end of the sequences ? See function comments pointed to above- > Why Most of the sequences are in small letters ? Clipped sequences in the SFF, see Roche documentation. B.