[mira_talk] Splitting paired-end reads

Hi,

First, I really like mira, and this mailing-list! Happy to make my first post here ;)

I know this is a list for mira, and my problem regards rather sff_extract, but since Bastien has written the sff_extract part that deals with paired-end reads, I thought I might give a try here...

I'm trying to parse a sff file and to get correct paired-end reads info. sff_extract works fine, but returns 99% of partial linker matches (.pl reads), and maybe 1% of correct pairs, whereas I know I have more. I turned some debugging print in sff_extract, and saw that there are some regular discrepancies between the correctedseq and maskedseq, but I can't really make sense of that.
The linker sequence is:
TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG

Below is a part of the output:

Any idea of what could be wrong? Any relationship with the very recent update of SSAHA2? (I have 2.3)

Thanks for your help!

molev166:/seq/bpe3/readSorting/tmp$: ~/bin/sff_extract -o test - l ../../originalData/linker.fasta ../../originalData/sff/FLVI58L05.sff
Working on '../../originalData/sff/FLVI58L05.sff':
Creating temporary sequences from reads in '../../originalData/sff/ FLVI58L05.sff' ... done.
Searching linker sequences with SSAHA2 (this may take a while) ...  ok.
Parsing SSAHA2 result file ...  done.
Converting '../../originalData/sff/FLVI58L05.sff' ... Partial hits in FLVI58L05FSYNS
tcagGGTTTGCACGCCTTCGACGCAACACCATCGTGAGTTTCCGATAGCTCGCATCGAAATTTCTCAAACAGTACTCTGTATTCATAGACAGAGAGACCGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGATAAACGTTTTCGATAAAAAGGACAACGTATGTTGTAGTAAAATTTTTTGAAGTTTCTAAAAActgagactgccaaggcacacaggggatagg
####GGTTTGCACGCCTTCGACGCAACACCATCGTGAGTTTCCGATAGCTCGCATCGAAATTTCTCAAACAGTACTCTGTATTCATAGACAGAGAGACCGTA ##################################ACGATAAACGTTTTCGATAAAAAGGACAACGTATGTTGTAGTAAAATTTTTTGAAGTTTCTAAAAA ############################## ########################AACACCATCGTGAGTTTCCGATAGCTCGCATCGAAATTTCTCAAACAGTACTCTGTAT ##########################################################################AAAGGACAACGTATGTTGTAGTAAAA ##################################################
Partial hits in FLVI58L05FRQEW
tcagTTGTCCATGAAGCTCTTGAAGAAAGTGGAAATAAAGCTCTATACTATATACCACAGTCGTCATAACAAGTATCAAAGCATAAATAGAAAAACTCAGAAGAGAAGGAAAAGGATTTATCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACGAAAATAAACTGGCCCCTCCATTTAAACAGGTTGActgagactgccaaggcacacaggggataggnn
####TTGTCCATGAAGCTCTTGAAGAAAGTGGAAATAAAGCTCTATACTATATACCACAGTCGTCATAACAAGTATCAAAGCATAAATAGAAAAACTCAGAAGAGAAGGAAAAGGATTTA ##########################################AAAATAAACTGGCCCCTCCATTTAAACAGGTTGA ################################ ########################GAAAGTGGAAATAAAGCTCTATACTATATACCACAGTCGTCATAACAAGTATCAAAGCATAAATAGAAAAACTCAG ################################################################################################################################
Partial hits in FLVI58L05FP2BE
tcagTTTCAAGCTCTTTTTCAACAGCGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGAATGGCTTGTGCAAGTTCATCTGTTGATAAAGTCCCCAAAATTGCCAAAGCAACTTTAGCCCCCACACCAGGCACATTTTGCAGCAAGCAAAACCACTCTTGTTCTGCTCTTGTAGCAAAACCAAAAAGGCGAATAGCGTCTTCACGAACATGTGTTTCAATAAAGAGACTTATACTCTCACCAAGAGCGGGTAAAGAAGGACGCAGCCGATTTGACACAAAAACCACATAACCCACACCGTGAACATTTACAAGGATATGATCATCAAAGATATGTTCAAGAGTCCCCTTTTAATTACCAATCActgagactgccaaggcacacaggggataggn
####TTTCAAGCTCTTTTTCAACAGCGTA##################################ACGAATGGCTTGTGCAAGTTCATCTGTTGATAAAGTCCCCAAAATTGCCAAAGCAACTTTAGCCCCCACACCAGGCACATTTTGCAGCAAGCAAAACCACTCTTGTTCTGCTCTTGTAGCAAAACCAAAAAGGCGAATAGCGTCTTCACGAACATGTGTTTCAATAAAGAGACTTATACTCTCACCAAGAGCGGGTAAAGAAGGACGCAGCCGATTTGACACAAAAACCACATAACCCACACCGTGAACATTTACAAGGATATGATCATCAAAGATATGTTCAAGAGTCCCCTTTTAATTACCAATCA###############################
###################################################################################CATCTGTTGATAAAGTCCCCAAAATTGCCAAAGCAACTTTAGCCCCCACACCAGGCACATTTTGCAGCAAGCAAAACCACTCTTGTTCTGCTCTTGTAGCAAAACCAAAAAGGCGAATAGCGTCTTCACGAACATGTGTTTCAATAAAGAGACTTATACTCTCACCAAGAGCGGGTAAAGAAGGACGCAGCCGATTTGACACAAAAACCACATAACCCACACCGTGAACATTTACAAGGATATGATCATCAAAGATATGTTCAAGAGT###################################################


============================================
Lionel Guy
Thunmansgatan 25, SE-75421 Uppsala

phone: +46 (0)18 245596
mobile: +46 (0)73 9760618
email: guy.lionel@xxxxxxxxx
============================================


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: