On Mon, Jan 3, 2011 at 6:57 PM, Jeremy Volkening <volkening@xxxxxxxxxxxxx> wrote: > > Hi, > > When I ran sff_extract on my latest data files in preparation for input > into mira, the 5' cutoffs in the traceinfo xml file were off by one > basepair (these are MID-tagged reads). I imagine this is set by the > sequencing lab at the time of the run, but the strange thing is that the > cutoffs in the fasta files themselves (shown as lowercase characters) > are correct. > > It's easy enough to do a global replace in the xml file, but I thought I > should mention it. I tried to look into the code to see what was going > on, by I don't know much python. Is this a bug or an incorrect setting > by the sequencing lab? > > Jeremy Are you sure they are off? I don't recall off hand if the NCBI traceinfo XML file uses one based on zero based counting but I'd check that first. Can you use the Roche off instrument apps to check? I think sffinfo will show you the read trim points, you could also output the full untrimmed sequence as FASTA to check (this will use lower case for the bits which are marked for trimming). You could also use the Roche tools to make a test SFF file with just a couple of problematic reads for testing sff_extract. Peter -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html