On Mon, Feb 17, 2014 at 3:51 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: >> On February 17, 2014 at 4:05 PM Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx> >> wrote: >> I think there is a problem in the MIRA 4.0 SAM output with FLAG bit >> fields 0x10 and 0x20 (partner read's strand) sometimes being wrong, >> giving bad FLAG value pairs like 83 and 131, or 99 and 179. > > IcanhazMAFplease? It's easier to track down problems on a live data set. > Sure - I'll email you a dropbox link (off list) shortly. > Related: how did you find those problems? I should take up a similar toolset > to spot potential problems earlier. I've not tried it here, but Picard's ValidateSamFile would probably work. I hit this playing with SAM/BAM to SSPACE tabular (see thread earlier this week) with some DIY code. There was something very odd in the numbers for different pair orientations I was getting from the SAM, thus I added an explicit FLAG sanity test: https://github.com/peterjc/picobio/blob/master/sambam/sam_to_sspace_tab.py https://github.com/peterjc/picobio/commit/2f94dd58ffe6a0dfff8c23dd2893e1bf4e839548 I then applied 'samtools fixmate' and promptly fell over an old bug - it requires name sorted BAM files but doesn't complain if you give it coordinate sorted BAM, it just corrupts your data. I noticed my year old pull request to fix this was still open: https://github.com/samtools/samtools/pull/20 Regards, Peter -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html