[mira_talk] Re: Strand FLAG error in SAM output from MIRA 4.0 miraconvert?

  • From: Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Mon, 17 Feb 2014 16:06:16 +0000

On Mon, Feb 17, 2014 at 3:51 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
>> On February 17, 2014 at 4:05 PM Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx>
>> wrote:
>> I think there is a problem in the MIRA 4.0 SAM output with FLAG bit
>> fields 0x10 and 0x20 (partner read's strand) sometimes being wrong,
>> giving bad FLAG value pairs like 83 and 131, or 99 and 179.
>
> IcanhazMAFplease? It's easier to track down problems on a live data set.
>

Sure - I'll email you a dropbox link (off list) shortly.

> Related: how did you find those problems? I should take up a similar toolset
> to spot potential problems earlier.

I've not tried it here, but Picard's ValidateSamFile would probably work.

I hit this playing with SAM/BAM to SSPACE tabular (see thread
earlier this week) with some DIY code. There was something very
odd in the numbers for different pair orientations I was getting
from the SAM, thus I added an explicit FLAG sanity test:

https://github.com/peterjc/picobio/blob/master/sambam/sam_to_sspace_tab.py
https://github.com/peterjc/picobio/commit/2f94dd58ffe6a0dfff8c23dd2893e1bf4e839548

I then applied 'samtools fixmate' and promptly fell over an old
bug - it requires name sorted BAM files but doesn't complain
if you give it coordinate sorted BAM, it just corrupts your data.
I noticed my year old pull request to fix this was still open:
https://github.com/samtools/samtools/pull/20

Regards,

Peter

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: