On Thu, Mar 19, 2015 at 6:30 PM, Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx> wrote: > On Thu, Mar 19, 2015 at 6:20 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: >> On 19 Mar 2015, at 18:45 , Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx> wrote: >>> Bastien - is there a recommend way to see in the MAF v2 format >>> if a read is part one or part two of a pair? >> >> Some doc update needed I think. Yes, the TS (Template Segment) >> line: the first gets a “1”, the last a “255”, inbetween get 2 to 254. >> For sequencing technologies with pairs, that makes it have just >>“1” and “255”. >> >> B. > > Lovely - I looked at that but the 255 surprised me, so I didn't > like to guess. > > Peter Thanks Bastien, That seems to be working now: https://github.com/peterjc/maf2sam/commit/eb3d798daf3b1e2d445dfc44d7f5474a836808f5 Lenis, Your example now works for me, reporting a sensible small fraction of the reads to be orphaned: $ python maf2fasta.py vasilis-test-S2-mt-lane6.maf vasilis-test-S2-mt-lane6.padded.fasta vasilis-test-S2-mt-lane6.unpadded.fasta chrM_bb Done $ ./maf2sam.py vasilis-test-S2-mt-lane6.padded.fasta vasilis-test-S2-mt-lane6.maf > vasilis-test-S2-mt-lane6.padded.sam [maf2sam] NOTE: Producing SAM using a gapped reference sequence. [maf2sam] Identified as MIRA v3.9 or later (MAF v2) [maf2sam] WARNING - Support for this is *still* EXPERIMENTAL! [maf2sam] Identified 3 read groups [maf2sam] Starting main pass though the MAF file [maf2sam] Unpaired read chrM [maf2sam] Almost done, 1047 orphaned paired reads remain [maf2sam] Done $ ./maf2sam.py vasilis-test-S2-mt-lane6.unpadded.fasta vasilis-test-S2-mt-lane6.maf > vasilis-test-S2-mt-lane6.unpadded.sam [maf2sam] Identified as MIRA v3.9 or later (MAF v2) [maf2sam] WARNING - Support for this is *still* EXPERIMENTAL! [maf2sam] Identified 3 read groups [maf2sam] Starting main pass though the MAF file [maf2sam] Unpaired read chrM [maf2sam] Almost done, 1047 orphaned paired reads remain [maf2sam] Done $ ./sam2bam.py vasilis-test-S2-mt-lane6.padded.sam vasilis-test-S2-mt-lane6.unpadded.sam samtools view -b -S vasilis-test-S2-mt-lane6.padded.sam | samtools sort - vasilis-test-S2-mt-lane6.padded [samopen] SAM header is present: 1 sequences. [bam_header_read] EOF marker is absent. The input is probably truncated. samtools index vasilis-test-S2-mt-lane6.padded.bam samtools idxstats vasilis-test-S2-mt-lane6.padded.bam chrM_bb 16747 20486 0 * 0 0 0 samtools view -b -S vasilis-test-S2-mt-lane6.unpadded.sam | samtools sort - vasilis-test-S2-mt-lane6.unpadded [bam_header_read] EOF marker is absent. The input is probably truncated. [samopen] SAM header is present: 1 sequences. samtools index vasilis-test-S2-mt-lane6.unpadded.bam samtools idxstats vasilis-test-S2-mt-lane6.unpadded.bam chrM_bb 16616 20486 0 * 0 0 0 This was with samtools 0.1.19 and the "EOF marker is absent" message here was a false alarm, see https://github.com/samtools/samtools/issues/18 (that bug has since been fixed in samtools). Both the padded and unpadded files loaded fine in Tablet v1.14.11.07 tested on Mac OS X. I'm not sure how I missed this back in November 2013 when I updated maf2sam.py to handle the new MAF v2 format from MIRA 3.9+. I was right to put the big "EXPERIMENTAL" warning in though ;) Sorry about this, and thank you for sharing this test file with me. Peter -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html