On Sat, Mar 30, 2013 at 5:59 PM, John Nash <john.he.nash@xxxxxxxxx> wrote: > On 2013-03-30, at 1:51 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > > > Indeed, the SAM MIRA writes is totally valid, it's just the conversion > > to BAM which fails. This was foreseeable and the main reason why I did not > > write BAM directly. But maybe the sam2bam converter could leave out reads > > which have more than 2^16 entries out by itself? I can see your point about long CIGAR strings being valid SAM, yet invalid BAM. However this might be viewed as an ambiguity of the spec, and worth bringing up on the samtools dev list. It may be something that can be fixed in CRAM (which may well largely replace BAM in the next year or so). In my pull request https://github.com/samtools/samtools/pull/39 for samtools I just fixed the overflow bug by turning it into an explicit error. By extending the struct it would be possible for samtools to handle longer CIGAR strings from SAM (at the cost of bloating memory needlessly for most users), and in theory that would make things like 'samtools depad' from SAM to BAM work. I'll probably need to forward that pull request to the samtools dev mailing list after the Easter weekend - given the timing I doubt anyone will look at this over the next couple of days. > > On the other hand: wasn't there some kind of rule that > > programs which read BAM should also be able to read > > SAM? > > If samtools mpileup could read BAM, I would not be in this mess. > > J Sadly no such rule exists - in fact most non-trivial uses need random access and therefore you need BAM since BAI indexing was never implemented for SAM (conceptually doing this is fairly trivial, just do what BAM indexing does but using plain simple offsets into the uncompressed SAM file - I suggested this a year or two back on the samtools dev list, but no one was interested). Regards, Peter -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html