[mira_talk] Re: Parse error at line 84: unmatched CIGAR operation

  • From: Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Sun, 31 Mar 2013 15:47:51 +0100

On Sat, Mar 30, 2013 at 5:59 PM, John Nash <john.he.nash@xxxxxxxxx> wrote:
> On 2013-03-30, at 1:51 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
>
> > Indeed, the SAM MIRA writes is totally valid, it's just the conversion
> > to BAM which fails. This was foreseeable and the main reason why I did not
> > write BAM directly. But maybe the sam2bam converter could leave out reads
> > which have more than 2^16 entries out by itself?

I can see your point about long CIGAR strings being valid SAM,
yet invalid BAM. However this might be viewed as an ambiguity
of the spec, and worth bringing up on the samtools dev list. It
may be something that can be fixed in CRAM (which may well
largely replace BAM in the next year or so).

In my pull request  https://github.com/samtools/samtools/pull/39
for samtools I just fixed the overflow bug by turning it into an
explicit error. By extending the struct it would be possible for
samtools to handle longer CIGAR strings from SAM (at the cost
of bloating memory needlessly for most users), and in theory
that would make things like 'samtools depad' from SAM to BAM
work.

I'll probably need to forward that pull request to the samtools
dev mailing list after the Easter weekend - given the timing
I doubt anyone will look at this over the next couple of days.

> > On the other hand: wasn't there some kind of rule that
> > programs which read BAM should also be able to read
> > SAM?
>
> If samtools mpileup could read BAM, I would not be in this mess.
>
> J

Sadly no such rule exists - in fact most non-trivial uses need
random access and therefore you need BAM since BAI indexing
was never implemented for SAM (conceptually doing this is fairly
trivial, just do what BAM indexing does but using plain simple
offsets into the uncompressed SAM file - I suggested this a
year or two back on the samtools dev list, but no one was
interested).

Regards,

Peter

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: