[mira_talk] Re: Parse error at line 84: unmatched CIGAR operation

  • From: John Nash <john.he.nash@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 27 Mar 2013 16:59:02 -0400

On 2013-03-27, at 3:31 PM, Peter Cock wrote:

> On Wed, Mar 27, 2013 at 7:18 PM, John Nash <john.he.nash@xxxxxxxxx> wrote:
>> Good afternoon,
>> 
>> I need to do some samtools mpileups and I have been trying to generate BAM
>> files from mira's output without success.
>> 
>> The data are from mapping assemblies using various versions of the 3.9.x
>> series of mira.
>> 
>> A.  If I use convert_project to generate a sam file from a maf file, all
>> goes well, and I can view the assembly in Tablet. However, either doing:
>> 
>> $ samtools view -bS STRAIN.sam > STRAIN.bam
>> 
>> or, after generating an index from the STRAIN_unpadded.fasta file using
>> samtools faidx:
>> 
>> $ samtools view -bt STRAIN_unpadded.fasta.fai STRAIN.sam > STRAIN.bam
>> 
>> generates a Parse error at line 84: unmatched CIGAR operation
> 
> What is line 84 of the file?

Hi Peter,

It's line 86 (not 84) in the case that I am reporting below because I am using 
a different assembly to the one mentioned above but this one also gave the same 
problem. I had 3 crashing at once and I cannot remember which crashed at line 
84 and which at line 86, etc. - but it's the same story...

I'm not a sam file expert but the offending line looks like a HUGE string, 
probably representing the genome I reference mapped to (it's S. enteritidis).  
It starts with: 

NC_011294       0       NC_011294_bb    527     50      
43M1D23M2D10M2D76M1D26M1D10M1D27M1D8M1D27M1D4M1D8M1D11M1D33M2D2M1D10M1D17M1
D75M1D22M1D38M1D5M1D10M2D12M1D10M1D2M1D55M1D51M2D15M1D11M1D21M1D25M1D12M1D13M1D3M1D98M1D23M1D62M1D35M1D9M1D2M1D7M1D10M1D28M1D32M1D5
M1D8M1D114M1D27M1D61M1D8M1D24M1D6M1D11M1D7M1D42M1D30M1D37M1D14M1D41M1D10M1D11M1D27M1D20M1D38M1D58M1D66M1D307M1D4M1D17M1D8M1D12M1D18
1M1D36M1D5M1D4M1D24M1D7M1D50M1D23M1D7M1D7M1D25M1D4M1D6M1D10M1D24M1D2M1D15M1D5M1D45M1D16M1D21M1D7M1D10M1D18M1D16M1D40M1D75M1D15M1D20
M1D47M1D4M1D18M1D2M1D7M1D102M1D1M1D31M1D38M1D69M1D58M1D3M1D25M1D5M1D18M1D6M1D12M1D50M1D25M1D4M1D14M1D51M1D27M1D4M1D26M1D24M1D10M1D3
   and goes on for several megabases.

> Are you (accidentally) using a padded SAM with an unpadded FASTA?
> If so, first you need to run 'samtools depad' to get a more traditional
> SAM file using an unpadded reference. This should be in the latest
> samtools 0.1.19 release (and can do SAM to BAM at the same time).

I've usually used Mira's unpadded.fasta as a reference w/o issues but good 
advice should always be taken :)

$ samtools depad -S SE_10-101_mira454MappingSE.sam -T 
SE_10-101_mira454MappingSE_out.padded.fasta -so fred.sam
[samopen] SAM header is present: 37 sequences.
Parse error at line 86: unmatched CIGAR operation
Aborted (core dumped)

*sigh...  

Using the unpadded fasta file caused it to tell me to use the padded.fasta file 
:)

>> B.  Peter Cock's maf2sam.py (version 2.0) hangs when I try to convert the
>> maf file to a sam file.
> 
> I never updated it to cope with the MAF from MIRA v3.9.x, but
> if you fancy sharing the data I can have a look at make it at least
> give a clear error. There seemed little point as MIRA v3.9.x has
> native SAM output.

Fair enough, I was grasping at straws by then.  I don't want to give you extra 
work but if you would be kind enough to look at a 330 megabyte gzipped maf 
file, I can send you the link backchannel as soon as it uploads to Dropbox.

> 
>> D.  picard-tools SamFormatConverter crashes.
>> 
>> I have tried it with at least 3 different reference assemblies.

I tested this with a few more genomes, and it might be only occurring with 454 
reads mapped to a reference - I have to check this further. The hybrid 
454/illumina assembly that I just tried converted from maf to sam to bam fine.


> Did that help?

It's helping... I was using Mira because I had a bunch of Mira reference 
assemblies already done, and needed a reference assembly as a bam file so that 
I could run samtools mpileup.  I ended up getting it to work using the MOSAIK 
tools.

Thanks
John




--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: