[mira_talk] Re: SSPACE TAB library information from MIRA caf or maf

  • From: Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx>
  • To: Robert Willows <robert.willows@xxxxxxxxx>
  • Date: Fri, 14 Feb 2014 09:46:02 +0000

On Fri, Feb 14, 2014 at 1:51 AM, Robert Willows
<robert.willows@xxxxxxxxx> wrote:
> On 14 Feb 2014, at 12:33 pm, Robert Willows <robert.willows@xxxxxxxxx> wrote:
>
>> Hi Peter,
>>
>> I've attached a shell script for making the TAB files from the miraconvert
>> produced sam file which has been depadded (instructions for depadding
>> are given in the script).

Thanks for sharing this.

Note 'samtools depad' can produce SAM output too, although I
prefer to produce and store a BAM file and turn it into SAM on
demand for parsing by piping 'samtools view' to stdin of a script.

>> The script tries to make 5 libraries and tells you
>> which ones have paired reads in them.

Your shell scripting is better than mine - there are some steps that I
don't quite get, but I can follow most of the logic.

I had started writing this in Python yesterday (still an unfinished
work in progress), but suspect for the assemblies I'm working on
the Illumina pairs are too short to help much with scaffolding:

https://github.com/peterjc/picobio/blob/master/sambam/sam_to_sspace_tab.py

One of the trickier parts is getting the end position of the mapping
(you have to parse the CIGAR string, which means handling both
lines for a read pair together - thus it is easier to work with a name
sorted file). Your code seems to instead take the simplification of
using len(SEQ), which is usually a reasonable approximation.

Producing one tab file for each read group was on my TO DO list,
although I may give up scaffolding for this particular project...

>> Also, I just found a problem in the sam file produced in the version 4.0
>> of miraconvert when making the sam file from the maf. So you need
>> to use miraconvert from version 4.0rc4 (release candidate #4) to
>> make the sam file which makes the sam file correctly. I'll post a new
>> thread about this.
>
> Oops. My mistake 4.0 and 4.0rc4 sam files are both OK.
> Sorry.

A false alarm - that's good :)

Peter

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: