[mira_talk] Re: Paired end info - SAM/BAM output from MIRA?

  • From: Peter <peter@xxxxxxxxxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 2 Jul 2010 21:24:27 +0100

On Fri, Jul 2, 2010 at 6:47 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
> On Freitag 02 Juli 2010 Peter wrote:
>> Funnily enough, when I asked the Tablet team if anyone else was
>> interested in it supporting MAF directly, they also said "Yet another
>> output format?". Certainly SAM/BAM seems to be the most widely
>> supported alignment/mapping format for big genome projects, but
>> I understand that MIRA is more focussed on smaller genomes
>> where the older file formats suffice.
>
> It's not that the "older" file format suffice, it's that they provide the only
> possibility for MIRA to have the results supported at all.

I would have to agree SAM/BAM are better suited for mapping onto
existing genomes than for de novo assembly.

>> You mentioned you would like to add tags - the SAM/BAM format
>> does allow for per-read tags, and I think there is some flexibility
>> in the header too. I guess it depends on what exactly you need -
>> I don't see an existing header convention for annotating a region
>> of a contig.
>
> SAM/BAM has no sequencing technology type field, it has no read base tags, no
> consensus tags no nothing MIRA needs to pass on information about all those
> things which make life during finishing so easy.

There is a SAM/BAM read group (RG) which might be suitable for
the technology type (and/or strain information). My impression is
that all these "optional" tags in the SAM/BAM read tags and headers
are only minimally defined in the file format specification. The paired
end stuff is pretty nicely done though (which was what I was initially
interested in here).

Peter

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: