[mira_talk] Paired end info - SAM/BAM output from MIRA?

  • From: Peter <peter@xxxxxxxxxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 1 Jul 2010 10:59:21 +0100

Hi Bastien et al,

I'm starting to get more paired end data, and would like to be able look
at this information in alignment viewers. What MIRA output file format
would you recommend for this?

Some file formats (like ACE) do not store the pair information explicitly -
so the viewer would have to resort to inferring pairs based on the read
names (messy!), and there is no way to get the expected separation.

I've been reading the MIRA Assembly Format (MAF) document,
http://mira-assembler.sourceforge.net/docs/mira_maf.html
and it looks like the TN and DI lines are used to record pairings,
plus the TF and TT lines give information about the separation.
That's great, but I don't know of any alignment viewing tools which
support MAF directly.

I am familiar with the SAM/BAM format which does have explicit
support for paired reads - although in MAF terms it only stores the
template name (the TN line in MAF) and forward/reverse read
information (DI line in MAF, part of the flag in SAM/BAM), while
the original read name (RD) is lost. To me it would be great for
MIRA to offer SAM/BAM output, since these are widely supported
in alignment viewing software. Note that SAM/BAM files do not
store the contig sequences, just a reference to the contig names
which are usually held in an accompanying (unpadded) FASTA file.
http://samtools.sourceforge.net/SAM1.pdf

I think might be possible to write a MAF to SAM conversion tool...
(which can then be turned into BAM easily).

Bastien - Is SAM (or even BAM) output something you are thinking
about supporting directly in MIRA?

Thanks,

Peter

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: