[mira_talk] Re: Paired end info - SAM/BAM output from MIRA?

  • From: Lionel Guy <guy.lionel@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 1 Jul 2010 12:35:55 +0200

Hi Peter,

I personally use consed to look at scaffolds/paired-end data between contigs (if this is what you want to do). Consed is ugly- and old- looking, but it's quite powerful to view (contigs and pairs between these, repeats) and modify assemblies (join/tear/rearrange contigs, modifiy consensus, etc). The price to pay is to work with the crappy ace file format, that contains no quality.

For all my projects, after the assembly with mira, I modify the ace file output by mira and create phdballs, that consed use to store quality AND paired-end information (...). I have a pretty experimental script that can do that, but, as said, it's experimental... if you want to test it, it is shipped in the mira 3rd party scripts (caf2aceMiraConsed.pl).

Good luck ;)

Lionel
On 1 Jul 2010, at 11:59 , Peter wrote:

Hi Bastien et al,

I'm starting to get more paired end data, and would like to be able look
at this information in alignment viewers. What MIRA output file format
would you recommend for this?

Some file formats (like ACE) do not store the pair information explicitly - so the viewer would have to resort to inferring pairs based on the read
names (messy!), and there is no way to get the expected separation.

I've been reading the MIRA Assembly Format (MAF) document,
http://mira-assembler.sourceforge.net/docs/mira_maf.html
and it looks like the TN and DI lines are used to record pairings,
plus the TF and TT lines give information about the separation.
That's great, but I don't know of any alignment viewing tools which
support MAF directly.

I am familiar with the SAM/BAM format which does have explicit
support for paired reads - although in MAF terms it only stores the
template name (the TN line in MAF) and forward/reverse read
information (DI line in MAF, part of the flag in SAM/BAM), while
the original read name (RD) is lost. To me it would be great for
MIRA to offer SAM/BAM output, since these are widely supported
in alignment viewing software. Note that SAM/BAM files do not
store the contig sequences, just a reference to the contig names
which are usually held in an accompanying (unpadded) FASTA file.
http://samtools.sourceforge.net/SAM1.pdf

I think might be possible to write a MAF to SAM conversion tool...
(which can then be turned into BAM easily).

Bastien - Is SAM (or even BAM) output something you are thinking
about supporting directly in MIRA?

Thanks,

Peter

--
You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: