[mira_talk] Re: beginner q on hybrid assembly

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 3 Sep 2009 21:27:39 +0200

On Donnerstag 03 September 2009 Sheri Simmons wrote:
> What is the best way to tell MIRA which data is
> which, and is it preferable to use phd files rather than coupled
> fasta/qual files? 

Hello Sheri,

you can have one input file per sequencing technology, i.e., you can have one 
phdball for the Sanger data and a FASTA/FASTA quality combo for the 454 files. 
Or a FASTA/FASTA qual for Sanger and another FASTA/FASTA qual for 454.

Note: I hope the PHD reading routine works, it's been literally ages it was 
used actively. It might be therefore preferable to use FASTA/FASTA qual files 
as input.

> Is there a particular header format I should be
> using in the fasta file? (e.g. 454's gsAssembler has headers as

No, just standard format, MIRA does not read more than the read name and the 
sequence. When using paired-end (templates), the read names however should 
conform to given conventions. More on that in the documentation for the read 
naming scheme paramater -LR:rns.

> And how do I indicate the presence of multiple input files on the command
> line?

As I wrote: only one input file/type per sequencing technology. Can you please 
read the section "A Sanger / 454 hybrid assembly walkthrough" of the help file 
dealing with 454 data? This should point you at how it's done (and if not, 
don't hesitate to ask :-)

Regards,
  Bastien


-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: