[mira_talk] Re: Assembly of contigs

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: Panos Ioannidis <panos.ioannidis@xxxxxxxxx>
  • Date: Mon, 21 Feb 2011 22:15:33 +0100

On Monday 21 February 2011 15:10:03 you wrote:
> I have two fasta files containing the assembled contigs that came from 454
> and Solexa (from the genome). Since it's the contigs and not the reads we
> are talking about, can I concatenate the two files and run an assembly on
> this combined dataset?

In theory you could, but ... see below.

> And if so, what sequencing technology should I put
> in the "--job" parameter? Just 454? Does it really matter in this case? Is
> there some kind of parameter to run mira just on contigs (fasta files with
> no quality values and no particular sequencing technology)?
> 
> Also, some of my contigs are rather big (>300kb) and causes mira to crash.

May I gently point to the following sentence which MIRA helpfully prints out 
at the end of your process:

   Thank you for noticing that this is NOT a crash, but a
   controlled program stop.

Indeed, a crash would be if a segmentation violation would have occured, or 
any other kind of error giving the operating system a good reason to kill the 
process. However, you wanted to run MIRA outside its specifications ("reads" of 
>300kb certainly is).

MIRA recognised that and went on strike.

Not without telling you the reason, I suppose ("reads too long" or something 
similar should be somewhere in that error message, too).

> Is there a parameter that defines the maximum input sequence length?
> (command line: mira -project=2192_combined --job=denovo,genome,normal,454
> -notraceinfo 454_SETTINGS -LR:wqf=no -AS:epoq=no)

No. Maximum read length at the moment is somewhere between 15 and 20kb (don't 
remember). MIRA is not a genome aligner to align genome sized sequences, 
sorry.

You have three options:
- use another program. I think I was told phrap aligns very long sequences. 
However, I would not recommend this as you are at the mercy of assembly 
errors. Therefore, you should either ...
- fragment both the 454 and Solexa sequences (fasta2frag.tcl will help you 
there) and assemble these de-novo, or ...
... simply assemble 454 and Solexa hybrid de-novo

Hope that helps,
  B.

PS: Please send mails like these only to the MIRA talk list, no CC to me.

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts:

  • » [mira_talk] Re: Assembly of contigs - Bastien Chevreux