[mira_talk] Re: big file in the log diretory

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 23 Mar 2011 22:15:15 +0100

On Wednesday 23 March 2011 20:21:04 Stephanie Pearl wrote:
> I was talking some more with our computing staff and it turns out that my
> job was actually running on a node with just 16 GB of RAM. Since my job was
> using 19 GB, it was swapping memory to disk and therefore slowing the
> program down. So it appears that my problem is solved. Thanks for the
> assistance, though!

Well, one problem down. However, I recommend that you really switch to at 
least 3.2.1. I do not think you would regret it, quite the contrary.


> The command line that I entered for my assembly (which is still running)
> was:
> 
> mira -project=hybridmulti -job=denovo,est,accurate,sanger,454
> -noclipping=454 -notraceinfo -fasta -CO:asir=yes -GE:not=4
> SANGER_SETTINGS -LR:wqf=no -AS:bdq=30:epoq=no:mrl=50 -CL:qc=no:bsqc=no
> -AL:egp=yes:egpl=10 -ED:ace=no 454_SETTINGS -LR:lsd=yes -AS:mrl=50
> -AL:egp=no:mrs=94 -ED:ace=no -OUT:sssip=yes

Hmmmmm ... I don't like that command line. At all. At least not for the job 
you are trying to do.

You wrote you had 3 strains (2 in 454, one in Sanger). Just out of curiosity: 
may I ask how closely related these strains are? And now on to the real 
problematic areas: 

1) why do you use "-CO:asir=yes", but no "-SB:lsd:yes". You should be able to 
assign a strain to each read, right? By doing that and telling MIRA about this 
("-SB:lsd=yes"), you enable MIRA to find out all by itself what is a SNP and 
what is a real repeat. Then you do not need "-CO:asir=yes" anymore (it is 
counterproductive for most use cases).

2) Why are you switching of the automatic editor for 454? You rob MIRA of one 
of the most potent improvement tools it has, not good.

3) No qualities for the Sanger reads? Ouch ... why's that?

4) maybe a problem: "454_SETTINGS -AL:egp=no" will cluster together repeats 
which have indels of >= 3 bases, e.g.:
     actgtgactgactgactgtgactgatgac
     actgtgactga******gtgactgatgac
Depending on what you want to do, you may or may not want this. For a de-novo 
assembly of closely related strains, I would not do that though.

B.

Other related posts: