[mira_talk] Re: Make us check a part of results

  • From: Lionel Guy <guy.lionel@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 15 Jun 2009 12:21:38 +0200

Hi Kenta,


1)
  We will use a mira_out.txt output for a further analysis. We are
happy if MIRA makes a result txt file one by one contig, or adds a
created contig to a result txt file, because we can check a part of
results without waiting for finishing whole assemblings.

Check in the assembly_d_log folder, normally mira outputs caf files (assembly files) after each pass. However, please note that before the last pass, contigs may (and probably will) change...

2)
  Can you estimate how long time does MIRA take for finishing an
assemble 500,000 454-reads (ca. 200 Mb) with the options "-fasta
-job=denovo,est,draft,454 -notraceinfo -SB:lsd=yes -SK:pr=95 - SK:mnr=yes
-OUT:ort=yes"?
  Our machine (64bit linux, 2 GHz CPU, 128 GB memory) has been running
for more than a week.

Ouch... that seems quite long. On a similar machine with only 8Gb of memory, a run with ~400k 454 runs (~130 Mb), using - job=denovo,genome,normal,454 on our moderate/low repeat bacteria takes approximately 12 hours. Is your genome highly repeated? Which pass is it working on (try grep "Pass" on the assembly log)? You can determine the number of passes mira will perform by checking the argument parsing section of the assembly log, under -AS:nop.

3)
  Please tell us a difference between conting headers, e.g., mira_c#
and mira_lrc#.

That's a very good question, that I wanted to ask Bastien for a long time already... By the way, I think there are more contig tags, I would be curious to know what they mean...

Hope that helps,

Lionel

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: