[mira_talk] Re: Mira says "killed" as last word after being almost done

  • From: Adrian Pelin <apelin20@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Sun, 15 May 2011 15:42:30 -0400

Interesting indeed.

But if kpmf is set at 20 (and the default is 15), why does this still
happen? Indeed the swap space was only 1 GB last time, now I increased it to
49 GB and the RAM itself is 48 GB. I am doing a hybrid assembly with high
coverages so it is to be expected that the program requires lots of memory.

Grazie




On Sun, May 15, 2011 at 1:45 PM, Robert Bruccoleri <
bruc@xxxxxxxxxxxxxxxxxxxxx> wrote:

>  Dear Colleagues,
>     My experience with Mira is that "killed" means Mira was killed by the
> Linux kernel because it ran out of virtual memory (sum of RAM and swap
> space). The system log in /var/log/messages should have a message about it
> at the time of the last modification of the log file. 'nohup' will not stop
> this from happening.
>     If you want to test this hypothesis and are willing to wait  a really
> long time, add more storage to the swap space and see what happens.
> Alternatively, increase the Mira parameters for minimum percentage overlap
> for reads (like mrs) to higher values so the overlap graph is reduced in
> size. However, a genome with lots of repetitive sequence and high coverage
> will still require lots of RAM.
>
>     Regards,
>     Bob
>
>
> Alex Copeland wrote:
>
> Hi,
> I encourage you to use 'nohup' (or screen)...
>
> 'nohup' can prevent a job from being terminated when the controlling
> terminal receives a signal to close. Using '&' to background the job
> still leaves you vulnerable to the shell getting a ctrl-D either from
> a keyboard accident or line mischief if it's running in a remote
> session. 'nohup' has saved me numerous times, but since I regularly
> forget to run jobs using it, and there's no way to go back and fix
> this, I've mostly switched to using screen which serves the same
> purpose and has other nice features.
>
> Best,
> Alex
>
> On Thu, May 12, 2011 at 9:58 AM, Adrian Pelin <apelin20@xxxxxxxxx> 
> <apelin20@xxxxxxxxx> wrote:
>
>
>  I am not sure that nohup is much diffrent from a & at the end of the cmd
> line. So yeah, the SKIM algo uses multiple threads, however other algorythm
> (doesn't mention what other algo there are) are not multi threaded.
>
>
>
> On Thu, May 12, 2011 at 12:46 PM, John Nash <john.he.nash@xxxxxxxxx> 
> <john.he.nash@xxxxxxxxx> wrote:
>
>
>  You can kill it with "kill -9 'process-id' ". Find the process id using
> ps, or "ps -ef | grep your_name" if the list is too long.
> From the manual:
> [number_of_threads(not)=1 ≤ integer ≤ 256]
>
> Default is 2. Master switch to set the number of threads used in different
> parts of mira.
>
> Note 1: currently only the SKIM algorithm uses multiple threads, other
> parts will follow.
>
> Note 2: Although the main data structures are shared between the threads,
> there's some additional memory needed for each thread.
>
> Note 3: when running the SKIM in parallel threads, MIRA can give different
> results when started with the same data and same arguments. While the effect
> could be averted for SKIM, the memory cost for doing so would be an
> additional 50% for one of the large tables, so this has not been implemented
> at the moment. Besides, at the latest when the Smith-Watermans run in
> parallel, this could not be easily avoided at all.
>
> I interpreted this to mean that "the computer will use multiple processors
> when the SKIM algorithm is used". I understand that that using multiple
> processors may change some results but I usually assemble to a backbone
> genome.
> cheers,
> John
>
> On 2011-05-12, at 12:38 PM, Adrian Pelin wrote:
>
> and nohup means nothing can kill it? I also understand that using multi
> threaded not=8 makes mira sue another algorythm.
>
>
>
> On Thu, May 12, 2011 at 12:22 PM, John Nash <john.he.nash@xxxxxxxxx> 
> <john.he.nash@xxxxxxxxx>
> wrote:
>
>
>  When I run mira, I include:
> nohup mira --project=whatever --job=denovo,genome,accurate,454
> -GE:not=8:kpmf=15 >&log_assembly.txt &
> not=8 uses 8 processors in some of the assembly stages. kpmf=15
> (depending on which server I use, I change it from 10-20 according to
> experience) is the command to "keep percent memory free"
> Hth
> John
> On 2011-05-12, at 12:13 PM, Adrian Pelin wrote:
>
> No, this was done on VMware, me being the only root/user, no other person
> knows the passwd. No other jobs have been running.
>
> Qhat do you mean change kpmf to 20?
>
> On Thu, May 12, 2011 at 11:58 AM, John Nash <john.he.nash@xxxxxxxxx> 
> <john.he.nash@xxxxxxxxx>
> wrote:
>
>
>  I have had mira crash a few times on my 12 CPU 64-bit Dell server
> running SLES (yuk), 32 GB RAM.
> Looking at dmesg, it appears that each time it was a RAM resources
> issue.  It turned out that somebody else was running a huge job on the
> server at night, which caused problems.  Are you the only user?  Are there
> other automated jobs which are RAM intensive which could be causing the
> crash?  Have you tried changing kpmf to 20? Do you use all 8 processors?
> FWIW, I run mira as "nohup mira etc... &" after some uh-oh moments.
> John
>
>
> On 2011-05-12, at 11:28 AM, Adrian Pelin wrote:
>
> The & is a neet idea. However commenting on:
>
>
>
>  - Maybe you killed it by error when connecting... does the time of
> creating of your log corresponds to the time you connected remotely?
>
>
>  Likely not since the last modification done to any of the listed files
> was 1 h before i connected. And to kill you need to ctrl+c it, and it does
> not say killed, I have killed it with ctrl + c many times and it never said
> Killed. Likely got killed by something and the only culprit is the OS. I
> think it has to do with OOM Killer which kills stuff when it goes crazy. I
> told Mira to leave 15% of memory free but who knows, maybe it went crazy on
> the CPU and that is why it got killed, or maybe it is the running time that
> you mentioned.
>
> This is a 32 GB server with 2 quad core cpus.
>
>
>
> On Thu, May 12, 2011 at 11:21 AM, Lionel Guy <guy.lionel@xxxxxxxxx> 
> <guy.lionel@xxxxxxxxx>
> wrote:
>
>
>  On 12 May 2011, at 17:12 , Adrian Pelin wrote:
>
>
>
>  - I did not do it since i was home and connected remotly to find out
> it is dead
>
>
>  Maybe you killed it by error when connecting... does the time of
> creating of your log corresponds to the time you connected remotely?
>
>
>
>  - That means that only the OS could of killed because of exceeded
> resource usage or max run time, which it is I have no idea:(
>
>
>  I doubt it, it would have been in the case where you were running
> things on a cluster with a queuing system. Not on y standard desktop box.
>
> I'd just run it again (try to run it in the background to avoid logging
> off problems)
>
> mira --fastq --project=gigaspora -proout=gigaspora_denovo
> --job=denovo,genome,accurate,454,solexa SOLEXA_SETTINGS
> -GE:tismin=50:tismax=350;tpbd=1 > log_hybdn.txt &
>
> (note the "&" at the far end of the command)
> Lionel
> --
> You have received this mail because you are subscribed to the mira_talk
> mailing list. For information on how to subscribe or unsubscribe, please
> visit http://www.chevreux.org/mira_mailinglists.html
>
>
>

Other related posts: