[mira_talk] Re: Mira says "killed" as last word after being almost done

  • From: Alex Copeland <accopeland@xxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 13 May 2011 11:28:21 -0700

Hi,
I encourage you to use 'nohup' (or screen)...

'nohup' can prevent a job from being terminated when the controlling
terminal receives a signal to close. Using '&' to background the job
still leaves you vulnerable to the shell getting a ctrl-D either from
a keyboard accident or line mischief if it's running in a remote
session. 'nohup' has saved me numerous times, but since I regularly
forget to run jobs using it, and there's no way to go back and fix
this, I've mostly switched to using screen which serves the same
purpose and has other nice features.

Best,
Alex

On Thu, May 12, 2011 at 9:58 AM, Adrian Pelin <apelin20@xxxxxxxxx> wrote:
> I am not sure that nohup is much diffrent from a & at the end of the cmd
> line. So yeah, the SKIM algo uses multiple threads, however other algorythm
> (doesn't mention what other algo there are) are not multi threaded.
>
>
>
> On Thu, May 12, 2011 at 12:46 PM, John Nash <john.he.nash@xxxxxxxxx> wrote:
>>
>> You can kill it with "kill -9 'process-id' ". Find the process id using
>> ps, or "ps -ef | grep your_name" if the list is too long.
>> From the manual:
>> [number_of_threads(not)=1 ≤ integer ≤ 256]
>>
>> Default is 2. Master switch to set the number of threads used in different
>> parts of mira.
>>
>> Note 1: currently only the SKIM algorithm uses multiple threads, other
>> parts will follow.
>>
>> Note 2: Although the main data structures are shared between the threads,
>> there's some additional memory needed for each thread.
>>
>> Note 3: when running the SKIM in parallel threads, MIRA can give different
>> results when started with the same data and same arguments. While the effect
>> could be averted for SKIM, the memory cost for doing so would be an
>> additional 50% for one of the large tables, so this has not been implemented
>> at the moment. Besides, at the latest when the Smith-Watermans run in
>> parallel, this could not be easily avoided at all.
>>
>> I interpreted this to mean that "the computer will use multiple processors
>> when the SKIM algorithm is used". I understand that that using multiple
>> processors may change some results but I usually assemble to a backbone
>> genome.
>> cheers,
>> John
>>
>> On 2011-05-12, at 12:38 PM, Adrian Pelin wrote:
>>
>> and nohup means nothing can kill it? I also understand that using multi
>> threaded not=8 makes mira sue another algorythm.
>>
>>
>>
>> On Thu, May 12, 2011 at 12:22 PM, John Nash <john.he.nash@xxxxxxxxx>
>> wrote:
>>>
>>> When I run mira, I include:
>>> nohup mira --project=whatever --job=denovo,genome,accurate,454
>>> -GE:not=8:kpmf=15 >&log_assembly.txt &
>>> not=8 uses 8 processors in some of the assembly stages. kpmf=15
>>> (depending on which server I use, I change it from 10-20 according to
>>> experience) is the command to "keep percent memory free"
>>> Hth
>>> John
>>> On 2011-05-12, at 12:13 PM, Adrian Pelin wrote:
>>>
>>> No, this was done on VMware, me being the only root/user, no other person
>>> knows the passwd. No other jobs have been running.
>>>
>>> Qhat do you mean change kpmf to 20?
>>>
>>> On Thu, May 12, 2011 at 11:58 AM, John Nash <john.he.nash@xxxxxxxxx>
>>> wrote:
>>>>
>>>> I have had mira crash a few times on my 12 CPU 64-bit Dell server
>>>> running SLES (yuk), 32 GB RAM.
>>>> Looking at dmesg, it appears that each time it was a RAM resources
>>>> issue.  It turned out that somebody else was running a huge job on the
>>>> server at night, which caused problems.  Are you the only user?  Are there
>>>> other automated jobs which are RAM intensive which could be causing the
>>>> crash?  Have you tried changing kpmf to 20? Do you use all 8 processors?
>>>> FWIW, I run mira as "nohup mira etc... &" after some uh-oh moments.
>>>> John
>>>>
>>>>
>>>> On 2011-05-12, at 11:28 AM, Adrian Pelin wrote:
>>>>
>>>> The & is a neet idea. However commenting on:
>>>>
>>>> > - Maybe you killed it by error when connecting... does the time of
>>>> > creating of your log corresponds to the time you connected remotely?
>>>>
>>>> Likely not since the last modification done to any of the listed files
>>>> was 1 h before i connected. And to kill you need to ctrl+c it, and it does
>>>> not say killed, I have killed it with ctrl + c many times and it never said
>>>> Killed. Likely got killed by something and the only culprit is the OS. I
>>>> think it has to do with OOM Killer which kills stuff when it goes crazy. I
>>>> told Mira to leave 15% of memory free but who knows, maybe it went crazy on
>>>> the CPU and that is why it got killed, or maybe it is the running time that
>>>> you mentioned.
>>>>
>>>> This is a 32 GB server with 2 quad core cpus.
>>>>
>>>>
>>>>
>>>> On Thu, May 12, 2011 at 11:21 AM, Lionel Guy <guy.lionel@xxxxxxxxx>
>>>> wrote:
>>>>>
>>>>> On 12 May 2011, at 17:12 , Adrian Pelin wrote:
>>>>>
>>>>> > - I did not do it since i was home and connected remotly to find out
>>>>> > it is dead
>>>>>
>>>>> Maybe you killed it by error when connecting... does the time of
>>>>> creating of your log corresponds to the time you connected remotely?
>>>>>
>>>>> > - That means that only the OS could of killed because of exceeded
>>>>> > resource usage or max run time, which it is I have no idea:(
>>>>>
>>>>> I doubt it, it would have been in the case where you were running
>>>>> things on a cluster with a queuing system. Not on y standard desktop box.
>>>>>
>>>>> I'd just run it again (try to run it in the background to avoid logging
>>>>> off problems)
>>>>>
>>>>> mira --fastq --project=gigaspora -proout=gigaspora_denovo
>>>>> --job=denovo,genome,accurate,454,solexa SOLEXA_SETTINGS
>>>>> -GE:tismin=50:tismax=350;tpbd=1 > log_hybdn.txt &
>>>>>
>>>>> (note the "&" at the far end of the command)
>>>>> Lionel
>>>>> --
>>>>> You have received this mail because you are subscribed to the mira_talk
>>>>> mailing list. For information on how to subscribe or unsubscribe, please
>>>>> visit http://www.chevreux.org/mira_mailinglists.html
>>>>
>>>>
>>>
>>>
>>
>>
>
>

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: