[contestms] Re: Queue not moving

  • From: Madhavan Mukund <madhavan@xxxxxxxxx>
  • To: contestms@xxxxxxxxxxxxx
  • Date: Sun, 22 Nov 2015 17:45:23 +0530

Will do.

On Sun, Nov 22, 2015 at 12:08:25PM +0000, Stefano Maggiolo
<s.maggiolo@xxxxxxxxx> wrote:

Can you send me a complete log of ES from start to hang? Please send it to my
personal email (s.maggiolo@xxxxxxxxx).

Is it true that the hang does not happen always at the same submission? 
How many submissions can you process for each run, more or less?
When it hangs, is ES or something else using 100% CPU or not?

Things you can try:
1. use only one worker, see if it still hangs
2. disable the periodic check for non-queued submissions, commenting this line


On 22 November 2015 at 11:46, Madhavan Mukund <madhavan@xxxxxxxxx> wrote:

1. This patch does not seem to help.

2. My worry about malformed files was mistaken.  It was the icon that
   links to SyntaxHighligher when you look at the code through the
   Admin web interface.

--Madhavan

On Sun, Nov 22, 2015 at 07:52:37AM +0000, Stefano Maggiolo <
s.maggiolo@xxxxxxxxx> wrote:
> Can you also try to apply this patch? https://github.com/cms-dev/cms/
commit/
> 5a3ac6d2f124e4120834e70c3f7587f610e3eb47
>
> If something is malformed I think it is very likely that you would see a
stack
> trace, rather than just a hanging...
>
> On 22 November 2015 at 07:28, Madhavan Mukund <madhavan@xxxxxxxxx>
wrote:
>
>     I suspect there may be spurious characters in some of the
submissions. The
>     contest data was conducted on local systems in multiple locations.
The
>     submissions were archived and sent in a spreadsheet! I'll check
and let you
>     know.
>
>     --Madhavan
>
>
>     -----Original Message-----
>     From: Madhavan Mukund <madhavan@xxxxxxxxx>
>     To: contestms <contestms@xxxxxxxxxxxxx>
>     Sent: Sun, 22 Nov 2015 12:49
>     Subject: [contestms] Re: Queue not moving
>
>     1.2
>
>     --Madhavan
>
>
>     -----Original Message-----
>     From: Stefano Maggiolo <s.maggiolo@xxxxxxxxx>
>     To: contestms <contestms@xxxxxxxxxxxxx>
>     Sent: Sun, 22 Nov 2015 12:37
>     Subject: [contestms] Re: Queue not moving
>
>     Hi Madhavan,
>
>     which version are you using?
>
>     On 22 November 2015 at 03:24, Madhavan Mukund <madhavan@xxxxxxxxx>
wrote:
>
>         I'm doing an offline evaluation of about 1700 submissions
from our
>         first round national contest.  I find that the evaluation
service
>         works for some time and then gets "stuck".  Eventually the
workers get
>         disabled saying "no response".
>
>         The worker logs typically show that the last job has been
completed.
>
>         For example:
>
>         2015/11/22 08:37:25 - INFO [Worker,0/compile submission
6045] Starting
>         job.
>         2015/11/22 08:37:28 - INFO [Worker,0/compile submission
6045] Finished
>         job.
>
>         2015/11/22 08:37:26 - INFO [Worker,1/compile submission
6053] Starting
>         job.
>         2015/11/22 08:37:27 - INFO [Worker,1/compile submission
6053] Finished
>         job.
>
>         The web admin interface continues to show Worker-0 compiling
6045 and
>         Worker-1 compiling 6053.
>
>         These are the last few lines from the Evaluation Service
log.
>
>         2015/11/22 08:37:25 - INFO [EvaluationService,0] Executing
operation
>         `compile on 6045 against dataset 101'.
>         2015/11/22 08:37:25 - INFO [EvaluationService,0] Asking
worker 0 to
>         `compile on 6045 against dataset 101' (8:32:26.899355 after
>         submission).
>         2015/11/22 08:37:25 - INFO [EvaluationService,0] Operation
`compile on
>         6045 against dataset 101' concluded successfully
>         2015/11/22 08:37:25 - INFO [EvaluationService,0] Executing
operation
>         `compile on 6052 against dataset 101'.
>         2015/11/22 08:37:25 - INFO [EvaluationService,0] Asking
worker 1 to
>         `compile on 6052 against dataset 101' (8:31:29.407304 after
>         submission).
>         2015/11/22 08:37:26 - INFO [EvaluationService,0] Operation
`compile on
>         6052 against dataset 101' concluded successfully
>         2015/11/22 08:37:26 - INFO [EvaluationService,0] Executing
operation
>         `compile on 6053 against dataset 101'.
>         2015/11/22 08:37:26 - INFO [EvaluationService,0] Operation
`compile on
>         6052 against dataset 101' for submission 6052 completed.
Success: True.
>         2015/11/22 08:37:26 - INFO [EvaluationService,0] Asking
worker 1 to
>         `compile on 6053 against dataset 101' (8:31:21.614886 after
>         submission).
>         2015/11/22 08:37:26 - INFO [EvaluationService,0]
[action_finished]
>         Couldn't find submission 6052(101) in the database. Creating
it.
>         2015/11/22 08:37:26 - INFO [EvaluationService,0] Operation
`compile on
>         6053 against dataset 101' concluded successfully
>         2015/11/22 08:37:26 - INFO [EvaluationService,0] Executing
operation
>         `compile on 6054 against dataset 101'.
>
>         Sometimes I can get the queue to move by killing and
restarting the
>         evaluation service, but it is very nondeterministic.
>
>         Some more details:
>
>         1. There are two tasks in the contest.  There are about
1080
>            submissions in the queue for one question and 640 for
the other.
>
>         2. Token mode is disabled for the contest and both tasks.
>
>         At present, I've managed to process a little over 200 of the
1720
>         submissions.
>
>         --Madhavan
>
>
>
>
>




Other related posts: