[argyllcms] Re: Spotread stalling sometimes under Windows

  • From: François Simond <color@xxxxxxxxxxx>
  • To: argyllcms@xxxxxxxxxxxxx
  • Date: Sun, 15 Nov 2015 19:47:46 +0100

On Fri, Nov 13, 2015 at 4:25 AM, Graeme Gill <graeme@xxxxxxxxxxxxx> wrote:

François Simond wrote:

It is like spotread.exe never received one of the command.

Sounds like a race condition - the thread doing the stdin read
probably got killed just as the character arrived. I don't have any
clever ideas for solving this at the moment, since stdin has
to be poll-able and also readable in other ways in other places in
the code.

Yes it appears to be that.
Do you think it would be possible to add in the stdout or stderr debug
output an equivalent of "take_emis_measurement called" that would be
the same for all sensor drivers ?

This way, a program controlling non-interactively spotread could ask
again in case the request was lost due to Windows limitations.

However when maxing out the CPU with multiple threads or an external
program, it is producing the error usually before completing 35

Don't max out the CPU(s). You may well find that other things fail too -
for some instruments (i.e the i1pro, i1d3 doing refresh rate measurement etc.
there are some real time aspects to making it go. If the thread response time
is too
unreasonable, you may get timeouts where the instrument gives up waiting for
the USB read. (i.e. accessing a color instrument is not a batch process!).

I just confirmed what you describe here. Only under Windows however.
I couldn't reproduce the same error with the stress test on a more
powerful computer, but multiplying the CPU load by 3 did the trick
under Windows 7 just like on the less powerful computer.
It's impressive to see how solid and resilient spotread and the sensor
drivers are resilient in comparison.

Meanwhile I ran the same stress test on another Linux machine with the
i1Pro, and instead of failing quickly it reached the point where it
requests a new calibration after almost a thousand readings:

Result is XYZ: 265.783643 270.932687 335.207395, D50 Lab: 145.713071
4.023201 -40.338008

Got reading: 962 elapsed: 1374ms
Spot read needs a calibration before continuing

Yep - the calibration has timed out. Time to re-cal.
(although it should probably have been done much earlier!).

I'll keep that in mind.
Colorimeters are a definitely a better choice when many readings need
to be acquired :)

Argyll 1.4.0 was not affected by the issue described and it seems
there was a rewrite after that, when comparing the version's code.
Maybe the older routines used to poll Windows stdin were more reliable?

Looks pretty much the same, except it didn't wait as long for the thread
to start, which would seem to be more likely to make it less reliable, since
there would be a smaller window to read a line, and a larger proportion of
the time the thread will be being killed.

I tried again an old version of the tools I made for internal use,
that was using spotread 1.4.0 and couldn't make it fail with an EyeOne
Display 2.
I can't test with the old i1 Display Pro I had back then (stolen)
I'm tempted to try hacking 1.4.0 to remove the checksum check and try
it again on Windows, you don't think it's worth it?

Under Windows, spotread fails after 4 readings (log 5)
On Linux, it went on up to 875 readings before asking for the sensor

Spotread certainly doesn't fail with the i1pro Rev A after 4 readings
(or any small number) running on my MSWin machine. I would have
notices a long time ago, since I often test things using spotread.
Does it only do this when running via your java program ?
Was your machine busy or idle ?

I updated the source code of the stress test on git for 3 times higher
CPU load which was required to trigger the failure on my more powerful
desktop running Windows.
It makes it fail with the i1 Display Pro but I didn't try the i1 Pro
yet. I'll do that now.

Please let me know if you manage to reproduce any of the issue described :)


Other related posts: