marcel nita wrote: > On 2/14/07, Graeme Gill <graeme@xxxxxxxxxxxxx> wrote: >> Looking at the PremiumGlossyPhoto results you posted, these >> don't look unreasonable to me. Average and peak errors have halved, >> so overall that's a pretty good result. I can understand that it's >> not so good that the white has got worse, but I guess this is >> the influence of trying to correct colors near white. You might >> try adding several white test patches as a way of increasing >> the weighting of the white error (the latest version of refine >> will give the lightest patch an increased weight of 5 automatically.) >> >> Another thing you could try is to leave out test patches of colors >> that get worse, My feeling is that it should also be clarified first, whether they really get worse WITH STATISTICAL SIGNIFICANCE. I think that there is a misunderstanding, how measurements are to be interpreted. We must always keep in mind that we're dealing here with RANDOM VARIABLES (see http://en.wikipedia.org/wiki/Random_variable), and samples of random variables cannot be compared deterministically. And noisy measurements are eventually samples of random variables. Let's illustrate this with an example. Assume for instance a perfect, error-free profile prof_A, and another profile prof_B with a systematic error of 1dE. Let's further assume that the printer's repeatability and the measurements have together a random error of 1dE RMS (or 0.92 dE avg; for simplicity of the simulation I'm assuming 3-variate uncorrelated Gaussian i.i.d. noise in CIELAB space). Given the above assumption, we know that the error of prof_B is exactly 1dE larger than the error of prof_A. However, if we MEASURE the printed result using prof_A and the result using prof_B, which errors will we observe? If we make again and again a print with prof_A and a print with prof_B, measure them, and compare each pair to the reference, then we will recognize, that measured_error(A) < measured_error(B) is observed only in 74% of all cases. But in the other 26% of the cases we will even observe measured_error(A) > measured_error(B) !!! although prof_A's error is known to be 1dE LOWER than prof_B's error. What I want to say is, if we have just a single measurement for each patch with the old profile, and only a single measurement for each patch with the refined profile, and if we observe new_error > previous_error for a particular patch, then this single observation pair is rather not statistically significant enough in order to conclude that the error of the refined profile is really larger than the error of previous old profile for this patch (except if the observed difference is huge, compared to the printer's repeatability). We can indeed improve the significance of our measurements, but only at the cost of making more prints and more measurements. If we would e.g. make 100 prints with the old profile and average the measurements, and make 100 prints with the refined profile and average these measurements too, and then compare the errors of the averaged old and new measurements w.r.t the reference, then we could make a comparison with much better significance, since the averaging of 100 samples reduces the observed random error by a factor of 10, but does not influence the systematic error of the profile. Applied to the above example, it would now be almost unlikely to observe error(A) > error(B) for the averaged measurements at all. And in 99% of all cases we would now observe error(B)-error(A) > 0.74 dE, and only in 1% of all cases we would observe a difference smaller than 0.74. Compared to the non-averaged measurements this is a big improvement, since with a single measurement we only can achieve error(B)-error(A) > -1.1 at the 99% significance level !!! So in order to get a feeling for the magnitude of the printer's repeatability error, and how the observed error can be approximately decomposed into the systematic error of the profile and the random repeatability error of the printer (where the latter cannot be eliminated by any profile), I would suggest to print N copies of the target using the SAME profile, preferably each copy with a different spatial randomization of the patches in order to account for spatial variations of the printer, and then to evaluate the resulting N measurement sets statistically. The larger N is chosen, the better of course. N=100 or more might be good, but I guess, you won't have so much patience :-) But even for just N=2 it should be possible to estimate at least the magnitude of the printer's average error over all patches (of course N=2 is not sufficient at all for individual noise estimates for each patch). So to get a first clue, you can start by printing, measuring and comparing just two copies of the target, using the same profile for both prints (but preferably using a different spatial randomization of the patches for the two prints, if possible). Then compare each of the two measurement sets with the reference, but primarily also compare them with each other (which should give a zero error for each patch if the printer's repeatability and the measurements were perfect - but of course they aren't, so expect to observe a significant difference between the prints made with the SAME profile too). Btw, Graeme, it might also be interesting if verify had an option to print the error summary excluding out-of-gamut patches, since they are to be considered as outliers, which cannot be corrected anyway. Thus they obfuscate how well the refinement works for correctable in-gamut patches. In the -v output, they could possibly be marked as out-of-gamut too. I just don't know whether this would be easy to implement? Regards, Gerhard >> or edit the test results for those patches, making >> them the same as the target, thereby selectively "turning off" >> further corrections for those points. Clumsy, but it might help. > > I will try this as soon as I will get the chance. It seems like a good > idea, because this is what I need: to stop samples going further away >> from reference. > Adding more white patches sounds better, but testing will tell. > Thank you. > >> Having said all that, I've have had a bit of a play with refine, and >> think I've >> struck upon a slightly better scheme to deal with out of gamut >> points, that >> allows efforts to correct them without the correction "running away" >> and causing things to get worse. I still notice some regressions for >> dark out of gamut points though. The overall improvement is slight in >> my tests, but may be worthwhile in improving behaviour for the critical >> near white colors. If you're running MSWindows, you can try out this >> version of refine here <http://www.argyllcms.com/refine.zip>. > > Is this in the development tree? Because I am not using the > executables directly, but a small program which contains icclink + > refine + interface( measurement statistics and target displayed > visually ). > If it is not, then I will have to wait until you put it there. > > Thank you again, > Marcel. >