Gerhard Fuernkranz wrote:
But for "-r" versus "-R", IMO the same applies.
It depends on whether the effect is in local or global behaviour. My assumption at the time was that this was dominated by local behaviour, and using a large number of purely random device space test points was what I arrived at as being an "unbiased observer". (I was seeing not only correlation between the space the test and verification points were distributed in, but also their arrangement - ie, grid to grid, etc.) I'm less sure now that that is the case. Certainly in general the accuracy of a profile is highest where the test samples are most closely concentrated, so I guess it makes sense that if the verification sampling distribution correlates with what's being verified, it will arrive at a favourable view of the accuracy. In reviewing some of my simulations, yes there is a bias applied by the type of distribution of the verification points. Given that this is all aimed at improving the accuracy of the forward profile, which is a table indexed by device values and containing the PCS values, one line of argument is that a uniform distribution of verification samples measured in device space is entirely appropriate in characterizing the behaviour of such a table. The other extreme argument is that the distribution of verification values should represent that of the statistical "average" usage made of the profile. It's actually difficult to arrive at such a test set, simply because the usage made of colors and color images is so diverse, and certainly isn't uniform, when measured in PCS space ! It is very difficult to draw confident conclusions in this area of research, since there is a lot of "noise", and often conflicting results for particular test cases. The margins of difference are fairly fine, and very dependent on the characteristics of the device, and the vagaries of where particular test chart and verification chart test points land (in terms of determining maximum errors in particular). Given a lot of time and compute power, some of these difficulties might be overcome, but not all of us have the necessary time to devote to such things :-) In spite of all this, the default targen algorithm does pretty well, even when measured against a perceptually uniform distribution of verification points, being the best as often as being the second best distribution in many cases. When measured against a uniform device distribution verification points, it does better, more often being the best distribution. This algorithm is consistently better than (say) the ECI2002 test chart for the same number of points, by 20-40% when measured by average or RMS CIE94 delta E's, and seems to have more advantage when larger numbers of test points are used. In contrast, the current targen adaptive algorithm sometimes does well, but sometimes doesn't do quite so well. It sometimes does a bit better than the default algorithm when measured by worst case error, so this hints that an improved adaptive algorithm that manages to be closer to optimal, may be better than the current default algorithm both for worst maximum, and best average error. Graeme Gill.