[argyllcms] Re: bin/average: averaging and possible outlier elimination for three or more .ti3 sets?

  • From: Graeme Gill <graeme@xxxxxxxxxxxxx>
  • To: argyllcms@xxxxxxxxxxxxx
  • Date: Mon, 31 Aug 2009 13:33:38 +1000

Craig Ringer wrote:
I have a fairly large (2310) sample chart set I've scanned in with my
i1Pro. The chart set was run off an offset litho press so I have quite a
few copies. I've scanned in a few copies and I'm seeing significant
differences between copies when I use the `verify' tool to compare
the .ti3 data sets. Eg:

$ verify CHART.sampleA.ti3 CHART.sampleB.ti3 Verify results:
  Total errors:     peak = 79.868037, avg = 0.971652
  Worst 10% errors: peak = 79.868037, avg = 5.476613
  Best  90% errors: peak = 1.193214, avg = 0.471101

The severity of a few of the errors would suggest possible misreads.
They're strip charts I'm reading with an i1Pro by hand (sigh) so
operator error (mine) isn't unlikely. It could also be quirks of the
printing process and/or newsprint media, since the charts are on
off-white partly recycled newsprint printed on a press that does
adaptive stochastic dithering.

I've recently come across this myself. My conclusion is that it's a problem
with bi-directional reading picking the wrong strip direction, because
that particular strip has too similar pattern forward and backwards.
This occurs more frequently with larger charts, because a larger number of
strips increases the odds.

The short term workaround is therefore to turn bidirection reading off 
(chartread -B),
and only read the strips in the forward direction.

The longer term solution is that in the next release I've improved printtarg so 
that
it not only optimizes the patch to patch contrast, but also the strip reverse 
direction
contrast.

Here's an example error spike:

5: 83.070051 1.346918 2.895409 <=> 82.505987 1.360226 2.779120  de 0.576080
6: 82.110832 1.536077 2.992189 <=> 81.607727 1.437567 2.733480  de 0.574238
7: 40.886825 4.316704 18.048722 <=> 82.063784 1.539669 2.906388  de 43.960711   
 **** Huge error spike ****
8: 82.523678 1.538664 3.054387 <=> 81.639391 1.422285 2.777460  de 0.933914
9: 82.579391 1.491124 3.066842 <=> 82.226908 1.541316 3.019219  de 0.359209
10: 82.347015 1.442016 3.176074 <=> 80.000400 1.401537 2.851284  de 2.369330

Another:

182: 78.976711 -2.232244 58.087355 <=> 77.775705 -2.120755 56.639930  de 
1.884114
183: 80.095716 -2.763189 55.135360 <=> 79.346721 -2.745549 54.804620  de 
0.818959
184: 37.428620 -5.000581 0.024627 <=> 78.810882 -2.002915 60.115930  de 
73.023573
185: 79.154031 -1.844991 61.359333 <=> 78.829979 -2.012077 60.145829  de 
1.267091
186: 78.984785 -1.665758 61.827127 <=> 78.670603 -1.793216 61.269881  de 
0.652287

.. etc

I've dealt with this myself (somewhat painfully) by doing a quick profile
on the data (ie. colprof -ql -B), and then using profcheck -v to locate
the worst patches, deleting them from the .ti3 file, rinse and repeat
until the errors look reasonable.

Also: I've run into an odd issue when averaging the data sets. The
output produced by `average' has more sets than the inputs do - both
inputs have 2130, but the output has 2364. I'm a bit puzzled about why,
given that both inputs were read using `chartread' from charts printed
using the same .ps file and had the same .ti1 and .ti2 . Is that
expected behavour?

Sounds like a bug, although I wasn't able to reproduce it with a simple test.

cheers,
        Graeme Gill.

Other related posts: