I wonder if there's any tool to implement ICC aware transforms on spooled print jobs for CUPS / Foomatic / Gutenprint. It seems like one can embed a ICC tag into files that are to be printed but AFAIK there's nothing that actually does (as a print filter / option) anything with that such as converting from source ICC to printer colorspace.
Doing this with higher level PDL's is non-trivial (ie. PDF, PostScript). You need a RIP really, and commercial products that do this sort of function sell for thousands of dollars.
I notice that some of ArgyllCMS's calculations can be a bit CPU intensive. I wonder if the following compilation options could be of help in performance:
http://gcc.gnu.org/gcc-4.2/changes.html > New Targets and Target Specific Improvements > IA-32/x86-64 >> * -mtune=native and -march=native will produce code optimized for the host architecture as detected using the cpuid instruction.
It may be worth a try, but I'd be surprised if it made much difference. The difference between debug and optimized isn't that great for instance. The only significant approach would be to recode some of the core algorithms to run on multiple CPU's (yes, on the wish list, but not likely soon). The pixel conversion engine code (imdi) can be speeded up by a factor of 2 if it's run on a 64 bit machine, but I would guess you are not talking about that aspect.
It might be a relatively easy way to parallelize some of the compute intensive tasks across multiple CPU cores without a lot of code development overhead, andwithout breaking the compilation on compilers / CPUs that don't support OpenMP or multi-cores.
A lot of it is parellizable, but it is not perfectly straightforward. The curve optimization code would need some careful though to thread, and it tends to dominate the forward profile construction now (the rspl code is relatively fast on modern processors, although it could be threaded relatively easily I think). The inverse lookup code is a massively parallel problem, so there's lots of scope there, but it's complicated by the presence of the intermediate calculation cache. If the cache was to be retained, it would need locking etc., and could easily be a bottleneck. Graeme Gill.