As we already know the slope addition variant accumulates the partial
errors to a much bigger total error. Using large points it could result
in errorneous pixels of distance even > 1.
Using the division variant would always result in correct positions but
divisions are slow and are thus highly deprecated.
The linear function variant most of the time yields the right solution,
otherwise the solution is off not more than 1. This is why I tried to
find error bounds so that the function *will always* be right. However,
it turned out that there is always a possibility the the computed
position is wrong. I'll shortly explain:
Important assumption: all coordinates are whole numbers!
The slope is computed as: a = (y2-y1)/(x2-x1) (for the 1. octant)
However it is presented as a' = a +/- e, where "e" is the rounding error
and <= LSB(a)/2 for round to nearest.
Every point is evaluated by y = x*a' + x1, 0 <= x <= x2-x1
This will always be right (that means round(x*a') = round(x*a)) except when
n+1/2 <= x*a' < n+1/2+(x2-x1)*e (for e >= 0), [*]
where n is some natural number.
From
n+1/2+(x2-x1)*e <= x*a' < n+1 ==> n+1/2 <= k*a < n+3/2
follows that e < 1/(2*((x2-x1)-x)). That means that "e" needs more
digits after the zero-comma than max(x1,x2) has before. The x86-FPU
works with 80-bit extended floats with a 64-bit mantisse internally, so
our coordinates are restricted to <= 2^31. Luckily this condition holds
for all signed 32-bit integers.
Written [*] in a different way results in
x*(y2-y1)-(x2-x1)/2 mod (x2-x1) < (x2-x1)^2*e
If we bound (x2-x1)^2*e < 1/2 then we can write this as
2*x*(y2-y1) mod 2*(x2-x1) = x2-x1
Unfortunately, this can happen with many combinations of numbers (it
doesn't happen with special numbers like x2-x1 being prime ;-/).
This approach failed.
However, frac(x*a'-1/2) < (x2-x1)*e is a rare condition. One could use
the linear function variant step for most numbers, and for those who
fall into this rare condition we could explicitely use the division
variant step. This way we would do only very few divisions and also most
of the time the branch would be predicted correctly.
Some pseudo-code example (for abs(x2-x1) >= abs(y2-y1)):
a' = (y2-y1)/(x2-x1); e = abs(x2-x1)/2^64; // 64 bits mantisse for 80-bit extended floats for (x = 0; x < x2-x1; x++) { y = x*a'; if (abs(y-round(y)) < e) y = x*(y2-y1)/(x2-x1); pixel(x,y+y1); }