Re: LuaJIT Matrix Algebra Function

  • From: Eliot Eikenberry <eliot.darkwolf@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Sun, 7 Sep 2014 21:34:52 -0400

I just got my oculus rift dk2 and have been working on setting everything
up to run in lua and moving a bunch of my old games into lua to use with it
and had a reason to do the same with with matrix (and vector,
quaternion,etc...).

On my machine, the code in Evan's gist runs in 0.058 on average.
Your hybrid code above runs at about 0.088.
The code at this gist runs at about 0.038 :
https://gist.github.com/Wolftousen/99a0c6835a1130b96965

Unfortunately, since i'm using opengl and double type matrices aren't
supported unless you are running version 4, i have to use floats instead of
doubles which slows it back down to about 0.059.  But regardless, awesome
we can get stuff so close to c speeds.

On Fri, Aug 8, 2014 at 7:14 AM, Beddell, Thomas Edmund <
ThomasEdmund.Beddell@xxxxxxxxx> wrote:

> Hi Evan,
>
> Your ffi array of doubles took 0.09 secs.
> I added your trick of making local copies of variables and that took Lua
> table version down to 0.17 secs.
> The hybrid approach with lua tables containing an array of doubles was
> exactly half way at 0.13 secs.
>
> C++ “equivalent” program 0.043 secs
> C++ “equivalent” program using DirectX functions (SSE2) 0.017 secs
>
> Needless to say my co-worker was “Impressed”!
>
> Thanks for your great input. I include the hybrid code.
>
> Best regards,
>
> Thomas Beddell
>
>
> local ffi = require("ffi")
>
> local mtype = ffi.typeof("double[16]")
>
> -- double 4x4, 1-based, column major
> matrix = {}
>
> -- Source for own metamethods
> matrix.__index = matrix
>
> setmetatable(matrix, matrix)
>
> -- Create a matrix object. Tested OK
> matrix.__call = function(self, ...)
>
>         -- Look in matrix for metamethods
>         return setmetatable({mtype(...)}, matrix)
>
> end
>
> function copy(self)
>
>         local mout = matrix()
>         local out = mout[1]
>         local m = self[1]
>
>         for i=0, 15 do
>
>                 out[i] = m[i]
>
>         end
>
>         return mout
>
> end
>
> -- Set matrix to identity matrix. Tested OK
> matrix.identity = function(self)
>
>         self = matrix()
>         local m = self[1]
>
>         for i=0, 15, 5 do
>
>                 m[i] = 1
>
>         end
>
>         return self
>
> end
>
> -- Inverse of matrix. Tested OK
> matrix.inverse = function(self)
>
>         local m = self[1]
>         local outm = matrix()
>         local out = outm[1]
>
>         out[0] = m[5] * m[10] * m[15] -
>                         m[5] * m[11] * m[14] -
>                         m[9] * m[6] * m[15] +
>                         m[9] * m[7] * m[14] +
>                         m[13] * m[6] * m[11] -
>                         m[13] * m[7] * m[10]
>
>         out[4] = -m[4] * m[10] * m[15] +
>                         m[4] * m[11] * m[14] +
>                         m[8] * m[6] * m[15] -
>                         m[8] * m[7] * m[14] -
>                         m[12] * m[6] * m[11] +
>                         m[12] * m[7] * m[10]
>
>         out[8] = m[4] * m[9] * m[15] -
>                         m[4] * m[11] * m[13] -
>                         m[8] * m[5] * m[15] +
>                         m[8] * m[7] * m[13] +
>                         m[12] * m[5] * m[11] -
>                         m[12] * m[7] * m[9]
>
>         out[12] = -m[4] * m[9] * m[14] +
>                         m[4] * m[10] * m[13] +
>                         m[8] * m[5] * m[14] -
>                         m[8] * m[6] * m[13] -
>                         m[12] * m[5] * m[10] +
>                         m[12] * m[6] * m[9]
>
>         out[1] = -m[1] * m[10] * m[15] +
>                         m[1] * m[11] * m[14] +
>                         m[9] * m[2] * m[15] -
>                         m[9] * m[3] * m[14] -
>                         m[13] * m[2] * m[11] +
>                         m[13] * m[3] * m[10]
>
>         out[5] = m[0] * m[10] * m[15] -
>                         m[0] * m[11] * m[14] -
>                         m[8] * m[2] * m[15] +
>                         m[8] * m[3] * m[14] +
>                         m[12] * m[2] * m[11] -
>                         m[12] * m[3] * m[10]
>
>         out[9] = -m[0] * m[9] * m[15] +
>                         m[0] * m[11] * m[13] +
>                         m[8] * m[1] * m[15] -
>                         m[8] * m[3] * m[13] -
>                         m[12] * m[1] * m[11] +
>                         m[12] * m[3] * m[9]
>
>         out[13] = m[0] * m[9] * m[14] -
>                         m[0] * m[10] * m[13] -
>                         m[8] * m[1] * m[14] +
>                         m[8] * m[2] * m[13] +
>                         m[12] * m[1] * m[10] -
>                         m[12] * m[2] * m[9]
>
>         out[2] = m[1] * m[6] * m[15] -
>                         m[1] * m[7] * m[14] -
>                         m[5] * m[2] * m[15] +
>                         m[5] * m[3] * m[14] +
>                         m[13] * m[2] * m[7] -
>                         m[13] * m[3] * m[6]
>
>         out[6] = -m[0] * m[6] * m[15] +
>                         m[0] * m[7] * m[14] +
>                         m[4] * m[2] * m[15] -
>                         m[4] * m[3] * m[14] -
>                         m[12] * m[2] * m[7] +
>                         m[12] * m[3] * m[6]
>
>         out[10] = m[0] * m[5] * m[15] -
>                         m[0] * m[7] * m[13] -
>                         m[4] * m[1] * m[15] +
>                         m[4] * m[3] * m[13] +
>                         m[12] * m[1] * m[7] -
>                         m[12] * m[3] * m[5]
>
>         out[14] = -m[0] * m[5] * m[14] +
>                         m[0] * m[6] * m[13] +
>                         m[4] * m[1] * m[14] -
>                         m[4] * m[2] * m[13] -
>                         m[12] * m[1] * m[6] +
>                         m[12] * m[2] * m[5]
>
>         out[3] = -m[1] * m[6] * m[11] +
>                         m[1] * m[7] * m[10] +
>                         m[5] * m[2] * m[11] -
>                         m[5] * m[3] * m[10] -
>                         m[9] * m[2] * m[7] +
>                         m[9] * m[3] * m[6]
>
>         out[7] = m[0] * m[6] * m[11] -
>                         m[0] * m[7] * m[10] -
>                         m[4] * m[2] * m[11] +
>                         m[4] * m[3] * m[10] +
>                         m[8] * m[2] * m[7] -
>                         m[8] * m[3] * m[6]
>
>         out[11] = -m[0] * m[5] * m[11] +
>                         m[0] * m[7] * m[9] +
>                         m[4] * m[1] * m[11] -
>                         m[4] * m[3] * m[9] -
>                         m[8] * m[1] * m[7] +
>                         m[8] * m[3] * m[5]
>
>         out[15] = m[0] * m[5] * m[10] -
>                         m[0] * m[6] * m[9] -
>                         m[4] * m[1] * m[10] +
>                         m[4] * m[2] * m[9] +
>                         m[8] * m[1] * m[6] -
>                         m[8] * m[2] * m[5]
>
>         local det = m[0] * out[0] + m[1] * out[4] + m[2] * out[8] + m[3] *
> out[12]
>
>         if det == 0 then return self:copy() end
>
>         det = 1.0 / det
>
>         for i = 1, 15 do
>
>                 out[i] = out[i] * det
>
>         end
>
>         return outm
>
> end
>
> -- Multiply matrix by a matrix. Tested OK
> matrix.__mul = function(self, m)
>
> local outm = matrix()
> local out = outm[1]
> local me, se = m[1], self[1]
>
> for i=0, 12, 4 do
>
>         for j=0, 3 do
>
>                 out[i+j] = me[j] * se[i] + me[j+4] * se[i+1] + me[j+8] *
> se[i+2] + me[j+12] * se[i+3]
>
>         end
>
> end
>
> return outm
>
> end
>
> -- Test
> local mOut
> local m = matrix():identity()
> local ITERATIONS = 300000
>
> local t = os.clock()
> for i=1, ITERATIONS do
>
>         mOut = m * m:inverse()
>
> end
>
> local time = os.clock() - t
> print("time", time)
>
>
>
>

Other related posts: