For reference:
local clock = os.clock;
local n = 0;
local now = clock();
for i = 1, 2^24 do
n = (function(i) return i + 2 end)(n);
end
now = clock() - now;
print("Time for 1: " .. now);
local n = 0;
local now = clock();
for i = 1, 2^24 do
n = n + 2;
end
now = clock() - now;
print("Time for 2: " .. now);
This code prints:
Time for 1: 1.104
Time for 2: 0.023
I don't know whether to be impressed that the JIT-compiler compiles
functions so quickly, or whether to be disappointed that it can't compile
the loop containing 'fn' efficiently.
BTW this was done on an AMD Athlon x4 760k clocked at 4.1 GHz.
Las
On 4 December 2016 at 22:24, Las <lasssafin@xxxxxxxxx> wrote:
That's a very good idea!
But I'd probably use _[index] for arguments, instead of 'a', 'b', etc.
(A bit related: I really hate the lua designers hate for complex syntax.
Like why shouldn't I be able to do {...}[2] instead of resorting to a
'select'?)
Then the question is if the JIT-compiler can compile it efficiently.
The code:
local memo = {} -- memoization table
function fn (expr)
if memo[expr] == nil then
local code = ("return function (...) _ = {...}; return %s
end"):format(expr)
memo[expr] = assert(loadstring(code))()
end
return memo[expr]
end
local n = 0;
for i = 1, 2^16 do
-- 1
n = fn("_[1] + 2")(n);
-- 2
n = n + 2;
end
When 2 is commented, the generated IR is:
0036 ------ LOOP ------------
0037 > p32 UREFO test.lua:2 #0
0038 > p32 EQ 0037 0000
0039 > tab TNEW #3 #0
0040 p32 FLOAD 0039 tab.array
0041 p32 AREF 0040 +1
0042 num ASTORE 0041 0033
0043 tab HSTORE 0028 0039
0044 nil TBAR 0024
0045 + num ADD 0033 +2
0046 + int ADD 0034 +1
0047 > int LE 0046 +65536
0048 int PHI 0034 0046
0049 num PHI 0033 0045
When 1 is commented, the generated IR is:
0006 ------ LOOP ------------
0007 + num ADD 0003 +2
0008 + int ADD 0004 +1
0009 > int LE 0008 +65536
0010 int PHI 0004 0008
0011 num PHI 0003 0007
Obviously it isn't very fast.
The times were measured like this for both:
local n = 0;
local now = clock();
for i = 1, 2^24 do
--n = fn("_[1] + 2")(n);
n = n + 2;
end
now = clock() - now;
Time for 1: 1.081
Time for 2: 0.022
Your way was ingenious, but the JIT-compiler is too stupid sadly.
Las
On 4 December 2016 at 20:58, Luke Gorrie <luke@xxxxxxxx> wrote:
On 4 December 2016 at 18:18, Las <lasssafin@xxxxxxxxx> wrote:
Yeah, that's what I meant.
Oh :). Well, I needed the practice at explaining how a tracing JIT
operates anyway. Starts to sound comical saying "Just read Thomas
Schilling's PhD thesis and you will have some initial idea..." too often.
I know it can compile calls, but it can't compile FNEW and UCLO of
simple functions.
The reason I'm asking is because obviously such "closures" can be used
to simplify APIs.
I see. Yes, in a perfect world the JIT would compile the closure creation
(and perhaps even sink the allocation.) I suppose one compensation is that
the loop inside the foreach() function could still be compiled and it is
only the caller that will suffer from the NYI.
One workaround could be to invent a new formulation that has the runtime
behavior that you want even if not the traditional syntax.
How about if you would replace the original code:
foreach(t, function(i, n) return n * 2 end)
with an alternative:
foreach(t, fn[[b*2]])
that does JIT efficiently and does not create new closures. Could be
implemented as:
-- fn(expr): Create a closure that returns the value of <expr>.
-- expr is a Lua expression with up to five arguments (a, b, c,
d, e).
--
-- Example: fn[[a*b]](21,2) => 42
memo = {} -- memoization table
function fn (expr)
if memo[expr] == nil then
local code = ("return function (a,b,c,d,e) return %s
end"):format(expr)
memo[expr] = assert(loadstring(code))()
end
return memo[expr]
end
-- Example:
local acc = 0
for i = 1, 100 do
acc = acc + fn[[a*b]](21,2)
end
assert(acc == 42*100)