While testing the performance of bsearch (see below) on various advanced data
structures, I found a performance drop of 20% after 3 iterations (fully
deterministic) on simple tables if the extra comparison function is provided
‘locally’ (tested with LJ2.1-beta3 on MacOSX). The small test case below shows
the observable slow down with cmp provided as-is (first timings), and no slow
down with less_than provided instead (second timings). I don’t really see why
the code is slowing down with cmp, any ideas? Thanks.
Best,
Laurent.
./luajit-2.1.0-beta3 testcase.lua
iter= 20190843.856128 per sec
iter= 18981851.451827 per sec
iter= 15782280.286985 per sec
iter= 16326743.859512 per sec
iter= 16132987.442083 per sec
iter= 16044539.642046 per sec
…
./luajit-2.1.0-beta3 testcase.lua
iter= 20281466.187754 per sec
iter= 20253205.576113 per sec
iter= 18948617.035186 per sec
iter= 20042811.445247 per sec
iter= 19690620.963423 per sec
iter= 20973550.255772 per sec
…
local rshift = require'bit'.rshift
local function less_than (a, b)
return a < b
end
local function bsearch (tbl, val, cmp_)
local cmp = cmp_ or less_than
local low, cnt = 1, #tbl
while cnt > 0 do
local stp = rshift(cnt,1)
local mid = low+stp
if cmp(tbl[mid], val) then
low, cnt = mid+1, rshift(cnt-1,1)
else
cnt = stp
end
end
return low -- tbl[low] <= val
end
local function run ()
local tbl = { 5,10,10,10,20,20,20,30,40 }
local cmp = function (a,b) return a < b end
local t0 = os.clock()
for i=1,1e7 do
assert(bsearch(tbl, 41, cmp) == 10) -- use less_than for stable timings
end
local t1 = os.clock()
print('iter=',1e7/(t1-t0),'per sec')
end
-- slow down from 20 to 16 M iter/sec on 3rd iteration
for i=1,10 do run() end
Attachment:
smime.p7s
Description: S/MIME cryptographic signature