Features for the facilitation of sandbox construction

  • From: Philipp Kutin <philipp.kutin@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 3 Aug 2012 17:57:58 +0200

Hello,

Since I began to use LuaJIT to create a new scripting system for the
game port that I co-maintain, there has always been a recurrent
problem that concerns the user/internal code divide, and how to access
stuff from the latter while restricting it from the former. For
example, consider a stripped down "wall" struct as it appears in our
code:

typedef struct
{
    int32_t x, y;
    int16_t point2;  // walls hold only one point for us, this is the
index of the second one
    int16_t picnum;  // must be in [0 .. MAXTILES-1]
    int8_t shade;
} walltype;

The first four members have certain restrictions and must not be
modified arbitrarily, so we'd make them 'const' in our FFI C
declaration, thereby losing the ability to modify them from trusted,
internal code. Another option would be to unrestrict the struct and
write accessor functions for each member (which are many more), but
then we'd have to hide the actual struct from user code, and
syntactically it'll look very awkward.

As another example, consider making a safe "bit array" type, similar
in operation to the one at the BitOp page [1], but with the most
efficient space usage of 1 bit per bit + constant overhead. (The way I
see it, the example needs 4 bits per bit.)

--== bitarffi.lua excerpt ==--
local mt = {
    __index = {
        -- Is bit i set?
        isset = function(s, i)
            assert(i >= 0 and i<=s.maxbidx)
            return bit.band(s.ar[bit.rshift(i, 5)], bit.lshift(1, i)) ~= 0
        end,

        -- more functions like "set bit", "clear bit", etc.
    },
}
local bitar = ffi.metatype("struct { const double maxbidx, maxidx;
int32_t ar[?]; }", mt)

function new(maxbidx, initval)
    assert(type(maxbidx) == "number" and (maxbidx >= 0))
    assert(initval == 0 or initval == 1)

    local maxidx = math.floor(maxbidx/32)
    local size = maxidx+1

    local s = bitar(size, maxbidx, maxidx)  -- REM: VLA/VLS types have
the size as first arg!
    ffi.fill(s.ar, size*4, -initval)  -- initialize the bit array now

    return s
end
--== end ==--

Of course, this bitar type cannot be exposed to untrusted code as-is,
since the 'ar' member is there in plain sight for everyone to access
out-of-bounds. Various attempts at remedying this fail:

* We can't add an 'ar' field to mt.__index, since built-in struct
accesses take precedence.
* Encapsulating the bit array in a setmetatable'd table won't work
right with the custom ops.

The situation is somewhat similar for plain Lua, where you construct
types with special behavior by coding the functionality it into
metatable fields, which is however implemented using rawget/rawset
instead of direct access (which could lead to infinite recursion, for
example).

[ Sorry if this is somewhat vague, I hope it's relatively clear what is meant. ]

What I propose in the hope that it's possible to implement without too
great performance overhead, is something very similar to the
raw{get,set} pair. First, the "restrict" keyword would be misused in C
decls like this:

ffi.cdef[[
typedef struct {
    int unrestricted[2];
    restrict int not_accessible[2];
} qwe_t;

qwe_t qwe;
]]

a = ffi.C.qwe.not_accessible  --> error!

That is, 'not_accessible' would be practically anonymous padding for
anyone accessing a qwe_t object directly. (BTW, is it possible to
achieve this with LuaJIT currently?) Now, the other side of the coin
would be 'raw' accessor functions in the ffi module, either
 * ffi.rawset(cdata, key, val) and val=ffi.rawget(cdata, key); or
 * cdata_u = ffi.access(cdata),
returning a totally unrestricted, i.e. fully read/writable reference
to the original cdata. On that note, is it deliberate that there's no
way to do "reinterpret-style" casts on aggregate-typed cdata, like in
the following pseudo-LuaJIT code?

o = ffi.new("struct { const int i; }", 1)
ou = ffi.castto(o, "struct { int i; }"))
ou.i = 2
print(o.i)  --> 2

The choice for 'restrict' is because IMO that use would not conflict
with any real uses, since it can only be sensibly attached to a
pointer, and pointers would never be made user-accessible in
FFI-declared structs anyway.

So, is this an idea that could be realized without impeding the normal
flow of things too much? Anything in this direction would greatly
simplify writing code that needs to be executed in an untrusted
environment.

[1] http://bitop.luajit.org/api.html


Greetings,
Philipp


P.S. Allocation sinking is super awesome! I like how you can write a
routine for the intersection of two lines in a handful of source
lines,

function intersect(a,v, b,w)
    local vxw = cross2(v,w)

    if (vxw ~= 0) then
        local btoa = tovec2(a)-tovec2(b)
        local cv = cross2(w, btoa)/vxw
        return tovec2(a)+cv*tovec2(v)
    end
end

and have a million of them executed in 37 ms (vs. 500 ms without
sinking, on x86).

Other related posts: