Re: Flat initializer list for union of unnamed struct fields

  • From: Peter Colberg <peter@xxxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Mon, 8 Sep 2014 23:48:02 -0400

On Tue, Sep 09, 2014 at 01:46:25AM +0200, Mike Pall wrote:
> Initialization of nested aggregates is not compiled. This is far
> from trivial in the general case. You can send a patch, if you
> really want to dive into this (function crec_alloc).

Please see the attached patch against LuaJIT 2.1.

It is a first attempt, but the test case already compiles:

[TRACE   1 union.lua:19 loop]
[TRACE --- union.lua:12 -- NYI: return to lower frame at union.lua:24]
[TRACE   2 union.lua:23 loop]

Could you give hints on how to improve the patch?

> IMHO it makes sense to enforce a common notation. Otherwise people
> will have a hard time to understand each other's modules. A simple
> struct is much easier on the compiler, too.

The OpenCL C specification permits both (and more) notations for
accessing vector components in device code, so for consistency the
host code should support both notations, too. The choice of notation
depends on how a vector type is used: x,y,z is suited for physical
vectors, s0,s1,s2,…,sA,…,sF for any aggregates up to 16 components.

To my surprise a plain struct versus a union with nested struct
perform equally well with the attached patch. The code transforms
and averages the velocities of 10⁵ solvent particles to obtain a
flow field. The allocation sinking in LuaJIT 2.1 is impressive.

Thanks,
Peter
diff --git a/src/lj_crecord.c b/src/lj_crecord.c
index acd786f..85144c2 100644
--- a/src/lj_crecord.c
+++ b/src/lj_crecord.c
@@ -969,6 +969,10 @@ static void crec_alloc(jit_State *J, RecordFFData *rd, 
CTypeID id)
       MSize i = 1;
       while (fid) {
        CType *df = ctype_get(cts, fid);
+       if (ctype_isxattrib(df->info, CTA_SUBTYPE)) {
+         fid = ctype_rawchild(cts, df)->sib;
+         continue;
+       }
        fid = df->sib;
        if (ctype_isfield(df->info)) {
          CType *dc;
local ffi = require("ffi")
ffi.cdef[[
typedef union {
  struct { double x, y, z, w; };
  struct { double s0, s1, s2, s3; };
} cl_double4;
]]

local double4 = ffi.typeof("cl_double4")

ffi.metatype(double4, {
  __add = function(a, b)
    return double4(a.x + b.x, a.y + b.y, a.z + b.z, a.w + b.w)
  end,
})

local N = 100000
local v = ffi.new("cl_double4[?]", N)
for i = 0, N - 1 do
  v[i] = double4(4, 3, 2, 1)
end
local x = double4(1, 2, 3, 4)
for i = 0, N - 1 do
  x = x + (v[i] + v[i])
end
assert(x.x == 1 + N * 8)
assert(x.y == 2 + N * 6)
assert(x.z == 3 + N * 4)
assert(x.w == 4 + N * 2)

Other related posts: