I did a thing on swizzling in the Khronos APIs: http://williamaadams.wordpress.com/2013/04/03/dynamically-swizzled-type-equivalent-goodness/ Not highly performant code (uses table lookups), but it might be interesting to look at. -- William =============================== - Shaping clay is easier than digging it out of the ground. > Date: Mon, 8 Sep 2014 23:48:02 -0400 > From: peter@xxxxxxxxxxx > To: luajit@xxxxxxxxxxxxx > Subject: Re: Flat initializer list for union of unnamed struct fields > > On Tue, Sep 09, 2014 at 01:46:25AM +0200, Mike Pall wrote: > > Initialization of nested aggregates is not compiled. This is far > > from trivial in the general case. You can send a patch, if you > > really want to dive into this (function crec_alloc). > > Please see the attached patch against LuaJIT 2.1. > > It is a first attempt, but the test case already compiles: > > [TRACE 1 union.lua:19 loop] > [TRACE --- union.lua:12 -- NYI: return to lower frame at union.lua:24] > [TRACE 2 union.lua:23 loop] > > Could you give hints on how to improve the patch? > > > IMHO it makes sense to enforce a common notation. Otherwise people > > will have a hard time to understand each other's modules. A simple > > struct is much easier on the compiler, too. > > The OpenCL C specification permits both (and more) notations for > accessing vector components in device code, so for consistency the > host code should support both notations, too. The choice of notation > depends on how a vector type is used: x,y,z is suited for physical > vectors, s0,s1,s2,…,sA,…,sF for any aggregates up to 16 components. > > To my surprise a plain struct versus a union with nested struct > perform equally well with the attached patch. The code transforms > and averages the velocities of 10⁵ solvent particles to obtain a > flow field. The allocation sinking in LuaJIT 2.1 is impressive. > > Thanks, > Peter