Adam Strzelecki wrote: > > The logical first step is to add SIMD builtins and hand-vectorize > > the code. The SIMD builtins etc. are one of the future extensions > > mentioned in the roadmap. > > Cool, thanks for the info. C++ autovec is for me portable and > forward compatible alternative to non-portable intrinsics or > vector classes lacking of decent standard. So I believe SIMD > built-ins for Lua will be even better and more flexible than > autovectorization. The plan is to add a higher-level "ffi.vec" module, so you don't need to use CPU-specific SIMD-builtins in your code. E.g. if you request a vector type consisting of four doubles (v4d), you'll get the 256 bit AVX ops. That is, if your CPU has AVX. Otherwise everything is split into two 128 bit SSE2 ops on two doubles each (2*v2d). OTOH if you were to try this on ARM, you'd get four scalar ops on plain doubles (ARM only has SIMD ops for floats/ints). So 'v3 = v2 + v1' may be turned into very different code, depending on the capabilities of the CPU you're running this on. All of that should happen transparently -- except for performance differences, of course. Ok, so you still have to choose a certain SIMD type for your code. But (hopefully) you won't have to deal with AVX vs. SSE vs. VFP vs. VMX etc. --Mike