Re: Projects I may be willing to sponsor - FFI Array Sum versus native table results

  • From: Joe Ellsworth <joexdobs@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Wed, 10 Dec 2014 12:56:39 -0800

Good catch. Corrected code is http://pastebin.com/jniGqHyp The FFI code is still a little faster but not by much. I added asserts at end to double check the sums match. Should have done that at the first.

elap allocate ffi 0.015sec
elap fill ffi array 0sec
elap sum the ffi arrray 48.564sec
elap sum the ffi arrray in reverse 48.406sec
elap fill native table  0sec
elap sum the lua arrray 50.404sec
elap sum the lua arrray in reverse 50.748sec



On 12/10/2014 12:31 PM, Pedro Tabacof wrote:
Joe, did you check whether the two sums are the same in the end?

I may be mistaken but on line 28 you loop 9999 times and on line 48 you loop for 900000 times. This would explain the time difference.

Pedro.

On Wed, Dec 10, 2014 at 6:20 PM, Joe Ellsworth <joexdobs@xxxxxxxxx <mailto:joexdobs@xxxxxxxxx>> wrote:

    Hi Stefano,

    I took the liberty of modifying your code to add enough work to
    make timing meaningful. http://pastebin.com/24tkRwGA If I am
    correct then what you just showed is a 107X speed improvement for
    the same algorithm.   That is pretty impressive.

    Ran on my laptop:

        elap allocate ffi 0sec
        elap fill ffi array 0sec
        elap sum the ffi arrray 0.484sec
        elap fill native table  0.015sec
        elap sum the lua arrray 51.887sec




    On 12/10/2014 11:50 AM, Stefano wrote:


    On 10 Dec 2014 18:39, "Szabó Antal" <szabo.antal.92@xxxxxxxxx
    <mailto:szabo.antal.92@xxxxxxxxx>> wrote:
    >
    > 2014-12-10 19:15 GMT+01:00 Joseph Ellsworth <joexdobs@xxxxxxxxx
    <mailto:joexdobs@xxxxxxxxx>>:
    >>
    >> The sample / example would   allocate the array,  Fill it with
    a set of doubles read from a file,   Run the sum and return both
    the sum and average value.
    >
    >
    > Here is a sample code for what I suppose you want:
    http://pastebin.com/cj7jShfj
    >
    > The summation loop looks like this in x64 assembly (use -jdump
    with luajit to get it):
    >
    > ->LOOP:
    > 7ffc8350ff80  addsd xmm7, [rax+rdi*8+0x8]
    > 7ffc8350ff86  add edi, +0x01
    > 7ffc8350ff89  cmp edi, 0x270f
    > 7ffc8350ff8f  jle 0x7ffc8350ff80        ->LOOP
    >
    > As far as I can tell, this is the shortest and fastest you can
    get without using vector instructions, which LuaJIT doesn't
    currently support.

    For something more end-user-friendly (for this project at least)
    than bare FFI the author might be interested in my the algebra
    module of an open source project I'm working on:

    http://scilua.org/sci_alg.html

    Stefano

    >
    >
    > Antal Szabó





--
Pedro Tabacof

Other related posts: