Re: Projects I may be willing to sponsor - FFI Array Sum versus native table results

  • From: Joe Ellsworth <joexdobs@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Wed, 10 Dec 2014 14:48:03 -0800

That is super cool. I would never have come up with the casting trick from the examples currently on the site.


On 12/10/2014 2:12 PM, Aleksandar Kordic wrote:
This http://pastebin.com/EghuB2BV is a little faster.

                elap allocate ffi 0.003 sec
                elap fill ffi array 0.001 sec
                elap sum the ffi arrray 42.049 sec
                elap sum the ffi arrray in reverse 52.804 sec
                elap fill native table  0.0069999999999908 sec
                elap sum the lua arrray 53.733 sec
                elap sum the lua arrray in reverse 53.535 sec

On Wed, Dec 10, 2014 at 9:56 PM, Joe Ellsworth <joexdobs@xxxxxxxxx <mailto:joexdobs@xxxxxxxxx>> wrote:

    Good catch.   Corrected code is http://pastebin.com/jniGqHyp The
    FFI code is still a little faster but not by much.    I added
    asserts at end to double check the sums match.  Should have done
    that at the first.
    elap allocate ffi 0.015sec
    elap fill ffi array 0sec
    elap sum the ffi arrray 48.564sec
    elap sum the ffi arrray in reverse 48.406sec
    elap fill native table  0sec
    elap sum the lua arrray 50.404sec
    elap sum the lua arrray in reverse 50.748sec




    On 12/10/2014 12:31 PM, Pedro Tabacof wrote:
    Joe, did you check whether the two sums are the same in the end?

    I may be mistaken but on line 28 you loop 9999 times and on line
    48 you loop for 900000 times. This would explain the time difference.

    Pedro.

    On Wed, Dec 10, 2014 at 6:20 PM, Joe Ellsworth
    <joexdobs@xxxxxxxxx <mailto:joexdobs@xxxxxxxxx>> wrote:

        Hi Stefano,

        I took the liberty of modifying your code to add enough work
        to make timing meaningful. http://pastebin.com/24tkRwGA If I
        am correct then what you just showed is a 107X speed
        improvement for the same algorithm.   That is pretty impressive.

        Ran on my laptop:

            elap allocate ffi 0sec
            elap fill ffi array 0sec
            elap sum the ffi arrray 0.484sec
            elap fill native table  0.015sec
            elap sum the lua arrray 51.887sec




        On 12/10/2014 11:50 AM, Stefano wrote:


        On 10 Dec 2014 18:39, "Szabó Antal"
        <szabo.antal.92@xxxxxxxxx <mailto:szabo.antal.92@xxxxxxxxx>>
        wrote:
        >
        > 2014-12-10 19:15 GMT+01:00 Joseph Ellsworth
        <joexdobs@xxxxxxxxx <mailto:joexdobs@xxxxxxxxx>>:
        >>
        >> The sample / example would allocate the array,  Fill it
        with a set of doubles read from a file,   Run the sum and
        return both the sum and average value.
        >
        >
        > Here is a sample code for what I suppose you want:
        http://pastebin.com/cj7jShfj
        >
        > The summation loop looks like this in x64 assembly (use
        -jdump with luajit to get it):
        >
        > ->LOOP:
        > 7ffc8350ff80  addsd xmm7, [rax+rdi*8+0x8]
        > 7ffc8350ff86  add edi, +0x01
        > 7ffc8350ff89  cmp edi, 0x270f
        > 7ffc8350ff8f  jle 0x7ffc8350ff80      ->LOOP
        >
        > As far as I can tell, this is the shortest and fastest you
        can get without using vector instructions, which LuaJIT
        doesn't currently support.

        For something more end-user-friendly (for this project at
        least) than bare FFI the author might be interested in my
        the algebra module of an open source project I'm working on:

        http://scilua.org/sci_alg.html

        Stefano

        >
        >
        > Antal Szabó





-- Pedro Tabacof



Other related posts: