Re: Data-dependent slowdown in loop involving io.lines()

  • From: soumith <soumith@xxxxxxxxx>
  • To: luajit <luajit@xxxxxxxxxxxxx>
  • Date: Tue, 11 Nov 2014 19:49:17 -0500

Rewriting the standard io.open/read via ffi just because we hit a bad
hashing scheme is actually quite sad. The cases where the collisions occur
are "very" real-world and not at all impractical cases (file paths, come
on!).

Either ways, we have a solution and we use it (a slightly nicer hashing
scheme that doesn't break on such simple cases for a very very tiny
increase in computation cycles), forking and monkey-patching it for
ourselves is a solution for us, the people who don't know better would just
give up.

On Tue, Nov 11, 2014 at 5:41 PM, Coda Highland <chighland@xxxxxxxxx> wrote:

> On Tue, Nov 11, 2014 at 2:23 PM, soumith <soumith@xxxxxxxxx> wrote:
> > Considering this as a non-issue is quite irresponsible especially when
> the
> > case is quite general and would be encountered by quite a few people.
>
> This isn't really calling it a "non-issue." What's actually being said
> is that it's essentially UNAVOIDABLE. There are tradeoffs to be made,
> and in any string-hashing system there's ALWAYS going to be some
> pathological worst-case behaviors that come up. And not hashing the
> string would cause worse behaviors in other places. Somewhere has to
> win, somewhere has to lose.
>
> If your behavior hits this path, there are ways to sidestep it. I gave you
> one.
>
> /s/ Adam
>
>

Other related posts: