On Mar 7, 2014, at 3:51 AM, Mike Pall <mike-1403@xxxxxxxxxx> wrote: > Konstantin Olkhovskiy wrote: >> The object comes to me as a void * and a size_t denoting it's total length. >> So according to your guideline, I need to generate a Lua function which will >> iterate through each byte, skipping if necessary and doing some bitop magic, >> and for each matched field, decode it (if required) and compare according >> to the match_list[field_index]. >> >> Does it still sound JIT-friendly? > > The need for linear byte-wise parsing implies lots of loops with > low, unpredictable iteration counts and a maze of conditionals. > This causes many branch prediction misses, which makes it > CPU-unfriendly. That makes it JIT-unfriendly, too. If you want > performance, consider using a data format that has been designed > with that in mind. > > [ > On a tangent: IMHO most binary serialization formats like BSON, > MessagePack or Protocol Buffers are misdesigns. They add lots of > complications to save a couple of bytes. But that does not > translate into a substantial improvement of the parsing speed > compared to a heavily tuned parser for a much simpler text-based > format. > Speaking of serialization, I think Cap’n Proto is a very good project to look at. http://kentonv.github.io/capnproto/. No CPU waste to save a couple of bytes. I also wrote a Lua plugin for it if anyone is interested: https://github.com/cloudflare/lua-capnproto > Especially considering that the majority of today's applications > in need of semi-structured messages eventually need to re-encode > their data as JSON to send them to a browser, anyway. > > For compact (on-disk) storage, it's much better to add a lazy > high-speed data compression layer on top. Because that allows > compressing the format metadata _and_ the content. > > One of the few killer arguments for using a binary serialization > format would be to allow in-place edits without having to > decode/encode a complete message every time. But I guess very few > binary serialization formats have that property. > ] Cap’n Proto does have that property. > > --Mike > Best regards, - Jiale