agentzh wrote: > From this graph, we can see that lua_yield appears in approximately > 13% of the total user-land stack samples (that is, about 13% of the > total run time), which is astonishing. The most time-consuming > sub-call within lua_yield is _lj_err_throw (and _Unwind_RaiseException > is the hottest within _lj_err_throw). lua_yield() uses stack unwinding. On x64 that implies using the native exception mechanism, which is rather slow. > And I'm wondering if there's any room for optimizing lua_yield in > LuaJIT 2.0? If we can make lua_yield fast here, then web apps atop > ngx_lua can also run significantly faster :) I've added a change to git HEAD to avoid the stack unwinding for regular yields in lua_yield(). But please note that: 1. It's still faster to use coroutine.yield() (does NOT use lua_yield). 2. Creating new coroutines/tables etc. for every request is not helpful. 3. Most of the overhead is due to Lua/C API friction. 4. Almost none of your Lua code is JIT-compiled. You're not using LuaJIT to its full potential if you're using that many C functions from the classic Lua/C API. I suggest to rethink the whole architecture. --Mike