Re: Profile at trace granularity?

  • From: Mike Pall <mike-1502@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 19 Feb 2015 16:36:36 +0100

Luke Gorrie wrote:
> Profiler feature idea: To divide profiler results into per-trace sections.

Actually, that's what I had (experimentally) before the current
profiler infrastructure was added. It turned out not to be that
useful, especially with more complex workloads.

First, neither the trace start nor the end is very indicative of
what code is actually part of the trace. You'd have to investigate
each trace individually. That's problematic, because while
profiling you don't want to dump traces. And doing it on a
separate run won't necessarily yield the same set of traces.

Also, to get a feeling what's really happening, one has to group
correlated (side-) traces somehow. But finding a useful grouping
is non-trivial manually and (probably) hard to do algorithmically.
Ok, other than grouping by source code location -- but that's
pretty much what the current profiler does.

> The profiler can already divide up results into separate sections:
> 
> - Compiled vs C vs interpreted vs GC
> - User-defined zones
> 
> and I am wondering if it would be helpful to also divide this up by trace
> number.

The attached (quick and dirty) patch does this. Use it with -jp=v.
Let me know if this turns out to be useful in practice.

--Mike
--- a/src/jit/p.lua
+++ b/src/jit/p.lua
@@ -74,7 +74,7 @@ local function prof_cb(th, samples, vmmode)
   -- Collect keys for sample.
   if prof_states then
     if prof_states == "v" then
-      key_state = map_vmmode[vmmode] or vmmode
+      key_state = map_vmmode[vmmode] or "TRACE "..vmmode
     else
       key_state = zone:get() or "(none)"
     end
--- a/src/lib_jit.c
+++ b/src/lib_jit.c
@@ -555,7 +555,10 @@ static void jit_profile_callback(lua_State *L2, lua_State 
*L, int samples,
     setfuncV(L2, L2->top++, funcV(tv));
     setthreadV(L2, L2->top++, L);
     setintV(L2->top++, samples);
-    setstrV(L2, L2->top++, lj_str_new(L2, &vmst, 1));
+    if (vmstate >= 256)
+      setintV(L2->top++, vmstate-256);
+    else
+      setstrV(L2, L2->top++, lj_str_new(L2, &vmst, 1));
     status = lua_pcall(L2, 3, 0, 0);  /* callback(thread, samples, vmstate) */
     if (status) {
       if (G(L2)->panic) G(L2)->panic(L2);
--- a/src/lj_profile.c
+++ b/src/lj_profile.c
@@ -155,7 +155,7 @@ static void profile_trigger(ProfileState *ps)
   mask = g->hookmask;
   if (!(mask & (HOOK_PROFILE|HOOK_VMEVENT))) {  /* Set profile hook. */
     int st = g->vmstate;
-    ps->vmstate = st >= 0 ? 'N' :
+    ps->vmstate = st >= 0 ? 256+st :
                  st == ~LJ_VMST_INTERP ? 'I' :
                  st == ~LJ_VMST_C ? 'C' :
                  st == ~LJ_VMST_GC ? 'G' : 'J';

Other related posts: