[gameprogrammer] Re: Scripting engines: Upvalues and functionrefs

  • From: Bob Pendleton <bob@xxxxxxxxxxxxx>
  • To: Gameprogrammer Mailing List <gameprogrammer@xxxxxxxxxxxxx>
  • Date: Mon, 12 Apr 2004 14:59:48 -0500

On Mon, 2004-04-12 at 12:40, David Olofson wrote:
> On Thursday 08 April 2004 18.54, Bob Pendleton wrote:
> > On Thu, 2004-04-08 at 06:08, David Olofson wrote:
> > > On Tuesday 06 April 2004 18.24, Bob Pendleton wrote:
> > > [...]
> > >
> > > > Yep. Having a GC makes life so much easier for both the
> > > > language developer and the programmer! But, using a GC based
> > > > language for a task with RT constraints is difficult.
> > >
> > > I'm leaning towards a hybrid refcount/GC solution eventually. It
> > > seems like a properly working refcounting system tends to be
> > > slower and more complicated than a proper GC, but I have this
> > > feeling that there are shortcuts to be made with a language and
> > > compiler designed with this stuff in mind.
> >
> > Yeah, that is my conclusion also. I've written systems with a GC
> > and with reference counting and reference counting is harder to get
> > right. I was looking at the same problem you are actually, a
> > language for scripting games.
> >
> > A reference counting system does need a GC to find cyclic
> > structures that a reference counting system leaves around. If you
> > provide a rich set of data types. like queues, sets, and so on you
> > can get rid of a lot of the cases where a programmer is likely to
> > create cyclic structures. But, not all of them.
> 
> Meanwhile, I'm trying to keep the language pretty minimal, with just a 
> few powerful datatypes... There's a conflict there that's hard to 
> avoid.
> 

Yep. Of course, you can do what perl does. It uses reference counting
and warns the programmer to break cycles. 

> 
> > One technique I played with was using an incremental GC. In an
> > interpretor you can run the GC while waiting for input. You can
> > also run it "between" instructions in a VM. You can run the VM for
> > N instructions and then run the incremental GC for X microseconds,
> > and so on.
> 
> Incremental GC seems to be the new fashion in multimedia related (ie 
> real time) scripting languages. One of the main reasons why I didn't 
> just go with Lua is that it relies on GC, but doesn't (yet) have an 
> incremental GC implementation.
> 

Yep, for all the same reasons you are running into.

> 
> > If you have a handle based memory allocator you can also merge free
> > blocks during idle time and "between" instructions, and reduce
> > memory fragmentation.
> >
> > > Either way, note that bounded execution time is more important
> > > than speed, so most of the conclusions drawn in traditional
> > > papers on this matter are more or less irrelevant. For RT work,
> > > half the average speed is better than occasional extreme or
> > > unbounded peaks. I can get a faster CPU if I have to, but not an
> > > infinitely fast CPU.
> >
> > Exactly the same conclusion I came to. Better to have predictable
> > behavior and lower performance than to have maximum performance and
> > unpredictable behavior.
> >
> > I have noticed that traditional papers on memory management do not
> > address the memory usage patterns in multimedia programs at all.
> 
> Maybe this is a catch 22 situation? People that are serious about RT 
> generally stay away from scripting languages, and soft RT and non RT 
> people just don't care enough to worry about the problem. So, there 
> are no scripting solutions that really work in RT environments, and 
> as a result, there are no users (ie potential contributors, if we 
> think in Free/Open Source terms) of RT scripting systems.

I don't think so. I think it is because the papers are written by
professors and graduate students who never seem to have any interest in
either embedded systems or real time. 

Then there is the point that RT systems tend to lag behind desktop and
server systems in overall memory and performance. And, RT/embedded
systems have reliability requirements that are many times higher than in
desktop/server systems. The result is that RT/embedded developers are
(rightly) conservative and therefore software techniques used in
RT/embedded systems tend to lag many years (decades :-) behind
techniques that are acceptable on desktop/server systems.

> 
> Right, I know there actually *are* RT scripting systems in various 
> more or less hidden places - but they all seem to be proprietary, 
> closely tied to some synth engine or similar, or both. That means 
> they're of little use to me, and I guess, of little use to most 
> people that just need a basic RT scripting system.
> 
> BTW, my code is LGPL, in case it turns out useful. Code to be released 
> as soon as I have anything remotely useful.
> 
> 
> [...]
> > > > You have to capture the reference into a named variable which
> > > > will be different for each reference. Not much chance of
> > > > confusion.
> > >
> > > No, not really... You just have to know what you're capturing
> > > (<funcref, context> or <funcref, reference to context>), since
> > > that's not obvious from the syntax.
> >
> > Ok, you lost me there. How would you ever have a context and not a
> > reference to a context?
> 
> Uhm... I wasn't thinking straight. I actually meant "reference to 
> context" vs "reference to private copy of context".
> 

That makes sense.

> 
> [...function call with illegal context...]
> > > > Design the language so that the default value of a reference is
> > > > something you can check to see if it is invalid and so that
> > > > once assigned a valid value the value can not become invalid.
> > >
> > > Not quite sure how to do that in this case... The CCALL (call via
> > > constant; ie "hardwired" call) instruction takes an operand that
> > > tells the VM how many levels to back up the stack to find the
> > > lexical parent function's register frame. The upvalue access
> > > instructions work in a similar fashion.
> >
> > There are some techniques for doing that that are much easier to
> > use. One is to just push pointers to outer stack frames on the
> > stack before pushing the function arguments on the stack. So
> > references to outer stack frames are just made indirect through the
> > appropriate value on the stack.
> 
> That's actually what I'm doing... Each call frame has a fixed field 
> that points to the "parent" stack frame, if any.
> 
> 
> > You don't need to climb the stack,
> > you just indirect through the appropriate frame pointer. This works
> > with a linked list stack like your too.
> 
> Well, the way my implementation works, the return path forms one 
> linked list, and the "upvalue scope path" forms another. (Not always 
> the same, since child functions may call each other and stuff, 
> requiring "shortcuts" in the upvalue path.)

Each function context needs a pointer to the context of each upvalue
context it can see along with the values for each local variable. For
example

function a(b, c)
{
        function d(e, f)
        {
                function g(h, i)
                {
                }
                function j(k, l)
                {
                }
        }
}

The context for function j always needs a pointer to the context for the
last invocation of a() and d(). While function d()'s context needs a
pointer to the most recent invocation of a(). Of course, d() can be
recursive so it may have many invocations on the stack. And g() and j()
may call each other and be recursive so they also may have many
invocations on the stack. So, the stack frame for a call to j would
contain:

j:
        -> last stack frame
        -> d 
        -> a
        value k
        value l



> 
> As to the "stack" itself, it's basically just heap memory that's 
> managed by the VM's CALL/RETURN logic. The return path is what makes 
> it a linked list. The very simplistic allocation logic is what makes 
> it a stack. (The latter will have to change to handle microthreads 
> and similar things, that effectively turn the stack into a tree.)
> 
> 
> [...]
> > > Actually, I'm thinking about plain LUTs, to completely avoid
> > > searching.
> >
> > To me a LUT is simply a look up table. The term doesn't imply a
> > lack of searching.
> 
> Good point... I was thinking in DSP and general low level terms, where 
> the term "LUT" normally means "table that is used to replace 
> expensive calculations with direct indexing operations."
> 
> 
> > > The idea is that any "typeless" field name should have a
> > > globally unique index, so you can look it up by index in any
> > > object that has a field by that name. Problem is that each object
> > > will need an object LUT that at least spans the indices of the
> > > names used, so it only works (efficiently) for small programs.
> >
> > Yeah, that is why local tables is a good solution. Even if you have
> > to search the search is short. But, you can have local tables and
> > give local variables unique indexes in the local context.
> 
> Well, local *variables* won't be a problem, as mine work just like in 
> C; they're allocated on the stack and bound at compile time. Typed 
> references won't be a problem either, as they'll use compile time 
> binding as well.
> 
> All I need is to turn "typeless names" into something that allows 
> quick run-time binding to members of objects - or just plain index by 
> name (implemented using hash tables where appropriate), if that's the 
> only sensible way of doing it.
> 
> 
> > Or, if you like the global table, then you can bind local values in
> > the global table. On entry to a function you have a list of indexes
> > whose current value is pushed on the stack. Then you can use those
> > names for you local variables. On exit you just pop the old values
> > back into the global table. This is a well know technique for doing
> > dynamic scoping efficiently.
> 
> Not quite sure I'm following... Calculate the local->global/parent 
> mapping/binding when the function is called, so that the actual code 
> will then deal with the right values?
> 
> Any keywords or references to look for? (I haven't been messing with 
> compilers all that much, and I have no formal education on the 
> subject, as you might have guessed.)

Dynamic scoping. 

> 
> 
> [...]
> > > > > The problem is that this gets ugly if/when objects way down
> > > > > the line have fields with names that are already in the table
> > > > > - and spread all over it... Making it a hash table seems like
> > > > > the obvious answer, but I'm open to suggestions. (Speed is
> > > > > important, but bounded look-up time is absolutely critical.)
> > > >
> > > > Have a table/per object instead of a global table. When you see
> > > > o.get just look up get in the table belonging to "o". If o
> > > > doesn't point to a class object then it doesn't have a table
> > > > and o.get is illegal.
> > >
> > > Yeah, but then you can't look up by index, right? (The compiler
> > > can't tell what index "get" will have in any particular object's
> > > list, except in the special case where the type is known at
> > > compile time.)
> >
> > You can't have your cake and eat it too.
> 
> Oh. Too bad... ;-)
> 
> 
> > If you have dynamic typing
> > then you have to pay for dynamic typing.
> 
> Yeah... That's why I'll add static typing as well. (Optional type 
> specifier when declaring variables, arguments and stuff.) I'm worried 
> about performance as well as compile time error checking.
> 
> Speaking of which;
> 
>       A) No type specifier ==> default type. (Say, integer.)
> 
>       B) No type specifier ==> object becomes dynamically typed.
> 
>       C) Type specifiers are not optional. Say "dynamic" if
>          that's what you want.
> 
> What's your vote?

I prefer to either have strong typing or dynamic typing. Never both.
But, if I have to have both them I prefer C. Always say what you mean.
Defaults breed bugs and they make it hard to find them because you can't
see them. Understanding code with defaults means having to keep track of
a lot of details about the language all the time. That is my main
complaint about perl. To many defaults.

> 
> 
> > If you want static typing
> > you have to live with static typing. Mixing them means that
> > sometimes you can use indexes and sometime you have to search. So,
> > if you have LUTs sometime you can index into them (when the
> > compiler knows the type) and sometimes you have to search them
> > (when the compiler does not know the type).
> 
> Yes... Well, look-ups can be done in bounded time, so there's no major 
> issue here. I just want to make sure I'm not missing some obvious 
> shortcuts. (Though I guess most things can be fixed later anyway. The 
> language isn't set in stone, and I'm going to keep the implementation 
> clean and simple until the thing is stable enough for optimization.)
> 
> 
> > > > That keeps all the tables local to their objects, no global
> > > > interactions to worry about. Also, since the tables are small
> > > > you can get good performance with a sequential search of the
> > > > tables so that you don't have to use space for a hash table.
> > > > OTOH, hash tables can be pretty small so use a hash table per
> > > > class.
> > >
> > > Right... It's not *that* expensive, and there are no nasty
> > > surprizes, really; objects with many fields have a higher worst
> > > case look-up time, but it's still bounded for any given piece of
> > > code.
> > >
> > > But O(1) is always better, of course. ;-)
> >
> > Not necessarily. ;-) Sometimes the flexibility that comes from
> > using an O(N/2) algorithm is worth the cost. Especially when N is
> > usually small.
> 
> Yeah... And N is basically defined by the script coder anyway, so it's 
> no big deal. Accessing object members through dynamically typed 
> variables means the number of elements may affect performance. Keep 
> member counts low, or use statically typed references whenever you 
> can, if performance is critical.
> 
> An optional compiler warning when generating dynamically typed code 
> might be handy, just so you don't do it by accident...
> 
Good Idea.
> 
> > Of course, a good hash table has performance that is
> > O(1) in most practical cases.
> 
> Yeah, I have some code lying around that guarantees O(1) nearly all 
> the time, and O(2) or something in some very rare cases. (You may 
> sometimes get two strings with the same hash code.)
> 
> However, I suspect that a plain search is faster than any hash tables 
> for small tables. Any ballpark figures for that? Or perhaps rather; 
> should I bother throwing in any hash table code at all, until it's 
> tuning time? :-)

Depends on the cost of computing the hash function and the cost of doing
an equality test. Assuming strings of bytes then if the hash function is
a typical shift xor type function then the cost of computing the hash is
about the same as testing for equality between two equal strings. So a
hash table that has to compute a hash function and then do an equality
test where there is a 98% chance of finding the right entry on the first
probe has a cost of 2+(a very small value). The average cost of a
sequential search of a table with 4 entries is 2 tests (on average you
have to look at half the entries in the table) where one is a short test
and one is a long test. So... the hash table starts to break even when
there are more than 4 entries in the table. OTOH, if you can sort the
table and use a binary search the hash table also starts to look good
when there are more than 4 items in the table.

So, I would stick with the simplest tables you can find until you can
prove there is a need for them.

> 
> 
> //David Olofson - Programmer, Composer, Open Source Advocate
> 
> .- Audiality -----------------------------------------------.
> |  Free/Open Source audio engine for games and multimedia.  |
> | MIDI, modular synthesis, real time effects, scripting,... |
> `-----------------------------------> http://audiality.org -'
>    --- http://olofson.net --- http://www.reologica.se ---
> 
-- 
+---------------------------------------+
+ Bob Pendleton: writer and programmer. +
+ email: Bob@xxxxxxxxxxxxx              +
+ web:   www.GameProgrammer.com         +
+---------------------------------------+


Other related posts: