Re: Implicit casting issues when binding to C++

  • From: Mike Pall <mike-1207@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 13 Jul 2012 19:41:00 +0200

Mike Pall wrote:
> I'm tempted to invest a couple hours into this and see how far
> I get. No promises -- I really don't know if this works out.

Well, it turned out to be a couple more hours than I had hoped. :-/

Here's a brief post mortem of my attempt at implementing C++-style
inheritance, but within the context of a plain C FFI:

* Parsing 'struct Derived : Base, Base2 { ... }' was the easiest
  part. Had to make some compromises with struct namespace vs.
  typedef namespace, which is unified in C++.

* Got into trouble with 'virtual', 'private' and so on. C++
  keywords, that are valid identifiers in C. Guessing from the
  context doesn't work that well. Since I really only needed
  'virtual', I settled on requiring that as the first token of a
  virtual member declaration.

* Pointer compatibility and rebasing went so-so. There's a reason
  C++ has more than one cast operation. Might lead to some
  surprises with ffi.cast(). I got it to invoke a virtual base
  method on a derived class and the pointer arithmetic was ok.

* Multiple inheritance ... that's where the problem starts. C++
  compilers try to be clever and reuse vtable pointers and slots
  from base classes. I think I've got the basics of this right,
  but it'll inevitably break into little pieces for anything
  bigger. And I haven't even looked at virtual inheritance ...

* Virtual methods didn't seem that hard, after I discovered that
  GCC only shares slots in the first vtable pointer with base
  classes, whereas MSVC isn't that picky. Even got this
  implemented in the JIT compiler. Was easier than I thought.
  Too easy -- I bet there are some exceptions to the rules ...

* Got into trouble again with non-virtual methods. Ignoring name
  mangling for now (do-it-yourself with asm("?get@Foo@@QAEHXZ")).
  But then ... with b:foo(), where do we search for the symbol
  name? There's no library namespace associated with an instance!
  And it doesn't make sense to attach one to a class, since the
  base class might come from a different library. Basically one
  has to search all loaded libraries. And then there are weak
  symbols and so on ... yuck.

* C++ compilers have basically complete freedom on how to layout
  non-POD classes. But for the FFI I need to know the exact
  layout. Ok, so there's supposed to be a standard ABI for that,
  but the document doesn't match reality and even says so upfront.
  Experimenting with GCC and MSVC leads me to the opinion that the
  only thing in the universe that knows how to do that is the
  compiler itself. Every time I tried to fix my layout algorithm,
  I found another exception to the rules.

* And inheritance is just one tiny piece of the whole C++ puzzle.
  Guess how many classes use std::string and have a look at what
  that entails. Name mangling, overloading, user-defined operators,
  constructors, destructors, parametric polymorphism, inline
  functions, templates, C++11?!? Oh dear ...

Ok, so I gave up (for now).

I'm sorry, but I guess everyone will have to live with their
workarounds for C++ support for a little longer. I shelved the
code -- it's very brittle and adds way too much complexity within
the current development timeframe for LuaJIT 2.0. I might give it
another go _after_ LuaJIT 2.1 has gotten up the ground, though.


tl;dr: C++ needs to die. I'm eager to help.


BTW: I'm still considering the enum change (always box them), but
     I'll have to think this through first.

--Mike

Other related posts: