[haiku-development] BeOS Apps Crashing on Haiku (Bug #889)

Howdy,

I've tracked bug http://dev.haiku-os.org/ticket/889 down to its cause, 
which ultimately is that our runtime loader resolves symbols differently 
from BeOS's one.

A bit (quite a bit actually) of background: For each polymorphic class 
(i.e. one containing virtual methods or having a base class that does) gcc 
also generates a type_info (more precisely a gcc internal subclass) 
instance and a function returning that instance in order to provide the 
RTTI features for that class. A pointer to the function is also available 
at a known slot in the vtable, so that the typeid() and dynamic_cast<>() 
can conveniently get it for a given object. The type_info instance is 
usually only accessed through the function and is lazily initialized by it 
(gcc 2.95.3 does that in thread-unsafe manner, BTW).

In case of a class that has a single polymorphic base class the internal 
gcc type_info subclass instantiated for representing the class is called 
__si_type_info. It contains a field pointing to the type_info of the base 
class. To initialize the instance, the type info function first invokes the 
type info function for the base class and then invokes the __si_type_info 
constructor accordingly. Unfortunately it does not pass the return value of 
the base class type info function to the constructor, but uses a pointer to 
the base class type_info directly. In case of position independent code the 
pointer is stored in the global offset table and is relocated by the 
runtime loader. The same goes for the pointer to the type info function.

This would all be no problem, if there were only a single type_info 
instance per class, but there can actually be -- and are, when compiled 
with a gcc predating Oliver's -- one per shared object (library, 
executable, add-on). Which wouldn't be a problem either, if the same 
instance would be used consistently. Alas that's not necessarily the case.

E.g. in case of Be's PackageBuilder: libtracker.so contains a 
BPrivate::BHScrollBar class, which is a subclass of BScrollBar. The type 
info function for BPrivate::BHScrollBar invokes the type info function for 
BScrollBar, which lives (only) in libbe.so and initializes the type_info 
instance of BScrollBar located in libbe.so. The BScrollBar type_info 
instance the BPrivate::BHScrollBar type info function uses subsequently is 
the one the runtime loader resolved to the one living in the PackageBuilder 
executable, though; and which has not been initialized. The result is a 
crash as soon as RTTI features (e.g. a dynamic_cast<>()) are used on a 
BPrivate::BHScrollBar object.

The way our runtime loader works is, AFAIK, actually more correct than on 
BeOS. When resolving a symbol we traverse the shared object DAG 
topologically (starting with the executable, then the libraries it depends 
on, then their dependencies, and so on) until the symbol has been found. 
Be's runtime loader seems to use a similar order, but instead starts with 
the shared object that requests the symbol. Thus the PackageBuilder 
executable is skipped when the BScrollBar type_info reference in 
libtracker.so is resolved. The same probably would occur on BeOS, however, 
if libtracker.so would contain a BScrollBar type_info but no corresponding 
type info function.

Besides ignoring the problem I see the following possible solutions:

1) Modify our runtime loader to use the same symbol resolution strategy as 
Be's. This should fix the problem for all executables (running on BeOS) 
that suffer from it. As mentioned before, I believe, our current runtime 
loader behavior is more correct, though.

2) Modify the gcc type info functions to use the type_info returned by the 
base class type info function instead of resolving it separately. This 
makes sure the type_info used has been initialized. The change should fix 
e.g. the file panel problem several of the Be applications have, but -- 
since we only correct the type info functions in our code -- it might not 
fix all instances of the problem.

Opinions?

CU, Ingo

Other related posts: