[muscle] Re: Muscle on Solaris

  • From: "Jeremy Friesner" <jaf@xxxxxxxxxxxx>
  • To: muscle@xxxxxxxxxxxxx
  • Date: Sun, 02 Jul 2006 16:55:31 PDT (-0700)

Hi Lior,

A stack crawl of the crash would be helpful, especially if you can determine 
the exact line at which the program crashed.  (if the debugger is being 
unhelpful, I sometimes even resort to determining the crash location 'the hard 
way', by putting in lots of fprintf(stderr, "got to line %i\n", __LINE__) type 
of statements and seeing which one gets printed last before the crash... 
MuscleSupport.h even defines a MCHECKPOINT macro for that purpose).

That said, the behaviour you describe reminds me of two previous issues I've 
seen... whether your problem is related or not, I have no idea, but they might 
provide clues:

1) I stumbled across a bug in gcc 3.x that would cause new (nothrow) to return 
an invalid pointer (0x04, instead of 0x00) on memory failure when you used it 
to try to allocate an array.  I put in a hack-around, as shown on lines 289-303 
of support/MuscleSupport.h, but perhaps the hack-around doesn't work (or worse, 
is causing problems) under Solaris.

2) On some CPUs (I believe SPARC is one of them), accessing a multibyte value 
(e.g. int32 or float) on a non-word-aligned memory address will cause the CPU 
to throw an exception.  At one point I went through muscle's flatten/unflatten 
code to handle this problem:  the muscleCopyIn() and muscleCopyOut() templated 
inline functions (also declared in support/MuscleSupport.h) are implemented to 
call memcpy() to access unaligned values if MUSCLE_CPU_REQUIRES_DATA_ALIGNMENT 
is #defined; otherwise they just do a normal copy using the assignment 
operator.  So it is possible either that MUSCLE_CPU_REQUIRES_DATA_ALIGNMENT is 
not being #defined on your system, and should be (see line 69 of 
MuscleSupport.h), or perhaps some new code snuck into the codebase that access 
unaligned values without using muscleCopyIn()/muscleCopyOut(), and thus causes 
the crash (entirely possible, since I don't test the code on SPARC CPUs, so I 
might have done that without thinking about it, and
  wouldn't see any symptoms under PPC or Intel)

Cheers,
Jeremy



Other related posts: