Jeremy Friesner wrote: > Hi Lior, > > A stack crawl of the crash would be helpful, especially if you can determine > the exact line at which the program crashed. (if the debugger is being > unhelpful, I sometimes even resort to determining the crash location 'the > hard way', by putting in lots of fprintf(stderr, "got to line %i\n", > __LINE__) type of statements and seeing which one gets printed last before > the crash... MuscleSupport.h even defines a MCHECKPOINT macro for that > purpose). > Here's the backtrace. Note that because this only happens when there is a "-O" switch, the backtrace is a bit muddled. (gdb) bt #0 0x00069300 in muscle::PulseNode::PulseNode() (this=0xf840c) at ../util/PulseNode.cpp:8 #1 0x0006d8b0 in muscle::ServerComponent::ServerComponent() (this=0xf83fc) at ../reflector/ServerComponent.cpp:10 #2 0x00054df0 in muscle::AbstractReflectSession::AbstractReflectSession() (this=0xff1405a8) at ../reflector/AbstractReflectSession.cpp:17 #3 0x00056108 in muscle::DumbReflectSession::DumbReflectSession() (this=0xf83fc) at ../reflector/DumbReflectSession.cpp:18 #4 0x000568c4 in muscle::StorageReflectSession::StorageReflectSession() (this=0xf83fc) at ../reflector/StorageReflectSession.cpp:62 #5 0x00056604 in muscle::StorageReflectSessionFactory::CreateSession(muscle::String const&) (this=0xffbff6b8) at ../reflector/StorageReflectSession.cpp:31 #6 0x0006af64 in muscle::FilterSessionFactory::CreateSession(muscle::String const&) (this=0xffbff598, clientHostIP=@0xffbff228) at ../util/RefCount.h:173 #7 0x0006203c in muscle::ReflectServer::DoAccept(unsigned short, int, muscle::ReflectSessionFactory*) (this=0xffbff7a8, port=2960, acceptSocket=3, optFactory=0xffbff598) at ../util/RefCount.h:94 #8 0x0006191c in muscle::ReflectServer::ServerProcessLoop() (this=0xffbff7a8) at ../reflector/ReflectServer.cpp:614 #9 0x000648e4 in std::moneypunct<char, false>::do_curr_symbol() const () at muscled.cpp:273 #10 0x00065550 in std::numeric_limits<unsigned long long>::min_exponent () Line 8 in the PulseNode.cpp file is the constructor, and the only thing that happens there is the member initializations. > That said, the behaviour you describe reminds me of two previous issues I've > seen... whether your problem is related or not, I have no idea, but they > might provide clues: > > 1) I stumbled across a bug in gcc 3.x that would cause new (nothrow) to > return an invalid pointer (0x04, instead of 0x00) on memory failure when you > used it to try to allocate an array. I put in a hack-around, as shown on > lines 289-303 of support/MuscleSupport.h, but perhaps the hack-around doesn't > work (or worse, is causing problems) under Solaris. > I disabled this workaround and recompiled the server, and the problem still happens. This workaround is not causing the problem. > 2) On some CPUs (I believe SPARC is one of them), accessing a multibyte value > (e.g. int32 or float) on a non-word-aligned memory address will cause the CPU > to throw an exception. At one point I went through muscle's > flatten/unflatten code to handle this problem: the muscleCopyIn() and > muscleCopyOut() templated inline functions (also declared in > support/MuscleSupport.h) are implemented to call memcpy() to access unaligned > values if MUSCLE_CPU_REQUIRES_DATA_ALIGNMENT is #defined; otherwise they just > do a normal copy using the assignment operator. So it is possible either > that MUSCLE_CPU_REQUIRES_DATA_ALIGNMENT is not being #defined on your system, > and should be (see line 69 of MuscleSupport.h), or perhaps some new code > snuck into the codebase that access unaligned values without using > muscleCopyIn()/muscleCopyOut(), and thus causes the crash (entirely possible, > since I don't test the code on SPARC CPUs, so I might have done that without > thinking about it, and > wouldn't see any symptoms under PPC or Intel) > > SPARC will crash on an access to a non-word-aligned memory access, but I made sure that the MUSCLE_CPU_REQUIRES_DATA_ALIGNMENT is set, and the issue still happens. > Cheers, > Jeremy > > Regards, Lior