[haiku-development] Re: Disabling Strict Aliasing for GCC4 Builds

  • From: "Michael Lotz" <mmlr@xxxxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Sat, 19 Apr 2008 15:21:06 +0200

Hi Andreas
> I get the impression you've spent much time investigating this, and  
> that is highly appreciated.

It's mostly time spent waiting for the build to get finished ;-).

> Yesterday, I booted into newly patched r25032 and was able to play  
> around with it quite nicely for some time, only on shutdown the usual  
> Tracker crash.

Just to make sure, you have rebuilt everything after updating? As this 
is a jam level change (adding the compiler option) it won't trigger a 
rebuild of the targets by default. You'd have to make sure by deleting 
the objects folder or using "jam -a haiku-image".

> Getting the impression that some things were slightly faster in the  
> gcc4 image (like opening a Deskbar menu) - could it be that,  
> independent of the vm_page_fault issue, gcc4's optimizations reveal  
> some SMP issues that don't surface on gcc2 due to timing? Would 
> serial  
> debugging be able to get any useful hints from such a badly locked-up  
> system?

It might be revealing new SMP issues, but I think it is more likely 
that these are issues coming from memory layout changes and the like. 
It's well possible for example that there are bugs in drivers that are 
just hidden because of a certain memory layout that is generated with 
GCC2 while they cause crashes under GCC4. Then there is the issue of 
SMP safety of certain optimizations. I'm for example not sure how this 
strict aliasing stuff is supposed to work in a multi-threaded 
environment. If you optimize something into registers for example and 
delay writing that stuff back to memory, how is this going to impact a 
situation where the data is actually shared between two independant 
threads? Like if the data resides in a shared area. It's possible that 
due to the strict aliasing rules the optimizer can tell that nobody in 
the same section of code can access this data. But how should it know 
that the data might be shared and accessed outside of the current code 
block (i.e. from some GUI thread or by a server)? So some of those 
optimizations could at least have an impact on our environment which 
might not be immediately obvious.

Anyway, I have uploaded my -O0 build as a 7zip archive to 
http://haiku.mlotz.ch/haiku.image.gcc4.no-optimization.7z 
for anyone that wants to try it (20MB expanding to a 150MB image). It 
has no optimizations enabled, so it pretty much rules out anything 
buggy in optimization or anything unexpected due to that. If stuff 
crashes in here, then either the compiler is really broken (highly 
unlikely) or it is a bug in Haiku that needs to be investigated/fixed 
(like the Tracker crash on shutdown). You could try copying over this 
complete installation to your partition and check the situation. If you 
get KDLs then note down where their stack crawls end up ("sc" in KDL). 
Most probably they will either be completely random or they will lead 
to the same few places in one or more of the drivers. Since I can use 
the GCC4 build with no strict aliasing and no value range propagation 
perfectly fine in QEMU, I'd tend to think the problem is somewhere in 
buggy or at least non-complient driver code (which is not used under 
QEMU in this case).

Regards
Michael

Other related posts: