On Thu, 26 Mar 2009 16:29 +0000, "Martin Guy" <martinwguy@xxxxxxxx> wrote: > > Do any of the exception flags get set for the other Maverick Crunch > > opcodes? > Yes, I've seen the underflow and overflow and inexact bits go up. Good to know. > > Not support denorm is more of a problem. > > Not for me. I'm trying to make it do the best real time audio it can, > so an dealing in samples in the range -1 to +1 (plus > analysis/resynthesis of these). > Which applications care whether the ground is at 2^-1022 or 2^-1074, > or is it more a question of scientific apps thinking they know the > smallest possible number, using it as an edge case and testing against > it? For real-time audio, it doesn't matter, because you're talking about such low volumes. You can just add a DC offset, or force subnormals-to-zero. Incidentally if you're forcing subnormals to zero, then you shouldn't honor signed zeros. denormal = subnormal = gradual underflow http://grouper.ieee.org/groups/754/faq.html http://gcc.gnu.org/wiki/FloatingPointMath Doesn't glibc and c99 require Full IEEE compliance? I guess it depends on what flags you're passing to gcc, and the defaults. In any case, there appear to be a few properties/flags built into GCC, that we can use, e.g. HONOR_SIGNED_ZEROS - if this is disabled, then more of the MaverickCrunch HW operations can be used, since -0 == +0, then. Not sure if any of them are actually faster than what you've replaced them with. HONOR_NONIEEE_DENORMS, DENORM_OPERANDS_ARE_ZERO, DENORM_RESULTS_ARE_ZERO, ZERO_RESULTS_ARE_POSITIVE are used on Cell, maybe we should do something similar In any case, I think the trig functions in libm (glibc) need it. I think there are other FPUs that have a -mieee flag, i.e. Alpha. This is in old gcc code, though. http://www.tybor.com/ has a few C99 FPCE test suites that will probably still fail. > > Looks very evil. May I propose that we modify the softfloat routines? > That's an interesting idea. A faster trick might be to use one of the > very instructions that take denormal inputs as zero and emit code > fragments to test the parameters using them. For example, when -mieee, > for > cfaddd mvd0, mvd1, mvd2 > > with mvd3 as a scratch register, emit something like > cfcpyd mvd3, mvd1 @ test first parameter > cfcmpd mvd3, #0 @ zero or denorm? > beq L55 @ yes. use softfloat version > cfcpyd mvd3, mvd2 @ test second parameter > cfcmpd mvd3, #0 @ zero or denorm? > beq L55 @ yes. use softfloat version > cfaddd mvd0, mvd1, mvd2 @ neither param is subnormal. > L56 > > and at L55 have a code fragment that stacks ARM regs r0-r3, moves the > arguments into them, calls the softfloat routines, moves the result > back to mvd0,restores r0-r3 and branches to L56. That way the code > would go in a straight line most of the time without even having to do > function calls. However implementing that's not on my horizon since I > don't need denormalized values. An even better idea. A little performance hit for all operations, but could be fine-tuned to only do a post operation check or do both pre/post operation, depending on the expected usage. for: cfaddd %0, %1, %2 do this - bigger hit for normals, since comparison is pre&post: cfcmpd %0, #0 @ %0 zero/denorm? cfcmpdne %1, #0 @ %1 zero/denorm? only run if %0 is norm beq L55 @ one of the operands is zero/denorm cfaddd %0, %1, %2 cfcmpd %2, #0 @ result zero/denorm - probably not possible for cfaddd, maybe possible for other operations? bne L56 @ recalculate L55 @ do softfloat here L56 or this - bigger hit for subnormals or zero result, since comparison is only post: cfaddd %0, %1, %2 cfcmpd %2, #0 @ result zero/denorm? bne L56 @ recalculate @ do softfloat here L56 Anyhow, there's probably no point in implementing all this until everything else is working correctly.