[linux-cirrus] Re: Yet another MaverickCrunch hardware bug?

  • From: "Hasjim Williams" <linux-cirrus@xxxxxxxxxxxxxxxxx>
  • To: linux-cirrus@xxxxxxxxxxxxx
  • Date: Thu, 26 Mar 2009 16:23:57 +1000

Hi Martin (and everyone else), 

I haven't chimed in for a while.  Looks like you made some more progress
on standalone gcc.  NB, I haven't tested your patches yet.

The binutils save/restore bug is simple enough to fix.  See attachment.  

IEEE 754 compliance can also be done at the glibc level (and/or the
kernel level), in a similar way to the VFP code does it.  I think the
DENORM exception is supposed to be done at the kernel level, since it is
usually done by microcode on most processors.  The other exceptions get
signalled up to glibc, and it handles them, or passes them onto the
running process.

IEEE 754 exceptions on MavCrunch = Inexact, Underflow, Overflow, Denorm
Div by Zero exception doesn't need to be implemented, since there is no
hardware divide...

IIRC, bit 12 of DSPSC gets set if either of the operands for an
operation was a denorm.  If this happens, then you generate an exception
and grab the INST from DSPSC and recalculate it using ARM instructions,
i.e. softfloat.  Errata 12 roughly explains this.  I assume you haven't
recompiled glibc with Maverick Crunch enabled, so it is still using
softfloat, and IEEE 754 hence compliant (and slower), i.e. running sin()
is still the same speed.

You'll probably find that compiling it with the -mieee flag won't work,
as there isn't any completed support for MaverickCrunch (yet) in glibc.

I had a half written patch for glibc, but I never completed it as I
think the kernel driver for crunch also needs to be modified to
correctly generate exceptions (so that GLIBC will handle them) for this
to happen.  The patch does fix the longjmp problem though.

e.g. arch/arm/vfp, do_vfp vs crunch_task_enable in
arch/arm/kernel/entry-armv.S

The current code works (without IEEE 754 exceptions), and is based of
the iWMMXt stuff (which doesn't generate exceptions).  Probably simpler
to write the code to handle synchronous exceptions first, and then
extend it to async exceptions later.  Not sure whether stalling the ARM
or using hardware interrupts will be simpler.

Has anyone built a complete system with crunch enabled (e.g. in
OpenEmbedded), or is it still only in certain apps, without accelerated
libraries?

Hasjim

On Sat, 14 Mar 2009 14:39 +0000, "Martin Guy" <martinwguy@xxxxxxxx>
wrote:
> OK, here's another try, once again with "no known bugs", for C at least.
> 
> Changes to last time:
>  * Always enables revision D1-E2 workarounds, drops
>  -mcirrus-fix-invalid-insns
>    flag and removes purported D0 workarounds.
>  * Avoids the extra hardware bug found by the FFTW testsuite.
>  * Disables the 64-bit instructions. Re-enable them with -mcirrus-di flag
> 
> Description: martinwguy.co.uk/martin/crunch
> Download: simplemachines.it/tools
> 
>    M
--- /home/hwilliams/binutils-2.17.50.0.12-original/gas/config/tc-arm.c  
2007-05-18 13:45:49.000000000 +1000
+++ binutils-2.17.50.0.12/gas/config/tc-arm.c   2007-05-18 13:44:48.000000000 
+1000
@@ -451,6 +451,7 @@
   REG_TYPE_NDQ,
   REG_TYPE_NSDQ,
   REG_TYPE_VFC,
+  REG_TYPE_MV,
   REG_TYPE_MVF,
   REG_TYPE_MVD,
   REG_TYPE_MVFX,
@@ -1037,6 +1038,7 @@
   /* Alternative syntaxes are accepted for a few register classes.  */
   switch (type)
     {
+    case REG_TYPE_MV:
     case REG_TYPE_MVF:
     case REG_TYPE_MVD:
     case REG_TYPE_MVFX:
@@ -3573,6 +3575,140 @@
   ignore_rest_of_line ();
 }
 
+/* Parse a directive saving Maverick Crunch data registers.  */
+
+static void
+s_arm_unwind_save_mv (void)
+{
+  int reg;
+  int hi_reg;
+  int i;
+  unsigned mask = 0;
+  valueT op;
+
+  if (*input_line_pointer == '{')
+    input_line_pointer++;
+
+  do
+    {
+      reg = arm_reg_parse (&input_line_pointer, REG_TYPE_MV);
+
+      if (reg == FAIL)
+       {
+         as_bad (_(reg_expected_msgs[REG_TYPE_MV]));
+         goto error;
+       }
+
+      if (mask >> reg)
+       as_tsktsk (_("register list not in ascending order"));
+      mask |= 1 << reg;
+
+      if (*input_line_pointer == '-')
+       {
+         input_line_pointer++;
+         hi_reg = arm_reg_parse (&input_line_pointer, REG_TYPE_MV);
+         if (hi_reg == FAIL)
+           {
+             as_bad (_(reg_expected_msgs[REG_TYPE_MV]));
+             goto error;
+           }
+         else if (reg >= hi_reg)
+           {
+             as_bad (_("bad register range"));
+             goto error;
+           }
+         for (; reg < hi_reg; reg++)
+           mask |= 1 << reg;
+       }
+    }
+  while (skip_past_comma (&input_line_pointer) != FAIL);
+
+  if (*input_line_pointer == '}')
+    input_line_pointer++;
+
+  demand_empty_rest_of_line ();
+
+  /* Generate any deferred opcodes because we're going to be looking at
+     the list. */
+  flush_pending_unwind ();
+
+  for (i = 0; i < 16; i++)
+    {
+      if (mask & (1 << i))
+       unwind.frame_size += 8;
+    }
+
+  /* Attempt to combine with a previous opcode.         We do this because gcc
+     likes to output separate unwind directives for a single block of
+     registers.         */
+  if (unwind.opcode_count > 0)
+    {
+      i = unwind.opcodes[unwind.opcode_count - 1];
+      if ((i & 0xf8) == 0xc0)
+       {
+         i &= 7;
+         /* Only merge if the blocks are contiguous.  */
+         if (i < 6)
+           {
+             if ((mask & 0xfe00) == (1 << 9))
+               {
+                 mask |= ((1 << (i + 11)) - 1) & 0xfc00;
+                 unwind.opcode_count--;
+               }
+           }
+         else if (i == 6 && unwind.opcode_count >= 2)
+           {
+             i = unwind.opcodes[unwind.opcode_count - 2];
+             reg = i >> 4;
+             i &= 0xf;
+
+             op = 0xffff << (reg - 1);
+             if (reg > 0
+                 && ((mask & op) == (1u << (reg - 1))))
+               {
+                 op = (1 << (reg + i + 1)) - 1;
+                 op &= ~((1 << reg) - 1);
+                 mask |= op;
+                 unwind.opcode_count -= 2;
+               }
+           }
+       }
+    }
+
+  hi_reg = 15;
+  /* We want to generate opcodes in the order the registers have been
+     saved, ie. descending order.  */
+  for (reg = 15; reg >= -1; reg--)
+    {
+      /* Save registers in blocks.  */
+      if (reg < 0
+         || !(mask & (1 << reg)))
+       {
+         /* We found an unsaved reg.  Generate opcodes to save the
+            preceeding block.  */
+         if (reg != hi_reg)
+           {
+             if (0) // (reg == 9)
+               {
+                 /* Short form.  */
+                 op = 0xc0 | (hi_reg - 10);
+                 add_unwind_opcode (op, 1);
+               }
+             else
+               {
+                 /* Long form.  */
+                 op = 0xc600 | ((reg + 1) << 4) | ((hi_reg - reg) - 1);
+                 add_unwind_opcode (op, 2);
+               }
+           }
+         hi_reg = reg - 1;
+       }
+    }
+
+  return;
+error:
+  ignore_rest_of_line ();
+}
 
 /* Parse an unwind_save directive.
    If the argument is non-zero, this is a .vsave directive.  */
@@ -3624,6 +3760,8 @@
     case REG_TYPE_MMXWR:  s_arm_unwind_save_mmxwr ();  return;
     case REG_TYPE_MMXWCG: s_arm_unwind_save_mmxwcg (); return;
 
+    case REG_TYPE_MV:     s_arm_unwind_save_mv (); return;    
+
     default:
       as_bad (_(".unwind_save does not support this kind of register"));
       ignore_rest_of_line ();
@@ -14256,8 +14394,8 @@
   REGDEF(FPSID,0,VFC), REGDEF(FPSCR,1,VFC), REGDEF(FPEXC,8,VFC),
 
   /* Maverick DSP coprocessor registers.  */
-  REGSET(mvf,MVF),  REGSET(mvd,MVD),  REGSET(mvfx,MVFX),  REGSET(mvdx,MVDX),
-  REGSET(MVF,MVF),  REGSET(MVD,MVD),  REGSET(MVFX,MVFX),  REGSET(MVDX,MVDX),
+  REGSET(mv,MV),  REGSET(mvf,MVF),  REGSET(mvd,MVD),  REGSET(mvfx,MVFX),  
REGSET(mvdx,MVDX),
+  REGSET(MV,MV),  REGSET(MVF,MVF),  REGSET(MVD,MVD),  REGSET(MVFX,MVFX),  
REGSET(MVDX,MVDX),
 
   REGNUM(mvax,0,MVAX), REGNUM(mvax,1,MVAX),
   REGNUM(mvax,2,MVAX), REGNUM(mvax,3,MVAX),
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/bits/endian.h 
glibc-2.5/ports/sysdeps/arm/bits/endian.h
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/bits/endian.h   2005-06-13 
20:11:47.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/bits/endian.h   2007-05-18 08:41:52.000000000 
+1000
@@ -15,5 +15,9 @@
 #ifdef __VFP_FP__
 #define __FLOAT_WORD_ORDER __BYTE_ORDER
 #else
+#ifdef __MAVERICK__
+#define __FLOAT_WORD_ORDER __LITTLE_ENDIAN
+#else
 #define __FLOAT_WORD_ORDER __BIG_ENDIAN
 #endif
+#endif
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/bits/fenv.h 
glibc-2.5/ports/sysdeps/arm/fpu/bits/fenv.h
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/bits/fenv.h 2001-07-06 
14:55:48.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/bits/fenv.h 2007-05-18 08:44:33.000000000 
+1000
@@ -20,6 +20,45 @@
 # error "Never use <bits/fenv.h> directly; include <fenv.h> instead."
 #endif
 
+#if defined(__MAVERICK__)
+
+/* Define bits representing exceptions in the FPU status word.  */
+enum
+  {
+    FE_INVALID = 1,
+#define FE_INVALID FE_INVALID
+    FE_OVERFLOW = 4,
+#define FE_OVERFLOW FE_OVERFLOW
+    FE_UNDERFLOW = 8,
+#define FE_UNDERFLOW FE_UNDERFLOW
+    FE_INEXACT = 16,
+#define FE_INEXACT FE_INEXACT
+  };
+
+/* Amount to shift by to convert an exception to a mask bit.  */
+#define FE_EXCEPT_SHIFT        5
+
+/* All supported exceptions.  */
+#define FE_ALL_EXCEPT  \
+       (FE_INVALID | FE_OVERFLOW | FE_UNDERFLOW | FE_INEXACT)
+
+/* IEEE rounding modes.  */
+enum
+  {
+    FE_TONEAREST = 0,
+#define FE_TONEAREST    FE_TONEAREST
+    FE_TOWARDZERO = 0x400,
+#define FE_TOWARDZERO   FE_TOWARDZERO
+    FE_DOWNWARD = 0x800,
+#define FE_DOWNWARD     FE_DOWNWARD
+    FE_UPWARD = 0xc00,
+#define FE_UPWARD       FE_UPWARD
+  };
+
+#define FE_ROUND_MASK (FE_UPWARD)
+
+#else /* FPA */
+
 /* Define bits representing exceptions in the FPU status word.  */
 enum
   {
@@ -31,6 +70,7 @@
 #define FE_OVERFLOW FE_OVERFLOW
     FE_UNDERFLOW = 8,
 #define FE_UNDERFLOW FE_UNDERFLOW
+
   };
 
 /* Amount to shift by to convert an exception to a mask bit.  */
@@ -44,6 +84,8 @@
    modes exist, but you have to encode them in the actual instruction.  */
 #define FE_TONEAREST   0
 
+#endif /* FPA */
+
 /* Type representing exception flags. */
 typedef unsigned long int fexcept_t;
 
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/bits/setjmp.h 
glibc-2.5/ports/sysdeps/arm/fpu/bits/setjmp.h
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/bits/setjmp.h       
2006-01-10 19:22:16.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/bits/setjmp.h       2007-05-18 
08:45:22.000000000 +1000
@@ -28,7 +28,11 @@
 #ifndef _ASM
 /* Jump buffer contains v1-v6, sl, fp, sp and pc.  Other registers are not
    saved.  */
+#ifdef __MAVERICK__
+typedef int __jmp_buf[34];
+#else
 typedef int __jmp_buf[22];
 #endif
+#endif 
 
 #endif
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/fegetround.c 
glibc-2.5/ports/sysdeps/arm/fpu/fegetround.c
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/fegetround.c        
2001-07-06 14:55:48.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/fegetround.c        2007-05-18 
08:47:52.000000000 +1000
@@ -18,9 +18,21 @@
    02111-1307 USA.  */
 
 #include <fenv.h>
+#include <fpu_control.h>
 
 int
 fegetround (void)
 {
+#if defined(__MAVERICK__)
+
+  unsigned long temp;
+
+  _FPU_GETCW (temp);
+  return temp & FE_ROUND_MASK;
+
+#else /* FPA */
+
   return FE_TONEAREST;         /* Easy. :-) */
+
+#endif
 }
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/fesetround.c 
glibc-2.5/ports/sysdeps/arm/fpu/fesetround.c
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/fesetround.c        
2005-10-11 01:29:32.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/fesetround.c        2007-05-18 
08:48:32.000000000 +1000
@@ -20,10 +20,26 @@
 #include <fenv.h>
+#include <fpu_control.h>

 int
 fesetround (int round)
 {
+#if defined(__MAVERICK__)
+  unsigned long temp;
+
+  if (round & ~FE_ROUND_MASK)
+    return 1;
+
+  _FPU_GETCW (temp);
+  temp = (temp & ~FE_ROUND_MASK) | round;
+  _FPU_SETCW (temp);
+  return 0;
+
+#else /* FPA */
+
   /* We only support FE_TONEAREST, so there is no need for any work.  */
   return (round == FE_TONEAREST)?0:1;
+
+#endif
 }
 
 libm_hidden_def (fesetround)
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/fpu_control.h 
glibc-2.5/ports/sysdeps/arm/fpu/fpu_control.h
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/fpu_control.h       
2001-07-06 14:55:48.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/fpu_control.h       2007-05-18 
08:50:28.000000000 +1000
@@ -20,6 +20,81 @@
 #ifndef _FPU_CONTROL_H
 #define _FPU_CONTROL_H
 
+#if defined(__MAVERICK__)
+
+/* DSPSC register: (from EP9312 User's Guide)
+ *
+ * bits 31..29 - DAID
+ * bits 28..26 - HVID
+ * bits 25..24 - RSVD
+ * bit  23     - ISAT
+ * bit  22     - UI
+ * bit  21     - INT
+ * bit  20     - AEXC
+ * bits 19..18 - SAT
+ * bits 17..16 - FCC
+ * bit  15     - V
+ * bit  14     - FWDEN
+ * bit  13     - Invalid
+ * bit 12      - Denorm
+ * bits 11..10 - RM
+ * bits 9..5   - IXE, UFE, OFE, RSVD, IOE
+ * bits 4..0   - IX, UF, OF, RSVD, IO
+ */
+
+/* masking of interrupts */
+#define _FPU_MASK_IM   (1 << 5)        /* invalid operation */
+#define _FPU_MASK_ZM   0               /* divide by zero */
+#define _FPU_MASK_OM   (1 << 7)        /* overflow */
+#define _FPU_MASK_UM   (1 << 8)        /* underflow */
+#define _FPU_MASK_PM   (1 << 9)        /* inexact */
+#define _FPU_MASK_DM   0               /* denormalized operation */
+
+#define _FPU_RESERVED  0xfffff000      /* These bits are reserved.  */
+
+#define _FPU_DEFAULT   0x00b00000      /* Default value.  */
+#define _FPU_IEEE          0x00b003a0  /* Default + exceptions enabled. */
+
+/* Type of the control word.  */
+typedef unsigned int fpu_control_t;
+
+/* Macros for accessing the hardware control word.  */
+#define _FPU_GETCW(cw) ({                      \
+       register int __t1, __t2;                \
+                                               \
+       __asm__ volatile (                      \
+       "cfmvr64l       %1, mvdx0\n\t"          \
+       "cfmvr64h       %2, mvdx0\n\t"          \
+       "cfmv32sc       mvdx0, dspsc\n\t"       \
+       "cfmvr64l       %0, mvdx0\n\t"          \
+       "cfmv64lr       mvdx0, %1\n\t"          \
+       "cfmv64hr       mvdx0, %2"              \
+       : "=r" (cw), "=r" (__t1), "=r" (__t2)   \
+       );                                      \
+})
+
+#define _FPU_SETCW(cw) ({                      \
+       register int __t0, __t1, __t2;          \
+                                               \
+       __asm__ volatile (                      \
+       "cfmvr64l       %1, mvdx0\n\t"          \
+       "cfmvr64h       %2, mvdx0\n\t"          \
+       "cfmv64lr       mvdx0, %0\n\t"          \
+       "cfmvsc32       dspsc, mvdx0\n\t"       \
+       "cfmv64lr       mvdx0, %1\n\t"          \
+       "cfmv64hr       mvdx0, %2"              \
+       : "=r" (__t0), "=r" (__t1), "=r" (__t2) \
+       : "0" (cw)                              \
+       );                                      \
+})
+
+/* Default control word set at startup.  */
+extern fpu_control_t __fpu_control;
+
+#else /* FPA */
+
+
+
 /* We have a slight terminology confusion here.  On the ARM, the register
  * we're interested in is actually the FPU status word - the FPU control
  * word is something different (which is implementation-defined and only
@@ -99,4 +174,6 @@
 /* Default control word set at startup.  */
 extern fpu_control_t __fpu_control;
 
+#endif /* FPA */
+
 #endif /* _FPU_CONTROL_H */
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/__longjmp.S 
glibc-2.5/ports/sysdeps/arm/fpu/__longjmp.S
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/__longjmp.S 2001-07-06 
14:55:48.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/__longjmp.S 2007-05-18 08:51:36.000000000 
+1000
@@ -30,7 +30,33 @@
        movs    r0, r1          /* get the return value in place */
        moveq   r0, #1          /* can't let setjmp() return zero! */
 
+#ifdef __MAVERICK__
+       cfldrd  mvd4,  [ip], #8
+       nop
+       cfldrd  mvd5,  [ip], #8
+       nop
+       cfldrd  mvd6,  [ip], #8
+       nop
+       cfldrd  mvd7,  [ip], #8
+       nop
+       cfldrd  mvd8,  [ip], #8
+       nop
+       cfldrd  mvd9,  [ip], #8
+       nop
+       cfldrd  mvd10, [ip], #8
+       nop
+       cfldrd  mvd11, [ip], #8
+       nop
+       cfldrd  mvd12, [ip], #8
+       nop
+       cfldrd  mvd13, [ip], #8
+       nop
+       cfldrd  mvd14, [ip], #8
+       nop
+       cfldrd  mvd15, [ip], #8
+#else
        lfmfd   f4, 4, [ip] !   /* load the floating point regs */
+#endif
 
        LOADREGS(ia, ip, {v1-v6, sl, fp, sp, pc})
 END (__longjmp)
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/setjmp.S 
glibc-2.5/ports/sysdeps/arm/fpu/setjmp.S
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/fpu/setjmp.S    2001-07-06 
14:55:48.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/fpu/setjmp.S    2007-05-18 08:53:00.000000000 
+1000
@@ -24,11 +24,41 @@
 
 ENTRY (__sigsetjmp)
        /* Save registers */
+#ifdef __MAVERICK__
+       cfstrd  mvd4,  [r0], #8
+       nop
+       cfstrd  mvd5,  [r0], #8
+       nop
+       cfstrd  mvd6,  [r0], #8
+       nop
+       cfstrd  mvd7,  [r0], #8
+       nop
+       cfstrd  mvd8,  [r0], #8
+       nop
+       cfstrd  mvd9,  [r0], #8
+       nop
+       cfstrd  mvd10, [r0], #8
+       nop
+       cfstrd  mvd11, [r0], #8
+       nop
+       cfstrd  mvd12, [r0], #8
+       nop
+       cfstrd  mvd13, [r0], #8
+       nop
+       cfstrd  mvd14, [r0], #8
+       nop
+       cfstrd  mvd15, [r0], #8
+#else
        sfmea   f4, 4, [r0]!
+#endif
        stmia   r0, {v1-v6, sl, fp, sp, lr}
 
        /* Restore pointer to jmp_buf */
+#ifdef __MAVERICK__
+       sub     r0, r0, #96
+#else
        sub     r0, r0, #48
+#endif
 
        /* Make a tail call to __sigjmp_save; it takes the same args.  */
        B       PLTJMP(C_SYMBOL_NAME(__sigjmp_save))
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/gccframe.h 
glibc-2.5/ports/sysdeps/arm/gccframe.h
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/gccframe.h      2001-11-16 
11:07:20.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/gccframe.h      2007-05-18 08:53:38.000000000 
+1000
@@ -17,6 +17,10 @@
    Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
    02111-1307 USA.  */
 
+#ifdef __MAVERICK__
+#define FIRST_PSEUDO_REGISTER 43
+#else
 #define FIRST_PSEUDO_REGISTER 27
+#endif
 
 #include <sysdeps/generic/gccframe.h>
diff -urN /home/hwilliams/glibc-2.5/ports/sysdeps/arm/gmp-mparam.h 
glibc-2.5/ports/sysdeps/arm/gmp-mparam.h
--- /home/hwilliams/glibc-2.5/ports/sysdeps/arm/gmp-mparam.h    2005-06-13 
20:11:47.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/gmp-mparam.h    2007-05-18 08:54:21.000000000 
+1000
@@ -29,6 +29,9 @@
 #if defined(__ARMEB__)
 # define IEEE_DOUBLE_MIXED_ENDIAN 0
 # define IEEE_DOUBLE_BIG_ENDIAN 1
+#elif defined(__MAVERICK__)
+#define IEEE_DOUBLE_BIG_ENDIAN 0
+#define IEEE_DOUBLE_MIXED_ENDIAN 0
 #elif defined(__VFP_FP__)
 # define IEEE_DOUBLE_MIXED_ENDIAN 0
 # define IEEE_DOUBLE_BIG_ENDIAN 0
--- /home/hwilliams/glibc-2.5/ports/glibc-ports-2.5/sysdeps/arm/eabi/setjmp.S   
2006-09-22 04:39:51.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/eabi/setjmp.S   2007-05-24 13:31:20.000000000 
+1000
@@ -74,6 +74,25 @@
        stcl    p1, cr15, [r12], #8
 Lno_iwmmxt:
 
+       tst     a3, #HWCAP_ARM_CRUNCH
+       beq     Lno_crunch
+
+       /* Save the call-preserved crunch registers.  */
+       /* Following instructions are cfstrd cr10, [ip], #8 (etc.)  */
+       stcl    p5, cr4,  [r12], #8
+       stcl    p5, cr5,  [r12], #8
+       stcl    p5, cr6,  [r12], #8
+       stcl    p5, cr7,  [r12], #8
+       stcl    p5, cr8,  [r12], #8
+       stcl    p5, cr9,  [r12], #8
+       stcl    p5, cr10, [r12], #8
+       stcl    p5, cr11, [r12], #8
+       stcl    p5, cr12, [r12], #8
+       stcl    p5, cr13, [r12], #8
+       stcl    p5, cr14, [r12], #8
+       stcl    p5, cr15, [r12], #8
+Lno_crunch:
+
        /* Make a tail call to __sigjmp_save; it takes the same args.  */
        B       PLTJMP(C_SYMBOL_NAME(__sigjmp_save))
 
--- 
/home/hwilliams/glibc-2.5/ports/glibc-ports-2.5/sysdeps/arm/eabi/__longjmp.S    
    2006-09-22 04:39:51.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/eabi/__longjmp.S        2007-05-24 
13:31:23.000000000 +1000
@@ -76,6 +76,25 @@
        ldcl    p1, cr15, [r12], #8
 Lno_iwmmxt:
 
+       tst     a2, #HWCAP_ARM_CRUNCH
+       beq     Lno_crunch
+
+       /* Restore the call-preserved crunch registers.  */
+       /* Following instructions are cfldrd cr10, [ip], #8 (etc.)  */
+       ldcl    p5, cr4,  [r12], #8
+       ldcl    p5, cr5,  [r12], #8
+       ldcl    p5, cr6,  [r12], #8
+       ldcl    p5, cr7,  [r12], #8
+       ldcl    p5, cr8,  [r12], #8
+       ldcl    p5, cr9,  [r12], #8
+       ldcl    p5, cr10, [r12], #8
+       ldcl    p5, cr11, [r12], #8
+       ldcl    p5, cr12, [r12], #8
+       ldcl    p5, cr13, [r12], #8
+       ldcl    p5, cr14, [r12], #8
+       ldcl    p5, cr15, [r12], #8
+Lno_crunch:
+
        DO_RET(lr)
 
 #ifdef IS_IN_rtld
--- glibc-2.5-original/ports/sysdeps/unix/sysv/linux/arm/sysdep.h       
2006-09-22 04:39:51.000000000 +1000
+++ glibc-2.5/ports/sysdeps/unix/sysv/linux/arm/sysdep.h        2007-05-24 
12:59:03.000000000 +1000
@@ -48,6 +48,7 @@
 #define HWCAP_ARM_EDSP         128
 #define HWCAP_ARM_JAVA         256
 #define HWCAP_ARM_IWMMXT       512
+#define HWCAP_ARM_CRUNCH    1024
 
 #ifdef __ASSEMBLER__
 
--- glibc-2.5/ports/sysdeps/unix/sysv/linux/arm/dl-procinfo.c-original  
2007-07-02 13:20:36.000000000 +1000
+++ glibc-2.5/ports/sysdeps/unix/sysv/linux/arm/dl-procinfo.c   2007-07-02 
13:23:19.000000000 +1000
@@ -47,12 +47,12 @@
 #if !defined PROCINFO_DECL && defined SHARED
   ._dl_arm_cap_flags
 #else
-PROCINFO_CLASS const char _dl_arm_cap_flags[10][10]
+PROCINFO_CLASS const char _dl_arm_cap_flags[11][10]
 #endif
 #ifndef PROCINFO_DECL
 = {
     "swp", "half", "thumb", "26bit", "fast-mult", "fpa", "vfp", "edsp",
-    "java", "iwmmxt",
+    "java", "iwmmxt", "crunch",
   }
 #endif
 #if !defined SHARED || defined PROCINFO_DECL
--- glibc-2.5/ports/sysdeps/unix/sysv/linux/arm/dl-procinfo.h-original  
2007-07-02 13:25:23.000000000 +1000
+++ glibc-2.5/ports/sysdeps/unix/sysv/linux/arm/dl-procinfo.h   2007-07-02 
13:25:38.000000000 +1000
@@ -24,7 +24,7 @@
 #include <ldsodefs.h>
 #include <sysdep.h>
 
-#define _DL_HWCAP_COUNT 10
+#define _DL_HWCAP_COUNT 11
 
 /* The kernel provides platform data but it is not interesting.  */
 #define _DL_HWCAP_PLATFORM     0
--- glibc-2.5/ports/sysdeps/arm/sysdep.h-original       2007-07-02 
13:05:53.000000000 +1000
+++ glibc-2.5/ports/sysdeps/arm/sysdep.h        2007-07-02 13:06:26.000000000 
+1000
@@ -51,6 +51,7 @@
 
 #endif
 
+#if 0 // ndef __MAVERICK__
 /* APCS-32 doesn't preserve the condition codes across function call. */
 #ifdef __APCS_32__
 #define LOADREGS(cond, base, reglist...)\
@@ -74,6 +75,7 @@
 #define DO_RET(_reg)           \
        movs pc, _reg
 #endif
+#endif
 
 /* Define an entry point visible from C.  */
 #define        ENTRY(name)                                                     
      \

Other related posts: