[raspi-internals] Re: Review: 80 bit vector instructions - scalar register update & accumulator

From: Herman Hermitage <hermanhermitage@xxxxxxxxxxx>
To: "raspi-internals@xxxxxxxxxxxxx" <raspi-internals@xxxxxxxxxxxxx>
Date: Sat, 13 Jul 2013 18:44:25 +1200

Corrections:

Accumulator
====

There is a signed 48 bit Accumulator associated with each lane.  It saturates 
to MAX 0x7fffffffffff and MIN 0x800000000000.

 f_i:
   0 1-----  ENA    Enable Normal Accumulation function.
   0 -1----  HIGH   Apply the accumulation to the top 32 bits of the 
Accumulator.
   0 --1---  SIGN   Accumulate values as if they were signed quantities.
   0 ---1--  CLRA  Clear accumulator before initial instruction execution 
(doesn't apply to repeated instruction phases).
   0 ----1-  WBA   Write back new value to accumulator.
   0 -----1  SUB    Sub ALU result from Accumulator.

Clearing accmulator:
====
 
   CLRA                               Clear accumulator on initially entering 
instruction.

Accumulation/Decumulation
====

   UACC  (ENA|WBA)                   Accumulate with unsigned value. 
   UDEC  (ENA|WBA|SUB)               Decumulate with unsigned value.
   SACC  (ENA|SIGN|WBA)              Accumulate with signed value.
   SDEC  (ENA|SIGN|WBA|SUB)          Decumulate with unsigned value.
   UACCH (ENA|HIGH|WBA)              Accumulate with unsigned value to high 
word of accumulator.
   UDECH (ENA|HIGH|WBA|SUB)          Decumulate with unsigned value to high 
word of accumulator.
   SACCH (ENA|HIGH|SIGN|WBA)         Accumulate with signed value to high word 
of accumulator.
   SDECH (ENA|HIGH|SIGN|WBA|SUB)     Decumulate with signed value to high word 
of accumulator.

The parts of the names are:
  U | S  -> Unsigned, Signed value
  ACC  -> Accumulate:  ie ACC[i] += unsigned(D[i]), or ACC[i] += signed(D[i]))
  DEC -> Decumulate: ie ACC[i] -= unsigned(D[i]), or ACC[i] -= signed(D[i]))
  H      -> Use high 32 bits of accumulator.

Add/Sub Accumulator to result value (without accumulator update):
===
(The key difference being the ACC[i] values are _not_ updated by the operation).

   UADDA  (ENA)                   D'[i] = ACC[i]+unsigned(D[i]).
   USUBA  (ENA|SUB)               D'[i] = ACC[i]-unsigned(D[i]).
   SADDA (ENA|SIGN)              D'[i] = ACC[i]+signed(D[i]).
   SSUBA  (ENA|SIGN|SUB)          D'[i] = ACC[i]-signed(D[i]).
   UADDAH (ENA|HIGH)              D'[i] = ACCH[i]+unsigned(D[i]).
   USUBAH (ENA|HIGH|SUB)          D'[i] = ACCH[i]-unsigned(D[i]).
   SADDAH (ENA|HIGH|SIGN)         D'[i] = ACCH[i]+signed(D[i]).
   SSUBAH (ENA|HIGH|SIGN|SUB)     D'[i] = ACCH[i]-signed(D[i]).

These names are speculative (I dont have a solid reference point).
The parts of the names are:
  U, S -> Unsigned or signed
  ADD -> Add accumulator to result
  SUB  -> Sub result from accumulator
  H -> Use high 32 bits of accumulator.

I'll try and post the equivalent C code to make the computations clearer.

We still have an unknown when ENA bit is clear.  From inspection in the blob it 
may relate to the SETF/IFxx flags (Phire's idea).
I'm still poking around.

Cheers
Herman

Follow-Ups:
- [raspi-internals] Re: Review: 80 bit vector instructions - scalar register update & accumulator
  - From: Herman Hermitage

[raspi-internals] Re: Review: 80 bit vector instructions - scalar register update & accumulator

Other related posts: