[raspi-internals] Re: 2op vs 3op ALU operations

  • From: Volker Barthelmann <vb@xxxxxxxxxxxx>
  • To: raspi-internals@xxxxxxxxxxxxx
  • Date: Mon, 20 May 2013 16:45:48 +0200

On 20.05.2013 15:32, David Given wrote:
Firstly, there are a number of instructions that contain a pc-relative
offset. (b, lea, ld, etc.) I've been testing my assembler against the
online disassembler and I notice that I need to add different modifiers
to the offset to make them come out right.

For example:

- the offet for simple b and bl seems to be relative to the *beginning*
of the instruction. (rd = pc + o)

- the offset for comparing b and addcmpb seems to be relative to the
*end* of the instruction. (rd = pc + o + 4)

I think I am using the same offset calculation (beginning of the instruction) for both cases at the moment and the code seems to work.

I've looked at how the disassembler calculates the target address but
TBH don't make much of the code. Does anyone have any clarifications?
(Incidentally, as a feature request: it would be really convenient if
the disassembler resolved the target address in ld rd, o (pc) instructions.)

As I am using those a lot, too, I second that.

Secondly, in instructions of the form ld rd, (ra + rb); is rb scaled
according to the size of rd? (It would be nice if it wasn't, as I want
to use that instruction a lot for PIC code.)

According to my tests, rb seems to be scaled.

Thirdly, in order to generate fully PIC code I think I have to offset
all memory accesses via gp (r24). (Because there's no fixup stage when
loading kernels, and we have no control over where in the VC4's memory
the kernel gets loaded, the code must run from anywhere.) This seems
rather painful. Is there a better way?

What I am doing currently is to use PC-relative addressing for all memory accesses. This works for most C-code apart from something like

int x,*p=&x;

For those cases, I relocate the addresses after loading the code.

Volker

Other related posts: