[hellogcc] Re: [投稿] x86 ABI中PLT例子的翻译

  • From: Yao Qi <qiyaoltc@xxxxxxxxx>
  • To: hellogcc@xxxxxxxxxxxxx
  • Date: Sun, 13 May 2012 18:39:18 +0800

我在你的文章里边,加了一些东西,我是在火车上写的,所以没有按照comments的方式加进去。
我在第五步的时候,加入了我的两个问题,第一个问题,最好回答一下,第二个问题有些超
范围,可以在这个文章里边跳过。

如果别人没有问题,就post到博客上吧。



首先,为了方便理解这些枯燥的spec,我们结合一个实际的例子,来理解这些spec。

例子很简单,
#include <stdio.h>

int
main (void)
{
  printf ("hellogcc\n");

  return 0;
}

后边我们会看到一些汇编程序和一些地址,为了搞清楚这些地址的含义,我们先列出一些段的地址范围,
(gdb) maintenance info sections 
Exec file:
    `/home/yao/SourceCode/plt.exe', file type elf32-i386.
    0x80481d4->0x8048224 at 0x000001d4: .dynsym ALLOC LOAD READONLY DATA 
HAS_CONTENTS
    0x8048224->0x804826e at 0x00000224: .dynstr ALLOC LOAD READONLY DATA 
HAS_CONTENTS

    0x8048298->0x80482a0 at 0x00000298: .rel.dyn ALLOC LOAD READONLY DATA 
HAS_CONTENTS
    0x80482a0->0x80482b8 at 0x000002a0: .rel.plt ALLOC LOAD READONLY DATA 
HAS_CONTENTS
    0x80482b8->0x80482e8 at 0x000002b8: .init ALLOC LOAD READONLY CODE 
HAS_CONTENTS
    0x80482e8->0x8048328 at 0x000002e8: .plt ALLOC LOAD READONLY CODE 
HAS_CONTENTS
    0x8048330->0x804849c at 0x00000330: .text ALLOC LOAD READONLY CODE 
HAS_CONTENTS
    0x804849c->0x80484b8 at 0x0000049c: .fini ALLOC LOAD READONLY CODE 
HAS_CONTENTS
    0x80484b8->0x80484c9 at 0x000004b8: .rodata ALLOC LOAD READONLY DATA 
HAS_CONTENTS
    0x80484cc->0x80484d0 at 0x000004cc: .eh_frame ALLOC LOAD READONLY DATA 
HAS_CONTENTS
    0x8049f0c->0x8049f14 at 0x00000f0c: .ctors ALLOC LOAD DATA HAS_CONTENTS
    0x8049f14->0x8049f1c at 0x00000f14: .dtors ALLOC LOAD DATA HAS_CONTENTS
    0x8049f1c->0x8049f20 at 0x00000f1c: .jcr ALLOC LOAD DATA HAS_CONTENTS
    0x8049f20->0x8049ff0 at 0x00000f20: .dynamic ALLOC LOAD DATA HAS_CONTENTS
    0x8049ff0->0x8049ff4 at 0x00000ff0: .got ALLOC LOAD DATA HAS_CONTENTS
    0x8049ff4->0x804a00c at 0x00000ff4: .got.plt ALLOC LOAD DATA HAS_CONTENTS
    0x804a00c->0x804a014 at 0x0000100c: .data ALLOC LOAD DATA HAS_CONTENTS
    0x804a014->0x804a01c at 0x00001014: .bss ALLOC

有了这些,我们开始看看 SYS V ABI 怎么说的吧。

原文摘自SYSTEM V APPLICATION BINARY INTERFACE。

Figure 5-7: Position-Independent Procedure Linkage Table

.PLT0: pushl  4(%ebx)
            jmp    *8(%ebx)
            nop; nop
            nop; nop
.PLT1: jmp    *name1@GOT(%ebx)
           pushl  $offset
           jmp    .PLT0@PC
.PLT2: jmp    *name2@GOT(%ebx)
           pushl $offset
           jmp    .PLT0@PC

这个图是spec中给的,我们看看实际程序中,我们 PLT section里边的内容是什么?

(gdb) disassemble 0x80482e8,0x8048328
Dump of assembler code from 0x80482e8 to 0x8048328:
   0x080482e8:  pushl  0x8049ff8
   0x080482ee:  jmp    *0x8049ffc
   0x080482f4:  add    %al,(%eax)
   0x080482f6:  add    %al,(%eax)
   0x080482f8 <__gmon_start__@plt+0>:   jmp    *0x804a000
   0x080482fe <__gmon_start__@plt+6>:   push   $0x0
   0x08048303 <__gmon_start__@plt+11>:  jmp    0x80482e8
   0x08048308 <__libc_start_main@plt+0>:        jmp    *0x804a004
   0x0804830e <__libc_start_main@plt+6>:        push   $0x8
   0x08048313 <__libc_start_main@plt+11>:       jmp    0x80482e8
   0x08048318 <puts@plt+0>:     jmp    *0x804a008
   0x0804831e <puts@plt+6>:     push   $0x10
   0x08048323 <puts@plt+11>:    jmp    0x80482e8

我们看到了,puts的plt entry,是 plt 3,前边的0 1 和 2都已经被占用了。这些都是系统
保留的entry。不同的体系结构,这里可能占用不同的书目的entry。plt 0会在本文中介绍
到,但是 plt 1 和 2 的作用,没有在本文介绍。

Following the steps below, the dynamic linker and the program ‘‘cooperate’’ to
resolve symbolic references through the procedure linkage table and the global
offset table.

动态链接器和程序,按照下面的步骤,协作完成对通过过程链接表和全局偏移表进行符号引用的解析。


1 . When first creating the memory image of the program, the dynamic linker
sets the second and the third entries in the global offset table to special
values. Steps below explain more about these values.

动态链接器在开始创建程序的内存映像时,会将全局偏移表中的第二,三项设置为特定的值。这些值在下面的步骤中详细解释。

2 . If the procedure linkage table is position-independent, the address of the
global offset table must reside in %ebx. Each shared object file in the pro-
cess image has its own procedure linkage table, and control transfers to a
procedure linkage table entry only from within the same object file. Conse-
quently, the calling function is responsible for setting the global offset table
base register before calling the procedure linkage table entry.

如果过程链接表是位置无关的,则全局偏移表的地址必须存在%ebx中。进程映像中的每个共享目标文件都有自己的过程链接表,并且只能从同一个目标文件中才能将控制转换到过程链接表的表项。因此,调用函数需要在调用过程链接表项之前,设置全局偏移表的基础寄存器。

3 . For illustration, assume the program calls name1, which transfers control to
the label .PLT1.

例如,假设程序调用了name1,其将控制转换到标号.PLT1.

4 . The first instruction jumps to the address in the global offset
table entry for
name1. Initially, the global offset table holds the address of the following
pushl instruction, not the real address of name1.


   0x08048318 <puts@plt+0>:     jmp    *0x804a008 // -> jmp 
*(_GLOBAL_OFFSET_TABLE_+20)
   0x0804831e <puts@plt+6>:     push   $0x10      // push relocation offset.

我们可以看到 0x804a008 落在的 .got.plt  的范围,
    0x8049ff4->0x804a00c at 0x00000ff4: .got.plt ALLOC LOAD DATA HAS_CONTENTS

(gdb) x/4x 0x804a008
0x804a008 <_GLOBAL_OFFSET_TABLE_+20>:   0x0804831e      0x00000000      
0x00000000      0x00000000

第一条指令跳转到全局偏移表项中name1的地址。初始的时候,全局偏移表中存放的是pushl指令之后的地址,而不是name1的实际地址。

5 . Consequently, the program pushes a relocation offset (offset) on the stack.
The relocation offset is a 32-bit, non-negative byte offset into the relocation
table. The designated relocation entry will have type R_386_JMP_SLOT,
and its offset will specify the global offset table entry used in the previous
jmp instruction. The relocation entry also contains a symbol table index,
thus telling the dynamic linker what symbol is being referenced, name1 in
this case.

因此,程序将一个重定位偏移量(offset)压入栈中 (see the insn on 0x0804831e: push 
0x10)。重定位偏移量为一个32位,非负的,重定位表的字节偏移。其所指定的重定位项将具有R_386_JMP_SLOT类型,并且它的偏移量指定了在之前jmp指令中会用到的全局偏移表项。

Relocation section '.rel.plt' at offset 0x2a0 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0804a000  00000107 R_386_JUMP_SLOT   00000000   __gmon_start__
0804a004  00000207 R_386_JUMP_SLOT   00000000   __libc_start_main
0804a008  00000307 R_386_JUMP_SLOT   00000000   puts
我们可以看到,这里有一个reloc R_386_JUMP_SLOT,对应的地址是0x804a008,其实就是 puts对应的 .got.plt 的entry。

重定位项还包含了一个符号表索引,因此告诉了动态链接器哪个符号在被引用。在该例子中,为name1.


这里的offset (就是 0x10)还是不是很清楚,它什么是 puts在.rel.plt 段里边的那个记录的偏移吗?
.rel.plt 段的大小是 0x16,里边包含三个一样类型的记录,那么第三个应该就是offset 0x10。

    0x80482a0->0x80482b8 at 0x000002a0: .rel.plt ALLOC LOAD READONLY DATA 
HAS_CONTENTS

这些是我的猜测。

第二个问题,dynamic linker对R_386_JUMP_SLOT是怎么处理的,都干了些什么?

6 . After pushing the relocation offset, the program then jumps to .PLT0, the
first entry in the procedure linkage table. The pushl instruction places the
value of the second global offset table entry (got_plus_4 or 4(%ebx)) on the
stack, thus giving the dynamic linker one word of identifying information.
The program then jumps to the address in the third global offset table entry
(got_plus_8 or 8(%ebx)), which transfers control to the dynamic linker.

在压入重定位偏移量之后,程序然后跳转到.PLT0,过程链接表的第一项。

   0x08048323 <puts@plt+11>:    jmp    0x80482e8  // jump to start of .plt 
section.

.PLT0:
   0x080482e8:  pushl  0x8049ff8
   0x080482ee:  jmp    *0x8049ffc

pushl指令将全局偏移表的第二个表项(got_plus_4 or 4(%ebx))压入栈中,因此给了动态链接器一个字的标识信息。
           ^^^^^^^^^^
it should be the 2nd entry of .got.plt,
   0x8049ff4->0x804a00c at 0x00000ff4: .got.plt ALLOC LOAD DATA HAS_CONTENTS

程序然后跳转到全局偏移表的第三个表项中(got_plus_8 or 8(%ebx))的地址,其将控制转换给动态链接器。
              ^^^^^^^^^ .got.plt, isn't?

(gdb) x/x 0x8049ffc
0x8049ffc <_GLOBAL_OFFSET_TABLE_+8>:    0x00123270
(gdb) disassemble 0x00123270,0x00123280
Dump of assembler code from 0x123270 to 0x123280:
   0x00123270 <_dl_runtime_resolve+0>:  push   %eax
   0x00123271 <_dl_runtime_resolve+1>:  push   %ecx
   0x00123272 <_dl_runtime_resolve+2>:  push   %edx
   0x00123273 <_dl_runtime_resolve+3>:  mov    0x10(%esp),%edx
   0x00123277 <_dl_runtime_resolve+7>:  mov    0xc(%esp),%eax
   0x0012327b <_dl_runtime_resolve+11>: call   0x11d5a0 <_dl_fixup>


As we can see, `jmp *0x8049ffc' jumps to _dl_runtime_resolve, the entry of 
dynamic linker,

7 . When the dynamic linker receives control, it unwinds the stack, looks at the
designated relocation entry, finds the symbol’s value, stores the ‘‘real’’
address for name1 in its global offset table entry, and transfers control to the
desired destination.

当动态链接器获得控制之后,其展开栈,查看指定的重定位项,发现符号的值,将name1的“实际”地址存放在它的全局偏移表项中,然后将控制转换到所希望的目的地。
dynamic resolver will modify the content on address 0x804a008,

0x804a008 <_GLOBAL_OFFSET_TABLE_+20>:   0x0804831e

让我们看看dynamic linker如何修改这个,我们在0x804a008上设置一个硬件watchpoint
(gdb) watch *0x804a008
Hardware watchpoint 2: *0x804a008
(gdb) c
Continuing.
Hardware watchpoint 2: *0x804a008

Old value = 134513438
New value = 1616016
_dl_fixup (l=<value optimized out>, reloc_arg=<value optimized out>) at 
dl-runtime.c:155
155     dl-runtime.c: No such file or directory.
        in dl-runtime.c

我们可以看到,地址0x804a008上的内容,从134513438 变化到了1616016,
(gdb) p/x 134513438
$2 = 0x804831e
(gdb) p/x 1616016
$3 = 0x18a890

我们看看 这个新地址 (1616016 0x18a890) 是什么

(gdb) disassemble 0x18a890,0x18a8a0
Dump of assembler code from 0x18a890 to 0x18a8a0:
   0x0018a890 <_IO_puts+0>:     push   %ebp
   0x0018a891 <_IO_puts+1>:     mov    %esp,%ebp
   0x0018a893 <_IO_puts+3>:     sub    $0x20,%esp
   0x0018a896 <_IO_puts+6>:     mov    %ebx,-0xc(%ebp)
   0x0018a899 <_IO_puts+9>:     mov    0x8(%ebp),%eax
   0x0018a89c <_IO_puts+12>:    call   0x143a0f <__i686.get_pc_thunk.bx>

Yay!, 我们能看到地址0x804a008上的内容已经变化成为了实际的glibc中的地址了。

(gdb) bt
#0  _dl_fixup (l=<value optimized out>, reloc_arg=<value optimized out>) at 
dl-runtime.c:155
#1  0x00123280 in _dl_runtime_resolve () at ../sysdeps/i386/dl-trampoline.S:37
#2  0x080483f9 in main () at plt.c:6

8 . Subsequent executions of the procedure linkage table entry will transfer
directly to name1, without calling the dynamic linker a second time. That
is, the jmp instruction at .PLT1 will transfer to name1, instead of ‘‘falling
through’’ to the pushl instruction.

以后对过程链接表项的执行,将会直接转换到name1,而不需要再次调用动态链接器。也就是说,在.PLT1中的jmp指令会直接跳转到name1,而不会顺序执行到pushl指令。



On 05/09/2012 12:30 PM, Mingjie Xing wrote:
> 原文摘自SYSTEM V APPLICATION BINARY INTERFACE。
> 
> Figure 5-7: Position-Independent Procedure Linkage Table
> 
> .PLT0: pushl  4(%ebx)
>             jmp    *8(%ebx)
>             nop; nop
>             nop; nop
> .PLT1: jmp    *name1@GOT(%ebx)
>            pushl  $offset
>            jmp    .PLT0@PC
> .PLT2: jmp    *name2@GOT(%ebx)
>            pushl $offset
>            jmp    .PLT0@PC
> ...
> 
> Following the steps below, the dynamic linker and the program ‘‘cooperate’’ to
> resolve symbolic references through the procedure linkage table and the global
> offset table.
> 
> 动态链接器和程序,按照下面的步骤,协作完成对通过过程链接表和全局偏移表进行符号引用的解析。
> 
> 
> 1 . When first creating the memory image of the program, the dynamic linker
> sets the second and the third entries in the global offset table to special
> values. Steps below explain more about these values.
> 
> 动态链接器在开始创建程序的内存映像时,会将全局偏移表中的第二,三项设置为特定的值。这些值在下面的步骤中详细解释。
> 
> 2 . If the procedure linkage table is position-independent, the address of the
> global offset table must reside in %ebx. Each shared object file in the pro-
> cess image has its own procedure linkage table, and control transfers to a
> procedure linkage table entry only from within the same object file. Conse-
> quently, the calling function is responsible for setting the global offset 
> table
> base register before calling the procedure linkage table entry.
> 
> 如果过程链接表是位置无关的,则全局偏移表的地址必须存在%ebx中。进程映像中的每个共享目标文件都有自己的过程链接表,并且只能从同一个目标文件中才能将控制转换到过程链接表的表项。因此,调用函数需要在调用过程链接表项之前,设置全局偏移表的基础寄存器。
> 
> 3 . For illustration, assume the program calls name1, which transfers control 
> to
> the label .PLT1.
> 
> 例如,假设程序调用了name1,其将控制转换到标号.PLT1.
> 
> 4 . The first instruction jumps to the address in the global offset
> table entry for
> name1. Initially, the global offset table holds the address of the following
> pushl instruction, not the real address of name1.
> 
> 第一条指令跳转到全局偏移表项中name1的地址。初始的时候,全局偏移表中存放的是pushl指令之后的地址,而不是name1的实际地址。
> 
> 5 . Consequently, the program pushes a relocation offset (offset) on the 
> stack.
> The relocation offset is a 32-bit, non-negative byte offset into the 
> relocation
> table. The designated relocation entry will have type R_386_JMP_SLOT,
> and its offset will specify the global offset table entry used in the previous
> jmp instruction. The relocation entry also contains a symbol table index,
> thus telling the dynamic linker what symbol is being referenced, name1 in
> this case.
> 
> 因此,程序将一个重定位偏移量(offset)压入栈中。重定位偏移量为一个32位,非负的,重定位表的字节偏移。其所指定的重定位项将具有R_386_JMP_SLOT类型,并且它的偏移量指定了在之前jmp指令中会用到的全局偏移表项。重定位项还包含了一个符号表索引,因此告诉了动态链接器哪个符号在被引用。在该例子中,为name1.
> 
> 6 . After pushing the relocation offset, the program then jumps to .PLT0, the
> first entry in the procedure linkage table. The pushl instruction places the
> value of the second global offset table entry (got_plus_4 or 4(%ebx)) on the
> stack, thus giving the dynamic linker one word of identifying information.
> The program then jumps to the address in the third global offset table entry
> (got_plus_8 or 8(%ebx)), which transfers control to the dynamic linker.
> 
> 在压入重定位偏移量之后,程序然后跳转到.PLT0,过程链接表的第一项。pushl指令将全局偏移表的第二个表项(got_plus_4 or
> 4(%ebx))压入栈中,因此给了动态链接器一个字的标识信息。程序然后跳转到全局偏移表的第三个表项中(got_plus_8 or
> 8(%ebx))的地址,其将控制转换给动态链接器。
> 
> 7 . When the dynamic linker receives control, it unwinds the stack, looks at 
> the
> designated relocation entry, finds the symbol’s value, stores the ‘‘real’’
> address for name1 in its global offset table entry, and transfers control to 
> the
> desired destination.
> 
> 当动态链接器获得控制之后,其展开栈,查看指定的重定位项,发现符号的值,将name1的“实际”地址存放在它的全局偏移表项中,然后将控制转换到所希望的目的地。
> 
> 8 . Subsequent executions of the procedure linkage table entry will transfer
> directly to name1, without calling the dynamic linker a second time. That
> is, the jmp instruction at .PLT1 will transfer to name1, instead of ‘‘falling
> through’’ to the pushl instruction.
> 
> 以后对过程链接表项的执行,将会直接转换到name1,而不需要再次调用动态链接器。也就是说,在.PLT1中的jmp指令会直接跳转到name1,而不会顺序执行到pushl指令。


Other related posts: