NJU ICS 程序的链接和加载 Note
Section 5 程序的链接和加载执行
本文结合了 CMU CSAPP 课程和 NJU ICS 的内容
由于 CSAPP 英文讲授且英文打起来更方便, 多数用英文书写。
ELF
Executable and Linkable Format
关于 ELF 表可以阅读 PA 2 的部分。
链接的本质

合并不同的 Section,形成一个大的表。
可执行文件的内存映像


符号和符号解析
每一个 可重定位目标模块m 都有一个符号表,定义了在 m 中定义的符号。有三种链接器符号。
- Global symbols(全局符号)
- 由模块
m定义并能被其他模块引用的模块。 - 非 Static 的函数名和非 static 的全局变量名。
- 由模块
- External symbols(外部符号)
- 由其他模块定义并被模块 m 引用的全局符号
- 如
main.c,函数名swap
- 如
- 由其他模块定义并被模块 m 引用的全局符号
- Local symbols(本地符号)
- 由模块
m定义和引用的带static的函数名和变量名。因其生存期为整个程序运行过程,故并不分配在栈中,而是分配在static data区(静态数据区),即在.data或.bss节中分配空间。- 如
swap.c中的static变量名bufp1 .data中的有初值,.bss一般默认为 0
- 如
- 由模块
目标文件 ELF 中的符号表
可以参考阅读[[ICS-PA2 note#阅读符号表,对照字符表|如何阅读符号表]]
What Do Linkers Do
Step 1: Symbol resolution
- Programs define and reference symbols (global variables and functions)
void swap(){...}/* define symbol swap */swap();/* reference symbol swap */int *xp = &x;/* define symbol xp, reference x */
- Symbol definitions are stored in object file (by assembler) in symbol table
- Symbol table is an array of
struct - Each entry includes name, size, and location of symbol.
- Symbol table is an array of
Step 2: Relocation (重定位)
- Merges separate code and data sections into single sections
- Relocates symbols from their relative locations in the
.ofiles to their final absolute memory locations in the executable - Update all references to these symbols to reflect their new positions
Three Kinds of Object Files (Modules)
- Relocatable object file (
.ofile )- Contains code and data in a form that can be combined with other relocatable object files to form executable object file
- each
.ofile is produced from exactly one source (.c) file
- Executable object file (
a.outfile)- Contains code and data in a from that can be copied directly into memory and then executed
- Share object file (
.sofile)- Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run-time
- Called Dynamic Link Libraries by Windows
Executable and Linkable Format (ELF)
-
Standard binary format for object files

-
Elf header
- Word size, byte ordering, file type, machine type, etc.
-
Segment header table
- Page size, virtual addresses memory segments (sections), segment sizes
-
.textsection- Code
- function store in here too.
-
.rodatasection- Read only data: jump tables (in
switch), - some const number
- Read only data: jump tables (in
-
.datasection- Initialized global variables
-
.bsssection- Uninitialized global variables or static variables
- back to 60 s. IBM: block started by symbol
- “Better Save Space” (BSS)
- because uninitialized global variables no need to be in
.o, it save occupied space. .bss的核心逻辑是:“只记录需求,不存储实体”。- 在磁盘(可执行文件)中:
.bss节几乎不占空间。可执行文件的头部(Section Header Table)仅记录.bss节的总大小。例如,如果你定义了static int arr[1000000] = {0};,磁盘文件不会增加 4 MB,而只是在文件头里记下一行“我需要 4 MB 的零”。 - 在内存中: 当程序加载(Loading)时,加载器会根据文件头的记录,在内存中分配对应大小的物理页框,并统一将其清零。
- because uninitialized global variables no need to be in
-
.symtabsection- Symbol table
- Procedure and static variable names
- Section names and locations
-
.rel.textsection- Relocation info for
.textsection - Addresses of instructions that will need to be modified in the executable
- Relocation info for
-
.rel.datasection- Relocation info for
.datasection - Addresses of pointer data that will need to be modified in the merged executable
- Relocation info for
-
.debugsection- Info for symbolic debugging (
gcc -g) - provide Information that relates line numbers to in the source code to line numbers in the machine code.
- that why we can use
gdb
- Info for symbolic debugging (
-
Section header tableOffsetandSizeof each section
Linker Symbols
- Global symbols
- Symbols defined by module
mthat can be referenced by other modules. - non-static C functions and on-static global variables
- Symbols defined by module
- External symbols
- Global symbols that are referenced by module
mbut defined by some other module
- Global symbols that are referenced by module
- Local symbols
- Symbols that are defined and referenced exclusively by module
m - E.g.: C functions and global variables defined with the
staticattribute - Local linker symbols are not local program variables
- local program variable are in the user stack
- Symbols that are defined and referenced exclusively by module
Local Symbols
- Local non-static C variables vs. local static C variables
- local non-static C variables: stored on the stack
- local static C variables: store in either
.bssor.data
1 | // Compiler allocates space in .data for each definition of x |
How Linker Resolves Duplicate Symbol Definitions?
- Program symbols are either strong or weak
- Strong: procedures and initialized globals
- Weak: uninitialized globals
- 新标准:
- Common: unallocated uninitialized globals
- Linker will tell how to link this common symbol later.
- Weak*: GCC 拓展的属性指示符
__attribute__(week)- 这个
weak会被更强的所链接。
- 这个

- Linker’s Symbol Rules
- Rule 1 Multiple strong symbols are not allowed.
- Each item can be defined only once
- Otherwise: linker’s error
- Rule 2: Given a strong symbol and multiple weak symbols, choose the strong symbol
- References to the weak symbol resolve to the strong symbol
- Rule 3: If there are multiple weak symbols, pick an arbitrary one
- Can override this with
gcc -fno-common- Or
-Werror
- Or

- Can override this with
- Rule 1 Multiple strong symbols are not allowed.
Relocated .text section
什么是重定位:
- 重定位的目的:在符号解析的基础上,将所有互相关联的目标模块合并,并确定运行时每个定义符号在虚拟地址空间中的地址 。在定义符号的引用处,重定位引用的地址。
临时地址到有效的绝对地址的修补。
- 重定位类型
R_386_PC32:引用处采用 PC 相对寻址方式- 如上图的例子
R_386_32:引用处使用绝对地址方式
Using Static Libaries

- Linker 对于外部引用 (external references)
- Scan
.ofiles and.afiles in the command line order - During scan, keep a list of the current unresolved references
- As each new
.oor.a(static libraries) , is encountered, try to resolve each unresolved reference in the list against the symbols defined inobj. - If any entries in the unresolved list at end of scan, then error
- Scan
- Command line order matters!
- affect how you scan
How to do risky link and compile?
To study link and compile, you may have to do risky link. However, gcc may prevent you from doing so by error your linkage. You can type following command to avoid it.
1 gcc <your .c> -fcommon
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来源 RedKold的小站!
评论



