Linking
Linking
Contents
Static Linking
Object Files
Static Libraries
Loading
Dynamic Linking of Shared Libraries (DLL: dynamic
linking libraries)
2
Linker Puzzles
int x;
p1() {} p1() {}
int x; int x;
p1() {} p2() {}
int x; double x;
int y; p2() {}
p1() {}
3
A Simplistic Program Translation Scheme
Translator
Problems:
• Efficiency: small change requires complete recompilation
• Modularity: hard to share common functions (e.g. printf)
Solution:
• Static linker (or linker)
4
A Better Scheme Using a Linker
m.c a.c
Translators Translators
Separately compiled
m.o a.o
relocatable object files
Linker (ld)
Executable object file (contains code
p and data for all functions defined in m.c
and a.c)
5
Translating the Example Program
Compiler driver coordinates all steps in the translation
and linking process.
– Typically included with each compilation system (e.g., gcc)
– Invokes preprocessor (cpp), compiler (cc1), assembler (as),
and linker (ld).
– Passes command line arguments to appropriate phases
Example: create executable p from m.c and a.c:
6
What Does a Linker Do?
Merges object files
– Merges multiple relocatable (.o) object files into a single executable
object file that can loaded and executed by the loader.
Resolves external references
– As part of the merging process, resolves external references.
• External reference: reference to a symbol defined in another object
file.
Relocates symbols
– Relocates symbols from their relative locations in the .o files to
new absolute positions in the executable.
– Updates all references to these symbols to reflect their new
positions.
• References can be in either code or data
– code: a(); /* reference to symbol a */
– data: int *xp=&x; /* reference to symbol x */
7
Why Linkers?
Modularity
– Program can be written as a collection of smaller source
files, rather than one monolithic mass.
– Can build libraries of common functions (more on this later)
• e.g., Math library, standard C library
Efficiency
– Time:
• Change one source file, compile, and then relink.
• No need to recompile other source files.
– Space:
• Libraries of common functions can be aggregated into a single
file...
• Yet executable files and running memory images contain only
code for the functions they actually use.
8
Executable and Linkable Format (ELF)
Standard binary format for object files
Derives from AT&T System V Unix
– Later adopted by BSD Unix variants and Linux
One unified format for
– Relocatable object files (.o),
– Executable object files
– Shared object files (.so)
Generic name: ELF binaries
Better support for shared libraries than old a.out formats.
9
ELF Object File Format
Elf header
– Magic number, type (.o, exec, .so), 0
machine, byte ordering, etc. ELF header
Program header table Program header table
– Page size, virtual addresses memory (required for executables)
segments (sections), segment sizes. .text section
.text section .data section
– Code
.bss section
.data section
– Initialized (static) data .symtab
.bss section .rel.txt
– Uninitialized (static) data .rel.data
– “Block Started by Symbol”
.debug
– “Better Save Space”
– Has section header but occupies no Section header table
space (required for relocatables)
10
ELF Object File Format (cont)
.symtab section
– Symbol table 0
ELF header
– Procedure and static variable names
– Section names and locations Program header table
.rel.text section (required for executables)
11
Example C Program
m.c a.c
int e=7; extern int e;
12
Merging Relocatable Object Files into
an Executable Object File
Relocatable Object Files Executable Object File
m.c a.c
int e=7; extern int e;
Def of local
symbol e int main() { int *ep=&e; Ref to
int r = a(); int x=15; external
exit(0); int y; symbol e
} Def of
local int a() { Defs of
symbol return *ep+x+y; local
Ref to external
}
Ref to external ep
symbol exit symbols
x and y
(defined in symbol a Def of
libc.so) Refs of local
local
symbols ep,x,y
symbol a
14
.text relative
m.o Relocation Info offset of the
position to be
modified
m.c
int e=7; Disassembly of section .text:
00000000 <e>:
0: 07 00 00 00
source: objdump
15
.text relative
a.c
00000000 <a>:
extern int e; 0: 55 pushl %ebp
1: 8b 15 00 00 00 00 movl 0x0,%edx
int *ep=&e; 3: R_386_32 ep
int x=15; 7: a1 00 00 00 00 movl 0x0,%eax
int y; 8: R_386_32 x
c: 89 e5 movl %esp,%ebp
int a() { e: 03 02 addl (%edx),%eax
10: 89 ec movl %ebp,%esp
return *ep+x+y;
12: 03 05 00 00 00 00 addl 0x0,%eax
} 14: R_386_32 y
18: 5d popl %ebp
19: c3 ret
final
addresses are
unknown yet Disassembly of section .data: .data relative
(Absolute offset of the
addressing 00000000 <ep>: position to be
mode) 0: 00 00 00 00 modified
0: R_386_32 e
00000004 <x>:
4: 0f 00 00 00 16
Executable After Relocation and
External Reference Resolution (.text)
08048530 <main>:
8048530: 55 pushl %ebp
8048531: 89 e5 movl %esp,%ebp
8048533: e8 08 00 00 00 call 8048540 <a>
8048538: 6a 00 pushl $0x0
804853a: e8 35 ff ff ff call 8048474 <_init+0x94>
804853f: 90 nop
08048540 <a>:
8048540: 55 pushl %ebp
8048541: 8b 15 1c a0 04 movl 0x804a01c,%edx
8048546: 08
8048547: a1 20 a0 04 08 movl 0x804a020,%eax
804854c: 89 e5 movl %esp,%ebp
804854e: 03 02 addl (%edx),%eax
8048550: 89 ec movl %ebp,%esp
8048552: 03 05 d0 a3 04 addl 0x804a3d0,%eax
8048557: 08
8048558: 5d popl %ebp
8048559: c3 ret
17
Executable After Relocation and
External Reference Resolution(.data)
m.c
int e=7;
Disassembly of section .data:
int a() {
return *ep+x+y;
}
18
Strong and Weak Symbols
Program symbols are either strong or weak
– strong: procedures and initialized globals
– weak: uninitialized globals
p1.c p2.c
strong int foo=5; int foo; weak
19
Linker’s Symbol Rules
Rule 1. A strong symbol can only appear once.
20
Linker Puzzles
int x;
Link time error: two strong symbols (p1)
p1() {} p1() {}
int x; double x;
Writes to x in p2 might overwrite y!
int y; p2() {}
Evil!
p1() {}
22
Static Libraries (archives)
p1.c p2.c
Translator Translator
static library (archive) of
p1.o p2.o libc.a relocatable object files
concatenated into one file.
Linker (ld)
executable object file (only contains code
p and data for libc functions that are called
from p1.c and p2.c)
ar rs libc.a \
Archiver (ar)
atoi.o printf.o … random.o
24
Commonly Used Libraries
libc.a (the C standard library)
– 8 MB archive of 900 object files.
– I/O, memory allocation, signal handling, string handling, data and
time, random numbers, integer math
libm.a (the C math library)
– 1 MB archive of 226 object files.
– floating point math (sin, cos, tan, log, exp, sqrt, …)
.data section
0x08048494
.bss section .text segment
(r/o)
.symtab
.rel.text 0x0804a010
.data segment
.rel.data (initialized r/w)
.debug
0x0804a3b0
Section header table .bss segment
(required for relocatables) (uninitialized r/w)
27
Linux run-time
memory image Process-specific data
structures
Different for (e.g., page tables,
each process task and mm structs, kernel
stack) Kernel
virtual
Physical memory memory
Identical for
each process
Kernel code and data
0xc0000000
%esp User stack
28
Shared Libraries
Static libraries have the following disadvantages:
– Potential for duplicating lots of common code in the executable
files on a file system.
• e.g., every C program needs the standard C library
– Potential for duplicating lots of code in the virtual memory space of
many processes.
– Minor bug fixes of system libraries require each application to
explicitly relink
Solution:
– Shared libraries (dynamic link libraries, DLLs) whose members are
dynamically loaded into memory and linked into an application at
run-time.
• Dynamic linking can occur when executable is first loaded and run.
– Common case for Linux, handled automatically by ld-linux.so.
• Dynamic linking can also occur after program has begun.
– In Linux, this is done explicitly by user with dlopen().
– Basis for High-Performance Web Servers.
• Shared library routines can be shared by multiple processes.
29
Dynamically Linked Shared Libraries
m.c a.c
Translators Translators
(cc1, as) (cc1,as)
m.o a.o
Linker (ld)
Loader/Dynamic Linker
libc.so functions called by m.c
(ld-linux.so)
and a.c are loaded, linked, and
(potentially) shared among
Fully linked executable processes.
p’ (in memory) P’
30
The Complete Picture
m.c a.c
Translator Translator
p libc.so libm.so
Loader/Dynamic Linker
(ld-linux.so)
p’
31