0% found this document useful (0 votes)
47 views

06 Codegeneration PDF

The document discusses the final stage of code generation in a compiler, which generates machine code from optimized intermediate code. It involves choosing appropriate machine instructions to translate each intermediate representation instruction while handling machine resources. The goals are to implement instruction-level details and machine-specific optimizations. Instruction selection maps intermediate code to machine instructions, dealing with issues like conditional jumps, constants, and exploiting complex instructions.

Uploaded by

Miguel Pech
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

06 Codegeneration PDF

The document discusses the final stage of code generation in a compiler, which generates machine code from optimized intermediate code. It involves choosing appropriate machine instructions to translate each intermediate representation instruction while handling machine resources. The goals are to implement instruction-level details and machine-specific optimizations. Instruction selection maps intermediate code to machine instructions, dealing with issues like conditional jumps, constants, and exploiting complex instructions.

Uploaded by

Miguel Pech
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Part 6

Code generation

Code generation 303


Structure of a compiler
character stream

Lexical analysis
token stream

Syntax analysis
syntax tree

Semantic analysis
syntax tree

Intermediate code generation


intermediate representation

Intermediate code optimization


intermediate representation

Code generation
machine code

Code optimization
machine code

Code generation 304


Final code generation

At this point, we have optimized intermediate code, from which we


would like to generate the final code
By final code, we typically mean assembly language of the target
machine
Goal of this stage:
I Choose the appropriate machine instructions to translate each
intermediate representation instruction
I Handle finite machine resources (registers, memory, etc.)
I Implement low-level details of the run-time environment
I Implement machine-specific code optimization
This step is very machine-specific
In this course, we will only mention some typical and general
problems

Code generation 305


Short tour on machine code

RISC (Reduced Instruction Set Computer)


I E.g.: PowerPC, Sparc, MIPS (embedded systems), ARM...
I Many registers, 3-address instructions, relatively simple instruction
sets
CISC (Complex Instruction Set Computer)
I E.g.: x86, x86-64, amd64...
I Few registers, 2-address instructions, complex instruction sets
Stack-based computer:
I E.g.: Not really used anymore but Javas virtual machine is
stack-based
I No register, zero address instructions (operands on the stack)
Accumulator-based computer:
I E.g.: First IBM computers were accumulator-based
I One special register (the accumulator), one address instructions,
other registers used in loops and address specification

Code generation 306


Outline

1. Introduction

2. Instruction selection

3. Register allocation

4. Memory management

Code generation 307


Instruction selection

One needs to map one or several instructions of the intermediate


representation into one or several instructions of the machine
language
Complexity of the task depends on:
I the level of the IR
I the nature of the instruction-set architecture
I the desired quality of the generated code
Examples of problems:
I Conditional jumps
I Constants
I Complex instructions

Code generation 308


Example: Conditional jumps

Conditional jumps in our intermediate language are of the form:


IF id relop Atom THEN labelid ELSE labelid

Conditional jumps might be dierent on some machines:


I One-way branch instead of two-way branches
branch if c lr
IF c THEN lt ELSE lf jump lf
I Condition such as id relop Atom might not be allowed. Then,
compute the condition and store it in a register
I There might exist special registers for conditions
I ...

Code generation 309


Example: Constants

IR allows arbitrary constants as operands to binary or unary


operators
This is not always the case in machine code
I MIPS allows only 16-bit constants in operands (even though integers
are 32 bits)
I On the ARM, a constant can only be a 8-bit number positioned at
any even bit boundary (within a 32-bit word)
If a constant is too big, translation requires to build the constant
into some register
If the constant is used within a loop, its computation should be
moved outside

Code generation 310


Exploiting complex instructions
If we do not care about efficiency, instruction selection is
straightforward:
I Writes a code skeleton for every IR instructions
I Example in MIPS assembly:
t2 := t1 + 116 ) addi r2,r1,116
(where r2 and r1 are the registers chosen for t2 and t1 )

Most processors (even RISC-based) have complex instructions that


can translate several IR instructions at once
I Examples in MIPS assembly:
t2 := t1 + 116 ) lw r3, 116(r1)
t3 := M[t2 ]
(where r3 and r1 are the registers chosen for t3 and t1 resp. and
assuming that t2 will not be used later)
For efficiency reason, one should exploit them

Code generation 311


Code generation principle

Determine for each variable whether it is dead after a particular use


(liveness analysis, see later)
t2 := t1 + 116
t3 := M[t2last ]
Associate an address (register, memory location...) to each variable
(register allocation, see later)
Define an instruction set description, i.e., a list of pairs of:
I pattern: a sequence of IR instructions
t := rs + k
rt := M[t last ]
I replacement: a sequence of machine-code instruction translating the
pattern
lw rt ,k(rs )
Use pattern matching to do the translation

Code generation 312


Illustration

Pattern/replacement pairs for a subset of the MIPS instruction set

t := rs + k,lw lw rt , k(rrts,lw ) k(rs ) rt , k(rs ) lw rt , k(rs )


rt := M[t last ] MIPS instructions:
rt := M[rs ]lw lw rt , 0(rrts,lw ) 0(rs ) rt , 0(rs ) lw rt , 0(rs )
lw r,k(s): r = M[s + k]
rt := M[k] lw lw rt , k(R0) rt ,lwk(R0) rt , k(R0) lw rt , k(R0)
t := rs + k,sw sw rt , k(rrts,sw ) k(rs ) rt , k(rs ) swswr,k(s):rt M[s
, k(rs+ ) k] = r
M[t last ] := rt add r,s,t: r = s + t
M[rs ] := rtsw sw rt , 0(rrts,sw ) 0(rs ) rt , 0(rs ) sw r , 0(r )
addi r,s,k:t r = ss + k
M[k] := rt sw sw rt , k(R0) rt ,swk(R0) rt , k(R0) sw r , k(R0)
where k ist a constant
rd := rs + radd t addrd , rs ,rdraddt, rs , rt rd , rs , rt add rd , rs , rt
rd := rt add addrd , R0,rdadd r,tR0, rt rd , R0, rt R0:adda register
rd , R0,containing
rt
rd := rs + kaddiaddi rd , rs ,rdkaddi, rs , k rd , rs , k theaddi
constant
rd , rs0, k
rd := k addiaddi rd , R0,rdaddik, R0, k rd , R0, k addi rd , R0, k
GOTO labelj j label label j label j label (Mogensen)
label
EN label
LSE label
f , t IFf , rs =
ELSE label
rt beq label
f , beq
THEN rs , rtt ,ELSE
rlabel
, rt , label
s beq tlabel
rsft,, rt , labelt beq rs , rt , labelt
labellabel
LABEL f :labelf :f label f : label f :
label
EN label
LSE label
Code IF
ELSE,r =
, generation label
r bne label
, bne
THEN r , r ,ELSE
rlabel
, r , label
bne label
r ,, r , label bne r , r , label 313
M[rs ] := rt sw rt , 0(rs )
M[k] := rt sw rt , k(R0)
Illustration
rd := rs + rt add rd , rs , rt
rd := rt add rd , R0, rt
rd := rs + k addi rd , rs , k
rd := k addi rd , R0, k
GOTO label j label
IF rs = rt THEN labelt ELSE label f , beq rs , rt , labelt
LABEL label f label f :
IF rs = rt THEN labelt ELSE label f , bne rs , rt , label f MIPS instructions:
LABEL labelt labelt :
IF rs = rt THEN labelt ELSE label f beq rs , rt , labelt beq r,s,lab: branch to l if
j label f
IF rs < rt THEN labelt ELSE label f , slt rd , rs , rt r=s
LABEL label f bne rd , R0, labelt
label f :
bnq r,s,lab: branch to l if
IF rs < rt THEN labelt ELSE label f , slt rd , rs , rt r6=s
LABEL labelt beq rd , R0, label f
labelt : slt r,s,t: d = (s < t)
IF rs < rt THEN labelt ELSE label f slt rd , rs , rt
bne rd , R0, labelt j l: unconditional jump
j label f
LABEL label label:
(Mogensen)
Figure 8.1: Pattern/replacement pairs for a subset of the MIPS instruction set

Code generation 314


Pattern matching
A pattern should be defined for every single IR instruction
(otherwise it would not be possible to translate some IR code)
A last in a pattern can only be matched by a last in the IR code
But any variable in a pattern can match a last in the IR code
If patterns overlap, there are potentially several translations for the
same IR code
On wants to find the best possible translation (e.g., the shortest or
the fastest)
Two approaches:
I Greedy: order the pairs so that longer patterns are listed before
shorter ones and at each step, use the first pattern that matches a
prefix of the IR code
I Optimal: associate a cost to each replacement and find the
translation that minimizes the total translation cost, e.g. using
dynamic programming

Code generation 315


Illustration

Using the greedy approach:

IR code MIPS code


a := a + b last add a, a, b
d := c + 8 sw a, 8(c)
M[d last ] := a )
IF a = c THEN label1 ELSE label2 beq a, c, label
LABEL label2 label2 :

Code generation 316


Outline

1. Introduction

2. Instruction selection

3. Register allocation

4. Memory management

Code generation 317


Register allocation

In the IR, we assumed an unlimited number of registers (to ease IR


code generation)
This is obviously not the case on a physical machine (typically, from
5 to 10 general-purpose registers)

Registers can be accessed quickly and operations can be performed


on them directly
Using registers intelligently is therefore a critical step in any
compiler (can make a dierence in orders of magnitude)

Register allocation is the process of assigning variables to registers


and managing data transfer in and out of the registers

Code generation 318


Challenges in register allocation

Registers are scarce


I Often substantially more IR variables than registers
I Need to find a way to reuse registers whenever possible
Register management is sometimes complicated
I Each register is made of several small registers (x86)
I There are specific registers which need to be used for some
instructions (x86)
I Some registers are reserved for the assembler or operating systems
(MIPS)
I Some registers must be reserved to handle function calls (all)
Here, we assume only some number of indivisible, general-purpose
registers (MIPS-style)

Code generation 319


A direct solution
Idea: store every value in main memory, loading values only when
they are needed.
To generate a code that performs some computation:
I Generate load instructions to retrieve the values from main memory
into registers
I Generate code to perform the computation on the registers
I Generate store instructions to store the result back into main memory
Example: (with a,b,c,d stored resp. at fp-8, fp-12, fp-16, fp-20)
a := b + c lw t0 , 12(fp)
d := a ) lw t1 , 16(fp)
c := a + d add t2 , t0 , t1
sw t2 , 8(fp)
lw t0 , 8(fp)
sw t0 , 20(fp)
lw t0 , 8(fp)
lw t1 , 20(fp)
add t2 , t0 , t1
sw t2 , 16(f )
Code generation 320
A direct solution

Advantage: very simple, translation is straighforward, never runout


of registers
Disadvantage: very inefficient, waste space and time

Better allocator should:


I try to reduces memory load/store
I reduce total memory usage
Need to answer two questions:
I Which register do we put variables in?
I What do we do when we run out of registers?

Code generation 321


Liveness analysis
A variable is live at some point in the program if its value may be
read later before it is written. It is dead if there is no way its value
can be used in the future.
Two variables can share a register if there is no point in the program
where they are both live
Liveness analysis is the process of determining the live or dead
statuses of all variables throughout the (IR) program

Informally: For an instruction I and a variable t


I If t is used in I , then t is live at the start of I
I If t is assigned a value in I (and does not appear in the RHS of I ),
then t is dead at the start of the I
I If t is live at the end of I and I does not assign a value to t, then t is
live at the start of I
I t is live at the end of I if it is live at the start of any of the
immediately succeding instructions

Code generation 322


Liveness analysis: control-flow graph

First step: construct the control-flow graph


For each instruction numbered i, one defines succ[i] as follows:
I If instruction j is just after i and j is neither a GOTO or
IF-THEN-ELSE instruction, then j is in succ[i]
I If i is of the form GOTO l, the instruction with label l is in succ[i].
I If i is IF p THEN lt ELSE lf , instructions with label lt and lf are both
in succ[i]

The third rule loosely assumes that both outcomes of the


IF-THEN-ELSE are possible, meaning that some variables will be
claimed live while they are dead (not really a probblem)

Code generation 323


Liveness analysis: control-flow graph
VENESS ANALYSIS 196 195 9. REG
CHAPTER
Example (Computation of Fibonacci(n) in a)

1: a := 0 i succ[i] gen[i] kill[i]


2: b := 1 1 2 a
3: z := 0 2 3 b
3 4 z
4: LABEL loop
4 5
5: IF n = z THEN end ELSE body
5 6, 13 n, z
6: LABEL body 6 7
7: t := a + b 7 8 a, b t
8: a := b 8 9 b a
9: b := t 9 10 t b
10: n := n 1 10 11 n n
11: z := 0 11 12 z
12: GOTO loop 12 4
13: LABEL end 13

Code generation
Figure 9.3: succ, gen and kill for the program
324
Liveness analysis: gen and kill
For each IR instruction, we define two functions:
194gen[i]: set of variables that may be read9.byREGISTER
CHAPTER instruction i
ALLOCATION
kill[i]: set of variables that may be assigned a value by instruction i
Instruction i gen[i] kill[i]
LABEL l 0/ 0/
x := y {y} {x}
x := k 0/ {x}
x := unop y {y} {x}
x := unop k 0/ {x}
x := y binop z {y, z} {x}
x := y binop k {y} {x}
x := M[y] {y} {x}
x := M[k] 0/ {x}
M[x] := y {x, y} 0/
M[k] := y {y} 0/
GOTO l 0/ 0/
IF x relop y THEN lt ELSE l f {x, y} 0/
x := CALL f (args) args {x}
Code generation 325
Liveness analysis: in and out
For each program instruction i, we use two sets to hold liveness
information:
I in[i]: the variables that are live before instruction i
I out[i]: the variables that are live at the end of i
in and out are defined by these two equations:

in[i] = gen[i] [ (out[i] \ kill[i])


[
out[i] = in[j]
j2succ[i]

These equations can be solved by fixed-point iterations:


I Initialize in[i] and out[i] to empty sets
I Iterate over instructions (in reverse order, evaluating out first) until
convergence (i.e., no change)
For the last instruction (succ[i] = ;), out[i] is set of variables that
are live at the end of the program (i.e., used subsequently)
Code generation 326
196 CHAPTER 9. REGISTER
ESSIllustration
ANALYSIS 195

i succ[i] gen[i] kill[i]


1: a := 0 1 2 a
2: b := 1 2 3 b
3: z := 0 3 4 z
4: LABEL loop 4 5
5: IF n = z THEN end ELSE body 5 6, 13 n, z
6: LABEL body 6 7
7: t := a + b 7 8 a, b t
8: a := b 8 9 b a
9: b := t 9 10 t b
10: n := n 1 10 11 n n
11: z := 0 11 12 z
12: GOTO loop 12 4
13: LABEL end 13
(Mogensen)
2: Example program for liveness analysis and register allocation
Figure 9.3: succ, gen and kill for the program in figure
Code generation 327
9.4. INTERFERENCE 197
Illustration
Initial Iteration 1 Iteration 2 Iteration 3
i out[i] in[i] out[i] in[i] out[i] in[i] out[i] in[i]
1 n, a n n, a n n, a n
2 n, a, b n, a n, a, b n, a n, a, b n, a
3 n, z, a, b n, a, b n, z, a, b n, a, b n, z, a, b n, a, b
4 n, z, a, b n, z, a, b n, z, a, b n, z, a, b n, z, a, b n, z, a, b
5 a, b, n n, z, a, b a, b, n n, z, a, b a, b, n n, z, a, b
6 a, b, n a, b, n a, b, n a, b, n a, b, n a, b, n
7 b,t, n a, b, n b,t, n a, b, n b,t, n a, b, n
8 t, n b,t, n t, n, a b,t, n t, n, a b,t, n
9 n t, n n, a, b t, n, a n, a, b t, n, a
10 n n, a, b n, a, b n, a, b n, a, b
11 n, z, a, b n, a, b n, z, a, b n, a, b
12 n, z, a, b n, z, a, b n, z, a, b n, z, a, b
13 a a a a a a
(Mogensen)
Figure 9.4: Fixed-point iteration for liveness analysis

Code generation 328


Interference

A variable x interferes with another variable y if there is an


instruction i such that x 2 kill[i], y 2 out[i] and instruction i is not
x := y
Note:
I Dierent from x 2 out[i] and y 2 out[i]:
I if x is in kill[i] and not in out[i] (because x is never used after an
assignment), then it should interfere with y 2 out[i] (to allow
side-eects)
Interference graph: undirected graph where each node is a variable
and two variables are connected if they interfere

Code generation 329


Illustration
n use definition 9.2 to generate interference for each assignment state-
program in figure 9.2:
198 CHAPTER 9. REGISTER
Instruction Left-hand side Interferes with
1 a n
2 b n, a z t
3 z n, a, b
7 t b, n
8 a t, n
9 b n, a a n
10 n a, b
11 z n, a, b b

(Mogensen)
global register allocation, i.e., find for each variable a register that it can
Figure since
ll points in the program (procedure, actually, 9.5: Interference
a program ingraph
termsfor the program in figu
mediate language corresponds to a procedure in a high-level language).
s that, for the purpose of register allocation, two variables interfere if
We can
at any point in the program. Also, evenuse definition
though 9.2 tois generate
interference defined in interference for each
ric way in definition 9.2,ment
the conclusion
in the program in figure 9.2: variables
that the two involved
Code generation 330
Register allocation

Global register allocation: we assign to a variable the same register


throughout the program (or procedure)
How to do it? Assign a register number (among N) to each node of
the interference graph such that
I Two nodes that are connected have dierent register numbers
I The total number of dierent register is no higher than the number of
available registers
This is a problem of graph colouring (where colour number =
register number), which is known to be NP-complete
Several heuristics have been proposed

Code generation 331


Chaitins algorithm

A heuristic linear algorithm for k-coloring a graph


Algorithm:
I Select a node with fewer than k outgoing edges
I Remove it from the graph
I Recursively color the rest of the graph
I Add the node back in
I Assign it a valid color
Last step is always possible since the removed node has less than k
neighbors in the graph
Implementation: nodes are pushed on a stack as soon as they are
selected

Code generation 332


Chaitin's Algorithm
Illustration
Stack of nodes
b b c c
b c a a
e e
a a d d e e
a d e d d
b b
g g f f c c
g f

Registers
Registers
Registers
R0 R R1 R R2 R R3 R
0 1 2 3
R0 R1 R2 R3
(Keith Schwarz)
Code generation 333
Chaitins algorithm

What if we can not find a node with less than k neighbors?


Choose and remove an arbitrary node, marking it as troublesome
When adding node back in, it may still be possible to find a valid
color
Otherwise, we will have to store it in memory.
I This is called spilling.

Code generation 334


Chaitin's Algorithm Reloaded
Chaitin's Algorithm Reloaded
Illustration
Stack of nodes
b c a
b c
b
a d (spilled)
c
a d
d
g e e
g e f
f g
f
Registers
R0 RRegisters
1
R2
R0 R1 R2
(Keith Schwarz)
Code generation 335
Spilling

A spilled variable is stored in memory


When we need a register for a spilled variable v , temporarily evict a
register to memory (since registers are supposed to be exhausted)
When done with that register, write its value to the storage spot for
v (if necessary) and load the old value back

Heuristics to choose the variable/node to spill:


I Pick one with close to N neighbors (increasing the chance to color it)
I Choose a node with many neighbors with close to N neighbors
(increase the chance of less spilling afterwards)
I Choose a variable thats not costly to spill (by looking at the program)

Code generation 336


Register allocation

We only scratched the surface of register allocation


Many heuristics exist as well as dierent approaches (not using
graph coloring)
GCC uses a variant of Chaitins algorithm

Code generation 337


Outline

1. Introduction

2. Instruction selection

3. Register allocation

4. Memory management

Code generation 338


MemoryHeres a map depicting the address space of an executing program:
organization
Stack

Heap

Global/static data

Code

MemoryRuntime
is generally
Stack
divided into four main parts:
Code: contains the
Each active function code
call has its ownofunique
the program
stack frame. In a stack frame (activation
record) we hold the following information:
Static data: contains static data allocated at compile-time
1) frame pointer: pointer value of the previous stack frame so we can reset the top
Stack: used
of stack for
whenfunction calls and
we exit this function. This local variables
is also sometimes called the dynamic
link.
Heap:2) for
staticthe
link: rest (e.g.,(like
in languages data
Pascalallocated at run-time)
but not C or Decaf) that allow nested
function declarations, a function may be able to access the variables of the
Computers have registers
function(s) that
within which it iscontain addresses
declared. In the static link,that
we holddelimits
the pointerthese
value of the stack frame in which the current function was declared.
dierent parts
3) return address: point in the code to which we return at the end of execution of
Code generation the current function. 339
known size and address at compile time. Furthermore, the allocated memory stays
allocated throughout the execution of the program.
Static data Most modern computers divide their logical address space into a text section
(used for code) and a data section (used for data). Assemblers (programs that con-
vert symbolic machine code into binary machine code) usually maintain current
address pointers
Contains to both the at
data allocated textcompile-time
area and the data area. They also have pseudo-
instructions (directives) that can place labels at these addresses and move them. So
Address of such data is then hardwired in the generated code
you can allocate space for, say, an array in the data space by placing a label at the
current-address
Used e.g. in Cpointer in the dataglobal
to allocate space and then move the current-address pointer
variables
up by the size of the array. The code can
There are facilities in assemblers to allocate use the label to access
such the array. Alloca-
space:
tion of space for an array A of 1000 32-bit integers (i.e., 4000 bytes) can look like
I Example to allocate an array of 4000 bytes
this in symbolic code:
.data # go to data area for allocation
baseofA: # label for array A
.space 4000 # move current-address pointer up 4000 bytes
.text # go back to text area for code generation

The base address of the array A is at the label baseofA.


Limitations:
I size of the data must be known at compile-time
257
I Never freed even if the data is only used a fraction of time

Code generation 340


10.7. ACCESSING NON-LOCAL VARIABLES 225
Stack

Next activation records
Space for storing local variables for spill
and for storing live variables allocated to
caller-saves registers across function calls
Space for storing callee-saves registers that
are used in the body
Incoming parameters in excess of four
Return address
FP ! Static link (SL)
Previous activation records

Mainly used to store activation


Figure 10.13: Activationrecords for
record with function
static link calls
But can be used to allocate arrays and other
data structures (e.g., in
y
C, to allocate local arrays) q
x p
f: g:
Return address Return address
FP ! SL (null)
Allocation is quick and easy FP ! SL (to f)

But sizes of arrays need to be known at compile-time and can only
Figure 10.14: Activation records for f and g from figure 10.11
be used for local variables (space is freed when the function returns)
Static links
Code generation 341
Heap

Used for dynamic memory allocations


Size of arrays or structures need not to be known at compile-time
Array sizes can be increased dynamically
Two ways to manage data allocation/deallocation:
I Manual memory management
I Automatic memory management (or garbage collection)

Code generation 342


Manual memory management

The user is responsible for both data allocation and deallocation


I In C: malloc and free
I In object oriented languages: object constructors and destructors
Advantages:
I Easier to implement than garbage collection
I The programmer can exercice precise control over memory usage
(allows better performances)
Limitations
I The programmer has to exercice precise control over memory usage
(tedious)
I Easily leads to troublesome bugs: memory leaks, double frees,
use-after-frees...

Code generation 343


A simple implementation
12.5. MANUAL MEMORY MANAGEMENT 261
Space is allocated by the operating system and then managed by the
program (through library functions such as malloc and free in C)
A free list is maintained with all current free memory blocks
(initially, one big block)
12 28 20

Malloc:
(a) The initial free list.
I Search through the free list for a block of sufficient size
I If found, it is possibly split in two with one removed from free list
12 12 20
I If not found, ask operating system for a new chunck of memory
Free:
16
I Insert the block back into the free list
Allocation is linear in the size of the free list, deallocation is done in
constant time (b) After allocating 12 bytes.

Code generation 344


A simple implementation

Block splitting leads to memory fragmentation


I The free list will eventually accumulate many small blocks
I Can be solved by joining consecutive freed blocks
I Makes free linear in free list size

Complexity of malloc can be reduced


I Limit block sizes to power of 2 and have a free list for each size
I Makes malloc logarithmic in heap size

Array resizing can be allowed by using indirection nodes


I When array is resized, it is copied into a new (bigger) block
I Indirection node address is updated accordingly

Code generation 345


Garbage collection

Allocation is still done with malloc or object constructors but


memory is automatically reclaimed
I Data/Objects that wont be used again are called garbage
I Reclaiming garbage objects automatically is called garbage collection

Advantages:
I Programmer does not have to worry about freeing unused resources
Limitations:
I Programmer cant reclaim unused resources
I Difficult to implement and add a significant overhead

Code generation 346


Implementation 1: reference counting

Idea: if no pointer to a block exists, the block can safely be freed

Add an extra field in each memory block (of the free list) with a
count of the incoming pointers
I When creating an object, set its counter to 0
I When creating a reference to an object, increment its counter
I When removing a reference, decrement its counter.
I If zero, remove all outgoing references from that object and reclaim
the memory

Code generation 347


Reference Counting in Action
Reference
Reference Counting in Action
counting: illustration
class LinkedList {
LinkedList next;
} LinkedList {
class
LinkedList next; head 1
} int main() {
LinkedList head = new LinkedList;
mid
LinkedList
int main() { mid = new LinkedList;
LinkedList
LinkedList tail
head = new
= new LinkedList;
LinkedList;
LinkedList mid = new LinkedList; tail 2
head.nexttail
LinkedList = mid;
= new LinkedList;
mid.next = tail;
head.next = mid;
mid = tail
mid.next = null;
= tail;
2
midhead.next.next
= tail = null;= null;
head = null; = null;
head.next.next
}
head = null;
}

(Keith Schwarz)

Code generation 348


Reference counting

Straightforward to implement and can be combined with manual


memory management
Significant overhead when doing assignements for incrementing
counters
Impose constraints on the language
I No pointer to the middle of an object, should be able to distinguish
pointers from integers...
Can not handle circular data structures
I As counters will never be zero
I E.g., doubly-linked lists

Code generation 349


Implementation 2: tracing garbage collectors
Idea: find all reachable blocks from the knowledge of what is
immediately accessible (the root set) and free all other blocks
The root set is the set of memory locations that are known to be
reachable
I all variables in the program: registers, stack-allocated, global
variables. . . Mark-and-Sweep In Action
Any objects (resp. not) reachable from the root set are (resp. not)
reachable

Root Set

Code generation 350


Tracing garbage collection: mark-and-sweep
Mark-and-sweep garbage collection:
I Add a flag to each block
I Marking phase: go through the graph, e.g., depth-first, setting the
flag for all reached blocks
I Sweeping phase: go through the list of blocks and free all unflagged
ones
Implementation of the mark stage with a stack:
I Initialized to the root set
I Retaining reachable blocks that have not yet been visited
Tracing GC is typically called only when a malloc fails to avoid
pauses in the program

Problem: stack requires memory (and a malloc has just failed)


I Marking phase can be implemented without a stack (at the expense
of computing times)
I Typically by adding descriptors within blocks

Code generation 351


Implementation: tracing garbage collection

Advantage:
I More precise than reference counting
I Can handle circular references
I Run time can be made proportional to the number of reachable
objects (typically much lower than number of free blocks)
Disadvantages:
I Introduce huge pause times
I Consume lots of memory

Code generation 352


Garbage collection

Other garbage collection methods:


Two-space collection (stop-and-copying):
I Alternative to free lists
I Two allocation spaces of same size are maintained
I Blocks are always allocated in one space until full
I Garbage collection then copies all live objects to the other space and
swap their roles
Generational collection:
I Maintain several spaces for dierent generations of objects, with
these spaces of increasing sizes
I Optimized according to the objects die young principle
Concurrent and incremental collectors
I Perform collection incrementally or concurrently during execution of
the program
I Avoid long pauses but can reduce the total throughput

Code generation 353


Part 7

Conclusion

Conclusion 354
Structure of a compiler
character stream

Lexical analysis
token stream

Syntax analysis
syntax tree

Semantic analysis
syntax tree

Intermediate code generation


intermediate representation

Intermediate code optimization


intermediate representation

Code generation
machine code

Code optimization
machine code

Conclusion 355
Summary
Part 1, Introduction:
I Overview and motivation...
Part 2, Lexical analysis:
I Regular expression, finite automata, implementation, Flex...
Part 3, Syntax analysis:
I Context-free grammar, top-down (predictive) parsing, bottom-up
parsing (SLR and operator precedence parsing)...
Part 4, Semantic analysis:
I Syntax-directed translation, abstract syntax tree, type and scope
checking...
Part 5, Intermediate code generation and optimization:
I Intermediate representations, IR code generation, optimization...
Part 6, Code generation:
I Instruction selection, register allocation, liveliness analysis, memory
management...

Conclusion 356
More on compilers

Our treatment of each compiler stage was superficial


See reference books for more details (Transp. 4)

Some things we have not discussed at all:


I Specificities of object-oriented or functional programming languages
I Machine dependent code optimization
I Parallelism
I ...

Related topics:
I Natural language processing
I Domain-specific languages
I ...

Conclusion 357

You might also like