0% found this document useful (0 votes)
44 views

Code Generation I

The document discusses several key issues in code generation including the input and target formats, memory management, instruction selection by selecting optimal target instructions and addressing modes, register allocation and assignment to efficiently map variables to registers, and choosing an evaluation order that minimizes intermediate values and register usage. It provides examples to illustrate code generation techniques for an hypothetical machine with registers and different addressing modes.

Uploaded by

Akshay S
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Code Generation I

The document discusses several key issues in code generation including the input and target formats, memory management, instruction selection by selecting optimal target instructions and addressing modes, register allocation and assignment to efficiently map variables to registers, and choosing an evaluation order that minimizes intermediate values and register usage. It provides examples to illustrate code generation techniques for an hypothetical machine with registers and different addressing modes.

Uploaded by

Akshay S
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

Module VI

Code Generation

1
Code genaration
• The final phase in compiler model is the code
generator.
• It takes as input an intermediate
representation of the source program and
produces as output an equivalent target
program.
• The code generation techniques presented below
can be used whether or not an optimizing phase
occurs before code generation.

2
Position of a Code Generator in the
Compiler Model

3
ISSUES IN THE DESIGN OF A CODE
GENERATOR(15 marks UQ CUSAT
April 2017)
• The following issues arise during the
code generation phase:
• 1.   Input to code generator
• 2.   Target program
• 3.   Memory management
• 4.   Instruction selection
• 5.   Register allocation
• 6.   Evaluation order
4
1. Input to code generator:
• The input to the code generation consists
of the intermediate representation of
the source program produced by front
end , together with information in the
symbol table to determine run-time
addresses of the data objects denoted by
the names in the intermediate
representation.
5
•  Intermediate representation can be : 
• a.   Linear representation such as postfix
notation
• b.   Three address representation such as
quadruples
• c.   Virtual machine representation such
as stack machine code
• d.   Graphical representations such as
syntax trees and dags.
6
  2. Target program:
• The output of the code generator is the target
program. The output may be :
• a). Absolute machine language
•  
• -  It can be placed in a fixed memory location and can
be executed immediately.
• b). Relocatable machine language
• -  It allows subprograms to be compiled separately.
• C). Assembly language
• - It makes the code generation is made easier.

7
3. Memory management:
•  Names in the source program are mapped to
addresses of data objects in run-time memory
by the front end and code generator. 
•     It makes use of symbol table, that is, a name
in a three-address statement refers to a
symbol-table entry for the name.
•       Labels in three-address statements have to
be converted to addresses of instructions.

8
• j:gotoi generates jump instruction as follows:
•  
• *  if i < j, a backward jump instruction with target
address equal to location of code for quadruple i
is generated.
•  
• *   if i > j, the jump is forward. We must store on
a list for quadruple i the location of the first
machine instruction generated for quadruple j.
When i is processed, the machine locations for all
instructions that forward jumps to i are filled.
9
4. Instruction selection:
• The instructions of target machine
should be complete and uniform.
•    Instruction speeds and machine idioms
are important factors when efficiency of
target program is considered.
•   The quality of the generated code is
determined by its speed and size.

10
Example
•  
• a:=b+c
• d:=a+e
•  
• MOV b,R0
• ADD c,R0
• MOV R0,a
• MOV a,R0
• ADD e,R0
• MOV R0,d

11
5. Register allocation
• Instructions involving register operands are
shorter and faster than those involving
operands in memory. The use of registers is
subdivided into two subproblems :
• 1. Register allocation - the set of variables that
will reside in registers at a point in the
program is selected.
• 2. Register assignment - the specific register
that a value picked
•  Certain machine requires even-odd register
pairs for some operands and results. 12
• For example , consider the division
instruction of the form :Div x, y
  where, x - dividend even register in
even/odd register pair y-divisor 
• even register holds the remainder
•  odd register holds the quotient

13
6. Evaluation order
• The order in which the computations
are performed can affect the
efficiency of the target code.
• Some computation orders require
fewer registers to hold intermediate
results than others.

14
Target Program Code
• The back-end code generator of a
compiler may generate different forms of
code, depending on the requirements:
– Absolute machine code (executable code)
– Relocatable machine code (object files for
linker)
– Assembly language (facilitates debugging)
– Byte code forms for interpreters (e.g. JVM)
15
The Target Machine
• Implementing code generation requires
thorough understanding of the target machine
architecture and its instruction set
• Our (hypothetical) machine:
– Byte-addressable (word = 4 bytes)
– Has n general purpose registers R0,
R1, …, Rn-1
– Two-address instructions of the form
op source, destination 16
The Target Machine: Op-codes and
Address Modes
• Op-codes (op), for example
MOV (move content of source to destination)
ADD (add content of source to destination)
SUB (subtract content of source from dest.)
• Address modes
Mode Form Address Added Cost

Absolute M M 1
Register R R 0
Indexed c(R) c+contents(R) 1

Indirect register *R contents(R) 0

Indirect contents(c+contents(R
*c(R) 1
indexed )) 17
Instruction Costs
• Define the cost of instruction
= 1 + cost(source-mode) + cost(destination-
mode)
• Eg: The instruction MOV R0,R1 copies the contents
of register R0 into register R1.
• This instruction has cost 1, since it occupies only one
word of memory.
• The instruction MOV R5, M , copies the contents of
register R5 into memory location M. This instruction
has cost 2, since the address of memory location M is
in the word following the instruction.
18
Examples

Instruction Operation Cost


MOV R0,R1 Store content(R0) into register R1 1
MOV R0,M Store content(R0) into memory location M 2
MOV M,R0 Store content(M) into register R0 2
MOV 4(R0),M Store contents(4+contents(R0)) into M 3
MOV *4(R0),M Store contents(contents(4+contents(R0))) into M 3
MOV #1,R0 Store 1 into R0 2
ADD 4(R0),*12(R1) Add contents(4+contents(R0))
to contents(12+contents(R1)) 3

19
Instruction Selection
• Instruction selection is important to obtain efficient
code
• Suppose we translate three-address code
x:=y+z
to: MOV y,R0
ADD z,R0
MOV R0,x a:=a+1 MOV a,R0
ADD #1,R0
MOV R0,a
Cost = 6
Better Better

ADD #1,a INC a


Cost = 3 Cost = 2 20
Instruction Selection: Utilizing
Addressing Modes
• Suppose we translate a:=b+c into
MOV b,R0
ADD c,R0
MOV R0,a
• Assuming addresses of a, b, and c are stored in R0,
R1, and R2
MOV *R1,*R0
ADD *R2,*R0
• Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a

21
Need for Global Machine-Specific
Code Optimizations
• Suppose we translate three-address code
x:=y+z
to: MOV y,R0
ADD z,R0
MOV R0,x
• Then, we translate
a:=b+c
d:=a+e
to: MOV a,R0
ADD b,R0
MOV R0,a
MOV a,R0 Redundant
ADD e,R0
MOV R0,d

22
Register Allocation and Assignment
• Efficient utilization of the limited set of registers is
important to generate good code
• Registers are assigned by
– Register allocation to select the set of variables that will
reside in registers at a point in the code
– Register assignment to pick the specific register that a
variable will reside in
• Finding an optimal register assignment in general is
NP-complete

23
Example

t:=a+b t:=a*b
t:=t*c t:=t+a
t:=t/d t:=t/d

{ R1=t } { R0=a, R1=t }

MOV a,R1 MOV a,R0


ADD b,R1 MOV R0,R1
MUL c,R1 MUL b,R1
DIV d,R1 ADD R0,R1
MOV R1,t DIV d,R1
MOV R1,t 24
Choice of Evaluation Order
• When instructions are independent, their evaluation
order can be changed

MOV a,R0
ADD b,R0
MOV R0,t1
t1:=a+b MOV c,R1
t2:=c+d ADD d,R1
a+b-(c+d)*e MOV e,R0
t3:=e*t2
t4:=t1-t3 MUL R1,R0 MOV c,R0
MOV t1,R1 ADD d,R0
reorder SUB R0,R1 MOV e,R1
MOV R1,t4 MUL R0,R1
t2:=c+d MOV a,R0
t3:=e*t2 ADD b,R0
t1:=a+b SUB R1,R025
t4:=t1-t3 MOV R0,t4
A Simple Code Generator
• A code generator generates target code for a
sequence of three- address statements and
effectively uses registers to store operands of
the statements.
• For example: consider the three-address
statement a := b+c It can have the following
sequence of codes:
 

26
A Simple Code Generator

• ADD Rj, Ri Cost = 1


(or)
• ADD c, Ri Cost = 2
(or)
• MOV c, Rj Cost = 3
• ADD Rj, Ri
27
Register and Address Descriptors:
• A register descriptor is used to keep track
of what is currently in each registers. The
register
descriptors show that initially all the
registers are empty.
• An address descriptor stores the location
where the current value of the name can
be found at run time.
28
A code-generation algorithm:
• The algorithm takes as input a sequence of
three-address statements constituting a basic
block. For each three-address statement of
the form x : = y op z, perform the following
actions:
• 1.   Invoke a function getreg to determine the
location L where the result of the
computation y op z should be stored.

29
2. Consult the address descriptor for y to
determine y’, the current location of y.
• Prefer the register for y’ if the value of y is
currently both in memory and a register.
• If the value of y is not already in L,
generate the instruction MOV y’ , L to
place a copy of y in L.

30
3.    Generate the instruction OP z’ , L
where z’ is a current location of z.
• Prefer a register to a memory location if
z is in both.
• Update the address descriptor of x to
indicate that x is in location L.
• If x is in L, update its descriptor and
remove x from all other descriptors.
31
4. If the current values of y or z have
no next uses, are not live on exit
from the block, and are in registers,
alter the register descriptor to
indicate that, after execution of x : =
y op z , those registers will no longer
contain y or z
32

You might also like