CDU5

The document discusses the code generation phase of a compiler, detailing its input, output, and the various issues involved in its design, such as memory management and instruction selection. It explains the types of target programs produced, including absolute, relocatable, and assembly languages, and outlines the code generation algorithm that utilizes register and address descriptors. Additionally, it emphasizes the importance of efficient register allocation and assignment for optimizing code generation.

Uploaded by

vyshnavivadlamani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views15 pages

CDU5

Uploaded by

vyshnavivadlamani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Code Generator

• Code generator is the final phase of compiler

• The front end translates the src prg into intermediate code
• Then code optimizer phase it’s an optional phase which is used for
optimizing the intermediate code. It means that it can reduce the [Link]
stmts in the intermediate code or it can replace the complex operations by
simpler one. Its o/p is also intermediate code
• For the code generator i/p is intermediate representation and the o/p is the
target prg
• All the phases are connected with the symbol table
Issues In The Design of Code Generator
1. Input to the Code Generator
2. Target Program
3. Memory Management
4. Instruction Selection
5. Register Allocation
6. Choice of Evaluation Order
Input to the Code Generator
• Input for the code generator is intermediate code
• The intermediate code may be a
Linear Representation : postfix notation
Three Address Representation : quadruple, Triple, Indirect Triple
Graphical Representation : syntax tree or DAG
syntax tree – interior nodes represent operator and leaf’s represent operand
In DAG common sub expression is identified and it wont construct it again
Target Program
• It is the o/p of code generator
• Code generator o/p may be of 3 kinds which produces
1. Absolute Machine Language
2. Relocatable Machine Language
3. Assembly Language
Absolute Machine Language
• it can be placed in fixed location in memory and can be immediately
executed.,
• so that there is no need to change address.
• If the prg is small then only we can use this absolute m/c lang bcz the
memory space must be available
Relocatable Machine Language
• we can store/load our prg anywhere in the memory and it can be
executed
• For ex. If our prg address starts with 0….10 but in memory we have
spaces from 100 to 110 which means that all these references 0…10 must
be converted to 100 to 110. so this 100 must be added to all the addresses
• All the references must be converted into this address(100). Therefore it
is called as Relocatable Machine Language, which is actually an object
code
• Here the sub prgs are compiled separately and all the sub prg relocatable
object files, library files must be linked together using linker and that
should be loaded into proper location into memory by using loader
• In Relocatable Machine Language linker and loader are necessary
Assembly Language
• It will produce pneumonic instructions which is called as assembly
language
MOV a, R0
ADD b, R0
• so if we use assembly language as target prg then the code generation
phase is easier. Because in the previous 2 kinds of o/p 0’s and 1’s must be
constructed where as this 3rd kind of o/p uses pneumonic instructions
• Here assembler is necessary to convert assembly language into m/c
language
• Even though this assembler is used separately this is the easier method for
generating the code generation phase
• And this is further used by target m/c hvng smaller memory
Memory Management
• Mapping of names in the src prg to runtime memory should also be done.
So this is done with the help of symbol table information
• Because whenever we see the sequence of declarations in the src prg that
will be entered into the symbol table along with the name, type and its
address(offset) also
• So for storing the offset we have used with information, so that we can
find the relative addresses of names and then that names can be entered
into the corresponding addresses into run time memory
Instruction Selection
• It also plays an important role in the efficiency of the code generation
phase
• MOV is an opcode and b,R0 are src and destination operands
• So this type of stmt by stmt three address code sequence sometimes
produce the poor code
• Some redundant code will also be generated
• Consider another ex. Here the 4th stmt is redundant. In the 3rd stmt the
content of R0 is x. R0 has the result x and again moving the same x into
R0 in the 4th stmt. So it is redundant and it is not necessary
• If the x in the 3rd stmt is not used further then it is also [Link]
there is no need to move the result R0 to x
• Because after adding z to R0 the y+z will be in R0. so in that content S
can be added
• So instead of 6 stmts we can use 4 stmts

• Constant 1 should be added to x

• #1 -- immediate addressing node
• Instead of these 3 stmts, if our target m/c has the instructions INC means
then we can use INC x. so that [Link] instructions are reduced

Object Code Forms

• It is also known as the target m/c (or) A simple target m/c model
• Some of the op’s that cmp computes are as follows:
1. Load Operations: load memory word to register
LD R1,X //load the value in location X to register
2. Store Operations: store the content of register into memory
ST X,R1 // store the value in register R1 into location X
3. Computational Operations: add,sub,mul all instr’s are computational
operations. These are represented in the form of op dst, src1, src2
op – the operation which we are performing, dst is the destination
argument(sometimes acts as both src and destination arguments),
remaning are source arguments Ex: ADD R1,R2,R3 it can be written as
R1=R2+R3 or R1=R1+R2+R3 Ex: INC X the content of X will be
incremented and stored in X
4. Unconditional Jump: without checking any condition the ctrl will be
transferred to the corresponding location
Ex: BR L
BR 1000 // without checking any condition ctrl will be
transferred to L which is 1000
5. Conditional Jump: by checking the condition ctrl will goes to the
corresponding location
Ex: BLTZ r,L //Branch, Less Than,Zero r is a register L is a label(addr)
If the content of reg is <0 then branch to label
If the content of the reg is -2 and L is 1000
So -2<0 , then the ctrl will goes to the L
If the stmt is false then next stmt will be executed
• The target m/c will be hvng variety of addressing modes
• By using addressing modes we can get effective address which contains
corresponding operand
Addressing Mode Form Address
Absolute M M
Register R R
Index C(R) C+Contents(R)
Indirect Register *R Contents(R)
Indirect Indexed *C(R) Contents(C+Contents(R))
Literal(or)Immediate # N/A
Absolute addr mode is represented with M, where M is the memory loc
• As instr format mainly contains 2 fields
1. opcode//specifies the operation which we are performing
[Link] of operand//memory loc
• Addr of M is 1000 where 1000 specifies the operand which is 10

Register Addr mode – in place of the memory loc we can have addr of the reg .
Let addr of R1 is 2000 so 2000 contains operand value

Indexed addr mode – R is the addr of register which uses some content which
is added with the addr C which gives the effective addr of the operand // C is the
addr of corresponding memory loc
R specifies indexed register
Indirect register AM – the instr format conatians addr of reg but the register
doesn’t conatin effective addr here the reg contains the addr and that addr will
give effective addr which provides operand value * will give the value at addr

Indirect Indexed addr mode – similar to Indexed addr mode. If we combine

indexed as well as indirect register then we will get Indirect Indexed addr mode
• R is the addr of reg, with the help of R we need to get the contents of R,
the contents of R will be added to C addr in order to get an addr, but this
addr doesn’t specify the value here it specifies effective addr here which
gives the operand

• Literal AM: instead of addr directly we can have

the value here
NA stands for not applicable because here we can
have operand in the instr so there is no need of any addr, in order to
specify the constant we use #

Code Generation Algorithm

• It generates target code for a seq of instr // generates m/c code for
optimized intermediate code
• It uses a fn getReg() to assign registers to variables
• It uses 2 data structures
[Link] Descriptor
[Link] Descriptor
• Register Descriptor – used to keep track of which variable is stored in a
reg. initially all registers are empty //specifies info abt reg i.e. which reg
contains which variable
• Address Descriptor– used to keep track of location where variables is
stored. Location may be register, memory addr, stack etc…// specifies
loc of a variable i.e which variable is stored in reg
The algorithm takes a sequence of three-address statements as input. For each
three address statement of the form x:= y op z perform the various actions.
These are as follows:
1. Invoke a function getreg to determine(find out) the location L where the
result of computation y op z should be stored.
 Normally a getreg will return an empty register. So if any register is
empty that will be written as location L
 If empty register is not available it checks whether suitable occupied
register(dead variable) is there or not. It means that the register may be
occupied by some other name but it is no longer used
 If empty register and occupied register both are available then getreg will
written the memory location as L
2. Consult the address descriptor for y to determine y'. If the value of y currently
in memory and register both then prefer the register y' . If the value of y is not
already in L then generate the instruction MOV y' , L to place a copy of y in L.
 y’ is the current location of y
 Address descriptor tells about location of y
 Sometimes y maybe in both in memory and register then we need
consider only the register location because preference will be given to the
register. For that purpose y’ is used.
 But if y is not in register means we have to consider memory location
 So first we need to consult address descriptor for y then we need to check
whether it is already available in location L
 If it is not available in L then we need to move y’ to L.
 So we have to generate the instruction MOV y' , L because
 For example x=y+z and R0 is the location
 Check y is in register or not
 y is not in reg so we need to consider memory location for y
 Then we have to check whether that y is already in R0 or not
 y is not in location
 So target instr will be produced as MOV y,R0
 Here L is R0
3. Generate the instruction OP z' , L where z' is used to show the current
location of z. if z is in both then prefer a register to a memory location. Update
the address descriptor of x to indicate that x is in location L(x is in L). if L is a
register update its descriptor to indicate that it contains the value of x(update
register descriptor to indicate that reg contains the value of x)
 op represents the operator
 Here we need to consult address descriptor of z to determine z’ i.e the
current value of z
 Register descriptor tells about current value
In the previous example operator is +
Assume that z is not in reg and it resides in memory so that ADD Z,Ro

In this ex: x is in R0
In our example L is a register
4. If y and z have no next uses(RHS variables) and not live on exit, update the
descriptors to remove y and z (bcz they are no longer needed)
But from our ex: x=y+z if it is the final stmt in the basic block then we will
consider that name as live on exit, in that case we need to perform store
instruction at the end
Register and Address Descriptors:
• A register descriptor contains the track of what is currently in each
register. The register descriptors show that all the registers are initially
empty.
• An address descriptor is used to store the location where current value of
the name can be found at run time.

Generating Code for Assignment Statements:

• Consider the example
• The assignment statement d:= (a-b) + (a-c) + (a-c) can be translated into
the following sequence of three address code:
t1:= a-b
t2:= a-c
t3:= t1 +t2
d:= t3+t2
Assume that these 4 stmts are in a basic block and assume that d is live on exit
Code sequence for the example is as follows:
Here registers R0 and R1 considered
Initially all the reg’s are empty
• The 1st stmt, initially all the registers will be empty
• getReg() will return R0 for performing fn. So R0 is L. in the location
only we are doing computation
• We need to check current value of y’, if it is not in L we have to perform
MOV y’,L
• a is not in reg so we need to consider only L
• 3rd step is op z’,L so we need to check whether b is in reg or L. as b is in L
then update the descriptors
• Address descriptor the variable in which location
For 2nd stmt invokes getReg() and returns R1
• Now L is R1
• For the 2nd stmt with the help of getreg it is finding the location of ‘a’
whether it is in memory or register.
• We need to check only register descriptor [Link] is going to tell that R0
contains t1 only not a.
• So we need to perform MOV inst
• So for this purpose a is assigned to the register R1
• And also check whether c is in any register or not
For the 3rd stmt directly we can perform 3rd step of algorithm ADD R1,R0 i.e.
the content of t2 is a added with t1

In the 4th stmt directly we can perform add instruction by seeing the previous
register descriptor
• d=t3+t2 the content of t2 should be added with t3
• After producing this instruction ADD as this is the final stmt we need to
perform store operation also. So MOV is performed
• d is in both memory and register
Register Allocation and Assignment
• Efficient utilization of registers is important in code generation strategy
Register Allocation tells about what values in a prg should reside in a register
Register Assignment tells that in which register each value should reside
• Normally in a basic block some variables should store in registers and if
we are at the end of the basic block then all the live variables must be
stored in a memory
• But in global register assignment throughout a loop some variables must
be used. So that frequently used variables must be loaded in some fixed
registers throughout a loop
• If there are 6 reg’s, 3 reg’s will be fixed for loading the frequently used
variables
Usage count in Register Allocation
• Consider a variable x
1. If x is in register, then we save 1 unit of cost for each reference to x, that
is not preceded by an assignment to x in the same block. (i.e. if the
variable is in register then we can save 1 unit of cost for each use of x in
the basic block)
2. We can also save2 units of cost, if we can avoid a store of x at the end of
the block(i.e. if the variable is live at the end of the basic block means we
can save 2 units bcz we are not going to store it into memory location it
will remain in the register only) and in which x is assigned a value
3. If we want to allocate x to a register what is the benefit.,this can be
calculated with the help of the formula,
4. B is a block. L is a loop, it can have any [Link] blocks. For all the blocks
in L we have to calculate this formula. summation must be performed for
all the blocks. If x is dead it is 0

Simple Flow Graph for Inner Loop

• This loop consists of 4 blocks
• Every block is having instructions
• In B1 bcdf are alive on entry to the basic block 1 and asdef are the
variables which are are live on the exit of the B1
Finding of Usage Count
• In B1 x is variable a. how many times a appeared in the RHS. So here a is
used 1 time but it should not be preceded by an assignment a. but in the
1st instruction a is preceded by an assignment to a. therefore 0.
• Check whether a is live or not. If a is live then it must be computed with
some other value in the block. i.e. a must be assigned to some value
• So in a total value is 4 which means that we can save 4 units of cost by
selecting a for one of the global register. If we have 3 global registers we
can save 4 units of cost if we assign a in a register
• We can save 6 units of cost if we assign b to the register
• Assume 3 registers R0,R1,R2
• we can assign a for Ro(also e and f) b for R1 d for R2

Compiler Design - Unit 5 NOTES
No ratings yet
Compiler Design - Unit 5 NOTES
28 pages
Code Generation: Issues in The Design of A Code Generator
No ratings yet
Code Generation: Issues in The Design of A Code Generator
33 pages
34-Issues in The Design of A Code Generator - Target Machine-25-10-2024
100% (1)
34-Issues in The Design of A Code Generator - Target Machine-25-10-2024
29 pages
Codegeneration Final
No ratings yet
Codegeneration Final
31 pages
Memory
No ratings yet
Memory
43 pages
Unit 5
No ratings yet
Unit 5
13 pages
Code Generation
No ratings yet
Code Generation
49 pages
CAO - Mod1 Mymry, Adress, Adressing Modes
No ratings yet
CAO - Mod1 Mymry, Adress, Adressing Modes
47 pages
Code Geneartion
No ratings yet
Code Geneartion
13 pages
Module 6
No ratings yet
Module 6
35 pages
Unit 5 Part 1 - CD
No ratings yet
Unit 5 Part 1 - CD
14 pages
Code Generation in Compilers
No ratings yet
Code Generation in Compilers
22 pages
Code Generation Techniques in Compilers
No ratings yet
Code Generation Techniques in Compilers
51 pages
Code Generation
No ratings yet
Code Generation
21 pages
Module 6 - Code Generation
No ratings yet
Module 6 - Code Generation
36 pages
Code Generation in Compiler Design
No ratings yet
Code Generation in Compiler Design
32 pages
Instruction Set Architecture and Principles
No ratings yet
Instruction Set Architecture and Principles
42 pages
CC 7
No ratings yet
CC 7
20 pages
CA Lecture 11
No ratings yet
CA Lecture 11
49 pages
Compiler Design and Construction Lecture Notes
No ratings yet
Compiler Design and Construction Lecture Notes
28 pages
Memory Locations and Addresses
No ratings yet
Memory Locations and Addresses
41 pages
Assembly Language Basics
No ratings yet
Assembly Language Basics
42 pages
CD Unit-6 LM
No ratings yet
CD Unit-6 LM
17 pages
Instruction Sets: Addressing Modes and Formats: William Stallings, Computer Organization and Architecture, 9 Edition
No ratings yet
Instruction Sets: Addressing Modes and Formats: William Stallings, Computer Organization and Architecture, 9 Edition
31 pages
Code Generation in Compilation
No ratings yet
Code Generation in Compilation
9 pages
Module 4
No ratings yet
Module 4
80 pages
Understanding Assemblers and Their Functions
100% (2)
Understanding Assemblers and Their Functions
62 pages
Code Generation and Storage Strategies
No ratings yet
Code Generation and Storage Strategies
9 pages
CD Module 3&4
No ratings yet
CD Module 3&4
74 pages
Intermediate Code Generation in Compilers
No ratings yet
Intermediate Code Generation in Compilers
32 pages
Computer Memory & Addressing Basics
No ratings yet
Computer Memory & Addressing Basics
7 pages
Machine Language
No ratings yet
Machine Language
108 pages
Computer Architecture Essentials
No ratings yet
Computer Architecture Essentials
104 pages
Computer Organization and Architecture (18EC35) - Machine Instructions and Programs - Part 2 (Module 2)
100% (1)
Computer Organization and Architecture (18EC35) - Machine Instructions and Programs - Part 2 (Module 2)
105 pages
CD Unit 6.1
No ratings yet
CD Unit 6.1
20 pages
Code Generation
No ratings yet
Code Generation
25 pages
Compiler Design: Code Generation & Optimization
No ratings yet
Compiler Design: Code Generation & Optimization
11 pages
Code Generation Issues and Solutions
No ratings yet
Code Generation Issues and Solutions
54 pages
CH13 AddressingModesAndFormats 31 Slides
No ratings yet
CH13 AddressingModesAndFormats 31 Slides
31 pages
Unit 5
No ratings yet
Unit 5
8 pages
l5 Instruction Set and Addressing Modes
No ratings yet
l5 Instruction Set and Addressing Modes
48 pages
Comp Arch-B-1
No ratings yet
Comp Arch-B-1
47 pages
Addressing Modes
No ratings yet
Addressing Modes
56 pages
Code Generation Challenges and Examples
No ratings yet
Code Generation Challenges and Examples
22 pages
EE209A - 24 15 Assembly2
No ratings yet
EE209A - 24 15 Assembly2
45 pages
Unit 5 Contd Final
No ratings yet
Unit 5 Contd Final
12 pages
Instruction Sets: Characteristics and Functions Addressing Modes and Formats
No ratings yet
Instruction Sets: Characteristics and Functions Addressing Modes and Formats
21 pages
Addressing Modes in Computer Architecture
No ratings yet
Addressing Modes in Computer Architecture
105 pages
Unit2 2
No ratings yet
Unit2 2
20 pages
Instruction Sets: Addressing Modes and Formats
No ratings yet
Instruction Sets: Addressing Modes and Formats
52 pages
Code Generation F
No ratings yet
Code Generation F
7 pages
Chapter 4 - Assembly Language Programming
No ratings yet
Chapter 4 - Assembly Language Programming
33 pages
Module 1 - Part 2
No ratings yet
Module 1 - Part 2
88 pages
Instruction Set Architecture Overview
No ratings yet
Instruction Set Architecture Overview
43 pages
Acd 5
No ratings yet
Acd 5
9 pages
Unit 2
No ratings yet
Unit 2
17 pages
Code Generation PDF
No ratings yet
Code Generation PDF
19 pages
Instruction Types and Addressing Modes
No ratings yet
Instruction Types and Addressing Modes
38 pages
NP Lab Manual
No ratings yet
NP Lab Manual
155 pages
Condition Variables in Operating Systems
No ratings yet
Condition Variables in Operating Systems
14 pages
Solution
No ratings yet
Solution
16 pages
Distributed System - Question Bank.
100% (2)
Distributed System - Question Bank.
4 pages
MCS 211 DalalTechnologies
No ratings yet
MCS 211 DalalTechnologies
22 pages
100 Most Liked Problems - LeetCode
No ratings yet
100 Most Liked Problems - LeetCode
9 pages
2 Producer Consumer Problem
No ratings yet
2 Producer Consumer Problem
10 pages
SAIDS Mu Ques Paper Merged
No ratings yet
SAIDS Mu Ques Paper Merged
20 pages
CS-1331 Exam 02: Object-Oriented Programming
No ratings yet
CS-1331 Exam 02: Object-Oriented Programming
24 pages
Notes Compiled by Prof.: Ganesh Sir: Sem - V (COMP)
No ratings yet
Notes Compiled by Prof.: Ganesh Sir: Sem - V (COMP)
146 pages
Ap Using C 2021
No ratings yet
Ap Using C 2021
2 pages
Monthly Class Test 1
No ratings yet
Monthly Class Test 1
5 pages
Matrix & Subarray Algorithms
No ratings yet
Matrix & Subarray Algorithms
32 pages
MPMC Unit 1
No ratings yet
MPMC Unit 1
253 pages
Ensemble Methods Vs Traditional ML Approaches An Empirical Analysis On Web Based Attack Detection in The Context of Industry 5-CHAKIR OUMAIMA
No ratings yet
Ensemble Methods Vs Traditional ML Approaches An Empirical Analysis On Web Based Attack Detection in The Context of Industry 5-CHAKIR OUMAIMA
23 pages
CS3381 - Object Oriented Programming Laboratory
No ratings yet
CS3381 - Object Oriented Programming Laboratory
39 pages
Resume Uploader Project Report
No ratings yet
Resume Uploader Project Report
28 pages
Alg DS1 Example Test 2
No ratings yet
Alg DS1 Example Test 2
3 pages
What Is A Pointer in C
No ratings yet
What Is A Pointer in C
15 pages
Machine Learning
No ratings yet
Machine Learning
44 pages
Coupling Cohesion
No ratings yet
Coupling Cohesion
26 pages
Bfs Dfs Uniform Cost
No ratings yet
Bfs Dfs Uniform Cost
71 pages
Unit 4
No ratings yet
Unit 4
54 pages
Admitted List of B.Tech Students Batch 2021
No ratings yet
Admitted List of B.Tech Students Batch 2021
50 pages
LAB02 Chapter 06 (A First Look at Classes) - Part 2 - 2
No ratings yet
LAB02 Chapter 06 (A First Look at Classes) - Part 2 - 2
10 pages
L16 Pipeline CTL
No ratings yet
L16 Pipeline CTL
18 pages
Probability & CDF for Dice Rolls
No ratings yet
Probability & CDF for Dice Rolls
10 pages
DLL - MTB MLE3 - Q4 - W2 Paggawa NG Banghay NG Ulat Edumaymaylauramosangie
No ratings yet
DLL - MTB MLE3 - Q4 - W2 Paggawa NG Banghay NG Ulat Edumaymaylauramosangie
6 pages
Chapter 11 Key Points Problem Solving
No ratings yet
Chapter 11 Key Points Problem Solving
2 pages
Operating System Lab Manual
No ratings yet
Operating System Lab Manual
98 pages

CDU5

Uploaded by

CDU5

Uploaded by

Code Generator

• Code generator is the final phase of compiler

• Constant 1 should be added to x

Object Code Forms

Indirect Indexed addr mode – similar to Indexed addr mode. If we combine

• Literal AM: instead of addr directly we can have

Code Generation Algorithm

Generating Code for Assignment Statements:

Simple Flow Graph for Inner Loop

You might also like