0% found this document useful (0 votes)
540 views

3 Intermediate Code Generation

This chapter discusses intermediate code generation which involves translating source code into an intermediate representation. It covers directed acyclic graphs, three address code, symbol tables, assignment statements, boolean expressions, flow control, procedure calls, and code generation. The benefits of intermediate code generation include enabling optimization and creating compilers that are independent of the source and target languages.

Uploaded by

AKASH PAL
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
540 views

3 Intermediate Code Generation

This chapter discusses intermediate code generation which involves translating source code into an intermediate representation. It covers directed acyclic graphs, three address code, symbol tables, assignment statements, boolean expressions, flow control, procedure calls, and code generation. The benefits of intermediate code generation include enabling optimization and creating compilers that are independent of the source and target languages.

Uploaded by

AKASH PAL
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Chapter 3

Intermediate Code Generation

LEARNING OBJECTIVES

 Introduction  Procedure calls


 Directed Acyclic Graphs (DAG)  Code generation
 Three address code  Next use information
 Symbol table operations  Run-time storage management
 Assignment statements  DAG representations of basic blocks
 Boolean expression  Peephole optimization
 Flow control of statements

inTroduCTion Directed acyclic graphs for expression: (DAG)


In the analysis–synthesis model, the front end translates a source • A DAG for an expression identifies the common sub expressions
program into an intermediate representation (IR). From IR the in the given expression.
back end generates target code. • A node N in a DAG has more than one parent if N represents a
common sub expression.
• DAG gives the compiler, important clues regarding the genera-
Source Intermediate Back Target tion of efficient code to evaluate the expressions.
Front
code end representation end representation
Example 1: DAG for a + a*(b – c) + (b – c)*d
Target Mostly target Target P7 P 13
independent, dependent, dependent, + P12
source dependent source independent source independent + P *
6
P5P10
* d P11
There are different types of intermediate representations: a −
P1P 2
• High level IR, i.e., AST (Abstract Syntax Tree) b c
• Medium level IR, i.e., Three address code P 3P 8 P 4P 9
• Low level IR, i.e., DAG (Directed Acyclic Graph)
• Postfix Notation (Reverse Polish Notation, RPN). P1 = makeleaf (id, a)
P2 = makeleaf (id, a) = P1
In the previous sections already we have discussed about AST and
RPN. P3 = makeleaf (id, b)
P4 = makeleaf (id, c)
Benefits of Intermediate code generation: The benefits of ICG
P5 = makenode (-, P3, P4)
are
P6 = makenode (*, P1, P5)
1. We can obtain an optimized code. P7 = makenode (+, P1, P6)
2. Compilers can be created for the different machines by
P8 = makeleaf (id, b) = P3
attaching different backend to existing front end of each
machine. P9 = makeleaf (id, c) = P4
3. Compilers can be created for the different source languages. P10 = makenode (-, P8, P9) = P5
Chapter 3  •  Intermediate Code Generation  |  6.37

P11 = makeleaf (id, d) The corresponding three address code will be like this:
P12 = makenode (*, P10, P11)
Syntax Tree DAG
P13 = makenode (+, P7, P12)
t1 = -z t1 = -z
Example 2:  a: = a – 10 t2 = y * t1 t2 = y * t1
:=
t3 = -z t5 = t2 + t2
− t4 = y * t3 X = t5
a 10 t5 = t4 + t2
X = t5

Three-Address Code The postfix notation for syntax tree is: xyz unaryminus *yz
In three address codes, each statement usually contains 3 unaryminus *+=.
addresses, 2 for operands and 1 for the result. •• Three address code is a ‘Linearized representation’ of
Example:  -x = y OP z syntax tree.
•• x, y, z are names, constants or complier generated •• Basic data of all variables can be formulated as syntax
temporaries, directed translation. Add attributes whenever necessary.
•• OP stands for any operator. Any arithmetic operator (or) Example:  Consider below SDD with following
Logical operator. specifications:
Example:  Consider the statement x = y * - z + y* - z E might have E. place and E.code
E.place: the name that holds the value of E.
=
E.code: the sequence of intermediate code starts evaluating E.
+ Let Newtemp: returns a new temporary variable each time
x
it is called.
*
* New label: returns a new label.
Unary-minus
y Unary-minus Then the SDD to produce three–address code for expressions
z y
z
is given below:

Production Semantic Rules


S→ id ASN E S. code = E.code \\ gen (ASN, id.place, E.place )
E. Place = newtemp ();
E→ E1 PLUS E2 E. code = E1. code || E2. code || gen (PLUS, E. place, E1. place, E2. place);
E. place = newtemp();
E→ E1MUL E2 E. code = E1. code || E2. code || gen (MUL, E. place, E1. place, E2. place);
E. Place = Newtemp();
E→ UMINUS E1 E. code = E1 code || gen (NEG, E. Place, E1. place);
E. code = E1.code
E→ LP E1 RP E. Place = E1. Place
E→ IDENT E.place = id. place
E. code = empty.list ();

Types of Three Address Statement Address and pointer manipulation


Assignment  x : = &y Store address of y to x
•• Binary assignment: x: = y OP z Store the result of y OP z x : = *y Store the contents of y to x
to x. *x : = y Store y to location pointed by x .
•• Unary assignment: x: = op y Store the result of unary
operation on y to x. Jump
•• Unconditional jump:- goto L, jumps to L.
Copy •• Conditional:
•• Simple Copy x: = y Store y to x if (x relop y)
•• Indexed Copy x: = y[i] Store the contents of y[i] to x goto L1;
•• x[i]:= y Store y to (x + i)th address. else
6.38 | Unit 6  •  Compiler Design

{ Example 1:  For the expression x = y * - z + y * - z, the


goto L2; quadruple representation is
}
OP Arg1 Arg2 Result
Where relop is <, < =, >, > = , = or ≠.
(0) Uminus z t1
Procedure call (1) * y t1 t2
(2) Uminus z t3
Param x1; (3) * y t3 t4
(4) + t2 t4 t5
Param x2; (5) = t5 x
.
. Example 2:  Read (x)
.
Op Arg1 Arg2 Result
Param xn; (0) Param x
Call procedure p with n parameters and (1) Call READ (x)
Call p, n, x;
store the result in x.
return x Use x as result from procedure. Example 3:  WRITE (A*B, x +5)

Declarations OP Arg1 Arg2 Result


•• Global x, n1, n2: Declare a global variable named x at off- (0) * A B t1
set n1 having n2 bytes of space. (1) + x 5 t2
•• Proc x, n1, n2: Declare a procedure x with n1 bytes of (2) Param t1
(3) Param t2
parameter space and n2 bytes of local variable space.
(4) Call Write 2
•• Local x, m: Declare a local variable named x at offset m
from the procedure frame.
•• End: Declare the end of the current procedure. Triples
Triples have three fields: OP, arg1, arg2.
Adaption for object oriented code •• Temporaries are not used and instead references to
•• x = y field z: Lookup field named z within y, store address instructions are made.
to x •• Triples are also known as two address code.
•• Class x, n1, n2: declare a class named x with n1 bytes of •• Triples takes less space when compared with Quadruples.
class variables and n2 bytes of class method pointers. •• Optimization by moving code around is difficult.
•• Field x, n: Declare a field named x at offset n in the class •• The DAG and triple representations of expressions are
frame. equivalent.
•• New x: Create a new instance of class name x. •• For the expression a = y* – z + y*–z the Triple representa-
tion is
Implementation of Three
Address Statements Op Arg1 Arg2
(0) Uminus z
Three address statements can be implemented as records (1) * y (0)
with fields for the operator and the operands. There are 3 (2) Uminus z
types of representations: (3) * y (2)

1. Quadruples (4) + (1) (3)


(5) = a (4)
2. Triples
3. Indirect triples
Array – references
Quadruples Example:  For A [I]: = B, the quadruple representation is
A quadruple has four fields: op, arg1, arg2 and result.
Op Arg1 Arg2 Result
•• Unary operators do not use arg2. (0) []= A I T1
•• Param use neither arg2 nor result. (1) = B T2
•• Jumps put the target label in result.
•• The contents of the fields are pointers to the symbol table The same can be represented by Triple representation also.
entries for the names represented by these fields. [] = is called L-value, specifies the address to an
•• Easier to optimize and move code around. element.
Chapter 3  •  Intermediate Code Generation  |  6.39

Op Arg1 Arg2 Example: 


(0) []= A I Declaration → M1D
(1) = (0) B M1→ ∈ {TOP (Offset): = 0 ;}
Example 2:  A: = B [I] D→ D ID
D→ id: T {enter (top (tblptr), id.name, T.type
Op Arg1 Arg2
top (offset)); top (offset): = top (offset)
(0) =[] B I
+ T. width ;}
(1) = A (0)
T→ integer {T.type : = integer; T. width: = 4 :}
= [ ] is called r-value, specifies the value of an element.
T→ double {T.type: = double; T.width = 8 ;}
Indirect Triples T→ * T1 {T. type: = pointer (T. type); T.width
•• In indirect triples, pointers to triples will be there instead = 4;}
of triples. Need to remember the current offset before entering the
•• Optimization by moving code around is easy. block, and to restore it after the block is closed.
•• Indirect triples takes less space when compared with
Quadruples. Example:  Block → begin M4 Declarations statements end
•• Both indirect triples and Quadruples are almost equally {pop (tblptr); pop (offset) ;}
efficient. M4 → ∈{t: = mktable (top (tblptr); push (t,
Example:  Indirect Triple representation of 3-address code tblptr); push (top (offset), offset) ;

Can also use the block number technique to avoid creating


Statement
a new symbol table.
(0) (14)
(1) (15)
(2) (16) Field names in records
(3) (17) •• A record declaration is treated as entering a block in
(4) (18) terms of offset is concerned.
(5) (19)
•• Need to use a new symbol table.
Op Arg1 Arg2 Example:  T→ record M5 D end
(14) Uminus z {T. type: = (top (tblptr));
(15) * y (14)
T. width = top (offset);
pop (tblptr);
(16) Uminus z
pop (offset) ;}
(17) * y (16)
(18) + (15) (17) M5 → ∈  {t: = mktable (null);
(19) = x (18) push (t, tblptr);
push {(o, offset) ;}

Symbol Table Operations Assignment Statements


Treat symbol tables as objects.
Expressions can be of type integer, real, array and record.
•• Mktable (previous); As part of translation of assignments into three address
•• create a new symbol table. code, we show how names can be looked up in the symbol
•• Link it to the symbol table previous. table and how elements of array can be accessed.
•• Enter (table, name, and type, offset)
•• insert a new identifier name with type and offset into Code generation for assignment statements  gen ([address
table # 1], [assignment], [address #2], operator, address # 3);
•• Check for possible duplication. Variable accessing  Depending on the type of [address # i],
•• Add width (table, width); generate different codes.
•• increase the size of symbol table by width. Types of [address # i]:
•• Enterproc (table, name, new table) •• Local temp space
•• Enter a procedure name into table. •• Parameter
•• The symbol table of name is new table. •• Local variable
•• Lookup (name, table); •• Non-local variable
•• Check whether name is declared in the symbol table, if •• Global variable
it is in the table then return the entry. •• Registers, constants,…
6.40 | Unit 6  •  Compiler Design

Error handling routine  error – msg (error information); Start_addr: starting address
The error messages can be written and stored in other 1D Array: A[i]
file. Temp space management:
•• Start_addr + (i – low )* w = i * w + (start_addr - low *w)
•• This is used for generating code for expressions. •• The value called base, (start_addr – low * w) can be com-
•• newtemp (): allocates a temp space. puted at compile time and then stored at the symbol table.
•• freetemp (): free t if it is allocated in the temp space Example:  array [-8 …100] of integer.
To declare [-8] [-7] … [100] integer array in Pascal.
Label management 2D Array A [i1, i2]
•• This is needed in generating branching statements. Row major order: row by row. A [i] means the ith row.
•• newlabel (): generate a label in the target code that has 1st row A [1, 1]
never been used. A [1, 2]

Names in the symbol table 2nd row A [2, 1]


S→ id: = E {p: = lookup (id-name, top (tblptr)); A [2, 2]
If p is not null then gen (p, “:=”, A [i, j] = A [i] [j]
E.place); Column major: column by column.
Else error (“var undefined”, id. Name) A [1, 1] : A [1, 2]
;} A [2, 1] A [2, 2]
E→E1+ E2 {E. place = newtemp (); 1st Column 2nd column
gen (E.place, “: = “, E1.place, "+”, Address for A [i1, i2]:
E2.Place); free temp (E1.pace); Start _ addr + ((i, - low1) *n2 + (i2 – low2))*w
freetemp Where low1 and low2 are the lower bounds of i1 and i2. n2
(E2. place) ;} is the number of values that i2 can take. High2 is the upper
E→ –E1 {E. place = newtemp (); bound on the valve of i2. n2 = high2 – low2 + 1
gen (E.place, “: =”, “uminus”, We can rewrite address for A [i1, i2] as ((i1 × n2) + i2)
E1.place); × w + (start _ addr - ((low1 × n2) + low2) × w). The value
Freetemp (E1. place ;)} (start _ addr - low1 × n2 × w – low2 × w) can be computed at
E→(E1) {E. place = E1. place ;} compiler time and then stored in the symbol table.
E→ id {p: = lookup (id.name, top (tblptr); Multi-Dimensional Array A [i1, i2,…ik]
If p ≠ null then E.place = p. place else error Address for A [i1, i2,…ik]
(“var undefined”, id. name) ;}

Type conversions
n n
)
= i1 * π ik= 2 i + i2 * π ik=3 i +  + ik * w

Assume there are only two data types: integer, float.


(
+ start _ addr − low1 * w * π ik= 2 ni

For the expression,


n
−low2 * w * π ik=3 i −  − lowk * w )
E → E1 + E2
It can be computed incrementally in grammar rules:
If E1. type = E2. type then f (1) = i1;
generate no conversion code f (j) = f (j -1) * nj + ij;
E.type = E1. type; f (k) is the value we wanted to compute.
Else Attributes needed in the translation scheme for addressing
E.type = float; array elements:
temp1 = newtemp (); Elegize: size of each element in the array
If E1. type = integer then Array:  a pointer to the symbol table entry containing
gen (temp1,’:=’ int - to - float, E1.place); information about the array declaration.
gen (E,’:=’ temp1, ‘+’, E2.place); Ndim: the current dimension index
Else Base: base address of this array
gen (temp1,’:=’ int - to - float, E2. place); Place: where a variable is stored.
gen (E,’:=’ temp1, ‘+’, E1. place); Limit (array, n) = nm is the number of elements in the mth
Free temp (temp1); coordinate.

Addressing array elements Translation scheme for array elements


Let us assume Consider the grammar
low: lower bound S → L: = E
w: element data width E→L
Chapter 3  •  Intermediate Code Generation  |  6.41

L→ id Boolean Expressions
L→ [Elist] There are two choices for implementation of Boolean
Elist→ Elist1, E expressions:
Elist→ id [E] 1. Numerical representation
E→ id 2. Flow of control
E→ E + E
Numerical representation
E→ (E)
Encode true and false values.
•• S → L: = E {if L. offset = null then /* L is a Numerically, 1:true 0: false.
simple id */ gen (L. place, “:=”, E.place); Flow of control: Representing the value of a Boolean
Else expression by a position reached in a program.
gen (L. place, “[“, L. offset, “]”,”:=”,
Short circuit code:  Generate the code to evaluate a Boolean
E.place);
expression in such a way that it is not necessary for the code
•• E → E1 + E2 {E.place = newtemp ();
gen (E. place, “:=”, E1.place, "+”, E2.
to evaluate the entire expression.
place) ;} •• If a1 or a2
•• E → (E1) {E.place= E1.place} a1 is true then a2 is not evaluated.
•• E →L {if L. offset = null then /* L is a •• If a1 and a2
simple id */ E.place:= L .place); a1 is false then a2 is not evaluated.
Else begin
E.place:=newtemp(); Numerical representation
gen (E.place, “:=”,L.place, “[“,L.offset,
E → id1 relop id2
‘]”);
{B.place:= newtemp ();
end }
gen (“if”, id1.place, relop.op, id2.
•• L → id {P! = lookup (id.name, top (tblptr)); place,”goto”, next stat +3);
If P ≠ null then
gen (B.place,”:=”, “0”);
Begin
gen (“goto”, nextstat+2);
L.place: = P.place:
gen (B.place,”:=”, “1”)’}
L.offset:= null;
End Example 1:  Translate the statement (if a < b or c < d and e
Else < f) without short circuit evaluation.
Error (“Var underfined”, id. Name) ;} 100: if a < b goto 103
•• L → Elist {L. offset: = newtemp (); 101: t1:= 0
gen (L. offset, “:=”, Elist.elesize, 102: goto 104
“*”, Elist.place ); 103: t1:= 1 /* true */
freetemp (Elist.place); 104: if c < d goto 107
L.Place := Elist . base ;} 105: t2:= 0 /* false */
•• Elist→ Elist1, E {t: =newtemp (); m: = Elist1. 106: goto 108
ndim+1;
107: t2:= 1
gen (t, “:=” Elist1.place, “*”, limit (Elist1.
108: if e < f goto 111
array, m));
Gen (t, “:=”, t"+”, E.place); freetemp 109: t3:= 0
(E.place); 110: goto 112
Elist.array: = Elist.array; 111: t3 := 1
Elist.place:= t; Elist.ndim:= m ;} 112: t4 := t2 and t3
Elist → id [E {Elist.Place:= E.place; Elist. 113: t3:= t1 or t4
ndim:=1;
P! = lookup (id.name, top (tblptr)); check Flow of Control Statements
for id errors;
Elist.elesize:= P.size; Elist.base: = p.base;
B→ id1 relop id2
{
Elist.array:= p.place ;}
B.true: = newlabel ();
•• E → id {P:= lookup (id,name, top (tblptr); B.false:= newlabel ();
Check for id errors; E. Place: = Populace ;} B.code:= gen (“if”, id1. relop, id2, “goto”,
6.42 | Unit 6  •  Compiler Design

B.true, “else”, “goto”, B. false) || .


gen (B.true, “:”) .
} .
S→if  B then S1 S.code:= B.code || S1 .code ||gen case V [k]: S[k]
(B.false, ‘:’) default: S[d]
|| is the code concatenation operator. }

1. If – then implementation:
S →if B then S1 {gen (Befalls,” :”);} Translation sequence
To B.true •• Evaluate the expression.
B.Code
To B.false •• Find which value in the list matches the value of the
B.true: S1.Code expression, match default only if there is no match.
B.false: •• Execute the statement associated with the matched value.

2. If – then – else How to find the matched value?  The matched value can be
P→S   {S.next:= newlabel (); found in the following ways:
P.code:= S.code || gen (S.next,” :”)} 1. Sequential test
2. Lookup table
S → if B then S1 else S2 {S1.next:= S.next;
3. Hash table
S2.next:= S.next; 4. Back patching
Secede: = B.code || S1.code ||.
Two different translation schemes for sequential test are
Gen (“goto” S.next) || B. false,” :”) shown below:
||S2.code}
1. Code to evaluate E into t
Need to use inherited attributes of S to define the
Goto test
attributes of S1 and S2
L[i]: code for S [1]
B.Code To B. true goto next
To B.false
B.true: S1.Code L[k]: code for S[k]
goto next
Goto S.next
L[d]: code for S[d]
S2.Code
B.false: Go to next test:
S.next If t = V [1]: goto L [1]
.
3. While loop: .
B→ id1 relop id2 B.true:= newlabel (); .
B.false:= newlabel (); goto L[d]
B.code:=gen (‘if’, id.relop, Next:
id2, ‘goto’, B.true ‘else’, ‘goto’, B. false) || 2. Can easily be converted into look up table
gen (B.true ‘:’); If t <> V [i] goto L [1]
S→ while B do S1 S.begin:= newlabel (); Code for S [1]
S.code:=gen (S.begin,’:’)|| goto next
B.code||S1.code || gen
(‘goto’, S.begin) || gen (B.false, ‘:’); L [1]: if t < > V [2] goto L [2]
Code for S [2]
S.begin B.Code B. true
B.false Goto next
B.true: S1.Code
L [k - 1]: if t < > V [k] goto L[k]
Goto S.next
Code for S[k]
B.false: Goto next
.
4. Switch/case statement: .
The c - like syntax of switch case is .
switch epr { L[k]: code for S[d]
case V [1]: S [1] Next:
Chapter 3  •  Intermediate Code Generation  |  6.43

Use a table and a loop to find the address to jump r – value: value of the variable, i.e., on the right side of
assignment. Ex: y, in above assignment.
V [1] L [1] l – value: The location/address of the variable, i.e., on the
L[1] : S [1]
V [2] L [2] leftside of assignment. Ex: x, in above assignment.
L [2]: S [2]
There are different modes of parameter passing
V [3] L [3]
1. call-by-value
2. call-by-reference
3. call-by-value-result (copy-restore)
4. call-by-name
3. Hash table: When there are more than two entries
use a hash table to find the correct table entry.
4. Back patching: Call by value
•• Generate a series of branching statements with the Calling procedure copies the r values of the arguments into
targets of jumps temporarily left unspecified. the called proceduce’s Activation Record.
•• To determine label table: each entry contains a list Changing a formal parameter has no effect on the actual
of places that need to be back patched. parameter.
•• Can also be used to implement labels and gotos.
Example:  void add (int C)
{
Procedure Calls C = C+ 10;
•• Space must be allocated for the activation record of the printf (‘\nc = %d’, &C);
called procedure. }
•• Arguments are evaluated and made available to the called main ()
procedure in a known place. {
•• Save current machine status. int a = 5;
•• When a procedure returns: printf (‘a=%d’, &a);
•• Place returns value in a known place. add (a);
•• Restore activation record. printf (‘\na = %d’, &a);
}
Example:  S → call id (Elist) In main a will not be affected by calling add (a)
{for each item P on the queue Elist.
It prints a = 5
Queue do gen (‘PARAM’, q);
gen (‘call:’, id.place) ;}
a=5
Elist → Elist, E {append E.place to the end of Only the value of C in add ( ) will be changed to 15.
Elist.queue} Usage:
Elist → E {initialize Elist.queue to contain only 1. Used by PASCAL and C++ if we use non-var
E.place} parameters.
Use a queue to hold parameters, then generate codes for 2. The only thing used in C.
params. Advantages:
Code for E1, store in t1 1. No aliasing.
. 2. Easier for static optimization analysis.
. 3. Faster execution because of no need for redirecting.
.
Code for Ek, store in tk
PARAM t1 Call by reference
: Calling procedure copies the l-values of the arguments into
. the called procedure’s activation record. i.e., address
. will be passed to the called procedure.
PARAM tk
•• Changing formal parameter affects the corresponding
Call P
actual parameter.
Terminology:
•• It will have some side effects.
Procedure declaration:
Parameters, formal parameters Example:  void add (int *c)
Procedure call: {
Arguments, actual parameters. *c = *c + 10;
The values of a variable: x = y printf(‘\nc=%d’, *c);
6.44 | Unit 6  •  Compiler Design

} int j;
void main() j = - 1;
{ For (in y= 0; y < 10; y ++)
int a = 5; x ++;
}
printf (‘\na = %d’, a);
add (&a); •• Instead of passing values or address as arguments, a func-
printf (‘\na = %d’, a); tion is passed for each argument.
output: a = 5 •• These functions are called thunks.
c = 15 •• Each time a parameter is used, the thunk is called, then
a = 15 the address returned by the thunk is used.
That is, here the actual parameter is also modified.
y = 0: use return value of thunk for y as the  -value.
Advantages
1. Efficiency in passing large objects. Advantages
2. Only need to copy addresses. •• More efficient when passing parameters that are never
used.
Call-by-value-result •• This saves lot of time because evaluating unused param-
Equivalent to call-by-reference except when there is aliasing. eter takes a longtime.
That is, the program produces the same result, but not the
same code will be generated. Code Generation
Aliasing: Two expressions that have the same l-values are Code generation is the final phase of the compiler model.
called aliases. They access the same location from different
places. Input Intermediate Code
Front
Aliasing happens through pointer manipulation. (or)
end code optimization
Source
1. Call by reference with global variable as an argument. program
2. Call by reference with the same expression as argu- Intermediate code
ment twice.
Example:  test (x,y,x) Target Code
program generation
Advantages:
1. If there is no aliasing, we can implement it by using
call – by – reference for large objects. The requirements imposed on a code generator are
2. No implicit side effect if pointers are not passed. 1.  Output code must be correct.
2.  Output code must be of high quality.
Call by-name 3.  Code generator should run efficiently.
used in Algol.
•• Procedure body is substituted for the call in calling procedure. Issues in the Design of a Code Generator
•• Each occurrence of a parameter in the called procedure is
The generic issues in the design of code generators are
replaced with the corresponding argument.
•• Similar to macro expansion. •• Input to the code generator
•• A parameter is not evaluated unless its value is needed •• Target programs
during computation. •• Memory Management
•• Instruction selection
Example: •• Register Allocation
void show (int x) •• Choice of Evaluation order
{
for (int y = 0; y < 10; y++)
x++; Input to the code generator
} Intermediate representation with symbol table will be the
main () input for the code generator.
{
int j; •• High Level Intermediate representation
j = –1; Example:  Abstract Syntax Tree (AST)
show (j);
} •• Medium – level intermediate representation
Actually it will be like this Example:  control flow graph of complex operations
main ()
{ •• Low – Level Intermediate representation
Chapter 3  •  Intermediate Code Generation  |  6.45

Example:  Quadruples, DAGS Example:  x = y + z in three address statements:


•• Code for abstract stack machine, i.e., postfix code. MOV y, R0 / * load y into R0 * /
ADD z, R0
Target programs MOV R0, x /* store R0 into x*/
The output of the code generator is the target program. The
output may take on a variety of forms: Register allocation
1. Absolute machine language •• Instructions with register operands are faster. So, keep fre-
2. Relocatable machine language quently used values in registers.
3. Assembly language •• Some registers are reserved.
Example:  SP, PC … etc.
Absolute machine language Minimize number of loads and stores.
•• Final memory area for a program is statically known.
•• Hard coded addresses.
•• Sufficient for very simple systems.
Evaluation order
•• The order of evaluation can affect the efficiency of the
Advantages:
target code.
•• Fast for small programs
•• Some orders require fewer registers to hold intermediate
•• No separate compilation
results.
Disadvantages: Can not call modules from other languages/
compliers. Target Machine
Lets us assume, the target computer is
Relocatable code  It Needs
•• Relocation table •• Byte addressable with 4 bytes per word
•• Relocating linker + loader (or) runtime relocation in •• It has n general purpose registers
Memory management Unit (MMU). R0, R1, R2, … Rn-1
•• It has 2 address instructions of the form
Advantage: More flexible.
OP source, destination
Assembly language  Generates assembly code and use an [cost: 1 + added]
assembler tool to convert this to binary (object) code. It needs Example:  The op may be MOV, ADD, MUL.
(i) assembler (ii) linker and loader. Generally cost will be like this
Advantage: Easier to handle and closer to machine. Source Destination Cost
Register Register 1
Register Memory 2
Memory management Memory Register 2
Mapping names in the source program to addresses of data Memory Memory 3
objects in runtime memory is done by the front end and the
code generator. Addressing modes:

•• A name in a three address statement refers to a symbol Mode Form Address Cost
entry for the name. Absolute M M 2
•• Stack, heap, garbage collection is done here. Register R R 1
Indexed C(R) C+contents(R) 2
Instruction selection Indirect *R Contents (R) 1
register
Instruction selection depends on the factors like
Indirect *C(R) Contents (C+contents 2
•• Uniformity indexed (R))
•• Completeness of the instruction
•• Instruction speed Example:  x: = y – z
•• Machine idioms MOV y, R0 → cost = 2
•• Choose set of instructions equivalent to intermediate rep- SUB z, R0 → cost = 2
resentation code. MOV R0, x → cost = 2
•• Minimize execution time, used registers and code size. 6
6.46 | Unit 6  •  Compiler Design

Runtime Storage Management 1. Static allocation: The position of an activation


record in memory is fixed at compile time.
Storage Organization 2. Stack allocation: A new activation record is pushed
To run a compiled program, compiler will demand the oper- on to the stack for each execution of the procedure.
ating system for the block of memory. This block of mem- The record is poped when the activation ends.
ory is called runtime storage.
This run time storage is subdivided into the generated Control stack  The control stack is used for managing active
target code, Data objects and Information which keeps track procedures, which means when a call occurs, the execution
of procedure activations. of activation is interrupted and status information of the
The fixed data (generated code) is stored at the statically stack is saved on the stack.
determined area of the memory. The Target code is placed When control is returned from a call, the suspended acti-
at the lower end of the memory. vation is resumed after storing the values of relevant reg-
The data objects are stored at the statically determined isters it also includes program counter which sets to point
area as its size is known at the compile time. Compiler immediately after the call.
stores these data objects at statically determined area The size of stack is not fixed.
because these are compiled into target code. This static data
Scope of declarations  Declaration scope refers to the cer-
area is placed on the top of the code area.
tain program text portion, in which rules are defined by the
The runtime storage contains stack and the heap. Stack
language.
contains activation records and program counter, data
Within the defined scope, entity can access legally to
object within this activation record are also stored in this
declared entities.
stack with relevant information.
The scope of declaration contains immediate scope
The heap area allocates the memory for the dynamic data
always. Immediate scope is a region of declarative portion
(for example some data items are allocated under the pro-
with enclosure of declaration immediately.
gram control)
Scope starts at the beginning of declaration and scope
The size of stack and heap will grow or shrink according
continues till the end of declaration. Whereas in the over
to the program execution.
loadable declaration, the immediate scope will begin, when
the callable entity profile was determined.
Activation Record The visible part refers text portion of declaration, which
Information needed during an execution of a procedure is is visible from outside.
kept in a block of storage called an activation record.
•• Storage for names local to the procedures appears in the
Flow Graph
activation record. A flow graph is a graph representation of three address
•• Each execution of a procedure is referred as activation of statement sequences.
the procedure. •• Useful for code generation algorithms.
•• If the procedure is recursive, several of its activation •• Nodes in the flow graph represents computations.
might be alive at a given time. •• Edges represent flow of control.
•• Runtime storage is subdivided into
1. Generated target code area
2. Data objects area Basic Blocks
3. Stack Basic blocks are sequences of consecutive statements in
4. Heap which flow of control enters at the beginning and leaves at
the end without a halt or branching.
Code
1. First determine the set of leaders
Static data •• First statement is leader
Stack •• Any target of goto is a leader

•• Any statement that follows a goto is a leader.
… 2. For each leader its basic block consists of the leader
Heap and all statements up to next leader.
Initial node: Block with first statement is leader.
•• Sizes of stack and heap can change during program Example:  consider the following fragment of code that
execution. computes dot product of two vectors x and y of length 10.
For code generation there are two standard storage begin
allocations: Prod: = 0;
Chapter 3  •  Intermediate Code Generation  |  6.47

i: = 1; 2. Dead code elimination: Code that computes values


repeat for names that will be dead i.e., never subsequently
begin used can be removed.
Prod: = Prod + x [i] * y [i]; 3. Renaming of temporary variables
i: = i + 1; 4. Interchange of two independent adjacent statements
end
until i < = 10; Algebraic Transformations
end
Algebraic identities represent other important class optimi-
B1 (1) Prod : = 0 zations on basic blocks. For example, we may apply arith-
metic identities, such as x + 0 = 0 + x = x,
(2) I: = 1
x*1=1*x=x
x–0=x
B2 (3) t1:= 4*i x/1 = x
(4) t2: =x[t1]
Next-Use Information
(5) t3: =4 * i
•• Next-use info used in code generation and register
(6) t4: =y [t3]
allocation.
(7) t5: =t2* t4
•• Remove variables from registers if not used.
(8) t6; =Prod + t5 •• Statement of the form A = B or C defines A and uses B
(9) Prod := t6 and C.
(10) t7: = i+1 •• Scan each basic block backwards.
(11) i:= t7 •• Assume all temporaries are dead or exit and all user vari-
ables are live or exit.
(12) if i < = 10 goto (3)

\The flow graph for this code will be


Algorithm to compute next use information
b1 Suppose we are scanning
i: x: = y op z
in backward scan
b2 •• attach to i, information in symbol table about x, y, z.
•• set x to not live and no next-use in symbol table
•• set y and z to be live and next-use in symbol table.
Here b1 is the initial node/block. Consider the following code:
•• Once the basic blocks have been defined, a number of 1: t1 = a * a
transformations can be applied to them to improve the 2: t2 = a * b
quality of code. 3: t3 = 2 * t2
1. Global: Data flow analysis
2. Local: 4: t4 = t1 + t2
•• Structure preserving transformations 5: t5 = b * b
•• Algebraic transformations 6: t6 = t4 + t5
•• Basic blocks compute a set of expressions. These expres-  7: x = t6
sions are the values of the names live on exit from the
block. Statements:
•• Two basic blocks are equivalent if they compute the same 7: no temporary is live
set of expressions. 6: t6: use (7) t4 t5 not live
5: t5: use (6)
Structure preserving transformations:
4: t4: use (6), t1 t3 not live
1. Common sub-expression elimination: 3: t3: use (4) t2 not live
a:=b+c a:=b+c 2: t2: use (3)
b:=a–d b:=a–d 1: t1: use (4)
c:=b+c ⇒ c:=b+c Symbol Table:
d:=a-d d:=b
t1 dead use in 4
6.48 | Unit 6  •  Compiler Design

t2 dead use in 3 Address mapping In this, mapping is defined between


t3 dead use in 4 intermediate representations to target code address.
t4 dead use in 6 It is based on run time environment like static, stack or
heap.
t5 dead use in 6
t6 dead use in 7 Instruction set  It should provide a complete set in such a
The six temporaries in the basic block can be packed into way that all its operations can be implemented.
two locations t1 and t2:
1: t1 = a * a Code Generation Algorithm
2: t2 = a * b For each three address statement x = y op z do
3: t2 = 2 * t2 •• Invoke a function getreg to determine location L where x
4: t1 = t1 + t2 must be stored. Usually L is a register.
•• Consult address descriptor of y to determine y′. Prefer a
5: t2 = b * b
register for y′. If value of y is not already in L generate
6: t1 = t1 + t2 MOV y′, L.
 7: x = t1 •• Generate
OP z′, L
Code Generator Again prefer a register for z. Update address descriptor
of x to indicate x is in L. If L is a register update its descrip-
•• Consider each statement
tor to indicate that it contains x and remove x from all other
•• Remember if operand is in a register
register descriptors.
•• Descriptors are used to keep track of register contents and
•• If current value of y and/or z have no next use and are
address for names
dead or exit from block and are in registers then change
•• There are 2 types of descriptors
the register descriptor to indicate that it no longer contain
1. Register Descriptor
y and /or z.
2. Address Descriptor
Function getreg
Register Descriptor 1. If y is in register and y is not live and has no next use
Keep track of what is currently in each register. Initially all after x = y OP z then return register of y for L.
registers are empty. 2. Failing (1) return an empty register.
3. Failing (2) if x has a next use in the block or OP
requires register then get a register R, store its
Address Descriptors contents into M and use it.
•• Keep track of location where current value of the name 4. Else select memory location x as L.
can be found at runtime.
Example:  D: = (a - b) + (a - c) + (a - c)
•• The location might be a register, stack, memory address
or a set of all these. Code
Stmt Generated reg desc addr desc
Issues in design of code generation The issues in the t= a - b MOV a, R0 R0 contains t t in R0
design of code generation are SUB b, R0
u=a–c MOV a, R1 R0 contains t t in R0
1. Intermediate representation SUB c, R1 R1 contains u u in R1
2. Target code v=t+u ADD R1, R0 R0 contains v u in R0
3. Address mapping R1 contains u v in R0
d=v+u ADD R1, R0 Ro contains d d in R0
4. Instruction set.
MOV R0,d d in R0 and
memory
Intermediate Representation  It is represented in post fix,
3-address code (or) quadruples and syntax tree (or) DAG.
Conditional Statements
Target Code The Target Code could be absolute code, Machines implement conditional jumps in 2 ways:
relocatable machine code (or) assembly language code.
Absolute code will execute immediately as it is having 1. Based on the value of the designated register (R)
fixed address relocatable, requires linker and loader to get Branch if values of R meets one of six conditions.
the code from appropriate location for the assembly code, (i)  Negative (ii)  Zero
assemblers are required to convert it into machine level (iii)  Positive (iv)  Non-negative
code before execution. (v)  Non-zero (vi)  Non-positive
Chapter 3  •  Intermediate Code Generation  |  6.49

Example:  Three address statement: if x < y goto z Code Generation from DAG:
It can be implemented by subtracting y from x in R, then
jump to z if value of R is negative. S1 = 4 * i S1 = 4 * i
2. Based on a set of condition codes to indicate whether S2 = add(A) - 4 S2 = add(A) - 4
last quantity computed or loaded into a location is S3 = S2 [S1] S3 = S2 [S1]
negative (or) Zero (or) Positive. S4 = 4 * i
•• compare instruction set codes without actually S5 = add(B) - 4 S5 = add(B) - 4
computing the value. S6 = S5[S4] S6 = S5[S4]
Example:  CMP x, y S7 = S3 *S6 S7 = S3 *S6
CJL Z. S8 = prod + S7 prod = prod + S7
•• Maintains a condition code descriptor, which tells the prod = S8
name that last sets the condition codes. S9 = I + 1
Example:  X: = y + z I = S9 I=I+1
If x < 0 goto z if I < = 20 got (1) if I < = 20 got (1)
By
MOV y, Ro
ADD z, Ro Rearranging order of the code
MOV Ro, x Consider the following basic block
CJN z. t1:= a + b
t2:= c + d
DAG Representation t3:= e – t2
of Basic Blocks x = t1 - t3 and its DAG
•• DAGS are useful data structures for implementing trans- −x
formations on basic blocks.
•• Tells, how value computed by a statement is used in sub- t1 − t3
sequent statements.
•• It is a good way of determining common sub expressions. a b y +t 2
•• A DAG for a basic block has following labels on the nodes:
•• Leaves are labeled by unique identifiers, either variable c d
names or constants.
•• Interior nodes are labeled by an operator symbol. Three address code for the DAG:
•• Nodes are also optionally given as a sequence of identi- (Assuming only two registers are available)
fiers for labels. MOV a, Ro
Example:  1: t1:= 4 * i ADD b, Ro
2: t2:= a [t1] MOV c, R1
3: t3:= 4 * i MOV Ro, t1 Register Spilling
4: t4:= b [t3]
MOV e, Ro Register Reloading
5: t5:= t2 * t4
6: t6:= prod + t5 SUB R1, Ro
7: prod: = t6 MOV t1, R1
8: t7:= i + 1 SUB Ro, R1
9: i= t7 MOV R1, x
10: if i < = 20 got (1)
Rearranging the code as
+ t 6, prod t2:= c + d
t3:= e – t2
prod * t5
t1:= a + b
(1)
x = t1 – t3
[] [ ] t4 <=
The rearrangement gives the code:
MOV c, Ro
20
a b t 1, t 3 + t 7, i ADD d, Ro
*
MOV e, R1
4 i SUB Ro, R1
io
6.50 | Unit 6  •  Compiler Design

MOV a, Ro 1. Eliminating redundant instructions


ADD b, Ro 2. Eliminating unreachable code
SUB R1, R0 3. Flow of control optimizations or Eliminating jumps
over jumps
MOV R1, x
4. Algebraic simplifications
Error detection and Recovery  The errors that arise while 5. Strength reduction
compiling 6. Use of machine idioms
1. Lexical errors
Elimination of Redundant Loads and stores
2. Syntactic errors
3. Semantic errors Example 1:  (1) MOV Ro, a
4. Run-time errors (2) MOV a, Ro
We can delete instruction (2), because the value of a is
Lexical errors  If the variable (or) constants are declared already in R0.
(or) defined, not according to the rules of language, special Example 2:  Load x, R0
symbols are included which were not part of the language, Store R0, x
etc is the lexical error. If no modifications to R0/x then store instruction can be
Lexical analyzer is constructed based on pattern recog- deleted
nizing rules to form a token, when a source code is made
into tokens and if these tokens are not according to rules Example 3:  (1) Load x, R0
then errors are generated. (2) Store R0, x
Example 4:  (1) store R0, x
Consider a c program statement (2) Load x, R0
printf (‘Hello World’); Second instruction can be deleted from both examples 3 and 4.
Main printf, (, ‘, Hello world,’ , ),; are tokens.
Example 5:  Store R0, x
Printf is not recognizable pattern, actually it should be
Load x, R0
printf. It generates an error.
Here load instruction can be deleted.
Syntactic error  These errors include semi colons, missing Eliminating Unreachable code
braces etc. which are according to language rules. An unlabeled instruction immediately following and uncon-
The parser reports the errors ditional jump may be removed.
•• May be produced due to debugging code intro-
Semantic errors  This type of errors arises, when operation duced during development.
is performed over incompatible type of variables, double •• May be due to updates in programs without consid-
declaration, assigning values to undefined variables etc. ering the whole program segment.
Runtime errors  The Runtime errors are the one which are Example:  Let print = 0
detected at runtime. These include pointers assigned with if print = 1 goto L 1 if print ! = 1 goto L 2
NULL values and accessing a variable which is out of its goto L 2 print instructions
boundary, unlegible arithmetic operations etc. L 1: print in L 2:
After the detection of errors. The following recovery
strategies should be implemented.
  1. Panic mode recovery
goto L 2 if 0! = 1 goto L 2
  2. Phrase level recovery print instructions print instructions
  3. Error production L 2: L 2:
  4. Global correction.
In all of the above cases print instructions are unreachable.
\ Print instructions can be eliminated.
Peephole Optimization Example:  goto L2
•• Target code often contains redundant instructions and …
suboptimal constructs. L2:
•• Improving the performance of the target program by
Flow of control optimizations  The unnecessary jumps can
examining a short sequence of target instructions (peep-
be eliminated.
hole) and replacing these instructions by a shorter or
Jumps like:
faster sequence is peephole optimization.
Jumps to jumps,
•• The peephole is a small, moving window on the target
Jumps to conditional jumps,
program. Some well known peephole optimizations are
Conditional jumps to jumps.
Chapter 3  •  Intermediate Code Generation  |  6.51

Example 1:  we can replace the jump sequence Reduction in strength


goto L1 •• x2 is cheaper to implement as x * x than as a call to expo-
… nentiation routine.
L1: got L2 •• Replacement of multiplication by left shift.
By the sequence
Example:  x * 23 ⇒ x < < 3
Got L2
•• Replace division by right shift.
L1: got L2,
… Example:  x > > 2 (is x/22)
If there are no jumps to L1 then it may be
possible to eliminate the statement L1: goto L2. Use of machine Idioms
•• Auto increment and auto decrement addressing modes
Example 2:  can be used whenever possible.
Sometimes skips “goto L 3”
Example:  replace add #1, R by INC R
goto L 1 Only one jump to if a < b goto L 2
... L goto L 3:
L 1: if a < b goto ...
L2 L 3:
L 3:
...

Exercises
Practice Problems 1 Var …
Directions for questions 1 to 15:  Select the correct alterna- call A2;
tive from the given choices }
1. Consider the following expression tree on a machine Procedure A2 ( )
with bad store architecture in which memory can be {
accessed only through load and store instructions. The
variables p, q, r, s and t are initially stored in memory. Var..
The binary operators used in this expression tree can Procedure A21 ( )
be evaluated by the machine only when the operands {
are in registers. The instructions produce result only Var…
in a register if no intermediate results can be stored
in memory, what is the minimum number of registers call A21 ( );
needed to evaluate this expression? }

+ Call A1;
}
− − Call A1;
}
p q t + Consider the calling chain: main ( )→ A1 ( ) → A2 ( ) →
A21 ( ) → A1 ( ).
r s
The correct set of activation records along with their
access links is given by
(A) 2 (B) 9 (A) (B) (C) (D)
(C) 5 (D) 3 main main
main main
2. Consider the program given below with lexical scoping
and nesting of procedures permitted. A1 A1 A1 A1
Program main ( ) A2 A2 A2 A2
{
A 21
Var … A 21 A 21 A 21
Procedure A1 ( ) Frame A1 Access A1 A1
{ Pointer links
6.52 | Unit 6  •  Compiler Design

3. Consider the program fragment: a = a * ( j* (b/c));


sum = 0; d = a * ( j* (b/c));
For (i = 1; i < = 20; i++) (A) 4 (B) 7
(C) 8 (D) 10
sum = sum + a[i] +b[i];
8. Let A = 2, B = 3, C = 4 and D = 5, what is the final value
How many instructions are there in the three-address
of the prefix expression: + * AB – CD
code for this?
(A) 5 (B) 10
(A) 15 (B) 16
(C) –10 (D) –5
(C) 17 (D) 18
9. Which of the following is a valid expression?
4. Suppose the instruction set of the processor has only
(A) BC * D – + (B) * ABC –
two registers. The code optimization allowed is code
(C) BBB ***- + (D) -*/bc
motion. What is the minimum number of spills to
memory in the complied code? 10. What is the final value of the postfix expression B C D
A D – + – + where A = 2, B = 3, C = 4, D = 5?
c = a + b;

(A) 5 (B) 4
d = c*a;
(C) 6 (D) 7
e = c + a;
11. Consider the expression x = (a + b)* –C/D. In the
x = c*c;
quadruple representation of this expression in which
If (x > a)
instruction ‘/’ operation is used?
(A) 3rd (B) 4th
{

(C) 5th (D) 8th
y = a*a;
12. In the triple representation of x = (a + b)*– c/d, in which
Else instruction (a + b) * – c/d result will be assigned to x?
{ (A) 3rd (B) 4th
d = d*d; e = e*e; (C) 5th (D) 8th
} 13. Consider the three address code for the following
(A) 0 (B) 1 program:
(C) 2 (D) 3 While (A < C and B > D) do
5. What is the minimum number of registers needed to If (A = = 1) then C = C + 1;
compile the above problem’s code segment without any Else
spill to memory?
While (A < = D) do
(A) 3 (B) 4
(C) 5 (D) 6 A = A + 3;
6. Convert the following expression into postfix notation: How many temporaries are used?
(A) 2 (B) 3
a = (-a + 2*b)/a
(C) 4 (D) 0
(A) aa – 2b *+a/= (B) a – 2ba */+ =
(C) a2b * a/+ (D) a2b – * a/+ 14. Code generation can be done by
(A) DAG (B) Labeled tree
7. In the quadruple representation of the following pro-
(C) Both (A) and (B) (D) None of these
gram, how many temporaries are used?
15. Live variables analysis is used as a technique for
int a = 2, b = 8, c = 4, d;
(A) Code generation (B) Code optimization
For ( j = 0; j< = 10; j++) (C) Type checking (D) Run time management

Practice Problems 2 (iii) For i = 1 to 10 ⇒ for i = 1 to 10 (r) Common sub


A [i] = B + C t=B+C expression
Directions for questions 1 to 19:  Select the correct alterna- A [i] = t; elimination.
tive from the given choices
(iv) x = 2 * y ⇒  y << 2; (s) Code motion
1. Match the correct code optimization technique to the
corresponding code:
(A) i – r, iii – s, iv – p, ii – q
(i) i = i * 1 ⇒  j = 2 * i (p) Reduction in (B) i – q, ii – r, iii – s, iv –p
j=2*i strength
(ii) A = B + C ⇒  A = B + C (q) Machine Idioms (C) i – s, iii – p, iii – q, iv – r
D = 10 + B + C D = 10 + A (D) i – q, ii – p, iii – r, iv – s
Chapter 3  •  Intermediate Code Generation  |  6.53

2. What will be the optimized code for the following z: = a * * 2


expression represented in DAG? x: = 0 * b
a=q*-r+q*-r y: = b + c
(A) t1 = -r (B) t1 = -r w: = y * y
t2 = q * t1 t2 = q * t1 u: = x + 3
t3 = a * t1 t3 = t2 + t2 v: = u + w
t4 = t2 + t3 a = t3 Assume that the only variables that are live at the exit of this
a = t4 block are v and z. In order, apply the following optimization
(C) t1 = -r (D) All of these to this basic block.
t2 = q 10. After applying algebraic simplification, how many
t3 = t1 * t2 instructions will be modified?
t4 = t3 + t3 (A) 1 (B) 2
a = t4 (C) 4 (D) 5
3. In static allocation, names are bound to storage at 11. After applying common sub expression elimination to
_______ time. the above code. Which of the following are true?
(A) Compile (B) Runtime (A) a: = b + c (B) y: = a
(C) Debugging (D) Both (A) and (B) (C) z = a + a (D) None of these
4. The actual parameters are evaluate d and their r-values 12. Among the following instructions, which will be modi-
are passed to the called procedure is known as fied after applying copy propagation?
(A) call-by-reference (A) a: = b + c (B) z: = a * a
(B) call-by-name (C) y: = a (D) w: = y * y
(C) call-by-value
13. Which of the following is obtained after constant
(D) copy-restore
folding?
5. If the expression – (a + b) *(c + d) + (a + b + c) is trans- (A) u: = 3 (B) v: = u + w
lated into quadruple representation, then how many (C) x: = 0 (D) Both (A) and (C)
temporaries are required?
14. In order to apply dead code elimination, what are the
(A) 5 (B) 6
statements to be eliminated?
(C) 7 (D) 8
(A) x=0
6. If the above expression is translated into triples repre- (B) y=b+c
sentation, then how many instructions are there? (C) Both (A) and (B)
(A) 6 (B) 10 (D) None of these
(C) 5 (D) 8 15. How many instructions will be there after optimizing
7. In the indirect triple representation for the expression the above result further?
A = (E/F) * (C – D). The first pointer address refers to (A) 1 (B) 2
(A) C–D (C) 3 (D) 4
(B) E/F 16. Consider the following program:
(C) Both (A) and (B)
L0: e: = 0
(D) (E/F) * (C – D)
b: = 1
8. For the given assembly language, what is the cost for it?
d: = 2
MOV b, a
L1: a: = b + 2
ADD c, a c: = d + 5
(A) 3 (B) 4
e: = e + c
(C) 6 (D) 2
f: a*a
9. Consider the expression
If f < c goto L3
((4 + 2 * 3 + 7) + 8 * 5). The polish postfix notation for L2: e: = e + f
this expression is
goto L4
(A) 423* + 7 + 85*+ (B) 423* + 7 + 8 + 5*
(C) 42 + 37 + *85* + (D) 42 + 37 + 85** + L3: e: = e + 2
L4: d: = d + 4
Common data for questions 10 to 15: Consider the fol-
b: = b – 4
lowing basic block, in which all variables are integers, and
** denotes exponentiation. If b! = d goto 4
a: = b + c L5:
6.54 | Unit 6  •  Compiler Design

How many blocks are there in the flow graph for the 18. In call by value the actual parameters are evaluated.
above code? What type of values is passed to the called procedure?
(A) 5 (A) l-values
(B) 6 (B) r-values
(C) Text of actual parameters
(C) 8
(D) None of these
(D) 7
19. Which of the following is FALSE regarding a Block?
7. A basic block can be analyzed by
1 (A) The first statement is a leader.
(A) Flow graph (B) Any statement that is a target of conditional / un-
conditional goto is a leader.
(B) A graph with cycles
(C) Immediately next statement of goto is a leader.
(C) DAG (D) The last statement is a leader.
(D) None of these

Previous Years’ Questions


1. The least number of temporary variables required to The minimum number of total variables required to con-
create a three-address code in static single assignment vert the above code segment to static single assignment
form for the expression q + r/3 + s – t * 5 + u * v/w is form is _____ .
________[2015] 4. What will be the output of the following pseudo-
code when parameters are passed by reference and
2. Consider the intermediate code given below.
dynamic scoping is assumed?[2016]
(1) i=1 a = 3;
(2) j=1
void n(x) { x = x* a; print (x);}
(3) t1 = 5 * i
void m(y) {a = 1; a = y – a; n(a) ; print (a)}
(4) t2 = t1 + j
(5) t3 = 4 * t2
void main( ) {m(a);}
(A)  6,2 (B)  6,6
(6) t4 = t3
(C) 4,2 (D) 4,4
(7) a[t4] = –1
5. Consider the following intermediate program in three
(8) j=j+1
address code
(9) if j < = 5 goto (3)
p=a−b
(10) i=i+1
(11) if i < 5 goto (2)
q=p*c
p=u*v
The number of nodes and edges in the control-flow-
graph constructed for the above code, respectively, q=p+q
are[2015] Which one of the following corresponds to a static
single assignment form of the above code?[2017]
(A) 5 and 7 (B) 6 and 7
(A) p1 = a − b (B) p3 = a − b
(C) 5 and 5 (D) 7 and 8
q1 = p1 * c q4 = p3 * c
3. Consider the following code segment.[2016] p1 = u * v p4 = u * v
x = u – t; q1 = p1 + q1 q5 = p4 + q4
y = x * v; (C) p1 = a − b (D) p1 = a − b
x = y + w; q1 = p2 * c q1 = p * c
y = t – z; p3 = u * v p2 = u * v
y = x * y; q2 = p4 + q3 q2 = p + q
Chapter 3  •  Intermediate Code Generation  |  6.55

Answer Keys
Exercises
Practice Problems 1
1. D 2. D 3. C 4. C 5. B 6. A 7. B 8. A 9. A 10. A
11. B 12. C 13. A 14. C 15. B

Practice Problems 2
1. B 2. B 3. A 4. B 5. B 6. A 7. B 8. C 9. A 10. A
11. B 12. D 13. A 14. C 15. C 16. A 17. C 18. B 19. D

Previous Years’ Questions


1. 8 2. B 3. 10 4. D 5. B

You might also like