Chapter 10 Intermediate Code Generator
Chapter 10 Intermediate Code Generator
1
Intermediate Code Generation
• Intermediate codes are machine independent codes, but they are close
to machine instructions.
• The given program in a source language is converted to an
equivalent program in an intermediate language by the intermediate
code generator.
• Intermediate language can be many different languages, and the
designer of the compiler decides this intermediate language.
– syntax trees can be used as an intermediate language.
– postfix notation can be used as an intermediate language.
– three-address code (Quadraples) can be used as an intermediate language
• we will use quadraples to discuss intermediate code generation
• quadraples are close to machine instructions, but they are not actual machine instructions.
– some programming languages have well defined intermediate languages.
• java – java virtual machine
• prolog – warren abstract machine
• In fact, there are byte-code emulators to execute instructions in these intermediate languages.
2
Three-Address Code (Quadraples)
• Three-Address Code is a statement of the form:
x := y op z
where x, y and z are names, constants or compiler-generated
temporaries; op is any operator.
• Types
1. Assignment statement of the form
X := Y op Z
Where op is binary arithmetic or logical operator.
2. Assignment statement of the form
X : =op Y Where op is unary operator.
3. Copy Statement
X := Y Where value of Y is assigned to X
3
4. The Unconditional Jump statement
goto L
5. The conditional jump
If X relop Y then goto L.
6. Procedure call
param x and call p,n for procedure calls and return y
where y is return type of function which is optional
For a procedure call p(x1, x2,….xn) the three-address statements are
param x1
param x2
…
param xn
call p,n
where n represents number of actual parameters.
4
7. Indexed assignment of the form
X[i] := y and X := Y[i]
8. Address and pointer assignment
X : = &Y and X := *Y and *X := Y
5
Implementation of Three Address Statements
E.g., a := b * -c + b * -c
• Quadruple implementation :
– Most direct
– Each record has four fields: op, arg 1, arg 2, result
op arg 1 arg 2 result
(0) uminus c t1
(1) * b t1 t2
(2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
6
Implementation of Three Address Statements
• Triple implementation :
– Avoids putting temporary names in symbol table
– Each record has three fields: op, arg 1, arg 2
op arg 1 arg 2
(0) uminus c
(1) * b (0)
(2) uminus c
(3) * b (2)
(4) + (1) (3)
(5) := a (4)
7
Implementation of Three Address Statement
• Indirect Triple implementation :
– Array of pointers to triples, rather than array of triples themselves
op arg 1 arg 2
Statements (14) uminus c
(0) (14) (15) * b (14)
(1) (15) (16) uminus c
(2) (16) (17) * b (16)
(3) (17) (18) + (1) (17)
(4) (18) (19) := a (18)
(5) (19)
8
Syntax-directed translation into three-address code
9
e.g. a=b+c
Production Semantic rule
S→id := E S.code : =E.code || gen(id.place ‘:=’ E.place)
E→E1 + E2 E.place := newtemp;
E.code := E1.code|| E2.code ||gen(E.place ‘:=’ E1.place ‘+’ E2.place)
10
Control statement
S.next
12
S.begin E.true
E.code
E.false
E.true
S1.code
goto S.begin
E.false
13
14
Consider the following example
while a<b do
if c<d then
x:=y+z
else
x:=y-z
15
Case Statements
16
Consider the following switch statement
switch E
begin
case V1 : S1
case V2 : S2
…….
case Vn-1 : Sn-1
default : Sn
end
17
We can generate three-address code statements of
the form shown below.
case V1 L1
case V2 L2
case Vn-1 Ln-1
case t Ln
label next
18
Backpatching
• The main problems with generating code for Boolean expression and
flow-of-control statements is that during one single pass we may not
know the labels to which control is transferred when jump statements
are generated.
• makelist(i) creates a new list containing only i, an index into the array of
quadruples; makelist returns a pointer to the list it has made.
• merge(p1,p2) concatenate list pointed by p1 and p2, and returns a pointer
to concatenated list.
• backpatch(p,i) inserts i as the target label for each of the statements on
the list pointed to by p.
19
If (a < b) then I := I+1 else j:= I+1
100: if a < b then goto ???
101: goto ???
102: t1 = I+1
103: I = t1
104: goto ???
105: t1 = I+1
106: j = t1
107:
20
Boolean Expression
• We insert marker non-terminal M into the grammar to cause a semantic
action to pick up, at appropriate times, the index of the next quadruple
to be generated.
• Consider the following grammar,
E→E1 or M E2
E→E1 and M E2
E→not E1
E→( E1 )
E→id1 relop id2
E→true
E→false
M→ε
21
• E has two attributes truelist and falselist to store the list of goto
instructions with empty destinations.
truelist: goto TRUELABEL
falselist: goto FALSELABLE
M.quad: the number for current instruction
makelist(quad): create a list.
nextquad: holds index of next quadruple to follow
22
23
Consider the expression a < b or c < d and e < f
E.t={100,104}
E.f={103,105}
E.t={100} E.t={104}
or M.q=102
E.f={101} E.f={103,105}
Є
a < b E.t={102}
and M.q=104 E.t ={104}
E.f={103} E.f={105}
Є
c < d e < f
24
The sub expression a< b is reduced to E by using production (5) from the figure 10.20
The reduction of e<f into E by production (5) and generates following quadruples.
27