0% found this document useful (0 votes)

216 views36 pages

18CS61 - SS and C - Module 5

Syntax directed translation is a technique used in compiler design to perform semantic analysis and code generation. It uses an annotated parse tree with attributes and semantic rules to check for correctness and enable proper program execution. Attributes can be synthesized, meaning their values are computed from child nodes, or inherited, where values are computed from parent or sibling nodes. Semantic rules specify how to compute the values of attributes associated with grammar symbols.

Uploaded by

Juice Kuditya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

216 views36 pages

18CS61 - SS and C - Module 5

Uploaded by

Juice Kuditya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

System Software and Compilers (18CS61) Module 5

Global Academy of Technology

RR Nagar, Bengaluru – 560098
Department of Computer Science and Engineering
(Accredited by NBA 2019-2022)

VI Semester
System Software and Compiler Design (18CS61)

Module 5
Syntax Directed Translation

Contents :
Syntax Directed Translation: Intermediate code generation, Code generation
Text book 2: Chapter 5.1, 5.2, 5.3, 6.1, 6.2, 8.1, 8.2

Introduction

 The third phase of the compiler is called semantic analysis. The main goal of semantic
analysis is to check for correctness of program and enable proper execution.
 Semantic analysis phase acts as an interface between syntax phase and code generation
phase. It accepts the parse tree from the syntax analysis phase and adds the semantic
information to the parse tree and performs certain checks based on this information.

Actions performed by semantic analysis phase

 Type checking : Number and type of arguments in function cal and in function definition
must be same. Else it results in semantic error.
 Object binding : Associates the variables with respective function definitions.
 Automatic type conversion of integers in mixed mode of operations.
 Helps in intermediate code generation.
 Display appropriate error messages.

The semantics of a language can be described using two notations

1. Syntax Directed Definition (SDD)
2. Syntax Directed Translation (SDT)

Syntax Directed Definition (SDD)

It is a Context Free Grammar with attributes and semantic rules.

Attributes are associated with grammar symbols.
Semantic rules are associated with productions. These rules are used to compute the attribute
values.

Courtesy VML, Dept. of CSE, GAT pg. 1

System Software and Compilers (18CS61) Module 5

Example:
Production : E→E+T
Semantic rule: E.val = E.val + T.val
attributes

Note :
 Semantic rule is associated with production.
 Attribute name val is associated with each non-terminal used in the rule.

Attribute

An attribute is a property of a programming language construct. Attributes are always associated

with grammar symbols.
If X is a grammar symbol and a is an attribute, then X.a denotes the value of the attribute a at a
particular node X in the parse tree.

Examples of attributes
1. The data types associated with variables such as int, float, char etc.
2. The value of an expression
3. The location of a variable in memory
4. The object code of a function
5. The number of significant digits in a number etc

Semantic rule

The rule that describes how to compute the attribute values associated with a grammar symbol is
called semantic rule.
Example :
Consider the production E → E + T
The attribute value of E which is on LHS of the production denoted by E.val can be calculated by
adding the attribute values of variables E and T on RHs of the production.
E.val = E.val + T.val

Types of attributes

1. Synthesized attribute
2. Inherited attribute

Synthesized attribute

The attribute value of a non-terminal A derived from the attribute values of its children or itself is
called as synthesized attribute.
Hence the attribute values of synthesized attributes are passed from children to the parent node in
bottom-up manner.

Courtesy VML, Dept. of CSE, GAT pg. 2

System Software and Compilers (18CS61) Module 5

Example:
Production : E→E+T
Semantic rule: E.val = E.val + T.val

Parse tree with attribute values:

E.val = 30

E.val = 10 + T.val = 20

Inherited attribute

The attribute value of a non-terminal A derived from the attribute values of its siblings or from its
parent or itself is called as inherited attribute.
Hence the attribute values of inherited attributes are passed from siblings or from parent to children
in top-down manner.
Example:
Production : D→T V
Where D – declaration
T – type such as int
V – variable such as sum

Semantic rule: V.inh = T.type

Parse tree with attribute values:
D

T.type = int V.inh = int

id.entry

The type int obtained from the Lexical Analyzer is already stored in T.type whose value is
transferred to its sibling V. ie, V.inh = T.type
Since attribute value of V is obtained from its sibling, it is inherited attribute and its attribute is
denoted by inh.
Similarly, the value int stored in V.inh is transferred to its child id.entry and hence entry is
inherited attribute of id and attribute value is denoted by id.entry.

Courtesy VML, Dept. of CSE, GAT pg. 3

System Software and Compilers (18CS61) Module 5

Synthesized vs Inherited attributes

Synthesized attribute Inherited attribute

An attribute is said to be Synthesized attribute if An attribute is said to be Inherited attribute if

its parse tree node value is determined by the its parse tree node value is determined by the
attribute value at children. attribute value at parent and/or siblings.

It can be evaluated during a single bottom-up It can be evaluated during a single top-down
traversal of parse tree. traversal of parse tree.

Annotated Parse Tree

 A parse tree showing the attribute values of each node is called annotated parse tree.
 The terminals in the annotated parse tree can have only synthesized attribute values and they
are obtained directly from Lexical Analyzer. So, there are no semantic rules in SDD to get
lexical values into terminals of the annotated parse tree. Terminals can never have inherited
attributes.
 The other nodes in the annotated parse tree may have synthesized or inherited attributes.

Questions

1. Write SDD for the following grammar

S  En where n represents end of file marker
E  E+T | T
T  T*F | F
F  (E) | digit

Solution :
Let us assume an input string 4 * 5 + 6 for computing synthesized attributes. The annotated parse
tree for the input string is

Courtesy VML, Dept. of CSE, GAT pg. 4

System Software and Compilers (18CS61) Module 5

Parse Tree Annotated Parse Tree

Productions Semantic rules

S  En S.val = E.val
EE+T E.val = E.val + T.val
ET E.val = T.val
TT*F T.val = T.val * F.val
TF T.val = F.val
F  (E) F.val = E.val
F  digit F.val = digit.lexval

2. Write the grammar and SDD for a simple desk calculator and show annotated parse tree
for the expression (3+4)*(5+6).

Solution :
A simple desk calculator performs operations such as addition, subtraction, multiplication and
division with or without ().
Grammar :
S  En where n represents end of file marker
EE+T | E-T | T
TT*F | T/F |F
F  (E) | digit

Annotated parse tree for the expression (3+4)*(5+6) consisting of attribute values for each non-
terminal is given below.

Courtesy VML, Dept. of CSE, GAT pg. 5

System Software and Compilers (18CS61) Module 5

Productions Semantic rules

S  En S.val = E.val
EE+T E.val = E.val + T.val
EE-T E.val = E.val - T.val
ET E.val = T.val
TT*F T.val = T.val * F.val
TT/F T.val = T.val / F.val
TF T.val = F.val
F  (E) F.val = E.val
F  digit F.val = digit.lexval

Courtesy VML, Dept. of CSE, GAT pg. 6

System Software and Compilers (18CS61) Module 5

3. Consider the following grammar

S  EN
EE+T | E-T | T
TT*F | T/F |F
F  (E) | digit
N;

i) Obtain SDD for the grammar

ii) Construct annotated parse tree for the input string 5*6+7

Solution :
Productions Semantic rules
S  EN S.val = E.val
EE+T E.val = E.val + T.val
EE-T E.val = E.val - T.val
ET E.val = T.val
TT*F T.val = T.val * F.val
TT/F T.val = T.val / F.val
TF T.val = F.val
F  (E) F.val = E.val
F  digit F.val = digit.lexval
N; ;

Parse tree Annotated Parse tree

Courtesy VML, Dept. of CSE, GAT pg. 7

System Software and Compilers (18CS61) Module 5

4. The SDD to translate binary integer number into decimal is shown below.
Construct the parse tree and annotated parse tree for the string 1100.

Production Semantic rule

BL B.val = L.val
L  LB L.val = 2*L.val + B.val
LB L.val = B.val
B0 B.val = 0
B1 B.val = 1

Annotated Parse Tree

Syntax Directed Translation (SDT)

In Syntax Directed Translation, along with the grammar, we associate some informal notations and
these notations are called as semantic rules.
Syntax Directed Translation = Grammar + Semantic rules
SDTs are used
 To build syntax trees for programming constructs
 To translate infix expressions into postfix notation
 To evaluate expressions

Types of SDT
1. S-attributed SDT
2. L-attributed SDT

Courtesy VML, Dept. of CSE, GAT pg. 8

System Software and Compilers (18CS61) Module 5

S-attributed SDT

1) It uses only synthesized attributes.

Example :
A  BCD { A.s = f (B.s, C.s, D.s} ie, A can value from the children B or C or D.
2) Semantic actions are placed at right end of the children production. Hence it is also called
as Postfix SDT.
Example :
A  BCD {……}
3) Attributes are evaluated during bottom up parsing.
Example :
A  BCD
We can find the value of A only after evaluating B, C, D values.

L-attributed SDT

1) It uses both synthesized and inherited attributes. Inherited attributes can inherit values from
either parent or left siblings only.
Example :
A  BCD { C.inh=A.inh, C.inh=B.inh, D.inh=B.inh, D.inh=A.inh}
But C.inh=D.inh is invalid as it takes value from right sibling.
2) Semantic actions are placed anywhere on RHS of the production.
Example :
A  {…} BC | B {…} C | BC {…}
3) Attributes are evaluated by traversing the parse tree - depth first, left to right.

Note
If a definition is S-attributed, then it is also L-attributed but NOT vice-versa.

Courtesy VML, Dept. of CSE, GAT pg. 9

System Software and Compilers (18CS61) Module 5

S-attributed SDT L-attributed SDT

Based on synthesized attribute. Based on both synthesized and inherited
(Parent/Left sibling) attributes.

Uses bottom-up parsing. Uses Top-down parsing (depth first, left to right)

Semantic rules are always written at Semantic rules are written anywhere in RHS.
rightmost position in RHS.

Problems

1. Check whether the following grammar is S-attributed or L-attributes grammar.

A  LM { L.inh = f(A.inh);
M.inh = f(L.syn);
A.syn = f(M.syn);
}
Solution :
Semantic rule Synthesized Inherited S-attributed L-attributed
L.inh = f(A.inh) No Yes No Yes (from parent)
M.inh = f(L.syn) No Yes No Yes (from left sibling)
A.syn = f(M.syn) Yes No Yes Yes (from child)

So, this grammar is L-attributed.

2. Check whether the following grammar is S-attributed or L-attributes grammar.

A  QR { R.inh = f(A.inh);
Q.inh = f(R.inh);
A.syn = f(Q.syn);
}
Solution :
Semantic rule Synthesized Inherited S-attribute L-attribute
R.inh = f(A.inh) No Yes No Yes (from parent)
Q.inh = f(R.inh) No Yes No No (from right sibling)
A.syn = f(Q.syn) Yes No Yes Yes (from child)

So, this grammar is neither S-attributed nor L-attributed.

Courtesy VML, Dept. of CSE, GAT pg. 10

System Software and Compilers (18CS61) Module 5

3. Given the grammar A  PQ and A  XY

Rule 1 : { P.i = A.i + 2, Q.i = P.i + A.i, A.s = P.s + Q.s }
Rule 2 : { X.i = A.i + Y.S, Y.i = X.s + A.i }
Which of the following is true?
1) Both rules are L-attributed
2) Only R1 is L-attributed
3) Only R2 is L-attributed
4) Neither R1 nor R2 are L-attributed

Option 2 is correct. Only R1 is L-attributed

4. Write the SDD for a simple type declaration and write the annotated parse tree for the
declaration “ float id1, id2, id3”.

Solution :
The grammar for the simple type declaration is
Where
DTL
D : Declaration
T  int | float T : Data type(int /float)
L  L1, id | id L : List of identifiers or identifier

Input string : float id1, id2, id3

Courtesy VML, Dept. of CSE, GAT pg. 11

System Software and Compilers (18CS61) Module 5

1) The declaration D consists of basic data type T followed by list of L identifiers. T can be
either int or float. Thus, the tokens corresponding to int or float such as integer or float
contained from Lexical analyzer are copied into attribute value of T. The corresponding
productions and semantic rules are
Production Semantic rule
T  int T.type=integer
T  float T.type=float

2) The attribute value T.type available in the left subtree should be transferred to the right
subtree L. Since attribute value is transferred from left sibling to right sibling, its attribute
must be inherited attribute and is denoted by L.inh and can be obtained by the following
production.
Production Semantic rule
D  TL L.inh=T.type

3) The type L.inh must be transferred to identifier id and hence it has to be copied into L1.inh
which is the left most child in RHS of the production L  L1, id. This can be obtained by
the following production.
Production Semantic rule
L  L1, id L1.inh=L.inh

4) The attribute value of L.inh in turn must be entered as the type for identifier id using the
production L  id. This can be done as follows.
Production Semantic rule
L  id Addtype (id.entry, L.inh)

Addtype()
• id.entry is a lexical value that points to the symbol table.
• L.inh is the type being assigned to every identifier in the list
• The function installs L.inh as the type of corresponding identifier.

5) The attribute value of L.inh in turn must be entered as the type for identifier id which is the
right most child in RHS of the production L  L1 , id. This can be done as follows.
Production Semantic rule
L  L1, id Addtype (id.entry, L.inh)

SDD for the grammar

Production Semantic rule

D  TL L.inh=T.type
T  int T.type=integer
T  float T.type=float
L  L , id L1.inh=L.inh
1
Addtype (id.entry, L.inh)
L  id Addtype (id.entry, L.inh)

Courtesy VML, Dept. of CSE, GAT pg. 12

System Software and Compilers (18CS61) Module 5

Dependency Graph

A graph that shows the flow of information which helps in computation of various attribute values
in a particular parse tree is called dependency graph.
An edge from one attribute instance to another attribute instance indicates that the attribute value of
the first is needed to compute the attribute value of the second.
While Annotated parse tree shows the values of attributes, Dependency graph shows how these
values are computed.

Example :
Production Semantic rules
EE+T E.val = E.val + T.val

In the above figure, the dotted lines along the nodes connected to them represent the parse tree.
The shaded nodes represented as val with solid arrows originating from one node and ends in
another node is the dependency graph.

Example :
SE
EE+T | E-T | T
TT*F | T/F |F
F  (E) | digit
Input : 7+3*2

Productions Semantic rules

SE S.val = E.val
EE+T E.val = E.val + T.val
EE-T E.val = E.val - T.val
ET E.val = T.val
TT*F T.val = T.val * F.val
TT/F T.val = T.val / F.val
TF T.val = F.val
F  (E) F.val = E.val
F  digit F.val = digit.lexval

Courtesy VML, Dept. of CSE, GAT pg. 13

System Software and Compilers (18CS61) Module 5

Question :

1. Give the SDD to process a sample variable declared in C and dependency graph for the
input “int a, b, c”

Production Semantic rule

Solution :
D→TL D  TL L.inh=T.type
T → int T  int T.type=integer
T → float T  float T.type=float
L → L1, id L  L , id L1.inh=L.inh
1
L → id Addtype (id.entry, L.inh)
L  id Addtype (id.entry, L.inh)

Dependency graph

Courtesy VML, Dept. of CSE, GAT pg. 14

System Software and Compilers (18CS61) Module 5

Construction of Syntax Tree

The syntax tree is an abstract representation of the language constructs. The syntax trees are used to
write the translation routines using SDD. Constructing syntax tree for an expression means
translation of expression into postfix form.

Functions
1. mknode(op, left, right)
2. mkleaf(id, entry)
3. mkleaf(num, val)

1. mknode(op, left, right)

This function creates a node with a filed operator having op as label and two pointers left and right.

2. mkleaf(id, entry)
id

This function creates a node for an identifier with label id and a pointer to symbol table is given by
entry.

3. mkleaf(num, val)
num

This function creates a node for number with label num and val is for value of that number.

Questions

1. Construct syntax tree for the expression x*y-5+z

Solution :
Convert infix expression to postfix expression : xy*5-z+

Courtesy VML, Dept. of CSE, GAT pg. 15

System Software and Compilers (18CS61) Module 5

Symbol Operation
x p1=mkleaf(id, ptr for x)
y p2=mkleaf(id, ptr for y)
* p3=mknode(*, p1, p2)
5 p4=mkleaf(num, 5)
- p5=mknode(-, p3, p4)
z p6=mkleaf(id, ptr for z)
+ p7=mknode(+, p5, p6)

Syntax tree

2. Construct syntax tree for the expression 3*5+4

Solution :
Convert infix expression to postfix expression : 35*4+

Symbol Operation
3 p1=mkleaf(num, 3)
5 p2=mkleaf(num, 5)
* p3=mknode(*, p1, p2)
4 p4=mkleaf(num, 4)
+ p5=mknode(+, p3, p4)

Courtesy VML, Dept. of CSE, GAT pg. 16

System Software and Compilers (18CS61) Module 5

Syntax tree

3. Assuming suitable SDD, construct syntax tree for the expression a-4+e
Solution :

Production Semantic Rule

S  En S.val = E.val
E  E+T | E-T | T E.val = E.val + T.val
E.val = E.val – T.val
E.val = T.val
TF T.val = F.val
F  digit F.val = digit.lexval
F  id F.val = addType(id, entry)

Convert infix expression to postfix expression : a4-e+

Symbol Operation
a p1=mkleaf(id, ptr for a)
4 p2=mkleaf(num, 4)
- p3=mknode(-, p1, p2)
e P4=mkleaf(id, ptr for e)
+ p5=mknode(+, p3, p4)

By Vanishree M L, Dept. of CSE, GAT pg. 17

System Software and Compilers (18CS61) Module 5

Syntax tree

Intermediate Code Generation

In the analysis-synthesis model of a computer, the front end of a compiler translates a source
program into an independent intermediate code and then back end of the compiler uses this
intermediate code to generate the target code.
The benefits using machine independent intermediate code are
It is easy to change the source or the target language by adapting only the front-end or back-end.
It makes optimization easier.
The intermediate representation can be directly interpreted.

Intermediate representations

The intermediate representation is chosen in such a way that

 It should be easy to translate the source language to the intermediate representation
 It should be easy to translate the intermediate representation to the machine code.
 The intermediate representation should be suitable for optimization

By Vanishree M L, Dept. of CSE, GAT pg. 18

System Software and Compilers (18CS61) Module 5

The following are commonly used intermediate code representation

1. Linear representation - Postfix notation (POSIX)
2. Graphical representation
3. Three-address code
4. Static single Assignment form (SSA)

1. Linear representation - Postfix notation (POSIX)

Infix expression : a+b
Postfix expression : ab+
No ( ) are needed in postfix notation because the position and no. of arguments of the operators
permit only one way to decode a postfix expression. In postfix notation, the operator follows the
operand.
Example :
Infix representation : (a-b)*(c+d)+(a-b)
Postfix representation : ab-cd+*ab-+

2. Graphical representation
a) Syntax Tree
Syntax tree is a condensed form of a parse tree. The operator and keyword nodes of the
parse tree are moved to their parents and a chain of single productions is replaced by single
link in syntax tree. The internal nodes are operators and child nodes are operands. To form
syntax tree, write ( ) in the expression. This way it is easy to recognize which operand
should come first.
Example :
x = -a * b + -a * b

b) DAG (Directed Acyclic Graph)

DAG is a directed graph that contains no cycles. A DAG is used to represent common sub
expressions. It has leaves corresponding to operands and interior nodes corresponding to
operators. A node may have more than one parent.

By Vanishree M L, Dept. of CSE, GAT pg. 19

System Software and Compilers (18CS61) Module 5

Example 1 :
t0 = a + b
t1 = t0 + c
d = t0 + t1

[t0 = a + b] [t1 = t0 + c] [d = t0 + t1]

Example 2 :
a + a * (b – c) + ( b – c) * d

Construction of DAG
Step 1:
For each 3-address instruction of the form x := y op z do the following activities
a) Find a node labeled y. If none exists, create it.
b) Find a node labeled z. If none exists, create it.
c) Find a node labeled op with y as the left child and z as the right child. If none found, then
create one and call this node N. If node op exists, whose name is N with y as left child and z
as right child, then add x to the list of identifiers attached to N.
Step 2:
For each 3-addres instruction of the form x := y, do the following activities.
a) Find a node labeled y. If node does not exist, create it and name it N. If exists, name it N.
b) Add x to the list of identifiers attached to N.

Step 3:
For each 3-address instruction of the form x := -y, do the following activities.
a) Find a node labeled y. If node does not exist, create it.
b) Find a node labeled – with y as the child. If none exists, create it. Call this node N.
c) Add x to the list of identifiers attached to N.

By Vanishree M L, Dept. of CSE, GAT pg. 20

System Software and Compilers (18CS61) Module 5

Questions

1. Give the DAG representation for the following basic block.

x=x*3
y=y+x
x=y-z
y=x

Solution :

1. x = x*3 2. y = y+x

3. x = y-z 3. y = x

2. Obtain the DAG representation for the expression (a+b) * (a+b+c)

Solution :
Covert to 3-address instruction.
t1= a+b
t2=t1+c
t3=t1*t2

By Vanishree M L, Dept. of CSE, GAT pg. 21

System Software and Compilers (18CS61) Module 5

t1= a+b t2=t1+c t3=t1*t2

3. Obtain the DAG representation for the expression a + a * (b - c) + (b – c) * d

Solution :
Note : There is no parenthesis for a+a. * has the highest precedence among +, *, -
Hence the expression order will be a + (a * (b - c)) + ((b – c) * d)
Covert to 3-address instruction.
t1 = b – c
t2= a * t1
t3 = t1 * d
t4= a + t2
t5 = t4 + t3

4. Obtain the DAG representation for the 3-address representation

a=b*c
d=b
e=d*c
b=e
f=b+c
g=f+d

By Vanishree M L, Dept. of CSE, GAT pg. 22

System Software and Compilers (18CS61) Module 5

Solution :
a=b*c d =b e=d*c

b=e f=b+c g=f+d

5. Obtain the DAG representation for the expression (((a+a) + (a+a)) + ((a+a) + (a+a)))

Solution :
Covert to 3-address instruction.
t1=a+a
t2=t1+t1
t3=t2
t4=t3+t3
t1=a+a t2=t1+t1

t3=t2 t4=t3+t3

By Vanishree M L, Dept. of CSE, GAT pg. 23

System Software and Compilers (18CS61) Module 5

6. Obtain the DAG representation for the expression (a + b) * (a + b) + c + d

Solution :
Covert to 3-address instruction.
t1=a+b
t2=t1*t1
t3=t2+c
t4=t3+d
t1=a+b t2=t1*t1

t3=t2+c t4=t3+d

6. Obtain the DAG representation for the expression (a + a) * (b - c) + (b – c) * d

Solution :
Covert to 3-address instruction.
t1=a+a
t2=b-c
t3=t1*t2
t4=t2*d
t5=t3+t4

By Vanishree M L, Dept. of CSE, GAT pg. 24

System Software and Compilers (18CS61) Module 5

t1=a+a t2=b-c t3=t1*t2

t4=t2*d t5=t3+t4

7. Obtain the DAG representation for the expression ((x+y) – (x+y) * (x-y))) + ((x+y) * (x-y) )
Solution :
Covert to 3-address instruction.
t1=x+y
t2=x-y
t3=t1*t2
t4=t1-t3
t5=t3
t6=t4+t5

t1=x+y t2=x-y t3=t1*t2

By Vanishree M L, Dept. of CSE, GAT pg. 25

System Software and Compilers (18CS61) Module 5

t4=t1-t3 t5=t3 t6=t4+t5

3. Three address code

In the three address code form, at the most, three addresses are used to represent any statement. The
general form of three address code representation is a:= b op c where a, b, c are operands and op is
an operator.
Example :
Consider the expression x = a + b + c. Its three-address code will be
t0 = a + b
t1 = t0 + c
x = t1
where t0 and t1 are temporary names generated by the compiler.
The various types of three address statements are

1 Assignment statement x := y op z op is arithmetic or logical operator to perform

binary operation.
2 Assignment instruction x := op y Performs unary operation, op here is an unary
operator.
3 Copy statement x := y The value of y is assigned to x.
4 Unconditional jump goto L The control goes to the statement labelled by L.
5 Conditional jump If x relop y The relop indicates relational operators such as <,
goto L >, <=, >=. If x relop y is true, then it executes goto
L statement.

6 Procedure calls pqr (x) Here x is used as the parameter to procedure pqr.
{ The return statement indicates to return the value of
….. y.
return y;
}
7 Indexed assignment x := y[i] The value of array y at ith index is assigned to x.
x[i] := y The value of identifier y is assigned to index i of
array x.
8 Address and pointer x := &y The value of x will be the address or location of y
assignment x := *y y is a pointer whose value is assigned to x.
*x := y r-value of object pointed by x is set by l-value of y.

By Vanishree M L, Dept. of CSE, GAT pg. 26

System Software and Compilers (18CS61) Module 5

Implementation of three address statements

Three address code is an abstract form of intermediate code that can be implemented as records
with fields for operator and operands. The three representations are
a) Quadruple representation
b) Triple representation
c) Indirect triple representation

a) Quadruple representation
In quadruple representation, each instruction is divided into four fields - op, arg1, arg2 and
result.
• The op field is used to represent the internal code for the operator.
• The arg1 and arg2 represent two operands
• The result is used to store the result of the expression.

Example :
a:= -b * c + d

Three address
Location op arg1 arg2 result
code
t1 = -b (0) uminus b - t1
t2 = c + d (1) + c d t2
t3 = t1 * t2 (2) * t1 t2 t3
a = t3 (3) := t3 - a

b) Triple representation
 In this representation, the use of temporary variables is avoided.
 Instead, references to instructions are made.
 The triple is a record field containing three fields op, arg1, arg2.

Example :
a := -b * c + d

Three address Location op arg1 arg2

code
t1 = -b (0) uminus b -
t2 = c + d (1) + c d
t3 = t1 * t2 (2) * (0) (1)
a = t3 (3) := (2) -

By Vanishree M L, Dept. of CSE, GAT pg. 27

System Software and Compilers (18CS61) Module 5

c) Indirect Triple representation

This representation makes use of pointer to the listing of all references to computations
which is made separately and stored.
Its similar in utility as compared to quadruple representation but requires less space than it.

Example :
a=b*–c+b*–c
Three address Location op arg1 arg2 Address Location
code
30 (1)
t1=-c (1) uminus c -
31 (2)
t2=b*t1 (2) * b (1)
t3=-c (3) uminus c 32 (3)

t4=bt3 (4) b (3) 33 (4)

t5=t2+t4 (5) + (2) (4) 34 (5)

a=t5 (6) := (5) - 35 (6)

4. Static Single Assignment form (SSA)

SSA is an intermediate representation that facilitates certain code optimizations.
Example :
Note that the subscripts distinguish each definition of variables p and q in SSA representation.

3 address code SSA form

p=a+b p1 = a + b
q=p–c q1 =p1 - c
p=q*d p2 = q1 * d
p=e–p p3 = e – p2
q=p+q q2 = p3 + q1

Code Generation

 Code generation is the final phase in the compiler design.

 The code optimizer accepts intermediate code representation which is generated from the
front end of the compiler and produces another intermediate code representation which is
optimized.
 Code generator takes intermediate representation produced by code optimizer along with
supplementary information in symbol table of the source program and produces as output an
equivalent target program.

By Vanishree M L, Dept. of CSE, GAT pg. 28

System Software and Compilers (18CS61) Module 5

 Code generator has 3 main tasks.

o Instruction selection : Choose an appropriate target machine instructions to
implement intermediate representation statements.
o Register allocation and assignment : Decide what values to keep in which registers.
o Instruction ordering : Decide in what order the schedule the execution of
instructions.

Issues in the design of a code generator

1. Input to code generator

The input to code generator is the intermediate code generated by the front end, along with
information in the symbol table that determines the run-time addresses of the data-objects
denoted by the names in the intermediate representation. Intermediate codes may be
represented mostly in quadruples, triples, indirect triples, Postfix notation, syntax trees,
DAG‟s, etc. The code generation phase just proceeds on an assumption that the input are free
from all of syntactic and state semantic errors, the necessary type checking has taken place
and the type-conversion operators have been inserted wherever necessary.

2. Target program
The target program is the output of the code generator. The output may be absolute machine
language, relocatable machine language, assembly language. Absolute machine language as
output has advantages that it can be placed in a fixed memory location and can be
immediately executed. Relocatable machine language as an output allows subprograms and
subroutines to be compiled separately. Relocatable object modules can be linked together and
loaded by linking loader. But there is added expense of linking and loading. Assembly
language as output makes the code generation easier. We can generate symbolic instructions
and use macro-facilities of assembler in generating code. And we need an additional assembly
step after code generation.

3. Memory Management
Mapping the names in the source program to the addresses of data objects is done by the front
end and the code generator. A name in the three address statements refers to the symbol table
entry for name. Then from the symbol table entry, a relative address can be determined for the
name.

4. Instruction selection
Selecting the best instructions will improve the efficiency of the program. It includes the
instructions that should be complete and uniform. Instruction speeds and machine idioms also
plays a major role when efficiency is considered. But if we do not care about the efficiency of
the target program then instruction selection is straight-forward.
For example, the respective three-address statements would be translated into the latter code
sequence as shown below:

By Vanishree M L, Dept. of CSE, GAT pg. 29

System Software and Compilers (18CS61) Module 5

P:=Q+R
S:=P+T
MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S
Here the fourth statement is redundant as the value of the P is loaded again in that statement
that just has been stored in the previous statement. It leads to an inefficient code sequence. A
given intermediate representation can be translated into many code sequences, with significant
cost differences between the different implementations. A prior knowledge of instruction cost
is needed in order to design good sequences, but accurate cost information is difficult to
predict.

5. Register allocation issues

Use of registers make the computations faster in comparison to that of memory, so efficient
utilization of registers is important. The use of registers are subdivided into two subproblems:
1. During Register allocation – we select only those set of variables that will reside in the
registers at each point in the program.
2. During a subsequent Register assignment phase, the specific register is picked to access
the variable.
As the number of variables increases, the optimal assignment of registers to variables becomes
difficult. Mathematically, this problem becomes NP-complete. Certain machine requires
register pairs consist of an even and next odd-numbered register. Example : M a, b
These types of multiplicative instruction involve register pairs where the multiplicand is an
even register and b, the multiplier is the odd register of the even/odd register pair.

Evaluation order
The code generator decides the order in which the instruction will be executed. The order of
computations affects the efficiency of the target code. Among many computational orders,
some will require only fewer registers to hold the intermediate results. However, picking the
best order in the general case is a difficult NP-complete problem.

Approaches to code generation issues

Code generator must always generate the correct code. It is essential because of the number of
special cases that a code generator might face. Some of the design goals of code generator are:
 Correct
 Easily maintainable
 Testable
 Efficient

By Vanishree M L, Dept. of CSE, GAT pg. 30

System Software and Compilers (18CS61) Module 5

The target language

It is a 3-address machine language with the following format

op target, source1, source2
The target machine is byte addressable ie, it can access 8 bit of information from the specified
address.
It has n no. of registers denoted by R0, R1 , R2, ….. Rn-1.

The various types of instructions that are supported by the target machine are
1) Load instructions
2) Store instructions
3) Computational instructions
4) Unconditional instructions
5) Conditional jumps

1. Load instructions
They are used to copy the data into the destination operand which must be a register.
Syntax :
LD destination, address
The second operand can either be a register or a memory location.
Example :
LD R1, R2
LD R1, A

2. Store instructions
It is the opposite of load instruction. It is used to copy the data into memory location specified in the
destination operand.
Syntax :
ST destination, register
Destination must be a memory location
Example :
ST A, R1

3. Arithmetic instruction
The arithmetic operations are performed using these instructions.
Syntax :
OP destination, source1, source2
Example :
ADD R0, R1, R2
SUB R0, R0, R1
MUL R2, R0, R1

By Vanishree M L, Dept. of CSE, GAT pg. 31

System Software and Compilers (18CS61) Module 5

4. Unconditional jump instructions

The branch instruction without any condition is called unconditional jump instruction.
Syntax :
BR label or JMP label
Where BR stands for BRanch instruction.

5. Conditional Jump instructions

Based on the value stored in a register ie, whether it is +ve or –ve or zero, if branching takes place,
then the instructions are called conditional branch instructions.
Syntax :
B[condition] register, label
B – Branch
Condition – LT for less than GT for greater than
Example :
BL R0, T1
BLTZ R1, T2
(Branch on less than zero)

The addressing modes that are supported by generalized target machine are
a) Direct addressing
Address of the data to be accessed is directly present in the instruction ie, If a location is
identified by a variable name x, the value stored in a memory location can be accessed
directly using x.
Example : LD R1, R2
Load the content of register R2 to R1.

b) Indexed addressing
The data can be accessed from a memory location using an index.
Example : LD R1 A[R2]

c) Indexed addressing where a memory location is integer

Same as the previous one except that a memory location is represented as an integer.
Example : LD R1, 100[R2]
Load the content of memory address (100 + content of register R2) to register R1.

d) Indirect addressing
The contents of data can be accessed by de-referencing using * operator.
Example :
LD R1, *(R2)
LOAD R1, @100
Load the content of memory address stored at memory address 100 to the register R1.

e) Immediate addressing
The data to be manipulated is directly present in the instruction and proceeded by #.
Example : LD R1, #100

By Vanishree M L, Dept. of CSE, GAT pg. 32

System Software and Compilers (18CS61) Module 5

Questions

1. Generate the code for 3-address statement x = y – z

LD R1, y // R1=y
LD R2, z // R2=z
SUB R1, R1, R2 // R1=R1-R2
ST x, R1 // x=R1
2. Generate the code for 3-address statement x=*p
LD R1, p // R1=p
LD R2, 0(R1) // R2=*p
ST x, R2 // x=R2

3. Generate the code for 3-address statement *p=x

LD R1, x // R1=x
LD R2, p // R2=p
ST 0(R2), R1 // *p=x

4. Generate the code for 3-address statement : if(x<y) goto L

LD R1, x // R1 = x
LD R2, y // R2 = y
SUB R1, R1, R2 // R1 = R1 - R2
BLTZ R1, M // if R1<0 jump M

5. Generate the code for 3-address statement b=a[i]

LD R1, i // R1=i
MUL R1, R1, 8 // R1=R1*8, array elements are 8 byte values
LD R2, a(R1) // R2=a(R1)
ST b, R2 // b=[a(R2)]

6. Generate the code for 3-address statement a[i]=b

LD R1, b // R1=b
LD R2, i // R2=i
MUL R2, R2, 8 // R2=R2*8
ST a(R2), R1 // a[i]=b

Program and instruction cost

Cost of the program = compilation cost + run time cost

Cost of each instruction = 1 + cost(addressing mode)
where
 Registering addressing mode where both operands are registers having cost=0
 Addressing modes with variables have cost=1
 Addressing modes with constants have cost =1
 Indirect addressing modes have cost=1

By Vanishree M L, Dept. of CSE, GAT pg. 33

System Software and Compilers (18CS61) Module 5

Questions

1. Determine the cost of the following sequence of instructions

LD R0, y // cost = 1 + 1 = 2
LD R1, z // cost = 1 + 1 = 2
ADD R0, R0, R1 // cost = 1 + 0 = 1
ST x, R0 // cost = 1 + 1 = 2
Total cost = 7

2. Determine the cost of the following sequence of instructions

LD R0, i // cost = 1 + 1 = 2
MUL R0, R0, 8 // cost = 1 + 1 = 2
LD R1, a(R0) // cost = 1 + 1 = 2
ST b, R1 // cost = 1 + 1 = 2
Total cost = 8

3. Determine the cost of the following sequence of instructions

LD R0, c // cost = 1 + 1 = 2
LD R1, i // cost = 1 + 1 = 2
MUL R1, R1, 8 // cost = 1 + 1 = 1
ST a(R0), R0 // cost = 1 + 1 = 2
Total cost = 8

4. Determine the cost of the following sequence of instructions

LD R0, p // cost = 1 + 1 = 2
LD R1, 0(R0) // cost = 1 + 1 = 2
ST x, R1 // cost = 1 + 1 = 2
Total cost = 6

Code Optimization Techniques

1. Compile Time Evaluation

2. Common sub-expression elimination
3. Dead Code Elimination
4. Code Movement
5. Strength Reduction

1. Compile Time Evaluation

Two techniques that falls under compile time evaluation are
a) Constant Folding
In this technique,
 As the name suggests, it involves folding the constants.
 The expressions that contain the operands having constant values at compile time are
evaluated.
 Those expressions are then replaced with their respective results.

By Vanishree M L, Dept. of CSE, GAT pg. 34

System Software and Compilers (18CS61) Module 5

Example:
Circumference of Circle = (22/7) x Diameter
Here,
 This technique evaluates the expression 22/7 at compile time.
 The expression is then replaced with its result 3.14.
 This saves the time at run time.

b) Constant Propagation
In this technique,
 If some variable has been assigned some constant value, then it replaces that variable
with its constant value in the further program during compilation.
 The condition is that the value of variable must not get alter in between.
Example:
pi = 3.14, radius = 10, Area of circle = pi x radius x radius
Here,
 This technique substitutes the value of variables „pi‟ and „radius‟ at compile time.
 It then evaluates the expression 3.14 x 10 x 10.
 The expression is then replaced with its result 314.
 This saves the time at run time.

2. Common Sub-Expression Elimination

In this technique,
 As the name suggests, it involves eliminating the common sub expressions.
 The redundant expressions are eliminated to avoid their re-computation.
 The already computed result is used in the further program when required.

Example :

Code Before Optimization Code After Optimization

S1 = 4 x i S1 = 4 x i
S2 = a[S1] S2 = a[S1]
S3 = 4 x j S3 = 4 x j
S4 = 4 x i // Redundant Expression S5 = n
S5 = n S6 = b[S1] + S5
S6 = b[S4] + S5

3. Code Movement
In this technique,
 As the name suggests, it involves movement of the code.
 The code present inside the loop is moved out if it does not matter whether it is present
inside or outside.
 Such a code unnecessarily gets execute again and again with each iteration of the loop.
 This leads to the wastage of time at run time.

By Vanishree M L, Dept. of CSE, GAT pg. 35

System Software and Compilers (18CS61) Module 5

Example:
Code Before Optimization Code After Optimization
for ( int j = 0 ; j < n ; j ++) x=y+z;
{ for ( int j = 0 ; j < n ; j ++)
x=y+ z; {
a[j] = 6 x j; a[j] = 6 x j;
} }

4. Dead Code Elimination

In this technique,
 As the name suggests, it involves eliminating the dead code.
 The statements of the code which either never executes or are unreachable or their output is
never used are eliminated.
Example :
Code Before Optimization Code After Optimization
i= 0; i= 0;
if (i == 1)
{
a=x+5;
}

5. Strength Reduction
In this technique,
 As the name suggests, it involves reducing the strength of expressions.
 This technique replaces the expensive and costly operators with the simple and cheaper
ones.
Example :
Code Before Optimization Code After Optimization
B=Ax2 B=A+A

Here,
 The expression “A x 2” is replaced with the expression “A + A”.
 This is because the cost of multiplication operator is higher than that of addition
operator.

By Vanishree M L, Dept. of CSE, GAT pg. 36

CCS336 Cloud Services Management Apr May 2024 Question Paper Download
No ratings yet
CCS336 Cloud Services Management Apr May 2024 Question Paper Download
3 pages
BE LP5 Manual 23-24
No ratings yet
BE LP5 Manual 23-24
67 pages
Module 3 Notes
No ratings yet
Module 3 Notes
27 pages
Compiler Design 6th Sem CSE Csvtu
No ratings yet
Compiler Design 6th Sem CSE Csvtu
136 pages
Compiler Design - Chapter 4 - Syntax Directed Translation
No ratings yet
Compiler Design - Chapter 4 - Syntax Directed Translation
49 pages
COA Imp Questions 20-21
No ratings yet
COA Imp Questions 20-21
4 pages
Compiler Design Unit 3
No ratings yet
Compiler Design Unit 3
39 pages
Syntax Directed Translation
No ratings yet
Syntax Directed Translation
23 pages
CN and WP Lab Manual
No ratings yet
CN and WP Lab Manual
101 pages
Compiler Code Generation Guide
No ratings yet
Compiler Code Generation Guide
42 pages
DIP Lab Manual Final
No ratings yet
DIP Lab Manual Final
31 pages
Fs Lab Manual
No ratings yet
Fs Lab Manual
57 pages
Iotlabmanualbcs701 250825110659 F7ee2558
No ratings yet
Iotlabmanualbcs701 250825110659 F7ee2558
36 pages
Java Lab Manual - 5th Sem Cse
No ratings yet
Java Lab Manual - 5th Sem Cse
23 pages
R-2008-M.e. Cse-Syllabus
No ratings yet
R-2008-M.e. Cse-Syllabus
39 pages
All Theory Questions
No ratings yet
All Theory Questions
2 pages
OOSE Model Question Paper for Computer Engg
0% (1)
OOSE Model Question Paper for Computer Engg
2 pages
AI & GREEN SKILLS - Shell - Edunet
No ratings yet
AI & GREEN SKILLS - Shell - Edunet
6 pages
Uninformed Search Algorithms
100% (1)
Uninformed Search Algorithms
19 pages
Advanced Data Structures Lab
100% (1)
Advanced Data Structures Lab
2 pages
18cs62 Mod 1
No ratings yet
18cs62 Mod 1
64 pages
Compiler Design Lecture Notes
No ratings yet
Compiler Design Lecture Notes
96 pages
East West Institute of Technology: Sadp Notes
No ratings yet
East West Institute of Technology: Sadp Notes
30 pages
Mini Project HPC
No ratings yet
Mini Project HPC
17 pages
Syntax-Directed Translation & Code Generation
No ratings yet
Syntax-Directed Translation & Code Generation
47 pages
YACC - Compiler Design
No ratings yet
YACC - Compiler Design
13 pages
OOMD Model Question Paper, SNGCE
0% (1)
OOMD Model Question Paper, SNGCE
5 pages
LAB # 07 Facts and Rules in PROLOG: Objective
No ratings yet
LAB # 07 Facts and Rules in PROLOG: Objective
6 pages
AI Full
No ratings yet
AI Full
104 pages
Static Hashing in DBMS
No ratings yet
Static Hashing in DBMS
75 pages
Pushdown Automata for IT Students
No ratings yet
Pushdown Automata for IT Students
247 pages
Smart Traffic System: YOLOv4 vs MobileNetV2
No ratings yet
Smart Traffic System: YOLOv4 vs MobileNetV2
7 pages
Passport Automation System Case Study
No ratings yet
Passport Automation System Case Study
97 pages
Syntax-Directed Translation Guide
No ratings yet
Syntax-Directed Translation Guide
16 pages
C Program for Type Checking
56% (9)
C Program for Type Checking
4 pages
Jntuk Ads Lab Manual
50% (2)
Jntuk Ads Lab Manual
27 pages
Crypto Currency Lab Manual
No ratings yet
Crypto Currency Lab Manual
58 pages
Java Course File
No ratings yet
Java Course File
306 pages
Problem Statement
No ratings yet
Problem Statement
23 pages
Object-Oriented Software Engineering Notes
No ratings yet
Object-Oriented Software Engineering Notes
114 pages
Data Structure Module 4
No ratings yet
Data Structure Module 4
25 pages
String Matching
100% (1)
String Matching
27 pages
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
No ratings yet
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
130 pages
6th Sem CSE - BEU Syllabus
No ratings yet
6th Sem CSE - BEU Syllabus
8 pages
A Mini Project Report On Age and Gender Detection-2
No ratings yet
A Mini Project Report On Age and Gender Detection-2
16 pages
7th Sem 1
No ratings yet
7th Sem 1
32 pages
Design and Implementation of Simple Scientific Calculator
No ratings yet
Design and Implementation of Simple Scientific Calculator
7 pages
Python Search & Sort Algorithms Analysis
100% (1)
Python Search & Sort Algorithms Analysis
37 pages
What Is Hierarchical Object Oriented Design
No ratings yet
What Is Hierarchical Object Oriented Design
2 pages
Efficient Crop Yield Prediction Using ML
No ratings yet
Efficient Crop Yield Prediction Using ML
4 pages
Bcs304 2024 Question Bank
No ratings yet
Bcs304 2024 Question Bank
6 pages
Phases of A Compiler
No ratings yet
Phases of A Compiler
6 pages
Finite Automata and PDA Design Cases
No ratings yet
Finite Automata and PDA Design Cases
2 pages
Seminar Report On AI Driven Drug Discovery 2
No ratings yet
Seminar Report On AI Driven Drug Discovery 2
22 pages
Intermediate Code Generation Techniques
No ratings yet
Intermediate Code Generation Techniques
42 pages
Unit 3
No ratings yet
Unit 3
14 pages
CC Lec 17
No ratings yet
CC Lec 17
18 pages
FALLSEM2023-24 BCSE307L TH VL2023240100900 2023-06-06 Reference-Material-I
No ratings yet
FALLSEM2023-24 BCSE307L TH VL2023240100900 2023-06-06 Reference-Material-I
49 pages
CD Unit 3
No ratings yet
CD Unit 3
23 pages
CD Unit-3
No ratings yet
CD Unit-3
36 pages
How To Learn AI From Scratch in 2024
No ratings yet
How To Learn AI From Scratch in 2024
3 pages
Enhancing DocStrip with l3docstrip
No ratings yet
Enhancing DocStrip with l3docstrip
2 pages
100 Problems in Stochastic Processes
No ratings yet
100 Problems in Stochastic Processes
74 pages
Data Structures Using C Amp C As Per Aicte 9788126508600 9788126589289 Compress
No ratings yet
Data Structures Using C Amp C As Per Aicte 9788126508600 9788126589289 Compress
524 pages
Async Js
No ratings yet
Async Js
69 pages
ISC Class 12 Mathematics Sample Paper 2024
No ratings yet
ISC Class 12 Mathematics Sample Paper 2024
40 pages
Comprehensive Machine Learning Course Outline
No ratings yet
Comprehensive Machine Learning Course Outline
2 pages
Snake and Ladder Game Development Guide
No ratings yet
Snake and Ladder Game Development Guide
58 pages
06 Practice Tests Set 22 - Paper 1H Mark Scheme
No ratings yet
06 Practice Tests Set 22 - Paper 1H Mark Scheme
20 pages
Binary Counters Overview
No ratings yet
Binary Counters Overview
19 pages
Overview of Java AWT Graphics Class
No ratings yet
Overview of Java AWT Graphics Class
6 pages
Document 1
No ratings yet
Document 1
6 pages
100 Must Do Leetcode Questions - NEW
No ratings yet
100 Must Do Leetcode Questions - NEW
36 pages
Esp Ang96 Manual Saci
No ratings yet
Esp Ang96 Manual Saci
14 pages
Java Overview and Key Concepts
No ratings yet
Java Overview and Key Concepts
14 pages
Programming in C - CS3251 - HandWritten Notes
No ratings yet
Programming in C - CS3251 - HandWritten Notes
21 pages
3rd Sem Syllabus CSE
No ratings yet
3rd Sem Syllabus CSE
18 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
11 pages
Aimlsyll
No ratings yet
Aimlsyll
113 pages
Part 1 AGR304 Notes Dr. Kato
No ratings yet
Part 1 AGR304 Notes Dr. Kato
7 pages
C++ Midterm Results Analysis
No ratings yet
C++ Midterm Results Analysis
9 pages
2018 Winter Model Answer Paper
No ratings yet
2018 Winter Model Answer Paper
26 pages
Lab Exam Questions
No ratings yet
Lab Exam Questions
6 pages
CS201 MCQ - S FinalTerm by Vu Topper RM-1
No ratings yet
CS201 MCQ - S FinalTerm by Vu Topper RM-1
18 pages
Mat102 Test Prep
No ratings yet
Mat102 Test Prep
2 pages
OOP Lab Manual: C# Classes & Constructors
No ratings yet
OOP Lab Manual: C# Classes & Constructors
7 pages
Solved Question Paper 4-3
No ratings yet
Solved Question Paper 4-3
15 pages
CST - Draft Syllabus of 3rd Semester - 06122022
No ratings yet
CST - Draft Syllabus of 3rd Semester - 06122022
22 pages
CS3381 Oops Lab-2021 R
No ratings yet
CS3381 Oops Lab-2021 R
3 pages
CV - Aniket - Nitk
No ratings yet
CV - Aniket - Nitk
1 page