Unit 5-Semantic Analysis and Intermediate Code Generation
Unit 5-Semantic Analysis and Intermediate Code Generation
Unit-5
Unit-5
Introduction
• We can associate information with a language
construct by attaching attributes to the grammar
symbols.
• A syntax directed definition specifies the values
of attributes by associating semantic rules with
the grammar productions.
Production Semantic Rule
E->E1+T E.code=E1.code||T.code||’+’
• We may alternatively insert the semantic actions inside the grammar
E -> E1+T {print ‘+’}
Unit-5
Syntax Directed Definitions
• A SDD is a context free grammar with attributes and rules
• Synthesized attributes
– A synthesized attribute at node N is defined only in terms of
attribute values of children of N and at N it
• Inherited attributes
– An inherited attribute at node N is defined only in terms of
attribute values at N’s parent, N itself and N’s siblings
Unit-5
Example of S-attributed SDD
Unit-5
Example of mixed attributes
Unit-5
Evaluation orders for SDD’s
• A dependency graph is used to determine the order of
computation of attributes
• Dependency graph
– For each parse tree node, the parse tree has a node for each attribute
associated with that node
– If a semantic rule defines the value of synthesized attribute A.b in
terms of the value of X.c then the dependency graph has an edge from
X.c to A.b
– If a semantic rule defines the value of inherited attribute B.c in terms
of the value of X.a then the dependency graph has an edge from X.c to
B.c
• Example!
Unit-5
Ordering the evaluation of attributes
• If dependency graph has an edge from M to N
then M must be evaluated before the attribute
of N
• Thus the only allowable orders of evaluation
are those sequence of nodes N1,N2,…,Nk such
that if there is an edge from Ni to Nj then i<j
• Such an ordering is called a topological sortof a
graph
• Example!
Unit-5
S-Attributed definitions
• An SDD is S-attributed if every attribute is
synthesized
• We can have a post-order traversal of parse-tree to
evaluate attributes in S-attributed definitions
postorder(N) {
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N;
}
• S-Attributed definitions can be implemented during bottom-up
parsing without the need to explicitly create parse trees
Unit-5
L-Attributed definitions
• A SDD is L-Attributed if the edges in dependency graph goes
from Left to Right but not from Right to Left.
• More precisely, each attribute must be either
– Synthesized
– Inherited, but if there us a production A->X1X2…Xn and there is
an inherited attribute Xi.a computed by a rule associated with
this production, then the rule may only use:
• Inherited attributes associated with the head A
• Either inherited or synthesized attributes associated with the
occurrences of symbols X1,X2,…,Xi-1 located to the left of Xi
• Inherited or synthesized attributes associated with this occurrence of
Xi itself, but in such a way that there is no cycle in the graph
Unit-5
Application of Syntax Directed Translation
• Type checking and intermediate code generation
• Construction of syntax trees
– Leaf nodes: Leaf(op,val)
– Interior node: Node(op,c1,c2,…,ck)
• Example:
Production Semantic Rules
1) E -> E1 + T E.node=new node(‘+’, E1.node,T.node)
2) E -> E1 - T E.node=new node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id,id.entry)
6) T -> num T.node = new Leaf(num,num.val)
Unit-5
Syntax tree for L-attributed definition
Unit-5
Postfix translation schemes
• Simplest SDDs are those that we can parse the grammar
bottom-up and the SDD is s-attributed
Unit-5
Example of postfix SDT
1) L -> E n {print(E.val);}
2) E -> E1 + T {E.val=E1.val+T.val;}
3) E -> T {E.val = T.val;}
4) T -> T1 * F {T.val=T1.val*F.val;}
5) T -> F {T.val=F.val;}
6) F -> (E) {F.val=E.val;}
7) F -> digit {F.val=digit.lexval;}
Unit-5
Parse-Stack implementation of postfix
SDT’s
• In a shift-reduce parser we can easily implement
semantic action using the parser stack
• For each nonterminal (or state) on the stack we can
associate a record holding its attributes
• Then in a reduction step we can execute the semantic
action at the end of a production to evaluate the
attribute(s) of the non-terminal at the leftside of the
production
• And put the value on the stack in replace of the
rightside of production
Unit-5
Example
L -> E n {print(stack[top-1].val);
top=top-1;}
E -> E1 + T {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
E -> T
T -> T1 * F {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
T -> F
F -> (E) {stack[top-2].val=stack[top-1].val
top=top-2;}
F -> digit
Unit-5
SDT’s with actions inside productions
Unit-5
SDT’s with actions inside productions (cont)
L
• Any SDT can be
implemented as follows E
1. Ignore the actions and
produce a parse tree {print(‘+’);}
E + T
2. Examine each interior node
N and add actions as new T F
children at the correct
{print(4);}
position {print(‘*’);}
T *F digit
3. Perform a postorder {print(5);}
traversal and execute actions F digit
when their nodes are visited
{print(3);}
Unit-5
digit
SDT’s for L-Attributed definitions
• We can convert an L-attributed SDD into an SDT
using following two rules:
– Embed the action that computes the inherited
attributes for a nonterminal A immediately before that
occurrence of A. if several inherited attributes of A are
dpendent on one another in an acyclic fashion, order
them so that those needed first are computed first
– Place the action of a synthesized attribute for the head
of a production at the end of the body of the
production
Unit-5
Example
S -> while (C) S1 L1=new();
L2=new();
S1.next=L1;
C.false=S.next;
C.true=L2;
S.code=label||L1||C.code||label||
L2||S1.code
Unit-5
Intermediate Code Generation
• Variants of Syntax Trees
• Three-address code
• Types and declarations
• Translation of expressions
• Type checking
• Control flow
• Backpatching
Unit-5
Introduction
• Intermediate code is the interface between front end
and back end in a compiler
• Ideally the details of source language are confined to
the front end and the details of target machines to
the back end (a m*n model)
• In this chapter we study intermediate
representations, static type checking and
intermediate code generation
+ *
*
d
a -
Unit-5
b c
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T E.node= new Node(‘+’, E1.node,T.node)
2) E -> E1-T E.node= new Node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id, id.entry)
6) T -> num T.node = new Leaf(num, num.val)
Example: 8) p8=Leaf(id,entry-b)=p3
1) p1=Leaf(id, entry-a) 9) p9=Leaf(id,entry-c)=p4
2) P2=Leaf(id, entry-a)=p1 10) p10=Node(‘-’,p3,p4)=p5
3) p3=Leaf(id, entry-b) 11) p11=Leaf(id,entry-d)
4) p4=Leaf(id, entry-c) 12) p12=Node(‘*’,p5,p11)
5) p5=Node(‘-’,p3,p4) 13) p13=Node(‘+’,p7,p12)
6) p6=Node(‘*’,p1,p5)
7) p7=Node(‘+’,p1,p6)
Unit-5
Value-number method for
constructing DAG’s
= id To entry for i
num 10
+ + 1 2
3 1 3
i 10
• Algorithm
– Search the array for a node M with label op, left child l and right child r
– If there is such a node, return the value number M
– If not create in the array a new node N with label op, left child l, and
right child r and return its value
• We may use a hash table
Unit-5
Three address code
b c
Unit-5
Forms of three address instructions
• x = y op z
• x = op y
• x=y
• goto L
• if x goto L and ifFalse x goto L
• if x relop y goto L
• Procedure calls using:
– param x
– call p,n
– y = call p,n
• x = y[i] and x[i] = y
• x = &y and x = *y and *x =y
Unit-5
Example
L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100
Unit-5
Data structures for three address
codes
• Quadruples
– Has four fields: op, arg1, arg2 and result
• Triples
– Temporaries are not used and instead references
to instructions are made
• Indirect triples
– In addition to triples we use a list of pointers to
triples
Unit-5
Example Three address code
t1 = minus c
t2 = b * t1
• b * minus c + b * minus c t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Unit-5
Type Expressions
Example: int[2][3]
array(2,array(3,integer))
Unit-5
Type Equivalence
Unit-5
Declarations
Unit-5
Storage Layout for Local Names
• Computing types and their widths
Unit-5
Storage Layout for Local Names
Unit-5
Sequences of Declarations
•
Unit-5
Fields in Records and Classes
•
Unit-5
Translation of Expressions and
Statements
• We discussed how to find the types and offset
of variables
• We have therefore necessary preparations to
discuss about translation to intermediate code
• We also discuss the type checking
Unit-5
Three-address code for expressions
Unit-5
Incremental Translation
Unit-5
Addressing Array Elements
• Layouts for a two-dimensional array:
Unit-5
Semantic actions for array reference
Unit-5
Translation of Array References
Unit-5
Conversions between primitive
types in Java
Unit-5
Introducing type conversions into
expression evaluation
Unit-5
Abstract syntax tree for the
function definition
fun length(x) =
if null(x) then 0 else length(tl(x)+1)
Unit-5
Inferring a type for the function length
Unit-5
Algorithm for Unification
Unit-5
Unification algorithm
boolean unify (Node m, Node n) {
s = find(m); t = find(n);
if ( s = t ) return true;
else if ( nodes s and t represent the same basic type ) return true;
else if (s is an op-node with children s1 and s2 and
t is an op-node with children t1 and t2) {
union(s , t) ;
return unify(s1, t1) and unify(s2, t2);
}
else if s or t represents a variable {
union(s, t) ;
return true;
}
else return false;
}
Unit-5
Control Flow
Unit-5
Short-Circuit Code
Unit-5
Flow-of-Control Statements
Unit-5
Syntax-directed definition
Unit-5
Generating three-address code for booleans
Unit-5
translation of a simple if-statement
•
Unit-5
Backpatching
• Previous codes for Boolean expressions insert symbolic labels for jumps
• It therefore needs a separate pass to set them to appropriate addresses
• We can use a technique named backpatching to avoid this
• We assume we save instructions into an array and labels will be indices in
the array
• For nonterminal B we use two attributes B.truelist and B.falselist together
with following functions:
– makelist(i): create a new list containing only I, an index into the array of
instructions
– Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns a
pointer to the concatenated list
– Backpatch(p,i): inserts i as the target label for each of the instruction on
the list pointed to by p
Unit-5
Backpatching for Boolean Expressions
•
•
Unit-5
Backpatching for Boolean Expressions
Unit-5
Flow-of-Control Statements
Unit-5
Translation of a switch-statement
Unit-5