0% found this document useful (0 votes)
10 views

Unit 5-Semantic Analysis and Intermediate Code Generation

Uploaded by

btechgaming05
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Unit 5-Semantic Analysis and Intermediate Code Generation

Uploaded by

btechgaming05
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 61

CE442: DESIGN OF LANGUAGE PROCESSOR

Unit-5

Semantic Analysis & Intermediate Code


Generation
Outline
• Syntax Directed Definitions
• Evaluation Orders of SDD’s
• Applications of Syntax Directed Translation
• Syntax Directed Translation Schemes

Unit-5
Introduction
• We can associate information with a language
construct by attaching attributes to the grammar
symbols.
• A syntax directed definition specifies the values
of attributes by associating semantic rules with
the grammar productions.
Production Semantic Rule
E->E1+T E.code=E1.code||T.code||’+’
• We may alternatively insert the semantic actions inside the grammar
E -> E1+T {print ‘+’}

Unit-5
Syntax Directed Definitions
• A SDD is a context free grammar with attributes and rules

• Attributes are associated with grammar symbols and


rules with productions

• Attributes may be of many kinds: numbers, types, table


references, strings, etc.

• Synthesized attributes
– A synthesized attribute at node N is defined only in terms of
attribute values of children of N and at N it
• Inherited attributes
– An inherited attribute at node N is defined only in terms of
attribute values at N’s parent, N itself and N’s siblings

Unit-5
Example of S-attributed SDD

Production Semantic Rules


1) L -> E n L.val = E.val
2) E -> E1 + T E.val = E1.val + T.val
3) E -> T E.val = T.val
4) T -> T1 * F T.val = T1.val * F.val
5) T -> F T.val = F.val
6) F -> (E) F.val = E.val
7) F -> digit F.val = digit.lexval

Unit-5
Example of mixed attributes

Production Semantic Rules


1) T -> FT’ T’.inh = F.val
T.val = T’.syn
2) T’ -> *FT’1 T’1.inh = T’.inh*F.val
T’.syn = T’1.syn
3) T’ -> ε T’.syn = T’.inh
3) F -> digit F.val = F.val = digit.lexval

Unit-5
Evaluation orders for SDD’s
• A dependency graph is used to determine the order of
computation of attributes

• Dependency graph
– For each parse tree node, the parse tree has a node for each attribute
associated with that node
– If a semantic rule defines the value of synthesized attribute A.b in
terms of the value of X.c then the dependency graph has an edge from
X.c to A.b
– If a semantic rule defines the value of inherited attribute B.c in terms
of the value of X.a then the dependency graph has an edge from X.c to
B.c
• Example!
Unit-5
Ordering the evaluation of attributes
• If dependency graph has an edge from M to N
then M must be evaluated before the attribute
of N
• Thus the only allowable orders of evaluation
are those sequence of nodes N1,N2,…,Nk such
that if there is an edge from Ni to Nj then i<j
• Such an ordering is called a topological sortof a
graph
• Example!
Unit-5
S-Attributed definitions
• An SDD is S-attributed if every attribute is
synthesized
• We can have a post-order traversal of parse-tree to
evaluate attributes in S-attributed definitions

postorder(N) {
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N;
}
• S-Attributed definitions can be implemented during bottom-up
parsing without the need to explicitly create parse trees

Unit-5
L-Attributed definitions
• A SDD is L-Attributed if the edges in dependency graph goes
from Left to Right but not from Right to Left.
• More precisely, each attribute must be either
– Synthesized
– Inherited, but if there us a production A->X1X2…Xn and there is
an inherited attribute Xi.a computed by a rule associated with
this production, then the rule may only use:
• Inherited attributes associated with the head A
• Either inherited or synthesized attributes associated with the
occurrences of symbols X1,X2,…,Xi-1 located to the left of Xi
• Inherited or synthesized attributes associated with this occurrence of
Xi itself, but in such a way that there is no cycle in the graph

Unit-5
Application of Syntax Directed Translation
• Type checking and intermediate code generation
• Construction of syntax trees
– Leaf nodes: Leaf(op,val)
– Interior node: Node(op,c1,c2,…,ck)
• Example:
Production Semantic Rules
1) E -> E1 + T E.node=new node(‘+’, E1.node,T.node)
2) E -> E1 - T E.node=new node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id,id.entry)
6) T -> num T.node = new Leaf(num,num.val)

Unit-5
Syntax tree for L-attributed definition

Production Semantic Rules


E.node=E’.syn +
1) E -> TE’
E’.inh=T.node
2) E’ -> + TE1’ E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
3) E’ -> -TE1’ E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
4) E’ ->  E’.syn = E’.inh

5) T -> (E) T.node = E.node

6) T -> id T.node=new Leaf(id,id.entry)


7) T -> num
T.node = new Leaf(num,num.val)
Unit-5
Syntax directed translation schemes
• An SDT is a Context Free grammar with program fragments
embedded within production bodies

• Those program fragments are called semantic actions

• They can appear at any position within production body

• Any SDT can be implemented by first building a parse tree and


then performing the actions in a left-to-right depth first order

• Typically SDT’s are implemented during parsing without


building a parse tree

Unit-5
Postfix translation schemes
• Simplest SDDs are those that we can parse the grammar
bottom-up and the SDD is s-attributed

• For such cases we can construct SDT where each action is


placed at the end of the production and is executed along
with the reduction of the body to the head of that production

• SDT’s with all actions at the right ends of the production


bodies are called postfix SDT’s

Unit-5
Example of postfix SDT

1) L -> E n {print(E.val);}
2) E -> E1 + T {E.val=E1.val+T.val;}
3) E -> T {E.val = T.val;}
4) T -> T1 * F {T.val=T1.val*F.val;}
5) T -> F {T.val=F.val;}
6) F -> (E) {F.val=E.val;}
7) F -> digit {F.val=digit.lexval;}

Unit-5
Parse-Stack implementation of postfix
SDT’s
• In a shift-reduce parser we can easily implement
semantic action using the parser stack
• For each nonterminal (or state) on the stack we can
associate a record holding its attributes
• Then in a reduction step we can execute the semantic
action at the end of a production to evaluate the
attribute(s) of the non-terminal at the leftside of the
production
• And put the value on the stack in replace of the
rightside of production
Unit-5
Example

L -> E n {print(stack[top-1].val);
top=top-1;}
E -> E1 + T {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
E -> T
T -> T1 * F {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
T -> F
F -> (E) {stack[top-2].val=stack[top-1].val
top=top-2;}
F -> digit
Unit-5
SDT’s with actions inside productions

• For a production B->X {a} Y


– If the parse is bottom-up then we perform action “a” as soon as this
occurrence of X appears on the top of the parser stack
– If the parser is top down we perform “a” just before we expand Y
• Sometimes we cant do things as easily as explained above
• One example is when we are parsing this SDT with a bottom-up
parser 1) L -> E n
2) E -> {print(‘+’);} E1 + T
3) E -> T
4) T -> {print(‘*’);} T1 * F
5) T -> F
6) F -> (E)
7) F -> digit {print(digit.lexval);}

Unit-5
SDT’s with actions inside productions (cont)
L
• Any SDT can be
implemented as follows E
1. Ignore the actions and
produce a parse tree {print(‘+’);}
E + T
2. Examine each interior node
N and add actions as new T F
children at the correct
{print(4);}
position {print(‘*’);}
T *F digit
3. Perform a postorder {print(5);}
traversal and execute actions F digit
when their nodes are visited
{print(3);}

Unit-5
digit
SDT’s for L-Attributed definitions
• We can convert an L-attributed SDD into an SDT
using following two rules:
– Embed the action that computes the inherited
attributes for a nonterminal A immediately before that
occurrence of A. if several inherited attributes of A are
dpendent on one another in an acyclic fashion, order
them so that those needed first are computed first
– Place the action of a synthesized attribute for the head
of a production at the end of the body of the
production

Unit-5
Example
S -> while (C) S1 L1=new();
L2=new();
S1.next=L1;
C.false=S.next;
C.true=L2;
S.code=label||L1||C.code||label||
L2||S1.code

S -> while ( {L1=new();L2=new();C.false=S.next;C.true=L2;}


C) {S1.next=L1;}
S1{S.code=label||L1||C.code||label||L2||S1.code;}

Unit-5
Intermediate Code Generation
• Variants of Syntax Trees
• Three-address code
• Types and declarations
• Translation of expressions
• Type checking
• Control flow
• Backpatching

Unit-5
Introduction
• Intermediate code is the interface between front end
and back end in a compiler
• Ideally the details of source language are confined to
the front end and the details of target machines to
the back end (a m*n model)
• In this chapter we study intermediate
representations, static type checking and
intermediate code generation

Static Intermediate Code


Parser
Checker Code Generator Generator
Front end Back end
Unit-5
Variants of syntax trees
• It is sometimes beneficial to crate a DAG
instead of tree for Expressions.
• This way we can easily show the common sub-
expressions and then use that knowledge
during code generation
• Example: a+a*(b-c)+(b-c)*d
+

+ *

*
d
a -

Unit-5
b c
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T E.node= new Node(‘+’, E1.node,T.node)
2) E -> E1-T E.node= new Node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id, id.entry)
6) T -> num T.node = new Leaf(num, num.val)

Example: 8) p8=Leaf(id,entry-b)=p3
1) p1=Leaf(id, entry-a) 9) p9=Leaf(id,entry-c)=p4
2) P2=Leaf(id, entry-a)=p1 10) p10=Node(‘-’,p3,p4)=p5
3) p3=Leaf(id, entry-b) 11) p11=Leaf(id,entry-d)
4) p4=Leaf(id, entry-c) 12) p12=Node(‘*’,p5,p11)
5) p5=Node(‘-’,p3,p4) 13) p13=Node(‘+’,p7,p12)
6) p6=Node(‘*’,p1,p5)
7) p7=Node(‘+’,p1,p6)

Unit-5
Value-number method for
constructing DAG’s
= id To entry for i
num 10
+ + 1 2
3 1 3
i 10

• Algorithm
– Search the array for a node M with label op, left child l and right child r
– If there is such a node, return the value number M
– If not create in the array a new node N with label op, left child l, and
right child r and return its value
• We may use a hash table

Unit-5
Three address code

• In a three address code there is at most one


operator at the right side of an instruction
• Example:
+
t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -

b c

Unit-5
Forms of three address instructions
• x = y op z
• x = op y
• x=y
• goto L
• if x goto L and ifFalse x goto L
• if x relop y goto L
• Procedure calls using:
– param x
– call p,n
– y = call p,n
• x = y[i] and x[i] = y
• x = &y and x = *y and *x =y

Unit-5
Example

• do i = i+1; while (a[i] < v);

L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100

Symbolic labels Position numbers

Unit-5
Data structures for three address
codes
• Quadruples
– Has four fields: op, arg1, arg2 and result
• Triples
– Temporaries are not used and instead references
to instructions are made
• Indirect triples
– In addition to triples we use a list of pointers to
triples

Unit-5
Example Three address code
t1 = minus c
t2 = b * t1
• b * minus c + b * minus c t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5

Quadruples Triples Indirect Triples


op arg1 arg2 result op arg1 arg2 op op arg1 arg2
minus c t1 0 minus c 35 (0) 0 minus c
* b t1 t2 1 * b (0) 36 (1) 1 * b (0)
minus c t3 2 minus c 37 (2) 2 minus c
* b t3 t4 3 * b (2) b (2)
38 (3) 3 *
+ t2 t4 t5 4 + (1) (3) 39 (4) 4 + (1) (3)
= t5 a 5 = a (4) 40 (5) 5 = a (4)

Unit-5
Type Expressions
Example: int[2][3]
array(2,array(3,integer))

• A basic type is a type expression


• A type name is a type expression
• A type expression can be formed by applying the array type constructor
to a number and a type expression.
• A record is a data structure with named field
• A type expression can be formed by using the type constructor g for
function types
• If s and t are type expressions, then their Cartesian product s*t is a type
expression
• Type expressions may contain variables whose values are type
expressions

Unit-5
Type Equivalence

• They are the same basic type.


• They are formed by applying the same
constructor to structurally equivalent types.
• One is a type name that denotes the other.

Unit-5
Declarations

Unit-5
Storage Layout for Local Names
• Computing types and their widths

Unit-5
Storage Layout for Local Names

Syntax-directed translation of array types

Unit-5
Sequences of Declarations

• Actions at the end:


Unit-5
Fields in Records and Classes

Unit-5
Translation of Expressions and
Statements
• We discussed how to find the types and offset
of variables
• We have therefore necessary preparations to
discuss about translation to intermediate code
• We also discuss the type checking

Unit-5
Three-address code for expressions

Unit-5
Incremental Translation

Unit-5
Addressing Array Elements
• Layouts for a two-dimensional array:

Unit-5
Semantic actions for array reference

Unit-5
Translation of Array References

Nonterminal L has three synthesized attributes:


• L.addr
• L.array
• L.type

Unit-5
Conversions between primitive
types in Java

Unit-5
Introducing type conversions into
expression evaluation

Unit-5
Abstract syntax tree for the
function definition
fun length(x) =
if null(x) then 0 else length(tl(x)+1)

This is a polymorphic function


in ML language

Unit-5
Inferring a type for the function length

Unit-5
Algorithm for Unification

Unit-5
Unification algorithm
boolean unify (Node m, Node n) {
s = find(m); t = find(n);
if ( s = t ) return true;
else if ( nodes s and t represent the same basic type ) return true;
else if (s is an op-node with children s1 and s2 and
t is an op-node with children t1 and t2) {
union(s , t) ;
return unify(s1, t1) and unify(s2, t2);
}
else if s or t represents a variable {
union(s, t) ;
return true;
}
else return false;
}
Unit-5
Control Flow

boolean expressions are often used to:


• Alter the flow of control.
• Compute logical values.

Unit-5
Short-Circuit Code

Unit-5
Flow-of-Control Statements

Unit-5
Syntax-directed definition

Unit-5
Generating three-address code for booleans

Unit-5
translation of a simple if-statement

Unit-5
Backpatching
• Previous codes for Boolean expressions insert symbolic labels for jumps
• It therefore needs a separate pass to set them to appropriate addresses
• We can use a technique named backpatching to avoid this
• We assume we save instructions into an array and labels will be indices in
the array
• For nonterminal B we use two attributes B.truelist and B.falselist together
with following functions:
– makelist(i): create a new list containing only I, an index into the array of
instructions
– Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns a
pointer to the concatenated list
– Backpatch(p,i): inserts i as the target label for each of the instruction on
the list pointed to by p

Unit-5
Backpatching for Boolean Expressions

Unit-5
Backpatching for Boolean Expressions

• Annotated parse tree for x < 100 || x > 200


&& x ! = y

Unit-5
Flow-of-Control Statements

Unit-5
Translation of a switch-statement

Unit-5

You might also like