0% found this document useful (0 votes)
87 views

Unit 4 Syntax-Directed Translation & Intermediate Code Generation

This document discusses syntax-directed translation and intermediate code generation. It covers syntax-directed definitions, dependency graphs, S-attributed definitions, L-attributed definitions, and applications of syntax-directed translation like type checking and code generation. It also describes different syntax-directed translation schemes like postfix translation schemes and schemes with actions inside productions.

Uploaded by

hello
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Unit 4 Syntax-Directed Translation & Intermediate Code Generation

This document discusses syntax-directed translation and intermediate code generation. It covers syntax-directed definitions, dependency graphs, S-attributed definitions, L-attributed definitions, and applications of syntax-directed translation like type checking and code generation. It also describes different syntax-directed translation schemes like postfix translation schemes and schemes with actions inside productions.

Uploaded by

hello
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

CS450: DESIGN OF LANGUAGE PROCESSOR

Unit-4

Syntax-Directed Translation &


Intermediate Code Generation
Outline
Syntax-Directed Translation
• Syntax Directed Definitions

• Evaluation Orders of SDD’s

• Dependency Graphs

• S-attributed Definitions

• L-attributed Definitions

• Applications of Syntax Directed Translation

• Syntax Directed Translation Schemes


Intermediate Code Generation
• Variants of Syntax Trees
Introduction
• We can associate information with a language construct
by attaching attributes to the grammar symbols.
• A syntax directed definition specifies the values of
attributes by associating semantic rules with the
grammar productions.

Production Semantic Rule


E->E1+T E.code=E1.code||T.code||’+’
• We may alternatively insert the semantic actions inside the
grammar E -> E1+T {print ‘+’}
Syntax Directed Definitions
• A SDD is a context free grammar with attributes and
rules

• Attributes are associated with grammar symbols and


rules with productions

• Attributes may be of many kinds: numbers, types, table


references, strings, etc.

• Synthesized attributes
– A synthesized attribute at node N is defined only in
terms of attribute values of children of N and at N it
• Inherited attributes
– An inherited attribute at node N is defined only in
terms of attribute values at N’s parent, N itself and N’s
siblings
Example of S-attributed SDD

Production Semantic Rules


1) L -> E n L.val = E.val
2) E -> E1 + T E.val = E1.val + T.val
3) E -> T E.val = T.val
4) T -> T1 * F T.val = T1.val * F.val
5) T -> F T.val = F.val
6) F -> (E) F.val = E.val
7) F -> digit F.val = digit.lexval
Example of mixed attributes

Production Semantic Rules


1) T -> FT’ T’.inh = F.val
T.val = T’.syn
2) T’ -> *FT’1 T’1.inh = T’.inh*F.val
T’.syn = T’1.syn
3) T’ -> ε T’.syn = T’.inh
1) F -> digit F.val = F.val = digit.lexval
Dependency graph

• A dependency graph is used to determine the order of


computation of attributes

• Dependency graph
– For each parse tree node, the parse tree has a node
for each attribute associated with that node
– If a semantic rule defines the value of synthesized
attribute A.b in terms of the value of X.c then the
dependency graph has an edge from X.c to A.b
– If a semantic rule defines the value of inherited
attribute B.c in terms of the value of X.a then the
dependency graph has an edge from X.c to B.c
Example of Dependency graph
• Synthesized attributes are represented by .val.
• Hence, E.val, E1.val, and E2.val have synthesized
attributes.
• Dependencies are shown by solid arrows.
• Arrows from E1 and E2 show that the value of E
depends upon E1 and E2.
Ordering the evaluation of
attributes
• If dependency graph has an edge from M to N then M
must be evaluated before the attribute of N
• Thus the only allowable orders of evaluation are those
sequence of nodes N1,N2,…,Nk such that if there is an
edge from Ni to Nj then i<j
• Such an ordering is called a topological sort of a graph
S-Attributed definitions
• An SDD is S-attributed if every attribute is synthesized
• We can have a post-order traversal of parse-tree to evaluate
attributes in S-attributed definitions

postorder(N) {
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N;
}
• S-Attributed definitions can be implemented during bottom-up
parsing without the need to explicitly create parse trees
L-Attributed definitions
• A SDD is L-Attributed if the edges in dependency graph
goes from Left to Right but not from Right to Left.
• More precisely, each attribute must be either
– Synthesized
– Inherited, but if there us a production A->X1X2…Xn and
there is an inherited attribute Xi.a computed by a rule
associated with this production, then the rule may only use:

• Inherited attributes associated with the head A


• Either inherited or synthesized attributes associated with the
occurrences of symbols X1,X2,…,Xi-1 located to the left of Xi
• Inherited or synthesized attributes associated with this
occurrence of Xi itself, but in such a way that there is no cycle in
the graph
Application of Syntax Directed
Translation
• Type checking and intermediate code generation
• Construction of syntax trees
– Leaf nodes: Leaf(op,val)
– Interior node: Node(op,c1,c2,…,ck)
• Example:
Production Semantic Rules
1) E -> E1 + T E.node=new node(‘+’, E1.node,T.node)
2) E -> E1 - T E.node=new node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id,id.entry)
6) T -> num T.node = new Leaf(num,num.val)
Syntax tree for L-attributed
definition

Production Semantic Rules


1) E -> TE’ E.node=E’.syn +
E’.inh=T.node
2) E’ -> + TE1’ E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
3) E’ -> -TE1’ E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
4) E’ -> ∈ E’.syn = E’.inh

5) T -> (E) T.node = E.node

6) T -> id T.node=new Leaf(id,id.entry)


7) T -> num
T.node = new Leaf(num,num.val)
Syntax directed translation
schemes
• An SDT is a Context Free grammar with program
fragments embedded within production bodies

• Those program fragments are called semantic actions

• They can appear at any position within production body

• Any SDT can be implemented by first building a parse


tree and then performing the actions in a left-to-right
depth first order

• Typically SDT’s are implemented during parsing without


building a parse tree
Postfix translation schemes
• Simplest SDDs are those that we can parse the grammar
bottom-up and the SDD is s-attributed

• For such cases we can construct SDT where each action


is placed at the end of the production and is executed
along with the reduction of the body to the head of that
production

• SDT’s with all actions at the right ends of the production


bodies are called postfix SDT’s
Example of postfix SDT

1) L -> E n {print(E.val);}
2) E -> E1 + T {E.val=E1.val+T.val;}
3) E -> T {E.val = T.val;}
4) T -> T1 * F {T.val=T1.val*F.val;}
5) T -> F {T.val=F.val;}
6) F -> (E) {F.val=E.val;}
7) F -> digit {F.val=digit.lexval;}
Parse-Stack implementation of
postfix SDT’s
• In a shift-reduce parser we can easily implement
semantic action using the parser stack
• For each nonterminal (or state) on the stack we can
associate a record holding its attributes
• Then in a reduction step we can execute the semantic
action at the end of a production to evaluate the
attribute(s) of the non-terminal at the leftside of the
production
• And put the value on the stack in replace of the rightside
of production
Example

L -> E n {print(stack[top-1].val);
top=top-1;}
E -> E1 + T {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
E -> T
T -> T1 * F {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
T -> F
F -> (E) {stack[top-2].val=stack[top-1].val
top=top-2;}
F -> digit
SDT’s with actions inside
productions
• For a production B->X {a} Y
– If the parse is bottom-up then we perform action “a” as soon as
this occurrence of X appears on the top of the parser stack
– If the parser is top down we perform “a” just before we expand Y
• Sometimes we cant do things as easily as explained above
• One example is when we are parsing this SDT with a bottom-up
parser

1) L -> E n
2) E -> {print(‘+’);} E1 + T
3) E -> T
4) T -> {print(‘*’);} T1 * F
5) T -> F
6) F -> (E)
7) F -> digit {print(digit.lexval);}
SDT’s with actions inside
productions (cont)
L
• Any SDT can be implemented as
follows
1. Ignore the actions and E
produce a parse tree
2. Examine each interior node {print(‘+’);}
N and add actions as new E + T
children at the correct
position T F
3. Perform a postorder traversal {print(4);}
and execute actions when {print(‘*’);}
T *F digit
their nodes are visited
{print(5);}
F digit
{print(3);}
digit
SDT’s for L-Attributed definitions
• We can convert an L-attributed SDD into an SDT using
following two rules:
– Embed the action that computes the inherited
attributes for a nonterminal A immediately before that
occurrence of A. if several inherited attributes of A are
dpendent on one another in an acyclic fashion, order
them so that those needed first are computed first
– Place the action of a synthesized attribute for the
head of a production at the end of the body of the
production
Example
S -> while (C) S1 L1=new();
L2=new();
S1.next=L1;
C.false=S.next;
C.true=L2;
S.code=label||L1||C.code||label||L2||S1.code

S -> while ( {L1=new();L2=new();C.false=S.next;C.true=L2;}


C) {S1.next=L1;}
S1{S.code=label||L1||C.code||label||L2||S1.code;}
Intermediate Code Generation
• Variants of Syntax Trees
• Three-address code
• Control flow
Introduction
• Intermediate code is the interface between front
end and back end in a compiler
• Ideally the details of source language are
confined to the front end and the details of target
machines to the back end (a m*n model)
• In this chapter we study intermediate
representations, static type checking and
intermediate code generation

Static Intermediate Code


Parser
Checker Code Generator Generator
Front end Back end
Intermediate Code

The Following are commonly used intermediate code


representation:
Syntax tree
Postfix notation
Three-address code
Syntax tree
A syntax tree is nothing more than a condensed form of
a parse tree. The operator and keyword nodes of the
parse tree are moved to their parents and a chain of
single productions is replaced by the single link in the
syntax tree the internal nodes are operators and child
nodes are operands. To form a syntax tree put
parentheses in the expression, this way it’s easy to
recognize which operand should come first.
Syntax tree
Example: x = (a + b * c) / (a – b * c)
Postfix notation

• The ordinary (infix) way of writing the sum of a and b is


with an operator in the middle: a + b
• The postfix notation for the same expression places the
operator at the right end as ab +. In general, if e1 and e2
are any postfix expressions, and + is any binary
operator, the result of applying + to the values denoted
by e1 and e2 is postfix notation by e1e2 +. No
parentheses are needed in postfix notation because the
position and arity (number of arguments) of the operators
permit only one way to decode a postfix expression. In
postfix notation, the operator follows the operand.
Postfix notation

Example:- The postfix representation of the expression


(a – b) * (c + d) + (a – b) is :

ab – cd + *ab -+
Three-Address Code
• A statement involving no more than three references(two
for operands and one for result) is known as a three
address statement. A sequence of three address
statements is known as a three address code. Three
address statement is of form x = y op z, where x, y, and z
will have address (memory location). Sometimes a
statement might contain less than three references but it
is still called a three address statement.
• For Example: a=b+c*d;
• The Intermediate code generator will try to devide this
expression into subexpressions and then generate the
corresponding code.
r1 = c*d;
r2 = b+r1;
a = r2
Variants of syntax trees

A syntax tree basically has two variants which are


described below:
• Directed Acyclic Graphs for Expressions
• The Value-Number Method for Constructing DAG's
Directed Acyclic Graphs for Expressions
• It is sometimes beneficial to crate a DAG instead of tree
for Expressions.
• This way we can easily show the common
sub-expressions and then use that knowledge during
code generation
• Example: a+a*(b-c)+(b-c)*d

+ *

*
d
a -

b c
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T E.node= new Node(‘+’, E1.node,T.node)
2) E -> E1-T E.node= new Node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id, id.entry)
6) T -> num T.node = new Leaf(num, num.val)

Example: 8) p8=Leaf(id,entry-b)=p3
1) p1=Leaf(id, entry-a) 9) p9=Leaf(id,entry-c)=p4
2) P2=Leaf(id, entry-a)=p1 10) p10=Node(‘-’,p3,p4)=p5
3) p3=Leaf(id, entry-b) 11) p11=Leaf(id,entry-d)
4) p4=Leaf(id, entry-c) 12) p12=Node(‘*’,p5,p11)
5) p5=Node(‘-’,p3,p4) 13) p13=Node(‘+’,p7,p12)
6) p6=Node(‘*’,p1,p5)
7) p7=Node(‘+’,p1,p6)
Value-number method for
constructing DAG’s
= id To entry for i
num 10
+ + 1 2
3 1 3
i 10

• Algorithm
– Search the array for a node M with label op, left child l and right
child r
– If there is such a node, return the value number M
– If not create in the array a new node N with label op, left child l,
and right child r and return its value
• We may use a hash table
Three address code

• In a three address code there is at most one operator at


the right side of an instruction
• Example:

+
t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -

b c
Forms of three address
instructions
• x = y op z
• x = op y
• x=y
• goto L
• if x goto L and ifFalse x goto L
• if x relop y goto L
• Procedure calls using:
– param x
– call p,n
– y = call p,n
• x = y[i] and x[i] = y
• x = &y and x = *y and *x =y
Example

• do i = i+1; while (a[i] < v);

L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100

Symbolic labels Position numbers


Data structures for three address
codes
• Quadruples
– Has four fields: op, arg1, arg2 and result
• Triples
– Temporaries are not used and instead references to
instructions are made
• Indirect triples
– In addition to triples we use a list of pointers to triples
Example
Three address code
t1 = minus c
t2 = b * t1
• b * minus c + b * minus c t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5

Quadruples Triple Indirect Triples


op arg1 arg2 result op arg1 arg2 op op arg1 arg2
minus c t1 0 minus c 35 (0) 0 minus c
* b t1 t2 1 * b (0) 36 (1) 1 * b (0)
minus c t3 2 minus c 37 (2) 2 minus c
* b t3 t4 3 * b (2) b (2)
38 (3) 3 *
+ t2 t4 t5 4 + (1) (3) 39 (4) 4 + (1) (3)
= t5 a 5 = a (4) 40 (5) 5 = a (4)
Control Flow

Boolean expressions are often used to:


• Alter the flow of control.
• Compute logical values.
Short-Circuit Code


Flow-of-Control Statements
Syntax-directed definition
Generating three-address code for
booleans
Translation of a simple
if-statement

You might also like