Compiler Design Unit 3
Compiler Design Unit 3
(Eg) Let us consider the Grammar for arithmetic expressions (DESK CALCULATOR)
The Syntax Directed Definition associates to each non terminal a synthesized attribute
called val.
Definition. An S-Attributed Definition is a Syntax Directed Definition that uses only synthesized
attributes.
Types
1.Postfix
2.Syntax Tree
Three-address Statements are akin to assembly code.Statements can have symbolic labels and
there are statements for flow of control. A symbolic label represents the index of three-address
statement in the array holding intermediate code.
4. The unconditional jump goto L. The three-address statement with label L is the next to be
executed.
6. Param x and call p, n for procedure calls and return y. where y representing a returned value
is optional Their typical use it as the sequence of three.address statements
param x1
param x2
….
param xn
call p,n generated as part of a call of the procedure p(x1,x2,….xn)
7. Indexed assignments of the form x=y[i] and x[i]=y. The first of these sets x to the value in the
location i memory units beyond location y. The stat[i]=y sets the contents of the location I units
beyond x to the value of y. In both these instructions, x, y. and i refer to data objects.
8. Address and pointer assignments of the form x=&y, x=*y and *x=y
Implementation of Three Address Statements:
Quadruples
A quadruple is a record structure with four fields, which we call op,. arg1, arg 2, and result. The
op field contains an internal code for the operator. The three-address statement x =y op z is
represented by placing y in arg1, z in arg2, and x in result. Statements with unary operators like
x = -y or x= y do not use arg2. Operators like param use neither arg2 nor result. Conditional and
unconditional jumps put the target label in result.
Triples
To avoid entering temporary names into the symbol table, we might refer to a temporary value
by the position of the statement that computes it.
Doing so ,the three address statements can be representedby records with only three fields
:op,arg1,arg2. The quadruples and triple representation for the assignment a=b+-c+b+-c is
shown below:
The contents of fields arg1,arg 2, and result are normally pointers to the symbol-table entries for
the names represented by these fields. If so, temporary names must be entered into the symbol
table as they are created.
Indirect Triples
Another implementation of three address code that has been considered is that of listing
pointers to triples, rather than listing the triples themselves. This implementation is called
indirect triples.
(Eg) Implement the Quadruple, Triple and Indirect Triple for the expression E:=(a*b)+c
Soln
T1=a*b
T2=T1+c
E:=T2
Quadruple
Triple
OP Arg 1 Arg 2
(0) * a b
(1) + (0) c
(2) := E (1)
Indirect Triple
(0) (100)
(1) (101)
(2) (102)
OP Arg 1 Arg 2
(100) * a b
(101) + (100) c
(102) := E (101)
Two attributes
S → id := E
E → E1 + E2
E → E1 * E2
{E.place:= newtmp E.code := E1 .code || E2 .code || gen(E.place := E1 .place * E2 .place)}
E → -E1
E → (E1)
E → id
• Example:
• Consider the statement n=f(a[i]) where a is array of integers f is function from integers to
integers
• The three address code for the procedure call will be as follows
t1 = i * 4
t2 = a [t1]
param t2
t3 = call f, 1
n = t3
• Implementation of Symbol table – Following are commonly used data structure for
implementing symbol table :-
1. List:
In this method, an array is used to store names and associated information.
A pointer “available” is maintained at end of all stored records and new names
are added in the order as they arrive
To search for a name we start from beginning of list till available pointer and if not
found we get an error “use of undeclared name”
While inserting a new name we must ensure that it is not already present
otherwise error occurs i.e. “Multiple defined name”
Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
Advantage is that it takes minimum amount of space.
2. Linked List:
This implementation is using linked list. A link field is added to each record.
Searching of names is done in order pointed by link of link field.
A pointer “First” is maintained to point to first record of symbol table.
Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
3. Hash Table:
In hashing scheme two tables are maintained – a hash table and symbol table
and is the most commonly used method to implement symbol tables..
A hash table is an array with index range: 0 to table size – 1.These entries are
pointer pointing to names of symbol table.
To search for a name we use hash function that will result in any integer between
0 to table size – 1.
Insertion and lookup can be made very fast – O(1).
Advantage is that search is possible and disadvantage is that hashing is
complicated to implement.
4. Binary Search Tree:
Another approach to implement symbol table is to use binary search tree i.e. we
add two link fields i.e. left and right child.
All names are created as child of root node that always follows the property of
binary search tree.
Insertion and lookup are O(log2 n) on average.
• Scope Management
o A compiler maintains two types of symbol tables: a global symbol tablewhich can
be accessed by all the procedures and scope symbol tables that are created for
each scope in the program.
o To determine the scope of a name, symbol tables are arranged in hierarchical
structure as shown in the example below:
...
int value=10;
void pro_one()
{
int one_1;
int one_2;
{ \
int one_3; |_ inner scope 1
int one_4; |
} /
int one_5;
{ \
int one_6; |_ inner scope 2
int one_7; |
} /
}
void pro_two()
{
int two_1;
int two_2;
{ \
int two_3; |_ inner scope 3
int two_4; |
} /
int two_5;
}
...
o The above program can be represented in a hierarchical structure of symbol
tables:
o The global symbol table contains names for one global variable (int value) and
two procedure names, which should be available to all the child nodes shown
above. The names mentioned in the pro_one symbol table (and all its child
tables) are not available for pro_two symbols and its child tables.
o This symbol table data structure hierarchy is stored in the semantic analyzer and
whenever a name needs to be searched in a symbol table, it is searched using
the following algorithm:
First a symbol will be searched in the current scope, i.e. current symbol
table.
if a name is found, then search is completed, else it will be searched in
the parent symbol table until,
Either the name is found or global symbol table has been searched for
the name.