CD Unit 3
CD Unit 3
Fig: Example
2. Evaluation Strategies
Parse-tree methods (dynamic): 1. Build the parse tree. 2. Build the dependency graph. 3.
Topological sort the graph. 4. Evaluate it (cyclic graph fails).
Q: What if tree has cycles?
Hard to tell, for a given grammar, whether there exists any parse tree whoe depdency graphs have
cycles.
Top-Down (LL)
L-attributed grammar:
Informally – dependency-graph edges may go from left to right, not other way around.
Given production A → X1X2 ···Xn.
Inherited attributes of Xj depend only on:
Inherited attributes of A
Arbitrary attributes of X1,X2,···Xj−1.
Synthesized attributes of A depend only on its inherited attributes and arbitrary RHS
attributes.
Synthesized attributes of an action depends only on its inherited attributes i.e., evaluation
order: Inh(A), Inh(X1), Syn(X1), . . . , Inh(Xn), Syn(Xn), Syn(A).
This is precisely the order of evaluation for an LL parser.
Bottom-Up(LR)
S-attributed grammar:
UNIT-IV COMPILER DESIGN
L-attributed, and only synthesized attributes for non-terminals, actions at far right of a RHS Can
evaluate S-attributed in one bottom-up (LR) pass.
3. Introduction to ICG
The front end translates a source program into an intermediate representation from which the back
end generates target code.
Benefits of using a machine-independent intermediate form are:
1. Retargeting is facilitated. That is, a compiler for a different machine can be created by
attaching a back end for the new machine to an existing front end.
2. A machine-independent code optimizer can be applied to the intermediate representation.
E ( E1 ) E.nptr : = E1.nptr
t1 : = - c t1 : = -c
t 2 : = b * t1 t2 : = b * t1
t3 : = - c t 5 : = t 2 + t2
t 4 : = b * t3 a : = t5
t 5 : = t 2 + t4
a : = t5
(a) Code for the syntax tree (b) Code for the dag
The reason for the term “three-address code” is that each statement usually contains three
addresses, two for the operands and one for the result.
Types of Three-Address Statements:
The common three-address statements are:
1. Assignment statements of the form x : = y op z, where op is a binary arithmetic or logical operation.
2. Assignment instructions of the form x : = op y, where op is a unary operation. Essential unary
operations include unary minus, logical negation, shift operators, and conversion operators that, for
example, convert a fixed-point number to a floating-point number.
3. Copy statements of the form x : = y where the value of y is assigned to x.
4. The unconditional jump goto L. The three-address statement with label L is the next to be executed.
5. Conditional jumps such as if x relop y goto L. This instruction applies a relational operator (
<, =, >=, etc. ) to x and y, and executes the statement with label L next if x stands in relation
relop to y. If not, the three-address statement following if x relop y goto L is executed next, as in the
usual sequence.
6. param x and call p, n for procedure calls and return y, where y representing a returned value is
optional. For example,
param x1 param
x2
...
param xn call p,n
generated as part of a call of the procedure p(x1, x2, …. ,xn ).
7. Indexed assignments of the form x : = y[i] and x[i] : = y.
8. Address and pointer assignments of the form x : = &y , x : = *y, and *x : = y.
Syntax-Directed Translation into Three-Address Code:
When three-address code is generated, temporary names are made up for the interior nodes of a
syntax tree. For example, id : = E consists of code to evaluate E into some temporary t, followed by the
assignment id.place : = t.
Given input a : = b * - c + b * - c, the three-address code is as shown above. The synthesized
attribute S.code represents the three-address code for the assignment S.
The nonterminal E has two attributes :
1. E.place, the name that will hold the value of E , and
2. E.code, the sequence of three-address statements evaluating E.
UNIT-IV COMPILER DESIGN
PRODUCTION SEMANTIC RULES
E E1 + E2 E.place := newtemp;
E.code := E1.code || E2.code || gen(E.place ‘:=’ E1.place ‘+’ E2.place)
E E1 * E2 E.place := newtemp;
E.code := E1.code || E2.code || gen(E.place ‘:=’ E1.place ‘*’ E2.place)
E - E1 E.place := newtemp;
E.code := E1.code || gen(E.place ‘:=’ ‘uminus’ E1.place)
E ( E1 ) E.place : = E1.place;
E.code : = E1.code
E id E.place : = id.place;
E.code : = ‘ ‘
Table: Syntax-directed definition to produce three-address code for assignments
Semantic rules generating code for a while statement
5. TYPE CHECKING
A compiler must check that the source program follows both syntactic and semantic
conventions of the source language.
This checking, called static checking, detects and reports programming errors.
Some examples of static checks:
1. Type checks – A compiler should report an error if an operator is applied to an incompatible
operand. Example: If an array variable and function variable are added together.
2. Flow-of-control checks – Statements that cause flow of control to leave a construct must have some
place to which to transfer the flow of control. Example: An error occurs when an enclosing statement,
such as break, does not exist in switch statement.
Position of type checker
A type checker verifies that the type of a construct matches that expected by its context. For
example: arithmetic operator mod in Pascal requires integer operands, so a type checker verifies
that the operands of mod have type integer.
Type information gathered by a type checker may be needed when code is generated.
Type Systems:
The design of a type checker for a language is based on information about the syntactic constructs in
the language, the notion of types, and the rules for assigning types to language constructs.
For example: “ if both operands of the arithmetic operators of +,- and * are of type integer, then the result
is of type integer ”
Type Expressions:
The type of a language construct will be denoted by a “type expression.”
A type expression is either a basic type or is formed by applying an operator called a type
constructor to other type expressions.
The sets of basic types and constructors depend on the language to be checked.
The following are the definitions of type expressions:
UNIT-IV COMPILER DESIGN
1. Basic types such as boolean, char, integer, real are type expressions.
A special basic type, type_error , will signal an error during type checking; void denoting “the
absence of a value” allows statements to be checked.
2. Since type expressions may be named, a type name is a type expression.
3. A type constructor applied to type expressions is a type expression.
Constructors include:
Arrays: If T is a type expression then array (I,T) is a type expression denoting the type of an
array with elements of type T and index set I.
Products: If T1 and T2 are type expressions, then their Cartesian product T1 X T2 is a type
expression.
Records: The difference between a record and a product is that the fields of a record have names.
The record type constructor will be applied to a tuple formed from field names and field types.
For example:
type row = record
address: integer;
lexeme: array[1..15] of char end;
var table: array[1...101] of row;
declares the type name row representing the type expression record((address X integer) X (lexeme
X array(1..15,char))) and the variable table to be an array of records of this type.
Pointers: If T is a type expression, then pointer(T) is a type expression denoting the type “pointer
to an object of type T”.
For example, var p: ↑ row declares variable p to have type pointer(row).
Functions: A function in programming languages maps a domain type D to a range type R. The type
of such function is denoted by the type expression D → R
4. Type expressions may contain variables whose values are type expressions.
Tree representation for char x char → pointer (integer)
→
x pointer