0% found this document useful (0 votes)
18 views

Chapter - 4: Semantic Analysis

Compiler design

Uploaded by

Nhatty Özil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Chapter - 4: Semantic Analysis

Compiler design

Uploaded by

Nhatty Özil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Chapter – 4

Semantic analysis
Syntax directed translation
 A syntax directed definition is a generalization of the
CFG in which each grammar symbol has an associated set
of attributes (synthesized and inherited).
 An attribute can represent anything we choose ( a string,
a number, a type, a memory location, etc.)
 The value of a synthesized attribute is computed from the
values of attributes at the children of that node in the parse
tree.
 The value of an inherited attribute is computed from the
values of attributes at the siblings and parent of that node in
the parse tree.
Attributes
 A process of computing an attribute and its values is
referred as binding.
 Static: prior to execution e.g.: number of digits in a number
 Dynamic: during execution e.g.: value of a variable

Parse: 27 Check the class of the following


attributes
1. Value in grammar 1
2. Data type in grammar 2

Parse: int x
Semantic rules/Attributed Grammar
 Semantic rules represent attributes and their values on the
production.

 An attributed grammar is S-Attributed when all of its attributes are


synthesized.
 The L-attributed class of grammars allow a limited kind of inherited
attributes.
Dependency Graph
 Dependency graph shows
dependencies between
attributes and enables to
find an evaluation order
for the semantic rules.

 A parse tree showing


attributes with their
values is called an
annotated or decorated
parse tree.
Examples
 Construct the semantic rules(attributed grammar) and
Dependency graph for the following grammars

Parse: 27 Parse: 5+5+3

/*Binary number */
N→L1.L2
L1→L2 B
L→B
Parse: int x B→0
B→1
Parse: 1001.10
Type checking
 The compiler must check if the source program follows
semantic conventions of the source language.
 Static checking
 Dynamic checking
 The following are examples of static checks:
 Type Checks.
 Uniqueness Checks.
 Flow-of-Control Checks.
 Name -Related Checks.
 A type checker verifies that the type construct matches that
expected by its context.
 For example, a type checker should verify that the type value
assigned to a variable is compatible with the type of the variable
 Mod(%) expects two integers
Type checking(cont. . .)
 In almost all languages, types are either basic or
constructed
 Basic types are boolean, character, integer and real
 Constructed types includes arrays, records and sets
Type Expression
 The type of a language construct will be denoted by a type
expression
 A type expression is either a basic type or formed by applying
an operator called type constructor to the type expression
 The following are type expressions:
 A basic type is a type expression
 A type constructor applied to a type expression is a type expression.
Constructors include:
 Arrays. If I in an index set and T is a type expression, then array (I, T) is a type
expression
 Pointers. If T is a type expression then pointer(T) is a type expression
 Functions. The type expression of a function has the form D→R where
D is the type expression of the parameters and R is the TE of the
returned value.
 Etc. ..
Examples of type checker
The following example gives type checking system for function calls:

E→E1 (E2) {
E.type := if E2.type = s and E1.type = s → t then

else
type_error

}
Examples of type checker
Expressions:

E→literal {E.type := char}

E→num {E.type := integer}

E→id {E .type := lookup (id.entry)}

E→E1 mod E2 {E.type := if E1.type = integer and E2.type := integer then integer else type_error}

E→E1[E2] {E.type := if E2.type = integer and E1.type := array(s, t) then t else type_error}

E→^E1 {E.type := if E2.type = pointer(t) then t else type_error}

S→id := E

S→if E then S1

S→while E do S1

S→S1 ; S2
Examples of type checker
Statements:
S→id := E {S.type := if id.type = E.type then void else type_error}

S→if E then S1 {S.type := if E.type = boolean then S1.type else type_error}

S→while E do S1 {S.type := if E.type = boolean then S1.type else type_error}

S→S1 ; S2 {S.type := if S1.type = void and S2.type = void then void else type_error}
Equivalence of type expression
 Two type expressions are structurally equivalent if and
only if they are the same basic type or are formed by
applying the same constructor to structurally equivalent
types
 For example, integer is equivalent only to integer

 Int a[2] is structurally equivalent with int b[4] as they are


constructed from structurally equivalent types
Type conversion
Chapter – 5

Intermediate Code Generation


Intermediate code generation
 In a compiler, the front end translates a source
program into an intermediate representation, and the
back end generates the target code from this
intermediate representation
 The use of a machine independent intermediate code
(IC) is:
 retargeting to another machine is facilitated
 the optimization can be done on the machine independent
code
 In practice the IC generation and the type checking
can be done at the same time as it use syntax
directed translation
Intermediate code generation
 While generating machine code directly from source
code is possible, it entails two problems
 With m languages and n target machines, we need to write m
front ends, m X n optimizers, and m X n code generators
 The code optimizer which is one of the largest and very-
difficult-to-write components of a compiler, cannot be
reused(machine and language dependent)
 By converting source code to an intermediate code, a
machine-independent code optimizer may be written
 This means just m front ends, n code generators and 1
optimizer
Intermediate code generation
 Intermediate code must be easy to produce and easy to
translate to machine code
 A sort of universal assembly language
 Should not contain any machine-specific parameters (registers,
addresses, etc.)
 Quadruples, triples, three address code, abstract syntax
trees, Directed acyclic graph are the classical forms used
for intermediate code generation
Intermediate code generation

Using triples, we refer to the result of an operation x op y by its position, rather than by an explicit temporary name.
Intermediate code generation
Postfix Notation
 The postfix notation is practical for an intermediate
representation as the operands are found just before the
operator
 In fact, the postfix notation is a linearized representation
of a syntax tree
 e.g., 1 + 2 * 3 will be represented in the postfix notation as 1 2
+3*
Intermediate code generation
Three-Address code
 The three address code is a sequence of statements
of the form: X := Y op Z
 Only one operator at the right side of the assignment is
possible.
 Similarly to postfix notation, the three-address code is a
linearized representation of a syntax tree.
 usually contains three-addresses (the two operands and
the result)
Intermediate code generation
Statement Format Comments
Assignment (binary operation) X := Y op Z Arithmetic and logical operators used

Assignment (unary operation) X := op Y

Copy statement X := Y
Unconditional jump goto L
Conditional jump If X relop y goto L

Function call
param X1 The parameters are specified by param
param X2

param Xn
call p, n The procedure p is called by indicating the number of parameters n

Indexed arguments X := Y [I] X will be assigned the value at the address Y + I


The value at the address Y + I will be assigned X
Y[I] := X

Address and pointer assignments X := & Y X is assigned the address of Y


X := *Y X is assigned the element at the address Y
The value at the address X is assigned Y
*X = Y
Intermediate code generation(Examples)
 Generate three address code and syntax tree for the
following examples
 a+b+(a+b)
 a + a * (b - c) + (b - c) * d
 a+b*c–d/(b*c)
 (( x + y ) - (( x + y) * (x - y))) + ((x + y) * (x - y))
Intermediate code generation(Examples)
 C-Program

int a=3, b=4, dot_prod, i;


dot_prod = 0;
for (i=0; i<10; i++) dot_prod += a*b;

 Intermediate code

dot_prod = 0; | T4 = dot_prod+T3
i = 0; | dot_prod = T4
L1: if(i >= 10) goto L2 | T5 = i+1
T1 = a | i = T5
T2 = b | goto L1
T3 = T1*T2 | L2:
Intermediate code generation(Examples)
 C-Program

int a, b, dot_prod, i; int* a1; int* b1;


dot_prod = 0; a1 = a; b1 = b;
for (i=0; i<10; i++) dot_prod += *a1++ * *b1++;

 Intermediate code
Intermediate code generation(Examples)
Intermediate code generation (Examples)
 C-Program (function)
int dot_prod(int x, int y){
int d, i; d = 0;
for (i=0; i<10; i++) d += x*y;
return d;
}
 Intermediate code
func begin dot_prod | T2 = y
d = 0; | T3 = T1*T2
i = 0; | T4 = d+T3
L1: if(i >= 10)goto L2 | d = T4
T1 = x | T5 = i+1
| i = T5
| goto L1
| L2: return d
| func end
Intermediate code generation(Examples)
1. C-Program

do i = i+1; while (a[i] < v) ;


Intermediate code ?

2. C-Program

if ((a+b < c+d) || ((e==f) && (g > h-k)))


a = a+1;
else a = a-1; c = c+3;
Intermediate code ?

3. C-Program
int fact(int n)
{
if (n==0) return 1;
else return (n*fact(n-1));
}
Intermediate code ?
Chapter – 6

Code Generation
Code Generation
 Back end operation
 Machine architecture(CPU in the eyes of ML programmer)
 Instruction set operation
 Addressing mode
 Data format
 CPU registers
 I/O instructions
 Code Generation
Atoms → Binary coded instruction(Object language program)
Code Generation
 Query #1:
 Can we use a C++ backend while developing a new language?
Code Generation
 Query #2:
 Given
 Source code for Pascal compiler
 Pascal Compiler that runs on Mac

 Required
 Ada compiler which runs on Mac
Code Generation
 Query #3:
 Given
 Source code for Pascal compiler
 Pascal Compiler that runs on Mac

 Required
 Pascal compiler for a new machine called RISC
Code Generation
 Converting atoms to
instructions
 Eg. C=A+B

 Converting conditional
branches  LOD followed by STO architecture
 If(A>B) A=B*C L1:
LOD R1, =’1’ LOD R1, T1
 MOV supporting architecture CMP R1, =’0’, 1
STO R1, T1
MOV 1, T1 JMP L2
LOD R1, A
TST A, B, 3, L1 LOD R1, B
MOV 0, T1 CMP R1, B, 3
JMP L1 MUL R1, C
LBL L1
TST T1, 0, 1, L2 LOD R1, =’0’ STO R1, A
MUL B, C, A STO R1, T1 CMP 0,0,0
JMP L3 JMP L3
LBL L2 L2:
LBL L3 L3:
Code Generation
 Register Allocation
 CPU
 Single arithmetic register
 Specific purpose registers or
 General purpose registers (Here we need RA)
 Register Allocation:
 Process of assigning a purpose to a particular register
 Used to
 Maximize utilization of CPU registers
 Minimize memory reference
 Minimize number of instructions (Can also optimize code)
 Allows CG to maintain information on
 Which registers are used
 Which are available for reuse
Code Generation
 Show the code that can be generated for the following
C++ code segments with efficient register utilization
 Example #1:
A=B+C*D;
B=A-C*D;
One solution can be
LOD R1, C LOD R1, C
MUL R1, D MUL R1, D
STO R1, Temp1 STO R1, Temp2
Problems
1. Number of instructions
LOD R1, B LOD R1, A
2. Number of memory references
ADD R1, Temp1 SUB R1, Temp2
STO R1, A STO R1, B

 Example #2:
 a - b/c + d*(e-f + g+h)

You might also like