0% found this document useful (0 votes)
68 views

Compiler 2

This chapter introduces the concepts needed to build a simple syntax-directed translator. It discusses defining a context-free grammar, parsing source code using predictive parsing, and implementing syntax-directed translation to generate intermediate code. The chapter also provides examples of translating infix arithmetic expressions to postfix expressions and mapping code fragments to three-address code.

Uploaded by

mamudu francis
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Compiler 2

This chapter introduces the concepts needed to build a simple syntax-directed translator. It discusses defining a context-free grammar, parsing source code using predictive parsing, and implementing syntax-directed translation to generate intermediate code. The chapter also provides examples of translating infix arithmetic expressions to postfix expressions and mapping code fragments to three-address code.

Uploaded by

mamudu francis
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 45

Chapter 2

Chang Chi-Chung
2008.03 rev.1
A Simple Syntax-Directed
Translator
 This chapter contains introductory material to
Chapters 3 to 8
 To create a syntax-directed translator that maps
infix arithmetic expressions into postfix
expressions.
 Building a simple compiler involves:
 Defining the syntax of a programming language
 Develop a source code parser: for our compiler
we will use predictive parsing
 Implementing syntax directed translation to
generate intermediate code
A Code Fragment To Be
Translated
To extend syntax-directed translator to map code fragments into three-
address code. See appendix A.

1: i = i + 1
2: t1 = a [ i ]
{ 3: if t1 < v goto 1
int i; int j; 4: j = j -1
float[100] a; float v; float x; 5: t2 = a [ j ]
while (true) { 6: if t2 > v goto 4
do i = i + 1; while ( a[i] < v ); 7: ifFalse i >= j goto 9
do j = j – 1; while ( a[j] > v ); 8: goto 14
if ( i>= j ) break; 9: x = a [ i ]
x = a[i]; a[i] = a[j]; a[j] = x; 10: t3 = a [ j ]
} 11: a [ i ] = t3
} 12: a [ j ] = x
13: goto 1
14:
A Model of a Compiler Front
End
Source Lexical Token Syntax Intermediate Three-address
program analyzer stream Parser tree Code code
Generator
Character
Stream

Symbol
Table
Two Forms of Intermediate
Code
 Abstract syntax trees  Tree-Address instructions

do-while

1: i = i + 1
body > 2: t1 = a [ i ]
3: if t1 < v goto
1
assign [] v

i + a i

i 1
Syntax Definition

 Using Context-free grammar (CFG)


 BNF: Backus-Naur Form
 Context-free grammar has four components:
 A set of tokens (terminal symbols)
 A set of nonterminals
 A set of productions
 A designated start symbol
Example of CFG

 G = <T, N, P, S>
 T = { +,-,0,1,2,3,4,5,6,7,8,9 }
 N = { list, digit }
 P=
 list  list + digit

 list  list – digit

 list  digit

 digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
 S = list
Derivations

 The set of all strings (sequences of tokens) g


enerated by the CFG using derivation
 Begin with the start symbol
 Repeatedly replace a nonterminal symbol in the c
urrent sentential form with one of the right-hand si
des of a production for that nonterminal
Example of the Derivations
list  Production
 list + digit  list  list + digit
 list - digit + digit  list  list – digit
 digit - digit + digit
 list  digit
 9 - digit + digit
 digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
 9 - 5 + digit
9-5+2
 Leftmost derivation
 replaces the leftmost nonterminal (underlined) in each step.
 Rightmost derivation
 replaces the rightmost nonterminal in each step.
Parser Trees
 Given a CFG, a parse tree according to the grammar is a tree wit
h following propertes.
 The root of the tree is labeled by the start symbol

 Each leaf of the tree is labeled by a terminal (=token) or 

 Each interior node is labeled by a nonterminal

 If A  X X … X is a production, then node A has immediate chil


1 2 n
dren X1, X2, …, Xn where Xi is a (non)terminal or  ( denotes the
empty string)

 Example A
 A  XYZ

X Y Z
Example of the Parser Tree
 Parse tree of the string 9-5+2 using grammar G
list

list digit

list digit

digit
The sequence of
9 - 5 + 2 leafs is called the
yield of the parse tree
Ambiguity

 Consider the following context-free grammar


G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string>

P = string  string + string | string - string | 0 | 1 | … | 9

 This grammar is ambiguous, because more


than one parse tree represents the string 9-
5+2
Ambiguity (Cont’d)

string string

string string string

string string string string string

9 - 5 + 2 9 - 5 + 2
Associativity of Operators
 Left-associative
 If an operand with an operator on both sides of it, then it belo
ngs to the operator to its left.
 string a+b+c has the same meaning as (a+b)+c
 Left-associative operators have left-recursive productions
 left  left + term | term
 Right-associative
 If an operand with an operator on both sides of it, then it belo
ngs to the operator to its right.
 string a=b=c has the same meaning as a=(b=c)
 Right-associative operators have right-recursive productions
 right  term = right | term
Associativity of Operators (co
nt’d)
list right

list digit letter right

list digit letter right

digit letter

a + b + c a = b = c

left-associative right-associative
Precedence of Operators
 String 9+5*2 has the same meaning as 9+(5*2)
 * has higher precedence than +
 Constructs a grammar for arithmetic expression
s with precedence of operators.
 left-associative : + - (expr)
 left-associative : * / (term)

Step 1: Step 3:
factor  digit | ( expr ) expr  expr + term
| expr – term
| term
Step 2: Step 4:
term  term * factor expr  expr + term | expr – term | term
| term / factor term  term * factor | term / factor | factor
| factor factor  digit | ( expr )
An Example: Syntax of
Statements
 The grammar is a subset of Java statements.

 This approach prevents the build-up of semicolons a


fter statements such as if- and while-, which end wit
h nested substatements.
stmt  id = expression ;
| if ( expression ) stmt
| if ( expression ) stmt else stmt
| while ( expression ) stmt
| do stmt while ( expression ) ;
| { stmts }

stmts  stmts stmt


| 
Syntax-Directed Translation
 Syntax-Directed translation is done by attaching rules
or program fragments to productions in a grammar.
 Translate infix expressions into postfix notation. ( in t
his chapter )
 Infix: 9 – 5 + 2
 Postfix: 9 5 – 2 +
 An Example
 expr  expr1 + term
 The pseudo-code of the translation

translate expr1 ;
translate term ;
handle + ;
Syntax-Directed Translation
(Cont’d)
 Two concepts (approaches) related to
Syntax-Directed Translation.
 Synthesized Attributes
 Syntax-directed definition
 Build up a translation by attaching strings (semantic
rules) as attributes to the nodes in the parse tree.
 Translation Schemes
 Syntax-directed translation
 Build up a translation by program fragments which are
called semantic actions and embedded within production
bodies.
Syntax-directed definition
 The syntax-directed definition associates
 With each grammar symbol (terminals and nonterminals), a s
et of attributes.
 With each production, a set of semantic rules for computing t
he values of the attributes associated with the symbols appea
ring in the production.
 An attribute is said to be
 Synthesized
 if its value at a parse-tree node is determined from attribute valu
es at its children and at the node itself.
 Inherited
 if its value at a parse-tree node is determined from attribute valu
es at the node itself, its parent, and its siblings in the parse tree.
An Example: Synthesized
Attributes
An annotated parse tree
 Suppose a node N in a parse tree is labeled by gr
ammar symbol X.
 The X.a is denoted the value of attribute a of X at
node N.
expr.t = “95-2+”

expr.t = “95-” term.t = “2”

expr.t = “9” term.t = “5”

term.t = “9”

9 - 5 + 2
Semantic Rules
Production Semantic Rules
expr  expr1 + term expr.t = expr1.t || term.t || ‘+’
expr  expr1 - term expr.t = expr1.t || term.t || ‘-’
expr  term expr.t = term.t
term  0 term.t = ‘0’
term.t = ‘1’
term  1


term.t = ‘9’
term  9
|| is the operator for string concatenation in semantic rule.
Depth-First Traversals
 Tree traversals
 Breadth-First
 Depth-First
 Preorder: N L R
 Inorder: L N R
 Postorder: L R N
 Depth-First Traversals: Postorder 、 From left to right
procedure visit(node N)
{
for ( each child C of N, from left to right )
{
visit(C);
}
evaluate semantic rules at node N;
}
Example: Depth-First
Traversals
expr.t = 95-2+

expr.t = 95- term.t = 2

expr.t = 9 term.t = 5

term.t = 9

9 - 5 + 2

Note: all attributes are the synthesized type


Translation Schemes
 A translation scheme is a CFG embedded
with semantic actions
 Example
 rest  + term { print(“+”) } rest

Embedded Semantic Action

rest

+ term { print(“+”) } rest


An Example: Translation
expr
Scheme
expr + term { print(‘+’) }

expr - term { print(‘-’) } 2 { print(‘2’) }

term 5 { print(‘5’) }
expr  expr + term { print(‘+’)
}
9 { print(‘9’) } expr  expr – term { print(‘-’) }
expr  term
term  0 { print(‘0’) }
term  1 { print(‘1’) }


Parsing
 The process of determining if a string of
terminals (tokens) can be generated by a
grammar.
 Time complexity:
 For any CFG there is a parser that takes at most
O(n3) time to parse a string of n terminals.
 Linear algorithms suffice to parse essentially all
languages that arise in practice.
 Two kinds of methods
 Top-down: constructs a parse tree from root to leaves
 Bottom-up: constructs a parse tree from leaves to root
Top-Down Parsing
 Recursive descent parsing is a top-down method
of syntax analysis in which a set of recursive proced
ures is used to process the input.
 One procedure is associated with each nonterminal of a gr
ammar.
 If a nonterminal has multiple productions, each production i
s implemented in a branch of a selection statement based
on input lookahead information
 Predictive parsing
 A special form of recursive descent parsing
 The lookahead symbol unambiguously determines the flow
of control through the procedure body for each nonterminal.
An Example: Top-Down
Parsing
stmt  expr ;
| if ( expr ) stmt
| for ( optexpr ; optexpr ; optexpr ) stmt
| other
optexpr  
| expr

stmt

for ( optexpr ; optexpr ; optexpr ) stmt

ε expr expr other


void stmt() {

Pseudocode For switch ( lookahead ) {


case expr:
match(expr); match(‘;’); break;
a Predictive Pa case if:
match(if); match(‘(‘);

rser
match(expr); match(‘)’);
stmt(); break;
case for:
match(for); match(‘(‘);
optexpr(); match(‘;’);
stmt  expr ; optexpr(); match(‘;’);
| if ( expr ) stmt optexpr(); match(‘)’);
| for ( optexpr ; optexpr ; optexpr ) stmt stmt(); break;
| other case other:
match(other); break;
default:
report(“syntax error”);
}
Use ε- }

Productions
optexpr   | expr
void optexpr() {
if ( lookahead == expr ) match(expr);
}

void match(terminal t) {
if ( lookahead == t )
lookahead = nextTerminal;
else
report(“syntax error”);
}
Example: Predictive Parsing
Parse LL(1)
Tree stmt

for ( optexpr ; optexpr ; optexpr ) stmt

match(for) match(‘(‘) optexpr()match(‘;‘)


optexpr()match(‘;‘) optexpr()
match(‘)‘) stmt()

Input

for ( ; expr ; expr ) other

lookahead
FIRST
 FIRST() is the set of terminals that appear a
s the first symbols of one or more strings gen
erated from 
  is Sentential Form
 Example
 FIRST(stmt) = { expr, if, for, other }
 FIRST(expr ;) = { expr }
stmt  expr ;
| if ( expr ) stmt
| for ( optexpr ; optexpr ; optexpr ) stmt
| other
Examples: First
type  simple
| ^ id
| array [ simple ] of type
simple  integer
| char
| num dotdot num

FIRST(simple) = { integer, char, num }


FIRST(^ id) = { ^ }
FIRST(type) = { integer, char, num, ^, array }
Designing a Predictive Parser
 A predictive parser is a program consisting of a
procedure for every nonterminal.
 The procedure for nonterminal A
 It decides which A-production to use by examining t
he lookahead symbol.
 Left Factor
 Left Recursion
 ε Production
 Mimics the body of the chosen production.
 Applying translation scheme
 Construct a predictive parser, ignoring the actions.
 Copy the actions from the translation scheme into th
e parser
Left Factor
 Left Factor
 One production for nonterminal A starts with the s
ame symbols.

 Example:
stmt  if ( expr ) stmt
| if ( expr ) stmt else stmt

 Use Left Factoring to fix it


stmt  if ( expr ) stmt rest
rest  else stmt | ε
Left Recursion
 Left Recursive
 A production for nonterminal A starts with a self re
ference.
 A  Aα | β
 An Example:
 expr  expr + term | term
 Rewrite the left recursive to right recursive by
using the following rules.
A  βR
R  αR | ε
Example: Left and Right
Recursive A A
… R
A R
A …

A R
R
β α α …. α β α α …. α ε

left recursive right recursive


Abstract and Concrete
Syntax +
- 2
expr
9 5
expr term

expr term
helper
term

9 - 5 + 2
Conclusion: Parsing and
Translation Scheme
 Give a CFG grammar G as below:
expr  expr + term { print(‘+’) }
expr  expr – term { print(‘-’) }
expr  term
term  0 { print(‘0’) }
term  1 { print(‘1’) }

term  9 { print(‘9’) }

 Semantic actions for translating into postfix notation.


Conclusion: Parsing and
Translation
 Step 1
Scheme
 To elimination left-recursion
 Technique
A  Aα | Aβ | γ
into
A  γR
R  αR | βR | ε
 Use the rule to transforms G.
Conclusion: Parsing and
Translation Scheme
 Left-Recursion-elimination
expr  term rest
rest  + term { print(‘+’) } rest
| – term { print(‘-’) } rest
| ε
term  0 { print(‘0’) }
term  1 { print(‘1’) }

term  9 { print(‘9’) }
An Example: Left-Recursion-elimin
ation expr

term rest

9 { print(‘9’) } - term { print(‘-’) } rest

5 { print(‘5’) } + term { print(‘+’) } rest

2 { print(‘2’) } ε

expr  term rest


rest  + term { print(‘+’) } rest
| – term { print(‘-’) } rest
| ε
term  0 { print(‘0’) } | 1 { print(‘1’) } | … | 9 { print(‘9’) }
Conclusion: Parsing and
 Translation Scheme
void expr() {
Step 2 term(); rest();
}
 Procedures for
void rest() {
Nonterminals. if ( lookahead == ‘+’ ) {
match(‘+’); term();
print(‘+’); rest();
}
else if ( lookahead == ‘-’ ) {
match(‘-’); term();
print(‘-’); rest();
}
else { } //do nothing with the input
}

void term() {
if ( lookahead is a digit ) {
t = lookahead; match(lookahead);
print(t);
}
else
report(“syntax error”);
}
Conclusion: Parsing and
Translation Scheme
 Step 3
 Simplifying the Translator
void rest() {
void rest() { while ( true ) {
if ( lookahead == ‘+’ ) { if ( lookahead == ‘+’ ) {
match(‘+’); term(); match(‘+’); term();
print(‘+’); rest(); print(‘+’); continue;
} }
else if (lookahead == ‘-’) { else if (lookahead == ‘-’) {
match(‘-’); term(); match(‘-’); term();
print(‘-’); rest(); print(‘-’); continue;
} }
else { } break;
}
}
Conclusion: Parsing and
 Translation
Complete Scheme
import java.io.*;

class Parser {
static int lookahead;

public Parser() throws IOException {


lookahead = System.in.read();
}

void expr() { void term() throws IOException {


term(); if (Character.isDigit((char)lookahead){
while ( true ) { System.out.write((char)lookahead);
if ( lookahead == ‘+’ ) { match(lookahead);
match(‘+’); term(); }
System.out.write(‘+’); else throw new Error(“syntax error”);
continue; }
}
else if (lookahead == ‘-’) { void match(int t) throws IOException {
match(‘-’); term(); if ( lookahead == t )
System.out.write(‘-’); lookahead = System.in.read();
continue; else throw new Error(“syntax error”);
} }
else return; }
}

You might also like