2. Simple Syntax Directed Translation
2. Simple Syntax Directed Translation
Translation
• Intermediate-code generation:
abstract syntax trees or simply syntax trees, represents the hierarchical
syntactic structure of the source program.
three-address instructions
Three-address code
• Terminals: + - 0 1 2 3 4 5 6 7 8 9
A string of terminals is a sequence of zero or
more terminals.
The string of zero terminals is called the empty
string (εε)
• Non-terminals: list digit
• Start symbol: list
Derivations
• A grammar derives strings by beginning with
the start symbol & repeatedly replacing a non-
terminal by the body of a production for that
non-terminal.
• The terminal strings that can be derived from
the start symbol form the language defined by
the grammar.
Example
(1)
(2)
(3)
(4)
• 9-5+2 is a list:
a) 9 is a list by production (3) , since 9 is a digit.
b) 9-5 is a list by production (2) , since 9 is a list & 5
is a digit
c) 9-5+2 is a list by production (1) , since 9-5 is a list
and 2 is a digit.
Parsing
• Parsing is the problem of taking a string of terminals
and figuring out how to derive it from the start symbol
of the grammar,
• if it cannot be derived from the start symbol of the
grammar, then reporting syntax errors within the
string.
• Parsing is one of the most fundamental problems in all
of compiling
• A source program has multi-character lexemes that are
grouped by the lexical analyzer into tokens, whose first
components are the terminals processed by the parser.
Parse Trees
• A parse tree pictorially shows how the start
symbol of a grammar derives a string in the
language.
• If non-terminal A has a production A → XYZ,
then a parse tree may have an interior node with
three children labeled X, Y, & Z, from left to right:
Properties of the Parse Tree
1. The root is labeled by the start symbol.
2. Each leaf is labeled by a terminal or by ε.
3. Each interior node is labeled by a non-terminal.
4. If A is the non-terminal labeling some interior node and
X1 , X2, • • • , Xn are the labels of the children of that
node from left to right, then there must be a
production A → X1X2 · · · Xn.
• Here, X1 , X2 , . . . , Xn each stand for a symbol that is
either a terminal or a non-terminal .
• As a special case, if A → ε is a production, then a node
labeled A may have a single child labeled ε
Tree Terminology
A tree consists of one or more nodes. Nodes may have
labels, which in this book typically will be grammar
symbols.
When we draw a tree, we often represent the nodes by
these labels only.
• (9-5) +2 • 9- (5 + 2)
Associativity of Operators
• By convention, 9+5+2 is equivalent to (9+5) +2
• 9-5-2 is equivalent to ( 9-5) -2.
• When an operand like 5 has operators to its left
and right, conventions are needed for deciding
which operator applies to that operand.
• operator + associates to the left, because an
operand with plus signs on both sides of it
belongs to the operator to its left.
• In most programming languages: addition,
subtraction, multiplication, and division are left-
associative.
Associativity of Operators
• exponentiation are right-associative.
• assignment operator = in C and its
descendants is right associative;
the expression a=b=c is treated in the
same way as the expression a= (b=c )
• right → letter = right I letter
• letter → a I b I . . . I z
Parse Tree for 9-5-2 and a=b=c
left - associative: + -
left - associative: * /
We create two nonterminals expr and term for the two levels of
precedence, and an extra nonterminal factor for generating basic
units in expressions.
The basic units in expressions are presently digits and
parenthesized expressions.