Syntax-Directed Translation
CS3300 - Compiler Design
Attach rules or program fragments to productions in a grammar.
Syntax Directed Translation
Syntax directed definition (SDD)
E1 → E2 + T E1 .code = E2 .code||[Link]||0 +0
V. Krishna Nandivada Syntax directed translation Scheme (SDT)
IIT Madras E → E+T {print ’+’} // semantic action
F → id {print [Link]}
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 2 / 29
SDD and SDT scheme Example: SDD vs SDT scheme – infix to postfix trans
SDD: Specifies the values of attributes by associating semantic SDTScheme SDD
rules with the productions.
E → E + T {print0 +0 } E → E+T [Link] = [Link]||[Link]||0 +0
SDT scheme: embeds program fragments (also called semantic E → E − T {print0 −0 } E → E−T [Link] = [Link]||[Link]||0 −0
actions) within production bodies.
E→T E→T [Link] = [Link]
The position of the action defines the order in which the action is
T →0 {print0 00 } T →0 [Link] =0 00
executed (in the middle of production or end).
T →1 {print0 10 } T →1 [Link] =0 10
SDD is easier to read; easy for specification. ··· ···
SDT scheme – can be more efficient; easy for implementation. T →9 {print0 90 } T →9 [Link] =0 90
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 3 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 4 / 29
Syntax directed translation - overview Syntax directed definition
1 Construct a parse tree
2 Compute the values of the attributes at the nodes of the tree by SDD is a CFG along with attributes and rules.
visiting the tree An attribute is associated with grammar symbols (attribute
Key: We don’t need to build a parse tree all the time. grammar).
Translation can be done during parsing. Rules are are associated with productions.
class of SDTs called “L-attributed translations”.
class of SDTs called “S-attributed translations”.
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 5 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 6 / 29
Attributes Specifying the actions: Attribute grammars
Attribute is any quantity associated with a programming construct. Idea: attribute the syntax tree
Example: data types, line numbers, instruction details
can add attributes (fields) to each node
Two kinds of attributes: for a non-terminal A, at a parse tree node N specify equations to define values (unique)
A synthesized attribute: defined by a semantic rule associated can use attributes from parent and children
with the production at N.
Example: to ensure that constants are immutable:
defined only in terms of attribute values at the children of N and at add type and class attributes to expression nodes
N itself. rules for production on := that
An inherited attribute: defined by a semantic rule associated with 1 check that [Link] is variable
the parent production of N. 2 check that [Link] and [Link] are consistent or conform
defined only in terms of attribute values at the parent of N siblings
of N and at N itself.
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 7 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 8 / 29
Attribute grammars Example
To formalize such systems Knuth introduced attribute grammars: P RODUCTION S EMANTIC RULES
grammar-based specification of tree attributes D → TL [Link] := [Link]
value assignments associated with productions T → int [Link] := integer
each attribute uniquely, locally defined T → real [Link] := real
label identical terms uniquely
L → L1 , id L1 .in := [Link]
Can specify context-sensitive actions with attribute grammars addtype([Link], [Link])
L → id addtype([Link], [Link])
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 9 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 10 / 29
Example: Evaluate signed binary numbers Example (continued)
The attributed parse tree for -101:
P RODUCTION S EMANTIC RULES
NUM → SIGN LIST [Link] := 0 NUM val: -5
if [Link]
[Link] := -[Link]
pos: 0
else SIGN neg: T LIST
val: 5
[Link] := [Link]
SIGN → + [Link] := false val and neg are
SIGN → - [Link] := true LIST
pos: 1
BIT
pos: 0
synthesized attributes
LIST → BIT [Link] := [Link]
val: 4 val: 1
pos is an inherited
[Link] := [Link] attribute
pos: 2 pos: 1
LIST → LIST1 BIT LIST1 .pos := [Link] + 1 LIST BIT
val: 4 val: 0
[Link] := [Link]
[Link] := LIST1 .val + [Link] pos: 2
BIT →0 [Link] := 0 BIT
val: 4
BIT →1 [Link] := [Link]
- 1 0 1
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 11 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 12 / 29
Dependences between attributes The attribute dependency graph
nodes represent attributes
edges represent flow of values
values are computed from constants & other attributes graph is specific to parse tree
synthesized attribute – value computed from children
size is related to parse tree’s size
inherited attribute – value computed from siblings & parent
key notion: induced dependency graph can be built alongside parse tree
The dependency graph must be acyclic
Evaluation order:
topological sort the dependency graph to order attributes
using this order, evaluate the rules
The order depends on both the grammar and the input string
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 13 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 14 / 29
Example (continued) Example: A topological order
The attribute dependency graph:
1 [Link]
NUM val: -5
0
2 LIST0 .pos
3 LIST1 .pos
pos: 0
4 LIST2 .pos
SIGN neg: T LIST 0
val: 5
5 BIT0 .pos
6 BIT1 .pos
pos: 1 pos: 0
7 BIT2 .pos
LIST1 BIT
2
val: 4 val: 1
8 BIT0 .val
9 LIST2 .val
pos: 2 pos: 1
10 BIT1 .val
LIST BIT
2
val: 4
1
val: 0
11 LIST1 .val
12 BIT2 .val
pos: 2
13 LIST0 .val
BIT0 14 [Link]
val: 4
Evaluating in this order yields [Link]: -5
- 1 0 1
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 15 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 16 / 29
Evaluation strategies Avoiding cycles
Parse-tree methods (dynamic)
1 build the parse tree Hard to tell, for a given grammar, whether there exists any parse
2 build the dependency graph tree whoe depdency graphs have cycles.
3 topological sort the graph
4 evaluate it (cyclic graph fails) Focus on classes of SDD’s that guarantee an evaluation order –
do not permit dependency graphs with cycles.
L-attributed – class of SDTs called “L-attributed translations”.
S-attributed – class of SDTs called “S-attributed translations”.
What if there are cycles?
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 17 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 18 / 29
Top-down (LL) on-the-fly one-pass evaluation Bottom-up (LR) on-the-fly one-pass evaluation
L-attributed grammar:
Informally – dependency-graph edges may go from left to right, not
other way around.
given production A → X1 X2 · · · Xn
inherited attributes of Xj depend only on: S-attributed grammar:
1 inherited attributes of A L-attributed
2 arbitrary attributes of X1 , X2 , · · · Xj−1
only synthesized attributes for non-terminals
actions at far right of a RHS
synthesized attributes of A depend only on its inherited attributes
and arbitrary RHS attributes Can evaluate S-attributed in one bottom-up (LR) pass.
synthesized attributes of an action depends only on its inherited
attributes
i.e., evaluation order:
Inh(A), Inh(X1 ), Syn(X1 ), . . . , Inh(Xn ), Syn(Xn ), Syn(A)
This is precisely the order of evaluation for an LL parser
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 19 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 20 / 29
Evaluate S-attributed grammar in bottom-up parsing Inherited Vs Synthesised attributes
Evaluate it in any bottum-up order of the nodes in the parse tree. Synthesized attributes are limited
(One option:) Apply postorder to the root of the parse tree:
void postorder (N) { Inherited attributes (are good): derive values from constants, parents,
for (each child C of N) siblings
do used to express context (context-sensitive checking)
postorder(C); inherited attributes are more “natural”
done
We want to use both kinds of attributes
evaluate the attributes associated with N; can always rewrite L-attributed LL grammars (using markers and
} copying) to avoid inherited attribute problems with LR
post order traversal of the parse tree corresponds to the exact Self reading (if interested) – Dragon book Section 5.5.4.
order in which the bottom-up parsing builds the parse tree.
Thus, we can evaluate S-attributed in one bottom-up (LR) pass.
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 21 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 22 / 29
LL parsers and actions LL parsers and actions
push EOF
push Start Symbol
token ← next token()
repeat
How does an LL parser handle (aka - execute) actions? pop X
if X is a terminal or EOF then
Expand productions before scanning RHS symbols, so:
if X = token then
push actions onto parse stack like other grammar symbols token ← next token()
pop and perform action when it comes to top of parse stack else error()
else if X is an action
perform X
else /* X is a non-terminal */
if M[X,token] = X → Y1 Y2 · · · Yk then
push Yk , Yk−1 , · · · , Y1
else error()
until X = EOF
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 23 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 24 / 29
LR parsers and action symbols Action-controlled semantic stacks
What about LR parsers?
Scan entire RHS before applying production, so: Approach:
cannot perform actions until entire RHS scanned stack is managed explicitly by action routines
actions take arguments from top of stack
can only place actions at very end of RHS of production
actions place results back on stack
introduce new marker non-terminals and corresponding Advantages:
productions to get around this restriction† actions can directly access entries in stack without popping
(efficient)
A → w action β Disadvantages:
becomes implementation is exposed
action routines must include explicit code to manage stack (or use
A → Mβ
stack abstract data type).
M → w action
† yacc, bison, CUP do this automatically
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 25 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 26 / 29
LR parser-controlled semantic stacks LL parser-controlled semantic stacks
Idea: let parser manage the semantic stack
Problems:
LR parser-controlled semantic stacks:
parse stack contains predicted symbols, not yet matched
parse stack contains already parsed symbols
often need semantic value after its corresponding symbol is
maintain semantic values in parallel with their symbols
popped
add space in parse stack or parallel stack for semantic values
Solution:
every matched grammar symbol has semantic value
use separate semantic stack
pop semantic values along with symbols
push entries on semantic stack along with their symbols
⇒ LR parsers have a very nice fit with semantic processing
on completion of production, pop its RHS’s semantic values
* *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 27 / 29 [Link] Nandivada (IIT Madras) CS3300 - Aug 2019 28 / 29
Attribute Grammars
Advantages
clean formalism
automatic generation of evaluator
high-level specification
Disadvantages
evaluation strategy determines efficiency
increased space requirements
parse tree evaluators need dependency graph
results distributed over tree
circularity testing
Intel’s 80286 Pascal compiler used an attribute grammar evaluator to
perform context-sensitive analysis.
Historically, attribute grammar evaluators have been deemed too large
and expensive for commercial-quality compilers. *
[Link] Nandivada (IIT Madras) CS3300 - Aug 2019 29 / 29