BottomUpParsing ShiftReduceParsing
BottomUpParsing ShiftReduceParsing
Bottom Up Parsing
Bottom-up Parsing is more general than
(deterministic) top-down parsing
But just as efficient
Builds on ideas in top-down parsing
Bottom-up is the preferred method
38
38
Bottom-Up Parsing
Bottom-up Parsers don’t need left-factored
grammars
Revert to the “natural” grammar for our
example
E→T+E|T
T → int * T | int | (E)
Consider the string: int * int + int
39
39
1
11/6/2023
Bottom-Up Parsing
Bottom-Up parsing reduces a string to the
start symbol by inverting productions.
40
40
Bottom-Up Parsing
Note that the productions, read backwards,
trace a right-most derivation
41
41
2
11/6/2023
Bottom-Up Parsing
Fact 1: A bottom-up parser traces a right-
most derivation in reverse.
42
42
Bottom-Up Parsing
43
43
3
11/6/2023
Bottom-Up Parsing
44
44
Bottom-Up Parsing
45
45
4
11/6/2023
Bottom-Up Parsing
46
46
Bottom-Up Parsing
47
47
5
11/6/2023
48
48
49
49
6
11/6/2023
Shift-Reduce Parsing
The fact that a bottom-up parser traces a
right-most derivation in reverse has an
important consequence.
Let αβω be a step of a bottom-up parse.
Assume that the next reduction is by X → β.
Then ω is a string of terminals.
Why? Because αXω → αβω is a step in
the right-most derivation.
50
50
Shift-Reduce Parsing
Idea: Split string into two substrings
Right substring is as yet unexamined by parsing.
Left substring has terminals and non-terminals
The dividing point is marked by a |
51
51
7
11/6/2023
Shift-Reduce Parsing
Bottom-Up Parsing uses only two kinds
of actions
Shift (Moves | one place to the right)
• ABC|xyz ➔ ABCx|yz
Reduce
• (Apply an inverse production at the right end of
the left string)
– If A → xy is a production
– CBxy|ijk → CBA|ijk
52
52
Shift-Reduce Parsing
int * int | + int reduce T → int
T+T| reduce E → T
T+E| reduce E → T + E
E|
53
53
8
11/6/2023
54
54
55
55
9
11/6/2023
56
56
57
57
10
11/6/2023
58
58
59
59
11
11/6/2023
60
60
61
61
12
11/6/2023
62
62
63
63
13
11/6/2023
64
64
65
65
14
11/6/2023
66
66
Shift-Reduce Parsing
Left string can be implemented by a stack
Top of the stack is |
Shift pushes a terminal on the stack
Reduce
Pops symbols off the stack (production rhs)
Pushes a non-terminal on the stack (production
lhs)
67
67
15
11/6/2023
Shift-Reduce Parsing
In a given state, more than one action (shift-
reduce) may lead to a valid parse
If it is legal to shift or reduce then there is a
shift-reduce conflict.
Not good but easier to remove
If it is legal to reduce by two different
productions then it is a reduce-reduce
conflict.
Bad and indicate problems with the grammar
68
68
Handles
69
69
16
11/6/2023
Handles
Left string can be implemented by a stack
Top of the stack is a |
Shift pushes a terminal on the stack
Reduce
Pops 0 or more symbols of the stack
• production rhs
Pushes a non-terminal on the stack
• production lhs
70
70
Handles
How do we decide when to shift or reduce?
Example Grammar
E→ T + E | T
T → int * T | int | (E)
Consider step int | * int + int
We could reduce by T → int giving T | * int + int
A fatal mistake!!!
• No way to reduce to the start symbol E
• No production with T* in the grammar
71
71
17
11/6/2023
Handles
Intuition: Want to reduce only if the result
can still be reduced to the start symbol
S→* αXω→αβω
72
72
Handles
Handles formalize the intuition
A handle is a reduction that allows further
reductions back to the start symbol.
73
73
18
11/6/2023
Handles
Important Fact#2 about bottom-up parsing:
74
74
Handles
Informal induction on # of reduce moves:
True initially, stack is empty
Immediately after reducing a handle
Right-most non-terminal on the top of stack
Next handle must be to the right of right-most
non-terminal, because this is a right-most
derivation.
Sequence of shift moves reaches next handle
75
75
19
11/6/2023
Handles
In shift-reduce parsing handles always
appear on the top of stack
76
Recognizing Handles
Bad News
No known efficient algorithms to recognize
handles
Good News
There are good heuristics for guessing handles.
On some CFGs, heuristics always guess correctly
77
77
20
11/6/2023
Recognizing Handles
All CFGs
Unambiguous CFGs
LR(k) CFGs
LALR(k) CFGs
SLR(k) CFGs
78
78
Recognizing Handles
79
79
21
11/6/2023
Recognizing Handles
What does this tell us?
A viable prefix does not extend past the right end
of the handle
It’s a viable prefix because it is a prefix of the
handle.
As long as a parser has viable prefixes on the
stack no parsing error has been encountered.
80
80
Recognizing Handles
Important Fact#: 3 about Bottom-Up Parsing
(Non-obvious)
For any grammar the set of viable prefixes is a
regular language.
We show how to compute an automaton that
accept viable prefixes.
81
81
22
11/6/2023
Recognizing Handles
An item is a production with a “.” somewhere
on the rhs.
The items for T→ (E) are:
T→.(E)
T→(.E)
T→(E.)
T→(E).
The only item for X → ε is X → .
Items are often called “LR(0) items”.
82
82
Recognizing Handles
83
83
23
11/6/2023
Recognizing Handles
Consider the input (int)
E→T+E|T
T → int * T | int | (E)
Then (E|) is a state of the shift-reduce parse
(E is a prefix of the rhs of T→ (E)
• Will be reduced after the next shift
Item T → (E.) says that so far we have seen (E of
this production and hope to see ).
84
84
Recognizing Handles
The stack may have many prefixes or rhs’s
Prefix1, Prefix2,…,Prefixn-1, Prefixn
Stack is a stack of prefixes of rhs
Let Prefixi be a prefix of rhs of Xi → αi
Prefixi will eventually reduce to Xi
The missing part of αi-1 starts with Xi
i.e. there is a Xi-1 → Prefixi-1Xiβ for some β
Recursively, Prefixk+1…Prefixn eventually
reduces to the missing part of αk
85
85
24
11/6/2023
Recognizing Handles
Consider the string (int * int):
(int*|int) is a state of the shift-reduce parse.
“(” is a prefix of the rhs of T → (E)
“ε“ is a prefix of rhs of E → T
“int*” is a prefix of the rhs of T→int*T
86
86
Recognizing Handles
The stack of items
T→(.E)
E→ .T
T→ int*.T
Says
We have seen “(” of T → (E)
We have seen “ε“ of E → T
We have seen “int*” of T → int*T
87
87
25
11/6/2023
Recognizing Handles
To recognize viable prefixes, we must
88
88
89
26
11/6/2023
6. Start state is S’ → .S
90
90
91
91
27
11/6/2023
92
92
93
93
28
11/6/2023
94
94
95
95
29
11/6/2023
96
96
97
97
30
11/6/2023
98
98
99
99
31
11/6/2023
100
100
101
101
32
11/6/2023
102
102
103
103
33
11/6/2023
104
104
105
105
34
11/6/2023
106
106
107
107
35
11/6/2023
108
108
109
109
36
11/6/2023
110
110
111
111
37
11/6/2023
112
112
113
113
38
11/6/2023
114
114
Valid Items
115
115
39
11/6/2023
Valid Items
The states of the DFA are “canonical
collections of items” OR “canonical collection
of LR(0) items”
The Dragon Book gives another way of LR(0)
items.
Item X → β.γ is valid for a viable prefix
αβ if
S’ →* αXω → αβγω (by a right-most
derivation)
After parsing αβ, the valid items are the
possible tops of the stack item. 116
116
Valid Items
117
117
40
11/6/2023
Valid Items
118
SLR Parsing
LR(0) Parsing: Assume
Stack contains α
Next input is t
DFA on input α terminates in state s
Reduce by X → β if
s contains item X → β.
Shift if
s contains item X → β.tω
Equivalent to saying s has a transition
labeled t
119
119
41
11/6/2023
SLR Parsing
120
120
SLR Parsing
121
121
42
11/6/2023
SLR Parsing
SLR = Simple LR
SLR improves on LR(0) shift/reduce
heuristics
Fewer states have conflicts
122
122
SLR Parsing
Idea: Assume
Stack contains α
Next input is t
DFA on input α terminates in state s.
Reduce X → β if
s contains item X → β.
t ∈ Follow(X)
Shift if
s contains X → β.tω
123
123
43
11/6/2023
SLR Parsing
If there are conflicts under these rules
then the grammar is not SLR
The rules amount to a heuristic for
detecting handles
The SLR grammars are those where the
heuristics detect exactly the handles
124
124
SLR Parsing
125
125
44
11/6/2023
SLR Parsing
126
126
SLR Parsing
A lot of grammars are not SLR
Including all ambiguous grammars
More grammars can be parsed using
precedence declarations
Instructions for resolving conflicts
127
127
45
11/6/2023
SLR Parsing
Consider an ambiguous grammar
E → E + E | E * E | (E) | int
The DFA for this grammar contains a state with the
following items:
E → E * E. E → E. + E
Shift/Reduce Conflict.
Declaring “* has higher precedence than +” resolves
this conflict in favor of reducing.
The term “precedence declaration” is misleading.
These declarations don’t define precedence relation
rather they define conflict resolution.
Not quite the same. 128
128
SLR Parsing
129
129
46
11/6/2023
SLR Parsing
If there is a conflict in the last step then the
grammar is not SLR(k)
k is the amount of look ahead.
In practice = 1
130
130
131
131
47
11/6/2023
132
132
48