Syntax Analyzer 2-up to LR(0)
Syntax Analyzer 2-up to LR(0)
• Constructs parse tree for an input string beginning at the leaves (the
bottom) and working towards the root (the top)
• Example: id*id
E -> E + T | T id*id F * id T * id T* F T E
T -> T * F | F
F id T*F T
F -> (E) | id id F
id id F id T*F
id F id
id
Bottom Up Parsers
Shift-reduce parser
• E=>T=>T*F=>T*id=>F*id=>id*id
Shift-reduce parser
• Shift: Moving of the symbols from input buffer onto the stack, this
action is called shift. (push current input symbol to stack)
E→ExE
Stack Input Buffer Parsing Action
E → id $ id – id x id $ Shift
$ id – id x id $ Reduce E → id
Parse the input string $E – id x id $ Shift
$E– id x id $ Shift
id – id x id using a $ E – id x id $ Reduce E → id
shift-reduce parser. $E–E x id $ Shift
$E–Ex id $ Shift
$ E – E x id $ Reduce E → id
$E–ExE $ Reduce E → E x E
$E–E $ Reduce E → E – E
$E $ Accept
• Handle: Handle is a substring that matches the
body of a production. (Handle = RHS of production)
• Handle is a Right Sentential Form + position
where reduction can be performed + production
used for reduction
Handle pruning
• Basic operations:
• Shift
• Reduce Stack Input Action
• Accept
• Error $ id*id$ shift
• Example: id*id
$id *id$ reduce by F->id
$F *id$ reduce by T->F
$T *id$ shift
$T* id$ shift
$T*id $ reduce by F->id
$T*F $ reduce by T->T*F
$T $ reduce by E->T
$E $ accept
Handle will appear on top of the stack
S S
A
B
B A
α β γ z α γ z
y x y
Stack Input Stack Input
$αβγ yz$ $αγ xyz$
$αβB yz$ $αBxy z$
$αβBy z$
Conflicts during shift reduce parsing
Stack Input
… if expr then stmt else …$
Conflicts during shift reduce parsing
• There are two kinds of conflicts that can occur in an SLR(1) parsing table.
A shift-reduce conflict occurs in a state that requests both a shift action and
a reduce action. A reduce-reduce conflict occurs in a state that requests two
or more different reduce actions.
How to determine?
• A full parsing table is not needed, only the canonical collection. In the
canonical collection, find all final items (and only final items), and see if:
• There are both shift and reduce in the same item ("shift-reduce", s/r)
• There are two reduce actions in the same item ("reduce-reduce", r/r)
If none of these is true, there are no conflicts, even in LR(0). If there are
some of the above, SLR(1) still may solve it.
Shift reduce conflict and Reduce/reduce conflict
1. LR(0) Parser
2. Simple LR-Parser (SLR)
3. Canonical LR Parser (CLR)
4. Look ahead LALR Parser.
Comparison of LL & LR Methods
LL(1)
a1 a2 … ai … an $ Scanner
sm L R Parsing Engine
Xm
s m-1
Xm-1
Compiler Construction
…
s0 Parser
Action Goto Grammar
Generator
Stack
L R Parsing Tables
Bottom-Up Parsing Algorithms
LR(k) parsing
L: scan input Left to right
R: produce Rightmost derivation
k tokens of lookahead
LR(0)
zero tokens of look-ahead
SLR
Simple LR: like LR(0), but uses FOLLOW sets to build
more “precise” parsing tables
LR(0) is a toy, so we focus on SLR
LR Family:
LR(0)<= SLR(1)<=LALR(1)<=CLR(1)
• Shift: Moving of the symbols from input buffer onto the stack, this action is
called shift.
• Accept: If stack contains start symbol only and input buffer is empty at the
same time then that action is called accept.
• Error: A situation in which parser cannot either shift or reduce the symbols, it
cannot even perform accept action then it is called error action.
LR(0) steps:
• Right sentential form
CFG G, S-> alpha, alpha – T or NT, A right sentential form is a sentential
form that can be derived by right most derivation.
• LR(0) items
. Dot anywhere in the right side, including the beginning or end. In the case of
an epsilon production then B -> epsilon, B -> . Is an item
• A-> XY•Z
• A-> XYZ•
• The production A-> generates only one
item, A-> •.
• Complete Item : An Item where the Item Dot is at the end of the
RHS.
• The LR parser consists of
1) Input 2)Output 3)Stack 4) Driver Program 5) Parsing Table
• Only the Parsing Table changes from one parser to the other.
• In CLR method the stack holds the states from the LR(0)
automation and canonical LR and LALR methods are same
• The Driver Program uses the Stack to store a string
s 0 X 1 s 1 X 2 …X m s m
• A GOTO function.
Closure algorithm
SetOfItems CLOSURE(I) {
J=I;
repeat
α.Bβ in J)
for (each item A->
for (each prodcution B->γ of G)
if (B->.γ is not in J)
add B->.γ to J;
until no more items are added to J on one round;
return J;
GOTO algorithm
SetOfItems GOTO(I,X) {
J=empty;
if (A->α.X β is in I)
add CLOSURE(A-> αX. β ) to J;
return J;
}
The Action Table
• Parsing is completed
• The GOTO table specifies which state to put on top of the stack
after a reduce.
• The GOTO Table is important to find out the next state after every
reduction.
• The GOTO Table is indexed by a state of the parser and a Non Terminal
(Grammar Symbol).
ex : GOTO[S, A]
• The GOTO Table simply indicates what the next state of the parser if it
has recognized a certain.
LR(0) Parser
• The LR Parser is a Shift-reduce Parser that makes use of a
Deterministic Finite Automata, recognizing the Set Of All Viable
Prefixes by reading the stack from Bottom To Top.
T T S7
S0
S → •E$ T → (•E)
E → •T E → •T
( E → •E+T
E → •E+T
T → •i T → •i
T → •(E) S5 T → •(E) (
i
i
E T → i• E
( S8
S1
S → E•$ T → (E•)
E → E•+T i E → E•+T
+ +
)
$ E → E+•T
S3 T → •i
S2 T → •(E) S9
S → E$• T T → (E)•
S4
E →E+T•
Construction of LR(0) Items
I0 T i (
I6
T
( I9
I7 I8 S
E
)
Construction of LR(0) Parsing Table
Input
States Action Part (Terminals) Goto Part
(Non-
Terminals)
i + ( ) $ E T
0 5 7 1 6 Shift
1 3 2 Shift
2 S→E$ Accept
3 5 7 4 Shift
4 E→E+T Reduce
5 T→i Reduce
6 E→T Reduce
7 5 7 8 6 Shift
8 3 9 Shift
9 T→(E) Reduce 31
String Acceptance
Input
States Action Part (Terminals) Goto Part (Non-
Terminals)
i + ( ) $ E T
0 5 7 1 6 Shift
1 3 2 Shift
2 S→E$ Ac c ep t
3 5 7 4 Shift
4 E→E+T Reduce
5 T→i Reduce
6 E→T Reduce
7 5 7 8 6 Shift
8 3 9 Shift
9 T→(E) Reduce
i + ( ) $ E T
0 5 7 1 6 Shift
1 3 2 Shift
2 S→E$ Ac c ep t
3 5 7 4 Shift
4 E→E+T Reduce
5 T→i Reduce
6 E→T Reduce
7 5 7 8 6 Shift
8 3 9 Shift
9 T→(E) Reduce
i + ( ) $ E T
0 5 7 1 6 Shift
1 3 2 Shift
2 S→E$ Ac c ep t
3 5 7 4 Shift
4 E→E+T Reduce
5 T→i Reduce
6 E→T Reduce
7 5 7 8 6 Shift
8 3 9 Shift
9 T→(E) Reduce
i + ( ) $ E T
0 5 7 1 6 Shift
1 3 2 Shift
2 S→E$ Ac c ep t
3 5 7 4 Shift
4 E→E+T Reduce
5 T→i Reduce
6 E→T Reduce
7 5 7 8 6 Shift
8 3 9 Shift
9 T→(E) Reduce
• Augmented grammar:
• G with addition of a production: S’->S
• Closure of item sets:
• If I is a set of items, closure(I) is a set of items constructed from I by the
following rules:
• Add every item in I to closure(I)
• If A->α.Bβ is in closure(I) and B->γ is a production then add the item B-
>.γ to clsoure(I).
• Example:
I0=closure({[E’->.E]}
E’->E E’->.E
E -> E + T | T E->.E+T
T -> T * F | F E->.T
T->.T*F
F -> (E) | id T->.F
F->.(E)
F->.id
Constructing canonical LR(0) item sets
(cont.)
• Goto (I,X) where I is an item set and X is a grammar symbol is closure of
set of all items [A-> αX. β] where [A-> α.X β] is in I
• Example
I1
E’->E.
E E->E.+T
I0=closure({[E’->.E]}
E’->.E I2
E->.E+T T
E’->T.
E->.T T->T.*F
T->.T*F I3
T->.F ( F->(.E)
F->.(E) E->.E+T
E->.T
F->.id T->.T*F
T->.F
F->.(E)
F->.id
Canonical LR(0) items
Void items(G’) {
C= CLOSURE({[S’->.S]});
repeat
for (each set of items I in C)
for (each grammar symbol X)
if (GOTO(I,X) is not empty and not in C)
add GOTO(I,X) to C;
until no new set of items are added to C on a round;
}
E’->E
E -> E + T | T
Example T -> T * F | F
acc F -> (E) | id
$ I6 I9
E->E+.T
I1 T->.T*F T
E’->E. + T->.F
E->E+T.
T->T.*F
E E->E.+T
F->.(E)
F->.id
I0=closure({[E’->.E]} I2
E’->.E T I7
F I10
E->.E+T E->T. * T->T*.F
F->.(E) T->T*F.
E->.T T->T.*F id F->.id
T->.T*F id
T->.F I5
F->.(E)
F->.id ( F->id. +
I4
F->(.E)
I8 I11
E->.E+T
E->.T
E E->E.+T )
T->.T*F F->(E.) F->(E).
T->.F
F->.(E)
F->.id
I3
T>F.
Use of LR(0) automaton
• Example: id*id
INPUT a1 … ai … an $
Disadvantage:
S
String : abcd
a A d
$<a<b=c>d>$
b c
Operator precedence parsing
1. Suppose A-> X1, X2,………Xn
• If Xi and Xi+1 are terminals
then Xi=Xi+1
• If Xi and Xi+2 are terminals and Xi+1 is non terminal, then
then Xi=Xi+2
• If Xi is terminal and Xi+1 is non terminal then
Xi < Lead (Xi+1)
• If Xi is non terminal and Xi+1 is terminal then
then Trial(Xi) > Xi+1
• $ < Lead (S) and Trial (S) > $
Operator precedence parsing
• Example:
1) T -> T +T | T * T | id
1) T -> T +T | T * T | id
id + * $
Id - ⋗ ⋗ ⋗
+ ⋖ ⋗ ⋖ ⋗
* ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ -
Operator precedence parsing
1) T -> T +T | T * T | id
Input String: id+id*id $
Operator Precedence Relation: Top of the stack: $ id + id * id
id ⋗ * ⋗ + ⋗ $
1. Id ⋗ + ………. Pop $ id
2. $ ⋖ id ……… push $ id
id + * $ 3. $ ⋖ + ………. Push $+
4. + ⋖ id ………. Push $ + id
Id - ⋗ ⋗ ⋗ 5. Id ⋗ * ………. Pop $+
6. + ⋖ * ………. Push $+*
+ ⋖ ⋗ ⋖ ⋗
7. * ⋖ id ……… push $ + * id
* ⋖ ⋗ ⋗ ⋗ 8. Id ⋗ $ ………. Pop $+*
9. * ⋗ $ ……… pop $+
$ ⋖ ⋖ ⋖ - 10. + ⋗ $ ……… pop $
11. $ $ ………… Stop Accept
Operator precedence parsing
• Example:
1) T -> T +T | T * T | id
T T T
Input String: id + id * id $
Operator precedence parsing
• Example:
1) T -> T +T | T * T | id
The graph representing the precedence
functions is-
Size: N*N
id + * $
Id - ⋗ ⋗ ⋗
+ ⋖ ⋗ ⋖ ⋗
* ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ -
Operator precedence parsing
• Example:
1) T -> T +T | T * T | id
The graph representing the precedence
functions is-
fid → gx → f+ → g+ → f$
gid → fx → gx → f+ → g+ → f$
Operator precedence parsing
• Example: The graph representing the precedence
functions is- (not in closed loop)
1) T -> T +T | T * T | id
Size: N*N id + * $
Id - ⋗ ⋗ ⋗
+ ⋖ ⋗ ⋖ ⋗
* ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ -
Size: 2*N id + * $
fid < gid …. 4<5
F fid – 4 2 4 0 f+ < g* …….2<3
g gid – 5 1 3 0 fid → gx → f+ → g+ → f$
gid → fx → gx → f+ → g+ → f$