Compiler Design(Unit-II)
Compiler Design(Unit-II)
(Unit-II)
Basic Parsing
Vibhor Kr. Vishnoi
Assistant Professor
2
Phases of Compiler
3
Parser
• Bottom up parsing
• Bottom up parsing is also known as shift-reduce parsing.
• Bottom up parsing is used to construct a parse tree for an input
string.
• In the bottom up parsing, the parsing starts with the input
symbol and construct the parse tree up to the start symbol by
tracing out the rightmost derivations of string in reverse.
Parser
• Bottom up parsing
• Example
• E→T
• T→T*F
• T → id
• T →F
• F → id
Type of Parsing
Bottom-Up Parsing
o Bottom-Up Parser : Constructs a parse tree for an input string beginning at the
leaves(the bottom) and working up towards the root(the top)
o We can think of this process as one of “reducing” a string w to the start symbol
of a grammar
o Bottom-up parsing is also known as shift-reduce parsing because its two main actions
are shift and reduce.
At each shift action, the current symbol in the input string is pushed to a
stack.
At each reduction step, the symbols at the top of the stack (this symbol
sequence is the right side of a production) will replaced by the non-terminal
at the left side of that production.
Handle
Input string
A Shift-Reduce Parser
E E+T | T Right-Most Derivation of id+id*id
T T*F | F E E+T E+T*F E+T*id E+F*id
F (E) | id E+id*id T+id*id F+id*id id+id*id
2.Reduce: Replace the handle on the top of the stack by the non-terminal.
4.Error: Parser discovers a syntax error, and calls an error recovery routine.
id id
Shift reduce parsing
• Example:
• Grammar:
• S → S+S
• S → S-S
• S → (S)
• S→a
• Input string:
• a1-(a2+a3)
•
Shift reduce parsing
Parsing table:
• Example:
• Grammar:
• S → S+S
• S → S-S
• S → (S)
• S→a
• Input string:
• a1-(a2+a3)
•
Operator precedence parsing
E E+T | E*T|id
id + * $
id > > > >
+ < > < >
* < > > >
$ < < < <
Operator precedence parsing
W=id+id*id
E
E + E * E
id id id
LR Parser
• LR(0) Table
• If a state is going to some other state on a terminal then it
correspond to a shift move.
• If a state is going to some other state on a variable then it
correspond to go to move.
• If a state contain the final item in the particular row then write the
reduce node completely.
LR (0) Parsing
• LR(0) Table
LR (0) Parsing
• LR(0) Table
Explanation:
I0 on S is going to I1 so write it as 1.
I0 on A is going to I2 so write it as 2.
I2 on A is going to I5 so write it as 5.
I3 on A is going to I6 so write it as 6.
I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
I4, I5 and I6 all states contains the final item because they contain • in the
right most end. So rate the production as production number.
SLR (1) Parsing
• SLR (1) refers to simple LR Parsing. It is same as LR(0) parsing. The
only difference is in the parsing table. To construct SLR (1) parsing
table, we use canonical collection of LR (0) item.
• In the SLR (1) parsing, we place the reduce move only in the follow
of left hand side. Various steps involved in the SLR (1) Parsing:
1. For the given input string write a context free grammar
2. Check the ambiguity of the grammar
3. Add Augment production in the given grammar
4. Create Canonical collection of LR (0) items
5. Draw a data flow diagram (DFA)
6. Construct a SLR (1) parsing table
SLR (1) Parsing
• SLR (1) Table Construction
• The steps which use to construct SLR (1) Table is given below:
• If a state (Ii) is going to some other state (Ij) on a terminal, then it
corresponds to a shift move in the action part.
• If a state (Ii) is going to some other state (Ij) on a variable then it
correspond to go to move in the Go to part.
• If a state (Ii) contains the final item like A → ab• which has no
transitions to the next state then the production is known as reduce
production. For all terminals X in FOLLOW (A), write the reduce entry
along with their production numbers.
SLR (1) Parsing
• Example
• S→E
E→E+T|T
T→T*F|F
F → id
• Add Augment Production and insert '•' symbol at the first position
for every production in G
• S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id
SLR (1) Parsing
• I0= S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id
SLR (1) Parsing
• I1= Go to (I0, E) = closure (S` → E•, E → E• + T)
I2= Go to (I0, T) = closure (E → T•T, T• → * F)
I3= Go to (I0, F) = Closure ( T → F• ) = T → F•
I4= Go to (I0, id) = closure ( F → id•) = F → id•
• I5 = E → E +•T
T → •T * F
T → •F
F → •id
• I6 = T → T * •F
F → •id
• I7= Go to (I5, T) = Closure (E → E + T•) = E → E + T•
I8= Go to (I6, F) = Closure (T → T * F•) = T → T * F•
SLR (1) Parsing
SLR (1) Parsing
CLR (1) Parsing
• LR (1) item
• LR (1) item is a collection of LR (0) items and a look ahead
symbol.
LR (1) item = LR (0) item + look ahead
• The look ahead is used to determine that where we place the final
item.
• The look ahead always add $ symbol for the argument production.
CLR (1) Parsing
• Example
• CLR ( 1 ) Grammar
• S → AA
• A → aA
• A→b
CLR (1) Parsing
• Add Augment Production, insert '•' symbol at the first position for
every production in G and also add the lookahead.
• S` → •S, $ 1. Augmented production must have $ as a lookahead
2. If A α.Bβ, a/b
• S → •AA, $
Then B.γ, First(β) or a/b (if β contains null)
• A → •aA, a/b
• A → •b, a/b
CLR (1) Parsing
• I0
• S` → •S, $
• S → •AA, $
• A → •aA, a/b
• A → •b, a/b
• I1= S` → S•, $
• I2= S → A•A, $
A → •aA, $
A → •b, $
CLR (1) Parsing
• Example
• CLR ( 1 ) Grammar
• S → AA
• A → aA
• A→b
LALR (1) Parsing
• Add Augment Production, insert '•' symbol at the first position for
every production in G and also add the lookahead.
• S` → •S, $
• S → •AA, $
• A → •aA, a/b
• A → •b, a/b
LALR (1) Parsing
LALR (1) Parsing:
LR Parsing:
Type of Parsing
Type of Parsing
Non-Recursive Descent
LL(1) Parsing:
1st L represents that the scanning of the Input will be done from Left to Right
manner and
Second L shows that in this Parsing technique we are going to use Left most
Derivation Tree. and finally the 1 represents the number of look ahead, means
how many symbols are you going to see when you want to make a decision.
Non-Recursive Descent
Left Factoring:
If a grammar contains two productions of form
S→ aα and S → aβ
It is not suitable for top down parsing without backtracking.
Troubles of this form can sometimes be removed from the
grammar by a technique called the left factoring.
In the left factoring, we replace { S→ aα, S→ aβ } by
{ S → aS', S'→ α, S'→ β } cf. S→ a(α|β)
(Hopefully α and β start with different symbols)
Non-Recursive Descent
Left Factoring
In the left factoring, we replace { S→ aα, S→ aβ } by
{ S → aS', S'→ α, S'→ β } cf. S→ a(α|β)
(Hopefully α and β start with different symbols)
Left factoring for G { S→aSb | c | ab }
S→aS' | c cf. S(=aSb | ab | c = a ( Sb | b) | c ) → a S' | c
S'→Sb | b
Non-Recursive Descent
LL(1) Parsing:
Construction of LL(1) Parsing Table:
To construct the Parsing table, we have two functions:
1: First(): If there is a variable, and from that variable if we try to drive all the strings
then the beginning Terminal Symbol is called the first.
2: Follow(): What is the Terminal Symbol which follow a variable in the process of
derivation.
Non-Recursive Descent
Grammar: First( ) :
E → TE’ FIRST(E) = { ( , id}
E’ → +TE’ | ε FIRST(E’) ={+ , ε }
T → FT’ FIRST(T) = { ( , id}
T’ → *FT’ | ε FIRST(T’) = {*, ε }
F → (E) | id FIRST(F) = { ( , id }
Non-Recursive Descent
Grammar Follow( ):
E → TE’ FOLLOW(E) = { $, ) }
E’ → +TE’ | ε FOLLOW(E’) = { $, ) }
T → FT’ FOLLOW(T) = { +, $, ) }
T’ → *FT’ | ε FOLLOW(T’) = { +, $, ) }
F → (E) | id FOLLOW(F) = {+, * , $ , ) }
Non-Recursive Descent
First( ) : Follow( ):
FIRST(E) = { ( , id} FOLLOW(E) = { $, ) }
FIRST(E’) ={+ , ε } FOLLOW(E’) = { $, ) }
FIRST(T) = { ( , id} FOLLOW(T) = { +, $, ) }
FIRST(T’) = {*, ε } FOLLOW(T’) = { +, $, ) }
FIRST(F) = { ( , id } FOLLOW(F) = {+, * , $ , ) }
Non-Recursive Descent
First( ) : Follow( ):
FIRST(E) = { ( , id} FOLLOW(E) = { $, ) }
FIRST(E’) ={+ , ε } FOLLOW(E’) = { $, ) }
FIRST(T) = { ( , id} FOLLOW(T) = { +, $, ) }
FIRST(T’) = {*, ε } FOLLOW(T’) = { +, $, ) }
FIRST(F) = { ( , id } FOLLOW(F) = {+, * , $ , ) }
Non-Recursive Descent
Non-Recursive Descent
Non-Recursive Descent
Non-Recursive Descent
Non-Recursive Descent
Non-Recursive Descent
Non-Recursive Descent
LL(1) grammar:
The parsing table entries are single entries. So each location has not more than
one entry. This type of grammar is called LL(1) grammar.