0% found this document useful (0 votes)
18 views

CD Unit3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

CD Unit3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Parsing or Syntax Analysis

Parsing is a process which takes input string and produces either a parse tree or generates syntax error.

• Second phase of compilation

• Syntax Analyser takes tokens from Lexical Analyser & group them in a program structure, if any

syntax cannot be recognized then it will give syntactical


s error.

Example :- for the given program statement ( a + b) * c


parser produces the following parse tree
Role of Parser
We need to construct parse tree based on derivation process.

Derivation process is a process of deriving a input string by applying production rules

1. Parser scan input from left to right.


2. Parser makes use of production rues for deriving the given input string .
3. And finally a parse tree is constructed.
Ex:- x = a + b * c;
s
Role of Parser
1. It verifies the structure generated by the tokens based on the grammar.
2. It constructs the parse tree.
3. It reports the errors.
4. It performs error recovery.

s
Parsing Techniques

Depending upon how the parse tree is built, parsing techniques are classified

s
Top Down vs Bottom Up Parsing

s
Problems with Top Down Parsing
• The parse tree is generated from top to bottom (from root to leaves).
• The leftmost derivation is applied at each derivation step.
• In top-down parsing selection of proper production rule is very important.

Problems with Top Down Parsing


s
1. Backtracking

2. Left Recursion

3. Left Factoring

4. Ambiguity
Top Down Parsing- Backtracking

Backtracking:- It is a technique in which non-terminal symbols are expanded on trial & error basis
until a match for input string is found.

Disadvantages:
s
1. This powerful technique

2. This technique is slower and requires exponential time in general.

3. Hence Backtracking is not preferred for practical compilers.


Top Down Parsing- Backtracking
example:-Consider the input string ‘acb’ for the following grammar

s
Top Down Parsing-Left Recursion
Left recursion is considered to be a problematic situation for Top down parsers where as Right
recursion and general recursion does not create any problem for the Top down parsers.
Therefore, left recursion has to be eliminated from the grammar.

s
Top Down Parsing- Left Factoring
Left Factoring is a process to transform the grammar with common prefixes.
The grammar obtained after the process of left factoring is called as Left Factored Grammar

s
Top Down Parsing- Ambiguity
A grammar is said to ambiguous if for any string generated by it, it produces more than
one-Parse tree
Or derivation tree
Or syntax tree
Or leftmost derivation
Or rightmost derivation

Example - X → X+X | X*X |X| a


s

X → X+X X → X*X
→ a +X → X+X*X
Here we are able to → a+ X*X → a+ X*X
create 2 parse trees for → a+a*X → a+a*X
given production , so → a+a*a → a+a*a
the given grammar is
ambiguous.
Recursive Descent
 It is a kind of Top-Down Parser.
 It uses collection of recursive procedures for parsing given input string (for each non terminal)
 CFG is used to build recursive routines.
 Each nonterminal in the grammar is implemented as a function.
 Begin with the start symbol S of the grammar by calling the function S().
 Based on the first token received, apply the appropriate grammar rule for S.
s
 Continue in this manner until S is “satisfied.”

Advantages:-
• Simple to build

Disadvantages:-
• Error-recovery is difficult.
• They are not able to handle as large a set of grammars as other parsing methods.
Recursive Descent- procedure
The procedure will start from top root node and we have to apply several production rules to get
bottom leaf nodes.

Step1: If input is a Non-terminal then call the corresponding procedure of that Non-terminal

Setp2: If input is a terminal then compare terminal with input string


if it is a match the pointer advances

Step3: If a Non-terminal produces more than one productions


s then all the production code should
be written in the corresponding function.

Step4: There is no need to define main() . If we define main() then we have to call start symbol
function in main().
Recursive Descent- procedure
Example:-
E -> i E’
E’ -> + i E’ | ε

There are 2 Non-terminals ( E & E’ ) so we define 2 functions

Eprime()
{
E ()
if (input == ‘+’ )
{ s
{ input++;
if (input == ‘i’ )
if (input == ‘i’ )
input++;
input++;
Eprime();
Eprime();
}
}
else
return;
}
sample input string = i+i $
First & Follow functions
Compiler design is the process of creating software that translates source code written in a high-
level programming language into machine code that can be executed by a computer. Parsing is one of
the crucial steps in the process of compiler design that involves breaking down the source code into
smaller pieces and analyzing their syntax to generate a parse tree.

FIRST and FOLLOW in Compiler Design functions are used to generate a predictive parser, which is a
type of syntax analyzer used to check whether the input source code follows the syntax of the
programming language.
s
The FIRST and FOLLOW in Compiler Design sets are used to construct a predictive parser, which can
predict the next token in the input. First tells which terminal can start production whereas the follows
tells the parser what terminal can follow a non-terminal.

By using FIRST and FOLLOW the compiler come to know in advance, “ which character of the string is
produced by applying which production rule”.
First & Follow functions
To compute the first set of a nonterminal, we must consider all possible productions of the
nonterminal symbol and compute the first set of the symbols that can appear as the first token in
the right-hand side of each production. If any of these symbols can derive the empty string (i.e., the
ε symbol), then we must also include the First set of the next symbol in the production.

RULES:-
Example:-
E->TE’
1. If A Aaα , α Є ( V U T )* then FIRST(A)= { a}
E’->+TE’/ε
2. If A ε then FIRST(A) = {ε } s
T->FT’
3. If ABC then
a)FIRST(A)=FIRST(B) , if FIRST(B) doesnot contain ε T’->*FT’/ε
b)FIRST(A)=FIRST(B) U FIRST(C), if FIRST(B) contains ε F->(E)/id
FIRST(E) ={ ( , id }
FIRST(E’) ={ +, ε }
FIRST(T) ={ ( , id }
FIRST(T’) ={ * , ε }
FIRST(F) ={ ( , id }
First & Follow functions
The Follow function of nonterminal symbols represents the set of terminal symbols that can appear
immediately after the nonterminal in any valid derivation.

To compute the Follow set of a nonterminal, we must consider all the productions in which the
nonterminal symbol appears and add the Follow set of the nonterminal’s parent to the Follow set of
the nonterminal. We must also consider the case where the nonterminal appears at the end of
production and add the Follow set of the nonterminal’s parent to the Follow set of the nonterminal.
s
RULES:-

1. If ‘S’ is a start symbol then FOLLOW(S)={ $ }


2. If AαBβ then
a) FOLLOW(B) = FIRST(β) , if FIRST(β) ≠ ε
b) FOLLOW(B) = FOLLOW(A) , if β ε
1. If A αB then FOLLOW(B) = FOLLOW(A)
First & Follow functions
Example:- 1) FOLLOW(E) ={ $, ) }
E->TE’ • $ included as E is start symbol
E’->+TE’/ε • ) included as it’s the following symbol after E in F(E)
T->FT’
T’->*FT’/ε 2) FOLLOW(E’) ={ $, ) }
F->(E)/id • By applying rule3 to ETE’ , FOLLOW(E’) = FOLLOW(E)

3)FOLLOW(T) ={ +, $, ) }
• By applying rule2 (a) to ETE’, FOLLOW(E’) = FIRST(E’)
• FIRST(E’)={+, ε }-------1 , But Follow
s should not contain ε
• So substitute ε for E’ in E T E’ resulting ET
• For ET , FOLLOW(T) = FOLLOW(E) ={$, )}-----2
• Combining 1 & 2 we get FOLLOW(T) ={+, $, ) }

4)FOLLOW(T’) ={ +, $ ,)}
• By applying rule3 to TFT’ , FOLLOW(T’) = FOLLOW(T)

5)FOLLOW(F) ={ *, +, $, )}
• Apply same procedure as for 3 rd point
LLR(1) Parser
• LL(1) parser / Predictive Parser / Non-Recursive Descent Parser

• parser program works with the following 3 components to produce output


1) INPUT: Contains string to be parsed with $ as it's end marker

2) STACK: Contains sequence of grammar symbols with $ as it's bottom marker. Initially stack
contains only $

3) PARSING TABLE: A two dimensional array M[A,a], where A is a non-terminal and a is a Terminal
s
It is a tabular implementation of the recursive descent parsing, where a stack is maintained by the
parser table than the language in which parser is written
LLR(1) Parser
Construction of LL(1) parser

STEP1:-Elimination of Left Recursion

STEP2:- Elimination of Left Factoring

STEP3:- Calculation of First and Follow s

STEP4:- Construction Parsing table

STEP5:- String Validation, check whether Input string is accepted by parser or not
LLR(1) Parser
Example:-
EE+T |T
TT*F|F
F(E) |id

STEP1:-Elimination of Left Recursion

EE+T |T
Converted to ETE’
E’+TE’ | ε s
TT*F|F
Converted to TFT’
T’*FT’ | ε
After removing Left Recursion we get
ETE’
E’+TE’ | ε
TFT’
T’*FT’ | ε
F(E) |id
LLR(1) Parser
STEP2:- Elimination of Left Factoring

ETE’
E’+TE’ | ε
TFT’
T’*FT’ | ε
F(E) |id
s
The above grammar has no left factoring.
LLR(1) Parser
STEP3:- Calculation of First and Follow 3)FOLLOW(T) ={ +, $, ) }
ETE’ • By applying rule2 (a) to ETE’, FOLLOW(E’) =
E’+TE’ | ε FIRST(E’)
TFT’ • FIRST(E’)={+, ε }-------1 , But Follow should not
T’*FT’ | ε contain ε
F(E) |id • So substitute ε for E’ in E T E’ resulting ET
• For ET , FOLLOW(T) = FOLLOW(E) ={$, )}-----2
FIRST(E) ={ ( , id } • Combining 1 & 2 we get FOLLOW(T) ={+, $, ) }
FIRST(E’) ={ +, ε }
FIRST(T) ={ ( , id } 4)FOLLOW(T’)
s ={ +, $ ,)}
FIRST(T’) ={ * , ε } • By applying rule3 to TFT’ , FOLLOW(T’) =
FIRST(F) ={ ( , id } FOLLOW(T)

FOLLOW(E) ={ $, ) } 5)FOLLOW(F) ={ *, +, $, )}
• $ included as E is start symbol • Apply same procedure as for 3 rd point
• ) included as it’s the following symbol after E in F(E)

2) FOLLOW(E’) ={ $, ) }
• By applying rule3 to ETE’ , FOLLOW(E’) = FOLLOW(E)
LLR(1) Parser
STEP4:- Construction Parsing table
• All the Null Productions of the Grammars will go
• Rows are Non Terminals under the Follow elements and the remaining
• Columns are Terminals productions will lie under the elements of the First
set.

s
LLR(1) Parser
STEP5:- String Validation, check whether Input string is accepted by parser or not
Actions are decided by mapping Top of stack & input string in parsing Table

After parsing entire string if the stack contains only $ then we can say that the string is accepted by parser.
LLR(1) Parser
Now draw the parse tree from the Action column of the parsing table.

s
BOTTOM UP PARSING
Reduction, it is the activity in which the parser tries to match substring of input string with RHS of
production rule and replace it by corresponding LHS.

Handle , is the substring of input string that matches with RHS of production

It is the process of reducing the input string to the start symbol of the grammar i,.e, Right most
derivation in reverse order.

1) The input string is first taken


s
2) Then we try to reduce substring of input string with the help of grammar.

3) Finally we try to obtain the start symbol.

Parse tree is constructed starting from leaf nodes.


Leaf nodes are reduced further to internal nodes.
These internal nodes are further reduced & eventually a root node is obtained.
BOTTOM UP PARSING
Example- STEP4:
S aABe “aABe” Here substring of input string ‘d’
AAbc | b matches with the production Bd ,
B d so replace substring ‘d’ with
Let input string is “abbcde” corresponding LHS B .

STEP1:
abbcde” STEP5:
S Here substring of input string ‘aABe’
STEP2: s matches with the production
“aAbcde” Here substring of input string ‘b’ S aABe , so replace substring
matches with the production Ab , ‘aABe’ with corresponding LHS S .
so replace substring ‘b’ with
corresponding LHS A .
STEP3:
“aAde” Here substring of input string ‘Abc’
matches with the production AAbc ,
so replace substring ‘Abc’ with
corresponding LHS A .
BOTTOM UP PARSING
Handle pruning, it is the process of obtaining the Right most derivation in reverse order.
Example-
S aABe SaABe
AAbc | b SaAde
B d SaAbcde
Sabbcde
Let input string is “abbcde”
Right sentential form Handle Production
abbcde b s Ab
aAbcde Abc AAbc
aAde d Bd
aABe aABe SaABe
S

Here Bottom up parsing is producing Right most derivation in reverse order.


Shift Reduce Parser
Shift Reduce parser a bottom-up parsing technique i.e. the parse tree is constructed from
leaves(bottom) to the root(up).

This parser requires some data structures i.e.


• An input buffer for storing the input string& string is terminated with $
• A stack for storing Symbols of production rules, Initially bottom of stack stores $

Then string is said to be accepted


1. Shift: This involves push operation pushing symbols
s from the input buffer onto the stack.
2. Reduce: When a match for the handle(RHS of production) is found at the top of the stack,
the reduce operation applies the applicable production rules,
i.e., pops out the RHS of the production rule from the stack and
pushes the LHS of the production rule onto the stack.
3. Accept: If only the start symbol is present in the stack and the input buffer is empty then,
the parsing action is called accept. When accepted action is obtained,
it is means successful parsing is done.
4. Error: This is the situation in which the parser can neither perform shift action nor reduce action
and not even accept action.
Shift Reduce Parser-example1
Example 1 – Consider the grammar S –> S + S | S * S | id
Perform Shift Reduce parsing for input string “id + id + id”.

s
Shift Reduce Parser-example2
Example 3 – Consider the grammar S –> ( L ) | a & L –> L , S | S
Perform Shift Reduce parsing for input string “( a, ( a, a ) ) “.
Stack Input Buffer Parsing Action

$ (a,(a,a))$ Shift

$( a,(a,a))$ Shift

$(a ,(a,a))$ Reduce S → a

$(S ,(a,a))$ Reduce L → S

$(L , ( a , a s) ) $ Shift

$(L, (a,a))$ Shift

$(L,( a,a))$ Shift

$(L,(a ,a))$ Reduce S → a

$(L,(S ,a))$ Reduce L → S

$(L,(L ,a))$ Shift


Shift Reduce Parser-example2

Stack Input Buffer Parsing Action

$(L,(L, a))$ Shift

$(L,(L,a ))$ Reduce S → a

$ ( L, ( L, S ))$ Reduce L →L, S

$ ( L, ( L ))$ Shift

s
$ ( L, ( L ) )$ Reduce S → (L)

$ ( L, S )$ Reduce L → L, S

$(L )$ Shift

$(L) $ Reduce S → (L)

$S $ Accept
LR Parser or LR(k) parser
LR parsing is one type of bottom up parsing. It is used to parse the large class of grammars.

In the LR parsing, "L" stands for left-to-right scanning of the input.


"R" stands for constructing a right most derivation in reverse.
"K" is the number of input symbols of the look ahead used to make number of parsing decision.

NOTE:-
LR(0) and SLR(1) uses canonical collection of LR(0) items
LALR(1) and CLR(1) uses canonical collection of LR(1) items
LR Parser or LR(k) parser
1. Input buffer contains the string to be parsed followed by a $ Symbol.
2. A stack is used to contain a sequence of grammar symbols with a $ at the bottom of the stack.
3. Parsing table is a two dimensional array. It contains two parts:
a) Action part and
b) Go To part.
NOTE:-The LR algorithm requires stack, input, output and parsing table. In all type of LR parsing, input,
output and stack are same but parsing table is different.

s
LR Parser or LR(k) parser-- terminology
The dot(.) is useful to indicate that how much of the input has been scanned up to a given point in the
process of parsing.
Item , is any production rule with a dot(.) at the beginning of RHS of production.
LR (0) item , is any production rule with a dot(.) at some position of RHS of production.
Final Item , is any production rule with a dot(.) at the end of RHS of production.
Ex:-
S .A // item
SA.A // LR(0) item
SAA. // final item
s
Canonical collection represents the set of valid states for the LR parser
a) Canonical LR (0) collection
b) Canonical LR (1) collection
Canonical LR (0) collection helps to construct LR parsers like LR(0) & SLR parsers
To create Canonical LR (0) collection for Grammar, 3 things are required −
1. Augmented Grammar
2. Closure Function
3. goto Function
LR Parser or LR(k) parser-- terminology
Augmented grammar is a production rule where the start symbol appears only on the LHS of
productions. It is used to identify when to stop parser & declare string as accepted.
Ex:- For the grammar SAA & A SA
S’S is the augmented grammar, here S’ appears only on LHS

Closure( ) , it helps to find states of the automaton and


to compute closure we need to add dot(.) to the RHS of production.
If after dot(.) we have to write Non-terminal symbol productions and we need to add dot(.) in RHS of
these productions also.
Ex:- s

Goto, it will help in constructing transitions of the automaton.


It performs to move on the dot(.) to the RHS of the symbol.
Syntax:----- Goto(old_state, NT)
Ex:-
LR Parser or LR(k) parser-- implementation
STEPS to implement LR(0) parser

1. For given Input string write CFG

2. Check the Ambiguity of the grammar

3. Add augment production to the given grammar


s
4. Create canonical collection of LR(0) items

5. Draw a data flow diagram

6. Construct LR(0) parsing table

7. Parse the input string


LR Parser or LR(k) parser-- Example
Example:-
S → AA
A → aA | b

STEP 1&2 :- CFG & ambiguity


Already satisfied

STEP3:- Add augment grammar


Add Augment Production and insert '•' symbol atsthe first position for every production in G
S` → •S // Augment grammar
S → •AA
A → •aA
A → •b
LR Parser or LR(k) parser-- Example
STEP4:- canonical collection of LR(0) items
I0 = S` → •S
S → •AA
A → •aA
A → •b

I1= Go to (I0, S) = closure (S` → S•) = S` → S•

I2= Go to (I0, A) = closure (S → A•A)


I2 =S→A•A s
A → •aA //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •b

I3= Go to (I0,a) = Closure (A → a•A)


I3= A → a•A //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •aA
A → •b
LR Parser or LR(k) parser-- Example

I4= Go to (I0, b) = closure (A → b•) = A → b•

Go to (I2,a) = Closure (A → a•A) = (same as I3)


Go to (I2, b) = Closure (A → b•) = (same as I4)
Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)
s

I5= Go to (I2, A) = Closure (S → AA•) = SA → A•


I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
LR Parser or LR(k) parser-- Example
STEP5:-Draw the Data Flow Diagram

s
LR Parser or LR(k) parser-- Example

STEP6:-construct LR(0) Parsing Table


• If a state is going to some other state on a terminal then it correspond to a shift move.
• If a state is going to some other state on a non- terminal then it correspond to go to move.
• If a state contain the final item in the particular row then write the reduce node completely.

let

S` → S ----------------Accepted
S → AA ----------------r1
A → aA ----------------r2
A→b ----------------r3
LR Parser or LR(k) parser-- Example

s
LR Parser or LR(k) parser-- Example
Step7:-
• For every push operation the
pointer advances to next
symbolparsing the input string
NOTE:- AfterPUSH
NOTE:- After PUSH increment
increment the to
the pointer pointer
point toto point
next toinnext
symbol inputsymbol
string. in input string.

s
LR Parser or LR(k) parser-- Example
Explanation
• I0 on S is going to I1 so write it as 1.
• I0 on A is going to I2 so write it as 2.
• I2 on A is going to I5 so write it as 5.
• I3 on A is going to I6 so write it as 6.
• I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
• I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
• I4, I5 and I6 all states contains the final item because they contain • in the right most end. So rate the
production as production number.
• I1 contains the final item which drives(S` → S•), so s action {I1, $} = Accept.
• I4 contains the final item which drives A → b• and that production corresponds to the production
number 3 so write it as r3 in the entire row.
• I5 contains the final item which drives S → AA• and that production corresponds to the production
number 1 so write it as r1 in the entire row.
• I6 contains the final item which drives A → aA• and that production corresponds to the production
number 2 so write it as r2 in the entire row.
SLR parser
• SLR is simple LR.
• It is the smallest class of grammar having few number of states.
• SLR is very easy to construct and is similar to LR parsing.
• The only difference between SLR parser and LR(0) parser is that in LR(0) parsing table, we place
Reduce move only in the FOLLOW of LHS not the entire row as in LR(0).

Steps for constructing the SLR parsing table :


1. For given Input string write CFG
2. Check the Ambiguity of the grammar s
3. Writing augmented grammar
4. LR(0) collection of items to be found
5. Draw a data flow diagram
6. Find FOLLOW of LHS of production
7. Constructing parsing table
8. Parse the input string
SLR Parser- Example
Example:-
S → AA
A → aA | b

STEP 1&2 :- CFG & ambiguity


Already satisfied

STEP3:- Add augment grammar


Add Augment Production and insert '•' symbol atsthe first position for every production in G
S` → •S // Augment grammar
S → •AA
A → •aA
A → •b
SLR Parser- Example
STEP4:- canonical collection of LR(0) items
I0 = S` → •S
S → •AA
A → •aA
A → •b

I1= Go to (I0, S) = closure (S` → S•) = S` → S•

I2= Go to (I0, A) = closure (S → A•A)


I2 =S→A•A s
A → •aA //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •b

I3= Go to (I0,a) = Closure (A → a•A)


I3= A → a•A //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •aA
A → •b
SLR Parser- Example

I4= Go to (I0, b) = closure (A → b•) = A → b•

Go to (I2,a) = Closure (A → a•A) = (same as I3)


Go to (I2, b) = Closure (A → b•) = (same as I4)
Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)
s

I5= Go to (I2, A) = Closure (S → AA•) = SA → A•


I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
SLR Parser- Example
STEP5:-Draw the Data Flow Diagram

s
SLR Parser-Example
STEP6: –
Find FOLLOW of LHS of production

FOLLOW(S)=$
FOLLOW(A)=a,b,$

let

S` → S ---------------- Accepted s
S → AA ----------------r1
A → aA ----------------r2
A→b ----------------r3
SLR Parser- Example
STEP7:-construct SLR(0) Parsing Table
• If a state is going to some other state on a terminal then it correspond to a shift move(Sn)
• If a state is going to some other state on a non- terminal then it correspond to go to move(n)
• If a state contain the final item in the particular row then write the reduce move only in the
FOLLOW of LHS.

s
LR Parser or LR(k) parser-- Example
Step7:-
• For every push operation the
parsingto the
pointer advances next input string
symbol NOTE:- AfterPUSH
NOTE:- After PUSH increment
increment the to
the pointer pointer
point toto point
next toinnext
symbol inputsymbol
string. in input string.

s
CLR Parser
• The CLR parser stands for canonical LR parser.
• It is a more powerful LR parser.
• It makes use of look-ahead symbols.
• This method uses a large set of items called LR(1) items.

LR(1) items=LR(0) items + look ahead


RULES For Calculating Look-ahead
1. The look ahead for the argument production is always $ symbol
Ex:-
S’ → .S , $ // look-ahead is $ as it’s the augment production
s

2. When we have to calculate Look-ahead of Non-terminal( NT ) in a production


See ( .NT ) in the previous production and calculate FIRST ( remaining part after .NT )
Ex:-
S’ → .S , $ // previous production .NT is .S
S → .AA , ? // look-ahead = FIRST(remaining part after .NT in previous production ) i.e, FIRST($)= $
resulting S.AA,$
CLR Parser-- Implementation
STEPS to implement CLR(1) parser

1. For given Input string write CFG

2. Check the Ambiguity of the grammar

3. Add augment production to the given grammar


s
4. Create canonical collection of LR(1) items

5. Draw a data flow diagram

6. Construct CLR(1) parsing table

7. Parse the input string


CLR Parser-Example
Example:-
S → AA
A → aA | b

STEP 1&2 :- CFG & ambiguity


Already satisfied

STEP3:- Add augment grammar


Add Augment Production and insert '•' symbol atsthe first position for every production in G
S` → •S // Augment grammar
S → •AA
A → •aA
A → •b
CLR Parser-- Example
STEP4:- canonical collection of LR(1) items
I0 = S` → •S , $ // as it’s the argument production
S → •AA , $ //look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST($)= $
A → •aA , a/b //look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST(A,$)= a/b
A → •b , a/b

I1= Go to (I0, S) = closure (S` → S•) = S` → S• , $ //write the look-ahead as it is in I0

I2= Go to (I0, A) = closure (S → A•A) s


I2 =S→A•A , $ //write the look-ahead as it is in I0
A → •aA , $ //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •b , $ // look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST($)= $

I3= Go to (I0,a) = Closure (A → a•A)


I3= A → a•A , a/b //write the look-ahead as it is in I0
A → •aA , a/b //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •b , a/b // look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST(a/b)= a/b
CLR Parser-- Example
I4= Go to (I0, b) = closure (A → b•) = A → b•, a/b //write the look-ahead as it is in I0

I5= Go to (I2, A) = Closure (S → AA•) = S → AA• , $ //write the look-ahead as it is in I2

I6=Go to (I2,a) = Closure (A → a•A) = Aa.A , $ //write the look-ahead as it is in I2


A.aA , $ // look-ahead=
A.d , $ //FIRST(remaining part after .NT in previous production )
i.e, FIRST($)= $
s
I7=Go to (I2, b) = Closure (A → b•) = Ab. , $ //write the look-ahead as it is in I2

Go to (I3, a) = Closure (A → a•A) = (same as I3)


Go to (I3, b) = Closure (A → b•) = (same as I4)

I8= Go to (I3, A) = Closure (A → aA•) = A → aA• , a/b //write the look-ahead as it is in I3


CLR Parser-- Example
STEP5:-Draw the Data Flow Diagram

s
CLR Parser--Example

STEP6:-construct CLR(1) Parsing Table


• If a state is going to some other state on a terminal then it correspond to a shift move.
• If a state is going to some other state on a non- terminal then it correspond to go to move.
• If a state contain the final item in the particular row then write the reduce node completely.

The only difference between SLR parser and CLR(1) parser is that in CLR(1) parsing table, we
place Reduce move only in the look-ahead symbols
s not in the FOLLOW of LHS.

let

S` → S ----------------Accepted
S → AA ----------------r1
A → aA ----------------r2
A→b ----------------r3
CLR Parser -- Example

s
CLR Parser -- Example
Explanation
The placement of shift node in CLR (1) parsing table is same as the SLR (1) parsing table. Only difference
in the placement of reduce node.

I4 contains the final item which drives ( A → b•, a/b), so action {I4, a} = R3, action {I4, b} = R3.
I5 contains the final item which drives ( S → AA•, $), so action {I5, $} = R1.
I7 contains the final item which drives ( A → b•,$), sos action {I7, $} = R3.
I8 contains the final item which drives ( A → aA•, a/b), so action {I8, a} = R2, action {I8, b} = R2.
I9 contains the final item which drives ( A → aA•, $), so action {I9, $} = R2.
CLR Parser-- Example
Step7:-
parsing the input string
NOTE:- After PUSH increment the pointer to point to next symbol in input string.

• For every push


operation the pointer
advances to next
symbol
LALR Parser
• LALR refers to the look-ahead LR.
• It is the most powerful parser.
• It can handle large classes ofs grammar.
• To construct the LALR (1) parsing table, we use the canonical collection of LR (1) items.
• LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.

STEPS to implement LALR(1) parser


1. For given Input string write CFG
2. Check the Ambiguity of the grammar s
3. Add augment production to the given grammar
4. Create canonical collection of LR(1) items
5. Draw a data flow diagram
6. Construct LALR(1) parsing table
7. Parse the input string
LALR Parser-Example
Example:-
S → AA
A → aA | b

STEP 1&2 :- CFG & ambiguity


Already satisfied

STEP3:- Add augment grammar


Add Augment Production and insert '•' symbol atsthe first position for every production in G
S` → •S // Augment grammar
S → •AA
A → •aA
A → •b
LALR Parser-- Example
STEP4:- canonical collection of LR(1) items
I0 = S` → •S , $ // as it’s the argument production
S → •AA , $ //look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST($)= $
A → •aA , a/b //look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST(A,$)= a/b
A → •b , a/b

I1= Go to (I0, S) = closure (S` → S•) = S` → S• , $ //write the look-ahead as it is in I0

I2= Go to (I0, A) = closure (S → A•A) s


I2 =S→A•A , $ //write the look-ahead as it is in I0
A → •aA , $ //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •b , $ // look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST($)= $

I3= Go to (I0,a) = Closure (A → a•A)


I3= A → a•A , a/b //write the look-ahead as it is in I0
A → •aA , a/b //Add all productions of A in to I2 State because "•" is followed by the non-terminal
A → •b , a/b // look-ahead= FIRST(remaining part after .NT in previous production ) i.e, FIRST(a/b)= a/b
LALR Parser-- Example
I4= Go to (I0, b) = closure (A → b•) = A → b•, a/b //write the look-ahead as it is in I0

I5= Go to (I2, A) = Closure (S → AA•) = S → AA• , $ //write the look-ahead as it is in I2

I6=Go to (I2,a) = Closure (A → a•A) = Aa.A , $ //write the look-ahead as it is in I2


A.aA , $ // look-ahead=
A.d , $ //FIRST(remaining part after .NT in previous production )
i.e, FIRST($)= $
s
I7=Go to (I2, b) = Closure (A → b•) = Ab. , $ //write the look-ahead as it is in I2

Go to (I3, a) = Closure (A → a•A) = (same as I3)


Go to (I3, b) = Closure (A → b•) = (same as I4)

I8= Go to (I3, A) = Closure (A → aA•) = A → aA• , a/b //write the look-ahead as it is in I3


LALR Parser-- Example
If we analyze then LR (0) items of I3 and I6 are same but they differ only in their lookahead.
I3 = { A → a•A, a/b I6= { A → a•A, $
A → •aA, a/b A → •aA, $
A → •b, a/b A → •b, $
} }

Clearly I3 and I6 are same in their LR (0) items but differ in their look-ahead, so we can combine them
and called as I36.
I36 = { A → a•A, a/b/$ s
A → •aA, a/b/$
A → •b, a/b/$
}
The I4 and I7 are same but they differ only in their look ahead, so we can combine them and called as
I47.
I47 = {A → b•, a/b/$ }
The I8 and I9 are same but they differ only in their look ahead, so we can combine them and called as
I89.
I89 = {A → aA•, a/b/$ }
LALR Parser-- Example
STEP5:-Draw the Data Flow Diagram

s
LALR Parser--Example

STEP6:-construct LALR(1) Parsing Table


• If a state is going to some other state on a terminal then it correspond to a shift move.
• If a state is going to some other state on a non- terminal then it correspond to go to move.
• If a state contain the final item in the particular row then write the reduce node completely.

The only difference between CLR(1) parser and LALR(1) parser is that in LALR(1) parsing
table, we combine the two similar states but with
s different look-ahead.

Ex:- I4 = {A → b•, a/b} & I7 = {A → b•, $ } = I47 = {A → b•, a/b/$ }

let
S` → S ----------------Accepted
S → AA ----------------r1
A → aA ----------------r2
A→b ----------------r3
LALR Parser -- Example

s
LALR Parser -- Example
Explanation
The placement of shift node in LALR (1) parsing table is same as the CLR (1) parsing table. Only
difference is in LALR parsing table construction , we merge these similar states.

I3 and I6 are similar except their look-ahead.


I4 and I7 are similar except their look-ahead.
I8 and I9 are similar except their look-ahead.

Wherever there is 3 or 6, make it 36(combined form)


s
Wherever there is 4 or 7, make it 47(combined form)
Wherever there is 8 or 9, make it 89(combined form)
LALR Parser-- Example

• For every push


operation the pointer
advances to next
symbol

You might also like