Module 2a - With soln
Module 2a - With soln
Role of parser
Toke Pars
Source Lexical n Parse e
front end
IR
Parser
tree Rest of
progra analyzer tree
m Get next
token
Symbol table
Nonterminal symbol:
The name of syntax category of a language, e.g., noun, verb, etc.
The It is written as a single capital letter, or as a name enclosed between < … >, e.g., A
or <Noun> <Noun Phrase> → <Article><Noun>
<Article> → a | an | the
<Noun> → boy | apple
Context free grammar
• A context free grammar (CFG) is a 4-tuple where,
is finite set of non terminals,
is disjoint finite set of terminals,
is an element of and it’s a start symbol,
is a finite set formulas of the form where and
Terminal symbol:
A symbol in the alphabet.
It is denoted by lower case letter and punctuation marks used in language.
Start symbol:
First nonterminal symbol of the grammar is called start symbol.
Production:
A production, also called a rewriting rule, is a rule of grammar. It has the form of
A nonterminal symbol → String of terminal and nonterminal symbols
<Noun Phrase> → <Article><Noun>
<Article> → a | an | the
<Noun> → boy | apple
Example: Grammar
Write terminals, non terminals, start symbol, and productions
for following grammar.
E E O E| (E) | -E | id
O+|-|*|/ |↑
Terminals: id + - * / ↑ ( )
Non terminals: E, O
Start symbol: E
Productions: E E O E| (E) | -E | id
O+|-|*|/ |↑
Derivation &
Ambiguity
Derivation
• Derivation is used to find whether the string belongs to a given
grammar or not.
• Types of derivations are:
1. Leftmost derivation
2. Rightmost derivation
Leftmost derivation
• A derivation of a string in a grammar is a left most derivation
if at every step the left most non terminal is replaced.
• Grammar: SS+S | S-S | S*S | S/S | a Output string: a*a-a
S
S Parse tree represents the
S-S
structure of derivation S - S
S*S-S S * S a
a*S-S
a*a-S a a
Parse tree
a*a-a
Leftmost Derivation
Rightmost derivation
• A derivation of a string in a grammar is a right most
derivation if at every step the right most non terminal is
replaced.
• It is all called canonical derivation.
• Grammar: SS+S | S-S | S*S | S/S | a Output string: a*a-a
S
S S * S
S*S
a S - S
S*S-S
S*S-a a a
Parse Tree
S*a-a
Rightmost Derivation
Exercise: Derivation
1. Perform leftmost derivation and draw parse tree.
A0A | 𝜖
SA1B
B0B | 1B | 𝜖
Output string: 1001
2. Perform leftmost derivation and draw parse tree.
S0S1 | 01 Output string: 000111
3. Perform rightmost derivation and draw parse tree.
EE+E | E*E | id | (E) | -E
Output string: id + id * id
Ambiguity
• Ambiguity, is a word, phrase, or statement which contains
more than one meaning.
Chip
•
𝜶
can be derived from either N1 or N2
Ambiguous grammar
• Ambiguous grammar is one that produces more than one
leftmost or more then one rightmost derivation for the same
sentence.
• Grammar: SS+S | S*S | (S) | a Output string: a+a*a
S S
S S * S S S + S
S*S S+S
S + S a a S * S
S+S*S a+S
a+S*S a a a+S*S a a
a+a*S a+a*S
a+a*a a+a*a
• Here, Two leftmost derivation for string a+a*a is possible
Exercise: Ambiguous Grammar
Check Ambiguity in following grammars:
1. S aS | Sa | 𝜖 (output string: aaaa)
2. S aSbS | bSaS | 𝜖 (output string: abab)
3. S SS+ | SS* | a (output string: aa+a*)
4. <exp> → <exp> + <term> | <term>
<term> → <term> * <letter> | <letter>
<letter> → a|b|c|…|z (output string: a+b*c)
5. Prove that the CFG with productions: S a | Sa | bSS | SSb |
SbS is ambiguous (Hint: consider output string yourself)
Left recursion & Left
factoring
Left recursion
• A grammar is said to be left recursive if it has a non terminal
such that there is a derivation for some string
𝐴→ 𝐴𝛼∨¿
𝐴’
𝜖
𝛼 𝐴’
Examples: Left recursion elimination
EE+T | T
ETE’
E’+TE’ | ε
TT*F | F
TFT’
T’*FT’ | ε
XX%Y | Z
XZX’
X’%YX’ | ε
Exercise: Left recursion
1. AAbd | Aa | a
BBe | b
2. AAB | AC | a | b
3. SA | B
AABC | Acd | a | aa
BBee | b
4. ExpExp+term | Exp-term | term
Left factoring
Left factoring is a grammar transformation that is useful for
producing a grammar suitable for predictive parsing.
SaAB | aCD
SaS’
S’AB | CD
A xByA | xByAzA | a
A xByAA’ | a
A’ Є | zA
A aAB | aA |a
A’AB | A | 𝜖
AaA’
A’AA’’ | 𝜖
A’’B | 𝜖
Exercise
1. SiEtS | iEtSeS | a
2. A ad | a | ab | abc | x
Parsing
• Parsing is a technique that takes input string and produces
output either a parse tree if string is valid sentence of
grammar, or an error message indicating that string is not a
valid.
• Types of parsing are:
1. Top down parsing: In top down parsing parser build parse
tree from top to bottom.
2. Bottom up parsing: Bottom up parser starts from leaves and
work up to the root.
Classification of parsing methods
Parsin
g
descen
t
Backtracking
• In backtracking, expansion of nonterminal symbol we choose
one alternative and if any mismatch occurs then we try another
alternative.
• Grammar: S cAd Input string: cad
A ab | a
S S S
c A d c A d c A d
Make prediction Make prediction
descen
t
LL(1) parser (predictive parser)
• LL(1) is non recursive top down parser.
1. First L indicates input is scanned from left to right.
2. The second L means it uses leftmost derivation for input
string
3. 1 means it uses only input symbol to predict the parsing
process. a + b $ INPU
T
X
Y Predictiv
e parsing OUTPU
Z
Stack program T
$
Parsing table
M
LL(1) parsing (predictive parsing)
Steps to construct LL(1) parser
1. Remove left recursion / Perform left factoring (if any).
2. Compute FIRST and FOLLOW of non terminals.
3. Construct predictive parsing table.
4. Parse the input string using parsing table.
Rules to compute first of non terminal
1. If and is terminal, add to .
2. If , add to .
3. If is nonterminal and is a production, then place in if for
some , a is in , and 𝜖 is in all of that is . If 𝜖 is in for all then
add 𝜖 to .
Everything in is surely in If does not derive 𝜖, then we do
nothing more to , but if , then we add and so on.
Rules to compute first of non terminal
Simplification of Rule 3
If ,
• If does not derive
• If derives
• If & Y2 derives ∈
• If , Y2 & Y3 derives ∈
𝛽 𝑖𝑠𝑎𝑏𝑠𝑒𝑛𝑡 𝛽 𝑖𝑠𝑝𝑟𝑒𝑠𝑒𝑛𝑡
𝑅𝑢𝑙𝑒 2 𝜖 𝜖
𝑅𝑢𝑙𝑒 2 𝑅𝑢𝑙𝑒2+𝑅𝑢𝑙𝑒3
Rules to construct predictive parsing
table
1. For each production of the grammar, do steps 2 and 3.
2. For each terminal in , Add to .
3. If is in , Add to for each terminal in . If is in , and is in ,
add to .
4. Make each undefined entry of M be error.
Example-1: LL(1) parsing
SaBa
NT First
BbB | ϵ S {a}
Step 1: Not required
B {b,𝜖}
Step 2: Compute FIRST
First(S) S a B a
Rule 1
SaBa A add to
FIRST(S)={
a}
First(B)
BbB B𝜖
B b B B 𝜖
A Rule 1
A
add to Rule 2
,𝜖}
FIRST(B)={ add to
b
Example-1: LL(1) parsing
SaBa
NT First Follow
Follow(S)
Rule 1: Place $ in FOLLOW(S)
Follow(S)={ $ }
Follow(B)
SaBa BbB
S a B a Rule 2 B b B
Rule 3
A B First( A B Follow(A)=follow(B)
Follow(B)={ a }
Example-1: LL(1) parsing
SaBa
NT First Follow
S {a} {$}
BbB | ϵ
B {b,𝜖} {a}
Step 3: Prepare predictive parsing table
NT Input Symbol
a b $
S SaBa
B
Rule: 2
SaBa A
a = first()
a=FIRST(aBa)={ a } M[A,a] = A
M[S,a]=SaBa
Example-1: LL(1) parsing
SaBa
NT First Follow
S {a} {$}
BbB | ϵ
B {b,𝜖} {a}
Step 3: Prepare predictive parsing table
NT Input Symbol
a b $
S SaBa
B BbB
Rule: 2
BbB A
a = first()
a=FIRST(bB)={ b } M[A,a] = A
M[B,b]=BbB
Example-1: LL(1) parsing
SaBa
NT First Follow
S {a} {$}
BbB | ϵ
B {b,𝜖} {a}
Step 3: Prepare predictive parsing table
NT Input Symbol
a b $
S SaBa Error Error
B Bϵ BbB Error
Rule: 3
Bϵ A
b = follow(A)
M[B,a]=B𝜖
b=FOLLOW(B)={ a } M[A,b] = A
Example-2: LL(1) parsing
SaB | ϵ
BbC | ϵ
CcS | ϵ
Step 1: Not required NT First
S { a, 𝜖 }
Step 2: Compute FIRST B {b,𝜖}
First(S) C {c,𝜖}
SaB S𝜖
S a B S 𝜖
Rule 1 Rule 2
A A
add to add to
FIRST(S)={ a , 𝜖 }
Example-2: LL(1) parsing
SaB | ϵ
BbC | ϵ
CcS | ϵ
Step 1: Not required NT First
S { a, 𝜖 }
Step 2: Compute FIRST B {b,𝜖}
First(B) C {c,𝜖}
BbC B𝜖
B b C B 𝜖
Rule 1 Rule 2
A A
add to add to
FIRST(B)={ b , 𝜖 }
Example-2: LL(1) parsing
SaB | ϵ
BbC | ϵ
CcS | ϵ
Step 1: Not required NT First
FIRST(B)={ c , 𝜖 }
Example-2: LL(1) parsing
Step 2: Compute FOLLOW
Follow(S) Rule 1: Place $ in FOLLOW(S)
Follow(S)={ $ }
CcS SaB | ϵ
C c S BbC | ϵ
Rule 3
A B Follow(A)=follow(B) CcS | ϵ
Follow(S)=Follow(C) ={$}
NT First Follow
S {a,𝜖} {$}
BbC SaB B {b,𝜖} {$}
C {c,𝜖} {$}
B b C S a B
Rule 3 Rule 3
A B Follow(A)=follow(B) A B Follow(A)=follow(B)
Follow(C)=Follow(B) ={$} Follow(B)=Follow(S)={$}
Example-2: LL(1) parsing
SaB | ϵ NT First Follow
S {a,𝜖} {$}
BbC | ϵ B {b,𝜖} {$}
CcS | ϵ C {c,𝜖} {$}
Step 3: Prepare predictive parsing table
N Input Symbol
T
a b c $
S SaB
B
C
Rule: 2
A
SaB a = first()
M[A,a] = A
a=FIRST(aB)={ a }
Example-2: LL(1) parsing
SaB | ϵ NT First Follow
S {a} {$}
BbC | ϵ B {b,𝜖} {$}
CcS | ϵ C {c,𝜖} {$}
Step 3: Prepare predictive parsing table
N Input Symbol
T
a b c $
S SaB S𝜖
B
C
Rule: 3
A
S𝜖 b = follow(A)
M[A,b] = A
b=FOLLOW(S)={ $ }
Example-2: LL(1) parsing
SaB | ϵ NT First Follow
S {a} {$}
BbC | ϵ B {b,𝜖} {$}
CcS | ϵ C {c,𝜖} {$}
Step 3: Prepare predictive parsing table
N Input Symbol
T
a b c $
S SaB S𝜖
B BbC
C
Rule: 2
A
BbC a = first()
M[A,a] = A
a=FIRST(bC)={ b }
Example-2: LL(1) parsing
SaB | ϵ NT First Follow
S {a} {$}
BbC | ϵ B {b,𝜖} {$}
CcS | ϵ C {c,𝜖} {$}
Step 3: Prepare predictive parsing table
N Input Symbol
T
a b c $
S SaB S𝜖
B BbC B𝜖
C
Rule: 3
A
B𝜖 b = follow(A)
M[A,b] = A
b=FOLLOW(B)={ $ }
Example-2: LL(1) parsing
SaB | ϵ NT First Follow
S {a} {$}
BbC | ϵ B {b,𝜖} {$}
CcS | ϵ C {c,𝜖} {$}
Step 3: Prepare predictive parsing table
N Input Symbol
T
a b c $
S SaB S𝜖
B BbC B𝜖
C CcS
Rule: 2
A
CcS a = first()
M[A,a] = A
a=FIRST(cS)={ c }
Example-2: LL(1) parsing
SaB | ϵ NT First Follow
S {a} {$}
BbC | ϵ B {b,𝜖} {$}
CcS | ϵ C {c,𝜖} {$}
Step 3: Prepare predictive parsing table
N Input Symbol
T
a b c $
First(E’) E’+TE’ | ϵ
E’+TE’ TFT’
E’ + T E’ Rule 1
add to T’*FT’ | ϵ
A NT First
EF(E){ (,id }
{ +, 𝜖 }
| id
E’
T { (,id }
E’𝜖
T’
E’ Rule 2 F { (,id }
A add to
FIRST(E’)={ + , 𝜖 }
Example-3: LL(1) parsing
Step 2: Compute FIRST ETE’
First(T’) E’+TE’ | ϵ
T’*FT’ TFT’
T’ * F T’ Rule 1
add to T’*FT’ | ϵ
A NT First
EF(E){ (,id }
{ +, 𝜖 }
| id
E’
T { (,id }
T’𝜖
T’ { *, 𝜖 }
T’ Rule 2 F { (,id }
A add to
FIRST(T’)={ *, 𝜖 }
Example-3: LL(1) parsing
Step 2: Compute FOLLOW ETE’
FOLLOW(E) E’+TE’ | ϵ
Rule 1: Place $ in FOLLOW(E) TFT’
FOLLOW(E)={ $, ) }
Example-3: LL(1) parsing
ETE’
Step 2: Compute FOLLOW
E’+TE’ | ϵ
FOLLOW(E’) TFT’
ETE’
NT First T’*FT’ |ϵ
Follow
E T E’ F(E)
Rule 3 E { (,id }| id { $,) }
{ +, 𝜖 } { $,) }
A B
E’
T { (,id }
T’ { *, 𝜖 }
E’+TE’ F { (,id }
E’ +T E’ Rule 3
A B
FOLLOW(E’)={ $,) }
Example-3: LL(1) parsing
Step 2: Compute FOLLOW ETE’
FOLLOW(T) E’+TE’ | ϵ
TFT’
ETE’
NT First T’*FT’ |ϵ
Follow
E T E’ F(E)
Rule 2 E { (,id }| id { $,) }
{ +, 𝜖 } { $,) }
A B
E’
T { (,id }
T’ { *, 𝜖 }
F { (,id }
E T E’ Rule 3
A B
FOLLOW(T)={ +, $, )
Example-3: LL(1) parsing
Step 2: Compute FOLLOW ETE’
FOLLOW(T) E’+TE’ | ϵ
TFT’
E’+TE’
NT First T’*FT’ |ϵ
Follow
E’ + T E’ F(E)
Rule 2 E { (,id }| id { $,) }
{ +, 𝜖 } { $,) }
A B
E’
T { (,id } { +,$,) }
T’ { *, 𝜖 }
F { (,id }
E’ + T E’ Rule 3
A B
FOLLOW(T)={ +, $, ) }
Example-3: LL(1) parsing
Step 2: Compute FOLLOW ETE’
FOLLOW(T’) E’+TE’ | ϵ
TFT’
TFT’
NT First T’*FT’ |ϵ
Follow
T F T’ F(E)
Rule 3 E { (,id }| id { $,) }
{ +, 𝜖 } { $,) }
A B
E’
T { (,id } { +,$,) }
T’*FT’ T’ { *, 𝜖 } { +,$,) }
F { (,id }
T’ *F T’ Rule 3
A B
FOLLOW(T’)={+ $,) }
Example-3: LL(1) parsing
Step 2: Compute FOLLOW ETE’
FOLLOW(F) E’+TE’ | ϵ
TFT’
TFT’
NT First T’*FT’ |ϵ
Follow
T F T’ F(E)
Rule 2 E { (,id }| id { $,) }
{ +, 𝜖 } { $,) }
A B
E’
T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id }
T F T’ Rule 3
A B
FOLLOW(F)={ *, + ,$ , )
Example-3: LL(1) parsing
Step 2: Compute FOLLOW ETE’
FOLLOW(F) E’+TE’ | ϵ
TFT’
T’*FT’
NT First T’*FT’ |ϵ
Follow
T’ * F T’ F(E)
Rule 2 E { (,id }| id { $,) }
{ +, 𝜖 } { $,) }
A B
E’
T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
F { (,id } {*,+,$,)}
T’ * F T’ Rule 3
A B
FOLLOW(F)={ *,+,$, ) }
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
M[E,(]=ETE’
M[E,id]=ETE’
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
M[E’,+]=E’+TE’
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
M[E’,$]=E’𝜖
M[E’,)]=E’𝜖
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
M[T,(]=TFT’
M[T,id]=TFT’
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
M[T’,*]=T’*FT’
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
NT Input Symbol
E’+TE’ | ϵ
id + * ( ) $ TFT’
E ETE’ ETE’
T’*FT’ | ϵ
E’ E’+TE’ E’𝜖 E’𝜖
NT First F(E)
Follow
T TFT’ TFT’ | id
E { (,id } { $,) }
{ +, 𝜖 }
T’ T’𝜖 T’*FT’ T’𝜖 T’𝜖
E’ { $,) }
F
T { (,id } { +,$,) }
T’ { *, 𝜖 } { +,$,) }
T’𝜖 F { (,id } {*,+,$,)}
Rule: 3
b=FOLLOW(T’)={ +,$,) } A
b = follow(A)
M[T’,+]=T’𝜖 M[A,b] = A
M[T’,$]=T’𝜖
M[T’,)]=T’𝜖
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
a=FIRST((E))={ ( }
M[F,(]=F(E)
Example-3: LL(1) parsing
Step 3: Construct predictive parsing table ETE’
a=FIRST(id)={ id }
M[F,id]=Fid
Example-3: LL(1) parsing
• Step 4: Make each undefined entry of table be Error
ETE’
E’+TE’ | ϵ
NT Input Symbol TFT’
id + * ( ) $ T’*FT’ | ϵ
F(E) | id
E ETE’ Error Error ETE’ Error Error
E’ Error E’+TE’ Error Error E’𝜖 E’𝜖
T TFT’ Error Error TFT’ Error Error
T’ Error T’𝜖 T’*FT’ Error T’𝜖 T’𝜖
F Fid Error Error F(E) Error Error
Example-3: LL(1) parsing
Step 4: Parse the string : id + id * id $ NT Input Symbol
id + * ( ) $
STACK INPUT OUTPUT
E ETE’ Error Error ETE’ Error Error
E$ id+id*id$
E’ Error E’+TE’ Error Error E’𝜖 E’𝜖
TE’$ id+id*id$ ETE’
T TFT’ Error Error TFT’ Error Error
FT’E’$ id+id*id$ TFT’
T’ Error T’𝜖 T’*FT’ Error T’𝜖 T’𝜖
idT’E’$ id+id*id$ Fid
F Fid Error Error F(E) Error Error
T’E’$ +id*id$
E’$ +id*id$ T’𝜖
+TE’$ +id*id$ E’+TE’
TE’$ id*id$ FT’E’$ id$
FT’E’$ id*id$ TFT’ idT’E’$ id$ Fid
idT’E’$ id*id$ Fid T’E’$ $
T’E’$ *id$ E’$ $ T’𝜖
*FT’E’$ *id$ T*FT’ $ $ E’𝜖
Check whether the following
Grammar is LL(1).
1. 5.
SaBa SA
BbB | ϵ ABb | Cd
B aB| ϵ
2. C cC| ϵ
SaSbS | bSaS | ϵ
6.
3. S → (L) | a
S (S) | ϵ L → L,S | S
4. Note: Ensure that each cell in the parsing table contains at most one
S iEtS | iEtSeS| a production rule. If any cell contains more than one production rule,
the grammar is not LL(1).
Eb
Parsing methods
Parsin
g
descen
t
Recursive descent parsing
• A top down parsing that executes a set of recursive procedure
to process the input without backtracking is called recursive
descent parser.
• There is a procedure for each non terminal in the grammar.
• Consider RHS of any production rule as definition of the
procedure.
• As it reads expected input symbol, it advances input pointer to
next position.
Cont.,
Example: Recursive
Procedure T descent parsing
Procedure
Match(token t)
Procedure E {
{ If lookahead=’*’ {
If { If
lookahead=num lookahead=t
{ Match(‘*’);
If lookahead=next_toke
Match(num); lookahead=num n;
T(); { Else
}
Else Match(num); Error();
Error(); Procedure Error
}
If lookahead=$ T(); {
{ } Print(“Error”);
Declare Else }
success;
} Error();
T* numT| 𝜖
Else
Error(); }
Enum T
} Else
3 * 4 $ Succes NULL
} s
Example: Recursive
Procedure T descent
Procedure parsing
Procedure E { Match(token t)
{ If lookahead=’*’ {
If { If
lookahead=num lookahead=t
{ Match(‘*’);
If lookahead=next_toke
Match(num); lookahead=num n;
T(); { Else
}
Else Match(num); Error();
Error(); Procedure Error
}
If lookahead=$ T(); {
{ } Print(“Error”);
Declare Else }
success;
} Error();
T* numT| 𝜖
Else
Error(); }
Enum T
} Else
3 * 4 $ Succes NULL
3 4 * $
} s Error