0% found this document useful (0 votes)
1K views79 pages

Top Down Parser - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311

The document discusses top-down parsing. It begins by explaining that a top-down parser starts with the root node of the parse tree, labeled with the goal symbol. It then describes the top-down parsing algorithm of constructing the root node and repeatedly expanding nodes until the fringe matches the input string. Left recursion is identified as a problem for top-down parsers as it can lead to non-termination, and the technique of eliminating left recursion by transforming grammars to right-recursive form is covered.

Uploaded by

Abdul Wahid Khan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views79 pages

Top Down Parser - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311

The document discusses top-down parsing. It begins by explaining that a top-down parser starts with the root node of the parse tree, labeled with the goal symbol. It then describes the top-down parsing algorithm of constructing the root node and repeatedly expanding nodes until the fringe matches the input string. Left recursion is identified as a problem for top-down parsers as it can lead to non-termination, and the technique of eliminating left recursion by transforming grammars to right-recursive form is covered.

Uploaded by

Abdul Wahid Khan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 79

Top-Down Parser

 A top-down parser starts


with the root of the parse
tree.
 The root node is labeled
with the goal symbol of the
grammar
1
Top-Down Parser
 A top-down parser starts
with the root of the parse
tree.
 The root node is labeled
with the goal symbol of the
grammar
2
Top-Down Parsing Algorithm
 Construct the root node of
the parse tree
 Repeat until the fringe of the
parse tree matches input
string

3
Top-Down Parsing Algorithm
 Construct the root node of
the parse tree
 Repeat until the fringe of the
parse tree matches input
string

4
Top-Down Parsing
 At a node labeled A, select
a production with A on its
lhs
 for each symbol on its rhs,
construct the appropriate
child
5
Top-Down Parsing
 At a node labeled A, select
a production with A on its
lhs
 for each symbol on its rhs,
construct the appropriate
child
6
Top-Down Parsing
 When a terminal symbol is
added to the fringe and it
does not match the fringe,
backtrack
 Find the next node to be
expanded
7
Top-Down Parsing
 When a terminal symbol is
added to the fringe and it
does not match the fringe,
backtrack
 Find the next node to be
expanded
8
Top-Down Parsing
 The key is picking right
production in step 1.
 That choice should be
guided by the input string

9
Top-Down Parsing
 The key is picking right
production in step 1.
 That choice should be
guided by the input string

10
Expression Grammar
1 Goal → expr
2 expr → expr + term
3 | expr - term
4 | term
5 term → term * factor
6 | term ∕ factor
7 | factor
8 factor → number
9 | id
10 | ( expr )
11
Top-Down Parsing
 Let’s try parsing

x–2*y

12
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
13
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y

This worked well except that “–” does not


match “+”
14
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y

The parser must backtrack to here


15
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr – term x – 2 * y
4 term – term x – 2 * y
7 factor – term x – 2 * y
9 <id,x> – term x – 2 * y
9 <id,x> – term x – 2 * y

This time the “–” and “–” matched


16
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr – term x – 2 * y
4 term – term x – 2 * y
7 factor – term x – 2 * y
9 <id,x> – term x – 2 * y
9 <id,x> – term x – 2 * y
- <id,x> – term x – 2 * y
We can advance past “–” to look at “2”
17
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr – term x – 2 * y
4 term – term x – 2 * y
7 factor – term x – 2 * y
9 <id,x> – term x – 2 * y
9 <id,x> – term x – 2 * y
- <id,x> – term x – 2 * y

Now, we need to expand “term”


18
P Sentential Form input
- <id,x> – term x – 2 * y
7 <id,x> – factor x – 2 * y
9 <id,x> – <num,2> x – 2 * y
- <id,x> – <num,2> x – 2 * y

“2” matches “2”


We have more input but no non-terminals
left to expand

19
P Sentential Form input
- <id,x> – term x – 2 * y
7 <id,x> – factor x – 2 * y
9 <id,x> – <num,2> x – 2 * y
- <id,x> – <num,2> x – 2 * y

 The expansion terminated


too soon
  Need to backtrack
20
P Sentential Form input
- <id,x> – term x – 2 * y
5 <id,x> – term * factor x – 2 * y
7 <id,x> – factor * factor x – 2 * y
8 <id,x> – <num,2> * factor x – 2 * y
- <id,x> – <num,2> * factor x–2*y
- <id,x> – <num,2> * factor x – 2 * y
9 <id,x> – <num,2> * <id,y> x – 2 * y
- <id,x> – <num,2> * <id,y> x–2 *y
Success! We matched and consumed all the input
21
Another Possible Parse
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr +term x – 2 * y
2 expr +term +term x – 2 * y
2 expr +term +term +term x – 2 * y
2 expr +term +term +term +.... x – 2 * y
Wrong choice of expansion
consuming no input!!
Parser must make the leads
right to non-termination
choice
22
Left Recursion

Top-down parsers cannot


handle left-recursive
grammars

23
Left Recursion
Formally,
A grammar is left recursive
if  A  NT such that  a
derivation A * A , for
some string   (NT  T)*

24
Left Recursion
 Our expression grammar is
left recursive.
 This can lead to non-
termination in a top-down
parser

25
Left Recursion
 Our expression grammar is
left recursive.
 This can lead to non-
termination in a top-down
parser

26
Left Recursion
 Non-termination is bad in
any part of a compiler!

27
Left Recursion
 For a top-down parser, any
recursion must be a right
recursion
 We would like to convert
left recursion to right
recursion
28
Left Recursion
 For a top-down parser, any
recursion must be a right
recursion
 We would like to convert
left recursion to right
recursion
29
Eliminating Left Recursion

To remove left recursion, we


transform the grammar

30
Eliminating Left Recursion
Consider a grammar fragment:

A → A
| 

where neither  nor  starts


with A.

31
Eliminating Left Recursion
We can rewrite this as:

A →  A'

A' →  A'
| 

where A' is a new non-terminal


32
Eliminating Left Recursion
We can rewrite this as:

A →  A'

A' →  A'
| 

where A' is a new non-terminal


33
Eliminating Left Recursion
A →A'
A' →  A'
| 

 This accepts the same


language but uses only right
recursion
34
Eliminating Left Recursion
The expression grammar we
have been using contains two
cases of left- recursion

35
Eliminating Left Recursion

expr → expr + term


| expr – term
| term
term → term * factor
| term ∕ factor
| factor
36
Eliminating Left Recursion
Applying the transformation yields

expr → term expr'


expr' → + term expr'
| – term expr'
| 
37
Eliminating Left Recursion
Applying the transformation yields

term → factor term'


term' → * factor term'
| ∕ factor term'
| 
38
Eliminating Left Recursion
 These fragments use only
right recursion
 They retain the original left
associativity
 A top-down parser will
terminate using them.
39
Eliminating Left Recursion
 These fragments use only
right recursion
 They retain the original left
associativity
 A top-down parser will
terminate using them.
40
Eliminating Left Recursion
 These fragments use only
right recursion
 They retain the original left
associativity
 A top-down parser will
terminate using them.
41
1 Goal → expr
2 expr → term expr'
3 expr' → + term expr'
4 | – term expr'
5 | 
6 term → factor term'
7 term' → * factor term'
8 | ∕ factor term'
9 | 
10 factor → number
11 | id
12 | ( expr )
42
Predictive Parsing
 If a top down parser picks
the wrong production, it
may need to backtrack
 Alternative is to look ahead
in input and use context to
pick correctly
43
Predictive Parsing
 If a top down parser picks
the wrong production, it
may leed to backtrack
 Alternative is to look ahead
in input and use context to
pick correctly
44
Predictive Parsing
 How much lookahead is
needed?
 In general, an arbitrarily
large amount

45
Predictive Parsing
 How much lookahead is
needed?
 In general, an arbitrarily
large amount

46
Predictive Parsing
 Fortunately, large classes
of CFGs can be parsed
with limited lookahead
 Most programming
languages constructs fall in
those subclasses
47
Predictive Parsing
 Fortunately, large classes
of CFGs can be parsed
with limited lookahead
 Most programming
languages constructs fall in
those subclasses
48
Predictive Parsing
Basic Idea:
Given A →  | ,
the parser should be
able to choose
between  and .
49
Predictive Parsing
FIRST Sets:
For some rhs   G, define
FIRST() as the set of
tokens that appear as the
first symbol in some string
that derives from .

50
Predictive Parsing
That is,
x  FIRST()
iff xfor some.

51
Predictive Parsing
The LL(1) Property
If A →  and A →  both
appear in the grammar, we
would like
FIRST()  FIRST() = 

52
Predictive Parsing
Predictive parsers accept LL(k)
grammars
“left-to-right” scan of input
“k” tokens of lookahead
LL(k)
left-most derivation
53
Predictive Parsing
The LL(1) Property
FIRST()  FIRST() = 
allows the parser to make a
correct choice with a
lookahead of exactly one
symbol!
54
Predictive Parsing
What about -productions?
They complicate the
definition of LL(1)

55
Predictive Parsing
What about -productions?
They complicate the
definition of LL(1)

56
Predictive Parsing
If A →  and A →  and
  FIRST() , then we need
to ensure that FIRST() is
disjoint from FOLLOW(), too

57
Predictive Parsing
FOLLOW()
is the set of all words in
the grammar that can
legally appear after an .

58
Predictive Parsing
For a non-terminal X,
FOLLOW(X )
is the set of symbols
that might follow the
derivation of X.
59
Predictive Parsing
FIRST and FOLLOW
X

FIRST FOLLOW
60
Predictive Parsing
Define FIRST+() as
FIRST()  FOLLOW(), if
  FIRST()
FIRST(), otherwise

61
Predictive Parsing
Then a grammar is LL(1)
iff A →  and A → 
implies

FIRST+()  FIRST+() = 

62
Predictive Parsing
Given a grammar that has
the is LL(1) property
• we can write a simple
routine to recognize each
lhs
• code is simple and fast
63
Predictive Parsing
Given a grammar that has
the is LL(1) property
• we can write a simple
routine to recognize each
lhs
• code is simple and fast
64
Predictive Parsing
Given a grammar that has
the is LL(1) property
• we can write a simple
routine to recognize each
lhs
• code is simple and fast
65
Predictive Parsing
Consider
A → 1 23
which satisfies the LL(1)
property FIRST+()FIRST+
( ) = 

66
/* find an A */
if(token  FIRST(1))
find a 1 and return true
else if(token  FIRST(2))
find a 2 and return true
if(token  FIRST(3))
find a 3 and return true
else error and return false
67
/* find an A */
if(token  FIRST(1))
find a 1 and return true
else if(token  FIRST(2))
find a 2 and return true
if(token  FIRST(3))
find a 3 and return true
else error and return false
68
/* find an A */
if(token  FIRST(1))
find a 1 and return true
else if(token  FIRST(2))
find a 2 and return true
if(token  FIRST(3))
find a 3 and return true
else error and return false
69
/* find an A */
if(token  FIRST(1))
find a 1 and return true
else if(token  FIRST(2))
find a 2 and return true
if(token  FIRST(3))
find a 3 and return true
else error and return false
70
/* find an A */
if(token  FIRST(1))
find a 1 and return true
else if(token  FIRST(2))
find a 2 and return true
if(token  FIRST(3))
find a 3 and return true
else error and return false
71
Predictive Parsing
Grammar with the LL(1)
property are called predictive
grammars because the parser
can “predict” the correct
expansion at each point in the
parse.
72
Predictive Parsing
 Parsers that capitalize on
the LL(1) property are
called predictive parsers
 One kind of predictive
parser is the recursive
descent parser
73
Predictive Parsing
 Parsers that capitalize on
the LL(1) property are
called predictive parsers
 One kind of predictive
parser is the recursive
descent parser
74
Recursive Descent Parsing
1 Goal → expr
2 expr → term expr'
3 expr' → + term expr'
4 | - term expr'
5 | 
6 term → factor term'
7 term' → * factor term'

8 | ∕ factor term'
9 | 
10 factor → number
11 | id
12 | ( expr )
75
Recursive Descent Parsing
This leads to a parser with
six mutually recursive
routines
Goal Term
Expr TPrime
EPrime Factor
76
Recursive Descent Parsing
Each recognizes one non-
terminal (NT) or terminal (T)

Goal Term
Expr TPrime
EPrime Factor
77
Recursive Descent Parsing
 The term descent refers to
the direction in which the
parse tree is built.
 Here are some of these
routines written as functions

78
Recursive Descent Parsing
 The term descent refers to
the direction in which the
parse tree is built.
 Here are some of these
routines written as functions

79

You might also like