Top-Down Parser
A top-down parser starts
with the root of the parse
tree.
The root node is labeled
with the goal symbol of the
grammar
1
Top-Down Parser
A top-down parser starts
with the root of the parse
tree.
The root node is labeled
with the goal symbol of the
grammar
2
Top-Down Parsing Algorithm
Construct the root node of
the parse tree
Repeat until the fringe of the
parse tree matches input
string
3
Top-Down Parsing Algorithm
Construct the root node of
the parse tree
Repeat until the fringe of the
parse tree matches input
string
4
Top-Down Parsing
At a node labeled A, select
a production with A on its
lhs
for each symbol on its rhs,
construct the appropriate
child
5
Top-Down Parsing
At a node labeled A, select
a production with A on its
lhs
for each symbol on its rhs,
construct the appropriate
child
6
Top-Down Parsing
When a terminal symbol is
added to the fringe and it
does not match the fringe,
backtrack
Find the next node to be
expanded
7
Top-Down Parsing
When a terminal symbol is
added to the fringe and it
does not match the fringe,
backtrack
Find the next node to be
expanded
8
Top-Down Parsing
The key is picking right
production in step 1.
That choice should be
guided by the input string
9
Top-Down Parsing
The key is picking right
production in step 1.
That choice should be
guided by the input string
10
Expression Grammar
1 Goal → expr
2 expr → expr + term
3 | expr - term
4 | term
5 term → term * factor
6 | term ∕ factor
7 | factor
8 factor → number
9 | id
10 | ( expr )
11
Top-Down Parsing
Let’s try parsing
x–2*y
12
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
13
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
This worked well except that “–” does not
match “+”
14
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
The parser must backtrack to here
15
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr – term x – 2 * y
4 term – term x – 2 * y
7 factor – term x – 2 * y
9 <id,x> – term x – 2 * y
9 <id,x> – term x – 2 * y
This time the “–” and “–” matched
16
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr – term x – 2 * y
4 term – term x – 2 * y
7 factor – term x – 2 * y
9 <id,x> – term x – 2 * y
9 <id,x> – term x – 2 * y
- <id,x> – term x – 2 * y
We can advance past “–” to look at “2”
17
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr – term x – 2 * y
4 term – term x – 2 * y
7 factor – term x – 2 * y
9 <id,x> – term x – 2 * y
9 <id,x> – term x – 2 * y
- <id,x> – term x – 2 * y
Now, we need to expand “term”
18
P Sentential Form input
- <id,x> – term x – 2 * y
7 <id,x> – factor x – 2 * y
9 <id,x> – <num,2> x – 2 * y
- <id,x> – <num,2> x – 2 * y
“2” matches “2”
We have more input but no non-terminals
left to expand
19
P Sentential Form input
- <id,x> – term x – 2 * y
7 <id,x> – factor x – 2 * y
9 <id,x> – <num,2> x – 2 * y
- <id,x> – <num,2> x – 2 * y
The expansion terminated
too soon
Need to backtrack
20
P Sentential Form input
- <id,x> – term x – 2 * y
5 <id,x> – term * factor x – 2 * y
7 <id,x> – factor * factor x – 2 * y
8 <id,x> – <num,2> * factor x – 2 * y
- <id,x> – <num,2> * factor x–2*y
- <id,x> – <num,2> * factor x – 2 * y
9 <id,x> – <num,2> * <id,y> x – 2 * y
- <id,x> – <num,2> * <id,y> x–2 *y
Success! We matched and consumed all the input
21
Another Possible Parse
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr +term x – 2 * y
2 expr +term +term x – 2 * y
2 expr +term +term +term x – 2 * y
2 expr +term +term +term +.... x – 2 * y
Wrong choice of expansion
consuming no input!!
Parser must make the leads
right to non-termination
choice
22
Left Recursion
Top-down parsers cannot
handle left-recursive
grammars
23
Left Recursion
Formally,
A grammar is left recursive
if A NT such that a
derivation A * A , for
some string (NT T)*
24
Left Recursion
Our expression grammar is
left recursive.
This can lead to non-
termination in a top-down
parser
25
Left Recursion
Our expression grammar is
left recursive.
This can lead to non-
termination in a top-down
parser
26
Left Recursion
Non-termination is bad in
any part of a compiler!
27
Left Recursion
For a top-down parser, any
recursion must be a right
recursion
We would like to convert
left recursion to right
recursion
28
Left Recursion
For a top-down parser, any
recursion must be a right
recursion
We would like to convert
left recursion to right
recursion
29
Eliminating Left Recursion
To remove left recursion, we
transform the grammar
30
Eliminating Left Recursion
Consider a grammar fragment:
A → A
|
where neither nor starts
with A.
31
Eliminating Left Recursion
We can rewrite this as:
A → A'
A' → A'
|
where A' is a new non-terminal
32
Eliminating Left Recursion
We can rewrite this as:
A → A'
A' → A'
|
where A' is a new non-terminal
33
Eliminating Left Recursion
A →A'
A' → A'
|
This accepts the same
language but uses only right
recursion
34
Eliminating Left Recursion
The expression grammar we
have been using contains two
cases of left- recursion
35
Eliminating Left Recursion
expr → expr + term
| expr – term
| term
term → term * factor
| term ∕ factor
| factor
36
Eliminating Left Recursion
Applying the transformation yields
expr → term expr'
expr' → + term expr'
| – term expr'
|
37
Eliminating Left Recursion
Applying the transformation yields
term → factor term'
term' → * factor term'
| ∕ factor term'
|
38
Eliminating Left Recursion
These fragments use only
right recursion
They retain the original left
associativity
A top-down parser will
terminate using them.
39
Eliminating Left Recursion
These fragments use only
right recursion
They retain the original left
associativity
A top-down parser will
terminate using them.
40
Eliminating Left Recursion
These fragments use only
right recursion
They retain the original left
associativity
A top-down parser will
terminate using them.
41
1 Goal → expr
2 expr → term expr'
3 expr' → + term expr'
4 | – term expr'
5 |
6 term → factor term'
7 term' → * factor term'
8 | ∕ factor term'
9 |
10 factor → number
11 | id
12 | ( expr )
42
Predictive Parsing
If a top down parser picks
the wrong production, it
may need to backtrack
Alternative is to look ahead
in input and use context to
pick correctly
43
Predictive Parsing
If a top down parser picks
the wrong production, it
may leed to backtrack
Alternative is to look ahead
in input and use context to
pick correctly
44
Predictive Parsing
How much lookahead is
needed?
In general, an arbitrarily
large amount
45
Predictive Parsing
How much lookahead is
needed?
In general, an arbitrarily
large amount
46
Predictive Parsing
Fortunately, large classes
of CFGs can be parsed
with limited lookahead
Most programming
languages constructs fall in
those subclasses
47
Predictive Parsing
Fortunately, large classes
of CFGs can be parsed
with limited lookahead
Most programming
languages constructs fall in
those subclasses
48
Predictive Parsing
Basic Idea:
Given A → | ,
the parser should be
able to choose
between and .
49
Predictive Parsing
FIRST Sets:
For some rhs G, define
FIRST() as the set of
tokens that appear as the
first symbol in some string
that derives from .
50
Predictive Parsing
That is,
x FIRST()
iff xfor some.
51
Predictive Parsing
The LL(1) Property
If A → and A → both
appear in the grammar, we
would like
FIRST() FIRST() =
52
Predictive Parsing
Predictive parsers accept LL(k)
grammars
“left-to-right” scan of input
“k” tokens of lookahead
LL(k)
left-most derivation
53
Predictive Parsing
The LL(1) Property
FIRST() FIRST() =
allows the parser to make a
correct choice with a
lookahead of exactly one
symbol!
54
Predictive Parsing
What about -productions?
They complicate the
definition of LL(1)
55
Predictive Parsing
What about -productions?
They complicate the
definition of LL(1)
56
Predictive Parsing
If A → and A → and
FIRST() , then we need
to ensure that FIRST() is
disjoint from FOLLOW(), too
57
Predictive Parsing
FOLLOW()
is the set of all words in
the grammar that can
legally appear after an .
58
Predictive Parsing
For a non-terminal X,
FOLLOW(X )
is the set of symbols
that might follow the
derivation of X.
59
Predictive Parsing
FIRST and FOLLOW
X
FIRST FOLLOW
60
Predictive Parsing
Define FIRST+() as
FIRST() FOLLOW(), if
FIRST()
FIRST(), otherwise
61
Predictive Parsing
Then a grammar is LL(1)
iff A → and A →
implies
FIRST+() FIRST+() =
62
Predictive Parsing
Given a grammar that has
the is LL(1) property
• we can write a simple
routine to recognize each
lhs
• code is simple and fast
63
Predictive Parsing
Given a grammar that has
the is LL(1) property
• we can write a simple
routine to recognize each
lhs
• code is simple and fast
64
Predictive Parsing
Given a grammar that has
the is LL(1) property
• we can write a simple
routine to recognize each
lhs
• code is simple and fast
65
Predictive Parsing
Consider
A → 1 23
which satisfies the LL(1)
property FIRST+()FIRST+
( ) =
66
/* find an A */
if(token FIRST(1))
find a 1 and return true
else if(token FIRST(2))
find a 2 and return true
if(token FIRST(3))
find a 3 and return true
else error and return false
67
/* find an A */
if(token FIRST(1))
find a 1 and return true
else if(token FIRST(2))
find a 2 and return true
if(token FIRST(3))
find a 3 and return true
else error and return false
68
/* find an A */
if(token FIRST(1))
find a 1 and return true
else if(token FIRST(2))
find a 2 and return true
if(token FIRST(3))
find a 3 and return true
else error and return false
69
/* find an A */
if(token FIRST(1))
find a 1 and return true
else if(token FIRST(2))
find a 2 and return true
if(token FIRST(3))
find a 3 and return true
else error and return false
70
/* find an A */
if(token FIRST(1))
find a 1 and return true
else if(token FIRST(2))
find a 2 and return true
if(token FIRST(3))
find a 3 and return true
else error and return false
71
Predictive Parsing
Grammar with the LL(1)
property are called predictive
grammars because the parser
can “predict” the correct
expansion at each point in the
parse.
72
Predictive Parsing
Parsers that capitalize on
the LL(1) property are
called predictive parsers
One kind of predictive
parser is the recursive
descent parser
73
Predictive Parsing
Parsers that capitalize on
the LL(1) property are
called predictive parsers
One kind of predictive
parser is the recursive
descent parser
74
Recursive Descent Parsing
1 Goal → expr
2 expr → term expr'
3 expr' → + term expr'
4 | - term expr'
5 |
6 term → factor term'
7 term' → * factor term'
8 | ∕ factor term'
9 |
10 factor → number
11 | id
12 | ( expr )
75
Recursive Descent Parsing
This leads to a parser with
six mutually recursive
routines
Goal Term
Expr TPrime
EPrime Factor
76
Recursive Descent Parsing
Each recognizes one non-
terminal (NT) or terminal (T)
Goal Term
Expr TPrime
EPrime Factor
77
Recursive Descent Parsing
The term descent refers to
the direction in which the
parse tree is built.
Here are some of these
routines written as functions
78
Recursive Descent Parsing
The term descent refers to
the direction in which the
parse tree is built.
Here are some of these
routines written as functions
79