0% found this document useful (0 votes)
23 views148 pages

Unit 03 Parser

The document discusses the role of parsers in compiler design, specifically focusing on syntax analysis and context-free grammars. It explains the process of generating parse trees, derivation sequences, and the challenges posed by left recursion and ambiguity in grammars. Additionally, it outlines various parsing techniques, including top-down and bottom-up parsing, and the methods for eliminating left recursion and performing left factoring.

Uploaded by

Eshan Jinabade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views148 pages

Unit 03 Parser

The document discusses the role of parsers in compiler design, specifically focusing on syntax analysis and context-free grammars. It explains the process of generating parse trees, derivation sequences, and the challenges posed by left recursion and ambiguity in grammars. Additionally, it outlines various parsing techniques, including top-down and bottom-up parsing, and the methods for eliminating left recursion and performing left factoring.

Uploaded by

Eshan Jinabade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 148

System Software

Compiler Design

School of Computer Engineering and


Technology
1
6/5/2021
Role of Parser / Syntax
Analysis
Token Parse
Source Tree Rest of Intermediate
Lexical Parser Front
Analyzer Representation
Program End
Get Next
Token

Symbol
Table

3
6/5/2021
Role of Parser / Syntax
●Analysis
Checks whether the token stream
meets the Grammatical
Specification of the Language and
generates the Syntax Tree.
●A grammar of a programming language
is typically described by a Context Free
Grammar, which also defines the
structure of the parse tree.
●A syntax error is produced by the
compiler when the program does not
meet the grammatical specification.
6/5/2021 3
Definition of Context-Free
Grammars
A context-free grammar G = (T, N, S, P)
consists of:
1.T, a set of terminals (scanner tokens).
2. N, a set of nonterminals
(syntactic variables generated by
productions).
3.S, a designated start nonterminal.
4. P, a set of productions. Each
production has the form, A::= α ,
where A is a nonterminal and α is a
sentential form , i.e., a string of zero or
more grammar symbols 6/5/2021 4
Context-Free
Grammars
● A context-free grammar defines the syntax of a
programming language
● The syntax defines the syntactic categories for language
constructs
◦ Statements
◦ Expressions
◦ Declarations
● Categories are subdivided into more detailed categories
◦ A Statement is a
🞄 For-statement
🞄 If-statement
🞄 Assignment

<statement> ::= <for-statement> | <if-statement> |


<assignment>
<statement>
<for-statement> ::= for ( <expression> ; <expression> ;
<assignment ::= <identifier> := <expression>
<expression> )
> 6/5/2021 5
Syntax
Analysis

Syntax Analysis Problem Statement: To


find a derivation sequence in a grammar G
for the input token stream (or say that none
exists).

6
6/5/2021
Derivation
Given the following grammar:

E → E + E | E * E | ( E ) | - E | id

Is the string -(id + id) a sentence in this grammar?

Yes because there is the following derivation:

E ⇒ -E ⇒ -(E) ⇒ -(E + E) ⇒ -(id + id)

Where ⇒ reads “derives in one step”.

7
6/5/2021
Parse trees
A parse tree is a graphical representation
of a derivation sequence of a sentential form.

Tree nodes represent symbols of the grammar


(nonterminals or terminals) and tree edges
represent derivation steps.

8
6/5/2021
Derivation
E → E + E | E * E | ( E ) | - E | id

Lets examine this derivation:

E ⇒ -E ⇒ -(E) ⇒ -(E + E) ⇒ -(id + id)

E E E E E

- E - E - E - E

( E ) ( E ) ( E )

E + E E + E
This is a top-down derivation
because we start building the id id
parse tree at the top parse tree

9
6/5/2021
Another Derivation
Example E → E + E | E * E | ( E ) | - E | id
Find a derivation for the expression:
id + id * id
E
According to the grammar, both are correct.
E + E

Id E *
A grammar that produces more than one E
id id
parse tree for any input sentence is said
to be an ambiguous grammar. E

E + E

E * E id

id id
1
6/5/2021
1
Parse Trees and
Derivations
E

E
E ⇒E + E
E
+ ⇒ id + E
E E
i * ⇒
id + E *
d
id id E
Top-down parsing ⇒
id + id *
E E

E
E
E + ⇒
id + id *
E E
i * id E + E
d
id id ⇒
E+E*E
Bottom-up parsing ⇒
E+E*
6/5/2021 id 11
Top–Down Bottom–Up
● Parsing
A parse tree is Parsing
created from root ● A parse tree is
to leaves created from
● The traversal of parse leaves to root
trees is a preorder ● The traversal of parse
traversal trees is a reversal of
● Tracing leftmost postorder traversal
derivation ● Tracing
● Two types: rightmost
Backtracking: Try different
◦ Backtracking
structures parser
and backtrack derivation
if it◦does not matched
Predictive parser ● More powerful than
the input top- down parsing
Predictive: Guess the
structure of the
parse tree from the
6/5/2021 12
next input
Parse
Parsing r Im
p
Techniques
Top-Down Bottom-Up
Parser Parser

6/5/2021 13
Parse
Parsing r Im
p
Techniques
Top-Down Bottom-Up
Parser Parser

Recursive Predictiv
Descent Par e [LL(1)]
ser Parser

6/5/2021 14
Parse
Parsing r Im
p
Techniques
Top-Down Bottom-Up
Parser Parser

Recursive Predictiv Shift- LR


Descent e [LL(1)] Reduce Parser
Parser Parser Parser

Canonical
SLR LALR LR
Parser Parser 6/5/2021 Parser 15
Left Recursion
E→E+T|T
Consider the grammar: T→T*F|F
→ ( E ) | id

A top-down parser might loop forever when parsing


an expression using this grammar

E E E E

E + T E + T E + T

E + T E + T

E + T

16
6/5/2021
Left Recursion
E→E+T|T
Consider the grammar: T→T*F|F
F → ( E ) | id

A grammar that has at least one production of the form


A ⇒ Aα is a left recursive grammar.

Top-down parsers do not work with left-recursive


grammars.

Left-recursion can often be eliminated by rewriting the


grammar.

17
6/5/2021
Elimination of Left Recursion
● A grammar is left recursive if it has a NT A such
that there is a derivation A + Aα for some string α.
● Top down parsing methods cannot handle left
recursive grammars, so a transformation that
eliminates left recursion is needed.
E.g. AA α | β
It has left recursion. To eliminate it we rewrite it as
AβA’
A’ α A’ | є
Contd

● The technique to eliminate left recursion is:

AAα1| Aα1|…| β1| β2|…| βn


no βi begins with A.
So we replace the A-productions by
Aβ1 A‘ | β2A‘ | … | βn A‘
A’ α1 A‘ | α2 A‘ | ….| αmA‘ | є

e.g. consider the G as


S Aa | b
A Ac| Sd | є
Left Recursion
This left-recursive E→E+T|T
grammar: T→T*F|F
F → ( E ) |id

Can be re-written to eliminate the immediate left recursion:

E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → ( E ) | id

20
6/5/2021
Left Factoring
The following grammar:

stmt → if expr then stmt else stmt


| if expr then stmt

Cannot be parsed by a predictive parser that looks one element ahead.

But the grammar stmt → if expr then stmt stmt’


can be re-written: stmt‘→ else stmt | ε

Where ε is the empty


string.
Rewriting a grammar to eliminate multiple productions
starting with the same token is called left factoring.

21
6/5/2021
How to do left factoring :algorithm
Algorithm :Left factoring a grammar
Input:Grammar G
Output:Eq. left factored grammar
for each NT A find the longest prefix α common to two
or more its alternatives.
if α ≠ є i.e. there is a common prefix,replace all the
A- productions
Aαβ1 | αβ2| …| αβn | γ
Where γ : all alternatives that do not begin with
α by AαA’ | γ
A’ β1 |β2 | …| βn
Here A’ is a new NT.
Recursive descent parsing

● A parser that uses a set of recursive procedures


to recognize its input is called a recursive
descent parsing.
● It is an attempt to find a leftmost derivation for an
input string
● It can be viewed as an attempt to construct a parse tree
for the i/p starting from the root and creating
the nodes of the parse tree in preorder.
● Predictive parsing is special case of RDP where
no backtracking is required.
● Recursive descent parsers will look ahead one
character and advance the input stream reading
pointer when proper matches occur.
6/5/2021 23
● Consider the G
ScAd
Aab | a
i/p is cad
S cAd
ScAdcabd
ScAdcad
A left recursive grammar can cause a RDP, even with
backtracking to go into an infinite loop.

6/5/2021 24
● The procedures for the arithmetic expression
grammar:
●E TE’
E’ +TE’|є
T FT’
T’ *FT’|є
F (E)|id

6/5/2021 25
Recursive procedures to
recognize arithmetic expressions:
Procedure
E( ): begin
T( )
E’( )
End;
procedure E’( ):
If input_symbol = ‘+’
then begin
ADVANCE( )
T( )
E’( )
end;
6/5/2021 26
Contd…
procedure
T( ) :
begin
F( )
T’( )
End;

procedure T’( )
If input_symbol = ‘*’
then begin
ADVANCE( )
F( )
T’( )
End; 6/5/2021 27
Contd…
procedure F( ):
If input_symbol = ‘id’
then ADVANCE( )
Else if input Symbol=‘ ( ’ then
begin
ADVANCE( )
E( )
If input_symbol = ‘ ) ’
then ADVANCE( )
else ERROR( )
end
else
ERROR( )

6/5/2021 28
Top Down
Predictive Parser Con…

(LL(1))
●Her (LL(1)
e )

One (1) Look


Scans Input Generates Leftmost
ahead Symbol
from Left to Derivation Tree
Right

we can have parsers


(LL(k)) also
6/5/2021 29
●No Backtracking Parser:
●Writing Special Grammar –
Eliminating Left Recursion &
Left Factoring.
●Must know – Current Input
Symbol (a)
& Non terminal (A) to be
expanded.
A -> α1/ α2 / α3 ……/ αn
6/5/2021 30
●LL(1) Parser uses explicit STACK
rather than Recursive calls.
●Stack Implementation:

Stack W (Input String)


$ Start Symbol Input String $
-------- --------

-------- --------

$ $ accept
6/5/2021 31
A Predictive Parser

E → TE’
E’ → +TE’ | ε
Grammar: T → FT’
T’ → *FT’ | ε
F → ( E ) | id

NON- INPUT SYMBOL


TERMINA id + * ( ) $
L
Parsing
Table: E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id 6/5/2021 F → (E) 33
A Predictive Parser

INPUT: id + id * id $ OUTPUT:
E

T E’
T Predictive Parsing
STACK: E
E’ Program
$
$

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε 34
F F → id 6/5/2021 F → (E)
A Predictive Parser

INPUT: id + id * id $ OUTPUT:
E

T E’
F Predictive Parsing
STACK: T F T’
Program
T’
E
E’
’$
$

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $ (Aho,Sethi,
E E → TE’ E → TE’ Ullman,
E’ E’ → +TE’ E’ → ε E’ → ε
pp. 186)
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε 35
F F → id 6/5/2021 F → (E)
A Predictive Parser

INPUT: id + id * id $ OUTPUT:
E

T E’
id Predictive Parsing
STACK: T
F F T’
Program
T’
ET’
E’
’E$ id
’$
$
PARSING NON- INPUT SYMBOL
TERMINAL id + * ( ) $
TABLE: (Aho,Sethi,
E E → TE’ E → TE’
Ullman,
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’ pp. 188)
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
36
F F → id 6/5/2021 F → (E)
A Predictive Parser
Action when Top(Stack) = input ≠ $ : Pop stack, advance
input.
INPUT: id + id * id $ OUTPUT:
E

T E’
id Predictive Parsing
STACK: F F T’
Program
T’
T’
E’
E id
’$
$
PARSING NON- INPUT SYMBOL
TERMINAL
TABLE: id + * ( ) $

E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
37
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
A Predictive Parser

INPUT: id + id * id $ OUTPUT:
E

T E’
Predictive Parsing
STACK: E’
T’ F T’
Program
E$
’$ id ε

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε 38
F F → id 6/5/2021 F → (E)
A Predictive Parser
The predictive parser proceeds
in this fashion emiting the E
following productions:
T E’

E’ → +TE’ F T’ + T E’

T → FT’ id ε F T’ ε
F → id
id * F T’
T’ → * FT’
F → id id ε

T’ → ε When Top(Stack) = input = $


the parser halts and accepts the
E’ → ε
input string.

38
6/5/2021
LL(k) Parser

This parser parses from left to right, and does a


leftmost-derivation. It looks up 1 symbol ahead to
choose its next action. Therefore, it is known as
a LL(1) parser.

An LL(k) parser looks k symbols ahead to decide


its action.

39
6/5/2021
The Parsing
Table
E → TE’
Given this grammar: E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → ( E ) | id

How is this parsing table built?

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $

E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)
40
6/5/2021
FIRST and
FOLLOW
We need to build a FIRST set and a FOLLOW set
for each symbol in the grammar.

The elements of FIRST and FOLLOW


are terminal symbols.

FIRST(α) is the set of terminal symbols that can


begin any string derived from α.

FOLLOW(α) is the set of terminal symbols that can follow

α: t ∈ FOLLOW(α) ↔ ∃ derivation containing αt

41
6/5/2021
Rules to Create FIRST
GRAMMAR: FIRST rules:
E → TE’ 1. If X is a terminal, FIRST(X) = {X}
E’ → +TE’ | ε
T → FT’ 2. If X → ε , then ε ∈ FIRST(X)
T’ → *FT’ | ε 3. If X → Y1Y2 ••• Yk
F → ( E ) | id and Y1 ••• Yi-
SETS:
1 ⇒* ε and
FIRST(id) = {id}
FIRST(*) = {*} a ∈FIRST(Yi)
FIRST(+) = {+}
FIRST(() = {(}
then a ∈ FIRST(X)
FIRST()) = {)}
FIRST(E’) = {ε} {+, ε}
FIRST(T’) = {ε} {*, ε}
FIRST(F) = {(, id}
FIRST(T) = FIRST(F) = {(, id}
FIRST(E) = FIRST(T) = {(, id}
43
6/5/2021
FIRST(E’) = {+,
ε }
FIRST(T’) = {* ,

Ru es to Create
ε }
FIRST(F) = { (, id}
FIRST(T) = {(, id}

l FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:


E → TE’
E’ → +TE’ | ε
1. If S is the start symbol, then $ ∈
T → FT’ FOLLOW(S)
T’ → *FT’ | ε 2. If A → αBβ,
F → ( E ) | id and a ∈
SETS: FIRST(β) and a
FOLLOW(E) = {$} { ), $}
≠ε
FOLLOW(E’) = { ), $} then a ∈ FOLLOW(B)
FOLLOW(T) = { ), $} 3. If A → αB
and a ∈ FOLLOW(A)
then a ∈
FOLLOW(B)
3a. If A → αBβ
β ⇒*
and
A and B are non-terminals,
ε strings of grammar symbols
α and β are 44
6/5/2021
FIRST(E’) = {+,
ε}
FIRST(T’) = {* ,

Rules to Create
ε} FIRST(F) =
(, id}
{ FIRST(T) = {(,
id}
FOLLOW
FIRST(E) = {(, id}
GRAMMAR: FOLLOW rules:
E → TE’ 1. If S is the start symbol, then $ ∈ FOLLOW(S)
E’ → +TE’ | ε
2. If A → αBβ,
T → FT’
T’ → *FT’ | ε
and a ∈
F → ( E ) | id FIRST(β) and a
≠ε
SETS: ∈ FOLLOW(B)
3. then
If A →a αB
FOLLOW(E) = {), $} and a ∈ FOLLOW(A)
FOLLOW(E’) = { ), $} then a ∈
FOLLOW(T) = { ), $} {+, ), $} FOLLOW(B)
3a. If A → αBβ
and β ⇒* ε
and a ∈ FOLLOW(A)
then a ∈
FOLLOW(B)
44
6/5/2021
FIRST(E’) = {+,
ε}
FIRST(T’) = {* ,

R les to Create
ε}
FIRST(F) = { (, id}
FIRST(T) = {(, id}

u FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:


E → TE’
E’ → +TE’ | ε 1. If S is the start symbol, then $ ∈ FOLLOW(S)
T → FT’ 2. If A → αBβ,
T’ → *FT’ | ε and a ∈
F → ( E ) | id FIRST(β) and a
SETS: ≠ε
If A →a αB
3. then ∈ FOLLOW(B)
FOLLOW(E) = {), $} and a ∈ FOLLOW(A)
FOLLOW(E’) = { ), $} then a ∈
FOLLOW(T) = {+, ), $} FOLLOW(B)
FOLLOW(T’) = {+, ), $}
3a. If A → αBβ
and β ⇒* ε
and a ∈ FOLLOW(A)
then a ∈
FOLLOW(B)
45
6/5/2021
FIRST(E’) = {+,
ε}
FIRST(T’) = {* ,

Rules to Create
ε}
FIRST(F) = { (, id}
FIRST(T) = {(, id}

FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:


E → TE’
1. If S is the start symbol, then $ ∈ FOLLOW(S)
E’ → +TE’ | ε
T → FT’ 2. If A → αBβ,
T’ → *FT’ | ε and a ∈
F → ( E ) | id FIRST(β) and a
≠ε
SETS:
then a ∈ FOLLOW(B)
FOLLOW(E) = {), $}
3. If A → αB
FOLLOW(E’) = { ), $}
and a ∈ FOLLOW(A)
FOLLOW(T) = {+, ), $} then a ∈
FOLLOW(T’) = {+, ), $} FOLLOW(B)
FOLLOW(F) = {+, ), $}
3a. If A → αBβ
and β ⇒* ε
and a ∈ FOLLOW(A)
then a ∈ 46
FOLLOW(B) 6/5/2021
FIRST(E’) = {+,
ε}
FIRST(T’) = {* ,

Rules to Create
ε }
FIRST(F) = {(, id}
FIRST(T) = {(, id}

FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:


E → TE’ 1. If S is the start symbol, then $ ∈ FOLLOW(S)
E’ → +TE’ | ε
2. If A → αBβ,
T → FT’
T’ → *FT’ | ε
and a ∈
F → ( E ) | id FIRST(β) and a
≠ε
SETS: ∈ FOLLOW(B)
3. then
If A →a αB
FOLLOW(E) = {), $} and a ∈ FOLLOW(A)
FOLLOW(E’) = { ), $} then a ∈
FOLLOW(T) = {+, ), $} FOLLOW(B)
FOLLOW(T’) = {+, ), $} 3a. If A → αBβ
FOLLOW(F) = {+, ), $} {+, *, ), $} and β ⇒* ε
and a ∈ FOLLOW(A)
then a ∈
FOLLOW(B)
48
6/5/2021
(Aho,Sethi,Ullman, pp. 189)
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , FOLLOW(E’) = { ), $}
T → FT’ ε} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t
F → ( E ) | id
P n FIRST(F) = {(, id}

Table
FIRST(T) = {(, id}
FIRST(E)
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A,
a]

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

48
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , ε} FOLLOW(E’) = { ), $}
T → FT’ FIRST(F) = {(, id} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t n FIRST(T) = {(, id} FOLLOW(T’) = {+, ), $}
F → ( E ) | id
P Table
FIRST(E) = {(, id} FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A, a]

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

49
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , FOLLOW(E’) = { ), $}
T → FT’ ε} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t
F → ( E ) | id
P n FIRST(F) = {(, id}

Table
FIRST(T) = {(, id}
FIRST(E)
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A,
a]

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

50
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , ε} FOLLOW(E’) = { ), $}
T → FT’ FIRST(F) = {(, id} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t n FIRST(T) = {(, id} FOLLOW(T’) = {+, ), $}
F → ( E ) | id
P Table
FIRST(E) = {(, id} FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A, a]

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

51
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , ε} FOLLOW(E’) = { ), $}
T → FT’ FIRST(F) = {(, id} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t n FIRST(T) = {(, id} FOLLOW(T’) = {+, ), $}
F → ( E ) | id
P Table
FIRST(E) = {(, id} FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A, a]

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

52
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , FOLLOW(E’) = { ), $}
T → FT’ ε} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t
F → ( E ) | id
P n FIRST(F) = {(, id}

Table
FIRST(T) = {(, id}
FIRST(E)
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A,
a]
2. If A → α:
if ε ∈ FIRST(α), add A → α to M[A,
b] for each terminal b ∈
FOLLOW(A),

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

53
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , FOLLOW(E’) = { ), $}
T → FT’ ε} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t
F → ( E ) | id
P n FIRST(F) = {(, id}

Table
FIRST(T) = {(, id}
FIRST(E)
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A,
a]
2. If A → α:
if ε ∈ FIRST(α), add A → α to M[A,
b] for each terminal b ∈
FOLLOW(A),

PARSING NON- INPUT SYMBOL


TERMINAL
TABLE: id + * ( ) $
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)

54
6/5/2021
GRAMMAR: FIRST SETS: FOLLOW SETS:
E → TE’ FIRST(E’) = {+, ε} FOLLOW(E) = {), $}

Rules o Build arsi g


E’ → +TE’ | ε FIRST(T’) = {* , FOLLOW(E’) = { ), $}
T → FT’ ε} FOLLOW(T) = {+, ), $}
T’ → *FT’ | ε
t
F → ( E ) | id
P n FIRST(F) = {(, id}

Table
FIRST(T) = {(, id}
FIRST(E)
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, *, ), $}

1. If A → α:
if a ∈ FIRST(α), add A → α to M[A, a]
2. If A → α:
if ε ∈ FIRST(α), add A → α to M[A,
b] for each terminal b ∈
FOLLOW(A),
3. If A → α:
if ε ∈ FIRST(α), and $ ∈
FOLLOW(A), NON-add A → α to M[A, INPUT
$] SYMBOL
TERMINAL id + * ( ) $
PARSING
TABLE: E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’ → ε
T T → FT’ T → FT’
T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε
F F → id F → (E)
55
6/5/2021
LL(1) Parsing Algorithm:
(Table Based)
●Given
An LL(1) grammar, a parsing algorithm
that uses the LL(1) parsing table
●Note: - Assuming that ‘$’ indicates the
bottom of the stack and the end of the
input string –
1. Push the start symbol onto the top of the
parsing stack.
2. “While” the top of the stack ≠$ and the
next input
token ≠$ do
6/5/2021 57
then ( match and Algo
con…
pop) pop the
parsing stack
advance
elseif theof the parsing stack is Non
the top
input
Terminal A and the next input is Terminal a
and parsing table entry M[ A, a] contains
production A->X1,X2,……Xn
then (generate)
pop the parsing
stack
for(i=n;i<=1;i++)
push Xi on top of the stack
else error
if (the top of the parsing
stack =$) and the next
6/5/2021 58
Bottom Up Parser
●Bottom-up parsers are basically
those generates a parse tree
starting from Leaves
(Bottom) and creating nodes up
to Root of parse tree.
●The reduction steps trace a
rightmost derivation on
reverse.
●More Powerful than Top down
Parsers.
6/5/2021 58
Bottom Up Parser
●Stack
Implementation:
Stack W (Input String)
$ Input String $
-------- --------

-------- --------

$ Start $ accept
Symbol
6/5/2021 59
●Different types of Bottom up
parser:
- Shift Reduce Parser:
- Operator Precedence
Parser:
- LR Parsers:
🞄 Simple LR (SLR)parser
🞄 LALR parser
🞄 Canonical LR (CLR)parser
6/5/2021 60
●Actions
:
- Shift:
-
Reduce
:
-
Accept:
- Error:
6/5/2021 61
Bottom-Up Parser

Consider the Grammar: S → aABe


A → Abc | b
B→d

We want to parse the input string


abbcde.

62
Bottom-Up Parser
Example

INPUT: a b b c d e $ OUTPUT:

Production
S → aABe
Bottom-Up Parsing
A → Abc
Program
A→b
B→d

63
Bottom-Up Parser
Example

INPUT: a b b c d e $ OUTPUT:

Production
S → aABe
Bottom-Up Parsing
A → Abc
Program A
A→b
B→d b

64
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: a A b c d e $ OUTPUT:

Production
S → aABe
Bottom-Up Parsing
A → Abc
Program A
A→b
B→d b

65
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: a A b c d e $ OUTPUT:

Production
S → aABe
Bottom-Up Parsing
A → Abc
Program A
A→b
B→d b

We are not reducing here in this example.


A parser would reduce, get stuck and then backtrack!

66
Bottom-Up Parser
Example

INPUT: a A b c d e $ OUTPUT:

Production
A
S → aABe
Bottom-Up Parsing
A → Abc A b c
Program
A→b
B→d b

67
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: a A d e $ OUTPUT:

Production
A
S → aABe
Bottom-Up Parsing
A → Abc A b c
Program
A→b
B→d b

68
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: a A d e $ OUTPUT:

Production
A B
S → aABe
Bottom-Up Parsing
A → Abc A b c d
Program
A→b
B→d b

69
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: a A B e $ OUTPUT:

Production
A B
S → aABe
Bottom-Up Parsing
A → Abc A b c d
Program
A→b
B→d b

70
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: a A B e $ OUTPUT:
S
Production e
a A B
S → aABe
Bottom-Up Parsing
A → Abc A b c d
Program
A→b
B→d b

71
(Aho,Sethi,Ullman, pp. 195)
Bottom-Up Parser
Example

INPUT: S $ OUTPUT:
S

Production a A B e
S → aABe
Bottom-Up Parsing
A → Abc A b c d
Program
A→b
B→d b

This parser is known as an LR Parser because


it scans the input from Left to right, and it constructs
a Rightmost derivation in reverse order.

72
(Aho,Sethi,Ullman, pp. 195)
Handle
pruning
● Principles of Bottom Up Parsing – Handles
● The leftmost simple phrase of a sentential form is
called the handle.

The basic steps of a bottom-up parser are


- to identify a substring within a rightmost sentential form which matches
the RHS of a rule.
- when this substring is replaced by the LHS of the matching rule, it must
produce the previous rightmost- sentential form.
Such a substring is called a handle .
● A handle of a right sentential form γ, is
- a production rule A → β, and
- an occurrence of a sub-string β in γ

such that when the occurrence of β is replaced by A in γ, we get


the previous right sentential form in a rightmost derivation of γ.
Handle pruning

then the rule Aβ and the occurrence β is the handle in βw.

Grammar is
EE+E
EE*E
E(E)
Eid

Derive string “ id+id*id” using rightmost derivation.

Note: String at the right of handle contains only terminal


symbols.
Handle
pruning
The process of discovering a handle & reducing it to
the appropriate left-hand side is called handle pruning.
Handle pruning forms the basis for a bottom-up
parsing method.
Reduction made by a shift
reduce parser
Right sentential Handle Reducing Production
Form

-
id1+ id2 * id3 id1 Eid
E + id2 * id3 id2 Eid
E + E * id3 id3 Eid
E+E*E E*E EE*E
E+E E+E EE+E
E
Contd…

T → T ∗ F | T/F |
E→E+T|E−T|T

FF→P∗ ∗ F|
P
P → −P | B
B → (E) | id

Input string is: − id **


id/id
Contd..
Stack Input Parser move

$ − id ** id/id$ shift −
$− id** id /id $ shift id
$ − id ** id / id $ reduce by Bid
$−B ** id / id $ reduce by PB
$−P ** id / id $ reduce by P − P
$P ** id / id $ shift **
… … ….
$E $ accept
Contd…
• Shift- reduce parsers require the following
data structures
1. a buffer for holding the input string to be parsed
2. a data structure for detecting handles (stack)
3. a data structure for storing and accessing the LHS
and RHS of rules.
Contd…
• Stack implementation of shift-reduce parsing
In handle pruning 2 problems are to be solved
1. Locate the substring to be reduced in a right sentential form
2. Determine what prod to choose in case there is more that one prod
with that substring on the right side

- The parser operates by shifting zero or more i/p symbols onto the stack
until a handle β is on top.
- Then reduce β to the left side of the appropriate prod.
- Repeat until an error is detected or stack contains the start
symbol and i/p is empty.

• Show moves by parser for string “id1+id2*id3” using


arithmetic expression G.
Contd…
● Basic actions of the shift-reduce parser are:

Shift: Moving a single token from the input buffer


onto the stack till a handle appears on the
stack.
Reduce: When a handle appears on the stack, it is popped and
replaced by the left hand side of thecorresponding production.
Accept: When the stack contains only the start symbol and input
buffer is empty, the parser halts announcing a successful parse.
Error: When the parser can neither shift nor reduce nor accept.
Halts announcing an error.
Contd

• Conflicts in a Shift-Reduce Parser
Following conflicting situations may get into shift-reduce grammar
1. Shift - reduce conflict
A handle β occurs on TOS; the next token a is such that βaγ happens to
be another handle.
the parser has two options
• Reduce the handle using A🡪 β
• Ignore the handle β ; shift a and continue parsing and eventually
reduce using B🡪 βaγ
2. Reduce- reduce conflict
the stack contents are αβγ and both βγ and γ are handles with A🡪 βγ and
B🡪 γ as the corresponding rules.
Then parser has two reduce possibilities:
• Choose shift(or reduce) in a shift reduce conflict
• Prefer one reduce (over others) in a reduce-reduce conflict
LR
Parsers
●Used for Large Class of
‘G’/ CFG.
●Called LR(K) Parsing.

6/5/2021 83
Bottom Up
LR Parsers Con…

●Her (LR(K
e ))

Scans Input
from Left to
Right

6/5/2021 84
●Her (LR(K
e ))

Scans Input Generates


from Left to Rightmost
Right Derivation Tree
in Reverse

6/5/2021 85
●Her (LR(K
e ))

K no of Look
Scans Input Generates
ahead Symbols
from Left to Rightmost
Right Derivation Tree
in Reverse

6/5/2021 86
Properties of LR
Parsers
●Can be constructed for
which it is possible to
write CFG.
●Most general Non-
backtracking S-R parsing
method.
●Proper Superset of CFG that
can be parsed by
Predictive Parsers.
●Can detect Syntactic Errors as
soon as while Scanning the
i/p. 6/5/2021 87
Block Schematic of LR
● AParser:
table driven Parser has an I/p Buffer, a Stack, a
Parsing Table and O/p stream along with Driver
Program.
a1 … ai a $Input
. …. n
Sm
Xm
Sm-1 LR Output
Xm-1 Parsing
… Progra
S0 m
Stack

actio got Parsing Table


n o6/5/2021 88
GRAMMAR:
(1) E→ E + T
(2) E → T
(3) T → T * F
LR Parser
(4) T → F
(5) F→ ( E )
Example OUTPUT:

INPUT: id * id + id $
(6) F→ id

LR Parsing
STACK: 0
E
Program

State action goto


id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 89
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E → T
LR Parser
OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
(5) F
(6) F → id→
(E)

5
LR Parsing
STACK: E Program
id
0
State action goto
id + * ( ) $ E T F F
0 s5 s4 1 2 3
1 s6 acc
id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 90
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E → T LR Parser OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
(5) F
(6) F → id→
(E)
LR Parsing
STACK: 0
Program

Stat actio got


e id + n * ( o )
$ E T F
F
0 s5
s6 s4 1
acc 2
r2 r2 3 id
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 92
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E → T LR Parser OUTPUT:
(3) T→
(4) T →TF*F
(5) F
→(E)
Example
INPUT: id * id + id $

(6) F
→ id
3
LR Parsing
STACK: E
Program
F
0
T
State action goto
id + * ( ) $ E T F F
0 s5 s4 1 2 3
1 s6 acc
id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 93
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E → T
LR Parser
OUTPUT:
(3) T→
(4) T
(5) F
→TF*F
Example
INPUT: id * id + id $
→(E)
(6) F
→ id
LR Parsing
STACK: 0
Program

T
Stat actio got
e id + n * ( o ) F
$ E T F
0 s5
s6 s4 1
acc 2
r2 r2 3 id
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 94
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E → T LR Parser OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
(5) F →
(E)
(6) F →
2
LR Parsing
STACK:
id E Program
T
0
T
State action goto
id + * ( ) $ E T F F
0 s5 s4 1 2 3
1 s6 acc
id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 94
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
Example
(3) T → T * F
(4) T → F
INPUT: id * id + id $
(5) F →
(E)
(6) F →
7
LR Parsing
STACK:
id E Program
*
2
T
T
State action goto
0 id + * ( ) $ E T F F
0 s5 s4 1 2 3
1 s6 acc
id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 95
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T
LR Parser
OUTPUT:
(3) T → T * F
(4) F → F→
(5) T Example
INPUT: id * id + id $
(EF
(6) ) → id

5
LR Parsing
STACK: E Program
id
7 T F
* State action goto
2 id + * ( ) $ E T F F id
T 0 s5 s4 1 2 3
0 1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 96
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
(3) T → T * F
(4) F → F→
(5) T Example
INPUT: id * id + id $
(EF
(6) ) → id

7
LR Parsing
STACK: E Program
*
2 T F
T
Stat actio got
0 e id + n * ( o ) F id
$ E T F
0 s5
s6 s4 1
acc 2
3
id
r2 r2
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 98
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →

(2)
E+T
E’→
(3) T →TT* F
LR Parser OUTPUT:
(4) T → F
(5) F → Example
INPUT: id * id + id $
(E)
(6) F →
id LR Parsing
STACK: 10 E T
Program
F
7 T * F
* State action goto
2 id + * ( ) $ E T F F id
T 0 s5 s4 1 2 3
0 1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 99
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
E→ T * F
(2) T
(3)
LR Parser OUTPUT:
(4) T → F
(5) F → Example
INPUT: id * id + id $
(E)
(6) F →
id
LR Parsing
STACK: 0 T
Program

T * F
Stat actio got
e id + n * ( o ) F id
$ E T F
0 s5
s6 s4 1
acc 2
3
id
r2 r2
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 100
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →E
+
(2)T E → T
(3) T → T * F
LR Parser
OUTPUT:
(4) T → F
(5) F Example
INPUT: id * id + id $
→(E)
(6) F E
→ id
LR Parsing
STACK: 2 T
Program
T
0 T * F
State action goto
id + * ( ) $ E T F F id
0 s5 s4 1 2 3
1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 101
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →E
+
(2)T E → T
(3) T → T * F
LR Parser
OUTPUT:
(4) T → F
(5) F Example
INPUT: id * id + id $
→(E)
(6) F E
→ id
LR Parsing
STACK: 0 T
Program

T * F
Stat actio got
e id + n * ( o ) F id
$ E T F
0 s5
s6 s4 1
acc 2
3
id
r2 r2
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 102
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
(3) T → T * F
(4) T → F
(5) F →
Example
INPUT: id * id + id $

(E) E
(6) F →
1
LR Parsing
STACK:
id T
Program
E
0 T * F
State action goto
id + * ( ) $ E T F F id
0 s5 s4 1 2 3
1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 102
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
(5) F →
(E) E
(6) F →
6
LR Parsing
STACK:
id T
Program
+
1 T * F
E
State action goto
0 id + * ( ) $ E T F F id
0 s5 s4 1 2 3
1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 103
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
(5)
(6) F
F → →
id ( E ) E

5
LR Parsing
STACK: T F
Program
id
6 T * F
+
State action goto
1 id + * ( ) $ E T F F id
id
E 0 s5 s4 1 2 3
0 1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 104
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T
LR Parser
OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
(5)
(6) F
F → →
id ( E ) E

6
LR Parsing
STACK: T F
Program
+
1 T * F
E
Stat actio got
0 e id + n * ( o ) F id
id
$ E T F
0 s5
s6 s4 1
acc 2
3
id
r2 r2
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 106
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T
LR Parser
OUTPUT:
(3) T→
(4) T
(5) F
→TF*F
Example
INPUT: id * id + id $
→(E)
(6) F E T
→ id
LR Parsing
STACK: 3 T F
Program
F
6 T * F
+
State action goto
1 id + * ( ) $ E T F F id
id
E 0 s5 s4 1 2 3
0 1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 107
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
(3) T→
(4) T
(5) F
→TF*F
Example
INPUT: id * id + id $
→(E)
(6) F E
→ id
LR Parsing
STACK: 6 T F
Program
+
1 T * F
E
Stat actio got
0 e id + n * ( o ) F id
id
$ E T F
0 s5
s6 s4 1
acc 2
3
id
r2 r2
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 108
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →E
+ T E’ → T
(2)
(3) T → T * F
LR Parser
OUTPUT:
(4) T → F
(5) F → Example
INPUT: id * id + id $
E
(E)
(6) F → E + T
id LR Parsing
STACK: 9 T F
Program
T
6 T * F
+
State action goto
1 id + * ( ) $ E T F F id
id
E 0 s5 s4 1 2 3
0 1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 109
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →E
+ TE → T
(2)
(3) T → T * F LR Parser OUTPUT:
Example
(4) T → F
(5) F → E
INPUT: id * id + id $
(E)
(6) F → E + T
id
LR Parsing
STACK: 0 T F
Program

T * F
Stat actio got
e id + n * ( o ) F id
id
$ E T F
0 s5
s6 s4 1
acc 2
3
id
r2 r2
1
2 s7 s r2 8
3 r4
r6 4 r6 r4
4 s5 s 2 9
5 r4
r6 4 r6 r4
6 s5 s6 s s11 3 3
7 s5 r1 4 r1 10
8 CMPUT 680 - Compiler Design
9
11 and
s7
r r r1
r r 110
Optimization (Aho,Sethi,Ullman, pp. 220)
GRAMMAR:
(1) E →
E+T
(2) E’ → T LR Parser OUTPUT:
(3) T → T * F
(4) T → F Example
INPUT: id * id + id $
E
(5) F →
(E) E + T
(6) F →
1
LR Parsing
STACK:
id T F
Program
E
0 T * F
State action goto
id + * ( ) $ E T F F id
id
0 s5 s4 1 2 3
1 s6 acc id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
CMPUT 680 - Compiler Design
9 r1 s7
and
r1 r1 111
10 r3 r3
Optimization r3 r3 (Aho,Sethi,Ullman, pp. 220)
Constructing Parsing
Tables
All LR parsers use the same parsing program that
we demonstrated in the previous slides.

What differentiates the LR parsers are the action


and the goto tables:
Simple LR (SLR): succeds for the fewest grammars, but is
easiest
the to implement. (See AhoSethiUllman pp. 221-230).
Canonical LR: succeds for the most grammars, but is the hardest
to implement. It splits states when necessary to prevent
reductions
that would get the parser (See AhoSethiUllman pp. 230-236).
stuck.
Lookahead LR (LALR): succeds for most common syntatic
constructions used in programming languages, but
produces LR tables much smaller than canonical LR.
(See AhoSethiUllman pp. 236-247).
112
(Aho,Sethi,Ullman, pp. 221)
Closure()
●I set of items of G
●Closure(I)
Initially every item of I is
included in Closure(I)
● Repeat
If A->α.Bβ in closure(I) and B->
γ is a production, add B->.γ (If it is not
already present) to closure, Until no
new items can be added to closure
(I) 6/5/2021 112
goto
()
●Goto(I,X), I set of items, X grammar
symbol
● Goto(I,X) := closure({A->αX.β| A-
>α.Xβ ∈ I}) For valid items I for viable
prefix γ ,
then goto(I,X) = valid items for viable
prefix γX
●Kernel Items:Which includes the
initial item S’.S, and all items
6/5/2021 113
Set of Items
Construction
procedure
items(G’); begin
C := closure({S’-
>.S}); repeat
for each set of items I in C and each
grammar symbol X such that
goto(I,X) is not empty and not in C
do
add goto(I,X) to C
until no more items can be
added to C end
6/5/2021 114
●E’ -> E I0 =
●E -> E + T I1 =
|T I2 =
I3 =
●T -> T * F I4 =
|F I5 =
●F -> (E) | I6 =
id I7 =
I8 =
I9 =
I10 =
I11
6/5/2021 115
LR(0)
Items
I0 = E’ -> .E
E -> .E + T
E -> .T
T -> .T * F
T -> .F
F -> .(E)
F -> id

6/5/2021 116
LR(0)
Items
I0 = E’ -> .E
E -> .E + T
E -> .T
T -> .T * F
T -> .F
I1 = E’ -> E.
F -> .(E)
E -> E. + T
F -> id

6/5/2021 117
LR(0)
Items
I0 = E’ -> .E
I1 = E’ -> E.
E -> .E + T
E -> E. + T
E -> .T
T -> .T * F
T -> .F
F -> .(E) I2 = ET. T
F -> id T.* F

6/5/2021 118
LR(0)
Items
I0 = E’ -> .E
E -> .E + T
E -> .T
T -> .T * F
T -> .F I3 = T -> F.
F -> .(E)
F -> id
I1 = E’ -> E.
E -> E. + T

I2 = E ->T.
T -> T. * F 6/5/2021 119
LR(0)
I3 = T -> F.
Items
I0 = E’ -> .E
E -> .E + T
I4 = F -> (.E)
E -> .T
E -> .E + T
T -> .T * F
E -> .T
T -> .F
T -> .T * F
F -> .(E)
T -> .F
F -> id F -> .(E)
I1 = E’ -> E. F -> id
E -> E. + T

I2 = E -> T.
T -> T. * F 6/5/2021 120
LR(0)
I3 = T -> F.
Items
I0 = E’ -> .E
E -> .E + T
I4 = F -> (.E)
E -> .T
E -> .E + T
T -> .T * F
E -> .T
T -> .F
T -> .T * F
F -> .(E)
T -> .F
F -> id F -> .(E)
I1 = E’ -> E. F -> id
E -> E. + T
I5 = F -> id.
I2 = E -> T.
T -> T. * F 6/5/2021 121
LR(0) I6 =E -> E +. T
I0 Items
= E’ -> .E I3 = T -> F. T -> .T * F
E -> .E + T T -> .F
E -> .T I4 = F -> (.E) F -> .(E)
T -> .T * F E -> .E + T F -> id
T -> .F E -> .T
F -> .(E) T -> .T * F
F -> id T -> .F
F -> .(E)
I1 = E’ -> E.
F -> id
E -> E. + T
I2 = E -> T.
T -> T. * F I5 = F -> id.

6/5/2021 122
LR(0) I6 =E -> E +. T
I0 Items
= E’ -> .E I3 = T -> F. T -> .T * F
E -> .E + T T -> .F
E -> .T I4 = F -> (.E) F -> .(E)
T -> .T * F E -> .E + T F -> id
T -> .F E -> .T I7=T -> T *. F
F -> .(E) T -> .T * F F -> .(E)
F -> id T -> .F F -> id
F -> .(E)
I1 = E’ -> E.
F -> id
E -> E. + T
I2 = E -> T.
T ->T. * F I5 = F -> id.

6/5/2021 123
LR(0) I6 =E -> E +. T
I0 Items
= E’ -> .E I3 = T -> F. T -> .T * F
E -> .E + T T -> .F
E -> .T I4 = F -> (.E) F -> .(E)
T -> .T * F E -> .E + T F -> id
T -> .F E -> .T I7 =T -> T *. F
F -> .(E) T -> .T * F F -> .(E)
F -> id T -> .F F -> id
F -> .(E)
I1 = E’ -> E. I8 =F -> (E.)
F -> id
E -> E. + T E-> E.+T
I2 = E -> T.
T -> T. * F I5 = F -> id.

6/5/2021 124
LR(0) I6 =E -> E +. T
I0 Items
= E’ -> .E I3 = T -> F. T -> .T * F
E -> .E + T T -> .F
E -> .T I4 = F -> (.E) F -> .(E)
T -> .T * F E -> .E + T F -> id
T -> .F E -> .T I7 =T -> T *. F
F -> .(E) T -> .T * F F -> .(E)
F -> id T -> .F F -> id
F -> .(E)
I1 = E’ -> E. I8 =F -> (E.)
F -> id
E -> E. + T E-> E.+T
I2 = E -> T. I9 =E -> E +T.
T -> T. * F I5 = F -> id.
T -> T. * F

6/5/2021 125
LR(0) I7 =T -> T *. F
I4 = F -> (.E) F -> .(E)
I0Items
= E’ -> .E
E -> .E + T F -> id
E -> .E + T
E -> .T E -> .T
I8 =F -> (E.)
T -> .T * F T -> .T * F
E-> E.+T
T -> .F T -> .F
F -> .(E) F -> .(E) I9 =E -> E +T.
F -> id F -> id T -> T. * F

I1 = E’ -> E. I5 = F -> id. I10 = T -> T*F.


E -> E. + T
I6 =E -> E +. T
I2 = E -> T. T -> .T * F
T -> T. * F T -> .F
F -> .(E)
F -> id
I3 = T -> F. 6/5/2021 126
LR(0) I7 =T -> T *. F
I0Items
= E’ -> .E I4 = F -> (.E) F -> .(E)
E -> .E + T E -> .E + T F -> id
E -> .T E -> .T
T -> .T * F T -> .T * F I8 =F -> (E.)
T -> .F T -> .F E-> E.+T
F -> .(E) F -> .(E)
F -> id I9 =E -> E +T.
F -> id T -> T. * F
I1 = E’ -> E. I5 = F -> id.
E -> E. + T I10 = T -> T*F.
I6 =E -> E +. T
I2 = E -> T. T -> .T * F
T -> .F
T -> T. * F
F -> .(E) I11 =F -> (E).
F -> id
I3 = T -> F. 6/5/2021 127
● E’ -> E
● E -> E + T
|T
● T -> T * F |
F
● F -> (E) |
id
6/5/2021 128
DFA Each State is Final

… State

E + T * To
Star I0 I1 I6 I9
F I7
t To I3
(
To
i
I4
T * Fd To
I2 I7 I10
I5
(
To I4
F i
I3 d To
I5
(
( E )
I4 I8 I11
T
To
I2 +
i F To
d I5 To I3
6/5/2021 I6 SLR Tabl
●Input: An Augmented Grammar G’
●Output:The SLR Parsing Table
function action and goto for G’.
SLR Parsing

●Method:
1. Construct C={I0,I1…..In}, the
collection of sets of LR(0) items for
G’.
2. State i is constructed from Ii .The
parsing actions for the state i are
determined as follows:
a] If [A->α.aβ] is in Ii and goto(Ii,
a) = Ij, then set action[ i, a] to
b] If [A->α.] is in Ii, then set
action[ i, a] to “reduce A->α ”
for all ‘a’ in FOLLOW(A) ; here A may
Continue

not
C] If be S’. is in Ii, then set
[S’->S.]
action[ i, $] to “Accept” .
3.The goto transitions for state i are
constructed for all nonterminals A
using rule : If goto(Ii, A) = Ij,
then goto[ i, A] =j.
4.All entries not defined by rules
(2) and (3) are made “ error”.
5.The initial state of the parser is
[S’-
the one constructed from the set
6/5/2021 132
b] If [A->α.] is in Ii, then set
action[ i, a] to “reduce A->α ”
for all a in FOLLOW(A) ; here A may
not
C] If be S’. is in Ii, then set
[S’->S.]
action[ i, $] to “Accept” .
3.The goto transitions for state i are
constructed for all nonterminals A
using
If oto(Ii,
If
rule :
anyA) conflicting
= Ij, then goto[ are
actions i,
g A] generated by
=j.
4. Alis notabove
the rules (a,b,c), we say the grammar
SLR(1). The algorithm fails to produce
ar laentries notcase.
parser in this defined by rules (2)
e and (3) made “ error”.
constructed
5. he initial statefrom of thethe setparser
of itemsis
6/5/2021 BACK 133
Constructing Canonical LR
Parsing Table
●The extra information is
Canonical LR Parsing

incorporated into the state by


redefining items which
includes a terminal symbol as a
●Second
[ A->α.β,Component. (is
where A->αβ is
lookahead of the item )
a] prod. a is
terminal or $
Second Component.
●We call such an object as an
LR(1) item.
6/5/2021 133
Closure(
●I)
Begin
◦ Repeat
for each item [ A->α.Bβ, a]
in (I) each production B-
> γ is in G’,
and each terminal b in
FIRST(βa) such that [B->.γ,
b] is not in I (If it is
not already present) do add [B-
>.γ, b] to I;
Until no new items can be added to
6/5/2021 134
goto(I,
X)
●begin
Let J be the set of items [A->αX.β,
a] such that [A->α.Xβ, a is in I;
return
closure(J) end;

6/5/2021 135
Continue

Procedure
items(G’); begin
C = {closure({[S’🡪.S, $]})};
repeat
for each set of items I in C and each grammar
symbol X,
such that goto(I, X) is not empty and
not in C do add goto(I, X) to C.
Until more items can be added to C
end.
6/5/2021 136
Example

6/5/2021 137
●Input: An Augmented Grammar G’
●Output:The canonical LR
Parsing Table function
CLR Parsing

action and goto for G’.


●Method:
1. Construct C={I0,I1…..In}, the
collection of sets of LR(1) items for
G’.
2. State i is constructed from Ii .The
parsing actions for the state i are
determined as follows:
a] If [A->α.aβ, b] is in Ii and
6/5/2021 138
b] If [A->α., a] is in Ii, then set
action[ i, a] to “reduce A->α ” ,
here A may not be S’.
Continue

C] If [S’->S., $] is in Ii, then set


action[ i, $] to “Accept” .
3.The goto transitions for state i are
constructed for all nonterminals A
using rule : If goto(Ii, A) = Ij,
then goto[ i, A] =j.
4.All entries not defined by rules
(2) and (3) are made “ error”.
5.The initial state of the parser is
the one constructed from the set
6/5/2021 139
C]If [S’->S. , $] is in Ii, then set
action[ i, a] to “Accept” .

If any conflicting actions are generated by


the above rules (a,b,c), we say the grammar
is not LR(1). The algorithm fails to produce a
parser in this case.

6/5/2021 140
Action Go to
State
c d $ S C

0 S3 S4 1 2

1 acc

2 S6 S7 5

3 S3 S4 8

4 r3 r3

5 r1

6 S6 S7 9

7 r3

8 r2 r2
2
6/5/ 14
Constructi Lookahead-LR (LALR)
ng Parsing
Table
LALR

●The table obtained are


smaller than Canonical .
●Also It can handle some
constructs that cannot be
handled by SLR Grammar.
●We look for sets of LR(1 ) items
having the same core, that is,
set of first components and
we may merge these sets with
common cores into one set
6/5/2021 143
● Input: An Augmented Grammar G’
● Output: The LALR Parsing Table functions action and goto
for G’.
● Method:
LALR Parsing

1. Construct C={I0,I1…..In}, the collection


of sets of LR(1) items.
2. For each core present among the set
of LR(1) items, find all sets having
that core, and replace these sets
by their union.
3. Let C’ = { J1, J2,… Jm } be the resulting
sets of LR(1) items.The parsing
actions for state i are constructed
from Ji in the same manner as in
algorithm(CLR).
LALR(1 6/5/2021 (LALR (1) 144
4.The goto table is constructed as
follows. If J is union of one or
Continue

more
items sets
, thatof is
LR(1)
J = I1 U I2 U … U Ik,
then the cores of goto(I1, X) , goto(I2, X)
…. goto(Ik, X) are the same, since I1, I2, .…
Ik all have the same core. Let K be the union
of all sets of items having the same core as
goto(I1, X).
Then goto(J, X) = K.
If there are no parsing action conflicts, then
the given grammar is said to be an LALR(1)
grammar

6/5/2021 144
Action Go to
State
c d $ S C
LALR parsing table for

0 S36 S47 1 2

1 acc

2 S36 S47 5

36 S36 S47 89

47 r3 r3 r3

5 r1

89 r2 r2 6/5/2021
r2 145
-
Compar
e
CLR Vs LAL
- SLR R
Vs

6/5/2021 146
Semantic
Analysis
●Here Compiler tries to discover the
meaning of a program by analyzing its
Parse Tree or Abstract Syntax Tree.
●Checks whether the SP is according to
Syntactic and Semantic Conventions of
Source Lang or not.
●Also known as Context Sensitive
Analysis. Attributes on Symbols
●Answer depends on Value not Syntax.
Attribute
Attribute Evaluation Rules
Grammar
Indexing of Grammar Symbols
Type Checking

●TYPE CHECKING is the main activity in


semantic analysis.
●Goal: calculate and ensure consistency of
the type of every expression in a program
●If there are type errors, we need to notify
the user.
●Otherwise, we need the type information
to generate code that is correct. 14

You might also like