0% found this document useful (0 votes)

195 views

Chapter 3 - Syntax Analysis

The document discusses the role of a parser in syntax analysis. It begins by introducing context-free grammars and how they are used to specify the structure of legal programs in a programming language. It then covers various parsing concepts like derivation, parse trees, ambiguity, left and right recursion, and how these issues are resolved. The document aims to provide an overview of key parser and grammar related topics for syntax analysis.

Uploaded by

Anonymous zDRmfIpGf

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

195 views

Chapter 3 - Syntax Analysis

Uploaded by

Anonymous zDRmfIpGf

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 88

Syntax Analyzer Parser

Chapter 3

By Esubalew Alemneh

Contents (Session-1)
Introduction
Context-free grammar
Derivation
Parse Tree
Ambiguity
Resolving Ambiguity

Immediate & Indirect Left Recursion

Eliminating Immediate & Indirect Left Recursion

Left Factoring
Non-Context Free Language Constructs

Introduction

Abstract representations of the input program could be:

abstract-syntax tree/parse tree + symbol table
intermediate code
object code
Syntax analysis is done by the parser.
Produces a parse tree from which intermediate code can be

generated
By detecting whether the program is written following the
grammar rules.
Reports syntax errors, attempts error correction and
recovery
Collects information into symbol tables

Introduction
Error

Source
program

Lexical
analyzer

token
Request
for token

Parse
parser tree

Rest of
front end

Symbol
table

Parsers can be

Top-down or Bottom-up

Int.
code

Context Free Grammars(CFG)

CFG is used to specify the structure of legal programs.
The design of the grammar is an initial phase of the

design of a programming language.

Formally a CFG G = (Vt,Vn,S,P), where:
Vt is the set of terminal symbols in the grammar

(i.e.,the set of tokens returned by the scanner)

Vn, the non-terminals, are variables that denote sets of
(sub)strings occurring in the language. These impose a
structure on the grammar.
S is the start/goal symbol, a distinguished non-terminal in Vn
denoting the entire set of strings in L(G).
P is a finite set of productions specifying how terminals and
non-terminals can be combined to form strings in the
language.
Each production must have a single non-terminal on its left
hand side.

The set V = Vt Vn is called the vocabulary of G

Context Free Grammars(CFG)

Example (G1):

E E+E | EE | E*E | E/E | -E

E (E)
E id

Where
Vt = {+, -, *, / (,), id}, Vn = {E}
S = {E}
Production are shown above

Sometimes can be replaced by ::=

CFG is more expressive than RE - Every language that

can be described by regular expressions can also be

described by a CFG
L = {anbn | n>=1}, is an example language that can be

expressed by CFG but not by RE

Context-free grammar is sufficient to describe most

programming languages.

Context Free Grammars(CFG)

BNF (Backus Normal Form or BackusNaur Form) a

notation techniques for context-free grammars, often

used to describe the syntax of languages used in
computing
Has many extensions and variants
Extended BackusNaur Form (EBNF)
Augmented BackusNaur Form (ABNF).

A BNF specification is a set of derivation rules, written as

<symbol> ::= expression

BNF for valid arithmetic expression

<expr> ::= <expr> <op> <expr>

<expr> ::= ( <expr> )
<expr> ::= - <expr>
<expr> ::= id
<op> ::= + | - | * | /

Derivation
A sequence of replacements of non-terminal

symbols to obtain strings/sentences is called a

derivation
If we have a grammar E E+E then we can
replace E by E+E
In general a derivation step is A if
there is a production rule A in a grammar
where and are arbitrary strings of terminal

and non-terminal symbols

Derivation of a string should start from a

is a sentential form (terminals & nonterminals Mixed)

is a sentence if it contains
only terminal symbols

production with start symbol in the left

Derivation
Derivate string (id+id) from G1

E -E -(E) -(E+E) -(id+E) -(id+id) (LMD)

OR
E -E -(E) -(E+E) -(E+id) -(id+id) (RMD)
At each derivation step, we can choose any of the non-

terminal in the sentential form of G for the replacement.

If we always choose the left-most non-terminal in each

derivation step, this derivation is called as left-most

derivation(LMD).
If we always choose the right-most non-terminal in each

derivation step, this derivation is called as right-most

derivation(RMD).

Parse Tree
A parse tree can be seen as a graphical

representation of a derivation
Inner nodes of a parse tree are non-terminal

symbols.
The leaves of a parse tree are terminal symbols.
E

E -E

-(E)

E
-

-(E+E)

E
(

E
-

-(id+E)

E
-

E
(

-(id+id)

E
(

E
-

E
(

Ambiguity
An ambiguous grammar is one that produces

more than one LMD or more than one RMD for

the same sentence. E E*E
E E+E
id+E
id+E*E

id+id*E
Eid+id*id
E

E+E*E
id+E*E
id+id*E
id+id*id
E
E

E
id

*
E
id

E
id

Ambiguity
For the most parsers, the grammar must be

unambiguous.
If a grammar unambiguous grammar then there are
unique selection of the parse tree for a sentence
We should eliminate the ambiguity in the grammar
during the design phase of the compiler.
An unambiguous grammar should be written to
eliminate the ambiguity.
We have to prefer one of the parse trees of a
sentence (generated by an ambiguous grammar)
to disambiguate that grammar to restrict to this
choice.

AmbiguityDangling If

stmt if expr then stmt |

if expr then stmt else stmt

if E1 then if E2 then S1 else S2

stmt

stmt
expr then
E1

| otherstmts

stmt

else

if expr then
E2

stmt

stmtexpr then stmt

if
SE
21

if expr then stmt else s

We prefer the second parse tree (else

matches with closest if).

So, we have to disambiguate our grammar

Resolving Ambiguity
Option 1: add a meta-rule e.g. precedence and

associativity rules
For example else associates with closest previous if
works, keeps original grammar intact
ad hoc and informal

Option 2: rewrite the grammar to resolve ambiguity

explicitly
stmt matchedstmt | unmatchedstmt
matchedstmt if expr then matchedstmt else matchedstmt |
otherstmts
unmatchedstmt if expr then stmt |
if expr then matchedstmt else unmatchedstmt
formal, no additional rules beyond syntax
sometimes obscures original grammar

Resolving Ambiguity
Option 3: redesign the language to remove
the ambiguity
Stmt ::= ... |
if Expr then Stmt end |
if Expr then Stmt else Stmt end
formal, clear, elegant
allows sequence of Stmts in then and else

branches, no { , } needed
extra end required for every if

Left Recursion
A grammar is left recursive if it has a non-

terminal A such that there is a derivation.

A A

for some string

Top-down parsing techniques cannot handle left-

recursive grammars.
So, we have to convert our left-recursive grammar
into an equivalent grammar which is not leftrecursive.
Two types of left-recursion
immediate left-recursion - appear in a single step of the

derivation (),
Indirect left-recursion - appear in more than one step of the
derivation.

Eliminating Immediate Left Recursion

AA|

where does not start with A

eliminate immediate left

recursion
A A
A A |

A A |

A
|
OR

In general,
A A 1 | ... | A m | 1 | ... | n
not start with A

where 1 ... n do

eliminate immediate left recursion

A 1 A | ... | n A
A 1 A | ... | m A |

an equivalent grammar

Eliminating Left Recursion

Remove left recursion from the grammar

below
E E+T | T
T T*F | F
F id | (E)
Answer

E T E
E +T E |
T F T
T *F T |
F id | (E)

Indirect Left-Recursion
A grammar cannot be immediately left-recursive,

but it still can be left-recursive.

By just eliminating the immediate left-recursion,
we may not get a grammar which is not leftrecursive.
S Aa | b
A Sc | d This grammar is not immediately leftrecursive,
but it is still left-recursive.
S Aa Sca
or
A Sc Aac causes to a left-recursion
So, we have to eliminate all left-recursions from our
grammar

Eliminating Indirect Left-Recursion

Arrange non-terminals in some order: A 1 ... An
we will remove indirect left recursion by
constructing an equivalent grammar G such that
- if Ai Aja is any production of G, then i < j
For each non-terminal in turn, do:
For each terminal Ai such that 1< j<i and we have a

production rule of the form Ai Aj, where the Aj

productions are Aj 1 | |Bn , do:
Replace the production rule Ai Aj with the rule Ai
1 | |Bn
Eliminate any immediate left recursion among the
productions 1

Eliminating Indirect Left-Recursion

Example 1

S Aa | b
A Ac | Sd | f
- Order of non-terminals: S = A1, A = A2
A1 A2 a | b
A2 A2 c | A1 d | f
The only production with j<i is A2 A1 d
for A:
- Replace it with A2 A2 ad | bd

A2 A2 c | A2 ad | bd | f
- Eliminate the immediate left-recursion in A
A2 bdA|bdA
A cA | adA|

So, the resulting equivalent grammar which is not left-recursive is:

S Aa | b
A bdA | fA
A cA | adA |

Eliminating Indirect Left-Recursion

Example 2
A1 A2 A3
A2 A3 A1 | b
A3 A1 A1 | a
Replace A3 A1 A1 by A3 A2 A3 A1
and then replace this by
A3 A3 A1 A3 A1 and A3 b A3 A1
Eliminating direct left recursion in the above,
gives: A3 aK | b A3 A1K
k A1 A3 A1K |

The resulting grammar is then:

A1 A2 A3
A2 A3 A1 | b
A3 aK | b A3 A1K
k A1 A3 A1K |

Left Factoring
A predictive parser (a top-down parser without

backtracking) insists that the grammar must be

left-factored.
stmt if expr then stmt else stmt |
if expr then stmt
when we see if, we cannot know which
production rule to choose to re-write stmt in the
derivation.
In general,
A 1 | 2 where is non-empty and the first
symbols of 1 and 2 (if they have one)are different
when processing we cannot know whether to expand

A to 1

A to 2

Left Factoring
But, if we re-write the grammar as follows

A A
A 1 | 2

so, we can immediately expand A to A

Left Factoring Algorithm

For each non-terminal A with two or more

alternatives (production rules) with a common

non-empty prefix, let say
A 1 | ... | n | 1 | ... | m
convert it into
A A | 1 | ... | m
A 1 | ... | n

Left Factoring
Example1

Example2

A abB | aB | cdg | cdeB A ad | a | ab | abc | b

| cdfB

A aA | cdg | cdeB |
cdfB
A bB | B

A aA | cdA
A bB | B
A g | eB | fB

A aA | b
A d | | b | bc

A aA | b
A d | | bA
A | c

Non-Context Free Language Constructs

There are some language constructions in the

programming languages which are not context-free.

This means that, we cannot write a context-free
grammar for these constructions.
L1 = { c | is in (a|b)*} is not context-free
declaring an identifier and checking whether it is

declared or not later. We cannot do this with a

context-free language. We need semantic analyzer
(which is not context-free).
L2 = {anbmcndm | n1 and m1 } is not context-free
declaring two functions (one with n parameters, the

other one with m parameters), and then calling them

with actual parameters.

Contents(Session-2)
Top Down Parsing
Recursive-Descent Parsing
Predictive Parser
Recursive Predictive Parsing
Non-Recursive Predictive Parsing

LL(1) Parser Parser Actions

Constructing LL(1) - Parsing Tables
Computing FIRST and FOLLOW functions
LL(1) Grammars
Properties of LL(1) Grammars

Top Down Parsing

Top-down parsing involves constructing a parse tree for

the input string, starting from the root

Basically, top-down parsing can be viewed as finding a
leftmost derivation for an input string.
How it works? Start with the tree of one node labeled with

the start symbol and repeat the following steps until the fringe
of the parse tree matches the input string
1. At a node labeled A, select a production with A on its LHS
and for each symbol on its RHS, construct the appropriate
child
2. When a terminal is added to the fringe that doesn't match
the input string, backtrack
3. Find the next node to be expanded
! Minimize the number of backtracks as much as

possible

Top Down Parsing

Two types of top-down parsing

Recursive-Descent Parsing
Backtracking is needed (If a choice of a production rule

does not work, we backtrack to try other alternatives.)

It is a general parsing technique, but not widely used
because it is not efficient
Predictive Parsing
no backtracking and hence efficient
needs a special form of grammars (LL(1) grammars).
Two types
Recursive

Predictive Parsing is a special form of

Recursive Descent Parsing without backtracking.
Non-Recursive (Table Driven) Predictive Parser is
also known as LL(1) parser.

Recursive-Descent Parsing
It tries to find the left-most derivation.

Backtracking is needed
Example
S aBc
B bc | b
input: abc
A left-recursive grammar can cause a

recursive-descent parser, even one with

backtracking, to go into an infinite loop.
That is, when we try to expand a non-terminal B,

we may eventually find ourselves again trying

to expand B without having consumed any input.

Predictive Parser
A grammar

eliminate

a grammar suitable for predictive

parsing (a LL(1) grammar)

left

no %100 guarantee.
When re-writing a non-terminal in a derivation step, a
predictive parser can uniquely choose a production
rule by just looking the current symbol in the input
string.
Note: When we are
stmt if ......
|
trying to write the nonwhile ...... |
terminal stmt, we can
begin ...... |
uniquely choose the
for .....
production rule by just
left recursion

factor

looking the current

However, even though we eliminate the left
token.
recursion in the grammar, and left factor it, it may not
be suitable for predictive parsing (not LL(1) grammar).

Recursive Predictive Parsing

Predictive Parsing can be recursive or non-recursive
In recursive predictive parsing, each non-terminal

corresponds to a procedure/function.
Example

A aBb | bAB
proc A {
case of the current token {
a: - match the current token with a, and move to the next token;
- call B;
- match the current token with b, and move to the next token;
b: - match the current token with b, and move to the next token;
- call A;
- call B;
}
}

Recursive Predictive Parsing

When to apply -productions?

A aA | bB | l
If all other productions fail, we should apply an l-

production. For example, if the current token is not

a or b, we may apply the
-production.
Most correct choice: We should apply a l-

production for a non-terminal A when the current

token is in the follow set of A (which terminals can
follow A in the sentential forms).

Recursive Predictive Parsing

Non-Recursive Predictive Parsing

A non-recursive predictive parser can be built

by maintaining a stack explicitly, rather than

implicitly via recursive calls
Non-Recursive predictive parsing is a table-

driven top-down parser.

Model of a table-driven
predictive parser

Non-Recursive Predictive Parsing

Input buffer
our string to be parsed. We will assume that its end is marked

with a special symbol $.

Output
a production rule representing a step of the derivation

sequence (left-most derivation) of the string in the input

buffer.

Stack
contains the grammar symbols
at the bottom of the stack, there is a special end marker

symbol $.
initially the stack contains only the symbol $ and the starting
symbol S.
when the stack is emptied (i.e. only $ left in the stack), the
parsing is completed.

Parsing table
a two-dimensional array M[A,a]

LL(1) Parser Parser Actions

The symbol at the top of the stack (say X) and the

current symbol in the input string (say a) determine

the parser action.
There are four possible parser actions.
1. If X and a are $ parser halts (successful
completion)
2. If X and a are the same terminal symbol (different
from $)
parser pops X from the stack, and moves the
next symbol in the input buffer.
3. If X is a non-terminal
parser looks at the parsing table entry M[X,a].
If M[X,a] holds a production rule XY1Y2...Yk, it
pops X from the stack and pushes Yk,Yk-1,...,Y1 into
the stack. The parser also outputs the production

LL(1) Parser Example1

S aBa
B bB | S
B
stack

b LL(1)

$
Parsing
Table

S aBa
B

input

B bB
output

$S
$aBa
$aB
$aBb
$aB
$aBb
$aB
$a

abba$
S aBa
abba$
bba$
B bB
bba$
ba$
B bB
ba$
a$
B
a$

We will see
how to
construct
parsing
table Very
soon

accept, successful completion

LL(1) Parser Example2

E TE
E +TE |
T FT
T *FT |
F (E) | id
id
E

E TE

E
E +TE
T T FT
T
T
F F id

E is start symbol

E TE
T FT
T *FT
F (E)

LL(1) Parser Example2

stack inputoutput
$E id+id$ E TE
$ET id+id$ T FT
$E TF id+id$ F id
$ E Tid id+id$
$ E T +id$ T
$ E +id$ E +TE
$ E T+ +id$
$ E T id$ T FT
$ E T F id$ F id
$ E Tid id$
$ E T $ T
$ E $ E
$ $ accept

LL(1) Parser Example3

Taking Input
id+id*id
which is formed
from the Grammar
for Example 2

Constructing LL(1) Parsing Tables

Two functions are used in the construction of

LL(1) parsing tables:

FIRST
FOLLOW

FIRST() is a set of the terminal symbols which

occur as first symbols in strings derived from

where is any string of grammar symbols.
if derives to , then is also in FIRST() .

FOLLOW(A) is the set of the terminals which

occur immediately after the non-terminal A in

the strings derived from the starting symbol.
a terminal a is in FOLLOW(A)

if S Aa

Compute FIRST for a String X

1. If X is a terminal symbol, then

FIRST(X)={X}

2. If X is , then FIRST(X)={}

3. If X is a non-terminal symbol and X is

a production rule, then add in FIRST(X).

4. If X is a non-terminal symbol and X

Y1Y2..Yn is a production rule, then

if a terminal a in FIRST(Yi) and is in all
FIRST(Yj) for j=1,...,i-1, then a is in
FIRST(X).

Compute FIRST for a String X

Example

E TE
E +TE |
T FT
T *FT|
F (E) | id

From Rule 1
FIRST(id) = {id}
From Rule 2
FIRST() = {}
From Rule 3 and
4
First(F) = {(, id}
First(T) = {*, }

FIRST(E) = {+, }
FIRST(E) = {(,id}
Others

FIRST(TE) = {(,id}
FIRST(+TE ) = {+}
FIRST(FT) = {(,id}
FIRST(*FT) = {*}
FIRST((E)) = {(}

Compute FOLLOW (for non-terminals)

1. $ is in FOLLOW(S), if S is the start symbol
2. Look at the occurrence of a nonterminal on the

RHS of a production which is followed by

something
if A B is a production rule, then everything in

FIRST() except is FOLLOW(B)

3. Look at B on the RHS that is not followed by

anything
If ( A B is a production rule )

or ( A B is
a production rule and is in FIRST() ), then
everything in FOLLOW(A) is in FOLLOW(B).

Compute FOLLOW (for non-terminals)

Example

E TE
E +TE |
iii. T FT
iv. T *FT |
v. F (E) | id
FOLLOW(E) = { $, ) }, because
i.
ii.

.From first rule Follow (E) contains $

.From Rule 2 Follow(E) is first()), from the production F

(E)

FOLLOW(E) = { $, ) } . Rule 3
FOLLOW(T) = { +, ), $ }
From Rule 2 + is in FOLLOW(T)
From Rule 3 Everything in Follow(E) is in Follow(T) since
First(E) contains

.
.

FOLLOW(F) = {+, *, ), $ } same reasoning as

Constructing LL(1) Parsing Table -- Algorithm

For each production rule A of a grammar

1. for each terminal a in FIRST()

add A to M[A,a]
2. If in FIRST()

for each terminal a in FOLLOW(A) add A

to M[A,a]
3. If in FIRST() and $ in FOLLOW(A)

Constructing LL(1) Parsing Table -- Example

E TE

FIRST(TE)={(,id}

E TE into M[E,(] and M[E,id]

E +TE

FIRST(+TE )={+}

E +TE into M[E,+]

T FT

FIRST()={}
none
but since in FIRST()
and FOLLOW(E)={$,)} E into M[E,$] and M[E,)]
FIRST(FT)={(,id}

T FT FIRST(FT )={*}

T FT into M[T,(] and M[T,id]

T *FT into M[T,*]

FIRST()={}
none
but since in FIRST()
and FOLLOW(T)={$,),+}
T into M[T,$], M[T,)]
& M[T,+]

F (E)

FIRST((E) )={(}

F (E) into M[F,(]

LL(1) Grammars
A grammar whose parsing table has no multiple-defined

entries is said to be LL(1) grammar.

First L refers input scanned from left, the second L refers leftmost derivation and 1 refers one input symbol used as a lookhead symbol do determine parser action input scanned from left
to right
A grammar G is LL(1) if and only if the following conditions

hold for two distinctive production rules A and A

1. Both and cannot derive strings starting with same terminals.
2. At most one of and can derive to .
3. If can derive to , then cannot derive to any string starting

with a terminal in FOLLOW(A).

From 1 & 2, we can say that First( ) I First() = 0

From 3, means that if

and the like

is in First(), then First( ) I Follow(A) = 0

A Grammar which is not LL(1)

The parsing table of a grammar may contain more than one

production rule.
In this case, we say that it is not a LL(1) grammar.
SiCtSE |
EeS |
Cb

FIRST(iCtSE) = {i}
FIRST(a) = {a}
FIRST(eS) = {e}
FIRST() = {}
FIRST(b) = {b}
FOLLOW(S) = { $,e }
FOLLOW(E) = { $,e }
FOLLOW(C) = { t }

S iCtSE

S Sa
EeS

two production rules for M[E,e]

Problem ambiguity

A Grammar which is not LL(1)

What do we have to do if the resulting parsing table contains

multiply defined entries?

Eliminate left recursion in the grammar, if it is not eliminated
A A |

any terminal that appears in FIRST() also appears

FIRST(A) because A .
If is , any terminal that appears in FIRST() also appears
in FIRST(A) and FOLLOW(A).
Left factor the grammar, if it is not left factored.
A grammar is not left factored, it cannot be a LL(1) grammar:
A 1 | 2
any terminal that appears in FIRST(1) also appears in
FIRST(
If its (new grammars) parsing table still contains multiply
defined entries, that grammar is ambiguous or it is inherently
not a LL(1) grammar.
An ambiguous grammar cannot be a LL(1) grammar.

Error Recovery in Predictive Parsing

An error may occur in the predictive parsing

(LL(1) parsing)
if the terminal symbol on the top of stack does

not match with

the current input symbol.
if the top of stack is a non-terminal A, the
current input symbol is a,
the parsing table entry M[A,a] is empty.
What should the parser do in an error case?
The parser should be able to give an error

message (as much as possible meaningful error

message).
It should recover from that error case, and it

Contents (Session-3)
Bottom Up Parsing
Handle Pruning
Implementation of A Shift-Reduce

Parser
LR Parsers
LR Parsing Algorithm
Actions of A LR-Parser
Constructing SLR Parsing Tables
SLR(1) Grammar
Error Recovery in LR Parsing

Bottom-Up Parsing
A bottom-up parser creates the parse tree of the

given input starting from leaves towards the root.

A bottom-up parser tries to find the RMD of the given

input in the reverse order.

Bottom-up parsing is also known as shift-reduce

parsing because its two main actions are shift and

reduce.
At each shift action, the current symbol in the input

string is pushed to a stack.

At each reduction step, the symbols at the top of the
stack will be replaced by the non-terminal at the left
side of that production.
Accept: Successful completion of parsing.
Error: Parser discovers a syntax error, and calls an
error recovery routine.

Bottom-Up Parsing
A shift-reduce parser tries to reduce the given input

string into the starting symbol.

a string

the starting symbol

reduced to

At each reduction step, a substring of the input

matching to the right side of a production rule is

replaced by the non-terminal at the left side of that
production rule.
If the substring is chosen correctly, the right most
derivation of that string is created in the reverse order.
Rightmost Derivation:

Shift-Reduce Parser finds:

... S

rm
rm

Shift-Reduce Parsing -- Example

S aABb
A aA | a
B bB | b

input string: aaabb

aaAbb
aAbb
reduction
aABb
S
S aABb aAbb aaAbb aaabb
Right Sentential Forms
How do we know which substring to be

replaced at each reduction step?

Handle
Informally, a handle of a string is a substring that

matches the right side of a production rule.

But not every substring matches the right side of a

production rule is handle

A handle of a right sentential form ( ) is

a production rule A and a position of
where the string may be found and replaced by A to
produce
the previous right-sentential form in a rightmost
derivation of .

S A rm

If the grammar is unambiguous, then every right-

sentential form of the grammar has exactly one

handle.

Handle Pruning
A right-most derivation in reverse can be

obtained by handle-pruning.
S 0 rm
rm
1 rm
2 ...
n-1 n=
rm
rm
input string
Start from n, find a handle Ann in n,

and

replace n by An to get n-1.

Then find a handle An-1n-1 in n-1, and replace n1 in by An-1 to get n-2.
Repeat this, until we reach S.

A Shift-Reduce Parser - example

E E+T | T
Right-Most Derivation of id+id*id
T T*F | FE E+T E+T*F E+T*id E+F*id
F (E) | id
E+id*id T+id*id F+id*id id+id*id
Right-Most Sentential Form
id+id*id F id
F+id*id
TF
T+id*id
ET
E+id*id
F id
E+F*id
TF
E+T*id
F id
E+T*F T T*F
E+T
E E+T
E

Reducing Production

Handles are red and underlined in the right-sentential

forms.

A Stack Implementation of A Shift-Reduce Parser

Stack

Input Action

$ id+id*id$

shift

$id

+id*id$ reduce by F id

+id*id$ reduce by T F

+id*id$ reduce by E T

+id*id$ shift

$E+

id*id$

shift

$E+id *id$

reduce by F id

$E+F *id$

reduce by T F

$E+T *id$

shift

$E+T*

id$ shift

$E+T*id

reduce by F id

$E+T*F

reduce by T T*F

$E+T $

reduce by E E+T

Initial stack just

contains only the
end-marker $ &
the end of the
input string is
marked by the
Parse Tree
end-marker $.

Shift-Reduce Parsers
The most prevalent type of bottom-up parser

today is based on a concept called LR(k) parsing;

left to right

right-most

k lookhead (k is omitted it is 1)

LR-Parsers overs wide range of

Simple LR parser (SLR )
Look Ahead LR (LALR)

CFG
LR
grammars.
LALR
SLR

most general LR parser (LR )

SLR, LR and LALR work same, only their parsing

tables are different.

LR Parsers
LR parsing is attractive because:
LR parsers can be constructed to recognize virtually all
programming-language constructs for which contextfree grammars can be written.
LR parsing is most general non-backtracking shiftreduce parsing, yet it is still efficient.
The class of grammars that can be parsed using LR
methods is a proper superset of the class of grammars
that can be parsed with predictive parsers.
LL(1)-Grammars LR(1)-Grammars
An LR-parser can detect a syntactic error as soon as
it is possible to do so a left-to-right scan of the input.
Drawback of the LR method is that it is too much

work to construct an LR parser by hand.

Use tools e.g. yacc

LR Parsing Algorithm
input

... ai

... an

stack

Sm
Xm

LR Parsing Algorithm

Sm-1
Xm-1
.
.
S1
X1
S0

s
t
a
t
e
s

Action Table

Goto Table

terminals and $

non-terminal

four different
actions

s
t
a
t
e
s

each item is
a state number

output

A Configuration of LR Parsing
Algorithm
A configuration of a LR parsing is:

( So S1 ... Sm, ai ai+1 ... an $ )

Stack

Rest of Input

Sm and ai decides the parser action by consulting the

parsing action table. (Initially Stack contains just S o )

A configuration of a LR parsing represents the right

sentential form:
X1 ... Xm ai ai+1 ... an $
Xi is the grammar symbol represented by state s i

Actions of A LR-Parser
1. If ACTION[Sm,

ai ] = shift s, the parser executes a shift

move ; it shifts the next state s onto the stack, entering the
configuration
( So S1 ... Sm, ai ai+1 ... an $ ) ( So S1 ... Sm s, ai+1 ... an $ )

2. If ACTION[Sm,

ai ] = reduce A, then the parser executes

a reduce move changing configuration from

r is the
length of , and s = GOTO[sm-r, A]. Output is the reducing
production A
Here the parser first popped r state symbols off the stack,
exposing state sm-r then the parser pushed s.
( So S1 ... Sm, ai ai+1 ... an $ ) to ( So S1 ... Sm-r s, ai ... an $ ) where

3. If ACTION[Sm, ai ] = Accept, parsing successfully completed

4. If ACTION[Sm,

ai ] = Error, parser detected an error (an

empty entry in the action table)

LR-parsing algorithm

(SLR) Parsing Tables for Expression

Grammar
Expression
Grammar
1) E E+T
2) E T
3) T T*F
4) T F
5) F (E)
6) F id

Action Table
state

Goto Table
)

s4
r6

acc

r6
10

s11

Actions of A (S)LR-Parser -Example

For id*id+id
stack

input

id*id+id$

0id5

*id+id$

reduce by Fid

Fid

0F3

*id+id$

reduce by TF

0T2

*id+id$

shift 7

0T2*7

id+id$

shift 5

0T2*7id5

action

output

shift 5

+id$

0T2*7F10 (*)+id$

reduce by Fid

Fid

reduce by TT*F

TT*F

0T2

+id$

reduce by ET

0E1

+id$

shift 6

0E1+6

id$

shift 5

0E1+6id5

reduce by Fid

Fid

0E1+6F3

reduce by TF

0E1+6T9 (**)
0E1

$
accept

b/c goto(0, F)
=3
b/c goto(0, T)
=2

reduce by EE+T

b/c goto(7, F)
= 10
b/c goto(0, T)
=2

(*) T2*7F10

EE+T
reduced by T
b/c

Conflicts During Shift-Reduce Parsing

There are context-free grammars for which shift-

reduce parsers cannot be used.

Stack contents and the next input symbol may
not decide action:
shift/reduce conflict: Whether make a shift
operation or a reduction.
reduce/reduce conflict: The parser cannot
decide which of several reductions to make.
If a shift-reduce parser cannot be used for a
grammar, that grammar is called as non-LR(k)
grammar.
An ambiguous grammar can never be a LR
grammar

Constructing SLR Parsing Tables

LR(0) Item
An LR parser makes shift-reduce decisions by maintaining states to

keep track of where we are in a parse.

An LR(0) item of a grammar G is a production of G a dot at the some
position of the right side.

Ex:

A aBb

Possible LR(0) Items: A

(four different possibility)

aBb

A a Bb

.
.

A aB b
A aBb

Sets of LR(0) items will be the states of action and goto table of the

SLR parser.
i.e. States represent sets of "items.

A collection of sets of LR(0) items (the canonical LR(0) collection) is

the basis for constructing SLR parsers.

Constructing SLR Parsing Tables

To construct the canonical LR(0) collection for a

grammar, we define an augmented grammar and two

functions, CLOSURE and GOTO.
Augmented Grammar:
G is G with a new production rule SS where S is the new
starting symbol.
Purpose: to provide a single production that, when reduced,
signals the end of parsing
If I is a set of LR(0) items for a grammar G, then

closure(I) is the set of LR(0) items constructed from

I by the two rules:
1. Initially, every LR(0) item in I is added to closure(I).
2. If A .B is in closure(I), for all production rules B in G,

add B. in the closure(I).

We will apply this rule until no more new LR(0) items
can be added to closure(I).

Closure (I) . Example

Give a grammar
EE+T|T
TT*F|F
F ( E ) | id
Then,
Closure({T T .* F}) = {T T . * F}
Closure ({T T .* F, T T * .F}) = {T T .* F, T
T * .F, F .(E ), F .id}
Closure ({F ( .E ) } ) = {F ( .E ), E . E + T, E
. T, T . T * F, T . F, F .( E ), F . id }
closure({E .E}) ={E .E, E .E+T, E .T,
T .T*F, T .F, F .(E), F .id}

Goto Operation
If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is

defined as follows:

If A X in I, then every item in closure({A X }) will be in goto(I,X).

Example:

I ={E
F

.
.

E, E

(E), F

E+T, E

T, T

T*F, T

id}

. .
. .
. .
. .
.
.
.
. .
. .
.
.

goto(I,E) = closure({ E E , E E +T }) = { E E , E E +T })
goto(I,T) = closure({ E T , T T *F }) = {E T , T T *F}
goto(I,F) = closure({ T F

}) = { T F

})

goto(I,() = closure({ F ( E)}) = {F ( E), E

goto(I,id) = closure( { F id

}) = { F id

Goto({E E ., E E + .T},+) = closure({ }) = { }

E+T, E

T, T

T*F, T

F, F

(E), F

id }

Construction of The Canonical LR(0)

Collection
To create the SLR parsing tables for a

grammar G, we will create the canonical LR(0)

collection of the grammar G.
Algorithm:
Void items(G) {
C = { CLOSURE({S.S}) }
repeat
for (each set of items I in C)
for(each grammar symbol X)
if (goto(I,X) is not empty
and not in C)
add goto(I,X) to C
Until no new sets of items are added

The Canonical LR(0) Collection -Example

C = {closure({E .E}) } = {E . E, E . E + T, E

. T, T . T * F, T . F, F .( E ), F . id}.
This gives us the items for the first state (state0
I0) of our DFA

Now we need to compute Goto functions for all

of the relevant symbols in the set.

In this case, we care about the symbols E, T, F, (,

and id, since those are the symbols that have a

.symbol in front of them in some item of the set C.
For symbol E, Goto(I0, E) = closure({E

E ., E

E . + T})
= {E E ., E E . + T} = call it I1
For symbol T, Goto(I0, T) = closure({E T ., T

For symbol F, Goto(I0, F) =closure({T F.}) = {T

F.} = I3
For symbol (, Goto(I0, () = closure({F ( .E ) }) = {F

(.E ), E . E + T, E . T, T . T * F, T . F, F .(
E ), F . Id} = I4

For symbol id, Goto(I0, id) = closure({F id.}) = {F

id.} = I5

Repeat this step for newly created states (I1, I2, I3, I4, I5)

till . occures at the end of kernal of each state.

For symbol +, Goto(I1, +) = closure({E E +. T}) = {E

E +. T, }, T . T * F, T . F, F .( E ), F . Id} = I6

For symbol , Goto(I2, ) = closure({T T * .F})= {T

T * .F, F .( E ), F . Id} = I7.

For symbol E, Goto(I4, E) = closure({F ( E. ), E

E .+ T})

Summary of states obtained , and to which state each

production in a state goes

Transition Diagram (DFA) of Goto

Function

LR(0) automaton for the Example

E E
E E+T

closure({E

E}) =

{ E

T T*F

.
.
.

E Rule 1

E+T Rule 2
T Rule 2

T*F

Rule 2

Constructing SLR Parsing Table -Example

Before we start construction of SLR action/goto

tables, we need to compute the follow sets for

all of the non-terminals in the grammar and we
need to number the productions.
0: E E
1: E E + T
Follow(E) = {$}
2: E T
Follow(E) = {$,),+}
3: T T * F
Follow(T) = {$,),+}
4: T F
Follow(F) = {$,),+,*}
5: F ( E )
6: F id
Each terminal is column for action table, each nonterminal is column for GOTO table and each state is
row for both tables

Constructing SLR Parsing Table

1. Construct the canonical collection of sets of LR(0)

items for G.
C{I0,...,In}
2. Create the parsing action table as follows

2.1. If a is a terminal, A.a in Ii and goto(Ii,a)=Ij

then action[i, a] is shift j.
2.2. If A. is in Ii , then action[i,a] is reduce A for
all a in FOLLOW(A) where AS.
2.3. If SS. is in Ii , then action[i,$] is accept.
2.4. If any conflicting actions generated by these rules,
the grammar is not SLR(1).
3. Create the parsing goto table
. for all non-terminals A, if goto(I i,A)=Ij then goto[i,A]=j

4. All entries not defined by (2) and (3) are errors.

5. Initial state of the parser contains S.S

Constructing SLR Parsing Table

:Example
From Rule 2.1.

Take F .( E ) from I0, Goto(I0, ( ) = I4, then action[0, (] = shift

4
Take E E . + T from I1, Goto(I1, +) = I6, then action[1, +] =
shift 6
Take T T . * F from I2, Goto(I2, *) = I7, then action[2, *] = shift
7
other shifts can be populated in the same way

From Rule 2.2.

Take E T. from I2, Follow(T) = {$,),+}

action[2,$] = reduce 2 (2: E T )
Action[2, )] = reduce 2
Action[2, +)] = reduce 2

other reduces can be done in the same way

From Rule 2.3.

E E . is I1, action[1,$] = accept

Constructing SLR Parsing Table

:Example
From 3 - Creating the parsing goto table

o
.
.
.
.
.
.
.
.

Take
Take
Take
Take
Take
Take
Take
Take
Take

E in I0, goto(I0,E)=I1 then goto[0,E]=1

T in I0 , goto(I0,T)= I2 then goto[0, T] = 2
T in I0, goto(I0,F)= I3 then goto[0, F] = 3
E in I4, goto(I4,E)=I8 then goto[4,E]=8
T in I4 , goto(I4,T)= I2 then goto[4, T] = 2
F in I4, goto(I4,F)= I3 then goto[4, F] = 3
T in I6 , goto(I6,T)= I9 then goto[6, T] = 9
F in I6, goto(I6,F)= I3 then goto[6, F] = 3
F in I7, goto(I7,F)= I10 then goto[7, F] = 10

Parsing Tables of Expression

Grammar

Action Table

stat
e
0

s4
r6

s4
s6

Goto Table
8

6
8

acc

r6
10

s1
1
r1

Exercise
Construct SLR Parse table for the augmented

grammar and show how the parser accepts the

string or input () ()
1: S S
2:S (S)S
3:S l
Answer

SLR(1) Grammar
An LR parser using SLR(1) parsing tables for a

grammar G is called as the SLR(1) parser for G.

If a grammar G has an SLR(1) parsing table, it is

called SLR(1) grammar (or SLR grammar in short).

Every SLR grammar is unambiguous, but every

unambiguous grammar is not a SLR grammar.

If the SLR parsing table of a grammar G has a
conflict, we say that that grammar is not SLR
grammar.
shift/reduce conflict, reduce/reduce conflict

Error Recovery in LR Parsing

An LR parser will detect an error when it

consults the parsing action table and finds

an error entry. All empty entries in the action
table are error entries.
missing operand
unbalanced right parenthesis

Errors are never detected by consulting the

goto table.
Some error recovery are
Discard zero or more input symbols until a symbol

a is found
By marking each empty entry in the action table
with a specific error routine.

Assignment 3
Given the following grammar where a, b, & c

are terminals and S, X, Y are non-terminals

S XaYb | Y |l
X aY | c
Y bX | a
Build LL(1) Parsing table for the grammar (Show

all the necessary steps).

What can you say about ambiguity of the grammar?
show how the parser accepts/rejects the input cabbab

Build Simple LR Parsing table for the grammar

(Show all the necessary steps).

Is there any conflicts during shift-reduce parsing?
show how the parser accepts/rejects the string cabbab

Administering MS SQL Server 2019 Databases
No ratings yet
Administering MS SQL Server 2019 Databases
6 pages
Introduction To Compiler Design
100% (1)
Introduction To Compiler Design
186 pages
Stored Procedures in MS SQL Server
No ratings yet
Stored Procedures in MS SQL Server
5 pages
The Text Mining Handbook
No ratings yet
The Text Mining Handbook
423 pages
Javascript Hard Parts Oop PDF
100% (1)
Javascript Hard Parts Oop PDF
48 pages
Case Study Kudochem Combined
No ratings yet
Case Study Kudochem Combined
3 pages
98 383
No ratings yet
98 383
7 pages
Compiler Construction MCQ
100% (1)
Compiler Construction MCQ
16 pages
BNF Ebnf
100% (1)
BNF Ebnf
25 pages
Strings in C
No ratings yet
Strings in C
18 pages
Cits2211lectures Grammars PDF
No ratings yet
Cits2211lectures Grammars PDF
28 pages
Backus-Naur Form
No ratings yet
Backus-Naur Form
14 pages
BNF and EBNF - What Are They and How Do They Work
No ratings yet
BNF and EBNF - What Are They and How Do They Work
4 pages
20483-Programming-in-C
No ratings yet
20483-Programming-in-C
3 pages
Mongodb 3.0 Manual
No ratings yet
Mongodb 3.0 Manual
1,021 pages
SQL Server Replication
No ratings yet
SQL Server Replication
8 pages
HTML Notes For 2024 Update
No ratings yet
HTML Notes For 2024 Update
86 pages
Mathml Tutorial
No ratings yet
Mathml Tutorial
43 pages
Django Girls Tutorial Extensions
100% (1)
Django Girls Tutorial Extensions
36 pages
SQL Server Management Studio - Tips For Improving The TSQL Coding Process - Stack Overflow PDF
No ratings yet
SQL Server Management Studio - Tips For Improving The TSQL Coding Process - Stack Overflow PDF
9 pages
Compiler Design Elimination Left Recursion and Left Factoring
No ratings yet
Compiler Design Elimination Left Recursion and Left Factoring
16 pages
2020 - Chapter 2 Software Quality Standards
No ratings yet
2020 - Chapter 2 Software Quality Standards
71 pages
Computer Graphics
100% (1)
Computer Graphics
460 pages
Introduction To Object Oriented Programming
No ratings yet
Introduction To Object Oriented Programming
128 pages
PCD Lab Manual
No ratings yet
PCD Lab Manual
28 pages
PHP Pdo Syntax
No ratings yet
PHP Pdo Syntax
20 pages
13 Java 8 Lambda Expressions Part 1
No ratings yet
13 Java 8 Lambda Expressions Part 1
21 pages
Unit - I-Object Oriented Programming Concepts
No ratings yet
Unit - I-Object Oriented Programming Concepts
22 pages
Abstract On Mac - Os
No ratings yet
Abstract On Mac - Os
4 pages
mcsl-17 C and Assembly Language Programming Lab
No ratings yet
mcsl-17 C and Assembly Language Programming Lab
42 pages
Front End Web Developer: Course Syllabus
100% (1)
Front End Web Developer: Course Syllabus
10 pages
Regex Tutorial PDF
No ratings yet
Regex Tutorial PDF
9 pages
Natural Language Processing - Session 3 - Regular Expressions
No ratings yet
Natural Language Processing - Session 3 - Regular Expressions
39 pages
Postfix To Infix Conversion: Data Structure
No ratings yet
Postfix To Infix Conversion: Data Structure
5 pages
ADO1
No ratings yet
ADO1
110 pages
Intel SIMD Architecture: Computer Organization and Assembly Languages Yung-Yu Chuang
No ratings yet
Intel SIMD Architecture: Computer Organization and Assembly Languages Yung-Yu Chuang
80 pages
JavaScript - Quick Guide
No ratings yet
JavaScript - Quick Guide
139 pages
Node - Js
No ratings yet
Node - Js
25 pages
Angular Merged
No ratings yet
Angular Merged
98 pages
Insertion Sort: Analysis of Complexity: Georgy Gimel'farb
No ratings yet
Insertion Sort: Analysis of Complexity: Georgy Gimel'farb
15 pages
Node js+at+Scale+Vol +1+-+Understanding+the+Node Js+module+system+and+using+npm
No ratings yet
Node js+at+Scale+Vol +1+-+Understanding+the+Node Js+module+system+and+using+npm
21 pages
Durga Core Java
50% (2)
Durga Core Java
2 pages
Web Component Development With Angularjs
No ratings yet
Web Component Development With Angularjs
264 pages
Complete Download C in a Nutshell Second Edition Peter Prinz And Tony Crawford PDF All Chapters
100% (1)
Complete Download C in a Nutshell Second Edition Peter Prinz And Tony Crawford PDF All Chapters
65 pages
Java8 Lambda Expressions Streams
No ratings yet
Java8 Lambda Expressions Streams
71 pages
Going Go Programming
No ratings yet
Going Go Programming
324 pages
Advanced C++
No ratings yet
Advanced C++
143 pages
2 0 Design Css Themes and Master Pages Sep 2007
No ratings yet
2 0 Design Css Themes and Master Pages Sep 2007
507 pages
HTML 5.2-Estandar PDF
No ratings yet
HTML 5.2-Estandar PDF
1,793 pages
MFC Internals: III III
100% (1)
MFC Internals: III III
13 pages
JavaScript Objects
No ratings yet
JavaScript Objects
11 pages
Visual Studio 2013 Cookbook
From Everand
Visual Studio 2013 Cookbook
Richard Banks
No ratings yet
Advanced GitLab CI/CD Pipelines: An In-Depth Guide for Continuous Integration and Deployment
From Everand
Advanced GitLab CI/CD Pipelines: An In-Depth Guide for Continuous Integration and Deployment
Adam Jones
No ratings yet
Lec03 parserCFG
No ratings yet
Lec03 parserCFG
27 pages
Tekkom M4,5
No ratings yet
Tekkom M4,5
29 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
51 pages
CD - Ch.2
No ratings yet
CD - Ch.2
39 pages
Context Free Grammars
No ratings yet
Context Free Grammars
10 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
CD Question Bank
100% (1)
CD Question Bank
16 pages
BCST 602 Compiler Design
No ratings yet
BCST 602 Compiler Design
2 pages
Numerical Question Based On Shift Reduce Parser
No ratings yet
Numerical Question Based On Shift Reduce Parser
8 pages
2023 24 PCD UNIT 2 Modified
No ratings yet
2023 24 PCD UNIT 2 Modified
40 pages
II-II CSE ACD MID-2 Objective Set 1 July 2024
No ratings yet
II-II CSE ACD MID-2 Objective Set 1 July 2024
4 pages
Ss Mini Project
No ratings yet
Ss Mini Project
20 pages
Unit Ii Syntax Analysis
No ratings yet
Unit Ii Syntax Analysis
7 pages
R05311201 Automata and Compiler Design
No ratings yet
R05311201 Automata and Compiler Design
7 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
Modulewise Questions On All Module SPCC
No ratings yet
Modulewise Questions On All Module SPCC
8 pages
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
0% (1)
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
7 pages
Compiler Design Quantum PDF
100% (1)
Compiler Design Quantum PDF
211 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
24 pages
Syntax Analysis 2
No ratings yet
Syntax Analysis 2
70 pages
Reference
No ratings yet
Reference
55 pages
MODULE 3 - Syntax Analysis
No ratings yet
MODULE 3 - Syntax Analysis
110 pages
LR Parsing Methods
No ratings yet
LR Parsing Methods
50 pages
LR Parsers (SLR, LALR, and Canonical LR Parser)
No ratings yet
LR Parsers (SLR, LALR, and Canonical LR Parser)
4 pages
Sant Gadge Baba Amravati University Gazette - 2021 - Part Two - 517
No ratings yet
Sant Gadge Baba Amravati University Gazette - 2021 - Part Two - 517
19 pages
Compiler Construction Paper
No ratings yet
Compiler Construction Paper
6 pages
Homework 3
No ratings yet
Homework 3
14 pages
Parsing Methods - Compiler Design
No ratings yet
Parsing Methods - Compiler Design
46 pages
Bottom Up Approach
No ratings yet
Bottom Up Approach
22 pages
CD Lab Kare With Solution With Header
No ratings yet
CD Lab Kare With Solution With Header
124 pages
LR Parsing Table Costruction
100% (1)
LR Parsing Table Costruction
47 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
117 pages
CD Unitwise Imp Questions
100% (1)
CD Unitwise Imp Questions
5 pages
CS8602 Compiler Design UNIT 2 MCQ
No ratings yet
CS8602 Compiler Design UNIT 2 MCQ
7 pages
Cs606 Solved Mcqs Final Term by Junaid
No ratings yet
Cs606 Solved Mcqs Final Term by Junaid
39 pages