0% found this document useful (0 votes)

10 views28 pages

Unit Iii

Uploaded by

Thenmozhi Elumalai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views28 pages

Unit Iii

Uploaded by

Thenmozhi Elumalai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

UNIT III

SYNTAX ANALYSIS

SYNTAX ANALYSIS - Introduction

Syntax analysis is the second phase of the compiler. It gets the input from the tokens and
generates a syntax tree or parse tree.

Advantages of grammar for syntactic specification :

1. A grammar gives a precise and easy-to-understand syntactic specification of a

programming language.
2. An efficient parser can be constructed automatically from a properly designed grammar.
3. A grammar imparts a structure to a source program that is useful for its translation into
object code and for the detection of errors.
4. New constructs can be added to a language more easily when there is a grammatical
description of the language.

THE ROLE OF PARSER

The parser or syntactic analyzer obtains a string of tokens from the lexical analyzer and
verifies that the string can be generated by the grammar for the source language. It reports any
syntax errors in the program. It also recovers from commonly occurring errors so that it can
continue processing its input.

Position of parser in compiler model

source lexical token parser parse rest of intermediate

program analyzer tree front end representation
get next token

symbol
table

Functions of the parser :

1. It verifies the structure generated by the tokens based on the grammar.

2. It constructs the parse tree.
3. It reports the errors.
4. It performs error recovery.

Issues :

Parser cannot detect errors such as:

1. Variable re-declaration
2. Variable initialization before use.
3. Data type mismatch for an operation.

The above issues are handled by Semantic Analysis phase.

Syntax error handling :

Programs can contain errors at many different levels. For example :

1. Lexical, such as misspelling a keyword.
2. Syntactic, such as an arithmetic expression with unbalanced parentheses.
3. Semantic, such as an operator applied to an incompatible operand.
4. Logical, such as an infinitely recursive call.

Functions of error handler :

1. It should report the presence of errors clearly and accurately.

2. It should recover from each error quickly enough to be able to detect subsequent errors.
3. It should not significantly slow down the processing of correct programs.

Error recovery strategies :

The different strategies that a parse uses to recover from a syntactic error are:

1. Panic mode
2. Phrase level
3. Error productions
4. Global correction

Panic mode recovery:

On discovering an error, the parser discards input symbols one at a time until a
synchronizing token is found. The synchronizing tokens are usually delimiters, such as
semicolon or end. It has the advantage of simplicity and does not go into an infinite loop. When
multiple errors in the same statement are rare, this method is quite useful.

Phrase level recovery:

On discovering an error, the parser performs local correction on the remaining input that
allows it to continue. Example: Insert a missing semicolon or delete an extraneous semicolon etc.

Error productions:

The parser is constructed using augmented grammar with error productions. If an error
production is used by the parser, appropriate error diagnostics can be generated to indicate the
erroneous constructs recognized by the input.

Global correction:
Given an incorrect input string x and grammar G, certain algorithms can be used to find a
parse tree for a string y, such that the number of insertions, deletions and changes of tokens is as
small as possible. However, these methods are in general too costly in terms of time and space.
CONTEXT-FREE GRAMMARS

A Context-Free Grammar is a quadruple that consists of terminals, non-terminals, start

symbol and productions.

Terminals : These are the basic symbols from which strings are formed.

Non-Terminals : These are the syntactic variables that denote a set of strings. These help to
define the language generated by the grammar.

Start Symbol : One non-terminal in the grammar is denoted as the “Start-symbol” and the set of
strings it denotes is the language defined by the grammar.

Productions : It specifies the manner in which terminals and non-terminals can be combined to
form strings. Each production consists of a non-terminal, followed by an arrow, followed by a
string of non-terminals and terminals.

Example of context-free grammar: The following grammar defines simple arithmetic

expressions:

expr → expr op expr

expr → (expr)
expr → - expr
expr → id
op → +
op → -
op → *
op → /
op → ↑

In this grammar,

 id + - * / ↑ ( ) are terminals.
 expr , op are non-terminals.
 expr is the start symbol.
 Each line is a production.

Derivations:

Two basic requirements for a grammar are :

1. To generate a valid string.
2. To recognize a valid string.

Derivation is a process that generates a valid string with the help of grammar by replacing the
non-terminals on the left with the string on the right side of the production.

Example : Consider the following grammar for arithmetic expressions :

E → E+E | E*E | ( E ) | - E | id
To generate a valid string - ( id+id ) from the grammar the steps are
1. E → - E
2. E → - ( E )
3. E → - ( E+E )
4. E → - ( id+E )
5. E → - ( id+id )

In the above derivation,

 E is the start symbol.
 - (id+id) is the required sentence (only terminals).
 Strings such as E, -E, -(E), . . . are called sentinel forms.

Types of derivations:

The two types of derivation are:

1. Left most derivation

2. Right most derivation.

 In leftmost derivations, the leftmost non-terminal in each sentinel is always chosen first for
replacement.

 In rightmost derivations, the rightmost non-terminal in each sentinel is always chosen first
for replacement.

Example:

Given grammar G : E → E+E | E*E | ( E ) | - E | id

Sentence to be derived : – (id+id)

LEFTMOST DERIVATION RIGHTMOST DERIVATION

E→-E E→-E

E→-(E) E→-(E)

E → - ( E+E ) E → - (E+E )

E → - ( id+E ) E → - ( E+id )

E → - ( id+id ) E → - ( id+id )

 String that appear in leftmost derivation are called left sentinel forms.
 String that appear in rightmost derivation are called right sentinel forms.

Sentinels:

Given a grammar G with start symbol S, if S → α , where α may contain non-terminals or

terminals, then α is called the sentinel form of G.
Yield or frontier of tree:

Each interior node of a parse tree is a non-terminal. The children of node can be a
terminal or non-terminal of the sentinel forms that are read from left to right. The sentinel form
in the parse tree is called yield or frontier of the tree.

Ambiguity:

A grammar that produces more than one parse for some sentence is said to be ambiguous
grammar.

Example : Given grammar G : E → E+E | E*E | ( E ) | - E | id

The sentence id+id*id has the following two distinct leftmost derivations:

E → E+ E E → E* E

E → id + E E→E+E*E

E → id + E * E E → id + E * E

E → id + id * E E → id + id * E

E → id + id * id E → id + id * id

The two corresponding parse trees are :

E E

E + E E * E

id E * E E + E id

id id id id

WRITING A GRAMMAR

There are four categories in writing a grammar :

1. Regular Expression Vs Context Free Grammar

2. Eliminating ambiguous grammar.
3. Eliminating left-recursion
4. Left-factoring.
Each parsing method can handle grammars only of a certain form hence, the initial grammar may
have to be rewritten to make it parsable.
Regular Expressions vs. Context-Free Grammars:

REGULAR EXPRESSION CONTEXT-FREE GRAMMAR

It is used to describe the tokens of programming It consists of a quadruple where S → start

languages. symbol, P → production, T → terminal, V →
variable or non- terminal.
It is used to check whether the given input is It is used to check whether the given input is
valid or not using transition diagram. valid or not using derivation.

The transition diagram has set of states and The context-free grammar has set of
edges. productions.

It has no start symbol. It has start symbol.

It is useful for describing the structure of lexical It is useful in describing nested structures
constructs such as identifiers, constants, such as balanced parentheses, matching
keywords, and so forth. begin-end’s and so on.

 The lexical rules of a language are simple and RE is used to describe them.

 Regular expressions provide a more concise and easier to understand notation for tokens
than grammars.

 Efficient lexical analyzers can be constructed automatically from RE than from

grammars.

 Separating the syntactic structure of a language into lexical and nonlexical parts provides
a convenient way of modularizing the front end into two manageable-sized components.

Eliminating ambiguity:

Ambiguity of the grammar that produces more than one parse tree for leftmost or rightmost
derivation can be eliminated by re-writing the grammar.

Consider this example, G: stmt → if expr then stmt | if expr then stmt else stmt | other

This grammar is ambiguous since the string if E1 then if E2 then S1 else S2 has the following
two parse trees for leftmost derivation :
1. stmt

if expr then stmt

if expr then stmt else stmt

E2 S1 S2

2. stmt

if expr then stmt else stmt

E1 S2

if expr then stmt

E2 S1

To eliminate ambiguity, the following grammar may be used:

stmt → matched_stmt | unmatched_stmt

matched_stmt → if expr then matched_stmt else matched_stmt | other

unmatched_stmt → if expr then stmt | if expr then matched_stmt else unmatched_stmt

Eliminating Left Recursion:

A grammar is said to be left recursive if it has a non-terminal A such that there is a

derivation A=>Aα for some string α. Top-down parsing methods cannot handle left-recursive
grammars. Hence, left recursion can be eliminated as follows:
If there is a production A → Aα | β it can be replaced with a sequence of two productions

A → βA’

A’ → αA’ | ε

without changing the set of strings derivable from A.

Example : Consider the following grammar for arithmetic expressions:

E → E+T | T

T → T*F | F

F → (E) | id

First eliminate the left recursion for E as

E → TE’

E’ → +TE’ | ε

Then eliminate for T as

T → FT’

T’→ *FT’ | ε

Thus the obtained grammar after eliminating left recursion is

E → TE’

E’ → +TE’ | ε

T → FT’

T’ → *FT’ | ε

F → (E) | id

Algorithm to eliminate left recursion:

1. Arrange the non-terminals in some order A1, A2 . . . An.

2. for i := 1 to n do begin
for j := 1 to i-1 do begin
replace each production of the form A i → Aj γ
by the productions Ai → δ1 γ | δ2γ | . . . | δk γ
where Aj → δ1 | δ2 | . . . | δk are all the current Aj-productions;
end
eliminate the immediate left recursion among the Ai-productions
end
Left factoring:

Left factoring is a grammar transformation that is useful for producing a grammar

suitable for predictive parsing. When it is not clear which of two alternative productions to use to
expand a non-terminal A, we can rewrite the A-productions to defer the decision until we have
seen enough of the input to make the right choice.

If there is any production A → αβ 1 | αβ2 , it can be rewritten as

A → αA’

A’ → β 1 | β2

Consider the grammar , G : S → iEtS | iEtSeS | a

E→b

Left factored, this grammar becomes

S → iEtSS’ | a
S’ → eS | ε
E→b

PARSING

It is the process of analyzing a continuous stream of input in order to determine its

grammatical structure with respect to a given formal grammar.

Parse tree:

Graphical representation of a derivation or deduction is called a parse tree. Each interior

node of the parse tree is a non-terminal; the children of the node can be terminals or non-
terminals.

Types of parsing:

1. Top down parsing

2. Bottom up parsing

 Top–down parsing : A parser can start with the start symbol and try to transform it to the
input string.
Example : LL Parsers.
 Bottom–up parsing : A parser can start with input and attempt to rewrite it into the start
symbol.
Example : LR Parsers.

TOP-DOWN PARSING

It can be viewed as an attempt to find a left-most derivation for an input string or an

attempt to construct a parse tree for the input starting from the root to the leaves.
Types of top-down parsing :

1. Recursive descent parsing

2. Predictive parsing

1. RECURSIVE DESCENT PARSING

 Recursive descent parsing is one of the top-down parsing techniques that uses a set of
recursive procedures to scan its input.

 This parsing method may involve backtracking, that is, making repeated scans of the
input.

Example for backtracking :

Consider the grammar G : S → cAd

A → ab | a
and the input string w=cad.

The parse tree can be constructed using the following top-down approach :

Step1:

Initially create a tree with single node labeled S. An input pointer points to ‘c’, the first symbol
of w. Expand the tree with the production of S.

c A d

Step2:

The leftmost leaf ‘c’ matches the first symbol of w, so advance the input pointer to the second
symbol of w ‘a’ and consider the next leaf ‘A’. Expand A using the first alternative.

c A d

a b

Step3:

The second symbol ‘a’ of w also matches with second leaf of tree. So advance the input pointer
to third symbol of w ‘d’. But the third leaf of tree is b which does not match with the input
symbol d.
Hence discard the chosen production and reset the pointer to second position. This is called
backtracking.

Step4:

Now try the second alternative for A.

c A d

Now we can halt and announce the successful completion of parsing.

Example for recursive decent parsing:

A left-recursive grammar can cause a recursive-descent parser to go into an infinite loop. Hence,
elimination of left-recursion must be done before parsing.

Consider the grammar for arithmetic expressions

E → E+T | T

T → T*F | F

F → (E) | id

After eliminating the left-recursion the grammar becomes,

E → TE’

E’ → +TE’ | ε

T → FT’

T’ → *FT’ | ε

F → (E) | id

Now we can write the procedure for grammar as follows:

Recursive procedure:

Procedure E()
begin
T( );
EPRIME( );
end
Procedure EPRIME( )
begin
If input_symbol=’+’ then
ADVANCE( );
T( );
EPRIME( );
end

Procedure T( )
begin
F( );
TPRIME( );
end

Procedure TPRIME( )
begin
If input_symbol=’*’ then
ADVANCE( );
F( );
TPRIME( );
end

Procedure F( )
begin
If input-symbol=’id’ then
ADVANCE( );
else if input-symbol=’(‘ then
ADVANCE( );
E( );
else if input-symbol=’)’ then
ADVANCE( );
end

else ERROR( );

Stack implementation:

To recognize input id+id*id :

PROCEDURE INPUT STRING

E( ) id+id*id

T( ) id+id*id

F( ) id+id*id

ADVANCE( ) id+id*id
TPRIME( ) id+id*id

EPRIME( ) id+id*id

ADVANCE( ) id+id*id

T( ) id+id*id

F( ) id+id*id

ADVANCE( ) id+id*id

TPRIME( ) id+id*id

ADVANCE( ) id+id*id

F( ) id+id*id

ADVANCE( ) id+id*id

TPRIME( ) id+id*id

2. PREDICTIVE PARSING

 Predictive parsing is a special case of recursive descent parsing where no backtracking is

required.

 The key problem of predictive parsing is to determine the production to be applied for a
non-terminal in case of alternatives.

Non-recursive predictive parser

INPUT a + b $

STACK
X Predictive parsing program
OUTPUT
Y

$
Parsing Table M
The table-driven predictive parser has an input buffer, stack, a parsing table and an output
stream.

Input buffer:

It consists of strings to be parsed, followed by $ to indicate the end of the input string.

Stack:

It contains a sequence of grammar symbols preceded by $ to indicate the bottom of the stack.
Initially, the stack contains the start symbol on top of $.

Parsing table:

It is a two-dimensional array M[A, a], where ‘A’ is a non-terminal and ‘a’ is a terminal.

Predictive parsing program:

The parser is controlled by a program that considers X, the symbol on top of stack, and a, the
current input symbol. These two symbols determine the parser action. There are three
possibilities:

1. If X = a = $, the parser halts and announces successful completion of parsing.

2. If X = a ≠ $, the parser pops X off the stack and advances the input pointer to the next
input symbol.
3. If X is a non-terminal , the program consults entry M[X, a] of the parsing table M. This
entry will either be an X-production of the grammar or an error entry.
If M[X, a] = {X → UVW},the parser replaces X on top of the stack by WVU.
If M[X, a] = error, the parser calls an error recovery routine.

Algorithm for nonrecursive predictive parsing:

Input : A string w and a parsing table M for grammar G.

Output : If w is in L(G), a leftmost derivation of w; otherwise, an error indication.

Method : Initially, the parser has $S on the stack with S, the start symbol of G on top, and w$ in
the input buffer. The program that utilizes the predictive parsing table M to produce a parse for
the input is as follows:

set ip to point to the first symbol of w$;

repeat
let X be the top stack symbol and a the symbol pointed to by ip;
if X is a terminal or $ then
if X = a then
pop X from the stack and advance ip
else error()
else /* X is a non-terminal */
if M[X, a] = X →Y1Y2 … Yk then begin
pop X from the stack;
push Yk, Yk-1, … ,Y1 onto the stack, with Y1 on top;
output the production X → Y1 Y2 . . . Yk
end
else error()
until X = $ /* stack is empty */

Predictive parsing table construction:

The construction of a predictive parser is aided by two functions associated with a grammar G :

1. FIRST

2. FOLLOW

Rules for first( ):

1. If X is terminal, then FIRST(X) is {X}.

2. If X → ε is a production, then add ε to FIRST(X).

3. If X is non-terminal and X → aα is a production then add a to FIRST(X).

4. If X is non-terminal and X → Y1 Y2…Yk is a production, then place a in FIRST(X) if for some

i, a is in FIRST(Yi), and ε is in all of FIRST(Y1),…,FIRST(Yi-1); that is, Y1,….Yi-1 => ε. If ε is
in FIRST(Yj) for all j=1,2,..,k, then add ε to FIRST(X).

Rules for follow( ):

1. If S is a start symbol, then FOLLOW(S) contains $.

2. If there is a production A → αBβ, then everything in FIRST(β) except ε is placed in

follow(B).

3. If there is a production A → αB, or a production A → αBβ where FIRST(β) contains ε, then

everything in FOLLOW(A) is in FOLLOW(B).
Algorithm for construction of predictive parsing table:

Input : Grammar G

Output : Parsing table M

Method :

1. For each production A → α of the grammar, do steps 2 and 3.

2. For each terminal a in FIRST(α), add A → α to M[A, a].

3. If ε is in FIRST(α), add A → α to M[A, b] for each terminal b in FOLLOW(A). If ε is in

FIRST(α) and $ is in FOLLOW(A) , add A → α to M[A, $].

4. Make each undefined entry of M be error.

Example:

Consider the following grammar :

E → E+T | T
T → T*F | F
F → (E) | id

After eliminating left-recursion the grammar is

E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E) | id

First( ) :

FIRST(E) = { ( , id}

FIRST(E’) ={+ , ε }

FIRST(T) = { ( , id}

FIRST(T’) = {*, ε }

FIRST(F) = { ( , id }

Follow( ):

FOLLOW(E) = { $, ) }

FOLLOW(E’) = { $, ) }

FOLLOW(T) = { +, $, ) }

FOLLOW(T’) = { +, $, ) }

FOLLOW(F) = {+, * , $ , ) }

Predictive parsing table :

NON- id + * ( ) $
TERMINAL
E E → TE’ E → TE’
E’ E’ → +TE’ E’ → ε E’→ ε
T T → FT’ T → FT’
T’ T’→ ε T’→ *FT’ T’ → ε T’ → ε
F F → id F → (E)
Stack implementation:

stack Input Output

$E id+id*id $
$E’T id+id*id $ E → TE’
$E’T’F id+id*id $ T → FT’
$E’T’id id+id*id $ F → id
$E’T’ +id*id $
$E’ +id*id $ T’ → ε
$E’T+ +id*id $ E’ → +TE’
$E’T id*id $
$E’T’F id*id $ T → FT’
$E’T’id id*id $ F → id
$E’T’ *id $
$E’T’F* *id $ T’ → *FT’
$E’T’F id $
$E’T’id id $ F → id
$E’T’ $
$E’ $ T’ → ε
$ $ E’ → ε

LL(1) grammar:

The parsing table entries are single entries. So each location has not more than one entry. This
type of grammar is called LL(1) grammar.

Consider this following grammar:

S → iEtS | iEtSeS | a
E→b

After eliminating left factoring, we have

S → iEtSS’ | a
S’→ eS | ε
E→b

To construct a parsing table, we need FIRST() and FOLLOW() for all the non-terminals.

FIRST(S) = { i, a }

FIRST(S’) = {e, ε }

FIRST(E) = { b}

FOLLOW(S) = { $ ,e }
FOLLOW(S’) = { $ ,e }

FOLLOW(E) = {t}

Parsing table:

NON- a b e i t $
TERMINAL
S S→a S → iEtSS’
S’ S’ → eS S’ → ε
S’ → ε
E E→b

Since there are more than one production, the grammar is not LL(1) grammar.

Actions performed in predictive parsing:

1. Shift
2. Reduce
3. Accept
4. Error

Implementation of predictive parser:

1. Elimination of left recursion, left factoring and ambiguous grammar.

2. Construct FIRST() and FOLLOW() for all non-terminals.
3. Construct predictive parsing table.
4. Parse the given input string using stack and parsing table.

BOTTOM-UP PARSING

Constructing a parse tree for an input string beginning at the leaves and going towards the root is
called bottom-up parsing.

A general type of bottom-up parser is a shift-reduce parser.

SHIFT-REDUCE PARSING
Shift-reduce parsing is a type of bottom-up parsing that attempts to construct a parse tree
for an input string beginning at the leaves (the bottom) and working up towards the root (the
top).

Example:
Consider the grammar:
S → aABe
A → Abc | b
B→d
The sentence to be recognized is abbcde.
REDUCTION (LEFTMOST) RIGHTMOST DERIVATION

abbcde (A → b) S → aABe
aAbcde (A → Abc) → aAde
aAde (B → d) → aAbcde
aABe (S → aABe) → abbcde
S
The reductions trace out the right-most derivation in reverse.

Handles:

A handle of a string is a substring that matches the right side of a production, and whose
reduction to the non-terminal on the left side of the production represents one step along the
reverse of a rightmost derivation.

Example:

Consider the grammar:

E → E+E
E → E*E
E → (E)
E → id

And the input string id1+id2*id3

The rightmost derivation is :

E → E+E
→ E+E*E
→ E+E*id3
→ E+id2*id3
→ id1+id2*id3

In the above derivation the underlined substrings are called handles.

Handle pruning:

A rightmost derivation in reverse can be obtained by “handle pruning”.

(i.e.) if w is a sentence or string of the grammar at hand, then w = γn, where γn is the nth right-
sentinel form of some rightmost derivation.
Stack implementation of shift-reduce parsing :

Stack Input Action

1. $ id1+id2*id3 $ shift

2. $ id1 +id2*id3 $ reduce by E→id

$E +id2*id3 $ shift

$ E+ id2*id3 $ shift

$ E+id2 *id3 $ reduce by E→id

$ E+E *id3 $ shift

$ E+E* id3 $ shift

$ E+E*id3 $ reduce by E→id

$ E+E*E $ reduce by E→ E *E

$ E+E $ reduce by E→ E+E

$E $ accept

Actions in shift-reduce parser:

 shift – The next input symbol is shifted onto the top of the stack.
 reduce – The parser replaces the handle within a stack with a non-terminal.
 accept – The parser announces successful completion of parsing.
 error – The parser discovers that a syntax error has occurred and calls an error recovery
routine.

Conflicts in shift-reduce parsing:

There are two conflicts that occur in shift shift-reduce parsing:

1. Shift-reduce conflict: The parser cannot decide whether to shift or to reduce.

2. Reduce-reduce conflict: The parser cannot decide which of several reductions to make.

1. Shift-reduce conflict:

Example:

Consider the grammar:

E→E+E | EE | id and input id+idid

Stack Input Action Stack Input Action
$ E+E *id $ Reduce by $E+E *id $ Shift
E→E+E
$E *id $ Shift $E+E* id $ Shift

$ E* id $ Shift $E+E*id $ Reduce by

E→id
$ E*id $ Reduce by $E+E*E $ Reduce by
E→id E→E*E
$ E*E $ Reduce by $E+E $ Reduce by
E→E*E E→E*E
$E $E

2. Reduce-reduce conflict:

Consider the grammar:

M → R+R | R+c | R
R→c
and input c+c

Stack Input Action Stack Input Action

$ c+c $ Shift $ c+c $ Shift

$c +c $ Reduce by $c +c $ Reduce by
R→c R→c
$R +c $ Shift $R +c $ Shift

$ R+ c$ Shift $ R+ c$ Shift

$ R+c $ Reduce by $ R+c $ Reduce by

R→c M→R+c
$ R+R $ Reduce by $M $
M→R+R
$M $
LR PARSERS
An efficient bottom-up syntax analysis technique that can be used to parse a large class of
CFG is called LR(k) parsing. The ‘L’ is for left-to-right scanning of the input, the ‘R’ for
constructing a rightmost derivation in reverse, and the ‘k’ for the number of input symbols.
When ‘k’ is omitted, it is assumed to be 1.

Advantages of LR parsing:
 It recognizes virtually all programming language constructs for which CFG can be
written.
 It is an efficient non-backtracking shift-reduce parsing method.
 A grammar that can be parsed using LR method is a proper superset of a grammar that
can be parsed with predictive parser.
 It detects a syntactic error as soon as possible.

Drawbacks of LR method:
It is too much of work to construct a LR parser by hand for a programming language
grammar. A specialized tool, called a LR parser generator, is needed. Example: YACC.

Types of LR parsing method:

1. SLR- Simple LR
 Easiest to implement, least powerful.
2. CLR- Canonical LR
 Most powerful, most expensive.
3. LALR- Look-Ahead LR
 Intermediate in size and cost between the other two methods.

The LR parsing algorithm:

The schematic form of an LR parser is as follows:

INPUT a1 ai an $
… …

Sm LR parsing program OUTPUT

Xm
Sm-1
Xm-1
… action goto
S0

STACK
It consists of : an input, an output, a stack, a driver program, and a parsing table that has two
parts (action and goto).

 The driver program is the same for all LR parser.

 The parsing program reads characters from an input buffer one at a time.

 The program uses a stack to store a string of the form s 0X1s1X2s2…Xmsm, where sm is on
top. Each Xi is a grammar symbol and each si is a state.

 The parsing table consists of two parts : action and goto functions.

Action : The parsing program determines sm, the state currently on top of stack, and a i, the
current input symbol. It then consults action[sm,ai] in the action table which can have one of four
values :

1. shift s, where s is a state,

2. reduce by a grammar production A → β,
3. accept, and
4. error.

Goto : The function goto takes a state and grammar symbol as arguments and produces a state.

LR Parsing algorithm:

Input: An input string w and an LR parsing table with functions action and goto for grammar G.

Output: If w is in L(G), a bottom-up-parse for w; otherwise, an error indication.

Method: Initially, the parser has s0 on its stack, where s0 is the initial state, and w$ in the input
buffer. The parser then executes the following program :

set ip to point to the first input symbol of w$;

repeat forever begin
let s be the state on top of the stack and
a the symbol pointed to by ip;
if action[s, a] = shift s’ then begin
push a then s’ on top of the stack;
advance ip to the next input symbol
end
else if action[s, a] = reduce A→β then begin
pop 2* | β | symbols off the stack;
let s’ be the state now on top of the stack;
push A then goto[s’, A] on top of the stack;
output the production A→ β
end
else if action[s, a] = accept then
return
else error( )
end
CONSTRUCTING SLR(1) PARSING TABLE:

To perform SLR parsing, take grammar as input and do the following:

1. Find LR(0) items.
2. Completing the closure.
3. Compute goto(I,X), where, I is set of items and X is grammar symbol.

LR(0) items:
An LR(0) item of a grammar G is a production of G with a dot at some position of the
right side. For example, production A → XYZ yields the four items :

A → . XYZ
A → X . YZ
A → XY . Z
A → XYZ .

Closure operation:
If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I
by the two rules:

1. Initially, every item in I is added to closure(I).

2. If A → α . Bβ is in closure(I) and B → γ is a production, then add the item B → . γ to I , if it
is not already there. We apply this rule until no more new items can be added to closure(I).

Goto operation:
Goto(I, X) is defined to be the closure of the set of all items [A→ αX . β] such that
[A→ α . Xβ] is in I.

Steps to construct SLR parsing table for grammar G are:

1. Augment G and produce G’

2. Construct the canonical collection of set of items C for G’
3. Construct the parsing action function action and goto using the following algorithm that
requires FOLLOW(A) for each non-terminal of grammar.

Algorithm for construction of SLR parsing table:

Input : An augmented grammar G’

Output : The SLR parsing table functions action and goto for G’
Method :
1. Construct C = {I0, I1, …. In}, the collection of sets of LR(0) items for G’.
2. State i is constructed from Ii.. The parsing functions for state i are determined as follows:
(a) If [A→α∙aβ] is in Ii and goto(Ii,a) = Ij, then set action[i,a] to “shift j”. Here a must be
terminal.
(b) If [A→α∙] is in I i , then set action[i,a] to “reduce A→α” for all a in FOLLOW(A).
(c) If [S’→S.] is in Ii, then set action[i,$] to “accept”.

If any conflicting actions are generated by the above rules, we say grammar is not SLR(1).
3. The goto transitions for state i are constructed for all non-terminals A using the rule:
If goto(Ii,A) = Ij, then goto[i,A] = j.
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state of the parser is the one constructed from the set of items containing
[S’→.S].

Example for SLR parsing:

Construct SLR parsing for the following grammar :
G:E→E+T|T
T→T*F|F
F → (E) | id

The given grammar is :

G : E → E + T ------ (1)
E →T ------ (2)
T → T * F ------ (3)
T→F ------ (4)
F → (E) ------ (5)
F → id ------ (6)

Step 1 : Convert given grammar into augmented grammar.

Augmented grammar :
E’ → E
E →E+T
E →T
T →T*F
T→F
F → (E)
F → id

Step 2 : Find LR (0) items.

I0 : E’ → . E
E →.E+T
E →.T
T →.T*F
T →.F
F → . (E)
F → . id

GOTO ( I0 , E) GOTO ( I4 , id )
I1 : E’ → E . I5 : F → id .
E →E.+T
GOTO ( I6 , T )
GOTO ( I0 , T) I9 : E → E + T .
I2 : E → T . T→T.*F
T →T.*F
GOTO ( I6 , F )
GOTO ( I0 , F) I3 : T → F .
I3 : T → F .
GOTO ( I6 , ( )
I4 : F → ( . E )

GOTO ( I0 , ( ) GOTO ( I6 , id)

I4 : F → ( . E) I5 : F → id .
E →.E+T
E →.T GOTO ( I7 , F )
T →.T*F I10 : T → T * F .
T →.F
F → . (E) GOTO ( I7 , ( )
F → . id I4 : F → ( . E )
E →.E+T
GOTO ( I0 , id ) E →.T
I5 : F → id . T →.T*F
T →.F
GOTO ( I1 , + ) F → . (E)
I6 : E → E + . T F → . id
T →.T*F
T →.F GOTO ( I7 , id )
F → . (E) I5 : F → id .
F → . id
GOTO ( I8 , ) )
GOTO ( I2 , * ) I11 : F → ( E ) .
I7 : T → T * . F
F → . (E) GOTO ( I8 , + )
F → . id I6 : E → E + . T
T→.T*F
GOTO ( I4 , E ) T→.F
I8 : F → ( E . ) F→.(E)
E→E.+T F → . id

GOTO ( I4 , T) GOTO ( I9 , *)
I2 : E →T . I7 : T → T * . F
T→T.*F F→.(E)
F → . id
GOTO ( I4 , F)
I3 : T → F .
GOTO ( I4 , ( )
I4 : F → ( . E)
E →.E+T
E →.T
T →.T*F
T →.F
F → . (E)
F → id

FOLLOW (E) = { $ , ) , +)
FOLLOW (T) = { $ , + , ) , * }
FOOLOW (F) = { * , + , ) , $ }

SLR parsing table:

ACTION GOTO

id + * ( ) $ E T F

I0 s5 s4 1 2 3

I1 s6 ACC

I2 r2 s7 r2 r2

I3 r4 r4 r4 r4

I4 s5 s4 8 2 3

I5 r6 r6 r6 r6

I6 s5 s4 9 3

I7 s5 s4 10

I8 s6 s11

I9 r1 s7 r1 r1

I10 r3 r3 r3 r3

I11 r5 r5 r5 r5

Blank entries are error entries.

Stack implementation:

Check whether the input id + id * id is valid or not.

STACK INPUT ACTION

0 id + id * id $ GOTO ( I0 , id ) = s5 ; shift

0 id 5 + id * id $ GOTO ( I5 , + ) = r6 ; reduce by F→id

0F3 + id * id $ GOTO ( I0 , F ) = 3
GOTO ( I3 , + ) = r4 ; reduce by T → F

0T2 + id * id $ GOTO ( I0 , T ) = 2
GOTO ( I2 , + ) = r2 ; reduce by E → T

0E1 + id * id $ GOTO ( I0 , E ) = 1
GOTO ( I1 , + ) = s6 ; shift

0E1+6 id * id $ GOTO ( I6 , id ) = s5 ; shift

0 E 1 + 6 id 5 * id $ GOTO ( I5 , * ) = r6 ; reduce by F → id

0E1+6F3 * id $ GOTO ( I6 , F ) = 3
GOTO ( I3 , * ) = r4 ; reduce by T → F

0E1+6T9 * id $ GOTO ( I6 , T ) = 9
GOTO ( I9 , * ) = s7 ; shift

0E1+6T9*7 id $ GOTO ( I7 , id ) = s5 ; shift

0 E 1 + 6 T 9 * 7 id 5 $ GOTO ( I5 , $ ) = r6 ; reduce by F → id

0 E 1 + 6 T 9 * 7 F 10 $ GOTO ( I7 , F ) = 10
GOTO ( I10 , $ ) = r3 ; reduce by T → T * F

0E1+6T9 $ GOTO ( I6 , T ) = 9
GOTO ( I9 , $ ) = r1 ; reduce by E → E + T

0E1 $ GOTO ( I0 , E ) = 1
GOTO ( I1 , $ ) = accept

TYPE CHECKING

A compiler must check that the source program follows both syntactic and semantic conventions
of the source language.
This checking, called static checking, detects and reports programming errors.

Some examples of static checks:

1. Type checks – A compiler should report an error if an operator is applied to an incompatible

operand. Example: If an array variable and function variable are added together.

CD Unit 2
No ratings yet
CD Unit 2
19 pages
CS602PC - Compiler Design Lecture Notes Unit 2
No ratings yet
CS602PC - Compiler Design Lecture Notes Unit 2
42 pages
CD Unit 2
No ratings yet
CD Unit 2
15 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
Unit 2
No ratings yet
Unit 2
29 pages
CC 3
No ratings yet
CC 3
29 pages
Compiler Design Lecture Notes
No ratings yet
Compiler Design Lecture Notes
37 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
Wa0005.
No ratings yet
Wa0005.
42 pages
Unit 2
No ratings yet
Unit 2
45 pages
Unit-2
No ratings yet
Unit-2
32 pages
UNIT 3 Syntax Analysis-Part1: Harshita Sharma
No ratings yet
UNIT 3 Syntax Analysis-Part1: Harshita Sharma
70 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
28 pages
Compiler 2
No ratings yet
Compiler 2
32 pages
Unit4 Notes
No ratings yet
Unit4 Notes
32 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
Unit Iii
No ratings yet
Unit Iii
95 pages
Unit - II CD
No ratings yet
Unit - II CD
38 pages
II. Parser: Syntax Analysis
No ratings yet
II. Parser: Syntax Analysis
18 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
compiler_design- Module3
No ratings yet
compiler_design- Module3
19 pages
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
No ratings yet
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
76 pages
2 Syntax Analysis - Introduction
No ratings yet
2 Syntax Analysis - Introduction
8 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Role of Parse1
No ratings yet
Role of Parse1
20 pages
COMPILER DESIGN UNIT 2
No ratings yet
COMPILER DESIGN UNIT 2
44 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
C Depart
No ratings yet
C Depart
7 pages
4th - Syntax Analysis
No ratings yet
4th - Syntax Analysis
29 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
Syntax Analysis
No ratings yet
Syntax Analysis
58 pages
Chapter 6
No ratings yet
Chapter 6
52 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Unit 2 - Sessions 1 - 2
No ratings yet
Unit 2 - Sessions 1 - 2
36 pages
CH03
No ratings yet
CH03
57 pages
Unit-2 PCD
No ratings yet
Unit-2 PCD
36 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
CS8602 CD Unit 2
No ratings yet
CS8602 CD Unit 2
43 pages
CC_unit_3
No ratings yet
CC_unit_3
51 pages
Syntax Analysis
No ratings yet
Syntax Analysis
73 pages
CD Unit 2
100% (1)
CD Unit 2
20 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
Unit-3-Parser Basics, Need and Role of Parser
No ratings yet
Unit-3-Parser Basics, Need and Role of Parser
5 pages
Syntax Analysis
No ratings yet
Syntax Analysis
47 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
Module 3 Ss and CD Lecture Notes 18cs61
No ratings yet
Module 3 Ss and CD Lecture Notes 18cs61
15 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
20 pages
CD UNIT-II Syntax Analysis
No ratings yet
CD UNIT-II Syntax Analysis
13 pages
UNIT 2 PPT
No ratings yet
UNIT 2 PPT
22 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
CH-3 Syntax Analyzer
No ratings yet
CH-3 Syntax Analyzer
41 pages
MODULE 3 Syntax Analysis
100% (1)
MODULE 3 Syntax Analysis
182 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Top-Down Parsing Predictive Parsing
No ratings yet
Top-Down Parsing Predictive Parsing
4 pages
Week 10 - Non Recursive Predictive Parsor
0% (1)
Week 10 - Non Recursive Predictive Parsor
41 pages
Simple One Pass Compiler
No ratings yet
Simple One Pass Compiler
62 pages
Low Level Virtual Machine C# Compiler Senior Project Proposal
No ratings yet
Low Level Virtual Machine C# Compiler Senior Project Proposal
30 pages
CD Lab Manual PDF
No ratings yet
CD Lab Manual PDF
83 pages
Unit 1,2 PDF
No ratings yet
Unit 1,2 PDF
31 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
31 pages
V - Cse - CS3501 - CD - QB - Unit 2
No ratings yet
V - Cse - CS3501 - CD - QB - Unit 2
7 pages
Compiler Design
No ratings yet
Compiler Design
4 pages
Compiler Design and Construction Note
No ratings yet
Compiler Design and Construction Note
97 pages
Theory of Automata
No ratings yet
Theory of Automata
20 pages
Notes Compile Complete
No ratings yet
Notes Compile Complete
117 pages
Research Paper Compiler
No ratings yet
Research Paper Compiler
9 pages
Unit 2.2
No ratings yet
Unit 2.2
31 pages
Collection Exit Model Exam File Final
No ratings yet
Collection Exit Model Exam File Final
11 pages
Compiler-Group Assignment
No ratings yet
Compiler-Group Assignment
15 pages
CompilerDesign 210170107518 Krishna (4-10)
No ratings yet
CompilerDesign 210170107518 Krishna (4-10)
47 pages
At&CD Important Questions Bank
No ratings yet
At&CD Important Questions Bank
7 pages
Top-Down Parsing-Prerequisites For Predictive Parsing
No ratings yet
Top-Down Parsing-Prerequisites For Predictive Parsing
8 pages
Adama Science and Technology University: School of Electrical Engineering and Computing
No ratings yet
Adama Science and Technology University: School of Electrical Engineering and Computing
10 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
CSC-437 Chapter 4
No ratings yet
CSC-437 Chapter 4
65 pages
Top Down Translation
No ratings yet
Top Down Translation
96 pages
CS3501 CD QB-UNIT 3
No ratings yet
CS3501 CD QB-UNIT 3
7 pages
What Is Translators
No ratings yet
What Is Translators
95 pages
CD Unit-Ii
No ratings yet
CD Unit-Ii
37 pages
Compiler Lecture 10
No ratings yet
Compiler Lecture 10
19 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
70 pages
Syntax Analysis
No ratings yet
Syntax Analysis
20 pages