0% found this document useful (0 votes)
23 views

Presented by Jyoti Thakur

The document discusses the phases and process of compilation. It outlines the major phases as lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization and target code generation. It describes lexical analysis as breaking the program into tokens by removing whitespace and comments. It discusses the different types of grammars as left recursive and right recursive, and how left recursion is eliminated. It also covers topics like tokens, patterns, lexemes, parsers, LR parsing and the construction of LR parsing tables.

Uploaded by

jyoti thakur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Presented by Jyoti Thakur

The document discusses the phases and process of compilation. It outlines the major phases as lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization and target code generation. It describes lexical analysis as breaking the program into tokens by removing whitespace and comments. It discusses the different types of grammars as left recursive and right recursive, and how left recursion is eliminated. It also covers topics like tokens, patterns, lexemes, parsers, LR parsing and the construction of LR parsing tables.

Uploaded by

jyoti thakur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Presented by

JYOTI THAKUR
Outline
 Introduction
 Phases of compiler
 Tokens, Pattern, Lexems,
 Lexical Analysis
 Types of Grammar
 Parsers
Introduction
 Compiler is a language processor which translates a
source program into object program(machine
language).
 It can be Assembly Language.
input

Source Target
Compiler
Program Program

Output
Error messages
The Analysis-Synthesis Model
of Compilation
 There are two parts to compilation:
 Analysis
 Synthesis
Phases of compiler
 Lexical Analysis
 Syntax Analysis
 Semantic Analysis
 Intermediate Code Generator
 M/C independent code optimization
 Target code generator
 M/C dependent code optimization
Tokens, Patterns and Lexemes
 A sequence of input characters that comprises a single
token is called a lexeme. Ex: total, = , 2 etc.
 Tokens are classes of similar lexemes. Ex: identifier,
keyword, constant etc.
 It is a rule which describe a token. Ex: pattern for
identifier is letter is followed by letter or digits.
Lexical Analysis
 It reads total high level Language Program one
character at a time.
 It Breaks the program into tokens.
 It removes White spaces and Comments.
 It maintains the line number of the program.
 It creates the storage for identifier in the symbols
table.
Types of Grammar
 There are two types of Grammar:
1. Left Recursive Grammar.
2. Right Recursive Grammar.
• Left recursive grammar may not be equivalent to
Right recursive grammar.
• Left recursion grammar creates an infinite loop in the
program.
• Right recursion grammar doesn’t create any
problem.
Elimination of left recursion
 A grammar is left recursive if it has a non-terminal A
+
such that there is a derivation A=> Aα
 Top down parsing methods can’t handle left-
recursive grammars
 A simple rule for direct left recursion elimination:
 For a rule like:
 S-> Sa|b
 We may replace it with
 S -> b S’
 S’ -> a S’ | ɛ
Left factoring
 This kind of production actually creates an issue with
Generating a string.
 To avoid Backtracking.
For a rule like:
 S -> aα1/aα2/aα3…
We may replace it with:
 S ->aS’
 S’ -> α1/α2/α3..
Left factoring (cont.)
 Example:
 S -> I E t S | i E t S e S | a
 E -> b
Introduction
 A Top-down parser tries to create a parse tree from the
root towards the leaf's scanning input from left to
right
 It can be also viewed as finding a leftmost derivation
for an input string
 Example: id+id*idE
E -> TE’ lm
E E
lm
E E E
lm lm lm
E’ -> +TE’ | Ɛ T E’ T E’ T E’ T E’ T E’
T -> FT’
T’ -> *FT’ | Ɛ F T’ F T’ F T’ F T’ + T E’

F -> (E) | id id id Ɛ id Ɛ
Example
S->cAd
A->ab | a Input: cad

S S S

c A d c A d c A d

a b a
Computing First
 Rule 1:
If A is ε, Then First(A) = ε.
 Rule 2:
If A is terminal, Then First(A) = A.
 Rule 3:
If A is variable, Then First(A) -> First(x1, x2,
x3…),Where if first(x1) contains ε then find first(x2)…
Computing follow
 To compute First(A) for all non terminals A, apply
following rules until nothing can be added to any
follow set:
1. Place $ in Follow(A) where A is the start symbol
2. If there is a production S-> βAαD then everything in
First(αD) except ɛ is in Follow(A).
3. If there is a production S-> βAor a production
S->βAD where First(D) contains ɛ, then everything in
Follow(A) is in Follow(S)
 Example!
Construction of predictive
parsing table(LL1)
 For each production A->α in grammar do the
following:
1. For each terminal a in First(α) add A-> in M[A,a]
2. If ɛ is in First(α), then for each terminal b in
Follow(A) add A-> ɛ to M[A,b]. If ɛ is in First(α) and
$ is in Follow(A), add A-> ɛ to M[A,$] as well
 If after performing the above, there is no production
in M[A,a] then set M[A,a] to error
Example First Follow
F {(,id} {+, *, ), $}
E -> TE’ {(,id} {+, ), $}
E’ -> +TE’ | Ɛ T
E {(,id} {), $}
T -> FT’
T’ -> *FT’ | Ɛ E’ {+,ɛ} {), $}
T’ {*,ɛ} {+, ), $}
F -> (E) | id
Input Symbol
Non -
terminal id + * ( ) $
E E -> TE’ E -> TE’

E’ E’ -> +TE’ E’ -> Ɛ E’ -> Ɛ

T T -> FT’ T -> FT’

T’ T’ -> Ɛ T’ -> *FT’ T’ -> Ɛ T’ -> Ɛ

F F -> id F -> (E)


Another example
S -> iEtSS’ | a
S’ -> eS | Ɛ
E -> b

Input Symbol
Non -
terminal a b e i t $
S S -> a S -> iEtSS’

S’ S’ -> Ɛ S’ -> Ɛ
S’ -> eS
E E -> b
Introduction
 Constructs parse tree for an input string beginning at
the leaves (the bottom) and working towards the
root (the top)
 Example: id*id

E -> E + T | T id*id F * id T * id T*F F E


T -> T * F | F
T*F F
F -> (E) | id id F F id

id id F id T*F

id F id

id
Shift-reduce parser
 The key decisions during bottom-up parsing are about
when to reduce and about what production to
apply
 A reduction is a reverse of a step in a derivation
 The goal of a bottom-up parser is to construct a
derivation in reverse:
 E=>T=>T*F=>T*id=>F*id=>id*id
LR Parsing
 The most prevalent type of bottom-up parsers
 The structure of LR Parser is Similar to LL(1)Parser.
 It apply on unambiguous grammar.
 It uses a canonical collection of LR(0)items.
States of an LR parser
 States represent set of items
 An LR(0) item of G is a production of G with the dot at
some position of the body:
 For A->XYZ we have following items
 A->.XYZ
 A->X.YZ
 A->XY.Z
 A->XYZ.
 In a state having A->.XYZ we hope to see a string
derivable from XYZ next on the input.
 What about A->X.YZ?
Constructing canonical LR(0)
item sets
 Augmented grammar:
 G with addition of a production: S’->S
 Closure of item sets:
 If I is a set of items, closure(I) is a set of items constructed from I by
the following rules:
 Add every item in I to closure(I)

 If A->α.Bβ is in closure(I) and B->γ is a production then add the


item B->.γ to clsoure(I).
 Example: I0=closure({[E’->.E]}
E’->E E’->.E
E -> E + T | T E->.E+T
T -> T * F | F E->.T
T->.T*F
F -> (E) | id T->.F
F->.(E)
F->.id
Constructing canonical LR(0)
item sets (cont.)
 Goto (I,X) where I is an item set and X is a grammar
symbol is closure of set of all items [A-> αX. β] where
[A-> α.X β] is in I
I1
 Example E’->E.
E E->E.+T
I0=closure({[E’->.E]}
E’->.E I2
E->.E+T T
E’->T.
E->.T T->T.*F
T->.T*F I4
T->.F ( F->(.E)
F->.(E) E->.E+T
E->.T
F->.id T->.T*F
T->.F
F->.(E)
F->.id
Closure algorithm
SetOfItems CLOSURE(I) {
J=I;
repeat
for (each item A-> α.Bβ in J)
for (each prodcution B->γ of G)
if (B->.γ is not in J)
add B->.γ to J;
until no more items are added to J on one round;
return J;
GOTO algorithm
SetOfItems GOTO(I,X) {
J=empty;
if (A-> α.X β is in I)
add CLOSURE(A-> αX. β ) to J;
return J;
}
Canonical LR(0) items
Void items(G’) {
C= CLOSURE({[S’->.S]});
repeat
for (each set of items I in C)
for (each grammar symbol X)
if (GOTO(I,X) is not empty and not in C)
add GOTO(I,X) to C;
until no new set of items are added to C on a round;
}
E’->E
E -> E + T | T

Example acc
$
T -> T * F | F
F -> (E) | id
I6 I9
E->E+.T
I1 T->.T*F T
E’->E. + T->.F
E->E+T.
T->T.*F
E E->E.+T
F->.(E)
F->.id
I0=closure({[E’->.E]} I2
T I7 I10
E’->.E
E’->T. * T->T*.F F
E->.E+T F->.(E)
T->T.*F T->T*F.
E->.T id F->.id
T->.T*F id
T->.F I5
F->.(E) F->id.
F->.id ( +
I4
F->(.E)
E->.E+T I8 I11
E->.T
E E->E.+T )
T->.T*F F->(E.) F->(E).
T->.F
F->.(E)
F->.id

I3
T>F.
THANK YOU!!

You might also like