0% found this document useful (0 votes)
55 views

Atcd Unit 2

Uploaded by

Sanam Durgarani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Atcd Unit 2

Uploaded by

Sanam Durgarani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

AUTOMATA THEORY AND COMPILER DESIGN

UNIT-II
Context free grammar
Context free grammar is a formal grammar which is used to
generate all possible strings in a given formal language.
Context free grammar G can be defined by four tuples as:
G= (V, T, P, S)
Where,
G describes the grammar
T describes a finite set of terminal symbols.
V describes a finite set of non-terminal symbols
P describes a set of production rules
S is the start symbol.

In CFG, the start symbol is used to derive the string. You can
derive the string by repeatedly replacing a non-terminal by the
right hand side of the production, until all non-terminal have
been replaced by terminal symbols.
Example
1.Construct the CFG for the language having any number of
a's over the set ∑= {a} Check String:aaaa
2. Construct a CFG for the regular expression (0+1)*. Check
String:1010
3. Construct a CFG for a language L = {wcwR | where w € (a,
b)*}. Check String: abbcbba
4. Construct a CFG for the language L = anb2n where n>=1.
Check String:aabbbb.

Context-free grammars (CFGs) are used to describe context-free languages.


A context-free grammar is a set of recursive rules used to generate patterns
of strings. A context-free grammar can describe all regular languages and
more, but they cannot describe all possible languages.
Derivation
Derivation is a sequence of production rules. It is used to get the input string
through these production rules.

Generation of Derivation Tree


A derivation tree or parse tree is an ordered rooted tree that graphically
represents the semantic information a string derived from a context-free
grammar.
Representation Technique
Root vertex − Must be labeled by the start symbol.
Vertex − Labeled by a non-terminal symbol.
Leaves − Labeled by a terminal symbol or ε.
If S → x1x2 …… xn is a production rule in a CFG, then the parse tree /
derivation tree will be as follows −
Starting with the grammar's topmost rule, the parser executes
subroutines backward for each input symbol that is not a
terminal symbol until it reaches one.

A parse tree representing the input's structure according to the


grammar is the parser's output.

Sentential Form and Partial Derivation Tree


 A partial derivation tree is a sub-tree of a derivation
tree/parse tree such that either all of its children are in the
sub-tree or none of them are in the sub-tree.
 If a partial derivation tree contains the root S, it is called
a sentential form. The above sub-tree is also in sentential
form.
Leftmost and Rightmost Derivation of a String
•Leftmost derivation − A leftmost derivation is obtained by
applying production to the leftmost variable in each step.
•Rightmost derivation − A rightmost derivation is obtained by
applying production to the rightmost variable in each step.
•Example:
1) Let any set of production rules in a CFG be X → X+X | X*X |X|a
over an alphabet {a}. The leftmost & rightmost derivation for the string a+a*a
2) E = E + E , E = E - E , E = a | b The leftmost & rightmost derivation for the
string a - b + a
3) S → AB | ε ,A → aB , B → Sb leftmost & rightmost string "abb“
4) S → aB | bA , S → a | aS | bAA ,S → b | aS | aBB leftmost & rightmost
string aabbabba
5) S → A1B ,A → 0A | ε ,B → 0B | 1B | ε leftmost & rightmost leftmost &
rightmost string is:1000101
Ambiguity in Grammar:

 A grammar is said to be ambiguous if there exists more than one left most
derivation or more than one right most derivation or more than one parse
tree for a given input string.

 A grammar is said to be ambiguous if there exists more than one leftmost


derivation or more than one rightmost derivation or more than one parse
tree for the given input string. If the grammar is not ambiguous, then it is
called unambiguous.

 If the grammar has ambiguity, then it is not good for compiler


construction. No method can automatically detect and remove the
ambiguity, but we can remove ambiguity by re-writing the whole grammar
without ambiguity.
1) consider a grammar G with the production rule For the string "3 * 2 + 5",
the above grammar can generate two parse trees by leftmost derivation
E→I
E→E+E
E→E*E
E → (E)
I → ε | 0 | 1 | 2 | ... | 9
2) Check whether the given grammar G is ambiguous or not. string " id + id – id
E→E+E
E→E-E
E → id
3) Check whether the given grammar G is ambiguous or not. string "aabb“
S → aSb | SS
S→ε
Top-down Approach −
•Starts with the starting symbol S
•Goes down to tree leaves using productions
Bottom-up Approach −
•Starts from tree leaves
•Proceeds upward to the root which is the starting symbol S
Classification of the Top-Down Parser:
•Recursive-descent parsers: Recursive-descent parsers are a type of top-
down parser that uses a set of recursive procedures to parse the input. Each
non-terminal symbol in the grammar corresponds to a procedure that
parses input for that symbol.

•Backtracking parsers: Backtracking parsers are a type of top-down parser


that can handle non-deterministic grammar. parsers are not as efficient as
other top-down parsers because they can potentially explore When a parsing
decision leads to a dead end, the parser can backtrack and try another
alternative. Backtracking many parsing paths.

•Non-backtracking parsers: Non-backtracking is a technique used in top-down parsing to


ensure that the parser doesn’t revisit already-explored paths in the parse tree during the
parsing process.
•Predictive parsers:
Predictive parsers are top-down parsers that use a parsing to predict
which production rule to apply based on the next input symbol. Predictive
parsers are also called LL parsers because they construct a left-to-right,
leftmost derivation of the input string.

Advantages of Top-Down Parsing


Top-down parsing is much simple.
It is incredibly easy to identify the response action of the top-down parser.

Disadvantages of Top-Down Parsing


Top-down parsing cannot handle left recursion in the grammar’s present.
When using recursive descent parsing, the parser may need to backtrack when it encounters
a symbol that does not match the expected token. This can make the parsing process slower
and less efficient.
Left-Recursion
A production of grammar is said to have left recursion if the leftmost
variable of its RHS is same as variable of its LHS.
Left recursion is considered to be a problematic situation for Top down parsers.
Therefore, left recursion has to be eliminated from the grammar.
Left-Recursion Elimination
we can eliminate left recursion by replacing the pair of productions with-
A → Aα / β
A → βA’
A’ → αA’ / ∈
Remove left recursion in the given grammar
A → ABd / Aa / a
B → Be / b

Remove left recursion in the given grammar


A → Ba / Aa / c
B → Bb / Ab / d

Remove left recursion in the given grammar


A → (B) / b
B → BxA / A
Left-Factoring
Left factoring is a process by which the grammar with common prefixes
is transformed to make it useful for Top down parsers.
Example
A → αβ1 / αβ2 / αβ3
This kind of grammar creates a problematic situation for Top down
parsers.
Top down parsers can not decide which production must be chosen to
parse the string in hand.
To remove this confusion, we use left factoring.
Left-Factoring
Left-Factoring
Problem-1: Do left factoring in the following grammar-
A → aAB / aBc / aAc

A → aA’
A’ → AB / Bc / Ac

A → aA’
A’ → AD / Bc
D→B/c
This is a left factored grammar.
Left-Factoring
Problem-2: Do left factoring in the following grammar-
S → aSSbS / aSaSb / abb / b
S → aS’ / b
S’ → SSbS / SaSb / bb

S → aS’ / b
S’ → SA / bb
A → SbS / aSb
This is a left factored grammar.
LL(1) Parsing: Here the 1st L represents that the scanning of the
Input will be done from the Left to Right manner and the
second L shows that in this parsing technique, we are going to
use the Left most Derivation Tree. And finally, the 1 represents
the number of look-ahead, which means how many symbols are
you going to see when you want to make a decision.
Essential conditions to check first are as follows:
1.The grammar is free from left recursion.
2.The grammar should not be ambiguous.
3.The grammar has to be left factored in so that the grammar is
deterministic grammar.
These conditions are necessary but not sufficient for proving a
LL(1) parser.
Algorithm to construct LL(1) Parsing Table:
Step 1: First check all the essential conditions mentioned above and go
to step 2.
Step 2: Calculate First() and Follow() for all non-terminals.
1. First(): If there is a variable, and from that variable, if we try to drive
all the strings then the beginning Terminal Symbol is called the First.
2.Follow(): What is the Terminal Symbol which follows a variable in the
process of derivation.
Step 3: For each production A –> α. (A tends to alpha)
3.Find First(α) and for each terminal in First(α), make entry A –> α in
the table.
2.If First(α) contains ε (epsilon) as terminal, then find the Follow(A) and
for each terminal in Follow(A), make entry A –> ε in the table.
3.If the First(α) contains ε and Follow(A) contains $ as terminal, then
make entry A –> ε in the table for the $.
To construct the parsing table, we have two functions:
In the table, rows will contain the Non-Terminals and the column
will contain the Terminal Symbols.

All the Null Productions of the Grammars will go under the Follow
elements and the remaining productions will lie under the elements
:
of the First set.
construct the given grammar into LL(1) Parser
E → E+T/ T
T → T*F / F
F → (E) / id verify w=id+id

construct the given grammar into LL(1) Parser


S→W
W → ZXY /XY
Y→C/∈
Z→a/d
X→ Xb / ∈ verify w=aadd

construct the given grammar into LL(1) Parser


S → iEtS / iEtSeS / a
E→c
Bottom-up parsing
 Bottom-up parsing starts from the leaf nodes of a tree and works in upward
direction till it reaches the root node.
 Here, we start from a sentence and then apply production rules in reverse
manner in order to reach the start symbol.
 The image given below depicts the bottom-up parsers available.
Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing.
These steps are known as shift-step and reduce-step.
•Shift step: The shift step refers to the advancement of the input
pointer to the next input symbol, which is called the shifted symbol.
This symbol is pushed onto the stack. The shifted symbol is treated as a
single node of the parse tree.
•Reduce step : When the parser finds a complete grammar rule (RHS)
and replaces it to (LHS), it is known as reduce-step. This occurs when
the top of the stack contains a handle. To reduce, a POP function is
performed on the stack which pops off the handle and replaces it with
LHS non-terminal symbol.
Shift reduce parsing procedure:
Shift reduce parsing is a process of reducing a string to the start symbol
of a grammar.
Shift reduce parsing uses a stack to hold the grammar and an input tape
to hold the string.

•Sift reduce parsing performs the two actions: shift and reduce.
That's why it is known as shift reduces parsing.
•At the shift action, the current symbol in the input string is pushed
to a stack.
•At each reduction, the symbols will replaced by the non-
terminals. The symbol is the right side of the production and non-
terminal is the left side of the production.
Grammar

Check input string is a-(a+a)

construct the given grammar into Shift reduce Parser


E → 2E2
E→ 3E3/ 4 Check input String is 32423

construct the given grammar into Shift reduce Parser


S → (L) /a
E→ L,S/ S Check input String is (a,(a,a))
LR Parser
 The LR parser is a non-recursive, shift-reduce, bottom-up parser. It
uses a wide class of context-free grammar which makes it the most
efficient syntax analysis technique.

 LR parsers are also known as LR(k) parsers, where L stands for left-
to-right scanning of the input stream; R stands for the construction
of right-most derivation in reverse, and k denotes the number of
lookahead symbols to make decisions.
Steps for constructing the LR(0) parsing table :

1.For the given input string write the Context free


grammar.
2.Writing augmented grammar
3.Create canonical LR(0) collection of items
4.Draw the data flow diagram.
5. Using Data flow diagram construct the Parse table.
In Parse table defining 2 functions: goto(list of non-
terminals) and action(list of terminals) in the parsing
table.
6.Stack implementation for verifying input String
7.Parse tree construction for the input String.
 Augmented grammar :
If G is a grammar with starting symbol S, then G’
(augmented grammar for G) is a grammar with a new
starting symbol S‘ and productions S’-> .S .
 The purpose of this new starting production is to indicate
to the parser when it should stop parsing.
 The ‘ . ‘ before S indicates the left side of ‘ . ‘ has been
read by a compiler and the right side of ‘ . ‘ is yet to be
read by a compiler.
Construct an LR parsing table for the given context-free grammar –

STEP 1 –

STEP 2 –
STEP 3 – Find LR(0) collection of items
STEP 4- Draw the data flow diagram
STEP 5- Parse Table
1)Construct an LR parsing table for the given context-free grammar –

E->BB
B->cB/d verify the w=ccdd also construct the parse tree.

2) Construct an LR parsing table for the given context-free grammar –

E->E+T/T
T->TF/F
F->F*/a/b
SLR Parser :

SLR is simple LR. It is the smallest class of grammar having few


number of states.

SLR is very easy to construct and is similar to LR parsing. The


only difference between SLR parser and LR(0) parser is that in
LR(0) parsing table, there’s a chance of ‘shift reduced’ conflict
because we are entering ‘reduce’ corresponding to all terminal
states.

We can solve this problem by entering ‘reduce’ corresponding to


FOLLOW of LHS of production in the terminating state. This is
called SLR(1) collection of items
Steps for constructing the SLR parsing table :
1. Writing augmented grammar
2. LR(0) collection of items to be found
3. Find FOLLOW of LHS of production
4. Defining 2 functions: goto [list of terminals] and
action[list of non-terminals] in the parsing table
1) Construct an SLR parsing table for the given context-free grammar –

E->E+T/T
T->T*F/F
F->(E)/id verify the w=id*id+id also construct the parse tree.

2) Construct an SLR parsing table for the given context-free grammar –

E->T+E/T
T->id
verify the w=id+d+id also construct the parse tree.
Advantages of SLR -parsing draw backs of-SLR parsing

You might also like