0% found this document useful (0 votes)
9 views

Chapter_2_Finite State Automata_Part_3

The document discusses Context-Free Grammar (CFG) and its significance in recognizing languages that cannot be processed by finite automata. It explains the concept of Push-Down Automata (PDA) and how CFG can generate strings for various languages, including palindromes and specific patterns of 'a's and 'b's. Additionally, it covers derivation processes, types of derivations, and the concept of ambiguous grammar.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Chapter_2_Finite State Automata_Part_3

The document discusses Context-Free Grammar (CFG) and its significance in recognizing languages that cannot be processed by finite automata. It explains the concept of Push-Down Automata (PDA) and how CFG can generate strings for various languages, including palindromes and specific patterns of 'a's and 'b's. Additionally, it covers derivation processes, types of derivations, and the concept of ambiguous grammar.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Context Free Grammar

 We have introduced two different but equivalent,


methods of describing languages i.e. Finite
Automata and Regular Expressions.
 Expressions used to represent regular languages are
called regular expressions and the model that
recognizes the regular expression is finite automata.
 However, there is a class of languages, i.e. Non Regular
Languages, which are not processed by an NFA/DFA.
 In order to recognize these languages, there is a need
to come up with a new kind of automata.
Context Free Grammar
 This is what brings us to the concept of Push-Down
Automata (PDA).
 A PDA is a more powerful version of automata which
can recognize certain languages that are not
recognized by an NFA/DFA.
 Languages recognized by PDA are generated by a set of
rules called Context-Free Grammar(CFG) and are
known as Context-Free Languages (CFL).
 Context Free Languages are a larger class of languages
that encompasses all regular languages and some non
regular languages.
Context Free Grammar
 Context Free Grammar are the set of rules that are
used to generate the string that belongs to Context
Free Language
 Context Free Grammars are more expressive it is
because if a language L is accepted by a finite automata
then L can be generated by a context-free grammar as
well. Beware: The converse is NOT true
 CFG can describe certain features that have a recursive
structure, which makes them useful in a variety of
applications.
Context Free Grammar
 CFG are studied in fields of theoretical computer
science compiler design, and linguistics.
 Designers of compilers and interpreters for
programming languages often start by obtaining a
grammar for the language.
 Most compilers and interpreters contain a component
called a parser that extracts the meaning of a program
prior to generating the compiled code or performing
the interpreted execution.
 So construction of parser will be facilitated after we
describe the CFG.
Context Free Grammar
 Formally Context Free Grammar G can be defined by
quadruple:
G=(V, ∑, R, S)
Where, V= Set of variables / set of non-terminals (upper
case letter) and set of terminals (lower case
letter)
∑=Set of terminals (lower case letter)
S= Start symbol
R =Set of production rules
R is of the form:
P→α, where α={V, ∑}* and P Є V {single non terminals)
Context Free Grammar
 Let us consider a CFG as
G=(V, ∑, R, S)
where, V= {A,B,S,a,b}
∑= {a,b}
R consists of:
S → AbB
A → aA
A→ε
B → bB
B→ε
S= S is a start symbol
Context Free Grammar
 Above CFG can be used to generate the string that
belongs to the language L={ anbbm for m,n>=0}
 For example the string w=aaabbbb can be derived from
the given grammer by using the production rule.
Context Free Grammar
The process of string w=aaabbbb generation can be shown as:
S → AbB using (i)
→ aAbA using (ii)
→ aaAbA using (ii)
→ aaaAbA using (ii)
→ aaaεbA using (iii)
→ aaabbB using (iv)
→ aaabbbB using (iv)
→ aaabbbbB using (iv)
→ aaabbbbε using (v)
→ aaabbbb
which is the required string generated by the given grammar.
Context Free Grammar
 Rules of grammar for the regular expression are:
 a*
S → aS------------- (i)
S →ε ------------------(ii)
 a+b
S → a------------- (i)
S → b------------- (ii)
 (a+b)*
S → aS------------- (i)
S → bS------------- (ii)
S →ε ------------------(iii)
Context Free Grammar
 a*.b*
S → AB------------- (i)
S →ε ------------------(ii)
A→ aA------------- (iii)
A →ε ------------------(iv)
B→ bB------------- (v)
B→ε ------------------(vi)
 a*+b*
S → A------------- (i)
S → B------------- (ii)
S →ε ------------------(iii)
A→ aA------------- (iv)
A →ε ------------------(v)
B→ bB------------- (vi)
B→ε ------------------(vii)
CFG_Example_1
Write the CFG for the language that generates the string of at
least 2 length over ∑ {0,1}.
Solution,
The regular expression for the language that generates the string of at
least 2 length is:
R.E= (0+1)(0+1)(0+1)*
Its rule for the grammar can be written as:
S → AAB------------- (i)
A→ 1 ------------- (ii)
A → 0 ------------------(iii)
B→ 1B------------- (iv)
B →0B ------------------(v)
B→ε ------------------(vi)
CFG_Example_1
Let G be the CFG for the given language which is given by
quadruple as:
G=(V, ∑, R, S)
where, V= {A, B, S, 0, 1} are set of variables
∑= {0, 1} are set of terminals
R consists of:
S → AAB------------- (i)
A→ 1 ------------- (ii)
A → 0 ------------------(iii)
B→ 1B------------- (iv)
B →0B ------------------(v)
B→ε ------------------(vi)
S= S is a start symbol
CFG_Example_2
 Write the CFG for the language that generates the
palindrome string over ∑ {a,b}.

Solution,
The set of palindrome string over {a,b} are:
{a, b, aa, bb,aba, baab, bbbaabbb, aaabbaaa, ababbaba………}

The properties of palindrome string is that the first symbol


must match with last symbol, second last symbol must match
with second last symbol and so on.
CFG_Example_2
 Let us consider a CFG as
G=(V, ∑, R, S)
where, V= {A, a, b} are set of variables
∑= {a,b} set of terminals
R consists of:
A→ aAa
A→ bAb
A→a
A→b
A→ε
S = A is the start symbol
CFG_Example_3
 Construct CFG for language: {ambn : m>n, n≥0} .
Solution,
The set of string over {a,b} are:
{a, aa, aaa, aaab, aaaabbb, aaaaabbbb, aaaabbb, aaaaab………}

The properties of string generated by above CFG is that the


occurrence of a is always greater than b, also the number of b
can be zero i.e. no occurrence of b.
CFG_Example_3
 Let us consider a CFG as
G=(V, ∑, R, S)
where, V= {S, A, a, b}
∑= {a,b}
R consists of:
S→ SA
S→ aS | a
A →aAb | ε
S = S is the start symbol
CFG_Example_4
 Construct CFG for language: {ambn : m, n>0, m≥n} .
Solution,
The set of string over {a,b} are:
{ab, aabb, aab, aaabb, aaaaabbbb, aaaabbb, aaaaab………}

The properties of string generated by above CFG is that the


occurrence of a and b can be same as well as the number of a
can be more than number of b.
CFG_Example_4
 Let us consider a CFG as
G=(V, ∑, R, S)
where, V= {S, a, b}
∑= {a,b}
R consists of:
S→ AB
A→ aA| a
B→ aBb| b
S = S is the start symbol
Derivation of String from CFG
 Let G = (V, ∑, R, S) be a context free grammar. If w1 w2
w3….…wn are strings over variable V such that:
w1 → w2 → w3 ……….→ wn then we can say wn is
derivable.
 The sequence of substitution to obtain a string is
called derivation.
 Say wi derives wn i.e. w1 →* wn. Then, the sequence of
steps to obtain wn from wi is called derivation.
 A derivation of a string w in a given grammar G is a
sequence of substitutions starting with the start
symbol and resulting in w.
Derivation of String from CFG
 Language of CFG (L(G)):
 If G = (V, ∑, R, S) be a CFG, then the language of G
denoted by L(G) is the set of terminal strings that have
derivations from start symbol.
i.e. L(G) = { w ϵ ∑* : S →* w }
Derivation of String from CFG
 There are two types of derivation:
 Left Most Derivation (LMD)
 Right Most Derivation (RMD)
 In the leftmost derivation (LMD), at each step the
production rule for the leftmost non-terminal is used
whereas in rightmost derivation (RMD) the
production rule for the rightmost non-terminal is
used.
Derivation of String from CFG
Consider a grammar G as:
G=(V, ∑, R, S)
where, V= {S, A, B, a, b} are set of terminals
∑= {a, b} are set of terminals
R consists of:
S → aAS------------- (i)
S→ aS ------------- (ii)
S → b ------------------(iii)
A→ bB------------- (iv)
A →a ------------------(v)
B→ aA ------------------(vi)
B→ b ------------------(vii)
S= S is a start symbol
Derivation of String from CFG
Let us derive the string w=abaaab using different
derivation.
Left Most Derivation (LMD)
S → aAS------------rule (i)
→ abBS------------rule (iv)
→ abaAS------------rule (vi)
→ abaaS------------rule (v)
→ abaaaS------------rule (ii)
→ abaaab------------rule (iii)
Derivation of String from CFG
Right Most Derivation (RMD)
S → aAS------------rule (i)
→ aAaS------------rule (ii)
→ aAab------------rule (iii)
→ abBab------------rule (iv)
→ abaAab------------rule (vi)
→ abaaab------------rule (v)
Derivation Tree/ Parse Tree
 Derivation tree is the pictorial description of the
derivation of the string using the production rule
defined by the grammar.
 It is a hierarchical representation of derivation that
shows how the start symbol of a grammar derives a
strings in language.
 For a CFG, G = (V, ∑, R, S) , a parse tree has following
properties:
 The root is labeled by start symbol (S).
 Each interior nodes of parse tree are variables (V).
Derivation Tree/ Parse Tree
 Each leaf node is labeled by terminal symbol ( ∑ ) or ϵ .
 If an interior node is labeled with non-terminal A and its
children are x1 ,x2 ,….xn from left to right then there is a
production P as:
A → x1 x2 ………xn for each xi ϵ ∑

 Left hand side of the production rule is the root node at


each level and the right hand side of the production rule
is divided into multiple branches
Derivation Tree/ Parse Tree
Consider a production rule
S → X1 X2 X3
X2→ aX3
Its derivation tree can be drawn as:
S

X1 X2 X3

a X3
Derivation Tree/ Parse Tree
Consider a grammar G as:
G=(V, ∑, R, S)
where, V= {S, A, B, a, b} are set of terminals
∑= {a, b} are set of terminals
R consists of:
S → aAS------------- (i)
S→ aS ------------- (ii)
S → b ------------------(iii)
A→ bB------------- (iv)
A →a ------------------(v)
B→ aA ------------------(vi)
B→ b ------------------(vii)
S= S is a start symbol
Derive a string w= abaaab using LMD draw parse tree
Derivation Tree/ Parse Tree
Left Most Derivation (LMD)
S → aAS------------rule (i)
→ abBS------------rule (iv)
→ abaAS------------rule (vi)
→ abaaS------------rule (v)
→ abaaaS------------rule (ii)
→ abaaab------------rule (iii)
Derivation Tree/ Parse Tree
Parse Tree for derivation
S

a S
A a
S

b B

A b
a

a
Derivation Tree/ Parse Tree
Consider a grammar G as:
G=(V, ∑, R, S)
where, V= {S, A, B, a, b} are set of terminals
∑= {a, b} are set of terminals
R consists of:
S → aA | bB
A→ b | bS | aAA
B → a | aS | bBB
S= S is a start symbol
For the string bbaababa find its LMD, RMD and draw
parse tree.
Derivation Tree/ Parse Tree
Left Most Derivation (LMD)
S → bB------------[ using S → bB ]
→ bbBB------------[ using B → bBB ]
→ bbaB------------[ using B → a]
→ bbaaS------------[ using B → aS ]
→ bbaabB------------[ using S → bB ]
→ bbaabaS------------[ using B → aS ]
→ bbaababB------------[ using S → bB ]
→ bbaababa------------[ using B → a ]
Derivation Tree/ Parse Tree
Right Most Derivation (RMD)
S → bB------------[ using S → bB ]
→ bbBB------------[ using B → bBB ]
→ bbBaS------------[ using B → aS]
→ bbBabB------------[ using S → bB ]
→ bbBabaS------------[ using B → aS ]
→ bbBababB------------[ using S → bB ]
→ bbBababa------------[ using B → a ]
→ bbaababa------------[ using B → a ]
Derivation Tree/ Parse Tree
S

b B

b B
B

a S
a
b B

a S

b B
Parse Tree of Derivation
a
Ambiguous Grammar
 A grammar G = (V, ∑, R, S) is said to be ambiguous if
there is a string w Є L(G) for which we can derive two
or more distinct derivation tree rooted at S and
resulting the string w.
 In other words, if there exist multiple leftmost
derivation or multiple rightmost derivation for the
same string for any grammar then it is ambiguous.
Ambiguous Grammar
 Let G = (V, ∑, R, S) be a CFG with the production rule of
the form:
S → AB | aaB
A → a | Aa
B→b
Now, for string w=aab , we have two left most derivation
as:
S → AB
S → aaB
→ AaB
→ aab
→ aaB
→ aab
Ambiguous Grammar
 The derivation tree can be:
S S

A B a a B

A a
b b

a
 Since there are two parse tree for the same string w=aab,
this grammar is ambiguous.
Closure Properties of CFG
 Context Free Languages are closed under
 Union
 Concatenation
 Kleene Star
 That is we can establish a new rule in the existing
grammar which can generate the string similar to the
existing grammar.
 To prove the closure properties we need to redefine the
four tuples of the grammar
Closure Properties of CFG (Union)
 Let G1 = (V1, ∑1 , R1, S1) and G2 = (V2, ∑2 , R2, S2) be two
context free grammars.
 Let us consider that they have disjoint set of non terminals.
 To show the union of two CFG is also a CFG we need to
define the new start symbol say S which is not in G1 and G2
 Then construct a new grammar G = (V, ∑ , R, S) where ,
V = V1 ∪ V2 ∪ {S}
∑ = ∑1 ∪ ∑2
R = R1 ∪ R2 ∪ {S →S1 | S2}
 Here G is clearly a context free grammar because two new
rules added are also of correct form.
Closure Properties of CFG (Union)
 Now , we can claim that L(G) = L(G1) ∪ L(G2)
 It is because of the rule S →S1 | S2.
 If we want to generate the string as per the rule of G1
we can use the rule S→S1 and using the production
rule of R1 it can generate the string that belongs to
L(G1).
 Again if we want to generate the string as per the rule
of G2 we can use the rule S→S2 and using the
production rule of R2 it can generate the string that
belongs to L(G2).
 So, context free languages are closed under union.
Closure Properties of CFG (Concatenation)
 Let G1 = (V1, ∑1 , R1, S1) and G2 = (V2, ∑2 , R2, S2) be two
context free grammars.
 Let us consider that they have disjoint set of non terminals.
 To show the concatenation of two CFG is also a CFG we
need to define the new start symbol say S which is not in G1
and G2
 Then construct a new grammar G = (V, ∑ , R, S) where ,
V = V1 ∪ V2 ∪ {S}
∑ = ∑1 ∪ ∑2
R = R1 ∪ R2 ∪ {S →S1S2}
 Here G is clearly a context free grammar because new rule
added is also of correct form.
Closure Properties of CFG (Concatenation)
 Here, if S1→*w1 and S2→*w2 then we can claim
S →* w1w2 as S →S1S2
 Now , we can claim that L(G) = L(G1).L(G2)
 If we want to generate the string w1w2 it is possible
because of rule S →S1S2 which can generate the string
that belongs to L(G1) and L(G2).
 The symbol S1 generates the string that belongs to
L(G1) and S2 generates the string that belongs to
L(G2).
 So, context free languages are closed under
concatenation.
Closure Properties of CFG (Kleene Star)
 Let G1 = (V1, ∑1 , R1, S1) be a context free grammar.
 To show the kleen star of CFG is also a CFG we need to
define the new start symbol say S which is not in G1.
 Then construct a new grammar G = (V, ∑ , R, S) where,
V = V1 ∪ {S}
∑ = ∑1
R = R1 ∪ {S→SS1 and S→ ε}
 Here G is clearly a context free grammar because new
rule added is also of correct form.
Closure Properties of CFG (Kleene Star)
 Here, if S1→*w1 and we can claim S →*w1 as S→SS1
and S→ ε
 Now , we can claim that L(G) = L(G1)
 If we want to generate the string w1 we can repeatedly
use S→SS1 as per our need and after that we can use
the rule R1 to generate w1.
 The symbol S1 generates the string that belongs to
L(G1).
 So, context free languages are closed under kleen star.
Closure Properties of CFG
 The Context Free Language are not closed under
Intersection and Complement.
 This property of CFL can be shown by using the
Pumping Lemma.
 We know that L1 = {0n1n2n | n >= 1} is not a CFL (use
the Pumping Lemma).
 However, L2 = {0n1n2i | n >= 1, i >= 1} is a CFL, and its
CFG is:
S → AB
A → 0A1 | 01
B → 2B | 2
Closure Properties of CFG
 So is L3 = {0i1n2n | n >= 1, i >= 1} is a CFL, and its CFG is:
S → AB
A → 0A | 0
B → 1B2 | 12
But L1 = L2  L3 is NOT a CFL.
So, context free languages are not closed under
intersection.
Closure Properties of CFG
 CFLs are NOT closed under Complement.
 If L is a CFL, its complement ഥ
𝑳
We know L1 L𝟐 = L𝟏 + L𝟐
Also CFLs are closed under union, it would follow that
the CFLs are closed under intersection.
However we know CFL are not closed under intersection.
Hence CFL are not closed under complement.
Chomsky Hierarchy of Grammar
Chomsky Hierarchy of Grammar

Assignment:
• Go through the production rule of each grammar
End of Chapter 2
Thank You !!!!

You might also like