0% found this document useful (0 votes)
20 views

CH 3

This document discusses context-free grammars and languages. It defines context-free grammars, derivations, sentential forms, and languages of grammars. It also discusses leftmost and rightmost derivations, ambiguity in grammars, and Chomsky normal form.

Uploaded by

ephremweleba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

CH 3

This document discusses context-free grammars and languages. It defines context-free grammars, derivations, sentential forms, and languages of grammars. It also discusses leftmost and rightmost derivations, ambiguity in grammars, and Chomsky normal form.

Uploaded by

ephremweleba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter 3

Context-Free
Grammars
Context-Free Grammars and Languages
 Defn. 3.1.1 A context-free grammar is a quadruple (V, ,
P, S), where
 V is a finite set of variables (non-terminals)
 , the alphabet, is a finite set of terminal symbols
 P is a finite set of rules of the form V  (V  )*, and
 S  V, is the start symbol

 A production rule of the form A  w, where w  (V 


 )*, applied to the string uAv yields uwv, and u and
v define the context in which A occurs.
 Because the context places no limitations on the applicability
of a rule, such a grammar is called context-free grammar
(CFG)
2
Context-Free Grammars and Languages
 Defn. 3.1.2. Let G = (V, , P, S) be a CFG and v  (V  )*.
The set of strings derivable from v is defined recursively
as follows:
i) Basis: v is derivable from v
ii) Recursion: If u = xAy is derivable from v and A  w  P, then
xwy is derivable from v
iii) Closure: All strings constructed from v and a finite number of
applications of (ii) are derivable from v

 The derivability of w  (V  )* from v  (V  )+ is denoted


*  n *
v  w , or v  w , v  w, v  w
G

 The language of the grammar G is the set of terminal strings


derivable from the start symbol of G
3
CFG and Languages
 Defn. 3.1.3. Let G = (V, , P, S) be a CFG
(i) A string w  (V  )* is a sentential form of G if S 
*
G
w
(ii) A string w  * is a sentence of G if S 
*
G
w
(iii) The language of G, denoted L(G), is the set { w  * |
S*
w}
 A set of strings w over an alphabet is called a CFL if there
is a CFG that generates w

 Leftmost (Rightmost) derivation: a derivation that transforms


the 1st variable occurring in a string from left-to-right
(right-to-left)
e.g., Fig. 3.1(a) and (b) exhibit a leftmost derivation, whereas
Fig. 3.1(c) shows a rightmost derivation

 The derivation of a string can be graphically depicted by a


derivation/parse tree 4
CFG and Languages

5
CFG and Languages
 Design CFG for the following languages:
(i) The set { 0n1n | n  0 }.

(ii) The set { aibjck | i  j or j  k }, i.e., the set of strings of a’s


followed by b’s followed by c’s such that there are either
a different number of a’s and b’s or a different number of
b’s and c’s, or both.

 Given the following grammar:


S  A1 B
A  0A | 
B  0B | 1B | 

Give the leftmost and rightmost derivation of the string 00101

6
CFG and Languages
Defn. 3.1.4. Let G = (V, , P, S) be a CFG and S 
*

G w a
derivation. The derivation tree, DT, of S 
*
G
w is an
ordered tree that can be built iteratively as follows:
(i) Initialize DT T with root S
(ii) If A  x1 ... xn, where xi  (V  ), is a rule in the derivation
applied to rAv, then add x1 ... xn as the children of A in T
(iii) If A   is a rule in the derivation applied to uAv, then add
 as the only child of A in T
e.g., Fig. 3.2 for Fig. 3.1(a) S  AA  aA  aAAA
 abAAA  abaAA  ababAA
 ababaA  ababaa
Fig. 3.3 for Fig. 3.1(a)...(d)

 Example. Let G be the CFG .. P = S  zMNz, M  aMa | z,


N  bNb | z
which generates strings of the form zanzanbmzbmz, where n, m  0 7
3.2 Examples of Context-Free Grammar (CFG)
 Many CFGs are the union of simpler CFGs, i.e., combining
individual grammars by putting their rules S1, S2, ..., Sn
together using S, the start symbol:
S  S1 | S2 | ... | Sn

 Example. Consider the language { 0n1n | n  0 }  { 1n0n | n  0 }


Step 1. Construct the CFG for the language { 0n1n | n  0 }
S1  0 S1 1 | 
Step 2. Construct the CFG for the language { 1n0n | n  0 }
S2  1 S2 0 | 
Step 3. Construct the CFG for the language { 0n1n | n  0 } 
{ 1 n0 n | n  0 }
S  S1 | S2
S1  0 S1 1 | 
S2  1 S2 0 |  8
3.2. Examples of CFG
 Example. Consider the following grammar:
S  aSa | bSb | a | b | 
where S  aSa | bSb capture the recursive generation process
and the grammar generates the set of palindromes over {a, b}

 Example. Consider a CFG which generates the language


consisting of even number of a’s and even number of b’s:
S  aB | bA |  {S: even a’s and even b’s}
A  aC | bS {A: even a’s and odd b’s}
B  aS | bC {B: odd a’s and even b’s}
C  aA | bB {C: odd a’s and odd b’s}
 Example. Same as above except odd a’s and odd b’s
S  aB | bA
A  aC | bS
B  aS | bC
C  aA | bB |  9
4.5 Chomsky Normal Form
 A simplified normal form which restricts the length and
composition of the R.H.S. of a rule in CFG

 Defn 4.5.1. A CFG G = (V, , P, S) is in chomsky normal


form if each rule in G has one of the following forms:
i) A  BC
ii) A  a
iii) S  

where A, B, C, S  V, and B, C  V - { S }, and a  

 The derivation tree for a string generated by a CFG


in chomsky normal form is a binary tree

10
Chomsky Normal Form
 Theorem 4.5.2. Let G = (V, , P, S) be a CFG. There is an
algorithm to construct a grammar G’ = (V’, ’, P’, S’) in
chomsky normal form that is equivalent to G
Proof (sketch):
(i) For each rule A  w, where |w| > 1, replace each terminal
symbol a  w by a distinct variable Y and create new rule
Y  a.
(ii) For each modified rule X  w, w is either a terminal or a
string in V+. Rules in the latter form must be broken into a
sequence of rules, each of whose R.H.S. consists of two
variables.
 Example 4.5.1

 One of the applications of using CFGs that are in Chomsky


Normal Form
- Constructing binary search trees to accomplish “optimal” time
11
and space search complexity for parsing an input string
3.5 Leftmost Derivations and Ambiguity
 Theorem 3.5.1 Let G = (V, , P, S) be a CFG. A string
w  L(G) iff there is a leftmost derivation of w from S.
Proof. It is clear that if there is a leftmost derivation of w from S, w
 L(G).
We can show that every string in w  L(G) is derivable in a
leftmost manner, i.e., S  * w, is a leftmost derivation.
If there is any rule application that is not leftmost, the rule
applications can be reordered so that they are leftmost.

 Is there a unique leftmost derivation for every string in a CFL?


 Answer: No. (Consider the two leftmost derivations in Fig. 3.1.)
 The possibility of a string having several leftmost derivations
introduces the notion of ambiguity.
 The ambiguity increases the burden on debugging a program,
which should be avoided.
12
3.5 Leftmost Derivations and Ambiguity
 Defn. 3.5.2 A CFG G is ambiguous if there is a string w  L(G)
that can be derived by two distinct leftmost derivations. A
grammar that is not ambiguous is called unambiguous.

 Example 3.5.1 The grammar G, which is defined as


S  aS | Sa | a
is ambiguous, since there are two leftmost derivations on aa:
S  aS  aa and S  Sa  aa
however, G’, which is defined as S  aS | a, is unambiguous.

 Unfortunately, there are some CFLs that cannot be generated


by any unambiguous grammars. Such languages are
called inherently ambiguous.

 A grammar is unambiguous if, at each leftmost-derivation step,


there is only one rule that can lead to a derivation of the
desired string. 13
3.5 Leftmost Derivations and Ambiguity
 Example 3.5.2 The ambiguous grammar G,
S  bS | Sb | a

can be converted into unambiguous grammar G1 or G2, where

G1: S  bS | aA A  bA | 
G2: S  bS | A A  Ab | a

 Example 3.5.3 The following grammar G is ambiguous:


S  aSb | aSbb |  (in Example 3.2.4), since
S  aSb  aaSbbb  aabbb, and
S  aSbb  aaSbbb  aabbb

which can be converted into an unambiguous grammar


S  aSb | A |  A  aAbb | abb
14
3.5 Leftmost Derivations and Ambiguity
 Example. An inherently ambiguous language
L = { anbncm | n, m  0 }  { anbmcm | m, n  0 }
 Every grammar that generates L is ambiguous
 Consider the following grammar of L:
S  S1 | S2,
S1  S1c | A, A  aAb | 
S2  aS2 | B, B  bBc | 
 the strings { anbncn | n  0 } always have two different DTs, e.g.,
S S

S1 S2

S1 c a S2

S1 c a S2
15
…… ……
3.5 Leftmost Derivations and Ambiguity
 Another example of inherently ambiguous language:
L = { anbncmdm | n, m > 0 }  { anbmcmdn | n, m > 0 }

 The problem of determining whether an arbitrary language


is inherently ambiguous is recursively unsolvable.
 i.e., there is no algorithm that determines whether an
arbitrary language is inherently ambiguous.

 Reference:
“Ambiguity in context free languages,” S. Ginsburg and J. Ullian,
Journal of the ACM, (13)1: 62- 89, January 1966.

16

You might also like