Open In App

Closure Properties of Context Free Languages

Last Updated : 05 Feb, 2025
Summarize
Comments
Improve
Suggest changes
Share
Like Article
Like
Report

Context-Free Languages (CFLs) are an essential class of languages in the field of automata theory and formal languages. They are generated by context-free grammars (CFGs) and are recognized by pushdown automata (PDAs). Understanding the closure properties of CFLs helps in determining which operations preserve the context-free nature of a language.

CFLs are closed under some operations, meaning that applying these operations to CFLs will always result in another CFL. However, they are not closed under certain operations, which means that performing these operations on CFLs might result in a language that is not context-free.

Context Free Languages (CFLs) are accepted by pushdown automata. Context free languages can be generated by context free grammars, which have productions (substitution rules) of the form : 

A -> ? (where A ? N and ? ? (T ? N)* and N is a non-terminal and T is a terminal)

Properties of Context Free Languages 

CFLs are closed under the following operations:

1. Union

If L and M are two context-free languages, then their union L ∪ M is also a CFL.

Construction:

  1. Consider two context-free grammars, G and H, for L and M respectively.
  2. Assume that G and H have no common variables (this can be ensured by renaming variables if needed).
  3. Introduce a new start symbol S and add the rule: S → S₁ | S₂ Here, S₁ and S₂ are the start symbols of G and H, respectively.
  4. The resulting grammar generates L ∪ M, proving that CFLs are closed under union.

Example: Let L₁ = { aⁿbⁿcᵐ | m ≥ 0, n ≥ 0 } and L₂ = { aⁿbᵐcᵐ | n ≥ 0, m ≥ 0 }.

  • L₁ enforces that the number of a’s equals the number of b’s.
  • L₂ enforces that the number of b’s equals the number of c’s.
  • Their union states that either of these conditions must be satisfied, making the resulting language context-free.

Note: So CFL are closed under Union. 

2. Concatenation

If L and M are CFLs, then their concatenation LM is also a CFL.

Construction:

  1. Let G and H be the context-free grammars for L and M, respectively.
  2. Assume that G and H have no common variables.
  3. Introduce a new start symbol S and add the production:S → S₁ S₂Here, S₁ and S₂ are the start symbols of G and H, respectively.
  4. This ensures that any derivation from S will first generate a string in L and then a string in M, proving that CFLs are closed under concatenation.

Example 

L1 = { anbn | n >= 0 } and L2 = { cmdm | m >= 0 } 
L3 = L1.L2 = { anbncmdm | m >= 0 and n >= 0} is also context free. 
L1 says number of a’s should be equal to number of b’s and L2 says number of c’s should be equal to number of d’s. Their concatenation says first number of a’s should be equal to number of b’s, then number of c’s should be equal to number of d’s. So, we can create a PDA which will first push for a’s, pop for b’s, push for c’s then pop for d’s. So it can be accepted by pushdown automata, hence context free. 

Note: So CFL are closed under Concatenation. 

3. Kleene Closure

If L is a CFL, then its Kleene closure L* (zero or more repetitions of strings in L) is also a CFL.

Construction:

  1. Let G be the context-free grammar for L with start symbol S₁.
  2. Introduce a new start symbol S and add the rule:S → S₁ S | εThis rule ensures that S can derive zero or more copies of S₁, proving closure under the Kleene star.

Example

L1 = { anbn | n >= 0 } 
L1* = { anbn | n >= 0 }* is also context free. 

Note :So CFL are closed under Kleen Closure. 

4. Intersection and complementation

If L1 and If L2 are two context free languages, their intersection L1 ? L2 need not be context free.

Example

L1 = { anbncm | n >= 0 and m >= 0 } and L2 = (ambncn | n >= 0 and m >= 0 } 
L3 = L1 ? L2 = { anbncn | n >= 0 } need not be context free. 
L1 says number of a’s should be equal to number of b’s and L2 says number of b’s should be equal to number of c’s. Their intersection says both conditions need to be true, but push down automata can compare only two. So it cannot be accepted by pushdown automata, hence not context free. 
Similarly, complementation of context free language L1 which is ?* - L1, need not be context free. 

Note : So CFL are not closed under Intersection and Complementation. 

5. Reversal

If L is a CFL, then its reversal L^R (where each string is reversed) is also a CFL.

Construction:

  1. Take the context-free grammar G for L.
  2. Modify each production so that the right-hand side of every rule is reversed.
  3. The resulting grammar generates L^R, proving that CFLs are closed under reversal.

Example:

  • Original Grammar: S → 0S1 | 01
  • Reversed Grammar: S → 1S0 | 10

6. Homomorphism

A homomorphism is a function that replaces each symbol in a string with another string.

If L is a CFL and h is a homomorphism, then h(L) is also a CFL.

Example:

  • Let G have the production S → 0S1 | 01.
  • Define a homomorphism h(0) = ab, h(1) = ε.
  • Then, h(L(G)) will be generated by a new grammar with productions:S → abS | ab

This shows that CFLs are closed under homomorphism.

7. Inverse Homomorphism

If L is a CFL and h is a homomorphism, then h⁻¹(L) is also a CFL.

Construction:

  • Instead of using a grammar, we use a PDA to construct the inverse homomorphism.
  • The PDA for h⁻¹(L) simulates the PDA for L but keeps track of the mapping of symbols.

Deterministic Context-free Languages 

Deterministic CFL are subset of CFL which can be recognized by Deterministic PDA. Deterministic PDA has only one move from a given state and input symbol, i.e., it do not have choice. For a language to be DCFL it should be clear when to PUSh or POP. 

For example, L1= { anbncm | m >= 0 and n >= 0} is a DCFL because for a’s, we can push on stack and for b’s we can pop. It can be recognized by Deterministic PDA. On the other hand, L3 = { anbncm ? anbmcm | n >= 0, m >= 0 } cannot be recognized by DPDA because either number of a’s and b’s can be equal or either number of b’s and c’s can be equal. So, it can only be implemented by NPDA. Thus, it is CFL but not DCFL. 
Note : DCFL are closed only under complementation and Inverse Homomorphism. 

Multiple-choice questions (MCQs)

Question : Consider the language L1,L2,L3 as given below. 
L1 = { ambn | m, n >= 0 } 
L2 = { anbn | n >= 0 } 
L3 = { anbncn | n >= 0 } 
Which of the following statements is NOT TRUE? 
A. Push Down Automata (PDA) can be used to recognize L1 and L2 
B. L1 is a regular language 
C. All the three languages are context free 
D. Turing machine can be used to recognize all the three languages 

Solution : Option (A) says PDA can be used to recognize L1 and L2. L1 contains all strings with any no. of a followed by any no. of b. So, it can be accepted by PDA. L2 contains strings with n no. of a’s followed by n no. of b’s. It can also be accepted by PDA. So, option (A) is correct. 
Option (B) says that L1 is regular. It is true as regular expression for L1 is a*b*. 
Option (C) says L1, L2 and L3 are context free. L3 languages contains all strings with n no. of a’s followed by n no. of b’s followed by n no. of c’s. But it can’t be accepted by PDA. So option ( C) is not correct. 
Option (D) is correct as Turing machine can be used to recognize all the three languages. 

Question : The language L = { 0i12i | i ? 0 } over the alphabet {0, 1, 2} is : 
A. Not recursive 
B. Is recursive and deterministic CFL 
C. Is regular 
D. Is CFL bot not deterministic CFL. 

Solution : The above language is deterministic CFL as for 0’s, we can push 0 on stack and for 2’s we can pop corresponding 0’s. As there is no ambiguity which moves to take, it is deterministic. So, correct option is (B). As CFL is subset of recursive, it is recursive as well. 

Question : Consider the following languages: 
L1 = { 0n1n| n?0 } 
L2 = { wcwr | w ? {a,b}* } 
L3 = { wwr | w ? {a,b}* } 
Which of these languages are deterministic context-free languages? 
A. None of the languages 
B. Only L1 
C. Only L1 and L2 
D. All three languages 

Solution : Languages L1 contains all strings in which n 0’s are followed by n 1’s. Deterministic PDA can be constructed to accept L1. For 0’s we can push it on stack and for 1’s, we can pop from stack. Hence, it is DCFL.

L2 contains all strings of form wcwr where w is a string of a’s and b’s and wr is reverse of w. For example, aabbcbbaa. To accept this language, we can construct PDA which will push all symbols on stack before c. After c, if symbol on input string matches with symbol on stack, it is popped. So, L2 can also be accepted with deterministic PDA, hence it is also DCFL. 

L3 contains all strings of form wwr where w is a string of a’s and b’s and wr is reverse of w. But we don’t know where w ends and wr starts. e.g.; aabbaa is a string corresponding to L3. For first a, we will push it on stack. Next a can be either part of w or wr where w=a. So, there can be multiple moves from a state on an input symbol. So, only non-deterministic PDA can be used to accept this type of language. Hence, it is NCFL not DCFL. 
So, correct option is (C). Only, L1 and L2 are DCFL. 

Question : Which one of the following grammars generate the language L = { aibj | i ? j } 
S -> AC | CB, C -> aCb | a | b, A -> aA | ?, B -> Bb | ? 
S -> aS | Sb | a | b 
S -> AC | CB, C -> aCb | ?, A -> aA | ?, B -> Bb | ? 
S -> AC | CB, C -> aCb | ?, A -> aA | a, B -> Bb | b 

Solution : The best way to solve these type of questions is to eliminate options which do not satisfy conditions. The conditions for language L is no. of a’s and no. of b’s should be unequal. 

In option (B), S => aS => ab. It can generate strings with equal a’s and b’s. So, this option is incorrect. 
In option (C), S => AC => C => ?. In ?, a’s and b’s are equal (0), so it is not correct option. 
In option (A), S will be replaced by either AC or CB. C will either generate no. of a’s more than no. of b’s by 1 or no. of b’s more than no. of a’s by 1. But one more a or one more b can be compensated by B -> bB | ? or A -> aA | ? respectively. So it may give strings with equal no. of a’s and b’s. So, it is not a correct option. 
In option (D), S will be replaced by either AC or CB. C will always generate equal no. of a’s and b’s. If we replace S by AC, A with add atleast one extra a. and if we replace S by CB, B will add atleast one extra b. So this grammar will never generate equal no. of a’s and b’s.

So, option (D) is correct. 


Next Article
Article Tags :

Similar Reads