7B Midterm Review Solutions
7B Midterm Review Solutions
Solution
We can construct a DFA with an omitted garbage state that recognizes L:
1 2
0
0
2 0
2 0 1
All accepted strings must have length at least 1 since the initial state is not accepting. We
branch from qs depending on the first symbol of w. WLOG, let the first symbol of w be 0. Once
we take the transition from qs to q00 , we return to the accepting state q00 if and only if we
read a sequence of symbols (after the initial 0) of the form (120)∗ , which ensures the first two
conditions of the strings in the language.
1
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
2. The set of strings over the alphabet Σ = {a, b, . . . , z} that contain at least one m between any two a’s
in the string; for example abc, john, mama, american are in the language, but papa, panamerican are
not.
Solution
We can construct a DFA that recognizes this language:
Σ \ {a} Σ \ {a, m} Σ
a
q0 q1 a q2
Whenever we see an a, if we ever see another a before seeing any ms, we reject.
3. Challenge: The set of binary strings which represent in binary a number that is an integer multiple
of 3 (leading zeros are allowed). For example, 00, 110 are in the language (they correspond to 0, 6
respectively), but 001, 101 are not (they correspond to 1, 5). Hint: Three states are enough. Think
about what each additional symbol in a binary string does to the number; it might be helpful to write
out some examples.
Solution
As the hint mentions, three states are enough for our DFA; thus, it is natural to think about
how we can use the three states as a counter for the modulus operation, where 0 mod 3 is the
accepting state for our binary string. We see the following relationship between the values mod
3:
0 1
1 0
q0 q1 q2
1 0
2
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
For the following problems, if a language L is given, prove that L is regular or prove that L is nonregular.
4. L = {ww | w ∈ {0, 1}∗ and w contains at least one 0 and at least one 1} over the alphabet Σ = {0, 1}.
Solution
Define the following language:
We see that A = L((00)∗ ∪ (11)∗ ), as every string in A must have even length (due every string
in A having form ww) and must be composed of only 0s or only 1s (in order to not contain
both a 0 and a 1).
We then observe that L ∪ A = B, where
Since A can be expressed as a regular expression, A is regular. Now assume, for the sake of
contradiction, that L is also regular. Since the class of regular languages is closed under the
union operation, if L and A are both regular, then B must also be regular.
However, B is not context-free and therefore not regular.a
Since B is not regular, then due to closure properties, it cannot be the case that both L and A
are regular, so L must be nonregular.
a Since it is true that if a language is regular, it is context-free, the contrapositive must be true. It was proved
5. Prove that the union of a regular language L1 and nonregular language L2 , L1 ∪L2 , such that L1 ∩L2 = ∅
results in a nonregular language.
Solution
To prove that L1 ∪ L2 is nonregular, we assume for the sake of contradiction that it is regular.
Then, we aim to show that if L1 ∪ L2 is regular, then L2 will always be regular, which is a
contradiction.
To construct an expression for L2 in terms of L1 ∪ L2 and L1 , we observe that
L2 = (L1 ∪ L2 ) ∩ L1
6. L = {ai bj ck | i + j = k}.
Solution
L is nonregular as it does not satisfy the pumping lemma for regular languages. We choose the
string w = ap bp c2p , and it is clear that w ∈ L and |w| = 4p ≥ p. For all partitions w = xyz,
due to the conditions |xy| ≤ p and |y| > 0, we see that y must consist of at least one a and is
composed only of as, or in other words, y = ak , 0 < k ≤ p. We then consider the pumped-up
string xy 2 z = ap+k bp c2p and see that xy 2 z ∈
/ L (as (p + k) + p = 2p + k ̸= 2p since k ̸= 0); for
all partitions w = xyz, w cannot be pumped. Since L does not satisfy the pumping lemma for
regular languages, L is not a regular language.
3
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
7. L = {0k u0k | k ≥ 1, u ∈ Σ∗ }.
Solution
We prove that L is regular by providing a regular expression. We observe that if k = 1, then
L contains all strings that start and end with 0 (note that u includes all characters between
the starting and ending characters; this is possible since u ∈ Σ∗ ). Due to this observation,
L = L(0Σ∗ 0), so L is regular.
8. L = {0k 1u0k | k ≥ 1, u ∈ Σ∗ }.
Solution
L is nonregular as it does not satisfy the pumping lemma for regular languages. We choose the
string w = 0p 10p , and it is clear that w ∈ L and |w| = 2p + 1 ≥ p. For all partitions w = xyz,
due to the conditions |xy| ≤ p and |y| > 0, we see that y must consist of at least one 0 and
is composed only of 0s, or in other words, y = 0k , 0 < k ≤ p. We then consider the pumped-
up string xy 2 z = 0p+k 10k and see that xy 2 zz ∈/ L; for all partitions w = xyz, w cannot be
pumped.a Since L does not satisfy the pumping lemma for regular languages, L is not a regular
language.
a By pumping up, we see the difference between this language L and the previous language L k k
R = {0 u0 |
k ≥ 1, u ∈ Σ∗ }: the fixed 1 allows for the distinction between the substring of the first occurrence of 0k and u,
which in turn allows us to pump up the string w in our solution.
9. L = {0i 1j | i, j ≥ 0, i ̸= j}.
Solution
We will prove that L is nonregular using closure properties. Assume, for the sake of contradic-
tion, that L is regular. Then, we see that
L(0∗ 1∗ ) ∩ L = {0n 1n | n ≥ 0}
It is important to note that while L does contain strings of the form 0n 1n , n ≥ 0, L also
contains strings that are not of the form 0∗ 1∗ as these strings are not in L. Thus, when we take
the intersection of L(0∗ 1∗ ) and L, the result is the set of all strings of the form 0∗ 1∗ where the
number of 0s is equal to the number of 1s. However, the resulting language {0n 1n | n ≥ 0} is
nonregular as it was proved in class.a
Since {0n 1n | n ≥ 0} is nonregular, due to closure properties, it cannot be the case that both
L(0∗ 1∗ ) and L are regular, so L must be nonregular. Since the class of regular languages is
closed under the complement operation, if L is nonregular, then L itself must be nonregular.b
a See Lecture 8 (10/3). If you would like to use results from class, you must state that you are doing so; simply
stating that the resulting language is nonregular without stating that it was proved such in class is not sufficient.
b To see this, consider what would be true about L if L were to be regular.
10. L = {0n 1n | 0 ≤ n ≤ 3}
Solution
Since L is finite, L is regular.
4
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
11. Challenge: Let L be the language consisting of all strings of a’s and b’s with an equal number of
occurrences of ab and ba as substrings. (The string aabbbaa has one occurrence of each of the substrings
ab and ba.) Is L a regular language? Prove your answer.
Solution
Yes, L is regular: In fact, L can be described equivalently as the set of strings over {a, b} that
begin and end with the same symbol. We break into two scenarios: there are no substrings of ab
and ba or there are one or more substrings of ab and ba. The first case is trivial; for the second
scenario, wlog, let the string start with a. Once ab is introduced to the string, the string must
end with an a to balance the amount of substrings of ab and ba. If the string instead ends with
b, then there are now more occurances of ab than ba. We can construct an NFA that recognizes
L as follows:
a, b a
q1 a q3
a
a
q0
b
b
q2 q4
b
a, b b
5
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
L = {ai bj | i, j ≥ 0, i ≤ j ≤ 3i}
is context-free.
Solution
The language L is context-free because the following CFG G3 generates it. Let G3 =
({S}, {a, b}, R, S) where R = {S → aSb | aSbb | aSbbb | ε}. An informal justification is as
follows: for each non-empty string (the string ε is trivially in L and generated by G), each
derivation generates one a on the left and n b’s on the right where 1 ≤ b ≤ 3. If i a’s are
generated, then the minimum number of b’s generated is j = i and the maximum number of b’s
generated is j = 3i, so i ≤ j ≤ 3i.
L = {ai bj ck : i = j or j = k where i, j, k ≥ 0}
Solution
An approach is to consider L as the union of two languages where i = j and j = k, and construct
the CFG accordingly:
S → S1 | S2
S1 → U V
U → aU b | ε
V → cV | ε
S2 → XY
X → aX | ε
Y → bY c | ε
An informal justification: S1 derives the strings where i = j, because the rule U → aU b ensures
that |a| = |b|, and the rule S1 → U V ensures the order of abc. S2 derives the strings where
j = k, because the rule Y → bY c ensures that |b| = |c|, and the rule S2 → XY ensures the
order of abc.
This grammar is ambiguous; Given the discussion at the beginning, w = an bn cn satisfies both
i = j and j = k and thus, can be derived from both S1 and S2. This gives two parse trees for
w.
6
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
L = {an bk cn | n, k ≥ 0}
Solution
We construct PDA P as follows:
a, ε → a b, ε → ε c, a → ε
ε, ε → $ ε, ε → ε ε, ε → ε ε, $ → ε
q0 q1 q2 q3 q4
7
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
ε, ε → $ 1, ε → ε ε, $ → ε
s q p r
0, ε → X 1, X → ε
1, ε → ε
Solution
8
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
5. Challenge: For any language L, let SUFFIX(A) = {v | uv ∈ L for some string u}. Show that the class
of context-free languages is closed under the SUFFIX operation.
Solution
To show that the class of context-free languages is closed under the SUFFIX operation, we
show that if L is context-free, SUFFIX(L) is also context-free. Let L be a context-free language
recognized by PDA M = (Q, Σ, Γ, δ, q0 F ); to show that SUFFIX(L) is context-free, we construct
a PDA M ′ = (Q′ , Σ, Γ, δ ′ , q0′ , F ) that recognizes SUFFIX(L). We define the new set of states Q′ ,
transition function δ ′ , and start state q0′ :
• Q′ = Q ∪ Q̂, where Q̂ = q∈Q {q̂}. In other words, we add a new state q̂ that corresponds
S
(the relationship will be defined in the transition function) to each existing state q; the
new and existing states make up the set of states of M ′ . We will call q̂ the corresponding
state of q for all q ∈ Q.
• δ ′ : Q′ × Σε × Γε → P(Q × Γε ) is defined as:
– For all q ∈ Q, a ∈ Σε , and b ∈ Γε , δ ′ (q, a, b) = δ(q, a, b). In other words, the original
transition function is copied over to our new transition function.
– For all q ∈ Q, a ∈ Σε , and b ∈ Γε , δ ′ (q̂, ε, b) = {(r̂, c) | (r, c) ∈ δ(q, a, b)}. In other
words, for each transition from state q that reads a, pushes b onto the stack, goes to
state r, and pops c from the stack, the corresponding state q̂ does not consume any
input symbol, pushes b onto the stack, goes to the corresponding state r̂, and pops
c from the stack. This effectively performs what M would do upon seeing symbol a,
but without actually consuming such symbol.
– For all q ∈ Q, a ∈ Σε , and b ∈ Γε , δ ′ (q̂, ε, ε) = {(q, ε)}. In other words, the transition
from the corresponding state q̂ to the existing state q can be performed without
consuming a symbol or pushing/popping from the stack.
• q0′ = qˆ0 . In other words, the new start state is the corresponding state of q0 .
At a high level, we are creating a duplicate version of M where each transition does not read in an
input symbol, and creating a transition from each state in the duplicate M to its corresponding
state in the original M . Now, we show that v ∈ SUFFIX(L) if and only if M ′ accepts v.
• If v ∈ SUFFIX(L), then there is some string u such that uv ∈ L. Since M is the PDA that
recognizes L, then M accepts uv; there exists an accepting computation path from q0 that
ends up in an accepting state after reading uv. To show that M ′ accepts v, we show that
there exists some computation path that results in the acceptance of v. From the start
state q0′ = qˆ0 , we can take the transitions δ ′ (q̂, ϵ, b) that correspond to the transitions
that would occur in the previously mentioned accepting computation path after reading
in u and push/pop from the stack accordingly (without reading any input symbols).
After taking these transitions, the state of our current computation is the following: while
our stack is identical to the resulting stack if M were to run on u along the accepting
computation path, we are currently at the corresponding state q̄ to the state q that M
would currently be at. We then take the transition δ ′ (q̂, ε, ε) to reach the existing state q.
From here, we can read in the input string v and take the transitions δ ′ (q, a, b) that are
along the accepting computation path. Therefore, if v ∈ SUFFIX(L), M ′ accepts v.
• If M ′ accepts v, then some computation path must end up in an accepting state. Each
computation path starts by running some number transitions from the corresponding
states before eventually getting to an existing state q ∈ Q; upon arriving at this existing
state q, the computation path follows the transitions of the original PDA M . This means
that there exists some string u such that if we read u first in M and then read in v, then
M will accept. Therefore, uv ∈ L and v ∈ SUFFIX(L).
9
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions
is context-free.
Solution
First note that L is the union of two languages L1 , L2 where L1 = {xy | x, y ∈ {a, b}, |x| =
|y|, x ̸= y} and L2 = {w | |w| ≡ 1 mod 2} (i.e., the string is of odd length). We show L1 and
L2 are context free through defining CFGs G1 and G2 where L(G1 ) = L1 and L(G2 ) = L2 .
G1 :
S1 → AB | BA
A → a | aAa | aAb | bAa | bAb
B → b | aBa | aBb | bBa | bBb
directly as a result of how the rules are defined. We can see that, x and y will be at the same
place in the first and second half of the string, thus w ∈ L.
We then show that if w ∈ L then w ∈ L(G). Since we have w = xy where x ̸= y, x and y
must differ at at least one character; we derive the rest of the string using the production rules
A → aAa | aAb | bAa | bAb and B → aBa | aBb | bBa | bBb, and derive the distinct character
through the rules A → a and B → b.
G2 :
S2 → aX | bX
X → aS2 | bS2 | ε
Informal justification: the grammar only generates strings of odd length since we alternate
between containing the variables S2 and X, and only X can end the derivation (deriving ε.
Thus, we can union together the two CFLs L1 and L2 to yield L, which proves that L is
context-free.
10