0% found this document useful (0 votes)
6 views

7B Midterm Review Solutions

The document provides midterm review solutions for a course on regular and context-free languages. It includes detailed solutions for constructing deterministic finite automata (DFA) for various languages, proving certain languages as regular or nonregular, and constructing context-free grammars and pushdown automata. Each problem is accompanied by explanations and diagrams where applicable.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

7B Midterm Review Solutions

The document provides midterm review solutions for a course on regular and context-free languages. It includes detailed solutions for constructing deterministic finite automata (DFA) for various languages, proving certain languages as regular or nonregular, and constructing context-free grammars and pushdown automata. Each problem is accompanied by explanations and diagrams where applicable.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

Helen Chu and Eumin Hong

October 23, 2022


COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

1 Problem Set A: Regular Languages


Design deterministic finite automata for the following languages. You can give the DFA by their transition
diagrams. You do not need to show that they are correct.
Note: In your transition diagrams, you can use shorthand notation on the labels of the edges. For example,
you can label an edge by Σ \ {a} to indicate that the transition takes place for all input symbols except a.
Make sure to specify the starting state and the accepting states in your diagrams.
1. L is the language over the alphabet Σ = {0, 1, 2} consisting of all strings that:
• Every 0 is immediately followed by a 1, every 1 is immediately followed by a 2, and every 2 is
immediately followed by a 0.
• The string starts and ends with the same symbol.
• The string must have length at least 1.

Solution
We can construct a DFA with an omitted garbage state that recognizes L:

1 2

q00 q01 q02

0
0

2 0

qs 1 q11 q12 q10

2 0 1

q22 q20 q21

All accepted strings must have length at least 1 since the initial state is not accepting. We
branch from qs depending on the first symbol of w. WLOG, let the first symbol of w be 0. Once
we take the transition from qs to q00 , we return to the accepting state q00 if and only if we
read a sequence of symbols (after the initial 0) of the form (120)∗ , which ensures the first two
conditions of the strings in the language.

1
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

2. The set of strings over the alphabet Σ = {a, b, . . . , z} that contain at least one m between any two a’s
in the string; for example abc, john, mama, american are in the language, but papa, panamerican are
not.

Solution
We can construct a DFA that recognizes this language:

Σ \ {a} Σ \ {a, m} Σ
a

q0 q1 a q2

Whenever we see an a, if we ever see another a before seeing any ms, we reject.

3. Challenge: The set of binary strings which represent in binary a number that is an integer multiple
of 3 (leading zeros are allowed). For example, 00, 110 are in the language (they correspond to 0, 6
respectively), but 001, 101 are not (they correspond to 1, 5). Hint: Three states are enough. Think
about what each additional symbol in a binary string does to the number; it might be helpful to write
out some examples.

Solution
As the hint mentions, three states are enough for our DFA; thus, it is natural to think about
how we can use the three states as a counter for the modulus operation, where 0 mod 3 is the
accepting state for our binary string. We see the following relationship between the values mod
3:

curr mod 3 x next mod 3


0 0 0
0 1 1
1 0 2
1 1 0
2 0 1
2 1 2

Thus we construct our DFA as follows:

0 1
1 0

q0 q1 q2

1 0

2
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

For the following problems, if a language L is given, prove that L is regular or prove that L is nonregular.
4. L = {ww | w ∈ {0, 1}∗ and w contains at least one 0 and at least one 1} over the alphabet Σ = {0, 1}.

Solution
Define the following language:

A = {ww | w ∈ {0, 1}∗ and w does not contain both a 0 and a 1}

We see that A = L((00)∗ ∪ (11)∗ ), as every string in A must have even length (due every string
in A having form ww) and must be composed of only 0s or only 1s (in order to not contain
both a 0 and a 1).
We then observe that L ∪ A = B, where

B = {ww | w ∈ {0, 1}∗ }

Since A can be expressed as a regular expression, A is regular. Now assume, for the sake of
contradiction, that L is also regular. Since the class of regular languages is closed under the
union operation, if L and A are both regular, then B must also be regular.
However, B is not context-free and therefore not regular.a
Since B is not regular, then due to closure properties, it cannot be the case that both L and A
are regular, so L must be nonregular.
a Since it is true that if a language is regular, it is context-free, the contrapositive must be true. It was proved

that B is not context-free in Lecture 12 (10/17).

5. Prove that the union of a regular language L1 and nonregular language L2 , L1 ∪L2 , such that L1 ∩L2 = ∅
results in a nonregular language.

Solution
To prove that L1 ∪ L2 is nonregular, we assume for the sake of contradiction that it is regular.
Then, we aim to show that if L1 ∪ L2 is regular, then L2 will always be regular, which is a
contradiction.
To construct an expression for L2 in terms of L1 ∪ L2 and L1 , we observe that

L2 = (L1 ∪ L2 ) ∩ L1

In other words, L2 is composed of everything in both L1 and L2 (which is L1 ∪ L2 ) and


everything that is not in L1 (which is L1 ). Since L1 is regular and the class of regular languages
is closed under the complement operation, L1 must also be regular. Since (we assumed for the
sake of contradiction that) L1 ∪ L2 is regular and the class of regular languages is closed under
the intersection operation, (L1 ∪ L2 ) ∩ L1 = L2 must be regular. However, this leads us to a
contradiction, as L2 was stated to be nonregular. Thus, L1 ∪ L2 must be nonregular.

6. L = {ai bj ck | i + j = k}.

Solution
L is nonregular as it does not satisfy the pumping lemma for regular languages. We choose the
string w = ap bp c2p , and it is clear that w ∈ L and |w| = 4p ≥ p. For all partitions w = xyz,
due to the conditions |xy| ≤ p and |y| > 0, we see that y must consist of at least one a and is
composed only of as, or in other words, y = ak , 0 < k ≤ p. We then consider the pumped-up
string xy 2 z = ap+k bp c2p and see that xy 2 z ∈
/ L (as (p + k) + p = 2p + k ̸= 2p since k ̸= 0); for
all partitions w = xyz, w cannot be pumped. Since L does not satisfy the pumping lemma for
regular languages, L is not a regular language.

3
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

7. L = {0k u0k | k ≥ 1, u ∈ Σ∗ }.

Solution
We prove that L is regular by providing a regular expression. We observe that if k = 1, then
L contains all strings that start and end with 0 (note that u includes all characters between
the starting and ending characters; this is possible since u ∈ Σ∗ ). Due to this observation,
L = L(0Σ∗ 0), so L is regular.

8. L = {0k 1u0k | k ≥ 1, u ∈ Σ∗ }.

Solution
L is nonregular as it does not satisfy the pumping lemma for regular languages. We choose the
string w = 0p 10p , and it is clear that w ∈ L and |w| = 2p + 1 ≥ p. For all partitions w = xyz,
due to the conditions |xy| ≤ p and |y| > 0, we see that y must consist of at least one 0 and
is composed only of 0s, or in other words, y = 0k , 0 < k ≤ p. We then consider the pumped-
up string xy 2 z = 0p+k 10k and see that xy 2 zz ∈/ L; for all partitions w = xyz, w cannot be
pumped.a Since L does not satisfy the pumping lemma for regular languages, L is not a regular
language.
a By pumping up, we see the difference between this language L and the previous language L k k
R = {0 u0 |
k ≥ 1, u ∈ Σ∗ }: the fixed 1 allows for the distinction between the substring of the first occurrence of 0k and u,
which in turn allows us to pump up the string w in our solution.

9. L = {0i 1j | i, j ≥ 0, i ̸= j}.

Solution
We will prove that L is nonregular using closure properties. Assume, for the sake of contradic-
tion, that L is regular. Then, we see that

L(0∗ 1∗ ) ∩ L = {0n 1n | n ≥ 0}

It is important to note that while L does contain strings of the form 0n 1n , n ≥ 0, L also
contains strings that are not of the form 0∗ 1∗ as these strings are not in L. Thus, when we take
the intersection of L(0∗ 1∗ ) and L, the result is the set of all strings of the form 0∗ 1∗ where the
number of 0s is equal to the number of 1s. However, the resulting language {0n 1n | n ≥ 0} is
nonregular as it was proved in class.a
Since {0n 1n | n ≥ 0} is nonregular, due to closure properties, it cannot be the case that both
L(0∗ 1∗ ) and L are regular, so L must be nonregular. Since the class of regular languages is
closed under the complement operation, if L is nonregular, then L itself must be nonregular.b
a See Lecture 8 (10/3). If you would like to use results from class, you must state that you are doing so; simply

stating that the resulting language is nonregular without stating that it was proved such in class is not sufficient.
b To see this, consider what would be true about L if L were to be regular.

10. L = {0n 1n | 0 ≤ n ≤ 3}

Solution
Since L is finite, L is regular.

4
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

11. Challenge: Let L be the language consisting of all strings of a’s and b’s with an equal number of
occurrences of ab and ba as substrings. (The string aabbbaa has one occurrence of each of the substrings
ab and ba.) Is L a regular language? Prove your answer.

Solution

Yes, L is regular: In fact, L can be described equivalently as the set of strings over {a, b} that
begin and end with the same symbol. We break into two scenarios: there are no substrings of ab
and ba or there are one or more substrings of ab and ba. The first case is trivial; for the second
scenario, wlog, let the string start with a. Once ab is introduced to the string, the string must
end with an a to balance the amount of substrings of ab and ba. If the string instead ends with
b, then there are now more occurances of ab than ba. We can construct an NFA that recognizes
L as follows:

a, b a

q1 a q3

a
a

q0

b
b

q2 q4
b

a, b b

Or equivalently, L can be generated by the regular expression

ε ∪ a ∪ b ∪ (a(a ∪ b)∗ a) ∪ (b(a ∪ b)∗ b)

5
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

2 Problem Set B: Context Free Languages


1. Show that the language

L = {ai bj | i, j ≥ 0, i ≤ j ≤ 3i}

is context-free.

Solution
The language L is context-free because the following CFG G3 generates it. Let G3 =
({S}, {a, b}, R, S) where R = {S → aSb | aSbb | aSbbb | ε}. An informal justification is as
follows: for each non-empty string (the string ε is trivially in L and generated by G), each
derivation generates one a on the left and n b’s on the right where 1 ≤ b ≤ 3. If i a’s are
generated, then the minimum number of b’s generated is j = i and the maximum number of b’s
generated is j = 3i, so i ≤ j ≤ 3i.

2. Give a context-free grammar that generates the language

L = {ai bj ck : i = j or j = k where i, j, k ≥ 0}

Is your grammar ambiguous? Why or why not? (Consider the string an bn cn .)

Solution
An approach is to consider L as the union of two languages where i = j and j = k, and construct
the CFG accordingly:

S → S1 | S2
S1 → U V
U → aU b | ε
V → cV | ε
S2 → XY
X → aX | ε
Y → bY c | ε

An informal justification: S1 derives the strings where i = j, because the rule U → aU b ensures
that |a| = |b|, and the rule S1 → U V ensures the order of abc. S2 derives the strings where
j = k, because the rule Y → bY c ensures that |b| = |c|, and the rule S2 → XY ensures the
order of abc.
This grammar is ambiguous; Given the discussion at the beginning, w = an bn cn satisfies both
i = j and j = k and thus, can be derived from both S1 and S2. This gives two parse trees for
w.

6
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

3. Construct a pushdown automata for the language

L = {an bk cn | n, k ≥ 0}

Solution
We construct PDA P as follows:

a, ε → a b, ε → ε c, a → ε

ε, ε → $ ε, ε → ε ε, ε → ε ε, $ → ε
q0 q1 q2 q3 q4

The equivalence proof is similar to that of HW3 question 2.


First, we show that if w = an bk cn such that n, k ≥ 0, then it is accepted by P . When running P
on input w, first $ is pushed to the stack and we enter state q1 . Then we loop on q1 n times such
that n a’s are pushed to the stack and we enter state q2 , where all k b’s are read. In q3 , we read
n c’s, popping n a’s from the stack. This leaves us with just $ on the stack, so P transitions to
the accept state.
Then, we show if P accepts a string w, w ∈ L. P first pushes a $ to the stack. Then it either
reads and pushes n > 0 a’s or moves to state q2 immediately. From q2 , the only way for the
computation path to reach state q3 with only c’s remaining is if all b’s occur before the c’s. From
q2 , we have to read exactly as many d’s as a’s and pop that many a’s from the stack. This way,
we transition to the accept state popping the final $ on the stack.

7
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

4. Describe the language that the following PDA recognizes:

ε, ε → $ 1, ε → ε ε, $ → ε

s q p r

0, ε → X 1, X → ε
1, ε → ε

Solution

L = {xy | x ∈ {0, 1}∗ , y ∈ {1}∗ , the number of 0’s in x < |y|}


We prove equivalence through proving an if-and-only-if relationship.
We first show if a string w is accepted by the PDA, then w ∈ L. We transition from state
s to state q by pushing a $ marker to the stack, then we loop on state q where, for every 0
we see, we push an X to the stack. For an input of 1 we have two choices - to stay on q, or
nondeterminiscally choose to transition to state p, where for every 1 we see, we pop an X off
the stack. Finally, we transition to state r by popping $ off the stack. Thus we see that state
q generates {0, 1}∗ while the stack counts the number of 0’s, the transition to state p takes in
a 1, and we must loop on state p until we’ve popped all of the X’s off the stack before we can
transition to the accept state, thus the string must have a suffix of 1’s at least as long as the
number of 0’s in the string.
We then show if w ∈ L then w is accepted by the PDA. Let the number of 0’s in w be called x.
Because of the nondeterministic transitions from state q, the PDA is guaranteed to accept any
string such that the length of its suffix of 1’s is greater than x, since we can loop on q and take
the 1 transition to p at the x + 1th character from the end of the string.
Thus L is equal to the language recognized by the PDA.

8
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

5. Challenge: For any language L, let SUFFIX(A) = {v | uv ∈ L for some string u}. Show that the class
of context-free languages is closed under the SUFFIX operation.

Solution
To show that the class of context-free languages is closed under the SUFFIX operation, we
show that if L is context-free, SUFFIX(L) is also context-free. Let L be a context-free language
recognized by PDA M = (Q, Σ, Γ, δ, q0 F ); to show that SUFFIX(L) is context-free, we construct
a PDA M ′ = (Q′ , Σ, Γ, δ ′ , q0′ , F ) that recognizes SUFFIX(L). We define the new set of states Q′ ,
transition function δ ′ , and start state q0′ :
• Q′ = Q ∪ Q̂, where Q̂ = q∈Q {q̂}. In other words, we add a new state q̂ that corresponds
S
(the relationship will be defined in the transition function) to each existing state q; the
new and existing states make up the set of states of M ′ . We will call q̂ the corresponding
state of q for all q ∈ Q.
• δ ′ : Q′ × Σε × Γε → P(Q × Γε ) is defined as:
– For all q ∈ Q, a ∈ Σε , and b ∈ Γε , δ ′ (q, a, b) = δ(q, a, b). In other words, the original
transition function is copied over to our new transition function.
– For all q ∈ Q, a ∈ Σε , and b ∈ Γε , δ ′ (q̂, ε, b) = {(r̂, c) | (r, c) ∈ δ(q, a, b)}. In other
words, for each transition from state q that reads a, pushes b onto the stack, goes to
state r, and pops c from the stack, the corresponding state q̂ does not consume any
input symbol, pushes b onto the stack, goes to the corresponding state r̂, and pops
c from the stack. This effectively performs what M would do upon seeing symbol a,
but without actually consuming such symbol.
– For all q ∈ Q, a ∈ Σε , and b ∈ Γε , δ ′ (q̂, ε, ε) = {(q, ε)}. In other words, the transition
from the corresponding state q̂ to the existing state q can be performed without
consuming a symbol or pushing/popping from the stack.
• q0′ = qˆ0 . In other words, the new start state is the corresponding state of q0 .
At a high level, we are creating a duplicate version of M where each transition does not read in an
input symbol, and creating a transition from each state in the duplicate M to its corresponding
state in the original M . Now, we show that v ∈ SUFFIX(L) if and only if M ′ accepts v.
• If v ∈ SUFFIX(L), then there is some string u such that uv ∈ L. Since M is the PDA that
recognizes L, then M accepts uv; there exists an accepting computation path from q0 that
ends up in an accepting state after reading uv. To show that M ′ accepts v, we show that
there exists some computation path that results in the acceptance of v. From the start
state q0′ = qˆ0 , we can take the transitions δ ′ (q̂, ϵ, b) that correspond to the transitions
that would occur in the previously mentioned accepting computation path after reading
in u and push/pop from the stack accordingly (without reading any input symbols).
After taking these transitions, the state of our current computation is the following: while
our stack is identical to the resulting stack if M were to run on u along the accepting
computation path, we are currently at the corresponding state q̄ to the state q that M
would currently be at. We then take the transition δ ′ (q̂, ε, ε) to reach the existing state q.
From here, we can read in the input string v and take the transitions δ ′ (q, a, b) that are
along the accepting computation path. Therefore, if v ∈ SUFFIX(L), M ′ accepts v.
• If M ′ accepts v, then some computation path must end up in an accepting state. Each
computation path starts by running some number transitions from the corresponding
states before eventually getting to an existing state q ∈ Q; upon arriving at this existing
state q, the computation path follows the transitions of the original PDA M . This means
that there exists some string u such that if we read u first in M and then read in v, then
M will accept. Therefore, uv ∈ L and v ∈ SUFFIX(L).

9
COMS W3261 Fall 2022 Handout 7b: Midterm Review Solutions

6. Challenge: Show that the complement of the language

L = {ww | w ∈ {a, b}∗ }

is context-free.

Solution

First note that L is the union of two languages L1 , L2 where L1 = {xy | x, y ∈ {a, b}, |x| =
|y|, x ̸= y} and L2 = {w | |w| ≡ 1 mod 2} (i.e., the string is of odd length). We show L1 and
L2 are context free through defining CFGs G1 and G2 where L(G1 ) = L1 and L(G2 ) = L2 .

G1 :

S1 → AB | BA
A → a | aAa | aAb | bAa | bAb
B → b | aBa | aBb | bBa | bBb

We show that L(G1 ) = L1 through an iff relation.


We first show that if w ∈ L(G), then w ∈ L. We can see that

L(G) = {w1 xw2 v1 yv2 | |w1 | = |w2 | = k, |v1 | = |v2 | = l, x ̸= y}

directly as a result of how the rules are defined. We can see that, x and y will be at the same
place in the first and second half of the string, thus w ∈ L.
We then show that if w ∈ L then w ∈ L(G). Since we have w = xy where x ̸= y, x and y
must differ at at least one character; we derive the rest of the string using the production rules
A → aAa | aAb | bAa | bAb and B → aBa | aBb | bBa | bBb, and derive the distinct character
through the rules A → a and B → b.

G2 :

S2 → aX | bX
X → aS2 | bS2 | ε

Informal justification: the grammar only generates strings of odd length since we alternate
between containing the variables S2 and X, and only X can end the derivation (deriving ε.
Thus, we can union together the two CFLs L1 and L2 to yield L, which proves that L is
context-free.

10

You might also like