0% found this document useful (0 votes)
37 views19 pages

3 Finite-State Machines: 3.1 Intuition

uu

Uploaded by

Raymond Fan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views19 pages

3 Finite-State Machines: 3.1 Intuition

uu

Uploaded by

Raymond Fan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Models of Computation Lecture 3: Finite-State Machines [Fa’14]

Caveat lector! This is the first edition of this lecture note. A few topics are missing, and
there are almost certainly a few serious errors. Please send bug reports and suggestions to
[email protected].

Life only avails, not the having lived. Power ceases in the instant of repose;
it resides in the moment of transition from a past to a new state,
in the shooting of the gulf, in the darting to an aim.
— Ralph Waldo Emerson, “Self Reliance”, Essays, First Series (1841)

O Marvelous! what new configuration will come next?


I am bewildered with multiplicity.
— William Carlos Williams, “At Dawn” (1914)

3 Finite-State Machines
3.1 Intuition
Suppose we want to determine whether a given string w[1 .. n] of bits represents a multiple of 5
in binary. After a bit of thought, you might realize that you can read the bits in w one at a time,
from left to right, keeping track of the value modulo 5 of the prefix you have read so far.

MultipleOf5(w[1 .. n]):
rem ← 0
for i ← 1 to n
rem ← (2 · rem + w[i]) mod 5
if rem = 0
return True
else
return False

Aside from the loop index i, which we need just to read the entire input string, this algorithm
has a single local variable rem, which has only four different values (0, 1, 2, 3, or 4).
This algorithm already runs in O(n) time, which is the best we can hope for—after all, we
have to read every bit in the input—but we can speed up the algorithm in practice. Let’s define a
change or transition function δ : {0, 1, 2, 3, 4} × {0, 1} → {0, 1, 2, 3, 4} as follows:

δ(q, a) = (2q + a) mod 5.

(Here I’m implicitly converting the symbols 0 and 1 to the corresponding integers 0 and 1.) Since
we already know all values of the transition function, we can store them in a precomputed table,
and then replace the computation in the main loop of MultipleOf5 with a simple array lookup.
We can also modify the return condition to check for different values modulo 5. To be
completely general, we replace the final if-then-else lines with another array lookup, using an
array A[0 .. 4] of booleans describing which final mod-5 values are “acceptable”.
After both of these modifications, our algorithm can be rewritten as follows, either iteratively
or recursively (with q = 0 in the initial call):

© Copyright 2014 Jeff Erickson.


This work is licensed under a Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/4.0/).
Free distribution is strongly encouraged; commercial distribution is expressly forbidden.
See http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/ for the most recent revision.

1
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

DoSomethingCool(q, w):
DoSomethingCool(w[1 .. n]):
if w = "
q←0
return A[q]
for i ← 1 to n
else
q ← δ[q, w[i]]
decompose w = a · x
return A[q]
return DoSomethingCool(δ(q, a), x)

If we want to use our new DoSomethingCool algorithm to implement MultipleOf5, we simply


give the arrays δ and A the following hard-coded values:

q δ[q, 0] δ[q, 1] A[q]


0 0 1 True
1 2 3 False
2 4 0 False
3 1 2 False
4 3 4 False

We can also visualize the behavior of DoSomethingCool by drawing a directed graph, whose
vertices represent possible values of the variable q—the possible states of the algorithm—and
whose edges are labeled with input symbols to represent transitions between states. Specifically,
a
the graph includes the labeled directed edge p−→q if and only if δ(p, a) = q. To indicate the
proper return value, we draw the “acceptable” final states using doubled circles. Here is the
resulting graph for MultipleOf5:

1 3
1 1 0

0 0 0 1 4 1

1 0
2

State-transition graph for MultipleOf5

If we run the MultipleOf5 algorithm on the string 00101110110 (representing the number
374 in binary), the algorithm performs the following sequence of transitions:
0 0 1 0 1 1 1 0 1 1 0
0 −→ 0 −→ 0 −→ 1 −→ 2 −→ 0 −→ 1 −→ 3 −→ 1 −→ 3 −→ 2 −→ 4

Because the final state is not the “acceptable” state 0, the algorithm correctly returns False.
We can also think of this sequence of transitions as a walk in the graph, which is completely
determined by the start state 0 and the sequence of edge labels; the algorithm returns True if
and only if this walk ends at an “acceptable” state.

3.2 Formal Definitions


The object we have just described is an example of a finite-state machine. A finite-state machine
is a formal model of any system/machine/algorithm that can exist in a finite number of states
and that transitions among those states based on sequence of input symbols.
Finite-state machines are also commonly called deterministic finite-state automata, abbre-
viated DFAs. The word “deterministic” means that the behavior of the machine is completely

2
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

determined by the input string; we’ll discuss nondeterministic automata in the next lecture.
The word “automaton” (plural “automata”) comes from ancient Greek αυτoµατoς meaning
“self-acting”, from the roots αυτo- (“self”) and -µατoς (“thinking, willing”, the root of Latin
mentus).
Formally, every finite-state machine consists of five components:
• An arbitrary finite set Σ, called the input alphabet.
• Another arbitrary finite set Q, whose elements are called states.
• An arbitrary transition function δ : Q × Σ → Q.
• A start state s ∈ Q.
• A subset A ⊆ Q of accepting states.
The behavior of a finite-state machine is governed by an input string w, which is a finite
sequence of symbols from the input alphabet Σ. The machine reads the symbols in w one at a
time in order (from left to right). At all times, the machine has a current state q; initially q is
the machine’s start state s. Each time the machine reads a symbol a from the input string, its
current state transitions from q to δ(q, a). After all the characters have been read, the machine
accepts w if the current state is in A and rejects w otherwise. In other words, every finite state
machine runs the algorithm DoSomethingCool! The language of a finite state machine M ,
denoted L(M) is the set of all strings that M accepts.
More formally, we extend the transition function δ : Q × Σ∗ → Q of any finite-state machine
to a function δ∗ : Q × Σ∗ → Q that transitions on strings as follows:
(

q if w = ",
δ (q, w) :=
δ∗ (δ(q, a), x) if w = a x.

Finally, a finite-state machine accepts a string w if and only if δ∗ (s, w) ∈ A, and rejects w
otherwise. (Compare this definition with the recursive formulation of DoSomethingCool!)
For example, our final MultipleOf5 algorithm is a DFA with the following components:
• input alphabet: Σ = {0, 1}
• state set: Q = {0, 1, 2, 3, 4}
• transition function: δ(q, a) = (2q + a) mod 5
• start state: s = 0
• accepting states: A = {0}
This machine rejects the string 00101110110, because

δ∗ (0, 00101110110) = δ∗ (δ(0, 0), 0101110110)


= δ∗ (0, 0101110110) = δ∗ (δ(0, 0), 101110110)
= δ∗ (0, 101110110) = δ∗ (δ(0, 1), 01110110) = · · ·
..
.
· · · = δ∗ (1, 110) = δ∗ (δ(1, 1), 10)
= δ∗ (3, 10) = δ∗ (δ(3, 1), 0)
= δ∗ (2, 0) = δ∗ (δ(3, 0), ")
= δ∗ (4, ") = 4 6∈ A.

3
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

We have already seen a more graphical representation of this entire sequence of transitions:
0 0 1 0 1 1 1 0 1 1 0
0 −→ 0 −→ 0 −→ 1 −→ 2 −→ 0 −→ 1 −→ 3 −→ 1 −→ 3 −→ 2 −→ 4

The arrow notation is easier to read and write for specific examples, but surprisingly, most people
actually find the more formal functional notation easier to use in formal proofs. Try them both!
We can equivalently define a DFA as a directed graph whose vertices are the states Q, whose
edges are labeled with symbols from Σ, such that every vertex has exactly one outgoing edge
with each label. In our drawings of finite state machines, the start state s is always indicated
by an incoming arrow, and the accepting states A are always indicted by doubled circles. By
induction, for any string w ∈ Σ∗ , this graph contains a unique walk that starts at s and whose
edges are labeled with the symbols in w in order. The machine accepts w if this walk ends at an
accepting state. This graphical formulation of DFAs is incredibly useful for developing intuition
and even designing DFAs. For proofs, it’s largely a matter of taste whether to write in terms of
extended transition functions or labeled graphs, but (as much as I wish otherwise) I actually find
it easier to write correct proofs using the functional formulation.

3.3 Another Example


The following drawing shows a finite-state machine with input alphabet Σ = {0, 1}, state set
Q = {s, t}, start state s, a single accepting state t, and the transition function

δ(s, 0) = s, δ(s, 1) = t, δ(t, 0) = t, δ(t, 1) = s.

1
0 s t 0
1
A simple finite-state machine.

For example, the two-state machine M at the top of this page accepts the string 00101110100
after the following sequence of transitions:
0 0 1 0 1 1 1 0 1 0 0
s −→ s −→ s −→ t −→ t −→ s −→ t −→ s −→ s −→ t −→ t −→ t.

The same machine M rejects the string 11100101 after the following sequence of transitions:
1 1 1 0 0 1 0 1
s −→ t −→ s −→ t −→ t −→ s −→ t −→ t −→ s.

Finally, M rejects the empty string, because the start state s is not an accepting state.
From these examples and others, it is easy to conjecture that the language of M is the set of
all strings of 0s and 1s with an odd number of 1s. So let’s prove it!

Proof (tedious case analysis): Let #(a, w) denote the number of times symbol a appears in
string w. We will prove the following stronger claims, for any string w.
¨ ¨
∗ s if #(1, w) is even ∗ t if #(1, w) is even
δ (s, w) = and δ (t, w) =
t if #(1, w) is odd s if #(1, w) is odd

Let w be an arbitrary string. Assume that for any string x that is shorter than w, we have
δ∗ (s, x) = s and δ∗ (t, x) = t if x has an even number of 1s, and δ∗ (s, x) = t and δ∗ (t, x) = s if
x has an odd number of 1s. There are five cases to consider.

4
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

• If w = ", then w contains an even number of 1s and δ∗ (s, w) = s and δ∗ (t, w) = t by


definition.

• Suppose w = 1 x and #(1, w) is even. Then #(1, x) is odd, which implies

δ∗ (s, w) = δ∗ (δ(s, 1), x) by definition of δ∗


= δ∗ (t, x) by definition of δ
=s by the inductive hypothesis

δ∗ (t, w) = δ∗ (δ(t, 1), x) by definition of δ∗


= δ∗ (s, x) by definition of δ
=T by the inductive hypothesis

Since the remaining cases are similar, I’ll omit the line-by-line justification.

• If w = 1 x and #(1, w) is odd, then #(1, x) is even, so the inductive hypothesis implies

δ∗ (s, w) = δ∗ (δ(s, 1), x) = δ∗ (t, x) = t


δ∗ (t, w) = δ∗ (δ(t, 1), x) = δ∗ (s, x) = s

• If w = 0 x and #(1, w) is even, then #(1, x) is even, so the inductive hypothesis implies

δ∗ (s, w) = δ∗ (δ(s, 0), x) = δ∗ (s, x) = s


δ∗ (t, w) = δ∗ (δ(t, 0), x) = δ∗ (t, x) = t

• Finally, if w = 0 x and #(1, w) is odd, then #(1, x) is odd, so the inductive hypothesis
implies

δ∗ (s, w) = δ∗ (δ(s, 0), x) = δ∗ (s, x) = t


δ∗ (t, w) = δ∗ (δ(t, 0), x) = δ∗ (t, x) = s ƒ

Notice that this proof contains |Q|2 · |Σ| + |Q| separate inductive arguments. For every pair of
states p and q, we must argue about the language so strings w such that δ∗ (p, w) = q, and we
must consider each first symbol in w. We must also argue about δ(p, ") for every state p. Each of
those arguments is typically straightforward, but it’s easy to get lost in the deluge of cases.
For this particular proof, however, we can reduce the number of cases by switching from tail
recursion to head recursion. The following identity holds for all strings x ∈ Σ∗ and symbols
a ∈ Σ:
δ∗ (q, x a) = δ(δ∗ (q, x), a)

We leave the inductive proof of this identity as a straightforward exercise (hint, hint).

Proof (clever renaming, head induction): Let’s rename the states 0 and 1 instead of s and t.
Then the transition function can be described concisely as δ(q , a) = (q + a) mod 2.
Now we claim that for every string w, we have δ∗ (0, w) = #(1, w) mod 2. So let w be
an arbitrary string, and assume that for any string x that is shorter than w that δ∗ (0, x) =
#(1, x) mod 2. There are only two cases to consider: either w is empty or it isn’t.

• If w = ", then δ∗ (0, w) = 0 = #(1, w) mod 2 by definition.

5
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

• Otherwise, w = x a for some string x and some symbol a, and we have


δ∗ (0, w) = δ(δ∗ (0, x), a)
= δ(#(1, x) mod 2, a) by the inductive hypothesis
= (#(1, x) mod 2 + a) mod 2 by definition of δ
= (#(1, x) + a) mod 2 by definition of mod 2
= (#(1, x) + #(1, a)) mod 2 because #(1, 0) = 0 and #(1, 1) = 1
= (#(1, x a)) mod 2 by definition of #
= (#(1, w)) mod 2 because w = x a ƒ

Hmmm. This “clever” proof is certainly shorter than the earlier brute-force proof, but is it really
“better”? “Simpler”? More intuitive? Easier to understand? I’m skeptical. Sometimes brute force
really is more effective.

3.4 Yet Another Example


As a more complex example, consider the Rubik’s cube, a well-known mechanical puzzle invented
independently by Ern Rubik in Hungary and Terutoshi Ishigi in Japan in the mid-1970s. This
puzzle has precisely 519,024,039,293,878,272,000 distinct configurations. In the unique solved
configuration, each of the six faces of the cube shows exactly one color. We can change the
configuration of the cube by rotating one of the six faces of the cube by 90 degrees, either clockwise
or counterclockwise. The cube has six faces (front, back, left, right, up, and down), so there
are exactly twelve possible turns, typically represented by the symbols R, L, F, B, U, D, R̄, L̄, F̄, B̄, Ū, D̄,
where the letter indicates which face to turn and the presence or absence of a bar over the letter
indicates turning counterclockwise or clockwise, respectively. Thus, we can represent a Rubik’s
cube as a finite-state machine with 519,024,039,293,878,272,000 states and an input alphabet
with 12 symbols; or equivalently, as a directed graph with 519,024,039,293,878,272,000 vertices,
each with 12 outgoing edges. In practice, the number of states is far too large for us to actually
draw the machine or explicitly specify its transition function; nevertheless, the number of states
is still finite. If we let the start state s and the sole accepting state be the solved state, then
the language of this finite state machine is the set of all move sequences that leave the cube
unchanged.

A complicated finite-state machine.

3.5 Building DFAs


This section describes a few examples of building DFAs that accept particular languages, thereby
proving that those languages are automatic. As usual in algorithm design, there is no purely

6
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

mechanical recipe—no automatic method—no algorithm—for building DFAs in general. However,


the following examples show several useful design strategies.

3.5.1 Superstrings

Perhaps the simplest rule of thumb is to try to construct an algorithm that looks like MultipleOf5:
A simple for-loop through the symbols, using a constant number of variables, where each variable
(except the loop index) has only a constant number of possible values. Here, “constant” means
an actual number that is not a function of the input size n. You should be able to compute the
number of possible values for each variable at compile time.
For example, the following algorithm determines whether a given string in Σ = {0, 1} contains
the substring 11.

Contains11(w[1 .. n]):
found ← False
for i ← 1 to n
if i = 1
last2 ← w[1]
else
last2 ← w[1] · w[2]
if last = 11
found ← True
return found

Aside from the loop index, this algorithm has exactly two variables.

• A boolean flag found indicating whether we have seen the substring 11. This variable has
exactly two possible values: True and False.

• A string last2 containing the last (up to) three symbols we have read so far. This variable
has exactly 7 possible values: ", 0, 1, 00, 01, 10, and 11.

Thus, altogether, the algorithm can be in at most 2 × 7 = 14 possible states, one for each possible
pair (found, last2). Thus, we can encode the behavior of Contains11 as a DFA with fourteen
states, where the start state is (False, ") and the accepting states are all seven states of the form
(True, ∗). The transition function is described in the following table (split into two parts to save
space):

q δ[q, 0] δ[q, 1] q δ[q, 0] δ[q, 1]


(False, ") (False, 0) (False, 1) (True, ") (True, 0) (True, 1)
(False, 0) (False, 00) (False, 01) (True, 0) (True, 00) (True, 01)
(False, 1) (False, 10) (True, 11) (True, 1) (True, 10) (True, 11)
(False, 00) (False, 00) (False, 01) (True, 00) (True, 00) (True, 01)
(False, 01) (False, 10) (True, 11) (True, 01) (True, 10) (True, 11)
(False, 10) (False, 00) (False, 01) (True, 10) (True, 00) (True, 01)
(False, 11) (False, 10) (True, 11) (True, 11) (True, 10) (True, 11)

For example, given the input string 1001011100, this DFA performs the following sequence of

7
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

transitions and then accepts.

1 0 0 1
(False, ") −→ (False, 1) −→ (False, 10) −→ (False, 00) −→
0 1 1
(False, 01) −→ (False, 10) −→ (False, 01) −→
1 0 0
(True, 11) −→ (True, 11) −→ (True, 10) −→ (True, 00)

3.5.2 Reducing states

You can probably guess that the brute-force DFA we just constructed has considerably more states
than necessary, especially after seeing its transition graph:

F,ε 1 F,1 Τ,1 0 Τ,ε

0 0 1 1 0 0

F,0 F,10 0 F,11 1 Τ,11 0 Τ,10 Τ,0


1 0 0 1
0 1 1 1 1 0 0
0 1

0 F,00 1 F,01 Τ,01 1 Τ,00 0

Our brute-force DFA for strings containing the substring 11

For example, we don’t need actually to remember both of the last two symbols, but only the
penultimate symbol, because the last symbol is the one we’re currently reading. This observation
allows us to reduce the number of states from fourteen to only six. Once the flag part of the state
is set to True, we know the machine will eventually accept, so we might as well merge the two
accepting states together. Finally, and more subtly, because all transitions out of (False, ") and
(False, 0) are identical, we can merge those two states together as well. In the end, we obtain
the following DFA with just three states:
• The start state, which indicates that the machine has not read the substring 11 an did not
just read the symbol 1.
• An intermediate state, which indicates that the machine has not read the substring 11 but
just read the symbol 1.
• A unique accept state, which indicates that the machine has read the substring 11.

0 0,1

0
1
1
A minimal DFA for superstrings of 11

At the end of this note, I’ll describe an efficient algorithm to transform any given DFA into an
equivalent DFA with the fewest possible states. Given that this minimization algorithm exists,
there is very little incentive to optimize DFAs by hand. Clarity is infinitely more important than
brevity, especially in this class.

8
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

3.5.3 Every this after that

Suppose we want to accept the set of strings in which every occurrence of the substring 00 occurs
after every occurrence of the substring 11. Equivalently, we want to reject every string in which
some 00 occurs before 11. Often the easiest way to design a DFA to check whether a string is not
in some set is first to build a DFA that is in that set and then invert which states in that machine
are accepting.
From the previous example, we know that there is a three-state DFA M11 that accepts the set
of strings with the substring 11 and a nearly identical DFA M00 that accepts the set of strings
containing the substring 00. By identifying the accept state of M00 with the start state of M11 ,
we obtain a five-state DFA that accepts the set of strings with 00 before 11. Finally, by inverting
which states are accepting, we obtain the DFA we want.

1 0,1 0 0,1
1 0
0 1
0 1

1 0 0,1
1 0
0 1
0 1

1 0 0,1

1 0
0 1
0 1
Building a DFA for the language of strings in which every 00 is after every 11.

3.5.4 Both This and That: The Product Construction

Now suppose we want to accept all strings that contain both 00 and 11 as substrings, in either
order. Intuitively, we’d like to run two of our earlier DFAs in parallel—the DFA M00 to detect
superstrings of 00 and the DFA M11 to detect superstrings of 11—and then accept the input
string if and only if both of these DFAs accept. In fact, we can encode precisely this “parallel
computation” into a single DFA, whose states are all ordered pairs (p, q), where p is a state in
a
M00 and q is a state in M11 . The new “parallel” DFA includes the transition (p, q) −→ (p0 , q0 ) if
a a
and only if M00 contains the transition p −→ p0 and M11 contains the transition q −→ q0 . Finally,
the state (p, q) is accepting if and only if p and q are accepting states in their respective machines.
The resulting nine-state DFA is shown on the next page.
More generally, let M1 = (Σ, Q 1 , δ1 , s1 , A1 ) be an arbitrary DFA that accepts some language
L1 , and let M2 = (Σ, Q 2 , δ2 , s2 , A2 ) be an arbitrary DFA that accepts some language L2 (over the
same alphabet Σ). We can construct a third DFA M = (Σ, Q, δ, s, A) that accepts the intersection
language L1 ∩ L2 as follows.

Q := Q 1 × Q 2 = (p, q) p ∈ Q 1 and q ∈ Q 2
s := (s1 , s2 )

A := A1 × A2 = (p, q) p ∈ A1 and q ∈ A2

δ((p, q), a) := δ1 (p, a), δ2 (q, a)

9
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

0 0,1
0
s a 1 b
1

1
1 s s,s 1 s,a 1 s,b

1 0 0 0 0 1
1 1

a a,s a,a a,b

0 0 0 0

0
0,1 b b,s b,a 1 b,b
1
0 0,1

Building a DFA for the language of strings in which every 00 is after every 11.

To convince yourself that this product construction is actually correct, consider the extended
transition function δ∗ : (Q × Q0 ) × Σ∗ → (Q × Q0 ), which acts on strings instead of individual
symbols. Recall that this function is defined recursively as follows:
(
q if w = ",
δ∗ (p, q), w :=

δ∗ δ((p, q), a), x

if w = a x.

Inductive definition-chasing gives us the identity δ∗ ((p, q), w) = δ1∗ (p, w), δ2∗ (q, w) for any


string w:

δ∗ (p, q), " = (p, q) by the definition of δ∗




= δ1∗ (p, "), δ2∗ (q, ") by the definitions of δ1∗ and δ2∗ ;


δ∗ (p, q), a x = δ∗ δ((p, q), a), x by the definition of δ∗


 

= δ∗ (δ1 (p, a), δ2 (q, a)), x



by the definition of δ
∗ ∗

= δ1 ((δ1 (p, a), x), δ2 (δ2 (q, a), x) by the induction hypothesis
= δ1∗ (p, a x), δ2∗ (q, a x) by the definitions of δ1∗ and δ2∗ .


It now follows from this seemingly impenetrable wall of notation that for any string w, we have
δ∗ (s, w) ∈ A if and only if both δ1∗ (s1 , w) ∈ A1 and δ2∗ (s2 , w) ∈ A2 . In other words, M accepts w if
and only if both M1 and M2 accept w, as required.
As usual, this construction technique does not necessarily yield minimal DFAs. For example,
in our first example of a product DFA, illustrated above, the central state (a, a) cannot be reached
by any other state and is therefore redundant. Whatever.
Similar product constructions can be used to build DFAs that accept any other boolean
combination of languages; in fact, the only part of the construction that needs to be changed is
the choice of accepting states. For example:

• To accept the union L1 ∪ L2 , define A = (p, q) p ∈ A1 or q ∈ A2 .

10
Models of Computation Lecture 3: Finite-State Machines [Fa’14]


• To accept the difference L1 \ L2 , define A = (p, q) p ∈ A1 but not q 6∈ A2 .

• To accept the symmetric difference L1 ⊕ L2 , define A = (p, q) p ∈ A1 xor q ∈ A2 .

Moreover, by cascading this product construction, we can construct DFAs that accept arbitrary
boolean combinations of arbitrary finite collections of regular languages.

3.6 Decision Algorithms


ÆÆÆ It’s unclear how much we can say here, since we haven’t yet talked about graph algorithms,
or even really about graphs. Perhaps this discussion should simply be moved to the graph-
traversal notes.
• Is w ∈ L(M)? Follow the unique path from q0 with label w. By definition, w ∈ L(M ) if
and only if this path leads to an accepting state.

• Is L(M) empty? The language L(M ) is empty if and only if no accepting state is
reachable from q0 . This condition can be checked in O(n) time via whatever-first search,
where n is the number of states. Alternatively, but less usefully, L(M ) = ∅ if and only if
L(M ) contains no string w such that |w| < n.
• Is L(M) finite? Remove all states unreachable from q0 (via whatever first search).
Then L(M ) is finite if and only if the reduced DFA is a dag; this condition can be checked
by depth-first search. Alternatively, but less usefully, L(M ) is finite if and only if L(M )
contains no string w such that n ≤ |w| < 2n.

• Is L(M) = Σ∗ ? Remove all states unreachable from q0 (via whatever first search). Then
L(M ) = Σ∗ if and only if every state in M is an accepting state.
• Is L(M) = L(M 0 )? Build a DFA N such that L(N ) = L(M ) \ L(M 0 ) using a standard
product construction, and then check whether L(N ) = ∅.

3.7 Closure Properties


ÆÆÆ We haven’t yet proved that automatic languages are regular yet, so formally, for now, some
of these are closure properties of automatic languages.
• Complement (easy for DFAs, hard for regular expressions.)
• Concatenation (trivial for regular expressions, hard for DFAs)
• Union (trivial for regular expressions, easy for DFAs via product)
• Intersection (hard for regular expressions, easy for DFAs via product)
• Difference (hard for regular expressions, easy for DFAs via product)
• Kleene star: wait for NFAs (trivial for regular expression, hard for DFAs)
• Homomorphism: only mention in passing
• Inverse homomorphism: only mention in passing

3.8 Fooling Sets


Fix an arbitrary language L over an arbitrary alphabet Σ. For any strings x, y, z ∈ Σ∗ , we say that
z distinguishes x from y if exactly one of the strings xz and yz is in L. If no string distinguishes
x and y, we say that x and y are L-equivalent and write x ≡L y. Thus,

x ≡L y ⇐⇒ For every string z ∈ Σ∗ , we have xz ∈ L if and only if yz ∈ L.

11
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

For example, let L eo denote the language of strings over {0, 1} with an even number of 0s
and an odd number of 1s. Then the strings x = 01 and y = 0011 are distinguished by the string
z = 100, because

xz = 01 • 100 = 01100 ∈ L eo
yz = 0011 • 100 = 0011100 6∈ L eo .

On the other hand, it is quite easy to prove (hint, hint) that the strings 0001 and 1011 are
L eo -equivalent.
Let M be an arbitrary DFA for an arbitrary language L, and let x be y be arbitrary strings. If
x and y lead to the same state in M —that is, if δ∗ (s, x) = δ∗ (s, y)—then we have

δ∗ (s, xz) = δ∗ (δ∗ (s, x), z) = δ∗ (δ∗ (s, y), z) = δ∗ (s, yz)

for any string z. In particular, either M accepts both x and y, or M rejects both x and y, and
therefore x ≡ L y. It follows that if x and y are not L-equivalent, then any DFA that accepts L
has at least two distinct states δ∗ (s, x) 6= δ∗ (s, y).
Finally, a fooling set for L is a set F of strings such that every pair of strings in F has a
distinguishing suffix. For example, F = {01, 101, 010, 1010} is a fooling set for the language L eo
of strings with an even number of 0s and an odd number of 1s, because each pair of strings in F
has a distinguishing suffix:

• 0 distinguishes 01 and 101;


• 0 distinguishes 01 and 010;
• 0 distinguishes 01 and 1010;
• 10 distinguishes 101 and 010;
• 1 distinguishes 101 and 1010;
• 1 distinguishes 010 and 1010.

The pigeonhole principle now implies that for any integer k, if language L is accepted by a DFA
with k states, then every fooling set for L contains at most k strings. This simple observation has
two immediate corollaries.
First, for any integer k, if L has a fooling set of size k, then every DFA that accepts L has at
least k states. For example, the fooling set {01, 101, 010, 1010} proves that any DFA for L eo has at
least four states. Thus, we can use fooling sets to prove that certain DFAs are as small as possible.
Second, and more interestingly, if a language L is accepted by any DFA, then every fooling set
for L must be finite. Equivalently:

If L has an infinite fooling set, then L is not accepted by any DFA.

This is arguably both the simplest and most powerful method for proving that a language is
non-regular. Here are a few canonical examples of the fooling-set technique in action.

Lemma 3.1. The language L = {0n 1n | n ≥ 0} is not regular.

Proof: Consider the set F = {0n | n ≥ 0}, or more simply F = 0∗ . Let x and y be arbitrary
distinct strings in F . Then we must have x = 0i and y = 0 j for some integers i 6= j. The suffix
z = 1i distinguishes x and y, because xz = 0i 1i ∈ L, but yz = 0i 1 j 6∈ L. We conclude that F is a
fooling set for L. Because F is infinite, L cannot be regular. ƒ

12
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

Lemma 3.2. The language L = {wwR | w ∈ Σ∗ } of even-length palindromes is not regular.

Proof: Let x and y be arbitrary distinct strings in 0∗ 1. Then we must have x = 0i 1 and y = 0 j 1
for some integers i 6= j. The suffix z = 10i distinguishes x and y, because xz = 0i 110i ∈ L, but
yz = 0i 110 j 6∈ L. We conclude that 0∗ 1 is a fooling set for L. Because 0∗ 1 is infinite, L cannot be
regular. ƒ
n
Lemma 3.3. The language L = {02 | n ≥ 0} is not regular.
i j
Proof: Let x and y be arbitrary distinct strings in L. Then we must have x = 02 and y = 02 for
some integers i 6= j. The suffix z = 02 distinguishes x and y, because xz = 02 +2 = 02 ∈ L,
i i i i+1

but yz = 02 +2 6∈ L. We conclude that L itself is a fooling set for L. Because L is infinite, L


i j

cannot be regular. ƒ

Lemma 3.4. The language L = {0 p | p is prime} is not regular.

Proof: Again, we use 0∗ as our fooling set, but but the actual argument is somewhat more
complicated than in our earlier examples.
Let x and y be arbitrary distinct strings in 0∗ . Then we must have x = 0i and y = 0 j for some
integers i 6= j. Without loss of generality, assume that i < j. Let p be any prime number larger
than i. Because p + 0( j − i) is prime and p + p( j − i) > p is not, there must be a positive integer
k ≤ p such that p + (k − 1)( j − i) is prime but p + k( j − i) is not. Then the suffix 0 p+(k−1) j−ki
distinguishes x and y:

xz = 0i 0 p+(k−1) j−ki = 0 p+(k−1)( j−i) ∈ L because p + (k − 1)( j − i) is prime;


yz = 0 j 0 p+(k−1) j−ki = 0 p+k( j−i) 6∈ L because p + k( j − i) is not prime.

(Because i < j and i < p, the suffix 0 p+(k−1) j−ki = 0(p−i)+(k−1)( j−i) has positive length and
therefore actually exists!) We conclude that 0∗ is indeed a fooling set for L, which implies that L
is not regular. ƒ

One natural question that many students ask is “How did you come up with that fooling set?”
Perhaps the simplest rule of thumb is that for most languages L—in particular, for almost all
languages that students are asked to prove non-regular on homeworks or exams—either some
simple regular language like 0∗ or 10∗ 1 is a fooling set, or the language L itself is a fooling set.
(Of course, there are well-engineered counterexamples.)

?
3.9 The Myhill-Nerode Theorem
The fooling set technique implies a necessary condition for a language to be accepted by a
DFA—the language must have no infinite fooling sets. In fact, this condition is also sufficient.
The following powerful theorem was first proved by Anil Nerode in 1958, strengthening a 1957
result of John Myhill.¹

The Myhill-Nerode Theorem. For any language L, the following are equal:
¹Myhill considered the finer equivalence relation x ∼ L y, meaning wxz ∈ L if and only if w yz ∈ L for all strings
w and z, and proved that L is regular if and only if ∼ L defines a finite number of equivalence classes. Like most
of Myhill’s early automata research, this result appears in an unpublished Air Force technical report. The modern
Myhill-Nerode theorem appears (in an even more general form) as a minor lemma in Nerode’s 1958 paper, which (not
surprisingly) does not cite Myhill.

13
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

(a) the minimum number of states in a DFA that accepts L,


(b) the maximum size of a fooling set for L, and
(c) the number of equivalence classes of ≡ L .
In particular, L is accepted by a DFA if and only if every fooling set for L is finite.

Proof: Let L be an arbitrary language.


We have already proved that the size of any fooling set for L is at most the number of states
in any DFA that accepts L, so (a)≤(b). It also follows directly from the definitions that F ⊆ Σ∗ is
a fooling set for L if and only if F contains at most one string in each equivalence class of ≡ L ;
thus, (b)=(c). We complete the proof by showing that (a)≥(c).
We have already proved that if ≡ L has an infinite number of equivalence classes, there is no
DFA that accepts L, so assume that the number of equivalence classes is finite. For any string w,
let [w] denote its equivalence class. We define a DFA M≡ = (Σ, Q, s, A, δ) as follows:

Q := [w] w ∈ Σ∗


s := ["]

A := [w] w ∈ L
δ([w], a) := [w • a]

We claim that this DFA accepts the language L; this claim completes the proof of the theorem.
But before we can prove anything about this DFA, we first need to verify that it is actually
well-defined. Let x and y be two strings such that [x] = [ y]. By definition of L-equivalence,
for any string z, we have xz ∈ L if and only if yz ∈ L. It immediately follows that for any
symbol a ∈ Σ and any string z 0 , we have x az 0 ∈ L if and only if y az 0 ∈ L. Thus, by definition of
L-equivalence, we have [x a] = [ y a] for every symbol a ∈ Σ. We conclude that the function δ is
indeed well-defined.
An easy inductive proof implies that δ∗ (["], x) = [x] for every string x. Thus, M accepts
string x if and only if [x] = [w] for some string w ∈ L. But if [x] = [w], then by definition
(setting z = "), we have x ∈ L if and only if w ∈ L. So M accepts x if and only if x ∈ L. In other
words, M accepts L, as claimed, so the proof is complete. ƒ

?
3.10 Minimal Automata
Given a DFA M = (Σ, Q, s, A, δ), suppose we want to find another DFA M 0 = (Σ, Q0 , s0 , A0 , δ0 ) with
the fewest possible states that accepts the same language. In this final section, we describe
an efficient algorithm to minimize DFAs, first described (in slightly different form) by Edward
Moore in 1956. We analyze the running time of Moore’s in terms of two parameters: n = |Q| and
σ = |Σ|.
In the preprocessing phase, we find and remove any states that cannot be reached from the
start state s; this filtering can be performed in O(nσ) time using any graph traversal algorithm.
So from now on we assume that all states are reachable from s.
Now define two states p and q in the trimmed DFA to be distingusiable, written p 6∼ q , if at
least one of the following conditions holds:

• p ∈ A and q 6∈ A,
• p 6∈ A and q ∈ A, or
• δ(p, a) 6∼ δ(q, a) for some a ∈ Σ.

14
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

Equivalently, p 6∼ q if and only if there is a string z such that exactly one of the states δ∗ (p, z)
and δ∗ (q, z) is accepting. (Sound familiar?) Intuitively, the main algorithm assumes that all
states are equivalent until proven otherwise, and then repeatedly looks for state pairs that can be
proved distinguishable.
The main algorithm maintains a two-dimensional table, indexed by the states, where
Dist[p, q] = True indicates that we have proved states p and q are distinguishable. Initially, for all
states p and q, we set Dist[p, q] ← True if p ∈ A and q 6∈ A or vice versa, and Dist[p, q] = False
otherwise. Then we repeatedly consider each pair of states and each symbol to find more
distinguished pairs, until we make a complete pass through the table without modifying it. The
table-filling algorithm can be summarized as follows:²

MinDFATable(Σ, Q, s, A, δ):
for all p ∈ Q
for all q ∈ Q
if (p ∈ A and q 6∈ A) or (p 6∈ A and q ∈ A)
Dist[p, q] ← True
else
Dist[p, q] ← False
notdone ← True
while notdone
notdone ← False
for all p ∈ Q
for all q ∈ Q
if Dist[p, q] = False
for all a ∈ Σ
if Dist[δ(p, a), δ(q, a)]
Dist[p, q] ← True
notdone ← True
return Dist

The algorithm must eventually halt, because there are only a finite number of entries in the
table that can be marked. In fact, the main loop is guaranteed to terminate after at most n
iterations, which implies that the entire algorithm runs in O(σn 3 ) time. Once the table is filled,
any two states p and q such that Dist(p, q) = False are equivalent and can be merged into a
single state. The remaining details of constructing the minimized DFA are straightforward.

ÆÆÆ Need to prove that the main loop terminates in at most n iterations.

With more care, Moore’s minimization algorithm can be modified to run in O(σn2 ) time. A
faster DFA minimization algorithm, due to John Hopcroft, runs in O(σn log n) time.
²More experienced readers should become queasy at the mere suggestion that any algorithm merely fills in a table,
as opposed to evaluating a recurrence. This algorithm is no exception. Consider the boolean function Dist(p, q, k),
which equals True if and only if p and q can be distinguished by some string of length at most k. This function obeys
the following recurrence:

(p ∈ A) ⊕ (q ∈ A) if k = 0,
Dist(p, q, k) = _ 
Dist(p, q, k − 1) ∨ Dist δ(p, a), δ(q, a), k − 1 otherwise.
a∈Σ

Moore’s “table-filling” algorithm is just a space-efficient dynamic programming algorithm to evaluate this recurrence.

15
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

Example

To get a better idea how this algorithm works, let’s visualize the algorithm running on our
earlier brute-force DFA for strings containing the substring 11. This DFA has four unreachable
states: (False, 11), (True, "), (True, 0), and (True, 1). We remove these states, and relabel the
remaining states for easier reference. (In an actual implementation, the states would almost
certainly be represented by indices into an array anyway, not by mnemonic labels.)

0 1 2
1
0 0
1

1 5 9 0 8
1 0 0
0 1 1 0
0 1
1

0 3 1 4 7 1 6 0

Our brute-force DFA for strings containing the substring 11, after removing all four unreachable states

The main algorithm initializes (the bottom half of) a 10 × 10 table as follows. (In the
implementation, cells marked 6∼ have value True and blank cells have value False.)
0 1 2 3 4 5 6 7 8
1
2
3
4
5
6 6 ∼ 6 ∼ 6 ∼ 6 ∼ 6 ∼ 6 ∼
7 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
8 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
9 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
In the first iteration of the main loop, the algorithm discovers several distinguishable pairs
of states. For example, the algorithm sets Dist[0, 2] ← True because Dist[δ(0, 1), δ(2, 1)] =
Dist[2, 9] = True. After the iteration ends, the table looks like this:
0 1 2 3 4 5 6 7 8
1
2 6∼ 6∼
3 6∼
4 6∼ 6∼ 6∼
5 6∼ 6∼
6 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
7 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
8 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
9 6∼ 6∼ 6∼ 6∼ 6∼ 6∼
The second iteration of the while loop makes no further changes to the table—We got lucky!—so
the algorithm terminates.

16
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

The final table implies that the states of our trimmed DFA fall into exactly three equivalence
classes: {0, 1, 3, 5}, {2, 4}, and {6, 7, 8, 9}. Replacing each equivalence class with a single state
gives us the three-state DFA that we already discovered.

0 1 2

0 1
0
1

1 5 9 0 8
1 0 1 0
0 1 0
1
0 1

0 3 1 4 7 1 6 0

0 0,1

1
1
0

Equivalence classes of states in the trimmed DFA, and the resulting minimal equivalent DFA.

Exercises
1. For each of the following languages in {0, 1}∗ , describe a deterministic finite-state machine
that accepts that language. There are infinitely many correct answers for each language.
“Describe” does not necessarily mean “draw”.

(a) Only the string 0110.


(b) Every string except 0110.
(c) Strings that contain the substring 0110.
(d) Strings that do not contain the substring 0110.
? (e) Strings that contain an even number of occurrences of the substring 0110. (For
example, this language contains the strings 0110110 and 01011.)
(f) Strings that contain the subsequence 0110.
(g) Strings that do not contain the subsequence 0110.
(h) Strings that contain an even number of 1s and an odd number of 0s.
(i) Strings that represent a number divisible by 7 in binary.
(j) Strings whose reversals represent a number divisible by 7 in binary.
(k) Strings in which the substrings 01 and 10 appear the same number of times.
(l) Strings such that in every prefix, the number of 0s and the number of 1s differ by at
most 1.
(m) Strings such that in every prefix, the number of 0s and the number of 1s differ by at
most 4.
(n) Strings that end with 010 = 0000000000.

17
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

(o) Strings in which the number of 1s is even, the number of 0s is divisible by 3, the
overall length is divisible by 5, the binary value is divisible by 7, the binary value of
the reversal is divisible by 11, and does not contain thirteen 1s in a row. [Hint: This
is more tedious than difficult.]

2. (a) Let L ⊆ 0∗ be an arbitrary unary language. Prove that L ∗ is regular.


(b) Prove that there is a binary language L ⊆ (0 + 1)∗ such that L ∗ is not regular.

3. Describe and analyze algorithms for the following problems. In each case, the input is a
DFA M over the alphabet Σ = {0, 1}.

(a) Does M accept any string whose length is a multiple of 5?


(b) Does M accept every string that represents a number divisible by 7 in binary?
(c) Does M accept an infinite number of strings containing an odd number of 0s?
(d) Does M accept a finite number of strings that contain the substring 0110110 and
whose length is divisible by five?
(e) Does M accept only strings whose lengths are perfect squares?
(f) Does M accept any string whose length is composite?
? (g) Does M accept any string whose length is prime?

ÆÆÆ Move these to the graph traversal notes?

4. Prove that each of the following languages cannot be accepted by a DFA.


 2
(a) 0n n ≥ 0
 3
(b) 0n n ≥ 0
(c) 0 f (n) n ≥ 0 , where f (n) is any fixed polynomial in n with degree at least 2.


(d) 0n n is composite

(e) 0n 10n n ≥ 0

(f) 0i 1 j i 6= j

(g) 0i 1 j i < 3 j

(h) 0i 1 j i and j are relatively prime

(i) 0i 1 j j − i is a perfect square
(j) {w#w | w ∈ (0 + 1)∗ }
(k) {ww | w ∈ (0 + 1)∗ }

(l) w#0|w| w ∈ (0 + 1)∗


(m) w0|w| w ∈ (0 + 1)∗


(n) {x y | w, x ∈ (0 + 1)∗ and |x| = | y| but x 6= y}



(o) 0m 1n 0m+n m, n ≥ 0

18
Models of Computation Lecture 3: Finite-State Machines [Fa’14]

(p) {0m 1n 0mn | m, n ≥ 0}


(q) Strings in which the substrings 00 and 11 appear the same number of times.
(r) The set of all palindromes in (0 + 1)∗ whose length is divisible by 7.
(s) {w ∈ (0 + 1)∗ | w is the binary representation of a perfect square}
Æ
(t) {w ∈ (0 + 1)∗ | w is the binary representation of a prime number}

5. For each of the following languages over the alphabet Σ = {0, 1}, either describe a DFA
that accepts the language or prove that no such DFA exists. Recall that Σ+ denotes the
set of all nonempty strings over Σ. [Hint: Believe it or not, most of these languages can be
accepted by DFAs.]

(a) 0n w1n w ∈ Σ∗ and n ≥ 0


(b) 0n 1n w w ∈ Σ∗ and n ≥ 0


(c) w0n 1n x w, x ∈ Σ∗ and n ≥ 0


(d) 0n w1n x w, x ∈ Σ∗ and n ≥ 0


(e) 0n w1 x 0n w, x ∈ Σ∗ and n ≥ 0


(f) wx w w, x ∈ Σ∗


(g) wx w w, x ∈ Σ+


(h) wx wR w, x ∈ Σ+


(i) wwx w, x ∈ Σ+


(j) wwR x w, x ∈ Σ+


(k) wx w y w, x, y ∈ Σ+


(l) wx wR y w, x, y ∈ Σ+


(m) x ww y w, x, y ∈ Σ+


(n) x wwR y w, x, y ∈ Σ+


(o) wx x w w, x ∈ Σ+


? (p) wx wR x w, x ∈ Σ+


19

You might also like