Automata
Automata
Contents
1 Why should you read this? 1
3 Languages 2
11 References 17
1
2 A word about notation
There is a slight deviation in notation from what has been presented to you during the
lecture, and what you are going to read here. Be assured that both the notations mean
the same thing.
First of all, we use the term finite automaton (FA). In many books this is also called
a finite state machine. Both are exactly the same. In the script and in the lecture this
was referred to as the Endlicher Automat.
We denote an FA by the 5-tuple (Q, E, q0 , δ, A), where Q is a set of states, E is an
alphabet, q0 is the starting state, δ is a transition function, and A is the set of accepting
states.
In the script (and the slides presented in the lecture), this 5-tuple is denoted by
(E, X, f , x0 , F). E corresponds to our E, X corresponds to our Q (set of states), f
corresponds to our δ (transition function), x0 corresponds to our q0 (initial state), and
lastly F corresponds to our A (set of accepting states).
3 Languages
To define what we mean by language, we first have to define what an alphabet is.
An alphabet is a finite set of symbols which are used to form words in a language.
An example of an alphabet might be a set like {a, b}. In this note we will de-
note an alphabet using the symbol E. A string over E is some number of elements
of E (possibly none) placed in order. So if E = {a, b} then strings over E can be
a, ab, bbaa, abab, aaaabbaab and so on and so forth. A very important string, which is
always a string over E, no matter what E is, is the null string denoted by ε. This is the
string with no symbols. If x is a string over E when we will use |x| to denote the length
of x. Hence |aba| = 3 and |a| = 1, and |ε| = 0. Note also that strings of length one over
E are the same as elements of E.
For some alphabet E, we use E ∗ to denote the set of all possible strings over E.
Applying ∗ to some set is called the closure operation. So if E = {a, b} then
{a, b}∗ = {ε, a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, . . .}
2
In addition to the standard set operations, we can also use operations on strings
like concatenation to generate new languages. If x and y are strings over E then the
concatenation of x and y, denoted by xy is a new string formed by writing the symbols
of x followed by the symbols of y. So if x = aa and y = bbb then xy = aabbb. If L1
and L2 are two languages then we can generate a new language L1 L2 which is defined
as follows.
L1 L2 = {xy | x ∈ L1 and y ∈ L2 }
We can extend this notion of concatenation as follows. If L is a language over E then
for any integer k ≥ 0 we can define a language
Lk = LL . . . L (k times L)
Clearly, there is no hard line distinguishing the two approaches. As you read this
note you will learn more ways of generating languages, which will be more sophisti-
cated than the way the example language that we gave before as an example of con-
structing languages. You will also read more about recognizing languages, which is
more or less what is note is all about.
To explain our definition of regular languages a bit more clearly, we might say
that they are languages containing only ε or a string of length one, together with those
3
which can be obtained from such languages by a finite sequence of steps, where each
step consists of applying one of the three operations to the languages that were obtained
at an earlier step.
Example 3 Let L be the language consisting of all strings of 0s and 1s that have even
length. Note that since ε is of length zero, and zero is even, ε is in L.
Note that L can be thought of as consisting of a number, possibly zero, of strings of
length 2 concatenated. Hence the regular expression corresponding to L can be given
as (00 + 01 + 10 + 11)∗.
Exercise 1 What would be the regular expression corresponding to the language which
consist of all strings of 0s and 1s that have odd length?
Exercise 2 Prove that every finite language is regular. (Hint: Use mathematical in-
duction)
4
5 Deterministic Finite Automata
In this section we will discuss about simple machines which will be used for recogniz-
ing the languages introduced in the last section. This means that given a language L,
we will design a machine ML , which on given any string s as input, will accept it if
s ∈ L, and reject it otherwise.
We will see that regular languages can be characterized in terms of the “memory”
required to recognize them. We will restrict ourselves to machines which will read
any string presented to them in a single pass from left to right. This will help us to
clarify what information needs to be “remembered” during the process of recognizing
a language, and allows a classification of languages on the basis of how much needs to
be remembered at each step in order to recognize the language. Regular languages are
the simplest in this respect, there are other more “difficult” languages which we will
not consider here.
Try to convince yourself that L is regular, and corresponds to the regular expression
(1 + 01)∗.
Now let us try to design the machine for recognizing the language L in Example 4.
This machine examines any input string one character at a time and at some stage
decides to accept or reject the string. The easiest case, when the string can be disposed
off, is when 00 occurs as a substring. Let us call this case N and for this we just need
to remember if 00 occurs.
If 00 has not yet occurred while we are reading the string, there might be two other
cases: case L0 when the last symbol read is 0, and case L1 when the last symbol read
is 1. If we are in case L1 we might assume that the string is in L. In this case, if the
next symbol read is a 0 then we are in case L0, and if the next symbol read is a 1 then
we continue in case L1. In case L0, the symbols 0 and 1 would take us to case N and
L1 respectively. The only other case that we have left out is the ε. This case should be
treated separately because receiving 0 or 1 in this case requires different transition than
what happens in other cases.
So we see that for recognizing L we don’t need to remember exactly what substring
we have read so far, but only remember which of the four states we are currently in.
The discussion above can be summarized in the form of the diagram shown in
Figure 1.
The arrow to the circle labeled ε indicates where to start, when no symbols of the
input string have been examined yet. The double circle indicates that if we are at this
state when we reach the end of the string, then the answer is “Yes, the input string
belongs to the language L”. Ending at any other state indicates that the string is not in
the language. Note that the four circles correspond to the four different cases described
above.
You can think of the above diagram as a “machine” because it is possible to visual-
ize a piece of hardware doing the job we described above. At any time the machine is
in one of the four states/cases which we have labeled as ε, L0, L1 and N. Initially the
machine is in state ε and as it reads one symbol at a time from the input string, it jumps
from one state to the other. The double circle is an accepting state since it indicates
that the substring read so far belongs to the language L. So if the machine is in this
5
0
L0 N 0, 1
1 0
ε
L1 1
state after the entire input string has been read, then this indicates that the string is in
the language.
Observe that at the heart of the machine is a set of states and a function which
receives a state and a symbol as input and outputs the next state. The crucial property
of the machine is the finiteness of the set of states. So while recognizing a string, we
need not remember the entire substring that has been read so far, but only which of the
n (n = 4 in our case) states the substring belongs to. These set of states is precisely the
“memory” aspect that we talked about at the beginning of this section.
Now we will give a formal definition of machines like the one we just described.
We call such a machine a (deterministic) finite automaton or a finite-state machine.
Note that it is possible to extend our definition where the transition function takes as
an input not a single symbol, but a string of symbols. Let δ∗ denote this new transition
function. Then if for any string y ∈ E ∗ and symbol a ∈ E, and state q ∈ Q, δ∗ (q, ya) =
δ(δ∗ (q, y), a). The rest of the functionally of the FA remains the same.
Now let us state a theorem, whose proof we will work out later.
Using the above theorem, we can state another theorem describing an important
property about regular languages.
We will now try to sketch the proof of the above theorem. Let L1 and L2 be rec-
ognized by the FAs M1 = (P, E, p0 , δ1 , A1 ) and M2 = (Q, E, q0 , δ2 , A2 ) respectively.
Suppose that we wish to find an FA M = (R, E, r0 , δ, A) to recognize L1 ∪ L2 . If we can
find M then by Theorem 1 we prove that L1 ∪ L2 is regular.
Note that the FA M while processing a string needs to keep track of the status
of both M1 and M2 simultaneously, and M accepts when either of M1 or M2 accepts.
How can we do this? Basically, M1 should remember the state it is in and M2 should
remember the state it is in, and M needs to remember the states of both M1 and M2 . For
6
this to happen, we construct the states of M to be pairs (p, q) where p ∈ P and q ∈ Q.
So R = P × Q. Initially M1 is in the state p0 and M2 is in the state q0 , and so M should
be in the state (p0 , q0 ). If M is in the state (p, q) and receives input a, then it should
go to the state (δ1 (p, a), δ2 (q, a)), since δ1 (p, a) and δ2 (q, a) are the states to which
M1 and M2 would respectively go. Since M has to accept a string whenever M1 or M2
does, the accepting states of M should be pairs (p, q) for which either p ∈ A1 or q ∈ A2 .
That leads to the fact that M accepts L1 ∪ L2 .
By now you probably came to understand that exactly the same strategy works for
the cases L1 ∩ L2 and L1 − L2 , except for the definition of accepting states. In the first
case (p, q) should be an accepting state if both p ∈ A1 and q ∈ A2 , and in the second
case (p, q) is an accepting state if p ∈ A1 and q ∈ A2 .
For the last case i.e. L1 , note that we can reduce it to the third case by noting that
L1 = E ∗ − L1 . That solves this case. However we can solve it in an even more simple
way by noting that the M which accepts L1 will be the same as M1 accepting L1 with
the only difference that the accepting states of M will be P − A1 , i.e. exactly those
states which are not the accepting states in M1 .
7
0,1
0
0 1
1 B
A 0
0
1 1
0,1
0, 1
0
0
0
B
A
1 1
The only difference these machines will have is that from each state there might be
multiple transitions possible for any given input symbol, whereas in FAs for a given
state and input symbol there was exactly one “next state”.
Try to convince yourself that the FA shown in Figure 2 corresponds to the language L.
The automaton shown in Figure 3 is similar to the one in Figure 2, but is not a
FA since in state A, the input 0 offers possible transitions to two different states. It is
unclear as to which path should be taken if the input 0 occurs in state A. However, if we
decide on the rule that this automaton accepts strings for which there exists some path
to an accepting state, then note that it reflects the structure in the regular expression
much more than the FA in Figure 2 does. For any string in the language L it is easy to
describe a path that leads to the accepting state B: start at A and continue looping back
to A until the first occurrence of either 000 or 111, use these to go to the accepting state
B, and continue looping back to B for each of the remaining symbols. Note that this is
exactly what the regular expression in Example 5 also says. Further, also note that for
any string not in L, there doesn’t exist any path which starts in A and leads to B.
Now let us formally define machines like that described in Figure 3, which we will
call nondeterministic finite automaton (NFA).
8
Definition 4 (Nondeterministic Finite Automaton) A nondeterministic finite automa-
ton (NFA) is a 5-tuple (Q, E, q0 , δ, A) where Q, E, q0 and A are exactly the same as de-
scribed in the case of a FA. The function is δ is different and is defined from Q × E to
2Q (i.e. the set of all possible subsets of Q).
The above definition simply says that an NFA is exactly the same as a FA (or a
Deterministic FA) except that from each state there can be multiple different transitions
for any given input symbol.
Note that it is also possible to define δ as a relation instead of a function, in which
case δ ⊆ (Q × E) × Q). The two definitions are clearly equivalent.
As in the case of FA, it is also possible to extend the definition of δ to δ∗ so that it
accepts strings of symbols. So if for a string y ∈ E ∗ , symbol a ∈ E, and state p ∈ Q,
δ∗ (p, ya) = δ(δ∗ (p, y), a).
Lastly, we can generalize our definition of an NFA even further by including ε
transitions i.e. state transitions which require only the null string (ε) as input. We
will see that this additional extension will simplify the process of finding an abstract
machine for recognizing a given language. Therefore we can define an NFA with ε-
transitions (denoted by NFA-ε) to be the same as an NFA, except that the transition
function is defined as δ∗ : Q × (E ∗ ∪ {ε}) → 2Q .
So the ε-closure of a set S is simply the set of states that can be reached from the
elements of S using only ε-transitions.
Algorithm to calculate ε(S): Begin with T = S, and at each step add to T the union
of all the sets δ(q, ε) for q ∈ T . Stop when the set T doesn’t change any more. ε(S) is
the final value of T .
Now note that an FA can be considered to be a special case of an NFA, since a
function from Q × E to Q can in an obvious way be identified with a function from
Q × E to 2Q , whose values are all sets with one element. Similarly, an NFA can be
considered to be a special case of NFA-ε, one in which for each q ∈ Q, δ(q, ε) = 0. /
Therefore, any language that is recognized by an FA can be recognized by an NFA,
and any language that is recognized by an NFA can be recognized by an NFA-ε.
The equivalence of FA, NFA, and NFA-ε in terms of the class of languages that they
recognize, is proved with the additional fact that any language which is recognized by
9
an NFA-ε can be recognized by an FA. This completes the loop, and proves that allow-
ing nondeterminism doesn’t enlarge the class of languages that FAs can recognize.
We will not give a formal proof of the above theorem, but we will sketch what the
proof would look like. We will do this in two parts. First we will find an NFA M1
(without ε-transitions) recognizing L, and second we will find an FA equivalent to this
NFA.
δ1 (q, a) = δ∗ (q, a)
Note that δ∗ (q, a) is the set of all states that can be reached from q, using the input
symbol a but allowing ε-transitions both before and after. Therefore, the way we have
defined δ1 , if M can move from p to q using the input symbol a together with ε-
transitions, then M1 can move from p to q using the input a alone.
Finally, as we mentioned before, it might be necessary to make q0 an accepting
state in M1 . For this, we define
(c) M1 recognizes L
Since M recognizes L, a string x is in L if and only if δ∗ (q0 , x) ∩ A = 0. / On the other
hand, x is accepted by M1 if and only if δ1 ∗ (q0 , x) ∩ A1 = 0. / First let us consider the
case when ε({q0 }) ∩ A = 0/ in M. Here A1 is defined to be A. If |x| ≥ 1 then from part
(b) above, δ∗ (q0 , x) = δ1 ∗ (q0 , x). So x is in L if and only if x is accepted by M1 . If x = ε
then x is not accepted by either M or M1 . Hence it follows that M1 recognizes L.
In the second case, when ε({q0 })∩A = 0, then A1 = A∪{q0 }. ε is accepted by both
M and M1 . If |x| ≥ 1, δ∗ (q0 , x) = δ1 ∗ (q0 , x). Either this set contains an element of A (in
which case both M and M1 accept x), or this set contains neither q0 nor any elements
of A (in which case both M and M1 reject x). It is impossible for this set to contain
10
q0 and no elements of A since if q0 ∈ δ∗ (q0 , x), then since δ∗ (q0 , x) is the ε-closure of
another set (by definition), δ∗ (q0 , x) contains ε({q0 }) and thus contains an element of
A. Hence M and M1 accept exactly the same strings.
(a) Definition of M2
We are trying to eliminate the nondeterminism present in M1 , which means that in M2
each combination of state and input symbol should result in exactly one state. Note that
the transition function δ1 takes a pair (q, a) and returns a set of elements of Q1 . Now
suppose we define our notion of state to be a set of elements from Q1 . Let s be such a
subset of Q1 . For any element p ∈ s, there is a set δ1 (p, a) of (possibly several) elements
of Q1 to which M1 may go on input a. But for a single subset of elements s of Q1 there
is a single subset of elements of Q1 in which M1 may end up–this is the union of the
sets δ1 (p, a) for all elements p ∈ s. So for each state-input pair (with our new notion
of state), there is one and only one state. The machine obtained in this way clearly
simulates in a natural way the action of the original machine M1 , provided the initial
and the final states are defined correctly. Hence we have eliminated nondeterminism
by what might be called as subset construction: states in Q2 are subsets of Q1 .
From the above discussion, we can now define M2 = (Q2 , E, q2 , δ2 , A2 ) as follows.
The definition of A2 follows from the fact that for a string to be accepted in M1 , starting
in q1 , the machine can end up only in sets of states which contain an element of A1 .
(b) M2 recognizes L
Clearly this will follow if we can show that
Example 6 Consider the NFA-ε shown in Figure 4(a). Figure 4(b) shows the corre-
sponding NFA and Figure 4(c) the corresponding FA.
Let us consider a few steps in obtaining the NFA and the FA in the above example.
Let M = (Q, E, q, δ, A) be the NFA-ε, M1 = (Q1 , E, q1 , δ1 , A1 ) be the NFA, and M2 =
(Q2 , E, q2 , δ2 , A2 ) the FA.
11
0
q1 q1
1 0 1
ε
q 0, 1 q3
q3 0
q0
1 0
ε 0 q
2
q
2
1
1
(b)
(a) 0
q
0 1 0, 1
q q 1
1 3
1
0
0, 1 φ
q3
q0
0
1 0
q q 1
2 3
q
2
1
(c)
Figure 4: Obtaining an FA from a given NFA-ε. (a) denotes an NFA-ε, (b) the equiva-
lent NFA, and (c) the equivalent FA
12
For finding δ1 (q0 , 0), for example, we use
δ1 (q0 , 0) = δ∗ (q0 , 0) = ε( δ(p, 0))
p∈ε({q0 })
The steps involved are: calculating ε({q0 }), finding δ(p, 0) for each p in this set, taking
the union of the δ(p, 0)s, and calculating the ε-closure of the result. We can see that
from the state q0 with input 0, M can move to state q1 (using a ε-transition to q1 first),
or to state q3 (first to q2 with ε , and then from there to q3 with 0). There are no ε-
transitions from q1 or q3 , and so δ1 (q0 , 0) = {q1 , q3 }. A similar procedure will give
the remaining values of δ1 . Finally since ε({q0 }) does not contain q3 , A1 = A = {q3 }.
Figure 4(b) shows the NFA.
For the DFA shown in Figure 4(c), first note that it has only seven states and not
sixteen, which is the number of subsets of {q0, q1 , q2 , q3 }. This is because we create a
state only when it is needed, during the construction process of the FA.
From {q0 } the two states that can be reached with one symbol are {q1 , q3 } and
{q2 , q3 }. To compute δ2 ({q2 , q3 }, 0), we proceed as follows.
Since this state {q1 } didn’t appear yet during our construction, we add this state. Pro-
ceeding in this fashion we reach a point where for each state q that is drawn so far,
δ2 (q, 0) and δ2 (q, 1) are both already drawn, and this indicates that we have all the
reachable states.
/ To complete the FA we add an extra
Lastly, note that δ2 ({q3 }, 0) = δ2 ({q3 }, 1) = 0.
state φ for handling this case. The resulting FA is the one shown in Figure 4(c).
13
(a) (b)
(c)
Mr Ms
ε q
q s
r Accepting states in Ms
Accepting states in Mr
14
Mr
Accepting states in Mr
q
r
q
0
ε Ms
Accepting states in Ms
q
s
Q = Qr ∪ {q0 }
A = {q0 }
for q ∈ Q and a ∈ E
δr (q, a) if q∈Q
δ(q, a) =
0/ if q = q0
for q ∈ Q
δr (q, ε)) if q ∈ Qr − Ar
δ(q, ε) = δr (q, ε) ∪ {q0 } if q ∈ Ar
{qr } if q = q0
15
Mr
ε q
q0 r ε
Accepting states in Mr
(note that these are not accepting states in M)
L(p, q) = {x ∈ E ∗ | δ∗ (p, x) = q}
L(p, q) is the set of strings that allow M to reach state q if it begins in state p. If we
can show that each language L(p, q) correspond to a regular expression, then since the
language recognized by M is a union of the languages L(q0 , q) for all q ∈ A, a regular
expression for L can be obtained by combining these individual regular expressions
using +. We will give an inductive proof that each language L(p, q) is regular.
On what shall we base our induction? For this, consider the elements of Q to be
labeled with integers from 1 to N. Next we formalize the idea of a path going through
a state s. If x ∈ E ∗ , we say x represents a path from p to q through s if x can be written
in the form x = yz for some y and z with |y|, |z| > 0, δ∗ (p, y) = s and δ∗ (s, z) = q. For
any J ≥ 0 we define the set L(p, q, J) as follows. L(p, q, J) = {x ∈ E ∗ | x corresponds
to a path from p to q that goes through no state numbered higher than J}. Note that
L(p, q, N) = L(p, q) since N is the highest numbered state in the FA. Thus it will be
sufficient to show that L(p, q, N) is regular, and the way we shall show this is by proving
that L(p, q, J) is regular for each J, 0 ≤ J ≤ N. We will use mathematical induction over
J.
For the basis step we have to show that L(p, q, 0) is regular. Now a path from p to
q can go from p to q without going to any state numbered higher than 0 (i.e. without
going through any other state) only if it corresponds to a single symbol, or if p = q
and the path corresponds to the string ε. Thus L(p, q, 0) is a subset of E ∪ {ε} and is
regular.
Now we want to show that if the language L(p, q, K) is regular then L(p, q, K + 1) is
also regular for 1 ≤ p, q ≤ N, and 0 ≤ K ≤ N − 1. A string x ∈ L(p, q, K + 1) represents
a path from p to q that goes through no state numbered higher than K + 1. There are
two possible ways in which this can happen: the path could bypass the state K + 1
altogether, in which case x ∈ L(p, q, K) and is regular, or the path could go from p to
K + 1, possibly looping back to K + 1 several times and then go from K + 1 to q, never
going to any states higher than K + 1. In the later case, we can write x in the form
x1 yx2 , where
δ∗ (p, x1 ) = K+1
∗
δ (K + 1, y) = K+1
δ∗ (K + 1, x2 ) = q
16
If we include all the looping from K + 1 back to itself within the string y, then x1 ∈
L(p, K + 1, K) and x2 ∈ L(K + 1, q, K). This simply says that before arriving at K + 1
for the first time and after leaving K + 1 for the last time, the path goes through no
state numbered higher than K. Further, if y = ε, and each separate loop in the path is
represented by a string yi , then y = y1 y2 . . . yl and each yi is an element of L(K + 1, K +
1, K). Therefore y ∈ L(K + 1, K + 1, K)∗ . Hence it follows from the above discussion
that
11 References
1. “Introduction to Automata Theory, Languages, and Computation”, by John
Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman, Addison-Wesley, Reading,
Massachusetts, USA, 2001.
2. “Elements of the Theory of Computation”, by Harry R. Lewis and Christos H.
Papadimitriou, Prentice-Hall International, 1981.
3. “Introduction to Languages and the Theory of Computation”, by John C. Martin,
McGraw-Hill Inc., 1991.
17