Introduction To Languages and The Theory of Computation
Introduction To Languages and The Theory of Computation
rear y
! > - -
i ! ;
; : bd
Fs t
'
= a
es
'
ri
*
| ee i
os"7, ( Hx
a ae
a
Pes
‘
MW
Introduction to Languages
and The Theory of
Computation
Third Edition
John C. Martin
North Dakota State University
Boston
2
Burr Ridge, IL Dubuque, !A Madison, WI New York San Francisco St. Louis
Bangkok Bogoté Caracas KualaLumpur Lisbon London Madrid Mexico City
Milan Montreal New Delhi Santiago Seoul Singapore Sydney Taipei Toronto
McGraw-Hill Higher Education 3
A Division of The McGraw-Hill Companies
Some ancillaries, including electronic and print components, may not be available to
customers outside the United States.
ISBN 0-07-232200-4
ISBN 0-07-119854-7 (ISE)
www.tmhhe.com
‘Aoneea aig tr mt
er TEOCBA
= 7 i. :
teat Notairan : Rag, or 2- XL SB .
gavernt ¥
fon:
P pups <9 rf’ i fr if iy i ‘
p Preneurcs;-
Reldtigns” 22
i jis +) iy]
wu -
‘ Cigars 5
PRiemeas 12
a haiy CLS Uripiite Pinter
ioncetlesr iriver
Mat a
Z.
a Pls 7 ind Ale one The Oren
Tim SS
marge Rielinim ia
pokes ft» =
p, Se 227) 4
I
oe
———
v
ABOUT THE AUTHOR
Preface ix
la acl
Introduction xi
Regular Languages
and Finite Automata 83
CHAPTER6 ie TAV
Context-Free Grammars 203 Turing Machines and Their
Languages 317
6.1 Examples and Definitions 203
6.2 More Examples 210
CHAPTERY
6.3 Regular Grammars ZAG)
Turing Machines 319
6.4 Derivation Trees and Ambiguity 220
6.5 An Unambiguous CFG for Algebraic 9.1 Definitions and Examples 319
Expressions 226 9.2 Computing a Partial Function with a
6.6 Simplified Forms and Normal Forms 232) Turing Machine 328
Exercises 240 9.3. Combining Turing Machines 332
More Challenging Problems 247 9.4 Variations of Turing Machines:
Multitape TMs 337
9.5 Nondeterministic Turing Machines 341
CHAPTERT 9.6 Universal Turing Machines 347
Pushdown Automata 251 9.7 Models of Computation and the
7.1 Introduction by Way of an Example 251 Church-Turing Thesis 352
Exercises 354
7.2 The DefinitionofaPushdownAutomaton 255
More Challenging Problems 361
7.3 Deterministic Pushdown Automata 260
7.4 APDA Corresponding to a Given Context-Free
Grammar 265 CHAPTER1O
7.5 A Context-Free Grammar Corresponding to a Recursively Enumerable
Given PDA 273 Languages 365
7.6 Parsing 280
10.1 Recursively Enumerable and Recursive 365
Exercises 290
10.2 Enumerating a Language 368
More Challenging Problems 295
10.3. More General Grammars 371
Contents vii
10.4 Context-Sensitive Languages and the 12.6 Nonnumeric Functions, and Other Approaches
Chomsky Hierarchy 380 to Computability 470
10.5 Not All Languages are Recursively Exercises 474
Enumerable 387 More Challenging Problems 477
Exercises 397
More Challenging Problems 401
uuopoction \Val
Introduction to
Computational
PART V Complexity 479
Unsolvable Problems and
Computable Functions 405 CHAPTERIS
Measuring and Classifying
CHAPTER11 Complexity 481
Unsolvable Problems 407 13.1 Growth Rates of Functions 481
11.1. A Nonrecursive Language and an Unsolvable 13.2 Time and Space Complexity of a Turing
Problem 407 Machine 486
11.2 Reducing One Problem to Another: The 13.3 Complexity Classes 492
Halting Problem 411 Exercises 497
11.3 Other Unsolvable Problems Involving More Challenging Problems 499
TMs 416
11.4 Rice’s Theorem and More Unsolvable
CHAPTER 14
Problems 419
Tractable and Intractable
11.5 Post’s Correspondence Problem 422
Problems 500
11.6 Unsolvable Problems Involving Context-Free
Languages 430 14.1 Tractable and Possibly Intractable Problems:
Pand NP 500
Exercises 436
14.2 Polynomial-Time Reductions and
More Challenging Problems 439
NP-Completeness 506
14.3 Cook’s Theorem 510
CHAPTERI2 14.4 Some Other VP-Complete Problems Si7,
Computable Functions 442 Exercises 522
12.1 Primitive Recursive Functions 442 More Challenging Problems 524
12.2 Primitive Recursive Predicates and Some
Bounded Operations 451 References 52/
12.3 Unbounded Minimalization and ju-Recursive
Functions 459 Bibliography 529
12.4 Gédel Numbering 461 Index of Notation 531
12.5 All Computable Functions Are Index 535
p-Recursive 465
av Derik a
poet~em s
By: sine EID eee terse : THA
bas amolders
Seen aang
is tan reer i -nekoraa >
f ae
>a
a0
Pat Poe. an pifasecalh —
~~
——
ne *< a bon.
(Sh yitnel fet)<2
es
goateG Me
Day ee Furirose 3OF ametiang.9e
|
4 BE & le Yuasigme.d songe fine sail $F re re ae iie Hi ays iplicseiconie & ah ‘
by aes ANG [MSA MBSR iniioald ; ts PMI So) cae, a
raseet Ean; TP:. she ul pity Skeyrton 3 Cb nil in = ” script ie ovata’
ad Rega: ¢Gretigee JP ren aie odes tie ger eanet
:" os)Eine reas) er boyd» bods pyiet toa aangee nlf tk:
ah eect gi a CFS oy ee
ee 1 eS - 3 ~ 5" ne ite aaT. mae
O82 A haa 4 F aa
o He Ole.iii Raat ay ee
par
= aneirubealt SnueTuitnesn se eee ="
i a ’ A aid 7 Ve ss eae te >
VI are substantially independent of the first three parts, the text can also be used in a
course on Turing machines, computability, and complexity.
I am grateful to the many people who have helped me with all three editions of
this text. Particular thanks are due to Ting-Lu Huang, who pointed out an error in the
proof of Theorem 4.2 in the second edition, and to Jonathan Goldstine, who provided
several corrections to Chapters 7 and 8. I appreciate the thoughtful and detailed com-
ments of Bruce Wieand, North Carolina State University; Edward Ashcroft, Arizona
State University; Ding-Zhu Du, University of Minnesota; William D. Shoaff, Florida
Institute of Technology; and Sharon Tuttle, Humboldt State University, who reviewed
the second edition, and Ding-Zhu Du, University of Minnesota; Leonard M. Faltz,
Arizona State University; and Nilfur Onder, Michigan Tech, who reviewed a prelim-
inary version of this edition. Their help has resulted in a number of improvements,
including the modification in Chapter 9 mentioned earlier. Melinda Dougharty at
McGraw-Hill has been delightful to work with, and'I also appreciate the support and
professionalism of Betsy Jones and Peggy Selle. Finally, thanks once again to my
wife Pippa for her help, both tangible and intangible.
John C. Martin
INTRODUCTION
n order to study the theory of computation, let us try to say what a computation
is. We might say that it consists of executing an algorithm: starting with some
input and following a step-by-step procedure that will produce a result. Exactly
what kinds of steps are allowed in an algorithm? One approach is to think about
the steps allowed in high-level languages that are used to program computers (C, for
example). Instead, however, we will think about the computers themselves. We will
say that a step will be permitted in a computation if it is an operation the computer
can make. In other words, a computation is simply a sequence of steps that can be
performed by a computer! We will be able to talk precisely about algorithms and
computations once we know precisely what kinds of computers we will study.
The computers will not be actual computers. In the first place, a theory based on
the specifications of an actual piece of hardware would not be very useful, because
it would have to be changed every time the hardware was changed or enhanced.
Even more importantly, actual computers are much too complicated; the idealized
computers we will study are simple. We will study several abstract machines, or
models of computation, which will be defined mathematically. Some of them are as
powerful in principle as real computers (or even more so, because they are not subject
to physical constraints on memory), while the simpler ones are less powerful. These
simpler machines are still worth studying, because they make it easier to introduce
some of the mathematical formalisms we will use in our theory and because the
computations they can carry out are performed by real-world counterparts in many
real-world situations.
We can understand the “languages” part of the subject by considering the idea of
a decision problem, a computational problem for which every specific instance can
be answered “yes” or “no.” A familiar numerical example is the problem: Given a
positive integer n, is it prime? The number n is encoded as a string of digits, and a
computation that solves the problem starts with this input string. We can think about
this as a language recognition problem: to take an arbitrary string of digits and deter-
mine whether it is one of the strings in the language of all strings representing primes.
In the same way, solving any decision problem can be thought of as recognizing a
certain language, the language of all strings representing instances of the problem for
which the answer is “yes.” Not all computational problems are decision problems,
and the more powerful of our models of computation will allow us to handle more
general kinds; however, even a more general problem can often be approached by
considering a comparable decision problem. For example, if f is a function, being
able to answer the question: given x and y, is y = f(x)? is tantamount to being
able to compute f(x) for an arbitrary x. The problem of language recognition will
be a unifying theme in our discussion of abstract models of computation. Comput-
ing machines of different types can recognize languages of different complexity, and
xi
xii Introduction
computational problem to that of another. Some problems that are solvable in princi-
ple are not really solvable in practice, because their solution would require impossible
amounts of time and space. A simple criterion involving Turing machines is generally
used to distinguish the tractable problems from the intractable ones. Although the
criterion is simple, however, it is not always easy to decide which problems satisfy
it. In the last chapter we discuss an interesting class of problems, those for which no
one has found either a good algorithm or a convincing proof that none exists.
People have been able to compute for many thousands of years, but only very
recently have people made machines that can, and computation as a pervasive part
of our lives is an even more recent phenomenon. The theory of computation is
slightly older than the electronic computer, because some of the pioneers in the
field, Turing and others, were perceptive enough to anticipate the potential power of
computers; their work provided the conceptual model on which the modern digital
computer is based. The theory of computation has also drawn from other areas:
mathematics, philosophy, linguistics, biology, and electrical engineering, to name a
few. Remarkably, these elements fit together into a coherent, even elegant, theory,
which has the additional advantage that it is useful and provides insight into many
areas of computer science.
ae ak fier ire 1. _ Wires! as di ;
, : F iy
' \ bs S i
} & 6 el
eee
: = A i eae ia
7
as
sbpart atvingw
aN well a& Jibot wissen yew OF Ute ip paar
such 4 fitachine fr PeTTO LEP CUNT Cre “ie ang |
exept hol it'ts Vhoity’ ws douf mine chumienihy gud patente
Nevortian LP ve A Theis sent he 7 ala hale G
ehenuvit hemi eS sires HX vn care Geel Mn ey leben
=S— = 7 a a .
_—
(eatery s({ tests Kesdiyy Mbit teu Set Wa ayant woe ii) prude Pevityy) «jit! .
Bogs S41)ft)talad -Jngol Yo rolqlantey Sune tnd poets: aol} eres Sasbi"
ASF Sui OnuGRiT Tastee. £4 Vo aE a rth
at) o>
eortiee
ely
aes dha line
hate nate ;
te SR AA isa a
ft Soicha
err cen ce ee tf hoe a nethiag rt
Sa0nw i 53 : ch Beal .
ity argo
aul Hibiey) Poe qalos nattsete oe aN at 2
Sinisa
& Anod sill a) tc) leteey ¢hotoinuy 9h Oe oe ea gee ie
- : SG comin laphany
i
os .
Pay.
:
7 >,
‘
:
we
=
See me
C HAPTER
1.11|SETS
A set is determined by its elements. An easy way to describe or specify a finite set is
to list all its elements. For example,
A = {I11, 12, 21, 22}
When we enumerate a set this way, the order in which we write the elements is
irrelevant. The set A could just as well be written {11,21, 22,12}. Writing an
element more than once does not change the set: The sets {11, 21, 22, 11, 12, 21} and
101, 21,22, 12} are the same.
Even if a set is infinite, it may be possible to start listing the elements in a way
that makes it clear what they are. For example,
B= Or On ao eee
describes the set of odd integers greater than or equal to 3. However, although this
way of describing a set is common, it is not always foolproof. Does {3,5, 7, ...}
represent the same set, or does it represent the set of odd primes, or perhaps the set
of integers bigger than 1 whose names contain the letter “e”’?
A precise way of describing a set without listing the elements explicitly is to give
a property that characterizes the elements. For example, we might write
B = {x |x is an odd integer greater than 1}
or
A = {x |x is a two-digit integer, each of whose digits is 1 or 2}
The notation “{x|” at the beginning of both formulas is usually read “the set of all x
such that.”
To say that x is an element of the set A, we write
xeaA
PART 1 Mathematical Notation and Techniques
Using this notation we might describe the set C = {3, 5, 7, 9, 11} by writing
“the set of numbers 37 + 77, where i and j are pore nonnegative integers,” and a
concise way to write this is
= {3i+7j|i, jen}
For two sets A and B, we say that A is a subset of B, and write A C B, if every
element of A is an element of B. Because a set is determined by its elements, two
sets are equal if they have exactly the same elements, and this is the same as saying
that each is a subset of the other. When we want to prove that A = B, we will need
to show both statements: that A C B and that B C A.
The complement of a set A is the set A’ of everything that is not an element of
A. This makes sense only in the context of some “universal set” U containing all the
elements we are discussing.
={xeU|x¢A}
Here the symbol ¢ means “is not an element of.” If U is the set of integers, for
example, then {1, 2}’ is the set of integers other than 1 or 2. The set {1, 2}’ would be
different if U were the set of all real numbers or some other universe.
Two other important operations involving sets are union and intersection. The
union of A and B (sometimes referred to as “A union B”’) is the set
AUB={x|x €Aorx e€ B}
AN B={x|x €Aandx e€ B}
For example,
We can define another useful set operation, set difference, in terms of intersections
and complements. The difference A — B is the set of everything in A but not in B.
In other words,
A—B={x|x eAandx ¢ B}
= {x |x EeA}N {x |x ¢ B}
v/a 3h
For example,
To illustrate how identities of this type might be proved, let us give a proof of
(1.12), the second De Morgan law. Since (1.12) asserts that two sets are equal, we
will show that each of the two sets is a subset of the other.
To show (AM B)’ C A’ U B’, we must show that every element of (AN B)’ is an element
of A’ U B’. Let x be an arbitrary element of (AM B)’. Then by definition of complement,
x ¢ ANB. By definition of intersection, x is not an element of both A and B; therefore,
either x ¢ A or x ¢ B. Thus, x € A’ or x € B’,andsox € A’UB’.
To show A’U B’ C (AN B)’, letxbe any element of A’U B’. Thenx € A’ orx € B’.
Therefore, either x ¢ A or x ¢ B. Thus, x is not an element of both A and B, and so
x € ANB. Therefore, x € (AN B)’.
In order to visualize a set that is formed from primitive sets by using the set
operations, it is often helpful to draw a Venn diagram. The ideais to draw a large region
representing the universe and within that to draw schematic diagrams of the primitive
sets, overlapping so as to show one region for each membership combination. (This
may be difficult to do when there are more than three primitive sets; just as in our
list of identities above, however, three is usually enough.) If we shade the primitive
regions differently, the set we are interested in can be identified by the appropriate
combination of shadings.
In the case of two sets A and B, the basic Venn diagram is shown in Figure 1.1a.
The four disjoint regions of the picture, corresponding to the sets (A U B)’, A — B,
B — A, and A B, are unshaded, shaded one way only, shaded the other way only,
and shaded both ways, respectively. Figure 1.2 shows an unshaded Venn diagram for
three sets. For practice, you might want to label each of the eight subregions, using
a formula involving A, B, C, and appropriate set operations.
Venn diagrams can be used to confirm the truth of set identities. For example,
Figure 1.1b illustrates the set (A MN B)’, the left side of identity (1.12). It is obtained
CHAPTER 1 Basic Mathematical Objects
OOCCCO
proreeeanesatatek
Figure 1.1 |
(a) A basic Venn diagram; (b), (c) The second De Morgan identity.
A B ‘
GE
Figure 1.2]
A three-set Venn diagram.
by shading all the regions that are not shaded twice in Figure 1.1a. On the other hand,
Figure 1.1c shows the two sets A’ and B’, shaded in different ways. The union of
these two sets (the right side of (1.12)) corresponds to the region in Figure 1.1c that
is shaded at least once, and this is indeed the region shown in Figure 1.1b.
The symmetric difference A @ B of the two sets A and B is defined by the formula
A®B=(A-—8B)U(B-A)
and corresponds to the region in Figure |.la that is shaded exactly once. In Exercise
1.5, you are asked to use Venn diagrams to verify that the symmetric difference
operation satisfies the associative law:
A®(BOC)=(AGB)SC
PART 1 Mathematical Notation and Techniques
When Venn diagrams are used properly they can provide the basis for arguments
that are both simple and convincing. (In Exercise 1.48 you are asked to show the
associative property of symmetric difference without using Venn diagrams, and you
will probably decide that the diagrams simplify the argument considerably.) Never-
theless, a proof based on pictures may be misleading, because the pictures may not
show all the relevant properties of the sets involved. Because of the limitations of
Venn diagrams in reasoning about sets, it is also important to be able to work with
the identities directly. As an illustration of how these identities can be applied, let us
simplify the expression A U (B — A):
AUBUC={x|xeAorxe
Borx eC}
= {x |x is an element of at least one of the sets A, B, and C}
U4
fool
to mean the set {x |x € A; for at least one iwith 1 <i < n}; and
co
Ua
P(i)
means the set {x | x € A; for at least one i satisfying P(i)}. In Chapter 4 we will
encounter a set with the slightly intimidating formula
LJ 8(p,a)
ped*(q,x)
CHAPTER 1 Basic Mathematical Objects
We do not need to know what the sets 6*(q, x) and 5(p, a) are in order to understand
that
If 6*(q, x) were {r, s, t}, for example, this formula would give us 5(r, a) Ud(s,a) U
O(t, a).
The elements of a set can be sets themselves. For any set A, the set of all subsets
of A is referred to as the power set of A, which we shall write 24. The reason for this
terminology and this notation is that if A has n elements, then 24 has 2” elements.
To illustrate, suppose A = {1, 2, 3}. Then
= {@, {1}, {2}, {3}, (1, 2}, {1, 3}, (2, 3}, (1, 2, 3}}
Notice that 4 and A are both elements: The empty set is a subset of every set, and
every set is a subset of itself.
Here is one more important way to construct a set from given sets. For any two
sets A and B, we may consider a new set called the Cartesian product of A and B,
written A x B and read “A cross B.” It is the set of all ordered Pairs (a, b), where
aceAandbe B.
Ax B={(a,b)|a€Aandbe
B}
The word ordered means that the pair (a, b) is different from the pair (b, a), unless
a and b happen to be the same. If A has n elements and B has m 1 elements, then the
set A x B has exactly nm elements. For example,
aD eX (DAC, | = (0, D), (a, Cy, (a. a), (D;D), (0. C),, (BD, a)}
More generally, the set of all “ordered n-tuples” (a;, a, ..., dn), where a; € A; for
each i, is denoted by A; x Az x --+ x An.
1.2|Logic
x? <4.
a’ +b* =3.
He has never held public office.
These look like precise, objective statements, but as they stand they cannot be
said to have truth values. Essential information is missing: What are the values
of x, a, and b, and who is “he”? Each of these propositions involves one or more
free variables; for the proposition to have a truth value, each free variable must be
assigned a specific value from an appropriate domain, or universe. In the first case,
an appropriate domain for the free variable x is a set of numbers. If the domain is
chosen to be the set of natural numbers, for example, the values of x that make the
proposition true are those less than 2, and every other value of x will make it false. If
the domain is the set 7 of all real numbers, the proposition will be true for all choices
of x that are both greater than —2 and less than 2. The free variables in the second
statement apparently also represent numbers. For the domain of natural numbers
there are no values for a and b that would make the proposition true; for the domain
of real numbers, a = 1 and b = V2 is one of many choices that would work. An
appropriate domain for the third statement would be a set of people.
Just as we can combine numerical values by using algebraic operations like ad-
dition and subtraction, we can combine truth values, using logical connectives. A
compound proposition is one formed from simpler propositions using a logical con-
nective. When we add two numerical expressions, all we need to know in order to
determine the numerical value of the answer is the numerical value of the two expres-
sions. Similarly, when we combine logical expressions using a logical connective,
all we need to know (in order to determine the truth value of the result) is the truth
values of the original expressions. In other words, we can define a specific logical
connective by saying, for each combination of truth values of the expressions being
combined, what the truth value of the resulting expression is.
The three most common logical connectives are conjunction, disjunction, and
negation, the symbols for which are A, V, and —, respectively. These correspond to
the English words and, or, and not. The conjunction p A q of p and q is read “ p and
q,” and the disjunction p V q is read “p or g.” The negation =p of p is read “not Dee
These three connectives can be defined by the truth tables below.
The entries in the truth tables are probably the ones you would expect, on the
basis of the familiar meanings of the English words and, or, and not. For example,
CHAPTER 1 Basic Mathematical Objects 11
the proposition p A q is true if p and g are both true and false otherwise. The
proposition p V q is true if p is true, or if q is true, or if both p and gq are true.
(In everyday conversation “or” sometimes means exclusive or. Someone who says
“Give me liberty or give me death!” probably expects only one of the two. However,
exclusive or is a different logical connective, not the one we have defined.) Finally,
the negation of a proposition is true precisely when the proposition is false.
Another important logical connective is the conditional. The proposition p > q
is commonly read “if p, then g.” Although this connective also comes up in everyday
speech, it may not be obvious how it should be defined precisely. Consider someone
giving directions, who says “If you cross a railroad track, you’ve gone too far.” This
statement should be true if you cross a railroad track and you have in fact gone too far.
If you cross a railroad track and you have not gone too far, the statement should be
false. The less obvious cases are the ones in which you do not cross a railroad track.
Normally we do not consider the statement to be false in these cases; we assume that
the speaker chooses not to commit himself, and we still give him credit for being a
truthful person. Following this example, we define the truth table for the conditional
as follows.
If you are not convinced by the example that the last two entries of the truth table
are the most reasonable choices, consider the proposition
So ee PESee
where the domain associated with the free variable x is the set of natural numbers.
You would probably agree that this proposition ought to be true, no matter what value
is substituted for x. However, by choosing x appropriately we can obtain all three
of the cases in which the truth-table value is true. If x is chosen to be 0, then both
x < landx < 2 are true; if x = 1, the first is false and the second is true; and if
x > 1, both are false. Therefore, if this compound proposition is to be true for every
x, the truth table must be the one shown. The conditional proposition is taken to be
true except in the single case where it must certainly be false.
One slightly confusing aspect of the conditional proposition is that when it is
expressed in words, the word order is sometimes inverted. The statements “if p then
q” and “gq if p” mean the same thing: The crucial point is that the “if” comes just
before p in both cases. Perhaps even more confusing, however, is another common
way of expressing the conditional. The proposition p > q is often read “p only if
q.” It is important to understand that “if” and “only if” have different meanings. The
two statements “p if g” (¢q — p) and “p only if q” (p > q) are both conditional
12 PART 1 Mathematical Notation and Techniques
statements, but with the order of p and q reversed. Each of these two statements is
said to be the converse of the other.
The proposition (p > q) A (q > p) is abbreviated p < q, and the connective
<> is called the biconditional. According to the previous paragraph, p < q might
be read “p only if g, and p if q.” It is usually shortened to “p if and only if q.” (It
might seem more accurate to say “only if and if”; however, we will see shortly that
it doesn’t matter.) Another common way to read p < q is to say “if p then q, and
conversely.”
With a compound proposition, composed of propositional variables (like p and
q) and logical connectives, we can determine the truth value of the entire proposition
from the truth values of the propositional variables. This is a routine calculation based
on the truth tables of the connectives. We can take care of all the cases at once by
constructing the entire truth table for the compound proposition, considering one at a
time the connectives from which it is built. We illustrate the way this might be done
for the proposition (p V g) A =(p > @q).
The last column of the table, which is the desired result, is obtained by combining
the third and fifth columns using the A operation. Another way of carrying out the
same calculation is to include a column of the table for each operation in the expression
and to fill in the columns in the order in which the operations might be carried out.
The table below illustrates this approach.
The first two columns to be computed are those corresponding to the subexpres-
sions p V q and p —> q. Column 3 is obtained by negating column 2, and the final
result in column 4 is obtained by combining columns 1 and 3 using the A operation.
A compound proposition is called a tautology if it is true in every case (that is,
for every combination of truth values of the simpler propositions from which it is
CHAPTER 1 Basic Mathematical Objects 13
Ap RB AUR
it T Ip
ar F ah
F a ay
F Ie F
natural numbers. If we now modify this proposition by attaching the phrase “there
exists an x such that” at the beginning, then the proposition has been changed from
a statement about x to a statement about the domain. The status of the variable x
changes as well: It is no longer a free variable. If we tried to substitute a specific
value from the universe for all occurrences of x in the proposition, we would obtain
a statement such as “there exists a 3 such that 3* < 4,” which is nonsense. We say
that the statement “there exists an x such that x? < 4” is a quantified statement. The
phrase “there exists” is called the existential quantifier. The variable x is said to be
bound to the quantifier and is referred to as a bound variable. We write this statement
in the compact form
Ax (x? < 4)
The other quantifier is the universal quantifier, written V. The formula Vx(x? < 4)
stands for the statement “for every x, ane Aaa thi you are not familiar with this
notation, the way to remember it is that V is an upside-down A, for “all,” and J is a
backwards E, for “exists.”
If P(x) is any proposition involving the free variable x over some domain U, then
by definition, the quantified statement 4x(P(x)) is true if there is at least one value of
x in U that makes the formula P(x) true, and false otherwise. Similarly, Vx(P(x)) is
true precisely when P(x) is true no matter what element of U is substituted for x. If
P(x) is the formula x? < 4, then 4x(P(x)) is true if the domain is the set of natural
numbers, because 0* < 4. It is false for the domain of positive even integers. The
quantified statement Vx(P(x)) is false for both of these domains.
The notation for quantified propositions occasionally varies. For example, the
statement Jx(x* < 4) is sometimes written dx : x7 < 4. We have chosen to use the
parentheses in order to clarify the scope of the quantifier. In our example 4x(x* < 4),
the scope of the quantifier is the statement x? < 4. If the quantified statement appears
within a larger formula, then any x outside this scope means something different. If
you have studied a block-structured programming language such as Pascal or C, you
may be reminded of the scope of a variable in a block of the program, and in fact
the situations are very similar. If a block in a C program contains a declaration of
an identifier A, then the scope of that declaration is limited to that block and its
subblocks. To refer to A outside the block is either an error or a reference to some
other identifier declared outside the block.
Paying attention to the scope of quantifiers is particularly important in proposi-
tions with more than one quantifier. Consider, for example, the two propositions
as a statement about the specific value x. In other words, we can interpret “Sy” as
“there exists a number y, which may depend on x.” In this case the proposition is
true, since y could be chosen to be x, for example. In the second case, the existential
quantifier is outside the scope of “Vx,” which means that for the statement to be true
there would have to be a single number y that satisfies the inequality no matter what
x is. This is not the case, because the inequality fails if x = y + 2.
Although we will not be using the J and V notation very often after this chapter,
there are times when it is useful. One advantage of writing a quantified statement
using this notation is that it forces you to specify, and therefore to understand, the
scope of each quantifier.
dx(x € AA P(x))
In the second case, the form of the statement is to be Vx(Q(x)). If we tried the same choice
for Q(x), we would be saying not only that every x satisfies P(x), but also that every x is an
element of A. This is not what we want. The condition that x be an element of A is supposed
to make the statement weaker, not stronger—we do not want to say that every x satisfies P(x),
only that every x in A does. To say it another way, every x satisfies P(x) if it is also an element
of A. A conditional statement is a reasonable choice for Q(x), and our statement becomes
Sometimes the quantifier notation is relaxed even further, in order to write propositions like
Vx > O(P(x))
the same as saying that every divisor of p is either p or 1. Adapting the second part of the
previous example, we can restate this as “for every k, if k is a divisor of p, then k is either p
or 1.” Putting all these pieces together, we obtain for “p is prime” the proposition
Wx (dy(Wz(P(, y, z))))
We negate s as follows.
1.3| Functions
Functions, along with sets, are among the most basic objects in mathematics. The
ones you are most familiar with are probably those that involve real numbers, with
formulas like x? and logx. The first of these formulas specifies for a real number
x another real number x2. The second assigns a real number log x to a positive real
number x (logxmakes sense only if x > 0). In general, a function assigns to each
element of one set a single element of another set. The first set is called the domain
of the function, the second set is the codomain, and if the function is f, the element
of the codomain that is assigned to an element x of the domain is denoted f(x).
Natural choices for the domains of the functions with formulas x? and logx
would probably be 7 (the set of real numbers) and {x € R | x > O}, respectively;
however, when we define a function f, we are free to choose the domain to be any
set we want, as long as the rule or formula we have in mind gives us a meaningful
value f (x) for every x in the set. The codomain of a function f is specified in order
be chosen
to make it clear what kind of object f(x) will be; however, it can also
arbitrarily, provided that f(x) belongs to the codomain for every x in the domain.
little
(This apparent arbitrariness can sometimes be confusing, and we discuss it a
more on page 19.)
18 PART 1 Mathematical Notation and Techniques
We write
f:A->B
In the last example, we are making the assumption that no two humans are exactly
the same height, because otherwise there are sets x for which the phrase “the tallest
person in x” does not actually specify a function value.
must be able to find for each human being y at least one nonempty set S of humans
so that y is the tallest member of S. This is easy: The set {y} is such a set.
Again assuming that f : A > B, we say f is one-to-one, or injective, or an
injection, if no single element y of B can be f(x) for more than one x in A. In
other words, f is one-to-one if, whenever f(x1) = f(x), then x, = x2. To say
it yet another way, f is one-to-one if, whenever x, and x> are different elements of
A, then f (x1) # f(x2). A bijection is a function that is both one-to-one and onto.
If f : A > B is a bijection, we sometimes say that A and B are in one-to-one
correspondence, or that there is a one-to-one correspondence between the two sets.
Of our four examples, only f; is one-to-one. The other three are not, because
there are two people having the same mother, there are two people with the same
number of siblings, and there are many nonempty sets of people with the same tallest
element.
The terminology “one-to-one,” although standard, is potentially confusing. It
does not mean that to one element of A there is assigned (only) one element of B.
This property goes without saying; it is part of what f : A — B means. Nor does it
mean that for one y € B there is one x € A with f(x) = y. If f is not onto, then
there is a y for which there is no such x. “One-to-one” means the opposite of “many-
to-one.” One element of B can be associated with only (at most) one element of A.
To a large extent, whether f is one-to-one or onto depends on how we have
specified the domain and codomain. Consider these examples. Here 7 denotes the
set of all real numbers and R* the set of all nonnegative real numbers.
1. f:R— R, defined by f(x) = x’, is neither one-to-one nor onto. It is not
one-to-one, because f(—1) = f(1). It is not onto, since —1 cannot be f(x) for
any x.
2. f:R— Rt, defined by f(x) = x’, is onto but not one-to-one.
3. f :Rt > R, defined by f(x) = x’, is one-to-one but not onto.
4. f:R+t > Rt, defined by f(x) = x’, is both one-to-one and onto.
In less formal discussions, you often come across phrases like “the function
f(x) = x*.” Although such a phrase may be sufficient, depending on the context,
these four examples should make it clear that in order to specify a function f com-
pletely and discuss its properties, one must provide not only a rule by which the value
f (x) can be obtained from x, but also the domain and codomain of f.
On the other hand, we have already mentioned that the choice of domain and
codomain can be somewhat arbitrary. These examples seem to confirm that, and we
might want to consider a little more carefully whether this arbitrariness serves any
purpose. When people say “the function f(x) = x” or “the function f(x) = log x,”
they might be assumed to be using the convention that the domain is the largest set for
which the formula makes sense (for x the set 7 and for log x the set {x € R| x > O},
assuming in both cases that only real numbers are involved). In the case of the
codomain, it is not clear why we would choose F instead of R* as the codomain
of the function with formula f(x) = x2, since it is true that x* > 0 for every real
number x. In fact, it might seem that a good choice of codomain for any function f
20 PART 1 Mathematical Notation and Techniques
with domain A is simply the range f(A), the set of all values obtained by applying
the function to elements of the domain. (As we have already noticed, the range must
always be a subset of the codomain, and if it is chosen as the codomain then f is
automatically onto.)
There are, however, valid reasons for allowing ourselves to specify the domain
and codomain of a function as we wish. It might be appropriate because of the
circumstances to limit the domain of the function: If f(x) represents the weight of
an object x centimeters long, there is no reason to consider x < 0, even if the formula
for f(x) makes sense for these values. People do not always specify the codomain
of a function f with domain A to be the set f(A), because it may be difficult to
say exactly what set this is. In examples involving real numbers, for example, it is
tempting to write f : R — RF at the outset—assuming that f(x) is a real number for
every real number x—and worry about exactly what subset of 7? the range is only if it
is necessary. It may also be that the focus of the discussion is the two sets themselves,
rather than the function. We might, for example, want to ask whether two specific
sets A and B can be placed in one-to-one correspondence (in other words, whether
there is a function with domain A and codomain B that is a bijection). In any case,
even though it might occasionally seem unnecessary, we will try to specify both the
domain and the codomain of any function we discuss.
It follows by combining these two observations that if f and g are both bijections,
then the composition g o f is also a bijection.
Suppose that f : A — B is a bijection (one-to-one and onto). Then for any
y € B, there is at least one x € A with f(x) = y, since f is onto; and for any y € B,
there is at most one x € A with f(x) = y, since f is one-to-one. Therefore, for any
y € B, it makes sense to speak of the element x € A for which f(x) = y, and we
denote this x by f~'(y). We now have a function f~! from B to A: f~!(y) is the
element x € A for which f(x) = y.
Note that we obtain the formula f(f~!(y)) = y immediately. For any x € A,
f~'((x)) is defined to be the element z € A for which FO= FC). Since xis
also such an element, and since there can be only one, z = x. Thus we also have the
formula f~!(f(x)) = x. These two formulas summarize, and can be taken as the
defining property of, the inverse function f-!: B > A:
f"(S)
={x € Al f@)ES}
so that f—! associates a subset of A to each subset of B. The set f~!(B) is simply
the domain A. Note also that if f happens to be a bijection, so that f—! is a function
from B to A, then the set f—!(.S) can be obtained as in the beginning of Section 1.3.1:
3x —xy
to be a function with domain R x FR and codomain 7, since the formula makes sense,
and yields a real number, for any ordered pair (x, y) of real numbers.
We have seen several examples of this already in Section 1.1. If U is a set, we
may form the union of any two subsets A and B of U. The function f given by the
formula
f(A, B)=AUB
may be viewed as a function with domain 2” x 2" (the set of ordered pairs of subsets
of U) and codomain 2”.
22 PART 1 Mathematical Notation and Techniques
For an even more familiar example, consider the operation of addition on the set
Z of integers. For any two integers x and y, x + y is an integer. We may therefore
view addition as being a function from Z x Z to Z.
Union is a binary operation on the set 2” of subsets of U. Addition is a binary
operation on the set of integers. In general, a binary operation on a set S is a function
from S x S to S; in other words, it is a function that takes two elements of S and
produces an element of S. Other binary operations on 2” are those of intersection
and set difference. Other binary arithmetic operations on Z are multiplication and
subtraction. In most of the familiar situations where a binary operation e on a set is
involved, it is common to use the “infix” notation x e y rather than the usual functional
notation e(x, y). For example, we write A U B instead of U(A, B), and x + y instead
of +(x, y).
A unary operation on S is simply a function from S to S. For example, the com-
plement operation is a unary operation on 2”, and negation is a unary operation on Z.
If e is an arbitrary binary operation on a set S, and T is a subset of S, we say that
T is closed under the operation e if T eT C T. In other words, T is closed under e if
the result of applying the operation to two elements of T is, in fact, an element of T.
Similarly, if u is a unary operation on S, and T C S, T is closed under u if u(T) C T.
For example, the set \V is closed under the operation of addition (the sum of two
natural numbers is a natural number), but not under the operation of subtraction (the
difference of two natural numbers is not always a natural number). The set of finite
subsets of R is closed under all the operations union, intersection, and set difference,
since if A and B are finite sets of real numbers, all three of the sets AU B, AN B,
and A — B are finite sets of real numbers. The set of finite subsets of R is not closed
under the complement operation, since the complement of a finite set of real numbers
is not finite. The significance of a subset T of S being closed under e is that we can
then think of e as an operation on T itself; that is, if e is a binary operation, there is
a function from T. x T to T whose value at each pair (x, y) € T x T is the same
as that of e. (It is tempting to say that the function is e, except that we identify two
functions only if they have the same domains and codomains.)
Note that a few paragraphs back, division was not included among the binary
arithmetic operations on Z. There are two reasons for this. First, not every pair
(x, y) of integers is included in the domain of the (real) division operation, since
division by 0 is not defined. Second, even if y 4 0, the quotient x /y may not be an
integer; that is, the set of nonzero integers, thought of as a subset of R, is not closed
under the operation of division. One way around the second problem would be to
use a different operation, such as integer division (in which the value is the integer
quotient and the remainder is ignored). This leaves the problem of division by zero.
One approach would be to say that integer division is a binary operation on the set of
nonzero integers. Another approach, in which 0/x would still make sense whenever
x # 0, would be to say that for any fixed nonzero x, the set of integers is closed under
the unary operation of integer division by x.
Most of the time, we will be interested in unary or binary operations on a set.
However, there are times when it will be useful to consider more general types. Ann-
ary operation on a set S is a function from the n-fold Cartesian product Sx §x---x §
to S. As we saw in the case of union and intersection, when we start with an associative
CHAPTER 1 Basic Mathematical Objects 23
binary operation e, one for which x e (y ez) = (x e y) ez, there is a natural way to
obtain for each n > 2 an n-ary operation. This is what we are doing, for example,
when we consider the union of n sets instead of two.
1.4| Relations
A mathematical relation is a way of making more precise the intuitive idea of a
relationship between objects. Since a function will turn out to be a special type of
relation, we can start by giving a more precise definition of a function than the one
in Section 1.3, and then generalize it.
In calculus, when you draw the graph of a function f : R — R, you are
specifying a set of points, or ordered pairs: all the ordered pairs (x, y) for which
y = f(x). We might actually identify the function with its graph and say that the
function is this set of pairs. Saying that a function is a set of ordered pairs makes
it unnecessary to say that it is a “rule,” or a “way of assigning,” or any other such
phrase. In general, we may define a function f : A > B to be a subset f of A x B
so that for each a € A, there is exactly one element b € B for which (a, b) € f. For
each a € A, this element b is what we usually write as f(a).
A function from A to B is a restricted type of correspondence between elements
of the set A and elements of the set B, restricted in that to every a € A there must
correspond one and only one b € B. A bijection from A to B is even more restricted;
in the ordered-pair definition, for any a € A there must be one and only one b € B so
that the pair (a, b) belongs to the function, and for any b € B there is one and only one
a € Aso that (a, b) belongs to the function. If we relax both these restrictions, then
an element of A can correspond to several elements of B, or possibly to none, and an
element of B can correspond to several elements of A, or possibly none. Although
such a correspondence is no longer necessarily a function, either from A to B or from
B to A, it still makes sense to describe the correspondence by specifying a subset of
Ax B. This is how we can define a relation from A to B: It is simply a subset of A x B.
For an element a € A, a corresponds to, or is related to, an element b ¢€ B if the pair
(a, b) is in the subset. We will be interested primarily in the special case where A
and B are the same set, and in that case we refer to the relation as a relation on A.
You are already familiar with many examples of relations on sets, even if you
have not seen this formal definition before. When you write a = b, where a and b
are elements of some set A, you are using the relation of equality. If we think of = as
a subset of A x A, then we can write (a, b) €= instead of a = b. Of course, we are
more accustomed to the notation a = b. For this reason, in the case of an arbitrary
relation R ona set A, we often write a Rb instead of (a, b) € R. Both these notations
mean that a is related to b—or, if there is some doubt as to which relation is intended,
related to b via R.
24 PART 1 Mathematical Notation and Techniques
The subset = ofVV x N, for example, is the set {(0, 0), (1, 1), (2, 2), ...}, con-
taining all pairs (i, i). The relation on N specified by the subset
{(0, 1),
(0, 2), (1, 2),
(0, 3), C, 3), (2, 3),
sy,
of N x N is the relation <. Other familiar relations on NV include <, >, >, and #.
One that may not be quite as familiar is the “congruence mod n” relation. If n is a
fixed positive integer, we say a is congruent to b mod n, written a =, b, if a —b isan
integer multiple of n. In the interest of precision we write this symbolically: a =, b
means 4k(a — b = k xn). Note that the domain of the quantifier is the set of integers:
The integer k in this formula can be negative or 0. To illustrate, let = 3. The subset
=; ofNV x N contains the ordered pairs (0, 0), (1, 1), (1, 4), (4, 1), (7; 10), @, 6),
(8, 14), (76, 4), and every other pair (a, b) for which a — b is divisible by 3.
So far, all our examples of relations are well-known ones that have commonly
accepted names, such as = and <. These are not the only ones, however. We are
free to invent a relation on a set, either by specifying a condition that describes what
it means for two elements to be related, or simply by listing the ordered pairs we
want to be included. Consider the set A = {1, 2, 3, 4}. We might be interested in the
relation R; on A defined as follows: for any a and b in A, aR,b if and only if |a — b|
is prime. The ordered pairs in R, are (1, 3), (3, 1), (2, 4), (4, 2), (1, 4), and (4, 1).
We might also wish to consider the relation R2 = {(1, 1), (1, 4), G, 4), (4, 2)}. In
this relation, 1 is related to both itself and 4, 3 is related to 4, 4 is related to 2, and
there are no other relationships. Even if there is no simple way to say what it means
for a to be related to b, other than to say that (a, b) is one of these four pairs, Rp isa
perfectly acceptable relation on A.
It is useful to note one difference between the reflexive property and the other
two properties in Definition 1.2. In order for the relation R to be reflexive, every
element of A must be related to itself. In particular, there are a number of ordered
pairs that must be in the relation: those of the form (a, a). Another way to say this
is that A must contain as a subset the relation of equality on A. In the case of the
other two properties, the definition says only that if certain elements are related, then
certain others are related. If R is symmetric, for example, two elements a and b need
not be related; however, if one of the two pairs (a, b) and (b, a) is in the relation,
then the other is also.
The definitions of reflexive, symmetric, and transitive all start out “For every... .”
The negation of the quantified statement “For every. ..” is a quantified statement of
the form “There exists....” This means that to show a relation R on a set is not
reflexive, it is sufficient to show that there is at least one pair (x, x) not in R. To show
R is not symmetric, all we have to do is find one pair (a, b) so that aRb and not bRa.
And to show R is not transitive, we just need to find one choice of a, b, and c in the
set so that aRb and bRc but not aRc. In the last case, a, b, and c do not need to be all
different. For example, if there are elements a and b of the set A so that aRb, bRa,
and not aRa, then R is not transitive.
The relations < and > on the set NV are reflexive (for every a € N,a < a and
a > a) but not symmetric; for example, the statement 1 < 3 is true but 3 < 1 is not.
The < relation and the > relation are neither reflexive nor symmetric. The relation
is neither reflexive nor transitive; for example, although 1 4 3 and 3 + 1, the
statement 1 4 1, which would be required by transitivity, fails.
The simplest example of an equivalence relation on any set is the equality relation.
The three properties in Definition 1.2 are fundamental properties of equality: Any
element is equal to itself; if a is equal to b, then b is equal to a; and if a = b and
b =, then a = c. It seems reasonable to require that any relation we refer to as
“equivalence” should also satisfy these properties, just because of the way we use
the word informally. (In Section 1.2 we have also used the word in a more precise
way, and in Exercise 1.33 you are asked to show that logical equivalence is in fact an
equivalence relation.) An equivalence relation can be thought of as a generalization
of the equality relation.
Let us show that for any fixed positive integer n, the congruence relation =, on
the set V is an equivalence relation. First, it is reflexive because for every a € N,
a—a = 0*n and therefore a — a is a multiple of n. Second, it is symmetric
because for every a and J, if a =, b, then for some k, a — b = k * n; it follows that
b—a=—k«n = (—k) «n, and therefore b =, a. Finally, it is transitive. If a=, b
and b =, c, then for some integers k and m, a — b = kn and b — c = mn; therefore
a—c=(a—b)+(b—-c) = (k+™m)n, and a — c is a multipleof n.
An important general property of equivalence relations can be illustrated by this
example. For the sake of concreteness, we fix a value of n, say 4. The set of elements
of NV equivalent to 0 is the set of natural numbers i for which i — 0 is a multiple of
4: in other words, the set
(Os4 R129)
PART 1 Mathematical Notation and Techniques
The elements in the bin that contains a are precisely the elements of A (including a
itself) that are related to a via this relation E.
Now we can turn this around and show that any equivalence relation on a set can
be described in exactly this way. Suppose R is an equivalence relation on A. For any
element a of A, we denote by [a], or simply by [a], the equivalence class containing
a:
[alr = {x € A|xRa}
Note that the “equivalence class containing a” really does contain a. Because R is
reflexive, aRa, which means that a is one of the elements of [a]. Note also that since
R is symmetric, saying that x € [a] is really the same as saying that a € [x], oraRx.
The reason is that if x € [a], then x Ra, and then by the symmetry of R, aRx.
We have started with an equivalence relation on A, and we have obtained a
collection of bins—namely, the equivalence classes. Now we check that they do
actually form a partition of A. Of the two properties, the second is easier to check.
Saying that the union of the equivalence classes is A is the same as saying that every
element of A is in some equivalence class; however, as we have already noted, for any
CHAPTER 1 Basic Mathematical Objects 27
a € A,a € [a]. The other condition is that any two distinct equivalence classes are
disjoint; in other words, for any a and b, if [a] # [b], then [a] N[b] = @. Let us show
the contrapositive statement, which amounts to the same thing: If [a] and [b] are not
disjoint, then [a] = [b]. We start by showing that if [a] N [b] 4 @, then [a] € [b].
The same argument will show that [b] € [a], and it will follow that [a] = [b].
If [a] N [b] # GY, then there is an x that is an element of both [a] and [b]. As we
have noted above, if x € [a], then aRx. Let y be any element of [a]. Then we have
Using the transitivity of R once, along with the first two statements, we may say that
yRx; using it again with yRx and x Rb, we obtain yRb. What we have now shown
is that any element of [a] is an element of [b], so that [a] € [b].
We now have a partition of A into bins, or equivalence classes. If a and b are
elements satisfying aRb, thena € [b] and b € [b]; in other words, if two elements are
equivalent, they are in the same bin. On the other hand, if a and D are in the same bin,
then since b € [b], we must have a € [b], so that a and b are related. The conclusion
is that two elements are related if and only if they are in the same bin. Abstractly the
equivalence relation R on A is no different from the relation E described previously
in terms of the partition.
This discussion can be summarized by the following theorem.
aRb
a and b are in the same equivalence class
PART 1 Mathematical Notation and Techniques
a é[b]r
b € [alr
[alr = [b]r
[alr C [b]r
[blr C [alr
lalrO [blr FY
If we have a nonempty subset S of A, then in order to show it is an equivalence
class, we must show two things:
[n
— 1] = {n —1,2n
—1,3n —1,...}
These n equivalence classes are distinct, because no two of the integers 0, 1,...,n—1
differ by a multiple of n, and they are the only ones, because among them they clearly
account for all the nonnegative integers.
1.5 |Languages
By a language we mean simply a set of strings involving symbols from some al-
phabet. This definition will allow familiar languages like natural languages and
high-level programming languages, and it will also allow random assortments of un-
related strings. It may be helpful before we go any further to consider how, or even
whether, it makes sense to view a language like English as simply a set of strings.
An English dictionary contains words: dumpling, inquisition, notational, put, etc.
However, writing a sequence of English words is not the same as writing English. It
makes somewhat more sense to say that English is a collection of legal sentences. We
can say that the sentence “The cat is in the hat” is an element of the English language
and that the string “dumpling dumpling put notational inquisition” is not. Taking
sentences to be the basic units may seem arbitrary (why not paragraphs?), except that
rules of English grammar usually deal specifically with constructing sentences. In
CHAPTER 1 Basic Mathematical Objects 29
the case of a high-level programming language, such as C or Pascal, the most rea-
sonable way to think of the language as a set of strings is perhaps to take the strings
to be complete programs in the language. (We normally ask a compiler to check the
syntax of programs or subprograms, rather than individual statements.) Though we
sometimes speak of “words” in a language, we should therefore keep in mind that if
by a word we mean a string that is an element of the language, then a single word
incorporates the rules of syntax or grammar that characterize the language.
When we discuss a language, we begin by specifying the alphabet, which contains
all the legal symbols that can be used to form strings in the language. An alphabet is a
finite set of symbols. In the case of common languages like English, we would want
the alphabet to include the 26 letters, both uppercase and lowercase, as well as blanks
and various punctuation symbols. In the case of programming languages, we would
add the 10 numeric digits. Many examples in this book, however, involve a smaller
alphabet, sometimes containing only two symbols and occasionally containing only
one. Such alphabets make the examples easier to describe and can still accommodate
most of the features we will be interested in.
A string over an alphabet © is obtained by placing some of the elements of &
(possibly none) in order. The length of a string x over & is the number of symbols
in the string, and we will denote this number by |x|. Some of the strings over the
alphabet {a, b} are a, baa, aba, and aabba, and we have |a| = 1, |baa| = |aba| = 3,
and |aabba| = 5. Note that when we write a, we might be referring either to the
symbol in the alphabet or to the string of length 1; for our purposes it will usually not
be necessary to distinguish these. The null string (the string of length 0) is a string
over , no matter what alphabet D is. We denote it by A. (To avoid confusion, we
will never allow the letter A to represent an element of ¥.)
For any alphabet ©, the set of all strings over X is denoted by &*, and a language
over » is therefore a subset of &*. For & = {a, b}, we have
b* = {a, b}* = {A, a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, ...}
In the fourth example, n,(x) and np(x) represent the number of a’s and the number
of b’s, respectively, in the string x.
Because languages are sets of strings, new languages can be constructed using set
operations. For any two languages over an alphabet ©, their union, intersection, and
difference are also languages over &. When we speak of the complement of a language
over D, we take the universal set to be the language &*, so that L'=x* — L. Note
that any two languages can be considered to be languages over a common alphabet:
if Lj] © Uf and Lg © x3, then L; and L> are both subsets of (2; U X2)*. This
30 PART 1 Mathematical Notation and Techniques
creates the possibility of confusion, since now the complement of L; might be taken
to be either Xf — Ly or (X) U X2)* — Ly. However, it will usually be clear which
alphabet is referred to.
The concatenation operation on strings will also allow us to construct new lan-
guages. If x and y are elements of &*, the concatenation of x and y is the string xy
formed by writing the symbols of x and the symbols of y consecutively. If x = abb
and y = ba, xy = abbba and yx = baabb. For any string x, xA = Ax = x.
Clearly, concatenation is associative: For any strings x, y, and z, (xy)z = x(yz).
This allows us to concatenate several strings without specifying the order in which
the various concatenation operations are actually performed.
We say that a string x is a substring of another string y if there are strings w and
z, either or both of which may be null, so that y = wxz. The string car is a substring
of each of the strings descartes, vicar, carthage, and car, but not of the string charity.
A prefix of a string is an initial substring. For example, the prefixes of abaa are A
(which is a prefix of every string), a, ab, aba, and abaa. Similarly, a suffix is a final
substring.
Now that we have the concatenation operation, we can apply it to languages as
well as to strings. If L,, Ly C &*,
For example,
a* =aa---a
x S=XX++ +X
a=A xo =A
DS fAy ES A
These last four rules are analogous to sieth from algebra, and you can understand
them the same way. For any real number x, x° is defined to be 1. One reason is that
we want the formula x?x4 = x?T4 to mmole for every p and q, and in particular we
want x°x4 to be x4. This means that x° should be the unit of multiplication (the real
number u for which uy = y for every y), which is 1. The string that is the unit of
concatenation is A (because Ay= y for any string y), and the unit of concatenation
CHAPTER 1 Basic Mathematical Objects 31
for languages is {A}. Therefore, these are the appropriate choices for x° and L°,
respectively.
L* means the set of all strings that can be obtained by concatenating k elements
of L. Next we define the set of all strings that can be obtained by concatenating any
number of elements of L:
ore
|
hoe LJ Li
i=0
The operation * in this formula is often called the Kleene star, after the mathematician
S.C. Kleene. This use of the * symbol is consistent with using D* to represent the set of
strings over &, because strings are simply concatenations of zero or more symbols.
Note that A is always an element of L*, no matter what L is, since L° = {A}.
Finally, we denote by L* the set of all strings obtainable by concatenating one or
more elements of L:
CO
Lee LU Jy
i=l
You can check that Lt = L*L = LL*. The two languages L* and L+ may in fact
be equal—see Exercise 1.38.
Strings, by definition, are finite (have only a finite number of symbols). Most
interesting languages are infinite (contain an infinite number of strings). However,
in order to work with these languages we must be able to specify or describe them in
ways that are finite. At this point we may distinguish two possible approaches to this
problem, which can be illustrated by examples:
We will be studying, on the one hand, more and more general ways of generating
languages, beginning with ways similar to the ones used in the definition of L;, and
on the other hand, corresponding methods of greater and greater sophistication for
recognizing strings in these languages. In the second approach, it will be useful to
think of the algorithm for recognizing the language as being embodied in an abstract
machine, and a precise description of the machine will effectively give us a precise
way of specifying which strings are in the language. Initially these abstract machines
will be fairly primitive, since it turns out that languages like L; can be recognized
easily. A language like L2 will require a more powerful type of abstract machine to
recognize it, as well as a more general method of generating it. Before we are through,
we will study machines equivalent in power to the most sophisticated computer.
EXERCISES
1.1. Describe each of these infinite sets precisely, using a formula that does not
involve “...”. If you wish, you can use VV, R, Z, and other sets discussed in
the chapter.
a. {0,—1,2, —3,4, —5,...}
by (1/2, 1/4; 3/450)8, 3/8, 5/8>7/8,.1/16, 3/16, 5/16, 7/16, 37}
c. {10, 1100, 111000, 11110000, ...} (a subset of {0, 1}*)
d. {{O}, {1}, {2}, .. J
Cre Ols (0, Or hs 2) Ondo 73 ex. 3
Pole (ORL 2. SOF INO 354 251 ONT AOn ea
(OR? So oes}
1.2. Label each of the eight regions in Figure 1.2, using A, B, C, and appropriate
set operations.
1.3. Use Venn diagrams to verify each of the set identities (1.1)—-(1.19).
1.4. Assume that A and B are sets. In each case, find a simpler expression
representing the given set. The easiest way is probably to use Venn
diagrams, but also practice manipulating the formulas using the set identities
(1.1)-(1.19).
ae AR}
ere AN
CROLL BY SA
d. (A— B)UB— AUN B)
e. (A'M B’y
f. (A’UB’’
g. AU(BN(A—(B A)))
bh A (Bb (A U( Ba Ay)
1.5. Show using Venn diagrams that the symmetric difference operation ®
satisfies the associative property A@(B@C)=(A® B)@C.
1.6. In each case, find a simpler statement (one not involving the symmetric
difference operation) equivalent to the given one. Assume in each case that
CHAPTER 1 Basic Mathematical Objects 33
1.22. In this problem, as usual, R denotes the set of real numbers, R* the set of
nonnegative real numbers, NV the set of natural numbers (nonnegative
integers), and 2® the set of subsets of R. [0, 1] denotes the set
{x € R|0<x <1}. Ineach case, say whether the indicated function is
one-to-one, and say what its range is.
a. fa: R* > R* defined by fa(x) = x +a (where a is some fixed
element of R*)
d:R+ > Rt defined by d(x) = 2x
t:.N = N defined by t(x) = 2x
g: Rt > N defined by g(x) = Lx] (the largest integer < x)
p:R* > Rt defined by p(x) =x + Lx]
i: 2% — 2 defined by i(A) = AN [0, 1]
u:2® — 2® defined by u(A) = AU[0, 1]
LS: m:R+
REO
PO”
eS — Rt defined by m(x) = min(x, 2)
i. M: Rt > R* defined by M(x) = max(x, 2)
j. s: Rt > R* defined by s(x) = min(x, 2)+ max(x, 2)
1.23. Suppose A and B are sets, f: A> B,andg: BA. If f(g(y)) = y for
every y € B, then f isa____ functionand gisa__ function. Give
reasons for your answers.
1.24. Let A = {2, 3,4, 6, 7, 12, 18} and B = {7, 8, 9, 10}.
36 PART 1 Mathematical Notation and Techniques
do fa
f, od
80 fa
uol
Sacs
oe
Se
SS Lou
1.26. In each case, show that f is a bijection and find a formula for f~!.
a. f : Rk — R
definedby f(x) =x
b. f: Rt > {x € R|0 <x < 1} defined
by f(x) = 1/(14+ x)
c f:RxR— Rx R
defined
by f(x,y) = (a+ y,x — y)
1.27. Show that if f : A > B is a bijection, then f~ is also a bijection, and
Cees:
1.28. In each case, a relation on the set {1, 2, 3} is given. Of the three properties,
reflexivity, symmetry, and transitivity, determine which ones the relation has.
Give reasons.
a R= 1(l 3) 1) 2)
bi k= {1 1), (2, 2),.6; 5) acl, 2)}
c. R=6
1.29. For each of the eight lines of the table below, construct a relation on (235)
that fits the description.
1.30. Three relations are given on the set of all nonempty subsets of VV. In each
case, say whether the relation is reflexive, whether it is symmetric, and
whether it is transitive.
a. R is defined by: ARB if and only if A C B.
b. R is defined by: ARB if and only if AN B £ @.
c. R is defined by: ARB if and only if1 ¢ AN B.
1.31. How would your answer to Exercise 1.30 change if in each case R were the
indicated relation on the set of all subsets of NV’?
1.32. Let R be a relation on a set S. Write three quantified statements (the domain
being S in each case), which say, respectively, that R is not reflexive, R is
not symmetric, and R is not transitive.
1.33. In each case, a set A is specified, and a relation R is defined on it. Show that
R is an equivalence relation.
a. A = 2°, for some set S. An element X of A is related via R to an
element Y if there is a bijection from X to Y.
b. A is an arbitrary set, and it is assumed that for some other set B,
f : A — Bisa function. For x, y € A,xRy if f(x) = f(y).
c. Suppose U is the set {1,2,..., 10}. A is the set of all statements over
the universe U—that is, statements involving at most one free variable,
which can have as its value an element of U. (Included in A are the
statements false and true.) For two elements r and s of A,rRs ifr }s.
d. A is the set R, and for x, y € A, xRy if x — y is’an integer.
e. A is the set of all infinite sequences x = x9x,x2--- of 0’s and 1’s. For
two such sequences x and y, x Ry if there exists an integer k so that
A; = y; for every 1 = k.
1.34. In Exercise 1.33a, if S has exactly 10 elements, how many equivalence
classes are there for the relation R? Describe them. What are the elements of
the equivalence class containing {a, b} (assuming a and b are two elements
of S)?
1.35. In Exercise 1.33, if A and B are both the set of real numbers, and f is the
function defined by f(x) = x”, describe the equivalence classes.
1.36. In Exercise 1.33b, suppose A has n elements and B has m elements.
a. If f is one-to-one (and not necessarily onto), how many equivalence
classes are there?
b. If f is onto (and not necessarily one-to-one), how many equivalence
classes are there?
1.37. In Exercise 1.33c, how many equivalence classes are there? List some
elements in the equivalence class containing the statement
(x = 3) V (x = 7). List some elements in the equivalence class containing
the statement true, and some in the equivalence class containing false.
1.38. Let L be a language. It is clear from the definitions that Lt C L*. Under
what circumstances are they equal?
38 PART 1 Mathematical Notation and Techniques
1.39. a. Find a language L over {a, b} that is neither {A} nor {a, b}* and satisfies
1 eas bas
b. Find an infinite language L over {a, b} for which L # L*.
1.40. In each case, give an example of languages L; and L2 satisfying L;L2 =
LL as well as the additional conditions indicated.
a. Neither language is a subset of the other, and neither language is {A}.
b. Lj, is a proper nonempty subset of L2 (proper means L; # Lz), and
L, # {A}.
1.41. Let L, and L» be subsets of {0, 1}*, and consider the two languages L} U L5
and (L; UL>)*.
a. Which of the two is always a subset of the other? Why? Give an
example (i.e., say what L; and L> are) so that the opposite inclusion
does not hold.
b.. If Lj © L5, then(L, U £o)* = L> = LU L>. Similarly if £5 C L;.
Give an example of languages L,; and L2 for which Lj € L3, L5 < Li,
and LF UL3 = (1, UL))*.
1.42. Show that if A and B are languages over & and A C B, then A* C B*.
1.43. Show that for any language:L, L* = (L*)* = (Lt)* = (L*)*.
1.44, For a finite language L, denote by || the number of elements of L. (For
example, |{A, a, ababb}| = 3.) Is it always true that for finite languages A
and B, |AB| = |A||B|? Either prove the equality or find a counterexample.
1.45. List some elements of {a, ab}*. Can you describe a simple way to recognize
elements of this language? In other words, try to find a proposition p(x) so
that
a. {a, ab}* is precisely the set of strings x satisfying p(x); and
b. for any x, there is a simple procedure to test whether x satisfies p(x).
1.46. a. Consider the language L of all strings of a’s and b’s that do not end with
b and do not contain the substring bb. Find a finite language S so that
| eine
b. Show that there is no language S so that the language of all strings of a’s
and b’s that do not contain the substring bb is equal to S*.
1.47, Let L1, Lo, and L3 be languages over some alphabet ©. In each part below,
two languages are given. Say what the relationship is between them. (Are
they always equal? If not, is one always a subset of the other?) Give reasons
for your answers, including counterexamples if appropriate.
a. L,(L2 M L3), L,L> NM L,L3
Ut Ix -al <r}
r<l
135. In each case, write a quantified statement, using the formal notation
discussed in the chapter, that expresses the given statement. In both cases the
set A is assumed to be a subset of the domain, not necessarily the entire
domain.
a. There is exactly one element x in the set A satisfying the condition
P—that is, for which the proposition P(x) holds.
b. There are at least two distinct elements in the set A satisfying the
condition P.
1.56. Below are four pairs of statements. In all cases, the universe for the
quantified statements is assumed to be the set \V. We say one statement
logically implies the other if, for any choice of statements p(x) and q(x) for
which the first is true, the second is also true. In each case, say whether the
first statement logically implies the second, and whether the second logically
implies the first. In each case where the answer is no, give an example of
statements p(x) and g(x) to illustrate.
a.
Vx (p(x) V q(x))
Vx (p(x)) V Vx (q(x))
b.
Vx (p(x) A q(x))
Vx(p(x)) A Vx(q(x))
cy
Ax (p(x) V q(x))
dx (p(x)) V Ax(q(x))
d.
Ax (p(x) A q(x))
Ax (p(x)) A Ax (g(x))
1.57. Suppose A and B are sets and f : A > B. Let S and T be subsets of A.
a. Is the set f(S UT) asubset of f(S) U f(T)? If so, give a proof; if not,
give a counterexample (i.e., specify sets A, B, S, and T and a function
f).
b. Is the set f(S) U f(T) a subset of f(S UT)? Give either a proof or a
counterexample.
c. Repeat part (a) with intersection instead of union.
Repeat part (b) with intersection instead of union.
In each of the first four parts where your answer is no, what extra
assumption on the function f would make the answer yes? Give reasons
for your answer.
CHAPTER 1 Basic Mathematical Objects 41
Mathematical Induction
and Recursive Definitions
2.1|PROOFS
A proof of a statement is essentially just a convincing argument that the statement is
true. Ideally, however, a proof not only convinces but explains’ why the statement is
true, and also how it relates to other statements and how it fits into the overall theory. A
typical step in a proof is to derive some statement from (1) assumptions or hypotheses,
(2) statements that have already been derived, and (3) other generally accepted facts,
using general principles of logical reasoning. In a very careful, detailed proof, we
might allow no “generally accepted facts” other than certain axioms that we specify
initially, and we might restrict ourselves to certain specific rules of logical inference,
by which each step must be justified. Being this careful, however, may not be feasible
or worthwhile. We may take shortcuts (“It is obvious that ...” or “It is easy to show
that ...”) and concentrate on the main steps in the proof, assuming that a conscientious
or curious reader could fill in the low-level details.
Usually what we are trying to prove involves a statement of the form p > q. A
direct proof assumes that the statement p is true and uses this to show gq is true.
@ Proof
We start by saying more precisely what our assumption means. An integer n is odd if there
exists an integerx so that n = 2x + 1. Now let a and b be any odd integers. Then according
to this definition, there is an integer x so that a = 2x + 1, and there is an integer y so that
43
44 PART 1 Mathematical Notation and Techniques
Since we have shown that there is a z, namely, 2xy + x + y, so that ab = 2z + 1, the proof is
complete.
& Proof
The statement we wish to prove is of the general form “For every x, if p(x), then g(x).” For
each x, the statement “If p(x) then g(x)” is logically equivalent to “If not g(x) then not p(x),”
and therefore (by a general principle of logical reasoning) the statement we want to prove is
equivalent to this: For any positive integers i, j, and n, if it is not the case that i < J/n or
j < Jn, theni * j #n.
If it is not true thati < /n or j < ./n, theni > /n and j > ./n. A generally accepted
fact from mathematics is that if a and b are numbers with a > b, and c is a number > 0, then
ac > bc. Applying this to the inequality i> /n with c = j, we obtaini x j > /n xj. Since
n > 0, we know that ./n > 0, and we may apply the same fact again to the inequality j > /n,
this time letting c = ./n, to obtain j,/n > ./n./n =n. We now havei * j > j./n > n, and
it follows thati * 7 Zn.
The second paragraph in this proof illustrates the fact that a complete proof, with no details
left out, is usually not feasible. Even though the statement we are proving here is relatively
simple, and our proof includes more detail than might normally be included, there is still a lot
left out. Here are some of the details that were ignored:
Even if we include all these details, we have not stated explicitly the rules of inference
we have used to arrive at the final conclusion, and we have used a number of facts about real
numbers that could themselves be proved from more fundamental axioms. In presenting a
proof, one usually tries to strike a balance: enough left out to avoid having the minor details
obscure the main points and put the reader to sleep, and enough left in so that the reader will
be convinced.
/2 \|s Irrational
A real number x is rational if there are two integers m and n so that x = m/n. We present
one of the most famous examples of proof by contradiction: the proof, known to the ancient
Greeks, that /2 is irrational.
& Proof
Suppose for the sake of contradiction that /2 is rational. Then there are integers m’ and n’
with /2 = m’/n’. By dividing both m’ and n’ by all the factors that are common to both, we
obtain 2 = m/n, for some integers m and n having no common factors. Since m/n = “/2,
m =nJ2. Squaring both sides of this equation, we obtain m? = 2n?, and therefore m? is even
(divisible by 2). The result proved in Example 2.1 is that for any integers a and b, if a and b are
odd, then ab is odd. Since a conditional statement is logically equivalent to its contrapositive,
we may conclude that for any a and b, if ab is not odd, then either a is not odd or b is not odd.
However, an integer is not odd if and only if it is even (Exercise 2.21), and so for any a and b,
if ab is even, then a or b is even. If we apply this when a = b = m, we conclude that since
m 2 is even, m must be even. This means that for some k, m = 2k. Therefore, (2k)* = 2n?.
Simplifying this and canceling 2 from both sides, we obtain 2k? = n?. Therefore, n? is even.
The same argument that we have already used shows that n must be even, and son = 27 for
some j. We have shown that m and n are both divisible by 2. This contradicts the previous
statement that m and n have no common factor. The assumption that s/2 is rational therefore
leads to a contradiction, and the conclusion is that /2 is irrational.
@ Proof
Again we try a proof by contradiction. Suppose that A, B, and C are sets for which the
conditional statement is false. Then AM B = 0, C C B, and ANC # @. Therefore, there
exists x with x € AMC, so that x € A and x € C. Since C C B and x € C, it follows
that x € B. Therefore, x € AN B, which contradicts the assumption that AM B = @. Since
the assumption that the conditional statement is false leads to a contradiction, the statement is
proved.
There is not always a clear line between a proof by contrapositive and one by
contradiction. Any proof by contrapositive that p — q is true can easily be refor-
mulated as a proof by contradiction. Instead of assuming that —q is true and trying
to show —p, assume that p and —@ are true and derive —p; then the contradiction is
that p and —p are both true. In the last example it seemed slightly easier to argue
by contradiction, since we wanted to use the assumption that C C B. A proof by
contrapositive would assume that A C # @ and would try to show that
=((AN
B= %) A(C C B))
CHAPTER 2 Mathematical Induction and Recursive Definitions 47
This approach seems a little more complicated, just because the formula we are trying
to obtain is more complicated.
It is often convenient (or necessary) to use several different proof techniques
within a single proof. Although the overall proof in the following example is not a
proof by contradiction, this technique is used twice within the proof.
@ Proof
Since n > 2, two distinct factors in n! are n and 2. Therefore, n! > 2n =n+n>n-+1, and
thus n! — 1 > n. The number n! — 1 must have a factor p that is a prime. (See Example 1.2
for the definition of a prime. The fact that every integer greater than 1 has a prime factor is a
basic fact about positive integers, which we will prove in Example 2.11.) Since p is a divisor
of n! — 1, p <n! —1 <n!. This gives us one of the inequalities we need. To show the other
one, suppose for the sake of contradiction that p < n. Then since p is one of the positive
integers less than or equal ton, p is a factor of n!. However, p cannot be a factor of both n! and
n| — 1; if it were, it would be a factor of 1, their difference, and this is impossible. Therefore,
the assumption that p < n leads to a contradiction, and we may conclude thatn < p <n!.
na
Another useful technique is to divide the proof into separate cases; this is illus-
trated by the next example.
@ Proof
We can show the result by considering two separate cases. If x contains two consecutive 0’s
or two consecutive 1’s, then the statement is true for a string y of length 1. In the other case,
any symbol that follows a 0 must be a 1, and vice versa, so that x must be either 0101 or 1010.
The statement is therefore true for a string y of length 2.
Even though the argument is simple, let us state more explicitly the logic on which it
depends. We want to show that some proposition P is true. The statement P is logically
equivalent to true > P. If we denote by p the statement that x contains two consecutive 0’s
or two consecutive 1’s, then p V —p is true. This means true — P is logically equivalent to
(PNP): sak
which in turn is logically equivalent to
(Pp SERV
ANG Piae: Ff)
48 PART 1 Mathematical Notation and Techniques
This last statement is what we actually prove, by showing that each of the two separate condi-
tional statements is true.
In this proof, there was some choice as to which cases to consider. A less efficient approach
would have been to divide our two cases into four subcases: (i) x contains two consecutive 0’s;
(and so forth). An even more laborious proof would be to consider the 16 strings of length 4
individually, and to show that the result is true in each case. Any of these approaches is valid,
as long as our cases cover all the possibilities and we can complete the proof in each case.
The examples in this section provide only a very brief introduction to proofs.
Learning to read proofs takes a lot of practice, and creating your own is even harder.
One thing that does help is to develop a critical attitude. Be skeptical. When you
read a step in a proof, ask yourself, “Am I convinced by this?” When you have
written a proof, read it over as if someone else had written it (it is best to read aloud
if circumstances permit), and as you read each step ask yourself the same question.
yoisn+l)/2
i=1
The number of subsets of {1, 2,..., 2} is 2”.
It might be an inequality:
nS?
It might be some other assertion about n, or about a set with n elements, or a string
of length n:
(The term regular is defined in Chapter 3.) In this section, we discuss a common
approach to proving statements of this type.
In both the last two examples, it might seem as though the explicit mention of n
makes the statement slightly more awkward. It would be simpler to say, “Every finite
language is regular,” and this statement is true; it would also be correct to let the last
statement begin, “For any x and y in {0, 1}*, ifx = Oy1, ....” However, in both cases
the simpler statement is equivalent to the assertion that the original statement is true
for every nonnegative value of n, and formulating the statement so that it involves n
will allow us to apply the proof technique we are about to discuss.
CHAPTER 2 Mathematical Induction and Recursive Definitions 49
Ve 2 eee eh — a 2
This formula is supposed to hold for every n > 1; however, it makes sense to consider it for
n = 0 as well if we interpret the left side in that case to be the empty sum, which by definition
is 0. Let us therefore try to prove that the statement is true for every n > 0.
How do we start? Unless we have any better ideas, we might very well begin by writing
out the formula for the first few values of n, to see if we can spot a pattern.
n=0: 0=00+1)/2
n=1: 0+1=104+1)/2
n=2: 0+14+2=224 1/2
n=3: 04+14+24+3=33+4+1)/2
n=4: 04+14+24+34+4=444+1/2
As we are verifying these formulas, we probably realize after a few lines that in checking a
specific case, say n = 4, it is not necessary to do all the arithmetic on the left side: 0+ 1+2+
3 +4. We can take the left side of the previous formula, which we have already calculated,
and add 4. When we calculated 0 + 1 + 2 + 3, we obtained 3(3 + 1)/2. So our answer for
n=4is
which is the one we wanted. Now that we have done this step, we can take care of n = 5 the
same way, by taking the sum we just obtained for n = 4 and adding S:
0+1424+---+n=n(n+1)/2 n>0
for every
0) 0=00
+ 1)/2
(eas We 0+1=0(0+1)/2+1 (by using the result for n = 0)
=1(0/2 +1)
= 10+ 2)/2
= 1(1+1)/2
n=2: 0414+2=104+1)/2+2 (by using the result for n = 1)
=~) (V2)
=2(1
+ 2)/2
= 22+ 1)/2
PART 1 Mathematical Notation and Techniques
Since this pattern continues indefinitely, the formula is true for every n > 0.
Now let us criticize this proof. The conclusion, “the formula is true for every
n > 0,” is supposed to follow from the fact that “this pattern continues indefinitely.”
The phrase “this pattern” refers to the calculation that we have done three times, to
derive the formula for n = 1 fromn = 0, forn = 2 fromn = 1, and for n = 3 from
n = 2. There are at least two clear deficiencies in.the proof. One is that we have
not said explicitly what “this pattern” is. The second, which is more serious, is that
we have not made any attempt to justify the assertion that it continues indefinitely.
In this example, the pattern is obvious enough that people might accept the assertion
without much argument. However, it would be fair to say that the most important
statement in the proof is the one for which no reasons are given!
Our second version of the proof tries to correct both these problems at once: to
describe the pattern precisely by doing the calculation, not just for three particular
values of n but for an arbitrary value of n, and in the process, to demonstrate that the
pattern does not depend on the value of n and therefore does continue indefinitely.
n=0: 0=00+1)/2
ale: 0+1=0(0+1)/2+1 (by using the result for n= 0)
= 1(0/2
+ 1)
= 1(0+2)/2
=10+4+1)/2
i) SOs 0+1+2=1(1+1)/2+2 (by using the result for n = 1)
= 2(1/2
+ 1)
= 21-2)/2
= 2(2'+ 1)/2
n=3: 04+1+4+2+3=2(2+1)/24+3 (by using the result for n = 2)
= 3(2/2
+ 1)
= 3(2 + 2)/2
= 3(34+ 1)/2
In general, for any value of k > 0, the formula for n = k + 1 can be derived from the one
for n = k as follows:
CHAPTER 2 Mathematical Induction and Recursive Definitions 51
We might now say that the proof has more than it needs. Presenting the calcu-
lations for three specific values of » originally made it easier for the reader to spot
the pattern; now, however, the pattern has been stated explicitly. To the extent that
the argument for these three specific cases is taken to be part of the proof, it obscures
the two essential parts of the proof: (1) checking the formula for the initial value
of n, n = O, and (2) showing in general that once we have obtained the formula for
one value of n (n = k), we can derive it for the next value (n = k + 1). These two
facts together are what allow us to conclude that the formula holds for every n > 0.
Neither by itself would be enough. (On one hand, the formula for n = 0, or even
for the first million values of n, might be true just by accident. On the other hand, it
would not help to know that we can always derive the formula for the casen = k + 1
from the one for the case n = k, if we could never get off the ground by showing that
it is actually true for some starting value of k.)
The principle that we have used in this example can now be formulated in general.
14+24+3+---+n=n(n+1)/2
Induction hypothesis.
14+24+34+-:-¢4k4+)=(4+D((k4+1)4+1)/2
= (k + 1)(k/2+ 1)
=(k+ 1)(kK+2)/2
= (k+1)(k+ 1) ap 1)/2
Whether or not you follow this format exactly, it is advisable always to include
in your proof explicit statements of the following:
The general statement involving n that is to be proved.
The statement to which it reduces in the basis step (the general statement, but
with mo substituted for n).
The induction hypothesis (the general statement, with k substituted for n, and
preceded by “k > no, and”).
The statement to be shown in the induction step (with k + 1 substituted for n).
The point during the induction step at which the induction hypothesis is used.
The advantage of formulating a general principle of induction is that it supplies
a general framework for proofs of this type. If you read in a journal article the phrase
“It can be shown by induction that ...,” even if the details are missing, you can
supply them. Although including these five items explicitly may seem laborious at
first, the advantage is that it can help you to clarify for yourself exactly what you are
trying to do in the proof. Very often, once you have gotten to this point, filling in the
remaining details is a straightforward process.
| EXAMPLE 2.8
Pee _ =Strings of the Form Oy1 Must Contain the Substring 01
Let us prove the following statement: For any x € {0, 1}*, if x begins with 0 and ends with 1
(.e., x = Oy] for some string y), then x must contain the substring 01.
You may wonder whether this statement requires an induction proof; let us begin with
an argument that does not involve induction, at least explicitly. If x = Oy1 for some string
y € {0, 1}*, then x must contain at least one 1. The first 1 in x cannot occur at the beginning,
since x starts with 0; therefore, the first 1 must be immediately preceded by a 0, which means
that x contains the substring 01. It would be hard to imagine a proof much simpler than
this, and it seems convincing. It is interesting to observe, however, that this proof uses a fact
about natural numbers (every nonempty subset has a smallest element) that is equivalent to
CHAPTER 2 Mathematical Induction and Recursive Definitions 53
the principle of mathematical induction. We will return to this statement later, when we have
a slightly modified version of the induction principle. See Example 2.12 and the discussion
before that example.
In any case, we are interested in illustrating the principle of induction at least as much
as in the result itself. Let us try to construct an induction proof. Our initial problem is that
mathematical induction is a way of proving statements of the form “For MODE <2 10) bccn, %
and our statement is not of this form. This is easy to fix, and the solution was suggested at
the beginning of this section. Consider the statement P(n): If |x| =n and x = Oy1 for some
string y € {0, 1}*, then x contains the substring 01. In other words, we are introducing an
integer n into our statement, specifically in order to use induction. If we can prove that P(n) is
true for every n > 2, it will follow that the original statement is true. (The integer we choose is
the length of the string, and we could describe the method of proof as induction on the length
of the string. There are other possible choices; see Exercise 2.6.)
In the basis step, we wish to prove the statement “If |x| = 2 and x = Oyl for some
string y € {0, 1}*, then x contains the substring 01.” This statement is true, because if |x| = 2
and x = Oyl, then y must be the null string A, and we may conclude that x = 01. Our
induction hypothesis will be the statement: k > 2, and if |x| = k and x = Oy1 for some
string y € {0, 1}*, then x contains the substring 01. In the induction step, we must show: if
|x] =k + 1 and x = Oy1 for some y € {0, 1}*, then x contains the substring 01. (These three
statements are obtained from the original statement P(n) very simply: first, by substituting 0
for n; second, by substituting k for n, and adding the phrase “k > 2, and” at the beginning;
third, by substituting k + 1 for n. These three steps are always the same, and the basis step
is often as easy to prove as it is here. Now the mechanical part is over, and we must actually
think about how to continue the proof!)
We have a string x of length k + 1, about which we want to prove something. We have an
induction hypothesis that tells us something about certain strings of length k, the ones that begin
with 0 and end with 1. In order to apply the induction hypothesis, we need a string of length k to
apply it to. We can get a string of length k from x by leaving out one symbol. Let us try deleting
the initial 0. (See Exercise 2.5.) The remainder, y1, is certainly a string of length k, and we
know that it ends in 1, but it may not begin with 0—and we can apply the induction hypothesis
only to strings that do. However, if y1 does not begin with 0, it must begin with 1, and in this
case x starts with the substring 01! If y1 does begin with 0, then the induction hypothesis tells
us that it must contain the substring 01, so that x = Oy1 must contain the substring too.
Now that we have figured out the crucial steps, we can afford to be a little more concise in
our official proof. We are trying to prove that for every n > 2, P(n) is true, where P(n) is the
statement: If |x| = n and x = Oy1 for some string y € {0, 1}*, then x contains the substring
O01.
Basis step. We must show that the statement P(2) is true. P(2) says that if |x| = 2 and
x = Oy for some y € {0, 1}*, then x contains the substring 01. P(2) is true, because if
|x| = 2 and x = Oy1 for some y, then x = O1.
Induction hypothesis. k > 2 and P(k); in other words, if |x| = k and x = Oy1 for
some y € {0, 1}*, then x contains the substring 01.
Statement to be shown in induction step. P(k + 1); that is, if |x| =k + 1 and
x = Oyl for some y € {0, 1}*, then x contains the substring 01.
54 PART 1 Mathematical Notation and Techniques
Proof of induction step. Since |x| = k + 1 and x = Oyl, |yl| =k. If y begins with 1,
then x begins with the substring 01. If y begins with 0, then y1 begins with 0 and ends
with 1; by the induction hypothesis, y contains the substring 01, and therefore x does
also.
Me Ab
iOpes JE Ea Ak ie) Hal
Veet Nis Ser
write (Y) ;
We would like to show that when this code is executed, the value printed out is x”. We do this
in a slightly roundabout way, by introducing a new integer j, the number of iterations of the
loop that have been performed. Let P(/) be the statement that the value of Y after 7 iterations
is x/. The result we want will follow from the fact that P(j) is true for any j > 0, and the fact
that “For J = 1 ton” results in n iterations of the loop.
Basis step. P (0) is the statement that after 0 iterations of the loop, Y has the value a
This is true because Y receives the initial value | and after 0 iterations of the loop its
value is unchanged.
Inductive hypothesis. k > 0, and after k iterations of the loop the value of Y is x*.
Statement to be proved in induction step. After k + 1 iterations of the loop, the value
of Ydsa'th;
Proof of induction step. The effect of the assignment statement Y = Y x x is to replace
the old value of Y by that value times x; therefore, the value of Y after any iteration is x
times the value before that iteration. Since x * x* = x**!, the proof is complete.
Although the program fragment in this example is very simple, the example
should suggest that the principle of mathematical induction can be a useful technique
for verifying the correctness of programs. For another example, see Exercise 2.56.
You may occasionally find the principle of mathematical induction in a disguised
form, which we could call the minimal counterexample principle. The last example
in this section illustrates this.
(See Example 2.12.) Since we have verified P(O), k must be at least 1. Therefore,
k — 1 is at
least 0, and since k is the smallest value for which P fails, P(k — 1) is true.
This means that
5‘! — 2*-! is a multiple of 3, say 3j. Then, however,
Soe
a Se eI = 34S bot (S |) — 2) 3 a SPP 37
This expression is divisible by 3. We have derived a contradiction, which allows us to conclude
that our original assumption is false. Therefore, P(n) is true for every n > 0.
a a
You can probably see the similarity between this proof and one that uses the
principle of mathematical induction. Although an induction proof has the advantage
that it does not involve proof by contradiction, both approaches are equally valid.
Not every statement involving an integer n is appropriate for mathematical in-
duction. Using this technique on the statement
(12a) ae" eal
would be silly because the proof of the induction step would not require the induction
hypothesis at all. The formula for n = k + 1, or for any other value, can be obtained
immediately by expanding the left side of the formula and using laws of exponents.
The proof would not be a real induction proof, and it would be misleading to classify
it as one.
A general rule of thumb is that if you are tempted to use a phrase like “Repeat this
process for each n,” or “Since this pattern continues indefinitely” in a proof, there
is a good chance that the proof can be made more precise by using mathematical
induction. When you encounter one of these phrases while reading a proof, it is very
likely a substitute for an induction argument. In this case, supplying the details of the
induction may help you to understand the proof better.
and 1. This means k + 1 = r «5 for some positive integers r and s, neither of which is 1 or
k + 1. It follows that r and s must both be greater than 1 and less than k + 1.
In order to finish the induction step, we would like to show that r and s are both either
primes or products of primes; it would then follow, since k + 1 is the product of r and s, that
k + 1 is a product of two or more primes. Unfortunately, the only information our induction
hypothesis gives us is that k is a prime or a product of primes, and this tells us nothing about
rors.
Consider, however, the following intuitive argument, in which we set about verifying the
statement P(n) one value of n at a time:
2 is a prime.
3 is a prime.
4 = 2 x 2, which is a product of primes since P(2) is known to be true.
5 is a prime.
6 = 2 «3, which is a product of primes since P(2) and P(3) are known to be true.
7 is a prime.
8 = 2 «4, which is a product of primes since P(2) and P(4) are known to be true.
9 = 3 «3, which is a product of primes since P(3) is known to be true.
10 = 2 * 5, which is a product of primes since P(2) and P(5) are known to be true.
11 is a prime.
12 = 2 « 6, which is a product of primes since P(2) and P(6) are known to be true.
This seems as convincing as the intuitive argument given at the start of Example 2.7. Further-
more, we can describe explicitly the pattern illustrated by the first 11 steps: For each k > 2,
either k + 1 is prime or it is the product of two numbers r and s for which the proposition P
has already been shown to hold.
The difference between the pattern appearing here and the one we saw in Example 2.7
is this: At each step in the earlier example we were able to obtain the truth of P(k + 1) by
knowing that P(k) was true, and here we need to know that P holds, not only for k but also
for all the values up to k. The following modified version of the induction principle will allow
our proof to proceed.
To use this principle in a proof, we follow the same steps as before except for the
way we state the induction hypothesis. The statement here is that k is some integer
> no and that all the statements P(ng), P(mp + 1),..., P(k) are true. With this
change, we can finish the proof we began earlier.
Basis step. P(2) is the statement that 2 is either a prime or a product of two or more
primes. This is true because 2 is a prime.
Induction hypothesis. k > 2, and for every n with 2 < n < k, nis either prime ora
product of two or more primes.
Statement to be shown in induction step. k + 1 is either prime or a product of two or
more primes.
Proof of induction step. We consider two cases. If k + 1 is prime, the statement
P(k + 1) is true. Otherwise, by definition of a prime, k + 1 = r * s, for some positive
integers r and s, neither of which is 1 ork + 1. It follows that2 <r <kand2<s <k.
Therefore, by the induction hypothesis, both r and s are either prime or the product of
two or more primes. Therefore, their product k + 1 is the product of two or more primes,
and P(k + 1) is true.
containing n has a smallest element, then A does. With this in mind, we let P(n) be the
statement “Every subset of VV containing n has a smallest element.” We prove that P(n) is
true for every n > 0. (See Exercise 2.7.)
Basis step. P(0) is the statement that every subset of V containing 0 has a smallest
element. This is true because 0 is the smallest natural number and therefore the smallest
element of the subset.
Induction hypothesis. k > 0, and for every n with O < n < k, every subset of VV
containing 7 has a smallest element. (Put more simply, k > 0 and every subset of VV
containing an integer less than or equal to k has a smallest element.)
Statement to be shown in induction step. Every subset of \V containing k + 1 has a
smallest element.
Proof of induction step. Let A be any subset of VV containing k + 1. We consider two
cases. If A contains no natural number less than k + 1, then k + 1 is the smallest
element of A. Otherwise, A contains some natural number n with n < k. In this case, by
the induction hypothesis, A contains a smallest element.
1 iin
=0
p=
ne(n—1)! -ifn>0
This is one of the simplest examples of a recursive, or inductive, definition. It defines
the factorial function on the set of natural numbers, first by defining the value at 0,
CHAPTER 2 Mathematical Induction and Recursive Definitions 59
and then by defining the value at any larger natural number in terms of its value at the
previous one. There is an obvious analogy here to the basis step and the induction step
in a proof by mathematical induction. The intuitive reason this is a valid definition is
the same as the intuitive reason the principle of induction should be true: If we think
of defining n! for all the values of n in order, beginning with n = 0, then for any
k => 0, eventually we will have defined the value of k!, and at that point the definition
will tell us how to obtain (k + 1)!.
In this section, we will look at a number of examples of recursive definitions of
functions and examine more closely the relationship between recursive definitions
and proofs by induction. We begin with more examples of functions on the set of
natural numbers.
f@O)=1
fd) =1
foreveryn > 1, f@+1)= fm)
+ fa@—!1)
To evaluate f (4), for example, we can use the definition in either a top-down fashion:
or a bottom-up fashion:
FO=1
fQ) =1
f{QD=f/H+f/O=1+1=2
fB=f/Q]+fM=2+1=3
£4 = f3B)+fQ@=3+2=5
It is possible to give a nonrecursive algebraic formula for the number f (7); see Exercise 2.53.
However, the recursive definition is the one that most people remember and prefer to use.
If the definition of the factorial function is reminiscent of the principle of mathematical
induction, then the definition of the Fibonacci function suggests the strong principle of induc-
1). This
tion. This is because the definition of f(n + 1) involves not only f (7) but f(n —
observation is useful in proving facts about f. For example, let us prove that
for everyn > 0, f(n) < (5/3)"
Basis step. We must show that f(0) < (5/3)°; this is true, since f(0) and (5/ 3)° are
both 1.
60 PART 1 Mathematical Notation and Techniques
Induction hypothesis. k > 0, and for every n withO <n <k, f(n) < (5/3)”.
Statement to show in induction step. f(k + 1) < (5/3)**'.
Proof of induction step. Because of the way f is defined, we consider two cases in
order to make sure that our proof is valid for every value of k > 0. If k = 0, then
f(k +1) = f() = 1, by definition, and in this case the inequality is clearly true. If
k > 0, then we must use the recursive part of the definition, which is
fk +1) = f(k) + fk — 1). Since both k and k — 1 are < k, we may apply the
induction hypothesis to both terms, obtaining
fA+1)
= f+ fR-D)
6 /3)e-F 6/3)
= (5/3)*"(5/3 + 1)
= (5/3)*"(8/3)
= (5/3)*1(24/9)
< (5/3)*"1(25/9) = (5/3)!
‘Ce
Ai =
ll
1=1
n+1 n
For each n, (}_, A; is a set, in particular a subset of U; therefore, it makes sense to view the
recursive definition as defining a function from the set of natural numbers to the set of subsets
of U. What we have effectively done in this definition is to extend the binary operation of
union so that it can be applied to n operands. (We discussed this possibility for associative
operations like union near the end of Section 1.1.) This procedure is familiar to you in other
settings, although you may not have encountered a formal recursive definition like this one
before. When you add n numbers in an expression like )\’_, a;, for example, you are extending
the binary operation of addition to n operands. The notational device used to do this is the
summation sign: & bears the same relation to the binary operation + that |_) does to the binary
operation U. A recursive definition of © would follow the one above in every detail:
0
oa eal)
i=1
n+l
with the last two is that we have not introduced a notation for the concatenation operator—we
have written the concatenation of x and y as simply xy.) We are also free to use one of these
general definitions in a special case. For example, we might consider the concatenation of n
languages, all of which are the same language L:
L° = {A}
foreveryn >0, L7*1=L" L
Of course, we already have nonrecursive definitions of many of these things. Our defini-
tion of _, A; in Section 1.1 is
n
|) Ai = {x |x € A; for atleast
one iwith 1<i <n}
i=l
and we also described it as the set of all strings that can be obtained by concatenating n
elements of L. It may not be obvious that we have gained anything by introducing the recursive
definition. The nonrecursive definition of the n-fold union is clear enough, and even with the
ellipses (...) it is not difficult to determine which strings are included in L”. However, the
recursive definition has the same advantage over the ellipses that we discussed in the first proof
in Example 2.7; rather than suggesting what the general step is in this n-fold concatenation, it
comes right out and says it. After all, when you construct an element of L”, you concatenate
two strings, not n strings at once. The recursive definition is more consistent with the binary
nature of concatenation, and more explicit about how the concatenation is done: An (n + 1)-
fold concatenation is obtained by concatenating an element of L* and an element of L. The
recursive definition has a dynamic, or algorithmic, quality that the other one lacks. Finally,
and probably most important, the recursive definition has a practical advantage. It gives us a
natural way of constructing proofs using mathematical induction.
Suppose we want to prove the generalized De Morgan law:
va
((Us)-)
at which point we can use the original De Morgan law to complete the proof, since we have
expressed the (k + 1)-fold union as a two-fold union.
62 PART 1 Mathematical Notation and Techniques
This example illustrates again the close relationship between the principle of
mathematical induction and the idea of recursive definitions. Not only are the two
ideas similar in principle, they are almost inseparable in practice. Recursive defini-
tions are useful in constructing induction proofs, and induction is the natural proof
technique to use on objects defined recursively.
The relationship is so close, in fact, that in induction proofs we might use recursive
definitions without realizing it. In Example 2.7, we proved that
1+2+---+n=n(n+1)/2
and the crucial observation in the induction step was that
1. Ae LE.
2. Forany x € L* andany y € L, xy € L*.
3. No string is in L* unless it can be obtained by using rules 1 and 2.
To illustrate, let L = {a, ab}. According to rule 1, A € L*. One application of rule 2 adds the
strings in L' = L, which are Aa = a and Aab = ab. Another application of rule 2 adds the
strings in L*, which are Aa, Aab, aa, aab, aba, and abab. For any k > 0, a string obtained
by concatenating k elements of L can be produced from the definition by using k applications
of rule 2.
An even simpler illustration is to let L be ©, which is itself a language over ©. Then a
string of length k in &*, which is a concatenation of k symbols belonging to D, can be produced
by k applications of rule 2.
This way of defining L* recursively is not the only way. Here is another possibility.
Ue AWS IES,
2. Foranyx €L,x € L*.
CHAPTER 2 Mathematical Induction and Recursive Definitions 63
In this approach, rules 1 and 2 are both necessary in the basis part of the definition, since rule
1 by itself would provide no strings other than A to concatenate in the recursive part.
The first definition is a little closer to our original definition of L*, and perhaps a little
easier to work with, because there is a direct correspondence between the applications of rule
2 needed to generate an element of L* and the strings of L that are being concatenated. The
second definition allows more flexibility as to how to produce a string in the language. There
is a sense, however, in which both definitions capture the idea of all possible concatenations
of strings of L, and for this reason it may not be too difficult to convince yourself that both
definitions really do work—that the set being defined is L* in each case. Exercise 2.63 asks
you to consider the question in more detail.
1. A eEpal.
2. For anya € &,a € pal.
3. For any x € pal and any a € X, axa € pal.
4. No string is in pal unless it can be obtained by using rules 1, 2, and 3.
The strings that can be obtained by using rules 1 and 2 exclusively are the elements of
pal of even length, and those obtained by using rules 2 and 3 are of odd length. A simple
nonrecursive definition of pal is that it is the set of strings that read the same backwards as
forwards. (See Exercise 2.60.)
In Example 2.14, we mentioned the algorithmic, or constructive, nature of the recursive
definition of L”. In the case of a language such as pal, this aspect of the recursive definition
can be useful both from the standpoint of generating elements of the language and from the
standpoint of recognizing elements of the language. The definition says, on the one hand, that
we can construct palindromes by starting with either A or a single symbol, and continuing to
concatenate a symbol onto both ends of the current string. On the other hand, it says that if we
wish to test a string to see if it’s a palindrome, we may first compare the leftmost and rightmost
symbols, and if they are equal, reduce the problem to testing the remaining substring.
1. i €AE.
2. For any x, y €AE, both (x + y) and (x — y) are elements of AE.
3. Nostring is in AE unless it can be obtained by using rules (1) and (2).
@ —i)) +71).
Some ofthe strings in AE are i, (i + 1), (i —i), (i +i) — i), and (i —
64 PART 1 Mathematical Notation and Techniques
| EXAMPLE 2.18
PE | Finite Subsets of the Natural Numbers
We define a set F of subsets of the natural numbers as follows:
Def.
For any n € N, {n} € F.
For any A and Bin F¥, AUB Ee Ff.
a Nothing is in ¥ unless it can be obtained by using rules 1, 2, and 3.
Se
We can obtain any two-element subset of \V by starting with two one-element sets and using
rule 3. Because we can then apply rule 3 to any two-element set A and any one-element set B,
we obtain all the three-element subsets of VV. It is easy to show using mathematical induction
that for any natural number n, any n-element subset of NV is an element of F; we may conclude
that F is the collection of all finite subsets of NV.
Let us consider the last statement in each of the recursive definitions in this sec-
tion. In each case, the previous statements describe ways of producing new elements
of the set being defined. The last statement is intended to remove any ambiguity:
Unless an object can be shown using these previous rules to belong to the set, it does
not belong to the set.
We might choose to be even a little more explicit. In Example 2.17, we might
say, “No string is in AE unless it can be obtained by a finite number of applications
of rules | and 2.” Here this extra precision hardly seems necessary, as long as it is
understood that “string” means something of finite length; in Example 2.18 it is easier
to see that it might be appropriate. One might ask whether there are any infinite sets
in F. We would hope that the answer is “no” because we have already agreed that the
definition is a reasonable way to define the collection of finite subsets of NY. On the
one hand, we could argue that the only way rule 3 could produce an infinite set would
be for one of the sets A or B to be infinite already. On the other hand, think about
using the definition to show that an infinite set C is not in F. This means showing that
C cannot be obtained by using rule 3. For an infinite set C to be obtained this way,
either A or B (both of which must be elements of F) would have to be infinite—but
how do we know that cannot happen? The definition is not really precise unless it
makes it clear that rules | to 3 can be used only a finite number of times in obtaining an
element of F. Remember also that we think of a recursive definition as a constructive,
or algorithmic, definition, and that in any actual construction we would be able to
apply any of the rules in the definition only a finite number of times.
Let us describe even more carefully the steps that would be involved in “a finite
number of applications” of rules 1 to 3 in Example 2.18. Take a finite subset A of VV
that we might want to obtain from the definition of F, say A = {2, 3,7, 11, 14}. There
are a number of ways we can use the definition to show that A € F. One obvious
approach is to start with {2} and use rule 3 four times, adding one more element
each time, so that the four steps give us the subsets {2, 3}, (23,7); 42) 3) Tela stand
{2, 3,7, 11, 14}. Each time, the new subset is obtained by applying rule 3 to two
sets, one a set we had obtained earlier by using rule 3, the other a one-element set
CHAPTER 2 Mathematical Induction and Recursive Definitions 65
(one of the elements of F specified explicitly by rule 2). Another approach would
be to start with two one-element sets, say {2} and {7}, to add elements one at a time
to each so as to obtain the sets {2, 3} and {7, 11, 14}, and then to use rule 3 once
more to obtain their union. In both of these approaches, we can write a sequence
of sets representing the preliminary steps we take to obtain the one we want. We
might include in our sequence all the one-element sets we use, in addition to the two-,
three-, or four-element subsets we obtain along the way. In the first case, therefore,
the sequence might look like this:
1>{2} Pee): Oe (2e3t 4. {7}
Oe OPCa 6. {11} GI Wescoer
ee lg 8. {14}
Plas (zene tadda Dg
and in the second case the sequence might be
hae PA ese teh oS. \2,0} 4. {7}
{LE} (ee Peal a 7. {14} 8. {7, 11, 14}
25a); 4 bi,14}
In both cases, there is considerable flexibility as to the order of the terms. The
significant feature of both sequences is that every term is either one of the specific
sets mentioned in statements 1 and 2 of the definition, or it is obtained from two terms
appearing earlier in the sequence by using statement 3.
A precise way of expressing statement 4 in the definition would therefore be to
say: i
No set A is in F unless there is a positive integer n and a sequence A;, A2,..., An, SO
that A, = A, and for every i with 1 <i < n, A; is either %, or a one-element set, or
A; U A, for some j,k < i.
A recursive definition of a set like F is not usually this explicit. Probably the
most common approach is to say something like our original statement 4; a less formal
approach would be to say “Nothing else is in F,” and sometimes the final statement
is skipped altogether.
1. AeL.
2. Forany y € L, both Oy and Oy] are in L.
3. No string is in L unless it can be obtained from rules 1 and 2.
In order to determine what the strings in L are, we might try to use the definition to generate
the first few elements. We know A ¢€ L. From rule 2 it follows that 0 and 01 are both in L.
Using rule 2 again, we obtain 00, 001, and 0011; one more application produces 000, 0001,
00011, and 000111. After studying these strings, we may be able to guess that the strings in L
are all those of the form 0'1/, where i > j > 0. Let us prove that every string of this form is
roy ye
PART 1 Mathematical Notation and Techniques
To simplify the notation, let A = {0'1/ |i > j = 0}. The statement A C L, as it stands,
does not involve an integer. Just as we did in Example 2.8, however, we can introduce the
length of a string as an integer on which to base an induction proof.
To prove: A C L;i.e., for every n > 0, every x € A satisfying |x| =n is an element of L.
Basis step. We must show that every x in A with |x| = 0 is an element of L. This is true
because if |x| = 0, then x = A, and statement 1 in the definition of L tells us that A € L.
Induction hypothesis. k > 0, and every x in A with |x| = k is an element of L.
Statement to show in induction step. Every x in A with |x| = k + 1 is an element of L.
Proof of induction step. Suppose x € A, and |x| = k'+ 1. Then x = 0'1/, where
i>j>0,andi+ j =k+1. Weare trying to show that x € L. According to the
definition, the only ways this can happen are for x to be A, for x to be Oy for some
y € L, and for x to be Oy1 for some y € L. The first case is impossible, since
|x| =k+1 > 0. In either of the other cases, we can obtain the conclusion we want if
we know that the string y is in L. Presumably we will show this by using the induction
hypothesis. However, at this point we have a slight problem. If x = Oy1, then the string
y to which we want to apply the induction hypothesis is not of length k, but of length
k — 1. Fortunately, the solution to the problem is easy: Use the strong induction
principle instead, which allows us to use the stronger induction hypothesis. The
statement to be shown in the induction step remains the same.
Induction hypothesis (revised). k > 0, and every x in A with |x| < k is anelement of L.
Proof of induction step (corrected version). Suppose x € A and |x| = k + 1. Then
x = 0'l/, wherei > j > Oandi+j =k+1. Weconsider
two cases. Ifi > j, then wecan
write x = Oy for some y that still has at least as many 0’s as 1’s (i.e., for some
y € A). In this case, since |y| = k, it follows from the induction hypothesis that y € L;
therefore, since x = Oy, it follows from the first part of statement 2 in the
definition of L that x is also an element of L. In the second case, when i = j, we
know that there must be at least one 0 and one 1 in the string. Therefore, x = Oy1 for
some y. Furthermore, y € A, because y = 0''1/—! andi = j. Since |y| < k, it follows
from the induction hypothesis that y € L. Since x = Oy1, we can use the second part of
statement 2 in the definition of L to conclude that x € L.
The two sets L and A are actually equal. We have now proved half of this
statement. The other half, the statement L C A, can also be proved by induction
on the length of the string (see Exercise 2.41). In the next section, however, we
consider another approach to an induction proof, which is more naturally related to
the recursive definition of L and is probably easier.
2.5|STRUCTURAL INDUCTION
We have already noticed the very close correspondence between recursive definitions
of functions on the set NV’ and proofs by mathematical induction of properties of
those functions. The correspondence is exact, in the sense that when we want to
CHAPTER 2 Mathematical Induction and Recursive Definitions 67
prove something about f(k + 1) in the induction step of a proof, we can consult the
definition of f(k + 1), the recursive part of the definition.
When we began to formulate recursive definitions of sets, there was usually no
integer involved explicitly (see Examples 2.15-2.19), and it may have appeared that
any correspondence between the recursive definition and induction proofs involving
elements of the set would be less direct (even though there is a general similarity
between the recursive definition and the induction principle). In this section, we
consider the correspondence more carefully. We can start by identifying an integer
that arises naturally from the recursive definition, which can be introduced for the
purpose of an induction proof. However, when we look more closely at the proof,
we will be able to dispense with the integer altogether, and what remains will be an
induction proof based on the structure of the definition itself.
The example we use to illustrate this principle is essentially a continuation of
Example 2.19.
1 A€EL.
2. For every x € L, both Ox and 0x1 are in L.
The third statement in the definition of L says that every element of L is obtained by starting
with A and applying statement 2 a finite number of times (zero or more).
We also have the language A = {0'l/ |i > j > 0}. In Example 2.19, we proved that
A C L, using mathematical induction on the length of the string. Now we wish to prove the
opposite inclusion L C A. As we have already pointed out, there is no reason why using the
length of the string would not work in an induction proof in this direction too—a string in L
has a length, just as every other string does. However, for each element x of L, there is another
integer that is associated with x not just because x is a string, but specifically because x is an
element of L. This is the number of times rule 2 is used in order to obtain x from the definition.
We construct an induction proof, based on this number, that L C A.
To prove: For every n > 0, every x € L obtained by n applications of rule 2 is an element
Olvl
Basis step. We must show that if x € L and x is obtained without using rule 2 at all,
then x € A. The only possibility for x in this case is A, and A € A because A = 0°1°.
Induction hypothesis. k > 0, and every string in L that can be obtained by k
applications of rule 2 is an element of A.
Statement to show in induction step. Any string in L that can be obtained by k + 1
applications of rule 2 is in A.
Proof of induction step. Let x be an element of L that is obtained by k + 1 applications
of rule 2. This means that either x = Oy or x = Oy1, where in either case y is a string in
L that can be obtained by using the rule k times. By the induction hypothesis, y € A, so
that y = 0'1/, withi > j > 0. Therefore, either x= 0'*11/ or x= 0'*! 1/+! and in
either case x € A.
PART 1 Mathematical Notation and Techniques
As simple as this proof is, we can make it even simpler. There is a sense in which the
integers k and k + 1 are extraneous. In the induction step, we wish to show that the new element
x of L, the one obtained by applying rule 2 of the definition to some other element y of L, is
an element of A. It is true that y is obtained by applying the rule k times, and therefore x is
obtained by applying the rule k+ 1 times. The only fact we need in the induction step, however,
is that y € A and for any element y of L that is in A, both Oy and Oy1 are also elements of
A. Once we verify this, k and k + 1 are needed only to make the proof fit the framework of a
standard induction proof.
Why not just leave them out? We still have a basis step, except that instead of thinking
of A as the string obtained from zero applications of rule 2, we just think of it as the string in
rule 1, the basis part of the definition of L. In the recursive part of the definition, we apply one
of two possible operations to an element y of L. What we will need to know about the string
y in the induction step is that it is in A, and therefore it is appropriate to designate this as our
induction hypothesis. The induction step is simply to show that for this string y, Oy and Oy1
(the two strings obtained from y by applying the operations in the definition) are both in L.
We can call this version of mathematical induction structural induction. Although there is
an underlying integer involved, just as in an ordinary induction proof, it is usually unnecessary
to mention it explicitly. Instead, the steps of the proof follow the structure of the recursive
definition directly. Below is our modified proof for this example.
To prove: L C A.
LA.
2.
For any x and y in AE, (x + y) and (x — y) are in AE.
3.
No other strings are in AE.
This time we try to show that every element of the set has the property of not containing the
substring )(. As in the previous example, we could set up an induction proof based on the
number of times rule 2 is used in obtaining a string from the definition. Notice that in this
approach we would want the strong principle of induction: If z is obtained by a total of k + 1
applications of rule 2, then either z = (x + y) or z = (x — y), where in either case both x
and y are obtained by k or fewer (not exactly k) applications of rule 2. However, in a proof by
structural induction, since the integer k is not present explicitly, these details are unnecessary.
What we really need to know about x and y is that they both have the desired property, and the
statement that they do is the appropriate induction hypothesis.
To prove: No string in AE contains the substring )(.
Basis step. The string i does not contain the substring )(. (This is obvious.)
Induction hypothesis. x and y are strings that do not contain the substring )(.
Statement to show in induction step. Neither (x + y) nor (x — y) contains the
substring )(.
Proof of induction step. In both the expressions (x + y) and (x — y), the symbol
_preceding x is not ), the symbol following x is not (, the symbol preceding y is not ), and
the symbol following y is not (. Therefore, the only way )( could appear would be for it
to occur in x or y separately.
Note that for the sake of simplicity, we made the induction hypothesis weaker than
we really needed to (that is, we proved slightly more than was necessary). In order to
use structural induction, we must show that if x and y are any strings in AE not
containing )(, then neither (x + y) nor (x — y) contains )(. In our induction step, we
showed this not only for x and y in AE, but for any x and y. This simplification is often,
though not always, possible (see Exercise 2.69).
The Language of Strings with More a's than b’s | EXAMPLE 2.22 |
1 ael.
2. Foranyxe€L,axeL.
3. For any x and y in L, all the strings bxy, xby, and xyb are in L.
4. No other strings are in L.
Let us prove that every element of L has more a’s than b’s. Again we may use the structural
step by
induction principle, and just as in the previous example, we can simplify the induction
70 PART 1 Mathematical Notation and Techniques
Basis step. The string a has more a’s than b’s. (This is obvious.)
Induction hypothesis. x and y are strings containing more a’s than b’s.
Statement to show in induction step. Each of the strings ax, bxy, xby, and xyb has
more a’s than b’s.
Proof of induction step. The string ax clearly has more a’s than b’s, since x does.
Since both x and y have more a’s than b’s, xy has at least two more a’s than b’s, and
therefore any string formed by inserting one more D still has at least one more a than b’s.
In Exercise 2.64 you are asked to prove the converse, that every string in {a, b}*
having more a’s than b’s is in the language L.
ae R Gees
2. For any x, y,z € S, if (x, y) € R’ and (y, z) € R’, then (x, z) € R’.
3. No other pairs are in R’.
(It makes sense to summarize statements 1 to 3 by saying that R’ is the smallest transitive
relation containing R.) Let us show that if R; and R> are relations on S with R, C Ro, then
Ree he
Structural induction is appropriate here since the statement we want to show says that
every pair in Rj satisfies the property of membership in R3.
The basis step is to show that every element of R, is an element of Rj. This is true because
R, © R, C R;; the first inclusion is our assumption, and the second is just statement 1 in the
definition of R}. The induction hypothesis is that (x, y) and (y, z) are elements of R, that are
in R}, and in the induction step we must show that (x, z) € R;. Once again the argument is
simplified slightly by proving more than we need to. For any pairs (x, y) and (y, z) in R},
CHAPTER 2 Mathematical Induction and Recursive Definitions 71
whether they are in Rj or not, (x, z) € Rj, because this is exactly what statement 2 in the
definition of Rj says.
A recursive definition of a set can also provide a useful way of defining a function
on the set. The function definition can follow the structure of the definition, in much
the same way that a proof using structural induction does. This idea is illustrated by
the next example.
eA Se.
2. For every x € &*, and everya € XU, xa € &*.
3. No other elements are in X*.
Two useful functions on * are the length function, for which we have already given a
nonrecursive definition, and the reverse function rev, which assigns to each string the string
obtained by reversing the order of the symbols. Let us give a recursive definition of each of
these. The length function can be defined as follows:
ieee — 10:
2. Forany x € D* andanya € ¥, |xa| = |x| +1.
»
For the reverse function we will often use the notation x” to stand for rev(x), the reverse of x.
Here is one way of defining the function recursively.
1D PN TNE
2. For any x € X* andanya € &, (xa)" =ax’.
To convince ourselves that these are both valid definitions of functions on &*, we could
construct a proof using structural induction to show that every element of &* satisfies the
property of being assigned a unique value by the function.
Let us now show, using structural induction, a useful property of the length function: For
every x and y in &*,
lxy| = |x| + lyl
(The structural induction, like the recursive definition of the length function itself, follows the
recursive definition of 5* above.) The formula seems obvious, of course, from the way we
normally think about the length function. The point is that we do not have to depend on our
intuitive understanding of length; the recursive definition is a practical one to use in discussing
properties of the function. For the proof we choose y as the string on which to base the induction
that
(see Exercise 2.34). That is, we interpret the statement as a statement about y—namely,
basis step of the structural induction is to show that this
for every x, |xy| = |x| + ly]. The
statement is true when y = A. It is, because for every x,
The induction hypothesis is that y is a string for which the statement holds, and in the induction
step we consider ya, for an arbitrary a € XZ. We want to show that for any x, |x(ya)| =
Ix| + lyal:
|x(ya)| = |(xy)a| (because concatenation is associative)
= |xy|+1 (by the recursive definition of the length function)
= (|x| + |y|) +1 (by the induction hypothesis)
= |x|+(/y|+1) (because addition is associative)
=|x|+|ya| (by the-definition of the length function)
In this case, structural induction is not much different from ordinary induction on the length of
y, but a little simpler. The proof involves lengths of strings because we are proving a statement
about lengths of strings; at least we did not have to introduce them gratuitously in order to
provide a framework for an induction proof.
Finally, it is worth pointing out that the idea of structural induction is general
enough to include the ordinary induction principle as a special case. When we prove
that a statement P(n) is true for every n > no, we are showing that every element of
the set S = {n |n = no} satisfies the property P. The set S can be defined recursively
as follows:
1. no € S.
2. Foreveryne S,n+1eS.
3. No integer is in S unless it can be obtained from rules 1 and 2.
EXERCISES
2.1. Prove that the statements (p V g) > r and (p > r) Vv (q — r) are logically
equivalent. (See Example 2.6.)
2.2. For each of Examples 2.5 and 2.6, how would you classify the given proof:
constructive, nonconstructive, or something in-between? Why?
2.3. Prove that if a and b are even integers, then ab is even.
2.4. Prove that for any positive integers i, j, and n, if i « j =n, then either
i> J/nor j > /n. (See Example 2.2. The statement in that example may
be more obviously useful since it tells you that for an integer n > 1, if none
of the integers j in the range 2 < j < \/n isa divisor of n, then n is prime.)
CHAPTER 2 Mathematical Induction and Recursive Definitions 73
2.5. In the induction step in Example 2.8, starting with x = Oy1, we used yl as
the string of length k to which we applied the induction hypothesis. Redo the
induction step, this time using Oy instead.
2.6. Prove the statement in Example 2.8 by using mathematical induction on the
number of 0’s in the string, rather than on the length of the string.
2.7. In Example 2.12, in order to show that every nonempty subset of V has a
smallest element, we chose P(n) to be the statement: Every subset of VV
containing n has a smallest element. Consider this alternative choice for
P(n): Every subset of \V containing at least n elements has a smallest
element. (We would want to try to prove that this statement P(n) is true for
every n > 1.) Why would this not be an appropriate choice?
In all the remaining exercises in this chapter, with the exception of 46 through
48, “prove” means “prove, using an appropriate version of mathematical induction.”
2.9. Suppose that do, a, ..., is a sequence of real numbers. Prove that for any
nisl,
n
> G = 4-1) = a, — a9
t=]
Se 1 api
erie SE Tsay,
tite |
2.12. For natural numbers n and i satisfying 0 <i <n, let C(n, i) denote the
number n!/(i!(n — i)!).
a. Show that if0 <i <n,thenC(n,i) =C(w—1,i-—1)+C(@—1,i).
(You don’t need mathematical induction for this.)
b. Prove that for any n > 0,
scm, yes 2.
i=0
2.13. Suppose r is a real number other than 1. Prove that for any n > 0,
74 PART 1 Mathematical Notation and Techniques
1+) cixi!=@+D!
i=l
2.15. Prove that for any n > 4, n! > 2”.
2.16. Prove that if ao, a;, ... is a sequence of real numbers so that a, < a,41 for
every n = 0, then for every m,n > 0, ifm <n, then dm < dp.
2.17. Suppose x is any real number greater than —1. Prove that for any n > 0,
(1 +x)" > 1+ nx. (Be sure you say in your proof exactly how you use the
assumption that x > —1.)
2.18. A fact about infinite series is that the series )°*°, 1/i diverges (i.e., is
infinite). Prove the following statement, which implies the result: For every
n > 1, there is an integer k, > 1 so that ees 1/i > n. (Hint: the sum
1/(K+1)4+1/(K +2) +---+1/(2k) is at least how big?)
2.19. Prove that for every n > 1,
Yoix2i =(—1)*2"142
i=l
2.20. Prove that for every n > 2,
see |
132)" == S/n
2.21. Prove that for every n > 0, n is either even or odd, but not both. (By
definition, an integer n is even if there is an integer i so thatn = 2 *i, andn
is odd if there is an integer i so thatn = 2 *i + 1.)
2.22. Prove that for any language L C {0, 1}*, if L? C L, then L* C L.
2.23. Suppose that © is an alphabet, and that f : U* — &* has the property that
f(a) =a for every a € X and f(xy) = f(x) f(y) for every x, y € D*.
Prove thatfonevery x 6. , f(x) =x.
2.24. Prove that for every n > 0, n(n? + 5) is divisible by 6.
2.25. Suppose a and b are integers with 0 < a < b. Prove that for every n > 1,
b” — a” is divisible by b — a.
2.26. Prove that every positive integer is the product of a power of 2 and an odd
integer.
2.27. Suppose that A;, A>, ... are sets. Prove that for every 72. = 15
i=1 i=l
2.28. Prove that for every n > 1, the number of subsets of (betas alee ee
2.29. Prove that for every n > 1 and every m > 1, the number of functions from
(12, ca0y 1} O41; 2a aS mee
CHAPTER 2 Mathematical Induction and Recursive Definitions 75
2.39. In each part below, a recursive definition is given of a subset of {a, b}*. Give
a simple nonrecursive definition in each case. Assume that each definition
includes an implicit last statement: “Nothing is in L unless it can be
obtained by the previous statements.”
a € L; for any x € L, xa and xb are in L.
a € L; for any x € L, bx and xb are in L.
a € L; for any x € L, ax and xb are in L.
a € L; for any x € L, xb, xa, and bx are in L.
a € L; for any x € L, xb, ax, and bx are in L.
2» a € L; for any x € L, xb and xba are in L.
moans
2.40. Give recursive definitions of each of the following sets.
The set NV of all natural numbers.
The set S of all integers (positive and negative) divisible by 7.
The set T of positive integers divisible by 2 or 7.
The set U of all strings in {0, 1}* containing the substring 00.
The set V of all strings of the form 0'1/, where j <i <2).
paos
mo The set W of all strings of the form 0'1/, where i > 27.
2.41. Let L and A be the languages defined in Example 2.19. Prove that L C A by
using induction on the length of the string.
2.42. Below are recursive definitions of languages L; and L>, both subsets of
{a, b}*. Prove that each is precisely the language L of all strings not
containing the substring aa.
a AELy;a€ Li;
For any x € L,, xb and xba are in L;;
Nothing else is in L.
b. A €Lj;a € Lo;
For any x € L», bx and abx are in Lp;
Nothing else is in Lp.
2.43. For each n > 0, we define the strings a, and b, in {0, 1}* as follows:
Prove that for every n > 0, the following statements are true.
a. The strings a, and b, are of the same length.
b. The strings a, and b, differ in every position.
c. The strings a2, and b2, are palindromes.
d. The string a, contains neither the substring 000 nor the substring 111.
2.44. The “pigeonhole principle” says that if n + 1 objects are distributed among n
pigeonholes, there must be at least one pigeonhole that ends up with more
than one object. A more formal version of the statement is that if f: A > B
and the sets A and B have n + | and n elements, respectively, then f cannot
be one-to-one. Prove the second version of the statement.
2.45. The following argument cannot be correct, because the conclusion is false.
Say exactly which statement in the argument is the first incorrect one, and
why it is incorrect.)
To prove: all circles have the same diameter. More precisely, for any
n > 1, if S is any set of n circles, then all the elements of S have the same
diameter. The basis step is to show that all the circles ina set of 1 circle have
the same diameter, and this is obvious. The induction hypothesis is that
k > land for any set S of k circles, all elements of S have the same diameter.
We wish to show that for this k, and any set S of k+ 1 circles, all the circles
in S have the same diameter. Let § = {C), Co,..., Cy41}. Consider the
GWieche 1 =, (Co, ee Gr) abut = Cr Op. 2. CeCe pn LIS
simply S with the circle C4, deleted, and R is S with the element C;,_;
deleted. Since both T and R are sets of k circles, all the circles in T have
the same diameter, and all the circles in R have the same diameter; these
statements follow from the induction hypothesis. Now observe that C;_
is an element of both T and R. If d is the diameter of this circle, then any
circle in T has diameter d, and so does any circle in R. Therefore, all the
circles in S have this same diameter.
Yo vi > 2nJ/n/3
i=1
2.50. Prove that every integer greater than 17 is a nonnegative integer combination
of 4 and 7. In other words, for every n > 17, there exist integers i, and j,,
both> 0, so that n = i, * 4+jy, * 7.
2.51. Prove that for every n > 1, the number of subsets of {1, 2,..., 2} having an
even number of elements is 2”~!.
2.52. Prove that the ordinary principle of mathematical induction implies the
strong principle of mathematical induction. In other words, show that if P(n)
is a statement involving n that we wish to establish for every n > N, and
1. The ordinary principle of mathematical induction is true;
2. P(N) is true;
3. Foreveryk > N, (P(N) A P(N +1)... P(k)) > P(R+1)
then P(n) is true for every n > N.
2.53. Suppose that f is the Fibonacci function (see Example 2.13).
a. Prove that for every n > 0, f, = c(a” — b”), where
1 1+ 75 1-J/5
C=, @=—, anddb=
a) 2 Z
and the two expressions are equal if and only if b; = a; for every i.
2.56. Consider the following loop, written in pseudocode:
while B do
Sir
The meaning of this is what you would expect. Test B; if it is true, execute S;
test B again; if it is still true, execute S again; test B again; and so forth. In
other words, continue executing S as long as the condition B remains true. A
CHAPTER 2 Mathematical Induction and Recursive Definitions 79
q = 0;
ee eG
while nis=,yido
Oeaa er
ay oh Se ree &
(The loop condition B is r > y, and the loop body S is the pair of
assignment statements.) By considering the condition (r > 0) A
(x = q * y +7), prove that when this loop terminates, the values of g
and r will be the integer quotient and remainder, respectively, when x is
divided by y; in other words, x = g * y+rand0 <r < y.
2.57. Suppose f is a function defined on the set of positive integers and satisfying
these two conditions:
Gg) fa)=1
Gi) forn = 1, fn) = fn +1) =2f@)
Prove that for every positive integer n, f (n) is the largest power of 2 less
than or equal to n.
2.58. The total time T (n) required to execute a particular recursive sorting
algorithm on an array of n elements is one second if n = 1, and otherwise no
more than Cn + 27 (n/2) for some constant C independent of n. Prove that
if n is any power of 2, say 2*, then
2.59. The function rev: &* — &* was defined recursively in Example 2.24.
Using the recursive definition, prove the following facts about rev. (Recall
that rev(x) is also written x”.)
a. For any strings x and y in &*, (xy)” = y’x’.
b. For any stringx € &*, (x")" =x.
c. For any string x € &* and any n > 0, (x”)" = (x")".
2.60. On the one hand, we have a recursive definition of the set pal, given in
Example 2.16. On the other hand, we have a recursive definition of x’, the
reverse of the string x, in Example 2.24. Using these definitions, prove that
pal= {x € &* |x" = x}.
80 PART 1 Mathematical Notation and Techniques
2.61. Prove that the language L defined by the recursive definition below is the set
of all elements of {a, b}* not containing the substring aab.
Nites:
For every x € L, xa, bx, and abx are in L;
Nothing else is in L.
2.62. In each part below, a recursive definition is given of a subset of {a, b}*. Give
a simple nonrecursive definition in each case. Assume that each definition
includes an implicit last statement: “Nothing is in L unless it can be
obtained by the previous statements.”
a. A, a, and aa are in L; for any x € L, xb, xba, and xbaa are in L.
b. A €L; for any x € L, ax, xb, and xba are in L.
2.63. Suppose L C {0, 1}*. Two languages L, and L, are defined recursively
below. (These two definitions were both given in Example 2.15 as possible
definitions of L*.)
Definition of L;:
1 sAe bis
jaar
u ~~ Se osm ted) Aat Yori | edt iy 2iFics
seal Feat
e - ' .
7 i Me, o > ’
ws No lee ti a wane a, Sei Biges c th enoerinit ’ a
SAE dat Regt) baa Sabied Sig cxidch theit Ties obits! tiiad
= re a) CAL ellie DOk TSO And fl Feyiy sini anc
Cpa i lend Amie ctieey SS gipclanls We USie Rigi (1, eh . a sue aioe oe)
wT tie Thee Ce. Lo are . &s
(A+ OO Cee eis wich partied (@ A ies Kotani 3"=
axhel
tea bn i Potions oi ae (i “aw po voile? eek ate ru i. ae ;
= : WN LN a .: os
ie oA Sy Lowe ee ai dil sale wh santas, Ap -
ie
aig as ian a
i
<< oeiaeBosch yop
"
is
et ee
a a. an : *
@. f See awe
me
NY aa exis
Regular Languages
and Finite Automata
83
< fay
a ine :
7
= i
gepsupniai isiupoh Py
steno?wea otal RM ~~.
7 ~ =
: - = ) ae
. ~ a = ;
*
7B:
~~ a fs
-
> ie WH WN aot at Bar fei er ot ch tit cnm eeamaauall envite
.
a |
al Pa pntrgnis!, whi ai eyawe Ot) wie wie coe ae eg a: er
A HUnrans Jest. andvicrayy te griay »saniqgent Teas WE hows adn
7 i ano ra very Be Ay » Wyrtiss me Drea iva ct TH ery : 38, . itneG, ih phic os
ii a <a
'
eeemo ee en cials-ses,meeros) lasrray nije APs a
garry: 14 exyeuy! aT a eeiaenia i
AP? ithe LSS te) Da Pda , wattey he's rien vid Sige 2 irs |
Li dy) MiNi
oan eases siiduGnee Sql: 6AT) worine vied Bell ae nehyh ate
veliss Ww oe Ce a cmanaie
ra oh Tees! @bo aetiNiiior) son witulio et gregh, Gumia e
Gere eyieel i som) »i pO) pe? ye ct nenliepiraty 2
— Shelitivery
| Ree typrthe we oi ht vidio) Mid laourys Tavraer i) qtiren baw mova ;
: > | ROT Peek (noe aed wil Hay alqinia Yiiiniay ae Bee AAR
PTS Ye ite oe a) PAU nl iA fiat fiairte) rhe eh
nn
:
bs 7
“~~ ~
~. —
C HAPTER
Regular Languages
and Finite Automata
85
86 PART 2 Regular Languages and Finite Automata
Definition 3.1 Regular Panguades ETatel at-\e[ 1-1 Brean over >
Bis
is anelement of R, and the corresp g regular expression is 9.
{A}is
is anelement . R, and theoe regular ia esis AL
The empty language is included in the definition primarily for the sake of con-
sistency. There will be a number of places where we will want to say things like
“To every something-or-other, there corresponds a regular language,” and without
the language 4 we would need to make exceptions for trivial special cases.
Our definition of regular expressions is really a little more restrictive in several
respects than we need to be in practice. We use notation such as L? for languages,
and it is reasonable to use similar shortcuts in the case of regular expressions. Thus
we sometimes write (r7) to stand for the regular expression (rr), (r*) to stand for
the regular expression ((r*)r), and so forth. You should also note that the regular
CHAPTER 3 Regular Languages and Finite Automata 87
expressions we get from the definition are fully parenthesized. We will usually relax
this requirement, using the same rules that apply to algebraic expressions: The Kleene
* operation has the highest precedence and + the lowest, with concatenation in be-
tween. This rule allows us to write a + b*c, for example, instead of (a + ((b*)c)).
Just as with algebraic expressions, however, there are times when parentheses are
necessary. The regular expression (a + b)* is a simple example, since the languages
corresponding to (a + b)* and a + b* are not the same.
Let us agree to identify two regular expressions if they correspond to the same
language. At the end of the last paragraph, for example, we could simply have said
(a+b)* 4a+b*
instead of saying that the two expressions correspond to different languages. With this
convention we can look at a few examples of rules for simplifying regular expressions
over {0, 1}:
Gee A="
1*1* = 1*
OF 41 = 1? 20"
(0*1*)* = (0+ 1)*
(0+ 1)*01(0 + 1)* + 1*0* = (0+ 1)*
(All five are actually special cases of more general rules. For example, for any two
regular expressions r and s, (r*s*)* = (r + s)*.) These rules are really statements
about languages, which we could have considered in Chapter 1. The last one is
probably the least obvious. It says that the language of all strings of 0’s and 1’s (the
right side) can be expressed as the union of two languages, one containing all the
strings having the substring 01, the other containing all the others. (Saying that all
the 1’s precede all the 0’s is the same as saying that 01 is not a substring.)
Although there are times when it is helpful to simplify a regular expression as
much as possible, we will not attempt a systematic discussion of the algebra of regular
expressions. Instead, we consider a few more examples.
0*(10* 10*)*10*
This is a way of showing the last 1 in the string explicitly, slightly different from the third
regular expression in this example.
0+ 10+10+)D04+104+1)0+1)
or, in our extended notation, (0 + 1)°. To reduce the length, however, we may simply allow
some or all of the factors to be A. We may therefore describe L by the regular expression
(0+1+4+A)°
a
a ela ae a
e
CHAPTER 3 Regular Languages and Finite Automata 89
L = {x € {0, 1}* |x ends with 1 and does not contain the substring 00}
In order to find a regular expression for L, we try stating the defining property of strings in L in
other ways. Saying that a string does not contain the substring 00 is the same as saying that no
0 can be followed by 0, or in other words, every 0 either comes at the very end or is followed
immediately by 1. Since strings in L cannot have 0 at the end, every 0 must be followed by 1.
This means that copies of the strings 01 and 1 can account for the entire string and therefore
that every string in L corresponds to the regular expression (1 + 01)*. This regular expression
is a little too general, however, since it allows the null string. The definition says that strings in
L must end with 1, and this is stronger than saying they cannot end with 0. We cannot fix the
problem by adding a | at the end, to obtain (1+01)*1 because now our expression is not general
enough; it does not allow 01. Allowing this choice at the end, we obtain (1 + 01)*(1 + 01), or
(1+01)*.
atb+---+z+A+B+H+:--+Z
andd (for “digit’’) to stand for
0+14+2+---49
An identifier in the C programming language is any string of length 1 or more that contains
only letters, digits, and underscores (_) and begins with a letter or an underscore. Therefore, a
regular expression for the language of all C identifiers is
d+_)jd+d+_)*
Suppose we keep the abbreviations / and d as in the previous example and introduce the
additional abbreviations s (for “sign”) and p (for “point”). The symbol s is shorthand for
A +a+m, where a is “plus” and m is “minus.” Consider the regular expression
sd* (pd* + pd*Esd* + Esd*)
(Here E is not an abbreviation, but one of the symbols in the alphabet.) A typical string
corresponding to this regular expression has this form: first a sign (plus, minus, or neither);
one or more digits; then either a decimal point and one or more digits, which may or may
not be followed by an E, a sign, and one or more digits, or just the E, the sign, and one or more
digits. (If nothing else, you should be convinced by now that one regular expression is often
worth several lines of prose.) This is precisely the specification for a real “Jiteral,” or constant,
in the Pascal programming language. If the constant is in exponential format, no decimal point
90 PART 2 Regular Languages and Finite Automata
is needed. If there is a decimal point, there must be at least one digit immediately preceding
and following it.
The only string not accounted for is A. However, there is no need to distinguish between
A and a string ending with 1: Neither is in the language, because neither ends with 0, and
once we get one more symbol we will not remember enough to distinguish the resulting strings
anyway, because they will both end with the same symbol. The conclusion is that there are
only two cases (either the string ends with 0 or it does not), and at each step we must remember
only which case we have currently.
This time, let L = {0, 1}*{11}, the language of all strings in {0, 1}* ending with 11. We can eas-
ily formulate an algorithm for recognizing L in which we remember only the last two symbols
of the input string. This time, in fact, we can get by with even a little less.
11. For
First, it is not sufficient to remember only whether the current string ends with
and one
example, suppose the algorithm does not distinguish between a string ending in 01
in 11. Thenif the next inputis 1, the algorithm will
ending in 00, on the grounds that neither ends
not be able to distinguish between the two new strings, which end in 11 and 01, respectively.
92 PART 2 Regular Languages and Finite Automata
This is not correct, since only one of these strings is in L. The algorithm must remember
enough now to distinguish between 01 and 00, so that it will be able if necessary to distinguish
between 011 and 001 one symbol later.
Two strings ending in 00 and 10, however, do not need to be distinguished. Neither string
is in L, and no matter what the next symbol is, the two resulting strings will have the same last
two symbols. For the same reason, the string 1 can be identified with any string ending in 01.
Finally, the two strings 0 and A can be identified with all the other strings ending in 0: In
all these cases, at least two more input symbols are required to produce an element of L, and
at that point it will be unnecessary to remember anything but the last two symbols.
Any algorithm recognizing L and following the rules we have adopted must distinguish
the following three cases, and it is sufficient for the algorithm to remember which of these the
current string represents:
| EXAMPLE 3.10 Strings with an Even Number of 0’s and an Odd Number of 1’s
Consider the language of strings x in {0, 1}* for which no(x) is even and n;(x) is odd. One
way to get by with remembering less than the entire current string would be to remember just
the numbers of 0’s and 1’s we have read, ignoring the way the symbols are arranged. For
example, it is not necessary to remember whether the current string is 011 or 101. However,
remembering this much information would still require us to consider an infinite number of
distinct cases, and an algorithm that remembers much less information can still work correctly.
There is no need to distinguish between the strings 011 and 0001111, for example: The current
answers are both “no,” and the answers will continue to be the same, no matter what input
symbols we get from now on. In the case of 011 and 001111, the current answers are also both
“no”; however, these two strings must be distinguished, since if the next input is 1, the answer
should be “no” in the first case but “yes” in the second.
The reason 011 and 0001111 can be treated the same is that both have an odd number of
0’s and an even number of 1’s. The reason 011 and 001111 must be distinguished is that one
has an odd number of 0’s, the other an even number. It is essential to remember the parity (i.e.,
even or odd) of both the number of 0’s and the number of 1’s, and this is also sufficient. Once
again, we have four distinct cases, and the only information about an input string that we must
remember is which of these cases it represents.
L = {x € {0, 1}* |x ends in 1 and does not contain the substring 00}
Suppose that in the course of processing an input string, we have seen the string s so far. If s
already contains the substring 00, then that fact is all we need to remember; s is not in L, and
no matter what input we get from here on, the result will never be in L. Let us denote this case
by the letter NV.
CHAPTER 3 Regular Languages and Finite Automata 93
Consider next two other cases, in both of which 00 has not yet occurred: case 0, in which
the last symbol of s is 0, and case 1, in which the last symbol is 1. In the first case, if the
next input is 0 we have case N, and if the next input is 1 we have case 1. Starting in case
1, the inputs 0 and 1 take us to cases 0 and 1, respectively. These three cases account for all
substrings except A. This string, however, must be distinguished from all the others. It would
not be correct to say that the null string corresponds to case N, because unlike that case there
are possible subsequent inputs that would give us a string in L. A does not correspond to case
0, because if the next input is 0 the answers should be different in the two cases; and it does
not correspond to case 1, because the current answers should already be different.
Once again we have managed to divide the set {0, 1}* into four types of strings so that in
order to recognize strings in L it is sufficient at each step to remember which of the four types
we have so far.
Figure 3.1 |
(a) Strings ending in 0; (b) Strings with next-to-last symbol 0; (c) Strings ending
with 11; (d) Strings with no even and n, odd; (e) A recognition algorithm for the
language in Example 3.4.
machine that would work as follows: The machine is at any time in one of four
possible states, which we have arbitrarily labeled A, 0, 1, and N. When it is activated
initially, it is in state A. The machine receives successive inputs of 0 or 1, and as a
result of being in a certain state and receiving a certain input, it moves to the state
specified by the corresponding arrow. Finally, certain states are accepting states (state
1 is the only one in this example). A string of 0’s and 1’s is in L if and only if the
state the machine is in as a result of processing that string is an accepting state.
It seems reasonable to refer to something of this sort as a “machine,” since
one can visualize an actual piece of hardware that works according to these rough
specifications. The specifications do not say exactly how the hardware works—
exactly how the input is transmitted to the machine, for example, or whether a ayes
answer corresponds to a flashing light or a beep. For that matter, the “machine”
might exist only in software form, so that the strings are input data to a program. The
CHAPTER 3 Regular Languages and Finite Automata 95
iaS-tup
finite-staté machine (abbreviated FA) is
:finiteautomaton, or
If you have not run into definitions like this before, you might enjoy what the
mathematician R. P. Boas had to say about them, in an article in The American
s
Mathematical Monthly (88: 727-731, 1981) entitled “Can We Make Mathematic
Intelligible?”:
early
There is a test for identifying some of the future professional mathematicians at an
beginning “Let X be an
age. These are students who instantly comprehend a sentence
96 PART 2 Regular Languages and Finite Automata
ordered quintuple (a, T, 7, 0, B), where ...” They are even more promising if they add,
“T never really understood it before.”
Whether or not you “instantly comprehend” a definition of this type, you can
appreciate the practical advantages. Specifying a finite automaton requires that we
specify five things: two sets Q and &, an element go and a subset A of Q, anda
function from Q x ¥ to Q. Defining a finite automaton to be a 5-tuple may seem
strange at first, but it is simply efficient use of notation. It allows us to talk about the
five things at once as though we are talking about one “object”; it will allow us to say
Let M = (Q, &, qo, A, 5) be an FA
instead of
Let M be an FA with state set Q, input alphabet ©, initial state go, set of
accepting states A, and transition function 6.
State
We can go one step further, using the same reasoning as before. In this new FA, the rows
for states A and A are the same, and neither A nor A is an accepting state. We can therefore
CHAPTER 3 Regular Languages and Finite Automata
(a) (b)
Figure 3.2 |
A finite automaton recognizing {0, 1}*{10}. (a) Transition diagram;
(b) Transition table. .
(a) (b)
Figure 3.3 |
A simplified finite automaton recognizing {0, 1}*{10}: (a) Transition
diagram; (b) Transition table.
include A in the state A as well. The three resulting states are A, B, and 10, and it is easy to see
that the number cannot be reduced any more. The final result and the corresponding transition
table are pictured in Figure 3.3.
We might describe state A as representing “no progress toward 11,” meaning that the
machine has received either no input at all, or an input string ending with 0 but not 10. B
stands for “halfway there,” or last symbol 1. As we observed after Example 3.11, however,
these descriptions of the states are not needed to specify the abstract machine completely.
PART 2 Regular Languages and Finite Automata
The analysis that led to the simplification of the FA in this example can be turned into a
systematic procedure for minimizing the number of states in a given FA. See Section 5.2 for
more details.
(q, A) — |
: orany gq € Q,5°
Foranyg € Q.y ¢ E* anda ¢ &
a b
O-)+-@)-@
Figure 3.4 |
The point is that the recursive definition is the best way to capture this intuitive idea
in a formal definition. Using this definition to calculate 5* (q, X) amounts to just what
you would expect, given what we wanted the function 5* to represent: processing the
symbols of x, one at a time, and seeing where the transition function 5 takes us at
each step. Suppose for example that M contains the transitions shown in Figure 3.4.
Let us use Definition 3.3 to calculate 5*(q, abc):
Note that in the calculation above, it was necessary to calculate 6*(q, a) by using the
recursive part of the definition, since the basis part involves 6*(q, A). Fortunately,
6*(q, a) turned out to be 5(q, a); for strings of length 1 (i.e., elements of ©), 6 and 5*
can be used interchangeably. For a string x with |x| 4 1, however, writing 5(q, x) is
incorrect, because the pair (g, x) does not belong to the domain of 6.
Other properties you would expect 6* to satisfy can be derived from our definition.
For example, a natural generalization of statement 2 of the definition is the formula
—Or> (anything)
Figure 3.5 |
Notice what the last statement in the definition does not say. It does not say that
L is accepted by M if every string in L is accepted by M. If it did, we could use
the FA in Figure 3.5 to accept any language, no matter how complex. The power
of a machine does not lie in the number of strings it accepts, but in its ability to
discriminate—to accept some and reject others. In order to accept a language L, an
FA has to accept all the strings in L and reject all the strings in L’.
The terminology introduced in Definition 3.4 allows us to record officially the
following fact, which we have mentioned already but will not prove until Chapter 4.
This theorem says on the one hand that if M is any FA, there is aregular expression
corresponding to the language L(M); and on the other hand, that given a regular
expression, there is an FA that accepts the corresponding language. The proofs in
Chapter 4 will actually give us ways of constructing both these things. Until then,
many examples are simple enough that we can get by without a formal algorithm.
Figure 3.6 |
A finite automaton accepting {00}*{11}*.
CHAPTER 3 Regular Languages and Finite Automata 101
The state labeled D serves the same purpose in this example that N did in Example 3.11.
Once the FA reaches D, it stays in that state; a string x for which 6*(A, x) = D cannot be a
prefix of any element of L.
The easiest way to reach the other accepting state B from A is with the string 11. Once the
FA is in state B, any even-length string of 1’s returns it to B, and these are the only strings that
do this. Therefore, if x is a string that causes the FA to go from A to B without revisiting A, x
must be of the form 11(11)* for some k > 0. The most general type of string that causes the FA
to reach state B is a string of this type preceded by one of the form (00)/, and a corresponding
regular expression is (00)*11(11)*.
By combining the two cases (the strings x for which 6*(A, x) = A and those for which
6*(A, x) = B), we conclude that the language L corresponds to the regular expression (00)* +
(00)*11(11)*, which can be simplified to
(00)*(11)*
(2
Figure 3.7 |
A finite automaton M accepting
{a, b}*{baaa}.
102 PART 2 Regular Languages and Finite Automata
string accepted by M must end in baaa. The language L(M) is the set of all strings ending in
baaa, and a regular expression corresponding to L(M) is (a + b)*baaa.
This is a roundabout way of arriving at a regular expression. It depends on our noticing
certain distinctive features of the FA, and it is not clear that the approach will be useful for
other machines. An approach that might seem more direct is to start at state A and try to
build a regular expression as we move toward E. Since 5(A,a) = A, we begin the regular
expression with a*. Since (A, b) = B and 6(B, b) = B, we might write a*bb* next. Now the
symbol a takes us to C, and we try a*bb*a. At this point it starts to get complicated, however,
because we can now go back to B, loop some more with input b, then return to C—and we
can repeat this any number of times. This might suggest a*bb*a(bb*a)*. As we get closer to
E, there are more loops, and loops within loops, to take into account, and the formulas quickly
become unwieldy. We might or might not be able to carry this to completion, and even if we
can, the resulting formula will be complicated. We emphasize again that we do not yet have
a systematic way to solve these problems, and there is no need to worry at this stage about
complicated examples.
Figure 3.8 |
Strings containing either ab or bba.
CHAPTER 3 Regular Languages and Finite Automata 103
nonaccepting states is supposed to represent. First, being in a nonaccepting state means that
we have not yet received one of the two desired strings. Being in the initial state qo should
presumably mean that we have made no progress at all toward getting one of these two strings.
It seems as though any input symbol at all represents some progress, however: An is at least
potentially the first symbol in the substring ab, and b might be the first symbol in bba. This
suggests that once we have at least one input symbol, we should never need to return to the
initial state.
It is not correct to say that p is the state the FA should be in if the last input symbol
received was an a, because there are arrows labeled a that go to state t. It should be possible,
however, to let p represent the state in which the last input symbol was a and we have not yet
seen either ab or bba. If we are already in this state and the next input symbol we receive is
a, then nothing has really changed; in other words, 5(p, a) should be p.
We can describe the states r and s similarly. If the last input symbol was b, and it was not
preceded by a b, and we have not yet arrived in the accepting state, we should be in state r.
We should be in state s if we have just received two consecutive b’s but have not yet reached
the accepting state. What should 5(r, a) be? The b that got us to state r is no longer doing us
any good, since it was not followed by b. In other words, it looked briefly as though we were
making progress toward getting the string bba, but now it appears that we are not. We do not
have to start over, however, because at least we have an a. We conclude that 5(r, a) = p. We
can also see that 6(s, b) = s: If we are in state s and get input b, we have not made any further
progress toward getting bba, but neither have we lost any ground.
At this point, we have managed to define the missing transitions in Figure 3.8a without
adding any more states, and thus we have an FA that accepts the language L. The transition
diagram is shown in Figure 3.8).
r = (11 +110)*0
and try to construct an FA accepting the corresponding language L. In the previous example,
our preliminary guess at the structure of the FA turned out to provide all the states we needed.
Here it will not be so straightforward. We will just proceed one symbol at a time, and for each
new transition try to determine whether it can go to a state that is already present or whether it
will require a new state.
The null string is not in L, which tells us that the initial state go should not be an accepting
state. The string 0, however, is in L, so that from qo the input symbol 0 must take us to an
accepting state. The string 1 is not in L; furthermore, | must be distinguished from A, because
the subsequent input string 110 should take us to an accepting state in one case and not in the
other (i.e., 110 € L and 1110 ¢ L.) At this point, we have determined that we need at least the
states in Figure 3.9a.
The language L contains 0 but no other strings beginning with 0. It also contains no
strings beginning with 10. It is appropriate, therefore, to introduce a state s that represents all
the strings that fail for either reason to be a prefix of an element of L (Figure 3.9b). Once our
FA reaches the state s, which is not an accepting state, it never leaves this state.
104 PART 2 Regular Languages and Finite Automata
Figure 3.9 |
A finite automaton accepting L3.
We must now consider the situation in which the FA is in state r and receives the input 1.
It should not stay in state r, because the strings 1 and 11 need to be distinguished (for example,
110 € L, but 1110 ¢ L). It should not return to the initial state, because A and 11 need to be
distinguished. Therefore, we need a new state t. From f, the input 0 must lead to an accepting
state, since 110 € L. This accepting state cannot be p, because 110 is a prefix of a longer string
in L and 0 is not. Let u be the new accepting state. If the FA receives a 0 in state u, then we
have the same situation as an initial 0: The string 1100 is in L but is nota prefix of any longer
string in L. So we may let 6(u, 0) = p. We have yet to define 5(t, 1) and d(u, 1). States t and
u can be thought of as “the end of one of the strings 11 and 110.” (The reason u is accepting
is that 110 can also be viewed as 11 followed by 0.) In either case, if the next symbol is 1, we
should think of it as the first symbol in another occurrence of one of these two strings. This
means that it is appropriate to define 6(t, 1) = 6(u, 1) = r, and we arrive at the FA shown in
Figure 3.9c.
The procedure we have followed here may seem hit-or-miss. We continued to add states
as long as it was necessary, stopping only when all transitions from every state had been drawn
CHAPTER 3 Regular Languages and Finite Automata 105
and went to states that were already present. Theorem 3.1 is the reason we can be sure that
the process will eventually stop. If we used the same approach for a language that was not
regular, we would never be able to stop: No matter how many states we created, defining the
transitions from those states would require yet more states. The step that is least obvious and
most laborious in our procedure is determining whether a given transition needs to go to a
new state, and if not, which existing state is the right one. The algorithm that we develop in
Chapter 4 uses a different approach that avoids this difficulty.
de:
Ljxiiss defined
tL bea languageinJee any stingin=*. The setih
74 Ciowe
: Lfxateed*|azeL)
ae sings x aed . are said to be goiceme with ped to L if -
_ Lix # L/y. Any string z thatis in one of the two sets but not the other Ge.
for which xz € L and yz ¢ L, or vice versa) is i said to distinguish Xandy ©
with er to L. If Lale feL dy, x and y are indistinguishable with respect
toL _
In order to show that two strings x and y are distinguishable with respect to a
language L, it is sufficient to find one string z so that either xz € L and yz ¢ L,
or xz ¢ L and yz € L (in other words, so that z is in one of the two sets L/x and
L/y but not the other). For example, if L is the language in Example 3.9, the set of
all strings in {0, 1}* that end in 10, we observed that 00 and 01 are distinguishable
with respect to L, because we can choose z to be the string 0; that is, O00 ¢ L and
106 PART 2 Regular Languages and Finite Automata
010 € L. The two strings 0 and 00 are indistinguishable with respect to L, because
the two sets L/0 and L/00 are equal; each is just the set L itself. (The only way 0z
or 00z can end in 10 is for z to have this property.)
Proof The assumption that x and y are distinguishable with respect to L means
that there is a string z in one of the two sets L/x and L/y but not the other. In other
words, one of the two strings xz and yz is in L and the other is not. Because we are
also assuming that M accepts L, it follows that one of the two states 6*(go, xz) and
5*(qo, yZ) is an accepting state and the other is not. In particular, therefore,
Because the left sides of these two equations are unequal, the right sides must be, and
therefore 6*(go, x) # 6*(qo, y).
_ respect to L, itfollow
Ly, = {x € {0, 1}* | |x| = n and the nth symbol from the right in x is 1}
5(abc, d) = bcd
The accepting states are the four for which the third symbol from the end is 1.
Figure 3.10 |
A finite automaton accepting L3.
108 PART 2 Regular Languages and Finite Automata
We might ask whether there is some simpler FA, perhaps using a completely different
approach, that would cut the number of states to a number more manageable than 2”+! — 1.
Although the number can be reduced just as in Example 3.12, we can see from Theorem 3.2
that any FA recognizing L,, must have at least 2” states. To do this, we show that any two
strings of length n (of which there are 2” in all) are distinguishable with respect to ire
Let x and y be two distinct strings of length n. They must differ in the ith symbol (from
the left), for some i with 1 < i <n. For the string z that we will use to distinguish these
two strings with respect to L,, we can choose any string of length i — 1. Then xz and yz still
differ in the ith symbol, and now the ith position is precisely the nth from the right. In other
words, one of the strings xz and yz is in L and the other is not, which implies that L/x # L/y.
Therefore, x and y are distinguishable with respect to L,.
‘Theorem 3.3 _
The language pal of botindGme. 2)
by any finite automaton, anditis
ee beaccepted
be04
Proof _
As we described oe we ail show that for any two distinct strings a x
and y in {0, 1}*, x and y are disting shable with respect to pal. To show —
this, we consider first the case when |x| = |yl, and we letz = x”. Then |
XZ = xx", which is in pal, and yz is not. If |x| A ly], we may as well
assume that |x| < |y|, and we let y= -yiY2, where |y;|= |x|. Again, |
we look for a string z so that xz € pal and yz ¢ pal. Anyz0 “the form
CHAPTER 3 Regular Languages and Finite Automata 109
In Chapter 5 we will consider other nonregular languages and find other methods
for demonstrating that a language is nonregular. Definition 3.5 will also come up again
in Chapter 5; the indistinguishability relation can be used in an elegant description of
a “minimum-state” FA recognizing a given regular language.
accepting states so that the strings accepted are the ones we want. For the language
L, UL, for example, x should be accepted if it is in either L; or L2; this means that
the state (p,q) should be an accepting state if either p or g is an accepting state of
its respective FA. For the languages L; M Lz and L; — Lz, the accepting states of the
machine are defined similarly.
-
A==Sig. Ipre an
. 3. IFA = ((p,ee) an
: Proof —
ne
0)a=i049) [email protected])
are similar.
It often happens, as in the following example, that the FA we need for one of
these three languages is even simpler than the construction in Theorem 3.4 would
seem to indicate.
of {0, 1}*. The languages L, and L, are recognized by the FAs in Figure 3.11a.
The construction in the theorem, for any of the three cases, produces an FA with nine
states. In order to draw the transition diagram, we begin with the initial state (A, P). Since
6,;(A, 0) = B and 6,(P, 0) = Q, we have 5((A, P), 0) = (B, Q). Similarly, 5((A, P), 1) =
(A, P). Next we calculate 6((B, Q), 0) and 6((B, Q), 1). As we continue this process, as
Figure 3.11 |
Constructing a finite automaton to accept L, —L.
PART 2 Regular Languages and Finite Automata
soon as a new state is introduced, we calculate the transitions from this state. After a few steps
we obtain the partial diagram in Figure 3.11b. We now have six states; each of them can be
reached from (A, P) as a result of some input string, and every transition from one of these
six goes to one of these six. We conclude that the other three states are not reachable from the
initial state, and therefore that they can be left out of our FA (Exercise 3.29).
Suppose now that we want our FA to recognize the language L; — L2. Then we designate
as our accepting states those states (X, Y) from among the six for which X is either A or B
and Y is not R. These are (A, P) and (B, Q), and the resulting FA is shown in Figure 3.11c.
In fact, we can simplify this FA even further. None of the states (C, P), (C, Q), or (C, R)
is an accepting state, and once the machine enters one of these states, it remains in one of
them. Therefore, we may replace all of them with a single state, obtaining the FA shown in
Figure 3.11d.
EXERCISES
3.1. In each case, find a string of minimum length in {0, 1}* not in the language
corresponding to the given regular expression.
ar 8 (OL)50°
ba = l*) (O81)O* 1%)
cr0* 007)"1*
dy" (O-F 10)"1*
3.2. Consider the two regular expressions
Cit)" ey"
CHAPTER 3 Regular Languages and Finite Automata 113
(aa*bb*)* = A+a(a+b)*b
3.17. For each of the FAs pictured in Figure 3.12, describe, either in words or by
writing a regular expression, the strings that cause the FA to be in each state.
3.18. Let x be a string in {0, 1}* of length n. Describe an FA that accepts the string
x and no other strings. How many states are required?
Figure 3.12 |
PART 2 Regular Languages and Finite Automata
3.19. For each of the languages in Exercise 3.9, draw an FA recognizing the
language.
3.20. For each of the following regular expressions, draw an FA recognizing the
corresponding language.
(0+ 1)*0
Ci? 10).
O+1)* (1+ 00)0-+ D*
(111 + 100)*0
O-- 10* + 0170
ee (0 + 1)*(01 + 110)
pie
Noa
3.21. Draw an FA that recognizes the language of all strings of 0’s and 1’s of
length at least 1 that, if they were interpreted as binary representations of
integers, would represent integers evenly divisible by 3. Your FA should
accept the string 0 but no other strings with leading 0’s.
3.22. Suppose M is the finite automaton (Q, %, go, A, 6).
a. Using mathematical induction or structural induction, show that for any
x and yin >*, andany g € Q,
5*(q, xy) = 8°(8", x), y)
Two reasonable approaches are to base the induction on x and to base it
on y. One, however, works better than the other.
b. Show that if for some state g, 6(q, a) = q for every a € &, then
5*(q, x) =q for every x € X*.
c. Show that if for some state g and some string x, 5*(q, x) = q, then for
every n = 050°(g, x)= ¢:
3.25. If L is a language accepted by an FA M, then there is an FA accepting L with
more states than M (and therefore there is no limit to the number of states an
FA accepting L can have). Explain briefly why this is true.
3.24. Show by an example that for some regular language L, any FA recognizing
L must have more than one accepting state. Characterize those regular
languages for which this is true.
3.25. For the FA pictured in Figure 3.11d, show that there cannot be any other FA
with fewer states accepting the same language.
3.26. Let z be a fixed string of length n over the alphabet {0, 1}. What is the
smallest number of states an FA can have if it accepts the language
{O, 1}*{z}? Prove your answer.
3.27. Suppose L is a subset of {0, 1}*. Does an infinite set of distinguishable pairs
with respect to L imply an infinite set that is pairwise distinguishable with
respect to L? In particular, if xo, x1, ... is a sequence of distinct strings in
{0, 1}* so that for any n > 0, x, and x, are distinguishable with respect to
L, does it follow that L is not regular? Either give a proof that it does follow,
or provide an example of a regular language L and a sequence of strings x9,
X,,... with this property.
CHAPTER 3 Regular Languages and Finite Automata 117
3.28. Let L C {0, 1}* be an infinite language, and for each n > 0, let
Lyn = {x € L | |x| =n}. Denote by s(n) the number of states an FA must
have in order to accept L,. What is the smallest that s(n) can be if L, 4 0?
Give an example of an infinite language L so that for every n satisfying
L, 4 @, s(n) is this minimum number.
3.29. Suppose M = (Q, &, qo, A, 5) is an FA. If p and q are elements of Q, we
say q is reachable from p if there is a string x € &* so that 6*(p, x) = q.
Let M;, be the FA obtained from M by deleting any states that are not
reachable from qo (and any transitions to or from such states). Show that M,
and M recognize the same language. (Note: we might also try to simplify M
by eliminating the states from which no accepting state is reachable. The
result might be a pair (¢,a) € Q x & for which no transition is defined, so
that we do not have an FA. In Chapter 4 we will see that this simplification
can still be useful.)
3.30. Let M = (Q, &, qo, A, 5) be an FA. Let M; = (Q, 2, go, R, 5), where R is
the set of states in QO from which some element of A is reachable (see
Exercise 3.29). What is the relationship between the language recognized by
M, and the language recognized by M? Prove your answer.
3.38. Suppose M = (Q, &, qo, A, 5) is an FA, and suppose that there is some
string z so that for every g € Q, 5*(q, z) € A. What conclusion can you
draw, and why?
3.32. (For this problem, refer to the proof of Theorem 3.4.) Show that for any
x € D* and any (p,q) € QO, 8*((p, gq), x) = (87 (p, x), 83(q, x)).
3.33. Let M,, M2, and M; be the FAs pictured in Figure 3.13, recognizing
languages Lj, L2, and L3, respectively.
Boll
1.SCE REN: @
pe Saaerad (a)
M,
(b)
Figure 3.13 |
PART 2 Regular Languages and Finite Automata
L, = {x € {0, 1}* | |x| > and the nth symbol from the right in x is 1}
It is shown in that example that any FA recognizing L,, must have at least 2”
states. Draw an FA with four states that recognizes Lz. For any n > 1,
describe how to construct an FA with 2” states that recognizes L,.
3.48. Let n be a positive integer and L = {x € {0, 1}* | |x| =n and no(x) =
n,(x)}. What is the minimum number of states in any FA that recognizes L?
Give reasons for your answer.
3.49. Let n be a positive integer and L = {x ¢€ {0, 1}* | |x| =n and no(x) <
n(x)}. What is the minimum number of states in any FA that recognizes L?
Give reasons for your answer.
3.50. Let n be a positive integer, and let L be the set of all strings in pal of length
2n. In other words,
: ses
Ns
Figure 3.14 |
transitions among the states of the first FA are the same as those among the
corresponding states of the other. For example, if 5; and 5) are the transition
functions,
In general, if M; = (Q), 2, q1, Ai, 61) and M2 = (Qo, X, gz, Az, 52) are
FAs andi : Q; — Q> is a bijection (i.e., one-to-one and onto), we say that i
is an isomorphism from M, to M, if these conditions are satisfied:
Gi) i(q1) =q
(ii) for any g € Q;,i(q) € Az if and only ifgq € Ay
(iii) forevery g € Q; andeverya € &,i(6;(g, a)) = d2(i(q), a)
and we say M, is isomorphic to Mz if there is an isomorphism from M, to
M). This is simply a precise way of saying that M; and M) are “essentially
the same.”
a. Show that the relation ~ on the set of FAs over &, defined by M; ~ M2
if M, is isomorphic to Mg), is an equivalence relation.
b. Show that if i is an isomorphism from M, to M> (notation as above),
then for every g € Q; and x € &*,
i(8;(q,
x)) = 8344),x)
c. Show that two isomorphic FAs accept the same language.
How many one-state FAs over the alphabet {0, 1} are there, no two of
which are isomorphic?
122 PART 2 Regular Languages and Finite Automata
e. How many pairwise nonisomorphic two-state FAs over {0, 1} are there,
in which both states are reachable from the initial state and at least one
state is accepting?
f. How many distinct languages are accepted by the FAs in the previous
part?
g. Show that the FAs described by these two transition tables are
isomorphic. The states are 1-6 in the first, A-F in the second; the initial
states are 1 and A, respectively; the accepting states in the first FA are 5
and 6, and D and E in the second.
123
124 PART 2 Regular Languages and Finite Automata
Figure 4.1 |
A simpler approach to accepting {11, 110}*{0}.
see that any path starting at go and ending at g4 corresponds to a string matching the regular
expression.
Let us now consider the ways in which the diagram in Figure 4.1 differs from that of an
FA, and the way in which it should be interpreted if we are to view it as a language-accepting
device. There are two apparently different problems. From some states there are not transitions
for both input symbols (from state q4 there are no transitions at all); and from one state, go,
there is more than one arrow corresponding to the same input.
The way to interpret the first of these features is easy. The absence of an a-transition
from state g means that from q there is no input string beginning with a that can result in the
device’s being in an accepting state. We could create a transition by introducing a “dead” state
to which the device could go from q on input a, having the property that once the device gets to
that state it can never leave it. However, leaving this transition out makes the picture simpler,
and because it would never be executed during any sequence of moves leading to an accepting
state, leaving it out does not hurt anything except that it violates the rules.
The second violation of the rules for FAs seems to be more serious. The diagram indicates
two transitions from qo on input 1. It does not specify an unambiguous action for that state-
input combination, and therefore apparently no longer represents a recognition aleon ora
language-recognizing machine.
As we will see shortly, there is an FA that operates in such a way that it simulates the
diagram correctly and accepts the right language. However, even as it stands, we can salvage
much of the intuitive idea of a machine from the diagram, if we are willing to give the machine
an element of nondeterminism. An ordinary finite automaton is deterministic: The moves it
makes while processing an input string are completely determined by the input symbols and the
state it starts in. To the extent that a device is nondeterministic, its behavior is unpredictable.
There may be situations (i.e., state-input combinations) in which it has a choice of possible
moves, and in these cases it selects a move in some unspecified way. One way to describe this
is to say that it guesses a move.
CHAPTER 4 Nondeterminism and Kleene’s Theorem 125
We are almost ready for a formal definition of the abstract device illustrated by
Figure 4.1b. The only change we will need to make to the definition of an FA involves
the transition function. As we saw in Example 4.1, for a particular combination of
state and input symbol, there may be no states specified to which the device should go,
or there may be several. All we have to do in order to accommodate both these cases
is to let the value of the transition function be a set of states—possibly the empty set,
possibly one with several elements. In other words, our transition function 6 will still
be defined on Q x ¥ but will now have values in 22 (the set of subsets of Q). Our inter-
pretation will be that 6(q, a) represents the set of states that the device can legally be in,
as a result of being in state g at the previous step and then processing input symbol a.
126 PART 2 Regular Languages and Finite Automata
Just as in the case of FAs, it is useful to extend the transition function 6 from
Q x & to the larger set Q x D*. For an NFA M, 5* should be defined so that 5*(p, x)
is the set of states M can legally be in as a result of starting in state p and processing
the symbols in the string x. The recursive formula for FAs was
A simpler way to say “M can get from p;_; to p; by processing a;” (or “p; is one of
the states to which M can get from p;_, by processing a;”) is
Die NPATGQ)
We may therefore define the function 5* as follows.
| 5*(p,
S for which there is a sequence of states p = po, pj
q satisfying o
Although the sequence of statements above was intended specifically to say that
Pn € 8*(po, 4142°++dy), there is nothing to stop us from looking at intermediate
points along the way and observing that for each i > 1,
Pn-1 © o*(p, y)
In other words, if g € 5*(p, ya,), then there is a state r (namely, r = p,_) in the
set d*(p, y) so that g € S(r, a,). It is also clear that this argument can be reversed:
If q € d(r, an) for some r € 5*(p, y), then we may conclude from the definition that
q € 5*(p, ya,). The conclusion is the recursive formula we need:
0,1
Figure 4.2 |
Then M can be represented by the transition diagram in Figure 4.2. Let us try to determine
L(M) by calculating 6* (qo, x) for a few strings x of increasing length. First observe that from
the nonrecursive definition of 5* it is almost obvious that 6 and 5* agree for strings of length 1
(see also Exercise 4.3). We see from the table that 6*(qo, 0) = {qo} and 6*(qo, 1) = {qo, 41};
SG cley,
re{qo.q}
= 6(go, I) Ud@Q, I)
= {40.11} U {42}
{90,915 92}
sg, 0)= LJ) sen
réd*(qo,0)
LU se. 0
re{qo}
= 6(qo,1)
= {40,41}
5 Gotti ele 8G1)
r€6*(qo,11)
= 6(qgo, 1) US(qi, 1)
a {qo, 71; gq}
We observe that 111 is accepted by M and 011 is not. You can see if you study the diagram in
Figure 4.2 that 6*(qo, x) contains q, if and only if x ends with 1; that for any y with |y| = 2,
5* (qo, xy) contains q3 if and only if x ends with 1; and, therefore, that the language recognized
by M is
This is the language we called L3 in Example 3.17, the set of strings with length at least 3
having a | in the third position from the end. By taking the diagram in Figure 4.2 as a model,
you can easily construct for any n > 1 an NFA with n + 1 states that recognizes L,. Since we
showed in Example 3.17 that any ordinary FA accepting L,, needs at least 2” states, it is now
clear that an NFA recognizing a language may have considerably fewer states than the simplest
FA recognizing the language.
The formulas above may help to convince you that the recursive definition of 5* is a
workable tool for investigating the NFA, and in particular for testing a string for membership
in the language L(M). Another approach that provides an effective way of visualizing the
behavior of the NFA as it processes a string is to draw a computation tree for that string. This
is just a tree diagram tracing the choices the machine has at each step, and the paths through
the states corresponding to the possible sequences of moves. Figure 4.3 shows the tree for the
NFA above and the input string 101101.
Starting in go, M can go to either qo or q; using the first symbol of the string. In either
case, there is only one choice at the next step, with the symbol 0. This tree shows several types
of paths: those that end prematurely (with fewer than six moves), because M arrives in a state
from which there is no move corresponding to the next input symbol; those that contain six
Figure 4.3 |
A computation tree for the NFA in Figure 4.2, as it processes
101101.
130 PART 2 Regular Languages and Finite Automata
moves and end in a nonaccepting state; and one path of length 6 that ends in the accepting state
and therefore shows the string to be accepted.
It is easy to read off from the diagram the sets 5*(qo, y) for prefixes y of x. For example,
5*(qo, 101) = {qo, 41, q3}, because these three states are the ones that appear at that level of
the tree. Deciding whether x is accepted is simply a matter of checking whether any accepting
states appear in the tree at level |x| (assuming that the root of the tree, the initial state, is at
level 0).
We now want to show that although it may be easier to construct an NFA accepting
a given language than to construct an FA, nondeterministic finite automata as a group
are no more powerful than FAs: Any language that can be accepted by an NFA can
also be recognized by a (possibly more complicated) FA.
We have discussed the fact that a (deterministic) finite automaton can be in-
terpreted as an algorithm for recognizing a language. Although an NFA might not
represent an algorithm directly, there are certainly algorithms that can determine for
any string x whether or not there is a sequence of moves that corresponds to the
symbols of x and leads to an accepting state. Looking at the tree diagram in Fig-
ure 4.3, for example, suggests some sort of tree traversal algorithm, of which there
are several standard ones. A depth-first traversal corresponds to checking the possible
paths sequentially, following each path until it stops, and then seeing if it stops in an
accepting state after the correct number of steps. A breadth-first, or level-by-level,
traversal of the tree corresponds in some sense to testing the paths in parallel.
The question we are interested in, however, is not whether there is an algorithm for
recognizing the language, but whether there is one that corresponds to a (deterministic)
finite automaton. We will show that there is by looking carefully at the definition of
an NFA, which contains a potential mechanism for eliminating the nondeterminism
directly.
The definition of a finite automaton, either deterministic or nondeterministic,
involves the idea of a state. The nondeterminism present in an NFA appears whenever
there are state-input combinations for which there is not exactly one resulting state.
In a sense, however, the nondeterminism in an NFA is only apparent, and arises from
the notion of state that we start with. We will be able to transform the NFA into
an FA by redefining state so that for each combination of state and input symbol,
exactly one state results. The way to do this is already suggested by the definition
of the transition function of an NFA. Corresponding to a particular state-input pair,
we needed the function to have a single value, and the way we accomplished this
was to make the value a set. All we have to do now is carry this idea a little further
and consider a state in our FA to be a subset of Q, rather than a single element of Q.
(There is a partial precedent for this in the proof of Theorem 3.4, where we considered
states that were pairs of elements of Q.) Then corresponding to the “state” S and the
input symbol a (i.e., to the set of all the pairs (p, a) for which p € S), there is exactly
one “state” that results: the union of all the sets 8(p, a) for p € S. All of a sudden,
the nondeterminism has disappeared! Furthermore, the resulting machine simulates
the original device in an obvious way, provided that we define the initial and final
states correctly.
CHAPTER 4 Nondeterminism and Kleene’s Theorem 131
This technique is important enough to have acquired a name, the subset con-
struction: States in the FA are subsets of the state set of the NFA.
132 PART 2 Regular Languages and Finite Automata
It is important to realize that the proof of the theorem provides an algorithm (the
subset construction) for removing the nondeterminism from an NFA. Let us illustrate
the algorithm by returning to the NFA in Example 4.2.
51 ({Go},0)= {Go}
5: ({go}, 1) = {40, a1}
51 ({40, 41}, 0) = 5({qo}, 0) U 8({qi}, 0) = {Go} U {92} = {40, 92}
81 ({40, qi}, 1) = 8({go}, 1) U 8({ai}, 1) = (40, 91} U {42} = {40, 41, 92}
51 ({Go, 92}, 9) = 8({go}, 9) U S({qa}, 0) = {Go} U {93} = {40, 93}
81({G0, g2}, 1) = (40, i} U {43} = {90, 41, 93}
It turns out that in the course of the calculation, eight distinct states (i.e., sets) arise. We
knew already from the discussion in Example 3.17 that at least this many would be necessary.
¢
1
90919293
Figure 4.4 |
The subset construction applied to the NFA in Figure 4.2.
CHAPTER 4 Nondeterminism and Kleene’s Theorem 133
Therefore, although we should not expect this to happen in general, the calculation in this case
produces the FA with the fewest possible states recognizing the desired language. It is shown
in Figure 4.4.
{qo} niga
{qa} M) D
{41,92} ) {q0, 93}
4) i) i)
{40 93} {91,92}
{qo, qa} {q1, qa}
You can recognize this as the FA shown in Figure 4.1a by substituting p for {qa}, r for {q1, q2},
5 for O, t for {go, 93}, and u for {go, qa}. (Strictly speaking, you shquld also substitute go for
{o}-)
Figure 4.5 |
(a) An NFA accepting {0}{1}*; (6) An NFA accepting
{O}*{1}.
question is just how to connect the two. The string 0 takes M, from qo to qj; since 0 is not
itself an element of L;L2, we do not expect the state q, to be an accepting state in M. We
might consider the possible ways to interpret an input symbol once we have reached the state
qi. At that point, a 0 (if it is part of a string that will be accepted) can only be part of the 0*
term from the second language. This might suggest connecting gq; to po by an arrow labeled 0;
this cannot be the only connection, because a string in L does not require a 0 at this point. The
symbol | in state g, could be either part of the 1* term from the first language, which suggests
a connecting arrow from q; to po labeled 1, or the 1 corresponding to the second language,
which suggests an arrow from q, to p; labeled 1. These three connecting arrows turn out to
be enough, and the resulting NFA is shown in Figure 4.6a.
We introduced a procedure in Chapter 3 to take care of the union of two languages provided
we have an FA for each one, but it is not hard to see that the method is not always satisfactory
if we have NFAs instead (Exercise 4.12). One possible NFA to handle L; U L is shown in
Figure 4.6b. This time it may not be obvious whether making qo the initial state, rather than
Po, is a natural choice or simply an arbitrary one. The label 1 on the connecting arrow from qo
to p, is the 1 in the regular expression 0*1, and the 0 on the arrow from qo to po is part of the
0* in the same regular expression. The NFA accepts the language corresponding to the regular
expression 01* + 1 + 00*1, which can be simplified to 01* + 0*1.
For both L,Lz and L; U Ly», we have found relatively simple composite NFAs, and it is
possible in both cases to see how the two original diagrams are incorporated into the result.
However, in both cases the structure of the original diagrams has been obscured somewhat in
the process of combining them, because the extra arrows used to connect the two depend on
the regular expressions being combined, not just on the combining operation. We do not yet
seem to have a general method that will work for two arbitrary NFAs.
One way to solve this problem, it turns out, is to combine the two NFAs so as to create
a nondeterministic device with even a little more freedom in guessing. In the case of con-
catenation, for example, we will allow the new device to guess while in state g, that it will
receive no more inputs that are to be matched with 1*. In making this guess it commits itself
to proceeding to the second part of the regular expression 01*0*1; it will be able to make this
guess, however, without any reference to the actual structure of the second part—in particular,
without receiving any input (or, what amounts to the same thing, receiving only the null string
A as input). The resulting diagram is shown in Figure 4.7a.
In the second case, we provide the NFA with an initial state that has nothing to do with
either M; or M;, and we allow it to guess before it has received any input whether it will be
CHAPTER 4 Nondeterminism and Kleene’s Theorem 135
Figure 4.6 |
(a) An NFA accepting {0}{1}*{0}*{1}; (6) An NFA accepting {O}{1}* U {O}*{1}.
Figure 4.7 |
(a) An NFA-A accepting {0}{1}*{0}*{1}; (6) An NFA-A accepting {0}{1}* U {0}* {1}.
looking for an input string in L; or one in Ly. Making this guess requires only A as input and
allows the machine to begin operating exactly like M, or Mp, whichever is appropriate. The
result is shown in Figure 4.7).
To summarize, the devices in Figure 4.7 are more general than NFAs in that they are
allowed to make transitions, not only on input symbols from the alphabet, but also on null
inputs.
=
@HOAHO+OFO>O
Figure 4.8|
We know in advance that the set A(S) is finite. As a result, we can translate the
recursive definition into an algorithm for calculating A(S) (Exercise 2.70).
The A-closure of a set is the extra ingredient we need to define the function 6*
recursively. If 5*(q, y) is the set of all the states that can be reached from q using the
symbols of y as well as A-transitions, then
LJ s@@ .
red*(q,y)
is the set of states we can reach in one more step by using the symbol a, and the
A-closure of this set includes any additional states that we can reach with subsequent
A-transitions.
Let M =(Q
= 5,qo» A, 8)bes
an NFA-A. The extended transition‘Fyechion
aeA)=Aq).
ye m*,andae x,
. \red gy) |
guage recognized by
138 PART 2 Regular Languages and Finite Automata
5° (qo, A) = A({go})
= {qo, P, t}
8*(go,0)=Al [J 4(p,0)
ped* (go, A)
= {p,u}
Figure 4.9 |
The NFA-A for Example 4.6.
CHAPTER 4 Nondeterminism and Kleene’s Theorem 139
sq.0D=Al LJ 3,0)
pd*(qo,0)
= AG@(p, 1) UStu, 1)
= A({r})
elas
= A(6(r, 0))
= A({s})
= {s, w, go, Pp, t}
(The last equality is the observation we made near the beginning of this example.) Because
5*(go, 010) contains w, and because w is an element of A, 010 is accepted.
Looking at the figure, you might argue along this line instead: The string AO10A is the
same as 010; the picture shows the sequence
A 0 1 0 A
go > P= p> r—s—w
of transitions; therefore 010 is accepted. In an example as simple as this one, going through
the detailed calculations is not necessary in order to decide whether a string is accepted. The
point, however, is that with the recursive definitions of A(S) and 5*, we can proceed on a
solid algorithmic basis and be confident that the calculations (which are indeed feasible) will
produce the correct answer.
In Section 4.1, we showed (Theorem 4.1) that NFAs are no more powerful than
FAs with regard to the languages they can accept. In order to establish the same result
for NFA-As, it is sufficient to show that any NFA-A can be replaced by an equivalent
NFA; the notation we have developed now allows us to prove this.
140 PART 2 Regular Languages and Finite Automata
Just as in the case of Theorem 4.1, the proof of Theorem 4.2 provides us with
an algorithm for eliminating A-transitions from an NFA-A. We illustrate the algo-
rithm in two examples, in which we can also practice the algorithm for eliminating
nondeterminism.
{A, B,C, D}
{C, D} D
D {B, D}
{D}
(c)
Figure 4.10 |
An NFA-A, an NFA, and an FA for {0}*{01}*{0}*.
144 PART 2 Regular Languages and Finite Automata
S*(A,0)=Al [) 86,0)
reA({A})
In a more involved example we might feel more comfortable carrying out each step literally:
calculating A({A}), finding 5(r, 0) for each r in this set, forming the union, and calculating the
A-closure of the result. In this simple example, we can see that from A with input 0, M can
stay in A, move to B (using a 0 followed by a A-transition), move to C (using a A-transition
and then a 0), or move to D (using the 0 either immediately preceded or immediately followed
by two A-transitions). The other entries in the last two columns of the table can be obtained
similarly. Since M can move from the initial state to D using only A-transitions, A must be
an accepting state in M,, which is shown in Figure 4.10b.
Having the values 5, (q, 0) and 4, (q, 1) in tabular form is useful in arriving at an FA. For
example (if we denote the transition function of the FA by 5), in order to compute 52({C, D}, 0)
we simply form the union of the sets in the third and fourth rows of the 5*(q, 0) column of the
table; the result is {D}. It turns out in this example that the sets of states that came up as we
filled in the table for the NFA were all we needed, and the resulting FA is shown in Figure 4.10c.
| EXAMPLE4.8
4.8 | Another Example of Converting an NFA-A to an NFA
For our last example we consider the NFA-A in Figure 4.1la, recognizing the language
{0}*({O1}*{1} U {1}*{0}). Again we show the transition function in tabular form, as well
as the transition function for the resulting NFA.
The NFA we end up with is shown in Figure 4.11b. Note that the initial state A is not
an accepting state, since in the original device an accepting state cannot be reached with only
A-transitions.
Unlike the previous example, when we start to eliminate the nondeterminism from our
NFA, new sets of states are introduced in addition to the ones shown. For example,
The calculations are straightforward, and the result is shown in Figure 4.11c.
CHAPTER 4 Nondeterminism and Kleene’s Theorem 145
Figure 4.11 |
An NFA-A, an NFA, and an FA for {0}*{01}*{1} U {1}*{0}.
4.3|KLEENE’S THEOREM
Sections 4.1 and 4.2 of this chapter have provided the tools we need to prove Theo-
rem 3.1. For convenience we have stated the two parts, the “if” and the “only if,” as
separate results.
146 PART 2 Regular Languages and Finite Automata
Proof
Because of Theorem 4.3, it is sufficient to show that every aoa language _
can be accepted e an NFA- A. The set o‘regular hohe overthe alt abet _
ONC
byne
lac
|
For either value of i, if x € L;, oS
. _ ona A-transition and then executing the ves ae cause M; to acest:x On
the other hand, if x is accepted by M,, there i
is a sequence of transitions cor-
a responding to x, starting at g, and ending at an element of A; or Ad. The first
ot these transitions must be a A- transition from da to either a or qd
- there are no other transitions from q,. Thereafter, since Q in = =
CHAPTER 4 Nondeterminism and Kleene’s Theorem 147
0 +O -O-O
(a) (b) ()
Figure 4.12 |
NFA-As for the three basic regular languages.
(d)
Figure 4.13 |
NFA- As for union, concatenation, and Kleene *.
148 PART 2 Regular Languages and Finite Automata
ot
5x(qe, A) = {qi} and 8; (qx,
a) = O fora € &.
For g € Q; anda € DU {A}, 54, a= 51(q. a) unless g€ A, and
a= A.
Forg€ Ai, &(q, A) =81(g, A |
Suppose x € L*. If x = A, then. y x isaccepted by Mx. Otherwise,
for some m = 1, x = X\X2---X%,, wherex, € L; for each i. M, can
move from gx to gi by a A-transition; for each i, My moves from q; to an-
element f; of A; by a sequence of transitions corresponding to x;; and for
each i, M; then moves from fj back to q, by a A-transition. It follows that
ee Soo = x is accepted by M,. On the other |
ca
; ‘eanbe dee y iy in the ion
x = (Ax A)(AxA)-- “(Atm A)
. where, for each i, there is a sequence of transitions co:4
qi to an element of A. Therefore, x € L .
Since we have constructed an NFA-A rec igL in each ofthe three
cases, the proof is complete. |
The constructions in the proof of Theorem 4.4 provide an algorithm for con-
structing an NFA-A corresponding to a given regular expression. The next example
illustrates its application, as well as the fact that there may be simplifications possible
along the way.
CHAPTER 4 Nondeterminism and Kleene’s Theorem 149
Figure 4.14 |
Constructing an NFA-A for (00 + Dado
150 PART 2 Regular Languages and Finite Automata
(c) (d)
Figure 4.15 |
A simplified NFA-A for (00 + 1)*(10)*.
parallel those of Figure 4.14 and incorporate some obvious simplifications. One must be a
little careful with simplifications such as these, as Exercises 4.32 to 4.34 illustrate.
We know from Sections 4.1 and 4.2 of this chapter how to convert an NFA-A obtained
from Theorem 4.4 into an FA. Although we have not yet officially considered the question of
simplifying a given FA as much as possible, we will see how to do this in Chapter 5.
If we have NFA-As M, and M), the proof of Theorem 4.4 provides us with
algorithms for constructing new NFA-As to recognize the union, concatenation, and
Kleene * of the corresponding languages. The first two of these algorithms were
illustrated in Example 4.5. As a further example, we start with the FAs shown in
Figures 4.16a and 4.16b (which were shown in Examples 3.12 and 3.13 to accept
the languages {0, 1}*{10} and {00}*{11}*, respectively). We can apply the algorithms
for union and Kleene *, making one simplification in the second step, to obtain the
CHAPTER 4 Nondeterminism and Kleene’s Theorem 151
Figure 4.16 |
An NFA-A for ((0 + 1)*10 + (00)*(11)*)*.
SR ey
yy GOS
qu
Figure 4.17 |
154 PART 2 Regular Languages and Finite Automata
The only property of L(p, q, 0) that is needed in the proof of Theorem 4.5 is that
it is finite. It is simple enough, however, to find an explicit formula. This formula
and the others used in the proof are given below and provide a summary of the steps
in the algorithm provided by the proof for finding a regular expression corresponding
to a given FA:
{a € X|d(p,a) = 4} ifp #4q
Lp, ¢3.0)=
{aeX|d(p,a)=p}U{A} ifp=q
Lip,¢g,k+1)=Lp,¢,k) OL ka 1, DLA +1R-+ 1k) ER 195k)
L(p,q) = L(p,q,n)
L=|)[email protected])
qeA
1(P,2,1) (7,3,
2D
1 a*(ba*)*b a*(ba*)*bb
2 at (ba*)* (a*b)* (atb)*b
at + a*(bat)t a*b(atb)* oN +a*b(atb)*b
Figure 4.18 |
CHAPTER 4 Nondeterminism and Kleene’s Theorem 155
Although many of these table entries can be obtained by inspection, the formula can
be used wherever necessary. In a number of cases, simplifications have been made in the
expressions produced by the formula.
For example,
If you wish, you can almost certainly find ways to simplify this further.
156 PART 2 Regular Languages and Finite Automata
EXERCISES
4.1. In the NFA pictured in Figure 4.19, calculate each of the following.
a-0-0' (DD)
b. 64*(1, bab)
c. 6d*(1, aabb)
d. 6*(1, aabbab)
e. 6*(1, aba)
4.2. An NFA with states 1-5 and input alphabet {a, b} has the following
transition table.
Figure 4.19 |
CHAPTER 4 Nondeterminism and Kleene’s Theorem 157
Figure 4.20 |
Figure 4.21 |
Figure 4.22 |
ge
1
2
3
4
5)
6
7
Find:
a. A({2, 3})
b. A({1})
c. A({3, 4})
d= "6" Gaba?)
€. 6 (1, ab)
f. 46*(1, ababa)
4.17. A transition table is given for another NFA-A with seven states.
q_
1
2
3
4
5
6
7
$8 Bad
(a)
;
>
() >/\> >8
Figure 4.23 |
4.21. Let M = (Q, &, qo, A, 5) be an FA, and let M; = (Q, D, qo, A, 5) be the
NFA-A defined in the proof of Theorem 4.3, in which 5,(g, A) = @ and
51(q,a) = {6(q, a)}, for every g € Q anda € &. Give a careful proof that
for every g € O andx € &*, dF (g, x) = {5*(q, x)}. Recall that the two
functions 4* and 6F are defined differently.
4.22. Let M, be the NFA-A obtained from the FA M as in the proof of Theorem
4.3. The transition function 6, of M, is defined so that 6,(g, A) = @ for
every state g. Would defining 6;(q¢, A) = {gq} also work? Give reasons for
your answer.
4.23. Let M = (Q, X, qo, A, 5) be an NFA-A. The proofs of Theorems 4.2 and
4.1 describe a two-step process for obtaining an FA M; = (Q), %, q1, Ai,
6,) that accepts the language L(M). Do it in one step, by defining Qj, q,
A, and 6, directly in terms of M.
4.24. Let M =(Q, , qo, A, 5) be an NFA-A. In the proof of Theorem 4.2, the
' NEFA M, might have more accepting states than M: The initial state go is
made an accepting state if A({go}) 0 A 4 YW. Explain why it is not necessary
to make all the states g for which A({g}) N A 4 @ accepting states in M;.
4.25. Suppose M = (Q, X, go, A, 5) is an NFA-A recognizing a language L. Let
M, be the NFA-A obtained from M by adding A-transitions from each
element of A to go. Describe (in terms of L) the language L(M;).
162 PART 2 Regular Languages and Finite Automata
Figure 4.24 |
CHAPTER 4 Nondeterminism and Kleene’s Theorem 163
Figure 4.25 |
4.32. In the construction of M,, in the proof of Theorem 4.4, consider this
alternative to the construction described: Instead of a new state g, and
A-transitions from it to q; and q2, make qj the initial state of the new
NFA-A, and create a A-transition from it to gz. Either prove that this works
in general, or give an example in which it fails.
4.33. In the construction of M, in the proof of Theorem 4.4, consider the
simplified case in which M, has only one accepting state. Suppose that we
eliminate the A-transition from the accepting state of M, to q2, and merge
these two states into one. Either show that this would always work in this
case, or give an example in which it fails. »
4.34. In the construction of M;, in the proof of Theorem 4.4, suppose that instead
of adding a new state q;, with A-transitions from it to g; and to it from each
accepting state of Q;, we make q, both the initial state and the accepting
state, and create A-transitions from each accepting state of M, to g,. Either
show that this works in general, or give an example in which it fails.
4.35. In each case below, find an NFA-A recognizing the language corresponding
to the regular expression, by applying literally the algorithm in the chapter.
Do not attempt to simplify the answer.
a. ((ab)*b + ab*)*
b. aa(ba)* + b*aba*
c. (ab+ (aab)*)(aa+a)
4.36. In Figures 4.26a and 4.26b are pictured FAs M, and M), recognizing
languages L, and L», respectively. Draw NFA-As recognizing each of the
following languages, using the constructions in this chapter:
L,L2
L,L\L2
Ly, UL,
Ly
LIULy
L2L*
S
oO.
See L1L2 U (L2L))*
164 PART 2 Regular Languages and Finite Automata
Figure 4.26 |
Figure 4.27 |
4.37. Draw NFAs accepting L;L2 and L211, where L; and L» are as in the
preceding problem. Do this by connecting the two given diagrams directly,
by arrows with appropriate labels.
4.38. Use the algorithm of Theorem 4.5 to find a regular expression corresponding
to each of the FAs shown in Figure 4.27. In each case, if the FA has n states,
construct tables showing L(p, q, j) for each j withO < j <n—1.
RECS I STIS)
reA(d(qg,a))
AFG) lair)
réd*(q,x)
4.45. In Example 4.5, we started with NFAs M, and M) and incorporated them
into a composite NFA M accepting L(M,) U L(M>) in sucha way that no
new states were required.
a. Show by considering the languages {0}* and {1}* that this is not always
possible. (Each of the two languages can obviously be accepted by a
one-state NFA; show that their union cannot be accepted by a two-state
NFA.)
b. Describe a reasonably general set of circumstances in which this is
possible. (Find a condition that might be satisfied by one or both of the
NFAs that would make it possible.)
4.46. Suppose &; and &» are alphabets, and the function f : pH DEST)
homomorphism; i.e., f (xy) = f (x) f(y) for every x, y € ay.
a. Show that f(A) = A.
b. Show that if L C DF is regular, then f(L) is regular. (f (L) is the set
{y © 2 | Vie. f (x) for some x €.L}.)
c. Show that if L C D3 is regular, then f~!(L) is regular. (f“(L) is the
set {x € Li | f(x) € L}.)
4.47. Suppose M = (Q, %, go, A, 5) is an NFA-A. For two (not necessarily
distinct) states p and g, we define the regular expression e( p,q) as follows:
e(p,q) =l+rj+ro+---+7,, where/ is either A (if 5(p, A) contains q)
CHAPTER 4 Nondeterminism and Kleene’s Theorem 167
or %, and the r;’s are all the elements a of © for which 5(p, a) contains q.
It’s possible for e(p, q) to be 9, if there are no transitions from p to g;
otherwise, e(p, q) represents the “most general” transition from p to q.
If we generalize this by allowing e(p, q) to be an arbitrary regular
expression over 1, we get what is called an expression graph. If p and q are
two states in an expression graph G, and x € &*, we say that x allows G to
move from p to q if there are states po, pi,..., Pm, With po = p and
Pm = 4, SO that x corresponds to the regular expression
€(Po, Pide(P1, P2) +: e(Pn—1, Pn). This allows us to say how G accepts a
string x (x allows G to move from the initial state to an accepting state), and
therefore to talk about the language accepted by G. It is easy to see that in
the special case where G is simply an NFA-A, the two definitions for the
language accepted by G coincide. (See Definition 4.5a.) It is also not hard to
convince yourself, using Theorem 4.4, that for any expression graph G, the
language accepted by G can be accepted by an NFA-A.
We can use the idea of an expression graph to obtain an alternate proof
of Theorem 4.5, as follows. Starting with an FA M accepting L, we may
easily convert it to an NFA-A M, accepting L, so that M, has no transitions
to its initial state go, exactly one accepting state q+ (which is different from
go), and no transitions from q+. The remainder of the proof is to specify a
reduction technique to reduce by one the number of states other than go and
qf, obtaining an equivalent expression graph at each step, until go and qf are
the only states remaining. The regular expression e(qo, q+) then describes
the language accepted. If p is the state to be eliminated, the reduction step
involves redefining e(q,r) for every pair of states g and r other than p.
Describe in more detail how this reduction can be done. Then apply this
technique to the FAs in Figure 4.27 to obtain regular expressions
corresponding to their languages.
C H AP T E R
168
CHAPTER 5 Regular and Nonregular Languages 169
is regular, then the set of equivalence classes for the relation Jzt is finite. In order to
obtain a characterization of regularity using this approach, we need to show that the
converse is also true, that if the set of equivalence classes is finite then L is regular.
Once we do this, we will have an answer to the first question above (how can
we tell whether L is regular?), in terms of the equivalence classes of the relation
I,. Furthermore, it turns out that if our language L is regular, identifying these
equivalence classes will also give us an answer to the second question (how can we
find an FA?), because there is a simple way to use the equivalence classes to construct
an FA accepting L. This FA is the most natural one to accept L, in the sense that it has
the fewest possible states; and an interesting by-product of the discussion will be a
method for taking any FA known to accept L and simplifying it as much as possible.
We wish to show that if the the set of equivalence classes of J; is finite, then
there is a finite automaton accepting L. The discussion may be easier to understand,
however, if we start with a language L known to be regular, and with an FA M =
(Q, X, qo, A, 5) recognizing L. If qg € Q, then adapting the notation introduced in
Section 4.3, we let
é([x], a) = [xa]
where for any string z, [z] denotes the equivalence class containing z. As an assertion
about a string x, this is a perfectly straightforward formula, which may be true or false,
depending on how 4([x], a) is defined. If we want the formula to be the definition
of 6([x], a), we must consider a potential problem. We are trying to define 5(q, a),
where gq is an equivalence class (a set of strings). We have taken a string x in the set
q, which allows us to write gq = [x]. However, there is nothing special about x; the
set g could just as easily be written as [y] for any other string y € q. If our definition
is to make any sense, it must tell us what 5(q, a) is, whether we write g = [x] or
q = [y]. The formula gives us [xa] in one case and [ya] in the other; obviously,
unless [xa] = [ya], our definition is nonsense. Fortunately, the next lemma takes
care of the potential problem.
Lemma 5.1 /;, is right invariant with respect to concatenation. In other words, for
any x,y € X* and anya € &, ifx I; y, then xa I; ya. Equivalently, if [x] = [y],
then [xa] = [ya].
Proof Suppose x I; y anda € X. Then L/x = L/y, so that for any z’ € D*, xz’
and yz’ are either both in L or both not in L. Therefore, for any z € D*, xaz and yaz
are either both in L or both not in L (because we can apply the previous statement
with z’ = az), and we conclude that xal, ya. @
Theorem 5.1 .
Let L C &*, and let Q, be the se
on &*. (Each element of 0, therefo
set, then M; = (Q,, d, qo.‘AL; Sé
_ gi
4. =(¢¢
l, 01|g0L 40).
__ by the formula 6([x], a) = = [xa]. Fee e,M, has
any FA accepting LL :
CHAPTER 5 Regular and Nonregular Languages 171
ma
tny= fy
one) induction ono y. The base Gensis
fores x. This is
i eae sincee the left side
ments
ShcA
1b- o nere iare the same. One action
rE L, then [x]JNL AG, sincex € (el. In the other direction, —
an element y of L, then x must bein L. Otherwise the ‘string
uishx andy withTespect |to L,and x and y could not both —
Corollary 5.1 L is a regular language if and only if the set of equivalence classes
of J, is finite.
Proof Theorem 5.1 tells us that if the set of equivalence classes is finite, there is
an FA accepting L; and Theorem 3.2 says that if the set is infinite, there can be no
such FA. @
Corollary 5.1 was proved by Myhill and Nerode, and it is often called the Myhill-
Nerode theorem.
It is interesting to observe that to some extent, the construction of M; in Theorem
5.1 makes sense even when the language L is not regular. /;, is an equivalence relation
for any language L, and we can consider the set Q; of equivalence classes. Neither
172 PART 2 Regular Languages and Finite Automata
S
O
©
O
@
Q
Figure 5.1 |
the definition of 5: Q; x & — Q, nor the proof that M; accepts L depends on the
assumption that Q, is a finite set. It appears that even in the most general case, we
have some sort of “device” that accepts L. If it is not a finite automaton, what is it?
Instead of inventing a name for something with an infinite number of states, let us
draw a (partial) picture of it in a simple case we have studied. Let L be the language
pal of all palindromes over {a,b}. As we observed in the proof of Theorem 3.3,
any two strings in {a, b}* are distinguishable with respect to L. Not only is the
set Q, infinite, but there are as many equivalence classes as there are strings; each
equivalence class contains exactly one string. Even in this most extreme case, there
is no difficulty in visualizing M;, as Figure 5.1 indicates.
The only problem, of course, is that there is no way to complete the picture, and
no way to implement M, as a physical machine. As we have seen in other ways, the
crucial aspect of an FA is precisely the finiteness of the set of states.
Figure 5.2 |
A minimum-state FA recognizing
{O, 1}*{10}.
and consider the three strings A, 1, and 10. We can easily verify that any two of these strings
are distinguishable with respect to L: The string A distinguishes A and 10, and also 1 and 10,
while the string 0 distinguishes A and 1. Therefore, the three equivalence classes [A], [1], and
[10] are distinct.
However, any string y is equivalent to (indistinguishable from) one of these strings. If y
ends in 10, then y is equivalent to 10; if y ends in 1, y is equivalent to 1; otherwise (if Ay,
y = 0, or y ends with 00), y is equivalent to A. Therefore, these three equivalence classes are
the only ones.
Let M; = (Q,, {0, 1}, [A], {[10]}, 5) be the FA we constructed in Theorem 5.1. Then
since 100 is equivalent to A and 101 is equivalent to 1. It follows that the FA M, is the one
shown in Figure 5.2. Not surprisingly, this is the same FA we came up with in Example 3.12,
except for the names given to the states. (One reason it is not surprising is that the strings A,
1, and 10 were chosen to correspond to the three states in the previous FA!)
We used Theorem 3.2, which is the “only if” part of Theorem 5.1, to show that
the language of palindromes over {0, 1} is nonregular. We may use the same principle
to exhibit a number of other nonregular languages.
Some strings are not prefixes of any elements of L (examples include 1, 011, and 010),
and it is not hard to see that the set of all such strings is an equivalence class (Exercise 5.4).
The remaining strings are of three types: strings in L, strings of 0’s (0', for some i > 0), and
strings of the form 0'1/ with O < j <i.
The set L is itself an equivalence class of J,. This is because for any string x € L, A is
the only string that can follow x so as to produce an element of L.
Saying as we did above that ““we must remember how many 0’s we have seen” suggests
that two distinct strings of 0’s should be in different equivalence classes. This is true: Ifi 4 j,
the strings 0! and O/ are distinguished by the string 1', because 0'1' € L and 0/1' ¢ L. We
now know that [0'] 4 [0/]. To see exactly what these sets are, we note that for any string x
other than 0’, the string 01'+! distinguishes 0' and x (because 0'01't! € L and x01'*! ¢ L).
In other words, 0! is equivalent only to itself, and [0'] = {0'}.
Finally, consider the string 000011, for example. There is exactly one string z for which
000011z € L: the string z = 11. However, any other string x having the property that xz € L
if and only if z = 11 is equivalent to 000011, and these are the strings 0/*71/ for j > 0.
No string other than one of these can be equivalent to 000011, and we may conclude that the
equivalence class [000011] is the set {0/*?1/ | j > 0}. Similarly, for each k > 0, the set
{0/+*1/ | 7 > 0} is an equivalence class.
Let us summarize our conclusions. The set L and the set of all nonprefixes of elements
of L are two of the equivalence classes; for each i > 0, the set with the single element 0! is
an equivalence class; and for each k > 0, the infinite set {0/+*1/ | 7 > 0} is an equivalence
class. Since every string is in one of these equivalence classes, these are the only equivalence
classes.
As we expected, we have shown in particular that there are infinitely many distinct equiv-
alence classes, which allows us to conclude that L is not regular.
is in L if and only if the numbers of left and right parentheses are the same. We may therefore
consider the set S = {(" | n > O}, in the same way that we considered the strings 0” in the
previous example. For 0 < m < n, the string a)” distinguishes ( and (", and so any two
elements of S are distinguishable with respect to L (i.e., in different equivalence classes). We
conclude from this that L is not regular. Exercise 5.35 asks you to describe the equivalence
classes of J; more precisely.
of all even-length strings of 0’s and 1’s whose first and second halves are identical. This time,
for a string z that distinguishes 0” and 0” when m # n, we choose z = 1"0"1". The string 0"z
is in L, and the string 0”z is not.
Exercise 5.20 asks you to get even a little more mileage out of the set {0” |n > 0}
or some variation of it. We close this section with one more example.
Before we attack this problem, there is one obvious way in which we might be
able to reduce the number of states in M without affecting the L, partition at all.
This is to eliminate the states q for which L, = M. For such a q, there are no strings
x satisfying 6*(qo, x) = q; in other words, qg is unreachable from go. It is easy to
formulate a recursive definition of the set of reachable states of M and then use that to
obtain an algorithm that finds all reachable states. If all the others are eliminated, the
resulting FA still recognizes L (Exercise 3.29). For the remainder of this discussion,
therefore, we assume that all states of M are reachable from qo.
It may be helpful at this point to look again at Example 3.12. Figure 5.3a shows
the original FA we drew for this language; Figure 5.3b shows the minimum-state FA
we arrived at in Example 5.1; Figure 5.3c shows the partition corresponding to the
original FA, with seven subsets; and Figure 5.3d shows the three equivalence classes
of I,, which are the sets L, for the minimum-state FA. We obtain the simpler FA
from the first one by merging the three states 1, 2, and 4 into the state A, and by
merging the states 3, 5, and 7 into B. State 6 becomes state C. Once we have done
this, we can easily determine the new transitions. From any of the states 1, 2, and
4, the input symbol 0 takes us to one of those same states. Therefore, the transition
from A with input 0 must go to A. From 1, 2, or 4, the input 1 takes us to 3, 5,
or 7. Therefore, the transition from A with input 1 goes to B. The other cases are
similar.
In general, starting with a finite automaton M, we may describe the problem in
terms of identifying the pairs (p, q) of states for which L, and L, are subsets of the
same equivalence class. Let us write this condition as p = q. What we will actually
do is to solve the opposite problem: identify those pairs (p, q) for which p 4 q. The
first step is to express the statement p = q ina slightly different way.
Lemma 5.2 Suppose p,q € Q, and x and y are strings with x € L, and y € Lg
(in other words, 8" (Go, x) = p and 6*(qo, y) = q). Then these three statements are
all equivalent:
1 p=q.
2. L/x =L/y (ie., xI_y, or x and y are indistinguishable with respect to L).
3. For any z € &*, 5*(p,z) € A @ 8*(q, z) € A (ie., 5*(p, z) and 5*(q, z) are
either both in A or both not in A).
Proof To see that statements 2 and 3 are equivalent, we begin with the formulas
(a) (d)
Lg=L=2*{10}
(c) (d)
Figure 5.3 |
Two FAs for {0, 1}*{10} and the corresponding partitions of {0, 1}*.
same equivalence class, then they are subsets of different equivalence classes, so that
statement 2 does not hold. @
Let us now consider how it can happen that p € g. According to the lemma,
this means that for some z, exactly one of the two states 6*(p, z) and 6*(q, z) isin A.
178 PART 2 Regular Languages and Finite Automata
The simplest way this can happen is with z = A, so that only one of the states p and
q isin A. Once we have one pair (p, q) with p # q, we consider the situation where
r,s € Q, and for some a € &, 5(r, a) = p and 6(s, a) = q. We may write
1. For any p and g for which exactly one of p and q is in A, (p, qg) isin S.
2. For any pair (p, g) € S, if (r, s) is a pair for pick 6(r, a) = p and 6(s,a) =q
for some a € &, then (r,s) is in S.
3. No other pairs are in S.
It is not difficult to see from the comments preceding the recursive definition that
for any pair (p,q) € S, p #q. On the other hand, it follows from Lemma 5.2 that
we can show S contains all such pairs by establishing the following statement: For
any string z € &*, every pair of states (p, q) for which only one of the states 6*(p, z)
and 6*(q, z) is in A is an element of S.
We do this by using structural induction on z. For the basis step, if only one of
6*(p, A) and 5*(q, A) is in A, then only one of the two states p and g is in A, and
(p,q) € S because of statement 1 of the definition.
Now suppose that for some z, all pairs (p, q) for which only one of 5*(p, z) and
6*(q, z) isin A are in S. Consider the string az, where a € &, and suppose that (r, s)
is a pair for which only one of 5*(r, az) and 5*(s, az) is in A. If we let p = 8(r, a)
and q = 4(s, a), then we have
Our assumption on r and s is that only one of the states 5*(r, az) and 5*(s, az),
and therefore only one of the states 6*(p, z) and 6*(q, z), is in A. Our induction
hypothesis therefore implies that (p,q) € S, and it then follows from statement 2
in the recursive definition that (r,s) € S. (Note that in the recursive definition of
x* implicit in this structural induction, the recursive step of the definition involves
strings of the form az rather than za.)
Now it is a simple matter to convert this recursive definition into an algorithm to
identify all the pairs (p, q) for which p # q.
Algorithm 5.1 (For Identifying the Pairs (p, g) with p £ q) List all (unordered)
pairs of states (p,q) for which p # q. Make a sequence of passes through these
pairs. On the first pass, mark each pair of which exactly one element is in A. On
each subsequent pass, mark any pair (r, s) if there is ana € © for which 4(r, a) = p,
CHAPTER 5 Regular and Nonregular Languages 179
5(s,a) = q, and (p, q) is already marked. After a pass in which no new pairs are
marked, stop. The marked pairs (p, q) are precisely those for which p #q. @
When the algorithm terminates, any pair (p, q) that remains unmarked represents
two states in our FA that can be merged into one, since the corresponding sets of strings
are both subsets of the same equivalence class. In order to find the total number of
equivalence classes, or the minimum number of states, we can make one final pass
through the states of M; the first state of M to be considered corresponds to one
equivalence class; for each subsequent state g of M, q represents a new equivalence
class only if the pair (p, q) was marked by Algorithm 5.1 for every previous state p of
M. As we have seen in our example, once we have the states in the minimum-state FA,
determining the transitions is straightforward. We return once more to Example 5.1
to illustrate the algorithm.
Figure 5.4 |
Applying Algorithm 5.1 to the FA in Figure 5.3a.
180 PART 2 Regular Languages and Finite Automata
go = 5" (qo, A)
qi = 5* (qo, 41)
q2 = 8* (qo, 4142)
must contain some state at least twice, by the pigeonhole principle (Exercise 2.44).
This is where our loop comes from. Suppose g; = gi+p, where 0 <i <i+p <n.
Then
Figure 5.5 |
CHAPTER 5 Regular and Nonregular Languages 181
W = Gi+pt1Git+p+2°°* any
(See Figure 5.5.) The string u is interpreted to be A if i = 0, and w is interpreted to
bey 1fi + p= n.
Since 6*(q;, v) = gi, we have 5*(q;, v”) = q; for every m > 0, and it follows
that 6*(qo, uv"w) = qf for every m > 0. Since p > 0 andi + p <n, we have
proved the following result.
This result is often referred to as the Pumping Lemma for Regular Languages,
since we can think of it as saying that for an arbitrary string in L, provided it is
sufficiently long, a portion of it can be “pumped up,” introducing additional copies
of the substring v, so as to obtain many more distinct elements of L.
The proof of the result was easy, but the result itself is complicated enough in its
logical structure that applying it correctly requires some care. Itmay be helpful first to
weaken it slightly by leaving out some information (where the integer n comes from).
Theorem 5.2a clarifies the essential feature and is sufficient for most applications.
In order to use the pumping lemma to show that a language L is not regular, we
must show that L fails to have the property described in the lemma. We do this by
assuming that the property is satisfied and deriving a contradiction.
The statement is of the form “There is ann so that for any x € L with|x| >n,..-
%
We assume, therefore, that we have such an n, although we do not know what it is.
We try to find a specific string x with |x| > n so that the statements involving x in the
theorem will lead to a contradiction. (The theorem says that under the assumption
that L is regular, any x € L with |x| > n satisfies certain conditions; therefore,
182 PART 2 Regular Languages and Finite Automata
our specific x satisfies these conditions; this leads to a contradiction; therefore, the
assumption leads to a contradiction; therefore, L is not regular.)
Remember, however, that we do not know what n is. In effect, therefore, we
must show that for any n, we can find an x € L with |x| > n so that the statements
about x in the theorem lead to a contradiction. It may be that we have to choose x
carefully in order to obtain a contradiction. We are free to pick any x we like, as long
as |x| => n—but since we do not know what n is, the choice of x must involve n.
Once we have chosen x, we are not free to choose the strings u, v, and w into
which the theorem says x can be decomposed. What we know is that there is some
way to write x as uvw so that equations (5.2)-(5.4) are true. Because we must
guarantee that a contradiction is produced, we must show once we have chosen x that
any choice of u, v, and w satisfying equations (5.1)—(5.4) produces a contradiction.
Let us use as our first illustration one of the languages that we already know is not
regular.
have worked for the language L (at least if n is even), no longer works for L,. The reason
this string works for L is that even if v happens to contain both 0’s and 1’s, uv2w is not of the
form 0'1'. (The contradiction is obtained, not by looking at the numbers of 0’s and 1’s, but by
looking at the order of the symbols.) The reason it does not work for L is that if v contains
equal numbers of 0’s and 1’s, then uv” w also has equal numbers of 0’s and 1’s, no matter what
m we use, and there is no contradiction.
If we had set out originally to show that L; was not regular, we might have chosen an x
in L, but not in L. An example of an inappropriate choice is the string x = (01)”. Although
this string is in L, and its length is at least n, look what happens when we try to produce a
contradiction. If x = uvw, we have these possibilities:
Unfortunately, none of the conditions (5.2)-(5.4) gives us any more information about v,
except for some upper bounds on j. In cases 2 and 4 we can obtain a contradiction because
the string v that is being pumped has unequal numbers of 0’s and 1’s. In the other two cases,
however, there is no contradiction, because uv”w has equal numbers of 0’s and 1’s for any
m. We cannot guarantee that one of these cases does not occur, and therefore we are unable to
finish the proof using this choice of x.
Then just as in the two previous examples, if Equations (5.1)—(5.4) are true, the string v is a
substring of the form 0/ (with j > 0) from the first part of x. We can obtain a contradiction
using either m = 0 or m > 1. In the first case, uv”w = 0"-/10", and in the second case, if
m = 2 for example, uv”w = 0"*/10". Neither of these is a palindrome, and it follows that L
cannot be regular.
It is often possible to get by with a weakened form of the pumping lemma. Here
are two versions that leave out many of the conclusions of Theorem 5.2a but are still
strong enough to show that certain languages are not regular.
Theorem 5.3 would be sufficient for Example 5.7, as you are asked to show in
Exercise 5.21, but it is not enough to take care of Examples 5.8 or 5.9.
Theorem 5.4 would not be enough to show that the language in Example 5.7 is
not regular. The next example shows a language for which it might be used.
The phrase not prime means factorable into factors 2 or bigger. We could choose m = p,
which would give
p+mq = p+ pq =p(l+q)
except that we are not certain that p > 2. Instead let m = p + 2g +2. Then
p+mq=p+(p+2q+2)q
= (p+ 2q) + (p+ 2q)q
= (p+2q)1+4q)
and this is clearly not prime.
This example has a different flavor from the preceding ones and seems to have more to
do with arithmetic, or number theory, than with languages. Yet it illustrates the fact, which
will become even more obvious in the later parts of this book, that many statements about
computation can be formulated as statements about languages. What we have found in this
example is that a finite automaton is not a powerful enough device (it does not have enough
memory) to solve the problem of determining, for an arbitrary integer, whether it is prime.
Corollary 5.1, in the first part of this chapter, gives a condition involving a
language that is necessary and sufficient for the language to be regular. Theorem 5.2a
gives a necessary condition. One might hope that it is also sufficient. This result (the
converse of Theorem 5.2a) would imply that for any nonregular language L, the
pumping lemma could be used to prove L nonregular; constructing the proof would
just be a matter of making the right choice for x. The next example shows that this
is not correct: Showing that the conclusions of Theorem 5.2a hold (i.e., showing that
there is no choice of x that produces a contradiction) is not enough to show that the
language is regular.
Any string of the form uv” w is still of the form a'b/c/ and is therefore an element of L (whether
or not/ is 0). If x = bc/, then again let u = A and let v be the first symbol in x. It is still true
that uv”w € L for every m > 0.
However, L is not regular, as you can show using Corollary 5.1. The details are almost
identical to those in Example 5.7 and are left to Exercise 5.22.
e
oe eae ee ee ee S
186 PART 2 Regular Languages and Finite Automata
4. Given two finite automata M; and Mb, are there any strings that are accepted by
both?
5. Given two FAs M;, and M), do they accept the same language? In other words,
is L(M,) = L(M))?
6. Given two FAs M;, and Mp, is L(M;) a subset of L(M>)?
7. Given two regular expressions r; and r2, do they correspond to the same
language?
8. Given an FA M, is it a minimum-state FA accepting the language L(M)?
Problem | is a version of the membership problem for regular languages, except
that we start with a regular expression rather than a finite automaton. Because we
have an algorithm from Chapter 4 to take an arbitrary regular expression and produce
an FA accepting the corresponding language, we can reduce problem | to the version
of the membership problem previously mentioned.
Section 5.2 gives a decision algorithm for problem 8: Apply the minimization
algorithm to M, and see if the number of states is reduced. Of the remaining problems,
some are closely related to others. In fact, if we had an algorithm to solve problem
2, we could construct algorithms to solve problems 4 through 7. For problem 4, we
could first use the algorithm presented in Section 3.5 to construct a finite automaton
M recognizing L(M,) M L(M>), and then apply to M the algorithm for problem
2. Problem 6 could be solved the same way, with L(M,) NM L(M2) replaced by
L(M,) — L(M)), because L(M,) C L(M2) if and only if L(M;) — L(M2) = 9.
Problem 5 can be reduced to problem 6, since two sets are equal precisely when each
is a subset of the other. Finally, a solution to problem 6 would give us one to problem
7, because of our algorithm for finding a finite automaton corresponding to a given
regular expression.
Problems 2 and 3 remain. With regard to problem 2, one might ask how a finite
automaton could fail to accept any strings. A trivial way is for it to have no accepting
states. Even if M does have accepting states, however, it fails to accept anything
if none of its accepting states is reachable from the initial state. We can determine
whether this is true by calculating 7; the set of states that can be reached from qo by
using strings of length k or less, as follows:
{qo} ifk=0
k
el EPC an mee ee =D ew) ee
(J; contains, in addition to the elements of 7-1, the states that can be reached in one
step from the elements of T;_1.)
If n is the number of states of M, then one of the two outcomes of the algorithm
must occur by the time 7; has been computed. This implies that the following
algorithm would also work.
188 PART 2 Regular Languages and Finite Automata
Begin testing all input strings, in nondecreasing order of length, for acceptance by M. If
no strings of length 7 or less are accepted, then L(M) = @.
Note, however, that this approach is likely to be much less efficient. For example,
if we test the string 0101100 and later the string 01011000, all but the last step of the
second test is duplicated effort.
The idea of testing individual strings in order to decide whether an FA accepts
something is naturally tempting, but useless as an algorithm without some way to
stop if the individual tests continue to fail. Only the fact that we can stop after testing
strings of length n makes the approach feasible. Theorem 5.2, the original form of
the pumping lemma, is another way to see that this is possible. The pumping lemma
implies that if x is any string in L of length at least n, then there is a shorter string in
L (the one obtained by deleting the middle portion v). Therefore, it is impossible for
the shortest string in the language to have length n or greater.
Perhaps surprisingly, the pumping lemma allows us to use a similar approach
with problem 3. If the FA M has n states, and x is any string in L of length at least
n, then there is a string y in L that is shorter than x but not too much shorter: There
exist u, v, and w with O < |v| < nso thatx = uvw € L and y = uw € L, so
that the difference in length between x and y is at most n. Now consider strings in
L whose length is at least n. If there are any at all, then the pumping lemma implies
that L must be infinite (because there are infinitely many strings of the form uv' w);
in particular, if there is a string x € L withn < |x| < 2n, then L is infinite. On the
other hand, if there are strings in L of length at least n, it is impossible for the shortest
such string x to have length 2n or greater—because as we have seen, there would
then have to be a shorter string y € L close enough in length to x so that |y| > n.
Therefore, if L is infinite, there must be a string x € L withn < |x| < 2n. We have
therefore established that the following algorithm is a solution for problem 3.
There are at least two reasons for discussing, and trying to solve, decision prob-
lems like the ones in our list. One is the obvious fact that solutions may be useful.
For a not entirely frivolous example, picture your hard-working instructor grading an
exam question that asks for an FA recognizing a specific language. He or she knows
a solution, but one student’s paper shows a different FA. The instructor must then try
to determine whether the two are equivalent, and this means answering an instance
of problem 5. If the answer to a specific instance of the problem is the primary con-
cern, then whether there is an efficient, or feasible, solution is at least as important as
whether there is a solution in principle. The solution sketched above for problem 5
involves solving problem 2, and the second version of the decision algorithm given
for problem 2 would not help much in the case of machines with a hundred states,
even if a computer program and a fast computer were available.
CHAPTER 5 Regular and Nonregular Languages 189
Aside from the question of finding efficient algorithms, however, there is another
reason for considering these decision problems, a reason that will assume greater
significance later in the book. It is simply that not all decision problems can be solved.
An example of an easy-to-state problem that cannot be solved by any decision
algorithm was formulated in the 1930s by the mathematician Alan Turing. He de-
scribed a type of abstract machine, now called a Turing machine, more general than
a finite automaton. These machines can recognize certain languages in the same way
that FAs can recognize regular languages, and Turing’s original unsolvable problem
is simply the membership problem for this more general class of languages: Given a
Turing machine M and a string x, does M accept x? (see Section 11.2). Turing ma-
chines are involved in this discussion in another way as well, because such a machine
turns out to be a general model of computation. This is what allows us to formulate
the idea of an algorithm precisely and to say exactly what “unsolvable” means.
Showing the existence of unsolvable problems—particularly ones that arise nat-
urally and are easy to state—was a significant development in the theory of computa-
tion. The conclusion is that there are definite theoretical limits on what it is possible
to compute. These limits have nothing to do with how smart we are, or how good at
designing software; and they are not simply practical limits having to do with effi-
ciency and the amount of time available, or physical considerations like the number
of atoms available for constructing memory devices. Rather, they are fundamental
limits inherent in the rules of logic and the nature of computation. We will be in-
vestigating these matters later in the book; for the moment, it is reassuring to find
that many of the natural problems involving regular languages do have algorithmic
solutions.
EXERCISES
5.1. For which languages L C {0, 1}* is there only one equivalence class with
respect to the relation /;,?
5.2. Let x be an arbitrary string in {0, 1}*, and let L = {x}. How many
equivalence classes are there for the relation J; ? Describe them.
5.3. Finda language L C {0, 1}* for which every equivalence class of J, has
exactly one element.
5.4. Show that for any language L C &%, the set
S = {x € &* | x is not a prefix of any element of L}
is one equivalence class of J, provided it is not empty.
5.5. Let L C &* be any language. Show that if [A] (the equivalence class of J,
containing A) is not {A}, then it is infinite.
5.6. Show that if L C D* is a language, x € D*, and [x] (the equivalence class of
I, containing x) is finite, then x is a prefix of an element of L.
5.7. For acertain language L C {a, b}*, I, has exactly four equivalence classes.
They are [A], [a], [ab], and [b]. It is also true that the three strings a, aa,
and abb are all equivalent, and that the two strings b and aba are equivalent.
Finally, ab € L, but A and a are not in L, and b is not even a prefix of any
element of L. Draw an FA accepting L.
5.8. Suppose there is a 3-state FA accepting L C {a, b}*. Suppose A ¢L,b¢ L,
and ba € L. Suppose also that aI,b, AI, bab, al,aaa, and bI, bb. Draw an
FA accepting L.
5.9. Suppose there is a 3-state FA accepting L C {a, b}". Suppose A€GL,DEL,
ba € L, and baba € L, and that AI,a and aI, bb. Draw an FA accepting L.
5.10. Find all possible languages L C {a, b}* for which /, has these three
equivalence classes: the set of all strings ending in b, the set of all strings
ending in ba, and the set of all strings ending in neither b nor ba.
192 PART 2 Regular Languages and Finite Automata
5.11. Find all possible languages L C {a, b}* for which J; has three equivalence
classes, corresponding to the regular expressions ((a + b)a*b)*,
((a + b)a*b)*aa*, and ((a + b)a*b)*ba*, respectively.
5.12. In Example 5.2, if the language is changed to {0"1” |n > 0} (i.e., A is added
to the original language), are there any changes in the partition of {0, 1}*
corresponding to J? Explain.
5.13. Consider the language L = {x € {0, 1}* |no(x) = n1(x)} (where no(x) and
n(x) are the number of 0’s and the number of 1’s, respectively, in x).
a. Show that if no(x) — ni (x) = no(y) — n1(y), then x Iz y.
b. Show that if no(x) — ni (x) 4 no(y) — n\(y), then x and y are
distinguishable with respect to x.
c. Describe all the equivalence classes of I.
5.14. Let M = (Q, &, qo, A, 5) be an FA, and suppose that Q; is a subset of Q
such that 5(qg, a) € Q, for every q € Q; andeverya € &.
a. Show that if Q; 1 A =4@, then for any p andq in Q), p =q.
b. Show that if Q; C A, then for any p andq in Q1, p = q.
SalS. For a language L over &, and two strings x and y in D* that are
distinguishable with respect to L, let
a. For the language L = {x € {0, 1}* |x ends in 010}, find the maximum
of the numbers dx) over all possible pairs of distinguishable strings x
and y.
b. If L is the language of balanced strings of parentheses, if |x| = m and
|y| =n, find an upper bound involving m and n on the numbers d L,X,y:
5.16. For each of the FAs pictured in Figure 5.6, use the minimization algorithm
described in Algorithm 5.1 and illustrated in Example 5.6 to find a
minimum-state FA recognizing the same language. (It’s possible that the
given FA may already be minimal.)
By if Find a minimum-state FA recognizing the language corresponding to each of
these regular expressions.
a. (0*10-- 1*0)@1)*
b. (010)*1 + (1*0)*
5.18. Suppose that in applying Algorithm 5.1, we establish some fixed order in
which to process the pairs, and we follow the same order on each pass.
a. What is the maximum number of passes that might be required? Describe
an FA, and an ordering of the pairs, that would require this number.
b. Is there always a fixed order (depending on M) that would guarantee no
pairs are marked after the second pass, so that the algorithm terminates
after three passes?
S219: For each of the NFA-As pictured in Figure 5.7, find a minimum-state FA
accepting the same language.
Figure 5.6 |
193
194 PART 2 Regular Languages and Finite Automata
(8) (h)
Figure 5.6 |
Continued
Figure 5.7 |
5.20. In each of the following cases, prove that L is nonregular by showing that
any two elements of the infinite set {0” |n > 0} are distinguishable with
respect to L.
a = 10710"
n= 0}
b. b= {ONO
lk > 7475}
e L={0'l)
| j =iorj =2i}
CHAPTER 5 Regular and Nonregular Languages 195
5.40. For two languages L; and L> over &, we define the quotient of L, and L> to
be the language
L,/L2 = {x | for some y € Lo,xy € Ly}
Show that if L; is regular and L> is any language, then L/L> is regular.
5.41. Suppose L is a language over J, and x1, x2,..., X, are strings that are
pairwise distinguishable with respect to L; that is, for any i # j, x; and x;
are distinguishable. How many distinct strings are necessary in order to
distinguish between the x;’s? In other words, what is the smallest number k
so that for some set {z;, Z2,..., Zx}, any two distinct x;’s are distinguished,
relative to L, by some z;? Prove your answer. (Here is a way of thinking
about the question that may make it easier. Think of the x;’s as points on a
piece of paper, and think of the z,’s as cans of paint, each z; representing a
different primary color. Saying that z; distinguishes x; and x; means that one
of those two points is colored with that primary color and the other isn’t. We
allow a single point to have more than one primary color applied to it, and
we assume that two distinct combinations of primary colors produce
different resulting colors. Then the question is, how many different primary
colors are needed in order to color the points so that no two points end up the
same color?)
5.42. Suppose M = (Q, &, qo, A, 5) is an FA accepting L. We know (Lemma 5-2)
that if p,q € Q and p # q, then there is a string z so that exactly one of the
two states 6*(p, z) and 6*(q, z) is in A. Find an integer n (depending only
on M) so that for any p and q with p ¥ q, there is such a z with |z| <n.
5.43. Show that L is regular if and only if there is an integer n so that any two
strings distinguishable with respect to L can be distinguished by a string of
length < n. (Use the two previous exercises.)
5.44, Suppose that M; = (Q), X, qi, Ai, 6;) and Mp = (Qo, ¥, q2, Az, 52) are
both FAs accepting the language L, and both have as few states as possible.
Show that M; and M) are isomorphic (see Exercise 3.55). Note that in both
cases, the sets L, forming the partition of * are precisely the equivalence
classes of I;. This tells you how to come up with a bijection from Q; to Qo.
What you must do next is to show that the other conditions of an
isomorphism are satisfied.
5.45. Use the preceding exercise to describe another decision algorithm to answer
the question “Given two FAs, do they accept the same language?”
5.46. Suppose L and L; are both languages over , and M is an FA with alphabet
Xu. Let us say that M accepts L relative to L, if M accepts every string in
the set L 1 L, and rejects every string in the set L; — L. Note that this is not
in general the same as saying that M accepts the language LN Lj.
Now suppose L;, Lo, ... are regular languages over ©, L; C Lj4, for
each 7, and UL; = X*. For each i, let n; be the minimum number of
states required to accept L relative to L;. If there is no FA accepting L
relative to L;, we say n; is 00.
CHAPTER 5 Regular and Nonregular Languages 199
x2 = uvw
|v] > 0
201
va
oat J
yee
i-txeit09 ”
feud — =
. + _fet
i Chal: Se
if ; = = shbbreaton
2 : ¢ iy, (2m
’ : <2 anf) Ps it — rau on te Sa ons Gr; x » metre 4G a wiih wv ee
oy a a nga } soqaal 6ri aes
“¢en| ite aL ge é H
rea oo fi f nite" © i sare13 fr.
:
Sele en anTtA TS deri. ta
| wires 4 apart Sy 3, re Dare esr Beat:
Pe | atten Mies Ta say 7) Belle ts ies Mier). gaia Lahde eition
mY o* Sirsigz 4et ESSA MNES SL en aneey or bigyad Sone Apaii Viz p ybiay
t < : rit agua vi ieTreat ipo ve bs > ity Ma Mh) P78 Ai uy sey a | wicked ent,Ms
a 4 ‘ a / - «pam rarest Letra di
a eal : Mae am ni“seliyearstl tehaha; = wre a ds qeartalhunpantty sistem
wae rt or ars, ie Nd bay teed Toohey gra) ahlpn? $4 | bs yao * Plows sitet rests a
. eo Ohl ute Mie een vmitieto iu qi) Sue Nuavieiytee, Stasesaiait soli ht
i Thorns 5 its wasn AJL cme ld gy aM et SiehAe ane 2oi 5 tain eibinsriot ad it
ae
a
C HAPTER
Context-Free Grammars
1 ay &
2. ForanySéL,SaeL.
3. ForanySéeL,SbeL.
4. No other strings are in L.
to describe the sequence of steps (the application of rules 2, 3, 3, and 1) used to obtain, or
203
204 PART 3 Context-Free Languages and Pushdown Automata
derive, the string bba. The derivation comes to an end at the point where we replace S by an
actual string of alphabet symbols (in this case A); each step before that is roughly analogous
to a recursive call, since we are replacing S by a string that still contains a variable.
We can simplify the notation even further by introducing the symbol | to mean “or” and
writing the first three rules as
S— A|Sa|Sb
(In our new notation, we dispense with writing rule 4, even though it is implicitly still in effect.)
We note for future reference that in an expression such as Sa | Sb, the two alternatives are Sa
and Sb, not a and S—in other words, the concatenation operation takes precedence over the |
operation.
In Example 2.15 we also considered this alternative definition of L:
A€eEL.
aeL.
beL.
For every x and yin L, xy € L.
8S
fo
9SS No other strings are in L.
Using our new notation, we would summarize the “grammar rules” by writing
S—>A|a|b|SS
With this approach there is more than one way to obtain the string bba. Two derivations are
shown below:
SSS
DS DSS! => DPS => bba
= So Sa => SSa — bSa— bba
The five steps in the first line correspond to rules 4, 3, 4, 3, and 2, and those in the second line
to rules 4, 2, 4, 3, and 3.
In both cases in Example 6.1, the formulas obtained from the recursive definition
can be interpreted as the grammar rules in a context-free grammar. Before we give the
official definition of such a grammar, we consider two more examples. In Example
6.2, although the grammar is perhaps even simpler than that in Example 6.1, the
corresponding language is one we know to be nonregular. Example 6.3 is perhaps
more typical in that it includes a grammar containing more than one variable.
S—>aSb|A
1. AEL.
2. Forevery x € L,axb € L.
3. Nothing else is in L.
The language L is easily seen to be the nonregular language {a"b” | n > O}. The grammar
rules in the first part of Example 6.1, for the language {a, b}*, allow a’s and b’s to be added
independently of each other. Here, on the other hand, each time one symbol is added to one
end of the string by an application of the grammar rule § — aSb, the opposite symbol is
added simultaneously to the other end. As we have seen, this constraint is not one that can be
captured by any regular expression.
Palindromes | EXAMPLE6.3 |
Let us consider both the language pal of palindromes over the alphabet {a, b} and its comple-
ment JN, the set of all nonpalindromes over {a, b}. From Example 2.16, we have the following
recursive definition of pal:
1. A,a,b € pal.
2. For any S € pal, aSa and bSb are in pal.
3. No other strings are in pal.
We can therefore describe pal by the context-free grammar with the grammar rules
S— A|a|b|aSa|bSb
The language N also obeys rule 2: For any nonpalindrome x, both axa and bxb are nonpalin-
dromes. However, a recursive definition of N cannot be as simple as this definition of pal,
because there is no finite set of strings that can serve as basis elements in the definition. There
is no finite set No so that every nonpalindrome can be obtained from an element of No by
applications of rule 2 (Exercise 6.42). Consider a specific nonpalindrome, say
abbaaba
If we start at the ends and work our way in, trying to match the symbols at the beginning
with those at the end, the string looks like a palindrome for the first two steps. What makes
it a nonpalindrome is the central portion baa, which starts with one symbol and ends with the
opposite symbol; the string between those two can be anything. A string is a nonpalindrome if
and only if it has a central portion of this type. These are the “basic” nonpalindromes, and we
can therefore write the following definition of N:
In order to obtain a context-free grammar describing N, we can now simply introduce a second
variable A, representing an arbitrary element of {a, b}*, and incorporate the grammar rules for
this language from Example 6.1:
S — aAb |bAa |aSa |bSb
A—>A|Aa|Ab
206 PART 3 Context-Free Languages and Pushdown Automata
a>c fp
means that the string 6 can be obtained from the string a by replacing some variable
that appears on the left side of a production in G by the corresponding right side, or
that
a =a,Aa
B= ay yap
and one of the productions in G is A > y. (We can now understand better the term
context-free. If at some point in a derivation we have obtained a string a containing
the variable A, then we may continue by substituting y for A, no matter what the
strings a, and a are—that is, independent of the context.)
In this case, we will say that a derives B, or B is derived from a, in one step.
More generally, we write
a =G B
CHAPTER 6 Context-Free Grammars 207
S> S+S|S—S|Sx*S|S/S|(S)|a
The string a + (a x a)/a — a can be obtained from the derivation
It is easy to see that there are many other derivations as well. For example,
We would probably say that the first of these is more natural than the second. The first starts
with the production
S—>S-S
208 PART 3 Context-Free Languages and Pushdown Automata
and therefore indicates that we are interpreting the original expression as the difference of two
other expressions. This seems correct because the expression would normally be evaluated as
follows:
The expression “is” the difference of the subexpression with value C and the subexpression a.
The second derivation, by contrast, interprets the expression as a quotient. Although there is
nothing in the grammar to rule out this derivation, it does not reflect our view of the correct
structure of the expression.
One possible conclusion is that the context-free grammar we have given for the language
may not be the most appropriate. It does not incorporate in any way the standard conventions,
having to do with the precedence of operators and the left-to-right order of evaluation, that
we use in evaluating the expression. (Precedence of operators dictates that in the expression
a+b +c, the multiplication is performed before the addition; and the expression a — b+c
means (a — b) +c, not a — (b+ c).) Moreover, rather than having to choose between two
derivations of a string, it is often desirable to select, if possible, a CFG in which a string can have
only one derivation (except for trivial differences between the order in which two variables
in some intermediate string are chosen for replacement). We will return to this question in
Section 6.4, when we discuss ambiguity in a CFG.
The syntax of these two types of statements can be described by the rules
where (expression) is another variable, whose productions would also be difficult to describe
completely.
Although in both cases the last term on the right side specifies a single statement, the logic
of a program often requires more than one. It is therefore necessary to have our definition of
(statement) allow a compound statement, which is simply a sequence of zero or more statements
enclosed within {}. We could easily write a definition for (compound-statement) that
would
say this. A syntax diagram such as the one shown accomplishes the same thing.
CHAPTER 6 Context-Free Grammars 209
( statement )
A path through the diagram begins with {, ends with }, and can traverse the loop zero or more
times.
More than one sentence derivable from this grammar does not quite work: “John reminded her-
self” and “Jane reminded himself,” for example. These could be eliminated in a straightforward
way (at the cost of complicating the grammar) by introducing productions like
(declarative sentence) —> (masculine noun) (verb) (masculine reflexive pronoun)
A slightly more subtle problem is “Jane reminded Jane.” N ormally we do not say this, unless
we have in mind two different people named Jane, but there is no obvious way to prohibit it
without also prohibiting “Jane reminded John.” (At least, there is no obvious way without
essentially using a different production for every sentence we want to end up with. This trivial
option is available here, since the language is finite.) To distinguish “Jane reminded John,”
which is a perfectly good English sentence, from “Jane reminded Jane” requires using context,
and this is exactly what a context-free grammar does not allow.
Meee eee ee eee ——
210 PART 3 Context-Free Languages and Pushdown Automata
S—A|0S1|1S0
Not every string in L can be obtained from these productions, because some elements of L begin
and end with the same symbol; the strings 0110, 10001101, and 0010111100 are examples.
If we look for ways of expressing each of these in terms of simpler elements of L, we might
notice that each is the concatenation of two nonnull elements of L (for example, the third string
is the concatenation of 001011 and 1100). This observation suggests the production S > SS.
It is reasonably clear that if G is the CFG containing the productions we have so far,
would like to show that x has a derivation in G. Such a derivation would have to start with the
production S$ — SS; in order to show that there is such a derivation, we would like to show
that x = wz, where w and z are shorter strings that can both be derived from S. (It will then
follow that we can start the derivation with S + SS, then continue by deriving w from the first
S and z from the second.) Another way to express this condition is to say that x has a prefix w
so that 0 < |w| < |x| and d(w) = 0.
Let us consider d(w) for prefixes w of x. The shortest nonnull prefix is 0, and d(0) = 1;
the longest prefix shorter than x is Oy, and d(0y) = —1 (because the last symbol of x is 0,
and d(x) = 0). Furthermore, the d-value of a prefix changes by 1 each time an extra symbol
is added. It follows that there must be a prefix w, longer than 0 and shorter than Oy, with
d(w) = 0. This is what we wanted to prove. The case when x = ly1 is almost the same, and
so the proof is concluded.
Let us continue with the language L = {x € {0, 1}* | no(x) = n(x)} of the last example;
this time we construct a CFG with three variables, based on a different approach to a recursive
definition of L.
One way to obtain an element of L is to add both symbols to a string already in L. Another
way, however, is to add a single symbol to a string that has one extra occurrence of the opposite
symbol. Moreover, every element of L can be obtained this way and in fact can be obtained by
adding this extra symbol at the beginning. Let us introduce the variables A and B, to represent
strings with an extra 0 and an extra 1, respectively, and let us denote these two languages by
Lo and L: 4
S—>OB|1A|A
It is also easy to find one production for each of the variables A and B. Ifa string in Lo begins
with 0, or if a string in L; begins with 1, then the remainder is an element of L. Thus, it is
appropriate to add the productions
A — 0S B- 1S
What remains are the strings in Lo that start with 1 and the strings in L, that start with 0. In the
first case, if x = ly and x € Lo, then y has two more 0’s than 1’s. If it were true that y could
be written as the concatenation of two strings, each with one extra 0, then we could complete
the A-productions by adding A > 1AA, and we could handle B similarly.
In fact, the same technique we used in Example 6.7 will work here. If d(x) = 1 and
x = ly, then A is a prefix of y with d(A) = 0, and y itself is a prefix of y with d(y) = D,
Therefore, there is some intermediate prefix w of y with d(w) = 1, and y = wz where
w,z €.Lo.
212 PART 3 Context-Free Languages and Pushdown Automata
This discussion should make it at least plausible that the context-free grammar with
productions
S—>OB|IA|A
A—0OS|1AA
B—>1S|0BB
generates the language L. By taking the start variable to be A or B, we could just as easily
think of it as a CFG generating Lo or Lj. It is possible without much difficulty to give an
induction proof; see Exercise 6.50.
The following theorem provides three simple ways of obtaining new CFLs from
languages that are known to be context-free.
CHAPTER 6 Context-Free Grammars 213
Note that it really is necessary in the first two parts of the proof to make sure that
V, NM V2 = @. Consider CFGs having productions
and
respectively. If we applied the construction in the first part of the proof without
relabeling variables, the resulting grammar would allow the derivation
S'S)
=> XA => 0A =ida
even though da is not derivable from either of the two original grammars.
Corollary 6.1 Every regular language is a CFL.
Proof
According to Definition 3.1, regular languages over D are the languages obtained
from @, {A}, and {a} (a € X) by using the operations of union, concatenation, and
Kleene *. Each of the primitive languages 9, {A}, and {a} is a context-free language.
(In the first case we can use the trivial grammar with no productions, and in the
other two cases one production is sufficient.) The corollary therefore follows from
Theorem 6.1, using the principle of structural induction.
(O11 + 1)*(01)*
We can take a few obvious shortcuts in the algorithm provided by the proof of Theorem 6.1.
The productions
A — O11/1
generate the language {011, 1}. Following the third part of the theorem, we can use the
productions
B- AB\A
A> O0l1]1
with B as the start symbol to generate the language {011, 1}*. Similarly, we can use
C+ DC|A
D> 01
to derive {01}* from the start symbol C. Finally, we generate the concatenation of the
two
languages by adding the production S > BC. The final grammar has start symbol S, auxiliary
variables A, B, C, and D, and productions
S— BC
B+ AB|A
Ae sO (el
C> DC|A
as D-> a
ee 01 Eee
CHAPTER 6 Context-Free Grammars 215
Starting with any regular expression, we can obtain an equivalent CFG using the
techniques illustrated in this example. In the next section we will see that any regular
language L can also be described by a CFG whose productions all have a very simple
form, and that such a CFG can be obtained easily from an FA accepting L.
S—>0|S0|0S
We also need to be able to add 1’s to our strings. We cannot expect that adding a 1 to an
element of Lo will always produce an element of Lo; however, if we have two strings in Lo,
concatenating them produces a string with at least two more 0’s than 1’s, and then adding a
single 1 will still yield an element of Lo. We could add it at the left, at the right, or between
the two. The corresponding productions are
S—> 1SS|SS1| S1S
It is not hard to see that any string derived by using the productions
S—>0|SO|0S|1SS | SS1|
SIS
is an element of Lo (see Exercise 6.43). In the converse direction we can do even a little better:
if Go is the grammar with productions
S—>0/0S|1SS|SS1|S1S
Once we have done this, the induction hypothesis will tell us that both w and z can be derived
from S, so that we will be able to derive x by starting with the production
S— S1S
To show that x has this form, suppose x contains n 1’s, wheren > 1. Foreachi with] <i <n,
let w; be the prefix of x up to but not including the ith 1, and z; the suffix of x that follows this
1. In other words, for each 7,
x = ww; lz;
where the | is the ith 1 in x. If d(w,) > 0, then we may let w = w, and z = z,. The string
Zn is O/ for some j > 0 because x ends with 0, and we have the result we want. Otherwise,
d(w,) < 0. In this case we select the first i with d(w;) < 0, say i = m. Now since x
begins with 0, d(w;) must be > 0, which implies that m > 2. At this point, we can say that
d(Wm-1) > Oandd(w,) < 0. Because w,, has only one more 1 than w,,_1, d(w,,_1) can be no
more than 1. Therefore, d(Wm_1) = 1. Since x = wm_11Zm—1, and d(x) > 0, it follows that
d(Zm-1) > 0. This means that we get the result we want by letting w = w,,_, and z = Zm_1.
The proof in this case is complete.
For the other two cases, the one in which x starts with 1 and the one in which x ends with
1, see Exercise 6.44.
Now it is easy enough to obtain a context-free grammar G generating L. We use S as our
start symbol, A as the start symbol of the grammar we have just derived generating Lo, and B
as the start symbol for the corresponding grammar generating L,. The grammar G then has
the productions
S—>A|B
A—>0|0A|1AA]|AA1|A1A
B—>1|1B|0BB|BBO|
BOB
The second part of Theorem 6.1, applied twice, reduces the problem to finding CFGs for these
three languages.
L; is essentially the language in Example 6.2, L3 is the same with the symbols 0 and 1
reversed, and L» can be generated by the productions
Bes |p
(The second production is B — 1, not B > A, since we want only nonnull strings.)
The final CFG G = (V, &, S, P) incorporating these pieces is shown below.
/V ={S/A,B,C} T= 10/1}
P=(S > ABC
A> OA1|A
B—->1B{1
GoEO LAR
A derivation of 0140? = (01)(1) (170°), for example, is
11
110
1100
11000
110001
1100010
11000101
110001010 AwBAwRPFeeQdsnae
218 PART 3 Context-Free Languages and Pushdown Automata
0 1
ea
eet
1
Figure 6.1 |
A>1B > 11B => 110C => 1100A => 11000A > 110001B
= 1100010C => 11000101B => 110001010C
This looks like a derivation in a grammar. The grammar can be obtained by
specifying the variables to be the states of the FA and starting with the productions
A—>I1B
B—1B
B+ 0C
C+>O0A
AOA
C—>1B
These include every production of the form
P—>aQ
where
P30
is a transition in the FA. The start symbol is A, the initial state of the FA. To complete
the derivation, we must remove the C from the last string. We do this by adding the
production B — 0, so that the last step in the derivation is actually
11000101B = 110001010
Note that the production we have added is of the form
P->a
where
PSF
is a transition from P to an accepting state F.
Any FA leads to a grammar in exactly this way. In our example
it is easy to see
that the language generated is exactly the one recognized by the FA. In
general, we
must qualify the statement slightly because the rules we have described
for obtaining
CHAPTER 6 Context-Free Grammars 219
productions do not allow A-productions; however, it will still be true that the nonnull
strings accepted by the FA are precisely those generated by the resulting grammar.
A significant feature of any derivation in such a grammar is that until the last step
there is exactly one variable in the current string; we can think of it as the “state of
the derivation,” and in this sense the derivation simulates the processing of the string
by the FA.
onas= poe-
oea {B >aa 5B,
: neea= aa +n isaccepted PYM, andn> 1,
- Thenther i
isa
‘sequence ea _
dd 1g ne extri in
10a transition Bo a
ro ction B > a forearai to
220 PART 3 Context-Free Languages and Pushdown Automata
Sometimes the term regular is applied to grammars that do not restrict the form
of the reductions so severely. It can be shown (Exercise 6.12) that a language is
regular if and only if it can be generated, except possibly for the null string, by a
grammar in which all productions look like this:
Bo>xC
Box
where B and C are variables and x is a nonnull string of terminals. Grammars of this
type are also called linear. The exercises discuss a few other variations as well.
a string @ for which the production A > a is used in the derivation. (In the case of
a production A — A, the node labeled A has the single child A.)
In the simplest case, when the tree is the derivation tree for a string x € L(G)
and there are no “A-productions” (of the form A — A), the leaf nodes of the tree
correspond precisely to the symbols of x. If there are A-productions, they show up
in the tree, so that some of the leaf nodes correspond to A; of course, those nodes can
be ignored as one scans the leaf nodes to see the string being derived, because A’s
can be interspersed arbitrarily among the terminals without changing the string. In
the most general case, we will also allow “derivations” that begin with some variable
other than the start symbol of the grammar, and the string being derived may still
contain some variables as well as terminal symbols.
In Example 6.4 we considered the CFG with productions
has the derivation tree shown in Figure 6.2b. In general, any derivation of a string in
a CFL has a corresponding derivation tree (exactly one).
(There is a technical point here that is worth mentioning: With the two produc-
tions S > SS |a, the sequence of steps S > SS = SSS =* aaa can be interpreted
two ways, because in the second step it could be either the first S or the second that
is replaced by SS. The two interpretations correspond to two different derivation
trees. For this reason, we say that specifying a derivation means giving not only the
sequence of strings but also the position in each string at which the next substitution
occurs. The steps S > SS = SSS already represent two different derivations.)
(a) (b)
Figure 6.2 |
Derivation trees for two algebraic
expressions.
222 PART 3 Context-Free Languages and Pushdown Automata
Algebraic expressions such as the two shown in Figure 6.2 are often represented
by expression trees—binary trees in which terminal nodes correspond to identifiers
or constants and nonterminal nodes correspond to operators. The expression tree in
Figure 6.3 conveys the same information as the derivation tree in Figure 6.2a, except
that only the nodes representing terminal symbols are drawn.
One step in a derivation is the replacement of a variable (to be precise, a particular
occurrence of a variable) by the string on the right side of a production. The derivation
a a is the entire sequence of such steps, and in a sequence the order of the steps is
Figure 6.3 | significant. The derivations
Expression tree
corresponding to Se NAS Ss eS Ss aka
Figure 6. 1a. and
We conclude that a string of terminals has more than one derivation tree if and
only if it has more than one leftmost derivation. Notice that in this discussion “left-
most” could just as easily be “rightmost”; the important thing is not what order is
followed, only that some clearly defined order be followed consistently, so that the
two normalized versions can be compared meaningfully.
As we have already noticed, a string can have two or more essentially different
derivations in the same CFG.
It is not hard to see that the ambiguity defined here is closely related to the
ambiguity we encounter every day in written and spoken language. The reporter
who wrote the headline “Disabled Fly to See Carter,” which appeared during the
administration of the thirty-ninth U.S. President, probably had in mind a derivation
such as
S — (adjective) (noun) - - -
S—> S+S|S—S|Sx*S|S/S|(S)|a
a+(axa)/a—a
and in fact the two derivations were both leftmost, which therefore demonstrates the ambiguity
of the grammar. This can also be demonstrated using only the productions S> S + S and
S — a; the string a + a + a has leftmost derivations
and
The corresponding derivation trees are shown in Figures 6.4a and 6.4b, respectively.
224 PART 3 Context-Free Languages and Pushdown Automata
ee
wer A
a a
a a a a
(a) (b)
Figure 6.4 |
Two derivation trees fora +a+a.
Although the difference in the two interpretations of the string a + a +a is not quite as
dramatic as in Example 6.4 (the expression is viewed as the sum of two subexpressions in both
cases), the principle is the same. The expression is interpreted as a + (a + a) in one case, and
(a +a) +a in the other. The parentheses might be said to remove the ambiguity as to how the
expression is to be interpreted. We will examine this property of parentheses more carefully
in the next section, when we discuss an unambiguous CFG equivalent to this one.
SSe
ES ee
S ee ee eee
It is easy to see by studying Example 6.12 that every CFG containing a production
of the general form A — AqaA is ambiguous. However, there are more subtle ways
in which ambiguity occurs, and characterizing the ambiguous context-free grammars
in any nontrivial way turns out to be difficult or impossible (see Section 11.6).
describing the if statement of Example 6.5 as well as the related if-else statement,
both part of
the C language. Now consider the statement
This can be derived in two ways from the grammar rules. In one, illustrated
in Figure 6.5a,
the else goes with the first if, and in the other, illustrated in Figure 6.5),
it goes with the second.
AC compiler should interpret the statement the second way, but not
as a result of the syntax
tules given; this is additional information with which the compiler must be
furnished.
Just as in Example 6.12, parentheses or their equivalent could be used
to remove the
ambiguity in the statement.
CHAPTER 6 Context-Free Grammars 225
(statement)
expr2 LG);
(a)
(statement)
if ( (expression) (statement)
expr2 f(); Es
(d)
Figure 6.5 |
Two interpretations of a “dangling else.”
(
if ((expression)) (stl) else (st2)
PART 3 Context-Free Languages and Pushdown Automata
These generate the same strings as the original rules and can be shown to be unambigu-
ous. Although we will not present a proof of either fact, you can see the intuitive reason for
the second. The variable ( stl ) represents a statement in which every if is matched by a
corresponding else, while any statement derived from ( st2 ) contains at least one unmatched
if. The only variable appearing before e/se in these formulas is ( stl ); since the else cannot
match any of the ifs in the statement derived from ( stl ), it must match the if that appeared
in the formula with the else.
It is interesting to compare both these sets of formulas with the corresponding ones in the
official grammar for the Modula-2 programming language:
These obviously resemble the rules for C in the first set above. However, the explicit END after
each sequence of one or more statements allows the straightforward grammar rule to avoid the
“dangling else” ambiguity. The Modula-2 statement corresponding most closely to the tree in
Figure 6.5a is
S > S+S|S*S|(S)\a
Once we obtain an unambiguous grammar equivalent to this one, it
will be easy to
reinstate the other operators.
Our final grammar will not have either S > $+ SorS > Sx S,
because either
production by itself is enough to produce ambiguity. We will also
keep in mind the
possibility, mentioned in Example 6.4, of incorporating into the grammar
the standard
CHAPTER 6 Context-Free Grammars 227
tules of order and operator precedence: » should have higher precedence than +,
and a + a + a should “mean” (a + a) +a, nota+(a+a).
In trying to eliminate S > $+ Sand S > SS, it is helpful to remember
Example 2.15, where we discussed possible recursive definitions of L*. Two possible
ways of obtaining new elements of L* are to concatenate two elements of L* and
to concatenate an element of L* with an element of L; we observed that the second
approach preserves the direct correspondence between one application of the recursive
tule and one of the “primitive” strings being concatenated. Here this idea suggests
that we replace S> S + S by either S > $+ T or S > T + S, where the variable
T stands for a term, an expression that cannot itself be expressed as a sum. If we
remember that a + a + a = (a +a) +a, we would probably choose §> S$ + T
as more appropriate; in other words, an expression consists of (all but the last term)
plus the last term. Because an expression can also consist of a single term, we will
also need the production $ — 7. At this point, we have
Se Sactets
[le
We may now apply the same principle to the set of terms. Terms can be products;
however, rather than thinking of a term as a product of terms, we introduce factors,
which are terms that cannot be expressed as products. The corresponding productions
are
TroTrs#F|F
So far we have a hierarchy of levels. Expressions, the most general objects, are
sums of one or more terms, and terms are products of one or more factors. This hierar-
chy incorporates the precedence of multiplication over addition, and the productions
we have chosen also incorporate the fact that both the + and « operations associate
to the left.
It should now be easy to see where parenthesized expressions fit into the hierarchy.
(Although we might say (A) could be an expression or a term or a factor, we should
permit ourselves only one way of deriving it, and we must decide which is most
appropriate.) A parenthesized expression cannot be expressed directly as either a sum
or a product, and it therefore seems most appropriate to consider it a factor. To say
it another way, evaluation of a parenthetical expression should take precedence over
any operators outside the parentheses; therefore, (A) should be considered a factor,
because in our hierarchy factors are evaluated first. What is inside the parentheses
should be an expression, since it is not restricted at all.
The grammar that we end up with is G1 = (V, &, S, P), where V = {S, T, F}
and P contains the productions
S>S+T|T
T—oTx«F\|F
F->(S)|a
We must now prove two things: first, that G1 is indeed equivalent to the original
grammar G, and second, that it is unambiguous. To avoid confusion, we relabel the
start symbol in G1.
228 PART 3 Context-Free Languages and Pushdown Automata
CHAPTER 6 Context-Free Grammars
You should spend a minute convincing yourself that every left parenthesis in a
balanced string has a mate (Exercise 6.30). The first-observation we make is that the
string of parentheses in any string obtained from $1 in the grammar G1 is balanced.
Certainly it has equal numbers of left and right parentheses, since they are produced
in pairs. Moreover, for every right parenthesis produced by a derivation in G1, a left
parenthesis appearing before it is produced simultaneously, and so no prefix of the
string can have more right parentheses than left.
Secondly, observe that in any derivation in G1, the parentheses between and
including the pair produced by a single application of the production F > (S 1) form
a balanced string. This is because the parentheses within the string derived from $1
do, and because enclosing a balanced string of parentheses within parentheses yields
a balanced string.
Now suppose that x <¢ L(G1), and (0 is any left parenthesis in x. The statement
that there is only one leftmost derivation of x in G1 will follow if we can show that
G1 is unambiguous, and we will be able to do this very soon. For now, however, the
discussion above allows us to say that even if there are several leftmost derivations
of x, the right parenthesis produced at the same time as (o is the same for any of
them—it is simply the mate of (0. To see this, let us consider a fixed derivation of x.
In this derivation, the step in which (9 is produced also produces a right parenthesis,
which we call )o. As we have seen in the previous paragraph, the parentheses in
x
beginning with (9 and ending with )) form a balanced string. This implies that
the
mate of (9 cannot appear after )y, because of how “mate” is defined.
However, the mate of (9 cannot appear before )o either. Suppose @ is the string
of parentheses starting just after (9 and ending with the mate of (9. The
string a
has an excess of right parentheses, because (ow is balanced. Let £ be the
string of
parentheses strictly between (9 and )g. Then B is balanced. If the mate of
(9 appeared
before )o, w would be a prefix of A, and this is impossible. Therefore
, the mate of (0
coincides with )o.
The point of this discussion is that when we say that something is within parenthe
-
ses, we can be sure that the parentheses it is within are the two parenthe
ses produced
by the production F — ($1), no matter what derivation we have in mind. This is the
ingredient we need for our theorem.
CHAPTER 6 Context-Free Grammars 231
C=
If there are no A-productions, then the string 6 must be at least as long as q; if there
are no unit productions, a and £ can be of equal length only if this step consists of
replacing a variable by a single terminal. To say it another way, if / and ¢ represent
the length of the current string and the number of terminals in the current string,
respectively, then the quantity / + t must increase at each step of the derivation. The
value of / + ¢ is 1 for the string § and 2k for a string x of length k in the language.
We may conclude that a derivation of x can have no more than 2k — 1 steps. In
particular, we now have an algorithm for determining whether a given string x is
in
the language generated by the grammar: If |x| = k, try all possible sequences
of
2k — | productions, and see if any of them produces x. Although this is not usually
a practical algorithm, at least it illustrates the fact that information about the form
of
productions can be used to derive conclusions about the resulting language.
In trying to eliminate A-productions from a grammar, we must begin with
a
qualification. We obviously cannot eliminate all productions of this form if the
string
A itself is in the language. This obstacle is only minor, however: We will
be able to
show that for any context-free language L, L — {A} can be generated by
a CFG with
no A-productions. A preliminary example will help us see how to proceed.
CHAPTER 6 Context-Free Grammars 233
S— ABCBCDA
A—>CD
B— Cb
Co>alA
D—>bD\A
The first thing this example illustrates is probably obvious already: We cannot simply
throw away the A-productions without adding anything. In this case, if D — A is eliminated
then nothing can be derived, because the A-production is the only way to remove the variable
D from the current string.
Let us consider the production § ~ ABCBCDA, which we write temporarily as
S— ABC,BC,DA
The three variables C,, C2, and D on the right side all begin A-productions, and each can also
be used to derive a nonnull string. In a derivation we may replace none, any, or all of these
three by A. Without A-productions, we will need to allow for all these options by adding
productions of the form S — a, where a is a string obtained from ABCBCDA by deleting
some or all of {C;, Cy, D}. In other words, we will need at least the productions
in addition to the one we started with, in order to make sure of obtaining all the strings that can
be obtained from the original grammar.
If we now consider the variable A, we see that these productions are still not enough.
Although A does not begin a A-production, the string A can be derived from A (as can other
nonnull strings). Starting with the production A > CD, we can leave out C or D, using the
same argument as before. We cannot leave out both, because we do not want the production
A — A in our final grammar. If we add subscripts to the occurrences of A, as we did to those
of C, so that the original production is
Sa A,BC,BC,DA;
we need to add productions in which the right side is obtained by leaving out some subset of
{A;, Az, Ci, C2, D}. There are 32 subsets, which means that from this original production we
obtain 31 others that will be added to our grammar.
The same reasoning applies to each of the original productions. If we can identify in
the production X — «a all the variables occurring in w from which A can be derived, then
we can add all the productions X — a’, where a’ is obtained from a by deleting some of
these occurrences. In general this procedure might produce new A-productions—if so, they
are ignored—and it might produce productions of the form X —> X, which also contribute
nothing to the grammar and can be omitted.
234 PART 3 Context-Free Languages and Pushdown Automata
In this case our final context-free grammar has 40 productions, including the 32 S-
productions already mentioned and the ones that follow:
Av C.D CD
B—>Cb\|b
Ca
Dip Dib
The procedure outlined in Example 6.14 is the one that we will show works in
general. In presenting it more systematically, we give first a recursive definition of a
nullable variable (one from which A can be derived), and then we give the algorithm
suggested by this definition for identifying such variables.
vari
a
No other variables in V-
You can easily convince yourself that the variables defined in Definition 6.6 are
the variables A for which A >* A. Obtaining the algorithm FindNull from
the
definition is straightforward, and a similar procedure can be used
whenever we have
such a recursive definition (see Exercise 2.70). When we apply the algorithm
in
Example 6.14, the set No is {C, D}. The set N, also contains A, as
a result of the
production A — CD. Since no other productions have right sides in {A,
C, D}*,
these three are the only nullable variables in the grammar.
Initialize P1 to be P.
Find all nullable variables in V, using Algorithm FindNull.
For every production A — a in P, add to P1 every production that can be
obtained from this one by deleting from @ one or more of the occurrences of
nullable variables in a.
Delete all A-productions from P1. Also delete any duplicates, as well as
productions of the form A > A.
236 PART 3 Context-Free Languages and Pushdown Automata
1. Initialize P1 to be P.
2. Foreach A € V, find the set of A-derivable variables.
CHAPTER 6 Context-Free Grammars 237
3. For every pair (A, B) such that B is A-derivable, and every nonunit production
B — a, add the production A > a to P1 if it is not already present in P 1.
4. Delete all unit productions from P1.
The proof is omitted (Exercise 6.62). Itis worth pointing out, again without proof, that
if the grammar G is unambiguous, then the grammar G1 obtained from the algorithm
is also.
S>S+4+T|T
T—>T+*F\|F
F > (S)|a
The S-derivable variables are T and F, and F is T-derivable. In step 3 of Algorithm 6.2, the
productions § + T * F | (S) | a and T — (S) |a are added to P1. When unit productions
are deleted, we are left with
S>S+T|TxF|(S)|a
T—>TxF\|(S)|a@
F — (S))|\4
A->a
A— BCDBCE
would be replaced by
A— BY,
Yi > CY,
Y, — DY;
5 — BY,
Y4—>CE
The new variables Y;, Y>, Y3, Y4 are specific to this production and would be used
nowhere else. Although this may seem wasteful, in terms of the number of variables,
at least there is no doubt that the combined effect of this set of five productions
is precisely equivalent to the original production. Adding these new variables and
productions therefore does not change the language generated.
If we are willing to let these informal arguments suffice, we have obtained
the
following result.
CHAPTER 6 Context-Free Grammars 239
S— AACD
A—>aAb|A
C—>aCla
D—>aDa\|bDb\|A
1. Eliminating A-productions. The nullable variables are A and D, and Algorithm 6.1
produces the grammar with productions
S— AACD|ACD|AAC|CD|AC|C
A — aAb|ab
C—>aCla
D — aDa\|bDb
| aa | bb
»
S—>aCla
and delete $ — C.
3. Restricting the right sides of productions to single terminals or strings of two or
more variables. This step yields the productions
S— AACD|ACD|AAC|CD|AC|X,C|a
A—> X,AX, | XaXp
C—> X,C|a
D— X,DX, | X,DXp | XaXa | XpXp
X,g>a
Xp —>b
4. The final step to CNF. There are six productions whose right sides are too long.
Applying our algorithm produces the grammar with productions
S— AT, T; —~ ATh In CD
S— AU, U, ~ CD
S— AV, V, ~ AC
240 PART 3 Context-Free Languages and Pushdown Automata
EXERCISES
6.1. In each case, say what language is generated by the context-free grammar
with the indicated productions.
a.
S— aSa|bSb|A
b.
S — aSa|bSb|a\|b
C;
S— aSb | bSa|A
d.
S — aSa|bSb|aAb
| bAa
A—aAa|bAb|a|b|A
(See Example 6.3.)
€.
S—>aS|bS|a
if
S > SS |.bS | a
g.
S— SaS |b
h.
Sartorin
Ea8'bs
6.2. Find a context-free grammar corresponding to the “syntax diagram” in
Figure 6.6.
CHAPTER 6 Context-Free Grammars 241
Figure 6.6 |
6.3. Acontext-free grammar is sometimes specified in the form of BNF rules; the
letters are an abbreviation for Backus-Naur Form. In these rules, the symbol
:!= corresponds to the usual —, and {X} means zero or more occurrences of
'X. Find a context-free grammar corresponding to the BNF rules shown
below. Uppercase letters denote variables, lowercase denote terminals.
S— SOILS | S10S | A
Figure 6.7 |
6.14. Draw an NFA accepting the language generated by the grammar with
productions
S — abA|bB
| aba
A—>b|aB\|bA
B—>aB\aA
244 PART 3 Context-Free Languages and Pushdown Automata
6.15. Show that if the procedure described in the proof of Theorem 6.2 is applied
to an NFA instead of an FA, the result is still a regular grammar generating
the language accepted by the NFA.
6.16. Consider the following statement. For any language L C &%, L is regular if
and only if L can be generated by some grammar in which every production
takes one of the four forms B > a, B ~ Ca, B > aC, or B —> A, where
B and C are variables and a € &. For both the “if” and the “only if” parts,
give either a proof or a counterexample.
6.17. A context-free grammar is said to be self-embedding if there is some variable
A and two nonnull strings of terminals w and 6 so that A >* aAB. Show
that a language L is regular if and only if it can be generated by a grammar
that is not self-embedding.
6.18. Each of the following grammars, though not regular, generates a regular
language. In each case, find a regular grammiar generating the language.
a. S— SSS|a|ab
S — AabB A—>aA|bA|A B — Bab| Bb|ab|b
S — AAS | ab | aab A—ab\|ba|A
S— AB A — aAa|bAb|a|b B+>aB\bB\A
@ S—>AA|B
(s
Oo
ie A— AAA|Ab|bA|a B— bB\|b
6.19. Refer to Example 6.4.
a. Draw the derivation tree corresponding to each of the two given
derivations of a+ (a x a)/a —a.
Write the rightmost derivation corresponding to each of the trees in (a).
How many distinct leftmost derivations of this string are there?
How many derivation trees are there for the string a +a +a+a+a?
Se
(@
Oe How many derivation trees are there for the string
(a+ (a+a))+(a+a)?
6.20. Give an example of a CFG and a string of variables and/or terminals
derivable from the start symbol for which there is neither a leftmost
derivation nor a rightmost derivation.
6.21. Consider the C statements
Hoel aE (ats 2) TF (as 4x = 23 eelge s = Se
a. What is the resulting value of x if a = 3? Ifa = 1?
b. Same question as in (a), but this time assume that the statement is
interpreted as in Figure 6.5a.
6.22. Show that the CFG with productions
S—a|Sa|bSS|SSb| SbS
is ambiguous.
6.23. Consider the context-free grammar with productions
S—> AB A>aA|A B—->ab|bB\|A
CHAPTER 6 Context-Free Grammars 245
Any derivation of a string in this grammar must begin with the production
S — AB. Clearly, any string derivable from A has only one derivation from
A, and likewise for B. Therefore, the grammar is unambiguous. True or
false? Why? (Compare with the proof of Theorem 6.4.)
6.24. In each part of Exercise 6.1, decide whether the grammar is ambiguous or
not, and prove your answer.
6.25. For each of the CFGs in Examples 6.3, 6.9, and 6.11, determine whether or
not the grammar is ambiguous, and prove your answer.
6.26. In each case, show that the grammar is ambiguous, and find an equivalent
unambiguous grammar.
S—> SS|a\|b
SAB Arica aA | AO 1 ABS
BB (A
S—>A|B A — aAb|ab B—>abB\A
S — aSb|aaSb\|A
2o S—
PS
O'S aSb|abS|A
6.27. Find an unambiguous context-free grammar equivalent to the grammar with
productions
S — aaaaS |aaaaaaaS | A
(See Exercise 2.50.)
6.28. The proof of Theorem 6.1 shows how to find a regular grammar generating
L, given a finite automaton accepting L.
a. Under what circumstances is the grammar obtained this way
unambiguous?
b. Describe how the grammar can be modified if necessary in order to make
it unambiguous.
6.29. Describe an algorithm for starting with a regular grammar and finding an
equivalent unambiguous grammar.
6.30. Show that every left parenthesis in a balanced string has a mate.
6.31. Show that if a is a left parenthesis in a balanced string, and b is its mate, then
a is the last left parentheses for which the string consisting of a and b and
everything in between is balanced.
6.32. Find an unambiguous context-free grammar for the language of all algebraic
expressions involving parentheses, the identifier a, and the four binary
operators +, —, *, and /.
6.33. Show that the nullable variables defined by Definition 6.6 are precisely those
variables A for which A =>* A.
6.34. In each case, find a context-free grammar with no A-productions that
generates the same language, except possibly for A, as the given CFG.
a.
S—> ABA A — aASb|a B— bS
246 PART 3 Context-Free Languages and Pushdown Automata
b.
S— AB|ABC
A— BA|BC|Ala
B+ AC|CB|A|b
C—> BC|AB|Alc
6.35. In each case, given the context-free grammar G, find a CFG G’ with no
A-productions and no unit productions that generates the language
L(G) — {A}.
a. G has productions
S — ABA A —> aA|A B— bB\A
b. G has productions
S — aSa|bSb|A A — aBb|bBa B— aB\|bB\|A
c. G has productions
S => eA Biss hx
A variable that is not useful is useless. Clearly if a variable is either not live
or not reachable (Exercises 6.36—6.37), then it is useless.
a. Give an example in which a variable is both live and reachable but still
useless.
b. Let G be a CFG. Suppose G1 is obtained by eliminating all dead
variables from G and eliminating all productions in which dead
variables appear. Suppose G2 is then obtained from G1 by eliminating
all variables unreachable in G1, as well as productions in which such
variables appear. Show that G2 contains no useless variables,
and
L(G2) = L(G).
c. Show that if the two steps are done in the opposite order, the resulting
grammar may still have useless variables.
d. In each case, given the context-free grammar G, find an equivalent
CFG
with no useless variables.
CHAPTER 6 Context-Free Grammars 247
i. G has productions
S — ABC|BaB A — aA|BaC
|aaa
B — DbBb\a C — CA|AC
6.39. In each case, given the context-free grammar G, find a CFG G’ in Chomsky
normal form generating L(G) — {A}.
a. Ghas productions $ > SS|(S)|A
b. Ghas productions S$ > S(S)| A
c. Gis the CFG in Exercise 6.35c
d. G has productions
S — AaA|CA|BaB A — aaBa|CDA|aa|DC
B — bB|bAB\|bb\|aS C > Ca|bC|D D— bDD\A
S—> bS|aT|A
T—>aT|bU|A
U—-aT\Aa
generates the language of all strings over the alphabet {a, b} that do not
contain the substring abb. One approach is to use mathematical induction to
prove two three-part statements. In both cases, each part starts with “For
every n = 0, if x is any string of length n,”. In the first statement, the three
parts end as follows: (i) if $ >* x, then x does not contain the substring
abb; (ii) if T =* x, then x does not contain the substring bd; (iii) if
U =>* x, then x does not start with b and does not contain the substring bb.
In the second statement, the three parts end with the converses of (i), (ii), and
(iii). The reason for using two three-part statements, rather than six separate
statements, is that in proving each of the two, the induction hypothesis will
say something about all three types of strings: those derivable from S, those
derivable from 7, and those derivable from U.
6.48. What language over {0, 1} does the CFG with productions
SOB TA TK
A— 0S|1AA
B—-1S|0BB
generates the language L = {x € {0, 1}* |no(x) = n,(x)} (See Example 6.8)
It would be appropriate to formulate two three-part statements, as in
Exercise 6.47, this time involving the variables §, A, and B and the
languages L, Lo, and L).
6.51. Prove that the CFG with productions § > 0515S | 1SOS | A generates
the
language L = {x € {0, 1}* |No(x) = n,(x)}.
6.52. a. Describe the language generated by the CFG G with productions
S SSCS) A
b. Show that the CFG G with productions
St > (S1)S; | A
CHAPTER 6 Context-Free Grammars 249
generates the same language. (One inclusion is easy. For the other one,
it may be helpful to prove the following statements for a string x € L(G)
with |x| > 0. First, if there is no derivation of x beginning with the
production S$ — (S), then there are strings y and z, both in L(G) and
both shorter than x, for which x = yz. Second, if there are such strings y
and z, and if there are no other such strings y’ and z’ with y’ shorter than
y, then there is a derivation of y in G that starts with the production
S — (S).)
6.53. Show that the CFG with productions
S — aSaSbS |aSbSaS | bSaSaS | A
generates the language {x € {a, b}* |ng(x) = 2n,(x)}.
6.54. Does the CFG with productions
S — aSaSb |aSbSa |bSaSaS | A
generate the language of the previous problem? Prove your answer.
6.55. Show that the following CFG generates the language
{x € {a, b}* |na(x) = 2np(x)}.
S > SS|bTT|TbT|TTb|A T — aS|SaS|Sa|\a
6.56. For alphabets ©; and £2, a homomorphism from Xj to X3 is defined in
Exercise 4.46. Show that if f : Lf — Xj is ahomomorphism and L C Xf
is a context-free language, then f(L) C &5 is also a CFG.
6.57. Show that the CFG with productions
S— S(S)|A
is unambiguous.
6.58. Find context-free grammars generating each of these languages.
a. {aiblck |i Aj +k}
b. {a'bick | j Ai +k}
6.59. Find context-free grammars generating each of these languages, and prove
that your answers are correct.
ae ap Wits j = 31/2}
b. {aib/ |i/2 < j < 3i/2}
6.60. Let G be the context-free grammar with productions
S > aS |aSbS |c
Pushdown Automata
S —> aSa|bSb\c
The strings in L are odd-length palindromes over {a, b} (Example 6.3), except that the middle
symbol is c. (We will consider ordinary palindromes shortly. For now, the “marker” in the
middle makes it easier to recognize the string.)
It is not hard to design an algorithm for recognizing strings in L, using a single left-to-right
pass. We will save the symbols in the first half of the string as we read them, so that once
we encounter the c we can begin matching incoming symbols with symbols already read. In
first
order for this to work, we must retrieve the symbols we have saved using the rule “last in,
out” (often abbreviated LIFO): The symbol used to match the next incoming symbol is the one
which
most recently read, or saved. The data structure incorporating the LIFO rule is a stack,
251
252 PART 3 Context-Free Languages and Pushdown Automata
is usually implemented as a list in which one end is designated as the top. Items are always
added (“pushed onto the stack’’) and deleted (“popped off the stack’’) at this end, and at any
time, the only element of the stack that is immediately accessible is the one on top.
In trying to incorporate this algorithm in an abstract machine, it would be reasonable to
say that the current “state” of the machine is determined in part by the current contents of the
stack. However, this approach would require an infinite set of “states,” because the stack needs
to be able to hold arbitrarily long strings. It is convenient instead to continue using a finite
set of states—although the machine is not a “finite-state machine” in the same way that an
FA is, because the current state is not enough to specify the machine’s status—and to think of
the stack as a simple form of auxiliary memory. This means that a move of our machine will
depend not only on the current state and input, but also on the symbol currently on top of the
stack. Carrying out the move may change the stack as well as the state.
In this simple example, the set Q of states will contain only three elements, go, g1, and
q2. The state qo, the initial state, is sufficient for processing the first half of the string. In this
state, each input symbol is pushed onto the stack, regardless of what is currently on top. The
machine stays in qo as long as it has not yet received the symbol c; when that happens, the
machine moves to state q,, leaving the stack unchanged. State q, is for processing the second
half of the input string. Once the machine enters this state, the only string that can be accepted
is the one whose second half (after the c) is the reverse of the string already read. In this state
each input symbol is compared to the symbol currently on top of the stack. If they agree, that
symbol is popped off the stack and both are discarded; otherwise, the machine will crash and
the string will not be accepted. This phase of the processing ends when the stack is empty,
provided the machine has not crashed. An empty stack means that every symbol in the first
half of the string has been successfully matched with an identical input symbol in the second
half, and at that point the machine enters the accepting state qo.
Now we consider how to describe precisely the abstract machine whose operations we
have sketched. Each move of the machine will be determined by three things:
Describing moves this way allows us to consider the two basic stack moves as special cases:
Popping the top symbol off the stack means replacing it by A, and pushing Y onto the stack
means replacing the top symbol X by Y X (assuming that the left end of the string corresponds
to the top). We could enforce the stack rules more strictly by requiring that a single
move
contain only one stack operation, either a push or a pop. However, replacing the stack
symbol
X by the string a can be accomplished by a sequence of basic moves (a pop, followed
by
a sequence of zero or more pushes), and allowing the more general move helps to
keep the
number of distinct moves as small as possible.
In the case of a finite automaton, our transition function took the form
6:Q0xxi->Q
CHAPTER 7 Pushdown Automata 253
Here, if we allow the possibility that the stack alphabet I (the set of symbols that can appear
on the stack) is different from the input alphabet ©, it looks as though we want
o-Oxyxl— Oxi"
é(q, a, X) = (p, a)
means that in state g, with X on top of the stack, we read the symbol a, move to state p, and
replace X on the stack by the string a.
This approach raises a few questions. First, how do we describe a move if the stack is
empty (5(q, a, ?))? We avoid this problem by saying that initially there is a special start symbol
Zo on the stack, and the machine is not allowed to move when the stack is empty. Provided
that Zo is never removed from the stack and that no additional copies of Zp are pushed onto
the stack, saying that Zp is on top means that the stack is effectively empty.
Second, how do we describe a move when the input is exhausted (6(qg, ?, X))? (Remember
that in our example we want to move to q> if the stack is empty when all the input has been
read.) The solution we adopt here is to allow moves that use only A as input, corresponding
to A-transitions in an NFA-A. This suggests that what we really want is
6:0x(ZU{A})xT
> QxI*
Of course, once we have moves of the form 5(q, A, X), we can make them before all the
input has been read; if the next input symbol is not read in a move, it is still there to be read
subsequently.
We have already said that there may be situations when the machine will crash—that is,
when no move is specified. In the case of a finite automaton, when this happened we decided
to make 5 (q, a) a subset of Q, rather than an element, so that it could have the value O. At the
same time we allowed for the possibility that 5(q¢, a) might contain more than one element, so
that the FA became nondeterministic. Here we do the same thing, except that since Q x I* is
an infinite set we should say explicitly that 5(q, a, X) and 6(q, A, X) will always be finite. In
our current example the nondeterminism is not necessary, but in many cases it is. Thus we are
left with
5:Qx(ZU{A}) x I — the set of finite subsets of Q x I"
Now we can give a precise description of our simple-palindrome recognizer. Q will be the
set {Go, 1, 92}, qo is the initial state, and q, is the only accepting state. The input alphabet
¥ is {a, b,c}, and the stack alphabet I" is {a, b, Zo}. The transition function 6 is given by
Table 7.1. Remember that when we specify a string to be placed on the stack, the top of the
stack corresponds to the left end of the string. This convention may seem odd at first, since
if we were to push the symbols on one at a time we would have to do it right-to-left, or in
reverse order. The point is that when we get around to processing the symbols on the stack,
the order in which we encounter them is the same as the order in which they occurred in the
string.
Moves 1 through 6 push the input symbols a and b onto the stack, moves 7 through 9
change state without affecting the stack, moves 10 and 11 match an input symbol with a stack
on
symbol and discard both, and the last move is to accept provided there is nothing except Zp
the stack.
254 PART 3 Context-Free Languages and Pushdown Automata
Let us trace the moves of the machine for three input strings: abcba, ab, and acaa.
(initially) qo abcba Zo
1 qo bcba aZy
4 qo cba baZo
9 q\ ba ba Zo
11 q1 a aZo
10 q1 = Zo
12 QD - Zo
(accept)
(initially) do ab Zo
i qo b aZo
4 do = ba Zo
(crash)
(initially) q0 acaa Zo
1 qo caa aZo
8 q aa aZo
10 nN a De
12 q2 a Zo
a a Re eee a
b, a/ba b, b/A
Figure 7.1 |
Transition diagram for the pushdown automaton (PDA) in
Example 7.1.
replacing X on the stack by a. Even with the extra information required for labeling an arrow,
a diagram of this type does not capture completely the PDA’s behavior in the same way that
a transition diagram for an FA does. With an FA, you can start at any point with just the
diagram and the input symbols and trace the action of the machine by following the arrows. In
Figure 7.1, however, you cannot follow the arrows without keeping track of the stack contents—
possibly the entire contents—as you go. The number of possible combinations of state and
stack contents is infinite, and it is therefore not possible to draw a “finite-state diagram” in the
same sense as for an FA. In most cases we will describe pushdown automata in this chapter by
transition tables similar to the one in Table 7.1, although it will occasionally also be useful to
show a transition diagram. »
The stack alphabet I and the initial stack symbol Zo are what make it necessary
to have a 7-tuple rather than a 5-tuple. Otherwise, the components of the tuple are the
same as in the case of an FA, except that the transition function 6 is more complicated.
We can trace the operation of a finite automaton by keeping track of the current
state at each step. In order to trace the operation of a PDA M, we must also keep track
of the stack contents. If we are interested in what the machine does with a specific
input string, it is also helpful to monitor the portion of the string yet to be read. A
configuration of the PDA M = (Q, X,T, go, Zo, A, 5) isa triple
(q, x, a)
where q € Q,x € X*, anda € I*. Saying that (g, x, a) is the current configuration
of M means that q is the current state, x is the string of remaining unread input, and
qa is the current stack contents, where as usual it is the left end of a that corresponds
to the top of the stack.
We write
(p, x, a) ru (q, y, B)
to mean that one of the possible moves in the first configuration takes M to the
second. This can happen in two ways, depending on whether the move consumes an
input symbol or is a A-transition. In the first case, x = ay for some a € &, and in
the second case x = y; we can summarize both cases by saying x = ay for some
a € XU {A}. In both cases, the string B of stack symbols is obtained from a by
replacing the first symbol X by a string & (in other words, a = X y for some X € T
and some y < I*, and B = &y for some —é € I'*), and
(q,5) € 5(p,a, X)
More generally, we write
either xsx” or xx", and in either case, there is a permissible sequence of choices that involves
making the correct “yes” guess at just the right time to cause the string to be accepted. It is
still possible, of course, that for an input string z that is a palindrome the PDA guesses “yes”
at the wrong time or makes the wrong type of “yes” guess; it might end up accepting some
palindrome other than z, or simply stop in a nonaccepting state. This does not mean that the
PDA is incorrect, but only that the PDA did not choose the particular sequence of moves that
would have led to acceptance of z.
The transition table for our PDA is shown in Table 7.2. The sets Q, I, and A are the same
as in Example 7.1, and there are noticeable similarities between the two transition tables. The
moves in the first six lines of Table 7.1 show up as possible moves in the corresponding lines
of Table 7.2, and the last three lines of the two tables (which represent the processing of the
second half of the string) are identical.
The fact that the first six lines of Table 7.2 show two possible moves tells us that there is
genuine nondeterminism. The two choices in each of these lines are to guess “not yet,” as in
Table 7.1, and to guess that the input symbol is the middle symbol of the (odd-length) string.
The input symbol is read in both cases; the first choice causes it to be pushed onto the stack,
and the second choice causes it to be discarded.
However, there is also nondeterminism of a less obvious sort. Suppose for example that
the PDA is in state qo, the top stack symbol is a, and the next input symbol is a, as in line 3.
In addition to the two moves shown in line 3, there is a third choice shown in line 8: not to
read the input symbol at all, but to execute a A-transition to state qi. This represents the other
“yes” guess, the guess that as a result of reading the most recent symbol (now on top of the
stack), we have reached the middle of the (even-length) string. This choice is made without
even looking at the next input symbol. (Another approach would have been to read the a, use it
to match the a on the stack, and move to qj, all on the same move; however, the moves shown
in the table preserve the distinction between the state qo in which all the guessing occurs and
the state q; in which all the comparison-making occurs.)
Note that the A-transition in line 8 is not in itself the source of nondeterminism. The
move in line 12, for example, is the only possible move from state 4 if Zo is the top
stack
(>, A, Zo)
Figure 7.2|
Computation tree for the PDA in Table 7.2, with input baab.
symbol. Line 8 represents nondeterminism because if the PDA is in state qo and a is the top
stack symbol, there is a choice between a move that reads an input symbol and one that does
not. We will return to this point in Section 7.3.
Just as in Section 4.1, we can draw a computation tree for a PDA such as this one, showing
the configuration at each step and the possible choices of moves at each step. Figure 7.2 shows
such a tree for the string baab, which is a palindrome.
Each time there is a choice, the possible moves are shown left-to-right in the order they
appear in Table 7.2. In particular, in each configuration along the left edge of Figure 7.2 except
the last one, the PDA is in state go and there is at least one unread input symbol. At each of
these points, the PDA can choose from three possible moves. Continuing down the left edge
of the figure represents a “not yet” guess that reads an input and pushes it onto the stack. The
other two possibilities are the two moves to state q:, one that reads an input symbol and one
that does not.
260 PART 3 Context-Free Languages and Pushdown Automata
[= ((1 5 b, bZo)
= (Nn ’ NG Zo)
This sequence of moves is the one in which the “yes” guess of the right type is made at
exactly the right time. Paths that deviate from the vertical path too soon terminate before the
PDA has finished reading the input; the machine either crashes or enters the accepting state q>
prematurely (so that the string accepted is a palindrome of length 0 or 1, not the one we have
in mind). Paths that follow the vertical path too long cause the PDA either to crash or to run
out of input symbols before getting a chance to empty the stack.
era. Os
Note that our definition does not require the transition function to be defined
for
every combination of state, input, and stack symbol; in a determin
istic PDA, it is
CHAPTER 7 Pushdown Automata 261
still possible for one of the sets 6(q, a, X) to be empty. In this sense, our notion of
determinism is a little less strict than in Chapter 4, where we called a finite automaton
nondeterministic if there was a pair (gq, a) for which 5(q, a) did not have exactly one
element.
The last statement in the definition anticipates to some extent the results of the
next two sections, which show that the languages that can be accepted by PDAs are
precisely the context-free languages. The last statement also suggests another way
in which CFLs are more complicated than regular languages. We did not define a
“deterministic regular language” in Chapter 4, although we considered both NFAs
and deterministic FAs. The reason is that for any NFA there is an FA recognizing
the same language; any regular language can be accepted by a deterministic FA. Not
every context-free language, however, can be accepted by a deterministic PDA. It
probably seemed obvious in Example 7.2 that the standard approach to accepting the
language of palindromes cannot work without nondeterminism; we will be able to
show in Theorem 7.1 that no other PDA can do any better, and that the language of
palindromes is not a DCFL.
S — SS | [S]| {S}| A
(It is also possible to describe this type of “balanced” string using the approach of Definition 6.5;
see Exercise 7.20.)
Our PDA will have two states: the initial state go, which is also the accepting state (note
that A is one element of L), and another state q,. Left brackets of either type are saved on the
stack, and one is discarded whenever it is on top of the stack and a right bracket of the same
type is encountered in the input. The feature of strings in L that makes this approach correct,
and therefore makes a stack the appropriate data structure, is that when a right bracket in a
balanced string is encountered, the left bracket it matches is the Jast left bracket of the same
type that has appeared previously and has not already been matched. The signal that the string
read so far is balanced is that the stack has no brackets on it (i.e., Zo is the top symbol), and if
this happens in state q, the PDA will return to the accepting state qo via a A-transition, leaving
the stack unchanged. From this point, if there is more input, the machine proceeds as if from
the beginning.
Table 7.3 shows a transition table for such a deterministic PDA. To make it easier to read,
the parentheses with which we normally enclose a pair specifying a single move have been
omitted.
The input string {[ ]}[], for example, results in the following sequence of moves.
1 qo a Zo (1 ’ aZo)
2 qo b Zo (qo, bZo)
3 qo a b (qo, A)
A qo b b (qo, bb)
5 q1 a a (41, aa)
6 q b a (qo, A)
7 qo A a (1 ’ a)
(all other combinations) none
1 qo a Zo (41; Zo)
2 qo b Zo (qo, bZo)
3 qo a b (qo, A)
4 qo b b (qo, bb)
5 1 a Zo (q1, AZo)
6 q b Zo (qo, Zo)
7 71 a a (q1, 4a)
8 q1 b a (q1, A)
(all other combinations) none
v
If we can provide a way for the PDA to determine in advance whether an a on the stack
is the only one, then we can eliminate the need to leave the accepting state when a is popped
from the stack, and thereby eliminate A-transitions altogether. There are at least three natural
ways we might manage this. One is to say that we will push a’s onto the stack only when we
have an excess of at least two, so that in state g,, top stack symbol Zp means one extra a, and
top stack symbol a means more than one. Another is to use a different stack symbol, say A,
for the first extra a. A third is simply to introduce a new state specifically for the case in which
there is exactly one extra a. The DPDA shown in Table 7.5 takes the first approach. As before,
qi is the accepting state. There is no move specified from q; with stack symbol b or from qo
with stack symbol a, because neither of these situations will ever occur.
This PDA may be slightly easier to understand with the transition diagram shown in
Figure 7.3.
We illustrate the operation of this machine on the input string abbabaa:
(qo, abbabaa, Zo) + (q1, bbabaa, Zo)
F (qo, babaa, Zo)
F (qo, abaa, bZo)
- (qo, baa, Zo)
F (qo, aa, bZo)
F (qo, 4, Zo)
- (q;, A, Zo) (accept)
264 PART 3 Context-Free Languages and Pushdown Automata
b, Z/bZ a, Z0/aZo
Figure 7.3 |
The DPDA in Table 7.5.
See Exercise 7.17 for some other examples of CFLs that are not DCFLs, and see
Section 8.2 for some other methods of showing that languages are not DCFLs.
confirm x’s membership in the language, but also reveal a derivation of x in the
grammar. Because of languages like pal, however, finding such a deterministic PDA
is too much to expect in general.) As the simulation progresses, the machine will test
the input string to make sure that it is still consistent with the derivation-in-progress.
If the input string does in fact have a derivation from the grammar, and if the PDA’s
guesses are the ones that correctly simulate this derivation, the tests will confirm this
and allow the machine to reach an accepting state.
There are at least two natural ways a PDA can simulate a derivation in the gram-
mar. A step in the simulation corresponds to constructing a portion of the derivation
tree, and the two approaches are called top-down and bottom-up because of the order
in which these portions are constructed.
We will begin with the top-down approach. The PDA starts by pushing the start
symbol S (at the top of the derivation tree) onto the stack, and each subsequent step
in the simulated derivation is carried out by replacing a variable on the stack (at a
certain node in the tree) by the right side of a production beginning with that variable
(in other words, adding the children of that node to the tree.) The stack holds the
current string in the derivation, except that as terminal symbols appear at the left of
the string they are matched with symbols in the input and discarded.
The two types of moves made by the PDA, after S is placed on the stack, are
1. Replace a variable A on top of the stack by the right side a of some production
A — a. This is where the guessing comes in.
2. Pop a terminal symbol from the stack, provided it matches the next input
symbol. Both symbols are then discarded.
At each step, the string of input symbols already read (which have been successfu
lly
matched with terminal symbols produced at the beginning of the string by the
deriva-
tion), followed by the contents of the stack, exclusive of Zo, constitute
s the current
string in the derivation. When a variable appears on top of the stack, it is because
terminal symbols preceding it in the current string have already been
matched, and
thus it is the leftmost variable in the current string. Therefore, the derivatio
n being
simulated is a leftmost derivation. If at some point there is no longer
any part of
the current string remaining on the stack, the attempted derivation
must have been
successful at producing the input string read so far, and the PDA can
accept.
We are now ready to give a more precise description of this top-dow
n PDA and
to prove that the strings it accepts are precisely those generated
by the grammar.
CHAPTER 7 Pushdown Automata 267
lee gs .V Uobe
is step of _
istou ea production ofthe a s— y’B,$0 that the string y in (7. 1):
268 PART 3 Context-Free Languages and Pushdown Automata
the initial mo
.
CHAPTER 7 Pushdown Automata 269
A Top-down PDA for the Strings with More a's than b's | EXAMPLE7.5 |
Consider the language
S—>al|aS|bSS|SSb|
SbS
Following the construction in the proof of Theorem 7.2, we obtain the PDA M = (Q, 2, T,
qo, Zo, A, 6), where Q = {q0,q1,q}, & = {a,b}, T = {S,a,b, Zo}, A = {q2}, and the
transition function 64 is defined by this table.
A
q1 A S (qi, 4), (gi, 4S), (qi, BSS), (qi, SSB), (qi, SBS)
q1 a a (qi, A)
q b b (qi, A)
nN A Zo (42, Zo)
(all other combinations) none
We consider the string x = abbaaa € L and compare the moves made by M in accepting
x with a leftmost derivation of x in the grammar. Each move in which a variable is replaced on
the stack by a string corresponds to a step in a leftmost derivation of x, and that step is shown
to the right of the move. Observe that at each step, the stack contains (in addition to Zo) the
portion of the current string in the derivation that remains after removing the initial string of
270 PART 3 Context-Free Languages and Pushdown Automata
You may wish to trace the other possible sequences of moves by which x can be accepted,
corresponding to other possible leftmost derivations of x in this CFG.
The opposite approach to top-down is bottom-up. In this approach, there are op-
posite counterparts to both types of moves in the top-down PDA. Instead of replacing
a variable A on the stack by the right side a of a production A > a (which effectively
extends the tree downward), the PDA removes a from the stack and replaces it by
(or “reduces it to”) A, so that the tree is extended upward. In both approaches, the
contents of the stack represents a portion of the current string in the derivation being
simulated; instead.ofremoving a terminal symbol from the beginning of this portion
(which appeared on the stack as a result of applying a production), the PDA “shifts”
a terminal symbol from the input to the end of this portion, in order to prepare for a
reduction.
Note that because shifting input symbols onto the stack reverses their order, the
string @ that is to be reduced to A will appear on the stack in reverse; thus the PDA
begins the reduction with the last symbol of on top of the stack.
Note also that while the top-down approach requires only one move to apply a
production A — a, the corresponding reduction in the bottom-up approach requires
a sequence of moves, one for each symbol in the string w. We are interested primarily
in the sequence as a whole, and with the natural correspondence between a production
in the grammar and the sequence of moves that accomplishes the reduction.
The process terminates when the start symbol S, left on the stack by the
last
reduction, is popped off the stack and Zo is the only thing left. The entire process
simulates a derivation, in reverse order, of the input string. At each step, the current
string in the derivation is formed by the contents of the stack (in reverse), followed
by
the string of unread input; because after each reduction the variable on top of
the stack
is the rightmost one in the current string, the derivation being simulated
in reverse is
a rightmost derivation.
CHAPTER 7 Pushdown Automata 271
The reason for numbering the productions will be seen presently. Suppose the input string is
a +a x a$, which has the rightmost derivation
S>S+T
=>S+Txa
= Stata
=> ff -axia
>a+axa
The corresponding steps or groups of steps executed by the bottom-up PDA as the string a+axa
is processed are shown in Table 7.6. Remember that at each point, the reverse of the string on
the stack (omitting Zo), followed by the string of unread input, constitutes the current string in
the derivation, and the reductions occur in the opposite order from the corresponding steps in
the derivation. For example, since the last step in the derivation is to replace T by a, the first
reduction replaces a on the stack by T. *
In Table 7.7 we show the details of the nondeterministic bottom-up PDA that carries out
these moves. The shift moves allow the next input to be shifted onto the stack, regardless of
the current stack symbol. The sequence of moves in a reduction can begin when the top stack
symbol is the last symbol of a, for some string a in a production A > a. If |a| > 1, the
moves in the sequence proceed on the assumption that the symbols below the top one are in
fact the previous symbols of a; they remove these symbols, from back to front, and place A
shift aZy
reduce Ta T Zo
reduce S—> T SZo
shift +S$Zo
shift a+SZo
reduce T- a T+ SZo
shift *T + SZo a
shift axT+SZo _
reduce T > Txa T +SZpo -
reduce S > S+T SZo fa
(pop S) Zo =
(accept)
Se
272 PART 3 Context-Free Languages and Pushdown Automata
q A Ss (q1, A)
1 A Zo (42, A)
(all other combinations) none
on the stack. Once such a sequence is started, a set of states unique to this sequence is what
allows the PDA to remember how to complete the sequence. Suppose for example that we
want to reduce the string T « a to T. If we begin in some state g, with a on top of the stack,
the first step will be to remove a and enter a state that we might call q3,;. (Here is where we
use the numbering of the productions: The notation is supposed to suggest that the PDA has
completed one step of the reduction associated with production 3.) Starting in state q3,,, the
machine expects to see « on the stack. If it does, it removes it and enters state 43,2, from which
the only possible move is to remove T from the stack, replace it by T, and return to g.
Of
course, all the moves of this sequence are A-transitions, affecting only the stack.
Apart from the special states used for reductions, the PDA stays in the state q during
almost all the processing. When S is on top of the stack, it pops S and moves to g;, from
which it enters the accepting state q, if at that point the stack is empty except for Zo.
The input
alphabet is {a, +, *} and the stack alphabet is {a, +, *, S, T, Zo}.
Note that in the shift moves, a number of combinations of input and stack symbol could
be omitted. For example, when a string in the language is processed, the symbol +
will never
occur simultaneously as both the input symbol and stack symbol. It does no harm
to include
these, however, since no string giving rise to these combinations will be reduced to
S.
It can be shown without difficulty that this nondeterministic PDA accepts
the language
generated by G. Moreover, for any CFG, a nondeterministic PDA can be construct
ed along
the same lines that accepts the corresponding language.
eS
CHAPTER 7 Pushdown Automata 273
MGit wil| thetetore accept by ails ick:a string that M does not arcenl -
id tlBt we letu1 startyypareson itsHah Ae a peeunder
is appropriate to‘doSO.
‘is
we allow M, to empty itsstack aeneuy. when M enters ao.
Stat isto provide it with a A-transition from this state toa p jal
tying state, rom which the 1 just:
274 PART 3 Context-Free Languages and Pushdown Automata
Now we may return to the problem we are trying to solve. We have a PDA
M that accepts a language L by empty stack (let us represent this fact by writing
L = L,(M)), and we would like to construct a CFG generating L. It will be helpful
to try to preserve as much as possible the correspondence in Theorem 7.2 between
the operation of the PDA and the leftmost derivation being simulated. The current
string in the derivation will consist of two parts, the string of input symbols read by
the PDA so far and a remaining portion corresponding to the current stack contents.
In fact, we will define our CFG so that this remaining portion consists entirely of
variables, so as to highlight the correspondence between it and the stack contents: In
order to produce a string of terminals, we must eventually eliminate all the variables
from the current string, and in order for the input string to be accepted by the PDA
(by empty stack), all the symbols on the stack must eventually be popped off.
We consider first a very simple approach, which is too simple to work in general.
Take the variables in the grammar to be all possible stack symbols in the PDA,
renamed if necessary so that no input symbols are included; take the start symbol to
be Zo; ignore the states of the PDA completely; and for each PDA move that reads a
(either A or an element of 2) and replaces A on the stack by B; B2 --- B,,, introduce
the production
A — aB,B,::- By
This approach will give us the correspondence outlined above between the current
stack contents and the string of variables remaining in the current string being derived.
Moreover, it will allow the grammar to generate all strings accepted by the PDA. The
reason it is too simple is that by ignoring the states of the PDA we may be allowing
other strings to be derived as well. To see an example, we consider Example 7.1.
This PDA accepts the language {xcx" |x € {a, b}*}. The acceptance is by final state,
rather than by empty stack, but we can fix this, and eliminate the state q2 as well, by
changing move 12 to
instead of {(g2, Zo)}. We use A and B as stack symbols instead of a and b. The
moves of the PDA include these:
Zo re aAZpo
A->cA
A->a
Zo7>A
CHAPTER 7 Pushdown Automata 275
(qo, aca, Zo) F (qo,ca, AZo) F (q1,a, AZo) F (qi, A, Zo) F (qi, A, A)
If we run the PDA on the input string aa instead, the initial move is
(qo, aa, Zo) F (Go, a, AZo)
and at this point the machine crashes, because it is only in state g; that it is allowed
to read a and replace A on the stack by A. However, our grammar also allows the
derivation
Ly => 4ALy > aaZy — ad
In order to eliminate this problem, we must modify our grammar so as to in-
corporate the states of the PDA. Rather than using the stack symbols themselves as
variables, we try things of the form
[p, A, q]
where p and q are states. For the variable [p, A, q] to be replaced by a (either A
or a terminal symbol), it must be the case that there is a PDA move that reads a,
pops A from the stack, and takes the machine from state p to state g. More general
productions involving the variable [p, A, q] are to be thought of as representing any
sequence of moves that takes the PDA from state p to state q and has the ultimate
effect of removing A from the stack. “
If the variable [p, A, g] appears in the current string of a derivation, our goal is
to replace it by A or a terminal symbol. This will be possible if there is a move that
takes the PDA from p to g and pops A from the stack. Suppose instead, however,
that there is a move from p to some state p,, that reads a and replaces A on the
stack by B, Bz --+ Bm. It is appropriate to introduce a into our current string at this
point, since we want the initial string of terminals to correspond to the input read so
far. But it is now also appropriate to think of our original goal as being modified,
as a result of all the new symbols that have been introduced on the stack. The most
direct way to eliminate these new symbols Bi, ..., Bm is as follows: to start in p
and make a sequence of moves—ending up in some state p2, say—that result in By
being removed from the stack; then to make some more moves that remove Bz and
in the process move from p2 to some other state p3; ...; to move from pm—1 to some
Pm and remove B,—1; and finally, to move from pm to gq and remove Br. The actual
moves of the PDA may not accomplish these steps directly, but this is what we want
their ultimate effect to be. Because it does not matter what the states p2, p3,..., Pm
are, we will allow any string of the form
a[pi, iB: p2\(p2; Bo, D3] ae LPas By; q)
to replace [p, A,q] in the current string. In other words, we will introduce the
productions
[p, A, q] > alpi, B, pollp2, Bo, p3]--- Pm, Bm» 4]
276 PART 3 Context-Free Languages and Pushdown Automata
for all possible sequences of states p2,..., Pm. Some such sequences will be dead
ends, in the sense that there will be no sequence of moves following this sequence
of states and having this ultimate effect. But no harm is done by introducing these
productions, because for any derivation in which one of these dead-end sequences
appears, there will be at least one variable that cannot be eliminated from the string,
and so the derivation will not produce a string of terminals. If we denote by S the
start symbol of the grammar, the productions that we need to begin are those of the
form
S — [qo, Zo,g]
where qo is the initial state. When we accept strings by empty stack the final state is
irrelevant, and thus we include a production of this type for every possible state q.
We now present the proof that the CFG we have described generates the language
accepted by M.
ee Ba: Xm:
oa
ins ueBr). we know t
:
qo b B (qo, BB)
qo c Zo (41, Zo)
qo c A (41, A)
KE
OMANIANFWNY
0 c B (1; B)
1 a A (q1, A)
qi b B (41, A)
qq A Zo (qi, A)
(all other combinations) none
F (q1,b, BZo)
F (qi, A, Zo)
dase. 2s)
The corresponding leftmost derivation in the grammar is
From the sequence of PDA moves, it may look as though there are several choices of
leftmost derivations. For example, we might start with the production S — [qo, Zo, qo].
Remember, however, that [qo, Zo, gq] represents a sequence of moves from go to q that has the
ultimate effect of removing Zp from the stack. Since the PDA ends up in state qj, it is clear
that q should be q;. Similarly, it may seem as if the second step could be
7.6 |PARSING
Suppose that G is a context-free grammar over an alphabet X. To parse a string
x € &* (to find a derivation of x in the grammar G, or to determine that there is none)
is often useful. Parsing a statement in a programming language, for example, is
necessary in order to classify it according to syntax; parsing an algebraic expression
is essentially what allows us to evaluate the expression. The problem of finding
efficient parsing algorithms has led to a great deal of research, and there are many
specialized techniques that depend on specific properties of the grammar.
In this section we return to the two natural ways presented in Section 7.4 of
obtaining a PDA to accept the language L(G). In both cases, the PDA
not only
accepts a string in L(G) but does it by simulating a derivation of x (in
one case a
leftmost derivation, in the other case rightmost). Although the official
output of a
PDA is just a yes-or-no answer, it is easy enough to enhance the machine
slightly by
allowing it to record its moves, so that any sequence of moves leading to
acceptance
causes a derivation to be displayed. However, neither construction by itself
can be said
to produce a parsing algorithm, because both PDAs are inherently
nondeterministic.
In each case, the simulation proceeds by guessing the next step in the
derivation; if
the guess is correct, its correctness will be confirmed eventually
by the PDA.
One approach to obtaining a parsing algorithm would be to consider
all possible
sequences of guesses the PDA might make, in order to see whether
one of them
leads to acceptance. Exercise 7.47 asks you to use a backtracking
strategy for doing
this with a simple CFG. However, it is possible with both types
of nondeterministic
PDAs to confront the nondeterminism more directly: rather than
making an arbitrary
choice and then trying to confirm that it was the right one, trying
instead to use all
the information available in order to select the choice that
will be correct. In the
CHAPTER 7 Pushdown Automata 281
S—>T$
T—>([T|T|A
is an unambiguous CFG generating L. In the top-down PDA obtained from this grammar,
the only nondeterminism arises when the variable T is on the top of the stack, and we have a
choice of two moves using input A. If the next input symbol is [, then the correct move (or,
conceivably, sequence of moves) must produce a [ on top of the stack to match it. Replacing T
by [T]T will obviously do this; replacing T by A would have a chance of being correct only
if the symbol below T were either [ or T, and it is not hard to see that this never occurs. It
appears, therefore, that if T is on top of the stack, T should be replaced by [TT if the next
input symbol is [ and by A if the next input is ] or $. The nondeterminism can be eliminated by
lookahead—using the next input symbol as well as the stack symbol to determine the move.
In the case when T is on the stack and the next input symbol is either ] or $, popping T
from the stack will lead to acceptance only if the symbol beneath it matches the input; thus the
PDA needs to remember the input symbol long enough to match it with the new stack symbol.
We can accomplish this by introducing the two states q) and gs to which the PDA can move on
the respective input symbol when T is on top of the stack. In either of these states, the only
correct move is to pop the corresponding symbol from the stack and return to q1. For the sake
of consistency, we also use a state g, for the case when T is on top of the stack and the next
move
input is [. Although in this case T is replaced on the stack by the longer string [T ]T, the
stack and return to q1. The alternative, which would be
from q) is also to pop the [ from the
replace these two moves by a single one that leaves the
slightly more efficient, would be to
PDA in q; and replaces T on the stack by TT.
The transition table for the original nondeterministic PDA is shown in Table 7.8, and
Table 7.9 describes the deterministic PDA obtained by incorporating lookahead.
ing
The sequence of moves by which the PDA accepts the string []$, and the correspond
steps in the leftmost derivation of this string, are shown below.
S—> T$
T—>T(T]|A
The standard top-down PDA produced from this grammar is exactly the same as the one in
Example 7.8, except for the string replacing T on the stack in the first move of line 3. We
can see the potential problem by considering the input string [][ ][ ]$, which has the leftmost
derivation
S => T$ => T[T]$ > T(TIIT]$ > TITITITI = --- > (IIS
The correct sequence of moves for this input string therefore begins
where the string 6 does not begin with T. These allow all the strings Ba", for n > 0, to be
obtained from T. If these two productions are replaced by
T — BU U > aUu|aA
the language is unchanged and the left recursion has been eliminated. In our example, with
a = [T] and f = A, we replace
T > T[T]|A
by
T—-U U > [TJU|A
in Example 7.8.
and the resulting grammar allows us to construct a deterministic PDA much as
e
ee 1 1 Bd As ee ae ee
284 PART 3 Context-Free Languages and Pushdown Automata
S— T$
bia aA a Sm ee sg
This is the unambiguous grammar obtained from the one in Example 7.8 by removing A-
productions from the CFG with productions T — [T]T |A; the language is unchanged except
that it no longer contains the string $.
Although there is no left recursion in the CFG, we can tell immediately that knowing the
next input symbol will not be enough to choose the nondeterministic PDA’s next move when T
is on top of the stack. The problem here is that the right sides of all four T-productions begin
with the same symbol. An appropriate remedy is to “factor” the right sides, as follows:
T > [U Oe IE | eer ey
More factoring is necessary because of the U-productions; in the ones whose right side begins
with 7, we can factor out T]. We obtain the productions
SiS S T > [U
U > TW | |W W->TI\A
We can simplify the grammar slightly by eliminating the variable 7, and we obtain
1 Jo A Zo (41, SZo)
D Onl IX S (41, [U$)
3 q1 [ U (q, [U]W
4 q A [ (qi, A)
5 1 ] U (q, ]W)
6 q A ] (q1, A)
U q1 [ Ww (qi, [U)
8 q1 ] W (qj, A)
9 1 $ W (qs, A)
10 qs A $ (q1, A)
11 q [ [ (q1, A)
12 M1 ] ] (q1, A)
13 1 $ $ (q1, A)
14 q1 A Zo (42, Zo)
ns) (all other combinatio
a
Se ee none
ee
e ee
meaning that the nondeterministic top-down PDA produced from the grammar can be
turned into a deterministic top-down parser by looking ahead to the next symbol. A
grammar is LL(k) if looking ahead k symbols in the input is always enough to choose
the next move of the PDA. Such a grammar allows the construction of a deterministic
top-down parser, and there are systematic methods for determining whether a CFG
is LL(k) and for carrying out this construction (see the references).
For an LL(1) context-free grammar, a deterministic PDA is one way of formu-
lating the algorithm that decides the next step in the derivation of a string by looking
at the next input symbol. The method of recursive descent is another way. The name
refers to a collection of mutually recursive procedures corresponding to the variables
in the grammar.
S — [U$
U > |W |(UJW
Ww-— [U|A
We give a C++ version of a recursive-descent parser. The term recognizer is really more
accurate than parser, though it would not be difficult to add output statements to the program
that would allow one to reconstruct a derivation of the string being recognized.
The program involves functions s, u, and w, corresponding to the three variables. Calls
on these three functions correspond to substitutions for the respective variables during a
derivation—or to replacement of those variables on the stack in a PDA implementation. There
is a global variable curr_ch, whose value is assigned before any of the three functions is
called. If the current character is one of those that the function expects, it is matched, and
the input function is called to read and echo the next character. Otherwise, an error-handling
function is called and told what the character should have been; in this case, the program
terminates with an appropriate error message.
Note that the program’s correctness depends on the grammar’s being LL(1), since each
of the functions can select the correct action on the basis of the current input symbol.
#include <iostream.h>
#include <stdlib.h>
void main()
{ get_ch(); s();
cout << endl << "Parsing complete. "
<< "The above string is in the language." << endl;
}
void w() // recognizes [U | <Lambda>
(o 2£ (currch == / [*) { mateh(’[*); iaiihye ee
Th Llle a
Parsing complete. The above Strings tne the language.
CHAPTER 7 Pushdown Automata 287
ERROR : Expecting [.
[]]
ERROR : Expecting $.
[[] (End of Data)
ERROR : Expecting one of [ or ].
The program is less complete than it might be, in several respects. In the case of a string
not in the language, it reads and prints out only the symbols up to the first illegal one—that
is, up to the point where the DPDA would crash. In addition, it does not read past the $; if
the input string were [] $], for example, the program would merely report that [] $ is in the
language. Finally, the error messages may seem slightly questionable. The second error, for
example, is detected in the function s, after the return from the call on u. The production is
S — [U$; the function “expects” to see $ at this point, although not every symbol other than
$ would have triggered the error message. The symbol [ would be valid here and would have
resulted in a different sequence of function calls, so that the program would not have performed
the same test at this point in s.
(0) S—> S$
GD) Sp Se Mi aete
Qs oi
(3) T > Tea
(4) a
The last four are essentially those in Example 7.6; the endmarker $ introduced in production
(0) will be useful here, as it was in the discussion of top-down parsing.
Table 7.11 shows the nondeterministic PDA in Example 7.6 with the additional reduction
corresponding to grammar rule (0).
The other slight difference from the PDA in Example 7.6 is that because the start symbol
state as
S occurs only in production (0) in the grammar, the PDA can move to the accepting
soon as it sees S on the stack.
to shift
Nondeterminism is present in two ways. First, there may be a choice as to whether
stack or to try to reduce a string on top of the stack. For example, if
an input symbol onto the
the first choice is correct if it is the T in the right side of T * a, and
T is the top stack symbol,
there may be some
the second is correct if it is the J in one of the S,-productions. Second,
288 PART 3 Context-Free Languages and Pushdown Automata
q A Ip (qi1, A)
41,1 A = (41,2, A)
q1,2 A Sj ‘ (q, S1)
doubt as to which reduction is the correct one; for example, there are two
productions whose
right sides end with a. Answering the second question is easy. When we
pop a off the stack, if
we find * below it, we should attempt to reduce T * a to T, and otherwise,
we should reduce
a to T. Either way, the correct reduction is the one that reduces the longest
possible string.
Returning to the first question, suppose the top stack symbol is 7, and
consider the
possibilities for the next input. If it is +, we should soon have the string
S; + T, in reverse
order, on top of the stack, and so the correct move at this point is a
reduction of either T or
S, + T (depending on what is below T on the stack) to S;. If the next
input is , the reduction
will be that of T * a to a, and since we have T already, we
should shift. Finally, if it is $,
we should reduce either T or S; + T to 5S, to allow the reduction
of S;$. In any case, we can
make the decision on the basis of the next input symbol. What
is true for this example is that
there are certain combinations of top stack symbol and input
symbol for which a reduction is
always appropriate, and a shift is correct for all the other
combinations. The set of pairs for
which a reduction is correct is an example of a precedence
relation. (It is a relation from I to
4, in the sense of Section 1.4.) There a number of types
of precedence grammars, for which
precedence relations can be used to obtain a deterministic
shift-reduce parser. Our example,
CHAPTER 7 Pushdown Automata
in which the decision to reduce can be made by examining the top stack symbol and the next
input, and in which a reduction always reduces the longest possible string, is an example of a
weak precedence grammar.
A deterministic PDA that acts as a shift-reduce parser for our grammar is shown in Ta-
ble 7.12. In order to compare it to the nondeterministic PDA, we make a few observations.
The stack symbols can be divided into three groups: (1) those whose appearance on top of the
stack requires a shift regardless of the next input (these are Zo, S|, +, and *); (2) those that
require a reduction or lead to acceptance (a, $, and S$); and (3) one, 7, for which the correct
choice can be made only by consulting the next input. In the DPDA, shifts in which the top
stack symbol is of the second type have been omitted, since they do not lead to acceptance of
any string and their presence would introduce nondeterminism. Shifts in which the top stack
symbol is of the first or third type are shown, labeled “shift moves.” If the top stack symbol
is of the second type, the moves in the reduction are all A-transitions. If the PDA reads a
symbol and decides to reduce, the input symbol will eventually be shifted onto the stack, once
the reduction has been completed (the machine must remember the input symbol during the
reduction); the eventual shift is shown, not under “shift moves,” but farther down in the table,
as part of the sequence of reducing moves.
1 qd oO X (q,0X)
(o is arbitrary; X is either Zo, S1, +, or *.)
2 q o r (q,0T)
G is any input symbol other than iheor eeD
Moves to reduce S 1$to ..
3 q A $ (qs, A)
4 KS A S| (q; S)
5 q A a (Ga, A)
6 a, A * (Ga,2, A)
7 a2 A ie (q,T)
8 qa,1 A Xx (q, TX)
(X is any stack symbol other than +.)
S q o i (Gro, A)
10 Ir. A + (975° A)
11 GT.0 A S; (q,08})
12 Gr.0 A x (q,0S,X)
(o is ene = Of $;X is any stack Sy poe ores than ae )
_ - _ Move toaccept -
13 q A § (qi, A)
(all other combinations) none
290 PART 3 Context-Free Languages and Pushdown Automata
EXERCISES
sks For the PDA in Example 7.1, trace the sequence of moves made for each of
the input strings bbcbb and baca.
7.2. For the PDA in Example 7.2, draw the computation tree showing all possible
sequences of moves for the two input strings aba and aabab.
73s For a string x € {a, b}* with |x| =n, how many possible complete
sequences of moves can the PDA in Example 7.2 make, starting with input
string x? (By a “complete” sequence of moves, we mean a sequence of
moves starting in the initial configuration (go, x, Zo) and terminati
ng in a
configuration from which no move is possible.)
7.4. Modify the PDA described in Example 7.2 to accept each of the followin
g
subsets of {a, b}*.
a. The language of even-length palindromes.
b. The language of odd-length palindromes.
Taos Give transition tables for PDAs recognizing each of the following
languages.
a. The language of all nonpalindromes over {a, b}.
b. {a"x |n>0, x € {a, b}* and |x| <n}.
C. {a'b/c* |i, j,k > Oand j =i or j =k}.
GA see(Gnb;.6}" |e(@) ee Ny(X) Or Ng(x) < n_(x)}.
CHAPTER 7 Pushdown Automata 291
7.6. In both cases below, a transition table is given for a PDA with initial state qo
and accepting state g2. Describe in each case the language that is accepted.
5 1 a b (41, 5)
6 U1 b b (41, 5), (G2, b)
(all other combinations) none
1 qo a Zo (qo, X Zo)
2 qo b Zo (qo, X Zo)
3 qo a X (qo, XX)
4 qo b Xx (qo, XX)
5 qo c Xx (q1, X)
6 Jo c Zo (41, Zo)
7 q1 a x (q1, A)
8 nN b Xx (qi, A)
9 Ch A Zo (42, Zo)
(all other combinations) 7 none
THE Give a transition table for a PDA accepting the language in Example 7.1 and
having only two states, the nonaccepting state go and the accepting state qo.
(Use additional stack symbols.)
7.8. Show that every regular language can be accepted by a deterministic PDA M
with only two states in which there are no A-transitions and no symbols are
ever removed from the stack.
7.9: Show that if L is accepted by a PDA in which no symbols are ever removed
from the stack, then L is regular.
DAO: Suppose L C &* is accepted by a PDA M, and for some fixed k, and every
x € D*, no sequence of moves made by M on input x causes the stack to
have more than k elements. Show that L is regular.
7.11. Show that if L is accepted by a PDA, then L is accepted by a PDA that never
crashes (i.e., for which the stack never empties and no configuration is
reached from which there is no move defined).
7.12. Show that if L is accepted by a PDA, then L is accepted by a PDA in which
every move either pops something from the stack (i.e., removes a stack
symbol without putting anything else on the stack); or pushes a single
symbol onto the stack on top of the symbol that was previously on top; or
leaves the stack unchanged.
292 PART 3 Context-Free Languages and Pushdown Automata
7.13. Give transition tables for deterministic PDAs recognizing each of the
following languages.
a. {x € {a, b}* |ng(x) = np (x)}
b. {x € {a, b}* |na(x) F no(x)}
ce {xe {a.b) |ng@) = 2ngx)}
dentato "a" |nym 0}
7.14, Suppose M, and M) are PDAs accepting L; and L>, respectively. Describe a
procedure for constructing a PDA accepting each of the following languages.
Note that in each case, nondeterminism will be necessary. Be sure to say
precisely how the stack of the new machine works; no relationship is
assumed between the stack alphabets of M, and M).
a. Ios U Lo
be Ti Ls
ea L7
7.15. Show that if there are strings x and y in the language L so that x is a prefix
of y and x # y, then no DPDA can accept L by empty stack.
7.16. Show that if there is a DPDA accepting L, and $ is not one of the symbols in
the input alphabet, then there is a DPDA accepting the language L{$} by
empty stack.
TAT. Show that none of the following languages can be accepted by a DPDA.
(Determine exactly what property of the language pal is used in the proof of
Theorem 7.1, and show that these languages also have that property. )
a. The set of even-length palindromes over {a, b}
b. The set of odd-length palindromes over {a, b}
c. {xx™~ |x € {0, 1}*} (where x~ means the string obtained from x by
changing 0’s to 1’s and 1’s to 0’s)
d. {xy |x € {0, 1}* and y is eitherx or x~}
7.18. A counter automaton is a PDA with just two stack symbols, A and Zo, for
which the string on the stack is always of the form A” Zp for some n > 0. (In
other words, the only possible change in the stack contents is a change in the
number of A’s on the stack.) For some context-free languages, such as
{0'1' | i > 0}, the obvious PDA to accept the language is in fact a counter
automaton. Construct a counter automaton to accept the given language in
each case below.
a. {x € {0, 1}* |nox) = nj(x)}
b. {x € {0, 1}* |no(x) < 2ni(x)}
7.19. Suppose that M = (Q, 4,1, qo, Zo, A, 5) is a deterministic PDA accepting
a language L. If x is a string in L, then by definition there is a sequence of
moves of M with input x in which all the symbols of x are read. It is
conceivable, however, that for some strings y ¢ L, no sequence of moves
causes M to read all of y. This could happen in two ways: M could either
crash by not being able to move, or it could enter a loop in which there
were
CHAPTER 7 Pushdown Automata 293
and x = (a+a*a)
*a.
b. The grammar has productions S > $+ S| SxS |(S)|a, and
x=(a*xa+t+a).
c. The grammar has productions S— (S)S | A, and x = Q(QQ).
7.22. Let M be the PDA in Example 7.2, except that move number 12 is changed
to (qz, A), so that M does in fact accept by empty stack. Let x = ababa.
Find a sequence of moves of M by which x is accepted, and give the
‘ corresponding leftmost derivation in the CFG obtained from M as in
Theorem 7.4.
7.23. Under what circumstances is the “nondeterministic” top-down PDA
described in Definition 7.4 actually deterministic? (For what kind of
language could this happen?)
7.24. In each case below, you are given a CFG and a string x that it generates. For
the nondeterministic bottom-up PDA that is constructed from the grammar
as in Example 7.6, trace a sequence of moves by which x is accepted,
showing at each step the state, the stack contents, and the unread input.
Show at the same time the corresponding rightmost derivation of x (in
reverse order) in the grammar. See Example 7.6 for a guide.
a. The grammar has productions S> S ES) pAyand x i=] (01-
b. The grammar has productions S > FS)S:\:As anda = EITM:
7.25. If the PDA in Theorem 7.4 is deterministic, what does this tell you about the
grammar that is obtained? Can the resulting grammar have this property
without the original PDA being deterministic?
7.26. Find the other useless variables in the CFG obtained in Example 7.7.
7.21. In each case, the grammar with the given productions satisfies the PL)
property. For each one, give a transition table for the deterministic PDA
obtained as in Example 7.8.
PART 3 Context-Free Languages. and Pushdown Automata
Context-Free and
Non-Context-Free Languages
where v, w, x,y,z € D*. Within this derivation, both the strings x and wAy are
derived from A. We may write
In order to obtain our pumping lemma, we must show that this duplication of
variables occurs in the derivation of every sufficiently long string in L(G). It will
also be helpful to impose some restrictions on the strings v, w, x, y, and z, just as we
did on the strings u, v, and w in the simpler case.
The discussion will be a little easier if we can assume that the tree representing
a derivation is a binary tree, which means simply that no node has more than two
children. We can guarantee this by putting our grammar into Chomsky normal form
(Section 6.6). The resulting loss of the null string will not matter, because the result
we want involves only long strings.
Let us say that a path in a nonempty binary tree consists either of a single node
or of a node, one of its descendants, and all the nodes in between. We will say that the
length of a path is the number of nodes it contains, and the height of a binary tree is the
length of the longest path. In any derivation whose tree has a sufficiently long path,
some variable must reoccur. Lemma 8.1 shows that any binary tree will have a long
path if the number of leaf nodes is sufficiently large. (In the case we are interested
in, the binary tree is a derivation tree, and because there are no A-productions the
number of leaf nodes is simply the length of the string being derived.)
Lemma 8.1 For any A > 1, a binary tree having more than 2’! leaf nodes must
have height greater than h.
Theoremat 9
_ LetG =(V,%, S, P) bea context-fres
_ with a total of p variables. Any string
_ Written as u = vwxyz, forsome str
Just as in the case of the earlier pumping lemma, it is helpful to restate the result
so as to emphasize the essential features.
“Solara ere
1 C,
alors
C3 (Gy Cs
a b B (Cee
b C7
Cs, Cy b
a b
u = (ab)(b)(ab)(b)(a)
= Vi Wx yz
Figure 8.1 |
300 PART 3 Context-Free Languages and Pushdown Automata
Using the pumping lemma for context-free languages requires the same sorts of
precautions as in the case of the earlier pumping lemma for regular languages. In order
to show that L is not context-free, we assume it is and try to derive a contradiction.
Theorem 8.1a says only that there exists an integer n, nothing about its value; because
we can apply the theorem only to strings wu with length > n, the u we choose must be
defined in terms of n. Once we have chosen u, the theorem tells us only that there
exist strings v, w, x, y, and z satisfying the four properties; the only way to guarantee
a contradiction is to show that every choice of v, w, x, y, z satisfying the properties
leads to a contradiction.
According to Chapter 7, a finite-state machine with an auxiliary memory in the
form of a stack is enough to accept a CFL. An example of a language for which a
single stack is sufficient is the language of strings of the form (')' (see Example 7.3),
which could just as easily have been called a'b'. The a’s are saved on the stack so
that the number of a’s can be compared to the number of b’s that follow. By the
time all the b’s have been matched with a’s, the stack is empty, which means that the
machine has forgotten how many there were. This approach, therefore, would not
allow us to recognize strings of the form a‘b'c'. Rather than trying to show directly
that no other approach using a single stack can work, we choose this language for our
first proof using the pumping lemma.
| EXAMPLE8.1
Pe | =The Pumping Lemma Applied to {a'b/c"}
Let
Detahie |; =)
Suppose for the sake of contradiction that L is context-free, and let n be the integer
in Theorem
8.1a. An obvious choice for u with |u| > nis u = a"b"c". Suppose v, w, x,
y, and z are any
strings satisfying conditions (8.1)-(8.4). Since |wxy| <n, the string wxy can
contain at most
two distinct types of symbol (a’s, b’s, and c’s), and since |wy| > 0, w and y
together contain
at least one. The string vw?xy?z contains additional occurrences of the
symbols in w and y;
CHAPTER 8 Context-Free and Non-Context-Free Languages 301
therefore, it cannot contain equal numbers of all three symbols. On the other hand, according
to (8.4), vw*xy?z € L. This is a contradiction, and our assumption that L is a CFL cannot be
correct.
Note that to get the contradiction, we started with uv € L and showed that vw?xy?z fails
to be an element, not only of L but also of the bigger language
This language is similar in one obvious respect to the language of even-length palindromes,
which is a context-free language: In both cases, in order to recognize a string in the language,
a machine needs to remember the first half so as to be able to compare it to the second. For
palindromes, the last-in, first-out principle by which a stack operates is obviously appropriate.
In the case of L, however, when we encounter the first symbol in the second half of the string
(even assuming that we know when we encounter it), the symbol we need to compare it to is
the first in, not the last—in other words, it is buried at the bottom of the stack. Here again,
arguing that the obvious approach involving a PDA fails does not prove that a PDA cannot be
made to work. Instead, we apply the pumping lemma.
Suppose L is a CFL, and let n be the integer in Theorem 8.la. This time the choice of
u is not as obvious; we try u = a"b"a"b". Suppose that v, w, x, y, and z are any strings
satisfying (8.1)-(8.4). We must derive a contradiction from these facts, without making any
other assumptions about the five strings.
As in Example 8.1, condition (8.2) tells us that wxy can overlap at most two of the four
contiguous groups of symbols. We consider several cases.
First, suppose w or y contains at least one a from the first group of a’s. Since |wxy| <n,
neither w nor y can contain any symbols from the second half of u. Consider m = 0 in
condition (8.4). Omitting w and y causes at least one initial a to be omitted, and does not affect
the second half. In other words,
vwxyz = a'bia"b"
where i < n and 1 < j <n. The midpoint of this string is somewhere within the substring
a", and therefore it is impossible for it to be of the form ss.
Next, suppose wxy contains no a’s from the first group but that w or y contains at least
one b from the first group of b’s. Again we consider m = 0; this time we can say
vw°xy’z = a"bia/b"
where i <n and 1 < j <n. The midpoint is somewhere in the substring bia, and as before,
the string cannot be in L.
We can get by with two more cases: case 3, in which wxy contains no symbols from
the first half of u but at least one a, and case 4, in which w or y contains at least one b from
302 PART 3 Context-Free Languages and Pushdown Automata
the second group of b’s. The arguments in these cases are similar to those in cases 2 and 1,
respectively. We leave the details to you.
Just as in Example 8.1, there are other languages for which essentially the same proof
works. Two examples in this case are {a‘bia‘b' | i > 0} and {a'b/a‘b/ |i, j > 0}. A similar
proof shows that {scs |s € {a, b}*} also fails to be context-free. Although the marker in the
middle may appear to remove the need for nondeterminism, the basic problem that prevents a
PDA from recognizing this language is still present.
In proofs of this type, there are several potential trouble-spots. It may not be
obvious what string u to choose. In Example 8.2, although a”b"a"b" is not the only
choice that works, there are many that do not (Exercise 8.2).
Once u is chosen, deciding on cases to consider can be a problem. A straightfor-
ward way in Example 8.2 might have been to consider seven cases:
1. wy contains only a’s from the first group;
2. wy contains a’s from the first group and b’s from the first group;
3. wy contains only b’s from the first group;
| EXAMPLES.3
8.3 | A Third Application of the Pumping Lemma
Let
then vw°xy%z still contains n a’s; since wy contains one of the other two symbols, vw°x yz
contains fewer occurrences of that symbol than u does and therefore is not in L. We have
obtained a contradiction and may conclude that L is not context-free.
The language {a'b/c* |i < j andi < k} can be shown to be non-context-free by exactly
the same argument.
where both identifiers have n a’s. However, for a technical reason that will be mentioned in
a minute, we complicate the program by including two subsequent references to the variable
instead of one:
Here it is assumed that all three identifiers are a”. There is one blank in the program, after
int, and it is necessary as a separator. About all that can be said for this program is that it
will make it past the compiler, possibly with a warning. It declares an integer variable, then,
twice, it evaluates the expression consisting of that identifier (the value is probably garbage,
since the program has not initialized the variable), and does nothing with the value.
According to the pumping lemma, u = vwxyz, where (8.2)—(8.4) are satisfied. In particu-
lar, vw°xy°z is supposed to be a valid C program. However, this is impossible. If wy contains
the blank or any of the symbols before it, then vxz still contains at least most of the first oc-
currence of the identifier, and without main () {int and the blank, it cannot be syntactically
correct. We are left with the case in which wy is a substring of “a”; a"; a”;”. If wy contains
either the final semicolon or bracket, the string vxz is also illegal. If it contains one of the
two intermediate semicolons, and possibly portions of one or both of the identifiers on either
304 PART 3 Context-Free Languages and Pushdown Automata
side, then vxz has two identifiers, which are now not the same. Finally, if wy contains only a
portion of one of the identifiers and nothing else, then vxz still has a variable declaration and
two subsequent expressions consisting of an identifier, but the three identifiers are not all the
same. In either of these last two situations, the declaration-before-use principle is violated. We
conclude that vxz is not a legal C program, and therefore that L is not a context-free language.
(The argument would almost work with the shorter program having only two occurrences
of the identifier, but not quite. The case in which it fails is the case in which wy contains the
first semicolon and nothing else. Deleting it still leaves a valid program, consisting simply of
a declaration of a variable with a longer name. Adding multiple copies is also legal because
they are interpreted as harmless “empty” statements.)
There are other examples of syntax rules whose violation cannot be detected by a PDA.
We noted in Example 8.2 that {a"b”a"b"} is not a context-free language, and we can imagine
a situation in which being able to recognize a string of this type is essentially what is required.
Suppose that two functions f and g are defined, having n and m formal parameters, respectively,
and then calls on f and g are made. Then the numbers of parameters in the calls must agree
with the numbers in the respective definitions.
a ee ee
Lemma 8.2 If the binary tree consisting of a node on the path and its descendants
has h or fewer branch points, its leaf nodes include no more than 2” distinguished
nodes.
Proof The proof is by induction and is virtually identical to"that of Lemma 8.1,
except that the number of branch points is used instead of the height, and the number
of distinguished leaf nodes is used instead of the total number of leaf nodes. The
reason the statement involves 2” rather than 2’~! is that the bottom-most branch
point has two distinguished descendants rather than one.
ision
aeproofa Theorem 8.2
Note that the ordinary pumping lemma is identical to the special case of Theo-
rem 8.2 in which all the positions of u are distinguished.
306 PART 3 Context-Free Languages and Pushdown Automata
(The reason for this choice will be clear shortly.) Let us also designate the first n positions of
u as the distinguished positions, and let us suppose that v, w, x, y, and z satisfy (8.5)-(8.9).
First, we can see that if either w or y contains two distinct symbols, then we can obtain a
contradiction by considering the string-vw*xyz, which will no longer have the form a*b*c’*.
Second, we know that because wy contains at least one distinguished position, w or y consists
of a’s. It follows from these two observations that unless w consists of a’s and y consists of
the same number of b’s, looking at vw?x yz will give us a contradiction, because this string
has different numbers of a’s and b’s. Suppose now that w = a/ and y = b/. Letk = n!/j,
which is still an integer, and let m = k + 1. Then the number of a’s in vw” xyz is
n+(m—l)exj=Hn+kej=n+n!
which is the same as the number of c’s. We have our contradiction, and therefore L cannot be
context-free.
With just the pumping lemma here, we would be in trouble: There would be no way to
rule out the possibility that w and y contained only c’s, and therefore no way to guarantee a
contradiction.
| EXAMPLES.6
8.6 | Using Ogden’s Lemma when the Pumping Lemma Fails
Let L = {a?bic'd’ | p = 0Oorg =r = s}. It seems clear that L should not be a CFL,
because {a’b‘c*} is not. The first thing we show in this example, however, is that L satisfies
the properties of the pumping lemma; therefore, the pumping lemma will not help us to show
that L is not context-free.
Suppose n is any positive integer, and u is any string in L with |u| > n, say u = aPbic'd’.
We must show the existence of strings v, w, x, y, and z satisfying (8.1)-(8.4). We consider
two cases. If p = 0, then there are no restrictions on the numbers of b’s, c’s, or d’s, and the
choices w = b, v = x = y = A work. If p > 0, then we know that g = r = s, and the
choices w = a, v =x = y = A work.
Now we can use Theorem 8.2 to show that L is indeed not a CFL. Suppose L is a CFL,
let n be the integer in the theorem, let u = ab"c"d", and designate all but the first
position of
u as distinguished. Suppose that v, w, x, y, and z satisfy (8.5)-(8.9). Then the string wy
must
contain one of the symbols b, c, or d and cannot contain all three. Therefore, vw?xy*z
has one
a and does not have equal numbers of b’s, c’s, and d’s, which means that it cannot be
in L.
the intersection and complement operations to the list. We can now show, however,
that for context-free languages this is not possible.
The second part of the proof of Theorem 8.3 is a proof by contradiction and appears to be a
nonconstructive proof. If we examine it more closely, however, we can use it to find an exam-
ple. Let L; and L» be the languages defined in the first part of the proof. Then the language
L,N Lz = (Li UL5)’ is not a CFL. Therefore, because the union of CFLs is a CFL, at least
one of the three languages L, Lo, LU L} is a CFL whose complement is not a CFL. Let us
try to determine which.
There are two ways a string can fail to be in L;. It can fail to be an element of R, the
language {a}*{b}*{c}*, or it can be a string a’ bick for which i > j. In other words,
Li = RU {a'bict |i = 7}
308 PART 3 Context-Free Languages and Pushdown Automata
The language R’ is regular because R is, and therefore R’ is context-free. The second
language involved in the intersection can be expressed as the concatenation
{abi cht.
j}—= {a | m= 0} (aid! | jie Oh ick 0}
each factor of which is a CFL. Therefore, L' is a CFL. A similar argument shows that L’, is
also a CFL. We conclude that L{ U L}, or
— Let M, Le
andletMz= (Qo, B, qx, Ar, 82) be an FA recognizing L>. We de
CHAPTER 8 Context-Free and Non-Context-Free Languages 309
310 PART 3 Context-Free Languages and Pushdown Automata
(Pp+). 2, B) mu ((P,q)s2, a)
t Otherwise, y = y’a for some a
oe i : . — |
It is also worthwhile, in the light of Theorems 8.3 and 8.4, to re-examine the
proof that the complement of a regular language is regular. If the finite automaton
M = (Q, %, qo, A, 5) recognizes the language L, then the finite automaton
M’ =
(Q, X, qo, Q— A, 5) recognizes D* — L. We are free to apply the same construction
to the pushdown automaton M = (Q, X,T', qo, Zo, A, 5) and to consider
the PDA
M = (Q,%,T, qo, Zo, Q — A, 5). Theorem 8.3 says that even if
M accepts the
context-free language L, M’ does not necessarily accept U* — L. Why not?
The problem is nondeterminism. It may happen that for some x € a
(qo,x,Zo) Fy (p, A, @)
for some state p € A, and
for some other state g ¢ A. This means that the string x is accepte
d by M as well as by
M", since q is an accepting state in M’. In the case of finite automat
a, nondeterminism
can be eliminated: Every NFA-A is equivalent to some FA. The
corresponding result
about PDAs is false (Theorem 7.1), and the set of CFLs
is not closed under the
complement operation.
We would expect that if M is a deterministic PDA (DPDA)
recognizing L, then
the machine M’ constructed as above would recogni
ze >)* =]. Unfortunately, this
is still not quite correct. One reason is that there might be
input strings that cause M
to enter an infinite sequence of A-transitions and are never
processed to completion.
Any such string would be accepted neither by M nor
by M’. However, the result
CHAPTER 8 Context-Free and Non-Context-Free Languages 311
in Exercise 7.46 shows that this difficulty can be resolved and that a DPDA can be
constructed that recognizes &* — L. It follows that the complement of a deterministic
context-free language (DCFL) is a DCFL, and in particular that any context-free
language whose complement is not a CFL (Theorem 8.3 and Example 8.7) cannot be
a DCFL.
EXERCISES
8.1. In each case, show using the pumping lemma that the given language is not
a
CFIy,
a! e=fa
bic ia
b. L= {x € {a, b}* |n(x) = n,(x)?}
CG ES ao aly = 0}
d. L = {x € {a, b, c}* |ng(x) = max {np(x), ne(x)}}
e. L= {a"b"a"b"*™ |m,n > 0}
8.2. In the pumping-lemma proof in Example 8.2, give some example
s of choices
of strings wu € L with |u| > n that would not work.
CHAPTER 8 Context-Free and Non-Context-Free Languages 313
8.3. In the proof given in Example 8.2 using the pumping lemma, the
contradiction was obtained in each case by considering the string vw°xy°z.
Would it have been possible instead to use vw*xyz in each case?
8.4. In Example 8.4, is it possible with Ogden’s lemma rather than the pumping
lemma to use the string u mentioned first, with only two occurrences of the
identifier?
8.5. Decide in each case whether the given language is a CFL, and prove your
answer.
ae oe {a"
bh? a bein: n>. 0}
b. L = {xayb
| x, y € {a, b}* and |x|= |y|}
Ct = ixvex |X G214,.0)"}
d. 1 ={xyx
| x, yea, b}* and |x| = 1}
Gus bee x.e la. DY |n(x) <r). < 2n,(x)}
fe Pox e{asb}” [me (x) = 10np(x)}
L = the set of non-balanced strings of parentheses
8.6. State and prove theorems that generalize Theorems 5.3 and 5.4 to
context-free languages. Then give an example to illustrate each of the
following possibilities.
a. Theorem 8.1a can be used to show that the language is a CFL, but the
generalization of Theorem 5.3 cannot.
b. The generalization of Theorem 5.3 can be used to show the language is
not a CFL, but the generalization of Theorem 5.4 cannot.
c. The generalization of Theorem 5.4 can be used to show the language is
not a CFL.
8.7. Show that if L is a DCFL and R is regular, then L M R is a DCFL.
8.8. In each case, show that the given language is a CFL but that its complement
is not. (It follows in particular that the given language is not a DCFL.)
Aa coli = f,0t} = Kk}
b lable (i FZ ei ek}
c. {x € {a, b}* |x is not ww for any w}
8.9. Use Ogden’s lemma to show that the languages below are not CFLs:
faibitkat | k # i}
{a'bialb/ |j #i}
{a'bia' |j Fi}
8.10. Show that if L is a CFL and F is finite, L — F is a CFL.
Show that if L is not a CFL and F is finite, then L — F is not a CFL.
oe
sO?
xe
OSS Show that if L is not a CFL and F is finite, then L U F is nota CEL.
8.11. For each part of the previous exercise, say whether the statement is true if
“finite” is replaced by “regular,” and give reasons.
8.12. For each part of Exercise 8.10, say whether the statement is true if “CFL” is
replaced by “DCFL,” and give reasons.
314 PART 3 Context-Free Languages and Pushdown Automata
8.20. Use Exercise 7.40 and Exercise 8.7 to show that the following languages are
not DCFLs. This technique is used in Floyd and Beigel (1994), where the
language in Exercise 7.40 is referred to as Double-Duty(L).
a. pal, the language of palindromes over {0, 1} (Hint: Consider the regular
language corresponding to 0*1*0*#1*0*.)
b. {x € {a, b}* |no(x) = na(X) or ny (x) = 2na(x)}
C1x.6 (a, bi |,(x) <n, (%) or np) = 2ng()}
8.21. (Refer to Exercise 7.40) Consider the following argument to show that if
L © &* is a CFL, then so is {x#y | x € L andxy € L}. (# is assumed to be a
symbol not in &.)
Let M be a PDA accepting L, with state set Q. We construct anew PDA
M, whose state set contains two states g and q’ for every state g € Q. M,
copies M up to the point where the input symbol # is encountered, but using
the primed states rather than the original ones. Once this symbol is seen, if
the current state is g’ for some g € A (i.e., if M would have accepted the
current string), then the machine switches over to the original states for the
rest of the processing. Therefore, it enters an accepting state subsequently if
and only if both the substring preceding the # and the entire string read so
far, except for the #, would be accepted by M.
Explain why this argument is not correct.
8.22. Show that the result at the beginning of the previous exercise is false. (Find
a CFL L so that {x#y |x € L and xy € L} is not a CFL.)
8.23. Show that if L C {a}* is a CFL, then L is regular.
8.24. Consider the language L = {x € {a, b}* |na(x) = f(np(x))}. Exercise 5.52
is to show that L is regular if and only if f is ultimately periodic; in other
words, L is regular if and only if there is a positive integer p so that for each r
with 0 <r < p, f is eventually constant on the set S,, = {jp +r | Jj = 9}.
Show that L is a CFL if and only if there is a positive integer p so that for
each r with O <r < p, f is eventually linear on the set S,,,. “Eventually
linear” on S,,- means that there are integers N, c, and d so that for every
j=>N, fGpt+r) =cj +d. (Suggestion: for the “if” direction, show how
to construct a PDA accepting L; for the converse, use the pumping lemma.)
8.25. Let
4n+7 if n is even
MD ai Mat Pipa Oda
ety 4: o i ha rt Be tia ds
hase ne {4 Nidboh‘ae és
Thon Y jen gs Pts “ =i
ana 7
oT
re Ete: “vt <bean: t) jars
sith Peau , oj beet 7 iMs
a)
AE TA > wn
ae ’ SS tl
SAuieteast: ey lhc a ie v
i ;gu
Bil a,io ‘a ord, : opener| f
aS : ie > La 1 am
Turing Machines and
Their Languages
317
2 _pohut. 7
sabi : 8
M; ity: wanhie Tiga “etn 4,
;
CSLestat a
; j P
i. “~ i <= ~~
ia
_
SeS oe
a
oo
an
ve a é
TUG q *)
Lae ay @: =a
C HAPTER
Turing Machines
319
320 PART 4 Turing Machines and Their Languages
Some of these elements should seem familiar. A Turing machine will have a finite
alphabet of symbols (actually two alphabets, an input alphabet and a possibly larger
alphabet for use during the computation) and a finite set of states, corresponding to
the possible “states of mind” of the human computer. Instead of a sheet of paper,
Turing specified a linear “tape,” which has a left end and is potentially infinite to
the right. The tape is marked off into squares, each of which can hold one symbol
from the alphabet; if a square has no symbol on it, we say that it contains the blank
symbol. For convenience, we may think of the squares as being numbered, left-to-
right, starting with 0, although this numbering is not part of the official model and it
is not necessary to refer to the numbers in describing the operation of the machine.
We think of the reading and writing as being done by a tape head, which at any time
is centered on one square of the tape. In our version of a Turing machine—which
is similar although not identical to the one proposed by Turing—a single move
is
determined by the current state and the current tape symbol, and consists of three
parts:
ed Mochines
The notation is interpreted to mean that the string xay appears on the tape, beginning
in square 0, that the tape head is on the square containing a, and that all squares to
the right of y are blank. For a nonnull string w, writing (g, xw) or (q,xwy) will
mean that the tape head is positioned at the first symbol of w. If (g, xay) represents
a configuration, then y may conceivably end in one or more blanks, and we would
also say that (q, xayA) represents the same configuration; usually, however, when
we write (q, xay) the string y will either be null or have a nonblank last symbol.
Just as in the case of PDAs, we can trace a sequence of moves by showing the
configuration at each step. We write
(Go, Ax)
no outcome, and he is left in suspense. As undesirable as this may seem, we will find
that it is sometimes inevitable. In the examples in this chapter, we can construct the
machine so that this problem does not arise, and every input string is either accepted
or explicitly rejected.
In most simple examples, it will be helpful once again to draw transition diagrams,
similar to but more complicated than those for FAs. The move X/Y,D
5(q,
X) = (7, Y, D)
Figure 9.1 |
(where D is R, L, or S) will be represented as in Figure 9.1. A single Turing
Our first example should make it clear, if it is not already, that Turing machines machine move.
are at least as powerful as finite automata.
Figure 9.2 |
An FA and a TM to accept {a, b}*{aba}{a, b}*.
324 PART 4 Turing Machines and Their Languages
Figure 9.3 |
An FA and a TM to accept {a, b}*{aba}.
A TM Accepting pal
To see a little more of the power of Turing machin
es, let us construct a TM to accept the
language pal of palindromes over {a, b}. Later
in this chapter we will introduce the possibility
of nondeterminism in a TM, which would allow
us to build a machine simulating the PDA
CHAPTER 9 Turing Machines 325
in Example 7.2 directly. However, the flexibility of TMs allows us to select any algorithm,
without restricting ourselves to a specific data structure such as a stack. We can easily formulate
a deterministic approach by thinking of how a long string might be checked by hand. You might
position your two forefingers at the the two ends. As your eyes jump repeatedly back and forth
comparing the two end symbols, your fingers, which are the markers that tell your eyes how
far to go, gradually move toward the center. In order to translate this into a TM algorithm,
we can use blank squares for the markers at each end. Moving the markers toward the center
corresponds to erasing (i.e., changing to blanks) the symbols that have just been tested. The
tape head moves repeatedly back and forth, comparing the symbol at one end of the remaining
nonblank string to the symbol at the other end. The transition diagram is shown in Figure 9.4.
Again the tape alphabet is {a, b}, the same as the input alphabet. The machine takes the top
path each time it finds an a at the beginning and attempts to find a matching a at the end.
If it encounters a b in state q3, so that it is unable to match the a at the beginning, it enters
the reject state h,. (As in Figure 9.2c, this transition is not shown.) Similarly, it rejects from
state go if it is unable to match a b at the beginning.
We trace the moves made by the machine for three different input strings: anonpalindrome,
an even-length palindrome, and an odd-length palindrome.
A/A,R
(odd palindrome)
A/A,R
(even palindrome)
Figure 9.4 |
ATM to accept palindromes over {a, D}.
326 PART 4 Turing Machines and Their Languages
ies SsalsmealcD
The idea behind the TM will be to separate the processing into two parts: first,
finding the
middle of the string, and making it easier for the TM to distinguish the symbols in
the second
half from those in the first half; second, comparing the two halves. We accomplish the
first task
by working our way in from both ends simultaneously, changing symbols to
their uppercase
versions as we go. This means that our tape alphabet will include A and B in
addition to the
input symbols a and b. Once we arrive at the middle—which will happen
only if the string is
of even length—we may change the symbols in the first half back to their
original form. The
second part of the processing is to start at the beginning again and, for
each lowercase symbol
in the first half, compare it to the corresponding uppercase symbol
in the second. We keep
track of our progress by changing lowercase symbols to uppercase and
erasing the matching
uppercase symbols.
There are two ways that an input string can be rejected. If its length
is odd, the TM will
discover this in the first phase. If the string has even length but
a symbol in the first half fails
to match the corresponding symbol in the second half, the TM
will reject the string during the
second phase.
The TM suggested by this discussion is shown in Figure
9.5. Again we trace it for three
Strings: two that illustrate both ways the TM can reject the input,
and one that is in the language.
(qo, Aabaa) Kk
(q1, Aabaa) ~~ (q, AAbaa) (42, AAbaaA)
F (qs, MAbaa) + (q4, AADGA) ~—E* (gg, Aba A)
(qi, AAbaA) + (qo, AABaA) + (q2, AABaA)
F (qs, AABaA) + (q4, AABAA) | (gq), AABAA)
F (qs, AABAA) + (gs, AABAA) | (qs, AabAA)
(first phase completed)
CHAPTER 9 Turing Machines 327
A/A,R
B/B,R
Figure 9.5 |
A Turing machine to accept {ss |s € {a, b}*}.
(qo, Aabab) F*
(same as previous case, up to 3rd-from-last move)
+ (qs, AABAB) + (q7, AABAB) + (q7, AABAB)
F (q9, AABA) + (qo, AAB) F (qo, AABA)
- (hz, AABA) (accept)
Sees
pee ee ae ee
328 PART 4 Turing Machines and Their Languages
etT = (0, 2,1, qo, 5) bea Turi yma ine, and let f beapartial function.
| &* with values in I'*. We tes f ifforeveryx e S*at
hich f is defined, — 7
It is still not quite correct to say that a TM computes only one function. One rea-
son is that two functions can look exactly alike except for having officially different
codomains (see Section 1.3). Another reaSon is thataTM might be viewed as comput-
ing either a function of one variable or a function of more than one. For example, if T
computes the function f : (D*)* — I’*, then T also computes f, : ©* > I* defined
by fi(x) = f(x, A). Wecan say, however, that for any specified k, and any C CI,
a given TM computes at most one function of k variables having codomain C.
Numerical functions of numerical arguments can also be computed by Turing ma-
chines, once we choose a way of representing the numbers by strings. We will restrict
ourselves to natural numbers (nonnegative integers), and we generally use the “unary”
representation, in which the integer n is represented by the string 1" = 11...1.
The TM we construct in Figure 9.6 to compute the function will reverse the input string “in
place” by moving from the ends toward the middle, at each step swapping a symbol in the first
half with the matching one in the second half. In order to keep track of the progress made so far,
symbols will also be changed to uppercase. A pass that starts in state g; with a lowercase symbol
state
on the left changes it to the corresponding uppercase symbol and remembers it (by going to
When the
q in the case of an a and q, in the case of a b) as it moves the tape head to the right.
the one on
TM arrives at g3 or qs, if there is a lowercase symbol on the right corresponding to
330 PART 4 Turing Machines and Their Languages
A/A,R
Figure 9.6 |
Reversing a string.
the left, the TM sees it, remembers it (by going to either 46 OF q7), and changes it to the symbol
it remembers from the left. The tape head is moved back to the left, and the first
uppercase
symbol that is encountered is changed to the (uppercase) symbol that had been on
the right.
For even-length strings, the last swap will return the TM to qi, and at that point the absence
of
any more lowercase symbols sends the machine to gg and the final phase of processing.
In the
case of an odd-length string, the last pass will not be completed normally, because
the machine
will discover at either q3 or qs that there is no lowercase symbol to swap
with the one on the
left (which therefore turns out to have been the middle symbol of the string).
When the swaps
have been completed, all that remains is to move the tape head to the
end of the string and
make one final pass back to the left, changing all the uppercase symbols
back to lowercase.
We trace the TM in Figure 9.6 for the odd-length string abb and
the even-length string
baba.
Figure 9.7 |
A Turing machine to compute n mod 2.
nmod 2 | EXAMPLE9.5 —
The numerical function that assigns to each natural number n the remainder when n is divided
by 2 can be computed by moving to the end of the input string, making a pass from right to
left in which the 1’s are counted and simultaneously erased, and either leaving a single 1 (if
the original number was odd) or leaving nothing. The TM that performs this computation is
shown in Figure 9.7.
1 ifxeL
XL(x) = |
0 otherwise
Computing the function x, is therefore similar in one respect to accepting the language L (see
Section 9.1); instead of distinguishing between strings in L and strings not in L by accepting or
not accepting, the TM accepts every input, and distinguishes between the two types of strings
by ending up in the configuration (h,, A1) in one case and the configuration (hz, AO) in the
other.
If we have aTM T computing x,, we can easily obtain one that accepts L. All we have to
do is modify T so that when it leaves output 0, it enters the state h, instead of h,. Sometimes it
is possible to go the other way; a simple example is the language L of palindromes over {a, b}
(Example 9.2). ATM accepting L is shown in Figure 9.4, and a TM computing x, is shown
in Figure 9.8.
It is obtained from the previous one by identifying the places in the transition diagram
where the TM might reject, and modifying the TM so that instead of entering the state h, in
those situations, it continues in a way that ends up in state h, with output 0. For any language
L accepted by a TM T that halts on every input string, another TM can be constructed from T
that computes x,, although the construction may be more complicated than in this example.
ATM of either type effectively allows an observer to decide whether a given input string is in
L; a “no” answer is produced in one case by the input being rejected and in the other case by
output 0.
As we saw in Section 9.1, however, a TM can accept a language L and still leave the
question of whether x € L unanswered for some strings x, by looping forever on those inputs.
332 PART 4 Turing Machines and Their Languages
Figure 9.8 |
Computing x, for the set of palindromes.
(If we could somehow see that the TM was in an infinite loop, we would have the answer;
but if we were depending on the TM to tell us, we would wait forever.) In this case, a TM
computing the function x; would be better, because it would guarantee an answer for every
input. Unfortunately, it is no longer clear that such a machine can be obtained from T. We
will return to this question in Chapter 10.
ee ee “eee 2ST se ee
state of T, and executes the moves of T; (using the function 5,) up to the point where
T, would halt; for any move that would cause T; to halt in the accepting state, T; T>
executes the same move except that it moves instead to the initial state of T>. At this
point the tape head is positioned at the square on which 7; halted. From this point
on, the moves of 7; 7> are the moves of T> (using the function 52). If either T; or T>
would reject during this process, T; T) does also, and T;T> accepts precisely if and
when 7> accepts.
In order to use this composite machine in a larger context, in a manner similar to
a transition diagram but without showing the states explicitly, we might also write
T; — Th
We can also make the composition conditional, depending on the current tape
symbol when 7; halts. We might write
Te
to stand for the composite machine 7,7T'T>, where T’ is described by the diagram
in Figure 9.9. This composite machine can be described informally as follows: It e a/a, S (7)
Figure 9.10 |
the accepting state; and if it is anything else, reject. In the first case, assuming T halts
normally, repeat the execution of T until T halts normally scanning some symbol
other than a; if at that point the tape symbol is b, halt normally, otherwise reject.
(The machine might also reject during one of the iterations of T; and it might loop
forever, either because one of the iterations of T does or because T halts normally
with current tape symbol a every time.)
Although giving a completely precise definition of an arbitrary combination of
TMs would be complicated, it is usually clear in specific examples what is involved.
There is one possible source of confusion, however, in the notation we are adopting.
Consider a TM T of the form 7; -> 7. If 7; halts normally scanning some symbol
not specified explicitly (i.e., other than a), T rejects. However, if T, halts normally,
T does also—even though no tape symbols are specified explicitly. We could avoid
this seeming inconsistency by saying that if 7, halts normally scanning a symbol
other than a, T halts normally, except that T would then not be equivalent to the
composition 7; 7’T> described above, and this seems undesirable. In our notation, if
at the end of one sub-TM’s operation at least one way is specified for the composite
TM to continue, then any option that allows accepting at that point must be shown
explicitly, as in Figures 9.1la and 9.11b. (The second figure is a shortened form of
the first.)
Some basic TM building blocks, such as moving the head a specified number of
positions in one direction or the other, writing a specific symbol in the current square,
and searching to one direction or the other for a specified symbol, are straightforward
and do not need to be spelled out. We consider a few slightly more involved operations.
T, +> T, tye a ey
b/b,§
G (a) (he)
(a) (d)
Figure 9.11 |
CHAPTER 9 Turing Machines 335
a/a, R a/a,R
b/b,R b/b,R
a/a, L a/a, L
b/b, L b/b, L
a/a, R
b/b,R
A/A, R
A/a, Ib, BB, R
Figure 9.12 |
A Turing machine to copy strings.
a/a, L
Figure 9.13 |
A Turing machine to delete a symbol.
Inserting a symbol a, or changing the tape contents from yz to yaz, would be done virtually
the same way, except that the single pass would go from left to right, and the move that starts
it off would write a instead of A. You are asked to complete this machine in Exercise 9.13.
The Delete machine transforms yaz to yz. What if it is called when the tape
contents are not yaz, but yazAw, where w is some arbitrary string of symbols?
At first it might seem that the TM ought to be designed so as to finish with yzAw
on the tape. A closer look, however, shows that this is unreasonable. Unless we
know something about the computations that have gone on before, or unless the
rightmost nonblank symbol has been marked somehow so that we can recognize it,
we have no way of finding it. The instructions “move the tape head to the square
containing the rightmost nonblank symbol” cannot ordinarily be executed by a TM
(Exercise 9.12). In general, a Turing machine is designed according to specifications,
which say that if it starts in a certain configuration it should halt normally in some
other specified configuration. The specifications may say nothing about what the
result should be if the TM starts in some different configuration. The machine may
satisfy its specifications and yet behave unpredictably, perhaps halting inh, or looping
forever, in these abnormal situations.
Figure 9.14 |
Another way of accepting pal.
not to be elements of I. In both cases it is easy to see that the restrictions do not
reduce the power of the machine. You are referred to the exercises for the details.
One identifiable difference between a Turing machine and a typical human com-
puter is that a TM has a one-dimensional tape with a left end, rather than sheets of
paper that might be laid out in both directions. One way to try to increase the power of
the machine, therefore, might be to remove one or both of these restrictions: to make
the “tape” two-dimensional, or to remove the left end and make the tape potentially
infinite in both directions. In either case we would start by specifying the rules under
which the machine would operate, and the conventions that would be followed with
regard to input and output. Again the conclusion is that the power of the machine
is not significantly changed by either addition, and again we leave the details to the
exercises.
Rather than modifying the tape, we might instead add extra tapes. We could
decide in that case whether to have a single tape head, which would be positioned at
the same square on all the tapes, or a head on each tape that could move independently
of the others. We choose the second option. An n-tape Turing machine will be
specifiable as before by a 5-tuple T = (Q, XT, go, 5). It will make a move on the
basis of its current state and the n-tuple of tape symbols currently being examined;
since the tape heads move independently, we describe the transition function as a
partial function
620 xd UfA}) => (CUTE, 1) x (UTA x IRE st
The notion of configuration generalizes in a straightforward way: A configuration of
an n-tape TM is specified by an (n + 1)-tuple of the form
Lee ue a, 51)be
Then there is a one-t eT I> = (Q2,%
_ satisfying the following two conditions. —
CHAPTER 9 Turing Machines 339
340 PART 4 Turing Machines and Their Languages
. Ifthe current square contains #,reject, since T; would have craked te cee to _
move the tape head off tape 2. If not, and if the new square does not containa
pair of symbols (becau: :D2 = R and Ty has not previo sly examined positions _
this far to the right), ¢ ert the symbol a there to the pair ((a, A’); if the new
_ Square does contain a pair, say (a, b), convert it to. @,b’).Move the tape head ::
back to the beginning. a -
: Locate the pair (a, ¢) again, as in oe 1.
Fee it to (b;, c) and mo\
CHAPTER 9 Turing Machines 341
9.5| NONDETERMINISTIC
TURING MACHINES
Nondeterminism plays different roles in the two simpler models of computation we
studied earlier. It is convenient but not essential in the case of FAs, whereas the
language pal is an example of a context-free language that cannot be accepted by
any deterministic PDA. Turing machines have enough computing power that once
again nondeterminism fails to add any more. Any language that can be accepted by a
nondeterministic TM can be accepted by an ordinary one. The argument we present
to show this involves a simulation more complex than that in the previous section.
Nevertheless, the idea of the proof is straightforward, and the fact that the details get
complicated can be taken as evidence that TMs capable of implementing complex
algorithms can be constructed from the same kinds of routine operations we have
used previously.
A nondeterministic Turing machine (NTM) T = (Q, 2, IT, qo, 5) is defined ex-
actly the same way as an ordinary TM, except that values of the transition function 6
are subsets, rather than single elements, of the set (QU {ha, h,}) x T'U{ A}) x {R,L,S}.
We do not need to say that 6 is a partial function, because now 6(q, a) is allowed to
take the value @.
342 PART 4 Turing Machines and Their Languages
Figure 9.15 |
The computation tree for a nondeterministic TM
: We can therefore think of 7,’s job as searching the tree for accepting
configurations. Because the tree might be infinite, a breadth-first approachis
appropriate: T) will try all possible single moves, then all possible sequences —
of two moves, then all possible sequences of three moves, and so on. The
machine we actually construct will be inefficient in that every sequence oO
n+ 1moves will involve repeating a sequence of n moves tried| previously. |
_ Even ifthe tree is finite (which means that for some n, every possible :sc
: quence of n moves 7; can make on input x leads to a halt), 7> will still loop —
forever if T, never accepts: It will attempt to try longer and longer sequences _
of moves, and the effect will be that it ends up repeating the same sequences _
of moves, in the same order, over and over. However, if x €¢ L(T7;), then for —
some n there is a sequence of n moves that causes dq toues jap x,and _
T> willeventually get around to trying that sequence. _
We will take advantage of Theorem 9.1 by giving 72 fet tapes. The _
rst is used only to save the original input string, and its contents are never .
changed. The second is used to keep track of the sequence of moves of Ty :
_ that 7) is currently attempting to execute. The thirdis the “working tape,”
corresponding to hs s tape, where T) actually carries out the steps specified
by the current string on tape 2. Every time 7> begins trying a new sequence, :
the third tape is erased and the input from tape | re-copied onto it. -
A particular sequence of moves will be represented by a string of bi-
nary digits. The string 001, for example, represents the following sequence:
first, ‘the move representing the first (i.e., Oth) of the two choices from the
initial configuration Co, which takes 7, to some configuration C;; next,
the first possible move from the configuration C,, which leads to some .
n
move from C. Because moves
configuration Crsnext, thesecond possible r -
344 PART 4 Turing Machines and Tneir Languages
G ame.) *)
InitializeTapes2&3 CopyInput
Ge a) S)
S#*
EraseTape3
Figure 9.16 |
Simulating a nondeterministic TM by a three-tape TM.
0 and i may be the same, there may be several strings of digits thatdescribe
the same sequences of moves. There may also be strings of digits that do not
_ correspond to sequences of moves, because the first few moves cause 7; to
_ halt. When 7 encounters a digit that does not acu toaan executable
move, it will abandon the string.
We will use the canonical ordering of {0, i in which thestrings are .
i arranged iin the order .
a 0, 1, 00, 01, 10, a 000, 001, . viii, 0000, .
. S< ee
ibe the peneral structure
of fivesmaller T Ms called ees picesZ Execute
ii
iss composed
Erase-
e
0/0, L
WAIST Y,
Figure 9.17 |
The one-tape version of NextSequence.
346 PART 4 Turing Machines and Their Languages
(a) (b)
Figure 9.18 |
A typical small portion of Execute.
Figure 9.19 |
An NTM to accept strings of length 2’.
Now look at the nondeterministic TM T in Figure 9.19. T moves past the input string,
places a single 1 on the tape, separated from the input by a blank, and after positioning the tape
head on the blank, executes Double zero or more times before returning the tape head to square
0. Finally, it executes the TM Equal, which works as follows: Starting with tape contents
Ax Ay, where x and y are strings of 1’s, Equal accepts if and only if x = y (see Example 9.9).
The nondeterminism in T lies in the indeterminate number of times Double is executed.
When Equal is finally executed, the string following the input on the tape represents some
arbitrary power of 2. If the original input string happens to represent a power of 2, say 2', then
there is a sequence of choices T can make that will cause the input to be accepted—namely,
the sequence in which Double is executed exactly i times. On the other hand, if the input string
is not a power of 2, it will fail to be accepted, because in the last step it is compared to a string
that must be a power of 2. Our conclusion is that T accepts the language (2 a0}.
We do not need nondeterminism in order to accept this language. (Nondeterminism is
never necessary, by Theorem 9.2.) It merely simplifies the description. One deterministic
way to test whether an integer is a power of 2 is to test the integer first to see if it is 1 and, if
not, perform a sequence of divisions by 2. If at any step before reaching 1 we get a nonzero
remainder, we answer no. If we finally obtain the quotient 1 without any of the divisions
producing a remainder, we answer yes. We would normally say, however, that multiplying
by 2 is easier than dividing by 2. An easier approach, therefore, might be to start with 1 and
perform a sequence of multiplications by 2. We could compare the result of each of these to the
original input, accepting if we eventually obtained a number equal to it, and either rejecting if
we eventually obtained a number larger than the input, or simply letting the iterations continue
forever.
The nondeterministic solution T in Figure 9.19 is closer to the second approach, except
that instead of comparing the input to each of the numbers 2’, it guesses a value of i and tests
that value only. Removing the nondeterminism means replacing the guess by an iteration in
which all the values are tested; a deterministic TM that did this would simply be a more efficient
version of the TM constructed in the proof of Theorem 9.2, which tests all possible sequences
of moves of T.
Convention. We assume from this point on that there are two fixed infinite sets
OQ = {q1,q,...} and S = {a,,a,...} so that for any Turing machine T =
(Q, u,T,qo,5), wehave QC Qandr' CS.
It should be clear that this assumption about states is not a restriction at all,
because the names assigned to the states of a TM are irrelevant. Furthermore, as
long as S contains all the letters, digits, and other symbols we might want in our
input alphabets, the other assumption is equally harmless (no more restrictive, for
example, than limiting the character set on a computer to 256 characters). Once we
have a subscript attached to every possible state and tape symbol, we can represent a
state or a symbol by a string of 0’s of the appropriate length; 1’s are used as separators.
CHAPTER 9 Turing Machines 349
| First we associate to each tape symbol (including ‘>,to each state (including
. lg and h,r), and toeach of the three directions, a string of 0’s. Let
s(A) =
s(a;) = s (for each a; € S)
“s(ha) = 0
s(h,) = 00
s(qi) =0'°? (for each g; € Q)
s(S) =
s(L) =
s(R) = 000
Each move m of a TM, described by the formula
3(p, a) = (q, b, D)
is encoded by the string :
, C0) = s(p)1s(a)1s(q)1s@)1s(D)1
: and 24any TM ie — initial state q, T iis encoded by the string |
ne e(T)=s(q)le(m)le(ms)1---e(m)1
wherem 1,1, - +, m, are the distinct moves of 7, arrangedin some arbitrary
ane Finallyanyeee Z = 2122 +++ Ze, where each z; € S, isencoded by
eZ) =“As()IsCea)1 -s(z) 1
The 1 at the beginning of the string e(z) is included so that in a composite string
of the form e(T )e(z), there will be no doubt as to where e(7) stops. Notice that one
consequence is that the encoding s(a) of a single symbol a € S is different from the
encoding e(a) of the one-character string a.
Because the moves of a TM T can appear in the string e(T’) in any order, there
will in general be many correct encodings of 7. However, any string of 0’s and 1’s
can be the encoding of at most one TM.
b/b, R b/b, L
Figure 9.20 |
Remember that the first part of the string, in this case 0001, is to identify the initial state of the
TM. The individual moves in the remainder of the string are separated by spaces for readability.
The input to the universal TM 7, will consist of a string of the form e(T )e(z),
where T is a TM and z is a string over T’s input alphabet. In Example 9.11, if the
input string to T were baa, the corresponding input string to T,, would consist of the
string e(T) given in the example, followed by 10001001001. On any input string of
the form e(T)e(z), we want T,, to accept if and only if T accepts input z, and in this
case we want the output from 7, to be the encoded form of the output produced by T
on input z.
Now we are ready to construct 7,. It will be convenient to give it three tapes.
According to our convention for multitape TMs, the first tape will be both the input
and output tape. It will initially contain the input string e(T)e(z). The second tape
will be the working tape during the simulation of T, and the third tape will contain
the encoded form of the state T is currently in.
The first step of T,, is to move the string e(z) (except for the initial 1) from the
end of tape | to tape 2, beginning in square 3. Since T begins with its leftmost square
blank, 7,, will write 01 (because 0 = s(A)) in squares | and 2 of tape 2; square 0
is left blank, and the tape head is positioned on square 1. The next step for 7, is to
copy the encoded form of T’s initial state from the beginning of tape | onto tape 3,
beginning in square 1, and to delete it from tape 1.
After these initial steps, 7, is ready to begin simulating the action of T (encoded
on tape 1) on the input string (encoded on tape 2). As the simulation starts, the three
tape heads are all in square 1. The next move of T at any point is determined by T’s
state (encoded on tape 3) and the current symbol on T’s tape, whose encoding starts
in the current position on tape 2. In order to simulate this move, 7,, must search tape
1 for the 5-tuple whose first two parts match this state-input combination. Abstractly
this is a straightforward pattern-matching operation, and a TM that carries it out is
shown in Figure 9.21. Since the search operation never changes any tape symbols,
we have simplified the labeling slightly, by writing as
Galn=)n(SeReS)
(GaAr nukes)
CON:
(Gp 0, =) (S, L, S)
Figure 9.21 |
Finding the right move on tape 1.
The 5-tuple on tape 1 specifies that T’s current symbol should be changed to b,
the head should be moved left, and the state should be changed to state 3. These
operations can be carried out by 7, in a fairly straightforward way, and we omit the
details. The final result is
A.00010100001010001100001001000001000100110001...
A0100100010001 A...
AQO000A...
and T, is now ready to simulate the next move.
There are several ways this process might stop. T might halt abnormally, either
because there is no move specified or because the move calls for it to move its tape
head off the tape. In the first case, the search operation pictured in Figure 9.21 also
halts abnormally (although the move to h, is not shown explicitly), because after the
last 5-tuple on tape 1 has been tried unsuccessfully, the second of the 1’s at the end
takes the machine back to the initial state, and there is no move from that state with 1
on tape 1. We can easily arrange for T, to reject in the second case as well. Finally,
T may accept. 7, detects this when it processes a 5-tuple on tape 1 whose third part
is a single 0. In this case, after T,, has changed tape 2 appropriately, it erases tape 1,
copies tape 2 onto tape 1, and accepts.
1. The nature of the model makes it seem likely that all the steps that are crucial to
human computation can be carried out by a TM. Of course, there are
differences in the details of how they are carried out. A human normally works
with a two-dimensional sheet of paper, not a one-dimensional tape, and a
human is perhaps not restricted to transferring his or her attention to the
location immediately adjacent to the current one. However, although working
within the constraints imposed by a TM might make certain steps in a
computation awkward, it does not appear to limit the types of computation that
are possible. For example, if the two-dimensional aspect of the paper really
plays a significant role in a computation, the TM tape can be organized so as to
simulate two dimensions. This may mean that two locations contiguous on a
sheet of paper are not contiguous on the tape; the only consequence is that the
TM may require more moves to do what a human could do in one.
2. Various enhancements of the TM model have been suggested in order to make
the operation more like that of a human computer, or more convenient, or more
efficient. These include the enhancements mentioned in this chapter, such as
doubly infinite tapes, multiple tapes, and nondeterminism. In each case, it is
possible to show that the computing power of the machine is unchanged.
3. Other theoretical models of computation have been proposed. These include
machines such as those mentioned earlier in this section, machines that are
354 PART 4 Turing Machines and Their Languages
EXERCISES
9.1. Trace the TM in Figure 9.5 (the one accepting the language
{ss | s € {a, b}*}) on the string aaba. Show the configuration at each step.
9.2. Below is a transition table for a TM.
same language that is more similar to the FA in that it accepts a string only
after it has read all the symbols of the string.
93; Figures 9.2 and 9.3 show two examples of converting an FA to a TM
accepting the same language. Describe precisely how this can be done for an
arbitrary FA.
9.6. Draw a transition diagram for a Turing machine accepting each of the
following languages.
{a'b! |i < j}
{a"b™c" |n > 0}
{x € {a, b, c}* |na(x) = no(x) =n-(x)}
The language of balanced strings of parentheses
The language of all nonpalindromes over {a, b}
Su {www
Ce
te
em
SN | w € {a, b}*}
he Describe the language (a subset of {1}*) accepted by the TM in Figure 9.22.
9.8. We do not define A-transitions fora TM. Why not? What features of a TM
make it unnecessary or inappropriate to talk about A-transitions?
9.9. Suppose 7; and 7 are TMs accepting languages L; and L> (both subsets of
&*), respectively. If we were following as closely as possible the method
used in the case of finite automata to accept the language L,;L2, we might
form the composite TM 7\7>. (See the construction of M, in the proof of
Theorem 4.4.) Explain why this approach, or any obvious modification of it,
will not work. .
1/1,L
Figure 9.22 |
356 PART 4 Turing Machines and Their Languages
9.10. Given TMs T, => (On X11, Lee 1) 61) and T» = (Qo, dX, I>, q2; 62), with
T', © Do, give a precise definition of the TM 7; T = (Q, 2, T, go, 6). Say
precisely what Q, X, I’, go, and 6 are.
9.11. Suppose T is aTM accepting a language L. Describe how you would modify
T to obtain another TM accepting L that never halts in the reject state h,.
9.12. Suppose T is a TM that accepts every input. We would like to construct a
TM Ry so that for every input string x, Rr halts in the accepting state with
exactly the same tape contents as when T halts on input x, but with the tape
head positioned at the rightmost nonblank symbol on the tape. (One reason
this is useful is that we might want to use T in a larger composite machine,
but to erase the tape after T has halted.)
a. Show that there is no fixed TM 7p so that Rr = TT for every T. (In
other words, there is no TM capable of executing the instruction “move
the tape head to the rightmost nonblank ‘tape symbol” in every possible
situation.)
b. Describe a general method for constructing Rr, given T.
9.13. Draw the Insert(o) TM, which changes the tape contents from yz to yoz.
Here y € (2 U {A})*, 0 € 2 U {A}, and z € &*. You may assume that
wefan}.
9.14. Does every TM compute a partial function? Explain.
9.15. In each case, draw a TM that computes the indicated function. In the first
five parts, the function is from NV to N. In each of these parts, assume that
the TM uses unary notation—that is, the natural number n is represented by
the string 1”.
a> ef(x) = 2-2
Dy Geax
cof) =x?
d. f(x) = x/2 (“/” means integer division.)
e. f(x) =the smallest integer greater than or equal to log,(x + 1) (i.e.,
f(0) = 9, fC) = 1, f(2) = fGB) = 2, f(4) =--- = Ff) =3, and so
on.)
f fs {a, b}* x {a,b} = 40, Lidetinedby fi, y) = 1 = y,
F(x, y) = 0 otherwise.
g. of ita, b}* x {ab)y > 410.1) detined byaf G39) = it x= 9;
f(x, y) = 0 otherwise. Here < means with respect to “lexicographic,”
or alphabetical, order. For example, a < aa, abab < abb, etc.
h. f is the same as in the previous part, except that this time < refers to
canonical order. That is, a shorter string precedes a longer one, and the
order of two strings of the same length is alphabetical.
i. f : {a, b}* — {a, b}* defined by f(x) = a™™b™© (i.e., f (x) has the
same symbols as x but with all the a’s at the beginning).
9.16. The TM shown in Figure 9.23 computes a function from {a, b}* to {a, b}*.
For any string x € {a, b}*, describe the string f(x).
CHAPTER 9 Turing Machines 357
Figure 9.23 |
a/a, R
b/b,R A/A, R
A/A, R
Figure 9.24 |
Exercises 9.24—9.25 involve a Turing machine with a doubly infinite tape. The
tape squares on such a machine can be thought of as numbered left to right, as in an
ordinary TM, but now the numbers include all negative integers as well as nonnega-
tive. Aconfiguration can still be described by a pair (g, xay). There is no assumption
about which square the string x begins in; in other words, two configurations that
are identical except for the square in which x begins are considered the same. For
this reason, we may adopt the same convention about the string x as about y: when
we specify a configuration as (¢, xay), we may assume that x does not begin with a
blank.
9.24. Construct a TM with a doubly-infinite tape that does the following: If it
begins with the tape blank except for a single a somewhere on it, it halts in
the accepting state with the head scanning the square with the a.
9.25. Let T =(Q, 2,1, go, 5) be aTM. Show that there is aTM
T; = (Q1, ©, 11, q1, 61) with a doubly-infinite tape, with T C 1), satisfying
these two conditions:
a. For any x € &*, 7; accepts input x if and only if T does.
b. For any x € &*, if(qo, Ax) F> (ha, yaz), then (q1, Ax) FH}, (Aa, yaz).
CHAPTER 9 Turing Machines 359
9.26. In defining a multitape TM, another option is to specify a single tape head
that scans the same position on all tapes simultaneously. Show that a
machine of this type is equivalent to the multitape TM defined in Section 9.5.
9.27. Draw the portion of the transition diagram for the one-tape TM M>
embodying the six steps shown in the proof of Theorem 9.1 corresponding to
the move 5; (p, a), a2) = (q, b1, bo, RL) of My.
9.28. Draw a transition diagram for a three-tape TM that works as follows:
starting in the configuration (qo, Ax, Ay, A), where x and y are strings of
0’s and 1’s of the same length, it halts in the configuration (h,, Ax, Ay iNZ),
where z is the string obtained by interpreting x and y as binary
representations and adding them.
9.29. What is the effect of the nondeterministic TM with input alphabet {0, 1}
whose transition table is shown below, assuming it starts with a blank tape?
(Assuming that it halts, where is the tape head when it halts, and what strings
might be on the tape?)
¢ 8. 8,0) |
qo A {(q1, A, R)}
q A {(q1,0, R), (qi, 1, R), (q2, A, L)}
qQ 0 {(q2,0,L)}
Q 1 {(@2, 1, L)}
qQ A {(ha, A, S)}
»
9.30... Call the NTM in the previous exercise G. Let Copy be the TM in
Example 9.4, which transforms Ax to Ax Ax for an arbitrary string
x € {0, 1}*. Finally, let Equal be a TM that works as follows: starting with
the tape Ax Ay, it accepts if and only if x = y. Consider the NTM shown in
Figure 9.25. (It is nondeterministic because G is.) What language does it
accept?
0/0, R 0/0,R
1/1,R 1/1,R
A
Bales s G ———s Copy BSR Delete
0/0, L A/A, L A
1/1, 1D Equal
0/0, L
1/1, L
Figure 9.25 |
360 PART 4 Turing Machines and Their Languages
931: Using the idea in the previous exercise, draw a transition diagram for an
NTM that accepts the language {1” |n = k? for some k > 0}.
9.32. Using the same general technique, draw a transition diagram for an NTM
that accepts the language {1” |n is a composite integer > 4}.
9.33. Suppose L is accepted by a TM T. Describe how you could construct a
nondeterministic TM to accept each of the following languages.
a. The set of all prefixes of elements of L
b. The set of all suffixes of elements of L
c. The set of all substrings of elements of L
9.34. Figure 9.18b shows the portion of the Execute TM corresponding to the
portion of M, shown in Figure 9.18a. Consider the portion of M, shown in
Figure 9.26. Assume as before that the maximum number of choices at any
point in M, is 2, and that the moves shown are the only ones from state r.
Draw the corresponding portion of Execute.
935. Assuming the same encoding method discussed in Section 9.7, and assuming
that s(0) = 00 and s(1) = 000, draw the TM that is encoded by the string
9.36. Draw the portion of the universal TM T, that is responsible for changing the
tape symbol and moving the tape head after the search operation has
identified the correct 5-tuple on tape 1. For example, the configuration
A.00010100001010001100001001000001000100110001...
A010010010001A...
AQO00A...
would be transformed to
A.00010100001010001100001001000001000100110001...
A0100100010001 A...
A00000A...
9.37. Table 7.2 describes a PDA accepting the language pal. Draw a TM that
accepts this language by simulating the PDA. You can make the TM
nondeterministic, and you can use a second tape to represent the stack.
9.38. Suppose we define the canonical order of strings in {0, 1}* to be the order in
which a string precedes any longer string and the order of two equal-length
strings is numerical. For example, the strings 1, 01, 10, 000, 011, 100 are
listed here in canonical order. Describe informally how to construct a TM T
that enumerates the set of palindromes over {0, 1} in canonical order. In
other words, T loops forever, and for every positive integer n, there is some
CHAPTER 9 Turing Machines 361
point at which the initial portion of T’s tape contains the string
AADALAODAILAOODA - - - Ax,
where x, is the nth palindrome in canonical order, and this portion of the
tape is never subsequently changed.
nonnegative numbers (i.e., the right half of the tape) and odd-numbered
squares to represent the remaining squares.
9.43. In Figure 9.27 is a transition diagram for a TM M with a doubly-infinite
tape. First, trace the moves it makes on the input string abb. Then, for the
ordinary TM M, that you constructed in the previous exercise to simulate M,
trace the moves that M, makes in simulating M on the same input.
9.44. Suppose M, is a two-tape TM, and M, is the ordinary TM constructed in
Theorem 9.1 to simulate M,. If M, requires n moves to process an input
string x, give an upper bound on the number of moves M2 requires in order
to simulate the processing of x. Note that the number of moves M, has made
places a limit on the position of its tape head. Try to make your upper bound
as sharp as possible.
9.45. Show that if there is aTM 7 computing the function f : NV— WN, then there
is another one, 7’, whose tape alphabet is {1}. Suggestion: suppose T has
tape alphabet [ = {a1, a2,...,a,}. Encode A and each of the a;’s by a
string of 1’s and A’s of length n + 1 (for example, encode A byn + 1
blanks, and a; by 1'A”*!~‘). Have T’ simulate T, but using blocks of n + 1
tape squares instead of single squares.
a/a, L
b/b, L
A/A, L
B/B,L
A/A,R
B/B, R
Figure 9.27 |
CHAPTER 9 Turing Machines 363
9.46. Describe how you could construct a TM 7p that would accept input strings of
0’s and 1’s and would determine whether the input was a string of the form
e(T) for some TM T. (‘“Determine” means compute the characteristic
function of the set of such encodings.)
9.47. Modify the construction in the proof of Theorem 9.2 so that if the NTM halts
on every possible sequence of moves, the TM constructed to simulate it halts
on every input.
9.48. Beginning with a nondeterministic Turing machine 7;, the proof of Theorem
9.2 shows how to construct an ordinary TM 7) that accepts the same
language. Suppose |x| = n, T, never has more than two choices of moves,
and there is a sequence of n, moves by which 7; accepts x. Estimate as
precisely as possible the number of moves that might be required for T> to
accept x.
9.49. Formulate a precise definition of a two-stack automaton, which is like a
PDA, except that it is deterministic and a move takes into account the
symbols on top of both stacks and can replace either or both of them.
Describe informally how you might construct a machine of this type
accepting {a'b'c' |i > 0}. Do it in a way that could be generalized to
{a'bic'd' |i > 0}, {a'b'ci'd'e' | i > O}, etc.
9.50. Describe how a Turing machine can simulate a two-stack automaton;
specifically, show that any language that can be accepted by a two-stack
machine can be accepted by a TM.
9.51. A Post machine is similar to a PDA, but with the following differences. It is
deterministic; it has an auxiliary queue instead of a stack, and the input is
’ assumed to have been previously loaded onto the queue. For example, if the
input string is abb, then the symbol currently at the front of the queue is a.
Items can be added only to the rear of the queue, and deleted only from the
front. Assume that there is a marker Zo initially on the queue following the
input string (so that in the case of null input Zo is at the front). The machine
can be defined as a 7-tuple M = (Q, 4, TI, go, Zo, A, 5), like a PDA. A
single move depends on the state and the symbol currently at the front of the
queue; and the move has three components: the resulting state, an indication
of whether or not to remove the current symbol from the front of the queue,
and what to add to the rear of the queue (a string, possibly null, of symbols
from the queue alphabet).
Construct a Post machine to accept the language {a”b"c” |n > O}.
9 SZ. We can specify a configuration of a Post machine (see the previous exercise)
by specifying the state and the contents of the queue. If the original marker
Zo is currently in the queue, so that the string in the queue is of the form
a ZoB, then the queue can be thought of as representing the tape of a Turing
machine, as follows. The marker Zo is thought of, not as an actual tape
symbol, but as marking the right end of the string on the tape; the string B is
at the beginning of the tape, followed by the string a; and the tape head is
currently centered on the first symbol of «—or, if a = A, on the first blank
364 PART 4 Turing Machines and Their Languages
square following the string B. In this way, the initial queue, which contains
the string w Zo, represents the initial tape of the Turing machine with input
string w, except that the blank in square 0 is missing and the tape head scans
the first symbol of the input.
Using this representation, it is not difficult to see how most of the moves
of a Turing machine can be simulated by the Post machine. Here is an
illustration. Suppose that the queue contains the string abbZoab, which we
take to represent the tape ababb. To simulate the Turing machine move that
replaces the a by c and moves to the right, we can do the following:
a. remove a from the front and add c to the rear, producing bbZoabc
b. adda marker, say $, to the rear, producing bbZoabc$
c. begin a loop that simply removes items from the front and adds them to
the rear, continuing until the marker $ appears at the front. At this point,
the queue contains $bbZoabc.
d. remove the marker, so that the final queue represents the tape abcbb
The Turing machine move that is hardest to simulate is a move to the left.
Devise a way to do it. Then give an informal proof, based on the simulation
outlined in this discussion, that any language that can be accepted by a
Turing machine can be accepted by a Post machine.
9.53. Show how a two-stack automaton can simulate a Post machine, using the first
stack to represent the queue and using the second stack to help carry out the
various Post machine operations. The first step in the simulation is to load
the input string onto stack 1, using stack 2 first in order to get the symbols in
the right order. Give an informal argument that any language that can be
accepted by a Post machine can be accepted by a two-stack automaton. (The
conclusion from this exercise and Exercises 50 and 52 is that the three types
of machines—Turing machines, Post machines, and two-stack
automata—are equivalent with regard to the languages they can accept.)
Recursively Enumerable
Languages
Theorem 10.1
: Every recursive lee is
recursively enumerable.
Proof -— - -
AS we observed in Secale’y,‘ if tis a Turing ae recognizingae
|» then we can get a TM. oe Lae ate. Trso
o that phen it leaves
~ output 0, itenters the reject state. ;
365
366 PART 4 Turing Machines and Their Languages
We have already identified the potential problem with the converse of Theorem
10.1. If T is a Turing machine accepting L, there may be strings not in L for which
T loops forever and therefore never produces an answer. Later we will see that this
possibility cannot always be eliminated: There are recursively enumerable languages
that are not recursive. For now, we record the partial result that is naturally suggested.
It will be useful to generalize it slightly, to nondeterministic machines.
_ Theorem 10.2 — —
S uence
oe Lis accepted e a nondeterministic T™ {T, andtevey possible
2 ane looping - nt
_ two respects. “Pirst, if 1” ans a sequence of moves of T that accepts, it
creates: the output A
Al on tape 1 before iteae Second, ifno wihosedd of
_ portion of tape 1 the sequence of digits pines on tape 2.of this way, :
tape | keeps a history of all sequences that are unsuccessful. (In the proof
of Theorem 42, we assumed forthe sake of ie that es ase were —
stn of digits of oe re
Both the union and intersection operations preserve the property of recursive
enumerability. The construction we use to prove this involves a TM capable of
simulating two other TMs simultaneously.
CHAPTER 10 Recursively Enumerable Languages 367
‘Theorem 10.3 ae
ff Ly and Ly are recursively enumerable languages over ¥, thenLyU is
and x NL» are also oy ee ale
See Tr, = (Q), 4,1, 91, 6;) and = (Os, 2 r. q2, asare, TMs |
accepting L; and L>, respectively. We wish to construct TMs accepting
: Ty; U Ly and L, M Lo. In both cases, it is useful to use a two-tape1
machine.
_We can model the construction after that of the FA in
i the proof of Theorem 2
3.4 and let the two tapes represent the tapes of T; and 7, respectively. -
We describe the solution for L; U L2. The two--tape machine T = _
(Q, 2,1, go, 6) begins by placing a copy of the input string, which isalready —
on tape 1, onto tape 2. It inserts the marker # at the beginning of both tapes,
ain order to detect a crash resulting from 7; or T trying to move its tape head
off the tape. From this point on, the simultaneous simulation of Tyon ta
1 ane T, on tape 21s eos by allowing oe. Possible move
The set of recursive languages is also closed under unions and intersections; we
leave the details to the exercises. For recursive languages, we can add the complement
operation to the list as well.
s Theorem 10.4
— If L isrecursive, so is
is Li.
Proof :
| M1is a tucdng ee recognizing L, we can make it recognize Le
a interchanging the two outputs.
368 PART 4 Turing Machines and Their Languages
This simple proof cannot be adapted in any obvious way for recursively enumer-
able languages. It does not immediately follow that the corresponding statement for
recursively enumerable languages is false (which it turns out to be), the next result
suggests, however, that it is less likely to be true.
If L is a finite language, the Turing machine in the definition can either halt
normally when all the elements of L appear on tape 1, or continue to make moves
without printing any other strings on tape 1. If L is infinite, T continues to move
forever.
Now we wish to show that a language is recursively enumerable (can be accepted
by a TM) if and only if it can be enumerated by some TM. The idea of the proof is
simple, although one direction turns out to be a little more subtle than it might first
appear.
On the one hand, if we have a machine T enumerating L, then given an input
string x, we can test x for membership in L by just waiting to see whether x ever
appears on T’s output tape. A TM 7; that carries out this strategy is guaranteed to
accept L, because the strings for which the test is successful are precisely those in L;
for all the others, T; loops forever, unless L is finite.
On the other hand, if T is a TM accepting L, then we consider all the strings
in &*, in some order such as the canonical order described in Section 9.5. In this
ordering, shorter strings precede longer ones, and strings of the same length are
ordered alphabetically (assuming some initial, arbitrary ordering on the symbols in
x). For each string x, we try to decide whether to include x in our enumeration by
using T to determine whether x € L. Here is the place where the argument needs -
to be a little more sophisticated: If it should happen that T loops forever on input x,
and if we are not careful, we will never get around to considering any string beyond
x. The construction in our official proof will be able to handle this problem.
"Theorem 10.6
a language L Sg = isrecursively enumerable°C
t
d justeee the#. If thetwo. wipes ayieh,1,accepts. qois -
2, which by
7 will accept precisely the strings that are generated on tape
: eave aretheelements a L.
370 PART 4 Turing Machines and Their Languages
fe Ne
ead by!
7 willeventuall be listec onttape1,
sand
In the second half of the proof of Theorem 10.6, you should notice that although
the strings in &* are generated in canonical order on tape 2, the strings in L will
not in general be listed in that order on tape 1. (For example, if T accepts a after
five moves and b after two, b would appear before a on tape 1.) With the stronger
assumption that T is actually recursive, however, the simple construction outlined
before the statement of the theorem can be carried out with no complications. On the
other hand, it is also easy to show that if there is a TM enumerating L in canonical
order, then L must be recursive. We state the result officially below, and leave the
proof to the exercises.
CHAPTER 10 Recursively Enumerable Languages 371
We can summarize Theorems 10.6 and 10.7 informally by saying that a lan-
guage is recursively enumerable if there is an algorithm for listing its elements, and
a language is recursive if there is an algorithm for listing its elements in canoni-
cal order. Some characterizations of recursively enumerable languages in terms of
Turing-computable functions will also be discussed in the exercises.
if we wanted to allow the variable A to be replaced by the string y, but only when
A is immediately preceded in the string by a and immediately followed by 8. Al-
though productions of this type are general enough for our purposes, it is often more
convenient to write them in the form
aABr yy
372 PART 4 Turing Machines and Their Languages
1 phrase-structure.
are disjoint sets of 1
Much of the notation developed for context-free grammars can be carried over
intact. In particular,
C5 6
means that 6 can be derived from a in zero or more steps, and
L(G) =1x € SD" |S ex}
To illustrate the generality of these grammars, we consider the first two examples
of non-context-free languages in Chapter 8.
L={a'bic |i=1}
Our grammar will involve variables A, B, C, as well as two others to be explained shortly.
There will be three types of productions: those that produce strings with equal numbers of
A’s, B’s, and C’s, though not always in the order we want; those that allow the appropriate
changes in the order of A’s, B’s, and C’s; and finally, those that change all the variables to the
corresponding terminals, if the variables are in the right order.
Productions of the first two types are easy to find. The context-free productions
S — ABCS | ABC
generate all strings of the form (A BC)", and the productions
BA— AB CA — AC CB
— BC
will allow the variables to realign themselves properly. For the third type, we cannot simply
add productions like A — a, because they might be used too soon, before the variables line
CHAPTER 10 Recursively Enumerable Languages 373
themselves up correctly. Instead we say that C can be replaced by c, but only if it is preceded
by c or b:
cC > cc bC > be
bB — bb aB —> ab
aA — aa
Once we have an a at the beginning to start things off, these productions allow the string to
transform itself into lowercase, from left to right. But where does the first a come from? It is
still not correct to have A — a, even with our restrictions on b’s and c’s. This would allow
ABC to transform itself into abc wherever it occurs, and would therefore permit ABC ABC
to become abcabc. One solution is to use an extra variable F to stand for the left end of the
string. Then we can say that A can be replaced by a only when it is preceded by a or F:
aA — aa FA->a
S— FS 1
and modify the earlier productions so that they involve S, instead of S. The final grammar is
the one with productions
bB — bb bC — be cC
> cc
The string aabbcc, for example, can be derived as follows. At each point, the underlined string
is the one that is replaced in the subsequent step.
It is easy to see that any string in L can be derived from this grammar. In the other
direction, any string of terminal symbols derived from S has equal numbers of a’s, b’s, and
c’s; the only question is whether illegal combinations such as ba or ca can occur. Notice
first that if S >* a, then a cannot have a terminal symbol appearing after a variable, and so
a € D*V*. Furthermore, any subsequent production leaves the string of terminals intact and
either rearranges the variables or replaces one more by a terminal. Suppose u € L(G) and u
has an illegal combination of terminals, say ba. Then u = vbaw, and there is a derivation of u
that starts S >* vbB for some B € V*. This is impossible, however, because no matter what
B is, it is then impossible to produce a as the next terminal.
neve EE
374 PART 4 Turing Machines and Their Languages
In the derivation in this example, the movement of A’s to the left and C’s to
the right and the “propagation” of terminal symbols to the right in the last phase
might suggest the motion of a Turing machine’s tape head and the moves the machine
makes as it transforms its tape. This similarity is not a coincidence. The proof that
these grammars can generate arbitrary recursively enumerable languages will use the
ability of a grammar to mimic a TM and to carry out the same sorts of “computations.”
In any case, the idea of symbols migrating through the string is a useful technique
and can be used again in our next example.
—Hssilisac (aap
S —> FM
Each time a new symbol o is added, the migrating duplicate symbol will be the variable that
is the uppercase version of 0. The productions
allow the variables to migrate past the terminals in the first half. Eventually the migrating
variable hits M, at which point it deposits the corresponding terminal on the other side and
disappears, using the productions
AM —> Ma BM > Mb
J > IN MA
The string abbabb has the following derivation. As before, the underlined portion of each
string is the left side of the production used next.
It is reasonably clear that any string in L can be generated by our grammar. We argue informally
that no other strings are generated, as follows. First, every string in L(G) has even length,
because the only productions that change the ultimate length of the string increase it by 2.
Second, when M is finally eliminated in the derivation, all the terminal symbols in the final
string are present, and half come before M. Those preceding M are the terminals in the
productions F — FaA and F — FbB, because the only other productions that create
terminals create them to the right of M. The farther to the left a terminal in the first half is, the
more recently it appeared in the derivation, because the relative order of terminals in the first
half never changes. Of any two variables created by these same two productions, however, the
one appearing earliest reaches M first because the two can never be transposed. Therefore, of
two terminals in the second half, the one to the left came from the variable appearing more
recently, and therefore the second half of the final string matches the first.
We are now ready to show that the languages generated by unrestricted grammars
are precisely those accepted by Turing machines. In one direction we can simply
construct a TM to simulate derivations in a grammar. In the other direction, we
will take advantage of some of the features of unrestricted grammars that we have
already observed, in order to construct grammars that can simulate Turing machine
computations.
.
Theorem 10.8
ay any unrestricted grammar G = (V, &, S, P), there is a Turing:
machine
= (Q, %, 1, go, 8) wih Ea)- at)
Proof |
_ The TM we construct to accept L(G) will be the nondeterministic composite
: machine
T = MovePastInput — Simulate — Equal
where the first component moves the tape head to the blank square hic: :
the input string, the second simulates a derivation in G starting in this location _
and leaves the resulting string on the tape, and the third compares this result
to the original input, accepting if and only if the two strings agree. If the
input‘string x is in L(G), the nondeterministic simulation can choose the
/ sequence of moves that simulates the derivation of x, and the result will
_ be that T accepts. The only way Simulate can leave a string of terminal
symbols on the tape is as a result of carrying out the steps of a derivation; if
a ¢ L(G), this component will either generate a string different fromx or _
_._fail to complete a derivation at all, and 7 will fail to accept.
: The Simulate TM simulates a derivation in much the same way that a
nondeterministic top-down PDA simulates a derivation in a CFG (see Sec- __
tion 7.4). In the case of the PDA, however, terminal symbols at the left of
the current string are removed from the stack, and the production used in
376 PART 4 Turing Machines and Their Languages
oe oy of Sim -
S—>aBs|A
aB— Ba
Ba > aB
Bb
which generates the language of strings in {a, b}* with equal numbers of a’s and b’s. Figure 10.1
shows the Simulate TM discussed in the proof of Theorem 10.7. Note that in this example, the
only productions in which the left and right sides are of unequal length are the S-productions,
and S appears only at the right end of the string. In a more general example, applying a
production like S — aBS could be accomplished by using an Insert TM (Exercise 9.13)
twice, and S — A would require a deletion.
You should trace the moves of Simulate as it simulates the derivation of a string in the
language, say abba.
CHAPTER 10 Recursively Enumerable Languages 377
a/a,L
b/b, L
B/B,L
a/a, L
b/b, L
Figure 10.1 |
The Simulate TM for Example 10.3.
=~
b
y
ee
378 PART 4 Turing Machines and Their Languages
CHAPTER 10 Recursively Enumerable Languages 379
8p.4
a) = oeb,
oe
ndingderivationproducesastringwith theeit A
Oterase fromwet
pee oe
“@eZUla), nm eP uta)
(oj €ZU{A}, o €TU{A})
ence a €TU{A})
eee
underlined portion shows the left side of the production that will be used in the next step.
S => S(AA)
= I(AA)
= T(aa)(AA)
= T(bb)(aa)(AA)
= T(aa)(bb)(aa)(AA)
= qo(AA)(aa)(bb)(aa)(AA) (qo, Aaba)
= (AA)qi
(aa) (bb) (aa)(AA) F (gq, Aaba)
= (AA)(@A)q2(bb)(aa)(AA) F (q2, AAba)
= (AA)(aA)(bb)q2(aa)(AA) F (q2, AAba)
= (AA)(@A)(bb)(aa)qn(AA) F (q2, AAbaA)
= (AA)(aA)(bb)g3(aa)(AA) F (qs, AAba)
= (AA)(@A)g4(bb)
(aA) (AA) F (q4, AAD)
= (AA)qa(aA)(bb)
(aA) (AA) F (q4, AAD)
= (AA)(aA)qi
(bb) (aA)(AA) F (qi, AA)
= (AA)(aA)(bA)gs
(aA) (AA) F (gs, AAAA)
= (AA)(aA)go(bA)GA)(AA) F (qo, AAA)
= (AA)(@A)(bA)ha(aA)(AA) F (hg, AAAA)
= (AA)(GA)(bA)ha(aA)ha(AA)
=> (AA)(GA)ha(bA)ha(aA)ha(AA)
=> (AA)ha(aA)ha(bA)hg(aA)ha(AA)
= ha(AA)haG@A)ha(bA)ha(aA)ha(AA)
=> ha(aA)ha(bA)ha(aA)ha(AA)
=> ahg(bA)ha(aA)ha(AA)
= abh,(aA)h,(AA)
= abah,(AA)
= aba
itive Grammars
BA— AB CA— AC CB
— BC
A>a aA —> aa aB
— ab
bB — bb bC — bec ce — Ce
Theorem 10.10 :
WL 2 isa context-sensitive ener e,thereiisa linear-bounded automa,
ton accepting 1.
Proof
_ Suppose G = (V, &, S, P) isa CSG generating L. In the proof of Thess
_ 10.8, the Turing machine used the portion of the tape to the right of the
input for the simulated derivation. That option is not available to us here,
buta satisfactory alternative is to convert the portion of the tape between (
and ) into two tracks, so that the second provides the necessary space. For
this reason we let the tape alphabet of our machine M contain pairs (a,b),
where a,b & 2 UV U {A}, in addition to elements of ©. There ey be
other symbols needed as well.
The first action taken by M will be to convert the tape conliguunen
(%1%X2 °° Xn)
As you can probably see from the proof of Theorem 10.10, the significant fea-
ture of an LBA is not that the tape head doesn’t move past the input string at all,
but that its motion is restricted to a portion of the tape bounded by some linear
function of the input length (this explains the significance of the phrase linear-
bounded). As long as this condition is satisfied, an argument using multiple tape tracks
can be used to find an equivalent machine satisfying the stricter condition in Def-
inition 10.5.
The strict converse of Theorem 10.10 does not hold, since the null string might be
accepted by an LBA but cannot belong to any context-sensitive language. However,
the obvious modification of the statement is true.
Theorem 10.11 a
If thereis a linear-bounded automaton Me a QO} S o q0> 8)ascentsile
ee Le x, then — is a context-sensitive grammar See
- Proof —
. We give only a sketch of he proof, “Which issimilar to that of Thearem
_ 10.9. As before, the grammar iS constructed so thata derivation generates.
two. copies of a string, simulates the action of M on one, and eliminates —
everything except the other from the string if and when the ues moves _
ofM lead to acceptance. oo
|. The grammar differs from the previous ‘one in. that more Variables are
_ needed and they are more complicated. Previously the variables included _
. only S, T, left and right parentheses, and one variable for each possible state
of M. However, with these variables, productions. such as h,a(O102) > O71
would violate the context-sensitive condition. The way to salvage produc-
: ons of this form is simply to interpret strings of the form (0,0) and poo)
as variables. This approach could have been used in the earlier proof as well,
a and many of the productions would have been context-sensitive as a result. :
_ The difference is that now we no longer need strings containing (AA) and _
productions like h, (Ag) > A, because sally there
are no blanks between
tape markers of M. -
In addition, we must pay attention to the tape markers ( and ); tgs /
iyalso have variables such:as$i (02), (O; 92), and (Oj; (92)) (as well
ial contained ¢
A
384 PART 4 Turing Machines and Their Languages
U:
CHAPTER 10 Recursively Enumerable Languages 385
The phrase type 0 grammar was actually applied to a grammar in which all
productions are of the form a — , where a is a string of one or more variables; it is
easy to see, however, that any unrestricted grammar is equivalent to one of this type
(Exercise 10.19).
The characterizations of all these types of languages in terms of grammars make
it obvious that for 1 < i < 3, every language of type i is of type i — 1, except
that a context-free language is context-sensitive only if it does not contain the null
string. (For any context-free language, the set of nonnull strings in the language
is a context-sensitive language.) Theorem 10.12 shows that we can make an even
stronger statement about the casei = 1.
For each i with 1 <i < 3, the set of languages of type i is a subset of the set
of languages of type i — 1. For i either 2 or 3, we know the inclusion is proper:
There are context-sensitive languages that are not context-free (see Example 10.5)
and context-free languages that are not regular. The last inclusion is also proper for a
trivial reason, because {A} is an example of a recursively enumerable, non-context-
sensitive language. In order to show there are nontrivial examples, we recall from
Theorems 10.1 and 10.12 that
CSS KC RE
(where the three sets contain context-sensitive, recursive, and recursively enumerable
languages, respectively). It would therefore be sufficient to show either that RE—R #
9 or that there is a recursive language L for which L — {A} is not context-sensitive.
Both these statements are true. The proof of the first is postponed until Section 11.1,
because it depends on a type of argument that will be introduced in the next section.
The second statement is Exercise 11.33. The conclusion of either statement is that the
class of recursive languages (which does not show up explicitly in Table 10.1, because
there is no known characterization involving grammars) falls strictly between the two
bottom levels of the hierarchy.
CHAPTER 10 Recursively Enumerable Languages 387
In spite of the results mentioned in the preceding paragraph, just about any
language you can think of is context-sensitive. In particular, programming languages
are. The non-context-free aspects of the C language mentioned in Example 8.4, such
as variables having to be declared before they are used, can be accommodated by
context-sensitive grammars.
The four levels of the Chomsky hierarchy satisfy somewhat different closure
properties. The set of regular languages is closed under all the standard operations:
union, intersection, complement, concatenation, Kleene*, and so forth. The set of
context-free languages is not closed under intersection or complement. Once we show
that there are recursively enumerable languages that are not recursive, it will follow
from Theorem 10.5 that the set of recursively enumerable languages is not closed
under complement. Although it is not difficult to show that the class of context-
sensitive languages is closed under many of these operations (Exercise 10.44), the
case of complements remained an open question for some time. Szelepcsényi and
Immerman answered it independently in 1987 and 1988, by proving that if L can be
accepted by a linear-bounded automaton, then so can L’. Open questions concerning
context-sensitive languages remain: It is unknown, for example, whether or not every
CSL can be accepted by a deterministic LBA.
elements of A (and in the process determining the number n): “one,” “two,” ... is
short for “this is the element that will correspond to 1,” “this is iteelement that will
correspond to 2,” and so forth. Rather than applying this process to A, then applying
it to B, then comparing the results, we could simply try to match up the elements of
A and B directly. In the case of {a, b, c} and {x, y, z} we can: There is a bijection
(a one-to-one, onto function) from A to B. An example is the function f defined by
f(a) =x, f(b) = y, and f(c) = z. Although A and B are different sets, we can
view them as the same except for the labels we use to describe the elements: We can
talk about a, b, and c, or about f(a), f(b), and f(c). (This is exactly what it means
to have a bijection from one to the other.) It seems appropriate, in particular, to say
that whenever we have such a function, A and B are the same size.
This criterion can be applied to infinite sets as well as to finite; therefore, we
adopt it as our definition.
Two sets are the same size if there is a bijection from one to the other.
Note that the relation “is the same size as” is an equivalence relation. In particular,
if A and B are the same size, then so are B and A; and if A and B are the same size,
and B and C are the same size, then so are A and C.
Even though we are trying to avoid talking about the “size” of a set as a quantity,
we want to be able to say informally that one set is bigger than another. For example,
{p,q,r, 8, t} is bigger than {a, b, c}. This is because there is a one-to-one function
from {a, b, c} to {p,q,r,s, t} (for example, the function f for which f(a) = p,
f(b) = r, and f(c) = s), but no bijection. In other words, there is a subset of
{p,q,r, S, t} that is the same size as {a, b, c} (the subset {p, r, s}, for example), but
the entire set is not. Again, this characterization of “bigger than” extends to infinite
sets as well.
Aset A is bigger than a set B if there is a bijection from a subset of A to B but no bijection
from A to B, ©
(a)
B LR2O
St Sty te
a oarSE
Figure 10.2 |
tion f defined by f(n) = 2n is a bijection from the first to the second. Finally, for
an even more dramatic example, the set B = R* of all nonnegative real numbers, is
not bigger than the interval A = [0, 1), because the formula f(x) = tan(> x) (whose
graph is shown in Figure 10.2c) defines a bijection from A to B. These results are
counterintuitive. There is clearly a sense in which the set of natural numbers is twice
as big as the set of nonnegative even integers, and R* is infinitely many times as big
as [0, 1). However, it appears that according to our definition, “twice as big,” or even
“infinitely many times as big,” does not imply “bigger” in the case of infinite sets.
The following observation may make these examples seem less surprising: Say-
ing that there is a bijection from a set S to a proper subset of itself is equivalent to
saying that S is infinite. One direction is easy, because there can be no bijection from
a finite set S to a set with fewer elements. For the other direction, see Exercise 10.25.
As you can see from the preceding paragraphs, it is necessary to think carefully
about infinite sets, and it is dangerous to rely on intuitively obvious statements that
you take for granted in the case of finite sets.
Not only are some infinite sets larger than others, but there are many different
“sizes” of infinite sets (Exercises 10.47 and 10.49). For our purposes, however, it is
enough to distinguish two kinds of infinite sets: those that are the same size as the
set V of natural numbers, and those that are bigger, which account for all the rest.
nV to S, and countable
is uncountably infinite, or —
Saying that f : VV— S is a bijection means that these three conditions hold:
Therefore, saying that S is countably infinite (the same size as \’) means that the
elements of S can be listed (or “counted”) as f(0), f(1),..., so that every element
of S appears exactly once in the list. Saying that S is countable means the same thing,
except that the list may stop after a finite number of terms.
There are at least two ways in which one might misinterpret the phrase “can be
listed.” First, of course, it is never possible to finish counting or listing the elements
of an infinite set. Saying that S is countably infinite means that we can count elements
of S (“zero,” “one,” “two,’.. .) in such a way that for any x € S, x would be counted
(included in the correspondence being established) if we continued the count long
enough.
A second possible source of confusion is that when we say a set S is countably
infinite, we are saying only that “there exists” a bijection f. There may or may not
be an algorithm allowing us to compute f, or an algorithm telling us how to list
the elements of S. Whether or not a bijection exists has to do only with how many
elements are in the set; whether or not there is a computable bijection also depends
on what the elements are. In particular, every language L over a finite alphabet is
countable (Lemma 10.2 and Example 10.7); however, there is such an algorithm only
if L is recursively enumerable, and as we will see, not all languages are.
Figure 10.3 illustrates one way of thinking of a countable set. We may think of
each underline as a space big enough for one element of a set. The set is countable if
it can be made to “fit” into the indicated spaces.
If there is a bijection from NV to A and also one from WN to B, then there is one
from A to B. Therefore, any two countably infinite sets are the same size. Similarly,
as we noticed earlier, a set that is the same size as a countable set is also countable.
Not all uncountable sets are the same size, if there are in fact many different sizes
of infinite sets. However, an immediate consequence of the following fact is that any
uncountable set is bigger than any countable one.
One other simple fact about countable sets will be useful, and we record it as
Lemma 10.2.
Figure 10.3 |
Spaces to put the elements of a countable set.
CHAPTER 10 Recursively Enumerable Languages 391
An immediate example of a countably infinite set is the set V itself, and we have
already seen examples of countably infinite subsets of VV. It is not hard to find many
more examples. The set
Si {01/25 3/2; 265/242)
is countably infinite; the way we have defined it is to list its elements. The set Z
of all integers, both nonnegative and negative, is countable, because we can list the
elements this way:
= {0, —l, Ub mea 2: a
One way of thinking of Z is as the union of the two sets {0, 1, 2,...} and {—1, —2,
—3,...}. For any two countably infinite sets A = {ao, aj, az,...} and B = {bo, by,
b2, ...}, we can list the elements of the union in the same way, {do, bo, a1, D1, ...},
except that any x € AM B should be included only once in the list. The conclusion
is that the union of two countably infinite sets is countably infinite. A more dramatic
result, which provides a large class of examples, is the following, often expressed
informally by saying that a countable union of countable sets is countable.
Theorem 10.13 _
392 PART 4 Turing Machines and Their Languages
Figure 10.4 |
Listing the elements of U2) S,.
ction, either fr
Since each of the sets ee Sn is a subset of S, we can also conclude from Lemma
10.2 that a finite union of countable sets is countable.
Theorem 10.12 can be interpreted as saying that any uncountable set A must be
much, much bigger than any countable set B, because even the union of countably
many sets the same size as B is still countable (the same size as B) and therefore
smaller than A.
N XN = {(G, Dili g2 0
= {(0,0), (0, 1), ©, 2),...}
Uld.0), detec san)
U2 GO) Onna oe a
Ulase
CHAPTER 10 Recursively Enumerable Languages 393
3
pleted) Le 10}
m=0
3
oo x N)
and each of the sets {m} x NV is countable mace the function f,, : M — {m} x N defined by
fn (2) = (m, n) is abijection. (Note that {m} x NV is the set of elements in the mth row of Figure
10.4.) However, we can be more explicit about the bijection f from NV to NV x N illustrated
in Figure 10.4 and described in the proof of Theorem 10.13. Let us consider the inverse of the
bijection f, the function f~!.: VNx N — N, and give the formula for f~!(m, n)—in other
words, the formula that enumerates the pairs in NV x NV.
We refer again to the path shown in Figure 10.4. Let j > 0; as the path loops through the
array the jth time, it hits all the pairs (m,n) for which m +n = j, and there are j + 1 such
pairs. Furthermore, for a specific pair (m,n) with m +n = j, there are m other pairs (p, q)
with p + g = j that get hit by the path before this one. Therefore, the total number of pairs
preceding (m, n) in the enumeration is
14+2+---+(m+n—-1)4+(m+n)4+m=(m+n)\(m+n+4+1)/2+m
= LU) >"
n=0
where &” is the set of strings over © of length n. Since %” is finite and therefore countable, it
follows from Theorem 10.13 that D* is also countable. In the simple case when & = {a, b},
one way of listing the elements of &* is to use the canonical order:
{a, b}* = {A, a, b, aa, ab, ba, bb, aaa, aab, ...}
Finally, since a language L over ¥ is a subset of ©*, Lemma 10.2 implies that L is countable.
encoding function e : T — {0, 1}* described in Section 9.6. The only property of e that we
need here is that it is one-to-one and therefore a bijection from J to some subset of {0, 1}*.
Since {0, 1}* is countable, any subset is, and therefore 7 is.
Now it is simple to show that RE is countable as well. By definition, a recursively
enumerable language L can be accepted by some Turing machine. For each L, lett(L) be such
a TM. The result is a function t from RE to 7, and since a Turing machine accepts precisely
one language, t is one-to-one. Since 7 is countable, the same argument we used above shows
that RE is also.
Example 10.8 provides half of the result we are looking for. Now that we have
shown the set of recursively enumerable languages to be countable, proving that there
are uncountably many languages (i.e., that the set of languages is uncountable) will
show that there must be non-recursively-enumerable languages.
As our first example of an uncountable set, however, we consider the set 7 of real
numbers. (Theorem 10.14 actually says that a subset of 7? is uncountable, from which
it follows that 7e itself is.) The proof is due to the nineteenth-century mathematician
Georg Cantor. It is a famous example of a diagonal argument; although the logic is
similar to the proof we will give for the set of languages, this proof may be a little
easier to understand.
ore
rem10.14
< This
ae
CHAPTER 10 Recursively Enumerable Languages 395
Note that in the proof, although finding one number not in the list is enough to
obtain a contradiction, there are obviously many more. (The particular choice of x; is
arbitrary, as long as it is neither a; ;nor 9.) Saying that a set is uncountable means that
no matter what scheme you use to list elements, when you finish you will inevitably
find that you have left most of the elements out.
396 PART 4 Turing Machines and Their Languages
number Ae
- Now we silty list
Corollary 10.1 The set of languages over {0, 1} that are not recursively enumerable
is uncountable. In particular, there exists at least one such language.
Proof The corollary follows from Theorem 10.15, from the countability of the set
of recursively enumerable languages (Example 10.8), and from the fact that if S is
uncountable and S$; C S is countable, S — S; is uncountable (Exercise 10.27). @
The proofs of Theorems 10.14 and 10.15 are nonconstructive, and the diagonal
argument in these proofs appears to be closely associated with proof by contradiction.
However, as we mentioned at the beginning of this section, we will construct an
example of a non-recursively-enumerable language by using a diagonal argument
that parallels these two very closely.
The results in Example 10.8 and Corollary 10.1 say that the set of languages
that are not recursively enumerable is much bigger than the set of languages that are.
Nevertheless, you might wonder how significant this conclusion is in the theory of
computation. The nonconstructive proof does not shed any light on what aspects of
a language might make it impossible for a TM to accept it. The same proof, in fact,
0 1 2 3
HeAy OCA, 024 Oe A,
[oiAy eA, A ie
Darky OA ee ee
©
re
WONSee SeAye Siete, ee
Figure 10.5 |
CHAPTER 10 Recursively Enumerable Languages 397
shows that there must also be languages we cannot even describe precisely (because a
precise description can be represented by a string, and the set of strings is countable).
Maybe any language we might ever wish to study, or any language we can describe
precisely, can be accepted by a TM; if this is true, then Corollary 10.1 is more of a
curiosity than a negative result. At this stage in our discussion, such a possibility is
conceivable. In the next chapter, however, we will see that things do not turn out
this way.
EXERCISES
10.1. Show that if L; and L» are recursive languages, then L; U Lz and L; N L>
are also recursive.
10.2. Consider the following alternative approach to the proof of Theorem 10.2.
Given TMs 7; and 7) accepting L; and L», respectively, a one-tape
machine is constructed to simulate these two machines sequentially. The
tape Ax is transformed to Ax#Ax. T; is then simulated, using the second
copy of x as input and using the marker # to represent the end of the tape.
If and when 7; stops, either by accepting or crashing, the tape is erased
except for the original input, and 7 is simulated.
a. Can this approach be made to work in order to show that the union of
recursively enumerable languages is recursively enumerable? Why?
b. Can this approach be made to work in order to show that the intersection
of recursively enumerable languages is recursively enumerable? Why?
10.3. _Is the following statement true or false? If L,, Lo, ... are any recursively
enumerable subsets of &*, then Des L; is recursively enumerable. Give
reasons for your answer.
10.4. Suppose L, L2,..., Ly form a partition of &*; in other words, their union
is D* and any two are disjoint. Show that if each L; is recursively
enumerable, then each L; is recursive.
10.5. Prove Theorem 10.7, which says that a language is recursive if and only if
there is a Turing machine enumerating it in canonical order.
10.6. Suppose L C &*. Show that L is recursively enumerable if and only if
there is a computable partial function from &* to &* that is defined
precisely at the points of L.
10.7. The proof of Theorem 10.2 involves a result sometimes known as the
“Gnfinity lemma,” which can be formulated in terms of trees. (See the
discussion in the proof of Theorem 9.2 of the computation tree
corresponding to a nondeterministic TM.) Show that if every node in a tree
has at most a finite number of children, and there is no infinite path in the
tree, then the tree is finite, which means that there must be a longest path
from the root to a leaf node. (Here is the beginning of a proof. Suppose for
the sake of contradiction that there is no longest path. Then for any n, there
is a path from the root with more than n nodes. This implies that the root
398 PART 4 Turing Machines and Their Languages
node has infinitely many descendants. Since the root has only a finite
number of children, at least one of its children must also have infinitely
many descendants.)
10.8. Describe algorithms to enumerate these sets. (You do not need to discuss
the mechanics of constructing Turing machines to execute the algorithms.)
a. The set of all pairs (n, m) for which n and m are relatively prime
positive integers (relatively prime means having no common factor
bigger than 1)
b. The set of all strings over {0, 1} that contain a nonnull substring of the
form www
c. {n €N | for some positive integers x, y, and z, x" + y” = z”}
10.9. In Definition 10.2, the strings x; appearing on the output tape of T are
required to be distinct. Show that if L can be enumerated in the weaker
sense, in which this requirement is dropped, then L is recursively
enumerable.
10.10. Show that if f : WV— N is computable and strictly increasing, then the
range of f (or the set of strings representing elements of the range of f) is
recursive.
10.11. In each case, describe the language generated by the unrestricted grammar
with the given productions. The symbols a, b, and c are terminals, and all
other symbols are variables.
a.
S — Lar L>LD\A
Da — aaD DR~>R R-A
b.
Sar L—>LD|LT|\|A Da — aaD Ta — aaaT
DR—>R TR—>R R> A
©
S— ABCS|ABC
AB — BA AC —>CA BC
> CB
BA — AB CA > AC CB
—> BC
A—>a Bob C—>c¢
d.
S— LAxR A->a
L— LI IA— Al Ix > AxlIJ IR~+AxR
JA—> AJ J*x > «J JR— AR
LA—>EA EA—> AE Ex> E ER>A
CHAPTER 10 Recursively Enumerable Languages 399
10.40. Show that an infinite recursively enumerable set has an infinite recursive
subset.
10.41. Find unrestricted grammars to generate each of the following languages.
a. {x € {a,b, c}* |na(x) < np(x) and ng(x) < n-(x)}
b.. (x1e (a,b. cl} |glk) —iple), = 2h)
c. {a"|n = j(j +1)/2 for somej > 1} (Suggestion: if a string hasj
groups of a’s, the ith group containing i a’s, then you can create j + 1
groups by adding an a to each of the j groups and adding a single extra
a at the beginning.)
10.42. Suppose G is a context-sensitive grammar. In other words, for every
production a > £ of G, |B| > |a|. Show that there is a grammar G’, with
L(G’) = L(G), in which every production is of the form
aAB > aXxB
where A is a variable and a, B, and X are strings of variables and/or
terminals, with X not null.
10.43. A context-sensitive grammar is said to be in Kuroda normal form if each of
its productions takes one of the four forms A > a, A> B, A > BC, or
AB —> CD, where a is a terminal and the uppercase letters are variables.
Show that every CSL can be generated by a grammar in Kuroda normal
form.
10.44. Use the LBA characterization of context-sensitive languages to show that
the class of CSLs is closed under union, intersection, and concatenation,
and that if L isa CSL so is Lt.
10.45. Suppose G, and G» are unrestricted grammars generating L; and Lo,
respectively.
a. By modifying G; and G, if necessary, find an unrestricted grammar
generating L,L>.
b. By modifying the procedure described in the proof of Theorem 6.1, find
an unrestricted grammar generating L7.
c. Adapt your answer to part (b) to show that if L; is a CSL, then L7 is
also.
10.46. In the proof of Theorem 10.15, we assumed that the elements of 2 were
Ao, Ai, ..., and constructed a set A not in the list by letting
A = {i |i ¢ A;}. Starting with the same list, find a different formula for a
set B not in the list.
10.47. The two parts of this exercise show that for any set S (not necessarily
countable), 2° is larger than S. It follows that there are infinitely many
“orders of infinity.”
a. For any S, describe a simple bijection from S to a subset of 2°.
Show that for any S, there is no bijection from S to 2°. (You can copy
the proof of Theorem 10.15, as long as you avoid trying to list the
elements of S or making any reference to the countability of S$.)
CHAPTER 10 Recursively Enumerable Languages 403
10.48. In each case, determine whether the given set is countable or uncountable.
Prove your answer.
a. The set of all real numbers that are roots of integer polynomials; in other
words, the set of real numbers x so that, for some nonnegative integer n
and some integers ao, a), ..., Ay, X iS a solution to the equation
405
etl caelae
; nit ; es pacity . ie a (alte e-\7,
a? Kio hemnmerbeng ud n> ent 14 Van aiect edi D toa siti 63pee
parull 9 od berloe od Soates Mail erssbehulin ms seeen ett sudain got _—_
cae : aut \onw bellos vd elalemitnal
har ore fs ;
4 vhs ergnundal ly eoiteire ot Gvildizes cn VI tie tt ho whetmm te ncaa & hseit ani
a. ‘ biquacns ni. eoubeorng05 iochigil willie «aay vr Japede feb: Sato gah’
: leva 8 iabtescg ow.) yoked at nul elie vigane Te Wiie dn: aigeed ity
fies sit jedlione 9) whaldong acetal stwd 4 ovlosmdll ane aetna Tesfbenisin
a? pty ob oo apivar eniotloey guliuliet sirakhig vitievineal vente sagt ¢ai
bStara sd nes tad cradin han 2ryewyett ort-tratiod gnivloval: pibstdeny, aev bean
i a cee” > corey) lcheummntdiniod. faron
- outMe a a porto ibd a dvaetie acl poe a tates eeslecr ity i gotdoal afi
i
i “fads oven ow) tou al .cctiyteent qirw? ie ulipatenay ysvank create re,
EA Racinidd 2d 239 inl) seod) yinnidhig vei vont gahT yA inagniy anobtonait
IIB eog rN “ANOS G + yu eal vigne Drm ana ay Manthey riba ieee msteenyltOS
i, Unni. seottesiinoninice hobavodimnd lint ruceueset > inibey 2 wosthas ey} #96 awit np.
opt tant) vile Soe ne OW viilidaacunes Io snomtionnelt tore oraba ne a
, ‘ no SPV eS et) vei ng) ow ron ,hoiuilionne® soba eared T mat! wotnyi
@ Joma oi Or a) vi coir ‘(wh qt} taal soeSbive 1D sir
=
C H APT ER
Unsolvable Problems
xr €L ifandonlyif x7 ¢ L(T)
407
408 PART 5 Unsolvable Problems and Computable Functions
less arbitrary way. A natural choice for x7 is e(T), the string that describes T in our
encoding scheme. This approach makes it unnecessary to use countability explicitly,
since we do not need any particular ordering of either the set of TMs or the set of
strings.
The two languages SA and NSA are almost the complements of each other. More
precisely, {0, 1}* is the union of the three disjoint sets SA, NSA, and E’, where E =
SA U NSA, the set of all strings of the form e(T) for some TM T. The following
simple result will make it easy to draw the conclusions we want from this fact.
Lemma 11.1 The language E = {e(7) | T is a TM} is recursive.
Proof The encoding function e is described in Section 9.6. It is easy to check that
a string x of 0’s and 1’s is in E if and only if it satisfies these conditions:
Any string satisfying these conditions represents a TM, whether or not it carries out
any meaningful computation. There is an algorithm to take an arbitrary element of
{0, 1}* and determine the truth or falsity of each condition, and it is not difficult to
implement such an algorithm on a Turing machine.
Of the three languages NSA, SA, and E, we now know that the first is not recur-
sively enumerable and therefore not recursive, and the third is recursive. It follows
from the formula NSA = SA’ E that SA cannot be recursive (Theorem 10.4 and Ex-
ercise 10.1). However, although the definitions of SA and NSA are obviously similar,
the first language does turn out to be recursively enumerable.
| -‘Theorem 11.2. : a
a The language SA is recursively enumerable butnot recursive.
- Proof / - / |
Tt remains only to show that SA iisety one. The intuitive -
reason NSA is not recursively ‘enumerable is that aTM T for which eye _
ae ee failfo accent ae puns byloo pingRas Inser atl the
The three languages SA, NSA, and E’ are closely related to the decision problem
Self-accepting: Given a TM 7, does T accept the string e(T)?
The three sets contain the strings representing yes-instances, no-instances, and non-
instances, respectively, of the problem.
In order to solve a general decision problem P, we start the same way, by choosing
an encoding function e so that we can represent instances J by strings e(/) over some
alphabet ©. Let us give the names Y(P) and N(P) to the sets of strings representing
yes-instances and no-instances of P. Then, if E(P) = Y(P) U N(P), we have the
third set E’(P) of strings not representing instances, just as in our first example.
Any reasonable encoding function e must be one-to-one, so that a string can
represent at most one instance of P. It must be possible to decode a string e(/) and
recover the instance J. Finally, there should be an algorithm to decide whether a
410 PART 5 Unsolvable Problems and Computable Functions
a ‘Theorem 4350 |
eka si peaaleens
a
The situation in which it is easiest to say precisely what the phrase algorithmic
procedure means is that in which P; and P, are the membership problems for two
languages L; C ZF and Lz C D5, respectively. In this case an instance of P; is a
string x € Xf and an instance of P2 is a string y € 3. Finding a y for each x means
computing a function from Xf to D5, and this can be done directly by a TM. It makes
sense in this case to talk about reducing the first language to the second.
If L; < L», being able to solve the membership problem for L2 allows us to
solve the problem for L;, as follows. If we have a string x € Xj} and we want to
decide whether x € L,, we can answer the question indirectly by computing f(x)
and deciding whether that string is in L2. The answers to the two questions are the
same, because x € L if and only if f(x) € Lo.
In a more general situation, as we discussed in the previous section, we can
normally identify the decision problems P; and P2 with the membership problems
for the corresponding languages Y (P;) and Y (P>), assuming that we have appropriate
encoding functions. This means in particular that the statement P; < P) is equivalent
to the statement Y(P;) < Y (P2) (see Exercises 11.23—11.25). In the proof of Theorem
11.5, we discuss the reduction used in the proof, both at the level of problem instances,
which is normally a little easier to think about, and at the level of languages, which
allows us to be a little more precise. After that we will normally stick to the one most
directly applicable.
The most obvious reason for thinking about reductions is that we might be able
to solve a problem P by reducing it to another problem Q that we already know
how to solve. However, it is important to separate the idea of reducing one problem
to another from the question of whether either of the problems can be solved. The
specific reason for discussing reductions in this chapter is to obtain more examples
of unsolvable problems.
CHAPTER 11 Unsolvable Problems 413
If the problem of whether aTM T accepts the string e(T) is unsolvable, we should
not expect to be able to solve the more general membership problem for recursively
enumerable languages, which we abbreviate Accepts.
Accepts: Given a TM T and a string w, is w € L(T)?
It is worth mentioning once more why the obvious approach to the problem (give
the input string w to the TM T, and see what happens) is not a solution: This approach
will produce an answer only if T halts, not if it loops forever.
An instance of Accepts consists of a pair (JT, w), where T is a Turing machine
and w a string. With our encoding function e, we can represent such a pair by the
string e(T)e(w), and so we consider the language
Acc = {e(T)e(w) | w € L(T)}
Theorem 11.5 - L
Agente’is unsolvable.
Proof”
The intuitive idea of the proof is the observation we have already made: If
ig we could decide, for an arbitrary TM T and an arbitrary string w, whether _
os accepted w, then we could decide whether an arbitrary TM T accepted _
. eT ),andan we know that thisis impossible. To make this precise, we will
reduce Self-accepting to Accepts and use the last statement in Theorem
At.4. la order to show that Self-accepting can be reduced to Accepts, we
ust describe an algorithm for producing an instance F (1) of Accepts from
3 given instance I of Self-accepting. Tiisa Turing machine T,and F(/) is_
414 PART 5 Unsolvable Problems and Computable Functions
: eS
a. _ and
oy
LD 6 ee s
ne ape and leave 7
In this proof, the argument involving instances T and (7, x) of the two problems
seems simpler and more straightforward than the one involving languages. However,
if you compare them carefully, you can see that the key steps are almost exactly the
same. The definition f(x) = xe(x) in the second argument is simply the string
version of the definition F(T) = (T, e(T)) in the first. The details that make the
second proof more complicated have to do first with deciding whether the input string
is an instance of the problem, and then with the necessary decoding and encoding of
strings.
The most well-known unsolvable problem, the halting problem, is closely related
to the membership problem for recursively enumerable languages. For a given TM
T and a given string w, instead of asking whether T accepts w, it asks whether T
halts (by accepting or rejecting) on input w. We abbreviate the problem Halts.
Halts: Given a TM 7 and a string w, does T halt on input w?
Just as before, we can consider the corresponding language
= {e(T)e(w) | T halts on input w}
CHAPTER 11 = Unsolvable Problems 415
Hea8
iscain ne toobtain q fontT o eo
In Pdrto accompli chthis, iti
= (hr,b, 05(p, a=©, a,8).a2
D)
changing any move ofthe form Sp, De in :
if T ever arriv es
this: change means that eS ]
square Sine. The reason this
/ square, T; is stuck in this state and on this
entertherejectstate tryi a
changeis not~ sufficient iiswt QAes
not to terminate. It might seem as though, with a careful enough analysis, any infinite
loop can be detected.
Even without knowing anything about the halting problem, we can see that this
is unrealistic by considering an example from mathematics. Many famous, long-
standing open problems have to do with the existence or nonexistence of an integer
satisfying some specific property. Goldbach’s Conjecture, made in 1742, is that every
even integer 4 or greater can be expressed at least one way as the sum of two primes.
(For example, 18 = 5 + 13 and 100 = 41 + 59.) Although the statement has been
confirmed for values of n up to about 4 x 10!*, and most mathematicians assume the
conjecture is true, no one has proved it. (In 2000 the publishing company Faber and
Faber offered a prize of $1 million to anyone who could furnish a proof by March 15,
2002.) However, testing whether a specific integer is the sum of two primes is a very
simple calculation. Therefore, it is easy to write a computer program, or construct a
Turing machine, to execute the following algorithm:
n=) 4
conjecture = true
while (conjecture)
{ if (n is not the sum of two primes)
conjecture = false
else
sah egy ear
}
The program terminates if and only if there is an even integer greater than 4 that is
not the sum of two primes. Thus, in order to find out whether Goldbach’s conjecture
is true, all we would have to do is decide whether our program runs forever.
In any case, whether we consider programs like this or programs you might write
in your computer science classes, the fact that the halting problem is unsolvable says
that there cannot be any general method to test a program and decide whether it
will terminate. This could be frustrating to mathematicians who are trying to prove
Goldbach’s conjecture. On the one hand, if they are unable to find a proof, they cannot
take much comfort from the unsolvability of the halting problem, because there may
be a simpler alternative method of deciding the conjecture (presumably a way that
uses facts about integers and primes, rather than simply facts about programs); on the
other hand, there may not!
Some problems related to the halting problem are discussed in the exercises. For
example, another question to which an answer would be useful is: Given a computer
program (or a Turing machine), are there any input values for which it would loop
forever? See Exercise 11.12.
First, rather than considering only the string e(T), we might try to restrict the
problem the other way, by fixing a Turing machine T and allowing a solution algorithm
to depend on T. For some machines T there is such an algorithm, but for at least one
there is not. Consider the universal Turing machine T,, introduced in Section 9.6, and
the decision problem P,,: Given w, does T, accept w? If we had a general solution
to this problem, then we could solve Accepts, by taking an arbitrary pair (TJ, x) and
deciding whether T,, accepted the string e(T)e(x). (In other words, we can reduce
Accepts to the problem P,, by assigning to an instance (T, x) of Accepts the instance
e(T )e(x) of P,,.) Therefore, P,, is unsolvable.
Let us consider another special case of Accepts obtained by restricting the string,
this time to the null string. We define Accepts-A to be the decision problem
(i.e., does T eventually reach the accepting state, if it begins with a blank tape?)
teen
T becomes
: eae
ad. Se ith input A, the two machines ‘pitoun exactly thesame.
mputation until they accept, and qT wales anaif padoe ifms
Ducane.
refore, we have the reduction we need. /
Several of the decision problems in the previous section are of this type. In the case
of Accepts-A, having property R means containing the null string. In the first two
problems listed in Theorem 11.8, the language has property R if it is nonempty, or if
it is all of D*, respectively.
There is a good reason for concentrating on this class of decision problems: For
just about any property R we choose, the resulting problem is unsolvable! “Just about
420 PART 5 Unsolvable Problems and Computable Functions
] oythe: apty language, Accepts-A ci can bereduced to Pr. If Ris satisfied by.
the empty set,then | We use an indirect argument to show that Pp is unsolvable.
We tedu c Accepts-A tto Pr’, where R’ is the complementary property —
is ble because R’ isa nontrivial eae) notsatisfied- Q,.
nclude that Pp is
i unsolvable because Pp is. _
Here isa list, somewhat arbitrary and certainly not complete, of decision problems
whose unsolvability follows immediately from Rice’s theorem. Some of them we
have already shown to be unsolvable, by directly reducing other unsolvable problems
to them.
AcceptsSomething: Given a TM 7, is L(T) nonempty?
AcceptsTwo: Given a TM T, does T accept at least two strings?
AcceptsFinite: Given a TM T, is the language accepted by T finite?
AcceptsEverything: Given a TM T with input alphabet ©, is L(T) = D*?
AcceptsRegular: Given a TM 7, is the language accepted by T regular?
hE
SON
Cla
ed
Bates
AcceptsRecursive: Given a TM T, is the language accepted by T recursive?
Many decision problems involving Turing machines do not fit the format required
for applying Rice’s theorem directly. The problems Accepts and Halts do not, because
in both cases an instance is nota TM buta pair (7, x). The problem Subset in Theorem
11.8 involves more than one Turing machine, as does the problem
can obviously be solved: Being “given” a TM means in particular being given enough
information to trace the processing of a fixed string for a certain fixed number of
moves. An example of an unsolvable problem that involves the operation of a
TM and therefore cannot be immediately proved unsolvable using Rice’s theorem
is WritesSymbol, in Theorem 11.8. In view of that problem, it may seem surprising
422 PART 5 Unsolvable Problems and Computable Functions
Given a TM T, does T ever write a nonblank symbol when started with input A?
SeeExercise 11:15.
Finally, even for a problem of the right form (Given T, does L(T) satisfy property
R?), Rice’s theorem cannot be applied if the property R is trivial. Remember that
“trivial” is used here to describe a property that is possessed either by all the recursively
enumerable languages or by none of them. Deciding whether the property is trivial
may not be trivial. If the property is trivial, however, then the decision problem is
trivial in the sense that the answer is either yes for every instance or no for every
instance. An example of the first case is the problem: Given a TM T, can L(T) be
accepted by a TM that never halts after an odd number of moves? Here the answer
is always yes. We can modify any TM if necessary so that instead of halting after
an odd-numbered move, it makes an extra (unnecessary) move before halting. An
example of the second case is the problem: Given a TM T, is L(T) the language
NSA? (See definition 11.1.) Here the answer is always no: No matter what T is, L(T)
cannot be NSA, because NSA is not recursively enumerable.
The instance is a yes-instance if there is such a sequence, and we call the sequence a
solution sequence for the instance.
It is helpful in visualizing the problem to think of n distinct groups of dominoes,
each domino from the ith group having the string a; on the top half and the string 6; on
the bottom half (see Figure 11.1@), and to imagine that there are an unlimited number
of identical dominoes in each group. Finding a solution sequence for this instance
means lining up one or more dominoes in a horizontal row, each one positioned
vertically, so that the string formed by their top halves matches the string formed by
their bottom halves (see Figure 11.15). Duplicate dominoes can be used, and it is not
necessary to use all the distinct domino types.
CHAPTER 11 Unsolvable Problems 423
[2 eed pnlE
poe
(a)
fe
fat
(b)
Figure 11.1 |
i
s|5 = j=) S
and you can verify for yourself that there is also a solution sequence beginning
— oS
apo |
— >=
PCP has a feature shared by many of the unsolvable problems we have consid-
ered. There is a trivial way to arrive at the correct answer for any instance, if the
answer is yes: Just try all ways of lining up one domino, then all ways of lining
up two, and so forth. Of course, a mindless application of this approach is doomed
to failure in the case of a no-instance. Saying that PCP is unsolvable says on the
one hand that reasoning about this approach will not help (for any n, you can try
all sequences of n dominoes and still not be sure there is no sequence of n + 1 that
works), and on the other hand that no other approach is guaranteed to do better.
We show that PCP is unsolvable by introducing a slightly different problem,
showing that it can be reduced to PCP, and then showing that it is unsolvable by
showing that the membership problem Accepts can be reduced to it. An instance
of the Modified Post’s correspondence problem (MPCP) is exactly the same as an
424 PART 5 Unsolvable Problems and Computable Functions
instance of PCP, except that a solution sequence for the instance is required to begin
with domino 1. In other words, a solution sequence consists of a sequence of zero or
more integers i2, i3,..., ix so that
O10; ++ i, = Bi Bin Bi,
CHAPTER 11 Unsolvable Problems 425
_the same symbol. It is conceivable that some of the other i;j ’sare alson +1,
but if im 18 the last i; to equal n + 1, then 1, 12, ..., 1m 1S also a solution
sequence for the instance J. It is then easy to chee that lo... a ln— 1 1Sa
_ Solution sequence for the instance 7 of MPCP.
_ We have shown that / is a yes-“instance of MCE ifand.
ae if.
Tiisa
yes-instance of ee which implies that —
/ MPCP_ < PCP
"Theorem 11.14 |
MPCPis unsolvable.
Proof
We want to show that Accepts is reducible to MPCP. Let (7, w) be an: arbi-
trary instance of Accepts, so that T = CO. s T, qo, 8) isa Turing machine ©
_ and w isa string over the input alphabet ©. We wish to construct an instance
(@1, Bi), (@2, Bo), ..-, (ns Bn) of MPCP (a modified ee “Sys
tem) that has a elution sequence if and only if T accepts w.
___ It will be convenient to assume that T never halts in the reject state h,.
_ Since there is an algorithm to convert any I'M into an equivalent one that
_ enters an infinite loop whenever the original one would reject (see the proof
_ of Theorem 11.6), we may make this assumption without loss of generality. —
Some additional notation and terminology will be helpful! For an instance -
(a1, Bi), (@2, Bo), ..., (Qn, Bn) of MPCP, we will say that a partial solution
is a Sequence 72, i3,...,1; so thata = O10), Oj, is a prefix of Bp = _
PiB:,-- Bi,. A little less precisely, we might say that the string a obtained
_ this way is a partial solution, or that the two strings w and B represent a
| partial solution. Secondly, we introduce temporarily a new notation for _
representing ™ configurations. For x, y € (TU {A})* with y not ending —
_ in A, and q € Q, we will write xqy to represent the configuration that we —
normally denote by (q, x y), or by (¢, xA) in the case where y = A. In
other words, the symbols in x are those preceding the tape head, which is —
centered on the first symbol of y if y # A and on A otherwise.
In order to simplify the notation, we assume from here on that w aoA.
_ This assumption will not play an essential part in the proof, and we will
indicate later how to take care of the case when w = A.
Here is arough outline of the proof. The symbols involved in our
pairs will be those that appear in the configurations of T, together with
on additional symbol #. We want to specify pairs (%, B;) in our cece
correspondence system so that for any /, if
qoAw, Xi V1, - +». X79;5j
“are successive configurations through which T moves in processing the input
stringw, starting with the initial configuration qoAw, a partial solution can
426 PART 5 Unsolvable Problems and Computable Functions
tial
al coluion is one step ahead «of ¢
a.the the: :
a is determined, and the pairs (@;, B;) are suchthatevery tin
CHAPTER 11 Unsolvable Problems 427
428 PART 5 Unsolvable Problems and Computable Functions
++ Akim#
B= Hz - Hz 182j#
“where the strings u madonsv may be null. If at least oone> is nonnull, we may ;
extend the partial ee using one pair of ag 3 andothers of type 1,to
— obtain:
c
“stillcontains
where Zz, h, but has atlea oe fewer symbol than z;.
~ similar way, we can continue to extend the partial solution, so that the strings — :
between consecutive #’s decrease in length by either one or two symbols at
each step, until we have a partial ssolution of the form
. os /
oth at
CHAPTER 11 Unsolvable Problems 429
C= ae Het
aepeat Hatt
a/a, R
b/b,R
Figure 11.21
430 PART 5 Unsolvable Problems and Computable Functions
For the input string A, pair 1 is (#, #qo#). The following partial solution is the longest
possible, and the only one (except for smaller portions of it) ending in #:
where the @;’s and 6;’s are strings over &. Let
where the c;’s are symbols not contained in ©. The terminal symbols of both our
grammars will be the symbols of & U C. Let Gy be the CFG with start symbol Sy
and the 2” productions
"Theorem 11.12 eo i
- The problem CFG Nonemnpyintersedcogs Given
two
CFG G,and
_. 1{G)) n L(G) apes) is unsolvable. ; ee
irosh | 8 -
We can reduce PCP to CFGNonemptyinter, sectionasfollows. Fe
arbitrary instance I of PCP, consisting of the pairs Gi, B:)with le
construct theiinstance Ga,
oo -eb ee itcciee
and, on
n the oui hand,
“Theorem 11.13 / | :
The away ee ae a crais it ambiguous? isunsolvable.
. Progie /
al pod
productions those of Ga and G7p along withthe two additional
+ Diy te ee a eo :
- ies is in L(Gu)nN1G,
yy,7 two leftmost derivati onsi
in. G, beginnin
. and Ss> Sg, respec oe
and
Proof We prove the result for L;, and the proof for L2 is similar. We can show that
L, is a CFL by describing how to construct a PDA M accepting it. M will have in
its alphabet both states and tape symbols of T (including A), and these two sets are
assumed not to overlap. A finite automaton is able to check that the input string is of
the form z#z’#, where z and z’ are in the set [* QI"*, and this is part of what M does,
rejecting if the input string is illegal.
For the rest of the proof, we need to show that M can operate so that if the first
portion z of the input is a configuration of 7, then the stack contents when z has been
processed are (z’)’ Zo, where z Fr z’. (This allows M to process the input remaining
after the first # by simply matching input symbols against stack symbols.)
We consider the case in which '
Z = xpay
where p is a state and a is a tape symbol of 7, and the move of T in state p with tape
symbol a is
6(p,a) = q, 5, L)
The other cases can be handled similarly.
In this case, if x = x,c for some string x; and some tape symbol c, then T moves
from the configuration xpay = x,cpay to the configuration x;qcby. If x is null, T
rejects by trying to move its tape head left from square 0.
M can operate by pushing input symbols onto its stack until it sees a state of T;
at this point, the stack contains the string x” Zo. We may specify that M rejects if the
top stack symbol is Zo (that is, if x is null). Otherwise, the stack contains
r
M has now read the state p, and the next input symbol is a. It pops the c, replaces it by
bcq, and continues to read symbols and push them onto the stack until it encounters
then
the first #. This means z has been processed completely. The stack contents are
y’ bcgx; Zo = (xigeby)’Zo
and the string x,;qcby is the configuration we want. a
434 PART 5. Unsolvable Problems and Computable Functions
25 ofl eta)a .
CHAPTER 11 Unsolvable Problems 435
Proof Astring x over this alphabet fails to be in Cr if x does not end in #; otherwise,
if
6. For some odd i, z) and z;+; are configurations but the condition z} tr z+: fails.
It is easy to see that each condition individually can be tested by a PDA; in some
cases an FA would suffice, and in the others we can use arguments similar to those in
the proof of Lemma 11.2 (for the last two conditions in particular, nondeterminism
can be used to select a particular value of i, and testing that the condition fails for
that i is no harder than testing that it holds). Therefore, C7 is the union of CFLs, and
so itis a CFLitself.
The underlying fact that allows us to apply these results is that the problem of
determining whether there are any valid computations for a given TM is unsolvable.
This is just another way of expressing the unsolvability of AcceptsSomething, which
is a corollary of Rice’s theorem. We list two immediate applications, and there is
further discussion in the exercises.
EXERCISES
11.1. Show that the relation < on the set of languages (or on the set of decision
problems) is reflexive and transitive. Give an example to show that it is not
symmetric.
11.2. Let P> be the decision problem: Given a natural number 7, is n evenly
divisible by 2? Consider the numerical function f defined by the formula
7 @).= Mn.
a. To what other decision problem P does f reduce P2?
b. Find a numerical function g that reduces P to P>. It should have the
same property that f does; namely, computing the function does not
explicitly require solving the problem that the function is supposed to
reduce.
11.3. Show that if L; and L2 are languages over & and Ly is recursively
enumerable and L; < L, then L is recursively enumerable.
11.4. Show that if L C D* is neither empty nor all of &*, then any recursive
language over & can be reduced to L.
11.5. Fermat’s last theorem, until recently one of the most famous unproved
statements in mathematics, asserts that there are no integer solutions
(x, y, Z, m) to the equation x” + y” = 2” satisfying x, y > Oandn > 2.
Show how a solution to the halting problem would allow you to determine
the truth or falsity of the statement.
11.6. Show that every recursively enumerable language can be reduced to the
language Acc = {e(T)e(w) | T isa TM and T accepts input w}.
11.7. As discussed at the beginning of Section 11.3, there is at least one TM T so
that the decision problem Given w, does T accept w? is unsolvable. Show
that any TM accepting a nonrecursive language has this property.
CHAPTER 11 Unsolvable Problems 437
11.8. Show that for any x € &*, the problem Accepts can be reduced to the
problem: Given a TM 7, does T accept x? (This shows that, just as
Accepts-A is unsolvable, so is Accepts-x, for any x.)
11.9. Construct a reduction from Accepts-A to the problem Accepts-{A}: Given
aT T, is L(T) = {A}?
11.10. a. Given two sets A and B, find two sets C and D, defined in terms of A
and B, so that A = B if and only ifC C D.
b. Show that the problem Equivalent can be reduced to the problem
Subset.
11.11. a. Given two sets A and B, find two sets C and D, defined in terms of A
and B, so that A C B if and only if C = D.
b. Show that the problem Subset can be reduced to the problem
Equivalent.
11.12. For each decision problem given, determine whether it is solvable or
unsolvable, and prove your answer.
a. Given aTM 7, does it ever reach a state other than its initial state if it
starts with a blank tape?
b. Given aTM T and a nonhalting state g of T, does T ever enter state g
when it begins with a blank tape?
c. GivenaTM 7 and anonhalting state q of T, is there an input string x
that would cause T eventually to enter state q? ;
d. GivenaTM 7, does it accept the string A in an even number of moves?
e. Given aTM 7, is there a string it accepts in an even number of moves?
f. GivenaTM 7 and astring w, does T loop forever on input w?
g. Given aTM 7, are there any input strings on which T loops forever?
h. Given a TM T and a string w, does T reject input w?
i. Given aTM 7, are there any input strings rejected by T?
j. Given a TM T, does T halt within ten moves on every string?
k. Given aTM 7, is there a string on which T halts within ten moves?
1. Given TMs 7; and 7), is L(T,) € L(Ty) or L(Tn) € L(N))?
11.13. Let us make the informal assumption that Turing machines and computer
programs written in the C language are equally powerful, in the sense that
anything that can be programmed on one can be programmed on the other.
Give a convincing argument that both these decision problems are
unsolvable:
a. GivenaC program and a statement s in the program and a specific set I
of input data, is s ever executed when the program is run on input J?
b. Given aC program and a statement s in the program, is there a set J of
input data so that s is executed when the program runs on input J?
438 PART 5 Unsolvable Problems and Computable Functions
The conclusion reached here is false; explain precisely what is wrong with
the argument.
11.17. Refer to the correspondence system in Example 11.2, in the case where the
input string is ab. Find the solution sequence.
11.18. In each case below, either give a solution to the correspondence system or
show that none exists.
F
1010
: 101
omer
101
Re
asad,
11.19. Show that the special case of PCP in which the alphabet has only two
symbols is still unsolvable.
11.20. Show that the special case of PCP in which the alphabet has only one
symbol is solvable.
11.21. Show that each of these decision problems for CFGs is unsolvable.
a. Given two CFGs Gy and G2, is L(G;) = L(G2)?
b. Given two CFGs G; and G2, is L(G;) € L(G2)?
Cc. Given a CFG G and a regular language R, is L(G) = R?
CHAPTER 11 Unsolvable Problems 439
11.22. In the second proof of Theorem 11.12, given at the end of Section 11.6,
describe in reasonable detail the steps of the algorithm which, starting with
a TM T, constructs CFGs G, and G2 so that L(G,) N L(Gz2) is the set of
valid computations of T.
appropriately for strings x not of the form e(T )e(z), then f defines a
reduction from Acc’ to AE.)
e. Show that AE is not recursively enumerable.
11.27. If AE is the language defined in the previous exercise, show that if L is any
language whose complement is not recursively enumerable, then L < AE.
11.28. Find two unsolvable decision problems, neither of which can be reduced to
the other, and prove it.
11.29. In this problem TMs are assumed to have input alphabet {0, 1}. For a finite
set S C {0, 1}*, Ps denotes the decision problem: Given a TM 7, is
Siu Gh)?
a. Show that if x, y € {0, 1}*, then Pix) < Pry).
b. Show that if x, Wee {0, 1}*, then Pux} < Pry.z}:
c. Show that if x, y, z € {0, 1}*, then Py,y,)< Pry.
d. Show that for any two finite subsets S and U of {0, 1}*, Ps < Pu.
11.30. Repeat the previous problem, but this time letting Ps denote the problem:
Given aTM T, is L(T) = S?
11.31. For each decision problem given, determine whether it is solvable or
unsolvable, and prove your answer.
a. Given aTM 7, does T eventually enter every one of its nonhalting
states if it begins with a blank tape?
b. Given aTM 7, is there an input string that causes T to enter every one
of its nonhalting states?
11.32. Show that the problem CSLIsEmpty: given a linear-bounded automaton,
is the language it accepts empty? is unsolvable. Suggestion: use the fact
that Post’s correspondence problem is unsolvable, by starting with an
arbitrary correspondence system and constructing an LBA that accepts
precisely the strings a representing solutions to the correspondence system.
11.33. This exercise establishes the fact that there is a recursive language over
{a, b} that is not context-sensitive. (Note that the argument outlined below
uses a diagonal argument. At this point, a diagonal argument or something
comparable is the only technique known for constructing languages that are
not context-sensitive.)
a. Describe a way to enumerate explicitly the set of context-sensitive
grammars generating languages over {a, b}. You may make the
assumption that for some set A = {A), A2,...}, every such grammar
has start symbol A, and only variables that are elements of A.
b. If G,, Go, ... is the enumeration in part (a), and x1, x2, ... are the
nonnull elements of {a, b}* listed in canonical order, let
L = {x; |x; ¢ L(G;)}. Show that L is recursive and not
context-sensitive.
11.34. Is the decision problem: Given a CFG G, and a string x, is L(G) = {x}?
solvable or unsolvable? Give reasons for your answer.
CHAPTER 11 Unsolvable Problems 441
Computable Functions
of how busy a TM of this type can be before it halts. It has also been suggested that the term
“busy beaver” might refer to the resemblance between 1’s on the tape and twigs arranged by
a beaver.)
Suppose, for the sake of contradiction, that b is computable. Then it is possible to find a
TM 7; having tape alphabet {0, 1} that computes it (Exercise 9.45). Let T = T,T,, where Tj
is a TM also having tape alphabet {0, 1} that moves its tape head to the first square to the right
of its starting position in which there is either a 0 or a blank, writes a 1 there, and halts. Let
m be the number of states of T. By definition of the function b, no TM with m states and tape
alphabet {0, 1} can end up with more than b(m) 1’s on the tape if it halts on input 1”. However,
T is a machine of this type that halts with output 12°)*!. This contradiction shows that b is
not computable.
The function b has been precisely defined but is not computable. A formula like
Cly=¢ ry
for eveX ¢ N*
a : Projection functions: For each k > | and eachi with 1 <i < k, the projection
function Pr: N* — N is defined by the formula .
pn ee
Now we are ready to consider ways of combining functions to obtain new ones.
We start with composition, essentially as defined in Chapter 1, and another operation
involving a type of recursive definition.
Definition 12 Compostton
We have chosen here to restrict ourselves to functions whose values are single
integers, rather than k-tuples; otherwise, we could write h = f o g, where g is the
function from NV” to A defined by g(X) = (g1(X), ..., gx (X)).
Notice that in this definition, in order for h(X) to be defined, it is necessary and
sufficient that each g;(X) be defined and that f be defined at the point (g:(X),...,
gx(X)). If all the functions f, g1,..., gx are total, then h is total.
For a familiar example, let Add: NVx MN — WN be the usual addition function
(Add(x, y) = x + y), and let f and g be partial functions from A to NV’. Then the
function Add(f, g) obtained from Add, f, and g by composition is normally written
ft ge
The simplest way to define a function f from NV to NV recursively is to define
f (©) first, and then for any k > 0 to define f(k + 1) in terms of f(k). A standard
example is the factorial function:
O!=1 (K+ 1)!=(k%+1)
€k!
In the recursive step, the expression for f(k + 1) involves both k and f(k). We can
generalize this by substituting any expression of the form h(k, f(k)), where h is a
function of two variables. In order to use this approach for a function f of more than
one variable, we simply restrict the recursion to the last coordinate. In other words,
we start by saying what f(x, X2,..., Xn, 0) is, for any choice of (x1, ..., X,). This
means specifying a constant when n = 0 and a function of n variables in general.
CHAPTER 12 Computable Functions 445
Then in the recursive step, we say what f (x1, X2,...,%Xn,k + 1) is, in terms of
f (41, ---,%n, k). Let the n-tuple (x1, ..., x,) be denoted by X. In the most general
case, f(X,k + 1) may depend on X and k directly, in addition to f(X,k), just as
(k + 1)! depended on k as well as on k!. Thus a reasonable way to formulate the
recursive step is to say that
f® KAD =h(X,k, F(X, k))
for some function h of n + 2 variables.
In the factorial example, n = 0, g is the number (or the function of zero variables)
Cyl. and A(x) =. +11) 4y. :
Here again, if the functions g and h are total functions, f is total. If either g
or h is not total, the situation is a little more complicated. If g(X) is undefined for
some X € N”, then f (X, 0) is undefined, f(X, 1) = A(X, 0, f (X, 0)) is undefined,
and in general f(X,k) is undefined for each k. For exactly the same reason, if
f (X, k) is undefined for some k, say k = ko, then f(X, k) is undefined for every
k > ko; equivalently, if f (X, k,) is defined, then f (X, k) is defined for every k < kj.
These observations will be useful a little later in showing that a function obtained by
primitive recursion from computable functions is also computable.
At this point we have a class of initial functions, and we have two operations
with which to obtain new functions. Although other operations are necessary in order
to obtain all computable functions, it will be useful to formalize the set of functions
we can obtain with the tools we have developed.
446 PART 5 Unsolvable Problems and Computable Functions
Add(x,y)=x+y Mult(x,y)=x*y
are both primitive recursive. We start by finding a primitive recursive derivation for Add. Since
Add is not an initial function, and there is no obvious way to obtain it by composition, we try
to obtain it from simpler functions using primitive recursion. If Add is obtained from g and h
by primitive recursion, g and h must be functions of one and three variables, respectively. The
equations are
Add(x,0) = g(x)
Add(x, 0) should be x, and thus we may take g to be the initial function p}. In order to get
x+k-+1 (i.e., Add(x,k + 1)) from the three quantities x, k, and x + k, we can simply take
the successor of x +k. In other words, h(x, k, Add(x, k)) should be s (Add(x, k)). This means
that h(x;, x2, x3) should be s(x3), or s( P3 (X1, X2, X3)). Therefore, a derivation for Add can be
obtained as follows:
This way of ordering the five functions is not the only correct one. Any ordering in which Add
is last and s and p} both precede s(p3) would work just as well.
To obtain Mult, we try primitive recursion again. We have
Mult(x,0) = 0
Mult(x, k + 1) = xx (k + 1)
CHAPTER 12 Computable Functions 447
= Add(x «k, x)
Remember that we are attempting to write this in the form h(x, k, Mult(x,k)). Since x and
Mult(x,k) are the first and third coordinates of the 3-tuple (x, k, Mult(x,k)), we use the
function f = Add(p}, p3), obtained from Add, p?, and p} by composition. The function Mult
is obtained from 0 (i.e., the initial function Cj) and f using the operation of primitive recursion.
Therefore, Mult is also primitive recursive.
This derivation of Mult, and many arguments involving primitive recursive func-
tions, can be simplified somewhat by using the following general result.
where we define 0° = 1 in order to make the function total. To show that f is primitive
recursive, we look first at the function g defined by f(x, y) = x”. We can write
Hikes 0) = 1
x")
fix, k+1) = Mult(x,
= Mult(x, fi(x, k))
By considering the formula h(x, y, z) = Mult(x, z) and using part 1 of the theorem, we can
see that f; is primitive recursive. Since y* = f\(y, x), it follows from part 2 that the first term
in the formula for f is primitive recursive. The second and third terms are primitive recursive
functions of x because of parts 4 and 3 of the theorem, respectively, and therefore primitive re-
cursive functions of x and y as aresult of part 1. Finally, since f(x, y) = Add(Add(y*, x*), x’),
we can use the fact that composition preserves primitive recursiveness to conclude that f is
primitive recursive.
0 he 0
Pred(x) =
x—-1 eae == dl
The formulas
Pred(O) = 0
Pred(k + 1) =k
together with part 1 of Theorem 12.1 show that Pred can be derived from primitive recursive
functions using primitive recursion. If we define Sub by
ay, ifx>y
Sub(x, y) =
0 otherwise
Sub(x, 0) = x
from which it follows that Sub is primitive recursive. This operation is often written — and is
referred to as proper subtraction, or the monus operation.
Theorem 12.2
Every primitive recursive function i
is a computable total function.
i Proof
The way”we have defined the set of primitive 1 recursive functions makes _
| structural induction appropriate for proving things about them. We show
_ te following three statements: Every initial function iis a total computable _
function; any function obtained from total computable functions by« compo-
_ sition is also a total computable function; and any function obtained from
total oe. functions bypene recursion is also abot compalaile
function. - _
We hee previously. observed, in fact, that tou functions are total :
4 functions and that functions obtained from total functions by composition ;
or primitive recursion are total; thus we may concentrate on the conclusions _
involving computability. It is almost obvious that all the initial functions are
computable, and we omit the details. For the sake of simplicity, we show
that if h : NM” — AN is obtained from f : N* — WN and gj and gp, both
functions from NV” to A’, by composition, and if these three functions ae
computable, then h is. The argument is valid even if all the functions are -
partial functions, and it extends in an obvious way to the more general case 7
in which f is a function of k variables. =
Let T;. 7), and 7) be IMs computing f, 21, and go, respectively. We
will construct aTM 7), to compute h. To simplify notation, we denote by x
the m-tuple (x1, x2, ..., Xm) and by. 1* the string 1 A1?A---- AL”,
The TM 7, boots with tape contents Al*, and ‘it must use this input -
twice, once to compute each of the g;’s. It does this by copying the pot fo
prednes the tape
AV Al’
executing 7; to produce A1* A1*'?, and then making another copy of the _
input and eects T, to obtain
ALK AL a 120
_ At this bint itdeletes the original input and executes Ty on the bey
i Ale®, which produces the desired output.
_ Forany choice of X, 7), fails to accept during the execution of 7; if2;(X)
is undefined, and fails to accept during the execution of 7; if both g;(X)’s
are defined but f(¢1(X), g2(X)) is undefined. Therefore, T, computes h.
For the final step of the proof, suppose that g : Ni > Nandh
N+? > Aare computable and that f is obtained from g and h by primitive
recursion. We let T, and 7, be TMs computing g and A, respectively, and
we construct a TM T; to compute /.
originaltape of Ty looks like this:
Thele
AV A1@A- -- AVA!
450 PART 5 Unsolvable Problems and Computable Functions
Dyoo
2 AYKE
LZ
Ae
-
y
Lae
2 SZ
CHAPTER 12 Computable Functions 451
There are functions defined in more conventional ways that can be shown to be
total, computable, and not primitive recursive. One of the most well known is called
Ackermann’s function; its definition involves a sort of recursion, so that it is clearly
computable, but it can be shown to grow more rapidly than any primitive recursive
function. A readable discussion of this function can be found in the text by Hennie
(1977). :
1 if P(X) is true
x)=
xP(X) 0 otherwise
452 PART 5 Unsolvable Problems and Computable Functions
Since yp is a numerical function, all the properties of functions that we have discussed
in Section 12.1 are applicable to it and, by association, to P. In particular, P is
computable if xp is, and P is primitive recursive if xp is. If the characteristic
function x, of a language L is computable, we can decide whether a given input is
in L. When we say that xp is computable, we are saying something similar: There
is an algorithm to determine whether a given X satisfies P or makes P(X) true.
Predicates take the values true and false, and therefore it makes sense to apply the
logical operators \ (AND), V (OR), and = (NOT) to them. For example, (P; \ P2)(X)
is true if and only if both P;(X) and P(X) are true. Not surprisingly, these operations
preserve the primitive recursive property.
Se(0)=0 Se(k+1)=1
This function takes the value 0 if x = 0 and the value 1 otherwise, and its definition makes it
clear that it is primitive recursive. Now we may write
Although the other four relational predicates can be handled in the same way, it is easier
to use the formulas
LE=LTv EQ
GT = -LE
GE = -LT
NE = -EQ
which together with Theorem 12.4 imply that all these predicates are primitive recursive.
x = y * Div(x, y) + Mod(x, y)
R(x, y) = Mody, x)
According to part 2 of Theorem 12.1, the primitive recursiveness of Mod follows from that of
R. The following formulas can be verified easily.
R(x,0) = Mod(0O, x) = 0
R(x,k + 1) = Mod(k + 1, x)
R@, k)-+ 1 it 4 = OlandeR (eik) etl ec
gap lt it x= 10
For example,
R(5,9+ 1) = Mod(10, 5) = 0
x34+1 if x; #Oand
x3+1 < x,
h(X1,
x2, %3) = 10 ifx,4Oandx;+1=x,
xy+1 ite)
is not
atotal function, since it is undefined if x, 4 Oandx3+1 > x,. However, the modification
x4, Ll if x, =0
works just as well. The function R is obtained by primitive recursion from Cj and this modified
h, and Theorem 12.5 implies that h is primitive recursive. Therefore, so are R and Mod.
CHAPTER 12 Computable Functions 455
The function Div can now be handled in a similar way. If we define Q(x, y) to be
Div(y, x), then it is not hard to check that Q is obtained by primitive recursion from C} and
the primitive recursive function h, defined by
x3 if x; 4 O and Mod(x2, x1) +1 < x;
0 thee, =O
(Note that for any choice of (x;, x2, x3), precisely one of the predicates appearing in this
definition is true.)
The operations that can be applied to predicates to produce new ones include not
only logical operations such as AND, but also universal and existential quantifiers.
For example, if Sq is the 2-place predicate defined by
Sa es Oe =o)
then it is reasonable to apply the existential quantifier (“there exists”) to the second
variable in order to obtain the 1-place predicate PerfectSquare, defined by
PerfectSquare(x) = (there exists y with ye =X)
The predicate Sq is primitive recursive. Does it follow that PerfectSquare is? The
answer is: No, it does not follow, but yes, this predicate is primitive recursive.
Placing a quantifier in front of a primitive recursive predicate does not always produce
a primitive recursive predicate, and placing a quantifier in front of a computable
predicate does not always produce something computable.
We can easily find an example to illustrate the second statement, by considering
an unsolvable problem from Chapter 11 that can be obtained this way. Given an
alphabet ©, we can impose an ordering on it, which makes it possible to consider
the canonical order on ©*. For a natural number x, denote by s, the xth string with
respect to that ordering. Let 7, be the universal Turing machine of Section 9.6, and
let H be the 2-place predicate defined by
H(x, y) = (7, halts after exactly y moves on input s,)
is not computable, because to compute it would mean solving the halting problem.
One difference between these two examples, which is enough to guarantee that
is
PerfectSquare is computable even though Halts is not, is that for a given x there
need to be tested in order to determine whether the
a bound on the values of y that
any y
predicate “there exists y such that y? =x” is true. Since y? > y, for example,
< x. In particular, there is an algorithm to determine
for which y? = x must satisfy y
until a value is
whether PerfectSquare(x) is true: Try values of y in increasing order
> x. The predicate Halts illustrates the fact that
found satisfying either y? =x or y
456 PART 5 Unsolvable Problems and Computable Functions
if the simple trial-and-error algorithm that comes with such a bound is not available,
there may be no algorithm at all.
This discussion suggests that if we start with any (n + 1)-place predicate P, we
may consider the new predicate E p that results from applying the existential quantifier
to the last variable in a restricted way, by specifying a bounded range for this variable.
We can do the same thing with the universal quantifier (“for every”), and in both cases
this bounded quantification preserves the primitive recursive property.
In order to simplify the proof of Theorem 12.6, it is useful to introduce two other
“bounded operations.” We start by considering a simple special case in which the
resulting function is a familiar one.
The factorial function is defined recursively in Chapter 2. Here we use the
definition
x
55 Jah i
i=1
where, in general, He ; Pi Stands for the product pj * pj+1*- ++ * Pk ifk > j,and 1
if k < j. (In the second case, we think of it as the empty product; 1 is the appropriate
value, since we want the empty product multiplied by any other product to be that
other product.) We can generalize this definition by allowing the factors to be more
general than i—in particular, to involve other variables—and by allowing sums as
well as products.
Lemma 12.1 Let n > 0, and suppose that g : N"*! — WN is primitive recursive.
Then the functions f;, f2 : V"*! — N defined below are also primitive recursive:
k
fi(X,k) = D> 8% i)
1=0
k
A(X kb =] [8% i
i=0
CHAPTER 12 Computable Functions 457
for any X € N" andk > 0. (The functions f; and f are said to be obtained from g
by bounded sums and bounded products, respectively.)
Proof We give the proof for f,, and the other is almost identical. We may write
fi(X, 0) = 8(X, 0)
fi(X, Kk + 1) = fi(X, k) + (Xk + 1)
Therefore, f; is obtained by primitive recursion from the two primitive recursive
functions g; and h, where g;(X) = g(X, 0) and h(X, y,z) =z+a(X,y+1). @
Note the slight discrepancy between the definition of bounded product, in which
the product starts with i = 0, and the previous definition of x!. It is not difficult to
generalize the theorem slightly so as to allow the sum or product to begin with the
i = ig term, for any fixed ig (Exercise 12.33).
So far, the bounded versions of the operations in this section preserve the primi-
tive recursive property, whereas the unbounded versions do not even preserve com-
by
putability. In order to characterize the computable functions as those obtained
starting with initial functions and applying certain operations, we need at least one
is be-
operation that preserves computability but not primitive recursiveness. This
cause the initial functions are primitive recursive, and not all computable functions
its
are. The operation of minimalization turns out to have this feature. We introduce
bounded version here and examine the general operation in the next section.
the
For an (n + 1)-place predicate P, and a given X e€ N”, we may consider
into a bounded
smallest value of y for which P(X, y) is true. To turn this operation
is less than or
one, we specify a value of k and ask for the smallest value of y that
or not we bound
equal to k and satisfies P(X, y). There may be no such y (whether
bounded version of our
the possible choices by k); therefore, because we want the
for the function in this
function to be total, we introduce an appropriate default value
case.
458 PART 5 Unsolvable Problems and Computable Functions
Then
PrNo(0) = 2
PrNo(k + 1) = mp(PrNo(k), PrNo(k)! + 1)
We have shown that PrNo can be obtained by primitive recursion from the two functions C2
and h, where
The fact that we want Mp to be acomputable partial function for any computable
predicate P also has another consequence. Suppose, again, that the algorithm we are
relying on for computing Mp(X) is the simple-minded one of evaluating P(X, y) for
increasing values of y. Suppose also that for a particular yo, P(X, yo) is undefined.
Although there might be a value y; > yo for which P(X, y;) is true, we will never
get around to considering P(X, y,) if we get stuck in an infinite loop while trying
to evaluate P(X, yo). We can avoid this problem by stipulating that unbounded
minimalization be applied only to total predicates or total functions.
Unbounded minimalization is the last of the operations we need in order to char-
acterize the computable functions. Notice that in the definition below, this operator
is applied only to predicates defined by some numeric function being zero.
numerical formulas expressing relations between numbers. His ingenious use of these
techniques allowed him to establish unexpected results concerning logical systems.
Gédel’s incompleteness theorem says, roughly speaking, that any formal system com-
prehensive enough to include the laws of arithmetic must, if it is consistent, contain
true statements that cannot be proved within the system.
Although we will not be discussing Gédel’s results directly, the idea of “Godel
numbering” will be useful. The first step is simply to encode sequences of several
numbers as single numbers. One application will be to show that a more general
type of recursive definition than we have considered so far gives rise to primitive
recursive functions. A little later we will extend our “arithmetization” to objects such
as TMs. This will allow us to represent a sequence of calculations involving numbers
by a sequence of numbers, and it will be the principal ingredient in the proof that all
computable functions are j-recursive.
There are a variety of Gédel-numbering schemes. Most depend on a familiar
fact about numbers: Every positive integer can be factored into primes, and this
factorization is unique except for differences in the order of the factors.
Numbers
The Gédel number of any sequence is greater than or equal to 1, and every integer
greater than or equal to 1 is the Gédel number of a sequence. The function gn is not
one-to-one; for example,
If we start with a positive integer g and wish to decode g to find a sequence xo,
X1,+-++,Xn Whose Gédel number is g, we may proceed by factoring g into primes.
For each i, x; is the number of times PrNo(i) appears as a factor of g. For example,
the number 59895 has the factorization
Many common recursive definitions do not obviously fit the strict pattern required
by the operation of primitive recursion. The standard definition of the Fibonacci
function, for example, involves the formula
fat+l=f@m+fa-—I1)
The right side apparently is not of the form h(n, f(n)) because it also depends on
f(n — 1). Ina more general situation, f(n + 1) might depend on even more, con-
ceivably all, of the terms f(0), f(1),..., f(”). This type of recursion is known as
course-of-values recursion, and it bears the same relation to ordinary primitive recur-
sion that the strong principle of mathematical induction does to the ordinary principle
(see Section 2.3).
One simple way to use Godel numbers is to recast such recursive definitions to fit
the required form. Suppose f is defined recursively, so that f (n +1) depends on some
or all of the numbers f(0),..., f() (and possibly also directly on n). Intuitively,
what we need in order to describe f in terms of primitive recursion is another function
fi for which
1. Knowing f,(n) would allow us to calculate f (7).
2. fi(n +1) depends only onn and f;(n).
464 PART 5 Unsolvable Problems and Computable Functions
Now we are ready to apply our Gédel numbering techniques to Turing machines.
A computable function f is computed by a sequence of steps. If we can manage to
represent these steps as operations on numbers, then we will have a way of building
the function f from more rudimentary functions. Because aTM move can be thought
of as a transformation of the machine from one configuration to another, all we need
do to describe the move numerically is represent the TM configuration by a number.
We begin by assigning a number to each state. The halt states h, and h, are
assigned the numbers 0 and 1, respectively. If Q is the set of nonhalting states, then
we let the elements of Q be qz, q3,..., 5, where q2 is always assumed to be the
initial state.
The natural number to use in describing the tape head position is the number of
the tape square the head is scanning. Finally, we assign the number 0 to the blank
symbol A (we will sometimes write 0 instead of A), and we assume that the nonblank
tape symbols are 1, 2,...,t. This allows us to define the tape number of the TM at
any point to be the Gédel number of the sequence of symbols currently on the tape.
Note that because we are identifying A with 0, the tape number is the same no matter
how many trailing blanks we include in the sequence. The tape number of a blank
tape is 1.
Since the configuration of the TM is determined by the state, the tape head
position, and the current contents of the tape, we define the configuration number to
be the number
gn(q, P, tn)
where g is the number of the current state, P is the current head position, and tn is
the current tape number. The most important feature of the configuration number is
that from it we can reconstruct all the details of the configuration; we will be more
explicit about this in the next section.
The main outline of the proof has been provided for us by the Gédel numbering
scheme presented in the last section and the resulting arithmetization of Turing ma-
chines. If f is computed by the Turing machine 7, we will complete the proof by
defining the functions appearing in the formula
f = Resulty o fr 0 InitConfig™
and showing that they are jz-recursive and that the formula holds.
For any n-tuple X, InitConfig(X) will be the number of the initial TM con-
figuration corresponding to input X. This number does not depend on the TM we
466 PART 5 Unsolvable Problems and Computable Functions
use, because we have agreed to label the initial state of any TM as g2. The numeric
function fr corresponds to the processing done by 7. For an input X in the domain
of f, if n is the number representing the initial configuration of T corresponding
to input X, then f7(n) represents the accepting configuration ultimately reached by
T ; for integers n corresponding to other inputs X, f7(n) is undefined. The function
Resultr has the property that if n is the number of an accepting configuration in which
the string representing output f(X) is on the tape, then Resultr(n) = f (X).
The function Resultr is one of several whose value at a number m depends on
whether m is the number of a configuration of T. The first step is therefore to examine
the 1-place predicate IsConfigr defined by
IsConfig;
(n) = (n is a configuration number for T)
For numbers m of this form, the conditions on qg and tm are equivalent to the statement
(Exponent(0, m) < sr) A(Exponent(2, m) > 1)A(for every i, Exponent(i,tn) < tsr)
In order to show that the conjunction of these two predicates is primitive recursive,
it is sufficient to show that both of the universal quantifications can be replaced by
bounded universal quantifications. This is true because Exponent(i,n) = O when
i > n; the first occurrence of “for every i” can be replaced by “for every i < m” and
the second by “for every i < tn.” Hf
where ¢)(x;,...,Xn) is the tape number of the tape containing the input string
Al A1”A.--A1*. It is therefore sufficient to show that the function t™ is primi-
tive recursive. The proof is by mathematical induction on n. The basis step, n = 0, is
clear, since t” is constant in that case. Suppose that k > 0 and that t is primitive
recursive. The number t®*)) (x1, ..., x4, Xx41) = t“4) (X, x441) is the tape number
for the tape containing
AI* A. AI AL
CHAPTER 12 Computable Functions 467
Counting the symbols in the string Al*'A---A1**, we find that the string 1**+
occupies tape squares k + a x; + 1 through k + ae Xj +Xx41. This means that
the additional factors in the tape number resulting from the 1’s in this last block are
those of the form
k
PrNo(k + ) ix tj) (Sj < xen)
i=1
In other words, we may write
Xk+1 k
mn*)(X, xe41) = n® (X) « |]PrNo(k + Vox ti)
j=l i=l
The first factor, viewed as a function of X, is primitive recursive according to the
induction hypothesis. Therefore, by Theorem 12.1, it is still primitive recursive when
viewed as a function of k + 1 variables. The second is of the form
Xk+1
[]sa
j=l
for a primitive recursive function g, and is therefore primitive recursive by Lemma
12.1. The result follows because the set of primitive recursive functions is closed
under multiplication.
As we discussed earlier, we want Resultr (n) to be f (X) ifn represents the accept-
ing configuration with output f(X). This will be the case if, for any n representing
a configuration, we simply define Result; (n) to be the number of the tape square
containing the last nonblank symbol on the tape in this configuration, or 0 if there
are no nonblank symbols. We may also let Resulty (n) be 0 if n does not represent a
configuration.
where for any positive k, HighestPrime(k) is the number of the largest prime factor
of k, and HighestPrime(0) = 0 (e.g., HighestPrime(2°5°197) = 7, because 19 is
PrNo(7)). It is not hard to see that the function HighestPrime is primitive recursive,
and it follows that Result; is also. @
The only remaining piece is f7, the numerical function corresponding to the
processing done by T itself. At this point we make the simplifying assumption that
T never attempts to move its tape head left from square 0. This involves no loss
of generality because any TM is equivalent to one with this property. It will be
helpful next to introduce explicitly the functions that produce the current state, tape
468 PART 5 Unsolvable Problems and Computable Functions
head position, tape number, and tape symbol from the configuration number. The
respective formulas are
State(m) = Exponent(0, m)
Posn(m) = Exponent(1, m)
TapeNum(m) = Exponent(2, m)
Symbol(m) = Exponent(PrNo(Posn(m)), TapeNum(m))
for any m that is a configuration number for T, and 0 otherwise. Because IsConfigr
is a primitive recursive predicate, all four functions are primitive recursive.
The main ingredient in the description of fr is another function
Mover :N => N
The three functions NewState, NewPosn, and NewTapeNum all have the value 0 at
any point m that is not a configuration number. For a configuration number m,
NewState(m) is the resulting state if T can move from configuration m, and State(m)
otherwise; the other two functions are defined similarly. Thus, in order to show that
Mover is primitive recursive, it is sufficient to show that these three New functions
are. In the argument it will help to have one more function, NewSymbol, defined
analogously.
So far, our description of NewState(m) has involved three cases. One case
corresponds to the primitive recursive predicate —/sConfig,. The other two cases may
be divided into subcases, corresponding to the possible combinations of State(m) and
Symbol(m). Because these two functions are primitive recursive, so are the predicates
defining the subcases. In each subcase, the value of NewState(m) is either m or the
value specified by the transition table for T. Therefore, since NewState is defined by
cases that involve only primitive recursive functions and predicates, it must also be
primitive recursive. The argument to show that NewSymbol is primitive recursive is
exactly the same.
The proof for NewPosn is almost the same. This function may also be de-
fined by cases, the same ones involved in the definition of NewState. In each case,
NewPosn(m) is either 0, if m is not a configuration number; Posn(m), if T cannot
move from configuration m, or if the move does not change the position of the tape
CHAPTER 12 Computable Functions 469
head; Posn(m) + 1, if the move shifts the head to the right; or Posn(m)—1, if the
move shifts the head to the left. Therefore NewPosn is primitive recursive.
The definition of NewTapeNum can also be made using the same cases, with
slightly more complicated formulas. Suppose that Posn(m) = i, Symbol(m) = j, and
NewSymbol(m) = j'. The difference between TapeNum(m) and NewTapeNum(m)
is that the first number involves the factor PrNo(i)/ and the second has PrNo(i)’
instead; the exponents differ by j — j’ = NewSymbol(m) — Symbol(m). Thus in this
subcase, NewTapeNum(m) can be expressed as
TapeNum(m) * PrNo(Posn(m)yNewSymbol(m)-Symbol(m)
if NewSymbol(m) > Symbol(m), and
Div(TapeNum(m), PrNo(Posn(m))S¥mbol(m)-NewSymbol(m) )
otherwise. Since both formulas define primitive recursive functions, the function
NewTapeNum is primitive recursive. @
Now that we have described the effect of one move of T on the configuration
number, we can generalize to a sequence of k moves. Consider the function Tracer :
N? — N defined as follows:
m if IsConfig, (m)
Tracer (m, 0) =
z 0 otherwise
It is clear from Lemma 12.5 that Tracer can be obtained by primitive recursion from
two primitive recursive functions and is therefore primitive recursive itself. Assuming
that m is a configuration number, we may describe Tracey (m, k) as the number of the
configuration after k moves, if T starts in configuration m—or, if T is unable to make
as many as k moves from configuration m, as the number of the last configuration T
reaches starting from configuration m.
We need just one more auxiliary function before we can complete the proof of
Theorem 12.10. Let Acceptingr : N — N be defined by
0 if IsConfig; (m) A Exponent(0, m) = 0
Accepting, (m) =
Poem) 1 otherwise
_ oe N > N by - ;
sg) =2".
= (Pr
ie number of A is definedto
CHAPTER 12 Computable Functions 471
The fact that none of the exponents in the formula for gn(x) can be 0 has two
consequences. First, since the factorization of a positive integer into primes is unique,
the function gn: &* > WN is one-to-one. Second, numbers such as 3 = 293! or
10 = 21295! cannot be the Gédel number of any string. Note also that for any
x € 2, the highest power to which any prime can appear in the factorization of
gn(x) is the number of symbols in D.
Because gn: &* — WN is not a bijection, it is not correct to speak of gn~!. It is
convenient, however, to define a function from NV to =* that is a left inverse of gn,
as follows:
On a x ifn = gn(x)
A if n is not gn(x) for any string x
The default value A is chosen arbitrarily. Saying that gn’ is a left inverse of gn means
that for any x € X*,
gn'(gn(x)) = x
Now suppose that f : Lf — 2%} is a partial function, where © and X> are
alphabets. We define the corresponding numerical function pi N > N by saying
that if n is the Godel number of x, then pr (1) is the Gédel number of f (x). Note that
the one-to-one property of gn is necessary for this definition to make sense. We can
also express py concisely in terms of the left inverses of the two Gédel-numbering
functions. If gj : U{ — N and g> : L} — WN are the functions that assign Gédel
numbers, and gi and g% are their respective left inverses, then the formula for p is
Pf (81(X)) = g2(
fF(84 (81(x))))
= g2(f(x))
holds for any string x in X}. This formula says that in the figure, both ways of getting
from the upper left to the lower right produce the same result, and this is just another
xy X53
af *
A A
gi || & 82 || 85
Pf
N N
Figure 12.1 |
472 PART 5 Unsolvable Problems and Computable Functions
way of saying that the numerical function pf mirrors the action of the string function
f: If f takes the string x to the string y, py takes the number gn(x) to the number
enty).
=- fegiin)
@ oO Figure this formula can be interpreted as saying that
t
h paths from the lower left to the upper right produce the ssi
Heong hisformula when n = g)(x), we obtain :
and
red
ib ean
which cause variables to be incremented and decremented; “conditional go-to” state-
ments of the form
ifX AO goto L
where L is an integer label in the program; statements of the form
read(X)
write(X)
and nothing else. Even with a language such as this, it is possible to compute all
Turing-computable functions. One approach to proving this would be to write a
program in this language to simulate an arbitrary TM. Doing so would involve some
sort of arithmetization of TMs similar to Gédel numbering: One integer variable
would represent the state, another the head position, a third the tape contents, and so
on. Another approach would be via Theorem 12.10: to show that the set of functions
computable using this language contains the initial functions and is closed under all
the operations permitted for jz-recursive functions.
Finally, this language, or even a less restricted programming language, can com-
pute only Turing-computable functions. This might be shown directly, by simulating
on a TM each feature of the language, and in this way building a TM to execute a
program in the language. It can also be shown with the help of Theorem 12.8. Just as
aTM configuration can be described by specifying a state, a tape head position, and a
474 PART 5 Unsolvable Problems and Computable Functions
EXERCISES
12.1. Let f : NM — N be the function defined as follows: f(n) is the maximum
number of moves an n-state TM with tape alphabet {0, 1} can make if it
starts with input 1” and eventually halts. Show that f is not computable.
12.2. Define f : VN> N by letting f (1) be the maximum number of 1’s that an
n-state TM with no more than n tape symbols can leave on the tape,
assuming that it starts with input 1” and always halts. Show that f is not
computable.
12.3. Show that the uncomputability of the busy-beaver function (Example 12.1)
implies the unsolvability of the halting problem.
12.4. Suppose we define bb(n) to be the maximum number of 1’s that can be
printed by an n-state Turing machine with tape alphabet {0, 1}, assuming it
starts with a blank tape and eventually halts. Show that bb is not
computable.
12.5. Show that if f : VM— WN isa total function, then f is computable if and
only if the decision problem: Given natural numbers n and C, is
f(n) > C? is solvable.
12.6. Suppose that instead of including all constant functions in the set of initial
functions, Ge were the only constant function included. Describe what the
set PR obtained by Definition 12.4 would be.
12.7. Suppose that in Definition 12.4, the operation of composition is allowed but
that of primitive recursion is not. What functions are obtained?
12.8. If g(x) =x and h(x, y, z) = z+ 2, what function is obtained from g and h
by primitive recursion?
12.9. Here is a primitive recursive derivation. fo = C?; fi = C$; fo is obtained
from fy and f; by primitive recursion; f; = p3; f4 is obtained from f> and
f3 by composition; fs = ce ; fo is obtained from fs and f,4 by primitive
recursion; f7 = p\; fg = p3; fo =; fio is obtained from fo and fg by
composition; f;; is obtained from f7 and fio by primitive recursion;
fiz = pis fiz is obtained from fg and fi2 by composition; f\4 is obtained
CHAPTER 12 Computable Functions 475
from fi, fi2, and f3 by composition; and f}5 is obtained from fs and
fig
by primitive recursion. Give simple formulas for fo, fo, fia, and fis.
12.10. Find two functions g and h so that the function f defined by f(x) = x?
is
obtained from g and h by primitive recursion.
12.11. Give complete primitive recursive derivations for each of the following
functions.
f :.N? > N defined by f(x, y) = 2x + 3y
aS f :N definedby f(n) =n!
—> N
c. f:N— N definedby f(n) = 2"
d. f :N — WN defined by f(n) =n? — 1
e. f:N* > N defined by f(x, y) = |x — y|
12.12. Show that for any n > 1, the functions Add, and Mult, from N” to N :
defined by
AdG AX IGS+ O05) =Xp+X2+---+X,
Mult,
(X41, .:., Xp) = X1 * XQ * ++ KX,
respectively, are both primitive recursive.
12.13. Show that if f : NV > N is primitive recursive, A C N is a finite set, and
g is a total function agreeing with f at every point not in A, then gis
primitive recursive.
12.14. Show that if f : NM— N is an eventually periodic total function, then f is
primitive recursive. Eventually periodic means that for some no and some
D> Onf (+p) =f (x). forevery x:= no:
12.15. Show that each of the following functions is primitive recursive.
a. f :N* — N defined by f(x, y) = max{x, y}
b. f :N* > N
defined by f(x, y) = min{x, y}
ce. f:N— N defined by f(x) = |./x] (the largest natural number less
than or equal to ./x)
d. f :N— WN defined by f(x) = [log,(x + 1)]
12.16. Suppose P is a primitive recursive (k + 1)-place predicate, and f and g are
primitive recursive functions of one variable. Show that the predicates
AggP and Ey,P defined by
A ¢gP(X,k) = (for every i with f(k) <i < g(k), P(X,i))
E¢gP(X,k) = (there exists iwith f(k) <i < g(k) so that P(X, i))
are both primitive recursive.
12.17. Show that if g : V* > N is primitive recursive, then f : VM — N defined
by
x
12.18. Show that the function HighestPrime, introduced in the proof of Lemma
12.4, is primitive recursive.
12.19. In addition to the bounded minimalization of a predicate, we might define
the bounded maximalization of a predicate P to be the function m? defined
by
. a <k| P(X, y) is true} if this set is not empty
Mm (XK) = ;
0 otherwise
b. f is defined by f(x) = ax
f is defined by f(x) = x’
12.36. a. Give definitions of primitive recursive and recursive for a function
f : (2*)" > &X* of n string variables.
aa
b. Using your definition, show that the concatenation function from (2
to X* is primitive recursive.
Introduction to
Computational Complexity
& o far in our discussion of decision problems, we have considered only the quali-
tative question of whether the problem is solvable. In real life, the computational
resources available for solving problems are limited: There is only so much time and
space available. As a result, there are problems solvable in principle for which even
medium-sized instances are too hard in practice. In Part VI, we consider the idea of
trying to identify these intractable problems, by describing in some way the amounts
of computer time and memory needed in order to answer instances of a certain size.
In Chapter 13, we first introduce notation and terminology involving growth
rates that allow us to discuss in a meaningful way questions such as “How much
time?” In the rest of the chapter, we relate these quantitative issues to our specific
model of computation and then discuss some of the basic complexity classes: ways of
categorizing decision problems and languages according to their inherent complexity.
In trying to distinguish between tractable and intractable problems, a criterion
commonly used is polynomial-time solvability on a Turing machine. Many prob-
lems are tractable according to this criterion, and some can be shown not to be;
we can adapt the reduction technique of Chapter 11 in order to obtain examples of
both types. Perhaps even more interesting are problems whose status with respect
to this criterion is still up in the air—problems for which no one has found either a
polynomial-time solution or a proof that none exists. The notion of NP-completeness
is a way of approaching this topic. In the last two sections, we give Cook’s proof of the
NP-completeness of the satisfiability problem and several examples of combinatorial
problems that can be shown to be NP-complete by using polynomial-time reductions.
|
479
“ wre
of noifoubotnl.
Vixsiqmesd isnoifstiuqmnod
ca SS =
% Se
~ \
er —— a. an
tines ait vi tery Lpstede 4713 ert we cee (Py SVE rab hi pantera 16st wah)
= liceie Maruti crt ilies) al alder loe ol upset ati eeltorives0 fone civty
a bis stolt digit ap vleo @ Sra cbatvetxs ersldoty aalvlog a) higlicvs A2tUOUa:
io . wave diaitw wit Slatin nf ciety EPO) SIs ery pil) ayer eh sitintioy SPR .
in eon +
of Bei wi Mw AS net a Soro (ria! mor vic esunetedh bas-ind im
_"
te stittomns sft!Yew apne pi atively ye est rit aaty hes ayer Sead (tiasbiors ms
‘ely cinhen 2 terval rowel? tole er telaen) pion basin Tere} ‘
. Aworm witiond: soi nig Way ptnin ho niet «eT sal aa vera ab
*' ufre wiBP” 2h ftuate OHReUp eer Tanymiiiteey a th Seuceil ai wr wisitay se
Sania “esr! G2 2kareed widdiet] snantee atwld vide ow eiialy weP Medaee heat. “
aw ai Ay hashes crud att sine eourypath mod? bat ie si ecgrmoa Te
oe aati iperacne aioe) ay a iebrcroipe Meal ine em planting vide ah amine
> cies es some ig-iteewsitit wm Alfa noe ed sng St qnign of
a Smee oe, orailiasn warn) & ne cule qnt!-lenneny log at boat Lor on
a4 6? 168 Bviiolte OF iii) Hee bt rari am ott uP wal baboon sfdhret she
Cf
:
Parent
aes
7
|
C HAPTER
481
482 PART 6 Introduction to Computational Complexity
21 8
125 32
1000 1024
8000 1048576
5000 2567 125000 1.13 « 10%
20000 10307 1000000 1.27 « 10°°
2000000 1003007 1000000000 1.07 « 10°!
exponential function. Once we say how to talk precisely about growth rates, one of the most
useful distinctions will be that between polynomial growth rates and exponential growth rates.
Many of the features evident in this example will persist in general.
ee EE ee
ee
The simplest situation in which two functions f and g will be said to have
the same growth rate is when f is exactly proportional to g, or f = Cg for some
constant C. (The size of C is irrelevant, as long as it is independent of n.) Because
it is unusual for two runtimes to be exactly proportional, we generalize by allowing
one function to be approximately proportional to the other, which means that f < Cg
for some constant C and f > Dg for some other (positive) constant D. Again, the
sizes of C and D are not relevant. The first part of Definition 13.1 involves a single
inequality, so that we can talk about one growth rate being no greater than another; in
order to consider functions with equal growth rates we can simply use the statement
twice, the second time with the two functions reversed. The other way in which
the definition generalizes the simplest case is that it allows the inequality to fail at a
finite set of values of n, to take care of the case when functions are undefined or have
unrepresentative values at a few points.
The statements f = O(g), f = O(g), and f = o(g) are read “f is big-oh ofg,”
“f is big-theta of g,” and “f is little-oh of g,” respectively. All these statements can
be rephrased in terms of the ratio f (n)/g(n), provided that g(n) is eventually greater
than 0. Saying that f = o(g) means that the limit of this ratio as n approaches infinity
is 0; the statement f = O(g) means only that the ratio is bounded. If f = O(g), and
both functions are eventually nonzero, then both the ratios f/g and g/f are bounded,
which is the same as saying that the ratio f/g must stay between two fixed positive
values (or is “approximately constant’). If the statement f = O(g) fails, we write
f # O(g), and similarly for the other two. Saying that f 4 O(g) means that it is
impossible to find a constant C so that f(n) < Cg(n) for all sufficiently large n; in
other words, the ratio f(n)/g(n) is unbounded. This means that although the ratio
f (n)/g(n) may not be large for all large values of n, it is large for infinitely many
values of n.
A statement like f = O(g) describes a relationship between two functions. It
is not an equation, and it makes no sense, for example, to write O(f) = g. The
notation is fairly well-established, although a variation that is a little more precise is
to define a set O(g) and to write f € O(g) instead of f = O(g).
The statement f = O(g) conveys no information about the values of f(n) and
g(n) for any particular n. It says that in the long run (that is, for sufficiently large
values of n), f(n) is no larger than a function proportional to g. The constant of
proportionality may be very large, so that the actual value of f(n) may be much
larger than g(n). Nevertheless, we say in this case that the growth rate of f is no
larger than that of g. The terminology is most appropriate when the two functions f
and g are both nondecreasing, or at least eventually nondecreasing, and most of the
functions we are interested in will have this property. If f = O(g), we say f and
g have the same growth rate; as we have seen, this is a way of saying that the two
values are approximately proportional, or that the ratio is approximately constant, for
large values of n. If f = o(g), it is appropriate to say that the growth rate of f is
smaller than that of g, because according to the definition, no positive constant is small
enough to remain a constant of proportionality as n gets large. In this case, although
f (n) will eventually be smaller than g(n), in fact much smaller, the statement says
nothing about how large n must be before this happens.
We have introduced the idea of two functions having the same growth rate, or of
one having a smaller growth rate than another. This terminology is not misleading,
in the sense that these two relations (on the set of partial functions from VV to V
defined for all sufficiently large n) satisfy at least most of the crucial properties we
associate with the corresponding relations on numbers. It is clear from the definitions
that if f = o(g), then f = O(g), so that “smaller than” implies “no larger than” in
484 PART 6 Introduction to Computational Complexity
reference to growth rates. We would also expect that if f = o(g), then g £ O(f))
(if the growth rate of f is smaller than that of g, then it cannot be true that the growth
rate of g is no larger than that of f'), and this is easy to check. Theorem 13.1 contains
several other straightforward properties of these relations.
where a, # 0. In the latter case, k is the degree of the polynomial. It is easy to check
that if the leading coefficient a, is positive, then p(n) > 0 for all sufficiently large n.
Usually we will be interested only in polynomials having this property, and we may
CHAPTER 13 Measuring and Classifying Complexity 485
gin) =a"
for some fixed number a > 1. If the coefficients a; of a polynomial or the base a of an
exponential function are not integers, we may obtain an integer function by ignoring
the fractional part; in both cases, the growth rate is not affected.
On the basis of Example 13.1, we would probably conjecture that any quadratic
polynomial has a smaller growth rate than any cubic, and that either one has a smaller
growth rate than an exponential function. The next theorem generalizes both these
statements.
486 PART 6 Introduction to Computational Complexity
A tabulation such as the one in Example 13.1 would obviously look different
for polynomials of different degrees and for exponential functions with a different
base. At n = 1000, for example, (1.01)” is still only about 20959. However, the
“exponential growth” is still there, as the second part of the theorem confirms; it just
takes a little longer for its effects to become obvious. When n = 10000, (1.01)” is
greater than 10**, whereas (10000)? = 107, or one trillion.
4k? + 5k+4=n?4+5n/2+4
moves. Because odd-length input strings require fewer moves, we may conclude that
tr(n) = O(n’)
488 PART 6 Introduction to Computational Complexity
A/A, L
a/a, L
b/b, L
Figure 13.1 |
A Turning machine to accept {ss |s € {a, b}*}.
We might ask whether there is a TM accepting this language with a significantly smaller time
complexity. A complete answer to this question is a little complicated. It looks at first as though
the number of actual comparisons a TM must make in order to recognize a string in the language
should be simply proportional to the string’s length. The quadratic behavior of the machine in
Figure 13.1 is the result of the repeated back-and-forth motions of the tape head, which seem
to be necessary in phases one and three. There are ways to reduce the number of these motions
without changing the overall approach. For example, in the first phase we could make the TM
convert two symbols at the left end to uppercase, then two at the right, and so forth, and in the
final matching phase we could ask the TM to remember two symbols from the first half before
it sends its tape head to the second half to try to match them. These improvements ought to cut
the back-and-forth motion almost in half, at the expense of an increase in the number of states.
Furthermore, if going from one to two is good, going from one to a number bigger than two
CHAPTER 13 Measuring and Classifying Complexity 489
ought to be even better. (If there were some way to get the TM to remember the entire first
half of the input in the third phase, we could eliminate the extraneous motions in that phase
altogether. There is no way to do this, unfortunately, because the number of states is finite.)
Any Turing machine that looks at all n input symbols must make at least n moves. For
this language, any TM following an approach similar to that in Figure 13.1 will probably need
at least twice that many. Our tentative conclusion from the preceding paragraph is that we may
be able to reduce the runtime beyond that bare minimum by an arbitrarily large factor. This
conclusion turns out to be correct, not only for this example but in general. By increasing the
number of states and/or tape symbols, one can produce a comparable “linear speed-up” of any
Turing machine, and there are similar results for space complexity. We may conclude that the
number of moves a TM makes or the number of tape squares it uses on a particular input string
is not by itself very meaningful; a more significant indicator is the growth rate of the function.
Reducing the runtime by a large constant factor still leaves us with a quadratic growth rate,
and it can be shown that the time complexity of any one-tape TM accepting this language is
quadratic or higher.
Another way to reduce the runtime would be to increase the number of tapes. In this
example, using a two-tape TM makes it possible to recognize the language in linear runtime
by avoiding back-and-forth motions altogether (Exercise 13.22). In more general examples,
this is too much to expect, and the growth rate of the runtime can be reduced in this way only
to about the square root of the original (Exercise 13.25).
As Example 13.2 has suggested, looking at the growth rate of a TM’s time
complexity tells us something about the efficiency of the algorithm embodied by the
machine. Of two TMs recognizing the same language (with the same number of tapes),
the one for which the growth rate of the time complexity is smaller will be preferable,
at least for sufficiently long input strings. We can now turn this statement around in
order to compare the complexity, or difficulty, of two languages. If recognizing L; can
be accomplished by a k-tape TM with time complexity /;, and the time complexity
of any k-tape TM recognizing L2 grows faster than /;, it is reasonable to say that L2
is more complex, or difficult to recognize, than L.
You might not expect that the time or space complexity of a nondeterministic
Turing machine would be a useful concept; after all, such a machine is allowed to
take shortcuts by making guesses. However, many of the most interesting decision
problems have solutions that can be described easily by using nondeterminism, and
we will see that adding this ingredient will be helpful in categorizing languages, or
decision problems, according to their complexity.
= {x € {a,b}* | forsomek > 2 and some w with |w| > 1,x = w*}
|w| = 2,..., or fora string w with length |x|/2. The machine in Figure 13.2 uses nondeter-
minism instead. Let us temporarily ignore the first component, Place($), and concentrate on
the remaining parts, which also begin with the tape head on square 0. The machine moves
past the input string, generates an arbitrary string w on the tape, makes an identical copy of w,
makes an arbitrary number of additional copies, deletes all the blanks that separate the copies,
and compares the resulting string to the original input. The nondeterminism appears both in
the construction of w and in the choice of the number of copies made. If the input string is
of the form w*, where |w| > 1 and k > 2, then the sequence of moves in which that string
is generated on the tape causes the TM to accept; because the string generated is of this form,
any input that is not causes the final comparison to fail and the TM to reject.
The purpose of Place($) is to prevent the possibility of infinite loops, which would oth-
erwise be possible in two places: in the generation of w or in the production of additional
copies of w. Place($) places the special marker in square 3n + 2, where n is the length of the
input. Thereafter, since no moves are specified for the tape symbol $, the machine crashes if
the tape head ever moves that far right on the tape. The number 3n + 2 is chosen to allow for
the extreme case in which |w| = 1 and n copies of w are generated, so that the copies and the
blanks separating them require a total of 2n tape squares.
Although calculating the exact nondeterministic time complexity of this machine is com-
plicated, finding a big-oh answer is not so hard. If the string w that is generated has length m,
and k copies are made, then in order for the input string of length n to be accepted, km must
be n. From this point on we can argue as follows. Moving past the input and generating w
takes time roughly proportional to n + m. Creating each additional copy of w requires time
proportional to m?, and this occurs k — 1 times, so that this total is proportional to km? = mn.
The first deletion of a blank takes time proportional to m, the second requires time proportional
to 2m, ..., and the last requires time proportional to (k — 1)m; the total of these is therefore
proportional to k?m = kn. Finally, still assuming that the input and the string created non-
deterministically have the same length n, comparing them takes time proportional to n?. It is
now clear that the overall time complexity is O(n’).
a/a, R A/a, R
b/b, R A/b, R
—+ Place ($)
a/a, L Equal
b/b, L
Figure 13.2]
492 PART 6 Introduction to Computational Complexity
The space complexity of our nondeterministic Turing machine is easy to calculate. The
rightmost tape square visited is the one just to the right of the last copy of w that is created.
Again, we are only interested in the case when the input string is accepted, so that the total
amount of space used (including the space required for the input) isn + 2+k(@m+ 1) =
2n + k + 2, where as before, k is the number of copies of w that are created. Because k is no
larger than n, we conclude that the TM has linear space complexity.
Example 13.3 illustrates in a simple way something that we will often see later.
To decide whether x is in this language, we must answer the question: Do there exist
a string w and an integer k so that x = w*? A deterministic algorithm to answer
the question consists of testing every possible choice of w and k to see if any choice
works. Although nondeterminism can eliminate the “every possible choice” aspect
of the algorithm by guessing, the TM must still test deterministically the choice it
guesses. Many interesting decision problems have the same general form: Is there
some number (or some choice of numbers, or some path, ...) that works? The obvious
nondeterministic approach to such a problem is to choose values by guessing and then
test them to see if they work. If T is a nondeterministic TM answering the question
in this way, we can interpret the time complexity tr informally by thinking of tr (1)
as measuring the time required to test a possible solution. The assumption is that the
time required for guessing a solution to test is relatively small.
Step-counting Functions
halts when T halts if that happens first. In other words, T can be used as a clock in
conjunction with other machines. In a similar way, if we want to constrain the space
resources of a TM T during its computation, we can use a step-counting function f
in order to mark a space of exactly f(n) squares to be used by T.
It is obvious from the definition that if f is a step-counting function, then f can
be computed by a TM in such a way that the number of steps in the computation of
f(n) is essentially f(n). It is also true, though much less obvious, that a relaxed
form of this condition is still sufficient for a function to be a step-counting function.
By combining Theorems 13.5 and 13.6, we see in particular that for a step-
counting function f, NTime(f) © Time(C!) for some constant C. In fact, this
result does not require the assumption that f be step-counting, and it can easily be
obtained directly from the proof of Theorem 9.2. In that proof, we constructed a
Turing machine to try all possible finite sequences of moves of a given NTM. The
number of sequences of k or fewer moves is simply the number of nodes in the first k
levels of the “computation tree.” Because there is an upper bound on the number of
moves possible at each step, this number is bounded by c* for some constant c, and
the result follows without difficulty.
CHAPTER 13 Measuring and Classifying Complexity 497
EXERCISES
13.1. Suppose f, 2,h,k:N> WN.
a. Show that if f= O(h) andg = O(k), then f + g = O(h +k) and
¥2= O(hk).
b. Show that if f = o(h) and g = o(k) then f + g = o(h +k) and
fg =o(hk).
c. Show that f + g = O(max(f, g)).
13.2. Let fi(n) = 2n?, fo(n) = n? + 3n?/, f3(n) = n3/logn, and
2n? if n is even
Larter? aiiGes died
For each of the twelve combinations of i and j, determine whether
fi = O(f;), and whether f; = o( f;), and give reasons.
13.3. Suppose f is a total function from NV to N and that f = O(p) for some
polynomial function p. Show that there are constants C and D so that
f(™) < Cp(n) + D for every n.
13.4. a. Show that each of the functions n!, n”, and 2” has a growth rate greater
than that of any exponential function.
b. Of these three functions, which has the largest growth rate and which
the smallest?
13.5. a. Show that each of the functions 2V” and n'°” has a growth rate greater
than that of any polynomial and less than that of any exponential
function.
b. Which of these two functions has the larger growth rate?
13.6. Classify the function (log n)'°£” with respect to its growth rate (polynomial,
exponential, in-between, etc.)
13.7; Give a proof of the second statement in Theorem 13.2 that does not use
logarithms. One way to do it is to write n = no + m, and to consider the
formula
k k
ck (Motl) (mo +2 ( no +m )k
Saeed No not+1 no tm—1
13.9. Find the time complexity function for each of these TMs:
a. The TM in Example 9.2 accepting the language of palindromes over
{0, 1}.
b. The Copy TM shown in Figure 9.12.
13.10. Show that if L can be recognized by a TM T with a doubly infinite tape,
and tr = f, then L can be recognized by an ordinary TM with time
complexity O(f).
13.11. Show that for any solvable decision problem, there is a way to encode
instances of the problem so that the corresponding language can be
recognized by a TM with linear time complexity.
13.12. Show that if f and g are step-counting functions, then so are f + g, f * 8,
fog, and 2/,
13.13. Show that any polynomial with positive integer coefficients and a nonzero
constant term is a step-counting function.
13.14. Show that the following decision problem is unsolvable: Given a Turing
machine T and a step-counting function f, is the language accepted by T
in Time(f)?
13.15. Is the problem, Given a Turing machine T, is tr < 2n? solvable or
unsolvable? Give reasons.
13.16. Suppose s is a step-counting function satisfying s(n) >n. Let L bea
language accepted by a (multitape) TM 7, and suppose that the tape heads
of T do not move past square s(n) on any of the tapes for an input string of
length n. Show that T € Space(s). (Note: the reason it is not completely
obvious is that T may have infinite loops. Use the fact that if during a
computation of T some configuration repeats, then T is in an infinite loop.)
13.17. If T is aTM recognizing L, and T reads every symbol in the input string,
then tr(n) > 2n + 2. Show that any language that can be accepted by a
T™ T with tr(n) = 2n + 2 is regular.
13.18. Suppose L;, Ly € &*, L; € Time(f;), and Lz € Time(f2). Find functions
g andh so that L; U Lz € Time(g) and L, M Lz € Time(h).
13.19. As we mentioned in Section 13.3, we might consider an alternate Turing
machine model, in which there is an input tape on which the tape head can
move in both directions but cannot write, and one or more work tapes, one
of which serves as an output tape. For a function f, denote by DSpace(f)
the set of languages that can be recognized by a Turing machine of this type
which uses no more than f (n) squares on any work tape for any input
string of length n. The only restriction we need to make on f is that
f(n) > 0 for every n. Show that both the language of palindromes over
{0, 1} and the language of balanced strings of parentheses are in
DSpace(1 + [log,(n + 1)]). ([x] means the smallest integer greater than or
equal to x.)
CHAPTER 13 Measuring and Classifying Complexity 499
500
CHAPTER 14 Tractable and Intractable Problems 501
The sets P and PSpace include any language that can be recognized by aTM with
time complexity or space complexity, respectively, bounded by some polynomial.
We may speak informally of decision problems being in P or PSpace, provided
we keep in mind our earlier comments about reasonable encoding methods. Saying
that the tractable problems are precisely those in P cannot be completely correct. For
example, if two TMs solving the same problem had time complexities (1.000001)"
and n!°°°, respectively, one might prefer to use the first machine on a typical problem
instance, insnue of Theorem 13.2. However, in real life, “polynomial” is more likely
to mean n? or n? than n!?, and “exponential” normally turns out to be 2” rather than
(1.000001)”.
Another point in favor of the polynomial criterion is that it seems to be invariant
among the various models of computation. Changing the model can change the time
complexity, but by no more than a polynomial factor; roughly speaking, if a problem
can be solved in polynomial time on some computer, then it is in P.
It is obvious from Theorem 13.5 that P C PSpace and NP C NPSpace. The
following result, which follows easily from the theorems in Section 13.3, describes
some other simple relationships among these four sets. In particular, having defined
NPSpace, we can now forget about it.
One would assume that allowing nondeterminism should allow dramatic im-
provements in the time complexity required to solve some problems. An NTM, as we
have seen in Section 13.2, has an apparently significant advantage over an ordinary
TM, because of its ability to guess. The quadratic time complexity of the NTM in
Example 13.3 reflected the fact that we can test a proposed solution in quadratic time;
it does not obviously follow that we can test all solutions within quadratic time. As we
observed in Section 13.3, replacing a nondeterministic TM by a deterministic one as
in the proof of Theorem 9.2 can cause the time complexity to increase exponentially;
in fact, this is true of all the general methods known for eliminating nondeterminism
from a TM. It is therefore not surprising that there are many languages, or decision
problems, in NP for which no deterministic polynomial-time algorithms are known.
In other words, there are languages in NP that are not known to be in P.
What is surprising is that there are no languages in NP that are known not to
be in P. Although the most reasonable guess is that P is a proper subset of NP, no
one has managed to prove this statement. It is not for lack of trying; the problems
known to be in NP include a great many interesting problems that have been studied
502 PART 6 Introduction to Computational Complexity
intensively for many years, for which researchers have tried without success to find
either a polynomial-time algorithm or a proof that none exists. A general rule of
thumb is that finding good lower bounds is harder than finding upper bounds. In
principle, it is easier to exhibit a solution to a problem than to show that the problem
has no efficient solutions. In any case, the P = NP question is one of the outstanding
open problems in theoretical computer science.
Whether the second inclusion in Theorem 14.1 is a strict inclusion is also an
open question. We can summarize both these questions by saying that the role of
nondeterminism in the description of complexity is not thoroughly understood.
In the last part of this section, we will study two problems in NP that are interesting
for different reasons. The first can easily be shown to be in NP; however, we will see
in the next section that it is, in a precise sense, a hardest problem in NP. The second
also turns out to be in NP, though not obviously so, since it does not seem to fit the
“guess a solution and test it in polynomial time” pattern.
Cy AC2A---AC.
of subexpressions C;, each of which is a disjunction (i.e., formed with v’s) of literals. For
example,
(x1 V x3 V.X4)
A (%1 V x3)
A 1 V x4. V X2) AX3 A (X2 V X4)
is a CNF expression with five conjuncts. In general, one variable might appear more than once
within a conjunct, and a conjunct itself might be duplicated. We do, however, impose the extra
requirement that for some v > 1, the distinct variables be precisely x), x2,...,x,, with none
left out.
You can verify easily that the expression above is satisfied—made true—by the truth
assignment
The CNF-satisfiability problem (CNF-Sat for short) is this: Given an expression in conjunctive
normal form, is there a truth assignment that satisfies it?
We can encode instances of the CNF-satisfiability problem in a straightforward way,
omitting parentheses and v’s and using unary notation for variable subscripts. For example,
the expression
n<kvt+1l)4+c<F
+2k
The first inequality depends on the fact that all the strings x1' and ¥1/ all have length < v +1,
and the second is true because v and c are both no larger than k. This relationship between n
and k implies that any polynomial in n is bounded by a polynomial in k, and k seems like a
reasonable measure of the size of the problem instance. Therefore, if CNF. -Satisfiable € NP, it
makes sense to say that the decision problem CNF-Sat is in NP.
We can easily describe in general terms the steps a one-tape Turing machine T needs
to follow in order to accept CNF-Satisfiable. The first step is to verify that the input string
Tepresents a valid CNF expression in which the variables are precisely x), x,..., xy for some
v. Assuming the string is valid, T attempts to satisfy the expression, keeping track as it
proceeds which conjuncts have been satisfied so far and which variables within the unsatisfied
conjuncts have been assigned values. The iterative step consists of finding the first conjunct
not yet satisfied; choosing a literal within that conjunct that has not been assigned a value (this
is the only place where nondeterminism is used); giving the variable in that literal the value
that satisfies the conjunct; marking the conjunct as satisfied; and giving the same value to all
subsequent occurrences of that variable in unsatisfied conjuncts, marking any conjuncts that
are satisfied as a result. The loop terminates in one of two ways. Either all conjuncts are
eventually satisfied, or the literals in the first unsatisfied conjunct are all found to have been
falsified. In the first case T accepts, and in the second it rejects. If the expression is satisfiable,
and only in this case, the correct choice of moves causes T to guess a truth assignment that
works.
The TM T can be constructed so that, except for a few steps that take time proportional to
n, all its actions are minor variations of the following operation: Begin with a string of 1’s in the
input, delimited at both ends by some symbol other than 1, and locate some or all of the other
occurrences of this string that are similarly delimited. We leave it to you to convince yourself
that a single operation of this type can be done in polynomial time, and that the number of such
operations that must be performed is also no more than a polynomial. (The nondeterministic
time complexity of T is O(n); see Exercise 14.2.) Our conclusion is that CNF-Satisfiable,
and therefore CNF-Sat, is in NP.
The number of distinct truth assignments to an expression with j distinct variables is 2/.
Although this fact does not by itself imply that the decision problem is not in P, it tells us
that the brute-force approach of trying all solutions will not be helpful in attempting to find a
polynomial-time algorithm. We will return to CNF-Sat in the next section.
Let
used instead, because the number of binary digits needed to encode n is only about logn.
that for this problem “polynomial -time solution” means polynomial, not in 7, but in
us agree
log n, the length of the input string. In particular, therefore, the algorithm in which we actually
test all possible divisors up to ./n is not helpful. Even if we could test each divisor in constant
time, the required time would be proportional to ./n, which is not bounded by any polynomial
function of logn.
This problem also seems to illustrate the importance of whether a problem is posed posi-
tively or negatively. The composite decision problem: Given an integer n > 1, is it composite
(i.e., nonprime)? has a simple nondeterministic solution—namely, guess a possible factoriza-
tionn = p *q and test by multiplying that it is correct—and therefore is in NP. The primality
problem is not obviously in NP, since there is no obvious way to “guess a solution.” At the
language level, the fact that a language is in NP does not immediately imply that its complement
is (Exercises 14.3 and 14.4).
To see that the primality problem is in NP, we need ‘some facts from number theory. First
recall from Chapter 1 the congruence-mod-n relation =,, defined by a =, b if and only if
a — bis divisible by n. Of the two facts that follow, the first is due to Fermat, and we state it
without proof.
1. A positive integer n is prime if and only if there is a number x, with 1 < x <n — ily
satisfying
x"! = 1, and for every
m with 1 <m <n—1, x" #, 1
2. Ifn is not prime, then for any x with 0 < x <n — | satisfying x"! =, 1, we must also
have x\"—))/P =, 1 for some p that is a prime factor of n — 1.
We can check the second statement without too much trouble. If x”~! =, 1 and 7 is not prime,
then by statement 1, x” =, 1 for some m < n — 1. We observe that the smallest such m must
be a divisor of n — 1. The reason is that when we divide n — 1 by m, we get a quotient qg and
a remainder r, so that
n—-l=q*m+r and O0O<r<m
This means that x"~! = x7”"+” = (x)? x x”, and because x”! and x” are both congruent to
1 mod n, we must have (x”)? =, 1 and therefore x” =, 1. It follows that the remainder r
must be 0, because r < m and by definition m is the smallest positive integer with x” =, 1.
Therefore, n — 1 is divisible by m.
Now, any proper divisor m of n — 1 is of the form (n — 1)/j, for some j > 1 that is a
product of (one or more) prime factors of n — 1. Therefore, some multiple of m, say a * m, is
(n — 1)/p for a single prime p. Because x°*”" = (x”)* =, 1, statement 2 follows.
The significance of the first statement is that it gives us a way of expressing the primeness
of n that starts, “there is anumber x so that... ,” and thus we have a potential nondeterministic
solution: Guess an x and test it. At first, testing x seems to require that we test a// the numbers m
with 1 < m <n—1 to make sure that x” #, 1. If this is really necessary, the nondeterminism
is no help—we might as well go back to the usual test for primeness, trying divisors of n. The
significance of statement 2 is that in order to test x, we do not have to try all the m’s, but only
those of the form (n — 1)/p for some prime factor p of n — 1. (According to statement 2, if n
is not a prime, some m of this form will satisfy x” =, 1.) How do we find the prime factors
of n — 1? We guess!
CHAPTER 14 Tractable and Intractable Problems 505
With this introduction, it should now be possible to see that the following recursive
nondeterministic procedure accepts the set of primes.
Is_Prime(n)
if n = 2 return true
else ifn > 2 and n is even return false
else
{ guess
x withl <x <n
if x""' #, 1 return false
guess a factorization py, po,..., py of n — 1
fori =1tok
if not Is_Prime(p;) return false
if py * py *---py An — 1 return false
fori = 1 tok
if x®-)/Pi =, 1 return false
return true
}
ATM can simulate this recursion by using its tape to keep a stack of “activation records,” as
mentioned once before in the sketch of the proof of Theorem 13.6. In order to execute the
“return false” statement, however, it can simply halt in the reject state.
It is still necessary to show that this nondeterministic algorithm can be executed in poly-
nomial (nondeterministic) time. We present only the general idea, and leave most of the details
to the exercises. It is helpful to separate the time required to execute Is_Prime(n) into two
parts: the time required for the k recursive calls Is_Prime(p;), and all the rest. It is not hard
to see that for some constant c and some integer d, the nondeterministic time for everything
but the recursive calls is bounded by c(log n)4 (remember that log n is the length of the input
string). If 7 (n) is the total nondeterministic time, then
k
T(n) < c(logn)* + )>T(p;)
esi
This inequality can be used in an induction proof that T(n) < C(logn)4+', where C is some
sufficiently large constant. In the induction step, if we know from our hypothesis that T(p;) <
C (log p;)4*' for each 7, then we only need to show that
k
The following properties of the relation <, are not surprising and are consistent
with our understanding of what “no harder than” should mean in this context.
theorem 14.2 oe
|. S. is transitive: if Ly. Aes Lea LsthenAe
2. ye eP and L; Sp
< L», then L; Ee P.
3. IfLo ‘NPand Ly <p Lo, then hy e NP.
problem is
even though CNF-Sat appears to be a difficult problem in NP, the other
no easier.
x= AV 4
=e
where each a; ; isa literal. We want the vertices of G, to correspond precisely to the occurrences
of the terms q;,; in x; we let
¥,={G
ll si se and Veyeqg)
The edges of G, are now specified so that the vertex (i, j) is adjacent to (J, m) if and only if
the corresponding literals are in different conjuncts of x and there is a truth assignment to x
CHAPTER 14 Tractable and Intractable Problems 509
Figure 14.1 |
then determine a complete subgraph of G,,, because we have specified the edges of G, so that
any two of these vertices are adjacent.
On the other hand, suppose there is a complete subgraph of G, with k, vertices. Because
none of the corresponding literals is the negation of another, there is a truth assignment that
makes them all true; and because these literals must be in distinct conjuncts, this assignment
makes at least one literal in each conjunct true. Therefore, x is satisfiable.
Now let us consider how long it takes, beginning with the string w representing x, to
construct the string representing (G,, k,). The vertices of the graph can be constructed in a
single scan of w. For a particular literal in a particular conjunct of x, a new edge is obtained
for each literal in another conjunct that is not the negation of the first one. Finding another
conjunct, identifying another literal within that conjunct, and comparing that literal to the
original one can each be done within polynomial time, and it follows that the overall time is
still polynomial.
Just as before, Definition 14.3 and Theorem 14.3 can also be extended to decision
problems. An NP-complete problem is one for which the corresponding language is
NP-complete, and Theorem 14.3 provides a way of obtaining more NP-complete
problems—provided that we can find one to start with. It is not at all obvious that
we can. The set NP contains problems that are diverse and seemingly unrelated:
problems involving graphs, networks, sets and partitions, scheduling, number theory,
logic, and more. It is reasonable to expect that some of these problems will be more
complex than others, and perhaps even that some will be “hardest.” An NP-complete
problem, however, is not only hard but archetypal: Finding a good algorithm to solve
it guarantees that there will be comparable algorithms for every other problem in NP!
Exercise 14.14 describes a way to obtain an “artificial” NP-complete language. In
the next section we will see a remarkable result of Stephen Cook: The language CNF-
Satisfiable (or the decision problem CNF-Sat) is NP-complete. This fact, together
with the technique of polynomial-time reduction and Theorem 14.3, will allow us to
show that many interesting and widely studied problems are NP-complete.
Theorem 14.3 indicates both ways in which the idea of NP-completeness is sig-
nificant in complexity theory. On the one hand, if someone were ever to demonstrate
that some NP-complete problem could be solved by a polynomial-time algorithm,
then the P = NP question would be resolved; NP would disappear as a separate en-
tity, and researchers would redouble their efforts to find polynomial-time algorithms
for problems now known to be in NP, confident that they were not on a wild-goose
chase. On the other hand, as long as the question remains open (or if someone actually
succeeds in proving that P # NP), the difficulty of a problem P can be established
convincingly by showing that some other problem already known to be NP-hard can
be polynomial-time reduced to P.
14.3|COOK’S THEOREM
The idea of NP-completeness was introduced by Stephen Cook in 1971, and our
first example of an NP-complete problem is the CNF-satisfiability problem (Example
14.1), which he proved is NP-complete. The details of the proof are complicated,
and that is perhaps to be expected. Rather than using specific features of a decision
problem to construct a reduction, as we were able to do in Example 14.3, we must
now show there is a “generic” polynomial-time reduction from any problem in NP to
this one. Fortunately, once we have one NP-complete problem, obtaining others will
be considerably easier.
CHAPTER 14 Tractable and Intractable Problems 511
/\4i
i=l
to denote the conjunction
Ai A Az A+++
A Ay
and the same sort of shorthand with disjunction. As in Example 14.1, @ stands for
the
negation of a.
In one section of the proof, the following result about Boolean formulas is useful.
Lemma 14.1 Let F be any Boolean expression involving the variables ay, Gyan.
Then F is logically equivalent to an expression in conjunctive normal form: one of
the form
li
a
pee
i Il
= f=!SS,
Proof It will be easiest to show first that any such F is equivalent to an expression
in disjunctive normal form—that is,one of the form
Kaa ~
VA 2:3
v
I
ply
To do this, we introduce the following notation. For any assignment ® of truth values
to the variables, and for any j from 1 to t, we let
Sy ies,
OeS j=1
are precisely those that satisfy F. Since F and F; are satisfied by exactly the same
assignments, they must be logically equivalent.
We can finish the proof by applying our preliminary result to the expression —F’.
If —F is equivalent to
V/A
YY
bs
512 PART 6 Introduction to Computational Complexity
AV >i
i j
Recall from Example 14.1 that CNF-Satisfiable is the language of encoded yes-
aie
instances of the CNF-satisfiability problem. It is a language over 2 = Uieak
the theorem
41,1, Go, 0)
CHAPTER 14 Tractable and Intractable Problems 513
514 PART 6 Introduction to Computational Complexity
CHAPTER 14 Tractable and Intractable Problems 515
516 PART 6 Introduction to Computational Complexity
Soon we will look at two other decision problems involving undirected graphs.
Our next example is one of several possible variations on the CNF-satisfiability prob-
lem; see the book Computational Complexity (Addison-Wesley, Reading, MA, 1994)
by Papadimitriou for a discussion of some of the others. We denote by 3-Sat the
following decision problem: Given an expression in CNF in which every conjunct is
the disjunction of three or fewer literals, is there a truth assignment satisfying the ex-
pression? The language 3-Satisfiable will be the corresponding language of encoded
518 PART 6 Introduction to Computational Complexity
the problems
Although we now have five examples of NP-complete problems,
list is growing
now known to be NP-complete number in the thousands, and the
referenc e for
constantly. The book by Garey and Johnson remains a very good
, grouped
a general discussion of the topic and contains a varied list of problems
according to category (graphs, sets, and so on).
prob-
NP-completeness is still a somewhat mysterious property. Some decision
lete (Exercises
lems are in P, and others that seem similar turn out to be NP-comp
or a
14.15 and 14.22). In the absence of either an answer to the P ” NP question
definitive characterization of tractability, people generally take a pragmatic approach.
Many real-life decision problems require some kind of solution. If a polynomial-time
ete
algorithm does not present itself, maybe the problem can be shown to be NP-compl
group available, and construct ing a re-
by choosing a problem that is, from the large
duction. In this case, it is probably not worthwhil e spending a lot more time looking
for a polynomial-time solution. The next-best thing might be to look for an algorithm
that produces an approximate solution, or one that provides a solution for a restricted
set of instances. Both approaches represent active areas of research.
EXERCISES
14.1. In studying the CNF-satisfiability problem, what is the reason for imposing
the restriction that an instance must contain precisely the variables
X1,.-.,Xy, with none left out?
14.2. The nondeterministic Turing machine we described that accepts
CNF-Satisfiable repeats the following operation or minor variations of it:
starting with a string of 1’s in the input string, delimited at both ends by a
symbol other than 1, and locating some or all of the other occurrences of
this string that are similarly delimited. How long does an operation of this
type take on a one-tape TM? Use your answer to argue that the TM
accepting CNF-Satisfiable has time complexity O(n>).
14.3. a. Show thatif L € Time(f), then L’ € Time(f).
b. Show thatif L € P, then L’ € P, andif L € PSpace, then L’ € PSpace.
c. Explain carefully why the fact that L <¢ NP does not obviously imply
that L’ € NP.
14.4. a. Let L and L> be languages over ©, and Xp, respectively. Show that if
Lge Lo; then =, Ls
b. Show that if there is an NP-complete language L whose complement is
in NP, then the complement of any language in NP is in NP.
14.5. Show that if L,, L> C D*, L; € P, and L, is neither 9 nor &*, then
Ly <p Lp.
CHAPTER 14 Tractable and Intractable Problems 523
14.15. Show that the 2-colorability problem (Given a graph, is there a 2-coloring
of the vertices?) is in P.
14.16. Consider the following algorithm to solve the vertex cover problem. First,
we generate all subsets of the vertices containing exactly k vertices. There
are O(n*) such subsets. Then we check whether any of the resulting
subgraphs is complete. Why is this not a polynomial-time algorithm (and
thus a proof that P = NP)?
14.17. Let f be a function in PF, the set of functions from &* to ©* computable
in polynomial time. Let A (a language in &*) be in P. Show that f~!(A) is
in P, where by definition, fuAysie ed A Cree
14.18. In an undirected graph G, with vertex set V and edge set E, an independent
set of vertices is a set V; € V so that no two elements of V’ are joined by
an edge in E. Let IS be the decision problem: Given a graph G and an
integer k, is there an independent set of vertices with at least k elements?
Denote by VC and CSG the vertex cover problem and the complete
subgraph problem, respectively. Construct a polynomial-time reduction
from each of the three problems IS, VC, and CSG to each of the others.
(Part of this problem has already been done, in the proof of Theorem 14.7.
In the remaining parts, you are not to use the NP-completeness of any of
these problems.)
L, ® Lo = Lj{0} U Lo{h}
a. Show thatL; <, L; ® L2 and L2 <p L; @ Lo.
b. Show that for any languages L, L,, and L> over {0, 1}, with
L # (0, 1}*, if L; <p L and Lo <p L, then L; @ Lo Sp L.
14.22. Show that the 2-Satisfiability problem is in P.
14.23. Show that both P and NP are closed under the operations of union,
intersection, concatenation, and Kleene *.
CHAPTER 14 Tractable and Intractable Problems 525
SS) pr a PP? Che significance of the name is that w; and p; are viewed
as the weight and the profit of the ith item, respectively; we have a
“knapsack” that can hold no more than W pounds, and the problem is
asking whether it is possible to choose items to put into the knapsack
subject to that constraint, so that the total profit is at least P2)
Show that the 0-1 knapsack problem is NP-complete by constructing a
reduction from the partition problem.
REFERENCES
There are a number of texts that discuss the top- contains the first explicit statement of the Church-
ics in this book. Hopcroft, Motwani, and Ullman Turing thesis. Hennie (1977) contains a good general
(2001) and Lewis and Papadimitriou (1998) are both introduction to Turing machines.
recent editions of books that previously established Chomsky’s papers of 1956 and 1959 contain the
themselves as standard references. Others that might proofs of Theorems 11.1 and 11.2, as well as the defi-
serve as useful complements to this book include nition of the Chomsky hierarchy. The equivalence of
Sipser (1997), Sudkamp (1996), and Linz (2001), context-sensitive grammars and linear-bounded au-
and more comprehensive books for further reading tomata is shown in Kuroda (1964).
include Davis, Sigal, and Weyuker (1994) and Floyd Post’s correspondence problem was discussed in
and Beigel (1994). Post (1946). A number of the original papers by
The idea of a finite automaton appears in Mc- Turing, Post, Kleene, Church, and Gédel having to
Culloch and Pitts (1943) as a way of modeling neural do with computability and solvability are reprinted
nets. Kleene (1956) introduces regular expressions in Davis (1965). Kleene (1952), Davis (1958), and
and proves their equivalence to FAs, and NFAs are Rogers (1967) are further references on computabil-
investigated in Rabin and Scott (1959). Theorem 5.1 ity, recursive function theory, and related topics.
and Corollary 5.1 are due to Nerode (1958) and My- Rabin (1963) and Hartmanis and Stearns (1965)
hill (1957). Algorithm 5.1 appears in Huffman (1954) are two early papers dealing with computational com-
and in Moore (1956). The Pumping Lemmas (The- plexity. The class P was introduced in Cobham
orem 5.2a and Theorem 8.la) were proved in Bar- (1964), and Cook (1971) contains the definition of
Hillel, Perles, and Shamir (1961). NP-completeness and the proof that the satisfiability
Context-free grammars were introduced in problem is NP-complete. Karp’s 1972 paper exhib-
Chomsky (1956) and pushdown automata in Oet- ited a number of NP-complete problems and helped
tinger (1961); their equivalence is shown in Chom- to establish the idea as a fundamental one in com-
sky (1962), Evey (1963), and Schiitzenberger (1963). plexity theory. Garey and Johnson (1978) is the
LL(1) grammars are introduced in Lewis and Stearns standard introductory reference on NP-completeness
(1968) and LL(k) grammars in Rosenkrantz and and contains a catalogue of the problems then known
Stearns (1970). Knuth (1965) characterizes the gram- to be NP-complete. Recent references on computa-
mars corresponding to DCFLs. tional complexity include Balcazar (1988 and 1990),
Turing’s 1936 paper introduces Turing machines Bovet and Crescenzi (1994), Papadimitriou (1994),
and universal Turing machines and proves the un- and many of the chapters of van Leeuwen (1990).
solvability of the halting problem. Church (1936)
527
7 a ~ = oe
‘9 weihy) ‘ oF
aa8 PAaT © \) OE = eet ee GaSe ti
Fe lg eo eet ee =
eagvInat<a
ee Pe rae epee
ae Sad 1G terstreate Toilyxs Yall eld xetindios Sl) cea A ayes waine a:
| ema oea AaMisiRes PROT) Sie at Ob2aa8 nies | ee hak imnerinke crepe seen ae
SSntiioan anu Tot abtioubein ee vp (SCC) marvin jeabwindl banat
or GAP) bng HU! 16 aioqng2Air voiiidnies YieroryeKy it Wo eh
sfiob ditMietiseanAT bis C1! cori Mo Hany
7G
hidalcdieibitmimes
*Fo coipiaviniss od? vitotivsit ytertiod) ait to mein
i ists Stee) SV ISA RG
Sh iota gist? ‘iy z we 6
: se fever) L eiage at vod 2 wert) ~duientgs ahhakh whe proms Tee rhe
ae niteaicseih tsa sMidony vikwireaqernias peo" band) bag (001) vod bineiv vt a
“yf serxgar insignd 30) Yh wine A 49) ee —<—
: (POO tr
Gh gnived labhe> bis Hoel quel echt a att ahd ni cmorige UID jinn » tovesti a
25 watittav iow hex yairimudens die ch levied On Maett he a 6 cette sei hag
ayeet (5287) areata B07} aarwGtt a) enrize STES wlues ‘soul oAr iB
ie ilepiucioe 7) IM) SARESITIEN pera gin TH) edd 30 eA han aA 0} eoepleahip sada
midA leon Do QR mottt] Sri set 1) Lesdavrod? 40tes ot
bee iets
vt eglaeiehinys etiansnshy bar i401) aitten4 eh | ey (GZRT) ners) oh Sih ane.Ll
it ae bree paises: pees Ries OT Fi, {2 P{ }renthotikt rufsgqa 11 > andvisclts A
on- uvan nsdtminke Sie soytntie 0 ederim fad ino MORDiOR ai Qnty CSE
) pehiqm Gs n mrivoS oiient VA 94 ot
inn esoneti ‘wert siete tre
AOC! bus 0201) ses s'ehh shut ni yiirs!qais ian a ee a
AROQ) pamiiimibatet (PCy new bea el! sutton gnoel oreasta sinh 9
* (OC?1) nawlse. Dany te moiqido stl) Ser vetere ttn my sit anventy bas stencils
=.
Sey
jel) dnd} eierty 9
sani
ve a i ion
“~—T2s ay
ie oa" 1 a a By
‘ =a ahh)cep
a bis, a
j peo : a
oss 7
eae
Pig
ie
BIBLIOGRAPHY
Balcazar JL, Diaz J, Gabarro J: Structural Complexity I. Davis MD: Computability and Unsolvability. New York:
New York: Springer-Verlag, 1988. McGraw-Hill, 1958.
Balcazar JL, Diaz J, Gabarro J: Structural Complexity II. Davis, MD: The Undecidable. Hewlett, NY: Raven Press,
New York: Springer-Verlag, 1990. 1965.
Bar-Hillel Y, Perles M, Shamir E: On Formal Properties of Davis, MD, Sigal R, Weyuker EJ: Computability,
Simple Phrase Structure Grammars, Zeitschrift fiir Complexity, and Languages: Fundamentals of
Phonetik Sprachwissenschaft und Theoretical Computer Science, 2nd ed. New York:
Kommunikations-forschung 14: 143-172, 1961. Academic Press, 1994.
Boas RP: Can We Make Mathematics Intelligible? Dowling WF: There Are No Safe Virus Tests, American
American Mathematical Monthly 88: 727-731, 1981. Mathematical Monthly 96: 835-836, 1989.
Bovet DP, Crescenzi P: Introduction to the Theory of Earley J: An Efficient Context-free Parsing Algorithm,
Complexity. Englewood Cliffs, NJ: Prentice Hall, Communications of the ACM 13(2): 94-102, 1970.
1994. Evey J: Application of Pushdown Store Machines,
Carroll J, Long D: Theory of Finite Automata with an Proceedings, 1963 Fall Joint Computer Conference,
Introduction to Formal Languages. Englewood Montvale, NJ: AFIPS Press, 1963, pp. 215-227.
Cliffs, NJ: Prentice Hall, 1989. Floyd RW, Beigel R: The Language of Machines: An
Chomsky N: Three Models for the Description of Introduction to Computability and Formal
Language, IRE Transactions on Information Theory Languages. New York: Freeman, 1994.
2: 113-124, 1956. Garey MR, Johnson DS: Computers and Intractability: A
Chomsky N: On Certain Formal Properties of Grammars, Guide to the Theory of NP-Completeness. New York:
Information and Control 2: 137-167, 1959. Freeman, 1979.
Chomsky N: Context-free Grammars and Pushdown Hartmanis J, Stearns RE: On the Computational
Storage, Quarterly Progress Report No. 65, Complexity of Algorithms, Transactions of the
Cambridge, MA: Massachusetts Institute of American Mathematical Society 117: 285-306, 1965.
Technology Research Laboratory of Electronics, Hennie FC: Introduction to Computability. Reading, MA:
1962, pp. 187-194. Addison-Wesley, 1977.
Church A: An Unsolvable Problem of Elementary Hopcroft JE., Motwani R, Ullman J: Introduction to
Number Theory, American Journal of Mathematics Automata Theory, Languages, and Computation, 2nd
58: 345-363, 1936. ed. Reading, MA: Addison-Wesley, 1979.
Cobham A: The Intrinsic Computational Difficulty of Huffman DA: The Synthesis of Sequential Switching
Functions, Proceedings of the 1964 Congress for Circuits, Journal of the Franklin Institute 257:
Logic, Mathematics, and Philosophy of Science, New 161-190, 275-303, 1954.
York: North Holland, 1964, pp. 24-30. Immerman N: Nondeterministic Space is Closed under
Cook SA: The Complexity of Theorem Proving Complementation, S[AM Journal of Computing 17:
Procedures, Proceedings of the Third Annual ACM 935-938, 1988.
Symposium on the Theory of Computing, New York: Karp RM: Reducibility Among Combinatorial Problems.
Association for Computing Machinery, 1971, pp. In Complexity of Computer Computations. New
151-158. York: Plenum Press, 1972, pp. 85-104.
529
530 Bibliography
Kleene SC: Introduction to Metamathematics. New York: Paulos JA: Once upon a Number: The Hidden Logic of
Van Nostrand, 1952. Stories. New York: Basic Books, 1999.
Kleene SC: Representation of Events in Nerve Nets and Post EL: A Variant of a Recursively Unsolvable Problem,
Finite Automata. In Shannon CE, McCarthy J (eds), Bulletin of the American Mathematical Society 52:
Automata Studies. Princeton, NJ: Princeton 246-268, 1946.
University Press, 1956, pp. 3-42. Rabin MO: Real-Time Computation, /srael Journal of
Knuth, DE: On the Translation of Languages from Left to Mathematics 1: 203-211, 1963.
Right, Information and Control 8: 607-639, 1965. Rabin MO, Scott D: Finite Automata and their Decision
Kuroda SY: Classes of Languages and Linear-Bounded Problems, JBM Journal of Research and
Automata, Information and Control 7: 207-223, Development 3: 115-125, 1959.
1964. Rogers H, Jr: Theory of Recursive Functions and Effective
Levine JR, Mason T, Brown D: lex & yacc, 2nd ed. Computability. New York: McGraw-Hill, 1967.
Sebastopol, CA: O’Reilly & Associates, 1992. Rosenkrantz DJ, Stearns RE: Properties of Deterministic
Lewis HR, Papadimitriou C: Elements of the Theory of Top-down Grammars, Information and Control 17:
Computation, 2nd ed. Englewood Cliffs, NJ: 226-256, 1970.
Prentice Hall, 1998. Salomaa A: Jewels of Formal Language Theory.
Lewis PM II, Stearns RE: Syntax-directed Transduction, Rockville, MD: Computer Science Press, 1981.
Journal of the ACM 15: 465-488, 1968. Savitch WJ: Relationships between Nondeterministic and
Linz P: An Introduction to Formal Languages and Deterministic Tape Complexities, Journal of
Automata, 3rd ed. Sudbury, MA: Jones and Bartlett, Computer and System Sciences 4: 2, 177-192, 1970.
2001. Schiitzenberger MP: On Context-free Languages and
McCulloch WS, Pitts W: A Logical Calculus of the Ideas Pushdown Automata, Information and Control 6:
Immanent in Nervous Activity, Bulletin of 246-264, 1963.
Mathematical Biophysics 5: 115-133, 1943. Sudkamp TA: Languages and Machines: An Introduction
Moore EF: Gedanken Experiments on Sequential to the Theory of Computer Science, 2nd ed. Reading,
Machines. In Shannon CE, and McCarthy J (eds), MA: Addison-Wesley, 1996.
Automata Studies. Princeton, NJ: Princeton Sipser M: Introduction to the Theory of Computation.
University Press, 1956, pp. 129-153. Boston, MA: PWS, 1997.
Myhill J: Finite Automata and the Representation of Szelepesény R: The Method of Forcing for
Events, WADD TR-57-624, Wright Patterson Air Nondeterministic Automata, Bulletin of the EATCS
Force Base, OH, 1957, pp. 112-137. 33: 96-100, 1987.
Nerode A: Linear Automaton Transformations, Turing AM: On Computable Numbers with an Application
Proceedings of the American Mathematical Society to the Entscheidungsproblem, Proceedings of the
9: 541-544, 1958. London Mathematical Society 2: 230-265, 1936.
Oettinger AG: Automatic Syntactic Analysis and the van Leeuwen J (ed): Handbook of Theoretical Computer
Pushdown Store, Proceedings of the Symposia in Science (Volume A, Algorithms and Complexity).
Applied Mathematics 12, Providence, RI: American Amsterdam: MIT Press/Elsevier, 1990.
Mathematical Society 9, 1961 pp. 104-109. Younger DH: Recognition and Parsing of Context-free
Ogden O: A Helpful Result for Proving Inherent Languages in Time n°, Information and Control
Ambiguity, Mathematical Systems Theory 2: 10(2): 189-208, 1967.
191-194, 1968.
Papadimitriou CH: Computational Complexity. Reading,
MA: Addison-Wesley, 1994.
INDEX OF NOTATION
{x € A| P(x)} Ng (x) 29
{3i +77 |i, 7 = O} xy, XYZ 30
‘= L,L2 30
Al
Ga SEF 30
U peal be 31
¢ Lx] 35
AUB,ANB
|L| 38
A-—B
RSS Rem Re 41,70
4)
n! 47, 58
A@B n
i=1 60
AN Big C
pal 63
jot Ai AE 63
Upe Ai F
Up Ai rev, x" 71
QA C(n, i) 183
»
(a, b) GAE 75
AxB + (in a regular expression) 85
(Qj,2s ->5Gn) WwW
WW
PW
HA
HR (ri +12) (rir2), (r*)
OCOODOO0OD0DWM
wWOmMHIAMUNAA
DW 86
A, X A2x...An ©7),@) 86
IN =e FA 95
— (conditional) M = (Q, =, qo, A, 5) (an FA) 95
<-> 6* (in an FA) 98
=> (logical implication) L(M) (for an FA) 99
? L/x 105
Ax (P (x)), Vx(P(x)) Ln 107
dx € A(P(x)) rreyv 114
ay > 0(P()) sh(r) 114
f(x) NFA 125
R M = (Q, =, qo, A, 5) (an NFA) 125
(aca B 6* (in an NFA) 126
N L(M) (for an NFA) 127
f(S) NFA-A 135
Rt M = (Q, %, qo, A, 5) (an NFA-A) 135
sof 5* (in an NFA-A) 136
f-@) A(S) 137
f"S8) L(M) (for an NFA-A) 137
Z My = (Qu, 2, qu, Au, Su) 146
xey,e(x,y),TeT Mz = (Qc, 2, Ge, Acs Sc) 147
aRb My = (Qk, U, qk, Ak, 9k) 148
a=, b L(p,q) 152
L(p, 4; J) 152
[a], [alr
531
532 Index of Notation
asa ; SREY NS EH
‘i. me = ery
ee
22GB? | ee
a Warren’.
we its. Atwrattiheren poe
7 *, 4 x % STD
: m Oe =~,iw - ‘ ™
_ Racha Seaport gmoney + ERG, yA = Pe
acy. * oy & oh ths a
i ita r eV
; aru
Pe:se *. r
ee WBE.
9: ga eile tga = sadn rr} | an
“e? : he= peu
ea
cH Se OE) wee 71
Poh A At yo ride
a) * \ aor ov As
Be. ry) isonet ia 3 raat
ay ms 2 rive VA TY sont
é ayom.
(pares: “ersigt Sree) 24
a leah ca
St
age ee
At)
wh
igeti'w. pint
ip ih | A?
ay SP gi pare : ~ <4
: é i= ait
em ed, | py ’ ‘
; f nF : ra :
io ct
VU TM wm: :
“ay eatLh be
7 bah i=: pi?
ert “a
wore
_ i
——<“a* my ssi?
UY a ee A
[anbee as ' hee fale |
ey ra >
:) ae
INDEX
535
536 Index
Logically equivalent, 13
k-colorability problem, 519 Logically implies, 13
NP-completeness of, 520 Lookahead, 281
k-coloring, 519 Loop in a path in an FA, 180
Kleene, S.C., 31 Loop invariant, 79
Kleene star, 31
Kleene’s Theorem
Part 1, 146 Many-one reducibility, 412
Part 2, 151 Mate of a left parenthesis, 230
Mathematical induction. See Principle of mathematical
induction
A, 29 Maximal pairwise inequivalent set, 42
A-closed, 165 Membership problem
A-closure, 136 for a CFL, 311
algorithm to calculate, 137 for a regular language L, 186
A-productions, 232 for regular languages, 186
algorithm to eliminate, 234 for the language Y(P), 410
A-transitions Membership table, 14
eliminating from an NFA-A, 140 Memory
in an NFA-A, 135 in an FA, 95
in a PDA, 253, 258 required to recognize a language, 90, 106
Language, 28 Migrating symbols, 374
accepted by a PDA, 256 Minimal counterexample principle, 54
accepted by a TM, 322 Minimalization, 457
accepted by an FA, 99 bounded, 458
accepted by an NFA, 127 unbounded, 460
context-free. See Context-free languages Minimization algorithm, 179
recognized by an FA, 99 Minimum-state FA for a regular language, 170
recursive, 365 Mod, 454
recursively enumerable, 365 Model of computation, 189, 352
regular. See Regular languages Modified correspondence system, 425
Last in, first out, 251 Modified Post’s Correspondence Problem, 423
LBA. See Linear-bounded automaton reduction to PCP, 424
Left inverse, 471 unsolvability of, 425
Left recursion, 283 Modula-2, 226
eliminating, 283 Modus ponens, 34
Leftmost derivation, 222 Monus operation, 448
simulated by a top-down PDA, 266 MPCP. See Modified Post’s Correspondence Problem
Leftmost square of a TM tape, 320 j4-recursive functions, 460
Length of a string, 29, 71 computability of, 461
lex, 190 Mult (multiplication function), primitive recursive
Lexical analysis, 189 derivation of, 446
LIFO. See Last in, first out Multiplication, 446
Linear grammar, 220 Multitape TM, 338
Linear speed-up, 489 configuration of, 338
Linear-bounded automaton, 382 simulation of by an ordinary TM, 339
corresponding to a given CSG, 382 Mutual recursion, 206
Listing the elements of a set, 390 Myhill-Nerode Theorem, 171
Literal, 89, 114, 502
Little-oh, 483 N
Live variable in a CFG, 246 Natural numbers, 4
LL(k) grammar, 285 Negating quantified statements, 17
Logical connective, 10 Negation, 10
540 Index
ee 1h qnioeei ye ao ia-wei LF
eee
7 fan 3 OF. ondteniqiidasWi nr S >= aes -
7 ak 205. gutixadxrny, J » vi q
.
Bre e a Opto tr + (ah eety erdivin’!
iS d
> 2m lear iel! @ > pp ae
a Bee | Deere)
HE wor jug id
Pachatin NG®
So 00 ontFo saeco hy
ees sees 7 are ma
re wil
4 ie: 4, : -SRCRFE4asacenal
een sci ln sn € 8 snsiesdy mre
,
alegre fog ot Sle 2 eens rere!
ee MAO, THRs Pw
Hi Jerseotinsenas AY.
TO) s4nat etry
rae
w
one Ate, yt eT Ban
ft otgaaeiny Spreads sin?
HS corecimaacs ici?
REJee
_ SP AnabhagMaorher
ty s%. ab aati aan
Sigbasali,»
LO,
Pr}
4 ;pte
eiabcamaind Tos Vs secure
9a't
a vada Min, 13h a4
é 5 *
2:
an) yoo. se ;
er 2 fapsidin 2h
a ce aoiatbaoe
gr poms ” aia at Se A
f
M7,
ISBN O —O? =c3e2200— \y
||
(||
oO 780072°322 00 2
WWW mhhe com