0% found this document useful (0 votes)
906 views653 pages

Languages and Machines-Thomas A Sudkamp-Tercer Edicion

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
906 views653 pages

Languages and Machines-Thomas A Sudkamp-Tercer Edicion

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 653
THIRD EDITION An Introduction to the Theory of Computer Science Languages and Machines Thomas A. Sudkamp WRIGHT STATE UNIVERSITY PEARSON aren a eS Ta Boston San Francisco New York London Toronto Sydney Tokyo Singapore Madrid Mexico City Munich Paris Cape Town Hong Kong Montreal Acquisitions Editor Matt Goldstein Project Editor Katherine Harucunian Production Supervisor Marilyn Lloyd Marketing Manager Michelle Brown Marketing Coordinator Jake Zavracky Project Management Windfall Software Composition Windfall Software Copyeditor Yonie Overton Technical Illustration Horizon Design Proofreader Jennifer McClain indexer Thomas Sudkamp Cover Design Manager Joyce Cosentino Wells Cover Designer Alison R. Paddock Cover Image © 2005 Nova Development Prepress and Manufacturing Caroline Fell Printer Hamilton Printing Access the latest information about Addison-Wesley titles from our World Wide Web site: http:/iwww.aw-be.com/computing Many of the designations used by manufacturers and sellers 10 distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of trademark claim, the designations have been printed in initial caps or all caps. The programs and applications presented in this book have been included for thei instructional value. ‘They have been tested with care, but are not guaranteed for any particular purpose. The publisher does not offer any warranties or representations, nor does it accept any liabilities with respect to the programs or applications. Library of Congress Cataloging-in-Publication Data Sudkamp, Thomas A. Languages and machines : an introduction to the theory of computer science / Thomas A. Sudkamp.—3rd ed. p. cm, Includes bibliographical references and index. ISBN 0-321-32221-5 (alk. paper) 1, Formal languages. 2, Machine theory. 3. Computational complexity. 1. Title. QA267.3.$83 2005 511.3—te22 2004030342 Copyright © 2006 by Pearson Education, Inc. For information on obtaining permission for use of materia] in this work, please submit a written request (© Pearson Education, Inc., Rights and Contract Department, 75 Arlington Street, Suite 300, Boston, MA 02116 or fax your request to (617) 848-7047. Alll rights reserved, No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or any other media embodiments now known or hereafter to become known, without the prior written pemnission of the publisher. Printed in the United States of America. ISBN 0-321-32221-5 123456789 10-HAM-08 07 06 05 (dedication) > (parents) (parents) — (first name) (last name) (first name} > Donald | Mary (last name) > Sudkamp Introduction 1 HE Foundations Chapter 1 Mathematical Preliminaries 7 1.1 SetTheory 8 12 Cartesian Product, Relations, and Functions 11 13 Equivalence Relations 14 14 Countable and Uncountable Sets 16 1.5 Diagonalization and Self-Reference 21 1.6 Recursive Definitions 23 1.7 Mathematical Induction 27 18 DirectedGraphs 32 Exercises 36 Bibliographic Notes 40 Chapter 2 Languages 4 2.1 Strings and Languages 42 22 Finite Specification of Languages 45 2.3 Regular Sets and Expressions 49 24 — Regular Expressions and Text Searching 54 Exercises 58 Bibliographic Notes 61 vi Contents Re Grammars, Automata, and Languages Chapter 3 Context-Free Grammars 3.1 Context-Free Grammars and Languages 68 3.2 Examples of Grammars and Languages 76 3.3 RegularGrammars 81 3.4 = Verifying Grammars 83 3.5. Leftmost Derivations and Ambiguity 89 3.6 Context-Free Grammars and Programming Language Definition Exercises 97 Bibliographic Notes 102 Chapter 4 Normal Forms for Context-Free Grammars 4.1 Grammar Transformations 104 4.2 Elimination of \-Rules 106 43° Elimination of Chain Rules 113 44 Useless Symbols 116 45 Chomsky Normal Form 121 46 TheCYK Algorithm 124 4.7 Removal of Direct Left Recursion 129 4.8 Greibach Normal Form = 131 Exercises 138 Bibliographic Notes 143 Chapter 5 Finite Automata 5.1 A Finite-State Machine 145 5.2 Deterministic Finite Automata 147 5.3. State Diagrams and Examples 151. 54 Nondeterministic Finite Automata 159 5.5 4-Transitions 165 5.6 Removing Nondeterminism 170 5.7 DFA Minimization 178 Exercises 184 Bibliographic Notes 190 93 103 45 Contents vii Chapter 6 Properties of Regular Languages 191 6.1 Finite-State Acceptance of Regular Languages 191 6.2 ExpressionGraphs 193 63 Regular Grammars and Finite Automata 196 64 Closure Properties of Regular Languages 200 65 ANonregular language 203 6.6 The Pumping Lemma for Regular Languages 205 6.7 The Myhili-Nerode Theorem 211. Exercises 217 Bibliographic Notes 220 Chapter 7 Pushdown Automata and Context-Free Languages 22 7.1 Pushdown Automata = 221 7.2 Variations onthe PDA Theme 227 73 Acceptance of Context-Free Languages 232 7.4 The Pumping Lemma for Context-Free Languages 239 75 Closure Properties of Context-Free Languages 243. Exercises 247 Bibliographic Notes 251 Computability Chapter & Turing Machines 255 8.1 The Standard Turing Machine = 255 82 Turing Machines as Language Acceptors 259 8.3 Alternative Acceptance Criteria 262 84 Multitack Machines 263 8.5 Two-Way Tape Machines 265 8.6 Multitape Machines 268 8.7 Nondeterministic Turing Machines = 274 $8 Turing Machines as Language Enumerators 282 Exercises 288 Bibliographic Notes 293 viti Contents Chapter 9 Turing Computable Functions 295 91 92 93 94 95 96 Computation of Functions 295 Numeric Computation 299 Sequential Operation of Turing Machines 301 Composition of Functions 308 Uncomputable Functions 312 Toward a Programming Language 313 Exercises 320 Bibliographic Notes 323 Chapter 10 The Chomsky Hierarchy 325 10.1 10.2 10.3 10.4 Unrestricted Grammars 325 Context-Sensitive Grammars 332 Linear-Bounded Automata 334 The Chomsky Hierarchy 338 Exercises 339 Bibliographic Notes 341 Chapter 11 Decision Problems and the Church-Turing Thesis 343 1d 11.2 11.3 11.4 115 Representation of Decision Problems 344 Decision Problems and Recursive Languages 346 Problem Reduction 348 The Church-Turing Thesis 352 A Universal Machine 354 Exercises 358 Bibliographic Notes 360 Chapter 12 Undecidability 361 12.1 12.2 12.3 12.4 12.5 12.6 ‘The Halting Problem for Turing Machines 362 Problem Reduction and Undecidability 365 Additional Halting Problem Reductions 368 Rice's Theorem 371 An Unsolvable Word Problem 373 The Post Correspondence Problem 377 12.7 Undecidable Problems in Context-Free Grammars 382 Exercises 386 Bibliographic Notes 388 Chapter 13 Mu-Recursive Functions 13.1 Primitive Recursive Functions 389 13.2. Some Primitive Recursive Functions 394 13.3. Bounded Operators 398 13.4 Division Functions 404 13.5 Gédel Numbering and Course-of-Values Recursion 406 13.6 Computable Partial Functions 410 13.7 Turing Computability and Mu-Recursive Functions 415 13.8 The Church-Turing Thesis Revisited 421 Exercises 424 Bibliographic Notes 430 Bee Computational Complexity Chapter 14 Time Complexity 14.1 Measurement of Complexity 434 14.2 Rates of Growth 436 14.3. Time Complexity of a Turing Machine 442 14.4 Complexity and Turing Machine Variations 446 14.5 Linear Speedup 448, 14.6 Properties of Time Complexity of Languages 451 14.7. Simulation of Computer Computations 458 Exercises 462 Bibliographic Notes 464 Chapter 15 ?, NP, and Cook’s Theorem 15.1 Time Complexity of Nondetemministic Turing Machines 466 15.2. TheClasses Pand NP 468 15.3 Problem Representation and Complexity 469 15.4 Decision Problems and Complexity Classes 472 15.5 The Hamiltonian Circuit Problem 474 Contents ix 389 433 465 X Contents 15.6 15.7 15.8 15.9 Polynomial-Time Reduction 477 PNP? 479 The Satisfiability Problem 481 Complexity Class Relations 492 Exercises 493 Bibliographic Notes 496 Chapter 16 NP-Complete Problems 497 16.1 16.2 16.3 16.4 16.5 16.6 16.7 Reduction and NP-Complete Problems 497 The 3-Satisfiability Problem 498 Reductions from 3-Satisfiability 500 Reduction and Subproblems $13 Optimization Problems 517 Approximation Algorithms 519 Approximation Schemes 523 Exercises 526 Bibliographic Notes 528 Chapter 17 Additional Complexity Classes 529 17.1 17.2 17.3 174 17.5 17.6 Derivative Complexity Classes 529 Space Complexity 532 Relations between Space and Time Complexity 535 ‘P-Space, NP-Space, and Savitch’s Theorem 540 P-Space Completeness 544 An Intractable Problem 548 Exercises 550 Bibliographic Notes 551 Rae Deterministic Parsing Chapter 18 Parsing: An Introduction S55 18.1 The GraphofaGrammar 555 18.2 A Top-Down Parser 557 18.3 Reductions and Bottom-Up Parsing 561 18.4 ABouom-Up Parser 563 Contents X 18.5 Parsing and Compiting 567 Exercises 568 Bibliographic Noes 569 Chapter 19 LL(k) Grammars 371 19.1 Lookahead in Context-Free Grammars 57] 19.2. FIRST, FOLLOW, and Lookahead Sets 576 19.3. Suong LL(k)Grammars 579 19.4 Construction of FIRST, Sets 580 19.5 Construction of FOLLOW, Sets 583 19.6 A Strong LL(1)Grammar 585 19.7 A Strong LL(k) Parser = 587 19.8 LL(k)Grammars 589 Exercises 591 Bibliographic Notes 593 Chapter 20 LR(k) Grammars $95 20.1 LR(O)Contexts 595 20.2 AnLR(0) Parser 599 20.3 The LR(O)Machine 601 20.4 Acceptance by the LR(Q) Machine 606 20.5 LR(1}Grammars 612 Exercises 620 Bibliographic Notes 621 Appendix 1 Index of Notation 623 Appendix It The Greek Alphabet 627 Appendix II? The ASCII Character Set 629 Appendix IV Backus-Naur Form Definition of Java 631 Bibliography 6a Subject Index 649 Preface The objective of the third edition of Languages and Machines: An Introduction to the Theory of Computer Science remains the same as that of the first two editions, to provide a mathematically sound presentation of the theory of computer science at a level suitable for junior- and senior-level computer science majors. The impetus for the third edition was threefold: to enhance the presentation by providing additional motivation and examples; to expand the selection of topics, particularly in the area of computational complexity; and to provide additional flexibility to the instructor in the design of an introductory course in the theory of computer science. While many applications-oriented students question the importance of studying the- oretical foundations, it is this subject that addresses the “big picture" issues of computer science. When today’s programming languages and computer architectures are obsolete and solutions have been found for problems currently of interest, the questions considered in this book will still be relevant. What types of patterns can be algorithmically detected? How can languages be formally defined and analyzed? What are the inherent capabilities and limitations of algorithmic computation? What problems have solutions that require so much time or memory that they are realistically intractable? How do we compare the relative difficulty of two problems? Each of these questions will be addressed in this text. Organization Since most computer science students at the undergraduate level have little or no background in abstract mathematics, the presentation is intended not only to introduce the foundations of computer science but also to increase the student’s mathematical sophistication. This is accomplished by a rigorous presentation of the concepts and theorems of the subject accompanied by a generous supply of examples. Each chapter ends with a set of exercises that reinforces and augments the material covered in the chapter. To make the topics accessible, no special mathematical prerequisites are assumed. Instead, Chapter 1 introduces the mathematical tools of the theory of computing: naive set xvi Preface precise and unambiguous definitions of the concepts, structures, and operations. The fol- lowing notational conventions will be used throughout the book: Items Description Examples Elements and strings Tealic lowercase letters from the beginning. a, b, abc of the alphabet Functions Italic lowercase letters fiah Sets and relations Capital letters XYZ ET Grammars Capital letters G.G,G, Variables of grammars. Ttalic capital letters A,B,C,S Abstract machines Capital letters M,M),M) ‘The use of roman letters for sets and mathematical structures is somewhat nonstandard but was chosen to make the components of a structure visually identifiable. For example, a context-free grammar is a structure G=(Z, V, P, 5). From the fonts alone it can be seen that G consists of three sets and a variable 5. A three-part numbering system is used throughout the book; a reference is given by chapter, section, and item. One numbering sequence records definitions, lemmas, theorems, corollaries, and algorithms. A second sequence is used to identify examples. Tables, figures, and exercises are referenced simply by chapter and number. The end of a proof is marked by # and the end of an example by 0. An index of symbols, including descriptions and the numbers of the pages on which they are introduced, is given in Appendix I. Supplements Solutions to selected exercises are available only to qualified instructors. Please contact your local Addison-Wesley sales representative or send email to [email protected] for information on how to access them, Acknowledgments First and foremost, | would like to thank my wife Janice and daughter Elizabeth, whose Kindness, patience, and consideration made the successful completion of this book possible. I would also like to thank my colleagues and friends at the Institut de Recherche en Informatique de Toulouse, Université Paul Sabatier, Toulouse, France. The first draft of this revision was completed while I was visiting IRIT during the summer of 2004. A special thanks to Didier Dubois and Henri Prade for their generosity and hospitality. The number of people who have made contributions to this book increases with each edition. I extend my sincere appreciation to all the students and professors who have Preface xvii used this book and have sent me critiques, criticisms, corrections, and suggestions for improvement. Many of the suggestions have been incorporated into this edition. Thank you for taking the time to send your comments and please continue to do so. My email address is tsudkamp @cs.wright.edu. This book, in its various editions, has been reviewed by a number of distinguished com- puter scientists including Professors Andrew Astromoff (San Francisco State University), Dan Cooke (University of Texas-El Paso), Thomas Fernandez, Sandeep Gupta (Arizona State University), Raymond Gumb (University of Massachusetts-Lowell), Thomas F. Hain (University of South Alabama), Michael Harrison (University of California at Berkeley), David Hemmendinger (Union College), Steve Homer (Boston University), Dan Jurca (Cal- ifornia State University-Hayward), Klaus Kaiser (University of Houston), C. Kim (Uni- yersity of Oklahoma), D. T. Lee (Northwestern University), Karen Lemone (Worcester Polytechnic Institute), C. L. Lin (University of Illinois at Urbana-Champaign), Richard J. Lorentz (Califomia State University-Northridge), Fletcher R. Norris (The University of North Carolina at Wilmington), Jeffery Shallit (University of Waterloo), Frank Stomp (Wayne State University), William Ward (University of South Alabama), Dan Ventura (Brigham Young University), Charles Wallace (Michigan Technological University), Ken- neth Williams (Western Michigan University), and Hsu-Chun Yer (Iowa State University). Thank you all. I would also like to gratefully acknowledge the assistance received from the people at the Computer Science Education Division of the Addison-Wesley Publishing Company and Windfall Software who were members of the team that successfully completed this project. Thomas A. Sudkamp Dayton, Ohio Introduction The theory of computer science began with the questions that spur most scientific endeavors: how and what. After these had been answered, the question that motivates many economic decisions, how much, came to the forefront. The objective of this book is to explain the significance of these questions for the study of computer science and provide answers whenever possible. Formal language theory was initiated by the question, “How are languages defined?” In an attempt to capture the structure and nuances of natural language, linguist Noam Chomsky developed formal systems called grammars for defining and generating syntactically correct sentences. At approximately the same time, computer scientists were grappling with the problem of explicitly and unambiguously defining the syntax of programming languages. These two studies converged when the syntax of the programming language ALGOL was defined using a formalism equivalent to a context-free grammar. The investigation of computability was motivated by two fundamental questions: “What is an algorithm?” and “What are the capabilities and limitations of algorithmic computation?” An answer to the first question requires a formal model of computation. It may seem that the combination of a computer and high-level programming language, which clearly constitute a computational system, would provide the ideal framework for the study of computability. Only a little consideration is needed to see difficulties with this approach. What computer? How much memory should it have? What programming language? More- over, the selection of a particular computer or language may have inadvertent and unwanted consequences on the answer to the second question. A problem that may be solved on one computer configuration may not be solvable on another. The question of whether a problem is algorithmically solvable should be independent of the model computation used: Either there is an algorithmic solution to a problem or there is no such solution. Consequently, a system that is capable of performing all possible algo- rithmic computations is needed to appropriately address the question of computability. The characterization of general algorithmic computation has been a major area of research for mathematicians and logicians since the 1930s. Many different systems have been proposed as models of computation, including recursive functions, the lambda calculus of Alonzo 2 introduction Church, Markov systems, and the abstract machines developed by Alan Turing. Alll of these systems, and many others designed for this purpose, have been shown to be capable of solv- ing the same set of problems. One interpretation of the Church-Turing Thesis, which will be discussed in Chapter 11, is that a problem has an algorithmic solution only if it can be solved in any (and hence all} of these computational systems. Because of its simplicity and the similarity of its components to those of a modern day computer, we will use the Turing machine as our framework for the study of computation. The Turing machine has many features in common with a computer: It processes input, writes to memory, and produces output. Although Turing machine instructions are primitive compared with those of a computer, it is not difficult to see that the computation of a computer can be simulated by an appropriately defined sequence of Turing machine instructions. The Turing machine model does, however, avoid the physical limitations of conventional computers; there is no upper bound on the amount of memory or time that may be used in a computation. Consequently, any problem that can be solved on a computer can be solved with a Turing machine, but the converse of this is not guaranteed. After accepting the Turing machine as a universal model of effective computation, we can address the question, “What are the capabilities and limitations of algorithmic computation?” The Church-Turing Thesis assures us that a problem is solvable only if there is a suitably designed Turing machine that solves it. To show that a problem has no solution reduces to demonstrating that no Turing machine can be designed to solve the problem. Chapter 12 follows this approach to show that several important questions concerning our ability to predict the outcome of a computation are unsolvable. Once a problem is known to be solvable, one can begin to consider the efficiency or optimality of a solution. The question how much initiates the study of computational complexity. Again the Turing machine provides an unbiased platform that permits the comparison of the resource requirements of various problems. The time complexity of a Turing machine measures the number of instructions required by a computation. Time. complexity is used to partition the set of solvable problems into two classes: tractable and intractable. A problem is considered tractable if it is solvable by a Turing machine in which the number of instructions executed during a computation is bounded by a polynomial function of length of the input. A problem that is not solvable in polynomial time is considered intractable because of the excessive amount of computational resources required to solve all but the simplest cases of the problem. The Turing machine is not the only abstract machine that we will consider; rather, it is the culmination of a series of increasingly powerful machines whove properties will be examined. The analysis of effective computation begins with an examination of the properties of deterministic finite automata. A deterministic finite automaton is a read-once machine in which the instruction to be executed is determined by the state of the machine and the input symbol being processed. Although structurally simple, deterministic finite automata have applications in many disciplines including pattern recognition, the design of switching circuits, and the lexical analysis of programming languages. A more powerful family of machines, known as pushdown automata, are created by adding an external stack memory to finite automata. The addition of the stack extends the Introduction 3 computational capabilities of a finite automaton. As with the Turing machines, our study of computability will characterize the computational capabilities of both of these families of machines. Language definition and computability, the dual themes of this book, are not two unrelated topics that fall under the broad heading of computer science theory, but rather they are inextricably intertwined. The computations of a machine can be used to recognize a language; an input string is accepted by the machine if the computation initiated with the String indicates its syntactic correctness, Thus each machine has an associated language, the set of strings accepted by the machine. The computational capabilities of cach family of abstract machines is characterized by the languages accepted by the machines in the family. With this in mind, we begin our investigations into the related topics of language definition and effective computation. Foundations heoretical computer science includes the study of language definition, pattern recog- nition, the capabilities and limitations of algorithmic computation, and the analysis of the complexity of problems and their solutions. These topics are built on the founda- tions of set theory and discrete mathematics. Chapter 1 reviews the mathematical concepts, operations, and notation required for the study of formal language theory and the theory of computation. Formal language theory has its roots in linguistics, mathematical logic, and computer science. A set-theoretic definition of language is given in Chapter 2. This definition is suffi- ciently broad to include both natural (spoken and written) languages and formal languages, but the generality is gained at the expense of not providing an effective method for gen- erating the strings of a language. To overcome this shortcoming, recursive definitions and set operations are used to give finite specifications of languages. This is followed by the introduction of regular sets, a family of languages that arises in automata theory, formal language theory, switching circuits, and neural networks. The section ends with an exam- ple of the use of regular expressions—a shorthand notation for regular sets—in describing patterns for searching text. CHAPTER 1 Mathematical Preliminaries Set theory and discrete mathematics provide the mathematical foundation for formal lan- guage theory, computability theory, and the analysis of computational complexity. We begin our study of these topics with a review of the notation and basic operations of set theory. Cardinality measures the size of a set and provides a precise definition of an infinite set. One of the interesting results of the investigations into the properties of sets by German mathematician Georg Cantor is that there are different sizes of infinite sets. While Cantor’s work showed that there is a complete hierarchy of sizes of infinite sets, it is sufficient for our purposes to divide infinite sets into two classes: countable and uncountable. A set is countably infinite if it has the same number of elements as the set of natural numbers. Sets with more elements than the natural numbers are uncountable. In this chapter we will use a construction known as the diagonalization argument to show that the set of functions defined on the natural numbers is uncountably infinite. After we have agreed upon what is meant by the terms effective procedure and computable Junction (reaching this consensus is a major goal of Part II of this book), we will be able to determine the size of the set of functions that can be algorithmically computed. A comparison of the sizes of these two sets will establish the existence of functions whose values cannot be computed by any algorithmic process. While a set may consist of an arbitrary collection of objects, we are interested in sets whose elements can be mechanically produced. Recursive definitions are introduced to generate the elements of a set. The relationship between recursively generated sets and mathematical induction is developed, and induction is shown to provide a general proof technique for establishing properties of elements in recursively generated infinite sets. 8 Chapter 1 Mathematical Prelirninaries This chapter ends with a review of directed graphs and wees, structures that will be used throughout the book to graphically illustrate the concepts of formal language theory and the theory of computation. HE set Theory We assume that the reader is familiar with the notions of elementary set theory. In this section, the concepts and notation of that theory are briefly reviewed. The symbol € signifies membership, x € X indicates that x is a member or element of the set X. A slash through a symbol represents not, so x ¢ X signifies that + is not a member of X. Two sets are equal if they contain the same members. Throughout this book, sets are denoted by capital letters. In particular, X, Y, and Z are used to represent arbitrary sets. Italics are used to denote the elements of a set. For example, symbols and strings of the form a, b, A, B, aaaa, and abe Tepresent elements of sets. Brackets { } are used to indicate a set definition. Sets with a small number of members can be defined explicitly; that is, their members can be listed. The sets X= {1,23} Y= {a,b,c,d,¢) are defined in an explicit manner. Sets having a large finite or infinite number of members must be defined implicitly. A set is defined implicitly by specifying conditions that describe the elements of the set. The set consisting of all perfect squares is defined by {a |n =m? for some natural number m}. The vertical bar | in an implicit definition is read “such that.” The entire definition is read “the set of n such that n equals m squared for some natural number mm.” The previous example mentioned the set of natural numbers. This important set, denoted N, consists of the numbers 0, 1, 2,3, .. . - The empty set, denoted 9, is the set that has no members and can be defined explicitly by 9 = { }. A set is determined completely by its membership; the order in which the elements are presented in the definition is immaterial. The explicit definitions X=(1, 2,3), Y= (2, 1,3}, Z= (I, 3, 2, 2, 2} describe the same set. The definition of Z contains multiple instances of the number 2. Repetition in the definition of a set does not affect the membership. Set equality requires that the sets have exactly the same members, and this is the case; each of the sets X, Y, and Z has the natural numbers 1, 2, and 3 as its members. A set Y is a subset of X, written Y C X, if every member of Y is also a member of X. The empty set is trivially a subsct of every sct. Every set X is a subset of itself. If Y is a 11 SerTheory 9 subset of X and Y # X, then Y is called a proper subset of X. The set of all subsets of X is called the power set of X and is denoted P(X). Example 1.7.1 Let X = (1, 2, 3}. The subsets of X are 9 M2) GB} {12} (23) B.D 1,2, 3). Qo Set operations are used to construct new sets from existing ones. The unjon of two sets is defined by XUY={z|zeXorzeY}. The or is inclusive. This means that z is a member of X U Y if it is a member of X or Y or both. The intersection of two sets is the set of elements common to both. This is defined by XNY=(z|zeXandzeY}, ‘Two sets whose intersection is empty are said to be disjoint, The union and intersection of a sets, Xj, Xp,..., X,, are defined by a UX; =X, UX. U---UX, = (x(x €X;, forsomei =1,2.....n) n (HX NX, = tebe eX, forall f= 1,2, ....0}, iel respectively. Subsets Xj, X,..- . X, of a set X are said to partition X if 0 i X=UX; i=l ii) XOX; =G. forl3} XNY={2,3} Y=(0, PU {nin > 5} X-Y=(0,)) XNY=[n|n>5} Y-X=(4, 5} KUY =(nI|n > 5} ‘The final two sets in the right-hand column exhibit the equality required by DeMorgan’s Law. a The definition of subset provides the method for proving that a set X is a subset of Y, we must show that every element of X is also an element of Y. When X is finite, we can explicitly check each element of X for membership in Y. When X contains infinitely many elements, a different approach is needed. The strategy is to show that an arbitrary element of X is in Y. Example 1.1.3 ‘We will show that X = {8% — 1 | > O} is a subset of Y = (2m + 1} m is odd}. To gain a better understanding of the sets X and Y, it is useful to generate some of the elements of X and Y: X: 8-1-1=7, 8-2-1=15, 8-3-1=23, 8-4-1=31,... ¥: 2-141=3, 2-34+1=7. 2-54+1=11, 2-741 13, To establish the inclusion. we must show that every element of X is also an element of Y. An arbitrary element x of X has the fornt 8 — 1, for some 7 > 0. Let # = 4n — 1. Then im is an odd natural number and 2m +1=2%4n—-1) 41 =8—241 = 8-1 =a Thus x is also in Y and X ¢ Y. o 1.2 Cartesian Product, Relations, and Functions 11 Set equality can be defined using set inclusion; sets X and Y are equal if X ¢ Y and Y CX. This simply states that every element of X is also an element of Y and vice versa. When establishing the equality of two sets, the two inclusions are usually proved separately and combined to yield the equality, Example 1.1.4 We prove that the sets X={rlna= m? for some natural number m > Oo} Y= {+ 2n+1|n>0} are equal. First, we show that every element of X is also an element of Y. Let x € X; then x =m? for some natural number m > 0. Let mg be that number. Then x can be written x= (m9) = (ag — 141)? = (mg — 1)? + 20m - +1 Letting 2 = my — 1, we see that x =n? + 2n + 1 with n > 0. Consequently, x is a member of the set Y. ‘We now establish the opposite inclusion. Let y = (1g)? + 229 + 1 be an element of Y. Factoring yields y = (vg + 1). Thus y is the square of a natural number greater than zero and therefore an element of X. Since X € Y and Y € X, we conclude that X = Y. a ae Cartesian Product, Relations, and Functions The Cartesian product is a set operation that builds a set consisting of ordered pairs of elements from two existing sets. The Cartesian product of sets X and Y, denoted X x Y, is defined by Xx Y={[y, yl] x @X andy € Y}. A binary relation on X and Y is a subset of X x Y. The ordering of the natural numbers can be used to generate a relation LT (less than) on the set N x N. This relation is the subset of N x N defined by LT = (li, fi < jandi, j €N}. The notation [7. #] € LT indicates that # is less than j, for example, [0, 1], [0,2] € LT and (1, 1] ¢L7. 12 Chapter] Mathematical Preliminaries The Cartesian product can be generalized to construct new sets from any finite number of sets. [fx,, 42, . . . ,x,, aren elements, then {x), x2,..., X,]is called an ordered n-tuple. An ordered pair is simply another name for an ordered 2-tuple. Ordered 3-tuples, 4-tuples, and 5-tuples are commonly referred to as triples, quadruples, and quintuples, respectively. The Cartesian product of n sets X;, Xz... ., X,, is defined by Xp x Ky Xx Ky EDs az, a ep EX, fore = 1, 2-0. a). An n-ary relation on Xj, Xz, . .. ,X,, is a subset of X, x Xp x +--+ x X,,. L-ary, 2-ary, and 3-ary relations are called unary, binary, and ternary, respectively. Example 1.2.1 Let X = (1, 2, 3) and Y = {a, 5}. Then a) Xx Y=({[1, a], [1,5], [2, a], (2, 5), [3, a], [3, by b) ¥Y x X={[a, 1], (a, 2], fa, 3], (4, 1), £5, 2), (6, 3]} co) Yx Y={la, al, {a, d), [b, al, fb, 51) qd) Xx YxY=({[la,a], (1 4, @], (2, 4,4], (2, 6, ¢], (3,4, 4], (3, 8 al, fla, b], (1 4 b) (2.4, 6], (2, 6, 6], (3.4, 6), (3, d, b)} a Informally, a function from a set X toa set Y is a mapping of elements of X to elements of Y in which each element of X is mapped to at most one element of Y. A function f from X to Yis denoted f : X — Y. The element of Y assigned by the function f to an element x € X is denoted f(x). The set X is called the domain of the function and the elements of X are the arguments or operands of the function f. The range of f is the subset of Y consisting of the members of Y that are assigned to elements of X. Thus the range of a function f :X + ¥ is the set (y € ¥ | y = f(x) for some x € X}. The relationship that assigns to each person his or her age is a function from the set of people to the natural numbers. Note that an element in the range may be assigned to more than one element of the domain—there are many people whe have the same age. Moreover, not all natural numbers are in the range of the function; it is unlikely that the number 1000 is assigned to anyone. The domain of a function is a set, but this set is often the Cartesian product of two or more sets, A function fiXpx Xp x xX, OY is said to be an n-variable function or operation. The value of the function with variables Xy, X, + 5%, is denoted f(xy, x, . .., x,). Functions with one, two, or three variables are often referred to as unary, binary, and ternary operations. The function sq : N— N that assigns n* to each natural number is a unary operation. When the domain of a function consists of the Cartesian product of a set X with itself, the function is simply said to be a binary operation on X. Addition and multiplication are examples of binary operations on N. 1.2 Camesian Product, Relations, and Functions 13 A function f relates members of the domain to members of the range of f. A natural definition of function is in terms of this relation. A total function f from X to Y is a binary relation on X x Y that satisfies the following two properties: i) For each x € X, there is ay € Y such that {x, yle f. ii) If [x, »]e f and [x, y,] € f, then y, = yo. Condition (i) guarantees that each element of X is assigned a member of Y, hence the term soral. The second condition ensures that this assignment is unique. The previously defined telation LT is not a total function since it does not satisfy the second condition. A relation onN x N representing greater than fails to satisfy either of the conditions. Why? Example 1.2.2 Let X = {1, 2, 3) and ¥ = {a, 5}. The eight total functions from X to Y are listed below. x | fo x | fx) x | f@) x | fo 1 a 1 a 1 a 1 b 2 a 2 a 2 b 2 a 3 a 3 b 3 a 3 a 1 a 1 b 1 b 1 b 2 b 2 a 2 b 2 b 3 b 3 & 3 a 3 b a A partial function f from X to Y is a relation on X x Y in which y, = ya whenever Ix, y:] € f and {x, yo] € f. A partial function f is defined for an argument x if there is a y € ¥ such that [x, y] € f. Otherwise, f is undefined for x. A total function is simply a partial function defined for all elements of the domain. Although functions have been formally defined in terms of relations, we will use the standard notation f (x) = y to indicate that yis the value assigned tox by the function f, that is, that [x, y] € f. The notation f(r) + indicates that the partial function f is undefined for the argument x. The notation f(x) | is used to show that f(x) is defined without explicitly giving its value. Integer division defines a binary partial function div from N x N to N. The quotient obtained from the division of i by j, when defined, is assigned to div(, j). For example, div(3, 2) = 1, div(4, 2) = 2, and div(1, 2) =0. Using the previous notation, div(i, 0) t and div(i, 7) | for all values of j other than zero. A total function f : X > Y is said to be one-to-one if each element of X maps to a distinct element in the range. Formally, f is one-to-one if xy # x implies f(x) # f(x). A function f :X — Y is said to be onto if the range of f is the entire set Y. A total function 14° Chapter1 Mathernatical Preliminaries that is both one-to-one and onto defines a correspondence between the elements of domain and the range. Exarnple 1.2.3 The functions f, g, ands are defined from N to N — (0), the set of positive natural numbers. i) f@)=2nt1 ' ii) so={ iii) s@ ens 1 ifn=0 n otherwise The function f is one-to-one but not onto; the range of f consists of the odd numbers. The mapping from N to N — {0} defined by g is clearly onto but not one-to-one since g(0) = g(1) = 1. The function s is both one-to-one and onto, defining a correspondence that maps each natural number to its successor. o Example 1.2.4 In the preceding example we noted that the function f(n) = 2n + 1 is one-to-one, but not onto the set N — {0}. It is, however, a mapping from N to the set of odd natural numbers that is both one-to-one and onto. We will use f to demonstrate how to prove that a function has these properties. One-to-one: To prove that a function is one-to-one, we show that n and a must be the same whenever f(n) = f (m). The assumption f(t) = f(m) yields, 2n+1=2m+1 or 2n = 2m, and finally, n=m. It follows that » £m implies f (n) # f Gn), and f is one-to-one. Onto: To establish that f maps N onto the set of odd natural numbers, we must show that every odd natural number is in the range of f. If m is an odd natural number, it can be written m = 22 + 1 for some n € N. Then f(n) = 2n + 1=m and m is in the range of f. a EI fF Equivalence Relations A binary relation over a set X has been formally defined as a subset of the Cartesian product X x X. Infonmally, we use a relation to indicate whether a property holds between two elements of a set. An ordered pair is in the relation if its elements satisfy the prescribed condition. For example, the property is less than defines a binary relation on the set of natural numbers. The relation defined by this property is the set LT = {[i, j}| i < f}. 1.3. Equivalence Relations 15 Infix notation is often used to express membership in many common binary relations. In this standard usage, i < j indicates that i is less than j and consequently the pair {i, j] is in the relation LT defined above. ‘We now consider a type of relation, known as an equivalence relation, that can be used to partition the underlying set. Equivalence relations are generally denoted using the infix notation a = d to indicate that a is equivalent to b. Definition 1.3.1 A binary relation = over a set X is an equivalence relation if it satisfies i) Reflexivity: a =a, for alla € X ii} Symmetry: a = 6 implies b = a, for alla, be X iil) Transitivity: a = 6 and b = c implies a = ¢, for alla, 6, c € X. Definition 1.3.2 Let = be an equivalence relation over X. The equivalence class of an clement a € X defined by the relation = is the set [a]. = {bE X| a= 5}. Example 1.3.1 Let =p be the parity relation over N defined by n =p m if, and only if, m and m have the same parity (even or odd). To prove that =p is an equivalence relation, we must show that it is symmetric, reflexive, and transitive. i) Reflexivity: For every natural number n, n has the same parity as itself and n =p rn. ii) Symmetry: If n =p m, then a and m have the same parity and m =p n. iii) Transitivity: If n =p m and m =p k, then n and m have the same parity and m and k have the same parity. It follows that n and & have the same parity and n =p k. The two equivalence classes of the parity relation =p are [O]=, = {0, 2, 4,..-Jand [Ia = {1, 3,5...) o An equivalence class is usually written [aJ_, where a is an element in the class. In the preceding example, [0], was used to represent the set of even natural numbers, Lemma 1.3.3 shows that if a , then [a]. = [b}_.. Thus the element chosen to represent the class is irrelevant. Lemma 1.3.3 Let = be an equivalence relation over X and let a and 5 be elements of X. Then either (ale = [4]s or [a] 1 [6] 2 = 8- Proof. Assume that the intersection of [a]_ and [bJ}. is not empty. Then there is some element c that is in both of the equivalence classes. Using symmetry and transitivity, we show that [6]. ¢ [a].. Since c is in both [a]. and [b]=, we know a =c and 6 =c. By symmetry, c = b. Using transitivity, we conclude that a = b. 16 Chapter! Mathematical Preliminaries Now let d be any element in [b]_. Then 6 = d. The combination of a = 6, 6 =d, and transitivity yields a = d. That is, d € [a]_. We have shown that every element in [6}.. is also in [a].z, $0 [b]= € (ala. By a similar argument, we can establish that [@]= © (ble. The two inclusions combine to produce the desired set equality. : Theorem 1.3.4 Let = be an equivalence relation over X. The equivalence classes of = partition X. Proof By Lemma 1.3.3, we know that the equivalence classes form a disjoint family of subsets of X. Let a be any element of X. By reflexivity, a € [a]... Thus each element of X is in one of the equivalence classes. It follows that the union of the equivalence classes is the entire set X. a ia Countable and Uncountable Sets Cardinality is a measure that compares the size of sets. Intuitively, the cardinality of a set is the number of elements in the set. This informal definition is sufficient when dealing with finite sets, the cardinality can be obtained by counting the elements of the set. There are obvious difficulties in extending this approach to infinite sets. Two finite sets can be shown to have the same number of elements by constructing a one-to-one correspondence between the elements of the sets. For example, the mapping a—1 b—2 c—3 demonstrates that the sets {a, b, c} and {1, 2, 3} have the same size. This approach, com- paring the size of sets using mappings, works equally well for sets with a finite or infinite number of members, Definition 1.4.1 i) Two sets X and Y have the same cardinality if there is a total one-to-one function from X onto Y. ii) The cardinality of a set X is less than or equal to the cardinality of a set Y if there is total one-to-one function from X into Y. Note that the two definitions differ only by the extent to which the mapping covers the set Y. If the range of the one-to-one mapping is all of Y, then the two sets have the same cardinality. The cardinality of a set X is denoted card(X). The relationships in (i) and (ii) are denoted card(X) = card(Y) and card(X) < card(¥), respectively. The cardinality of X is said to be strictly less than that of Y, weitten card(X) < card(Y), if card(X) < card(Y) and card(X) # card(Y). The Schréder-Bernstein Theorem establishes the familiar relationship between < and = for cardinality. The proof of the Schréder-Bernstein Theorem is left as an exercise. 1.4 Countable and Uncountabie Sets 17 Theorem 1.4.2 (Schréder-Bernstein) If card (X) < card(¥) and card(¥) < card(X), then card(X) = card(Y). The cardinality of a finite set is denoted by the number of elements in the set. Thus card({a, by) = 2. A set that has the same cardinality as the set of natural numbers is said to be countably infinite or denumetable. Intuitively, a sct is denumerable if its members can be put into an order and counted. The mapping f that establishes the correspondence with the natural numbers provides such an ordering; the first element is f (0), the second (1), the third f (2), and so on. The term countable refers to sets that are either finite or denumerable. A set that is not countable is said to be uncountable. The set N — {0} is countably infinite; the function s(n) =n + 1 defines a one-to-one mapping from N onto N — {0}. It may seem paradoxical that the set N ~ {0}, obtained by removing an element from N, has the same number of elements of N. Clearly, there is lo one-to-one mapping of a finite set onto a proper subset of itself. It is this property that differentiates finite and infinite sets. Definition 1.4.3 A set is infinite if it has a proper subset of the same cardinality. Example 1.4.1 The set of odd natural numbers is countably infinite. The function f(2) = 21 + 1 from Example 1.2.4 establishes the one-to-one correspondence between N and the odd numbers. a A set is countably infinite if its elements can be put in a one-to-one correspondence with the natural numbers. A diagram of a mapping from N onto a set graphically illustrates the countability of the set. The one-to-one correspondence between the natural numbers and the set of all integers 7-3: -2 -1 18 Chapter1 Mathematical Preliminaries exhibits the countability of the set of integers. This correspondence is defined by the function div(n, 2)4+1 ifn is odd —div(n,2) ifn is even. SM= Example 1.4.2 The points of an infinite two-dimensional grid can be used to show that N x N, the set of ordered pairs of natural numbers, is deaumerable. The grid is constructed by labeling the axes with the natural numbers. The position defined by the ith entry on the horizontal axis and the jth entry on the vertical axis represents the ordered pair [i, j). (0.4) (1.4) [24] (3,4) ota es (3,3) The elements of the grid can be listed sequentially by following the arrows in the diagram. This creates the correspondence 0 1 2 3 4 3 6 7 t t t t t + ¢ t (0,01 [0.17 (10) (0.2) [1] [2,0] (0,3) [1 2) that demonstrates the countability of N x N. The one-to-one correspondence outlined above maps the ordered pair [i, j]to the natural number (( + 7) + 7 + D/2) +2. a The sets of interest in language theory and computability are almost exclusively finite or denumerable. We state, without proof, several closure properties of countable sets. Theorem 1.4.4 i} The union of two countable sets is countable. ii) The Cartesian product of two countable sets is countable. 1.4 Countable and Uncountable Sets 19 iii) The set of finite subsets of a countable set is countable. iv) The set of finite-length sequences consisting of elements of a nonempty countable set is countably infinite. The preceding theorem indicates that the property of countability is retained under many standard set-theoretic operations. Each of these closure results can be established by constructing a one-to-one correspondence between the new set and a subset of the natural numbers. A set is uncountable if it is impossible to sequentially list its members. The following, proof technique, known as Cantor's diagonalization argument, is used to show that there is an uncountable number of total functions from N to N. Two total functions f :N— N and g :N— N are equal if they have the same value for every element in the domain. That is, f =g if f(n) = g(x) for all n € N. To show that two functions are distinct, it suffices to find a single input value for which the functions differ. Assume that the set of total functions from the natural numbers to the natural numbers is denumerable. Then there is a sequence fo, f. fo, . . . that contains all the functions. The values of the functions are exhibited in the two-dimensional grid with the input values on the horizontal axis and the functions on the vertical axis. 9 J 2 3 4 fh fF f2 Ff fo A Ad AQ AD fA £0 AO £2 AG ALA AO AC £2 AG Ae) 40 fA FQ) 4G) AA) Consider the function f :N — N defined by f(n) = f,(t) + 1 The values of f are obtained by adding | to the values on the diagonal of the grid, hence the name diagonaliza- tion. By the definition of f, f@ # f,@ forevery i. Consequently, f is not in the sequence fo fs fy -.-. This is a contradiction since the sequence was assumed to contain all the total functions. The assumption that the number of functions is countably infinite leads to a contradiction. It follows that the set is uncountable. Diagonalization is a general proof technique for demonstrating that a set is not count- able. As seen in the preceding example, establishing uncountability using diagonalization is a proof by contradiction. The first step is to assume that the set is countable and there- fore its members can be exhaustively listed. The contradiction is achieved by producing a member of the set that cannot occur anywhere in the list. No conditions are put on the listing of the elements other than that it must contain all the elements of the set. Producing a contradiction by diagonalization shows that there is no possible exhaustive listing of the elements and consequently that the set is uncountable. This technique is exhibited again in the following examples. 20. Chapter! Mathematical Preliminaries Example 1.4.3 A function f from N to N has a fixed point if there is some natural number i such that f(@) =i. For example, f (1) =? has fixed points 0 and 1, while f() =n? + 1 has no fixed points. We will show that the number of functions that do not have fixed points is uncountable. The argument is similar to the proof that the number of all functions from N. to N is uncountable, cxcept that we now have an additional condition that must be met when constructing an element that is not in the listing. Assume that the number of the functions without fixed points is countable. Then these functions can be listed fo, fi» fg, -. . . To obtain a contradiction to our assumption that the setis countable, we construct a function that has no fixed points and is notin the list. Consider the function f(n) = f,(n) + + 1. The addition of n + lin the definition of f ensures that f(a) > n forall n. Thus f has no fixed points, By an argument similar to that given above, fi) # f,@) for all 7. Consequently, the listing fy, f,, fo, . . . is not exhaustive, and we conclude that the number of functions without fixed points is uncountable. a Example 1.4.4 P(N), the set of subsets of N, is uncountable. Assume that the set of subsets of N is countable. Then they can be listed Ng, Nj, No, . . . . Define a subset D of N as follows: For every natural number J, 7 € Dif, and only if, j Ny. By our construction, 0 ¢ D if 0 ¢ No, 1 ¢ D if 1 ¢ Nj, and so on. The set Dis clearly a set of natural numbers. By our assumption, Ng, Ny, N2, . . . is an exhaustive listing of the subsets of N. Hence, D = N; for some i. Is the number in the set D? By definition of D. ie Dif, and only if, i €N;. But since D = N,, this becomes i €Dif, and only if, i ¢ D, which is a contradiction. Thus, our assumption that P(N) is countable must be false and we conclude that P(N) is uncountable. To appreciate the “diagonal” technique, consider a two-dimensional grid with the natural numbers on the horizontal axis and the vertical axis labeled by the sets No, Nj, No, .. . . The position of the grid designated by row N; and column j contains yes if j €N;. Otherwise, the position defined by N; and column j contains no. The set Dis constructed by considering the relationship between the entries along the diagonal of the grid: the number J and the set N;. By the way that we have defined D, the number j is an element of D if. and only if, the entry in the position labeled by N; and j is no. Q 1.8 Diagonatization and Setf-Reference 2] E Diagonalization and Self-Reference In addition (o its use in cardinality proofs, diagonalization provides a method for demon- strating that certain properties or relations are inherently contradictory. These results are used in nonexistence proofs since there can be no object that satisfies such a property. Di- agonalization proofs of nonexistence frequently depend upon contradictions that arise from self-reference—an object analyzing its own actions, properties, or characteristics. Russell’s paradox, the uadecidability of the Halting Problem for Turing Machines, and Gédel’s proof of the undecidability of number theory are all based on contradictions associated with self- teference. The diagonalization proofs in the preceding section used a table with operators listed on the yertical axis and their arguments on the horizontal axis to illustrate the relationship between the operators and arguments. In each example, the operators were of a different type than their arguments. In self-reference, the same family of objects comprises the operators and their arguments. We will use the barber’s paradox, an amusing simplification of Russell's paradox, to illustrate diagonalization and self-reference. The barber’s paradox is concerned with who shaves whom in a mythical town. We are told that every man who is able to shave himself does so and that the barber of the town (a man himself) shaves all and only the people who cannot shave themselves. We wish to consider the possible truth of such a statement and the existence of such a town. In this case, the set of males in the town make up both the operators and the arguments; they are doing the shaving and being shaved. Let M= (py, Pz, Py» -. +» Pir -- -} be the set of all males in the town. A tabular representation of the shaving relationship has the form Pi Pr P3 Pi where the, jth position of the table has a1 if p; shaves p; anda 0 otherwise. Every column will have one entry with a | and all the other entries will be 0; each person either shaves himself or is shaved by the barber. The barber must be one of the people in the town, so he is p; for some value i. What is the value of the position i, i in the table? This is classic self-reference; we are asking what occurs when a particular object is simultaneously the operator (the person doing the shaving) and the operand (the person being shaved). Who shaves the barber? If the barber is able to shave himself, then he cannot do so since he shaves only people who are unable to shave themselves. If he is unable to shave himself, 22 Chapter! Mathernatical Preliminaries then he must shave himself since he shaves everyone who cannot shave themselves, We have shown that the properties describing the shaving habits of the town are contradictory so such a town cannot exist. Russell's paradox follows the same pattern, but its consequences were much more significant than the nonexistence of a mythical town. One of the fundamental tenets of set theory as proposed by Cantor in the late 1800s was that any property or condition that can be described defines a set—the set of objects that satisfy the condition. There may be no objects, finitely many, or infinitely many that satisfy the property, but regardless of the number or the type of elements, the objects form a sct. Russell devised an argument based on self-reference to show that this claim cannot be tne. The relationship examined by Russell’s paradox is that of the membership of one set in another. For each set X we ask the question, “Is a set Y an element of X?” This is not an unreasonable question, since one set can certainly be an element of another. The table below gives both some negative and positive examples of this question. x Y Yex? {a} {a} no (fa), b} {a} yes {{a}, a, 8) 9 yes {fa. 6}, (aj) (Ca}} no (({a}, 0), 8} {Ka}, yes It is important to note that the question is not whether Y is a subset of X, but whether it is an element of X. The membership relation can be depicted by the table XX X3 we where axes are labeled by the sets. A table entry [i, jJis | if X, is an element of X; and 0 if X, is not an element of X;. A question of self-reference can be obtained by identifying the operator and the operand in the membership question. That is, we ask if a set X, is an element of itself. The diagonal enwy [#, £) in the preceding table contains the answer to the question, “Is X; an element of X;?" Now consider the property that a set is not an element of itself. Does this property define a set? There are clearly examples of sets that satisfy the property; the set {a} is not 1.6 Recursive Definitions 23 an element of itself. The satisfaction of the property is indicated by the complement of the diagonal. A set X; is not an element of itself if, and only if, entry [i, i}is 0. Assume that S = (X | X ¢ X) isa set. Is S in S? If S is an element of itself, then it is not in S by the definition of S. Moteover, if S is not in S, then it must be in § since it is not an element of itself. This is an obvious contradiction. We were led to this contradiction by our assumption that the collection of sets that satisfy the property X ¢ X forma set. ‘We have constructed a describable property that cannot define a set. This shows that Cantor’s assertion about the universality of sets is demonstrably false. The ramifications of Russell's paradox were far-reaching. The study of set theory moved from a foundation based on naive definitions to formal systems of axioms and inference rules and helped initiate the formalist philosophy of mathematics. In Chapter 12 we will use self-reference to establish a fundamental result in the theory of computer science, the undecidability of the Halting Problem. 2% Recursive Defi ns Many, in fact most, of the sets of interest in formal language and automata theory contain an infinite number of elements. Thus it is necessary that we develop techniques to describe, generate, or recognize the elements that belong to an infinite set. In the preceding section we described the set of natural numbers utilizing ellipsis dots ( . . . ), This seemed reasonable since everyone reading this text is familiar with the natural numbers and knows what comes after 0, 1,2, 3. However, this description would be totally inadequate for an alien unfamiliar with our base 10 arithmetic system and numeric representations. Such a being would have no idea that the symbol 4 is the next element in the sequence or that 1492 is a natural number. In the development of a mathematical theory, such as the theory of languages or automata, the theorems and proofs may utilize only the definitions of the concepts of that theory. This requires precise definitions of both the objects of the domain and the operations. A method of definition must be developed that enables our friend the alien, or a computer that has no intuition, to generate and “understand” the properties of the elements of a set. A recursive definition of a set X specifies a method for constructing the elements of the set, The definition utilizes two components: a basis and a set of operations. The basis consists of a finite set of elements that are explicitly designated as members of X. The operations are used to construct new elements of the set from the previously defined members. The recursively defined set X consists of all elements that can be generated from the basis elements by a finite number of applications of the operations. The key word in the process of recursively defining a set is generate. Clearly, no process can list the complete set of natural numbers. Any particular number, however, can be obtained by beginning with zero and constructing an initial sequence of the natural numbers. This intuitively describes the process of recursively defining the set of natural numbers. This idea is formalized in the following definition. 24 © Chapter] Mathematical Preliminaries Definition 1.6.1 A recursive definition of N, the set of natural numbers, is constructed using the successor function s. i) Basis: 0 € N. ii) Recursive step: If n € N, then s(1) € N. iii) Closure: » € N only if it can be obtained from 0 by a finite number of applications of the operation s. The basis explicitly states that 0 is a natural number. In (ii), a new natural number is defined in terms of a previously defined number and the successor operation. The clo- sure section guarantees that the set contains only those elements that can be obtained from 0 using the successor operator. Definition 1.6.1 generates an infinite sequence 0, 5(0), s(s(O)), s(s(s(0))), . . . . This sequence is usually abbreviated 0, 1,2,3, .. . . How- ever, anything that can be done with the familiar Arabic numerals could also be done with the more cumbersome unabbreviated representation. The essence of a recursive procedure is to define complicated processes or structures in terms of simpler instances of the same process or structure. In the case of the natural numbers, “simpler” often means smaller. The recursive step of Definition 1.6.1 defines a number in terms of its predecessor. The natural numbers have now been defined, but what does it mean to understand their properties? We usually associate operations of addition, multiplication, and subtraction with the natural numbers. We may have leamed these by brute force, either through memorization or tedious repetition, For the alien or a computer to perform addition, the meaning of “add” must be appropriately defined. One cannot memorize the sum of all possible combinations of natural numbers, but we can use recursion to establish a method by which the sum of any two numbers can be mechanically calculated. The successor function is the only operation on the natural numbers that has been introduced. Thus the definition of addition may use only 0 and s. Definition 1.6.2 In the following recursive definition of the sum of m and n, the recursion is done on x, the second argument of the sum. i) Basis: If =0, then m +n =m. ii) Recursive step: m + s(1) =s(m +n). iti) Closure: m + n =k only if this equality can be obtained from yn + 0 = m using finitely many applications of the recursive step. The closure step is often omitted from a recursive definition of an operation on a given domain. In this case, it is assumed that the operation is defined for all the elements of the domain. The operation of addition given above is defined for all elements of N x N. The sum of m and the successor of n is defined in terms of the simpler case, the sum of m and n, and the successor operation. The choice of n as the recursive operand was arbitrary; the operation could also have been defined in terms of m, with n fixed. 1.6 Recursive Definitions 25 Following the constuction given in Definition 1.6.2, the sum of any two natural numbers can be computed using 0 and s, the primitives used in the definition of the natural numbers. Example 1.6.1 traces the recursive computation of 3 + 2. Example 1.6.1 The numbers 3 and 2 abbreviate s(s(s(0))) and s(s(0)), respectively. The sum is computed recursively by 5(5(5(0))} + 5(5(0)) =s(s(s(s(0))) + 50) = s{s(s(s(s(0))) + 0) = s{s(s(s{s))))) (basis case). This final value is the representation of the number 5. a Figure 1.1 illustrates the process of recursively generating a set X from basis Xo. Each of the concentric circles represents a stage of the construction. X, represents the basis elements and the elements that can be obtained from them using a single application of an operation defined in the recursive step. X; contains the elements that can be constructed with i or fewer operations. The generation process in the recursive portion of the definition produces acountably infinite sequence of nested sets. The set X can be thought of as the infinite union of the X;'s. Let x be an element of X and let X; be the first set in which x occurs. This means that x can be constructed from the basis elements using exactly j applications of the operators. Although each element of X can be generated by a finite number of applications of the operators, there is no upper bound on the number of applications needed to generate the entire set X. This property, generation using a finite but unbounded number of operations, is a fundamental property of recursive definitions. ‘The successor operator can be used recursively to define relations on the set N x N. The Cartesian product N x N is often portrayed by the grid of points representing the ordered pairs. Following the standard conventions, the horizontal axis represents the first component of the ordered pair and the vertical axis the second. The shaded area in Figure 1.2(a) contains the ordered pairs [i, j]in whichi < j. This set is therelation LT, less than, that was described in Section 1.2. Example 1.6.2 The relation LT is defined as follows: i) Basis: [0, 1] € LT. ii) Recursive step: If [m, 2] € LT, then [r, s(m)] € LT and [s(m), s(n)] € LT. iii} Closure: (, 2] € LT only if it can be obtained from [0, 1] by a finite number of applications of the operations in the recursive step. 26 Chapter 1 Mathematical Preliminaries Recursive generation of X: Xq = {x ] x is a basis element) Xj41 =X; U {x |x can be generated by i + 1 operations} X={x |x €X; for some j > 0} FIGURE 1.1 Nested sequence of sets in recursive definition, Using the infinite union description of recursive generation, the definition of LT gen- erates the sequence LT; of nested sets where LTy = (f0, 1) LT, =LT U {(0, 2) (1, 20 LT2 =LTy U {f0, 31, (1, 32 (2, 31} LI, = LT, UV {(0, 4), (1, 4), 12,4) (3,40 LT, =LT,_,V (Ui, f+ U4 7 =0, 1... 8} i oO The construction of LT shows that the generation of an element in arecursively defined set may not be unique. The ordered pair (1, 3] ¢ LT, is generated by the two distinct sequences of operations: Basis: {0, 1] {0, 1] 1 (0, s()] = [0, 2] {s¢0), s(1)] = (1, 2] 2 {sO), (2J={L 3] fl, s(2)1={L, 3). 1.7 Mathematical induction = 27 Qe es ww ee ww Bees ee ew eee Fe ew ee 6 5 4 3 dooeee reeves Oe es ee we wee 0123456789 0123456789 (a) (bo) FIGURE 1.2 Relations on N x N. Example 1.6.3 The shaded area in Figure }.2(b) contains all the ordered pairs with second component 3, 4, 5, or 6. A recursive definition of this set, call it X, is given below. i) Basis: [0, 3), [0, 4), [0, 5], and [0, 6) are in X. ii) Recursive step: If [m, n] € X, then [s(m), n] € X. iii) Closure: [», #1] € X only if itcan be obtained from the basis elements by a finite number of applications of the operation in the recursive step. The sequence of sets X; generated by this recursive process is defined by X= (0.3) (44h G5) 14, 611 J =O 1,--. i} a EE Mathematical induction Establishing relationships berween the elernents of sets and operations on the sets requires the ability to construct proofs that verify the hypothesized properties. It is impossible to prove that a property holds for every member in an infinite set by considering each element individually. The principle of mathematical induction gives sufficient conditions for proving that a property holds for every element in a recursively defined set. Induction uses the family of nested sets generated by the recursive process to extend a property from the basis to the entire set. 28 Chapter) Mathematical Preliminaries Principle of Mathematical induction Let X be a set defined by recursion from the basis Xp and let Xo, Xi, X2,....X;,.. . be the sequence of sets generated by the recursive process. Also let P be a property defined on the elements of X. If it can be shown that i) P holds for each clement in Xq, ii) whenever P holds for every element in the sets Xo, Xj... . .X;, Palso holds forevery element in Xj, then, by the principle of mathematical induction, P holds for every element in X. The soundness of the principle of mathematical induction can be intuitively exhibited using the sequence of sets constructed in the recursive definition of X. Shading the circle X; indicates that P holds for every element of X,. The first condition requires that the interior set be shaded. Condition (ii) states that the shading can be extended from any circle to the next concentric circle. Figure 1.3 illustrates how this process eventually shades the entire set X The justification for the principle of mathematical induction should be clear from the preceding argument. Another justification can be obtained by assuming that conditions (i) and (ii) are satisfied but P is not true for every element in X. [f P does not hold for all elements of X, then there is at least one set X; for which P does not universally hold. Let X; be the first such set. Since condition (i) asserts that P holds for all elements of Xo, j cannot be zero. Now P holds for all elements of X ;_; by our choice of j. Condition (ii) then requires that P hold for all elements in X;. This implies that there is no first set in the sequence for which the property P fails. Consequently, P must be true for alll the X;"s, and therefore for X. An inductive proof consists of three distinct steps. The first step is proving that the property P holds for each element of a basis set. This corresponds to establishing condition (i) in the definition of the principle of mathematical induction. The second is the statement of the inductive hypothesis. The inductive hypothesis is the assumption that the property P holds for every element in the sets Xg, Xj, . . . , X,- The inductive step then proves, using the inductive hypothesis, that P can be extended to each element in X,,,. Completing the inductive step satisfies the requirements of the principle of mathematical induction. Thus, it can be concluded that P is true for all elements of X. In Example 1.6.2, a recursive definition was given to generate the relation LT, which consists of ordered pairs [i, j} that satisfy §

You might also like