MA203 Real Analysis: Lecture Notes
MA203 Real Analysis: Lecture Notes
MA203
Real Analysis
Lecture Notes
Contents
1 Introduction                                                                                                                                                          2
  1.1 What is this course? . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   2
  1.2 What will it achieve? . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   2
  1.3 Who should take it? . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   2
  1.4 Course Content . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
  1.5 Lecturer . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
  1.6 Teaching . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
  1.7 Classes and Office Hours      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
  1.8 Exercises . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
  1.9 Books . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   4
  1.10 Assessment . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   4
4 Differentiation                                                                                                                                    30
  4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .                 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
  4.2 Derivative of functions f : R → R . . . . . . . . . . .                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
       4.2.1 Definition of the derivative . . . . . . . . . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
       4.2.2 Differentiability and continuity . . . . . . . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31
       4.2.3 Maxima, Minima, and the derivative . . . . . .                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   32
       4.2.4 Rolle’s Theorem and the Mean Value Theorem                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   33
  4.3 Differentiation of functions f : Rn → Rm . . . . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
       4.3.1 Partial and directional derivatives . . . . . . . .                 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
       4.3.2 The derivative of f : Rn → Rm . . . . . . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   36
  4.4 Learning outcomes . . . . . . . . . . . . . . . . . . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
  4.5 Comments on selected activities . . . . . . . . . . . .                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   39
  4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   39
5 Topology of Rm                                                                                                                                     41
  5.1 Introduction . . . . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
  5.2 Open and closed subsets of R . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
      5.2.1 Open sets of real numbers . . . . . .            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
      5.2.2 Collections of sets . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   42
      5.2.3 Properties of open sets . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   43
      5.2.4 Closed sets of real numbers . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   43
  5.3 Open and closed subsets of Rm . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45
      5.3.1 Open balls . . . . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45
      5.3.2 The definition of open set . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45
      5.3.3 Closed sets in Rm . . . . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45
  5.4 Continuity . . . . . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   46
      5.4.1 Continuity and open balls . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   46
      5.4.2 Continuity in terms of open sets . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   46
  5.5 Compactness . . . . . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   47
      5.5.1 Compact sets . . . . . . . . . . . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   47
      5.5.2 Characterising compact subsets of Rm             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   47
      5.5.3 Continuous functions on compact sets             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   48
  5.6 Learning outcomes . . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   48
  5.7 Comments on selected activities . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   49
  5.8 Exercises . . . . . . . . . . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   49
6 Metric spaces                                                                                                                                      51
  6.1 Introduction . . . . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   51
  6.2 Metrics and Metric Spaces . . . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   51
      6.2.1 Towards the idea of a metric space . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   51
      6.2.2 Definition of a metric space . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   52
      6.2.3 Important examples of metric spaces .            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   52
      6.2.4 Bounded subsets . . . . . . . . . . .            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   53
      6.2.5 Open balls . . . . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   54
  6.3 Open sets . . . . . . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   54
      6.3.1 The definition of open set . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   54
  6.4 Continuity in Metric Spaces . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   55
      6.4.1 The definition of continuity . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   55
      6.4.2 Continuity in terms of open sets . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   56
  6.5 Convergence and closed sets in metric spaces           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   56
      6.5.1 Definition of convergence . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   56
                                                                                                                                                      i
                               MA203 Real Analysis
7 Uniform convergence                                                                                                                                61
  7.1 Introduction . . . . . . . . . . . . . . . . . . .         . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   61
  7.2 Pointwise and uniform convergence . . . . . . .            . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   61
  7.3 Uniform convergence as convergence in a metric             space           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   62
  7.4 Uniform convergence and continuity . . . . . .             . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   63
  7.5 Learning outcomes . . . . . . . . . . . . . . . .          . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
  7.6 Comments on selected activities . . . . . . . .            . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
  7.7 Exercises . . . . . . . . . . . . . . . . . . . . .        . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
                               ii
Preface
These notes have been developed over several years. This current version is an edited version of a
Subject Guide produced for the University of London external programme. That guide was itself based
on MA203 lecture notes.
I am grateful to Malwina Luczak and Keith Martin for carefully reading a draft of the subject guide
and for suggesting ways in which to improve it.
                                                                                                      1
                                          MA203 Real Analysis
Chapter 1
Introduction
Contents
      This is a course in real analysis, designed for those who already know some real analysis (such as that
      encountered in MA103 Introduction to Abstract Mathematics). The emphasis is on functions,
      sequences and series in n-dimensional real space. The general concept of a metric space will also be
      studied and, if time allows, we will look briefly at topological spaces.
      After studying this course, you should be equipped with a knowledge of concepts (such as continuity
      and compactness) which are central not only to further mathematical courses, but to applications of
      mathematics in economics and other areas. For example, as we shall see, compactness is a very
      important idea in optimisation. The course will also enable you to set the real analysis you previously
      encountered in a larger context, to see that there is a ‘bigger picture’. More generally, a course of this
      nature, with the emphasis on abstract reasoning and proof, will help you to think in an analytical way,
      and be able to formulate mathematical arguments in a precise, logical manner.
      Most students taking this course will have already taken MA103 Introduction to Pure
      Mathematics or some other course based on formal definitions and proofs and, ideally, covering the
      concept of limit: indeed, such a course is a formal pre-requisite. Students who have not covered the
      notion of a limit may be able to take this course after carrying out some preliminary reading (Chapters
      1 to 3 of Bryant’s ‘Yet Another Introduction to Analysis’, for example), but familiarity with proof
      techniques really is essential. Some of you (for example, BSc Mathematics and Economics students)
      will be required to take this course: others will simply be interested in learning more about real
      analysis.
                                          2
                                                                                        Chapter 1. Introduction
1.5 Lecturer
1.6 Teaching
This is a half-unit course. Lectures will take place in the Michaelmas term, as follows.
      Mondays, 11-12 in room U8 (Tower 1) and Fridays 11-12 in room D1 (the Hong Kong Theatre,
      Clement House).
Classes start in Week 2, and run until the first week of Lent term (inclusive).
      Classes for the course are taught by Dr Elizabeth Boardman and Dr Eleni Katirtzoglou. You are
      encouraged to consult your class teacher during her office hour if you are having problems with the
      work of the classes and think you would benefit from one-to-one advice. If you are unable to see your
      class teacher, then you should see me. If you cannot attend my office hours, then I can see you at
      other times, by arrangement. (If you need to make an appointment, it is easiest if you email me.) My
      office hours are Mondays 3.30-4.30 and Fridays 9.30-10.30. (These are for Michaelmas term: Lent
      term office hours will probably be different.) Office hours are a very valuable, and often under-used,
      resource for students: please do talk with us if you are having difficulties.
1.8 Exercises
      Exercises will be assigned on a weekly basis. It is very important that you attempt all the Exercises
      suggested for handing in, and hand in work to your class teacher by the arranged time. Working
      through examples is the best way of ensuring you understand key concepts and techniques. Work
                                                                                                              3
                                      MA203 Real Analysis
       handed in will be marked, graded, and returned to you. Answers to all the exercises will be made
       available after the work has been discussed in class.
1.9 Books
       There are many books that would be useful for this course, since Mathematical Analysis is a major
       component of most university-level mathematics degree programmes. There is no single book that
       corresponds exactly to this course, but there are many books that are useful for parts of it. There is
       no requirement to buy a book.
       Bryant, Victor. Yet Another Introduction to Analysis. (Cambridge University Press: Cambridge, 1990)
       [ISBN 052138835X]
       Brannan, David. A First Course in Mathematical Analysis. (Cambridge University Press: Cambridge,
       2006) [ISBN 0521684242]
       Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. (John Wiley and Sons: New York,
       1999) Third edition. [ISBN 0471321486].
       Bryant, Victor. Metric Spaces: Iteration and Application. (Cambridge University Press: Cambridge,
       1985) [ISBN 0521318971]
       Sutherland, W. A. Introduction to Metric and Topological Spaces. (Oxford University Press: Oxford,
       1995) [ISBN 0198531613]
       None of these books covers all of the topics in this course. ’Yet Another Introduction to Analysis’ will
       be useful for Chapters 2 and 4, and it will also be useful for revising the material you will need to
       know from Introduction to Abstract Mathematics. The Binmore book will be useful for Chapters
       2, 3 and 4. Brannan’s book will be useful for Chapters 2 and 4. The book by Bartle and Sherbert will
       be useful for Chapters 2, 4 and 7, and will also be of some use for Chapters 5 and 6. Of more use for
       Chapters 5 and 6 are the ‘Metric Spaces’ book of Bryant and the Sutherland book. Note, however,
       that most of the Sutherland book covers more advanced topics than course, and the Bryant Metric
       Spaces book takes a slightly different approach from that taken here.
       Many other books cover the topics of this course, and the library has a range of texts on real analysis
       (under QA300).
1.10 Assessment
       There will be a formal 2-hour examination in the Summer term. Selected past papers and solutions
       will be available.
                                      4
Chapter 2
Series of real numbers
Contents
         2.1     Introduction . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    5
         2.2     Revision: sequences . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    5
         2.3     Series . . . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
         2.4     Convergence of series . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
         2.5     Special series . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10
         2.6     Some useful tests for non-negative series      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   12
         2.7     Alternating series . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   15
         2.8     Absolute convergence . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   16
         2.9     Power series . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
         2.10    Learning outcomes . . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
         2.11    Comments on selected activities . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
         2.12    Exercises . . . . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
Reading
Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. Chapters 3 and 9.
2.1 Introduction
      The first main topic of the unit is series. This chapter looks at how one can formalise and deal
      properly with infinite sums. A key question is whether an infinite sum exists (that is, whether a series
      converges).
      To understand series, we need to understand sequences. We start, therefore, by racing through some
      of the results you should know already from Introduction to Abstract Mathematics about
      sequences. (The discussion of this background material is therefore deliberately brief.)
      Formally, a sequence is a function f from N to R. We call f (n) the nth term of the sequence and we
      often denote the sequence by (f (n))∞n=1 or simply (f (n)). Informally a sequence is an infinite list of
      real numbers, one for each positive integer; for example,
a1 , a2 , a3 , . . .
      We denote it (an )∞
                        n=1 or (an ) (or indeed, (ar ), (ai ) etc.). Then we call an the n
                                                                                           th
                                                                                              term of this
      sequence.
                                                                                                                                                                                        5
                                        MA203 Real Analysis
        A sequence may be defined by giving an explicit formula for the nth term. For example the formula
        an = n1 defines the sequence whose value at the positive integer n is n1 .
        Definition 2.1 (Finite limit of a sequence) The sequence (xn ) is said to tend to the (finite) limit
        L if for all  > 0, there is an integer N such that for all n > N we have |xn − L| < . That is, (xn )
        tends to L if
                                      ∀ > 0, ∃N such that n > N =⇒ |xn − L| < .
        We write
                                                  xn → L as n → ∞
        or
                                                          lim xn = L,
                                                      n→∞
convergent =⇒ bounded.
        Definition 2.5 (Monotonic sequences) A sequence (an ) is increasing (decreasing) if for all n,
        an+1 ≥ an (an+1 ≤ an ). A sequence is monotonic if it is either increasing or decreasing.
        Theorem 2.6 An increasing (decreasing) sequence which is bounded above (below) is convergent.
        That is,
                                    bounded + monotonic =⇒ convergent.
                                        6
                                                                                        Chapter 2. Series of real numbers
Theorem 2.7 (‘Algebra of limits’ results) Suppose (an ) and (bn ) are convergent sequences with
an → a, bn → b, as n → ∞.
Then as n → ∞,
2.2.4 Subsequences
        Definition 2.8 (Subsequence) Let (an ) be a sequence and let f be a strictly increasing function
        from N to N. The sequence (af (n) ) is called a subsequence of the sequence (an ).
[Note that a function is strictly increasing if f (n + 1) > f (n) for all positive n.]
Often we will use shorthand to denote a sequence. For example, for a sequence
a1 , a2 , a3 , a4 , . . .
        We have written the increasing function f explicitly in terms of its value, by saying exactly which
        terms of the original sequence to take; that is, we take terms k1 , k2 , k3 and so on. Note that this will
        be just for notation’s sake and in these cases the underlying function has not disappeared; in fact the
        increasing function here is given by f (i) = ki for i = 1, 2, 3, . . ..
        Another notation, sometimes useful, is to let A = {f (n) : n ∈ N} and denote the subsequence (af (n) )
        by (an )n∈A . For any infinite set A of natural numbers, (an )n∈A is a subsequence of (an ). This
        notation is particularly useful if we ever have to form a subsequence of a subsequence. For example, if
        B and A are infinite subsets of N, with B ⊆ A, then (an )n∈B is a subsequence of (an )n∈A , which in
        turn is a subsequence of (an ). This approach avoids the need for double and triple subscripts.
        Theorem 2.9 Let (an ) be a sequence which tends to a limit L. Then any subsequence also tends to
        the limit L.
        Theorem 2.11 (Bolzano-Weierstrass Theorem) Every bounded real sequence has a convergent
        subsequence.
                                                                                                                       7
                                     MA203 Real Analysis
      (The proof is immediate. Let (an ) be a bounded sequence; by the preceding theorem, it has a
      monotonic subsequence (af (n) ). This subsequence is then also bounded and we have seen that
      bounded monotonic subsequences are convergent.)
2.3 Series
The previous material is revision from Abstract Mathematics. Now we start on new material.
      In this part of the unit, we will be concerned with how one can formalise the idea of summing an
      infinite list of numbers
                                                 a1 + a2 + a3 + . . . .
      As you would expect, this will once again involve the notion of a limit. We begin with a basic
      definition:
                                                                       n
                                                                       X
                                         sn = a1 + a2 + . . . + an =         ak .
                                                                       k=1
      The series with nth term an , denoted     an , is, formally, the sequence (sn ). We call an the nth term
                                             P
                                          th
      of the series and sn is called the n partial sum of the series.
                                                     P                                         P∞
      Although we denote a series by the notation        an , some textbooks use the notation n=1 an . We
      shall reserve that notation for something different, as I will explain below.
      Note that, as far as we are concerned, a series involves an infinite list of numbers. We do not discuss
      ‘finite’ series, since there are no convergence issues there.
      Example (−1)n .
                P
The nth term is (−1)n and the nth partial sum sn is −1 if n is odd and 0 if n is even.
                                     8
                                                                         Chapter 2. Series of real numbers
Proof. Because the series converges, there is some number L such that sn → L as n → ∞. Now, if
sn → L, we also have sn−1 → L. It follows that sn − sn−1 → L − L = 0. But what is sn − sn−1 ?
Well, it is precisely an . So we have an → 0 as n → ∞.
Note that, here, we have used the symbol ‘ ’ to denote the end of the proof. This is a convenient
way of indicating when a proof is over and the main text continues.
WARNING! You should be aware     P that the converse of this result is false: an tending to 0 does not
necessarily mean that the series    an converges. We shall see specific examples shortly of series in
which an → 0 and yet the series diverges. Finding sufficient conditions for a series to converge
                                                                                            P is the
main aim in what follows, and it’s not easy. Life would be simple if it were the case that     an
converges if and only if an → 0, but it isn’t so. What this means is that you should never find
yourself saying or writing “an → 0 and therefore the series converges.” It is never possible to conclude
that a series converges just from the fact that an → 0.
However, the result is useful, sometimes, for proving that a series does not converge, for it is
equivalent to the following result. (This is the contrapositive of Theorem 2.14.)
                                                                        P
Theorem 2.15 If an does not tend to 0 as n → ∞, then the series             an diverges.
You should not think that if a series diverges, then it must be the case that sn → ∞. This will turn
out to be P
          the case if all terms an of the series are non-negative, but it is not true in general. For
example, (−1)n diverges, but we do not have sn → ∞.
The following result establishes that, for a series with non-negative terms (a ‘non-negative series’),
either the series converges, or the partial sums tend to infinity.
                                P
Theorem 2.16 Suppose that          an is a series in which an ≥ 0 for all n. If the series diverges, then
the nth partial sum, sn , is such that sn → ∞ as n → ∞.
Proof. We start by observing that the partial sums of a non-negative series form an increasing
sequence. This is because sn+1 = sn + an+1 ≥ sn , since an+1 ≥ 0. If the sequence (sn ) was bounded
above, then it would converge, because an increasing sequence that is bounded above converges. So
it must be the case that (sn ) is not bounded above. So, for each K there is N such that sN > K.
But then we have that, for all n ≥ N , sn ≥ sN > K. This shows that sn → ∞.
Comment. You might well see that the proof of Theorem 2.16 shows us something else: namely, that
if the partial sums of a non-negative series form a bounded sequence, then the series converges. This
is because the partial sums are an increasing sequence. So, for non-negative series, the question of
convergence becomes one about whether the partial sums are bounded: explicitly, for a non-negative
series, the series converges if and only if the partial sums are bounded.
WARNING!  P Theorem 2.16 is only true for series with non-negative terms. If that’s not clear,
consider (−1)n . The partial sums of this series are bounded (for they are all either −1 or 0), but
the series does not converge.
                                                                                                            9
                                      MA203 Real Analysis
                                                             a(1 − rn )
      Proof. The partial sum of the geometric series is sn =             if r 6= 1. If |r| < 1 then this
                                                               1−r
                              a                        n
      converges to the limit 1−r because in that case r → 0. If |r| > 1, it does not converge because
      rn → ∞ when r > 1 and rn oscillates unboundedly when r < −1. The only remaining cases are when
      r = 1 and r = −1. When r = 1, sn is simply na and this tends to infinity and therefore does not
      converge. When r = −1, the partial sums are alternately a and 0, and this sequence of numbers does
      not converge.
                                                             X1
      The following result is extremely useful. The series      is so special that it has a special name: the
                                                              n
      harmonic series.
                                            P
      Theorem 2.18 The harmonic series          1/n diverges.
      The following more general result is going to be very useful to us. The second part of its proof is
      difficult and you would not be expected to reproduce it in an examination.
      Proof. The first part of the proof, in which we suppose s ≤ 1, is similar to the proof of divergence of
      the harmonic series. (It can alternatively be proved by using Theorem 2.16 together with
      Theorem 2.18 and the fact that 1/ns ≥ 1/n if s ≤ 1.) If sn denotes the partial sum, then we have
                                      10
                                                                        Chapter 2. Series of real numbers
Let sn denote the nth partial sum of the series. To prove that the series converges is, by definition, to
prove that the sequence (sn ) converges. Since the terms of the series are positive, the sequence (sn )
is increasing, so to establish convergence it is sufficient to show that it is bounded above. (Remember
that an increasing sequence that is bounded above must converge.)
Obviously sn ≤ s2n −1 , although you might wonder why we make this observation. (You’ll see. . .).
Now,
                                     1       1              1
                                        + s + ··· + n
                                s2n −1 = 1 +
                                      s
                                    2       3           (2 − 1)s
                                                                                             
                                 
      1    1      1    1    1    1                   1            1                    1     
=1+      +     +     +    +    +      +  · · · +            +             + · · · +           .
                                                                                             
      2s   3s     4s   5s   6s   7s               (2n−1 )s   (2n−1 + 1)s           (2n − 1)s 
                                                 
                                                  |                   {z                    }
                                                                           2n−1 terms
What we’ve done here is simply group the terms together. This does not change the value of the
expression. We have taken the first term, then the next 2 together, then the next 4, then the next 8,
and so on, until the last group, which is of size 2n−1 . (Note that 1 + 2 + · · · + 2n−1 does indeed
equal 2n − 1.) The reason for doing this is that we can now bound this expression by noting that in
each group, the largest term is the first in that group, so the value of each bracketed quantity is no
more than the number of terms inside the brackets multiplied by the first of the terms. So,
                                   1     1     1                  1
                   s2n −1   ≤ 1+2    + 4 s + 8 s + · · · + 2n−1 n−1 s
                                  2s    4      8               (2     )
                                   1       1       1              1
                            = 1 + s−1 + s−1 + s−1 + · · · + n−1 s−1
                                 2      4        8            (2    )
                                   1         1         1                1
                            = 1 + s−1 + s−1 2 + s−1 3 + · · · + s−1 n−1 ,
                                 2      (2 )         (2 )           (2 )
where we have just used the fact that (2i )s−1 = (2s−1 )i . Now, consider the geometric series
X        1
               . This has common ratio 1/2s−1 which is positive and less than 1, so the series
     (2 )n−1
       s−1
converges. In fact, its sum is
                                                      1
                                           L=                 .
                                                1 − (1/2s−1 )
But the most important fact is that its partial sums are bounded (all are less than L). The calculation
above shows that if tn is the nth partial sum of this geometric series, then s2n −1 ≤ tn . It follows that
s2n −1 ≤ L and hence, as required, we have shown that the partial sums s2n −1 (and hence sn ) are
bounded. This finishes the proof.
                           1/ns diverges, by using Theorem 2.16 together with Theorem 2.18 and
                         P
Prove that if s ≤ 1 then
                 s
the fact that 1/n ≥ 1/n if s ≤ 1.
There also exist some ‘Algebra of Limits’ results which can be proved directly from the corresponding
results for sequences:
                        P           P                          P∞               P∞
Theorem 2.20 Suppose       an and       bP
                                         n converge, and that
                                                           P     n=1 an = L and  n=1 bn = M .
Then,
P∞    for any real number c, the
                               P∞series    (an + b n ) and   can converge, and
  n=1 (an + bn ) = L + M and     n=1 can = cL.
                                                                                  n
                                                                                     √
But . . . note that
                  Pthe same does not hold for
                                           P products. For example, if an = (−1) / n, then,  P as  we
                                                                                               1
will shortly see,   an converges. However, (an × an ) diverges. This latter series is simply   n , the
harmonic series.
                                                                                                      11
                                        MA203 Real Analysis
        A series is non-negative if all its terms are non-negative. (Later we look at series which have some
        negative terms, but it’s easiest at the moment to stick to non-negative series.) The aim now is to
        develop a range of tests for convergence.
        Theorem 2.21 (Comparison Test) Let (an ), (bn ) be non-negative sequences such that an ≤ bn for
        all n. Then
               P                    P                  P∞        P∞
         1. If   bn converges, then   an does also, and n=1 an ≤ n=1 bn .
               P                   P
         2. If   an diverges, then   bn diverges.
                                                                                                      P
        Proof.
        P         The key observation is thatPif sn and tn are, respectively, the nth partial sums of    an and
           bn , then sn ≤ tn . Suppose that     bn converges. This means precisely that the series (tn )
        converges (by the definition of convergence of a series). So the sequence (tn ) is certainly bounded
        above. Now, (sn ) is an increasing sequence and since sn ≤ tn for all n, (sn ) is bounded above too.
        So, as an increasing sequence which is bounded above, it converges. Furthermore,
                                          ∞
                                          X                                        ∞
                                                                                   X
                                                an = lim sn ≤ lim tn =                   bn .
                                                      n→∞          n→∞
                                          n=1                                      n=1
                             P
        Suppose, now, that                 P By Theorem 2.16, sn → ∞. So, tn → ∞ since tn ≥ sn . Hence
                                an diverges.
        (tn ) diverges and (by definition)   bn diverges.
        When using the Comparison Test, Pit’s important to use it in the right direction. Suppose,
                                                                                           P       for example,
        you want to use it to show that    an converges. Then you need to find a series      bn that you know
        converges and which satisfies 0 ≤ an ≤ bn for
        P                                           P all n. If you wanted to use it to show that a series
           cn diverges, you need a divergent series   dn with cn ≥ dn .
        The Comparison Test can be weakened slightly as follows. (Here, what we’ve done is replace ‘for all
        n’ with ‘for all sufficiently large n’.)
        Theorem 2.22 Let (an ), (bn ) be non-negative sequences such that there is some N such that
        an ≤ bn for all n ≥ N . Then
               P                    P
         1. If   bn converges, then    an does also.
               P                   P
         2. If   an diverges, then   bn diverges.
        Proof. The key observation in the proof of the previous version of the Comparison Test was that,
        using the same notation, sn ≤ tn . That is no longer necessarily true in this case. However, it is true
        that there will be some constant M such that sn ≤ tn + M for all n ≥ N . For,
                                           n
                                           X                    N
                                                                X −1                  n
                                                                                      X
                              tn − sn =          (bn − an ) =          (bn − an ) +         (bn − an ).
                                           i=1                  i=1                   i=N
                       Pn                                        Pn
        Now, let M =     i=1 (bn   − an ). Then, noting that           i=N (bn   − an ) ≥ 0 because bn ≥ an for n ≥ N ,
        we see that
                                                   tn − sn ≥ M + 0 = M.
        Now the proof is very similar to the one before.
                                        12
                                                                                 Chapter 2. Series of real numbers
                      P
        Suppose that     bn converges. Then (tn ) converges and so it is bounded above. The sequence (sn ) is
        increasing and, since sn ≤ tn + M for all n ≥ N , (sn ) is bounded above.
                                                                            P     So, as an increasing
        sequence which is bounded above, it converges. Suppose, now, that      an diverges. P
                                                                                            By Theorem 2.16,
        sn → ∞. So, tn → ∞ since tn ≥ sn − M . Hence (tn ) diverges and (by definition)       bn diverges.
                              X  n2 + 1
        Example Consider                 . The nth term here behaves like 1/n3 , because the dominant term
                                  n5
                                  +n+1
        on the numerator is n2 and the dominant term in Pthe denominator is n5 . But this needs to be made
        precise. We can formally compare the series with   1/n3 by noting that
                                               n2 + 1     n2 + n2     2
                                                        ≤          = 3.
                                             n5 + n + 1       n5      n
                      2/n3 converges because      1/n3 does, by Theorem 2.19. Hence, by the Comparison
                   P                           P
        The series
        Test, the given series converges also.
        The following, more sophisticated, version of the Comparison Test, is more useful. We could call it
        the ‘Limiting’ Comparison Test, but we’ll usually just call it the Comparison Test (since the previous
        two versions of the Test can be thought of as special cases of this one.)
        Theorem 2.23 (Comparison Test) Suppose that               n ) are positive and that an /bn → L,
                                                       P (an ), (bP
        where L 6= 0 (and L is finite) as n → ∞. Then      an and    bn either both converge or both diverge:
        that is, they have the same behaviour with respect to convergence.
        Proof. Note that L will be positive because an , bn ≥ 0 and L 6= 0. Because an /bn → L, there will
        be some N so that for all n ≥ N ,                 
                                                  an       L
                                                  bN − L < 2 .
                                                          
This is just taking = L/2 > 0 in the definition of the limit of a sequence. So, for all n ≥ N ,
                                                       L   an   3L
                                                         <    <    .
                                                       2   bn    2
           P                            P
        If   bn converges, then so does P (3L/2)bn and hence, by the fact that an ≤ P     (3L/2)bn for all
        n ≥ N , Theorem
                    P    2.22 shows  that     an converges  also. On   the other hand, if   an converges,
                                                                                                  P        then
        so too does (2/L)a Pn  and, since b n ≤  (2/L)an Pfor n ≥  N ,  Theorem  2.22 shows  that   b n
        converges too. So,    an converges if and only if     bn converges. In other words, either they both
        converge or they both diverge.
                                 X n2 + 1
        Example Consider again                  . Using the limiting form of the Comparison Test to compare
                                     n5 + n + 1
                             3
                        P
        the series with   1/n , we simply observe that, since
                                                                                                              13
                                         MA203 Real Analysis
                      P
         1. L < 1 ⇒       an converges.
                      P
         2. L > 1 ⇒       an diverges (This includes the case L = ∞.)
        Proof. We prove the first part. (The second part can be proved similarly: try it!) Suppose that
        L < 1. Evidently, we may choose an M such that L < M < 1. Hence there exists N such that
                                                                       an+1
                                                       n≥N ⇒                < M.
                                                                        an
        In particular, aN +1 < M aN . From this we see that in general we have
                                                           aN +n < M n aN .
                                       P n
        Now sincePthe geometric series  M aN P converges (since 0 < M < 1), we have by the Comparison
        Test that   aN +n converges, and hence   an converges.
                                    X n7
        Example Consider again               . Here,
                                        6n
                                                                1/n
                                                           n7              n7/n   (n1/n )7
                                                       
                                             a1/n
                                              n   =                    =        =          .
                                                           6n               6        6
                              1/n
        Now, n1/n → 1, so an        → 1/6 as n → ∞. By the Root Test, the series converges.
Again, note that the Root Test says nothing about the case L = 1.
        Theorem 2.26 (Integral Test) LetR g be a positive, decreasing,P   integrable (for example, continuous)
                                              n
        function on [1, ∞), and let G(n) = 1 g(x) dx.P  Then the series     g(n) converges if and only if the
        sequence
        R∞        (G(n))  converges. In other words,   g(n) converges  if and  only if the improper integral
         1
            g(x) dx exists.
                                         14
                                                                               Chapter 2. Series of real numbers
      Theorem 2.27 (Integral Test) Suppose that a ≥ 1 is a fixed number. Let g beRa positive,
                                                                                        n
      decreasing,Pfunction defined on [a, ∞) and integrable on [a, ∞), and let G(n) = a g(x) dx. PThen
      the series   g(n) converges if and only if the Rsequence (G(n)) converges. In other words,  g(n)
                                                       ∞
      converges if and only if the improper integral a g(x) dx exists.
      (This second version is useful when the integral exhibits improper behaviour near 1, as in the following
      example.)
      AP
       series is alternating if its terms are alternately positive and negative. Such a series takes the form
      ± (−1)n+1 cn , where cn ≥ 0.
                                                                                         an = (−1)n+1 cn
                                                                                      P       P
      Theorem 2.28 (Leibniz Alternating Series Test (‘LAST’)) Suppose that
      is an alternating
             P          series, where cn ≥ 0. Then, if (cn ) is a decreasing sequence and limn→∞ cn = 0, the
      series    an converges.
                       P (−1)n+1
      Corollary 2.29             converges for s > 0.
                           ns
      WARNING! This test says that if the sequence (cn ) is decreasing and tends to 0, then the series
      converges. It says nothing at all if one of these two conditions fails to hold. This does not mean that
      these two conditions are necessary for convergence of the alternating series: it just means that the
      Leibniz test doesn’t work in those situations.
                                                                                          √
                                                                                X           n
      Example Let us use the Leibniz Alternating Series Test to prove that         (−1)n        converges.
                                                                                         n+1
                                                                              √
      The series is alternating, and takes the form (−1)n cn where cn = n/(n + 1) ≥ 0. We have
                                                     P
                                                          √
                                                        1/ n
                                                 cn =          → 0.
                                                       1 + 1/n
      Also, (cn ) is decreasing. There is more than one way to show this. First, we could note that
                                                     √
                                          cn+1         n + 1/(n + 2)
                                                 =     √
                                           cn            n/(n + 1)
                                                            √
                                                     (n + 1) n + 1
                                                 =     √
                                                         n(n + 2)
                                                     s
                                                       (n + 1)2 (n + 1)
                                                 =
                                                          n(n + 2)2
                                                     r
                                                       n3 + 3n2 + 3n + 1
                                                 =                        ,
                                                         n3 + 4n2 + 4n
                                                                                                            15
                                        MA203 Real Analysis
        and this√is at most 1 because 4n2 + 4n ≥ 3n2 + 3n + 1. Alternatively, we could note that if
        f (x) = x/(x + 1), then
                                        √             √
                            0      (1/(2 x))(x + 1) − x         1−x
                           f (x) =               2
                                                          = √              ≤ 0 for x ≥ 1,
                                          (x + 1)            2 x(x + 1)2
        and this shows that f is decreasing for x ≥ 1 and hence that (cn ) is decreasing.
                         P
            Note that if   an is a convergent series with non-negative terms, then it is absolutely convergent.
                                                                             P n−1
            From what we saw earlier in Theorem 2.17, the geometric series      ar      converges absolutely if
            |r| < 1.
                                                   1/n diverges, the series (−1)n /n converges
                                               P                           P
            By Theorem 2.29 and the fact that                                                P conditionally.
            (Theorem 2.29 shows it converges, but it does not converge absolutely because       1/n diverges.)
        Theorem 2.32 (Comparison Test) Let (an ), (bn ) be sequences such that |an | ≤ |bn | for all n. Then
               P                                 P                  P∞          P∞
         1. If   bn converges absolutely, then      an does also and n=1 |an | ≤ n=1 |bn |.
               P                      P
         2. If   |an | diverges, then   |bn | diverges.
                                             P
        Theorem 2.33 (Ratio Test) Let            an be a series such that
                                                      |an+1 |
                                          L = lim                   (L = ∞ allowed).
                                               n→∞     |an |
        Then
                                        16
                                                                                Chapter 2. Series of real numbers
                    P
       1. L < 1 ⇒       an converges absolutely
                    P
       2. L > 1 ⇒       an diverges.
      There isn’t enough time to cover power series in very great detail, but we look at how our
      convergence tests apply to power series.
      Let’s take the exponential series first. It’s easy to show that this converges absolutely for all x. We
      simply observe that
                                             n+1            
                                            x     /(n + 1)!     |x|
                                                               =       → 0,
                                                 |xn /n!|        n+1
      for any x, and so, by the Ratio Test, absolute convergence follows.
                                                                                     xn /n is convergent. Taking
                                                                                 P
      Example Let’s determine exactly those values of x for which the series
      an = xn /n, the ratio |an+1 |/|an | is
                                          n+1         
                                         x    /(n + 1)            n
                                                           = |x|       → |x|.
                                               |xn /n|             n+1
      The ratio test therefore tells us that the series converges absolutely if |x| < 1, and that it diverges if
      |x| > 1. But what if |x| = 1? Here, the ratio test is useless and we have to be more sophisticated.
      Well, |x| = 1 corresponds to twoP   cases: x = 1 and x = −1. We treat each separately. When x = 1,
      the series is the harmonic series     1/n, which we know diverges. When x = −1, we have the series
         (−1)n /n. This is convergent, by the Leibniz Alternating Series Test. (You should check this!) So
      P
      we have now determined exactly the values of x where the series converges: it converges for
      −1 ≤ x < 1 and diverges for all other values of x.
                                                                                        an xn converges
                                                                                    P
      Theorem 2.35 For every sequence (an ), there is an R such that the series
      absolutely for all x ∈ (−R, R), and diverges for all x with |x| > R. (It is possible that R = ∞).
      In the case in which R is finite, what happens at ±R is not determined by this theorem, and has to
      be considered separately. The name radius of convergence is given to R.
                                                                                                             17
                                       MA203 Real Analysis
At the end of this chapter and the relevant reading, you should be able to:
       This, you might recognise, is the sum of an arithmetic progression, and so, using the formula for such
       a sum, we obtain
                                                        1
                                                 sn = n(n + 1).
                                                        2
       Learning activity 2.2 There are at least two ways we can do this. First, as we already noted, the
       partial sum sn equals −1 if n is odd and 0 of n is even. So the sequence (sn ) alternates between the
       two values −1 and 0, and for this reason it does not converge. So the series diverges. Alternatively,
       we could use Theorem 2.15: an = (−1)n and so an does not tend to 0. Hence the series diverges.
                                                                       s                     s
                                                                 P
       Learning activity 2.3 Let sP n be the nth partial sum ofP 1/n . Because s ≤ 1, 1/n ≥ 1/n. So, if
       tn is the nth partial sum of    1/n, then sn ≥ tn . But     1/n diverges and 1/n ≥ 0 for all n, so by
       Theorem 2.16, tn → ∞. Since sn ≥ tn ,P   it follows that sn → ∞ also and hence the sequence (sn )
       diverges. But this means precisely that     1/ns diverges.
                                       18
                                                                                   Chapter 2. Series of real numbers
2.12 Exercises
       Exercise 2.1 Use the comparison test to prove that the series
                                                      X n2 − n + 1
                                                           n3 + 1
       diverges
       Exercise 2.7 For each of the following series say whether the series converges or diverges. In each
       case, give a brief reason or proof.
                                                     X 1               
                                        X
                                              1/n
                                                               X         1
                                             n ,         5/4
                                                             ,    cos        ,
                                                       n                n
                           X 1        X √n           X 2n + (−1)n      X n + (−1)n √n
                               √ ,                 ,                ,                     .
                                 n         2n3 − 1       n2 − n + 1            (n + 1)4
                                                                              X            n
                                                                                     n
       Exercise 2.8 Use the root test to determine whether the series                            converges.
                                                                                   2n + 1
       Exercise 2.9 For each of the following series say whether the series converges or diverges. In each
       case, give a brief reason or proof.
                       X (n + 1)2       X 2.5.8 . . . (3n − 1)       X (n!)2          X (n!)2
                                    ,                            ,           6n ,             4n .
                               n!            4.8.12 . . . (4n)         (2n)!            (2n)!
       Exercise 2.10 Prove Theorem 2.25; i.e., verify the correctness of the Root Test. (Hint: try to follow
       the proof for the Ratio Test).
                                                                 P             1
       Exercise 2.11 Discuss the convergence of the series                                for all s > 0.
                                                                     (n + 1)(log(n + 1))s
                                                                                                                19
                              MA203 Real Analysis
Exercise 2.12 Determine whether each of the following series converges, in each case justifying your
answer carefully.
            X           2  X (−1)n n X (−1)n n2 X                      X
                                                                      1          sin n
                  3n 5−n ,           ,              ,     (−1) n
                                                                 sin       ,           .
                              n2 + 1       n2 + 1                     n          n3/2
                              20
Chapter 3
Sequences, functions and limits in higher
   dimensions
Contents
           3.1     Introduction . . . . . . . . . . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   21
           3.2     Sequences in Rm . . . . . . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   21
           3.3     Revision: limits and continuity of functions f : R → R        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
           3.4     Limits and continuity of functions f : Rn → Rm . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   25
           3.5     Learning outcomes . . . . . . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
           3.6     Comments on selected activities . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
           3.7     Exercises . . . . . . . . . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
Reading
        Some of this chapter contains material that is revision from Introduction to Abstract
        Mathematics. Coverage in the textbooks of the other material in this chapter is weak. Chapter 19 of
        Binmore’s book is probably the best place to look.
3.1 Introduction
        In this chapter we look at what it means for a sequence of vectors to converge. We then look at limits
        and continuity of functions from Rn to Rm , reminding ourselves en route of the relevant concepts for
        functions from R to R that we met in Introduction to Abstract Mathematics.
3.2 Sequences in Rm
3.2.1 Distance in Rm
        The Euclidean distance (or simply distance) between x = (x1 , x2 , . . . , xm ) and y = (y1 , y2 , . . . , ym )
        in Rm is defined to be                         v
                                                       um
                                                       uX
                                            kx − yk = t (xi − yi )2 .
                                                                    i=1
(The case in which m = 1 corresponds to the distance |x − y| between two real numbers.)
        There is a certain mathematical attraction in defining distances on rather more abstract or unusual
        spaces, and this leads to the notion of a metric space. This is something we will touch on later in this
        course. For the moment, we are primarily interested in Euclidean space Rm , for some integer m ≥ 1,
        and we shall always use the Euclidean distance.
                                                                                                                                                                         21
                                       MA203 Real Analysis
        Note: The Euclidean distance between two vectors x, y in Rm is simply the norm or length of x − y,
        where the norm is the one arising from the usual inner product on Rm . (See Linear Algebra.)
        Consequently, the Euclidean distance has some nice properties. For example, by the triangle inequality
        for norms, we have that for any x, y, z ∈ Rm ,
        Having equipped ourselves with a notion of distance in Rm , we can say what we mean by a bounded
        subset. A bounded subset of Rm is one in which there is some fixed number bounding the distance
        between any two points in the set. Formally:
        Definition 3.1 (Bounded subset of Rm ) A subset B of Rm is bounded if there is K > 0 such that
        for all x, y ∈ B, kx − yk ≤ K.
        Note that K is fixed: it does not depend on x, y. (The definition would be meaningless if that were
        the case.) There are other, equivalent, ways to think about boundedness. For instance, we have the
        following characterisation.
        Theorem 3.2 A subset B of Rm is bounded if and only if there is some M such that kxk ≤ M for
        all x ∈ B.
        Prove Theorem 3.2. (You will have to show that Definition 3.1 implies the property described in
        Theorem 3.2, and also that the property described in that theorem implies the property of
        Definition 3.1.)
3.2.2 Convergence in Rm
Equivalently, xn → x as n → ∞ if
kxn − xk → 0 as n → ∞.
        The following result says that a sequence converges to a point if and only if it converges in each
        co-ordinate.
        Theorem 3.4 Suppose (xn ) is a sequence in Rm and let xn = (x1n , x2n , . . . , xmn ). Then
        xn → x = (x1 , . . . , xm ) if and only if, for i = 1, . . . , m, xin → xi as n → ∞.
        Proof. Suppose xn → x and let  > 0 be given. Then there is N such that for n > N ,
        kxn − xk < . But                    v
                                             um
                                             uX
                                kxn − xk = t (xin − xi )2 ≥ |xin − xi |,
                                                     i=1
                                       22
                                                       Chapter 3. Sequences, functions and limits in higher dimensions
        for any i between 1 and m, so for n > N , |xin − xi | <  and hence xin → xi . On the other hand, if,
        for each i, |xin − xi | < α, then
                                                         v
                                                         um
                                                         uX
                                          kxn − xk = t (xin − xi )2
                                                                   i=1
                                                               √
                                                           <   mα2
                                                             √
                                                           =   mα,
                           √
        so if we let α = / m, we have:
                                               √
                               |xin − xi | < / m, (i = 1, 2, . . . , m) =⇒ kxn − xk < .
                                                √
        If xin → xi for each i, we may take / m in the√ definition of limit (in place of ) to see that there is
        some Ni such that for n√> Ni , |xin − xi | < / m. Let N be the largest of N1 , . . . , Nm . Then for
        n > N , |xin − xi | < / m for all i and hence kxn − xk < . This shows that xn → x.
                                      n                                             
                                     4n+2                                          1/4
        Example Suppose xn =          n2
                                                 . Then, as n → ∞, xn → x =              . To see this, we can simply
                                                                                    1
                                     n2 −1
                       n       1        n2
        observe that        → and 2          → 1. Alternatively (though this is more difficult), we could
                     4n + 2    4      n −1
        calculate kxn − xk and check that kxn − xk → 0 as n → ∞.
Thus, a sequence (xn ) is bounded if there is some number M such that kxn k ≤ M for all n.
        Many of the results for sequences of real numbers extend to sequences in Rm for m > 1. For
        example, the Bolzano-Weierstrass theorem has the following generalisation.
        This section acts to remind us briefly of the important ideas of limit and continuity of functions from
        R to R.
        Definition 3.7 (Limit of a function at a point) Let f : R → R be a function. We say that L is the
        limit of f (x) as x tends to a, denoted by limx→a f (x) = L, if for each  > 0, there exists δ > 0 such
        that
                                          0 < |x − a| < δ =⇒ |f (x) − L| < .
        The definition states that if someone gives us any arbitrarily small , then there is some
        neighbourhood of a, (a − δ, a + δ), such that any x in this neighbourhood — other than possibly a
        itself — will have f (x) in the -neighbourhood (L − , L + ) of L. (Note that what happens to the
        function at a is irrelevant.)
        Let f, g : R → R be two functions and c any real number. Then a new function (f + g) is obtained by
        defining for each x, (f + g)(x) = f (x) + g(x). Similarly, we may define the functions
        |f |, (cf ), (f − g), (f + g), (f g) and (f /g), provided g(x) 6= 0. (For example, (f g)(x) = f (x)g(x).
        This should not be confused with the composite function f (g(x)).)
                                                                                                                  23
                                        MA203 Real Analysis
        Theorem 3.8 Let f , g : R → R be two functions and c any real number. Suppose that
        limx→a f (x) = L and limx→a g(x) = M . Then
        Definition 3.9 (One-sided limits) Let f : R → R be a function. We say that L is the limit of f (x)
        as x approaches a from the left, denoted by limx→a− f (x) = L if for each  > 0, there exists δ > 0
        such that
                                          0 < a − x < δ =⇒ |f (x) − L| < .
A similar definition applies to limits from the right, denoted limx→a+ f (x) = L.
        So, to say that f is continuous on [a, b] means that f is continuous at each point in (a, b), and that it
        is continuous on the left at b and continuous on the right at a.
It follows from the results on the algebra of limits that there are ‘heredity’ results for continuity.
        Theorem 3.14 (Heredity results for continuity) Let f, g : R → R be functions that are continuous
        at a ∈ R and c be any real number. Then |f |, (cf ), (f − g), (f + g), (f g) are all continuous at a, and
        (f /g) is continuous provided g(x) 6= 0 for any x in some neighbourhood of a.
                                                 Pk
        As a corollary, any polynomial p(x) =        i=0   ai xi is continuous.
        Recall that if f, g are functions, then we may define the composite function f (g(x)). It turns out that
        if g is continuous at a, and f is continuous at g(a), then the composite function f (g(x)) is
        continuous at a.
                                        24
                                                   Chapter 3. Sequences, functions and limits in higher dimensions
        Suppose that f : R → R. As mentioned above, we say that f (x) tends to L as x → a if and only if
        given any  > 0, there is δ > 0 such that
        This appears to use the fact that one can form the difference between any two real numbers,
        However, as above, we can interpret |x − a| as the distance between the real numbers x and a, in
        which case the above condition can be restated as
        It should be clear from this that the condition doesn’t really use any algebraic properties of R, only
        ‘distance’ properties. This definition and many of its consequences will remain if we have as domain
        and codomain Rn and Rm for any m, n ≥ 1.
        Definition 3.15 Suppose f : Rn → Rm , and that a ∈ Rn and L ∈ Rm . We say that L is the limit of
        f (x) as x tends to a if for each  > 0, there exists δ > 0 such that
(Note that we use the same notation, k.k, for the lengths of vectors in both Rn and Rm .)
        The definition can be modified in the obvious way if the function f maps from some subset A of Rn :
        we simply add the qualification that x ∈ A.
        We now give two examples to illustrate some important points about considering limits for functions
        f : Rn → Rm . Two key observations are:
            The limit of f (x) as x → a exists and equals L only if, no matter how x tends to a, the value of
            the function approaches L.
            Consideration of particular approaches of x to a along particular ‘trajectories’ (such as lines) can
            be used to show that a limit does not exist, but it can never be used to show a limit does exist:
            that requires a more general argument that assumes nothing about how x approaches a.
        where we have used the fact that for any real numbers a and b, 2ab ≤ a2 + b2 . (This follows from
        (a − b)2 ≥ 0.)
                                                                                                              25
                                        MA203 Real Analysis
        When looking at limits for functions from R to R, we noticed that one can define left and right limits
        and that these might be different. A counterpart to the idea of left and right limits for functions
        f : Rn → Rm when n > 1 is the idea of the limit along a path.
                                                                             x22 − x21
                                                                    
                                                                x1
                                                        f                =
                                                                x2           x21 + x22
        Let’s consider what happens to f (x) as x tends to 0 = (0, 0)T along the line x2 = αx1 ; that is,
        through x of the form (t, αt)T . We have
                                                                    α 2 t2 − t 2  α2 − 1
                                                           
                                                       t
                                              f                 =                = 2     .
                                                      αt             2
                                                                    t +α t  2  2  α +1
        So, f (x) approaches different values as x → 0 along different lines. In particular, f (x) does not have
        a limit as x → 0.
        The function g is quite different. If we again investigate what happens as x → 0 along the lines
        x2 = αx1 , we note that, for all α,
                                                         α 2 t3
                                             
                                               t
                                          g        = 2           → 0 as t → 0.
                                              αt      t + α 4 t4
        So, here, the limit as x tends to 0 along all lines is the same. However, we cannot deduce from this
        alone that g(x) has a limit as x → 0. For, consider what happens to g(x) as x → 0 along the
        parabola given by x1 = αx22 (that is, through points (αt2 , t)). We have
                                                                         αt4
                                                           
                                                      αt2                             α
                                           g                    =                 = 2   ,
                                                       t             α 2 t 4 + t4  α +1
        The two examples just given are very, very important and illustrate why the topic of limits for
        functions f : Rn → Rm is quite hard when n > 1. We repeat the main lessons to be learned. There
        are so many different ways in which x can approach a given a. The limit of f (x) as x → a exists and
        equals L only if, no matter how x tends to a, the value of the function approaches L. Consideration
        of particular approaches of x to a along particular ‘trajectories’ (such as lines) can be used to show
        that the limit does not exist (because, for instance, the values along different trajectories tend to
        different limits). However, to show that the limit of f as x tends to a exists, an argument needs to be
        given that does not assume any particular way in which x tends towards a.
        For a subset X of Rm , we say that f is continuous on X if for all a ∈ X, the limit of f (x), as x → a,
        with x ∈ a, exists and equals f (a).
                                        26
                                                     Chapter 3. Sequences, functions and limits in higher dimensions
        Theorem 3.17 Suppose that f : Rn → Rm and that f1 , f2 , . . . , fm : Rn → R are such that for all
        x ∈ Rn ,
                                     f (x) = (f1 (x), f2 (x), . . . , fm (x))T .
        Then f is continuous at a ∈ Rn if and only if f1 , f2 , . . . , fm are continuous at a.
        Note that if e1 , e2 , . . . , em are the standard basis vectors of Rm (so that ei has a 1 in position i and
        all other entries equal to 0), then the functions fi referred to are given by
                                              fi (x) = hf (x), ei i = eTi f (x),
        where ha, bi denotes the usual inner product (scalar product) on Rm . We shall call the functions fi
        the component functions of f .
        A useful observation is that all linear functions are continuous. Recall that a linear function
        f : Rn → Rm is one with the property that for all x, y ∈ Rn and all α, β ∈ R,
                                             f (αx + βy) = αf (x) + βf (y).
        Any linear function can be represented in matrix form: that is, there is some m × n matrix M such
        that f (x) = M x for all x. In this case, for 1 ≤ i ≤ m,
                                                fi (x) = eTi f (x) = eTi M x.
        If we let mi denote M T ei , then we see that
                                                                     T
                                         fi (x) = eTi M x = M T ei        x = mTi x,
        and so, for any a ∈ Rn ,
                         kfi (x) − fi (a)k = kmTi x − mTi ak = kmTi (x − a)k ≤ kmTi kkx − ak.
        As x → a, the right hand side tends to 0, because kmTi k is just a fixed number. This shows that fi is
        continuous, for each i, and it follows that f is continuous.
        Theorem 3.18 Suppose that f : Rn → Rm and that a ∈ Rn . Then f is continuous at a if and only
        if for every sequence (xn ) converging to a we have f (xn ) → f (a). Therefore f is continuous (on the
        whole of Rn ) if for every convergent sequence (xn ) in Rn , we have lim f (xn ) = f (lim xn ).
        Proof. Let (*) be the statement that for any sequence (xn ) such that limn→∞ xn = a,
        limn→∞ f (xn ) = f (a).
        Suppose first that f is continuous at a. We prove that this implies (*). Let (xn ) be a sequence of
        reals converging to a. We want to show that f (xn ) → f (a) as n → ∞, that is,
                                       ∀ > 0 ∃N ∀n ≥ N          |f (xn ) − f (a)| < .                         (∗∗)
        To prove this, let  > 0. Choose, according to the definition of continuity, a δ > 0 so that for all x,
        whenever |x − a| < δ, then |f (x) − f (a)| < . Since xn → a as n → ∞, there is an N so that n ≥ N
        implies |xn − a| < δ, which in turn implies |f (xn ) − f (a)| < . This shows (**) as desired.
        Conversely, assume that property (*) holds. In order to show continuity, we assume, to the contrary,
        that the function is discontinuous at a. This means that there is an  > 0 so that for all δ > 0 there is
        an x with |x − a| < δ but |f (x) − f (a)| ≥ . In particular, for every natural number n, letting
        δ = 1/n, there is a real number x, call it xn , with |xn − a| < 1/n but |f (xn ) − f (a)| ≥ . But then
        clearly xn → a as n → ∞, but we do not have f (xn ) → f (a) as n → ∞, a contradiction to (*).
                                                                                                                27
                                     MA203 Real Analysis
At the end of this chapter and the relevant reading, you should be able to:
      Learning activity 3.1 This is an “if and only if” problem. Separate the two halves completely;
      otherwise only confusion will ensue.
      Draw pictures to see why this result is true; if you just launch into a calculation you won’t succeed.
      So, if the “diameter” K of B is finite, why can there not be points of B arbitrarily far from the
      origin, and how then can we put some specific upper bound M on the distance from the origin? For
      the other way round, if all points of B are at distance at most M from the origin, how far can a pair
      of points of B be from each other? Once you’ve figured out (in both directions) what to prove, the
      main weapon at your disposal is the triangle inequality.
      First suppose that B is bounded, i.e., there is some constant K such that kx − yk ≤ K for all
      x, y ∈ B. If B = ∅ then the result is trivial. If B 6= ∅, then fix some y ∈ B, and observe that, for all
      x ∈ B, kxk = kx − 0k ≤ kx − yk + ky − 0k ≤ K + kyk, by the triangle inequality. Therefore, taking
      M = K + kyk, we see that kxk ≤ M for all x ∈ B.
      Now suppose that there is some M such that kxk ≤ M for all x ∈ B. Take any x, y in B, and
      observe that kx − yk ≤ kx − 0k + k0 − yk = kxk + kyk ≤ M + M = 2M . So, setting K = 2M , we
      see that kx − yk ≤ K for all x, y ∈ B, as required.
3.7 Exercises
      Exercise 3.1 For n ∈ N, let the point xn in R3 be given by xn = (1/n, 1/n, 2/n). Calculate kxn k
      for each n, and hence show that xn → 0 as n → ∞.
      Exercise 3.2 Let f, g : R → R be two functions, and c a real number. Suppose that limx→c f (x) = A
      and limx→c g(x) = B. Prove, directly from the definitions, that limx→c (f (x)g(x)) = AB.
      Hint: f (x)g(x) − AB = f (x)(g(x) − B) + (f (x) − A)B.
                                                                         x21 + x22
      Exercise 3.3 Show that limx→0 f (x) = 0, when f (x1 , x2 )T =                   .
                                                                        |x1 | + |x2 |
                                     28
                                              Chapter 3. Sequences, functions and limits in higher dimensions
                                                                      2
    3.4 Prove, directly from the definition, that the function f : R → R defined by
Exercise
   x1
f        = x1 x2 is continuous.
   x2
                                                  
                                               x1          x1 x2
Exercise 3.5 Does limx→0 f (x) exist, when f         =                ?
                                               x2       |x1 | + |x2 |
[You might make use of the fact that, for any u, v, u2 + v 2 ≥ 2uv.]
                                                                                      x1 x42
                                                                                   
                                                                            
                                                                        x1                        if (x, y)T 6= 0
                                                                                   
Exercise 3.7 Suppose g : R2 → R is the function given by g                       =   x2 + x82
                                                                        x2          1
                                                                                    0             if (x, y)T = 0.
Prove that g is not continuous at 0.
                                      (x, 0)T
                                  
                        2     2  x                             if y ≤ 0
Exercise 3.8 Let f : R → R be f    =                                     Determine the set of a ∈ R2 at
                                 y    (x, x)T                  if y > 0.
which f is continuous.
                                                                                                               29
                                            MA203 Real Analysis
Chapter 4
Differentiation
Contents
           4.1     Introduction . . . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
           4.2     Derivative of functions f : R → R . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
           4.3     Differentiation of functions f : Rn → Rm        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
           4.4     Learning outcomes . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
           4.5     Comments on selected activities . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   39
           4.6     Exercises . . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   39
Reading
4.1 Introduction
        You will know already how useful differentiation is. In this chapter, we look at why the familiar
        techniques of calculus work, and we also see how the notion of derivative of a function f : R → R can
        be generalised to functions f : Rn → Rm .
        Let f be a real-valued function defined at all points of an interval (a, b). For each x ∈ (a, b), we
        define the derivative of f at x, denoted by f 0 (x), to be
                                                              f (y) − f (x)
                                                          lim
                                                          y→x     y−x
if this limit exists (if not, then the function does not have a derivative at x).
        As you will know from calculus courses, we can think of f 0 (x) as being the slope of the tangent to the
        graph of f at x.
        If f 0 (x) is defined, we say that f is differentiable at x. If f is differentiable at each x ∈ (a, b), then f
        is said to be differentiable on (a, b).
                                            30
                                                                                                  Chapter 4. Differentiation
        We can also define the right and left derivatives of f at x. For example, the right derivative, denoted
        by fr0 (x), is the limit
                                                         f (y) − f (x)
                                                    lim+
                                                   y→x       y−x
        if it exists. We define the left derivative fl0 (x) similarly.
Note that f 0 (x) exists if and only if fl0 (x) and fr0 (x) exist and are equal.
First suppose that a > 0. Then for y sufficiently close to a we have that y > 0 and so |y| = y. Hence
Differentiability is a stronger version of continuity, in the sense that differentiability implies continuity.
        It should not be thought that the converse is true; that is, that continuity implies differentiability.
        Indeed (see Brannan, David, A First Course in Mathematical Analysis, Chapter 6), there is a function
        that is continuous at all real numbers, but not differentiable anywhere. The following results are
        well-known to you.
Theorem 4.2 Let f, g be defined on (a, b) and differentiable at c ∈ (a, b). Then
        Try to prove the second of the results of Theorem 4.2 using the definition of derivative. (You will of,
        course, recognise this as the product rule.)
                                                                                                                        31
                                        MA203 Real Analysis
        Theorem 4.3 (Chain Rule) Let f be defined on (a, b) and suppose f 0 (c) exists. Let g be defined on
        the range of f and be differentiable at f (c). Define the new function
        Note that G(y) → g 0 (f (c)) as y → f (c). (The problem is that if y = f (c), then we cannot divide by
        y − f (c).) Now, if f (x) 6= f (c),
                    g(f (x)) − g(f (c))   g(f (x)) − g(f (c)) f (x) − f (c)              f (x) − f (c)
                                        =                    ·              = G(f (x)) ·               ,
                           x−c               f (x) − f (c)        x−c                        x−c
        and this conclusion is also valid if f (x) = f (c) and x 6= c. As x → c, f (x) → f (c) (Theorem 4.1), so
        G(f (x)) → g 0 (f (c)). Also of course (f (x) − f (c))/(x − c) → f 0 (c) as x → c. The result follows.
        You will know from calculus that the sign of the derivative provides information about how a function
        behaves. For instance, if f 0 (a) > 0, then in some small neighbourhood of a, the f -values to the left
        of a are smaller than f (a), and those to the right are larger.
Theorem 4.4 If f 0 (a) > 0, then there exists δ > 0 such that
        Proof. Since limx→a (f (x) − f (a))/(x − a) = f 0 (a) > 0, then (taking  = f 0 (a) in the definition of
        limit) we can choose a δ > 0 such that
                                                                               
                                                   f (a + h) − f (a)
                                                                      − f 0 (a) < f 0 (a).
                                                                                
                              0 < |h − a| < δ =⇒ 
                                                            h
        Definition 4.5 Let f : R → R. We say that f has a local maximum at a point c ∈ R if there exists
        δ > 0 such that f (c) ≥ f (x) for all x ∈ (c − δ, c + δ). We say that f has a local minimum at c if
        there exists δ > 0 such that f (c) ≤ f (x) for all x ∈ (c − δ, c + δ).
        Well, you know that to find local maxima or minima, you solve f 0 = 0. But why? Can we give a
        formal justification for this. Indeed we can, as the following theorem and its proof show.
        Theorem 4.6 Let f : R → R. If f has a local maximum (or minimum) at c, and if f 0 (c) exists, then
        f 0 (c) = 0.
                                        32
                                                                                             Chapter 4. Differentiation
        Proof. Suppose that f has a local maximum at c. The proof in the case of a local minimum is
        similar. Then there is δ > 0 such that for all x ∈ I = (c − δ, c + δ), f (x) ≤ f (c). So, for x < c and
        x ∈ I,
                                                    f (x) − f (c)
                                                                  ≥0
                                                        x−c
        and for x > c and x ∈ I,
                                                        f (x) − f (c)
                                                                      ≤ 0.
                                                            x−c
        But f 0 (c) exists and so
(This left limit is non-negative because f (x) − f (c) ≤ 0 and, for such x, x − c < 0.) But, also,
        The theorem tells us that if f is differentiable on (a, b), then in order to examine all local maxima or
        minima, we may restrict attention to the points where the derivative is zero. Of course, in general a
        function may have a local maximum or minimum at a point where it is not differentiable. (For
        example: f (x) = |x| has a local minimum at x = 0.)
        In this section, we look at three extremely useful and important results. First, we have the following
        (which we call the Extreme Value Theorem), which says that a continuous function will have a
        maximum and a minimum on a closed and bounded interval.
        Theorem 4.7 (Extreme Value Theorem) Suppose the real function f is continuous on the closed
        bounded interval [a, b]. Then f is bounded on [a, b] and attains its bounds; that is, there are
        c1 , c2 ∈ [a, b] such that
f (c1 ) = min{f (x) : x ∈ [a, b]}, f (c2 ) = max{f (x) : x ∈ [a, b]}.
        Proof. Suppose first that f is unbounded above. For each n ∈ N, let xn be a point in [a, b] such
        that f (xn ) > n. The sequence (xn ) is bounded, so has a convergent subsequence (xnk ), tending to
        some limit c. Necessarily c ∈ [a, b]. Since f is continuous at c, f (xnk ) → f (c) as k → ∞. But this
        contradicts the construction of the sequence (xn ).
        So f is bounded above. Let M = sup{f (x) : x ∈ [a, b]}. For each n ∈ N, let xn be a point in [a, b]
        such that f (xn ) > M − n1 . Again take a convergent subsequence (xnk ) of (xn ), tending to some
        limit c ∈ [a, b]. Arguing as before, we see f (c) = M .
The argument showing that f attains a minimum value over [a, b] is very similar.
Note that this result does not hold if, for instance, the domain of f is an open interval (a, b).
        The next two results, Rolle’s Theorem and the Mean Value Theorem, concern the derivative. As we
        shall see, they are very useful.
        Theorem 4.8 (Rolle’s Theorem) Let f be continuous on [a, b] and differentiable on (a, b), and
        suppose that f (a) = f (b). Then there exists c ∈ (a, b) such that f 0 (c) = 0.
                                                                                                                   33
                                MA203 Real Analysis
Proof. If f (c) = f (a) for all c ∈ [a, b] then we are done because in this case, the definition of
derivative shows that f 0 (c) = 0 for all c ∈ (a, b). Why? Otherwise we may suppose that there is some
c ∈ [a, b] such that f (c) > f (a) = f (b). Now consider a point c ∈ (a, b) such that
f (c) = max{f (x) : x ∈ [a, b]}. Such a point c exists by the Extreme Value Theorem, and we must
have f 0 (c) = 0 by Theorem 4.6.
Theorem 4.9 (The Mean Value Theorem) Let f be continuous on [a, b] and differentiable on
(a, b). Then there exists c ∈ (a, b) such that
Proof. Let the constant α be given by α = (f (b) − f (a))/(b − a). Then the function g defined by
g(x) = f (x) − αx is continuous on [a, b] and differentiable on (a, b) (because f is) and it satisfies
g(a) = g(b) (this is why we chose α as we did). By Rolle’s Theorem, there is c ∈ (a, b) with
g 0 (c) = 0. But g 0 (x) = f 0 (x) − α, so there is c ∈ (a, b) with f 0 (c) = α, as required.
What is the point of the Mean Value Theorem? Well, there are a number of ways of thinking about
it. One useful interpretation is that it gives precise results about small changes in the value of a
function. We know that if b is close to a, then
for some c between a and b. This is a precise statement: it is an equality, not an approximation.
Note that it involves f 0 (c) rather than f 0 (a). The precise value of c is not necessarily known, so it
could be argued that there is still some inherent uncertainty in the statement, but we at least know
that c is between a and b. Consider the following example.
Example
√        Suppose
         √       that n is a positive
                               √      integer and that we want to find a good approximation to
  n + 1 − n. If we let f (x) = x (for x > 0) then f is differentiable on (0, ∞) and
                                √           √
                                   n + 1 − n = f (n + 1) − f (n).
Now, using the derivative in the standard way to obtain an approximation, we see that
which becomes
                                         √           √      1
                                             n+1−        n≈ √ .
                                                           2 n
That’s nice, but how close is the approximation? What does ≈ actually mean?
Suppose instead we use the Mean Value Theorem (MVT). This tells us that, precisely,
                                34
                                                                                                     Chapter 4. Differentiation
        Now, we don’t know c precisely, but we do know that it lies between n and n + 1. This implies that
                                                        1    1   1
                                                      √    < √ < √ .
                                                     2 n+1  2 c 2 n
        So the MVT tells us that
                                             1       √          √         1
                                          √       < n+1− n< √ .
                                         2 n+1                          2 n
                                                              √                            √         √
        This is much more useful: it shows not only that 1/(2 n) is an approximation to n + 1 − n, but
        it shows much more, giving a precise range of possible values, close to the approximation. In other
        words, we now know something concrete about how precise the approximation is.
        The Mean Value Theorem provides a useful tool for proving some of the familiar results about the
        derivative.
        Definition 4.10 Let f : I → R, where I is some interval. f is increasing (decreasing) on I if for each
        x, y ∈ I with x < y, we have f (x) ≤ f (y) (resp. f (x) ≥ f (y)).
        This can be proved by contradiction, using the Mean Value Theorem: the results follow from the fact
        that, for each pair x1 < x2 in (a, b), and for some c ∈ (x1 , x2 ).
                                                f (x2 ) − f (x1 ) = (x2 − x1 )f 0 (c).
        Suppose that f : Rn → R. The partial derivative ∂f /∂xi at a point a ∈ Rn is the instantaneous rate
        of change of the function with respect to xi , at a.
        Formally,
                                ∂f            f (a1 , . . . , ai−1 , ai + h, ai+1 , . . . , an ) − f (a)
                                    (a) = lim                                                            ,
                                ∂xi       h→0                             h
        if this limit exists.
        We may think of the partial derivative ∂f /∂xi as the rate of change in f as we move in the direction
        of the vector ei , because
                            f (a1 , . . . , ai−1 , ai + h, ai+1 , . . . , an ) − f (a)   f (a + hei ) − f (a)
                                                                                       =                      .
                                                        h                                        h
        But we can move in many other directions, and such considerations lead to the notion of directional
        derivative.
                                                                                                                           35
                                         MA203 Real Analysis
                                                                  (tu)2 (tv)                u2 v
                                                                               
                                 f (0 + tv) − f (0)   1
                                                    =                               =                .
                                          t           t         (tu)4 + (tv)2           t2 u 4 + v 2
        Let’s start informally. Suppose we want to approximate the change in f when x changes from a to
        a + h where
                                                h = (h1 , h2 , . . . , hn ).
Suppose also that all the partial derivatives exist. If the hi are small enough, then
                                                                ∂f1                     ∂f1
                                 f1 (a + h) − f1 (a) ≈              (a)h1 + · · · +         (a)hn
                                                                ∂x1                     ∂xn
                                                                ∂f2                     ∂f2
                                 f2 (a + h) − f2 (a) ≈              (a)h1 + · · · +         (a)hn
                                                                ∂x1                     ∂xn
                                                       ..
                                                        .
                                                                ∂fm                 ∂fm
                               fm (a + h) − fm (a) ≈                (a)h1 + · · · +     (a)hn ,
                                                                ∂x1                 ∂xn
        so we have that
                                                              ∂f1                              ∂f1        
                                                                   (a) · · ·                        (a)     h1
                                       f1 (a + h) − f1 (a)    ∂x1                              ∂xn         h2 
                f (a + h) − f (a) =            .
                                                ..                ..     ..                        ..
                                                           ≈                                              ..  .
                                                                                                        
                                                                   .         .                      .          .
                                                               ∂fm                              ∂fm
                                                                                                         
                                      fm (a + h) − fm (a)            (a) · · ·                        (a)     hn
                                                               ∂x1                              ∂xn
        This describes the linear approximation of f at a, and the        matrix (or, equivalently, the linear mapping
        it describes)
                                                   ∂f1                    ∂f1       
                                                           (a) · · ·             (a)
                                                   ∂x1                    ∂xn       
                                        Df (a) = 
                                                       ..     ..             ..     
                                                         .         .           .     
                                                     ∂fm                   ∂fm
                                                                                    
                                                           (a) · · ·             (a)
                                                     ∂x1                   ∂xn
        is known as the derivative (or the Jacobian derivative) of f at a.
        The ‘argument’ just given is not precise. We now take a more formal approach, in which we shall see
        that some conditions on the partial derivatives, other than simply their existence, are required to make
        the argument watertight. First, we start with the ‘proper’ formal definition of what is meant by the
        derivative of a function f : Rn → Rm .
                                         36
                                                                                       Chapter 4. Differentiation
In the statement of the next theorem, by a neighbourhood of a ∈ Rn we mean a set of the form
{x ∈ Rn : kx − ak < }, for some . (There are other, more general interpretations of
‘neighbourhood’, but this will do for now.)
Suppose that f : Rn → R. Then the gradient of f at a point a is defined to be the column vector
                                               ∂f         
                                                       (a)
                                               ∂x1        
                                               ∂f         
                                               ∂x (a) 
                                                          
                                    ∇f (a) =  2           .
                                                   ..     
                                                    .     
                                                 ∂f
                                                          
                                                       (a)
                                                ∂xn
Theorem 4.14 Suppose that f : Rn → Rm , that f1 , f2 , . . . , fm are the component functions, and
that a ∈ Rn . Then:
    f is differentiable at a =⇒ f is continuous at a.
    f is differentiable at a ⇐⇒ fi is differentiable at a (for i = 1, 2, . . . , m).
    If f is differentiable at a then
                                                   ∂f1                      ∂f1       
                                   
                                    (∇f1 (a))    T
                                                         (a) · · ·                (a)
                                                   ∂x1                      ∂xn       
                         Df (a) =      .
                                        ..             ..     ..                ..
                                                =                                    .
                                                                                    
                                                        .         .              .
                                                    ∂fm                      ∂fm
                                              T
                                                                                      
                                    (∇fm (a))             (a) · · ·                (a)
                                                    ∂x1                      ∂xn
       ∂fi
    If      (for i = 1, . . . , m and j = 1, . . . , n) all exist in a neighbourhood of a and are continuous
       ∂xj
    at a, then f is differentiable at a.
The gradient has a useful interpretation. We have seen that the rate of change of f at a in the
direction v is the directional derivative
This may be expressed as the inner (or scalar) product h∇f (a), vi, which equals
where θ denotes the angle between the vectors ∇f (a) and v, and where we have used the fact that
kvk = 1. This quantity is maximised when cos θ = 1, so it is maximised when the direction v is in the
same direction as ∇f (a). Suppose ∇f (a) 6= 0. Since directions have length 1, this means that the
maximising v is
                                                ∇f (a)
                                          v=             .
                                               k∇f (a)k
                                                                                                             37
                                     MA203 Real Analysis
      WARNING! Note that directional derivatives in all directions can exist, even if f is not differentiable.
      In other words, the existence of all directional derivatives does not imply the existence of the
      derivative. To see this, consider the following example.
      We saw in an earlier example that f has directional derivatives in all directions at 0. However, f is
      not differentiable at 0. In fact, it is not even continuous there, since, for example,
      f (t, t2 ) = 1/2 6→ 0 = f (0) as t → 0.
      In the example just given, why does showing that f is not continuous at 0 establish that the
      derivative does not exist there?
At the end of this chapter and the relevant reading, you should be able to:
          state the definition of the derivative, and left and right derivatives, and be able to use the
          definitions to calculate the derivative
          state, and be able to prove, that differentiability implies continuity
          state the product, quotient and chain rules (but the proofs are not needed)
          state the definition of local maxima and minima
          be able to prove that if c is a local maximum or minimum of f , and f is differentiable at c, then
          f 0 (c) = 0
          state that if f : R → R is continuous on [a, b] then f is bounded and attains its bounds (proof
          not needed)
          state, and be able to prove, the Extreme Value Theorem
          state, and be able to prove and use, Rolle’s Theorem
          state, and be able to prove and use, the Mean Value Theorem
          state, and be able to use, the formal definitions of partial derivatives
          state what is meant by directions and directional derivatives, and be able to calculate directional
          derivatives
          state, and be able to use, the precise definition of the derivative of a function f : Rm → Rn
          state what’s meant by the gradient of f : Rm → R and be able to calculate it
          state that differentiability of f : Rm → Rn implies continuity (no proof needed)
          state that the derivative exists if each component function is differentiable (no proof needed); and
          that if each partial derivative exists and is continuous, then f is differentiable (no proof needed)
          calculate derivatives of functions f : Rm → Rn
          state, and be able to use, the connection between directional derivative and gradient when f is
          differentiable
                                     38
                                                                                        Chapter 4. Differentiation
      Check this! Now, as x → c, (f (x) − f (c))/(x − c) → f 0 (c), (g(x) − g(c))/(x − c) → g 0 (c) and,
      because g is continuous (since it is differentiable) at c, g(x) → g(c). So the limit of the right hand
      side as x → c exists and is f 0 (c)g(c) + f (c)g 0 (c). The limit of the left hand side must, of course, be
      the same. But, by definition of the derivative, the limit of the left hand side is (f g)0 (c). Therefore
      (f g)0 (c) = f 0 (c)g(c) + f (c)g 0 (c).
      Learning activity 4.2 If a function is differentiable at a point, then it is also continuous there. So if
      it is not continuous, then it cannot be differentiable.
4.6 Exercises
      Exercise 4.1 By considering (f (y) − f (x))/(y − x), prove that f (x) = x3 is differentiable, and that
      f 0 (x) = 3x2 .
      Exercise 4.2 Suppose f : R → R is differentiable on R and that f 0 (x) ≥ K for all x > N , where
      K > 0 and N is some real number. Prove that f (x) → ∞ as x → ∞.
      Exercise 4.3 Suppose that f : R → R is differentiable (on all of R) and that, for all x, |f 0 (x)| ≤ M .
      Prove that for all x, y ∈ R,
                                        |f (x) − f (y)| ≤ M |x − y|.
      Exercise 4.4 Using the Mean Value Theorem, prove that, for all k ≥ 2,
                                           1                         1
                                             < log k − log(k − 1) <     .
                                           k                        k−1
      [What function f (x) might you try applying the Mean Value Theorem to? What interval [a, b] is likely
      to be relevant?]
                   1        1                                              X1
      Let sn = 1 + + . . . + be the nth partial sum of the harmonic series    . Use the inequalities
                   2        n                                               n
      above to show that
                                      sn − 1 < log n < sn−1 < sn .
      Now let bn = sn − log n. Prove that (bn ) is decreasing and bounded below. Hence show that there is
      a constant γ, with 0 ≤ γ ≤ 1, such that sn = log n + γ + en , where en → 0 as n → ∞. [The
      constant γ, known as Euler’s constant, is approximately 0.577.]
Exercise 4.5 Suppose that f : R → R is differentiable on R and that for all real numbers x,
      Prove that
                                              f (x) → ∞         as x → ∞.
                                                                                                              39
                                MA203 Real Analysis
Exercise 4.6 Suppose f : R → R is such that for all n and for all x ∈ R, f (n) (x) exists (that is, f is
infinitely differentiable on R). Suppose further that for all x ∈ R,
                                             f (x + 1) = f (x).
Use Rolle’s Theorem to prove that for every positive integer n, there is cn ∈ [0, 1) such that
f (n) (cn ) = 0.
[Hint: It’s enough to find a suitable point cn anywhere on the real line (why?). For n = 1 this is just
Rolle’s Theorem. Try it for n = 2, . . . .]
Exercise 4.7 Suppose a, b are real numbers with b > a. Apply Rolle’s Theorem to the function
f (x) = e−x (x − a)(x − b) to prove that the equation
                                   (x − a)(x − b) = (x − a) + (x − b)
has a solution between a and b.
Exercise 4.8 Let f : R2 → R be defined by f (x, y)T = x2 − xy. Set a = (1, 1)T , and let v be a unit
vector in the direction (2, 1)T . Find the derivative Df (a), and hence the directional derivative
Dv f (a).
                                                         ∂f     ∂f
Use the definition of partial derivatives to show that      and    both exist at (0, 0)T . Show,
                                                         ∂x     ∂y
however, that f is not differentiable at (0, 0)T .
Exercise 4.12 Let a be a point in Rn , and v be a vector in Rn . The line segment [a, a + v] between
a and a + v is the set {a + tv : t ∈ [0, 1]}. Suppose that f : Rn → R is differentiable. Define
g : R → R by g(t) = f (a + tv). Show that g is differentiable, with derivative g 0 (t) given by
Df (a + tv))(v).
[It might help to write v = ku, where u is a unit vector, and make use of the notion of the directional
derivative in direction u.]
By applying the (1-dimensional) Mean Value Theorem to g on the interval [0, 1], prove that there is
some c ∈ [a, a + v] such that
                                   f (a + v) − f (a) = Df (c)v.
Exercise 4.13 For u, v ∈ Rm , let hu, vi be the inner product, equal to uT v (or, equally, vT u). Let
f, g : Rn → Rm be differentiable at a, and define hf, gi : Rn → R to be the function given by
hf, gi(x) = hf (x), g(x)i. Prove that hf, gi is differentiable at a and that
                             Dhf, gi(a) = (f (a))T Dg(a) + (g(a))T Df (a).
                                40
Chapter 5
Topology of Rm
Contents
           5.1     Introduction . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
           5.2     Open and closed subsets of R . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
           5.3     Open and closed subsets of Rm .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45
           5.4     Continuity . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   46
           5.5     Compactness . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   47
           5.6     Learning outcomes . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   48
           5.7     Comments on selected activities       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   49
           5.8     Exercises . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   49
Reading
Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. Chapter 11.
        Neither of these readings is ideal in every way. The approaches taken by Bartle and Sherbert to the
        concepts of closed set and compactness are different from those taken here, and the Sutherland book
        is quite advanced, and of more use for the next chapter.
5.1 Introduction
        In this and the next chapter, we explore some important theoretical concepts in analysis. Partly, the
        aim is to enable us to generalise and place in a larger context some of the results that we have met
        earlier. For instance, the Extreme Value Theorem for a function f : R → R tells us that a continuous
        function will have a maximum and a minimum value on a closed interval. You might well ask what’s
        so special about closed intervals in order for this to work; or what’s so special about continuous
        functions? Will the Theorem also work for other types of domain rather than just closed intervals? To
        answer this question, we need to begin to consider some ‘topological’ ideas. (Do not be afraid: at this
        point, the word ‘topology’ is not meant to mean anything to you!)
        Very roughly speaking, a set U of real numbers is said to be an open set if around every point of U
        there is some room to move in both directions (increasing and decreasing or, if you like, to the left
        and right) without leaving the set U . The formal definition is as follows:
        Definition 5.1 (Open set of real numbers) A set U ⊆ R is an open set (or is open) if for every
        y ∈ U there is some  = (y) > 0 such that (y − , y + ) ⊆ U .
                                                                                                                                                                                                 41
                                       MA203 Real Analysis
        We write  = (y) in this definition to emphasise that  can, and will, generally, depend on y: that is,
        y is given and we then find a suitable .
        Example The open interval U = (1, 2) is open. To see this, let y ∈ U . Then y is between 1 and 2.
        Provided we take  to be no more than the smaller of y − 1 and 2 − y, then (y − , y + ) ⊆ U .
        Convince yourself of this! A similar argument shows that any open interval (a, b) is open (which is a
        relief, since we refer to it as an ‘open’ interval!).
        Make sure you understand why, with the chosen value of , we have (y − , y + ) ⊆ U . Write down a
        formal proof.
        Example The interval U = (1, 2] is not open, because if we take x = 2 then no matter how small  is,
        the interval (2 − , 2 + ) contains numbers greater than 2 and hence does not lie entirely in
        U = (1, 2]. (Note that, although there is an open interval in U around every other point of U , the
        fact that this fails to hold for the single point 2 is enough to show that U is not open: to be open, we
        would need the condition to hold for every point of U .)
        WARNING! It should not be thought that all open sets are open intervals: although every open
        interval is open, there are many other types of open set. For example, the set (1, 2) ∪ (3, 4) is open,
        but it is not an open interval.
        Some results (such as Theorem 5.2 below) are not true just for a finite collection of open sets, nor
        just for countably many: but for any collection. There’s some useful notation we can use when we
        want to work with collections, or families, of sets, and it’s worthwhile mentioning this at this stage.
        The reason for introducing this notation is that it will make it much simpler to work with infinite
        (rather than finite) collections of sets.
        Suppose that S is some set and I is some nonempty (indexing) set such that for each i ∈ I, we have
        a set Ai ⊆ S. Thus, {Ai : i ∈ I} is a collection, or family, of sets. The intersection and union of the
        sets in the family are easily defined:
                                            \
                                               Ai = {x : x ∈ Ai for all i ∈ I}
                                             i∈I
        and                          [
                                           Ai = {x : x ∈ Ai for at least one i ∈ Ai } .
                                     i∈I
        What’s the point here? Well, we could imagine having a set Ai for each positive integer i. But we
        could have even more sets: one for each i in some interval of real numbers, for example.
                                                                         S
        Example Suppose that for each i ∈ (1, ∞), Ai = (1/i, 2]. Then i∈I Ai = (0, 2].
                                       42
                                                                                           Chapter 5. Topology of Rm
Theorem 5.2 The union of any collection of open sets is again an open set.
        Proof. The Theorem says that the union of any collection of open sets is open. And it really means
        any collection: not just a finite collection, not just a countably infinite collection . . .. So how do we
        prove this? We don’t know what kind of collection we’re dealing with. Well, the point is that any
        collection can be written as {Ui : i ∈ I} for some index set I. (This is completely general, and it
        covers all possibilities. Special cases are I = {1, 2, . . . , n} if there are nSsets in the collection, I = N if
        there are countably many sets, and so on.) We need to show that U = i∈I Ui is open. So let us take
        an arbitrary y ∈ U . We need to show that there is some  > 0 such that (y − , y + ) ⊆ U . Now, the
        fact that y ∈ U means precisely that, for some i ∈ I, we have y ∈ Ui . (There may, of course, be more
        than one such i.) Because Ui is open, there is some  > 0 such that (y − , y + ) ⊆ Ui . But Ui ⊆ U
        (since U is the union of all the Ui ) and hence (y − , y + ) ⊆ U , as required.
        WARNING! It is not true that the intersection of any collection of open sets is open. ForTinstance,
                                                                                                   ∞
        suppose that Ui = (−i, i) for i ∈ N. Then each set Ui is open. However, the intersection i=1 Ui is
        {0}, the set containing only the number 0, and this is not open.
It is true, though, that the intersection of a finite collection of open sets is open.
        Proof. We prove the result for the case of two sets, U1 and U2 . (For any other finite number of sets
        you can prove the result in a similar way, or we can use induction on the number of sets.) Let
        U = U1 ∩ U2 . Suppose that y ∈ U . Then, because y ∈ U1 and U1 is open, there is 1 > 0 so that
        (y − 1 , y + 1 ) ⊆ U1 . Equally, because y ∈ U2 and U2 is open, there is 2 > 0 so that
        (y − 2 , y + 2 ) ⊆ U2 . Let  = min(1 , 2 ), the smaller of 1 and 2 . Then we have
(y − , y + ) ⊆ (y − 1 , y + 1 ) ⊆ U1
        and
                                           (y − , y + ) ⊆ (y − 2 , y + 2 ) ⊆ U2 .
        So (y − , y + ) ⊆ U1 ∩ U2 = U , and this is what we need.
        We also have the notion of closed sets. But before defining what this means, we need to clear up one
        source of potential confusion. By analogy with the use of the words ‘open’ and ‘closed’ in everyday
        language, we might think that a given set of real numbers must be either open or closed, and that if it
        is not open, it is closed. This, unfortunately, will not be the case: as we shall see, sets can be open
        but not closed, closed but not open, both open and closed, or neither open nor closed!
                                                                                                                    43
                               MA203 Real Analysis
Definition 5.4 A set C ⊆ R is a closed set (or is closed) if whenever (xn ) is a convergent sequence
and xn ∈ C for all n, then the limit of the sequence, lim xn , is in C.
So a set C is closed if for any convergent sequence of members of C, the limit of the sequence is in
C. This is a tricky definition to work with, but as we shall see shortly, there is another way of
describing closed sets.
Example The interval C = [0, 1] is closed. To see this, suppose that (xn ) is any sequence in C,
converging to a limit L. Then for each n, xn ∈ C, so 0 ≤ xn ≤ 1. Now, it follows from this that
0 ≤ L ≤ 1 (prove this!), so L ∈ C, and hence C is closed.
Example The interval C = (0, 1] is not closed. Consider the sequence (xn ) where xn = 1/n. For all
n, xn ∈ C. The sequence converges to 0, but 0 is not in C. So C is not closed.
We mentioned that ‘closed’ is not the ‘opposite’ of ‘open’, but the following result linking open sets
and closed sets is very useful.
Theorem 5.5 A set C of real numbers is closed if and only if its complement R \ C is open.
Proof. Because this is an ‘if and only if’ result, there are two things to prove here: first, that if C is
closed then R \ C is open; secondly, that if R \ C is open then C is closed.
Suppose, first, that C is closed and consider its complement U = R \ C. We want to show U is open.
Suppose it isn’t. Then there is some y ∈ U such that for no  > 0 do we have (y − , y + ) ⊆ U . In
other words, for all  > 0, the interval (y − , y + ) does not lie entirely within U = R \ C and hence
must contain points of C. For any positive integer n, let’s take  = 1/n. Then there is some
xn ∈ (y − 1/n, y + 1/n) such that xn ∈ C. Because |xn − y| < 1/n, we have that xn → y as
n → ∞. So here we have a sequence (xn ) in C such that lim xn = y 6∈ C. But this cannot happen
since C is closed. So what’s gone wrong? Well, we supposed that R \ C was not open, and this
supposition must therefore be wrong. So R \ C is open.
Next, suppose that R \ C is open. To prove that C is closed, we need to show that the limit of any
convergent sequence of points of C is in C. So suppose (xn ) is a convergent sequence, with xn ∈ C
for all n, and set L = lim xn . We need to show L ∈ C. Suppose this isn’t so. Then L is in the open
set R \ C, so there is some  > 0 such that (L − , L + ) ⊆ R \ C. Now, because xn → L, there is
some N such that for n > N , |xn − L| < , that is xn ∈ (L − , L + ). But then for n > N ,
xn ∈ R \ C. This is a contradiction to the fact that xn ∈ C. So we have gone wrong in assuming
that L is not in C. Therefore it is in C, and C is closed.
This theorem is an extremely useful characterisation of closed sets. In fact, we could, if we had
wanted, have taken the definition of a closed set to be a set C whose complement R \ C is open, and
many texts (including that of Bartle and Sherbert) do this.
Example Consider again the set C = [0, 1]. We showed this was closed by using Definition 5.4. But
we can also see that it is closed by considering its complement. For,
and this is open because it is the union of two open sets. So, since the complement of C is open, C is
closed.
Example The set R of all real numbers is both open and closed. The interval (0, 1] is neither open
nor closed.
WARNING! As mentioned, it is wrong to think that a set of real numbers must be either open or
closed, and that if it is not open, it is closed. This is not the case. Sets can be open but not closed,
closed but not open, both open and closed, or neither open nor closed. Theorem 5.5 describes a
relationship between closed and open: they are not ‘opposites’.
                               44
                                                                                    Chapter 5. Topology of Rm
For m > 1, the counterpart in Rm to the open interval (y − , y + ) in R is the open ball.
        Definition 5.6 (Open Ball) For x ∈ Rm and  > 0, the open ball of radius  around x is
                                             B (x) = {y : kx − yk < } .
        This is the set of those points y whose distance from x is less than .
        Example In R2 the open ball B (x) is the region enclosed by a circle of radius  centred at x. Note
        that the points on this circle do not lie in B (x).
        We have already investigated the notion of open sets of real numbers. All the ideas and results extend
        to Rm .
        Definition 5.7 A subset U of Rm is open if for any y ∈ U , there is  = (y) > 0 such that
        B (y) ⊆ U.
        Informally, a set is open if, from any point of the set, we can move some positive distance in any
        ‘direction’ without going outside the set.
        The following theorem shows us that any open ball is an open set, but there are other types of open
        set. Just as for open sets in R, the union of any collection of open subsets of Rn is again open.
        Proof. Suppose that B = B (x) is an open ball. Let y ∈ B. We need to show there is η > 0 such
        that Bη (y) ⊆ B. (We use η rather than  because the symbol  is already used in the description of
        B.) Now, since y ∈ B, we have that ky − xk < , so the number η =  − ky − xk is positive. We will
        show that Bη (y) ⊆ B. So, suppose z ∈ Bη (y). Then kz − yk < η and hence, by the triangle
        inequality,
                    kz − xk ≤ kz − yk + ky − xk < η + ky − xk =  − ky − xk + ky − xk = .
        So, kz − xk < . This means z ∈ B. So we have established that Bη (y) ⊆ B. It now follows that B
        is open.
        Definition 5.9 A set C ⊆ Rm is a closed set (or is closed) if whenever (xn ) is a convergent sequence
        and xn ∈ C for all n, then the limit of the sequence, lim xn , is in C.
So a set C is closed if for any convergent sequence of members of C, the limit of the sequence is in C.
        As for the case m = 1 investigated above, we have the following result, the proof of which is similar
        to the one given earlier.
                                                                                                             45
                                       MA203 Real Analysis
5.4 Continuity
        Theorem 5.11 The function f : Rn → Rm is continuous at a ∈ Rn if and only if given any open ball
        B (f (a)), there exists δ > 0 such that
        Proof. Recall that f is continuous at a if given any  > 0 there exists δ > 0 such that if kx − ak < δ
        then kf (x) − f (a)k < . The condition kx − ak < δ is exactly the same as x ∈ Bδ (a) and the
        condition kf (x) − f (a)k <  is equivalent to f (x) ∈ B (f (a)). Therefore, f is continuous at a if
        given any  > 0 there exists δ > 0 such that
        There is a simple characterisation of continuity (on the whole of Rn ) involving open sets. To state
        this succinctly, we need a new notation. Suppose that f : Rn → Rm is a function, and that B ⊆ Rm .
        Then we denote by f −1 (B) the subset {x ∈ Rn : f (x) ∈ B} of Rn consisting of all points which f
        maps into B. The use of the notation f −1 should not be taken as meaning that the inverse function
        f −1 of f exists: the same symbol is used here, but it means something different. (In particular, f −1
        as defined here is not a mapping from Rm to Rn but is, instead, a mapping from all subsets of Rm to
        subsets of Rn .)
        Theorem 5.12 Suppose f : Rn → Rm . Then f is continuous if and only if for all open subsets U of
        Rm , f −1 (U ) is an open subset of Rn .
Proof. Because this is an ‘if and only if’ result, there are two things to prove.
        First, suppose that f is continuous. Then we want to show that if U is open, then V = f −1 (U ) is
        open. To do this, we need to show that for each x ∈ V there is some η > 0 so that Bη (x) ⊆ V .
        Consider f (x). Because x ∈ f −1 (U ), we have f (x) ∈ U and because U is open, there is some  > 0
        such that B (f (x)) ⊆ U . By continuity of f , there is δ > 0 such that f (Bδ (x)) ⊆ B (f (x)). Thus,
        for any z ∈ Bδ (x), we have f (z) ∈ B (f (x)) ⊆ U. In particular, therefore, this shows that anything in
        Bδ (x) is mapped by f into U and hence Bδ (x) ⊆ f −1 (U ) = V . So we may take η = δ.
        Next, suppose that it is the case that f −1 (U ) is open whenever U is, and let a ∈ Rn . We want to
        show that f is continuous at a. Now, U = B (f (a)) is open because it is an open ball. So it follows
                                       46
                                                                                    Chapter 5. Topology of Rm
        that V = f −1 (B (f (a)) is open. We have f (a) ∈ U so a ∈ V . Because V is open, there is some
        δ > 0 such that Bδ (a) ⊆ V = f −1 (B (f (a)). This means that f (Bδ (a)) ⊆ B (f (a)) and hence that
        f is continuous at a.
5.5 Compactness
        The idea of a compact set is extremely important in analysis and its applications (especially to
        optimisation). There are a number of ways of defining what we mean by a compact set. The
        approach we take is through what is sometimes called ‘sequential compactness’.
        WARNING! Make sure you understand this definition. It is not the same as the definition of a closed
        set, though at first glance you may think it similar. Recall that a set C is closed if, whenever a
        sequence of members of C converges, then its limit is in C. This says nothing at all about sequences
        that do not converge. On the other hand, the definition of compactness says something about any
        sequence, and not just convergent ones. What it says, to re-iterate, is that C is compact if: when we
        take any sequence whose members are in C then that sequence will have a subsequence which
        converges and, furthermore the limit of this subsequence lies in C.
        Theorem 5.14 (Bolzano-Weierstrass Theorem) Every bounded real sequence has a convergent
        subsequence.
        Consider a closed bounded interval [a, b] of real numbers (where a < b). Suppose that (xn ) is a
        sequence of real numbers each belonging to [a, b]. The Bolzano-Weierstrass theorem tells us that this
        has a convergent subsequence. Since each member of the subsequence is between a and b, so too is
        L. So we have established:
        But what exactly are the compact subsets of R, and, more generally, Rm ? The following
        characterisation, known as the Heine-Borel Theorem, is very useful.
        Proof. We prove one half of the theorem, namely that if C is compact then it must be closed and
        bounded. So, suppose C is compact. We want to show that it is closed. To do so, we need to prove
        that whenever (xn ) is a convergent sequence in C then x = lim xn ∈ C. By compactness, there is
        some convergent subsequence of (xn ) whose limit is in C. But, since (xn ) converges to x, so too do
        all of its subsequences. We conclude that x ∈ C. Now we want to show that C is bounded. Suppose
        it is not. Then for each n ∈ N, there is some xn ∈ C with kxn k > n. We show that the sequence
        (xn ) has no convergent subsequence, contradicting the compactness of C. Suppose that (xnk ) is a
        subsequence and that xnk → x as k → ∞. Then there is some N so that for all k ≥ N ,
        kxnk − xk < 1. It follows that for all k ≥ N ,
                                                                                                           47
                                        MA203 Real Analysis
        But
                                            kxnk k > nk ≥ k → ∞ as k → ∞,
        so this is not possible. So we conclude that C is indeed bounded.
For example, the subset [1, 2] ∪ [3, 6] of R is compact, but [1, 2) is not.
        WARNING! Some texts define compactness by saying that a set is compact if and only if it is closed
        and bounded. This is a reasonable approach when dealing with Rm , but the definition of compactness
        we have given is substantially more general. It can apply (as we shall see) to ‘metric spaces’ other
        than Rm .
        The following result is useful. Earlier we mentioned that a continuous real function on a closed
        interval [a, b] of the real numbers is bounded on the interval and attains its maximum and minimum.
        That result can be seen as a special case of the following one.
        Theorem 5.17 Suppose that f : Rm → Rn is continuous and that C ⊆ Rm is compact. Then the
        image of C under f , f (C) = {f (x) : x ∈ C} is a compact subset of Rn .
        Proof. To show that f (C) is compact, take any sequence (yn ) in f (C); we need to show that (yn )
        has a subsequence converging to some element of f (C). For each n, since yn ∈ f (C), there is some
        xn in C such that f (xn ) = yn . Consider the sequence (xn ) in C: since C is compact, there is some
        subsequence (xnk ) converging to a limit x ∈ C. We claim that f (xnk ) → f (x) as k → ∞: to see this,
        fix any  > 0; by continuity of f there is some δ > 0 such that ky − xk < δ ⇒ kf (y) − f (x)k < , and
        by definition of convergence there is some K such that k ≥ K ⇒ kxnk − xk < δ. Combining these
        two facts gives us that, for any  > 0, there is some K such that k ≥ K ⇒ kf (xnk ) − f (x)k < , as
        required. But now notice that the sequence (f (xnk )) = (ynk ) is a subsequence of the original
        sequence (yn ) converging to a limit f (x) ∈ f (C), which is what we need.
        Theorem 5.18 Suppose that C is a compact subset of Rm and that the function f : Rm → R is
        continuous on C. Then f is bounded on C and it achieves its maximum and minimum. In other
        words, the set {f (x) : x ∈ C} is bounded and has a maximum and a minimum: i.e., there are
        x1 , x2 ∈ C such that f (x1 ) = max{f (x) : x ∈ C} and f (x2 ) = min{f (x) : x ∈ C}.
        Note that the Extreme Value Theorem we met in Chapter 4 follows from this, by taking m = 1 and C
        to be a closed and bounded interval [a, b].
At the end of this chapter and the relevant reading, you should be able to:
                                        48
                                                                                     Chapter 5. Topology of Rm
            state the Heine-Borel theorem (that the compact subsets of Rm are precisely the closed and
            bounded sets), and be able to prove part of it, namely that the compact subsets are closed and
            bounded.
            state, and be able to prove, that the image, under a continuous function, of a compact set is
            compact.
      Learning activity 5.1 Because  is the smaller of y − 1 and 2 − y, we have both that  ≤ y − 1 and
       ≤ 2 − y. So:
                                            y −  ≥ y − (y − 1) = 1
      and
                                                 y +  ≤ y + (2 − y) = 2.
      Therefore, (y + , y + ) ⊆ (1, 2) = U .
                                                       S
      Learning activity
                  S        5.2 We need to show S  that i∈I Ai = (0, 2]. Probably the easiest approach is to
      proveSthat i∈I Ai = (0, 2] and (0, 2] ⊆ i∈I Ai . For any i, Ai = (1/i,S2] ⊆ (0, 2], so we certainly
      have i∈I AI ⊆ (0, 2]. The more difficult part is to show that (0, 2] ⊆ i∈I Ai . To do this, we need
      to establish that if y ∈ (0, 2] then there is some i such that y ∈ Ai . So let y ∈ (0, 2]. If y > 1 then y
      belongs to all the Ai for i ∈ I = (1, ∞), so it certainly belongs to their union. Suppose now that
      y ≤ 1. We can see that y will belong to Ai if and only if 1/i ≤ y, which means i ≥ 1/y. So, if we
      take i = 1/y, then we’ll have i ∈ (1, ∞) = I, and y ∈ Ai .
      Learning activity 5.4 The proof is almost identical to that of the m = 1 result, Theorem 5.5: we
      simply replace open intervals by open balls.
      Suppose, first, that C is closed and consider its complement U = Rm \ C. We want to show U is
      open. Suppose it isn’t. Then there is some y ∈ U such that for no  > 0 do we have B (y) ⊆ U . In
      other words, for all  > 0, the open ball B (y) does not lie entirely within U = Rm \ C and hence
      must contain points of C. For any positive integer n, let’s take  = 1/n. Then there is some
      xn ∈ B1/n (y) such that xn ∈ C. Because kxn − yk < 1/n, we have that xn → y as n → ∞. So here
      we have a sequence (xn ) in C such that lim xn = y 6∈ C. But this cannot happen since C is closed.
      So what’s gone wrong? Well, we supposed that Rm \ C was not open, and this supposition must
      therefore be wrong. So Rm \ C is open.
      Next, suppose that Rm \ C is open. To prove that C is closed, we need to show that the limit of any
      convergent sequence of points of C is in C. So suppose (xn ) is a convergent sequence, with xn ∈ C
      for all n, and set L = lim xn . We need to show L ∈ C. Suppose this isn’t so. Then L is in the open
      set Rm \ C, so there is some  > 0 such that B (L) ⊆ Rm \ C. Now, because xn → L, there is some
      N such that for n > N , kxn − Lk < , that is xn ∈ B (L). But then for n > N , xn ∈ Rm \ C. This
      is a contradiction to the fact that xn ∈ C. So we have gone wrong in assuming that L is not in C.
      Therefore it is in C, and C is closed.
5.8 Exercises
      Exercise 5.1 Let z ∈ Rm and  > 0. Show that the ‘closed ball’ {x ∈ Rm : kx − zk ≤ } is a closed
      subset of Rm .
                                                                                                            49
                               MA203 Real Analysis
Exercise 5.2 For each of the following sets, state whether they are open, closed, both or neither.
Justify your answers briefly. [It might be useful to start by sketching each set.]
                                      A = {0, 1} (a subset of R);
                                      B = {(x, y)T ∈ R2 : x > y};
                                    C = {(x, 0)T ∈ R2 : 0 < x < 1}.
Exercise 5.3 Suppose that f : Rm → R is a continuous function and that f (x∗ ) > 0. Show that
there is an open ball B = Bδ (x∗ ) such that f (x) > 0 for all x ∈ B.
                                                    2
Exercise 5.4 Suppose f : R → R is given by
                                          f (x) = x . Let A = [0, 1] and B = [−1, 1]. Determine
        −1     −1                 −1
f (A), f (B), f (f (A)), and f f (B) .
Exercise 5.5 Prove, using the definition of a closed set, that the intersection of any collection of
closed subsets of Rn is closed.
Now prove the result using both of the following facts: (i) a set is closed if and only if its complement
is open, and (ii) the union of any collection of open sets is open.
Exercise 5.6 Let f : Rn → Rm and g : Rp → Rn be two functions. Recall that the composition
f ◦ g : Rp → Rm is defined by (f ◦ g)(x) = f (g(x)). Show that, for any subset S ⊆ Rm ,
                                     (f ◦ g)−1 (S) = g −1 (f −1 (S)).
Hence show that, if f and g are continuous, then so is f ◦ g.
Exercise 5.7 By giving an example, show that it is not generally the case that for a continuous
function f : R → R, f (U ) is open whenever U is.
Exercise 5.8 Define the subset S of R2 by S = {(x, y)T : x > 0, y = sin(1/x)}. Sketch the set S.
Show that S is not closed. Show that T = S ∪ {(0, y)T : −1 ≤ y ≤ 1} is closed.
Exercise 5.9 Prove, using the definition of a closed set, that the union of a finite collection of closed
subsets of Rn is closed. Now prove the result using both of the following facts: (i) a set is closed if
and only if its complement is open, and (ii) the intersection of a finite collection of open sets is open.
Exercise 5.10 Suppose that {Ui : i ∈ I} is a family of subsets of Rm , and consider a function
f : Rn → Rm . Prove that                       !
                                         [          [
                                    −1
                                  f         Ui =        f −1 (Ui ).
                                            i∈I        i∈I
Exercise 5.11 Which of the following sets are compact? Explain your answers briefly. (You might
want to use the Heine-Borel theorem, which tells us that C ⊆ Rn is compact ⇐⇒ C is closed and
bounded.)
                                            {0} ∪ [1, 2],
                                                                
                                 x1
                                      : x1 + x2 ≤ 1, x1 ≥ 0, x2 ≥ 0 ,
                                 x2
                                                       
                                           1/n
                                                  :n∈N ,
                                           1/n
                                                         
                                         x           2
                                             :1≤x y≤2 .
                                         y
Exercise 5.12 Suppose that C ⊆ Rn is compact and that D ⊆ C is a closed subset of C. Use the
Heine-Borel theorem to prove that D is also compact. Now prove the same result directly from the
definitions of closed and compact.
Exercise 5.13 Suppose that B ⊆ R is not compact. Prove that there is a continuous function
f : B → R which is not bounded on B.
[Hint: Recall that if B is not compact, then it is not bounded or not closed. Consider separately the
cases (a) B not bounded, (b) B not closed.]
                               50
Chapter 6
Metric spaces
Contents
           6.1     Introduction . . . . . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   51
           6.2     Metrics and Metric Spaces . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   51
           6.3     Open sets . . . . . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   54
           6.4     Continuity in Metric Spaces . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   55
           6.5     Convergence and closed sets in metric spaces        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   56
           6.6     Compactness in metric spaces . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   57
           6.7     Learning outcomes . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   58
           6.8     Comments on selected activities . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   58
           6.9     Exercises . . . . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   59
Reading
        Neither of these readings is ideal in every way. The Sutherland book is quite advanced. The Bryant
        book is useful, but its approach is different, in that it starts with closed sets and only considers open
        sets much later on.
6.1 Introduction
        In this chapter we unify and generalise some of the key concepts that we met earlier in this course.
        The important notions of closed and open set, convergence, continuity and compactness are all set in
        the larger context of metric spaces.
        In the last chapter, we built up our definitions and results from some fairly simple concepts. We
        started by defining an open ball in Rn , and then we used that to define an open set. We can write our
        definition of what it means for a sequence to converge to a limit in terms of open balls if we like, and
        from that idea we defined the notion of a closed set, and a compact set in Rn . Our definition of
        continuous function can also be re-written in terms of open balls.
        So there are a lot of important concepts springing from the idea of an open ball, which is simply the
        set of points at “distance” less than  from a given point. So really all our key definitions (and most
        of the results) depend only on the concept of distance.
        If for instance we could define the “distance” between two functions, then we could follow exactly the
        same process and get a (hopefully useful!) definition of what it means for a sequence of functions to
                                                                                                                                                                                   51
                                          MA203 Real Analysis
        converge to another function, or for a set of functions to be compact. Or if we define the “distance”
        between two matrices, we can do the same again.
        Rather than do exactly the same thing over and over again, the normal mathematical procedure is to
        define exactly what we mean by “distance”, and extend our definitions and (where possible) results to
        cover any notion of distance that qualifies. So a metric space is going to be a set X, equipped with a
        distance “function” (metric), satisfying certain properties. Our basic examples will include the sets R
        and Rn , equipped with the Euclidean distance. Then we will give definitions, and prove theorems,
        about metric spaces. Mostly these will be the same as we had in the previous chapter.
        To see what we want from our “distance”, let’s think about our motivating example of the Euclidean
                                                          2 1/2
                                         Pn                
        distance d(x, y) = kx − yk =        i=1 |xi − yi |      between two elements x, y of Rn . Here d is a
        function from the set R × R of ordered pairs of elements of Rn to the set of real numbers. In
                                 n     n
        addition, the distance function d satisfies a few fairly simple rules that go along with what we expect
        of anything called a distance. First, d(x, y) ≥ 0, and equals 0 only when x = y. Also, the distance
        between x and y is the same as the distance between y and x; that is, d(x, y) = d(y, x). Further, we
        have the triangle inequality: for any x, y, z ∈ Rn ,
        Abstracting from the above example, we obtain the definition of a metric (or distance function) d on
        an arbitrary set X:
        Definition 6.1 A metric space M = (X, d) consists of a set X together with a function, called a
        metric, d : X × X → R such that
        Example For any set X, we have a rather trivial metric known as the discrete metric d0 . This is
        defined by                                      
                                                          0 if x = y;
                                           d0 (x, y) =
                                                          1 if x 6= y.
        This last example shows that for any set X, there is always at least one metric defined on X. On a
        given set, it may be possible to define a number of different metrics, as the following example shows.
        Example Returning again to Rn , we have the discrete metric d0 and, for any positive integer p, the
        following function is a metric:
                                                       n
                                                                     !1/p
                                                     X
                                                                   p
                                        dp (x, y) =      |xi − yi |       .
                                                              i=1
                                          52
                                                                                              Chapter 6. Metric spaces
        The metric d1 gives the distance one would have to travel between x and y, when only movement
        parallel to the coordinate axes was possible. (As a ‘real’ interpretation, note that in the USA, where
        many downtown areas have a rectangular grid of streets, d1 is often called the taxicab metric.)
        Another metric on Rn , which we shall denote d∞ , is defined by
        So far, we have defined metrics on the nice, familiar, Euclidean spaces Rn . But the notion of metric
        space has far more to offer. Consider the following.
        Example Let C[0, 1] be the set of all continuous functions f : [0, 1] → R. Recall that each such
        function is bounded. As for Euclidean space, we can define a family of metrics on C[0, 1]. We define
        and, for p ≥ 1,
                                                      Z   1                    1/p
                                                                            p
                                        dp (f, g) =            |f (x) − g(x)| dx     .
                                                       0
The most important examples are p = 1 and p = 2. The metric d∞ is often called the ‘sup metric’.
        Example Again let A(m, n) be the set of m × n real matrices. Now define the norm kM k of a matrix
        M to be
                                        max{kM xk : x ∈ Rn , kxk = 1}.
        To see that this is well-defined, notice that the function fM : Rn → R defined by fM (x) = kM xk is
        continuous, and the set S = {x ∈ Rn : kxk = 1} is closed and bounded, so compact. Hence the
        function fM has a maximum value on S, which is by definition kM k. We then have, for any M and
        x, that kM xk ≤ kM kkxk. Now we define a metric on A(m, n) by setting d(M, N ) = kM − N k, for
        matrices M, N ∈ A(m, n).
        Thinking of a metric space as having a distance defined on it, we can discuss boundedness, and so on,
        in this more general context.
        Definition 6.2 Suppose that (X, d) is a metric space and Y ⊆ X. Then Y is bounded if there is
        K ∈ R such that for all x, y ∈ Y , d(x, y) ≤ K.
                                                                                                                  53
                                        MA203 Real Analysis
                                                                                                    Sn
        Theorem 6.3 If Y1 , Y2 , . . . , Yn are bounded subsets of a metric space, then so too is    i=1   Yi .
It is not necessarily true that an infinite union of bounded sets is open, however.
        Give an example of an infinite family of bounded sets in some metric space, the union of which is not
        bounded.
Let (X, d) be a metric space. For x ∈ X and > 0, the open ball of radius around x is
(If the metric is not clear, we use the notation B (x; d).)
        Example In the space (R2 , d2 ) (R2 with the usual metric), the open ball B (x) is the region enclosed
        by a circle of radius  centred at x — note that the points on this circle do not lie in B (x). On the
        other hand, with respect to the metric d1 (x, y) = |x√1 − y1 | + |x2 − y2 |, the open ball B (x) is a
        square, aligned diagonally, whose sides have length 2. With respect to the metric
        d∞ (x, y) = max(|x1 − y1 |, |x2 − y2 |), the open ball is an axis-aligned square of side-length 2.
Example If X is any set then the open balls in the discrete metric space M = (X, d0 ) are given by
                                            B (x) = {x} if  ≤ 1;
                                                      n
                                                        X     if  > 1.
        The open balls of a metric space have the following key property: given an open ball Bδ (x) in a
        metric space and a point y of Bδ (x), there is  = (y) > 0 such that B (y) ⊆ Bδ (x). To see this, set
         = δ − d(x, y), and use the triangle inequality. In general, we call sets with this property open sets.
        Definition 6.4 A subset U of a metric space M is open (in M ) if, for any y ∈ U , there is  > 0
        (depending on y) such that B (y) ⊆ U.
        Informally, a set is open if from any point of the set, we can move some positive distance in any
        ‘direction’ without going outside the set.
        Example If d0 is the discrete metric on a set X, then every subset of X is an open set in (X, d0 ),
        since every singleton subset is an open ball (e.g., B1 (x) = {x}).
        Example A singleton subset {x} is not open in R with the usual metric, but is open if we use the
        discrete metric. For clarity, when it is not perfectly clear what metric is being used, we should speak
        of ‘d-open sets’. Usually there will be no confusion.
                                        54
                                                                                         Chapter 6. Metric spaces
be the interior of a rectangle. Then U is an open set, but not an open ball.
        Example Consider again R2 , with the three different metrics d1 (the taxicab metric), d2 (the
        Euclidean metric), and d∞ . Let U be a d1 -open subset of R2 , and let x be any point of U . Then
        there is a d1 -open ball A = B (x; d1 ) around
                                                    √ x contained in U ; so A is the interior of a
        diagonally-aligned square of side-length  2. Then it is possible to fit the d2 -open ball B/√2 (x; d2 ) –
                           √
        a disc of radius / 2 – and the d∞ -open ball B/2 (x; d∞ ) – an axis aligned-square of side-length  –
        inside A, and so inside U . This tells us that U is also d2 -open and d∞ open.
        A similar argument shows that if U is d2 -open, then it is also d1 - and d∞ -open, and if U is d∞ -open,
        then it is also d1 - and d2 -open. We say that the three metrics are equivalent metrics.
        Results about open sets in general metric spaces can be proved in much the same way as their
        counterparts applying to the case of Rn (with the usual metric). The following theorem provides an
        example of this.
                                            U1 , U2 open =⇒ U1 ∩ U2 open;
                                                               [
                                            Ui (i ∈ I) open =⇒   Ui open.
                                                                  i∈I
        We mentioned earlier that with a concept of distance, we ought to be able to generalise the definition
        of continuity to functions having as domain and codomain general metric spaces.
        Definition 6.6 Let (X, dX ) and (Y, dY ) be two metric spaces, and let f be a function from X to Y .
        Suppose that a ∈ X. Then, we say that f is (dX , dY )-continuous (or, simply, continuous) at a if,
        given any  > 0, there exists δ > 0 such that
        Definition 6.7 The function f : X → Y is continuous at a ∈ X if given any open ball B (f (a); dY )
        around f (a), there exists δ > 0 such that
        that is,
                                           Bδ (a; dX ) ⊆ f −1 (B (f (a); dY )) .
        Example Let (X, dX ) and (Y, dY ) be metric spaces. Suppose that dX is the discrete metric d0 .
        Then, for any a ∈ X,
Thus (taking δ = 1 in the definition of continuity) any function f from X to Y is (d0 , dY )-continuous.
        Suppose on the other hand that dY is the discrete metric, and again let f be a function from X to Y .
        For f to be (dX , d0 )-continuous at a point a ∈ X, we need (taking  = 1) to find δ such that
                                                                                                               55
                                          MA203 Real Analysis
        Continuity can be characterised completely in terms of open sets. (We saw a special case of this in
        the previous chapter, when the spaces are Rm and Rn with the usual metrics.)
        Theorem 6.8 Let (X, dX ) and (Y, dY ) be metric spaces. Then a mapping f : X → Y is
        (dX , dY )-continuous if and only if, for every dY -open subset U of Y , f −1 (U ) is a dX -open subset of
        X.
        Give an example to show that it is possible for a function f : X → Y between metric spaces to be
        continuous, for U to be an open subset of X, and yet for f (U ) not to be an open subset of Y .
        It is an easy matter to extend the notion of convergence to general metric spaces. Recall the
        definition of convergence of a real sequence: The sequence (xn ) converges to x ∈ R if for all  > 0,
        there is N such that
                                               n ≥ N =⇒ |xn − x| < .
        Thus the following definition is completely natural.
        Definition 6.9 Suppose that (X, d) is a metric space and that (xn ) is a sequence in X. We say that
        (xn ) converges to x ∈ X if for any  > 0 there is N such that
n ≥ N =⇒ d(xn , x) < .
        Now that we have a definition of convergence, we can say precisely what we should mean by a closed
        subset of a metric space.
        Definition 6.10 Suppose (X, d) is a metric space. A subset C of X is closed if, whenever (xn ) is a
        sequence of elements of C converging to a limit x, the limit x is also in C.
        Example If d0 is the discrete metric on any set X, then a sequence (xn ) converges to x if and only if
        there is an N such that n ≥ N =⇒ xn = x. (I.e., a sequence is convergent if and only if it is
        eventually constant.) It follows that all subsets of X are closed in the discrete metric.
                                          56
                                                                                        Chapter 6. Metric spaces
        Theorem 6.11 Suppose (X, d) is a metric space. A set C ⊆ X is closed if and only if its
        complement X \ C is open.
        Definition 6.12 Suppose that (X, d) is a metric space. A subset C of X is said to be (sequentially)
        compact if any sequence (xn ), where xn ∈ C for all n, has a subsequence converging to a point of C.
If X itself is a compact subset, we simply say that the metric space (X, d) is compact.
        Example Let X be any set and let d0 be the discrete metric on X. Then (X, d0 ) is compact if and
        only if X is finite.
Theorem 6.13 Any compact subset C of a metric space (X, d) is closed and bounded.
        We saw earlier that, for Rm with the usual metric, the ‘if’ of this result can be replaced by ‘if and
        only if’. In other words, any closed and bounded subset of Rm is in fact compact. However, the next
        example shows that this is not true in general.
        Example Let d0 be the discrete metric on an infinite set X. Then X itself is a closed and bounded
        set in (X, d0 ), but it is not compact.
        The following result is a more general form of a key theorem given in the previous chapter. It is often
        phrased as ‘The continuous image of a compact set is compact.’
        Theorem 6.14 Suppose that (X, dX ) and (Y, dY ) are metric spaces, and that C is a compact subset
        of X. If f : X → Y is (dX , dY )-continuous, then f (C) is a compact subset of Y .
        Proof. One point about this proof: we must not assume that the elements of X and Y are real
        numbers, or elements of some Rn , or whatever. All we know is that they are elements of some metric
        space, and that’s all we may use.
        To show that f (C) is compact, take any sequence (yn ) in f (C); we need to show that (yn ) has a
        subsequence converging to some element of f (C). For each n, since yn ∈ f (C), there is some xn in
        C such that f (xn ) = yn . Consider the sequence (xn ) in C: since C is compact, there is some
        subsequence (xnk ) converging to a limit x ∈ C. We claim that f (xnk ) → f (x) as k → ∞: to see
        this, fix any  > 0; by continuity of f there is some δ > 0 such that
        dX (y, x) < δ ⇒ dY (f (y), f (x)) < , and by definition of convergence there is some K such that
                                                                                                            57
                                         MA203 Real Analysis
      k ≥ K ⇒ dX (xnk , x) < δ – combining these two facts gives us that, for any  > 0, there is some K
      such that k ≥ K ⇒ dY (f (xnk ), f (x)) < , as required. But now notice that the sequence
      (f (xnk )) = (ynk ) is a subsequence of the original sequence (yn ) converging to a limit f (x) ∈ f (C),
      which is what we need.
      So we have the following important corollary, which generalises earlier results on the maximisation and
      minimisation of continuous functions.
      Theorem 6.15 Suppose that C is a compact subset of a metric space (X, d) and that f : X → R is
      continuous (with respect to d and the usual metric on R). Then f attains its bounds on C. That is,
      there are c, d ∈ C such that
At the end of this chapter and the relevant reading, you should be able to:
      Learning activity 6.1 We need to check that the three properties for a metric hold. Certainly,
      d(X, Y ) ≥ 0, and d(X, Y ) = 0 if X = Y . Moreover, if d(X, Y ) = 0 then maxi,j |xij − yij | = 0, so
      |xij − yij | = 0 for all i, j; in other words, xij = yij , and X = Y . Next,
      Lastly, we need to verify the triangle inequality for d. Let X, Y, Z ∈ A = A(m, n). We want to show
      that d(X, Y ) ≤ d(X, Z) + d(Y, Z). Now, for each i, j, by the standard triangle inequality for real
      numbers, |xij − yij | ≤ |xij − zij | + |zij − yij |. So, for all i, j,
                                         58
                                                                                      Chapter 6. Metric spaces
      Learning activity 6.2 Let’s take the metric space to be R with the usual metric. For n ∈ N, let
      Un = (−n,Sn). Then Un is bounded, because for each x, y ∈ Un , we have d(x, y) < 2n. However,
                 ∞
      the union n=1 Un is the whole of R and this is not a bounded set.
      Learning activity 6.3 Consider f : R → R defined by f (x) = 0, for all x. With the usual metric on
      R, this is continuous but, although (0, 1) is open, f ((0, 1)) = {0} is not open.
      Learning activity 6.4 Suppose that X is finite. Then any sequence in X must take the same value,
      say x ∈ X, infinitely often, and this constant subsequence is convergent to x because all its members
      equal x. On the other hand, suppose X is infinite. Then there is a sequence with no repeated
      members. Such a sequence has no convergent subsequence because a sequence converges in the
      discrete metric if and only if all its terms are equal from some point onwards, and this sequence (and
      hence all if its subsequences) has none of its members equal. (You might want to look at the solution
      to Exercise 6.9 for a fuller explanation of why a sequence in the discrete metric converges if and only
      if it is ultimately constant.)
6.9 Exercises
      Exercise 6.1 Is it possible that in a metric space M containing more than one point, the only open
      subsets are M and ∅, the empty set?
Exercise 6.2 Let Z be the set of integers. Let p be a fixed prime number. Define
d:Z×Z→R
      by d(m, m) = 0, and d(m, n) = 1/r where r is such that m − n = pr−1 k, where r, k are integers and
      p does not divide k. Prove that d is a metric on Z.
Exercise 6.3 Prove that in any metric space (A, d), for any x, y, z ∈ A,
      Exercise 6.5 Suppose that, in a metric space (A, d), the sequence (an ) converges to a and (bn )
      converges to b. Prove that (in R, with the usual metric) the sequence of real numbers (d(an , bn ))
      converges to d(a, b). [You may find the result of Exercise 6.3 useful.]
      Exercise 6.6 Suppose that (A, d) is a metric space, X is a set, and that f : X → A is an injective
      (one-to-one) function. Define d0 : X × X → R by d0 (x, y) = d (f (x), f (y)) . Prove that d0 is a metric
      on X. Prove that d0 is not a metric if the function f is not injective.
      Exercise 6.7 Prove that a subset of a metric space is open if and only if it is a union of open balls.
      [Note: the union may be the union of an infinite family of open sets.]
      Exercise 6.8 Let M = (X, d) be a metric space, and let (xn ) be a sequence of points in X. Suppose
      that xn → x and xn → y. Show that x = y.
      Exercise 6.9 Suppose that (A, d) is a metric space. What does it mean to say that a sequence (xn )
      of members of A converges to x ∈ A (with respect to the metric d)?
                                                                                                            59
                                MA203 Real Analysis
Prove that a sequence in A converges with respect to the discrete metric if and only if there is N ∈ N
such that xr = xs for all r, s ≥ N . (That is, if and only if all of its terms are eventually equal to each
other.)
                                60
Chapter 7
Uniform convergence
Contents
         7.1     Introduction . . . . . . . . . . . . . . . . . . .    . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   61
         7.2     Pointwise and uniform convergence . . . . . . .       . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   61
         7.3     Uniform convergence as convergence in a metric        space     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   62
         7.4     Uniform convergence and continuity . . . . . .        . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   63
         7.5     Learning outcomes . . . . . . . . . . . . . . . .     . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
         7.6     Comments on selected activities . . . . . . . .       . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
         7.7     Exercises . . . . . . . . . . . . . . . . . . . . .   . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
Reading
This topic is not covered by many of the textbooks, but the following is worth reading.
7.1 Introduction
      This short chapter concerns the matter of how we might think about convergence of a sequence of
      functions (rather than of numbers of vectors).
      Suppose S ⊆ Rn and that for each n ∈ N, fn is a function from S to R. We may think of the
      functions f1 , f2 , f3 , . . ., as forming a sequence of functions (fn ). Of course, this is very different from
      a sequence of real numbers, but it is still possible to formulate some idea of ‘limit’ in this context.
      There are two main ways in which we might do this. One is through the definition of pointwise
      convergence and the other through uniform convergence. The following definitions describe these.
      Definition 7.1 (Pointwise convergence) The sequence (fn ) converges pointwise to the function f
      on S if and only if for each x ∈ S, fn (x) → f (x) as n → ∞; that is, given x ∈ S, and  > 0, there
      exists N = N (x, ) such that
      Definition 7.2 (Uniform convergence) The sequence (fn ) converges uniformly to the function f
      on S if and only if given  > 0, there exists N = N () such that
      The difference between these definitions is that, with uniform convergence, the N depends only on ,
      so that the same N works for every x ∈ S, whereas in pointwise convergence, x is given, and N can
      depend on x as well as on .
                                                                                                                                                                         61
                                     MA203 Real Analysis
      Theorem 7.3 Suppose that (fn ) converges uniformly on S to f . Then (fn ) converges pointwise on
      S to f .
      Example Suppose that we take S = [0, 1] and fn (x) = xn . Then it is easy to see that (fn ) converges
      pointwise on [0, 1] to the function
                                                  
                                                    0 if 0 ≤ x < 1
                                          f (x) =
                                                    1 if x = 1.
      Example Suppose again that fn (x) = xn but that S = [0, 1/2]. Then (fn ) converges uniformly to
      the identically-0 function f (given by f (x) = 0 for all x) on [0, 1/2], because, for all x ∈ [0, 1/2],
                                                                  n
                                                              n   1
                                          |fn (x) − f (x)| = x ≤      .
                                                                  2
      Hence, given  > 0, we can choose N so that (1/2)N < , so that, for any n > N and any
      x ∈ [0, 1/2], |fn (x) − f (x)| < .
      For S a subset of Rn , let F = FS be the set of bounded functions from S to R, i.e., the set of
      functions f : S → R for which there is some constant K with |f (x)| ≤ K for all x ∈ S.
      For two functions f, g ∈ FS , define d(f, g) = supx∈S |f (x) − g(x)|. Note that, since f and g are
      bounded, the set {|f (x) − g(x)| : x ∈ S} is bounded above – so the supremum d(f, g) does exist.
      Then the pair MS = (FS , d) is a metric space: d is sometimes called the sup metric and is often
      denoted by d∞ .
      To say that a sequence (fn ) of functions in FS converges to a limit f in the metric space MS means
      that, for all  > 0, there exists some N = N () such that n > N =⇒ d(fn , f ) < . Of course the last
      inequality can be rewritten as supx∈S |f (x) − g(x)| < . This isn’t exactly the same as the definition
      we gave for uniform convergence, but nevertheless the following result is now fairly obvious.
      Theorem 7.4 Let S be any subset of Rn , let (fn ) be a sequence of bounded functions from S to R,
      and let f be a bounded function from S to R. Then (fn ) converges uniformly to f on S if and only if
      fn → f in MS .
      Proof. Suppose fn → f in MS . Take any  > 0. Then there is some N = N () such that, for all
      n > N (), d(fn , f ) = supx∈S |fn (x) − f (x)| < . Thus, for n > N (), and any x ∈ S,
      |fn (x) − f (x)| < . Thus (fn ) converges uniformly to f .
                                     62
                                                                              Chapter 7. Uniform convergence
      Suppose (fn ) converges uniformly to f on S. Then for any  > 0 there is some N = N () such that
      n > N (), x ∈ S =⇒ |fn (x) − f (x)| < /2. Thus, for n > N (), supx∈S |fn (x) − f (x)| ≤ /2 < .
      Hence fn → f in MS .
      So, this gives us a method to test whether (fn ) converges uniformly to f . We need only to evaluate
      supx∈S |fn (x) − f (x)| = d(fn , f ) for each n, and see whether this tends to 0 as n → ∞.
      Example Consider again the sequence (fn ) of functions on [0, 1] given by fn (x) = xn , and let f be
      the pointwise limit of (fn ). (So f (x) = 0 for all x < 1, and f (1) = 1.) Then
      supx∈[0,1] |fn (x) − f (x)| = supx∈[0,1) xn = 1 for each n, and therefore (fn ) does not converge
      uniformly to f (as we saw before).
                                                               nx
      Example Suppose fn : [0, 1] → R is given by fn (x) =        . Then
                                                              n+x
                                                            x
                                              fn (x) =           → x,
                                                         1 + x/n
      as n → ∞, so the sequence (fn ) converges pointwise on [0, 1] to f (x) = x. To check whether the
      convergence is uniform, we consider supx∈[0,1] |fn (x) − x|. Now,
                                                                  −x2               2
                                                                     
                                             nx
                                                                         = sup x .
                                                       
                       |fn (x) − x| = sup        − x = sup 
                                     x∈[0,1] n + x        x∈[0,1] n + x    x∈[0,1] n + x
                                                                        
      The derivative of x2 /(n + x) is non-negative on [0, 1], so it is increasing, and is hence maximised at
      x = 1, so the supremum is 1/(n + 1). This does tend to 0 as n → ∞, so the convergence is uniform.
      Example Define f0 : R → R by f (x) = 0 for all x. The open ball B1 (f0 ) consists of all functions f
      for which there is some t < 1 with |f (x)| ≤ t for all x.
      Example The set G of functions g : R → R such that |g(x)| < 1 for all x is (perhaps curiously) not
      an open set in the sup-metric space. For instance, consider the function g(x) = π2 tan−1 (x): there is
      no positive  such that B (g) ⊆ G.
      Example Consider the set of all functions from S = [0, 1] to [0, 1]. This is a subset of FS . It is a
      bounded set in the metric space MS , and it is closed. However it is not compact: consider again the
      sequence (fn ) defined by fn (x) = xn .
      The following result is sometimes useful, as we shall see, because it will lead to a method by which we
      can determine that some sequences of functions do not converge uniformly.
      In terms of the sup-metric, this means that the set CS of continuous functions from S to R is a
      closed subset in the metric space MS .
      The theorem above results in the following, which sometimes gives an easy way to prove that
      convergence is not uniform.
      Corollary 7.6 Suppose (fn ) converges pointwise on S to f , and that each fn is continuous at a ∈ S.
      If f is not continuous at a, then (fn ) does not converge uniformly on S to f .
                                                                                                          63
                                     MA203 Real Analysis
      Example (As above) Suppose that we take S = [0, 1] and fn (x) = xn . Then, as we have seen, (fn )
      converges pointwise on [0, 1] to the function
                                                    
                                                      0 if 0 ≤ x < 1
                                            f (x) =
                                                      1 if x = 1.
At the end of this chapter and the relevant reading, you should be able to:
          state what is meant by the pointwise and uniform convergence (on a set) of a sequence of
          continuous functions, and demonstrate that you understand the difference between these
          state, and use, the fact that uniform convergence implies pointwise convergence
          demonstrate that you understand that uniform convergence is equivalent to convergence in a
          metric space of functions with respect to the sup metric
          state the fact that (fn ) converges uniformly on S to f if and only if
          kf kS = sup{kfn (x) − f (x)k : x ∈ S} tends to 0 as n → ∞; and be able to use this to prove
          uniform convergence, or that a sequence does not uniformly converge
          state, and use, the fact that if a sequence of continuous functions converges uniformly, then the
          limit function is also continuous (proof not necessary)
      Learning activity 7.1 Informally, the reason that uniform convergence implies pointwise convergence
      is that the former condition is stronger: it requires the same N () to work for every x. Formally, all
      one has to observe to prove this is that, if the convergence is uniform and if N () is as in
      Definition 7.2, then, for each x, we can take N (, x) to equal N () in Definition 7.1. Definition 7.1 is
      then satisfied.
7.7 Exercises
      Exercise 7.1 Let fn : [0, 1] → R be defined by fn (x) = nxn (1 − x). Prove that (fn ) converges
      pointwise on [0, 1] to the identically-0 function (the function f such that f (x) = 0 for all x). Prove,
      however, that (fn ) does not converge uniformly to f .
                                                                    2
      Exercise 7.2 Let fn : R → R be defined by fn (x) = xe−nx . Prove that (fn ) converges uniformly on
      [0, 1] to the identically-0 function.
      Exercise 7.3 Does the sequence (fn ) of functions converge pointwise, when fn takes the following
      form?
                                 (i) fn (x) = tan−1 (nx) (= arctan(nx)),
                                               (ii) fn (x) = xe−nx .
      In each case, determine whether the sequences converge uniformly on R.
      Exercise 7.4 Consider the sequence of functions (gn ), each defined on (0, 1), given by
                                                   
                                                      x x ≥ 1/n
                                          gn (x) =
                                                      0 x < 1/n.
Show that (gn ) converges uniformly on (0, 1) to the limit g, where g(x) = x for all x.
                                     64
                                                                        Chapter 7. Uniform convergence
Let G denote the set of all functions from (0, 1) to (0, 1), considered as a subset of the metric space
M(0,1) of bounded functions on (0, 1) with the sup-metric. Deduce from the previous part of the
question that G is not an open set in M(0,1) .
Exercise 7.5 Suppose that for each n ∈ N, gn : [0, 1] → R is the function defined by
                                     
                                       nx             if 0 ≤ x ≤ 1/n
                            gn (x) =    n
                                       n−1 (1 −  x)   if 1/n ≤ x ≤ 1.
Find a function g such that (gn ) converges pointwise to g on [0, 1]. Prove that the convergence is
uniform on the interval [c, 1] for any c > 0, but that it is not uniform on [0, 1].
Exercise 7.6 Let C[0, 1] be the set of continuous functions from [0, 1] to R. (Recall that all such
functions are bounded and attain their bounds.) Consider the metric space M = (C[0, 1], d), where d
denotes the sup-metric.
Show that the set of functions in C[0, 1] whose image is contained in (0, 1) is an open set in C[0, 1].
65