0% found this document useful (0 votes)
23 views

Introanalysis

This document provides an introduction to analysis as the study of limiting behavior and approximation. It reviews basic concepts of sequences and limits of real numbers, including definitions of convergence, divergence, and Cauchy sequences. The document establishes that a sequence converges if and only if it is Cauchy and discusses properties of limits.

Uploaded by

xph1010
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Introanalysis

This document provides an introduction to analysis as the study of limiting behavior and approximation. It reviews basic concepts of sequences and limits of real numbers, including definitions of convergence, divergence, and Cauchy sequences. The document establishes that a sequence converges if and only if it is Cauchy and discusses properties of limits.

Uploaded by

xph1010
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Lecture Notes M 517

Introduction to Analysis

Donald J. Estep

Department of Mathematics
Colorado State University
Fort Collins, CO 80523

[email protected]
http://www.math.colostate.edu/∼estep
Contents

Chapter 1. Introduction 1
1.1. Review of Real Numbers 1
1.2. Functions and Sets 3
1.3. Unions and Intersection 5
1.4. Cardinality 5

Chapter 2. Metric Spaces 10


2.1. Quick Reivew of Rn 10
2.2. Definition and Initial Examples of a Metric Space 11
2.3. Components of a Metric Space 15

Chapter 3. Compactness 20
3.1. Compactness Arguments and Uniform Boundedness 20
3.2. Sequential Compactness 24
3.3. Separability 25
3.4. Notions of Compactness 28
3.5. Some Properties of Compactness 32
3.6. Compact Sets in Rn 33

Chapter 4. Cauchy Sequences in Metric Spaces 35


4.1. A Few Facts and Boundedness 35
4.2. Cauchy Sequences 36
4.3. Completeness 37
4.4. Cauchy Sequences and Compactness 40

Chapter 5. Sequences in Rn 42
5.1. Arithmetic Properties and Convergence 42
5.2. Sequences in R and Order. 45
5.3. Series 46

Chapter 6. Continuous Funtions on Metric Spaces 48


6.1. Limit of a Function 49
6.2. Continuous Functions 50
6.3. Uniform Continuity 52
6.4. Continuity and Compactness 53
6.5. Rn -valued Continuous Functions 54

Chapter 7. Sequences of Functions and C([a, b]) 57


7.1. Convergent Sequences of Functions 57
7.2. Uniform Convergence: C([a, b]) is Closed, and Complete 61
7.3. C([a, b]) is Separable 63
ii
CONTENTS iii

7.4. Compact Sets in C([a, b]) 69


CHAPTER 1

Introduction

What is analysis? The definition depends on your point of view. To an applied


mathematician, analysis means approximation and estimation. They try to describe
the physical world in mathematical terms so as to understand how it works and
make predictions on future behavior/events. But, the world is so complex, we have
to resort to approximations both to describe it and to understand the mathematical
descriptions themselves. To a pure mathematician, analysis is the study of the
limiting behavior of an an infinite process.
Many mathematical objects, such as numbers, derivatives, and integrals, are
defined as the limit of an infinite process. Dealing with such limits in a rigorous way
is what disinguishes modern mathematics, beginning around the time of Newton
and Leibniz, from classical mathematics.
These are really the same notions; approximation and estimation requires the
idea of a limit and, likewise, a limit implies the ability to approximate.
The analysis we learn in this course makes up a basic bag of tools for a working
mathematician. These tools are needed for any deeper study in applied mathemat-
ics, differential equations, topology, functional analysis,... . As with any kind of
tool, you can only really learn them by using them, i.e. by doing problems.

1.1. Review of Real Numbers


This course is like an advanced calculus course, except that we work with
the more general notion of a “metric space” rather than only the real numbers.
Examples include R as well as spaces of functions. We first talk about the structure
of the space itself - as we might begin a rigorous caclulus course by constructing real
numbers and discussing sequences - and then we develop properties of functions on
our space. It will probably be useful for you to compare our developments to what
holds for reals and functions on reals.
Recall
Definition 1.1.1. A sequence of real numbers
(a1 , a2 , ...) = {ai }∞ ∞
i=1 = {abike }bike=1

is a set of numbers listed in a specific order. Here, “∞” denotes a neverending list.
The subscript “i” or “bike” is called the index.
Example 1.1.2.
1
= .111111
9
So,
(.1, .11, .111, ..., .11...1n , ...)
| {z }
n 10 s

1
2 1. INTRODUCTION

1
is a sequence of reals that approximates 9 in the sense that
1
− .1...1n < 10−n
9
for all n.
Sequences can have many forms.
Example 1.1.3.
(2, 4, 6, ...) = {2n}∞
n=1
(−1, 1, −1, 1, ...) = {(−1)n }∞
n=1
1 1 1 ∞
(1, , , ...) = { }n=1
2 3 n
(1, 1, 1, ...) = {1}∞
n=1
3
(3, 4.182, , −14.68, ...) = {?}∞n=1
5
Some of these sequences have the special property that as the index increases,
the corresponding numbers become closer and closer to one number.
Definition 1.1.4. If the numbers in the sequence {an } approach a number a
as n increases, then we call a the limit of the sequence and write

lim an = a and an −−−−→ a


n→∞ n→∞
We say the sequence converges.
Example 1.1.5.
1
lim .11...1n =
n→∞ 9
1
lim =0
n→∞ n
lim 5 = 5
n→∞

To make this definition more precise, we use mathematical language:


Definition 1.1.6. lim an = a means that for every  > 0, there is an N such
n→∞
that |an − a| <  for n ≥ N .
1
Exercise 1.1.7. Verify lim .111...1n = . (Find a concrete relation between
n→∞ 9
the  and the N .)
Definition 1.1.8. If a sequence does not converge, we say it diverges.
There is a myriad of interesting behavior associated with divergence. Here are
some examples of diverging sequences:
Example 1.1.9.
(1, 2, 3, 4, ...)
(1, −1, 1, −1, ...)
(random numbers)
1.2. FUNCTIONS AND SETS 3

After making this definition, the usual thing to do is to prove some basic prop-
erties. Some of these are:
Theorem 1.1.10. Limits of Sequences of Real Numbers
(1) Limits are unique.
(2) Arithmetic Properties: Suppose an → a and bn → b. Then
(a) an + bn → a + b.
(b) an bn → ab.
An important concept related to convergence is the notion of a Cauchy se-
quence.
Definition 1.1.11. A sequence {an } of real numbers is a Cauchy Sequence,
or is Cauchy, if for every  > 0, there is an N such that
|ai − aj | <  for i, j > N.
The interest in the Cauchy criterion (Def. 1.1.11) is that it does not require the
value of the limit, i.e., what the sequence converges to. The standard criterion for
convergence is practically useless. We are able to prove
Theorem 1.1.12. A sequence of real numbers converges if and only if it is
Cauchy.
Sequences and limits are inherent to the notion of the real number system.
Recall that rational numbers, in general, have infinite decimal expansions. But,
at least these decimal expansions are periodic, e.g., .123412341234... . Most real
numbers (in a precise sense) have infinite, non-repeating decimal expansions. These
are called the irrational numbers. All kinds of difficulties arise because
√ of this fact.
For example, how should we add the irrational numbers e and 2?
We add numbers with finite decimal expansions by starting at the rightmost
digit. But, that is impossible with irrationals! √What we do in practice is add
finite truncations,√or approximations, of e and 2, and count on this being an
approximation of 2 + e.
In fact, this is exactly the way we construct real numbers from the rationals,
which we understand better. Just as with ”+” above, the main point is to show the
real numbers inherit the usual properties of rational numbers. The tricky part is
that we can’t write down reals until we have defined them! This is why the Cauchy
criterion is so important.
All of these difficulties
√ carry over to defining functions of real numbers (how
should we compute e?) and the central role of convergence is why continuity is
such an important property.
In this course, we replace real numbers by a more abstract space, which includes
R and Rn as well as other collections like functions. Then we develop the basic
properties of the space and functions on the space.

1.2. Functions and Sets


Definition 1.2.1. Let A , B be two sets and suppose to each element a in A
there is associated one element b in B, which we write as b = f (a). f is a function
or map from A into B. We write f : A → B.
4 1. INTRODUCTION

Definition 1.2.2. Let A, B be sets and f : A → B. If C ⊂ A (read C is a


subset of A), then we define

f (C) = {f (c) : c ∈ C}

to be the image of C under f . We call A the domain of f and B the range of f .

Definition 1.2.3. If A, B are sets and f (A) = B, we say f maps A onto B.

Definition 1.2.4. Suppose A, B are sets and f : A → B. If C ⊂ B, then

f −1 (C) = {a ∈ A | f (a) ∈ C}

is the inverse image of C under f . f −1 (C) may be the empty set.

Definition 1.2.5. Let A, B be sets and f : A → B. If y ∈ B, then f −1 (y) =


{a ∈ A | f (a) = y}. If for each y ∈ B, f −1 (y) consists of at most one element of A,
then f is a 1-1 (one-to-one) map of A into B. This is equivalent to saying that

f (a1 ) 6= f (a2 ) when a1 , a2 ∈ A, a1 6= a2 .

The concept of a function is fundamental in many ways. As an appplication:

Definition 1.2.6. A sequence in a set A is a function f from the natural


numbers N into A. If f (n) = an for n ∈ N , then we write the sequence as

{an }∞
n=1 = (a1 , a2 , ...)

The values of f are called the terms or elements of the sequence.

Note: N may be replaced by N ∪ {0} = Z0 or even Z. Also, the


an need not be distinct.
Infinite sets and sequences have some nice properties that make them relatively
easy to use, as we will see. However, we do have to work with more complicated
sets in analysis, and though it is somewhat unfamiliar, the index notation can be
generalized to help out:

Definition 1.2.7. Let A and B be sets and suppose that for each a ∈ A there
is associated a subset of B called Ca . The set whose elements are the sets Ca for
a ∈ A is denoted
{Ca | a ∈ A} = {Ca }a∈A = {Ca }

Example 1.2.8. Let A = [−1, 1] and Ca = {x : sin(x) = a} for a ∈ A.. For


example,
C0 = {..., −3π, −2π, −π, 0, π, 2π, ...}
has a lot of points in it!

Exercise 1.2.9. What is {Ca }a∈[−1,1] ?

Note: the relation a → Ca in Def. 1.2.7 is not a function of


one number to one number. We call such this a set valued
function.
1.4. CARDINALITY 5

1.3. Unions and Intersection


Definition 1.3.1. Let A be a set and {Ca }a∈A a collection of subsets of A
such that Ca ⊂ A. The union of the sets {Ca } is the set S such that s ∈ S if and
only if s ∈ Ca for at least one a ∈ A. We write
[
S= Ca .
a∈A
If A = N, then we write

[ [ [
S= Cm = Cm = Cm .
m=1 m
If A = {1, 2, ..., n}, then we write
n
[
S= Cm
m=1
The intersection of the sets {CA } is the set T such that t ∈ T if and only if t ∈ Ca
for all a ∈ A. We write
\ ∞
\ \n
T = Ca = Cm = Cm .
a∈A m=1 m=1

Example 1.3.2.
[
{−1, 2, 5, 8} {3, 5, 8, 10} = {−1, 2, 3, 5, 8, 10}
\
{−1, 2, 5, 8} {3, 5, 8, 10} = {5, 8}
Example 1.3.3. Let A = (0, 1] and Ca = (0, a) for a ∈ A. Then
Sa ⊂ Cb if and only if 0 < a ≤ b < 1
(1) C
(2) Ta∈A Ca = C1
(3) a∈A Ca = ∅ (the empty set)
S T
Exercise 1.3.4. What is a∈A Ca and a∈A Ca in Example 1.2.8?
T
Definition 1.3.5. If A and B are two sets and A B is not empty, then we
say A and B intersect. Otherwise, they are disjoint.
S T
There are many important properties of and that you can read on page
28 of Rudin’s “Principles of Mathematical Analysis”.

1.4. Cardinality
Cardinality refers to the “number of points” in a set. “Number of points” is in
quotes because we want to talk about infinite sets as well. Moreover, we want to
distinguish different kinds of infinite sets.
Example 1.4.1. { 1, 2, 5,π, -8}, Z , and R turn out to have different cardinal-
ities.
We start the discussion by developing a mechanism for comparing the number
of elements between two sets. We use functions:
Definition 1.4.2. If there is a 1-1 map of a set A onto a set B, then we say that
A and B are in 1-1 correspondence, have the same cardinality or cardinal
number, or are equivalent, and we write A B.
6 1. INTRODUCTION

This relation has the following (obvious) properties:


Theorem 1.4.3. Let A, B, C be sets. Then
(1) A ∼ A (reflexive)
(2) A ∼ B ⇔ B ∼ A (symmetric)
(3) If A ∼ B and B ∼ C, then A ∼ C. (transitive)
Definition 1.4.4. A relation with the properties in Thm. 1.4.3 is called an
equivalence relation.
We now write down the basic classification of cardinality:
Definition 1.4.5. Let A be a set.
(1) A is finite if A ∼ {1, 2, ..., n} for some natural number n or if A is empty.
(2) A is infinite if A is not finite.
(3) A is countable if A ∼ N.
(4) A is uncountable if A is neither finite nor countable.
(5) A is at most countable if A is finite or countable
Note: some texts use countable to mean at most countable.
Example 1.4.6. Z is countable.

Z 0, 1, -1, 2, -2, 3, -3, ...


l l l l l l l
N 1, 2, 3, 4, 5, 6, 7, ...
Here, (
n
2 for n even
f (n) = .
− n−1
2 for n odd
takes N to Z and is 1-1 and onto.
Note: this shows that an infinite set can be equivalent to a proper
subset of itself! This only happens on infinite sets.
Example 1.4.7. Let {an }∞ n=1 be a sequence with distinct terms, and observe
{a2n }∞
n=1 is a subsequence. Then the function f : {an } → {a2n } given by f (am ) =
a2m is a bijection and we see {an } is equivalent to {a2n } as sets.
Lemma 1.4.8. Every countable set can be represented as a sequence.
Proof. Let A be a countable set. This means A ∼ N. So, there exists f :
N → A, a 1-1 and onto map. Let
a1 = f (1)
a2 = f (2)
..
.
an = f (n)
..
.
Since f is 1-1, we have A represented as {an }∞
n=1 . 
Note: The representation given in Lemma 1.4.8 is not unique!
1.4. CARDINALITY 7

Theorem 1.4.9. Every infinite subset of a countable set is countable.


Proof. Let A be countable and suppose C ⊂ A is infinite. Let A be arranged
A = (a1 , a2 , ...)
where the {ai } are distinct. Let n1 ∈ N be the smallest integer such that an1 ∈ C.
Having chosen n1 , n2 , ..., nk−1 with k ≥ 2, let nk be the smallest integer larger than
nk−1 such that ank ∈ C. We let f : N → C be given by
f (k) = ank .
Notice f is 1-1 and onto, and therefore C is countable. 
Thm. 1.4.9, along with the fact that every infinite set contains a countable
subset, says that countable sets are the ”smallest” kind of infinite set in terms of
cardinality. The following result is very significant, though this won’t be readily
apparent until one studies measure theory.
Theorem 1.4.10. Suppose {En }∞
n=1 is a sequence of countable sets. Then

[
S= En
n=1
is countable.
Proof. We let En be denoted by {Xnk }∞ k=1 = {Xn1 , Xn2 , ...}. Since S is the
union of all these elements, we can think of it as an infinite array:

x1,1 x1,2 x1,3 x1,4 ...

x2,1 x2,2 x2,3 x2,4 ...

x3,1 x3,2 x3,3 ...

x4,1 x4,2 x4,3 ...

Figure 1.1

We want to number the elements in S using N. consider the ordering shown in


Figure 1.1, which is listing
x1 x2 x3 x4 x5 x6 x7 x8 x9 ...
l l l l l l l l l
1 2 3 4 5 6 7 8 9 ...
We can find a 1 − 1 and onto map between the elements and N in this way. 
Note: there is an obvious corollary for at most countable unions
of at most countable sets.
8 1. INTRODUCTION

Next we show that the rational numbers, Q, are countable. To do this, we show
the following:
Theorem 1.4.11. Let A be a countable set. Let Bn be the set of n-tuples
(a1 , ..., an ), where ai ∈ A for 1 ≤ i ≤ n, and the {ai } in an n-tuple need not be
distinct. Then Bn is countable.
Proof. By induction. Since B1 = A, B1 is countable. For the induction step,
assume Bn−1 is countable where n ∈ {2, 3, ...}. The elements of Bn correspond
to {(b, a) | b ∈ Bn−1 , a ∈ A}. In particular, for any fixed b ∈ Bn−1 , the set
{(b, a)} | a ∈ A} ≡ A and hence is countable. Thus, Bn is the countable union of
countable sets. 
Corollary 1.4.12. Q is countable.
Proof. Consider A = Z and n = 2 in Theorem 1.4.11. Then Q = { ab | a, b ∈
Z}, corresponding to a subset of {(a, b) | a, b ∈ Z}. We know the latter set is
countable, so by Theorem 1.4.9, we have that Q is countable. 
Next, we will prove a fact about the cardinality of R.
Theorem 1.4.13. R is not countable.
Proof. The proof has one tricky point, which is that a real number can have
two decimal expansions. For example, .20 = .19.
It suffices to show that the set of numbers {x | 0 < x < 1} = (0, 1) is uncount-
able. If these numbers were countable, then there would be a sequence {sn }∞ n=1
that gives them all. We show this is impossible by constructing a number in (0, 1)
but not in {sn }∞n=1 .
We write each sn as a decimal expansion:
sn = 0.dn1 dn2 dn3 ...
where dni ∈ {0, 1, 2, ..., 9} for each i and n. Now define y = 0.e1 e2 e3 ..., where
(
1 if dnm 6= 1
em = .
2 if dnm = 1

/ {sn }∞
Notice y ∈ (0, 1), but y ∈ n=1 (see Example 1.4.14). 
Example 1.4.14. Let
s1 = .123456789...
s2 = .246812461...
s3 = .691284823...
s4 = .444444444...
..
.
y = .2121...
Now y is different than s1 in the first digit, s2 in the second, s3 in the third, and
so on. Note that situations like sn = .199999... and y = .2000... cannot occur by
the choice of {em }.
1.4. CARDINALITY 9

This type of argument is called a Cantor diagonalization argument. It is a


powerful technique, which can be used to prove the following theorem:
Theorem 1.4.15. Let A be the set of all sequences whose elements are the digits
0 and 1. Then A is uncountable.
This theorem also implies that the reals are uncountable by using binary ex-
pansions. It is also important in probability. We can denote an infinite sequence of
coin tosses as a sequence of 0’s and 1’s, and we see that the number of possibilities
is uncountable.
CHAPTER 2

Metric Spaces

The real numbers have properties that make it natural to discuss sequences.
Indeed, the reals are defined in terms of sequences. We would like to create an
abstract notion of a space in which it makes sense to talk about sequences and in
which sequences play as an important of a role as they do in the real numbers. In
a sense, we would like to abstract the ”sequential nature” of the reals, as opposed
to their other properties.
Our abstract space should certainly include the reals as an example. It should
also include Rn . So, we have to lose the order properties (this is, properties pertain-
ing to the relation “¡”) in our abstract space. But, Rn itself has special properties
that will not be present in the abstract notion. Therefore, Rn will always be a
special, important example of our abstract notion of a space.

2.1. Quick Reivew of Rn


Recall that
Rn = {(x1 , x2 , ..., xn ) | xi ∈ R, 1 ≤ i ≤ n},
where we define many properties of the n-tuple (x1 , x2 , ..., xn ).
There are at least four important structures on Rn :
(1) Algebraic Structure
Rn is a vector space over the scalars R with the usual definitions of
addition, subtraction, and scalar multiplication carried out “component
by component” on n-tuples. These definitions satisfy all the properties
expected of a vector space.
(2) Inner Product Spaces
Consideration of how to “multiply” two vectors in Rn lead to the dot
product, or inner product:
(x1 , ..., xn ) · (y1 , ..., yn ) = x1 y1 + ... + xn yn
This defines another structure on Rn , beginning with the fundamental no-
tion of orthogonality. Two vectors are orthogonal if their inner product is
zero. Beginning with this definition, we can develop all kinds of properties
of Rn .
(3) Length Structure
The inner product on Rn induces a natural length via
v
u n
√ uX
k s k= x · x = t x2i .
i=1

k · k is called a norm and it has some important properties. For α ∈


R, x, y ∈ Rn ,
10
2.2. DEFINITION AND INITIAL EXAMPLES OF A METRIC SPACE 11

(a) k x k≥ 0 and k x k= 0 ⇔ x = 0
(b) k αx k= |α| k x k
(c) (Triangle Inequality) k x + y k≤k x k + k y k
(d) (Cauchy-Schwarz Inequality)|x · y| ≤k x k k y k
(4) Distance Structure Finally, the norm induces a natural distance func-
tion. We define the distance between x and y in Rn as
d(x, y) =k x − y k
This distance function satisfies
(a) d(x, y) ≥ 0 and d(x, y) = 0 ⇔ x = y
(b) d(x, y) = d(y, x)
(c) d(x, z) ≤ d(x, y) + d(y, z)
for x, y, z ∈ Rn .
The distance function is certainly critical for talking about the con-
vergence of sequences of points in Rn : Convergence is tied to the notion
of points becoming closer and distance between points becoming smaller.
To define our abstract space, we skip past the other properties of Rn
and go right to distance.

2.2. Definition and Initial Examples of a Metric Space


Definition 2.2.1. A set X whose elements are called points is a metric space
if to any two points x, y ∈ X there is associated a real number d(x, y), called the
distance between x and y, such that for x, y, z ∈ X,,
(1) d(x, y) > 0 if x 6= y and
d(x, y) = 0 if x = y
(2) d(x, y) = d(y, x)
(3) d(x, z) ≤ d(x, y) + d(y, z).
d is called the metric on X. Property (3) is called the triangle inequality.
Example 2.2.2. R with d(x, y) = |x − y|
p Pn
Example 2.2.3. Rn with d(x, y) =k x − y k= i=1 (xi − yi )2
Pn
Example 2.2.4. Rn with d(x, y) =k x − y k1 = i=1 |xi − yi |
(k k1 is called the 1-norm.)
Exercise 2.2.5. Verify k k1 is a norm and that this defines a metric.
Example 2.2.6. Rn with d(x, y) =k x − y k∞ = max1≤i≤n |xi − yi |
(k k∞ is called the ∞ norm.)
Exercise 2.2.7. Verify that k k∞ is a norm and that this induces a metric.
This is an interesting development: One space X might have several difference
metrics!
Definition 2.2.8. Let (X, d) be a metric space. The ball of radius r cen-
tered at x ∈ X is defined
Br (x) = {y ∈ X | d(y, x) ≤ r}.
The unit ball is B1 (o).
12 2. METRIC SPACES

(0,1)
d(x, y) =k x − y k

(-1,0) (1,0)

d(x, y) =k x − y k∞

d(x, y) =k x − y k1

(0,-1)

Example 2.2.9. It is interesting to plot the unit “balls” in R2 with respect to


the three metrics:

Exercise 2.2.10. Reproduce this picture.

Example 2.2.11. Let S be a sphere in R3 . We define a metric on its surface by


setting d(x, y) to be the shortest distance along the surface of S between x and y
(which are on the surface). The shortest path along the surface is called a geodesic.

d(x,y)

Showing this is a metric is hard!

Definition 2.2.12. For any set X, we define the discrete metric by


(
1 if x 6= y
d(x, y) =
0 if x = y.

Exercise 2.2.13. Verify that the function given in Def. 2.2.12 is a metric.

Definition 2.2.14. Consider the set of sequences {(x1 , x2 , x3 , ...) | xi ∈ R}.


The Hilbert Space l2 is the subset of such sequences for which the terms are
”finitely square summable”, i.e.,

X
l2 = {(x1 , x2 , x3 , ...) | xi ∈ R for 1 ≤ i and x2i < ∞}
i=1
2.2. DEFINITION AND INITIAL EXAMPLES OF A METRIC SPACE 13

P∞
Notice that the terms in i=1 x2i are positive. This implies, among other
things, that

X N
X
lim xi = 0 and x2i = lim x2i
i→∞ N →∞
i=1 i=1
In view of the latter fact, we can think of l2 as something like (Rn , k · k) where
“n → ∞”. (Recall the usual definitions of +, etc. for sequences.)
v
u∞
uX
Theorem 2.2.15. With d(x, y) = t (xi − yi )2 for x, y ∈ l2 , (l2 , d) is a
i=1
metric space.
v
u∞ 2
uX
Note: Actually, k x k= t xi defines a norm on l2 , and
i=1
d(x, y) =k x − y k. However, we do not pursue that.

Proof. The proof is not so hard, except that we have to be careful to treat
the infinite sequences in a mathematically correct way. For example, to verify (1)
of Defn. 2.2.1, we want to show that d(x, y) ≥ 0 and d(x, y) = 0 ⇔ x = y (two
sequences are equal if and only if all the elements are equal) for x, y ∈ l2 .
However, we first have to show that d(x, y) is defined and finite! For this we
require the following basic inequality:
Lemma 2.2.16. Let a, b ∈ R. Then 2|a||b| ≤ a2 + b2 .
Proof: 0 ≤ (|a| − |b|)2 = |a|2 − 2|a||b| + |b|2 .
Now, for x, y ∈ l2 ,
XN
{ (xi − yi )2 }∞ n=1
i=1
is a monotone, nondecreasing sequence. If we prove the terms are bounded above
uniformly for all N , then it has a limit, i.e.,

X
lim (xi − yi )2
N →∞
i−1

is defined and is finite. For any N , we have


N
X N
X N
X
(xi − yi )2 ≤ 2 x2i + 2 yi2
i=1 i=1 i=1

X ∞
X
≤2 x2i + 2 yi2 ,
i=1 i=1

where the sum on the right is finite by assumption.


We conclude that if x, y ∈ l2 , then d(x, y) is well-defined and finite.
Clearly,
v
uN
uX
d(x, y) = lim t (xi − yi )2 ≥ 0.
N →∞
i=1
14 2. METRIC SPACES

If x = y, then d(x, y) = 0. Vice versa, if d(x, y) = 0, then


N
X
lim (xi − yi )2 = 0.
N →∞
i=1
Pn 2 ∞
But the terms of { i=1 (xi − yi ) }N =1 are nonnegative and nondecreasing, hence
N
X
(xi − yi )2 = 0
i=1
for all N . This implies xi = yi for all i.
For property (2), it is clear that d(x, y) = d(y, x).
Finally, for property (3), suppose x, y, z ∈ l2 . By the properties of the standard
Euclidean distance on Rn ,
v v v
uN uN uN
uX uX uX
t (x − z )2 ≤ t (x − y )2 + t (y − z )2
i i i i i i
i=1 i=1 i=1

for all N . Letting N → ∞, we see


d(x, z) ≤ d(x, y) + d(y, z).

An interesting fact about l2 is that it must be “infinite dimensional” (whatever
that means!). Another infinite dimensional example:
Definition 2.2.17. We let C([a, b]) denote the vector space of continuous func-
tions on the closed interval [a, b].
Why do I claim this is infinite dimensional? Recall we can write a continuous
function on [a, b] in terms of a unique Fourier series, i.e. as a linear combination of
the orthonormal Fourier basis functions derived from sin and cos.
Definition 2.2.18. The maximum norm or sup norm on C([a, b]) is defined
as
k f k∞ = max |f (x)|.
a≤x≤b
(Actually, we should use
sup |f (x)|,
a≤x≤b

but in the case of continuous functions, these are equivalent. We


will discuss the difference between sup and max later.)
Theorem 2.2.19. C([a, b]) with d(f, g) = d∞ (f, g) =k f − g k∞ is a metric
space.
Proof. First, we recall that a continuous function on a closed, bounded in-
terval has a maximum value and a minimum value. Hence, if f ∈ C([a, b]), then
k f k∞ is well defined, and likewise if f, g ∈ C([a, b]), then |f (x) − g(x)| is contin-
uous on [a, b], and d(f, g) is well-defined. Clearly, d(f, g) ≥ 0 and d(f, g) = 0 ⇔
|f (x) − g(x)| = 0 for all x, or f (x) = g(x). Also, d(f, g) = d(g, f ). Finally, if
f, g, h ∈ C([a, b]), then for any a ≤ x ≤ b,
|f (x) − h(x)| ≤ |f (x) − g(x)| + |g(x) − h(x)|.
2.3. COMPONENTS OF A METRIC SPACE 15

So,
max |f (x) − g(x)| ≤ max (|f (x) − g(x)| + |g(x) − h(x)|)
a≤x≤b a≤x≤b
≤ max |f (x) − g(x)| + max |g(x) − h(x)|,
a≤x≤b a≤x≤b

which shows property (3). 

Exercise 2.2.20. Given f, g ∈ C([a, b]), we have


max (|f (x)| + |g(x)|) ≤ max |f (x)| + max |g(x)|.
a≤x≤b a≤x≤b a≤x≤b

Construct an example that gives strict inequality in this bound.

2.3. Components of a Metric Space


Let (X, d) be a metric space and A ⊂ X a subset.
Definition 2.3.1. A neighborhood of a point x ∈ X is the set Nr (x) of all
points y ∈ X with d(x, y) < r.
Example 2.3.2. In Rn , Nr (x) is the open ball of radius r centered at x. In the
discrete metric, Nr (x) = x when r ≤ 1 and Nr (x) = X when r > 1.
Definition 2.3.3. A point x is a limit point of A if every neighborhood of x
contains a point y 6= x in A.
Example 2.3.4. Suppose a, b, c ∈ (R, | · |) with a < b < c. Then any x ∈ (a, b)
is a limit point of (a, b). a and b are limit points of (a, b). c is not a limit point of
(a, b).
Definition 2.3.5. If x ∈ A and x is not a limit point of A, then x is an
isolated point of A.
S
Example 2.3.6. In example 2.3.4, if A = (a, b) {c}, then c is an isolated point
of A.
Definition 2.3.7. A is closed if every limit point of A belongs to A.
Example 2.3.8. In (R, | · |), [a, b] is closed while (a, b), (a, b], and [a, b) are not.
Definition 2.3.9. A point x ∈ A is an interior point if there is a neighbor-
hood Nr (x) ⊂ A (for some r, not all r!).
Example 2.3.10. In (R, | · |), any x ∈ (a, b) is an interior point of (a, b), but a
and b are not.
Definition 2.3.11. A is open if every point of A is an interior point.
Example 2.3.12. In (R, | · |), (a, b) is open, but [a, b], [a, b), and (a, b] are not.
Definition 2.3.13. The complement Ac of A is the set of points in X but
not in A. This is also written Ac = X r A.
Before we define further components, we develop some basic connections be-
tween neighborhoods, open sets, and closed sets.
Theorem 2.3.14. Any neighborhood is open.
16 2. METRIC SPACES

Nr1 (x)




Nr2 (x)





G2





G2

Figure 2.1

Proof. Let Nr (x) be a neighborhood of x ∈ X and choose y ∈ Nr (x). Set


ρ = r − d(x, y) > 0. For all z such that d(y, z) < ρ, we have
d(x, z) ≤ d(x, y) + d(y, z) < r − ρ + ρ = r,
so z ∈ Nr (x). In other words, Nρ (y) ⊂ Nr (x). So, y is an interior point of
Nr (x). 
Theorem 2.3.15.
(1) Let {Aα } be a finite or infinite collection of sets Aα ⊂ X. Then
[ \
( Aα )c = Acα .
α α

(2) A set A ⊂ X is open if and only if Ac is closed.S


(3) If {Gα } is a collection of open sets, then Tα Gα is open.
(4) If {Fα } is a collection of closed sets, then α Fα is closed. Tn
(5) If {G1 , ..., Gn } is a finite collection of open sets, then Si=1 Gi is open.
n
(6) If {F1 , ..., Fn } is a finite collection of closed sets, then i=1 Fi is closed.
Proof.
c
, then x 6∈ T α Aα , i.e., x 6∈ Aα for any α. So, x ∈ Acα for
S S
(1) If x ∈ ( α Aα )S
all α. Hence, ( α Aα )c ⊂ α Acα . The reverse relation is very similar to
prove.
Exercise 2.3.16. Prove α Acα ⊂ ( α Aα )c .
T S

(2) Suppose Ac is closed and x ∈ A. x 6∈ Ac and so x is not a limitTpoint


of Ac . This implies there is a neighborhood N of x such that Ac N is
empty. So, N ⊂ A. Hence, x is an interior point of A and A is open.
If A is open and x is a limit point of Ac , then every neighborhood of
x contains a point of Ac , so x is not an interior point of A. Since A is
open, x ∈SAc . So, Ac is closed.
(3) Set G = α Gα . If x ∈ G, then x ∈ Gα for some α. x is an interior point
of Gα , T which impliesS x is an interior point of G, and so GTis open.
(4) Now,T( α Fα )c = α Fαc , and Fαc is open for all α. So, ( α Fα )c is open
and α FT α is closed.
n
(5) Set G = i=1 Gi . For x ∈ G, there are neighborhoods Nri (x) ⊂ Gi , i =
1, ..., n, for some radii {ri }ni=1 . Set r = min{r1 , ..., rn }, so Nr (x) ⊂ Gi , 1 ≤
i ≤ n. Nr (x) ⊂ G, so G is open. (See Figure 2.1.)
(6)
2.3. COMPONENTS OF A METRIC SPACE 17

Exercise 2.3.17. Use complements in (5) to prove this.



Finiteness of the collection is essential in (5) and (6) of Thm. 2.3.15.
T∞
Example 2.3.18. If Gn = (− n1 , n1 ) in (R, |·|), then n=1 Gn = {0} is not open.
Exercise 2.3.19. Construct an analogous example for (6).
0
Definition 2.3.20. If A ⊂ X, where (X, d) is a metric
S 0 space, then let A be
the set of limit points of A. The closure of A is A = A A .
Example 2.3.21. In (R, | · |), (a, b) = [a, b].
Theorem 2.3.22. Let A ⊂ X, where (X, d) is a metric space. Then
(1) A is closed.
(2) A = A ⇔ A is closed.
(3) A ⊂ F for every closed set F ⊂ X such that A ⊂ F .
(3) says that A is the “smallest” closed set in X that contains
A.
Proof.
(1) If x ∈ X and x 6∈ A, then x is not in A and is not a limit point of A.
Hence, x has a neighborhood in X that does not intersect A. So, (A)c is
open.
(2) If A = S A, then (1) implies A is closed. If A is closed, then A0 ⊂ A and
A = A A0 = A.
(3) If F is closed and A ⊂ F , then since F ⊃ F 0 , F ⊃ A0 , and so F ⊃ A. This
has an important consequence in the case of (R, | · |).

Theorem 2.3.23. Let A be a nonempty set of real numbers that is bounded
above. Let y = sup A. Then, y ∈ A and y ∈ A if A is closed.
(We say that bounded closed sets of real numbers have maximum
values.)
Proof. If y ∈ A, then y ∈ A. Assume y 6∈ A. For r > 0, there is an x ∈ A
such that y − r < x < y (otherwise y − r would be an upper bound for A). So, y is
a limit point of A! Hence, y ∈ A. 
In general, boundedness is often important.
Definition 2.3.24. A set A ⊂ X, where (X, d) is a metric space, is bounded
if there is a point y ∈ X and a number M such that
d(x, y) < M
for all x ∈ A. In other words, A ⊂ NM (y).
Example 2.3.25. Rn with d2 , d1 , or d∞ is not bounded, but Br (x) in any of
these metric spaces is bounded.
We next characterize limit points in terms of sequences. First, a preliminary
result:
18 2. METRIC SPACES

Theorem 2.3.26. Let A ⊂ X, where (X, d) is a metric space. Then x ∈ X is


a limit point of A if and only if every neighborhood of x contains infinitely many
points of A. In particular, a finite set of points has no limit points.
Proof. The ”if” direction is just the definition. Suppose x has a neighborhood
that contains only a finite number of points of A. Call the points {y1 , ..., yn } in
NT
A N that are distinct from x and set
r = min d(x, ym ) > 0.
1≤m≤n

Now Nr (x) contains no point x of A such that x 6= z. Hence, x cannot be a


limit point of A. 
Question: Why can we use ”min” instead of ”inf”, and why is
this important?
Definition 2.3.27. A sequence of points in a metric space (X, d) is an ordered
set of points {x1 , x2 , x3 , ...} with xi ∈ X for all i. If xi ∈ A ⊂ X for all i, then we
say the sequence is in A and write {xn } ∈ A.
Definition 2.3.28. A sequence {xn }∞
n=1 in a metric space (X, d) converges to
the limit x ∈ X, written
lim xn = x or xn → x,
n→∞
if for every  > 0 there exists an N such that d(xn , x) <  for n ≥ N .
The characterizations we seek are
Theorem 2.3.29. Let A ⊂ X, where (X, d) is a metric space. Then
(1) A is closed if and only if every convergent sequence in A has its limit in
A.
(2) x is a limit point of A if and only if there is a sequence of distinct points
{xn } in A such that xn → x.
Proof. (1) Suppose A is closed , {xn } ∈ A, and xn → x. If B = {xn }
is infinite, then every neighborhood of x contains an infinite number of
points of B hence of A. So, x is a limit point of A and x ∈ A. T
For the converse, let x be a limit point of A. choose x1 ∈ N1 (x) A, x1 6=
x. Suppose x1 , ..., xn have been chosen to be distinct points in X such
that \
xi ∈ N 1i (x) A
T
for 1 ≤ i ≤ n. We choose xn+1 ∈ N n+1 1 (x) A such that xn+1 6=
x1 , x2 , ..., xn . Then, {xn } is a sequence of distinct points in A such that
xn → x. By assumption, x ∈ A, so A is closed.
(2) We actually proved the “only if” above in the proof of (1). If {xn } is a
sequence of distinct points in A such that xn → x, then every neighbor-
hood of x contains an infinite number of the xn ’s, so x is a limit point of
A.

There are a couple of definitions that will be useful later on:
Definition 2.3.30. If (X, d) is a metric space, A ⊂ X is dense in X if every
point in X is a limit point of A or is in A or both.
2.3. COMPONENTS OF A METRIC SPACE 19

Example 2.3.31. Q is dense in R.


Definition 2.3.32. A subset A ⊂ X of a metric space (X, d) is perfect if it
is closed and every point in A is a limit point of A.
S
Example 2.3.33. In (R, |·|), let a < b < c. Then [a, b] is perfect, but [a, b] {c}
is closed and not perfect.
Finally, we address a subtle point about openness that is one reason that we
do not use open/closed sets as the fundamental tool in our investigations.
Example 2.3.34. Consider (0, 1) ⊂ (R, | · |), which is open. (0, 1) × {0} is not
open in (R2 , k · k). Note that (0, 1) × {1} ⊂ R1 ⊂ R2 and the metric induced by | · |
in R1 is the same as restricting the metric induced by k · k in R2 to R1 .
Definition 2.3.35. Suppose (X, d) is a metric space and Y ⊂ X is a subset.
Then (Y, d) is also a metric space which we say is induced by the metric on X.
Now, returning to Ex. 2.3.34:
Definition 2.3.36. Suppose (X, d) is a metric space and A ⊂ Y ⊂ X. A is
open relative to Y if to each x ∈ T
A there is an r such that y ∈ A when d(x, y) < r
and y ∈ Y. In other words, Nr (x) Y ⊂ A.
Example 2.3.37. In Ex. 2.3.34, (0, 1) is open relative to R1 ⊂ R2 , but is not
open in R2 .
In the situation in Def. 2.3.36, it is natural to ask for the relations between
being open relative to Y and openness in X.
Theorem 2.3.38. Let (X, d) beT a metric space and Y ⊂ X. A ⊂ Y is open
relative to Y if and only if A = Y G for some open set G ⊂ X.
Proof. If A is open relative to Y, then to each x ∈ A there is an rx > 0 such
that d(x, y) < rx and y ∈ Y ⇒ y ∈ A. S
Let Bx = {y ∈ X | d(y, x) < rx } and set G = x∈A Bx . Since T Bx is open
for T
every x, G is open. Moreover,
T x ∈ B x for all x ∈ A, so A ⊂ G Y. Also,
Bx Y ⊂ A for all x ∈ A, so G Y ⊂ T A.
T in X and A = G Y, then every x ∈ A has a neighborhood
Now, if G is open
Bx ⊂ G. Then Bx Y ⊂ A, so A is open relative to Y. 
CHAPTER 3

Compactness

One frequent goal of analysis is to generalize a property of some function that


holds at each point, or small part of a set, so that it holds uniformly over the entire
set at once. We might say that we are trying to draw a global conclusion from local
information. It turns out that the properties of the underlying set are critical. This
might seem a little abstract, but in fact you are very familiar with one example
covered in calculus.

3.1. Compactness Arguments and Uniform Boundedness


Definition 3.1.1. A function f : R → R is bounded or locally bounded at
each point in a set A ⊂ R if for each x ∈ A there are constants δx , Mx such that
|f (y)| ≤ Mx for x − δx ≤ y ≤ x + δx .
In this definition, the subscript “x” on δx and Mx is usually left off. We include
it to emphasize that each x may require different values of δx and Mx . Contrast
this with
Definition 3.1.2. A function f : R → R is uniformly continuous on a set
A ⊂ R if there is a constant M such that |f (y)| ≤ M for all y ∈ A.
In this case, M may depend on the set A, but does not vary with each choice
of point in A.
A basic problem that arises is:
Given a function f that is locally bounded on a set A, can we
conclude that f is uniformly bounded on A?
The answer depends on properties of A.
Example 3.1.3. f (x) = x1 is locally bounded on (0, 1). For x ∈ (0, 1), set
δx = x2 and M = x2 . But, x1 is not uniformly bounded on (0, 1).
What goes wrong? (0, 1) has the limit point 0 (where x1 is undefined), but
0 6∈ (0, 1) and we have no assumption of x1 being bounded at 0. We could avoid
this by assuming the set is closed.
Example 3.1.4. x1 is locally bounded on [, 1] for any 1 >  > 0 and is also
uniformly bounded on [, 1] for 1 >  > 0. Choose M = 1 .
Example 3.1.5. f (x) = x is locally bounded on [0, ∞) but is not uniformly
bounded on [0, ∞). For local boundedness, take δx = 1 and Mx = x+1 .
What goes wrong? [0, ∞) is closed and contains its limit points, but it is too
“big”, allowing x to grow without bound. We can avoid this by assuming the set
itself is bounded.
20
3.1. COMPACTNESS ARGUMENTS AND UNIFORM BOUNDEDNESS 21

Example 3.1.6. f (x) = x is uniformly bounded on [0, N ] for any N > 0.


Choose M = N .
This suggests the following theorem:
Theorem 3.1.7. Suppose f : R → R is locally bounded on a set A ⊂ R. If A
is closed and bounded, then f is uniformly bounded on A.
We present several proofs of this theorem, all of which are called “compactness
arguments”. Each argument rests on formulating a property of A (“compactness”)
that follows from the assumption of A being closed and bounded. These properties
generalize to abstract metric spaces, and this will be our lead into compactness.
I want to emphasize that assuming a set of real numbers is closed and bounded
is somewhat natural, but this may or may not be natural in an abstract metric
space. Nevertheless, the properties that derive from being closed and bounded are
natural.
The first property is:
Definition 3.1.8. A set A ⊂ R has the Bolzano-Weierstrass property if
every sequence of points in A has a subsequence that converges to a point in A.
Example 3.1.9. Suppose A ⊂ R is finite. Then A has the Bolzano-Weierstrass
property. This is because {an } ∈ A implies there exists x ∈ A such that B :=
{i | ai = x} is infinite. Let B = {b1 , b2 , ...} with i > j ⇒ bi > bj , and we see
(ab1 , ab2 , ...) is a subsequence of {an } that converges to x.
Example 3.1.10. N does not have the Bolzano-Weierstrass property. Take
the sequence {an }∞
n=1 where an = n.

Exercise 3.1.11. Does { a1 | a ∈ Z, a 6= 0} have the Bolzano-Weierstrass prop-


erty? Justify your answer.
The first proof of Theorem ‘3.1.7 is based on the following theorem:
Theorem 3.1.12. A set A ⊂ R is closed and bounded if and only if it has the
Bolzano-Weierstrass property.
Now, we are not going to prove this property here. It will follow from our
more general results later. What we will do is show how to use this to prove
Theorem 3.1.7.
Proof. Suppose f is not uniformly bounded. So, for any n ∈ N, there exists
x ∈ A such that |f (x)| > n. Let {xn } be a sequence such that
|f (xn )| > n
for all n. By Theorem 3.1.12, there must be a subsequence {xnk }∞
k=1 that converges
to some y ∈ A. Choose N > y. There exists N0 such that m > N0 implies
|xnm − y| < N − y. But if nl > max{N, N0 }, then we have xnl < N and xnl > N .
This is contradiction. 
The second property uses the notion of a covering:
Definition 3.1.13. Let A ⊂ R and O be a set of open intervals in R. If for
every x ∈ A there is at least one interval I ∈ O such that x ∈ I, then O is an open
cover of A.
22 3. COMPACTNESS

open cover

a b c

Example 3.1.14. A = (a, b) ∪ {c}

Example 3.1.15. Since Q is countable, we can write Q = {a1 , a2 , a3 , ...}. Let


O = {(an − , an + )}. Then O is an open cover of Q for any  > 0.
Definition 3.1.16. A set A ⊂ R has the Heine-Borel property if every
open cover of A can be reduced to an open cover with a finite number of sets,
i.e., if O an open cover of A, then there are intervals I1 , ..., In ∈ O such that
S is S
A ⊂ I1 ... In .
Example 3.1.17. Every finite set has the Heine-Borel Property.
Example 3.1.18. N does not have the Heine-Borel property. Take the cover
{(n − , n + ) | n ∈ N} for some small  > 0.
Example 3.1.19. A = { n1 | n ∈ N} does not have the Heine-Borel property.
Exercise 3.1.20. Prove the statement made in Example 3.1.19.
Now we will observe one more tool that we need in order to explore the second
proof of Theorem 3.1.7.
Theorem 3.1.21. A set A ⊂ R has the Heine-Borel property if and only if A
is closed and bounded
Again this follows from our more general result later. We just use this result
right now.
Proof. Of Theorem 3.1.7.
Since f is locally bounded on A, for each x ∈ A, there is an open interval Ix 3 x
and a number Mx such that
|f (y)| ≤ Mx
T
for y ∈ Ix A. Let O = {Ix | x ∈ A}.
O is an open cover of A, so there are Ix1 , Ix2 , ..., Ix3 such that
[ [
A ⊂ Ix1 ... Ixn .
Let M = max{Mx1 , ..., Mxn } (Why is having a finite number important?). Choose
x ∈ A, so x ∈ Ixi for some 1 ≤ i ≤ n. Hence,
|f (x)| ≤ Mxi ≤ M

The next property uses the idea of a descending sequence of sets.
Definition 3.1.22. A sequence of sets {An }∞
n=1 of R is descending if

A1 ⊃ A2 ⊃ A3 ⊃ ....

\
Cantor asked: Under what conditions do we have An 6= ∅?
n=1
3.1. COMPACTNESS ARGUMENTS AND UNIFORM BOUNDEDNESS 23

Definition 3.1.23. A descending sequence of sets {An }∞


n=1 in R has the Can-

\
tor Intersection property if An 6= ∅.
n=1

\
Example 3.1.24. Let An = (4 − n1 , 5 + n1 ) for n ∈ N. Then An = [4, 5].
n=1

\
Example 3.1.25. Let An = [− n1 , n1 ] for some a ∈ R. Then An = {0}.
n=1

Example 3.1.26. Let An = (0, n1 ) ⊂ R. Then {An } is descending, but



\ ∞
\
An = ∅. To see this, notice 0 6∈ An . Also, if x 6= 0, then there exists
n=1 n=1

\
m ∈ N such that 1
m > |x|. x 6∈ Am , so x 6∈ An . So, {An }∞
n=1 does not have the
n=1
Cantor Intersection property.
Example 3.1.27. Let An = [n, ∞) ⊂ R for n ∈ N. Then {An } is descending
\∞
but An = ∅.
n=1

In these two examples, the sets in the sequence are not open and bounded
respectively. Cantor proved
Theorem 3.1.28. Let {An } be a descending sequence of nonempty, closed, and
bounded subsets of R. Then
\∞
An 6= ∅.
n=1

Again, we present a more general result later. For now, we use this property
for our third proof of Theorem 3.1.7.
Proof. Suppose f is not uniformly bounded on A. Since A is bounded, A ⊂
[a, b] for some a, b. Divide [a, b] into two sub-intervals of length (b−a) 2 . f must be
unbounded on at least one of these sub-intervals. Call this sub-interval [a1 , b1 ].
Now repeat the division argument to get a new subinterval [a2 , b2 ] of [a1 , b1 ] of
length (b−a)
T
22 such that f is unbounded on [a2 , b2 ] A. Inductively, we obtain a
sequence {[an , bn ]} with (bn − an ) = b−a bn ] ⊂ [an−1 , bn−1 ] (so the sequence
2n , [an ,T
is descending), and f is unbounded on [an , bn ] A.

\
By Theorem 3.1.28, there is a point x ∈ [an , bn ]. x is a limit point of A
n=1
(why?) and so x ∈ A. By local boundedness,
T there are δx and Mx such that
|f (y)| ≤ Mx for y ∈ (x − δx , x + δx ) A. But, for n sufficiently large, [an , bn ] ⊂
(x − δx , x + δx ), which gives a contractiction. 
The last property does not really give a new compactness argument. It is
a restatement of the Bolzano-Weierstrass property in terms of Cauchy sequences.
(Recall Definition 1.1.11.)
We will base our fourth proof of Theorem 3.1.7 on the following theorem:
24 3. COMPACTNESS

Theorem 3.1.29. A set A ⊂ R is closed and bounded if and only if


(1) every sequence in A contains a Cauchy subsequence and
(2) every sequence in A that is a Cauchy sequence converges to a limit in A.
Both of these properties are needed, as we will see later. The proof is only a
slight modification of the first proof based on Theorem 3.1.12.
Proof. Of Theorem 3.1.7
If f is not bounded on A, there is a sequence {xn } ∈ A such that |f (xn )| > n.
There must be a Cauchy subsequence {xnk } that converges T to y ∈ A. There are
δy and My such that |f (z)| ≤ My for z ∈ (y − δy , y + δy ) A. But, for all k large,
xnk ∈ (y − δy , y + δy ) which yields a contradiction. 

3.2. Sequential Compactness


The “compactness arguments” presented in section 3.1 turn out to be powerful
tools in abstract metric spaces. But, the necessary properties of a set needed to use
these arguments do not follow from the assumption of being closed and bounded,
as they do in Rn (we will see this). Closed and bounded sets in Rn have special
properties because of the underlying properties of Rn .
So, our strategy is to assume the equivalent conditions for being closed and
bounded presented in Theorems 3.1.12, 3.1.21, and 3.1.28, i.e., the analogs of the
Bolzano-Weierstrass, the Heine-Borel, and the Cantor intersection properties, in
our general abstract metric spaces. After all, it is these conditions that are used in
our compactness arguments.
Nominally, these three conditions define three different characteristic properties
of the set in question. A major result we prove is that these are equivalent and
define just one characteristic property called compactness.
The first type of compactness we define is the analog of the Bolzano-Weierstrass
property. We define this and then explore some of the consequences that follow from
having the property.
Definition 3.2.1. A subset K of a metric space (X, d) is sequentially com-
pact if every sequence of points in K has a subsequence that converges to a point
in K. If X itself is sequetially compact, we call it a sequentially compact space.
It is common to use K for a compact set. Note: since (K, d) is also a metric
space, we can think of a sequentially compact subset as a sequentially compact
space.
Example 3.2.2. A closed, bounded interval in R is sequentially compact. (See
Theorem 3.1.12.)
Example 3.2.3. The set of rational numbers in [0, 1] is not a sequentially
compact subset of (R, |·|). Consider the equence obtained by taking a finite number
of terms of the decimal expansion of √12 :
(.7, .70, .707, .7071, .70710, .707106, ...),
which converges to an irrational number. Any subsequence also converges to √1 .
2
The first consequence of this definition that we prove is
Theorem 3.2.4. Let K ⊂ X be a (sequentially) compact subset of a metric
space (X, d). Then K is closed and bounded.
3.3. SEPARABILITY 25

Proof. First we prove closed. Suppose x were a limit point of K. Then there
exists a sequence of elements in K that converge to x. This sequence must have
a subsequence that converges to an element in K. However, the sequence and the
subsequence must have the same limit, so x must be in K.
Second, we prove bounded. Assume K is not bounded. Choose x0 ∈ K. For
each n ∈ N, let xn ∈ K with d(x0 , xn ) > n. Now, the sequence {xn } has the
property that every subsequence is unbounded, and hence cannot converge. So, K
cannot be sequentially compact. 

Note: being closed and bounded is not sufficient to guarantee


sequential compactness.
Example 3.2.5. For x, y ∈ R, define
d(x, y) = min{|x − y|, 1}.
(R, d) is a metric space and {xn } converges to x in (R, | · |) if and only if it converges
to x in (R, d).
Exercise 3.2.6. Show this last statement.
Now, (R, d) is closed and bounded. (Boundedness follows since d(x, o) ≤ 1 for
all x ∈ R.) However, (R, d) is not sequentially compact. For example, {1, 2, 3, ...}
contains no convergent subsequence.

3.3. Separability
We saw that sequential compactness imples being closed and bounded, and
the closed and bounded sets in R are special with respect to making compactness
arguments. Another property of real numbers is that the rational numbers are
dense. This turns out to be even more fundamentally important than is obvious.
It also turns out that sequential compactness is related to this property.
Definition 3.3.1. A metric space (X, d) is separable if it contains a count-
able, dense subset. A subset A ⊂ X of a metric space (X, d) is separable if it is
separable considered as a metric space with the metric d.
Example 3.3.2. (R, | · |) and (Rn , k · k) are separable. So are closed, bounded
subsets of these metric spaces.
Theorem 3.3.3. Let K ⊂ X be a sequentially compact subset of a metric space
(X, d). Then K is separable.
We prove this by introducing a useful notion and proving another theorem.
Definition 3.3.4. Let A ⊂ X, wehre (X, d) is a metric space. If for  > 0
there are points x1 , ..., xn ∈ A such that
n
[
A⊂ N (xi ),
i=1

then {x1 , ..., xn } is an -net for A.


Note: if {x1 , ..., xn } is an -net for A, then for any x ∈ A,
d(x, xi ) <  for at least one i.
26 3. COMPACTNESS

Example 3.3.5. Given any set of real numbers that is bounded and any  > 0,
we can find an -net consisting of points with finite decimal expansions. This is
very important for computation with real numbers on computers.
Note that the existence of an -net for any  > 0 is a stronger condition than
boundedness. Not only is K contained in some large ball, which S is obtained by
choosing some -net {x1 , ..., xn } and taking a ball that contains i N (xi ), but it
is contained in the union of a finite number of neighborhoods of any small size.
Example 3.3.6. Consider the closed unit “ball” in l2 (see Definition 2.2.14.):
v
u∞
uX
N1 (0) = {x ∈ l2 | t x2n ≤ 1}.
n=1

The sequence (x1 , x2 , x3 , ...) with


x1 = (1, 0, 0, 0, ...)
x2 = (0, 1, 0, 0, ...)
x3 = (0, 0, 1, 0, ...)
..
.
is clearly in N1 (0). However,

d(xm , xn ) = 2
for any m 6= n, hence N 12 (x) can contain at most one of the {xn } for any x ∈ l2 . If
xm ∈ N 12 (x) for some m, i.e.,
1
d(xm , x) < ,
2
then for m 6= n, we have

2 = d(xn , xm ) ≤ d(xn , x) + d(xm , x)
1
≤ d(xn , x) + .
2
This yields
√ 1 1
d(xn , x) ≥ 2 − > .
2 2
Thus, there can be no -net for N1 (0) in l2 with  = 12 .
Example 3.3.7. (R, | · |) does not have an -net for any  > 0.
Definition 3.3.8. Let (X, d) be a metric space. A set A ⊂ X is totally
bounded if for every  > 0, there is an -net for A.
Theorem 3.3.3 follows from
Theorem 3.3.9. Let K ⊂ X be a sequentially compact subset of a metric space
(X, d). Then K is totally bounded.
Proof. Of Theorem 3.3.9 Assume for some  > 0 there does not exist any
finite set of points {x1 , ..., xn } with
n
[
K⊂ N (xi ).
i=1
3.3. SEPARABILITY 27

n−1
[
We can choose a sequence of points {xm }∞
m=1 in K such that xn 6∈ N (xm )
m=1
for all n. This sequence can have no convergent subsequence. This is because if
{xmk }∞ 
k=1 → x, then there exists l such that k > l implies d(x, xmk ) < 2 . But now
xl+1 ∈ N (xl ), contradicting our choice of xl+1 . This proves the contrapositive. 
Exercise 3.3.10. This proof is stated a little more easily if we use the notion
of Cauchy sequences and the fact that a sequence that converges is Cauchy. Do
this.
1
Proof. Of Theorem 3.3.3. By Theorem 3.3.9, for  = m, m ∈ N, there is an
-net for K. Call it
{xm,1 , xm,2 , ..., xm,nm }.
So,
n
[m

K⊂ N m1 (xm,i ).
i=1
The set of points
A = {x1,1 , ..., x1,n1 , x2,1 , ..., x2,n2 , x3,1 , ..., x3,n3 , x4,1 , ...}
is at most countable and dense in K.
A is at most countable because it is the countable union of finite sets. (The
sets may intersect, so the end result could be finite.)
We need a little more justification to say A is dense in K. We have to show
that every point in K − A is a limit point of A. Assume x ∈ K, x 6∈ A. (Note,
if K is finite, then x ∈ A is forced.) We construct a sequence of points in A that
converges to x.
For m ∈ N, choose xm = xm,i such that 1 ≤ i ≤ nm and x ∈ N m1 (xm,i ). Then
we see that {xi }∞ ∞
i=1 ∈ A and {xi }i=1 → x. 
Summing up, so far we have that sequential compactness implies closed, bounded,
totally bounded, and separable.
There is one last fact about separability we will use. Separability is related to
open covers.
Definition 3.3.11. An open cover of a set A contained in a metric space
(X, d) is a collection of open subsets {Gα }α∈a of X such that
[
A⊂ Gα .
α∈a

A sub-cover of {Gα }α∈a is a sub-collection {Gα }α∈a0 ⊂a that still covers A. A


countable or finite (sub)cover has a countable or finite number of sets respec-
tively.
Theorem 3.3.12. Lindelöf 0 sT heorem Every open cover of a separable metric
space has a countable or finite subcover.
Proof. Let (X, d) be a separable metric space and let {xn }n∈N be a dense set
of points in X. The set of neighborhoods
{N m1 (xn )}n,m∈N
is an at most countable collection of open sets. We number these sets by {N1 , N2 , N3 , ...}.
28 3. COMPACTNESS

1
n x
r
4
xl

Figure 3.1

We first show that if G ⊂ X is an open set that contains x ∈ X, then


x ∈ Nn0 ⊂ G
0
for some n . Since G is open, Nr (x) ⊂ G for some 2 > r > 0. We choose xl from
the sequence {xm } so that
r
|xl − x| < .
4
We choose n ∈ N with
4 2 r 1 r
> n > or < < .
r r 4 n 2
(See figure 3.1). The triangle inequality implies
x ∈ N n1 (xl ) ⊂ Nr (x).
So, we have Nn0 = N n1 (xl ).
Let {Gα }α∈a be an open cover of X. We use {Ni } to extract a countable, or
finite, subcover. We select a sequence {Gα1 , Gα2 , Gα3 } from {Gα , ...} by choosing,
when possible, any Gαk such that Nk ⊂ Gαk . We skip values of k when necessary
to get an index set K. {Gαk }α∈K is at most countable. (Note, there may be ”gaps”
in the indices k ∈ K. For example,
α1 , α2 , α10 , α11 , α153 , ...
could be the indices. But, there is at most a countable number.
Any x ∈ X is contained in some Gα0 ∈ {Gα }. But, then x ∈ Nk ⊂ Gα0 for some
k, and hence there is a Gαk0 ∈ {Gαk }k∈K such that x ∈ Gαk0 . Hence, {Gαk }k∈K
covers X. 

3.4. Notions of Compactness


We now generalize the other two conditions equivalent to being closed and
bouned for intervals in R in Theorems 3.1.21 and 3.1.28, the Heine-Borel property
and Cantor’s intersection property.
Definition 3.4.1. (Compare to Definition 3.1.16).
A subset K ⊂ X of a metric space (X, d) is compact if every open cover of
K contains a finite subcover. If a metric space X is compact, then we call it a
compact metric space.
Definition 3.4.2. A collection {Fα }α∈a of closed subsets of a metric space
(X, d) has the finite intersection property if every finite subcollection of the
closed subsets has a nonempty intersection.
Example 3.4.3. Define Iα = [0, α] for α >S0 and I = [3, 4]. Then {Iα }α>0 has
the finite intersection property, but {{Iα }α>0 I} does not.
3.4. NOTIONS OF COMPACTNESS 29

We now show part of the Borel-Lebesgue Theorem. The rest will be proven in
Chapter 4.
Theorem 3.4.4. Borel-Lebesgue Theorem, part I
Let (X, d) be a metric space, and K ⊂ X. The following are equivalent.
(1) K is compact.
(2) Every collection of closed subsets of K with the finite intersection property
has a nonempty intersection.
(3) K is sequentially compact.
In light of this result, we only use the term “compact” and no longer refer to
“sequential compactness”. We may restate the results in Section 3.2 and 3.3:
Theorem 3.4.5. Let K ⊂ X be a compact subset of a metric space (X, d).
Then,
(1) K is closed.
(2) K is bounded.
(3) K is separable.
(4) K is totally bounded.
Proof. Of Theorem 3.4.4
We show that (1) → (2) → (3) → (1).
(1) → (2)
We assumeT K is compact. Let {Fα }α∈a be a collection of closed subsets of
K such that α∈a Fα = ∅. Consider the open sets {Gα }α∈a with Gα = Fαc , the
complement of Fα in X. Then,
[ [ \
Gα = Fαc = ( Fα )c = ∅c = X.
α α α
So, [
K⊂ Gα .
α
By compactness, [ [
K ⊂ Gα1 ... Gαn
for some α1 , ..., αn . Moreover, since Fαi ⊂ K,
F αi = K \ Gαi
so
\ \ \\
F α1 ... F αn = K \ Gα1 ... K \ Gαn
[ [
= K \ (Gα1 ... Gαn ) = K \ K = ∅.
Hence, {Fα }α∈a cannot have the finite intersection property.
Example 3.4.6. Consider [0, 1] ⊂ (R, | · |). Define
1 1
Fn = [ , ], n ∈ N,
n+3 n+1
so {Fn } = {[ 41 , 12 ], [ 51 , 13 ], [ 61 , 14 ], [ 17 , 15 ], ...} (see Figure 3.2).
Now, if Gn = Fnc , then
1 [ 1
Gn = (−∞, ) ( , ∞)
n+3 n+1
30 3. COMPACTNESS

1 1 1 1 1 1
0 7 6 5 4 3 2
1
F1
F2
F3
F4

Figure 3.2

or
1 [ 1
G1 = (−∞, ) ( , ∞)
4 2
1 [ 1
G2 = (−∞, ) ( , ∞)
5 3
1 [ 1
G3 = (−∞, ) ( , ∞)
6 4
1 [ 1
G4 = (−∞, ) ( , ∞)
7 5

\ T
Clearly, Fn = ∅, since F4 F1 = ∅.
n=1
Moreover, plotting {Gn } shows that
[ [ [
[0, 1] ⊂ G1 G2 G3 G4 .

{G1 , G2 , G3 , G4 } is the finite subcover constructed in the proof (see Figure 3.3).
Back to the proof: (2) → (3)
Let {xm } be a sequence in K. Define

Fn = {xn , xn+1 , ...} = {xm }∞


m=n

This is a descending sequence of closed sets with the finite intersection property,
since
\ \
xn0 ∈ Fn1 ... Fnk
when n0 = max{n1 , ..., nk } for any n1 , ..., T
nk ∈ N.

Hence, there is an x ∈ K with x ∈ n=1 Fn . There are two possibilities. If
x = xn for infinitely many n, the we extract the subsequence consisting of repeated
valued of xn , which obviously converges to x = xn .
If x = xn for finitely many n, then we construct a subsequence that converges
to x as follows: Choose n1 large enough that x 6= xm for m ≥ n1 . Given nk ∈ N,
us the fact that
x ∈ {xnk +1 , xnk +2 , xnk +3 , ...}
1
is a limit point to choose nk+1 > nk with 0 < d(x, xnk +1 ) < k+1 . Clearly, {xnk } →
x.
3.4. NOTIONS OF COMPACTNESS 31

1 1 1 1 1 1
0 7 6 5 4 3 2
1

Figure 3.3

Example 3.4.7. First, consider (1, 12 , 13 , 14 , ...).

F1 = {1, 21 , 13 , ...} {0}


S

= { 21 , 13 , ...} {0}
S
F2
= { 31 , 14 , ...} {0}
S
F3
..
.
T
Fn = {0}

In the proof, x = 0 and the sequence itself converges to x.


Next, consider {1, 12 , 1, 13 , 1, 41 , 1, 15 , ...}.

F1 = {1, 12 , 1, 13 , 1, 14 , 1, 15 , ...} {0}


S

= { 21 , 1, 13 , ...} {0}
S
F2
= {1, 13 , 1, 14 , ...} {0}
S
F3
..
.
T
Fn = {1, 0}

If we choose x = 1, then we choose the subsequence of all 1’s. If we choose x = 0,


then we choose the subsequence { 21 , 13 , 14 , ...}.

Back to the proof: (3) → (1).


32 3. COMPACTNESS

Assume K is sequentially compact and let {Gα }α∈a be an open cover of K. By


Theorem 3.3.3, K is separable and by Theorem 3.3.12, {Gα }α∈a can be reduced to
a finite or countble subcover.
Assume we have a countable cover {Gn }∞ n=1 . We show this can be reduced to
a finite subcover.
Sn Assume this is not true. For each n ∈ N, there is an xn ∈ K
but not in m=1 Gm . Otherwise, {Gm }nm=1 covers K. The sequence S∞ {xn } has a
subsequence {xnk } that converges to a limit x ∈ K. Since K ⊂ m=1 Gm , x ∈ GM
for some M > 0. Since GM is open, xnk ∈ GM for all sufficiently large k. But, this
contradicts the construction by which xnk 6∈ GM for nk > M . 

3.5. Some Properties of Compactness


We now show some easy, but characteristic properties of compactness of subsets.
Theorem 3.5.1. Let (X, d) be a metric space.
Sn
(1) If {K1 , ..., Kn } are compact subsets of X, then m=1 Km is compact.
T
(2) If {Kα }α∈a is a collection of compact subsets of X, then α∈a Kα is
compact.
Exercise 3.5.2. Prove Theorem 3.5.1.
Recall the “flaw” concerning openness and subsets discussed in Example 2.3.34.
If (X, d) is a metric space and Y ⊂ X, then (Y, d) is a metric space. A set G ⊂ Y
may be open in Y, but this does not mean it is open in X. Contrast this to
Theorem 3.5.3. Suppose (X, d) is a metric space and K ⊂ Y ⊂ X. K is a
compact subset of X if and only if K is a compact subset of Y.
Proof. Suppose K ⊂ X is compact and S let {Aα }α∈a be a collection of sets
that are open relative to Y such that K ⊂ α∈a Aα . By Theorem 2.3.38, there are
open sets {Gα }α∈a in X with
\
Aα = Y Gα
for α ∈ a. Since K is compact in X and is covered by {Aα }α∈a ,
[ [
K ⊂ Gα1 ... Gαn
for some α1 , ..., αn ∈ a. But, this implies
[ [
K ⊂ Aα1 ... Aαn ,
and K is compact in Y.
The other direction is simply the reverse of this argument. 
Finally, recall that Example 3.2.5 shows that being closed and bounded is not
sufficient to guarantee compactness. Interestingly,
Theorem 3.5.4. Closed subsets of a compact set in a metric space are compact.
Proof. Let (X, d) be a metric space, K ⊂ X compact, and F ⊂ K closed.
Let {Gα }α∈a be an openScover of F . If F c is added to {Gα }α∈a , then we obtain
an open cover {{Gα }α∈a F c } of K (see Figure 3.4). S
Since K is compact, there is a finite subcollection from {Gα }α∈a {F } that
c
covers K, and hence covers F . We can remove F and still retain a cover of F .
Thus, a finite subcollection of {Gα }α∈a covers F , and F is compact. 
3.6. COMPACT SETS IN Rn 33

G1

G2
G3

Figure 3.4

3.6. Compact Sets in Rn


As a special case, we consider Rn . First, we prove the generalization of any
closed and bounded interval is compact.
Definition 3.6.1. Let am > bm be numbers in R for m = 1, 2, ..., n. The set
{x = (x1 , ..., xn ) ∈ Rn | am ≤ xm ≤ bm , 1 ≤ m ≤ n}
is an n-cell in Rn .
Theorem 3.6.2. Every n-cell of Rn is compact. This implies in particular that
[a, b] ⊂ R, a < b, is compact
Proof. We first show a modified form of the finite intersection property for
n-cells. We begin with intervals in R1 :
Let {Im }∞ 1
m=1 be a descending sequence of closed intervals in R , i.e., I1 ⊃

I2 ⊃ .... This implies {Im }m=1 has the finite intersection property. We prove that
T ∞
m=1 Im is nonempty.
Let Im = [am , bm ]. The sequence {am } is bounded above by b1 , hence
x = sup am < ∞.
m∈N
T∞
We show that x ∈ m=1 Im . For positive integers m, l,
al ≤ al+m ≤ bl+m ≤ bm ,
so xT≤ bm for all m. Since am ≤ x for all m, am ≤ x ≤ bm for all m, and
x ∈ m∈N Im .
Now suppose {Im } is a descending sequence of n-cells in Rn . Let
Im = {x | am,l ≤ xl ≤ bm,l , 1 ≤ l ≤ n, m ∈ N}
Set
Im,l = [am,l , bm,l ] ⊂ R1
where a = (a1 , a2 , ..., an ) and b = (b1 , b2 , ..., bn ). So, for a ≤ l ≤ n, m ∈ N. In other
words,
Im = Im,1 × ... × Im,n .
34 3. COMPACTNESS

For each l, {Im,l }m∈N is a descending sequence of intervals in R1 . Hence, there are
real numbers xl such that
am,l ≤ xl ≤ bm,l
for 1 ≤ l ≤ n, m ∈ N. We have x = (x1 , x2 , ..., xn ) ∈ Im for all m.
Now we prove that the n-cell
I = {x ∈ Rn | an ≤ xm ≤ bm , 1 ≤ m ≤ n}
is compact. Set δ =k b − a k, so
k x − y k≤ δ
for all x, y ∈ I. Suppose there is an open cover {Gα }α∈a of I that contains no finite
subcover. Set
am + b m
cm =
2
for 1 ≤ m ≤ n. The intervals {[am , cm ], [cm , bm ]} determine 2n n-cells {Il } whose
union is I.
One of these n-cells, at least, cannot be covered by any finite collection from
{Gα }α∈a . Call this J1 . We next subdivide J1 into 2n n-cells in the same way.
Again, one of the resulting n-cells cannot be covered by any finite collection from
{Gα }α∈a , and we call this J2 .
Inductively, we obtain a sequence of n-cells {Jm }∞ m=1 with the properties
(1) J1 ⊃ J2 ⊃ ...
(2) Jm is not covered by any finite collection from {Gα }.
(3) If x ∈ Jm and y ∈ Jm , then k x − y k≤ 2−m δ.
T∞
By the discussion above, we know there is a point x ∈ m=1 JM , x ∈ Gα for
some α. Since Gα is open, there is an r > 0 with Nr (x) ⊂ Gα . If we choose m with
2−m δ < r, then Jm ⊂ Gα . But, this is a contradiction. 
From this, it is easy to prove
Theorem 3.6.3. A set K ⊂ Rn is compact if and only if it is closed and
bounded.
Proof. By Theorem 3.4.5, if K is compact, then it is closed and bounded. On
the other hand, if K is bounded, then it is contained in some compact n-cell. Since
K is a closed subset of a compact set, Theorem 3.5.4 shows it is compact. 
These last two theorems complete the proofs of Theorems 3.1.12(Bolzano-
Weierstrass property), 3.1.21(Heine-Borel property), and 3.1.28(Cantor Intersection
Property).
CHAPTER 4

Cauchy Sequences in Metric Spaces

We have defined the notions of sequences, subsequences, and convergence, and


explored their relation to the essential property of compactness. Recall that a
serious flaw in the definition of convergence, from the point of application, is that
this definition requires the (possibly unknown) limit. Cauchy sequences are a way
to get around this.

4.1. A Few Facts and Boundedness


First, we present a few facts about sequences and discuss the property of bound-
edness.
Theorem 4.1.1. The limit of a convergent sequence in a metric space is unique.
Proof. Suppose that {xn } is a sequence in a metric space (X, d) that converges
to x and y. Choose  > 0. There are N and M such that d(xn , x) <  for n ≥ N and
d(xn , y) ≤  for n ≥ M . Hence, for n ≥ max{N, M }, d(x, y) ≤ d(x, xn ) + d(xn , y) ≤
2. Since d(x, y) ≤ 2 for any  > 0, d(x, y) = 0 and so x = y. 

Theorem 4.1.2. Let {xn } be a sequence in a metric space (X, d). {xn } con-
verges to x ∈ X if and only if every neighborhood of x contains all but finitely many
terms of {xn }
Proof. Suppose {xn } → x and let N (x) be a neighborhood of x. By con-
vergence, there is an N such that d(xn , x) <  for n ≥ N . Hence, xn ∈ N (x) for
n ≥ N.
Now assume every neighborhood of x contains all but a finite number of {xn }.
For  > 0, consider N (x). By assumption, there is an N > 0 such that xn ∈ N (x),
i.e., d(xn , x) <  for n ≥ N . 

Recalling the notions of subsequences and subsequential limits, we have the


following useful fact.
Theorem 4.1.3. The subsequential limits of a sequence in a metric space form
a closed subset of the space.
Proof. Let {xn } be a sequence in a metric space (X, d). Let A be the set of
all subsequential limits of {xn }. Let x be a limit point of A. We want to show that
x ∈ A, which means showing that x is a subsequential limit of {xn }.
Choose n1 so that xn1 6= x. If no such n1 exists, then A has one point, and we
are done. Let δ = d(x, x1 ). Suppose n1 , ..., nm−1 are chosen. Since x is a limit point
of A, there is a y ∈ A with d(x, y) < 2−m δ. Since y ∈ A, there is an nm > nm−1
with d(y, xnm ) < 2−m δ.
35
36 4. CAUCHY SEQUENCES IN METRIC SPACES

We have
d(x, xnm ) ≤ d(x, y) + d(y, xnm )
≤ 2−m δ + 2−m δ = 21−m δ
We conclude that {xnm } → x. 
Finally, we discuss the connection between boundedness (Defn. 2.3.24) and
convergence of sequences.
Definition 4.1.4. A sequence {xn } in a metric space (X, d) is bounded if its
range forms a bounded set in X. Otherwise, it is unbounded. Equivalently, {xn }
is bounded if and only if there exists A ⊂ X,a bounded set, such that xn ∈ A for
all n.
Example 4.1.5. In (R, | · |),
(1) { n1 }∞
n=1 converges and is bounded.
(2) {n2 }∞ n=1 diverges and is unbounded.
(3) {1 + (−1)n }∞ n=1 diverges and is bounded.

We have the relation


Theorem 4.1.6. A sequence in a metric space that converges is bounded.
Proof. Suppose {xn } is a sequence in a metric space (X, d) that converges to
x. There is an integer N such that d(xn , x) < 1 for n ≥ N . Set
r = max{1, d(x, x1 ), . . . , d(x, xn−1 )}.
Then, d(xn , x) ≤ r for all n. 

4.2. Cauchy Sequences


The practical trouble with the standard definition of convergence is that it
involves the (usually) unknown limit.
Example 4.2.1. The sequence
n
s
X 4 4m 2 ∞
{ e n − sin( 4m ) }n=1
m=1
n 1 + n

converges, because it converges to


Z 4r
2
ex − sin( )dx,
1 1+x
which we can prove exists by standard Calculus results. However, we do not know
the value of this integral and cannot verify this by the definition of convergence.
The notion of a Cauchy sequence gives a way around this difficulty.
The idea is that if a sequence converges to a limit, i.e., the terms in the sequence
approach a limit as the index increases, then the terms also approach each other as
the index increases.
Definition 4.2.2. A sequence {xn } in a metric space (X, d) is a Cauchy
sequence if for every  > 0 there is an N > 0 such that d(xn , xm ) <  for
n, m ≥ N .
4.3. COMPLETENESS 37

Example 4.2.3. { n1 } is a Cauchy sequence in (R, |·|). This is because d( n1 , m


1
)=
| n1 1 2
− m | ≤ min{n,m} .
So, given  > 0, if we choose N > 2 , then for n, m ≥ N, | n1 − m1
| < .

Example 4.2.4. { sin(nx)


n } is a Cauchy sequence in C([0, π]) since
sin(nx) sin(mx) 2
| − |≤
n m min{n, m}
for 0 ≤ x ≤ π. Hence, given  > 0, if N > 1 , then
sin(nx) sin(mx)
d( , )<
n m
for n, m ≥ N .
Fitting our intuition,
Theorem 4.2.5. Any sequence in a metric space that converges is a Cauchy
sequence.
Proof. Assume {xm } is a sequence in a metric space (X, d) and xn → x.
Choose  > 0. There is an N such that d(xn , x) <  for n ≥ N . Hence,
d(xn , xm ) ≤ d(xn , x) + d(xm , x) ≤ 2
for n, m ≥ N, and {xn } is Cauchy. 
Moreover,
Theorem 4.2.6. If a subsequence of a Cauchy sequence in a metric space con-
verges to a limit, then the Cauchy sequence itself converges to the same limit.
Proof. Let {xn } be a Cauchy sequence in a metric space (X, d). Suppose the
subsequence {xnk } converges to x. Given  > 0, choose N so that d(xn , xm ) < 2
for n, m ≥ N. Choose K such that k ≥ K implies nk ≥ N and d(dnk , x) < 2 . Then
for all n ≥ N and k ≥ K, d(xn , x) < d(xn , xnk ) + d(xnk , x) < . 
Finally, we observe one more characteristic of Cauchy sequences:
Theorem 4.2.7. Cauchy sequences are bounded.
Proof. Suppose {xn } is a Cauchy sequence in the metric space (X, d). Then
there exists N ∈ N such that n, m ≥ N implies d(xn , xm ) < 1. We now have
d(xN , xk ) ≤ max{d(xN , x1 ), d(xN , x2 ), ..., d(xN , xN −1 ), 1}
for all k ∈ N. 

4.3. Completeness
Unfortunately, the converse to Theorem 4.2.5 just does not hold. Not every
Cauchy sequence in a metric space must converge to a point in the space.
Example 4.3.1. Consider (0, 1) ⊂ (R, | · |). { n1 } is a Cauchy sequence in (0, 1),
but does not converge to a limit in (0, 1).
Example 4.3.2. Consider Q ⊂ (R, | · |). {(1 + n1 )n } is a Cauchy sequence in Q
because we know that (1 + n1 )n → e in (R, | · |), but its limit e 6∈ Q.
38 4. CAUCHY SEQUENCES IN METRIC SPACES

Example 4.3.3. Consider the space of polynomials on [a, b] : P([a, b]) ⊂


2 n
C([a, b]). The sequence {1 + x + x2! + ... + xn! } converges uniformly to ex on [a, b],
i.e.,
xn
max |1 + x + ... + − ex | −−−−→ 0
a≤x≤b n! n→∞
xn
by Taylor’s theorem. Hence, {1 + x + ... + n! } is a Cauchy sequence in P([a, b]),
but its limit is ex 6∈ P([a, b]).
In these examples, the Cauchy sequence “acts” like it converges, but its limit
is not defined in the space we are working in.
Definition 4.3.4. A metric space is complete if every Cauchy sequence con-
verges to an element in the space.
Completeness is a property that has to be established, and this may or may
not be easy to do! As a first example, we prove
Theorem 4.3.5. Rn is complete.
Proof. Let {xn } be a Cauchy sequence in Rn . By Theorem 4.2.7, {xn } is
contained in a compact n-cell. This means that is has a convergent subsequence
and by Theorem 4.2.6 converges itself. 
As a second example, we present
Example 4.3.6. Recall that a real valued function f is bounded on an interval
[a, b] if there is a constant M such that
|f (x)| ≤ M for a ≤ x ≤ b.
Also, a continuous function is bounded on a finite closed interval but a function
that is bounded does not have to be continuous.
We define
M([a, b]) = {f : [a, b] → R | f is bounded}
and introduce the metric
d(f, g) = sup |f (x) − g(x)|
[a,b]

for f, g ∈ M([a, b]).


Exercise 4.3.7. Show (M([a, b]), d) is a metric space.
We show that M([a, b]) is complete. This is a three step process:
(1) Find a natural candidate for a limit for a Cauchy sequence.
(2) Verify that the limit is in the metric space.
(3) Prove the Cauchy sequence converges to this limit.
Let {fk } be a Cauchy sequence in M([a, b]). For a fixed x in [a, b], consider the
sequence of numbers {fk (x)}. This is a Cauchy sequence in (R, | · |), since
|fn (x) − fm (x)| ≤ sup |fn (y) − fm (y)| = d(fn , fm ).
a≤y≤b

Since R is complete, {fn (x)} converges to a real number. Define the function
f : [a, b] → R by
f (x) = lim fn (x) a ≤ x ≤ b.
n→∞
This is our candidate for the limit.
4.3. COMPLETENESS 39

We next show f ∈ M([a, b]), i.e., f is bounded. Choose N so that d(fm , fn ) ≤ 1


for n, m ≥ N . In particular,
sup |fN (x) − fm (x)| ≤ 1 for m ≥ N.
a≤x≤b

We let m → ∞ in this inequality and obtain


sup |fN (x) − f (x)| ≤ 1.
a≤x≤b

Since fN ∈ M([a, b]), there is an M such that sup |fN (x)| ≤ M and so
a≤x≤b

sup |f (x)| ≤ M + 1.
a≤x≤b

Finally, we show {fn } → f in (M([a, b]), d). Note that we have pointwise con-
vergence by construction, but we do not automatically have uniform convergence,
which is the convergence notion in (M([a, b]), d), automatically.
Let  > 0. There is an N such that
d(fm , fN ) <  for m ≥ N.
This means
|fm (x) − fN (x)| < 
for a ≤ x ≤ b and m ≥ N . Taking the limit as m → ∞ yields
|f (x) − fN (x)| < , a ≤ x ≤ b.
For n ≥ N ,
|fn (x) − f (x)| ≤ |fn (x) − fN (x)| + |fN (x) − f (x)|
< 2,
for a ≤ x ≤ b. Hence, d(fn , f ) < 2 for n ≥ N .
We can characterize completeness in terms of a generalization of the Cantor
Intersection Property (recall Definition 3.1.23).
Definition 4.3.8. Let A be a nonempty subset of a metric space (χ, d). The
diameter of A is defined by
diam(A) = sup d(x, y).
x,y∈A

Notice that diam(A) can be infinite and the diameter of a set consisting of a
single point is zero.
We have
Theorem 4.3.9. A metric space is complete if and only if the intersection of
every descending sequence of nonempty, closed sets whose diameters approach zero
consists of a single point.
Proof. Let {Fn } be a sequence of closed sets in a metric space (χ, d) with
F1 ⊃ F2 ⊃ F3 ⊃ ... and diam(Fn ) → 0, where Fn 6= 0 for all n.
Choose xn ∈ Fn for n ≥ 1. Note that since the sets are descending,
sup d(x, y) ≥ sup d(x, y).
x,y,∈Fn x,y∈Fn+1

Given  > 0, choose N such that diam(Fn ) <  for n ≥ N . This implies that
d(xn , xm ) <  for n, m ≥ N . Hence {xn } is a Cauchy sequence and there is a point
40 4. CAUCHY SEQUENCES IN METRIC SPACES


\ ∞
\
x such that xn → x. We claim {x} = Fn . First note that if x, y ∈ Fn , then
n=1 n=1
x, y ∈ Fn for all n, and since d(x, y) ≤ diam(Fn ) for all n, d(x, y) = 0, i.e., x = y.

\
Hence, Fn can consist of at most one point.
n=1
Now we claim x ∈ Fn for all n. If not, since Fnc is open for all n, there is an n
and an  > 0 such that N (x) ⊂ Fnc . But, then d(x, y) ≥  for y ∈ Fn . But, this
means d(x, xm ) ≥  > 0 for m ≥ N , contradicting xn → x.
Now let {xn } be a Cauchy sequence in (χ, d). Choose N1 > 0 such that
n, m ≥ N1 implies d(xn , xm ) < 12 . Set xN1 as the first term in a subsequence. Given
N1 , N2 , ..., Nk−1 , choose Nk > Nk−1 such that m, n ≥ Nk implies d(xn , xm ) < 21k .
Let xNk be the k th term in the subsequence. Now
1
d(xNk , xNk+1 ) <
2k
since Nk+1 > Nk .
Define the sequence of closed “balls”
B1 = N 1 (xN1 )
B2 = N 12 (xN2 )
B3 = N 14 (xN3 )
..
.
Bk =N 1 (xNk )
2k−1

..
.
These closed sets are nonempty, since xNk ∈ Bk . Moreover, they are descending.
If y ∈ Bk+1 , then
d(y, xNk ) ≤ d(y, xNk+1 ) + d(xnk+1 , xNk )
1 1 1
≤ 2k
+ 2k
= 2k−1
,
T∞
so Bk+1 ⊂ Bk . By assumption, there is a unique point x ∈ k=1 Bk . Since
1
d(x, xNk ) ≤ 2k−1 , xNk → x. Since {xn } is Cauchy, xn → x. 
We can characterize complete subsets of a complete metric space very nicely:
Theorem 4.3.10. Let (χ, d) be a complete metric space. A subspace Y ⊂ χ is
complete if and only if Y is closed.
Proof. Suppose Y is closed and {xn } is a Cauchy sequence in Y. Since χ is
complete, xk → x ∈ χ. But, Y is a closed, so x ∈ Y. This means Y is complete.
If Y is complete, then let x be a limit point of Y. There is a sequence {xn }
in Y with xn → x. This sequence converges in χ, so it is a Cauchy sequence in χ,
and therefore in Y. This means xn → x̃ ∈ Y and Y is closed. 

4.4. Cauchy Sequences and Compactness


In this section, we develop the Cauchy sequence analog of sequential compact-
ness, and complete the last part of Theorem 3.4.4:
4.4. CAUCHY SEQUENCES AND COMPACTNESS 41

Theorem 4.4.1. Borel-Lebesgue Theorem, part II Let (X, d) be a metric


space and K ⊂ X. K is compact if and only if it is complete and every sequence in
K has a Cauchy subsequence.
Proof. Suppose K is compact and let {xn } be a Cauchy sequence in K. {xn }
has a convergent subsequence {xnk } with limit in K. By Theorem 4.2.6, {xn }
converges to the same limit. So, K is complete.
For the other direction, assume {xn } is a sequence in K. We know that {xn }
has a Cauchy subsequence {xnk } that converges to x ∈ K, since K is complete.
This shows K is (sequentially) compact. 
We conclude by relating the concepts of the existence of a Cauchy subsequence
of an arbitrary sequence and total boundedness.
Theorem 4.4.2. A subset of a metric space is totally bounded if and only if
every sequence in the subset has a Cauchy subsequence.
Proof. Assume (X, d) is a metric space and A ⊂ X is totally bounded. Let
{xn } be a sequence in A.
N1
[
For 1 = 1, choose y1,1 , ..., y1,N1 ∈ X such that A ⊂ N1 (y1,k ). At least one
k=1
of the neighborhoods N1 (y1,k ) contains an infinite number of the {xn }. Suppose
this is N1 (y1,k1 ). Choose xn1 to be one of the infinite number of {xn } in N1 (y1,k1 ).
[N2
For 2 = 12 , choose y2,1 , ..., y2,N2 such that A ⊂ N2 (y2,k ). For some
T k=1
k, N2 (y2,kT) N1 (y1,k1 ) contains an infinite number of the {xn }. Say this is
N2 (y2,k2 ) N1 (y1,k1 ) and choose xn2 to be one of the points {xn } there.
N
[ m
1
Inductively, for m = m , we choose ym,1 , ..., ym,Nm so that A ⊂ Nm (ym,k ).
T T T k=1
For some 1 ≤ km ≤ Nm , Nm (ym,km ) Nm−1 (ym−1,km−1 ) ... N1 (y1,k1 ) con-
tains an infinite number of {xn }. Choose xnm to be one of these points. By
construction, {xnm } is Cauchy since for  > 0,
1 1
d(xnm , xnk ) < max(m,k) = min{ , } < 
m k
for k, m > 1 .
Next, assume every sequence has a Cauchy subsequence. For  > 0, choose
x1 ∈ K. S Then choose x2 ∈ K such that x2 6∈ N (x1 ). Choose x3 ∈ K so x3 6∈
N (x2 ) N (x1 ). Proceed inductively. This process must stop after a finite number
of steps since otherwise we would obtain a sequence that can have no Cauchy
[N
subsequence. Thus, K ⊂ N (xm ) for some N . 
m=1

We can restate
Theorem 4.4.3. Borel-Lebesguq Theorem II A subset of a metric space is
compact if and only if it is complete and totally bounded.
CHAPTER 5

Sequences in Rn

As an application of sequences, we consider Rn . Because of the vector space


structure on Rn and the order structure on R, we can say more about sequences
and convergence.

5.1. Arithmetic Properties and Convergence


The first issue is to develop the arithmetic properties of convergence:

Theorem 5.1.1. Let {an } and {bn } be convergent sequences in (R, | · |) with
an → a and bn → b. Then,
(1) lim (an + bn ) = a + b.
n→∞
(2) lim can = ca and lim (c + an ) = c + a for any c ∈ R.
n→∞ n→∞
(3) lim an bn = ab.
n→∞
1
(4) limn→∞ an = a1 , provided a 6= 0 and an 6= 0 for all n.

Proof. (1) Let  > 0 be given. There exists N1 ∈ N such that n ≥ N1


implies |an − a| < 2 . Also, there exists N2 ∈ N such that n ≥ N2 implies
|bn − b| < 2 . So, n ≥ max(N1 , N2 ) implies
 
|an + bn − (a + b)| ≤ |an − a| + |bn − b| < + = .
2 2
(2) Let c ∈ R and  > 0. Notice the result is obvious for c = 0, so assume

c 6= 0. There exists N such that n ≥ N implies |an − a| < min(, |c| ).
Then n ≥ N implies
|c + an − (c + a)| = |an − a| < 
and

|can − ca| = |c||an − a| < |c| = .
|c|
(3) Let  > 0. We have two cases. First, assume ab = 0. Then a or b is zero.
Without loss of generality assume a = 0. We know by Theorem 4.2.7 that
there exists B > 0 such that |bn | < B for all n. There exists N ∈ N such
that n ≥ N implies |an − a| = |an | < B . But now n ≥ N implies

|an bn − ab| = |an bn | ≤ |an ||bn | ≤ B = .
B

Second, assume ab 6= 0. Again by Theorem 4.2.7, there exist B > 3
such that |an | < B and |bn | < B for all n. Also, there exist N1 and N2
42
5.1. ARITHMETIC PROPERTIES AND CONVERGENCE 43


such that n ≥ N1 implies |an − a| < min(|a|, 3B ) and n ≥ N2 implies

|bn − b| < min(|b|, 3B ). So, n ≥ max(N1 , N2 ) implies

|an bn − ab| = ||an bn | − |ab||


 
< |(|a| + )(|b| + ) − |ab||
3B 3B
  
< ||ab| − − − − |ab||
3 3 3
= .

(4) Let  > 0. There exists N ∈ N such that n ≥ N implies |a − an | <


2
min( |a|2  , |a|
2 ). Then n ≥ N also implies

1 1 |a − an | |a − an |
| − |= < |a|2 < .
an a aan
2

Theorem 5.1.2. Suppose {xm } is a sequence in (Rn , k · k). Then, xm → x if


and only if each component of xm converges to the corresponding component of x.

Proof. Let xm = (xm,1 , ..., xm,n ) and x = (y1 , ..., yn ). The claim is

(xm → x) ⇔ (xm,k → yk ), 1 ≤ k ≤ n.

If xm → x, then the inequality

|xm,k − yk | ≤k xm − x k, 1≤k≤n

shows xm,k → yk for 1 ≤ k ≤ n.


If xm,k → yk , 1 ≤ k ≤ n, then given  > 0 there is an Nk such that

|xm,k − yk | <  for m ≥ Nk

for each 1 ≤ k ≤ n. For m ≥ max{N1 , ..., Nn },


n
X 1 √
k xm − x k= ( (xm,k − yk )2 ) 2 ≤ n.
k=1

Theorem 5.1.3. Suppose {xm } and {ym } are sequences in (Rn , k · k) and {am }
is a sequence in (R, | · |) such that xm → x, ym → y, am → a. Then
(1) lim (xm + ym ) = x + y.
m→∞
(2) lim xm · ym = x · y.
m→∞
(3) lim am xm = ax.
m→∞

Proof. (1) For each 1 ≤ k ≤ n, the kth components of {xm } and


{ym } converge to the kth components of x and y, respectively. By Theo-
rem 5.1.1, then, the kth component of {xm + ym } converges to x + y for
all 1 ≤ k ≤ n. Now by Theorem 5.1.2, we have the desired result.
44 5. SEQUENCES IN Rn

(2) Let xm = (xm,1 , xm,2 , ..., xm,n ), ym = (ym,1 , ym,2 , ..., ym,n ), x = (c1 , c2 , ..., cn ), and y =
(d1 , d2 , ..., dn ) for all m ∈ N. Then
Xn
lim xm · ym = lim xm,k ym,k
m→∞ m→∞
k=1
n
X
= lim xm,k ym,k
m→∞
k=1
Xn
= cm dm by Theorem 5.1.1
m=1
= x · y.
(3) Using the notation above, notice am xm = (am xm,1 , ..., am xm,n ). By Theo-
rem 5.1.1, each of these components converges to ack , so by Theorem 5.1.2,
we have the desired result.

5.2. SEQUENCES IN R AND ORDER. 45

5.2. Sequences in R and Order.


Order can be used to say a lot about convergence. For example,
Definition 5.2.1. A sequence {xn } in (R, | · |) is
(1) monotonically increasing if xn ≤ xn+1 for n = 1, 2, 3, ....
(2) monotonically decreasing if xn ≥ xn+1 for n = 1, 2, 3, ....
If it is either, we say it is monotonic.
Theorem 5.2.2. Suppose {xn } in (R, | · |) is monotonic. Then {xn } converges
if and only if it is bounded.
Proof. We know convergence implies boundedness. Assume xn ≤ xn+1 for
all n. Let A = {xn | n ∈ N} (the range of {xn }, i.e., the set of elements in the
sequence), and since {xn } is bounded, let x = sup A < ∞. We claim xn → x.
Given  > 0, there is an N such that
x −  < xN ≤ x,
otherwise x −  is an upper bound for A smaller than x. But
x −  < xn ≤ x for all n ≥ N.
Now, if {xn } is monotonically decreasing, then {−xn } is monotonically increas-
ing and converges, and by Theorem 5.1.1, part (2), {xn } converges. 
We would like to characterize the “size” of the range of a sequence. To do
this, we also have to consider the subsequential limits and the possibility that the
sequence is unbounded. For this, we consider two special instances of divergence.
Definition 5.2.3. If {xn } is a sequence in (R, | · |) such that for any M there
is an N with xn ≥ M for n ≥ N , we write xn → ∞ and say {xn } diverges to
infinity.
If {xn } is a sequence in (R, | · |) such that for any M there is an N with xn ≤ M
for n ≥ N , we write xn → −∞ and say that {xn } diverges to −∞.
We can now define the “bounds” on the range of a sequence in the limit of large
index.
Definition 5.2.4. Let {xn } be sequence in R and let A consist of all the
subsequential limits of {xn } plus ∞ or −∞ if there is a subsequence that diverges
to ∞ or −∞ respectively. Let
x∗ = sup A, x∗ = inf A.

x and x∗ are called the upper and lower limits of {xn } and we write
x∗ = lim sup xn
n→∞
x∗ = lim inf xn .
n→∞
(Note: Here, we use ∞ as the supremum of a set of numbers unbounded above
and −∞ as the infimum of a set of numbers unbounded below.)
Example 5.2.5. {xn } = (1, −1, 12 , 1, −1, 22 , 1, −1, 32 , 1, −1, 42 , ...}
Here, A = {1, −1, ∞}, lim sup xn = ∞, and lim inf xn = −1.
n→∞ n→∞

Example 5.2.6. Let {xn } = Q. Then lim sup xn = ∞ and lim inf xn = −∞.
n→∞ n→∞
46 5. SEQUENCES IN Rn

1 3 −4 5 −6 7
Example 5.2.7. {xn } = {(−1)n (1 + )} = {−2, , , , , , ...}.
n 2 3 4 5 6
Here, lim sup xn = 1 and lim inf xn = −1.
n→∞ n→∞

We can immediately connect these notions to convergence:


Theorem 5.2.8. A sequence {xn } in (R, | · |) converges to x if and only if
lim sup xn = lim inf xn = x.
n→∞ n→∞

Exercise 5.2.9. Prove Theorem 5.2.8.


There are other ways to define lim sup and lim inf and these give alternatives
for computing their values.
Theorem 5.2.10. Let {xn } be a sequence in (R, | · |). Then
(1) lim sup xn = lim sup{xn , xn+1 , xn+2 , ...}.
n→∞ n→∞
(2) lim inf xn = lim inf{xn , xn+1 , xn+2 , ...}.
n→∞ n→∞
(3) If x = lim sup xn , then for every  > 0 there is an N such that xn ≤ x + 
n→∞
for n ≥ N .
(4) If x = lim inf xn , then for every  > 0 there is an N such that xn ≥ x − 
n→∞
for n ≥ N .
Exercise 5.2.11. Prove Theorem 5.2.10.
The following result is interesting and important.
Theorem 5.2.12. Let {xn } be a sequence in (R, | · |), x∗ = lim sup xn , x∗ =
n→∞
lim inf xn , and A = the set of subsequential limits of {xn }. Then x∗ and x∗ are
n→∞
in A.
Exercise 5.2.13. Prove Theorem 5.2.12.
We can use these ideas to relate two different sequences.
Theorem 5.2.14. Let {xn } and {yn } be sequences in (R, | · |).
(1) If xn ≤ yn for all n ≥ N for some fixed N , then
lim inf xn ≤ lim inf yn
n→∞ n→∞
lim sup xn ≤ lim sup yn .
n→∞ n→∞
(2) If 0 ≤ xn ≤ yn for n ≥ N where N is some fixed N and yn → 0, then
xn → 0.
Exercise 5.2.15. Prove Theorem 5.2.14.

5.3. Series
Series are just a special kind of sequence.
Definition 5.3.1. Let {xn } be a sequence in (R| · |). The partial sums
associated to {xn } are defined
Xn
Sn = xm .
m=1
5.3. SERIES 47

The sequence of partial sums may or may not converge. If it does, we define
Definition 5.3.2. Let {xn } be a sequence in (R, | · |) and {Sn } the sequence
of partial sums. If {Sn } converges we define the series associated to {xn } as
X∞ Xn
xn = lim Sn = lim xm
n→∞ n→∞
n=1 m=1
and say the series converges. Otherwise, we say the series diverges.
The Cauchy Criterion for convergence becomes

X
Theorem 5.3.3. A series xn converges if and only if for every  > 0, there
n=1
is an N such that
m
X
| ak | <  for m > n ≥ N.
k=n

Exercise 5.3.4. Prove Theorem 5.3.3.


This implies

X
Theorem 5.3.5. If a series xn converges, then lim xn = 0.
n→∞
n=1

Exercise 5.3.6. Prove Theorem 5.3.5


The converse does not hold.

X 1
Example 5.3.7. diverges.
n=1
n
In general, if the summands in a series vary in sign, then convergence becomes
a delicate and difficult issue. When the summands have the same sign, things are
simpler. For example,
Theorem 5.3.8. A series of nonnegative terms converges if and only if the
partial sums form a bounded sequence.
Exercise 5.3.9. Prove Theorem 5.3.8
CHAPTER 6

Continuous Funtions on Metric Spaces

We now study functions from one metric space to another. Recall Defini-
tion 1.2.1.

Example 6.0.10. Define f : R2 → R by


(
x1 x2
2 2 (x1 , x2 ) 6= (0, 0)
f (x1 , x2 ) = x1 +x2
0 (x1 , x2 ) = (0, 0).

Example 6.0.11. Define T : C([a, b]) → R by


Z b
T (f ) = f (s)ds.
a

Example 6.0.12. Define S : C([a, b]) → C([a, b]) by


Z t
(S(f ))(t) = f (s)ds, a ≤ t ≤ b.
a
1
Example 6.0.13. Let C ([a, b]) consist of those functions in C([a, b]) with a
continuous first derivative. Define D : C 1 ([a, b]) → C([a, b]) by

D(f )(x) = f 0 (x), a ≤ x ≤ b.

Example 6.0.14. Suppose f ∈ C([0, π]). Recall that we define its Fourier sine
series as

X
ak sin(kx), 0 ≤ x ≤ π
k=1

with
Z π
2
ak = f (s) sin(ks)ds.
π 0

We consider the function F that takes a function f to its sequence of Fourier


coefficients {a1 , a2 , a3 , ...}. Bessel’s inequality states

2 π 2
X Z
a2k ≤ f (s)ds < ∞.
π 0
k=1

Hences, F : C([a, b]) → l2 .

This gives you some idea of the tremendous range of applications of functions
on metric spaces.
48
6.1. LIMIT OF A FUNCTION 49

6.1. Limit of a Function


Given our interest in sequences, it is no surprise that we consider functions that
have special behavior when applied to sequences. Namely, if a sequence converges,
then we want the image under a function in question to converge. We begin by
defining the notion of a limit of a function of a sequence.
Definition 6.1.1. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, f a function
with f : A → Y, and x a limit point of A.
We write
lim f (s) = y
s→x
f (s) → y as s → x
lim f (xn ) = y
n→∞

for every sequence {xn } in A such that xn 6= x for all n and lim xn = x. We say
n→∞
that f has a limit at x.
Note: The convergence lim xn = x is in the metric of X, i.e.,
n→∞
lim dx (xn , x) = 0, while the convergence lim f (xn ) = y is in
n→∞ n→∞
the metric of Y, i.e., lim dy (f (xn ), y) = 0.
n→∞

Example 6.1.2. Note that in the function in Figure 6.1, lim f (s) = y 6= f (x).
s→x
This is one reason for having the condition xn 6= x on the sequence.

f(x) f

x A

Figure 6.1

Example 6.1.3. Consider T in Ex. 6.0.11. T (f ) has a limit at f ∈ C([a, b]) if


there is a number I with Z b
fn (s)ds → I,
0
for all sequences {fn } in C([a, b]) with fn → f in C([a, b]), i.e., uniform convergence.
Is this true? More later...
It follows immediately that
50 6. CONTINUOUS FUNTIONS ON METRIC SPACES

Theorem 6.1.4. Let f : A ⊂ X → Y, where (X, dx ), (Y, dy ) are metric spaces.


If f has a limit at a point x, then this limit is unique.
Example 6.1.5. Consider Ex. 6.0.10, and let {(x1,n , x2,n )} be a sequence in
R2 . If (x1,n , x2,n ) → (0, 0), with x1,n 6= 0 for all n and x2,n = x1,n for all n, then
x21,n 1
f (x1,n , x2,n ) = = for all n.
2x21,n 2
If {(x1,n , x2,n )} = {(0, x2,n )}, where x2,n 6= 0 for all n, then
f (x1,n , x2,n ) = 0 for all n.
f cannot have a limit at (0, 0).
This example points to the difficulties that can happen in a general metric
space. The condition “for all sequences {xn } with xn → x, xn 6= x all n” is very
strong.
We can characterize the notion of a limit at a point using sets.
Theorem 6.1.6. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, f a function
with f : A → Y, and x a limit point of A. f has limit y at x if and only if for every
 > 0 there is a δ > 0 such that dy (f (s), y) <  for all s ∈ A with 0 < dx (x, s) < δ.
Proof. Suppose the second property is false. There is an  > 0 such that for
every δ > 0 there is an s ∈ A for which 0 < dx (x, s) < δ but dy (f (s), y) ≥ . Take
δn = n1 for n ∈ N. We get a sequence {xn } with xn → x but {f (xn )} cannot
converge to y.
On the other hand, suppose the second property holds and {xn } is a sequence
in A with xn 6= x for all n and xn → x. Given  > 0 there is a δ > 0 such that
dy (f (s), y) <  for s ∈ A, 0 < dx (s, x) < δ.
Given δ, there is an N such that
0 < dx (x, xn ) < δ, n ≥ N.
Hence,
dy (f (xn ), y) < , n ≥ N.


6.2. Continuous Functions


We now consider functions that “preserve” convergence of a sequence.
Definition 6.2.1. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, f a function
with f : A → Y, and x ∈ A. f is continuous at x if for every sequence {xn }
in A that converges to x, the sequence {f (xn )} converges to f (x) in Y. If f is
continuous at every point in A, then f is continuous on A.
Note: f must be defined at x in order to be continuous at x.
Note: A function is continous at any isolated point at which it
is defined.
Example 6.2.2. The function in Ex. 6.0.10 is not continuous.
6.2. CONTINUOUS FUNCTIONS 51

Example 6.2.3. Regarding Ex. 6.1.3, ideally we would like to have


Z b Z b
fn (s)ds → f (s)ds
a a
for any sequence {fn } in C([a, b]) that converges to f ∈ C([a, b]). This would make
Rb
the integral a ·ds continuous on C([a, b]). More later...
There are other characterizations of continuity:
Theorem 6.2.4. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, f : A → Y,
and x ∈ A.
(1) f is continuous at x if and only if for every  > 0, there is a δ > 0 such
that
dy (f (s), f (x)) < 
for all s ∈ A with dx (s, x) < δ.
(2) f is continuous at x if and only if for every neighborhood V of f (x) there
is a neighborhood U of x such that f (U ∩ A) ⊂ V .
Exercise 6.2.5. Prove Theorem 6.2.4.
As Rudin remarks, none of these ideas depend on X \ A, so we can restrict the
discussion to the case A = X. There is a useful characterization of continuity on a
set:
Theorem 6.2.6. Let (X, dx ), (Y, dy ) be metric spaces. A map f : X → Y is
continuous on X if and only if f −1 (G) is open in X for every open set G ⊂ Y.
Proof. Let f be continous on X and G ⊂ Y open. Suppose x ∈ X and
f (x) ∈ G. There is an  > 0 such that y ∈ G if dy (f (x), y) < . Since f is
continuous at x, there is a δ > 0 such that
dy (f (s), f (x)) <  if dx (s, x) < δ.
Hence, s ∈ f (G) as soon as dx (s, x) < δ, and x is an interior point of f −1 (G),
−1

and f −1 (G) must be open.


Conversely, suppose f −1 (G) is open in X for every open G ⊂ Y. Choose x ∈ X
and  > 0. Let G be the set of y ∈ Y such that dy (y, f (x)) < . G is open, so
f −1 (G) is open. There is a δ > 0 such that s ∈ f −1 (G) as soon as dx (s, x) < δ.
But if s ∈ f −1 (G), then f (s) ∈ G, so dy (f (s), f (x)) < . 
During the rest of this chapter, we discuss properties of continuous functions
in various settings. As a first result,
Theorem 6.2.7. Let (W, dw ), (X, dx ), (Y, dy ) be metric spaces, A ⊂ W, f :
A → X, g : f (A) → Y and define h : A → Y by h(w) = g(f (w)) for w ∈ A. If f
is continuous at w ∈ A and g is continuous at f (w), then h is continuous at w.
Proof. Let  > 0 be given. There is a η > 0 such that
dy (g(x), g(f (w))) < 
for x ∈ f (A) with dx (x, f (w)) < η. But, there is a δ > 0 such that
dx (f (w̄, f (w)) < η
if w̄ ∈ A and dw (w̄, w) < δ. So, dy (h(w̄, h(w)) <  for w̄, w ∈ A, dw (w̄, w) < δ. 
52 6. CONTINUOUS FUNTIONS ON METRIC SPACES

Definition 6.2.8. The function h in Theorem 6.2.6 is called the composition


of g with f .

6.3. Uniform Continuity


Recall that we say a function is continuous on a set A if it is continuous at each
point of A. It is important to realize that this is actually a rather weak property
because the “degree” of continuity is allowed to vary with the point. Specifically,
if f : A → Y, where A ⊂ X and (X, dx ), (Y, dy ) are metric spaces, and f is
continuous on A, then given a ∈ A and  > 0, there is a δ = δx, that depends on
both x and  such that dy (f (y), f (x)) <  for all y ∈ A with dx (y, x) < δx, .
1
Example 6.3.1. Consider f (x) = x on (0, 1) and fix x ∈ (0, 1). We have
1 1 y−x |y − x|
|f (x) − f (y)| = | − |=| |=
x y xy yx
for y ∈ (0, 1). If we restrict y so y > x2 (which allows y to be close to x), we get
2
|f (y) − f (x)| ≤ |x − y|.
x2
Given  > 0, if we assume that
x2 
|x − y| < = δx, ,
2
then |f (y) − f (x)| <  for y ∈ (0, 1), y > x2 , |x − y| < δ. To guarantee that f changes
no more than  by this argument, we have to restrict y to a neighborhood of x
whose size decreases as x becomes closer to zero.
2
Exercise 6.3.2. Convince yourself that if we choose y = x − x2 , then |f (x) −
f (y)| is on the order of .
Definition 6.3.3. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, and f : A →
Y. f is uniformly continuous on A if for every  > 0 there is a δ > 0 such that
dy (f (x), f (y)) <  for all x, y ∈ A with dx (x, y) < δ.
Example 6.3.4. x1 is not uniformly continuous on (0, 1) (see the problem?!).
So, continuity does not imply uniform continuity.
Example 6.3.5. Consider x2 on (0, 1) with the usual metrics. Choosing x, y ∈
(0, 1), we have
|f (x) − f (y)| = |x2 − y 2 | = |x + y||x − y| < 2|x − y|.
Given  > 0, if we choose any x, y ∈ (0, 1) with |x − y| < δ = 2 , then |f (x) − f (y)| <
.
Example 6.3.6. Consider x2 on R. Given any δ > 0, suppose we choose x and
y so y = x − 2δ . Then,
δ δ
|x2 − y 2 | = |x + y||x − y| = |2x − | · .
2 2
Given any  > 0, it is possible to choose x large enough that
δ δ
|2x − | · > ,
2 2
namely, x > δ + 4δ . So, x2 is not uniformly continuous on R.
6.4. CONTINUITY AND COMPACTNESS 53

There is a special class of functions that are uniformly continuous.


Definition 6.3.7. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, and f : A →
Y. f is Lipschitz continuous on A if there is a constant L such that
dy (f (x), f (y)) ≤ Ldx (x, y)
for all x, y ∈ A.
A Lipschitz continuous function is automatically uniformly continuous. Lip-
schitz continuity is not often discussed in standard analysis texts, but it is very
important in practice and is one of the most commonly encountered forms of con-
tinuity in differential equations, numerical analysis, and engineering.
It turns out that often continuity is too weak an assumption to guarantee
needed properties and we must use uniform continuity. As one example,
x
Example 6.3.8. The function f (x) = 1−x is continuous on [0, 1) with the usual
metrics, but is not uniformly continuous on [0, 1).
Exercise 6.3.9. Show this.
The sequence {xn }, with xn = 1 − n1 , is a Cauchy sequence in [0, 1).
Exercise 6.3.10. Show this.
However, f (xn ) = n − 1, so {f (xn )} is not a Cauchy sequence in f ([0, 1]).
In other words, the image of a Cauchy sequence under a continuous function
need not be a Cauchy sequence! However, we can prove
Theorem 6.3.11. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, f : A → Y
uniformly continuous on A, and {xn } a Cauchy sequence in A. Then, {f (xn )} is
a Cauchy sequence in Y.
Proof. Given  > 0, there is a δ > 0 such that dy (f (xn ), f (xm )) <  for all
xn , xm with dx (xn , xm ) < δ. Since {xn } is Cauchy, given δ > 0 there is an N such
that dx (xn , xm ) < δ for n, m > N . 

6.4. Continuity and Compactness


If we consider the examples in Section 6.3, we see that the properties of the un-
derlying set A are relevant to whether a function is merely continuous or uniformly
continuous. It turns out that continuity interacts very well with compactness.
As a first result,
Theorem 6.4.1. Suppose (X, dx ), (Y, dy ) are metric spaces, A ⊂ X, and f :
A → Y is continuous on A. If A is compact, then f (A) is compact.
Proof. Let {Gα } be an open cover of f (A). Since f is continuous, each set
f −1 (Gα ) is open. Moreover, {f −1 (Gα )} covers A. Hence, there are α1 , ..., αn such
that A ⊂ f −1 (Gα1 ∪ ... ∪ f −1 (Gαn ). Since f (f −1 (A)) ⊂ A for all A ⊂ Y,
f (A) ⊂ Gα1 ∪ ... ∪ Gαn .

In fact, it turns out that continuity and uniform continuity are the same on
compact sets.
54 6. CONTINUOUS FUNTIONS ON METRIC SPACES

Theorem 6.4.2. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, and f : A → Y


continuous on A. If A is compact, then f is uniformly continuous on A.
Proof. This is a compactness argument.
Let  > 0 be given. To each x ∈ A, there is a number δx such that
dy (f (x), f (y)) <  for y ∈ A, dx (x, y) < δx .
Let Gx be the set of y ∈ X such that dx (x, y) < 12 δx . Gx is open and since
x ∈ Gx , {Gx }x∈A is an open cover of A. There are points x1 , ..., xn in A such that
A ⊂ Gx1 ∪ ... ∪ Gxn . Set δ = 12 min{δx1 , ..., δxn }. It is crucial that we can use min
rather than inf here!
Suppose x, y ∈ A and dx (x, y) < δ. Now, x ∈ Gxm for some 1 ≤ m ≤ n, so
1
dx (x, xm ) < δxm .
2
We have
1
dx (y, xm ) ≤ dx (y, x) = dx (x, xm ) < δ + δxm
2
< δxm
Thus,
dy (f (x), f (y)) ≤ dy (f (x), f (xm )) + dy (f (xm ), f (y))
< 2.

This last theorem turns out to have tremendous consequences.
As a last result, we note that it is possible to say something about the continuity
of the inverse map in the setting of a compact space.
Theorem 6.4.3. Suppose f is a continuous and 1-1 map of a compact metric
space X onto a metric space Y. The inverse map on Y is a continuous map from
Y to X.
Proof. We use Theorem 6.2.6 on f −1 , so we have to show that f (G) is open
in Y for every open set G in X. Choose an open G ⊂ X. Since Gc is a closed subset
of a compact space, it is compact. Hence, f (Gc ) is compact in Y and therefore
closed. Since f is 1-1 and onto, f (G) = f (Gc )c , and so it is open. 

6.5. Rn -valued Continuous Functions


We now consider the special case of functions from a metric space (X, d) into
R or Rn with the usual metric.
First note that if f and g are such functions taking a metric space (X, d) into
R, then we can define f + g, f − g, f g, fg in a natural way. If f, g : (X, d) → Rn ,
then we can define f + g, f − g, and f · g in a natural way as well. We can then
write down the arithmetic properties of limits of such functions.
Theorem 6.5.1. Suppose (X, d) is a metric space, A ⊂ X, f, g : A → R, with
the usual metric on R, and x is a limit point of A. Then
(1) lim (f + g)(y) = lim f (y) + lim g(y).
y→x y→x y→x
(2) lim (f g)(y) = ( lim f (y))( lim g(y)).
y→x y→x y→x
f limy→x f (y)
(3) lim ( )(y) = , provided lim g(y) 6= 0.
y→x g limy→x g(y) y→x
6.5. Rn -VALUED CONTINUOUS FUNCTIONS 55

Proof. This follows from the properties of limits of sequences of real numbers.

In the same way, we prove
Theorem 6.5.2. Let (X, d) be a metric space and A ⊂ X.
(1) Suppose f1 , ..., fn : A → R with the usual metric and let f : A → Rn be
defined by f (x) = (f1 (x), ..., fn (x)), where we take the usual metric on Rn .
Then f is continuous at x ∈ A, or on A, if and only if each component
fm is continuous at x ∈ A, or on A.
(2) If f and g are continuous maps from A into (Rn , k · k), then f + g, f · g
are continuous.
Example 6.5.3. If x ∈ Rn is written x = (x1 , ..., xn ), then the coordinate
functions
φm (x) = xm , 1 ≤ m ≤ n,
are continuous since
|φm (x) − φm (y)| ≤k x − y k, x, y ∈ Rn .
Example 6.5.4. Repeated applications of Theorem 6.5.2 shows that polyno-
mials p : Rn → R,
M1 X
X M2 Mn
X
p(x) = ... Cm1 m2 ...mn xm mn
1 ...xn
1

m1 =0 m2 =0 mn =0
p(x)
are continuous. Furthermore, all rational functions q(x) , where p, q are polyno-
mials, are continuous at all points where q 6= 0.
The latter result also requires Theorem 6.2.7.
Example 6.5.5. Let (X, d) be a metric space and z ∈ X. We can define
f : X → R using d as
f (x) = d(x, z), x ∈ X.
The triangle inequality implies
d(x, z) ≤ d(x, y) + d(y, z)
d(y, z) ≤ d(x, y) + d(x, z), x, y ∈ X
or
|d(x, z) − d(y, z)| ≤ d(x, y), x, y ∈ X
and therefore f is continuous on X.
We now consider what happens for Rn -valued continuous functions on compact
sets.
Definition 6.5.6. Let (X, d) be a metric space and f : A → Rn , with the
usual metric on Rn . f is bounded on A if there is a constant M such that
k f (x) k≤ M for all x ∈ A.
If f is bounded on X, we say it is bounded.
Theorem 6.5.7. Let (X, d) be a metric space, A ⊂ X. If f : A → (Rn , k · k)
is continuous and A is compact, then f is bounded and f (A) is closed.
Proof. See Theorem 6.4.1 and Theorem 3.6.3. 
56 6. CONTINUOUS FUNTIONS ON METRIC SPACES

When f is real-valued, we can say even more.


Theorem 6.5.8. Let (X, d) be a metric space, A ⊂ X compact, and f : A → R
continuous. Set M = sup f (x) and m = inf x∈A f (x). There are points y, z ∈ A
x∈A
with f (y) = M and f (z) = m.
Proof. f (A) is closed and bounded. The result follows from Theorem 2.3.23.

There is also the famous
Theorem 6.5.9. Intermediate Value Theorem
Suppose f is a continuous, real valued function defined on [a, b] (where we take
the usual metric) and f (a) 6= f (b). Let c be any number between f (a) and f (b).
Then f (x) = c for some x between a and b.
Proof. We treat the case f (a) < f (b) and assume f (a) < c < f (b). Define
A = {x ∈ [a, b] | f (x) < c}.
Note that a ∈ A and b is an upper bound for A, hence x = sup A is defined and
a ≤ x ≤ b. We claim f (x) = c. There is a sequence {xn } in A with xn ≤, for all n
and xn → x.
Since f (xn ) < c for all n and f is continuous at x,
f (x) = lim f (xn ) ≤ c.
n→∞
In particular, x 6= b! Choose any sequence {yn } with x < yn ≤ b for all n and
yn → x. Now f (yn ) > c for all n, hence
f (x) = lim f (yn ) ≥ c.
n→∞

CHAPTER 7

Sequences of Functions and C([a, b])

As an application of all of the theory we have developed, we want to explore


the properties of the metric space
C([a, b]) = {f | f is continuous on [a, b]}
with metric d(f, g) = sup |f (x)−g(x)|. That is, we consider whether or not C([a, b])
[a,b]
is
• closed
• complete
• separable
• compact
and if it does not have these properties, what kind of subsets do.
In all cases, we are investigating the properties of sequences of points in C([a, b]),
that is, sequences of continuous functions. We will be particularly interested in
sequences that converge.

7.1. Convergent Sequences of Functions


Following Ex. 4.3.6, we define
Definition 7.1.1. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, and {fn } a
sequence of functions with fn : A → Y for all n. Suppose the sequence of points
{fn (x)} in Y converges for all x ∈ A. We define a fucntion f : A → Y by
f (x) = lim fn (x), x ∈ A.
n→∞
We say that {fn } converges pointwise to f on A and f is the pointwise limit
of {fn } on A.
Example 7.1.2. Consider
1
fn (x) = sin(nx) on [0, π].
n
(See figure 7.1.)
We have
1 1
|fn (x) − 0| = | sin(nx)| ≤ for all x ∈ [0, π],
n n
so {fn } → 0.
Example 7.1.3. Define on [0, 1] (See figure 7.2.) ,

2
n x,
 0 ≤ x ≤ n1
fn (x) = n ( n − x), n1 ≤ x ≤ n2
2 2

2

0, n ≤x≤1

57
58 7. SEQUENCES OF FUNCTIONS AND C([a, b])

f1

f4

fn , n big

Figure 7.1

f4

f3

f2
f1
1

1 2
2 3 1

Figure 7.2

Given x > 0, n2 ≤ x for all n sufficiently large, hence fn (x) → 0 pointwise on


[0, 1].

This example shows that pointwise convergence can allow some pretty bad
behavior!
In most situations, we want to know whether or not some particular properties
of a sequence of functions is inherited by the limit.

Example 7.1.4. When talking about C([a, b]), we want to know if the limit of
a sequence of continuous functions is continuous.

Example 7.1.5. In the numerical solution of differential equations, we want


to know if the limit of a sequence of approximate solutions converge to the true
7.1. CONVERGENT SEQUENCES OF FUNCTIONS 59

solution. Consider the simplest equation


(
y 0 (x) = f (x), a≤x≤b
(7.1)
y(0) = y0
Z x
with solution y(x) = y0 + f (s)ds.
a
We can define approximate solutions {yn (x)} using the rectangle rule to ap-
proximate the integral. Given n, define ∆x = b−a
n and xm = a+∆x·m, 0 ≤ m ≤ n;

a x1 x2 x3 b
x0 xn

b−a
∆x = n

If x ∈ [a, b], then x ∈ [xM −1 , xM ] for some 0 < M ≤ n. Define


M
X −2
Yn (x) = y0 = f (xm )∆x + f (xM −1 ) · (x − xM −1 ), for a ≤ x ≤ b,
m=1

which you can understand from the plot:

a xM b
x

Figure 7.3

Yn (x) is the area of the rectangles shown above. Yn (x) is continuous on [a, b]
for all n if f is continuous, and we want to know if {Yn (x)} → {y(x)}, the solution
of 7.1, which is also continuous if f is continuous. In most cases, we do not know
y.

It is important to realize that often the inheritance of an analytic property by


the limit of a sequence of functions is equivalent to whether or not it is justified to
switch the order of two limiting processes.
60 7. SEQUENCES OF FUNCTIONS AND C([a, b])

Example 7.1.6. Continuing Ex. 7.1.4, if the sequence {fn } in C([a, b]) converges
to f , we can rephrase the issue of inheriting continuity like this: Choose a ≤ x ≤ b.
Then for n ≥ 1,
lim fn (xm ) = fn (x)
m→∞
for all sequences {xm } in [a, b] with xm → x. Since fn → f pointwise,
f (x) = lim fn (x) = lim lim fn (xm )
n→∞ n→∞ m→∞
for all sequences {xm } in [a, b] with xm → x. For f to be continuous at x, we must
have
lim f (xm ) = f (x)
m→∞
for all such sequences. In other words, we require
(7.2) lim lim fn (xm ) = lim lim fn (xm ).
m→∞ n→∞ n→∞ m→∞
Example 7.1.7. Continuing Ex. 7.1.5, it is easy to see that given x ∈ [a, b], for
all sufficiently fine meshes, i.e., sufficiently large n,
Yn (x + h) − Yn (x)
lim ≈ f (xm−1 )
h→0 h
and as n → ∞, xm−1 → x, so
Yn (x + h) − Yn (x)
lim lim = f (x).
n→∞ h→0 h
This says Yn (x) is an approximate solution.
On the other hand, if Yn (x) → y(x) and we want to show that y solves 7.1,
then we want for x ∈ [a, b],
y(x + h) − y(x)
lim = f (x)
h→0 h
or
Yn (x + h) − Yn (x) Yn (x + h) − Yn (x)
lim lim = lim lim
h→0 n→∞ h n→∞ n→0 h
General Principle: Whenever there is more than one limiting process in some
situation, it is important to determine if the order of the limit matters.
m
Example 7.1.8. Consider { }m,n=∞ .
n + m m,n=1
m
lim lim = lim 0 = 0
m→∞ n→∞ n + m m→∞
m
lim lim = lim 1 = 1.
n→∞ m→∞ n + m n→∞
In fact, the limit of a sequence of continuous functions that converge pointwise
is not necessarily continuous.
Example 7.1.9. Consider {xn }∞ n=0 on [0, 1]. The sequence is in C([0, 1]). It
converges pointwise on [0, 1] to
(
n 0, 0 ≤ x ≤ 1,
χ1 (x) = lim x =
n→∞ 1, x=1
which is not continuous (see figure 7.4).

If x < 1, then given  > 0, choose N > − log(x) (recall log(x) < 0), so xn < 
for n ≥ N . However, xn = 1 for x = 1 and all n.
7.2. UNIFORM CONVERGENCE: C([a, b]) IS CLOSED, AND COMPLETE 61

1 x0

x1
x2
x3

Figure 7.4

7.2. Uniform Convergence: C([a, b]) is Closed, and Complete


Luckily, convergence in the sup metric of C([a, b]) is stronger than pointwise
convergence.
Example 7.2.1. The sequence {xn } in C([a, b]) does not converge in C([a, b]),
although it does converge pointwise (see Example 7.1.9). For any n ≥ 1,
(
n n 0, 1 ≤ x ≤ 1
sup |x − χ1 (x)| = sup |x − |
0≤x≤1 0≤x≤1 1, x = 1
= sup |x|n = 1
0≤x≤1

i.e.,
d(xn , x1 ) = 1 for all n.
Convergence in C([a, b]) is an example of uniform convergence.
Definition 7.2.2. Let (X, dx ), (Y, dy ) be metric spaces, A ⊂ X, and {fn } a
sequence of functions fn : A → Y for all n. {fn } converges uniformly to f on
A if for every  > 0 there is an N such that
dy (fn (x), f (x)) <  for x ∈ A and n ≥ N.
Compare this to Definition 7.1.1.
Example 7.2.3. The functions in Example 7.1.2 converge uniformly to 0 since
| n1 sin(mx) − 0| ≤ n1 for all 0 ≤ x ≤ π.
62 7. SEQUENCES OF FUNCTIONS AND C([a, b])

Example 7.2.4. The functions in Example 7.1.3 do not converge uniformly on


[0, 1] since
sup |fn (x) − 0| = n.
0≤x≤1

Example 7.2.5. The sequence {xn } does not converge uniformly to χ1 (x) on
[0, 1], but does on [0, 12 ].
Uniform convergence goes well with continuity.
Theorem 7.2.6. Let (X, dx ), (Y, dy ) be metric spaces and A ⊂ X. Suppose
{fn } is a sequence of functions with fn : A → Y continuous on A for all n and
fn → (f : A → Y) uniformly on A. Then f is continous of A.
Proof. Choose x ∈ A and  > 0. We want to show we can make dy (f (y), f (x))
smaller than  by making dx (x, y) small. Uniform convergence means that we can
make dy (f (x), fn (x)) and dy (f (y), fn (y)) small, so for y ∈ A, we write
dy (f (x), f (y)) ≤ dy (f (x), fn (x)) + dy (fn (x), fn (y)) + dy (fn (y), f (y)).
By uniform convergence, there is an N such that dy (f (x), fn (x)) <  and
dy (f (y), fn (y)) <  for n ≥ N , independent of x and y. Since fn is continuous on
A, there is a δ > 0 such that for any fixed n ≥ N ,
d(fn (x), fn (y)) < ,
for all y ∈ A, dx (x, y) < δ. Hence, using that value of n, we conclude
d(f (x), f (y)) < 3 for all y ∈ A, dx (x, y) < δ.

The functions in Example 7.1.3 show the converse does not hold: a sequence of
continuous functions can converge to a continuous function without the convergence
being uniform.
We now discuss the related topic of completeness. We state the Cauchy criterion
for uniform convergence.
Theorem 7.2.7. Let (X, dx ) and (Y, dy ) be metric spaces, Y complete, A ⊂ X,
and {fn } a sequence with fn : A → Y for all n. {fn } converges uniformly on A if
and only if for every  > 0 there is an N such that
dy (fn (x), fm (x)) <  for n, m ≥ N and x ∈ A.
Proof. Suppose {fn } converges uniformly on A to f . Given  > 0, there is
an N such that
dy (fn (x), f (x)) < , x ∈ A, n ≥ N.
So,
dy (fn (x), fm (x)) ≤ dy (fn (x), f (x)) + dy (f (x), fm (x))
< 2
for x ∈ A, n, m ≥ N .
Conversely, suppose the Cauchy condition holds. The sequence of points {fn (x)}
is a Cauchy sequence in Y for each x ∈ A, and therefore has a limit in Y that we
call f (x). This defines f : A → Y. {fn } converges pointwise to f on A and we
have to show the convergence is uniform.
7.3. C([a, b]) IS SEPARABLE 63

Let  > 0 and choose N so



dy (fn (x), fm (x)) < for n, m ≥ N, x ∈ A.
2
Fix n and let m → ∞. Since fm → f as m → ∞ and dy (f (x), ·) is continuous,
dy (fn (x), f (x)) ≤ 2 <  for n ≥ N, x ∈ A. 
Now, we observe that convergence in C([a, b]) is uniform convergence and R is
complete. Theorems 7.2.6 and 7.2.7 imply
Theorem 7.2.8. C([a, b]) is closed and complete.

7.3. C([a, b]) is Separable


We know R is separable and, in particular, Q is dense in R. This is extremely
important from a practical point of view because it means we can approximate
irrational numbers using rational numbers. This is what makes large scale scientific
computing possible, for example.
We prove that C([a, b]) is separable by first showing that continuous functions
can be approximated arbitrarily well by polynomials.
Theorem 7.3.1. Weierstrass Approximation Theorem Assume that f is
continuous on [a, b]. Given  > 0, there is a polynomial pn of sufficiently high degree
n such that
d(f, pn ) = sup |f (x) − pn (x)| < .
a≤x≤b

Another way to state this result is that there is a sequence of polynomials {pn }
(of course in C([a, b])) that converges to f in C([a, b]), that is, uniformly.
This theorem is profoundly important. It is the reason, for example, that the
use of polynomials is so widespread in numerical analysis, i.e., approximation of
functions, integrals, solutions of differential equations, and so on.
Note: unlike Taylor’s polynomials, this result does not require
increasing smoothness of f to increase the accuracy of the poly-
nomial approximations.
To prove that C([a, b]) is separable, we first note that if
Xn X n
p(x) = am xm and p̃(x) = ãm xm
m=0 m=0
are two polynomials on [a, b], then
d(p, p̃) = sup |p(x) − p̃(x)|
a≤x≤b
n
X
≤c· |am − ãm |
m=0
≤ (n + 1) · c · max |am − ãm |
0≤m≤n

where c is a constant that depends on a, b, and n. (c = max (max(|a|, |b|)m )


0≤m≤n
In particular, given a polynomial of degree n with real coefficients {am }nm=1
and  > 0, there is a polynomial p̃ of degree n with rational coefficients {ãm }nm=0
such that
d(pn , p̃n ) ≤ c(n + 1) max |am − ãm | < 
0≤m≤n
64 7. SEQUENCES OF FUNCTIONS AND C([a, b])

since the rationals are dense in R. (Note, however, that as the degree increases, the
coefficients generally must be approximated to increasing accuracy.) This means
that given a continuous function f on [a, b], and  > 0, we can find a polynomial
with rational coefficients pn such that d(pn , f ) < . We first use Theorem 7.3.1 to
find a polynomial, with possibly real coefficients, that approximates f to within 2
and then construct a polynomial with rational coefficients that approximates the
first polynomial to within 2 .
Since the set of polynomials with rational coefficients is countable, this proves
Theorem 7.3.2. C([a, b]) is separable.
We first note that it suffices to prove Theorem 7.3.1 on [0, 1]. We can map [0, 1]
into [a, b] by y = (b − a)x + a and vice-versa by x = a−ya−b . If g is continuous on [a, b],
then f (x) = g((b − a)x + a) is continuous on [0, 1]. If pn approximates f to within
 on [0, 1], then p̃n (y) = pn ( a−y
a−b ) is a polynomial that approximates g(y) to within
 on [a, b].
We give a constructive proof that uses probability.
Definition 7.3.3. Recall for n ≥ m ≥ 0, the binomial coefficient n choose m,
 
n n!
=
m m!(n − m)!
Example 7.3.4.
   
4 4! 3 3!
= =6 = =1
2 2!2! 0 3!0!
n

m is the number of distince subsets with m objects that can be chosen from
a set of n objects. This is very important in probability.
Example 7.3.5. We compute the probability P of getting an ace of diamonds
in a poker hand of 5 cards chosen at random from a deck of 52 cards using
number of outcomes in the event
P(event) =
total number of possible outcomes
when all outcomes are equally likely.
The total number of 5 card poker hands is 52

5 . Obtaining a “good” hand
amounts to choosing any 4 cards
 from the remaining 51 cards after getting an ace
of diamonds. So, there are 51
4 good hands.
51

4 5
P = 52 = .
5
52
It is straightforward to show that
           
n n n n n n
= , = = n, = = 1.
m n−m 1 n−1 n 0
There is also an important result called the binomial exansion.
Theorem 7.3.6. For n ∈ N and a, b ∈ R,
n  
n
X n m n−m
(a + b) = a b .
m=0
m
7.3. C([a, b]) IS SEPARABLE 65

Using this, we can derive other formulas we need. Writing


n  
n
X n m n−m
(7.3) (x + b) = x b
m=0
m

and differentiating with respect to x gives


n  
n−1
X n m−1 n−m
n(x + b) = m x b .
m=0
m
a
Setting x = a and multiplying by n gives
n  
n−1
X m n m n−m
(7.4) a(a + b) = a b .
m=0
n m

Differentiating 7.3 twice and manipulating gives


n
m2
 
1 2 n−2
X m n m n−m
(7.5) (1 − )a (a + b) = ( 2 − 2) a b .
n m=0
n n m

The approximating polynomials used to show Theorem 7.3.1 are constructed


using the so-called binomial polynomials. We set b = 1 − x in 7.3 to get
n  
X n m
1 = (x + (1 − x))n = x (1 − x)n−m
m=0
m

Definition 7.3.7. The n + 1 binomial polynomials of degree n are defined


 
n m
pn,m (x) = x (1 − x)n−m , m = 0, 1, ..., n.
m
Example 7.3.8.
 
2 0
p2,0 = x (1 − x)2 = (1 − x)2
0
 
2 1
p2,1 = x (1 − x)1 = 2x(1 − x)
1
p2,2 = x2
We observe that if 0 ≤ x ≤ 1 is the probability of an event E, then pn,m is the
probability that E occurs exactly m times in n independent trials.
Example 7.3.9. Suppose we toss a coin with the probability X that heads H
occurse and 1 − X that tails T occurs. The coin is unfair if X 6= 12 . Consider a
sequence of tosses
HT
| T HT HHHHT {zHT HHT T HT H}
m heads in n tosses.
The probability of any sequence occuring is
X m (1 − X)m
n

if it has m heads. There are m sequences with m heads in n tosses, so this shows
the claim about pn,m (x).
66 7. SEQUENCES OF FUNCTIONS AND C([a, b])

The binomial polynomials have several useful properties following from 7.4
and 7.5:
n
X
(7.6a) pn,m (x) = 1
m=0
n
X
(7.6b) mpn,m (x) = nx
m=0
n
X
(7.6c) m2 pn,m (x) = (n2 − n)x2 + nx
m=0

We next use these polynomials to prove the Law of Large Numbers.


Suppose we have an event E with probability X of occurring. How might we
determine X experimentally? If we conduct many trials N , we might expect to see
the event occurred roughly N X times most of the time. This is not clearly true,
however. If we toss a fair coint 100,000 times, we expect to see around 50,000 heads.
We could get all tails, however, with probability ( 12 )100,000 . On the other hand, we
can show the probability of getting exactly half heads in n tosses goes like

1
P √ (nlarge)
πn

hence also tends to zero.


The Law of Large Numbers encapsulates this.

Theorem 7.3.10. Law of Large Numbers Assume event E occurs with prob-
ability X and let m denote the number of times E occurs in n trials. Let  > 0 and
δ > 0 be given. The probability that m
n differs from X by less than δ is greater than
1 − , i.e.,
m
P(| − x| < δ) > 1 − ,
n
for all n sufficiently large.
Note: This does not say that E occurs exactly Xn times, nor
that E must occur roughly Xn times.

Proof. In terms of binomial polynomials, we want to show that given , δ > 0,


X
(7.7) pn,m (X) > 1 −  for all n large.
0≤m≤n
|m
n −X|<δ

Since lower bounds are difficult in general, we consider the complementary sum
giving the probability of what we don’t want:
X X
pn,m (X) = 1 − pn,m (x),
0≤m≤n 0≤m≤n
|m
n −X|≥δ |m
n −X|<δ
7.3. C([a, b]) IS SEPARABLE 67

that we estimate as
X 1 X m
pn,m ≤ 2 ( − X)2 pn,m (X)
δ n
0≤m≤n 0≤m≤n
|m
n −X|≥δ |m
n −X|≥δ
n
1 X
≤ (m − nX)2 pn,m (X)
n2 δ 2 m=0
n n n
1 X 2 X
2 2
X
≤ ( m p n,m (X) − 2nX mp n,m (X) + n X pn,m (X)).
n2 δ 2 m=0 m=0 m=0

Using 7.6a - 7.6c, the sums on the right simplify to nX(1−X). Since X(1−X) ≤
1
4 for 0 ≤ X ≤ 1,
X 1
(7.8) pn,m (x) ≤
4nδ 2
0≤m≤n
|m >δ
n −X|

and
X 1
pn,m (x) ≥ 1 − .
4nδ 2
0≤m≤n
|m
n −X|<δ

For given , δ > 0, we can insure (4nδ 2 )−1 <  by choosing n > 1
4δ 2  . 
Proof. Of Theorem 7.3.1. We first define the approximating polynomial,
named after the person who made this proof.
Definition 7.3.11. We partition [0, 1] by a uniform mesh with n + 1 nodes,
xm = m
n , m = 0, 1, ..., n. The Bernstein polynomial of order n for f on [0, 1] is
n
X
Bn (f, x) = Bn (x) = f (xm )pn,m (x).
m=0

Note that deg(Bn ) ≤ n.


The reason that Bn approximates f is intuitive
X X
Bn (x) = f (xm )pn,m (x) + f (xm )pn,m (x).
xm ≈x |xm −x|
large

The first sum converges to f as n increases becasue we can find m


n arbitrarily
close to x while the second sum goes to zero by the Law of Large Numbers.
Example 7.3.12. Consider x2 on [0, 1] with n ≥ 2,
n
X m
Bn (x) = ( )2 pn,m (x).
m=0
n
Using the identities 7.6a - 7.6c,
1 2 1
Bn (x) = (1 − )x + x.
n n
Note that Bn (x2 , x) 6= x2 and the error is
1
|x2 − Bn (x)| = x(1 − x)
n
68 7. SEQUENCES OF FUNCTIONS AND C([a, b])

2.75

2.5

2.25

2.0

1.75

1.5

1.25

1.0

0.0 0.25 0.5 0.75 1.0


x

Figure 7.5. B1 (x) > B2 (x) > B3 (x) > ex on (0, 1)

which tends to zero like n1 on (0, 1]. This contrasts with interpolating polynomials
and Taylor polynomials, which both have the property that if f is a polynomial
then pn = f for n ≥ deg(f ).
Example 7.3.13. For ex on (0, 1] (see Figure 7.5),
B1 (x) = (1 − x) + ex
1
B2 (x) = (1 − x)2 + 2e 2 x(1 − x) + ex2
1 2
B3 (x) = (1 − x)3 + 3e 3 x(1 − x)2 + 3e 3 x2 (1 − x) + ex3
..
.

We prove that given  > 0, there is an n such that


sup |f (x) − Bn (x)| < .
0≤x≤1

Using 7.6a, we write


n
X n
X
f (x) − Bn (x) = f (x)pn,m (x) − f (xm )pn,m (x)
m=0 m=0
Xn
= (f (x) − f (xm ))pn,m (x).
m=0

We expect that we can make f (x) − f (xm ) small when x is close to xm by


continuity. For δ > 0, we write
(7.9) X X
f (x) − Bn (x) = (f (x) − f (xm ))pn,m (x) + (f (x) − f (xm ))pn,m (x).
0≤m≤n 0≤m≤n
|x−xm |<δ |x−sm |≥δ
7.4. COMPACT SETS IN C([a, b]) 69

Theorem 6.4.2 implies f is uniformly continuous on [0, 1]. Given  > 0, there
is a δ > 0 such that

|f (x) − f (xm )| <
2
for all x, xm in [0, 1] with |x − xm | ≤ δ. Given δ, by the way, we can find xm such
that |x − xm | ≤ δ for all sufficiently large n, since the rationals are dense in [0, 1].
Thus,
X X
| (f (x) − f (xm ))pn,m (x)| ≤ |f (x) − f (xm )|pn,m (x)
0≤m≤n 0≤m≤n
|x−xm |<δ |x−xm |<δ
 X 
≤ pn,m (x) = .
2 2
0≤m≤n

Now the second sum on the right in 7.9 is bounded after we realize that Theo-
rem 6.5.8 implies |f | is bounded on [0, 1] by some constant M . Hence, 7.8 implies
X X
| (f (x) − f (xm ))pn,m (x)| ≤ 2M pn,m (x)
0≤m≤n 0≤m≤n
|x−xm |≥δ |x−xm |≥δ
M
≤ .
2nδ 2
M 
Given δ from the first estimate, we can force 2nδ 2 < 2 by taking n sufficiently
large. 

7.4. Compact Sets in C([a, b])


Example 7.2.1 shows that C([a, b]) is not compact.
Example 7.4.1. The sequence {xn | n ∈ N} does not have a convergent sub-
sequence in C([a, b]), i.e., there is no subsequence that converges uniformly.
This raises the issue of describing the compact subsets of C([a, b]).
By Theorem 3.2.4, if K ⊂ C([a, b]) is compact, then K is closed and bounded.
Closed is rather obvious: K is closed if for any sequence of functions {fn } in K
that conveges to f in the metric of C([a, b]) (uniformly), we have f ∈ K.
Example 7.4.2. Let
F = {f ∈ C([a, b]) | sup |f (x)| < 1}.
a≤x≤b

F is not closed since, for example,


1
{1 − | n = 1, 2, 3, ...}
n
is a sequence of functions in F that converges uniformly to f (x) ≡ 1, which is not
in F .
Example 7.4.3. Let
F = {f ∈ C([a, b]) | sup |f (x)| ≤ 1}.
a≤x≤b

We show F is closed. Choose a sequence {fn } in F that converges to f in C([a, b]).


We show f ∈ F , this is, sup |f (x)| ≤ 1}.
a≤x≤b
70 7. SEQUENCES OF FUNCTIONS AND C([a, b])

There is an  > 0 and an s ∈ [a, b] such that |f (x)| > 1 + . Because f is


continuous, there is a δ > 0 such that |f (y)| > 1 + 2 for y ∈ (x − δ, x + δ) ∩ [a, b].
But, for y ∈ (x − δ, x + δ) ∩ [a, b] and all n,
 
|f (y) − fn (y)| ≥ ||f (y)| − |fn (y)|| ≥ |1 + − 1| ≥ ,
2 2
which contradicts sup |fn (y) − f (y)| → 0 as n → ∞.
a≤y≤b

K ⊂ C([a, b]) is bounded means there is a function g ∈ C([a, b]) and an M such
that
d(f, g) = sup |f (x) − g(x)| ≤ M
a≤x≤b

for all f ∈ K. Since such a g is itself bounded on [a, b], we see there is an M such
that
sup |f (x)| ≤ M for all f ∈ K.
a≤x≤b

This motivates
Definition 7.4.4. Let (X, d) be a metric space, A ⊂ X, and F a set of functions
from A into Rn with the usual metric. F is uniformly bounded on A if there is
an M such that
sup k f (x) k≤ M for all f ∈ F.
x∈A

We have shown that if F ⊂ C([a, b]) is bounded, then F is uniformly bounded.


The converse is obviously true.
Theorem 7.4.5. Let F ⊂ C([a, b]) be a set of continuous functions on [a, b].
Then, F is uniformly bounded on [a, b] if and only if F is a bounded subset of
C([a, b]).
Example 7.4.6. The set F = {xn | n = 1, 2, 3...} is bounded on [0, 1], but not
on [0, 2]. The qualification of boundedness does depend on [a, b]. On the other hand,
we do not expect that being merely closed and bounded guarantees compactness.
In fact, {xn | n = 1, 2, 3, ...} is closed and bounded on [0, 1], but is not compact.
We need something more.
Example 7.4.7. Consider a sequence {gn } in C that converges in C([a, b]) to
g ∈ C([a, b]). The set
K = {gn | n = 1, 2, 3, ...} ∪ {g}
is compact. If {fn } is a sequence in K, then if {fn } contains a function in K
repeated infinitely often, then it has a subsequence that converges to an element
of K. Otherwise, infinitely many of the functions {gn } are contained in {fn }, and
{fn } contains a subsequence that converges to g.
Since g is uniformly continuous on [a, b], given  > 0, there is a δ0 such that

|g(x) − g(y)| < for x, y ∈ [a, b], |x − y| < δ0 .
3
Since gn → g uniformly, there is an N such that

sup |g(x) − g(y)| < for n ≥ N.
a≤x≤b 3
7.4. COMPACT SETS IN C([a, b]) 71

Hence, for n ≥ N , and x, y ∈ [a, b] with |x − y| < δ0 ,


|gn (x) − gn (y)| ≤ |gn (x) − g(x)| + |g(x) − g(y)| + |g(y) − gn (y)|
  
+ + = .
3 3 3
Now the functions g1 , ..., gN −1 are also uniformly continuous. Hence, there are
δ1 , ..., δN −1 such that |gm (x) − gm (y)| <  for x, y ∈ [a, b], |x − y| < δm , for m =
1, 2, ..., N − 1.
Setting δ = min{δ0 , δ1 , ..., δN −1 }, we see that for  > 0 there is a δ > 0 such
that for all n,
|gn (x) − gn (y)| <  for x, y ∈ [a, b], |x − y| < δ.
The functions in K are uniformly continuous with the same  and δ, sort of “uni-
formly uniformly continuous”.
This motivates
Definition 7.4.8. Let (X, dx ) and (Y, dy ) be metric spaces, A ⊂ X, and F a
set of functions from A into Y. F is equicontinuous on A if for every  > 0 there
is a δ > 0 such that for all f ∈ F ,
dy (f (x, f (y)) <  for all x, y ∈ A with dx (x, y) < δ.
Example 7.4.9. The functions in Example 7.4.7 are equicontinuous.
Example 7.4.10. For fixed L > 0, let F be the set of functions
F = {f : [a, b] → R | |f (x) − f (y)| ≤ L|x − y| for all x, y ∈ [a, b]},
that is, F is the set of Lipschitz continuous functions on [a, b] with constant L.
We also say that F is uniformly Lipschitz continuous with constant L. Then, F is
equicontinuous.
Exercise 7.4.11. Show F as above is equicontinuous.
Example 7.4.12. Consider
F = {xn | n = 1, 2, 3, ...}
on 0, 1]. Take  = 1
2 and choose 0 < δ < 1. Set x = 1 − 2δ . Since lim xn = 0, there
n→∞
is an N such that |1 − xn | > 12 =  for n ≥ N .
Hence, |1 − x| < δ while |1 − xn | > 12 =  for n ≥ N . Since δ is arbitrary, F
cannot be equicontinuous. The condition for equicontinuity fails at 1.
We now prove the important characterization of compactness in C([a, b]).
Theorem 7.4.13. Arzela-Ascoli Theorem Let K be a closed subset of C([a, b]).
Then K is compact if and only if K is uniformly bounded and equicontinuous.
Proof. First note that since K is closed and C([a, b]) is complete (Theo-
rem 7.2.8), K is complete. By Theorem 4.4.3, K is compact if and only if it is
totally bounded. We will prove that K is uniformly bounded and equicontinuous
if and only if it is totally bounded.
72 7. SEQUENCES OF FUNCTIONS AND C([a, b])

Suppose first K is totally bounded. This means in particular that it is bounded


and therefore uniformly bounded by Theorem 7.4.5. To show K is equicontinuous,
for  > 0, let {f1 , ..., fn } be a set of functions in K such that
n
[
K⊂ N (fm ).
m=1

This means that for any f ∈ K, there is an m, 1 ≤ m ≤ n, such that


sup |f (x) − fm (x)| < .
a≤x≤b

Each of the fm is uniformly continuous on [a, b] and since there is a finite number
of {f1 , ..., fn }, there is a δ > 0 such that
|fm (x) − fm (y)| <  for 1 ≤ m ≤ n, x, y ∈ [a, b], |x − y| < δ.
Choosing f ∈ K, we choose fm as above and write
|f (x) − f (y)| ≤ |f (x) − fm (x)| + |fm (x) − fm (y)| + |fm (y) − f (y)|
and with δ chosen as above,
|f (x) − f (y)| < 3,
for all x, y ∈ [a, b] with |x − y| < δ.
Since f was chosen arbitrarily, K is equicontinuous.
Now, we show that if K is uniformly bounded and equicontinuous, then it is
totally bounded. So, given any  > 0, we construct a finite set of functions F such
that
[
K⊂ N (f ),
f ∈F

which means that given g ∈ K there is an f ∈ F such that


sup |f (x) − g(x)| < .
a≤x≤b

Note that F does not have to be contained in K, just in C([a, b]).


Given  > 0, by equicontinuity, there is a δ > 0 such that

|g(x) − g(y)| < for all g ∈ K, x, y ∈ [a, b] with |x − y| < δ.
5
Using this δ, we creat a mesh on [a, b] with nodes {x1 , ..., xn } such that every
point in [a, b] is within δ of one of these points.

<δ <δ <δ


a x1 x2 x3 b

Note: we are using the fact that [a, b] is totally bounded!


7.4. COMPACT SETS IN C([a, b]) 73

Now choose M ∈ N such that |g(x)| ≤ M for all a ≤ x ≤ b and all g ∈ K.


1
Choose m ∈ N with m < 5 and partition [−M, M ] into 2M m congruent intervals
with nodes
−M = y0 < y1 < ... < y2M m = M.
(See figure 7.6.)

Y2M m = M

2M 1 
2M ∆y 2M m
= m
< 5

Y0 = −M

Figure 7.6

We now have a grid of points {(xi , yj ) | 1 ≤ i ≤ n, 0 ≤ j ≤ 2M m} for the


rectangle [a, b] × [−M, M ].

Y2M m = M

Y0 = −M
x1 x2 x3 ... xn−1 xn
74 7. SEQUENCES OF FUNCTIONS AND C([a, b])

Let F be the set of continuous functions on [a, b] that are piecewise linear whose
“corner points” occur at points on the grid (see Figure 7.7).

Figure 7.7. 3 examples

F has a finite number ((2M m)n ) of elements. Choose g ∈ K. There is at least


one f ∈ F such that

|g(xj ) − f (xj )| < j = 1, 2, ..., n.
5


< 5

Figure 7.8

We want to show g(x) is close to f (x) for all a ≤ x ≤ b (see Figure 7.8). Choose
a ≤ x ≤ b. Now xj ≤ x ≤ xj+1 for some 1 ≤ j ≤ n. By the equicontinuity of K
and choice of δ, we know |g(y) − g(xj )| < 5 for xj ≤ y ≤ xj+1 . It follows that
|f (xj+1 ) − f (xj )| ≤ |f (xj+1 ) − g(xj+1 )| + |g(xj+1 ) − g(xj )| + |g(xj ) − f (xj )|
   3
< + + = .
5 5 5 5
Since f is linear on [xj , xj+1 ],
3
|f (y) − g(xj )| < , xj ≤ y ≤ xj+1 .
5
7.4. COMPACT SETS IN C([a, b]) 75

So,
|g(x) − f (x)| ≤ |g(x) − g(xj )| + |g(xj ) − f (xj )| + |f (xj ) − f (x)|
< .
F is an -net for K. 
Example 7.4.14. We don’t have time for details, but a classic application of
the Arzela-Ascoli Theorem is to show that the forward Euler approximation
Y0 = y0
Yn = Yn−1 + ∆tf (Yn−1 ), n = 1, 2, ..., N,
T
where ∆t = N, for the initial value problem
(
y 0 = f (y), 0 ≤ t ≤ T
y(0) = y0
converges to y for 0 ≤ t ≤ T if f is continuous.

You might also like