100% found this document useful (2 votes)
347 views574 pages

Mathematical Analysis Volume I Overview

Uploaded by

rachidouchatr99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
347 views574 pages

Mathematical Analysis Volume I Overview

Uploaded by

rachidouchatr99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Mathematical Analysis

Volume I

Teo Lee Peng


Mathematical Analysis
Volume I

Teo Lee Peng

January 1, 2024
Contents i

Contents

Contents i

Preface iv

Chapter 1 The Real Numbers 1


1.1 Logic, Sets and Functions . . . . . . . . . . . . . . . . . . . . 1
1.2 The Set of Real Numbers and Its Subsets . . . . . . . . . . . . 7
1.3 Bounded Sets and the Completeness Axiom . . . . . . . . . . 16
1.4 Distributions of Numbers . . . . . . . . . . . . . . . . . . . . 28
1.5 The Convergence of Sequences . . . . . . . . . . . . . . . . . 33
1.6 Closed Sets and Limit Points . . . . . . . . . . . . . . . . . . 57
1.7 The Monotone Convergence Theorem . . . . . . . . . . . . . 66
1.8 Sequential Compactness . . . . . . . . . . . . . . . . . . . . 72

Chapter 2 Limits of Functions and Continuity 82


2.1 Limits of Functions . . . . . . . . . . . . . . . . . . . . . . . 83
2.2 Continuity of Functions . . . . . . . . . . . . . . . . . . . . . 104
2.3 The Extreme Value Theorem . . . . . . . . . . . . . . . . . . 115
2.4 The Intermediate Value Theorem . . . . . . . . . . . . . . . . 121
2.5 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . 129
2.6 Monotonic Functions and Inverses of Functions . . . . . . . . 136

Chapter 3 Differentiating Functions of a Single Variable 146


3.1 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.2 Chain Rule and Derivatives of Inverse Functions . . . . . . . . 164
3.3 The Mean Value Theorem and Local Extrema . . . . . . . . . 172
3.4 The Cauchy Mean Value Theorem . . . . . . . . . . . . . . . 193
3.5 Transcendental Functions . . . . . . . . . . . . . . . . . . . . 200
3.5.1 The Logarithmic Function . . . . . . . . . . . . . . . 201
3.5.2 The Exponential Functions . . . . . . . . . . . . . . . 204
Contents ii

3.5.3 The Trigonometric Functions . . . . . . . . . . . . . . 211


3.5.4 The Inverse Trigonometric Functions . . . . . . . . . 223
3.6 L’ Hôpital’s Rules . . . . . . . . . . . . . . . . . . . . . . . . 228
3.7 Concavity of Functions . . . . . . . . . . . . . . . . . . . . . 240

Chapter 4 Integrating Functions of a Single Variable 251


4.1 Riemann Integrals of Bounded Functions . . . . . . . . . . . 252
4.2 Properties of Riemann Integrals . . . . . . . . . . . . . . . . 279
4.3 Functions that are Riemann Integrable . . . . . . . . . . . . . 286
4.4 The Fundamental Theorem of Calculus . . . . . . . . . . . . . 302
4.5 Integration by Substitution and Integration by Parts . . . . . . 317
4.5.1 Integration by Substitution . . . . . . . . . . . . . . . 317
4.5.2 Integration by Parts . . . . . . . . . . . . . . . . . . . 324
4.6 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . 331

Chapter 5 Infinite Series of Numbers and Infinite Products 354


5.1 Limit Superior and Limit Inferior . . . . . . . . . . . . . . . . 354
5.2 Convergence of Series . . . . . . . . . . . . . . . . . . . . . . 370
5.3 Rearrangement of Series . . . . . . . . . . . . . . . . . . . . 400
5.4 Infinite Products . . . . . . . . . . . . . . . . . . . . . . . . . 413
5.5 Double Sequences and Double Series . . . . . . . . . . . . . . 427

Chapter 6 Sequences and Series of Functions 452


6.1 Convergence of Sequences and Series of Functions . . . . . . 452
6.2 Uniform Convergence of Sequences and Series of Functions . 464
6.3 Properties of Uniform Limits of Functions . . . . . . . . . . . 479
6.4 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
6.5 Taylor Series and Taylor Polynomials . . . . . . . . . . . . . 518
6.6 Examples and Applications . . . . . . . . . . . . . . . . . . . 542
6.6.1 The Irrationality of e . . . . . . . . . . . . . . . . . . 542
6.6.2 The Irrationality of π . . . . . . . . . . . . . . . . . . 543
6.6.3 Infinitely Differentiable Functions that are Non-Analytic548
6.6.4 A Continuous Function that is Nowhere Differentiable 554
6.6.5 The Weierstrass Approximation Theorem . . . . . . . 558
Contents iii

References 566
Preface iv

Preface

Mathematical analysis is a standard course which introduces students to rigorous


reasonings in mathematics, as well as the theories needed for advanced analysis
courses. It is a compulsory course for all mathematics majors. It is also strongly
recommended for students that major in computer science, physics, data science,
financial analysis, and other areas that require a lot of analytical skills. Some
standard textbooks in mathematical analysis include the classical one by Apostol
[Apo74] and Rudin [Rud76], and the modern one by Bartle [BS92], Fitzpatrick
[Fit09], Abbott [Abb15], Tao [Tao16, Tao14] and Zorich [Zor15, Zor16].
This book is the first volume of the textbooks intended for a one-year course in
mathematical analysis. We introduce the fundamental concepts in a pedagogical
way. Lots of examples are given to illustrate the theories. We assume that students
are familiar with the material of calculus such as those in the book [SCW20].
Thus, we do not emphasize on the computation techniques. Emphasis is put on
building up analytical skills through rigorous reasonings.
Besides calculus, it is also assumed that students have taken introductory
courses in discrete mathematics and linear algebra, which covers topics such as
logic, sets, functions, vector spaces, inner products, and quadratic forms. Whenever
needed, these concepts would be briefly revised.
In this book, we have defined all the mathematical terms we use carefully.
While most of the terms have standard definitions, some of the terms may have
definitions defer from authors to authors. The readers are advised to check the
definitions of the terms used in this book when they encounter them. This can be
easily done by using the search function provided by any PDF viewer. The readers
are also encouraged to fully utilize the hyper-referencing provided.

Teo Lee Peng


Chapter 1. The Real Numbers 1

Chapter 1

The Real Numbers

1.1 Logic, Sets and Functions

In this section, we give a brief review of propositional logic, sets and functions.
It is assumed that students have taken an introductory course which covers these
topics, such as a course in discrete mathematics [Ros18].

Definition 1.1 Proposition


A proposition, usually denoted by p, is a declarative sentence that is either
true or false, but not both.

Definition 1.2 Negation of a Proposition


If p is a proposition, ¬p is the negation of p. The proposition p is true if
and only if the negation ¬p is false.

From two propositions p and q, we can apply logical operators and obtain a
compound proposition.

Definition 1.3 Conjunction of Propositions


If p and q are propositions, p ∧ q is the conjunction of p and q, read as "p
and q". The proposition p ∧ q is true if and only if both p and q are true.

Definition 1.4 Disjunction of Propositions


If p and q are propositions, p ∨ q is the disjunction of p and q, read as "p or
q". The proposition p ∨ q is true if and only if either p is true or q is true.
Chapter 1. The Real Numbers 2

Definition 1.5 Implication of Propositions


If p and q are propositions, the proposition p → q is read as "p implies q".
It is false if and only if p is true but q is false.

p → q can also be read as "if p then q or "p only if q". In mathematics, we


usually write p =⇒ q instead of p → q.

Definition 1.6 Double Implication


If p and q are propositions, the proposition p ←→ q is read as "p if and
only if q". It is the conjunction of p → q and q → p. Hence, it is true if and
only if both p and q are true, or both p and q are false.

The stament “p if and only if q” is often expressed as p ⇐⇒ q.


Two compound propositions p and q are said to be logically equivalent, denoted
by p ≡ q, provided that p is true if and only if q is true.
Logical equivalences are important for working with mathematical proofs.
Some equivalences such as commutative law, associative law, distributive law are
obvious. Other important equivalences are listed in the theorem below.

Theorem 1.1 Logical Equivalences


Let p, q, r be propositions.

1. p → q ≡ ¬p ∨ q

2. De Morgan’s Law

(i) ¬(p ∨ q) ≡ ¬p ∧ ¬q
(ii) ¬(p ∧ q) ≡ ¬p ∨ ¬q

A very important equivalence is the equivalence of an implication with its


contrapositive.

Theorem 1.2 Contraposition


If p and q are propositions, p → q is equivalent to ¬q → ¬p.
Chapter 1. The Real Numbers 3

In mathematics, we are often dealing with statements that depend on variables.


Quantifiers are used to specify the extent to which such a statement is true. Two
commonly used quantifiers are "for all" (∀) and "there exists" (∃).
For negation of statements with quantifiers, we have the following generalized
De Morgan’s law.

Theorem 1.3 Generalized De Morgan’s Law

1. ¬ (∀x P (x)) ≡ ∃x ¬P (x)

2. ¬ (∃x P (x)) ≡ ∀x ¬P (x)

For nested quantifiers, the ordering is important if different types of quantifiers


are involved. For example, the statement

∀x ∃y x + y = 0

is not equivalent to the statement

∃y ∀x x + y = 0.

When the domains for x and y are both the set of real numbers, the first statement
is true, while the second statement is false.
For a set A, we use the notation x ∈ A to denote x is an element of the set A;
and the notation x ∈ / A to denote x is not an element of A.

Definition 1.7 Equal Sets


Two sets A and B are equal if they have the same elements. In logical
expression, A = B if and only if

x ∈ A ⇐⇒ x ∈ B.

Definition 1.8 Subset


If A and B are sets, we say that A is a subset of B, denoted by A ⊂ B,
if every element of A is an element of B. In logical expression, A ⊂ B
means that
x ∈ A =⇒ x ∈ B.
Chapter 1. The Real Numbers 4

When A is a subset of B, we will also say that A is contained in B, or B


contains A.
We say that A is a proper subset of B if A is a subset of B and A ̸= B. In
some textbooks, the symbol "⊆" is used to denote subset, and the symbol "⊂"
is reserved for proper subset. In this book, we will not make such a distinction.
Whenever we write A ⊂ B, it means A is a subset of B, not necessary a proper
subset.
There are operations that can be defined on sets, such as union, intersection,
difference and complement.

Definition 1.9 Union of Sets


If A and B are sets, the union of A and B is the set A ∪ B which contains
all elements that are either in A or in B. In logical expression,

x ∈ A ∪ B ⇐⇒ (x ∈ A) ∨ (x ∈ B).

Definition 1.10 Intersection of Sets


If A and B are sets, the intersection of A and B is the set A ∩ B which
contains all elements that are in both A and B. In logical expression,

x ∈ A ∩ B ⇐⇒ (x ∈ A) ∧ (x ∈ B).

Definition 1.11 Difference of Sets


If A and B are sets, the difference of A and B is the set A \ B which
contains all elements that are in A and not in B. In logical expression,

x ∈ A \ B ⇐⇒ (x ∈ A) ∧ (x ∈
/ B).

Definition 1.12 Complement of a Set


If A is a set that is contained in a universal set U , the complement of A in
U is the set AC which contains all elements that are in U but not in A. In
logical expression,

x ∈ AC ⇐⇒ (x ∈ U ) ∧ (x ∈
/ A).
Chapter 1. The Real Numbers 5

Since a universal set can vary from context to context, we will usually avoid
using the notation AC and use U \ A instead for the complement of A in U . The
advantage of using the notation AC is that De Morgan’s law takes a more succint
form.

Proposition 1.4 De Morgan’s Law for Sets

If A and B are sets in a universal set U , and AC and B C are their


complements in U , then

1. (A ∪ B)C = AC ∩ B C

2. (A ∩ B)C = AC ∪ B C

Definition 1.13 Functions


When A and B are sets, a function f from A to B, denoted by f : A → B,
is a correspondence that assigns every element of A a unique element in B.
If a is in A, the image of a under the function f is denoted by f (a), and it
is an element of B.
A is called the domain of f , and B is called the codomain of f .

Definition 1.14 Image of a Set


If f : A → B is a function and C is a subset of A, the image of C under f
is the set
f (C) = {f (c) | c ∈ C} .
f (A) is called the range of f .

Definition 1.15 Preimage of a Set


If f : A → B is a function and D is a subset of B, the preimage of D under
f is the set
f −1 (D) = {a ∈ A | f (a) ∈ D} .

Notice that f −1 (D) is a notation, it does not mean that the function f has an
inverse.
Next, we turn to discuss injectivity and surjectivity of functions.
Chapter 1. The Real Numbers 6

Definition 1.16 Injection


We say that a function f : A → B is an injection, or the function f :
A → B is injective, or the function f : A → B is one-to-one, if no pair of
distinct elements of A are mapped to the same element of B. Namely,

a1 ̸= a2 =⇒ f (a1 ) ̸= f (a2 ).

Using contrapositive, a function is injective provided that

f (a1 ) = f (a2 ) =⇒ a1 = a2 .

Definition 1.17 Surjection


We say that a function f : A → B is a surjection, or the function f : A →
B is surjective, or the function f : A → B is onto, if every element of B
is the image of some element in A. Namely,

∀b ∈ B, ∃a ∈ A, f (a) = b.

Equivalently, f : A → B is surjective if the range of f is B. Namely,


f (A) = B.

Definition 1.18 Bijection


We say that a function f : A → B is a bijection, or the function f : A → B
is bijective, if it is both injective and surjective.
A bijection is also called a one-to-one correspondence.

Finally, we would like to make a remark about some notations. If f : A → B


is a function with domain A, and C is a subset of A, the restriction of f to C is
the function f |C : C → B defined by f |C (c) = f (c) for all c ∈ C. When no
confusion arises, we will often denote this function simply as f : C → B.
Chapter 1. The Real Numbers 7

1.2 The Set of Real Numbers and Its Subsets

In this section, we introduce the set of real numbers using an intuitive approach.

Definition 1.19 Natural Numbers


The set of natural numbers N is the set that contains the counting numbers,
1, 2, 3 . . ., which are also called positive integers.

N is an inductive set. The number 1 is the smallest element of this set. If n is


a natural number, then n + 1 is also a natural number.
The number 0 corresponds to nothing.
For every positive integer n, −n is a number which produces 0 when adds to
n. This number −n is called the negative of n, or the additive inverse of n.
−1, −2, −3, . . ., are called negative integers.

Definition 1.20 Integers


The set of integers Z is the set that contains all positive integers, negative
integers and 0.

We will also use the notation Z+ to denote the set of positive integers.

Definition 1.21 Rational Numbers


The set of rational numbers Q is the set defined as
nm o
Q= m, n ∈ Z, n ̸= 0 .
n

Each rational number is a quotient of two integers, where the denominator is


nonzero. The set of integers Z is a subset of the set of rational numbers Q.
Every rational number m/n has a decimal expansion. For example,
23
− = −5.75,
4
27
= 3.857142857142 . . . = 3.8̇57142̇.
7
The decimal expansion of a rational number is either finite or periodic.
Chapter 1. The Real Numbers 8

Definition 1.22 Real Numbers


The set of real numbers R is intuitively defined to be the set that contains
all decimal numbers, which is not necessary periodic.

The set of real numbers contains the set of rational numbers Q as a subset. If
a real number is not a rational number, we call it an irrational number. The set
of irrational numbers is R \ Q.
It has been long known that there are real numbers that are not rational numbers.

The best example is the number 2, which appears as the length of the diagonal
of a unit square (see Figure 1.1).


Figure 1.1: The number 2.

The addition and multiplication operations defined on the set of natural numbers
can be extended to the set of real numbers consistently.
If a and b are real numbers, a + b is the addition of a and b, and ab is the
multiplication of a and b.
If a and b are positive real numbers, a+b and ab are also positive real numbers.
The set of real numbers with the addition and multuplication operations is
a field, which you will learn in abstract algebra. These operations satisfy the
following properties.
Chapter 1. The Real Numbers 9

Properties of Real Numbers

1. Commutativity of Addition

a+b=b+a

2. Associativity of Addition

(a + b) + c = a + (b + c)

3. Additive Identity
a+0=0+a=a
0 is called the additive identity.

4. Additive Inverse
For every real number a, the negative of a, denoted by −a, satisfies

a + (−a) = (−a) + a = 0

5. Commutativity of Multiplication

ab = ba

6. Associativity of Multiplication

(ab)c = a(bc)

7. Multiplicative Identity

a·1=1·a=a

1 is called the multiplicative identity.

8. Multiplicative Inverse
For every nonzero real number a, the reciprocal of a, denoted by 1/a,
satisfies
1 1
a· = ·a=1
a a
9. Distributivity
a(b + c) = ab + ac
Chapter 1. The Real Numbers 10

The set of complex numbers C is the set that contains all numbers of the
form a + ib, where a and b are real numbers, and i is the purely imaginary
number such that i2 = −1. It contains the set of real numbers R as a subset.
Addition and multiplication can be extended to the set of complex numbers. These
two operations on complex numbers also satisfy all the properties listed above.
Nevertheless, we shall focus on the set of real numbers in this course.
There are special subsets of real numbers which are called intervals. There
are nine types of intervals, four types are finite, five types are semi-infinite or
infinite. Their definitions are as follows.

Finite Intervals
1. (a, b) = {x ∈ R | a < x < b}

2. [a, b) = {x ∈ R | a ≤ x < b}

3. (a, b] = {x ∈ R | a < x ≤ b}

4. [a, b] = {x ∈ R | a ≤ x ≤ b}

For the intervals (a, b), [a, b), (a, b], [a, b], the points a and b are the end points
of the interval, while any point x with a < x < b is an interior point.

Semi-Infinite or Infinite Intervals


5. (a, ∞) = {x ∈ R | x > a}

6. [a, ∞) = {x ∈ R | x ≥ a}

7. (−∞, a) = {x ∈ R | x < a}

8. (−∞, a] = {x ∈ R | x ≤ a}

9. (−∞, ∞) = R.

For the intervals (a, ∞), [a, ∞), (−∞, a) and (−∞, a], a is the end point of
the interval, while any other points in the interval besides a is an interior point.
The set of natural numbers is a well-ordered set. Every nonempty subset
of positive integers has a smallest element. This statement is equivalent to the
Chapter 1. The Real Numbers 11

principle of mathematical induction, which is one of the important strategies in


proving mathematical statements.

Proposition 1.5 Principle of Mathematical Induction

Let P (n) be a sequence of statements that are indexed by the set of positive
integers Z+ . Assume that the following two assertions are true.

1. The statement P (1) is true.

2. For every positive integer n, if the statement P (n) is true, the statement
P (n + 1) is also true.

Then we can conclude that for all positive integers n, the statement P (n) is
true.

Before ending this section, let us discuss the absolute value and some useful
inequalities.

Definition 1.23 Absolute Value


Given a real number x, the absolute value of x, denoted by |x|, is defined
to be the nonnegative number

x, if x ≥ 0,
|x| =
−x, if x < 0.

In particular, | − x| = |x|.

For example, |2.7| = 2.7, | − 2.7| = 2.7.


The absolute value |x| can be interpreted as the distance between the number
x and the number 0 on the number line. For any two real numbers x and y, |x − y|
is the distance between x and y. Hence, the absolute value can be used to express
an interval.
Chapter 1. The Real Numbers 12

Intervals Defined by Absolute Values


Let a be a real number.

1. If r is a positive number,

|x − a| < r ⇐⇒ −r < x − a < r ⇐⇒ x ∈ (a − r, a + r).

2. If r is a nonnegative number,

|x − a| ≤ r ⇐⇒ −r ≤ x − a ≤ r ⇐⇒ x ∈ [a − r, a + r].

Absolute values behave well with respect to multiplication operation.

Proposition 1.6
Given real numbers x and y,

|xy| = |x||y|.

In general, |x + y| is not equal to |x| + |y|. Instead, we have an inequality,


known as the triangle inequality, which is very important in analysis.

Proposition 1.7 Triangle Inequality


Given real numbers x and y,

|x + y| ≤ |x| + |y|.

This is proved by discussing all four possible cases where x ≥ 0 or x < 0,


y ≥ 0 or y < 0.
A common mistake students tend to make is to replace both plus signs in the
triangle equality directly by minus signs. This is totally assurd. The correct one is

|x − y| ≤ |x| + | − y| = |x| + |y|.

For the inequality in the other direction, we have


Chapter 1. The Real Numbers 13

Proposition 1.8
Given real numbers x and y,

|x − y| ≥ ||x| − |y|| .

Proof
Since |x − y| ≥ 0, the statement is equivalent to

−|x − y| ≤ |x| − |y| ≤ |x − y|.

By triangle inequality,

|x − y| + |y| ≥ |x − y + y| = |x|.

Hence,
|x| − |y| ≤ |x − y|.
By triangle inequality again,

|x − y| + |x| = |y − x| + |x| ≥ |y − x + x| = |y|.

Hence,
−|x − y| ≤ |x| − |y|.
This completes the proof.

Example 1.1

If |x − 5| ≤ 2, show that
9 ≤ x2 ≤ 49.

Solution
|x − 5| ≤ 2 implies 3 ≤ x ≤ 7. This means that x is positive. The
inequality x ≥ 3 then implies that x2 ≥ 9, and the inequality x ≤ 7 implies
that x2 ≤ 49. Therefore,
9 ≤ x2 ≤ 49.

Finally, we have the useful Cauchy’s inequality.


Chapter 1. The Real Numbers 14

Proposition 1.9 Cauchy’s Inequality


For any real numbers a and b,

a2 + b 2
ab ≤ .
2

Proof
This is just a consequence of (a − b)2 ≥ 0.

An immediate consequence of Cauchy’s inequality is the arithmetic mean-


geometric mean inequality. For any nonnegative numbers a and b, the geometric
√ a+b
mean of a and b is ab, and the arithmetic mean is .
2
Proposition 1.10
If a ≥ 0, b ≥ 0, then
√ a+b
ab ≤ .
2
Chapter 1. The Real Numbers 15

Exercises 1.2
Question 1
Use induction to show that for any positive integer n,

n! ≥ 2n−1 .

Question 2: Bernoulli’s Inequality


Given that a > −1, use induction to show that

(1 + a)n ≥ 1 + na

for all positive integer n.

Question 3
Let n be a positive integer. If c1 , c2 , . . . , cn are numbers that lie in the
interval (0, 1), show that

(1 − c1 )(1 − c2 ) . . . (1 − cn ) ≥ 1 − c1 − c2 − · · · − cn .
Chapter 1. The Real Numbers 16

1.3 Bounded Sets and the Completeness Axiom

In this section, we discuss a property of real numbers called completeness. The


set of rational numbers does not have this property.
First, we introduce the concept of boundedness.

Definition 1.24 Boundedness


Let S be a subset of R.

1. We say that S is bounded above if there is a number c such that

x ≤ c for all x ∈ S.

Such a c is called an upper bound of S.

2. We say that S is bounded below if there is a number b such that

x ≥ b for all x ∈ S.

Such a b is called a lower bound of S.

3. We say that S is bounded if it is bounded above and bounded below. In


this case, there is a number M such that

|x| ≤ M for all x ∈ S.

Let us look at some examples.

Example 1.2
Determine whether each of the following sets of real numbers is bounded
above, whether it is bounded below, and whether it is bounded.

(a) A = {x | x < 2}

(b) B = {x | x > −2}

(c) C = {x | − 2 < x < 2}.


Chapter 1. The Real Numbers 17

Solution
(a) The set A is bounded above since every element of A is less than or
equal to 2. It is not bounded below, and so it is not bounded.

(b) The set B is bounded below since every element of B is larger than or
equal to −2. It is not bounded above, and so it is not bounded.

(c) The set C is equal to A∩B. So it is bounded above and bounded below.
Therefore, it is bounded.

Figure 1.2: The sets A, B, C in Example 1.2.

If S is a set of real numbers, the negative of S, denoted by −S, is the set

−S = {−x | x ∈ S} .

For example, the set B = {x | x > −2} is the negative of the set A = {x | x < 2},
the set C = {x | − 2 < x < 2} is the negative of itself (see Figure 1.2). It is
obvious that S is bounded above if and only if −S is bounded below.
Next, we recall the definition of maximum and minimum of a set.
Chapter 1. The Real Numbers 18

Definition 1.25 Maximum and Minimum


Let S be a nonempty subset of real numbers.

1. A number c is called the largest element or maximum of S if c is an


element of S and
x≤c for all x ∈ S.
If the maximum of the set S exists, we denote is by max S.

2. A number b is called the smallest element or minimum of S if b is an


element of S and
x≥b for all x ∈ S.
If the minimum of the set S exists, we denote it by min S.

Obviously, b is the maximum of a set S if and only if −b is the minimum of


the set −S.

Example 1.3

For the set S1 = [−2, 2], −2 is the minimum, and 2 is the maximum.
For the set S2 = [−2, 2), −2 is the minimum, and there is no maximum.

This example shows that a bounded set does not necessarily have maximum
or minimum. However, a finite set always have a maximum and a minimum.

Proposition 1.11
If S is a finite set, then S has a maximum and a minimum.

Next, we introduce the concept of least upper bound.


Chapter 1. The Real Numbers 19

Definition 1.26 Least Upper Bound


Let S be a nonempty subset of real numbers that is bounded above, and
let US be the set of upper bounds of S. Then US is a nonempty set that is
bounded below. If US has a smallest element u, we say that u is the least
upper bound or supremum of S, and denote it by

u = sup S.

Example 1.4

For the sets S1 = [−2, 2] and S2 = [−2, 2),

sup S1 = sup S2 = 2.

Notice that sup S, if exists, is not necessary an element of S. The following


proposition depicts the relation between the maximum of a set (if exists) and its
least upper bound.

Proposition 1.12 Supremum and Maximum


Let S be a nonempty subset of real numbers. Then S has a maximum if and
only if S is bounded above and sup S is in S.

One natural question to ask is, if S is a nonempty subset of real numbers that
is bounded above, does S necessarily have a least upper bound. The completeness
axiom asserts that this is true.

Completeness Axiom
If S is a nonempty subset of real numbers that is bounded above, then S
has a least upper bound.

The reason this is formulated as an axiom is we cannot prove this from our
intuitive definition of real numbers. Therefore, we will assume this as a fact for
the set of real numbers. A lots of theorems that we are going to derive later is a
consequence of this axiom.
Actually, the set of real numbers can be constructed axiomatically, taken it
Chapter 1. The Real Numbers 20

to be a set that contains the set of rational numbers, satisfying all properties
of addition and multiplication operations, as well as the completeness axiom.
However, this is a tedious construction and will drift us too far.
To show that the completeness axiom is not completely trivial, we show in
Example 1.6 that if we only consider the set of rational numbers, we can find a
subset of rational numbers A that is bounded above but does not have a least upper
bound in the set of rational numbers. We look at the following example first.

Example 1.5
Define the set of real numbers S by

S = x ∈ R | x2 < 2 .


Show that S is nonempty and is bounded above. Conclude that the set

A = x ∈ Q | x2 < 2


is also nonempty and is bounded above by a rational number.

Solution
The number 1 is in S, and so S is nonempty. For any x ∈ S, x2 < 2 < 4,
and hence x < 2. This shows that S is bounded above by 2. Since 1 and 2
are rational numbers, the same reasoning shows that the set A is nonempty
and is bounded above by a rational number.

Example 1.6
Consider the set
A = x ∈ Q | x2 < 2 .


By Example 1.5, A is a nonempty subset of rational numbers that is


bounded above by 2. Let UA be the set of upper bounds of A in Q. Namely,

UA = {c ∈ Q | x ≤ c for all x ∈ A} .

Show that UA does not have a smallest element.


Chapter 1. The Real Numbers 21

Solution
We use proof by contradiction. Assume that UA has a smallest element c1 ,
which is an upper bound of A that is smaller than or equal to any upper
bound of A. Then for any x ∈ A,

x 2 ≤ c1 .

Since 1 is in A, c1 is a positive rational number. Hence, there are poitive


integers p and q such that
p
c1 = .
q
Since there are no rational numbers whose square is 2, we must have either
c21 < 2 or c21 > 2.
Define the positive rational number c2 by
2p + 2q
c2 = .
p + 2q
Notice that
p(p + 2q) − q(2p + 2q) p2 − 2q 2
c1 − c2 = = ,
q(p + 2q) q(p + 2q)

p2 − 2q 2
c21 − 2 = ,
q2
and
4p2 + 8pq + 4q 2 − 2(p2 + 4pq + 4q 2 ) 2(p2 − 2q 2 )
c22 − 2 = = .
(p + 2q)2 (p + 2q)2

Case 1: c21 < 2.


In this case, p2 < 2q 2 . It follows that c1 < c2 and c22 < 2. But then c1
and c2 are both in A, and c2 is an element in A that is larger than c1 , which
contradicts to c1 is an upper bound of A. Hence, we cannot have c21 < 2.
Chapter 1. The Real Numbers 22

Case 2: c21 > 2.


In this case, p2 > 2q 2 . It follows that c1 > c2 and c22 > 2. Since c22 > 2, we
find that for any x ∈ A,
x2 < 2 < c22 .
Thus,
−c2 < x < c2 .
In particular, c2 is also an upper bound of A. Namely, c2 is in UA . But then
c1 and c2 are both in UA and c1 > c2 . This contradicts to c1 is the smallest
element in UA . Hence, we cannot have c21 > 2.
Since both Case 1 and Case 2 lead to contradictions, we conclude that UA
does not have a smallest element.

In the solution above, the construction of the positive rational number c2 seems
a bit adhoc. In fact, we can define c2 by
mp + 2nq
c2 =
np + mq

for any positive integers m and n with m2 > 2n2 . Then the proof still works.
Now let us see how completeness axiom is used to guarantee that there is a
real number whose square is 2.

Example 1.7
Use completeness axiom to show that there is a positive real number c such
that
c2 = 2.

Solution
Define the set of real numbers S by

S = x ∈ R | x2 < 2 .


Example 1.5 asserts that S is a nonempty subset of real numbers that is


bounded above. Completeness axiom asserts that S has a least upper bound
c.
Chapter 1. The Real Numbers 23

Since 1 is in S, c ≥ 1. We are going to prove that c2 = 2 using proof by


contradiction. If c2 ̸= 2, then c2 < 2 or c2 > 2.
Case 1: c2 < 2.
Let d = 2 − c2 . Then 0 < d ≤ 1. Define the number c1 by
d
c1 = c + .
4c
Then c1 > c, and

d d2 d d
c21 = c2 + + 2
≤ c2 + + < c2 + d = 2.
2 16c 2 16
This implies that c1 is an element of S that is larger than c, which contradicts
to c is an upper bound of S.
Case 2: c2 > 2.
Let d = c2 − 2. Then d > 0. Define the number c1 by
d
c1 = c − .
2c
Then c1 < c, and

d2
c21 = c2 − d + 2
> c2 − d = 2.
4c
This implies that c1 is an upper bound of S that is smaller than c, which
contradicts to c is the least upper bound of S.
Since we obtain a contradiction if c2 ̸= 2, we must have c2 = 2.

In fact, the completeness axiom can be used to show that for any positive real
number a, there is a positive real number c such that

c2 = a.

We denote this number c as a, called the positive square root of a. The number

b = − a is another real number such that b2 = a.
More generally, if n is a positive integer, a is a positive real number, then there
is a positive real number c such that cn = a. We denote this number c by

n
c= a,
Chapter 1. The Real Numbers 24

called the positive nth -root of a.


Using the interplay between a set and its negative, we can define the greatest
lower bound of a set that is bounded below.

Definition 1.27 Greatest Lower Bound


Let S be a nonempty subset of real numbers that is bounded below, and
let LS be the set of lower bounds of S. Then LS is a nonempty set that is
bounded above. If LS has a largest element ℓ, we say that ℓ is the greatest
lower bound or infimum of S, and denote it by

ℓ = inf S.

From the completeness axiom, we have the following.

Theorem 1.13
If S is a nonempty subset of real numbers that is bounded below, then S
has a greatest lower bound.

For a nonempty set S that is bounded, it has a least upper bound sup S and a
greatest lower bound inf S. The following is quite obvious.

Proposition 1.14
If S is a bounded nonempty subset of real numbers, it has a least upper
bound sup S and a greatest lower bound inf S. Moreover,

inf S ≤ sup S,

and inf S = sup S if and only if S contains exactly one element.

Let us emphasize again the characterization of the least upper bound and
greatest lower bound of a set.
Chapter 1. The Real Numbers 25

Characterization of Supremum and Infimum


Let S be a nonempty subset of real numbers, and let a be a real number.

1. a = sup S if and only if the following two conditions are satisfied.

(i) For all x ∈ S, x ≤ a.


(ii) If b is a real number such that x ≤ b for all x ∈ S, then a ≤ b.

2. a = inf S if and only if the following two conditions are satisfied.

(i) For all x ∈ S, x ≥ a.


(ii) If b is a real number such that x ≥ b for all x ∈ S, then a ≥ b.

Example 1.8
For each of the following set of real numbers, determine whether it has a
least upper bound, and whether it has a greatest lower bound.

(a) A = {x ∈ R | x3 < 2}

(b) B = {x ∈ R | x2 < 10}.

Solution
(a) The set A is bounded above, since if x ∈ A, then x3 < 2 < 23 , and so
x < 2. The set A is not bounded below since it contains all negative
numbers. Hence, A has a least upper bound, but it does not have a
greatest lower bound.

(b) If x2 < 10, then x2 < 16, and so −4 < x < 4. This shows that B
is bounded. Hence, B has a least upper bound, and a greatest lower
bound.

Finally, we want to highlight again Proposition 1.12 together with its lower
bound versus infimum counterpart.
Chapter 1. The Real Numbers 26

Existence of Maximum and Minimum


Let S be a nonempty subset of real numbers.

1. S has a maximum if and only if S is bounded above and sup S is in S.

2. S has a minimum if and only if S is bounded below and inf S is in S.


Chapter 1. The Real Numbers 27

Exercises 1.3
Question 1
For each of the following sets of real numbers, find its least upper bound,
greatest lower bound, maximum, and minimum if any of these exists. If
any of these does not exist, explain why.

(a) A = (−∞, 20)

(b) B = [−3, ∞)

(c) C = [−10, −2) ∪ (1, 12]

(d) D = [−2, 5] ∩ (−1, 7]

Question 2
Use completeness axiom to show that there is a positive real number c such
that
c2 = 5.

Question 3
For each of the following set of real numbers, determine whether it has a
least upper bound, and whether it has a greatest lower bound.

(a) A = {x ∈ R | x3 > 10}

(b) B = {x ∈ R | x2 < 2020}.


Chapter 1. The Real Numbers 28

1.4 Distributions of Numbers

In this section, we consider additional properties of the set of integers, rational


numbers and real numbers.
We start by a proposition about distribution of integers.

Proposition 1.15

1. If n is an integer, there is no integer in the interval (n, n + 1).

2. For any real number c, there is exactly one integer in the interval [c, c +
1), and there is exactly one integer in the interval (c, c + 1].

These statements are quite obvious. For any real number c, the integer in the
interval [c, c + 1) is ⌈c⌉, called the ceiling of c. It is the smallest integer larger than
or equal to c. For example ⌈−2.5⌉ = −2, ⌈−3⌉ = −3. The integer in the interval
(c, c + 1] is ⌊c⌋ + 1, where ⌊c⌋ is the floor of c. It is the largest integer that is less
than or equal to c. For example, ⌊−2.5⌋ = −3, ⌊−3⌋ = −3.
In Section 1.3, we have seen that a nonempty subset of real numbers that
is bounded above does not necessary have a maximum. Example 1.6 shows
that a nonempty subset of rational numbers that is bounded above also does not
necessary have a maximum. However, for nonempty subsets of integers, the same
is not true.

Proposition 1.16
Let S be a nonempty subset of integers.

1. If S is bounded above, it has a maximum.

2. If S is bounded below, it has a minimum.

The two statements are equivalent, and the second statement is a generalization
of the well-ordered principle for the set of positive integers. It can be proved using
mathematical induction.
Next we discuss another important property called the Archimedean property.
First let us show that the set of positive integers Z+ is not bounded above.
Chapter 1. The Real Numbers 29

Theorem 1.17
The set of positive integers Z+ is not bounded above.

Proof
Assume to the contrary that the set of positive integers Z+ is bounded
above. By completeness axiom, it has a least upper bound u. Since
u − 1 < u, u − 1 is not an upper bound of Z+ . Hence, there is a positive
integer n such that
n > u − 1.
It follows that
n + 1 > u.
Since n + 1 is also a positive integer, this says that there is an element of
Z+ that is larger than the least upper bound of Z+ . This contradicts to the
definition of least upper bound. Hence, Z+ cannot be bounded above.

The proof uses the key fact that any number that is smaller than the least upper
bound of a set is not an upper bound of the set. This is a standard technique in
proofs.

Theorem 1.18 The Archimedean Property

1. For any positive number M , there is a positive integer n such that n >
M.

2. For any positive number ε, there is a positive integer n such that 1/n <
ε.

These two statements are equivalent, and the first statement is equivalent to
the fact that the set of positive integers is not bounded above.
In the following, we consider another property called denseness.

Definition 1.28 Denseness


Let S be a subset of real numbers. We say that S is dense in R if every
open interval (a, b) contains an element of S.
Chapter 1. The Real Numbers 30

A key fact we want to prove is that the set of rational numbers Q is dense in
the set of real numbers.

Theorem 1.19 Denseness of the Set of Rational Numbers


The set of rational numbers Q is dense in the set of real numbers R.

Proof
Let (a, b) be an open interval. Then ε = b − a > 0. By the Archimedean
property, there is a positive integer n such that 1/n < ε. Hence,

nb − na = nε > 1,

and so
na + 1 < nb.
Consider the interval (na, na + 1]. There is an integer m that lies in this
interval. In other words,

na < m ≤ na + 1 < nb.

Dividing by n, we have
m
a< < b.
n
This proves that the open interval (a, b) contains the rational number m/n,
and thus completes the proof that the set of rational numbers is dense in the
set of real numbers.

Recall that a set A is said to be countably infinite if there is a bijection f :


+
Z → A. A set that is either finite or countably infinite is said to be countable.
We assume that students have seen the proofs of the following.

Proposition 1.20
The set of integers Z and the set of rational numbers Q are countable, while
the set of real numbers R is not countable.

Since the union of countable sets is countable, this proposition implies that the
set of irrational numbers is uncountable. Therefore, there are far more irrational
Chapter 1. The Real Numbers 31

numbers than rational numbers. Hence, it should not be surprising that the set of
irrational numbers is also dense in the set of real numbers. To prove this, let us
recall the following facts.

Rational Numbers and Irrational Numbers


1. If a and b are rational numbers, then a + b and ab are rational numbers.

2. If a is a nonzero rational number, b is an irrational number, then ab is an


irrational number.

Theorem 1.21 Denseness of the Set of Irrational Numbers


The set of irrational numbers R \ Q is dense in the set of real numbers R.

Proof
Let (a, b) be an open interval. Define

a b
c= √ , d= √ .
2 2
Then c < d. By the denseness of rational numbers, there is a rational
number u that lies in the interval (c, d). Hence,

a b
√ =c<u<d= √ .
2 2

Let v = 2u. Then v is an irrational number satisfying

a < v < b.

This proves that the open interval (a, b) contains the irrational number v,
and thus completes the proof that the set of irrational numbers is dense in
the set of real numbers.

Example 1.9
Is the set of integers Z dense in R? Justify your answer.
Chapter 1. The Real Numbers 32

Solution
(0, 1) is an open interval that does not contain any integers. Hence, the set
of integers is not dense in R.

Exercises 1.4
Question 1
Let S = Q \ Z. Is the set S dense in R? Justify your answer.
Chapter 1. The Real Numbers 33

1.5 The Convergence of Sequences

Infinite sequences play important roles in analysis. We will consider infinite


sequences that are indexed by the set of positive integers

a1 , a2 , . . . , an , . . .

This can be considered as a function f : Z+ → R, where an = f (n). The general


term in the sequence is denoted by an . In some occasions, we may also want to
consider sequences that start with a0 .
In the sequel, when we say a sequence, we always mean an infinite sequence
that is indexed by the set of positive integers, unless otherwise specified. A
sequence can be denoted by {an } or {an }∞ n=1 . This should not be confused with
the set {an | n ∈ Z+ } that contains all terms in the sequence.
There are various ways to specify a sequence. One of the ways is to give an
explicit formula for the general term an . For example {1/n} is the sequence with
an = 1/n. More precisely, it is the sequence with first five terms given by
1 1 1 1
1, , , , , . . . .
2 3 4 5
A sequence can also be defined recursively, such as the following example.

Example 1.10

Let {an } be the sequence defined by a1 = 2, and for n ≥ 2,

an = an−1 + 3.

Find the first 5 terms of the sequence.

Solution
We compute recursively.
a1 = 2
a2 = a1 + 3 = 5
a3 = a2 + 3 = 8
a4 = a3 + 3 = 11
a5 = a4 + 3 = 14
Chapter 1. The Real Numbers 34

The sequence {an } in Example 1.10 is an example of an arithmetic sequence.


One can prove by induction that

an = 3n − 1.

Example 1.11

Let {sn } be the sequence defined by s1 = 12 , and for n ≥ 2,

1
sn = sn−1 + .
2n
Find the first 5 terms of the sequence.

Solution
We compute recursively.
1
s1 =
2
1 3
s2 = s1 + =
22 4
1 7
s3 = s2 + 3 =
2 8
1 15
s4 = s3 + 4 =
2 16
1 31
s5 = s4 + 5 =
2 32

 sequence {sn } in Example 1.11 is the partial sum of the geometric sequence
 The
1
. One can prove by induction that
2n
1
sn = 1 − .
2n
Chapter 1. The Real Numbers 35

Example 1.12

Let {sn } be the sequence defined by


1 1
sn = 1 + + ··· + .
2 n
This sequence can also be defined recursively by s1 = 1, and for n ≥ 2,
1
sn = sn−1 + .
n

For the sequence {sn } defined in Example 1.12, the general term sn cannot be
expressed as an explicit elementary function of n.

Example 1.13

Let {an } be the sequence defined by a1 = 2, and for n ≥ 1,



a + 1 if an < 3,
n n
an+1 =
an − 1 if an ≥ 3.
n

Find the first six terms of the sequence.

Solution
We compute recursively.

a1 = 2 < 3
a2 = a1 + 1 = 3 ≥ 3
1 5
a3 = a2 − = < 3
2 2
1 17
a4 = a3 + = <3
3 6
1 37
a5 = a4 + = ≥3
4 12
1 173
a6 = a5 − =
5 60

From the examples above, we observe that some sequences are monotone.
Chapter 1. The Real Numbers 36

Definition 1.29 Increasing and Decreasing Sequences

1. We say that a sequence {an } is increasing if

an ≤ an+1 for all n ∈ Z+ .

2. We say that a sequence is decreasing if

an ≥ an+1 for all n ∈ Z+ .

3. We say that a sequence {an } is monotone if it is an increasing sequence


or it is a decreasing sequence.

Example 1.14

1. The sequence {an } defined in Example 1.10 is increasing.


 
1
2. The sequence is decreasing.
n
3. The sequence {an } defined in Example 1.13 is neither increasing nor
decreasing.

In analysis, we are often led to consider the behavior of a sequence {an } when
n gets larger than larger. We are interested to know whether the sequence would
approach a fixed value. This leads to the idea of convergence.

Definition 1.30 Convergence of Sequences

A sequence {an } is said to converge to the number a if for every positive


number ε, there is a positive integer N such that for all n ≥ N ,

|an − a| < ε.

Here the positive number ε is used to measure the distance from the term an
to the number a. Since ε can be any positive number, the distance can get as small
as possible.
One question that is natural to ask is whether a sequence {an } can converge to
Chapter 1. The Real Numbers 37

Figure 1.3: |an − a| < ε.

two different numbers. This is impossible.

Theorem 1.22
A sequence cannot converge to two different numbers.

Proof
This is proved by contradiction. Assume that there is a sequence {an }
which converges to two different numbers b and c. Let

|b − c|
ε= .
2
Since b and c are distinct, |b − c| > 0 and so ε > 0. By definition of
convergence, there is a positive integer N1 such that for all n ≥ N1 ,

|an − b| < ε.

Similarly, there is a positive integer N2 such that for all n ≥ N2 ,

|an − c| < ε.

If N = max{N1 , N2 }, then N ≥ N1 and N ≥ N2 . It follows that

|b − c| = |(aN − c) − (aN − b)| ≤ |aN − c| + |aN − b| < ε + ε = |b − c|.

This gives |b − c| < |b − c|, which is a contradiction. Hence, we conclude


that a sequence cannot converge to two different numbers.
Chapter 1. The Real Numbers 38

Figure 1.4: A sequence {an } cannot converge to two different numbers b and c.

Limit of a Sequence
If a sequence {an } converges to a number a, we say that the sequence is
convergent. Otherwise, we say that it is divergent. Theorem 1.22 says
that for a convergent sequence {an }, the number a that it converges to is
unique. We call this unique number a the limit of the convergent sequence
{an }, and express the convergence of {an } to a as

lim an = a.
n→∞

Using logical expression,

lim an = a ⇐⇒ ∀ε > 0, ∃N ∈ Z+ , ∀n ≥ N, |an − a| < ε.


n→∞

Let us look at a simple example of a constant sequence.

Example 1.15

Let c be a real number and let {an } be the sequence with an = c for all
n ∈ Z+ . Then for any ε > 0, we take N = 1. For all n ≥ N = 1, we have

|an − c| = |c − c| = 0 < ε,

which shows that the limit of the constant sequence {an } is c. Namely,

lim c = c.
n→∞

1
Another simple example is the sequence {an } with an = .
n
Chapter 1. The Real Numbers 39

Example 1.16
Use the definition of convergence to show that
1
lim = 0.
n→∞ n

Solution
Given ε > 0, the Archimedean property asserts that there is a positive
integer N such that 1/N < ε. If n ≥ N , we have
1 1
0< ≤ < ε.
n N
This gives
1
−0 <ε for all n ≥ N.
n
By definition, we conclude that
1
lim = 0.
n→∞ n

Let f : Z+ → Z+ be a function satisfying

f (k) < f (k + 1) for all k ∈ Z+ .

Then f (Z+ ) is an infinite set of positive integers. If we let nk = f (k), then

n1 < n2 < n3 < · · · .

Namely, n1 , n2 , n3 , . . . is a strictly increasing sequence of positive integers.

Definition 1.31 Subsequence

Let {an } be a sequence. A subsequence of {an } is a sequence {ank }


indexed by k ∈ Z+ , where nk = f (k) is defined by a function f : Z+ → Z+
satisfying
f (k) < f (k + 1) for all k ∈ Z+ .
Chapter 1. The Real Numbers 40

Example 1.17

The sequence {1/(2n − 1)} with first three terms given by


1 1
1, , ,
3 5
is a subsequence of the sequence {1/n} whose first five terms are
1 1 1 1
1, , , , .
2 3 4 5

If a sequence {an } converges to a, what can we say about its subsequence? It


is natural to expect any subsequence of {an } also converges to a.

Theorem 1.23 Subsequence of a Convergent Sequence

If the sequence {an } converges to a, then any of its subsequence also


converges to a.

Proof
Let {ank } be a subsequence of {an }. Notice that for all k ∈ Z+ ,

nk ≥ k.

Given ε > 0, there is a positive integer N such that for all n ≥ N ,

|an − a| < ε.

Take K = N . Then for all k ≥ K, nk ≥ nK = nN ≥ N , and thus,

|ank − a| < ε.

This proves that {ank } indeed converges to a.


Chapter 1. The Real Numbers 41

Example 1.18
Find the limit
1
lim
n→∞ 2n

if it exists.

Solution
   
1 1
Notice that is a subsequence of with nk = 2k . By Example
2n n
1.16,
1
= 0. lim
n→∞ n

We conclude from Theorem 1.23 that


1
lim = 0.
n→∞ 2n

Example 1.19

Show that the sequence {(−1)n } is divergent.

Solution
n
Let an = (−1) . Then for any positive integer n, a2n−1 = −1, and a2n = 1.
The subsequence {a2n−1 } of {an } converges to −1, while the subsequence
{a2n } of {an } converges to 1. Since there are two subsequences of {an }
that converge to two different limits, by Theorem 1.23, the sequence {an }
is not convergent.

For the sequence {an } defined in Example 1.10, we can see that the set {an | n ∈
+
Z } is not bounded above. Therefore, we would expect that the sequence does
not converge to any number.
For simplicity, we say that a sequence {an } is bounded above/bounded below/
bounded if the set {an | n ∈ Z+ } is bounded above/bounded below/bounded . If
the sequence {an } is bounded above, we denote the supremum of the set {an | n ∈
Z+ } as sup{an }. If the sequence {an } is bounded below, we denote the infimum
of the set {an | n ∈ Z+ } as inf{an }.
We have the following theorem which guarantees that a convergent sequence
Chapter 1. The Real Numbers 42

must be bounded.

Theorem 1.24 Boundedness of Convergent Sequence

If a sequence {an } is convergent, then it is bounded. Equivalently, if a


sequence {an } is not bounded, then it is not convergent.

Proof
Let {an } be a convergent sequence that converges to the limit a. By
definition of convergence with ε = 1, there is a positive integer N such
that for all n ≥ N ,
|an − a| < 1.
This implies that

|an | ≤ |an − a| + |a| < 1 + |a| for all n ≥ N.

Define
M = max {|a1 |, |a2 |, . . . , |aN −1 |, |a| + 1} .
Then
|an | ≤ M for all n ∈ Z+ .
This shows that the sequence {an } is bounded.

Example 1.20

By Theorem 1.24, the sequence {an } defined in Example 1.10 is not


convergent.

If the sequence {an } is convergent, and c is a constant, it is natural to expect


that the sequence {can } is also convergent.

Proposition 1.25

If the sequence {an } converges to a, then the sequence {can } converges to


ca.
Chapter 1. The Real Numbers 43

Proof
Given ε > 0, the number
ε
ε1 =
|c| + 1
is also positive. Since {an } converges to a, there is a positive integer N
such that for all n ≥ N ,
ε
|an − a| < ε1 = .
|c| + 1

It follows that for all n ≥ N ,


|c|
|can − ca| = |c||an − a| < ε < ε.
|c| + 1

This proves that {can } converges to ca.

Example 1.21
By Proposition 1.25, we find that for any constant c,
c
lim = 0.
n→∞ n

In the following, we establish a comparison theorem for limits.

Theorem 1.26 Squeeze Theorem

Let {an }, {bn } and {cn } be three sequences. Assume that there is a positive
integer N0 such that for all n ≥ N0 ,

b n ≤ an ≤ c n .

If both the sequences {bn } and {cn } converge to ℓ, then the sequence {an }
also converges to ℓ.
Chapter 1. The Real Numbers 44

Proof
For a positive number ε, since the sequence {bn } converges to ℓ, there is a
positive integer N1 such that for all n ≥ N1 ,

|bn − ℓ| < ε.

This implies that for all n ≥ N1 ,

bn − ℓ > −ε.

Similarly, since the sequence {cn } converges to ℓ, there is a positive integer


N2 such that for all n ≥ N2 ,

|cn − ℓ| < ε.

This implies that for all n ≥ N2 ,

cn − ℓ < ε.

Let N = max{N0 , N1 , N2 }. If n ≥ N , n ≥ N0 , n ≥ N1 and n ≥ N2 .


Therefore, if n ≥ N ,

an − ℓ ≥ bn − ℓ > −ε,

and
an − ℓ ≤ cn − ℓ < ε.
This proves that for all n ≥ N ,

|an − ℓ| < ε.

Therefore, the sequence {an } converges to ℓ.

When applying the squeeze theorem, we are interested in the limit of the
sequence {an }. It is not enough to find two seqeunces {bn } and {cn } satisfying
b n ≤ an ≤ c n
for all n greater than or equal to a fixed N0 . The two sequences {bn } and {cn }
must have the same limit.
Chapter 1. The Real Numbers 45

Example 1.22

For the sequence {an } with

(−1)n
an = ,
n
we have
1 1
− ≤ an ≤ .
n n
Since
1
lim = 0,
n→∞ n

we have
1
lim − = 0.
n→∞ n
By squeeze theorem,
(−1)n
lim = 0.
n→∞ n

More generally, we have the following.

Theorem 1.27
The sequence {an } converges to 0 if and only if the sequence {|an |}
converges to 0.

A word of caution. If the sequence {|an |} is convergent, the sequence {an } is


not necessarily convergent. An example is the sequence {an } with an = (−1)n .
Theorem 1.27 asserts that if {|an |} converges to 0, then {an } is convergent, and
it converges to 0. Nevertheless, if the sequence {an } is convergent, the sequence
{|an |} is necessarily convergent (see Question 1.5.4).

Proof of Theorem 1.27


First assume that the sequence {an } converges to 0. Given ε > 0, there is a
positive integer N such that for all n ≥ N ,

|an − 0| < ε.
Chapter 1. The Real Numbers 46

Notice that
|an | − 0 = |an | = | an − 0 |.
Hence, for all n ≥ N ,
|an | − 0 < ε.
This proves that the sequence {|an |} converges to 0.
Next, we assume that the sequence {|an |} converges to 0. Then the
sequence {−|an |} also converges to 0. Since

−|an | ≤ an ≤ |an |,

squeeze theorem implies that the sequence {an } converges to 0.

In the following, we discuss two useful results that can be deduced from
specific information about a convergent sequence. They will be useful in the
proofs of other theorems that we are going to discuss.

Lemma 1.28 Sequence with Positive Limit

If {an } is a sequence that converge to a positive number a, there is a positive


integer N such that an > a/2 > 0 for all n ≥ N .

One can easily formulate a counterpart of this lemma for a sequence with
negative limit.

Proof
Take ε = a/2. Then ε > 0. Hence, there is a positive integer N so that for
all n ≥ N ,
a
|an − a| < .
2
This implies that for all n ≥ N ,
a
an − a > − .
2
Thus, for all n ≥ N ,
a
an > > 0.
2
Chapter 1. The Real Numbers 47

Lemma 1.29

1. Given that {an } is a sequence that is bounded above by c. If {an }


converges to a, then a ≤ c.

2. Given that {an } is a sequence that is bounded below by b. If {an }


converges to a, then a ≥ b.

3. Given that {an } is a sequence satifying

b ≤ an ≤ c for all n ∈ Z+ .

If {an } converges to a, then b ≤ a ≤ c.

It is suffices to prove the first statement. The second statement follows by


considering the negative of the sequence. The third statement follows by combining
the results of the first two statements.

Proof
Given that

lim an = a and an ≤ c for all n ∈ Z+ ,


n→∞

we want to show that a ≤ c. Assume to the contrary that a > c. Take


ε = a − c. Then ε > 0. By definition of convergence, there is a positive
integer N such that for all n ≥ N ,

|an − a| < ε.

This implies that

an − a > −ε = c − a when n ≥ N.

Hence,
an > c when n ≥ N.
This contradicts to an ≤ c for all n ∈ Z+ . Therefore, we must have a ≤ c.

In Proposition 1.25, we have seen what happens when a convergent sequence


Chapter 1. The Real Numbers 48

is multiplied by a constant. In the following, we inspect the behaviour of limits


with respect to sums, products and quotients. We start by sums.

Theorem 1.30 Sums of Convergent Sequences

If the sequences {an } and {bn } converge to a and b respectively, the


sequence {an + bn } converges to a + b.

Linearity of Limits of Sequences


Combining Proposition 1.25 and Theorem 1.30, we obtain the following. If

lim an = a, lim bn = b,
n→∞ n→∞

then for any constants α and β,

lim (αan + βbn ) = αa + βb.


n→∞

Proof of Theorem 1.30


Given a positive number ε, the number ε/2 is also positive. Since the
sequence {an } converges to a, there is a positive integer N1 such that for
all n ≥ N1 ,
ε
|an − a| < .
2
Similarly, there is a positive integer N2 such that for all n ≥ N2 ,
ε
|bn − b| < .
2
Take N = max{N1 , N2 }. Then N is a positive integer and N ≥ N1 ,
N ≥ N2 . If n ≥ N , triangle inequality implies that

|(an + bn ) − (a + b)| = |(an − a) + (bn − b)|


≤ |an − a| + |bn − b|
ε ε
< +
2 2
= ε.

This proves that the sequence {an + bn } converges to a + b.


Chapter 1. The Real Numbers 49

Now we consider products.

Theorem 1.31 Products of Convergent Sequences

If the sequences {an } and {bn } converge to a and b respectively, the


sequence {an bn } converges to ab.

Notice that Proposition 1.25 is actually a special case of this theorem when
{bn } is a constant sequence.

Proof of Theorem 1.31


Since {an } and {bn } are convergent sequences, Theorem 1.24 says that
each of them is bounded. We can choose a common positive number M so
that for all n ∈ Z+ ,

|an | ≤ M, |bn | ≤ M.

By Lemma 1.29,
|a| ≤ M, |b| ≤ M.
Now we want to show that the difference of an bn and ab aproaches zero
when n gets large. This should be achieved by the fact that |an − a| and
|bn − b| both approach 0 when n gets large. To compare an bn − ab to an − a
and bn − b, we do some manipulations as follows.

an bn − ab = (an − a)bn + a(bn − b).

It follows from triangle inequality that

|an bn − ab| ≤ |an − a||bn | + |a||bn − b| ≤ M (|an − a| + |bn − b|) . (1.1)

Now we can show that an bn converges to ab. Given ε > 0, since ε/(2M ) is
also positive, there exists a positive integer N1 such that
ε
|an − a| < when n ≥ N1 .
2M
Similarly, there exists a positive integer N2 such that
Chapter 1. The Real Numbers 50

ε
|bn − b| < when n ≥ N2 .
2M
Take N = max{N1 , N2 }. When n ≥ N , n ≥ N1 and n ≥ N2 . It follows
from (1.1) that  ε ε 
|an − bn | < M + = ε.
2M 2M
This completes the proof that the sequence {an bn } converges to ab.

For quotient of two sequences, we notice that if y ̸= 0,


x 1
=x× ,
y y
which says that the quotient of x by y is a product of x with the reciprocal of y.
Hence, it is enough to consider the reciprocal of a nonzero sequence.

Theorem 1.32 Reciprocal of a Convergent Nonzero Sequence

If {an } is a nonzero sequence that converges to a nonzero limit a, the


reciprocal sequence {1/an } converges to 1/a.

Proof
Without loss of generality, assume that a > 0. Lemma 1.28 implies that
there is a positive integer N1 such that
a
an > >0 when n ≥ N1 .
2
Given ε > 0, a2 ε/2 is also positive. By definition of convergence, there is
a positive integer N2 such that when n ≥ N2 ,

a2 ε
|an − a| < .
2
Take N = max{N1 , N2 }. If n ≥ N ,

1 1 |an − a| 2 a2 ε
− = < 2× = ε.
an a |an ||a| a 2

This proves that the sequence {1/an } converges to 1/a.


Chapter 1. The Real Numbers 51

Remark 1.1 Reciprocal of a Sequence That Converges to 0


In the statement of Theorem 1.32, it is crucial that a ̸= 0. To see this,
consider the sequence {an } with an = 1/n. It converges to a = 0. The
sequence {1/an } is the sequence of natural numbers {n}, which does not
converge. In fact, since {an } converges to 0, the sequence {1/an } is not
bounded. Hence, the sequence {1/an } does not converge.

Corollary 1.33 Quotients of Convergent Sequences

Given that {an } is a sequence that converges to a, {bn } is a nonzero


sequence that converges to b. If b ̸= 0, the sequence {an /bn } converges
to a/b.

The results about sums, products and quotients of convergent sequences can
be summarized in the following.

Operations on Convergent Sequences


Given that
lim an = a and lim bn = b.
n→∞ n→∞

1. For any constants α and β, lim (αan + βbn ) = αa + βb.


n→∞

2. lim an bn = ab.
n→∞

an a
3. If bn ̸= 0 for all n ∈ Z+ and b ̸= 0, lim = .
n→∞ bn b

These will be used repeatedly in the future. Let us now look at some examples
how these properties are applied.

Example 1.23
Let m be a positive integer. Product rule of limits implies that
1 1 1
lim m
= lim × · · · × lim = 0.
n→∞ n n→∞ n
| {z n→∞ n}
m terms
Chapter 1. The Real Numbers 52

Example 1.24
Determine whether the limit exists. If it exists, find the limit.
(−1)n
 
(a) lim 3 +
n→∞ 3n − 2
2n2 + 3n + 4
(b) lim
n→∞ 5 − 7n2
n+1
(c) lim
n→∞ n2 + 1

n2 + 1
(d) lim
n→∞ n + 1

Solution
(a) Since {1/(3n − 2)} is a subsequence of the sequence {1/n}, it
converges to 0. By Theorem 1.27,

(−1)n
lim = 0.
n→∞ 3n − 2

Hence,

(−1)n (−1)n
 
lim 3 + = lim 3 + lim = 3 + 0 = 3.
n→∞ 3n − 2 n→∞ n→∞ 3n − 2

(b) The sequence {2n2 + 3n + 4} is not bounded. So it does not have


a limit. We cannot apply quotient rule of limits directly. Instead,
we need to do some manipulations. Divide the numerator and the
denominator by n2 and then apply the rules for limits, we have
3 4
2n2 + 3n + 4 2+ + 2
lim = lim n n = 2 + 0 + 0 = −2.
n→∞ 5 − 7n 2 n→∞ 5 0−7 7
2
−7
n
Chapter 1. The Real Numbers 53

(c) Divide the numerator and the denominator by n2 and then apply the
rules for limits, we have
1 1
n+1 + 2 0+0
lim 2 = lim n n = = 0.
n→∞ n + 1 n→∞ 1 1+0
1+ 2
n

(d) Since the reciprocal of the sequence has limit 0 by part (c), we find
that
n2 + 1
lim
n→∞ n + 1

does not exist.

We have seen in Section 1.3 that the supremum or infimum of a set is not
necessarily an element of the set. The supremum of a set is an element of the set
if and only if the set has a maximum. Analogously, the infimum of a set is an
element of the set if and only if the set has a minimum.
Even though the supremum and infimum of a set might fail to be an element
of the set, they are always limits of sequences in that set.

Lemma 1.34 Supremum and Infimum as Limits


Let S be a subset of real numbers.

1. If S is bounded above, there is a sequence {un } in S that converges to


u = sup S.

2. If S is bounded below, there is a sequence {ℓn } in S that converges to


ℓ = inf S.

Example 1.25

Consider the set S = (−∞, π). It is bounded above with sup S = π. The
sequence {un } with
1
un = π −
n
is a sequence in S that converges to π = sup S.
Chapter 1. The Real Numbers 54

To prove Lemma 1.34, it suffices for us to prove the first statement.

Proof of Lemma 1.34


Assume that S is bounded above. Then the completeness axiom asserts that
u = sup S exists. For any positive integer n, u − 1/n is smaller than u.
Hence, u − 1/n is not an upper bound of S. This implies that there is an
element un of S such that
1
un > u − .
n
Since un is in S and u is an upper bound of S, we have un ≤ u. In other
words, we have
1
u− < un ≤ u for all n ∈ Z+ .
n
Since  
1
lim u− = lim u = u,
n→∞ n n→∞

squeeze theorem implies that

lim un = u.
n→∞

This means {un } is a sequence in S that converges to u.


Chapter 1. The Real Numbers 55

Exercises 1.5
Question 1
1
Let a be a positive integer that is larger than 1. Show that lim = 0.
n→∞ an

Question 2
If {an } is a sequence that converge to a negative number a, show that there
is a positive integer N such that an < a/2 < 0 for all n ≥ N .

Question 3
Determine whether the limit exists. If it exists, find the limit.
3n + (−1)n
(a) lim
n→∞ n+2
4n + 2
(b) lim
n→∞ 7n2 + 3n

n
(c) lim 2
n→∞ 2n + n + 5

n2 + 4n
(d) lim
n→∞ n + 3

Question 4
If {an } is a sequence that converges to a, use the definition of convergence
to show that the sequence {|an |} converges to |a|.

Question 5: Last Statement in Remark 1.1


Given that {an } is a nonzero sequence that converges to 0.

(a) Show that {1/an } is not bounded.

(b) Conclude that the sequence {1/an } is divergent.


Chapter 1. The Real Numbers 56

Question 6
Let {an } and {bn } be sequences. Assume that there is a real number a such
that
|an − a| ≤ bn for all n ∈ Z+ .
If lim bn = 0, show that
n→∞
lim an = a.
n→∞

Question 7: The Convergence of the Sequence in Example 1.13

Consider the sequence {an } defined in Example by 1.13. It is defined


recursively by a1 = 2, and for n ≥ 1,

a + 1 if an < 3,
n n
an+1 =
an − 1 if an ≥ 3.
n

1
(a) Show that |an+1 − 3| ≤ for all n ∈ Z+ .
n
[Hint: Use induction.]

(b) Show that the sequence {an } is convergent and find its limit.
Chapter 1. The Real Numbers 57

1.6 Closed Sets and Limit Points

When we study convergence of sequences, we measure the closeness between


points by a positive number ε. A point x is within ε from the point a if x is in the
open interval (a − ε, a + ε). More generally, we define a neighbourhood of the
point a as follows.

Definition 1.32 Neighbourhood

Given a is a point in R, a neighbourhood of a is an open interval (b, c) that


contains a.

The concept of neighbourhood is closely related to the concept of interior


point.

Definition 1.33 Interior Point


If S is a set of real numbers, and there is a neighbourhood of the point a
that is contained in S, we call a an interior point of S.

In this section, we use sequences to define and study some properties of subsets
of real numbers. Given a subset S of real numbers, we say that a sequence {an }
is in S if each of the terms an is a point in S. In other words, the sequence {an }
is in S means that the set {an | n ∈ Z+ } is a subset of S. We will abuse notation
and write this as {an } ⊂ S when there is no confusion. We start with a simple but
useful lemma.

Lemma 1.35
Let S be a subset of real numbers. If {an } is a sequence in S that converges
to a, then every neighbourhood of a contains a point of S.
Chapter 1. The Real Numbers 58

Proof
Let (b, c) be a neighbourhood of a. Since a is in (b, c), b < a < c, and
hence the number
ε = min{a − b, c − a}
is positive. By definition, a − b ≥ ε, c − a ≥ ε. Since {an } converges to a,
there is a positive integer N such that for all n ≥ N ,

|an − a| < ε.

In particular,
b ≤ a − ε < aN < a + ε ≤ c.
This shows that aN is a point in S that is in the neighbourhood (b, c) of a.

Figure 1.5: b < a < c and ε = c − a ≤ a − b.

Next, we revisit the concept of denseness.

Theorem 1.36
Let S be a subset of real numbers. Then S is dense in R if and only if every
real number x is the limit of a sequence in S.

Since we have proved that each of the set of rational numbers and the set of
irrational numbers is dense in the set of real numbers, we immediately obtain the
following.

Corollary 1.37
Let x be a real number.

1. There is a sequence of rational numbers {pn } that converges to x.

2. There is a sequence of irrational numbers {qn } that converges to x.


Chapter 1. The Real Numbers 59

Proof of Theorem 1.36


First we assume that the set S is dense in R. Given a real number x, we
want to show that there is a sequence in S that converges to x. For each
positive integer n, since S is dense in R, there is an element of S in the
open interval (x − 1/n, x). Choose one of these elements and denote it by
an . Then {an } is a sequence in S satisfying
1
x− < an < x for all n ∈ Z+ .
n
By squeeze theorem, the sequence {an } converges to x.
Conversely, assume that every real number x is the limit of a sequence in S.
We want to show that S is dense in R. Let (a, b) be an open interval. Take
any point x in the interval (a, b). By assumption, there is a sequence {cn }
in S which converges to x. By Lemma 1.35, the interval (a, b) contains a
point of S. Thus we have shown that every open interval (a, b) contains a
point of S. This proves that S is dense in R.

Example 1.26

Let x = 2, and define the sequences {pn } and {qn } by

⌊10n 2⌋ √
pn = , q n = 2.
10n
Here ⌊a⌋ is the floor of a. By definition,
√ √ √
10n 2 − 1 < ⌊10n 2⌋ ≤ 10n 2.

Therefore,
√ 1 √
2 − n < pn ≤ 2.
10
√ √
By squeeze theorem, {pn } converges to 2. Since ⌊10n 2⌋ is an integer,
pn is a rational number. Hence, {pn } is a sequence of rational numbers that

converges to x = 2. Obviously, {qn } is a sequence of irrational numbers

that converges to x = 2.

The number pn is the rational number obtained by truncating the decimal


Chapter 1. The Real Numbers 60


expansion of 2 to give a number with n decimal places. The first 7 terms of
the sequence {pn } are

1.4, 1.41, 1.414, 1.4142, 1.41421, 1.414213, 1.4142135.

Now we introduce the concept of closed sets.

Definition 1.34 Closed Set


Let S be a subset of R. We say that S is closed in R provided that if {an }
is a sequence of points in S that converges to the limit a, the point a is also
in S.

Example 1.27
The three statements in Lemma 1.29 imply that intervals of the form
(−∞, a], [a, ∞) and [a, b] are closed subsets of R. In particular, we call
them closed intervals, and [a, b] is a closed and bounded interval.

Remark 1.2

1. By definition, R is closed in R.

2. ∅ is closed in R because the statement that defines a closed set is a


statement of the form p → q, where p is always false for an empty
set. Hence, for an empty set, this statement p → q that defines a closed
set is vacuously true.

Example 1.28

Is the interval (0, 2) closed in R?

Solution
The sequence {1/n} is a sequence in the interval (0, 2) that converges to
the point 0 that is not in (0, 2). Hence, the interval (0, 2) is not closed in R.
Chapter 1. The Real Numbers 61

Remark 1.3
One can prove that if S is an interval of the form (a, b), or (a, b], or [a, b),
or (−∞, a), or (a, ∞), then S is not closed in R.

Example 1.29
Is the set of rational numbers Q closed in R?

Solution
We have seen in Example 1.26 that there is a sequence in the set Q that

converges to 2, which is not in Q. Hence, Q is not closed in R.

The concept of closed sets is defined in terms of limits of sequences. This


leads us to the concept of limit points.

Definition 1.35 Limit Points


Let S be a subset of real numbers. A point x in R is called a limit point of
the set S if there is a sequence of points in S \ {x} that converges to x.

Notice that {an } is a sequence in S \ {x} if and only if it is a sequence in S


with none of the terms an equal to x.

Example 1.30

In the solution of Example 1.28, we have seen that the sequence {1/n} in
(0, 2) converges to the point 0. Since none of the an is 0, 0 is a limit point
of the set (0, 2).

Limits and Limit Points


Although the concepts of limits and limit points are closely related, one
should not get confused. The limit of a convergent sequence {an } is not
necessarily the limit point of the set {an | n ∈ Z+ }. For example, c is the
limit of the constant sequence {an } with an = c for all n ∈ Z+ , but c is not
a limit point of the set {an | n ∈ Z+ } = {c}.
Chapter 1. The Real Numbers 62

Example 1.31

Determine the set of limit points of the set (0, 2).

Solution
We claim that every point in [0, 2] is a limit point of the set (0, 2).
Example 1.30 shows that 0 is a limit point of (0, 2). The sequence {2−1/n}
is a sequence in (0, 2) that converges to 2. Hence, 2 is also a limit point of
(0, 2).
For any c ∈ (0, 2), c > 0. Let m be a positive intger such that 1/m < c.
Then {c − 1/(n + m)} is a sequence in (0, 2) that converges to c. Hence, c
is a limit point of (0, 2).
This completes the proof that the set of limit points of (0, 2) is [0, 2].

Remark 1.4

1. For intervals of the form (a, b), (a, b], [a, b) or [a, b], the set of limit
points is [a, b].

2. For intervals of the form (−∞, a) or (−∞, a], the set of limit points is
(−∞, a].

3. For intervals of the form (a, ∞) or [a, ∞), the set of limit points is
[a, ∞).

Example 1.32
Show that the set Z does not have limit points.

Solution
If n is an integer, x is contained in the open interval (n − 1, n + 1) that does
not contain any integer other than n itself. Hence, there is no sequence in
Z \ {n} that converges to n. Therefore, an integer n is not a limit point of
Z.
Chapter 1. The Real Numbers 63

If x is not an integer, it is contained in the interval (⌊x⌋, ⌈x⌉) that does not
contain any integers. By Lemma 1.35, x is not a limit of a sequence in Z.
Therefore, x is not a limit point of Z.

Definition 1.36 Isolated Points


Let S be a subset of real numbers. We say that x is an isolated point of S if

(a) x is in S;

(b) x is not a limit point of S.

By definition, we have the following.

Isolated Points vs Limit Points


A point in a set S is either a limit point or an isolated point of the set.

Example 1.33
By Example 1.32, every point in the set of integers Z is an isolated point of
the set.

The following is quite obvious from the definition of isolated points and Lemma
1.35.

Theorem 1.38
Let S be a subset of real numbers. A point x in S is an isolated point if and
only if there is a neighbourhood (a, b) of x that intersects the set S only at
the point x.

We have seen that a limit point of a set is not necessarily a point of that set.
The following gives a characterization of closed sets in terms of limit points.

Theorem 1.39
Let S be a subset of real numbers. The set S is closed in R if and only if it
contains all its limit points.
Chapter 1. The Real Numbers 64

To prove a statement of the form p ⇐⇒ q, we can prove p =⇒ q and


¬p =⇒ ¬q.

Proof
Assume first S is closed in R. Let x be a limit point of S. Then there is
a sequence {an } in S \ {x} that converges to x. In particular, {an } is a
sequence in S that converges to x. Since S is closed in R, x is in S. This
proves that S contains all its limit points.
Now assume that S is not closed in R. Then there is a sequence {an } in S
that converges to a point x, but x is not in S. Since x is not in S, none of
the terms in the sequence {an } is in S. Therefore, x is a limit point of S.
This shows that S does not contain all its limit points.
Chapter 1. The Real Numbers 65

Exercises 1.6
Question 1
Show that every real number is a limit point of the set of rational numbers.

Question 2
Let S be the set  
1
S= n ∈ Z+ ∪ {0}.
n
(a) Find the set of limit points and the set of isolated points of S.

(b) Is S a closed set?

Question 3
Determine whether each of the following is a closed set.

(a) A = [2, 3] ∪ [4, 7]

(b) B = (−∞, 2] ∪ [3, 5]

(c) C = R \ (−1, 1)

(d) D = [1, 2) ∪ [2, 4]

(e) E = (1, 2) ∪ (3, 4]


Chapter 1. The Real Numbers 66

1.7 The Monotone Convergence Theorem

Recall that a sequence {an } is monotone if it is increasing or it is decreasing.


Obviously, an increasing sequence is bounded below, and a decreasing sequence
is bounded above. However, a monotone sequence is not necessary convergent. A
simple example is the sequence of natural numbers {n}. In the following, we give
a characterization for a monotone sequence to be convergent.

Theorem 1.40 The Monotone Convergence Theorem

Let {an } be a monotone sequence.

1. If {an } is increasing, then {an } is convergent if and only if it is bounded


above. In this case,
lim an = sup{an }.
n→∞

2. If {an } is decreasing, then {an } is convergent if and only if it is bounded


below. In this case,
lim an = inf{an }.
n→∞

Convergence Criteria for Monotone Sequences


The monotone convergence theorem says that a montonone sequence is
convergent if and only if it is bounded.

It is suffices to prove the case where {an } is an increasing sequence.

Proof
First suppose that {an } is an increasing sequence that is convergent. Then
{an } is bounded. So it is bounded above.
Conversely, suppose that {an } is an increasing sequence that is bounded
above. Then a = sup{an } exists. Now we use the same argument as in
the proof of Lemma 1.34. Given ε > 0, since a − ε is less than a, it is not
an upper bound of the set S = {an | n ∈ Z+ }. Hence, there is a positive
integer N such that
aN > a − ε.
Chapter 1. The Real Numbers 67

It follows that

an ≥ aN > a − ε for all n ≥ N.

Since a is an upper bound of S, we also have an ≤ a for all n. Thus,

|an − a| < ε for all n ≥ N.

This shows that the sequence {an } converges to a.

The monotone convergence theorem is very useful because we can conclude


the convergence of a sequence without apriori knowing the limit of the sequence.
It is a consequence of the completeness axiom which asserts that any set that is
bounded above has a supremum.

Example 1.34

Let a be a number in the interval (0, 1). Show that

lim an = 0.
n→∞

Remark 1.5
It follows from Theorem 1.27 that for any a in the interval (−1, 1),

lim an = 0.
n→∞

Solution to Example 1.34


Since 0 < a < 1, for any positive integer N ,

an+1 = an × a < an .

Hence, the sequence {an } is decreasing. On the other hand, an > 0 for all
n ∈ Z+ . Hence, {an } is a decreasing sequence that is bounded below. By
the monotone convergence theorem, {an } converges to a number ℓ.
Chapter 1. The Real Numbers 68

Since {an+1 } is a subsequence of {an }, it also converges to ℓ. Applying


limit law to
an+1 = a × an ,
we have
ℓ = lim an+1 = a lim an = aℓ.
n→∞ n→∞

Since a ̸= 1, we must have ℓ = 0.

Example 1.35

Define the sequence {an } inductively by a1 = 1 and for all n ≥ 1,


2an + 2
an+1 = .
an + 2
Show that {an } is convergent and find its limit.

Solution
First notice that an > 0 for all n ∈ Z+ . When n ≥ 2,
2an + 2 2an−1 + 2 2(an − an−1 )
an+1 − an = − = .
an + 2 an−1 + 2 (an + 2)(an−1 + 2)

Now, a2 = 4/3 > a1 . Hence, we deduce that an+1 − an > 0 for all n ∈ Z+ .
In other words, {an } is an increasing sequence. For all n ≥ 1,
2
an+1 = 2 − < 2.
an + 2
Hence, {an } is bounded above by 2. Since {an } is an increasing sequence
that is bounded above, by monotone convergence theorem, it converges
to a limit u = sup{an }. Since {an+1 } is a subsequence of {an }, it also
converges to u. Apply the limit laws to
2an + 2
an+1 = ,
an + 2
we find that
2u + 2
u= .
u+2
Chapter 1. The Real Numbers 69

This implies that


u2 = 2.

Since an > 0, we must have u ≥ 0. Hence, u = 2.

Notice that Example 1.35 is closely related to Example 1.6. The sequence
{an } defined in Example 1.35 is another sequence of rational numbers which

converges to 2.
The next example is a classical one.

Example 1.36
Show that the limit  n
1
lim 1 +
n→∞ n
exists.

Solution
Let  n
1
an = 1 + .
n
Given a positive integer n, notice that
 n
an+1 n+2 (n + 2)n
= × .
an n+1 (n + 1)2

By Bernoulli’s inequality (see Question 1.2.2),


n  n
n2 + n + 1

(n + 2)n 1 n
= 1 − ≥ 1 − = .
(n + 1)2 (n + 1)2 (n + 1)2 (n + 1)2

It follows that
an+1 (n + 2)(n2 + n + 1) n3 + 3n2 + 3n + 2
≥ = 3 > 1.
an (n + 1)3 n + 3n2 + 3n + 1

This shows that


an+1 > an for all n ∈ Z+ .
Chapter 1. The Real Numbers 70

Hence, {an } is monotonically increasing. Using binomial expansion, we


have n  
X n 1
an = k
.
k=0
k n
For k ≥ 1, Question 1.2.1 shows that
 
n 1 1 n(n − 1) · · · (n − k + 1) 1 1
k
= k
≤ ≤ k−1 .
k n k! n k! 2

Therefore,
1 1 1
an ≤ 1 + 1 + + · · · + n−1 = 3 − n−1 ≤ 3.
2 2 2
This proves that {an } is bounded above by 3. Since {an } is an increasing
sequence that is bounded above, the monotone convergence theorem asserts
that the limit  n
1
lim an = lim 1 +
n→∞ n→∞ n
exists.

The Number e
The number e is defined as
 n
1
e = lim 1 + .
n→∞ n

Correct to 15 decimal places, its numerical value is

e = 2.718281828459046

One can show that the sequence {bn }∞


n=0 defined by b0 = 1,

1
bn = bn−1 + for all n ≥ 1,
n!
also converges to e. In series notation,
1 1 1 1
e=1+ + + + ··· + + ··· .
1! 2! 3! n!
Chapter 1. The Real Numbers 71

Exercises 1.7
Question 1
Given that the sequence {an } is defined by a1 = 2, and for all n ≥ 1,
3an + 1
an+1 = .
an + 2
Show that {an } is convergent and find its limit.

Question 2
For n ≥ 1, let  n
1
an = 1 + .
n
Define the sequence {bn }∞
n=0 by b0 = 1, and for all n ≥ 1,

1
bn = bn−1 + .
n!
(a) Show that the sequence {bn } is convergent.

(b) For a positive integer n, use the binomial expansion of an to show that
an ≤ bn and
3
b n − an ≤ .
2n
(c) Conclude that the sequence {bn } converges to e.
Chapter 1. The Real Numbers 72

1.8 Sequential Compactness

Let us first look at an example.

Example 1.37

Let {an } be the sequence defined by


n
an = (−1)n−1 .
n+1
Obviously,
n
|an | = ≤ 1.
n+1
Hence, the sequence {an } is bounded. Now,
2n − 1 2n
a2n−1 = , a2n = − .
2n 2n + 1
The subsequence {a2n−1 } converges to 1, whereas the subsequence {a2n }
converges to −1. Since there are two subsequences that converge to two
different limits, the sequence {an } is not convergent.

In this example, we find that although the sequence {an } is not convergent, it
has convergent subsequences. In this section, we are going to prove that every
bounded sequence has a convergent subsequence. By monotone convergence
theorem, it is sufficient to prove that every sequence has a monotone subsequence.
It can be achieved via a concept called peak index.

Definition 1.37 Peak Index


Let {an } be a sequence of real numbers. A positive integer m is called a
peak index of the sequence if

am ≥ an for all n ≥ m.

In other words, there is no term after the mth term that is larger than am .

If {an } is a decreasing sequence, every positive integer is a peak index of the


sequence. If {an } is an increasing sequence, m is a peak index if and only if
an = am for all n ≥ m, which means {an } is a constant from the mth term on.
Chapter 1. The Real Numbers 73

We can use the concept of peak indices to prove the following.

Theorem 1.41
Every sequence has a monotone subsequence.

Proof
Given a sequence {an }, let S be the set of its peak indices. It is a subset of
positive integers. We discuss the cases where S is infinite and S is finite.
Case 1: S is infinite.
Let n1 , n2 , n3 , . . . be the elements of S arranged in increasing order, namely,

n1 < n2 < n3 < · · · .

This is a subsequence of {n}. For any positive integer k, since nk+1 > nk
and nk is a peak index, we have

ank+1 ≤ ank .

This shows that {ank } is a decreasing subsequence of {an }.


Case 2: S is finite.
If S is an empty set, let n1 = 1. If S is not empty, it has a largest element
nmax . Let n1 = nmax + 1. Then for any integer n such that n ≥ n1 , n is
not a peak index of the sequence. Since n1 is not a peak index, there is an
n2 > n1 such that an2 > an1 . Suppose that we have chosen the positive
integers n1 , n2 , . . . , nk such that n1 < n2 < · · · < nk and

an1 < an2 < · · · < ank .

Now nk is not a peak index implies that there is a positive integer nk+1
larger than nk such that
ank+1 > ank .
This procedure constructs the increasing subsequence {ank } inductively.
In both cases, we have shown that {an } has a monotone subsequence.

Obviously, a subsequence of a bounded sequence is bounded. It follows from


the monotone convergence theorem the following important assertion.
Chapter 1. The Real Numbers 74

Theorem 1.42 Bolzano-Weierstrass Theorem


Every bounded sequence has a convergent subsequence.

Now we want to introduce a concept called Cauchy sequence, which is closely


related to completeness axiom.

Definition 1.38 Cauchy Sequence

A sequence {an } is called a Cauchy sequence provided that for any ε > 0,
the is a positive integer N such that for all m ≥ n ≥ N ,

|am − an | < ε.

Example 1.38
n+1
For the sequence {an } with an = , it is easy to check that it is a
n
Cauchy sequence. Notice that if m ≥ n,

1 1 1 1 1
|am − an | = − = − < .
n m n m n

Given ε > 0, the Archimedean property says that there is a positive integer
N such that 1/N < ε. Hence, if m ≥ n ≥ N ,
1 1
|am − an | < ≤ < ε.
n N

There is a similarity between the definition of a Cauchy sequence and the


definition of convergence of a sequence. We can show that a linear combination
of Cauchy sequences is a Cauchy sequence, and a product of Cauchy sequences
is a Cauchy sequence. For the quotient, some care need to be taken. We leave it
to the students to formulate the precise statement.
In the definition of a Cauchy sequence, we do not need to know whether the
sequence is convergent, or what is the limit of the sequence if it is convergent.
Nevertheless, a convergent sequence is a Cauchy sequence.
Chapter 1. The Real Numbers 75

Theorem 1.43
If a sequence {an } is convergent, then it is a Cauchy sequence.

Proof
Let a be the limit of the convergent sequence {an }. Given ε > 0, there is a
positive integer N such that for all n ≥ N ,
ε
|an − a| < .
2
It follows from triangle inequality that if m ≥ n ≥ N ,
ε ε
|am − an | ≤ |am − a| + |an − a| < + = ε.
2 2
Hence, {an } is a Cauchy sequence.

The converse is also true in the set of real numbers. It is proved using the fact
that every bounded sequence has a convergent subsequence.

Theorem 1.44 Cauchy Criterion for Convergent Sequennce

If {an } is a Cauchy sequence of real numbers, then it converges to a real


number.

Proof
First we prove that {an } is a Cauchy sequence implies that it is bounded.
The proof is almost identical to the proof that a convergent sequence is
bounded. Take ε = 1. There is a positive integer N0 such that for all
m ≥ n ≥ N0 ,
|am − an | < 1.
This implies that

|am | ≤ |aN0 | + 1 for all m ≥ N0 .


Chapter 1. The Real Numbers 76

Let
M = max{|a1 |, . . . , |aN0 −1 |, |aN0 | + 1.}
Then |an | ≤ M for all n ∈ Z+ , proving that it is bounded. Since {an } is a
bounded sequence, it has a convergent subsequence {ank } which converges
to a limit a. We want to prove that the sequence {an } also converges to a.
Given ε > 0, there is a positive integer N such that for all m ≥ n ≥ N ,
ε
|am − an | < .
2
There is a positive integer K such that for all k ≥ K,
ε
|ank − a| < .
2
Now let n be an integer such that n ≥ N . Since {nk } is a subsequence of
{n}, there is an integer k such that k ≥ K and nk ≥ n. Then
ε ε
|an − a| ≤ |ank − an | + |ank − a| < + = ε.
2 2
This proves that for all n ≥ N ,

|an − a| < ε.

Hence, the sequence {an } indeed converges to a.

Theorem 1.44 is proved using the fact that every bounded sequence has a
convergent subsequence. The latter is a consequence of the monotone convergence
theorem, whose validity relies on the completeness axiom for real numbers. Hence,
the fact that every Cauchy sequence of real numbers is convergent is a consequence
of the completeness axiom.
If we consider the set of rational numbers, the assertion is not true. For
example, we have shown that there is a sequence of rational numbers {an } that

converges to 2. Therefore, the sequence {an } is a Cauchy sequence that does
not converge in the set of rational numbers.
The following combines the results of Theorem 1.43 and Theorem 1.44.
Chapter 1. The Real Numbers 77

Cauchy Criterion for Convergent Sequennce


A sequence of real numbers {an } is convergent if and only if it is a Cauchy
sequence.

As the monotone convergence theorem, the Cauchy criterion can be used to


conclude the convergence of a sequence without apriori knowing the limit of the
sequence. It has wide applications as we are going to see in latter chapters.

Example 1.39
For a positive integer n, let
1 1
sn = 1 + + ··· + .
2 n
Show that the sequence {sn } is divergent.

Solution
We prove that {sn } is not a Cauchy sequence, by showing that for ε = 1/2,
for any positive integer N , there are integers m and n with m ≥ n ≥ N
such that
1
|sm − sn | ≥ .
2
For a given positive integer N , let n = N and m = 2N . Then m ≥ n ≥ N
and m − n = N . Notice that
1 1 1
sm − sn = + + ··· +
N +1 N +2 2N
1 1 1
≥ + + ··· +
|2N 2N {z 2N}
m−n=N terms
1
= .
2
This shows that {sn } is not a Cauchy sequence. Hence, it is not convergent.

We have studied the convergence of sequences, and the interplay between


sequences and sets. Now we define another property of sets called sequential
compactness.
Chapter 1. The Real Numbers 78

Definition 1.39 Sequential Compactness


Let S be a subset of real numbers. We say that S is sequentially compact
provided that every sequence in S has a subsequence that converges to a
point in S.

Using logic, we find that a set S is not sequentially compact if there is a


sequence in S that do not have a convergent subsequence with limit in S.
From the theories that we have developed in this chapter, it is not difficult to
prove the following.

Theorem 1.45
If S is a closed and bounded subset of real numbers, then it is sequentially
compact.

Proof
Let S be a subset of R that is closed and bounded. Given a sequence {an }
in S, since S is bounded, the sequence {an } is bounded. Therefore, there
is a subsequence {ank } that converges to a number a. Since {ank } is a
sequence in the set S that converges to a, and S is closed, the limit a must
be in S. In other words, we have shown that the sequence {an } in S has a
subsequence {ank } that converges to a point a that is in S. This proves that
S is sequentially compact.

Example 1.40

Since an interval of the form [a, b] is closed and bounded, it is sequentially


compact.

The converse to Theorem 1.45 is also true.

Theorem 1.46
Let S be a subset of R. If S is sequentially compact, then it is closed and
bounded.

This is a statement of the form p → q∧r. It is equivalent to (p → q)∧(p → r),


Chapter 1. The Real Numbers 79

which in turn is equivalent to (¬q → ¬p) ∧ (¬r → ¬p). Hence, we will prove the
following two statements: if S is not closed, it is not sequentially compact; and if
S is not bounded, it is not sequentially compact.

Proof
First, we prove that if S is not closed, it is not sequentially compact. If S is
not closed, there is a sequence {an } in S which converges to a point a but
a is not in S. For this sequence, every subsequence is convergent with limit
a. Hence, this sequence does not have a convergent subsequence with limit
in S. This proves that S is not sequentially compact.
Next, we prove that if S is not bounded, it is not sequentially compact. If S
is not bounded, for each integer n, there is a point an in S such that

|an | ≥ n.

Consider the sequence {an }. If {ank } is a subsequence of {an },

|ank | ≥ nk .

Hence, the sequence {ank } is not bounded, and thus it is not convergent.
This shows that the sequence {an } does not have any convergent
subsequence. Therefore, S is not sequentially compact.

Combining Theorem 1.45 and Theorem 1.46, we have the following.

Characterization of Sequentially Compact Sets


A subset of real numbers is sequentially compact if and only if it is closed
and bounded.

Notice that the only type of intervals that is both closed and bounded is the
type [a, b]. Hence, this is the only type of intervals that are sequentially compact.
Chapter 1. The Real Numbers 80

Example 1.41
Determine whether each of the following sets is sequentially compact.

(a) Z

(b) A = [2, 5] \ {3}

(c) B = (0, 6] ∩ [4, 7].

Solution
(a) The set Z is not bounded. Hence, it is not sequentially compact.

(b) 3 is a limit point of the set A but it is not in A. Hence, A is not closed,
and so it is not sequentially compact.

(c) B = [4, 6] is closed and bounded. Hence, B is sequentially compact.

It might be wondered why there is a need to introduce the concept of sequential


compactness if it is equivalent to closed and bounded. We will see that for a
subset of real numbers that is closed and bounded, every sequence in that set has a
subsequence that converges to a point in that set is a very important characteristic.
By introducing the concept of sequential compactness, we can avoid repeatedly
proving this property for a set that is closed and bounded.
The next theorem gives an important feature of a sequentially compact set.

Theorem 1.47
Let S be a subset of real numbers. If S is closed and bounded, then it has a
maximum and a minimum. Equivalently, if S is sequentially compact, then
it has a maximum and a minimum.

Proof
Since S is bounded, S has a least upper bound u and a greatest lower bound
ℓ. By Lemma 1.34, there are sequences {un } and {ℓn } in S that converge
to u and ℓ respectively. Since S is closed, u and ℓ are in S. Since u = sup S
is in S, S has a maximum. Since ℓ = inf S is in S, S has a minimum.
Chapter 1. The Real Numbers 81

Exercises 1.8
Question 1
Given that the sequence {an } is defined by
1 1
an = 1 + + ... + .
3 2n − 1
Show that {an } is not a Cauchy sequence. Then conclude that the sequence
{an } is divergent.

Question 2
Determine whether each of the following sequence is a Cauchy sequence.
n + (−1)n
(a) The sequence {an } with an =
n − (−1)n
1+n
(b) The sequence {bn } with bn =
1 − (−1)n n

Question 3
Determine whether each of the following sets is sequentially compact.

(a) A = {1, 2, · · · , 100}

(b) B = [4, 7] ∩ (6, 8]

Question 4
Show that the union of two sequentially compact sets is sequentially
compact.
Chapter 2. Limits of Functions and Continuity 82

Chapter 2

Limits of Functions and Continuity

In this chapter, we study functions f : D → R defined on a subset of real numbers


D, and taking values in the set of real numbers R. Polynomials and rational
functions are special examples. When we do not specify the domain of a function,
we will take its domain D to be the largest subset of real numbers where the
function can be defined.

Definition 2.1 Polynomials and Rational Functions


A polynomial is a function p : R → R of the form

p(x) = an xn + an−1 xn−1 + . . . + a1 x + a0 ,

where a0 , a1 , . . . , an are constants. We call p(x) a polynomial of degree n


if an ̸= 0. A rational function is a function of the form

p(x)
f (x) = ,
q(x)

where p(x) and q(x) are polynomials, and q(x) is not the zero polynomial.
The domain of this function is the set D = R \ S, where S is the finite point
set containing all x for which q(x) = 0.

For example, the domain of the rational function


x2 + 1
f (x) =
x+2
is the set D = R \ {−2}.
To be able to apply tools in analysis, we are interested in functions that are
continuous. Continuity can be defined in two different ways that are equivalent.
One is using positive numbers δ and ε to measure distances of points in the domain
and range, while the other is using limits of sequences.
Chapter 2. Limits of Functions and Continuity 83

The limit of a function f : D → R when the variable x approaches a limit


point x0 of the domain D is an important concept in defining derivatives. This
concept can be defined for any function f : D → R whose domain D contains
limit points. There is a close relation between the limit of a function f (x) when x
approaches a limit point x0 , and the continuity of the function at x0 .
Although the continuity of a function can be defined independently of limits
of functions, we choose to consider limits of functions first.

2.1 Limits of Functions

In Section 1.6, we have defined the concept of limit points of a set D. The point
x0 is a limit point of the set D if there is a sequence of points in D \ {x0 } that
converges to x0 . A limit point of a set is not necessarily in that set. A set that
contains all its limit points is a closed set. If a point x0 is in a set D but is not a
limit point of D, it is called an isolated point of D. If x0 is an isolated point of D,
there is a neighbourhood (a, b) of x0 which intersects the set D only at the point
x0 .
Limits of functions can be defined using the ε − δ language or using limits of
sequences. We will define the concept using limits of sequences first, and then
show that it is equivalent to the ε − δ definition.

Definition 2.2 Limits of Functions


Let D be a subset of real numbers and let x0 be a limit point of D. Given
a function f : D → R, we say that the limit of f (x) as x approaches x0
is ℓ, provided that whenever {xn } is a sequence of points in D \ {x0 } that
converges to x0 , the sequence {f (xn )} converges to ℓ.
If the limit of f : D → R as x approaches x0 is ℓ, we write

lim f (x) = ℓ.
x→x0

Notice that we do not define lim f (x) if x0 is not a limit point of the domain
x→x0
where the function is defined.
Chapter 2. Limits of Functions and Continuity 84

Logical Expression for Definition of Limits

lim f (x) = ℓ ⇐⇒
x→x0

∀{xn } ⊂ D \ {x0 }, lim xn = x0 =⇒ lim f (xn ) = ℓ.


n→∞ n→∞

Let us first look at some examples.

Example 2.1
Find the limit if it exists.
2x + 3
(a) lim
x→1 x2 + 1

x2 − 1
(b) lim
x→1 x − 1

x2 + 1
(c) lim
x→1 x − 1

Solution
(a) The function
2x + 3
f (x) =
x2 + 1
is defined on D = R. If {xn } is a sequence in R \ {1} that converges
to 1, limit laws imply that
2xn + 3 2×1+3 5
lim f (xn ) = lim 2
= 2
= .
n→∞ n→∞ xn + 1 1 +1 2
Hence,
2x + 3 5
lim = .
x→1 x2 + 1 2
Chapter 2. Limits of Functions and Continuity 85

(b) The function


x2 − 1
f (x) =
x−1
is defined on D = R \ {1}. If {xn } is a sequence in R \ {1} that
converges to 1, limit laws imply that

x2n − 1
lim f (xn ) = lim = lim (xn + 1) = 2.
n→∞ n→∞ xn − 1 n→∞

Hence,
x2 − 1
lim = 2.
x→1 x − 1

(c) The function


x2 + 1
f (x) =
x−1
is defined on D = R \ {1}. Consider the sequence {xn } with
1
xn = 1 + .
n
We find that
1 2
2
+ +2 1
f (xn ) = n n = 2n + 2 + .
1 n
n
The sequence {f (xn )} is not bounded, and so it is divergent. Hence,
the limit
x2 + 1
lim
x→1 x − 1

does not exist.

In part (b), we have used the fact that xn ̸= 1 to simplify f (xn ) to xn + 1.


Using laws for limits of sequences, it is immediate to see that limits of functions
respect taking linear combinations and multiplications. It also respects taking
quotients provided that the function on the denominator does not approach 0.
Chapter 2. Limits of Functions and Continuity 86

Proposition 2.1 Limit Laws for Functions


Let D be a subset of real numbers. Given that f : D → R and g : D → R
are functions defined on D, x0 is a limit point of D, and

lim f (x) = ℓ1 , lim g(x) = ℓ2 .


x→x0 x→x0

1. For any constants α and β, lim (αf + βg)(x) = αℓ1 + βℓ2 .


x→x0

2. lim (f g)(x) = ℓ1 ℓ2 .
x→x0

3. If g(x) ̸= 0 for all x ∈ D, and ℓ2 ̸= 0, then

f (x) ℓ1
lim = .
x→x0 g(x) ℓ2

From this proposition, it follows that we can take limits of a rational function
easily at a point which is not a zero of the polynomial in the denominator.

Proposition 2.2

Let p(x) and q(x) be polynomials. If x0 is a real number such that q(x0 ) ̸=
0, then
p(x) p(x0 )
lim = .
x→x0 q(x) q(x0 )

Let us now look at an example that involves the absolute values.

Example 2.2
Show that for any real number x0 ,

lim |x| = |x0 |.


x→x0
Chapter 2. Limits of Functions and Continuity 87

Solution
Let {xn } be a sequence in R\{x0 } that converges to x0 . By Question 1.5.4,
the sequence {|xn |} converges to |x0 |. This proves that

lim |x| = |x0 |.


x→x0

Now we want to formulate an equivalent definition for limits.

Theorem 2.3 Equivalent Definitions for Limits


Let D be a subset of real numbers, and let x0 be a limit point of D. Given
a function f : D → R, the following are two equivalent definitions for

lim f (x) = ℓ.
x→x0

(i) Whenever {xn } is a sequence of points in D \ {x0 } that converges to


x0 , the sequence {f (xn )} converges to ℓ.

(ii) For any ε > 0, there is a δ > 0 such that if the point x is in D and
0 < |x − x0 | < δ, then |f (x) − ℓ| < ε.

In logical notation, we can express (ii) as follows.

Logical Expression for Second Definition of Limits

lim f (x) = ℓ ⇐⇒
x→x0

∀ε > 0, ∃δ > 0, ∀x (x ∈ D) ∧ (0 < |x − x0 | < δ) =⇒ |f (x) − ℓ| < ε.

Here δ is a measure of the closeness of the point x ∈ D to the point x0 , and


ε is a measure of the closeness of the function value f (x) to the number ℓ. The
condition |x − x0 | > 0 is to stress that we only consider those points x that is
not x0 . From the definitions, we can see that the limit of a function f (x) when x
approaches x0 does not depend on how the function f is defined at x0 , and f does
not need to be defined at x0 for the limit to be defined.
To prove Theorem 2.3, we need to show that (i) ⇐⇒ (ii). This is equivalent
to (ii) =⇒ (i) and ¬ (ii) =⇒ ¬ (i).
Chapter 2. Limits of Functions and Continuity 88

Proof of Theorem 2.3


We start by showing that if (ii) holds, then (i) holds. Assume that (ii) holds.
To prove (i), we take a sequence {xn } in D \ {x0 } that converges to the
point x0 . We want to show that the sequence {f (xn )} converges to ℓ. We
prove this using the definition of convergence of sequences. Given ε > 0,
our assumption that (ii) holds implies that there is a δ > 0 such that for
all x that is in D with 0 < |x − x0 | < δ, we have |f (x) − ℓ| < ε. Since
{xn } converges to x0 , there is a positive integer N such that for all n ≥ N ,
|xn − x0 | < δ. By our definition, xn are points in D \ {x0 }. Hence, for all
n ≥ N,
|f (xn ) − ℓ| < ε.
Since we have shown that for all ε > 0, there is a positive integer N such
that for all n ≥ N , |f (xn ) − ℓ| < ε, we conclude that the sequence {f (xn )}
converges to ℓ. This proves (i) holds.
Now assume that (ii) is false. In logical notation, this means

∃ε > 0, ∀δ > 0, ∃x (x ∈ D) ∧ (0 < |x − x0 | < δ) ∧ |f (x) − ℓ| ≥ ε.

Namely, there is an ε > 0 such that for any δ > 0, there is a point x in
D \ {x0 } with |x − x0 | < δ but |f (x) − ℓ| ≥ ε. For this ε > 0, we
construct a sequence {xn } in D \ {x0 } in the following way. For each
positive integer n, there is a point xn in D \ {x0 } such that |xn − x0 | < 1/n
but |f (xn ) − ℓ| ≥ ε. Then {xn } is a sequence in D \ {x0 } that satisfies
1
|xn − x0 | < .
n
Since lim 1/n = 0, we find that the sequence {xn } converges to x0 . Since
n→∞
|f (xn ) − ℓ| ≥ ε for all n ∈ Z+ , the sequence {f (xn )} cannot converge to
ℓ. Hence, we have shown that there is a sequence {xn } in D \ {x0 } that
converges to x0 but {f (xn )} does not converge to ℓ. This proves that (i)
does not hold.
Chapter 2. Limits of Functions and Continuity 89

Example 2.3 The Heaviside Function


The Heaviside function H : R → R is defined by

1, if x ≥ 0;
H(x) =
0, if x < 0.

For any real number x0 , determine whether the limit lim H(x) exists.
x→x0

Figure 2.1: The Heaviside function H(x).

Solution
We consider the cases where x0 > 0, x0 < 0 and x0 = 0.
Case 1: x0 > 0. In this case, we claim that lim f (x) = 1.
x→x0
Given ε > 0, take δ = x0 . Then δ > 0. If x is in R and 0 < |x − x0 | < δ =
x0 , we have x − x0 > −x0 and hence x > 0. Thus f (x) = 1 and

|f (x) − 1| = 0 < ε.

This proves that lim f (x) = 1.


x→x0
Case 2: x0 < 0. In this case, we claim that lim f (x) = 0.
x→x0
Given ε > 0, take δ = −x0 . Then δ > 0. If x is in R and 0 < |x − x0 | <
δ = −x0 , we have x − x0 < −x0 and hence x < 0. Thus f (x) = 0 and

|f (x) − 0| = 0 < ε.

This proves that lim f (x) = 0.


x→x0
Chapter 2. Limits of Functions and Continuity 90

Case 3: x0 = 0. In this case, we claim that lim f (x) does not exist. Let
x→x0
{un } and {vn } be the sequences {1/n} and {−1/n} respectively. They are
both sequences in R \ {0} that converge to 0.

f (un ) = 1 and f (vn ) = 0 for all n ∈ Z+ .

Therefore,

lim f (un ) = 1 and lim f (vn ) = 0.


n→∞ n→∞

Since {f (xn )} has different limits when we consider two different


sequences {xn } in R \ {0} that converge to 0, we conclude that lim f (x)
x→x0
does not exist.

In this example, we can also use the ε−δ definition to show that lim f (x) does
x→0
not exist. Assume that lim f (x) exists and is equal to ℓ. Take ε = 1/2. There
x→0
exists δ > 0 such that for any x ∈ R, if 0 < |x − 0| < δ, then

|f (x) − ℓ| < ε.

Now the points x = x1 = −δ/2 and x = x2 = δ/2 both satisfy 0 < |x − 0| < δ.
We have f (x1 ) = 0 and f (x2 ) = 1. By triangle inequality,

|f (x1 ) − f (x2 )| ≤ |f (x1 ) − ℓ| + |f (x2 ) − ℓ| < 2ε = 1.

This gives
1 = |f (x1 ) − f (x2 )| < 1,
which is a contradiction. Hence, lim f (x) does not exist.
x→0
In calculus, we have defined the concepts of left limits and right limits to deal
with functions like the Heaviside function, which is defined by cases. Given a
subset of real numbers D and a point x0 , define

Dx0 ,− = {x ∈ D | x < x0 } , Dx0 ,+ = {x ∈ D | x > x0 } .

For example, consider D = [0, 2). If x0 = 1, then D1,− = [0, 1) and D1,+ =
(1, 2). If x0 = 0, then D0,− = ∅ and D0,+ = (0, 2). If x0 = 2, then D2,− = [0, 2)
and D2,+ = ∅.
Chapter 2. Limits of Functions and Continuity 91

Notice that even though x0 is a limit point of D, it might not be a limit point of
Dx0 ,− or Dx0 ,+ . We define the left limit and right limit of a function f : D → R
when x approaches x0 in the following way.

Definition 2.3 Left Limits and Right Limits


Let D be a subset of real numbers and let f : D → R be a function defined
on D.

1. If x0 is a limit point of Dx0 ,− , Dx0 ,− is not an empty set. We say that the
limit of the function f : D → R as x approaches x0 from the left exists
provided that the limit of the function f : Dx0 ,− → R as x approaches
x0 exists. If the left limit exists, it is denoted by

lim f (x) or simply as lim f (x).


x→x0 ,x<x0 x→x−
0

2. If x0 is a limit point of Dx0 ,+ , Dx0 ,+ is not an empty set. We say that the
limit of the function f : D → R as x approaches x0 from the right exists
provided that the limit of the function f : Dx0 ,+ → R as x approaches
x0 exists. If the right limit exists, it is denoted by

lim f (x) or simply as lim f (x).


x→x0 ,x>x0 x→x+
0

Left Limits, Right Limits, and Limits

1. If x0 is a limit point of both Dx0 ,+ and Dx0 ,− , then lim f (x) exists if
x→x0
and only if both lim− f (x) and lim+ f (x) exist and they are equal.
x→x0 x→x0

2. If x0 is a limit point of Dx0 ,− but is not a limit point of Dx0 ,+ , then


lim f (x) exists if and only if lim− f (x) exists.
x→x0 x→x0

3. If x0 is a limit point of Dx0 ,+ but is not a limit point of Dx0 ,− , then


lim f (x) exists if and only if lim+ f (x) exists.
x→x0 x→x0
Chapter 2. Limits of Functions and Continuity 92

Example 2.4
For the Heaviside function, we have

lim H(x) = 0 and lim H(x) = 1.


x→0− x→0+

Since the left and right limits are not equal, lim H(x) dos not exist.
x→0

Example 2.5 The Dirichlet’s Function


The Dirichlet’s function is the function f : R → R defined by

1, if x is rational,
f (x) =
0, if x is irrational.

For any real number x0 , determine whether the limit lim f (x) exists.
x→x0

This is a classical example of a function which we cannot visualize the graph.

Solution
Fixed a real number x0 . For any positive integer n, there is a rational
number pn and an irrational number qn in the open interval (x0 − 1/n, x0 ).
The sequences {pn } and {qn } are in R \ {x0 } and converge to x0 . Since

f (pn ) = 1 and f (qn ) = 0 for all n ∈ Z+ ,

we find that

lim f (pn ) = 1 and lim f (qn ) = 0.


n→∞ n→∞

Since the sequence {f (xn )} has different limits when we consider two
different sequences {xn } in R \ {x0 } that converge to x0 , we conclude
that lim f (x) does not exist.
x→x0

For this example, if one wants to use the ε − δ definition of limits, one can
proceed in the following way. For fixed x0 in R, assume that lim f (x) = ℓ.
x→x0
When ε = 1/2, there is a δ > 0 such that for any x with 0 < |x − x0 | < δ,
Chapter 2. Limits of Functions and Continuity 93

|f (x) − ℓ| < ε. The open interval (x0 − δ, x0 ) contains a rational number x1 and
an irrational number x2 . Notice that f (x1 ) = 1 and f (x2 ) = 0. Both x = x1 and
x = x2 satisfy 0 < |x − x0 | < δ. By triangle inequality,

|f (x1 ) − f (x2 )| ≤ |f (x1 ) − ℓ| + |f (x2 ) − ℓ| < 2ε = 1.

This gives
1 = |f (x1 ) − f (x2 )| < 1,
which is a contradiction. Hence, lim f (x) does not exist.
x→x0
Next, we consider composite functions.

Proposition 2.4

Given the two functions f : D → R and g : U → R, if f (D) ⊂ U , we can


define the composite function h = g ◦ f : D → R by h(x) = g(f (x)). If
x0 is a limit point of D, y0 is a limit point of U , f (D \ {x0 }) ⊂ U \ {y0 },

lim f (x) = y0 , lim g(y) = ℓ,


x→x0 y→y0

then
lim h(x) = lim (g ◦ f )(x) = ℓ.
x→x0 x→x0

Proof
Let {xn } be a sequence in D\{x0 } that converges to x0 , and let yn = f (xn )
for all n ∈ Z+ . By assumption, {yn } is a sequence in U \ {y0 }. Since
lim f (x) = y0 , the sequence {f (xn )} converges to y0 . Since lim g(y) =
x→x0 y→y0
ℓ, the sequence {g(yn )} converges to ℓ. In other words, the sequence {(g ◦
f )(xn )} converges to ℓ.
Since we have proved that whenever {xn } is a sequence in D \ {x0 } that
converges to x0 , the sequence {(g ◦ f )(xn )} converges to ℓ, we conclude
that
lim (g ◦ f )(x) = ℓ.
x→x0

Using the result of Example 2.2, we obtain the following.


Chapter 2. Limits of Functions and Continuity 94

Corollary 2.5
Let D be a subset of real numbers. Given a function f : D → R, if x0 is a
limit point of D and lim f (x) = ℓ, then
x→x0

lim |f (x)| = |ℓ|.


x→x0

Example 2.6
For any x0 ≥ 0, show that
√ √
lim x = x0 .
x→x0

Solution
Let us use the ε−δ definition of limits. Consider the case x0 = 0 first. Given
ε > 0, take δ = ε2 . Then δ > 0. If x ≥ 0 is such that 0 < |x − 0| < δ = ε2 ,

we have 0 < x < ε2 , which implies that 0 < x < ε. Hence, if x ≥ 0 and
0 < |x − 0| < δ,
√ √
| x − 0| < ε.
This proves that
√ √
lim x=0= x0 .
x→0

Now consider the case x0 > 0. Notice that


√ √ x − x0
x− x0 = √ √ .
x + x0
√ √
If x > x0 /4, then x> x0 /2 and

1 2
√√ < √ .
x + x0 3 x0
 
3 3 √
Given ε > 0, let δ = min x0 , ε x0 . Then δ > 0. If x ≥ 0 and
4 2
3 1
0 < |x − x0 | < δ, then |x − x0 | < x0 and so x > x0 . Therefore,
4 4
Chapter 2. Limits of Functions and Continuity 95

√ √ |x − x0 | 2
x−x0 = √ √ < δ × √ ≤ ε.
x + x0 3 x0
√ √
This proves that lim x = x0 .
x→x0

√ √
Figure 2.2: (a) The function f (x) = x. (b) The function f (x) = 3 x.

Using similar methods, one can prove that if n is an integer, then for any x0 in

the domain of the function f (x) = n x,

n

lim x = n x0 .
x→x0

Now we want to give a brief discussion about limits that involve infinities.

Definition 2.4 Infinity as Limits of Sequences

Given that {an } is a sequence of real numbers.

1. We say that the sequence {an } diverges to ∞, written as lim an = ∞,


n→∞
if for every positive number M , there is a positive integer N such that
for all n ≥ N , an ≥ M .

2. We say that the sequence {an } diverges to −∞, written as lim an =


n→∞
−∞, if for every positive number M , there is a positive integer N such
that for all n ≥ N , an ≤ −M .
Chapter 2. Limits of Functions and Continuity 96

Example 2.7

(a) The sequence {n2 } diverges to ∞.

(b) The sequence {−n2 } diverges to −∞.

(c) The sequence {(−1)n n2 } neither diverges to ∞ nor to −∞.

Given that {xn } is a sequence of real numbers. If {xn } diverges to ∞ or −∞,


there is a positive integer N such that xn ̸= 0 for all n ≥ N . Hence, for sequences
that diverge to ∞ and −∞, we can assume none of the terms is zero.
The following is another characterization of boundedness for a set in terms of
sequences that diverge to infinity.

1. A set D is not bounded above if and only if there is a sequence {xn } in


D that diverges to ∞.

2. A set D is not bounded below if and only if there is a sequence {xn } in


D that diverges to −∞.

Using these, we can make the following definitions.


Chapter 2. Limits of Functions and Continuity 97

Definition 2.5 Limits of Functions at Infinity


Let D be a subset of real numbers that is not bounded above. Given that ℓ
is a real number and f : D → R is a function defined on the set D. The
following two definitions for

lim f (x) = ℓ
x→∞

are equivalent.

(i) Whenever {xn } is a sequence of points in D that diverges to ∞, the


sequence {f (xn )} converges to ℓ.

(ii) For any ε > 0, there is a positive number M such that if the point x is
in D and x > M , then

|f (x) − ℓ| < ε.

Definition 2.6 Limits of Functions at Negative Infinity


Let D be a subset of real numbers that is not bounded below. Given that ℓ
is a real number and f : D → R is a function defined on the set D. The
following are two equivalent definitions for

lim f (x) = ℓ.
x→−∞

(i) Whenever {xn } is a sequence of points in D that diverges to −∞, the


sequence {f (xn )} converges to ℓ.

(ii) For any ε > 0, there is a positive number M such that if the point x is
in D and x < −M , then

|f (x) − ℓ| < ε.

Now let us look at a simple example.


Chapter 2. Limits of Functions and Continuity 98

Example 2.8
1
Show that lim = 0.
x→∞ x

Solution
We use both definitions to prove the statement.
Using the sequence definition, let {xn } be a sequence of nonzero real
numbers that diverges to ∞. We want to show that the sequence {1/xn }
converges to 0. Given ε > 0, the number M = 1/ε is also positive. Since
the sequence {xn } diverges to ∞, there is a positive integer N such that for
all n ≥ N ,
1
xn > M = .
ε
1
In particular, for all n ≥ N , xn > 0 and 0 < < ε. This proves that the
xn
1
sequence {1/xn } converges to 0. Therefore, lim = 0.
x→∞ x
Now consider the definition in terms of ε. Given ε > 0, let M = 1/ε. Then
M is a positive number. If x in R \ {0} is such that x > M , then
1 1
0< < = ε.
x M
1
This proves that lim = 0.
x→∞ x

This example demonstrates that working with the definition in terms of ε is


sometimes easier.
It is easy to see that the limit laws given in Proposition 2.1 and Proposition 2.4
also hold for the case where x → ∞ or x → −∞. We will skip the formulation
and use it directly. For example, we have the following.

Example 2.9
1
For any positive integer n, lim = 0.
x→∞ xn

Now let us look at some more examples.


Chapter 2. Limits of Functions and Continuity 99

Figure 2.3: (a) The function f (x) = 1/x. (b) The function f (x) = 1/x2 .

Example 2.10
Determine whether the limit
2x2 + 3x + 4
lim
x→∞ x2 + 7
exists. If it exists, find the limit.

Solution
Divide the numerator and denominator by x2 , we have
3 4
2
2x + 3x + 4 2+ + 2
= x x .
x2 + 7 7
1+ 2
x
Using limit laws and the fact that lim 1/x = 0, we find that
x→∞

2x2 + 3x + 4 2+0+0
lim = = 2.
x→∞ x2 + 7 1+0
Chapter 2. Limits of Functions and Continuity 100

Example 2.11
Determine whether the limit
x
lim √
x→−∞ x2 + 1
exists. If it exists, find the limit.

Solution

Notice that x2 = |x|. Hence,
x x
√ = r .
x2+1 1
|x| 1 + 2
x
When x < 0,
x
= −1.
|x|
Therefore,
x
lim = −1.
x→−∞ |x|

On the other hand, since


1 √
lim 1 + =1 and lim y = 1,
x→∞ x2 y→1

we find that
1
lim r = 1.
x→∞ 1
1+ 2
x
Hence,
x
lim √ = −1.
x→−∞ x2+1
Chapter 2. Limits of Functions and Continuity 101

x
Figure 2.4: The function f (x) = √ .
x2+1

Remark 2.1
Using similar ideas, one can formulate analogous definitions for the
following limits.

lim f (x) = ∞, lim f (x) = −∞,


x→x0 x→x0

lim f (x) = ∞, lim f (x) = −∞,


x→∞ x→∞

lim f (x) = ∞, lim f (x) = −∞.


x→−∞ x→−∞

Finally, we would like to mention that there is an analogue of the squeeze


theorem for functions, whose proof is straightforward.
Chapter 2. Limits of Functions and Continuity 102

Theorem 2.6 Squeeze Theorem


Let D be a subset of real numbers. Given that f : D → R, g : D → R,
h : D → R are functions defined on D and

g(x) ≤ f (x) ≤ h(x) for all x ∈ D.

If x0 is a limit point of D and

lim g(x) = lim h(x) = ℓ,


x→x0 x→x0

then
lim f (x) = ℓ.
x→x0

Exercises 2.1
Question 1
Find the limit if it exists.
x2 + 3x + 2
(a) lim
x→−1 x2 + 1
x2 − 3x − 4
(b) lim
x→−1 x+1
x2 + 1
(c) lim
x→−1 x + 1

x2 − 3x − 4
(d) lim
x→−1 x+1
Chapter 2. Limits of Functions and Continuity 103

Question 2
Define the function f : R → R by

x + 3, if x ≥ 1,
f (x) =
5 − x, if x < 1.

For any real number x0 , determine whether the limit lim f (x) exists.
x→x0

Question 3
Define the function f : R → R by

x, if x is rational;
f (x) =
−x, if x is irrational.

(a) Use squeeze theorem to show that lim f (x) exists and find the limit.
x→0

(b) If x0 ̸= 0, show that lim f (x) does not exist.


x→x0

Question 4
Determine whether the limit exists. If it exists, find the limit.
2x2 + x + 4
(a) lim
x→−∞ 5x2 + 2
2x + 3
(b) lim √
x→∞ 4x2 + 1

Question 5
For any x0 ≥ 0, show that

4

lim x = 4 x0 .
x→x0
Chapter 2. Limits of Functions and Continuity 104

2.2 Continuity of Functions

In this section, we introduce the concept of continuity of functions.

Definition 2.7 Continuity


Let D be a subset of real numbers that contains the point x0 , and let
f : D → R be a function defined on D. We say that the function f is
continuous at x0 provided that whenever {xn } is a sequence of points in
D that converges to x0 , the sequence {f (xn )} converges to f (x0 ).
We say that f : D → R is a continuous function if it is continuous at every
point of its domain D.

The definitions of limit and continuity are very similar. However, there is a
slight difference. To define continuity at a point x0 , x0 must be a point in the
domain of the function D. To define limit, x0 does not need to be a point in the
domain D but has to be a limit point of D. When the point x0 is in D and is also
a limit point of D, the relation between limit and continuity is as follows.

Proposition 2.7 Relation Between Limit and Continuity


Let D be a subset of real numbers that contains the point x0 . If x0 is a limit
point of D, then f is continuous at x0 if and only if

lim f (x) = f (x0 ).


x→x0

In other words, it says that if x0 is a limit point of the domain D, then f is


continuous at x0 if and only if

lim (f (x) − f (x0 )) = 0,


x→x0

if and only if
lim |f (x) − f (x0 )| = 0.
x→x0

The following fact is quite obvious.


Chapter 2. Limits of Functions and Continuity 105

Proposition 2.8
Let D be a subset of real numbers and let f : D → R be a function defined
on D. If f : D → R is continuous, then for any subset A of D, the function
f : A → R, which is the restriction of f to A, is also continuous.

Example 2.12
Proposition 2.2 says that a rational function is continuous.

Example 2.13

Example 2.3 says that the Heaviside function H(x) is continuous at x if


x ̸= 0. It is not continuous at x = 0.

Example 2.14
Example 2.5 says that the Dirichlet’s function is nowhere continuous.

Example 2.15

Example 2.6 says that the function f (x) = x is continuous. In general,

for any positive integer n, the function f (x) = n x is continuous.

A natural question to ask is the continuity of a function at an isolated point of


its domain. Let us first prove the following.

Lemma 2.9
Let D be a subset of real numbers and let x0 be an isolated point of D.
If {xn } is a sequence of points in D that converges to x0 , then there is a
positive integer n such that xn = x0 for all n ≥ N .
Chapter 2. Limits of Functions and Continuity 106

Proof
By Theorem 1.38, there is a neighbourhood (a, b) of x0 which intersects D
at x0 only. Let ε = min{x0 − a, b − x0 }. Then ε > 0 and (x0 − ε, x0 +
ε) ⊂ (a, b). Since the sequence {xn } converges to x0 , there is a positive
integer N such that for all n ≥ N , |xn − x0 | < ε. Hence, for all n ≥ N ,
xn ∈ (x0 − ε, x0 + ε) ⊂ (a, b). Since (a, b) ∩ D = {x0 }, we find that
xn = x0 for all n ≥ N .

Using this lemma, it is easy to prove the continuity of a function at an isolated


point of its domain.

Proposition 2.10 Continuity at an Isolated Point


Let D be a subset of real numbers that contains the point x0 . If x0 is an
isolated point of D, then f is continuous at x0 .

Proof
If {xn } is a sequence in D that converge to x0 , Lemma 2.9 says that there is
a positive integer N such that xn = x0 for all n ≥ N . Therefore, f (xn ) =
f (x0 ) for all n ≥ N . This implies that the sequence {f (xn )} converges to
f (x0 ). By the definition of continuity, f is continuous at x0 .

Example 2.16

Since every point of the set of positive integers Z+ is an isolated point, any
function f : Z+ → R defined on the set of positive integers is continuous.

This conclusion might be a bit counter intuitive for students that see it for the
first time. One can think about it naively in the following way. For an isolated
point, it has no close neighbours to be compared to. Hence, the limit operation
does not work, and thus the function is continuous by default.
Let us summarize again the continuity of a function at a point.
Chapter 2. Limits of Functions and Continuity 107

Continuity of a Function at a Point


Let D be a subset of real numbers and let x0 be a point in D. Given that
f : D → R is a function defined on D.

1. If x0 is an isolated point of D, then f is continuous at x0 .

2. If x0 is a limit point of D, then f is continuous at x0 if and only if

lim f (x) = f (x0 ).


x→x0

Similar to limits, we also have an equivalent definition for continuity in terms


of δ and ε.

Theorem 2.11 Equivalent Definitions for Continuity


Let D be a subset of real numbers and let x0 be point in D. Given a function
f : D → R, the following two definitions for f to be continuous at x0 are
equivalent.

(i) Whenever {xn } is a sequence of points in D that converges to x0 , the


sequence {f (xn )} converges to f (x0 ).

(ii) For any ε > 0, there is a δ > 0 such that if the point x is in D and
|x − x0 | < δ, then |f (x) − f (x0 )| < ε.

The proof of Theorem 2.11 is almost identical to the proof of Theorem 2.3.

Example 2.17

Use the ε − δ definition to show that the function f : R \ {0} → R defined


1
by f (x) = is continuous.
x

Solution
The domain of the function is D = R \ {0}. Let x0 be a point in D. Then
x0 ̸= 0. Notice that
Chapter 2. Limits of Functions and Continuity 108

1 1 |x − x0 |
|f (x) − f (x0 )| = − = . (2.1)
x x0 |x||x0 |
If
|x0 |
|x − x0 | < ,
2
then
|x0 |
|x| > > 0.
2
Given ε > 0, let
|x0 | |x0 |2
 
δ = min , ε .
2 2
|x0 | |x0 |
If x in D is such that |x − x0 | < δ, then |x − x0 | < and so |x| > .
2 2
It follows from (2.1) that
2
|f (x) − f (x0 )| < δ × ≤ ε.
|x0 |2

This proves that f is continuous at x0 .

From Proposition 2.1, it follows immediately that continuity is preserved when


we perform certain operations on functions.

Proposition 2.12
Let D be a subset of real numbers that contains the point x0 . Given that the
functions f : D → R and g : D → R are continuous at x0 .

1. For any constants α and β, the function αf + βg : D → R is continuous


at x0 .

2. The function (f g) : D → R is continuous at x0 .

3. If g(x) ̸= 0 for all x ∈ D, then the function (f /g) : D → R is


continuous at x0 .

For composition of functions, we have the following which is a counterpart of


Proposition 2.4.
Chapter 2. Limits of Functions and Continuity 109

Proposition 2.13

Given the two functions f : D → R and g : U → R with f (D) ⊂ U . If x0


is a point of D, f is continuous at x0 , g is continuous at y0 = f (x0 ), then
the composite function (g ◦ f ) : D → R is continuous at x0 .

This proposition can be proved easily using definition of continuity in terms


of convergent sequences.

Corollary 2.14
Let D be a subset of real numbers that contains the point x0 . If the function
f : D → R is continuous at x0 , then the function |f | : D → R is also
continuous at x0 .

Let us now look at an example of a piecewise function.

Example 2.18

Let f : [−2, 3] → R be the function defined by



2x2 − 3, if − 2 ≤ x ≤ 1,
f (x) =
cx + 2, if 1 < x ≤ 3.

Show that there is a value of c for which f is a continuous function.

Solution
The domain of the function f is D = [−2, 3]. First we show that if x0 ∈
D \ {1}, then f is continuous at x0 .
If x0 ∈ [−2, 1), then x0 < 1. If {xn } is a sequence in D \ {x0 } that
converges to x0 , then there is a positive integer N such that xn < 1 for all
n ≥ N . This implies that for all n ≥ N , f (xn ) = 2x2n − 3. Hence, the
sequence {f (xn )} converges to f (x0 ) = 2x20 − 3. This proves that f is
continuous at x0 .
Using similar arguments, we can show that if x0 ∈ (1, 3], f is continuous
at x0 .
Chapter 2. Limits of Functions and Continuity 110

Now, by definitions of left limits and right limits,

lim− f (x) = lim− (2x2 − 3) = −1;


x→1 x→1

lim f (x) = lim+ (cx + 2) = c + 2.


x→1+ x→1

For f to be continuous at x0 = 1, lim f (x) must exist. So we must have


x→1

lim f (x) = lim+ f (x).


x→1− x→1

This gives c = −3. In fact, when c = −3,

lim f (x) = −1 = f (1),


x→1

and hence f is continuous at x = 1.

Figure 2.5: The function defined in Example 2.18.


Chapter 2. Limits of Functions and Continuity 111

Remark 2.2
We can formulate a general theorem from Example 2.18 as follows.
Let D be a subset of real numbers that contains the point x0 , and let Dx0 ,−
and Dx0 ,+ be the intersection of D with the sets {x | x < x0 } and {x | x >
x0 } respectively. Suppose that x0 is a limit point of both Dx0 ,− and Dx0 ,+ ,
and f : D → R is a function such that its restrictions to Dx0 ,− and Dx0 ,+
are continuous. If

lim f (x) = lim+ f (x) = f (x0 ),


x→x−
0 x→x0

then f : D → R is a continuous function.

Finally, we define a special class of continuous functions called the Lipschitz


function.

Definition 2.8 Lipschitz Function


Let D be a subset of real numbers. A function f : D → R is said to be a
Lipschitz function if there is a constant c such that

|f (x1 ) − f (x2 )| ≤ c|x1 − x2 | for all x1 , x2 ∈ D.

The constant c is called a Lipschitz constant of the function.

Notice that a Lipschitz constant is nonnegative. The only Lipschitz functions


with 0 Lipschitz constant are the constant functions. If c0 is a Lipschitz constant
of a Lipschitz function f : D → R, any number c that is larger than c0 is also a
Lipschitz constant of f .

Example 2.19

Let f : R → R be the function given by f (x) = ax + b. Then f is a


Lipschitz function with Lipschitz constant |a|.
Chapter 2. Limits of Functions and Continuity 112

Example 2.20

Let f : [−10, 8] → R be the function defined by f (x) = x2 . Show that f is


Lipschitz.

Solution
For any x1 and x2 in [−10, 8],

|f (x1 ) − f (x2 )| = |x21 − x22 | = |x1 + x2 ||x1 − x2 |.

Triangle inequality implies that

|x1 + x2 | ≤ |x1 | + |x2 | ≤ 10 + 10 = 20.

Hence,
|f (x1 ) − f (x2 )| ≤ 20|x1 − x2 |.
This shows that f is a Lipschitz function with Lipschitz constant 20.

Example 2.21

Let f : R → R be the function defined by f (x) = x2 . Is f a Lipschitz


function?

Solution
If f is a Lipschitz function, there is a positive constant c such that

|f (x1 ) − f (x2 )| ≤ c|x1 − x2 |

for all real numbers x1 and x2 . Take x1 = c + 1 and x2 = 0. We find that

(c + 1)2 = |f (x1 ) − f (x2 )| ≤ c|x1 − x2 | = c(c + 1),

which implies that c + 1 ≤ 0, a contradiction. Hence, f is not a Lipschitz


function.

Here we see that whether a function is Lipschitz or not depends on the domain.
Finally we prove that a Lipschitz function is continuous.
Chapter 2. Limits of Functions and Continuity 113

Theorem 2.15
Let D be a subset of real numbers. If f : D → R is a Lipschitz function,
then it is continuous.

Proof
Since f : D → R is Lipschitz, there is a positive constant c such that for
any x1 and x2 in D,

|f (x1 ) − f (x2 )| ≤ c|x1 − x2 |.

Let x0 be a point in D. Given ε > 0, take δ = ε/c. Then δ > 0 and for any
x ∈ D, if |x − x0 | < δ,

|f (x) − f (x0 )| ≤ c|x − x0 | < cδ = ε.

This proves that f is continuous at x0 . Hence, f : D → R is a continuous


function.
Chapter 2. Limits of Functions and Continuity 114

Exercises 2.2
Question 1
Consider the function f : R → R defined by

2, if x > 2,
f (x) = .
x, if x ≤ 2

Show that f is a continuous function.

Question 2
Consider the function f : R → R defined by

x 2 , if x is rational,
f (x) =
−x2 , if x is irrational.

Show that f is continuous at x = 0.

Question 3
Consider the function f : R → R defined by

2x + 5, if x < −1,
f (x) = .
ax2 + x, if x ≥ −1

Show that there is a value of a for which f is a continuous function.

Question 4

Let f : [−7, 5] → R be the function defined by f (x) = 2x2 + 3x. Show


that f is a Lipschitz function.

Question 5

Let f : [1, ∞) → R be the function defined by f (x) = x. Show that f is
a Lipschitz function.
Chapter 2. Limits of Functions and Continuity 115

2.3 The Extreme Value Theorem

For a real-valued function f : D → R, the maximum value is the largest value the
function can assume; while the minimum value is the smallest value the function
can assume.

Definition 2.9 Maximium and Minimum Values of a Function


Let D be a subset of real numbers. Given that f : D → R is a real-valued
function defined on D.

1. f has a maximum value if and only if there is a point x0 in D such that

f (x) ≤ f (x0 ) for all x ∈ D.

Such a x0 is called a maximizer of the function f .

2. f has a minimum value if and only if there is a point x0 in D such that

f (x) ≥ f (x0 ) for all x ∈ D.

Such a x0 is called a minimizer of the function f .

Extreme Values
The maximum value of a function f : D → R is the maximum of the set
f (D); while the minimum value is the minimum of the set f (D).
A maximum value or a minimum value of a function is called an extreme
value of the function.

Example 2.22

(a) For the function f : [−1, 2] → R, f (x) = 2x, D = [−1, 2] and f (D) =
[−2, 4]. Thus, f has minimum value −2 and maximum value 4.

(b) For the function g : [−1, 2) → R, g(x) = 2x, D = [−1, 2) and g(D) =
[−2, 4). Thus, g has minimum value −2, but it does not have maximum
value.
Chapter 2. Limits of Functions and Continuity 116

(c) For the function h : (−1, 2] → R, h(x) = 2x, D = (−1, 2] and


h(D) = (−2, 4]. Thus, h has maximum value 4, but it does not have
minimum value.

Figure 2.6: The functions f (x), g(x) and h(x) defined in Example 2.22.

Example 2.22 shows that the existence of extreme values depends on the
domain of the function.
For a set to have maximum and minimum values, it is necessary (but not
sufficient) that the set is bounded. Let us first define what it means for a function
to be bounded.

Definition 2.10 Bounded Functions


We say that a real-valued function f : D → R is bounded if its range f (D)
is bounded. In other words, a function f : D → R is bounded if and only
if there is a positive constant M such that

|f (x)| ≤ M for all x ∈ D.

Example 2.23
All the three functions defined in Example 2.22 are bounded.

We are interested in a sufficient condition for a continuous function to have


maximum and minimum values. Before we proceed, let us look at two examples.
Chapter 2. Limits of Functions and Continuity 117

Example 2.24
1
Consider the function f : (0, 1) → R defined by f (x) = . Although the
x
domain of the function D = (0, 1) is bounded, the range of the function
f (D) = (1, ∞) is not bounded.

This example shows that a continuous function does not necessarily map a
bounded set to a bounded set.

Example 2.25
1
Consider the function f : [1, ∞) → R defined by f (x) = . Although
x
the domain of the function D = [1, ∞) is closed, the range of the function
f (D) = (0, 1] is not closed.

This example shows that a continuous function does not necessarily map a
closed set to a closed set.
The situation changes when we combine closed and bounded. Recall that
we have defined the concept of sequential compactness in Chapter 1, Section
1.8. A set D is sequentially compact if every sequence in D has a subsequence
that converges to a point in D. We have proved that a subset of real numbers is
sequentially compact if and only if it is closed and bounded.
The following theorem says that a continuous function maps a closed and
bounded set to a closed and bounded set.

Theorem 2.16
Let D be a closed and bounded subset of R. If f : D → R is a continuous
function, then the set f (D) is closed and bounded.

Using the fact that a subset of real numbers is sequentially compact if and only
if it is closed and bounded, Theorem 2.16 is equivalent to the following.

Theorem 2.17
Let D be a sequentially compact subset of R. If f : D → R is a continuous
function, then the set f (D) is sequentially compact.
Chapter 2. Limits of Functions and Continuity 118

Proof
We use the definition of sequential compactness to prove this theorem. Let
{yn } be a sequence in f (D). We need to prove that there is a subsequence
of {yn } that converges to a point in f (D).
For each positive integer n, since yn is in f (D), there is an xn in D such
that f (xn ) = yn . This gives a sequence {xn } in D. Since D is sequentially
compact, there is a subsequence {xnk } of {xn } that converges to a point
x0 in D. Since f is continuous at x0 , the sequence {f (xnk )} converges to
f (x0 ). In other words, we have shown that the subsequence {ynk } of {yn }
converges to the point f (x0 ) in f (D).

Proving Theorem 2.16 without using sequential compactness is tedious, and it


essentially goes through some of the arguments used to prove that a subset of real
numbers is sequentially compact if and only if it is closed and bounded. From
here, we can see the usefulness of the concept of sequential compactness.
In Theorem 1.47, we have seen that a set that is closed and bounded must
have a maximum and a minimum. Hence, we obtain immediately the following
theorem.

Theorem 2.18 Extreme Value Theorem


Let D be a closed and bounded subset of R. If f : D → R is a continuous
function, then f has a maximum value and a minimum value.

Corollary 2.19

If f : [a, b] → R is a continuous function defined on a closed and bounded


interval, then f is bounded, and it has a maximum value and a minimum
value.

Extreme value theorem is used to guarantee the existence of a maximum value


and a minimum value before we proceed to find these values, so that the attempt
to look for extreme values is not futile. In some circumstances, knowing the
existence of such extreme values is sufficient.
Chapter 2. Limits of Functions and Continuity 119

Example 2.26
Show that the function f : R → R defined by

f (x) = |x − 1| + |x − 2| + |x − 3| + |x − 4| + |x − 5|

has a minimum value.

Solution
In this example, the domain of the function is not closed and bounded. We
cannot apply the extreme value theorem directly. However, we can proceed
in the following way. First, we justify that the function f : R → R is
continuous. A function of the form g(x) = x − a is continuous since it
is a polynomial. Absolute value of a continuous function is continuous.
Hence, a function of the form h(x) = |x − a| is continuous. Being a sum
of continuous functions, f (x) is a continuous function.
To prove the existence of a minimum value, we notice that for x ≥ 5,

f (x) = x − 1 + x − 2 + x − 3 + x − 4 + x − 5 = 5x − 15 ≥ 10.

For x ≤ 1,

f (x) = 1 − x + 2 − x + 3 − x + 4 − x + 5 − x = 15 − 5x ≥ 10.

Now restrict the domain to [1, 5], the function f : [1, 5] → R is continuous.
Hence, it has a minimum value at some x0 ∈ [1, 5]. It follows that

f (x0 ) ≤ f (x) for all x ∈ [1, 5].

In particular,
f (x0 ) ≤ f (1) = 10.
This proves that for all x ∈ R, f (x) ≥ f (x0 ). Hence, the function f : R →
R has a minimum value.
Chapter 2. Limits of Functions and Continuity 120

Exercises 2.3
Question 1
Determine whether the function is bounded.
x
(a) f : R → R, f (x) = √ .
x2+4
1
(b) f : (0, 1) → R, f (x) = x + .
x

Question 2
If a function f : D → R is continuous and bounded, does it necessarily
have a maximum value and a minimum value? Justfiy your answer.

Question 3
Let f : [−4, 4] → R be the function defined by

x2 + x + 1
f (x) = √ .
4x2 + 9
Show that it has a maximum value and a minimum value.
Chapter 2. Limits of Functions and Continuity 121

2.4 The Intermediate Value Theorem

In this section, we are going to discuss the intermediate value theorem, which is
an important theorem for continuous functions. It is essentially a theorem about
existence of solutions for equations defined by continuous functions.

Theorem 2.20 Intermediate Value Theorem


Given that f : [a, b] → R is a continuous function. For any real number w
that is between f (a) and f (b), there exists a point c in [a, b] such that

f (c) = w.

Proof
The proof is using bisection method, which provides a constructive way to
find the point c.
Without loss of generality, assume that f (a) < w < f (b).
We construct two sequences {an } and {bn } recursively. Define a1 = a,
b1 = b, and let
a1 + b 1
m1 =
2
be the midpoint of a1 and b1 . The interval [a, b] = [a1 , b1 ] is bisected into
two subintervals [a1 , m1 ] and [m1 , b1 ] by the point m1 .
We want to define the interval [a2 , b2 ] to be one of these, based on the value
of f (m1 ).

• If f (m1 ) < w, define a2 = m1 and b2 = b1 .

• If f (m1 ) ≥ w, define a2 = a1 and b2 = m1 .

By definition,
a1 ≤ a2 < b2 ≤ b1 ,
f (a2 ) < w ≤ f (b2 ),
and the length of the interval [a2 , b2 ] is half the length of the interval [a1 , b1 ].
Chapter 2. Limits of Functions and Continuity 122

Suppose that we have defined a1 , . . . , an , b1 , . . . , bn , such that

a1 ≤ . . . ≤ an−1 ≤ an < bn ≤ bn−1 ≤ . . . ≤ b1 ,

f (ak ) < w ≤ f (bk ) for all 1 ≤ k ≤ n,


and
bk−1 − ak−1
b k − ak = for all 2 ≤ k ≤ n.
2
Let
an + b n
mn =
2
be the midpoint of an and bn .

• If f (mn ) < w, define an+1 = mn and bn+1 = bn .

• If f (mn ) ≥ w, define an+1 = an and bn+1 = mn .

By definition,
an ≤ an+1 < bn+1 ≤ bn ,
f (an+1 ) < w ≤ f (bn+1 ),
and
b n − an
bn+1 − an+1 = .
2
This constructs the sequences {an } and {bn }. Notice that {an } is an
increasing sequence that is bounded above by b, while {bn } is a decreasing
sequence that is bounded below by a.
By monotone convergence theorem, the sequence {an } converges to a
number c1 = sup{an } and the sequence {bn } converges to a number
c2 = inf{bn }. By induction, we find that

b−a
b n − an = .
2n−1
Taking n → ∞ limits, we conclude that

c2 − c1 = 0.
Chapter 2. Limits of Functions and Continuity 123

It follows that the number c = c1 = c2 satisfies

an ≤ c ≤ b n for all n ∈ Z+ , (2.2)

and
lim an = c = lim bn . (2.3)
n→∞ n→∞

Eq. (2.2) shows that c is in [a, b]. The continuity of the function f and (2.3)
implies that
f (c) = lim f (an ) = lim f (bn ).
n→∞ n→∞

Since
f (an ) < w and f (bn ) ≥ w for all n ∈ Z+ ,
we find that
f (c) ≤ w and f (c) ≥ w.
This proves that f (c) = w, and hence completes the proof of the theorem.

Figure 2.7: The intermediate value theorem.

The following is an example which we use the intermediate value theorem to


justify that an equation has a solution.
Chapter 2. Limits of Functions and Continuity 124

Example 2.27
Show that the equation
x6 + 6x + 1 = 0
has a real root.

Solution
6
Let f (x) = x + 6x + 1. Since f (x) is a polynomial, it is a continuous
function. Notice that

f (0) = 1, f (−1) = −4.

Hence, f (−1) < 0 < f (0). Namely, 0 is a value between f (−1) and f (0).
By intermediate value theorem, there is a point c in the interval (−1, 0) such
that f (c) = 0. Then x = c is a root of the equation

x6 + 6x + 1 = 0.

In this solution, the choice of a = −1 and b = 0 are by trial and error.


In practice, one can use a computer to sample some x values and calculate the
corresponding values of f (x). The goal is to find a and b such that f (a) and f (b)
have oppositive signs. To calculate the root c, one can implement the bisection
method numerically.

Example 2.28
Let n be a positive integer, and let c be a positive number. Use the
intermediate value theorem to show that there is a positive real number
x such that
xn = c.
Chapter 2. Limits of Functions and Continuity 125

Solution
Take a = 0, b = c + 1, and consider the function f : [a, b] → R defined by
f (x) = xn . Then,
f (a) = f (0) = 0,
f (b) = f (c + 1) = (1 + c)n ≥ 1 + nc > c.
Hence,
f (a) < c < f (b).
Since f is a continuous function, intermediate value theorem asserts that
there is a number x in the interval [0, c + 1] such that f (x) = xn = c.

In Chapter 1 Example 1.7, we use completeness axiom to solve this problem


when n = 2 and c = 2. Here we use the intermediate value theorem to tackle the
general problem. The tedious part has been settled in the proof of the intermediate
value theorem.
In the following, we want to formulate a precise relation between intervals and
the intermediate value theorem. We first introduce a concept called convexity.

Definition 2.11 Convex Sets


Let S be a subset of real numbers. We say that S is convex if for any u and
v in S, (1 − t)u + tv is in S for all t ∈ [0, 1].
Equivalently, S is convex provided that whenever u and v are in S and
u < v, then any w in the interval [u, v] is also in S.

The equivalence of the two definitions is seen by observing that when t changes
from 0 to 1, (1 − t)u + tv goes through all the points in the interval [u, v].
Obviously, an interval is a convex set. The converse is also true.

Theorem 2.21
Let S be a subset of real numbers. If S is a convex set, then S is an interval.
Chapter 2. Limits of Functions and Continuity 126

Sketch of Proof
If S is bounded below, let a = inf S. Otherwise, set a = −∞. If S is
bounded above, let b = sup S. Otherwise, set b = ∞.
If c is a point in (a, b), then a < c < b. In particular, since c > a, it is not
a lower bound of S. Hence, there is a point u in S such that a ≤ u < c.
Since c < b, c is not an upper bound of S. Hence, there is a point v in S
such that c < v ≤ b. Since u and v are in S and S is convex, all points in
the interval [u, v] are in S. By construction, u < c < v. Hence, c is in S.
This proves that all the points in (a, b) are in S.
Finally, we just need to consider whether S contains a, and whether it
contains b.

Convex Sets and Intervals


Let S be a convex set. If S is bounded below, let a = inf S. If S is bounded
above, let b = sup S.

1. If S is bounded, S does not contain inf S and sup S, then S = (a, b).

2. If S is bounded, S contains inf S but does not contain sup S, then S =


[a, b).

3. If S is bounded, S contains sup S but does not contain inf S, then S =


(a, b].

4. If S is bounded, and S contains both inf S and sup S, then S = [a, b].

5. If S is bounded below but not bounded above, and S does not contain
inf S, then S = (a, ∞).

6. If S is bounded below but not bounded above, and S contains inf S, then
S = [a, ∞).

7. If S is bounded above but not bounded below, and S does not contain
sup S, then S = (−∞, b).
Chapter 2. Limits of Functions and Continuity 127

8. If S is bounded above but not bounded below, and S contains sup S,


then S = (−∞, b].

9. If S is not bounded above nor bounded below, then S = (−∞, ∞) = R.

The following is a reformulation of the intermediate value theorem.

Theorem 2.22 Intermediate Value Theorem


Let I be an interval. If the function f : I → R is continuous, then f (I) is
an interval.

Proof
To show that f (I) is an interval, take two distinct points u and v in f (I).
We need to show that any w in between u and v is in f (I). Since u and v are
in f (I), there exist a and b in I such that u = f (a) and v = f (b). Without
loss of generality, assume that a < b. Since I is an interval, it contains the
interval [a, b]. Since f is continuous on [a, b], and w is in between f (a) and
f (b), the version of the intermediate value theorem that we have proved
implies that there is a point c in the interval (a, b) such that f (c) = w. This
shows that w is also in f (I).
Chapter 2. Limits of Functions and Continuity 128

Exercises 2.4
Question 1

Show that the equation 2x + x2 + 1 = 0 has a real solution.

Question 2
Given that f : [−2, 10] → [−2, 10] is a continuous function. Show that
there is a point x in the interval [−2, 10] such that f (x) = x.

Question 3
Suppose that f : R → R is a bounded continuous function. Show that there
is a real number x such that f (x) = x.

Question 4
Let n be an odd positive integer, and let

p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0

be a polynomial of degree n. Show that p(x) = 0 has a real root.


Chapter 2. Limits of Functions and Continuity 129

2.5 Uniform Continuity

In Section 2.2, we have defined the concept of continuity at a point. This is a


local property which only depends on the function value in a neighbourhood of
a point. In this section, we want to define a concept called uniform continuity,
which depends on the behaviour of the function on the whole domain. Such a
property is called a global property.

Definition 2.12 Uniform Continuity


Let D be a subset of real numbers. A function f : D → R defined on D is
uniformly continuous provided that for any ε > 0, there exists δ > 0 such
that if x1 and x2 are in D and |x1 − x2 | < δ, then

|f (x1 ) − f (x2 )| < ε.

Theorem 2.23 Equivalent Definition of Uniform Continuity


Let D be a subset of real numbers. A function f : D → R defined on D is
uniformly continuous if and only if whenever {un } and {vn } are sequences
in D such that
lim (un − vn ) = 0,
n→∞

{f (un )} and {f (vn )} are sequences in f (D) such that


 
lim f (un ) − f (vn ) = 0.
n→∞

Notice that we only require the sequence {un − vn } to converge to 0. We do


not require the sequence {un } nor the sequence {vn } to be convergent.
Theorem 2.23 can be proved in the same way as we prove the equivalence of
two definitions for limits of functions.
The following is quite obvious.

Theorem 2.24
Let D be a subset of real numbers. If f : D → R is a uniformly continuous
function, it is continuous.
Chapter 2. Limits of Functions and Continuity 130

Let us compare the definitions of continuity and uniform continuity using the
definitions in terms of ε − δ.

Continuity versus Uniform Continuity

• A function f : D → R is continuous if

∀x0 ∈ D, ∀ε > 0, ∃δ > 0, ∀x ∈ D, |x−x0 | < δ =⇒ |f (x)−f (x0 )| < ε.

• A function f : D → R is uniformly continuous if

∀ε > 0, ∃δ > 0, ∀x0 ∈ D, ∀x ∈ D, |x−x0 | < δ =⇒ |f (x)−f (x0 )| < ε.

The difference is in the order of the quantifiers. For a function to be continuous,


it must be continuous at each point in the domain. For each point x0 in the domain,
there should exist a positive δ for each positive ε. This number δ not only depends
on ε, but also on the point x0 . For uniform continuity, one needs to be able to find
a δ which only depends on ε but not on the point in the domain. This is where the
uniformity lies.
Let us look at some examples of functions that are continuous but not uniformly
continuous.

Example 2.29
1
Show that the function f : (0, 1) → R, f (x) = is not uniformly
x
continuous.

Solution
For a positive integer n, let
1 1
un = , vn = .
n+1 n+2
Then {un } and {vn } are sequences in the domain D = (0, 1), and
1 1
lim (un − vn ) = lim − lim = 0.
n→∞ n→∞ n + 1 n→∞ n + 2
Chapter 2. Limits of Functions and Continuity 131

Since f (un ) = n + 1 and f (vn ) = n + 2, we find that


 
lim f (un ) − f (vn ) = −1 ̸= 0.
n→∞

Hence, f is not uniformly continuous.

Example 2.30

Show that the function f : (0, ∞) → R, f (x) = x2 is not uniformly


continuous.

Solution
For a positive integer n, let
1
un = n + , vn = n.
n
Then {un } and {vn } are sequences in the domain D = (0, ∞), and
1
lim (un − vn ) = lim = 0.
n→∞ n→∞ n

Since  2
1 1
f (un ) − f (vn ) = n + − n2 = 2 + 2 ,
n n
we find that  
lim f (un ) − f (vn ) = 2 ̸= 0.
n→∞

Hence, f is not uniformly continuous.

If we change the domain of the function, the conclusion is different.

Example 2.31

Let f : [−10, 8] → R be the function defined by f (x) = x2 . Show that f is


uniformly continuous.
Chapter 2. Limits of Functions and Continuity 132

Solution
In the solution of Example 2.20, we have shown that for any x1 and x2 in
the domain D = [−10, 8],

|f (x1 ) − f (x2 )| ≤ 20|x1 − x2 |.

Given ε > 0, take δ = ε/20. Then δ > 0. If x1 and x2 are in D and


|x1 − x2 | < δ, then

|f (x1 ) − f (x2 )| < 20δ = ε.

This proves that f is uniformly continuous.

Example 2.31 is a function that is Lipschitz. In fact, the proof of Theorem 2.15
can be easily modified to prove that a Lipschitz function is uniformly continuous.

Theorem 2.25
Let D be a subset of real numbers. If f : D → R is a Lipschitz function,
then it is uniformly continuous.


The converse is not true. For example, the function f : [0, 1] → R, f (x) = x
is not Lipschitz, but it is uniformly continuous. We leave this to the exercise.
In the following, we give a sufficient condition for a function to be uniformly
continuous.

Theorem 2.26
Let D be a closed and bounded subset of real numbers. If f : D → R is a
continuous function, then it is uniformly continuous.

To prove this theorem, we start with a technical lemma.


Chapter 2. Limits of Functions and Continuity 133

Lemma 2.27
Let S be a sequentially compact set in R, and let {an } and {bn } be two
sequences in S. There is strictly increasing sequence of positive integers
{n1 , n2 , n3 , . . .} such that each of the subsequences {an1 , an2 , an3 , . . .} and
{bn1 , bn2 , bn3 , . . .} converges to a point in S.

Using sequential compactness, we can guarantee that {an } has a subsequence


that converges to a point in S, and {bn } also has a subsequence that converges to a
point in S. However, the indices of these two subsequences might not be related.
We need to choose the subsequences carefully to make sure that the indices {nk }
are the same.

Proof
First, the sequentially compactness of S guarantees that there is a
strictly increasing sequence of positive integers {k1 , k2 , k3 , . . .} so that the
subsequence {ak1 , ak2 , ak3 , . . .} converges to a point a in S. For a positive
integer j, let
c j = bk j .
Consider the sequence {cj } indexed by j ∈ Z+ . It is a subsequence of
{bn }, and it is also a sequence in S. Since S is sequentially compact, there
is a strictly increasing sequence of positive integers {j1 , j2 , j3 , . . .} such
that the subsequence {cj1 , cj2 , cj3 , . . .} = {bkj1 , bkj2 , bkj3 , . . .} converges
to a point b in S. For a positive integer m, let nm = kjm . Then
{n1 , n2 , n3 , . . .} is a strictly increasing sequence of positive integers.
The sequence {an1 , an2 , an3 , . . .} is a subsequence of {ak1 , ak2 , ak3 , . . .}.
Hence, it converges to a. The sequence {bn1 , bn2 , bn3 , . . .} is the sequence
{cj1 , cj2 , cj3 , . . .} which converges to b.

Now we return to the proof of Theorem 2.26.


Chapter 2. Limits of Functions and Continuity 134

Proof of Theorem 2.26


We use proof by contradiction. If f is not uniformly continuous, there is an
ε > 0 such that for all δ > 0, there are points u and v is D with |u − v| < δ
and |f (u) − f (v)| ≥ ε. This implies that for each n ∈ Z+ , there are points
un and vn in D with
1
|un − vn | < and |f (un ) − f (vn )| ≥ ε.
n
Thus, {un } and {vn } are two sequences in D such that

lim (un − vn ) = 0. (2.4)


n→∞

Since D is closed and bounded, it is sequentially compact. By Lemma


2.27, we find that there is a strictly increasing sequence of positive integers
{n1 , n2 , n3 , . . .} so that the subsequence {unk } converges to a point u0 in D;
and the subsequence {vnk } converges to a point v0 in D. Since f : D → R
is continuous, the sequence {f (unk )} converges to f (u0 ), and the sequence
{f (vnk )} converges to f (v0 ). By construction,

|f (unk ) − f (vnk )| ≥ ε for all k ∈ Z+ .

This implies that


|f (u0 ) − f (v0 )| ≥ ε.
Since {unk − vnk } is a subsequence of {un − vn }, (2.4) implies that

u0 − v0 = lim (unk − vnk ) = 0.


k→∞

In other words, u0 = v0 . Then we should have f (u0 ) = f (v0 ), which


contradicts to |f (u0 ) − f (v0 )| ≥ ε. We conclude that f must be uniformly
continuous.

Example 2.32

Show that the function f : (0, 100) → R, f (x) = x is uniformly
continuous.

Using the definition of uniform continuity to solve this problem is tedious.


Chapter 2. Limits of Functions and Continuity 135

Solution
The domain of the function Df = (0, 100) is not closed and bounded. We
cannot apply Theorem 2.26 directly. Consider the function g : [0, 100] → R

defined by g(x) = x. It is a continuous function. Since the domain
Dg = [0, 100] is closed and bounded, g is uniformly continuous.
Since f : (0, 100) → R is the restriction of the function g to Df , it is also
uniformly continuous.

Exercises 2.5
Question 1

Determine whether the function f : [0, ∞) → R, f (x) = 2x2 + 3x is


uniformly continuous.

Question 2
x
Show that the function f : (0, 20) → R, f (x) = √ is uniformly
x+1
continuous.

Question 3

Let f : [0, 1] → R be the function defined by f (x) = x. Show that f is
not Lipschitz, but it is uniformly continuous.

Question 4
1
Determine whether the function f : (0, 1) → R, f (x) = √ is uniformly
x
continuous.
Chapter 2. Limits of Functions and Continuity 136

2.6 Monotonic Functions and Inverses of Functions

In this section, we study monotonic functions.

Definition 2.13 Monotonic and Strictly Monotonic Functions


Let D be a subset of real functions and let f : D → R be a function defined
on D.

1. We say that f : D → R is an increasing function if for any x1 and x2 in


D,
x1 ≤ x2 =⇒ f (x1 ) ≤ f (x2 ).

2. We say that f : D → R is a strictly increasing function if for any x1 and


x2 in D,
x1 < x2 =⇒ f (x1 ) < f (x2 ).

3. We say that f : D → R is a decreasing function if for any x1 and x2 in


D,
x1 ≤ x2 =⇒ f (x1 ) ≥ f (x2 ).

4. We say that f : D → R is a strictly decreasing function if for any x1


and x2 in D,
x1 < x2 =⇒ f (x1 ) > f (x2 ).

5. We say that f : D → R is monotonic if it is increasing or it is decreasing.

6. We say that f : D → R is strictly monotonic if it is strictly increasing


or it is strictly decreasing.

The following is obvious from the definitions.

Proposition 2.28
Let D be a subset of real functions and let f : D → R be a function defined
on D. If f is strictly monotonic, then it is one-to-one.
Chapter 2. Limits of Functions and Continuity 137

Example 2.33

(a) Let f : [−3, 4] → R be the function defined by



−1, if − 3 ≤ x ≤ 0,
f (x) =
x − 1, if 0 < x ≤ 4.

It is an increasing function.

(b) Let g : [−3, 4] → R be the function defined by



1, if − 3 ≤ x ≤ 0,
g(x) =
1 − x, if 0 < x ≤ 4.

It is a decreasing function.

Neither f nor g is strictly monotonic.

Figure 2.8: The functions f (x) and g(x) defined in Example 2.33.
Chapter 2. Limits of Functions and Continuity 138

Example 2.34

(a) Let f : (−∞, 0] → R be the function defined by f (x) = x2 . Then f is


strictly decreasing.

(b) Let g : [0, ∞) → R be the function defined by g(x) = x2 . Then g is


strictly increasing.

Figure 2.9: The functions f (x) and g(x) defined in Example 2.34.

Figure 2.10: An increasing function with a jump discontinuity.

The following is a characterization of the discontinuities of a monotonic function.


Chapter 2. Limits of Functions and Continuity 139

Theorem 2.29
Let f : [a, b] → R be a monotonic function. For any x0 in (a, b], the left
limit
f− (x0 ) = lim− f (x)
x→x0

exists. For any x0 in [a, b), the right limit

f+ (x0 ) = lim+ f (x)


x→x0

exists. Define
f− (a) = f (a) and f+ (b) = f (b).
Then the function f : [a, b] → R is continuous at the point x0 in [a, b] if and
only if
f− (x0 ) = f (x0 ) = f+ (x0 ).
Otherwise, f has a jump discontinuity at x0 with jump

|f+ (x0 ) − f− (x0 )|.

Proof
If f is decreasing, then −f is increasing. Hence, we only need to consider
the case where f : [a, b] → R is increasing. Fixed x0 in (a, b]. Define the
nonempty set S− by

S− = {f (x) | a ≤ x < x0 } .

Since f is increasing, f (x) ≤ f (x0 ) for any x in [a, x0 ). Therefore the set
S− is bounded above by f (x0 ). Let u = sup S− . Then u ≤ f (x0 ). We
claim that
u = lim− f (x) = f− (x0 ).
x→x0

Given ε > 0, u − ε < u and thus it is not an upper bound of S− . Hence,


there is a point x1 in [a, x0 ) such that

f (x1 ) > u − ε.
Chapter 2. Limits of Functions and Continuity 140

Let δ = x0 −x1 . Then δ > 0. If x is a point in [a, x0 ) such that |x−x0 | < δ,
then x1 < x < x0 , and thus

u − ε < f (x1 ) ≤ f (x) ≤ u.

From this, we have


|f (x) − u| < ε.
This proves that
f− (x0 ) = lim− f (x) = u.
x→x0

Using similar argument, we find that for any x0 in [a, b), the right limit
lim+ f (x) exists, and
x→x0

f+ (x0 ) = lim+ f (x) = inf {f (x) | x0 < x ≤ b} .


x→x0

By definition of continuity, the function f is continuous at x0 if and only if


f− (x0 ) = f+ (x0 ). The statement about the jump is obvious.

Corollary 2.30
Let I be an interval. If f : I → R is monotonic, then f is continuous if and
only if f (I) is an interval.

Proof
If f : I → R is continuous, intermediate value theorem implies that f (I) is
an interval.
If f : I → R is not continuous, Theorem 2.29 implies that there is a point
x0 in the interval I for which either f− (x0 ) ̸= f (x0 ) or f+ (x0 ) ̸= f (x0 ). In
any case, f (I) cannot be an interval.

For a function f : I → R defined on an interval I, we have seen that if f is


strictly monotonic, it is one-to-one. It is true even if the function is not continuous.
If we assume that the function is continuous, the converse is also true. It is a
consequence of the intermediate value theorem.
Chapter 2. Limits of Functions and Continuity 141

Theorem 2.31
Let f : I → R be a function defined on an interval I. If f is continuous
and one-to-one, then f is strictly monotonic.

Proof
If f fails to be strictly monotonic, there exist three points a, x0 , b in I such
that a < x0 < b and one of the following holds.

(i) f (a) < f (b) < f (x0 )

(ii) f (x0 ) < f (a) < f (b)

(iii) f (a) > f (b) > f (x0 )

(iv) f (x0 ) > f (a) > f (b)

Consider case (i) where f (a) < f (b) < f (x0 ). Since w = f (b) is a value
between f (a) and f (x0 ), intermediate value theorem implies that there is
a point c in the interval (a, x0 ) for which f (c) = w. But then c ̸= b, but
f (c) = f (b). This contradicts to f is one-to-one.
Using the same argument, we will reach a contradiction for the other three
cases. This proves that f must be strictly monotonic.

Now we consider invertibility of functions. We only consider functions that


are defined on intervals.

Definition 2.14 Invertibility of a Function


Let I be an interval and let f : I → R be a function defined on I. The
function f : I → R is invertible if and only if it is injective. If f : I → R is
injective, its inverse is the function f −1 : f (I) → R defined in such a way
so that
f −1 (y) = x ⇐⇒ f (x) = y for all y ∈ f (I).
Chapter 2. Limits of Functions and Continuity 142

Example 2.35
Consider the functions f and g that are defined in Example 2.34.

(a) The inverse of the function f : (−∞, 0] → R, f (x) = x2 , is the



function f −1 : [0, ∞) → (−∞, 0], f −1 (x) = − x.

(b) The inverse of the function g : [0, ∞) → R, g(x) = x2 , is the function



g −1 : [0, ∞) → [0, ∞), g −1 (x) = x.

Figure 2.11: The functions in Example 2.35.

Notice that the inverse of a strictly increasing function is strictly increasing.


The inverse of a strictly decreasing function is strictly decreasing.
In the next theorem, we prove that the inverse of a continuous function is
continuous.

Theorem 2.32
Let I be an interval and let f : I → R be a continuous function defined
on I. If f : I → R is one-to-one, then f −1 : f (I) → R exists, and it is
continuous.
Chapter 2. Limits of Functions and Continuity 143

Proof
By Theorem 2.31, f is strictly monotonic. Without loss of generality, we
assume that f is strictly increasing.
Given a point y0 in the interval f (I), let x0 be the unique point in I such
that f (x0 ) = y0 . Given ε > 0, we need to prove that there is a δ > 0 such
that if y is a point in f (I) with |y − y0 | < δ, then |f −1 (y) − f −1 (y0 )| < ε.
For simplicity, assume that x0 is an interior point of I. Then there is a r > 0
such that [x0 − r, x0 + r] is in I. Take

ε1 = min{ε, r}.

Then ε1 > 0, ε1 ≤ ε and [x0 − ε1 , x0 + ε1 ] ⊂ [x0 − r, x0 + r] ⊂ I. Since f


is strictly increasing,

f (x0 − ε1 ) < f (x0 ) < f (x0 + ε1 ),

and the interval [f (x0 − ε1 ), f (x0 + ε1 )] is in f (I). Let

δ = min{f (x0 ) − f (x0 − ε1 ), f (x0 + ε1 ) − f (x0 )}.

Then δ > 0. If y is a point in I such that |y − y0 | < δ, then

f (x0 − ε1 ) ≤ y0 − δ < y < y0 + δ ≤ f (x0 + ε1 ).

This implies that y is also in f (I). Since f −1 is strictly increasing, we have

x0 − ε ≤ x0 − ε1 < f −1 (y) < x0 + ε1 ≤ x0 + ε.

This implies that


|f −1 (y) − f −1 (y0 )| < ε,
which completes the proof that f −1 is continuous at y0 .
If f −1 (y0 ) = x0 is an endpoint of I, we need to modify the proof a bit to
show that f −1 is continuous at y0 . The details are left to the students.
Chapter 2. Limits of Functions and Continuity 144

Remark 2.3
If I = (a, b) is an open interval and the function f : (a, b) → R is
continuous and one-to-one, we have seen in Theorem 2.31 that f is strictly
monotonic. In fact, one can prove that f (I) is also an open interval.
Without loss of generality, assume that f is strictly increasing. Since f
is continuous, f (I) is an interval. If f (I) is not an open interval, either
inf f (I) or sup f (I) is in f (I). If c = inf f (I) is in f (I), there is a point
u+a
u in (a, b) such that f (u) = c. But then u > a and so u1 = is also
2
a point in (a, b). Since u1 < u, f (u1 ) < f (u) = c. This contradicts to
c = inf f (I). In the same way, one can show that sup f (I) is not in f (I).
Hence, f (I) must be an open interval.

Although we can use limits to show that when n is a positive integer, the

function f (x) = n x is continuous, it is tedious. Using Theorem 2.32 is much
more succint.

Example 2.36
Let n be a positive integer.

1. If n is odd, the function f : R → R, f (x) = xn is continuous and one-



to-one. Hence, its inverse f −1 : R → R, f −1 (x) = n x is a continuous
function.

2. If n is even, the function f : [0, ∞) → [0, ∞), f (x) = xn is continuous


and one-to-one. Hence, its inverse f −1 : [0, ∞) → [0, ∞), f −1 (x) =
√n
x is a continuous function.

Recall that a rational number r can be written as r = p/q, where p is an integer


and q is a positive integer. For a positive real number x, we define xr by
√ √ p
xr = q
xp = q
x .

It is easy to check that the two expressions for xr are equal. Using the fact that
composition of continuous functions is continuous, we obtain the following.
Chapter 2. Limits of Functions and Continuity 145

Theorem 2.33
Let r be a rational number.

1. If r > 0, f : [0, ∞) → [0, ∞), f (x) = xr is a strictly increasing


continuous function.

2. If r < 0, f : (0, ∞) → (0, ∞), f (x) = xr is a strictly decreasing


continuous function.

Exercises 2.6
Question 1
2x + 1
Show that the function f : (−1, ∞) → R, f (x) = is strictly
x+1
monotonic, and find the inverse function f −1 .
Chapter 3. Differentiating Functions of a Single Variable 146

Chapter 3

Differentiating Functions of a Single Variable

The simplest function is the constant function f (x) = c, whose function value
does not vary with the input. The next class of functions that are relatively easy
to study is a polynomial of degree one f (x) = ax + b, where a ̸= 0. Sometimes
we also call any function of the form f (x) = ax + b as a linear function, as its
graph y = ax + b is a straight line in the xy-plane. However, this should not be
confused with a linear function that are considered in linear algebra, which in the
single variable case, refers to a function of the form f (x) = ax.

Definition 3.1 Graph of a Function


If f : D → R is a function defined on a subset D of real numbers, its graph
is the subset Gf in R2 defined as

Gf = {(x, y) | x ∈ D, y = f (x)} .

Figure 3.1: The graph of the function f (x) = ax + b when (i) a < 0, (ii) a = 0
and (iii) a > 0.

For the function y = f (x) = ax + b, we find that for any two distinct points
Chapter 3. Differentiating Functions of a Single Variable 147

x1 and x2 ,
f (x2 ) − f (x1 ) a(x2 − x1 )
= = a.
x2 − x1 x2 − x1
In other words, the change in the y values,

∆y = y2 − y1 = f (x2 ) − f (x1 )

is proportional to the change in the x-values

∆x = x2 − x1 ,

with propotionality constant a. This constant a is called the rate of change of the
function, and it is the slope of the line y = ax + b. Its magnitude |a| characterizes
how fast y is changing with respect to x, and its sign determine the way y changes.
When a > 0, y increases as x increases. When a < 0, y decreases as x increases.
For a function that is more complicated, such as a quadratic function y =
f (x) = x2 , we find that

∆y = f (x2 ) − f (x1 ) = x22 − x21 = (x2 + x1 )(x2 − x1 ) = (x2 + x1 )∆x.

In this case, ∆y/∆x is not a constant. It depends on the points x1 and x2 .


In real-life scenario, functions are used to describe the dependence of a variable
on the other. For example, if one wants to record the distance that has been
travelled by an object, the independent variable is the time t, while the dependent
variable is the distance s. In this case, one obtains a function s = s(t). The
average speed the object is travelling between the time t = t1 and the time t = t2
is
s(t2 ) − s(t1 )
.
t2 − t1
In general, one cannot expect that this speed is a constant. If we are interested in
the instantaneous speed that the object is travelling at time t = t1 , one can fix t1
and take t2 to be closer and closer to t1 , and study the behaviour of the average
speed. This leads to the concept of derivatives.

3.1 Derivatives

The derivative of a function y = f (x) is a measure of the rate of change of the y-


values with respect to the change in the x- values. To be able to measure this rate of
Chapter 3. Differentiating Functions of a Single Variable 148

change at a particular point x0 , the function has to be defined in a neighbourhood


of the point x0 . Henceforth, when we define derivatives, we will assume that the
function is defined on an open interval (a, b). This includes the case where a is
−∞ or b is ∞.

Definition 3.2 Derivatives


Given a function f : (a, b) → R and a point x0 in the interval (a, b), the
derivative of f at x0 is defined to be the limit

f (x) − f (x0 )
lim
x→x0 x − x0
if it exists. If the limit exists, we say that f is differentiable at x0 , and its
derivative a x0 is denoted by f ′ (x0 ). Namely,

f (x) − f (x0 )
f ′ (x0 ) = lim .
x→x0 x − x0

Notice that in defining the derivative of a function f (x) at a point x0 , the


function that we are taking limit of is the function

f (x) − f (x0 )
g(x) = ,
x − x0
which is defined on the set D = (a, b) \ {x0 }. It is easy to check that x0 is
indeed a limit point of the set D. The function g(x) is the quotient of the function
p(x) = f (x) − f (x0 ) and the function q(x) = x − x0 . It is not defined at x = x0
since q(x0 ) = 0. Moreover, since lim q(x) = q(x0 ) = 0, a necessary condition
x→x0
for f to be differentiable at the point x0 is lim p(x) = 0, which says that the
x→x0
function f (x) is continuous at x0 .

Theorem 3.1 Differentiability Implies Continuity

Let x0 be a point in the open interval (a, b). If the function f : (a, b) → R
is differentiable at x0 , it is continuous at x0 .
Chapter 3. Differentiating Functions of a Single Variable 149

Proof
If f is differentiable at x0 , the limit

f (x) − f (x0 )
f ′ (x0 ) = lim
x→x0 x − x0
exists. By limit laws,

f (x) − f (x0 )
lim (f (x) − f (x0 )) = lim lim (x − x0 ) = f ′ (x0 ) × 0 = 0.
x→x0 x→x0 x − x0 x→x0

Hence,
lim f (x) = f (x0 ),
x→x0

which proves that f is continuous at x0 .

By writing x = x0 + h, where h = x − x0 is the change in the x-values, we


can write the derivative of a function f : (a, b) → R at a point x0 as

f (x0 + h) − f (x0 )
f ′ (x0 ) = lim .
h→0 h
The continuity of the function f at the point x0 is then equivalent to

lim f (x0 + h) = f (x0 ).


h→0

Definition 3.3 Differentiable Functions


We say that a function f : (a, b) → R is differentiable if it is differentiable
at all points in (a, b). In this case, the derivative of f is the function f ′ :
(a, b) → R, where

f (x + h) − f (x)
f ′ (x) = lim .
h→0 h

Let us look at the simplest example where f (x) = ax + b.


Chapter 3. Differentiating Functions of a Single Variable 150

Example 3.1

Let f : R → R be the function f (x) = ax + b. For any x and x0 where


x ̸= x0 , we have
f (x) − f (x0 )
= a.
x − x0
This implies that

f (x) − f (x0 )
f ′ (x0 ) = lim = a.
x→x0 x − x0
Hence, f is a differentiable function and its derivative is

f ′ (x) = a for all x ∈ R.

Now let us look at a quadratic function.

Example 3.2

Let f : R → R be the function f (x) = x2 . Show that f is differentiable


and find its derivative.

Solution
For any real number x,

f (x + h) − f (x)
f ′ (x) = lim
h→0 h
(x + h)2 − x2
= lim
h→0 h
2xh + h2
= lim
h→0 h
= lim (2x + h)
h→0

= 2x.

This shows that f is differentiable and its derivative is f ′ (x) = 2x.

Finding derivative is finding the limit of

f (x) − f (x0 )
mx;x0 =
x − x0
Chapter 3. Differentiating Functions of a Single Variable 151

as x approaches x0 . Since mx;x0 is the slope of the secant line joining the two
points (x0 , f (x0 )) and (x, f (x)) on the graph of the function, in the limit x → x0 ,
we obtain a straight line that only touches the graph in a neighbourhood of the
point (x0 , f (x0 )) at this point. This line is called the tangent line of the curve
y = f (x) at the point (x0 , f (x0 )).

Figure 3.2: Derivative as slope of tangent line.

Definition 3.4 Tangent Line

Let x0 be a point in the interval (a, b). If the function f : (a, b) → R is


differentiable at x0 , then the tangent line to the curve y = f (x) at the point
(x0 , f (x0 )) is
y = y0 + f ′ (x0 )(x − x0 ).

Example 3.3

We have found in Example 3.2 that the derivative of the function f (x) = x2
is f ′ (x) = 2x. At the point x = 3, f (3) = 9 and f ′ (3) = 6. Hence, the
equation of the tangent line to the curve y = x2 at the point (3, 9) is

y = 9 + 6(x − 3) = 6x − 9.

Theorem 3.1 says that if a function f : (a, b) → R is differentiable at x0 , then


it is continuous at x0 . A natural question to ask is whether the converse is true.
Chapter 3. Differentiating Functions of a Single Variable 152

The answer is no, as shown by the following classical example.

Example 3.4

Consider the function f : R → R, f (x) = |x|. We have seen in Chapter 2


that this is a continuous function. Let x0 be a point in R.
Case 1: If x0 > 0, then for any x in the neighbourhood (0, ∞) of x0 ,
f (x) = |x| = x. It follows that

f (x) − f (x0 ) x − x0
lim = lim = 1.
x→x0 x − x0 x→x 0 x − x0

This implies that f is differentiable at x0 and f ′ (x0 ) = 1.


Case 2: If x0 < 0, then for any x in the neighbourhood (−∞, 0) of x0 ,
f (x) = |x| = −x. It follows that

f (x) − f (x0 ) −x − (−x0 ) x − x0


lim = lim =− = −1.
x→x0 x − x0 x→x0 x − x0 x − x0
This implies that f is differentiable at x0 and f ′ (x0 ) = −1.
Case 3: When x0 = 0, we find that
f (x) − f (0) x
lim+ = lim+ = 1,
x→0 x−0 x→0 x

f (x) − f (0) −x
lim− = lim− = −1.
x→0 x−0 x→0 x
This implies that the limit

f (x) − f (0)
lim
x→0 x−0
does not exist. Hence, f is not differentiable at x = 0.

Graphically, we find that the curve y = |x| has a "sharp turn" at the point
(0, 0), and there is no well-defined tangent there (see Figure 3.3).
Chapter 3. Differentiating Functions of a Single Variable 153

Figure 3.3: The graph of the function f (x) = |x| has a "sharp turn" at x = 0.

Remark 3.1 Left Derivatives and Right Derivatives


We can use left limits and right limits to define left derivatives and right
derivatives. Let f : [a, b] → R be a function defined on the closed interval
[a, b].

1. For any x0 ∈ (a, b], we say that the function is left-differentiable at x0


provided that the left derivative at x0 , f−′ (x0 ), defined as the left limit

f (x) − f (x0 )
f−′ (x0 ) = lim− ,
x→x0 x − x0

exists.

2. For any x0 ∈ [a, b), we say that the function is right-differentiable at x0


provided that the right derivative at x0 , f+′ (x0 ), defined as the right limit

f (x) − f (x0 )
f+′ (x0 ) = lim+ ,
x→x0 x − x0

exists.

3. For any x0 ∈ (a, b), f is differentiable at x0 if and only if it is both left


differentiable and right differentiable at x0 .

4. We say that the function f : [a, b] → R is differentiable if it is


differentiable at all x0 ∈ (a, b), right differentiable at a and left
differentiable at b.
Chapter 3. Differentiating Functions of a Single Variable 154

In the following, we will mainly discuss derivatives of functions defined on


open intervals. The extension to closed intervals is straighforward by considering
the one-sided derivatives at the end points.

Example 3.5 Example 3.4 Revisited

The function f (x) = |x| is left differentiable and right differentiable at


x = 0, with
f−′ (0) = −1, f+′ (0) = 1.
It is not differentiable at x = 0 since f−′ (0) ̸= f+′ (0).

Leibniz Notation for Derivatives


dy
For the function y = f (x), its derivative f ′ (x) is also denoted by or
dx
d
f (x).
dx

For example, in Example 3.2, we have shown that


d 2
x = 2x.
dx
In the following, we are going to derive derivative formulas. The simplest
derivative formula is the one for the function f (x) = xn , where n is a positive
integer.

Proposition 3.2

Let n be a positive integer. The function f (x) = xn is differentiable with


derivative
d n
x = nxn−1 .
dx

Proof
We use the formula

xn − xn0 = (x − x0 )(xn−1 + xn−2 x0 + · · · + xx0n−2 + x0n−1 ).

Let f (x) = xn . Then


Chapter 3. Differentiating Functions of a Single Variable 155

f (x) − f (x0 )
f ′ (x0 ) = lim
x→x0 x − x0
xn − xn0
= lim
x→x0 x − x0

= lim xn−1 + xn−2 x0 + · · · + xx0n−2 + x0n−1



x→x0

= xn−1 + xn−1 + · · · + xn−1 + xn−1


|0 0
{z 0 0
}
n terms

= nxn−1
0 .

Now let us look at the square root function.

Example 3.6

Determine the points where the function f : [0, ∞) → R, f (x) = x is
differentiable, and find the derivatives at those points.

Solution
First we consider the case where x0 > 0. When x > 0 and x ̸= x0 ,
√ √
f (x) − f (x0 ) x − x0 1
= =√ √ . (3.1)
x − x0 x − x0 x + x0

Hence,
f (x) − f (x0 ) 1 1
f ′ (x0 ) = lim = lim √ √ = √ .
x→x0 x − x0 x→x0 x + x0 2 x0

1
This shows that f is differentiable at x0 with derivative f ′ (x0 ) = √ .
2 x0
For x0 = 0, we can only consider the right derivative. When x > 0, the
formula (3.1) still holds. However, the limit

f (x) − f (0) 1
lim+ = lim+ √
x→0 x−0 x→0 x

does not exist. Hence, the function f (x) = x is not differentiable at
x = 0.
Chapter 3. Differentiating Functions of a Single Variable 156


Figure 3.4: The function f (x) = x is not differentiable at x = 0.

Using limit laws, one can find derivatives of linear combinations, products and
quotients of functions.

Proposition 3.3 Linearity of Derivatives

Let x0 be a point in (a, b). Given that the functions f : (a, b) → R and
g : (a, b) → R are differentiable at x0 . For any constants α and β, the
function αf + βg : (a, b) → R is also differentiable at x0 and

(αf + βg)′ (x0 ) = αf ′ (x0 ) + βg ′ (x0 ).

Proof
This is straightforward derivation from the limit laws. By assumption, we
have
f (x) − f (x0 ) g(x) − g(x0 )
f ′ (x0 ) = lim and g ′ (x0 ) = lim .
x→x0 x − x0 x→x0 x − x0
It follows that
(αf + βg)(x) − (αf + βg)(x0 )
(αf + βg)′ (x0 ) = lim
x→x0 x − x0
 
f (x) − f (x0 ) g(x) − g(x0 )
= lim α +β
x→x0 x − x0 x − x0
f (x) − f (x0 ) g(x) − g(x0 )
= α lim + β lim
x→x0 x − x0 x→x0 x − x0
= αf ′ (x0 ) + βg ′ (x0 ).
Chapter 3. Differentiating Functions of a Single Variable 157

This formula can be extended to k functions for any positive integer k. If


f1 , . . . , fk are functions defined on (a, b) and differentiable at the point x0 , then
for any constants c1 , . . . , ck , the function c1 f1 + . . . + ck fk is also differentiable at
x0 , and
(c1 f1 + . . . + ck fk )′ (x0 ) = c1 f1′ (x0 ) + . . . + ck fk′ (x0 ).

Proposition 3.4 Product Rule for Derivatives

Let x0 be a point in (a, b). Given that the functions f : (a, b) → R and
g : (a, b) → R are differentiable at x0 , the function (f g) : (a, b) → R is
also differentiable at x0 and

(f g)′ (x0 ) = f ′ (x0 )g(x0 ) + f (x0 )g ′ (x0 ).

Proof
Again, we are given that

f (x) − f (x0 ) g(x) − g(x0 )


f ′ (x0 ) = lim and g ′ (x0 ) = lim .
x→x0 x − x0 x→x0 x − x0
Since differentiability implies continuity, we have

lim f (x) = f (x0 ) and lim g(x) = g(x0 ).


x→x0 x→x0

Just like the proof of the product rule for limits, we need to do some
manipulations.

f (x)g(x) − f (x0 )g(x0 ) = (f (x) − f (x0 ))g(x0 ) + f (x)(g(x) − g(x0 )).

It follows that
f (x)g(x) − f (x0 )g(x0 )
(f g)′ (x0 ) = lim
x→x0 x − x0
 
f (x) − f (x0 ) g(x) − g(x0 )
= lim g(x0 ) + f (x)
x→x0 x − x0 x − x0
f (x) − f (x0 ) g(x) − g(x0 )
= lim lim g(x0 ) + lim f (x) lim
x→x0 x − x0 x→x0 x→x0 x→x0 x − x0
′ ′
= f (x0 )g(x0 ) + f (x0 )g (x0 ).
Chapter 3. Differentiating Functions of a Single Variable 158

For a different perspective, we denote f (x0 ) and g(x0 ) by u and v respectively,


and let
∆u = f (x) − f (x0 ), ∆v = g(x) − g(x0 ).
Then

f (x)g(x) − f (x0 )g(x0 ) = (u + ∆u)(v + ∆v) − uv


= v∆u + u∆v + ∆u∆v.

After didiving by ∆x = x−x0 , the term ∆u∆v/∆x vanishes in the limit ∆x → 0,


and we obtain the product rule.

General Product Rule


The product rule for derivatives can be expressed as
d du dv
(uv) = v +u .
dx dx dx
In general, when we have k functions u1 , u2 , . . . , uk , the product rule says
that
d du1 du2 duk
(u1 u2 · · · uk ) = (u2 · · · uk ) +(u1 u3 · · · uk ) +· · ·+(u1 · · · uk−1 ) .
dx dx dx dx
This can be proved by induction on k.

Notice that the formula


d n
x = nxn−1 , where n ∈ Z+
dx
d
follows from the general product rule and x = 1.
dx
Finally we turn to the quotient rule.

Proposition 3.5 Quotient Rule for Derivatives

Let x0 be a point in (a, b). Given that the functions f : (a, b) → R and
g : (a, b) → R are differentiable at x0 , and g(x) ̸= 0 for all x in (a, b).
Then the function (f /g) : (a, b) → R is differentiable at x0 and
 ′
f f ′ (x0 )g(x0 ) − f (x0 )g ′ (x0 )
(x0 ) = .
g g(x0 )2
Chapter 3. Differentiating Functions of a Single Variable 159

The assumption g(x) ̸= 0 for all x in (a, b) is to make sure that the function
f /g is well-defined on (a, b). In practice, we only need g(x0 ) ̸= 0 and g is
differentiable at x0 . For then we find that g is continuous at x0 . The assumption
g(x0 ) ̸= 0 will imply that g(x) ̸= 0 in a neighbourhood of x0 .

Proof
First, notice that
f (x) f (x0 ) f (x)g(x0 ) − f (x0 )g(x)
− =
g(x) g(x0 ) g(x)g(x0 )
(f (x) − f (x0 ))g(x0 ) − f (x0 )(g(x) − g(x0 ))
= .
g(x)g(x0 )

Using the same reasoning as in the proof of the product rule, we obtain
 ′
f 1
(x0 ) = lim
g x→x0 g(x)g(x0 )
 
f (x) − f (x0 ) g(x) − g(x0 )
× g(x0 ) lim − f (x0 ) lim
x→x0 x − x0 x→x0 x − x0
′ ′
f (x0 )g(x0 ) − f (x0 )g (x0 )
= .
g(x0 )2

Again, using the u and v notations, we have

u + ∆u u (u + ∆u)v − (v + ∆v)u v∆u − u∆v


− = = .
v + ∆v v v(v + ∆v) v(v + ∆v)

This gives a different perspective on the quotient rule.


Let us use the quotient rule to derive the derivative for f (x) = xn , when n is
a negative integer.

Proposition 3.6
For any integer n,
d n
x = nxn−1 . (3.2)
dx
Chapter 3. Differentiating Functions of a Single Variable 160

Proof
We have proved the formula (3.2) when n ≥ 0. When n < 0, let m = −n.
Then m is a positive integer. By quotient rule, we have
d d
d n d 1 xm 1 − xm mxm−1 m
x = = dx dx = − = − m+1 = nxn−1 .
dx dx x m x 2m x 2m x
Hence, the formula (3.2) also holds when n is a negative integer.

Definition 3.5 Higher Order Derivatives

If the function f : (a, b) → R is differentiable, its derivative f ′ : (a, b) →


R is also a function defined on (a, b). We can investigate whether f ′ is
differentiable. If f ′ is differentiable at a point x0 in (a, b), we denote its
derivative by f ′′ (x0 ), called the second (order) derivative of the function f
at x0 .
In the same way, we can define the nth -order derivative of the function
y = f (x) at a point x0 for any positive integer n. We use the notation

dn y
f (n) (x) or
dxn
to denote the nth -derivative of the function y = f (x). It is defined
recursively by

f (n−1) (x + h) − f (n−1) (x)


f (n) (x) = lim ,
h→0 h
where by default, f (0) (x) = f (x).
We say that a function f : (a, b) → R is n times differentiable if f (n) (x)
exists for all x in (a, b). A function is infinitely differentiable if it is n times
differentiable for any positive integer n.

Example 3.7
Polynomial functions are infinitely differentiable. Moreover, if the degree
of a polynomial p(x) is n, then p(k) (x) = 0 for all k ≥ n + 1.
Chapter 3. Differentiating Functions of a Single Variable 161

Example 3.8
Define the function f : R → R by

ax2 , if x < 1,
f (x) =
x + b , if x ≥ 1.
x
Find the values of a and b so that f is differentiable.

Solution
The function f is differentiable at any point x0 in the interval (−∞, 1) or
the interval (1, ∞).
For f to be differentiable, f has to be continuous and differentiable at x =
1. For f to be continuous at x = 1, we must have

lim f (x) = lim+ f (x).


x→1− x→1

This gives
a = 1 + b.
For f to be differentiable at x = 1, we must have

f (x) − f (1) f (x) − f (1)


lim− = lim+ .
x→1 x−1 x→1 x−1
Notice that
f (x) − f (1) d
lim− = ax2 = 2a,
x→1 x−1 dx x=1
 
f (x) − f (1) d b
lim+ = x+ = 1 − b.
x→1 x−1 dx x=1 x
Hence, we must have
2a = 1 − b.
Solving for a and b, we have
2 1
a= , b=− .
3 3
Chapter 3. Differentiating Functions of a Single Variable 162

Figure 3.5: The function f (x) defined in Example 3.8.


Chapter 3. Differentiating Functions of a Single Variable 163

Exercises 3.1
Question 1
Define the function f : R → R by

ax2 + x, if x < 1,
f (x) =
bx + 3 , if x ≥ 1.
x2
Find the values of a and b so that f is differentiable.

Question 2
Let x0 be a point in (a, b). Given that f : [a, b] → R is a continuous function
defined on [a, b] and differentiable at x0 . Let g : [a, b] → R be the function
defined by

 f (x) − f (x0 ) ,

if x ∈ [a, b] \ {x0 }
g(x) = x − x0
f ′ (x0 ),

if x = x0 .

Show that g : [a, b] → R is a continuous function.

Question 3
Let x0 be a point in (a, b) and let f : (a, b) → R be a function defined on
(a, b).

(a) If f : (a, b) → R is differentiable at x0 , show that

f (x0 + h) − f (x0 − h)
lim = f ′ (x0 ).
h→0 2h

(b) If the limit


f (x0 + h) − f (x0 − h)
lim
h→0 2h
exists, is f necessarily differentiable at x0 ?
Chapter 3. Differentiating Functions of a Single Variable 164

3.2 Chain Rule and Derivatives of Inverse Functions

In this section, we are going to derive derivative formulas for composite functions
and inverse functions. First we discuss a different perspective for differentiability
of a function at a point.

Differentiability
Let x0 be a point in the interval (a, b) and let f : (a, b) → R be a function
defined on (a, b). If f is differentiable at x0 , then

f (x0 + h) − f (x0 )
f ′ (x0 ) = lim .
h→0 h
This implies that

f (x0 + h) − f (x0 ) − f ′ (x0 )h


lim = 0.
h→0 h
Conversely, if there is a number c such that

f (x0 + h) − f (x0 ) − ch
lim = 0,
h→0 h
limit laws imply that

f (x0 + h) − f (x0 )
c = lim .
h→0 h
This implies that f is differentiable at x0 and f ′ (x0 ) = c. In other words,
the function f is differentiable at x0 if and only if there is a number c such
that
f (x0 + h) − f (x0 ) − ch
lim = 0.
h→0 h
Since x0 ∈ (a, b), there is an r > 0 such that (x0 − r, x0 + r) ⊂ (a, b). For
a given real number c, let ε : (−r, r) → R be the function defined by

f (x0 + h) − f (x0 ) − ch
ε(h) = .
h
Chapter 3. Differentiating Functions of a Single Variable 165

Then the differentiability of f at x0 is equivalent to lim ε(h) = 0. Hence,


h→0
f : (a, b) → R is differentiable at x0 if and only if there is a number c and
a function ε(h) such that

f (x0 + h) = f (x0 ) + ch + hε(h),

and
ε(h) → 0 when h → 0.

Theorem 3.7 Chain Rule


Given that f : (a, b) → R and g : (c, d) → R are functions such that
f (a, b) ⊂ (c, d). If x0 is a point in (a, b), f is differentiable at x0 , g is
differentiable at f (x0 ), then the composite function (g ◦ f ) : (a, b) → R is
differentiable at x0 and

(g ◦ f )′ (x0 ) = g ′ (f (x0 ))f ′ (x0 ).

Proof
Let y0 = f (x0 ), and define the functions ε1 (h) and ε2 (k) by

f (x0 + h) − f (x0 ) − f ′ (x0 )h


ε1 (h) = ,
h
g(y0 + k) − g(y0 ) − g ′ (y0 )k
ε2 (k) = .
k
Since f is differentiable at x0 and g is differentiable at y0 , we have
lim ε1 (h) = 0 and lim ε2 (k) = 0. Let
h→0 k→0

k(h) = f (x0 + h) − f (x0 ) = f ′ (x0 )h + ε1 (h)h.

Then by the definitions of ε1 (h) and ε2 (k),

(g ◦ f )(x0 + h) − (g ◦ f )(x0 ) = g(y0 + k(h)) − g(y0 )


= g ′ (y0 )k(h) + ε2 (k(h))k(h)
= g ′ (y0 )f ′ (x0 )h + ε3 (h)h,
Chapter 3. Differentiating Functions of a Single Variable 166

where
k(h)
ε3 (h) = g ′ (y0 )ε1 (h) + ε2 (k(h)) .
h
Since f is differentiable at x0 ,

k(h) f (x0 + h) − f (x0 )


lim = lim = f ′ (x0 ).
h→0 h h→0 h
This implies that lim k(h) = 0. By limit law for composite functions,
h→0

lim ε2 (k(h)) = lim ε2 (k) = 0.


h→0 k→0

Limit laws then imply that

(g ◦ f )(x0 + h) − (g ◦ f )(x0 ) − g ′ (y0 )f ′ (x0 )h


lim = lim ε3 (h) = 0.
h→0 h h→0

This proves that the function g ◦ f is differentiable at x0 and

(g ◦ f )′ (x0 ) = g ′ (y0 )f ′ (x0 ) = g ′ (f (x0 ))f ′ (x0 ).

Heuristically, if we let u = f (x) and y = g(u) = g(f (x)), chain rule says that

dy dy du
= × ,
dx du dx
which is the limit of
∆y ∆y ∆u
= ×
∆x ∆u ∆x
when ∆x → 0. The rigorous proof we give above do not use this because we
might face the problem that ∆u = f (x) − f (x0 ) can be zero even when x ̸= x0 .

Example 3.9
Let f : R → R be a differentiable function and let a be a constant. Show
that the function g : R → R defined by g(x) = f (ax) is differentiable, and
g ′ (x) = af ′ (ax).
Chapter 3. Differentiating Functions of a Single Variable 167

Solution
The function u : R → R, u(x) = ax is differentiable with u′ (x) = a. By
chain rule, the function g(x) = (f ◦ u)(x) is also differentiable and

g ′ (x) = f ′ (u(x))u′ (x) = af ′ (ax).

Example 3.10

Given that the function f : (0, 2) → R is differentiable at x = 1 and


f ′ (1) = a, find the value of

f (x3 ) − f (1)
lim
x→1 x−1
in terms of a.

Solution
3
Let g(x) = x . Then g(1) = 1 and g is differentiable at x = 1 with
g ′ (1) = 3.

f (x3 ) − f (1) (f ◦ g)(x) − (f ◦ g)(1)


lim = lim .
x→1 x−1 x→1 x−1
Since g is differentiable at x = 1 and f is differentiable at g(1), chain rule
implies that

f (x3 ) − f (1)
lim = (f ◦ g)′ (1) = f ′ (g(1))g ′ (1) = 3f ′ (1) = 3a.
x→1 x−1

Recall that we have proved in Section 2.6 that if I is an interval, f : I → R


is strictly monotonic and continuous, then f is invertible and f −1 : f (I) → R is
also continuous. The strictly monotonicity is a necessary and sufficient condition
for a continuous function to be one-to-one. If x0 is a point in the interior of
I, and f is differentiable at x0 , we can ask whether the inverse function f −1 is
differentiable at the point y0 = f (x0 ). Since (f −1 ◦ f )(x) = x for all x ∈ I, if
f −1 is differentiable at y0 , chain rule implies that
(f −1 )′ (y0 )f ′ (x0 ) = (f −1 )′ (f (x0 ))f (x0 ) = 1.
Therefore, a necessary condition for f −1 to be differentiable at y0 is f ′ (x0 ) cannot
Chapter 3. Differentiating Functions of a Single Variable 168

be zero. In the following theorem, we show that this condition is also sufficient.

Theorem 3.8 Derivative for Inverse Function


Let I be an open interval containing x0 , and let f : I → R be a function
that is strictly monotonic and continuous. If f is differentiable at x0 and
f ′ (x0 ) ̸= 0, the inverse function f −1 : f (I) → R is differentiable at y0 =
f (x0 ), and
1
(f −1 )′ (y0 ) = ′ .
f (x0 )

The formula for (f −1 )′ (y0 ) would follow from the chain rule if we know
apriori that f −1 is differentiable at y0 . The gist of this theorem is to state that
f −1 is indeed differentiable at y0 .

Proof
Without loss of generality, assume that f is strictly increasing. By Theorem
2.32, f −1 : f (I) → R is also continuous. There is a δ > 0 so that [x0 −
δ, x0 + δ] ⊂ I. Then (f (x0 − δ), f (x0 + δ)) is an open interval in f (I)
containing the point y0 . This implies that there is an r > 0 so that (y0 −
r, y0 + r) ⊂ f (I). For any k ∈ (−r, r), let

h(k) = f −1 (y0 + k) − f −1 (y0 ).

Then h is a strictly increasing continuous function of k and lim h(k) = 0.


k→0
Notice that
y0 + k = f (x0 + h(k)).
Therefore,

f −1 (y0 + k) − f −1 (y0 ) h(k)


= .
k f (x0 + h(k)) − f (x0 )
Chapter 3. Differentiating Functions of a Single Variable 169

Hence, by limit laws for quotients and composite functions, we find that

f −1 (y0 + k) − f −1 (y0 ) 1
lim =
k→0 k f (x0 + h(k)) − f (x0 )
lim
k→0 h(k)
1
=
f (x0 + h) − f (x0 )
lim
h→0 h
1
= ′ .
f (x0 )

This proves that f −1 is differentiable at y0 and


1
(f −1 )′ (y0 ) = .
f ′ (x 0)

As a corollary, we have the following.

Corollary 3.9
Let I be an open interval, and let f : I → R be a strictly monotonic
differentiable function. If f ′ (x) ̸= 0 for all x ∈ I, then the inverse function
f −1 : f (I) → R is also a strictly monotonic differentiable function with
1
(f −1 )′ (x) = .
f ′ (f −1 (x))

Example 3.11

Let r be a rational number, and let f : (0, ∞) → R be the function f (x) =


xr . Show that f is differentiable and

f ′ (x) = rxr−1 .
Chapter 3. Differentiating Functions of a Single Variable 170

Solution
First we consider the case r = 1/n, where n is a positive integer. The
function f (x) = x1/n is the inverse of the function g(x) = xn , which is
differentiable and strictly increasing. Hence, f (x) = x1/n is differentiable
and strictly increasing. Moreover, since g ′ (x) = nxn−1 , we have
1 1 1 1 1 −1
f ′ (x) = = = = xn .
g ′ (f (x)) g ′ (x1/n ) n(x1/n )n−1 n

Now for a general rational number r, there is an integer p and a positive


integer q such that r = p/q. It follows that

f (x) = (xp )1/q = (g ◦ h)(x),

where
h(x) = xp , g(x) = x1/q .
By Proposition 3.6, h′ (x) = pxp−1 . We have just shown that g ′ (x) =
1 1q −1
x . By chain rule,
q
1 p 1q −1 p p
f ′ (x) = g ′ (h(x))h′ (x) = (x ) × pxp−1 = x q −1 = rxr−1 .
q q
Chapter 3. Differentiating Functions of a Single Variable 171

Exercises 3.2
Question 1
Given that the function f : (0, ∞) → R is defined by
1
f (x) = √ .
4 + x2
(a) Show that f is one-to-one.

(b) Show that f is differentiable.

(c) Show that f −1 exists and is differentiable.

(d) Find f −1 (x) and (f −1 )′ (x).

Question 2
Let a be a positive number. Recall that a function f : (−a, a) → R is even
if and only if

f (−x) = f (x) for all x ∈ (−a, a);

and a function f : (−a, a) → R is odd if and only if

f (−x) = −f (x) for all x ∈ (−a, a).

Let f : (−a, a) → R be a differentiable function.

(a) If f is even, show that f ′ is odd.

(b) If f is odd, show that f ′ is even.


Chapter 3. Differentiating Functions of a Single Variable 172

3.3 The Mean Value Theorem and Local Extrema

The mean value theorem is one of the most important theorems in analysis. We
will first prove a special case of the mean value theorem called Rolle’s theorem.
To prove this, we need the extreme value theorem, which asserts the existence
of global maximum and global minimum for a continuous function defined on a
closed and bounded interval. As a matter of fact, what we actually need is a local
extremum, which we define as follows.

Definition 3.6 Local Maximum and Local Minimum


Let D be a subset of real numbers that contains the point x0 , and let f :
D → R be a function defined on D.

1. The point x0 is a local maximizer of f provided that there is a δ > 0


such that for all x in D with |x − x0 | < δ, we have

f (x) ≤ f (x0 ).

The value f (x0 ) is then a local maximum value of f .

2. The point x0 is a local minimizer of f provided that there is a δ > 0 such


that for all x in D with |x − x0 | < δ, we have

f (x) ≥ f (x0 ).

The value f (x0 ) is then a local minimum value of f .

3. The point x0 is a local extremizer if it is a local maximizer or a local


minimizer. The value f (x0 ) is a local extreme value if it is a local
maximum value or a local minimum value.

The definition of local extremum that we give here is quite general. We do


not impose conditions on the set D, nor require x0 to be an interior point of D.
Other mathematicians might define it differently. Under our definition, a global
extremum of a function is also a local extremum of the function.
Derivative is an useful tool in the search for local extrema. When a local
extremizer of a function is an interior point of the domain, and f is differentiable
Chapter 3. Differentiating Functions of a Single Variable 173

Figure 3.6: The function y = f (x) has local maxima at the points B and D, and
local minima at the points A and C. The point A is also where global minimum
appears; while the point B is where the global maximum appears.

at that point, the derivative of the function can only be zero at that point.

Theorem 3.10
Let (a, b) be a neighbourhood of the point x0 , and let f : (a, b) → R
be a function defined on (a, b). If x0 is a local extremizer of f , and f is
differentiable at x0 , then f ′ (x0 ) = 0.

Proof
Without loss of generality, assume that x0 is a local maximizer. Then there
is a δ > 0 such that (x0 − δ, x0 + δ) ⊂ (a, b), and for all x in (x0 − δ, x0 + δ),
f (x) ≤ f (x0 ). Since f is differentiable at x0 , the limit

f (x) − f (x0 )
lim
x→x0 x − x0
exists and is equal to f ′ (x0 ). This implies that the left limit and the right
limit both exist and both equal to f ′ (x0 ). Namely,

f (x) − f (x0 ) f (x) − f (x0 )


f ′ (x0 ) = lim− = lim+ .
x→x0 x − x0 x→x0 x − x0
Chapter 3. Differentiating Functions of a Single Variable 174

For the left limit, when x is in (x0 −δ, x0 ), x−x0 < 0 and f (x)−f (x0 ) ≤ 0.
Therefore,
f (x) − f (x0 )
≥0 when x ∈ (x0 − δ, x0 ).
x − x0

Taking the x → x− ′
0 limit, we find that f (x0 ) ≥ 0. For the right limit, when
x is in (x0 , x0 + δ), x − x0 > 0 but f (x) − f (x0 ) ≤ 0. Therefore,

f (x) − f (x0 )
≤0 when x ∈ (x0 , x0 + δ).
x − x0

Taking the x → x+ ′
0 limit, we find that f (x0 ) ≤ 0. Since the left limit shows
that f ′ (x0 ) ≥ 0 while the right limit shows that f ′ (x0 ) ≤ 0, we conclude
that f ′ (x0 ) = 0.

This theorem gives a necessary condition for a function f : (a, b) → R to


have a local extremum at a point where it is differentiable. Notice that it cannot
be applied if the local extremizer is not an interior point of the domain.

Definition 3.7 Stationary Points


Let D be a subset of real numbers and let f : D → R be a function defined
on D. If x0 is an interior point of D, f is differentiable at x0 and f ′ (x0 ) = 0,
we call x0 a stationary point of the function f .

Hence, Theorem 3.10 says that if x0 is an interior point of D, and the function
f : D → R is differentiable at x0 , a necessary condition for x0 to be a local
extremum of the function f is x0 must be a stationary point. Nevertheless, this
condition is not sufficient. For example, the function f (x) = x3 has a stationary
point at x = 0, but x = 0 is not a local extremizer of the funnction.
Now let us return to the mean value theorem. As a motivation, let us consider
the distance s travelled by an object as a function of time t. We have discussed in
Section 3.1 that to find the instantaneous speed of the object at a particular time
t0 , we first find the average speed over the time interval from t0 to t0 + ∆t, and
take the limit ∆t → 0. Namely, the instantaneous speed at time t0 is
s(t0 + ∆t) − s(t0 )
lim ,
∆t→0 ∆t
Chapter 3. Differentiating Functions of a Single Variable 175

which is precisely s′ (t0 ), the derivative of s(t) at t = t0 . The mean value theorem
asserts that the average speed of the object in a time interval [t1 , t2 ] must equal
to the instantaneous speed s′ (t0 ) for some t0 in that interval. Intuitively, this is
something one would expect to be true.
Now let us prove a special case of the mean value theorem.

Theorem 3.11 Rolle’s Theorem


Let f : [a, b] → R be a function that satisfies the following conditions.

(i) f : [a, b] → R is continuous.

(ii) f : (a, b) → R is differentiable.

(iii) f (a) = f (b).

Then there is a point x0 in (a, b) such that f ′ (x0 ) = 0.

Proof
Since f : [a, b] → R is a continuous function defined on a closed
and bounded interval, the extreme value theorem says that it must have
minimum value and maximum value. In other words, there are two points
x1 and x2 in [a, b] such that

f (x1 ) ≤ f (x) ≤ f (x2 ) for all x ∈ [a, b].

Notice that x1 and x2 are also local extremizers of the function f : [a, b] →
R. If f (x1 ) = f (x2 ), then f is a constant function. In this case, f ′ (x0 ) = 0
for all x0 in (a, b). If f (x1 ) ̸= f (x2 ), then f (x1 ) < f (x2 ). Since f (a) =
f (b), either x1 or x2 must be in the open interval (a, b). In other words,
there is a local extremizer x0 in the interval (a, b). Since f is differentiable
at x0 , Theorem 3.10 says that we must have f ′ (x0 ) = 0. In either case,
there is an x0 in (a, b) satisfying f ′ (x0 ) = 0.

Now we can prove the mean value theorem.


Chapter 3. Differentiating Functions of a Single Variable 176

Figure 3.7: The Rolle’s theorem.

Theorem 3.12 Mean Value Theorem


Let f : [a, b] → R be a function that satisfies the following conditions.

(i) f : [a, b] → R is continuous.

(ii) f : (a, b) → R is differentiable.

Then there is a point x0 in (a, b) such that

f (b) − f (a)
f ′ (x0 ) = .
b−a

The mean value theorem stated in Theorem 3.12 is also referred to as Lagrange’s
mean value theorem. Notice that Rolle’s theorem is a special case of the mean
value theorem where f (a) = f (b). The quantity

f (b) − f (a)
b−a
gives the average rate of change of the function f (x) over the interval [a, b],
and the mean value theorem says that this average rate of change is equal to
the rate of change at a particular point. To prove the mean value theorem, we
apply a transformation to the function f (x) to get a function g(x) that satisfies the
conditions in the Rolle’s theorem.
Chapter 3. Differentiating Functions of a Single Variable 177

Proof
Let g : [a, b] → R be the function defined by

g(x) = f (x) − mx,

where the constant m is determined by g(a) = g(b). This gives

f (a) − ma = f (b) − mb,

and so
f (b) − f (a)
m= .
b−a
Notice that the function g : [a, b] → R is continuous, and g : (a, b) → R is
differentiable with
f (b) − f (a)
g ′ (x) = f ′ (x) − m = f ′ (x) − .
b−a
By construction, g(a) = g(b). Hence, we can apply Rolle’s theorem to the
function g and conclude that there is a point x0 in (a, b) such that g ′ (x0 ) =
0. For this point x0 ,
f (b) − f (a)
f ′ (x0 ) = .
b−a
This proves the mean value theorem.

Figure 3.8: The mean value theorem.

Notice that for the mean value theorem to hold, the function f : [a, b] → R do
Chapter 3. Differentiating Functions of a Single Variable 178

not need to be differentiable at the end points of the interval [a, b], and the point
x0 is guaranteed to be a point in the interior of the interval.
The mean value theorem has very wide applications. We will discuss a few in
this section.
Recall that the derivative of a constant function is 0. The converse is not
obvious, but it is an easy consequence of the mean value theorem.

Lemma 3.13
If the function f : [a, b] → R is continuous on [a, b], differentiable on (a, b),
and f ′ (x) = 0 for all x ∈ (a, b), then f is a constant function.

Take any x ∈ (a, b]. Then f is continuous on [a, x], differentiable an (a, x).
Therefore, we can apply mean value theorem to conclude that there is a
point c in (a, x) such that

f (x) − f (a)
= f ′ (c) = 0.
x−a
This proves that f (x) = f (a). Therefore, the function f is a constant.

From this, we immediately obtain the following.

Theorem 3.14
Assume that the functions f : [a, b] → R and g : [a, b] → R are continuous
on [a, b], differentiable on (a, b), and

f ′ (x) = g ′ (x) for all x ∈ (a, b).

Then there is a constant C such that

f (x) = g(x) + C for all x ∈ [a, b].


Chapter 3. Differentiating Functions of a Single Variable 179

Proof
Define the function h : [a, b] → R by

h(x) = f (x) − g(x).

Then the function h is continuous on [a, b], differentiable on (a, b), and
h′ (x) = 0 for all x ∈ (a, b). By Lemma 3.13, h is a cosntant function.
Namely, there is a constant C so that h(x) = C for all x ∈ [a, b]. Therefore,

f (x) = g(x) + C for all x ∈ [a, b].

Theorem 3.14 implies the identity criterion, which says that if two functions
are differentiable in an open interval, their derivatives are the same, and
their values at a single point in the interval coincide, then these two
functions must be identical.

We have seen that if p(x) is a polynomial of degree n, and k is an integer larger


than n, then the k th -order derivative of p(x) is identically zero. Using Lemma 3.13,
we can prove that the converse is also true.

Example 3.12
Let n be a nonnegative integer. Assume that the function p : R → R is
(n + 1) times differentiable and p(n+1) (x) = 0 for all real numbers x. Then
p(x) is a polynomial of degree at most n.

Proof
We prove this by induction on n. When n = 0, the statement says that
if p : R → R is a differentiable function and p′ (x) = 0 for all x ∈ R,
then p(x) is a polynomial of degree 0. Since a polynomial of degree 0 is a
constant, this statement is true by Lemma 3.13.
Now let n ≥ 1, and assume that we have proved that for any k < n, if
q : R → R is a function that is (k+1) times differentiable and q (k+1) (x) = 0
for all real numbers x, then q(x) is a polynomial of degree at most k.
Chapter 3. Differentiating Functions of a Single Variable 180

Let p : R → R be a function that is (n + 1) times differentiable and


p(n+1) (x) = 0 for all real numbers x. Lemma 3.13 says that there is a
constant C such that
p(n) (x) = C.
Consider the function q : R → R defined by
C n
q(x) = p(x) − x .
n!
It is n times differentiable and

q (n) (x) = p(n) (x) − C = 0 for all x ∈ R.

By inductive hypothesis, q(x) is a polynomial of degree at most n − 1.


Namely, there are constants a0 , a1 , . . ., an−1 such that

q(x) = an−1 xn−1 + · · · + a1 x + a0 .

This implies that

p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 ,

where an = C/n!. Hence, p(x) is a polynomial of degree at most n.

Mean value theorem can be used to estimate the magnitude of a function


provided that we know the derivative.

Example 3.13

Given that the function f : [0, 10] → R is continuous on [0, 10],


differentiable on (0, 10), and −3 < f ′ (x) < 8 for all x in (0, 10). If
f (0) = −2, find a range for the values of f (x).
Chapter 3. Differentiating Functions of a Single Variable 181

Solution
Let x be point in (0, 10]. By mean value theorem, there is a c ∈ (0, x) such
that
f (x) − f (0)
= f ′ (c).
x−0
Since −3 < f ′ (c) < 8, we find that

−3x < f (x) + 2 < 8x.

This implies that

−32 < −3x − 2 < f (x) < 8x − 2 < 78.

Therefore, a range for the values of f (x) is (−32, 78).

The next example shows that the mean value theorem can be used to determine
the number of solutions of an equation.

Example 3.14
Recall that in Example 2.27, we have shown that the equation

x6 + 6x + 1 = 0

has a real root. Determine the exact number of real roots of this equation.

Solution
Let f : R → R be the function f (x) = x6 + 6x + 1. This is a differentiable
function with
f ′ (x) = 6x5 + 6.
From this, we find that f ′ (x) = 0 if only if x5 = −1, if and only if x = −1.
If x1 and x2 are two points such that x1 < x2 and f (x1 ) = f (x2 ) = 0,
Rolle’s theorem says that there is a point u in (x1 , x2 ) such that f ′ (u) = 0.
Chapter 3. Differentiating Functions of a Single Variable 182

If f (x) = 0 has three distinct real roots, we can assume that these real roots
are x1 , x2 and x3 with x1 < x2 < x3 . Then there is a u1 in (x1 , x2 ), and
a u2 in (x2 , x3 ) such that f ′ (u1 ) = f ′ (u2 ) = 0. In other words, f ′ (x) = 0
has two distinct real roots u1 and u2 . But we have shown that there is only
one x such that f ′ (x) = 0. Therefore, f (x) = 0 can have at most two real
solutions.
Since f (0) = 1, we have f (−1) < 0 < f (1). By intermediate value
theorem, there is a c1 ∈ (−1, 0) such that f (c1 ) = 0.
Since f (−2) = 53, we have f (−1) < 0 < f (−2). By intermediate value
theorem, there is a c2 ∈ (−2, −1) such that f (c2 ) = 0.
We conclude that f (x) = 0 has exactly two real roots.

Another important application of the mean value theorem is to determine the


increasing or decreasing patterns of functions.

Theorem 3.15
Given that f : [a, b] → R is a function continuous on [a, b], and
differentiable on (a, b).

1. If f ′ (x) > 0 for all x ∈ (a, b), then f : [a, b] → R is a strictly increasing
function.

2. If f ′ (x) < 0 for all x ∈ (a, b), then f : [a, b] → R is a strictly decreasing
function.

Notice that we only assume that f ′ is positive or negative on the open interval
(a, b). If f ′ exists at the end points, it can be 0 there, and the conclusion about the
strict monotonicity still holds for the entire closed interval [a, b].

Proof
It suffices for us to prove the first statement. Given any two points x1 and
x2 in the closed interval [a, b] with x1 < x2 , the function f is continuous on
[x1 , x2 ], differentiable on (x1 , x2 ), and f ′ (x) > 0 for any x ∈ (x1 , x2 ). By
mean value theorem, there is a point c in (x1 , x2 ) such that
Chapter 3. Differentiating Functions of a Single Variable 183

f (x2 ) − f (x1 )
= f ′ (c).
x2 − x 1
Since f ′ (c) > 0 and x2 − x1 > 0, we conclude that

f (x2 ) > f (x1 ).

This proves that f : [a, b] → R is strictly increasing.

We look at a simple example.

Example 3.15

Consider the function f : R → R, f (x) = x3 . Notice that f is differentiable


and f ′ (x) = 3x2 . Hence, f ′ (x) > 0 for x ̸= 0, but f ′ (0) = 0. Therefore, we
cannot apply Theorem 3.15 directly to conclude that f : R → R, f (x) = x3
is a strictly increasing function. However, we can proceed in the following
way. Since f ′ (x) > 0 on the open interval (−∞, 0), Theorem 3.15 implies
that f is strictly increasing on the closed interval (−∞, 0]. Since f ′ (x) > 0
on the open interval (0, ∞), Theorem 3.15 again implies that f is strictly
increasing on the closed interval [0, ∞). Combining together, we conclude
that f : R → R, f (x) = x3 is strictly increasing.

Remark 3.2
Let f : [a, b] → R be a function defined on [a, b], and let x1 , . . . , xn be
points in (a, b) such that the following conditions are satisfied.

(i) f is continuous on [a, b], differentiable on (a, b).

(ii) f ′ (xk ) = 0 for 1 ≤ k ≤ n.

(iii) f ′ (x) > 0 for any x ∈ (a, b) \ {x1 , . . . , xn }.

Using the same reasoning as in Example 3.15, one can prove that f is
strictly increasing on [a, b].

Example 3.15 shows that if a function f : (a, b) → R is differentiable and


strictly increasing, it is not necessary that f ′ (x) > 0 for all x ∈ (a, b). If we
Chapter 3. Differentiating Functions of a Single Variable 184

relax the strict monotonicity to monotonicity, we will find that f ′ (x) ≥ 0 for all
x ∈ (a, b) is sufficient and necessary for f to be increasing.

Theorem 3.16
Given that the function f : [a, b] → R is continuous on [a, b], and
differentiable on (a, b).

1. f : [a, b] → R is an increasing function if and only if f ′ (x) ≥ 0 for all


x ∈ (a, b).

2. f : [a, b] → R is a decreasing function if and only if f ′ (x) ≤ 0 for all


x ∈ (a, b).

Proof
Again, let us consider the first statement. If f ′ (x) ≥ 0 for all x ∈ (a, b),
the proof that f is increasing is almost verbatim the proof in Theorem 3.15,
with > replaced by ≥. For the converse, if f is increasing on [a, b], we want
to show that f ′ (x0 ) ≥ 0 for any x0 in (a, b). This follows from the fact that

f (x) − f (x0 )
≥0
x − x0
for any x in (a, b)\{0} since f is increasing. Taking limit gives f ′ (x0 ) ≥ 0.

For a function f (x) that is differentiable, the condition f ′ (x0 ) = 0 is necessary


for an interior point x0 to be a local extremizer, but not sufficient. Theorem 3.15
provides the tool for determining whether such point is a local extremizer. It is
called the first derivative test. We would not go into the general formulation.
Instead, we will apply Theorem 3.15 or Theorem 3.16 directly to solve such
problems.
Chapter 3. Differentiating Functions of a Single Variable 185

Example 3.16
Consider the function f : R → R defined by
x
f (x) = .
x2 +1
Find the local maximum value and the local minimum value of f , and find
the range of the function f .

Solution
Since f is a rational function, it is differentiable, and

(x2 + 1) − x(2x) 1 − x2 (x + 1)(x − 1)


f ′ (x) = 2 2
= 2 2
=− .
(x + 1) (x + 1) (x2 + 1)2

Since f is differentiable everywhere, the only candidates for the local


maximizer and the local minimizer are those points x where f ′ (x) = 0,
which are the poins x = −1 and x = 1.

• When x ∈ (−∞, −1), f ′ (x) < 0, and so f is decreasing on (−∞, 1].

• When x ∈ (−1, 1), f ′ (x) > 0, and so f is increasing on [−1, 1].

• When x ∈ (1, ∞), f ′ (x) < 0, and so f is decreasing on [1, ∞).

These imply that x = −1 is a local minimizer, and x = 1 is a local


maximizer. The local maximum value of f is f (1) = 12 , and the local
minimum value is f (−1) = − 12 . Notice that

lim f (x) = lim f (x) = 0.


x→−∞ x→∞

Since f is decreasing on (−∞, −1], for any x in (−∞, −1],


1
− = f (−1) ≤ f (x) < 0.
2
Since f is increasing on [−1, 1], for any x in [−1, 1],
1 1
− = f (−1) ≤ f (x) ≤ f (1) = .
2 2
Chapter 3. Differentiating Functions of a Single Variable 186

Since f is decreasing on [1, ∞), for any x in [1, ∞),


1
0 < f (x) ≤ f (1) = .
2
Combining together, we conclude that the range of f is [− 21 , 12 ].

x
Figure 3.9: The function f (x) = .
x2 +1

There is also a second derivative test for determining whether a stationary


point is a local minimizer or a local maximizer.

Theorem 3.17 Second Derivative Test


Let (a, b) be an interval that contains the point x0 , and let f : (a, b) → R
be a differentiable function. Assume that f ′ (x0 ) = 0, and f ′′ (x0 ) exists.

1. If f ′′ (x0 ) > 0, then x0 is a local minimizer of f .

2. If f ′′ (x0 ) < 0, then x0 is a local maximizer of f .

The second derivative test is inconclusive if f ′′ (x0 ) = 0, as can be shown by


considering the three functions f1 (x) = x4 , f2 (x) = −x4 and f3 (x) = x3 .
All these three functions have x = 0 as a stationary point. Their second
derivatives are all equal to zero at x = 0. However, x = 0 is a local
minimizer of f1 (x) = x4 , it is a local maximizer of the function f2 (x) =
−x4 , and it is not a local extremizer for the function f3 (x) = x3 .
Chapter 3. Differentiating Functions of a Single Variable 187

Proof of Theorem 3.17


We will give a proof of the first statement. The proof of the second
statement is similar. For the first statement, we are given that f ′ (x0 ) = 0
and f ′′ (x0 ) > 0. By definition,

f ′ (x) − f ′ (x0 ) f ′ (x)


f ′′ (x0 ) = lim = lim .
x→x0 x − x0 x→x0 x − x0

Take ε to be the positive number f ′′ (x0 )/2. The definition of limit implies
that there is a δ > 0 such that (x0 − δ, x0 + δ) ⊂ (a, b), and for all the points
x in (x0 − δ, x0 ) ∪ (x0 , x0 + δ),

f ′ (x) f ′′ (x0 )
− f ′′ (x0 ) < .
x − x0 2

This implies that for all x ∈ (x0 − δ, x0 ) ∪ (x0 , x0 + δ),

f ′ (x) f ′′ (x0 ) f ′′ (x0 )


> f ′′ (x0 ) − = > 0. (3.3)
x − x0 2 2
If x ∈ (x0 − δ, x0 ), x − x0 < 0. Equation (3.3) implies that f ′ (x) < 0.
Therefore, f is decreasing on (x0 − δ, x0 ]. This implies that

f (x) ≥ f (x0 ) for all x ∈ (x0 − δ, x0 ).

If x ∈ (x0 , x0 + δ), x − x0 > 0. Equation (3.3) implies that f ′ (x) > 0.


Therefore, f is increasing on [x0 , x0 + δ). This implies that

f (x) ≥ f (x0 ) for all x ∈ (x0 , x0 + δ).

Combining together, we find that f (x) ≥ f (x0 ) for all x in (x0 − δ, x0 + δ).
This proves that x0 is a local minimizer of f .
Chapter 3. Differentiating Functions of a Single Variable 188

Example 3.17

For the function f (x) considered in Example 3.16, we have shown that the
stationary points are x = −1 and x = 1. A tedious computation gives

′′ 2x(x2 − 3)
f (x) = .
(x2 + 1)3
Hence,
1 1
f ′′ (−1) = , f ′′ (1) = − .
2 2
The second derivative test can then be used to conclude that x = −1 is a
local minimizer, and x = 1 is a local maximizer.

Although applying the second derivative test seems straightforward, an analysis


using the first derivative test is more conclusive. Finding the second derivative can
also be tedious, as shown in the example above.
At the end of this section, we want to prove an analogue of intermediate value
theorem for derivatives.

Theorem 3.18 Darboux’s Theorem


Let f : [a, b] → R be a differentiable function. If w is a value strictly
between f+′ (a) and f−′ (b), then there is a point c in (a, b) such that f ′ (c) =
w.

If the function g ′ is continuous, then Darboux’s theorem follows immediately


from the intermediate value theorem. The strength of Darboux’s theorem lies in
the fact that it does not assume the continuity of g ′ .

Proof
The proof is an again an application of the extreme value theorem.
Without loss of generality, assume that f+′ (a) < w < f−′ (b). Since f :
[a, b] → R is a differentiable function, it is continuous. Define the function
g : [a, b] → R by
g(x) = f (x) − wx.
Chapter 3. Differentiating Functions of a Single Variable 189

Then g is differentiable and

g ′ (x) = f ′ (x) − w.

Notice that g : [a, b] → R is also continuous. By extreme value theorem,


g : [a, b] → R has a minimum value. Now,


g+ (a) = f+′ (a) − w < 0, ′
g− (b) = f−′ (b) − w > 0.

By definition,
′ g(x) − g(a)
g+ (a) = lim+ .
x→a x−a

Taking ε to be the positive number −g+ (a)/2, we find that there is a δ > 0
such that δ ≤ b − a, and for all x ∈ (a, a + δ),

g(x) − g(a) g ′ (a)



< g+ (a) + ε = + < 0.
x−a 2
In particular, for all x ∈ (a, a + δ), g(x) < g(a), and thus g(a) is not

a minimum value of the function g. Similarly, since g− (b) > 0, we find
that g(b) is not a minimum value of the function g. In other words, the
minimizer of g must be a point c inside (a, b). This is then also a local
minimizer. Since g is differentiable, we must have g ′ (c) = 0. This implies
that f ′ (c) = w.

Remark 3.3
As a consequence of the Darboux’s theorem, we find that if a function f :
(a, b) → R is differentiable and f ′ (x) ̸= 0 for any x ∈ (a, b), then either
f ′ (x) > 0 for all x ∈ (a, b), or f ′ (x) < 0 for all x ∈ (a, b). In any case, this
means that such a function must be strictly monotonic.

Before closing this section, let us define a terminology.


Chapter 3. Differentiating Functions of a Single Variable 190

Definition 3.8 C k functions


Let k be a nonnegative integer. We say that a function f : (a, b) → R
is a C k -function it is has k times derivatives and the k th -derivative f (k) :
(a, b) → R is also continuous.

It s easy to see that if f : (a, b) → R is a C k -function, then for any 0 ≤ j < k,


f (j) : (a, b) → R is continuous.
A C 0 function is just a continuous function. A C 1 function is called a continuously
differentiable function. In general, a C k function is called a k-times continuously
differentiable function.
The definition of C k functions can be extended to the case where the function
f is defined on a closed interval [a, b].
Chapter 3. Differentiating Functions of a Single Variable 191

Exercises 3.3
Question 1
Given that the function f : [−5, 8] → R is continuous on [−5, 8],
differentiable on (−5, 8), and −4 < f ′ (x) < 4 for all x in (−5, 8). If
f (0) = 2, find a range for the values of f (x).

Question 2
Show that the function f : R → R,

x3
f (x) =
x2 + 1
is strictly increasing, and find the range of the function.

Question 3
Show that the equation
x5 + x + 32 = 0
has exactly one real solution.

Question 4
Find the number of real solutions of the equation
32x
= 1.
x4+ 16

Question 5
Let n be a nonnegative integer, and let f : (a, b) → R be a differentiable
function. If the equation f ′ (x) = 0 has n distinct real roots in the interval
(a, b), show that the equation f (x) = 0 has at most (n + 1) distinct real
roots in the interval (a, b).
Chapter 3. Differentiating Functions of a Single Variable 192

Question 6
Consider the function f : R → R defined by
x+1
f (x) = .
x2 + 15
(a) Find the local maximum value and the local minimum value of f .

(b) Find the range of the function f .

Question 7
Let f : [a, b] → R be a function such that the limit

f (x) − f (b)
L = lim−
x→b x−b
exists. If L > 0, show that there is a δ > 0 such that δ ≤ b − a and for all
x ∈ (b − δ, b),
f (x) < f (b).

Question 8
Let f : (a, b) → R be a differentiable function. Suppose that f ′ : (a, b) →
R is monotonic, show that f ′ : (a, b) → R is continuous.
Chapter 3. Differentiating Functions of a Single Variable 193

3.4 The Cauchy Mean Value Theorem

In previous section, we have seen that the mean value theorem is very useful in
analysing the behavior of a differentiable function. For future applications, we
will often quote it in the following form.

Alternative Form of Mean Value Theorem


If f : (a, b) → R is a differentiable function, x0 is a point in (a, b), h is
such that x0 + h is also in (a, b), then there is a number c ∈ (0, 1) such that

f (x0 + h) − f (x0 ) = f ′ (x0 + ch)h. (3.4)

To see this, let x1 = x0 + h. If h = 0, (3.4) is obviously true for any c in (0, 1).
If h ̸= 0, then when c runs through all values from 0 to 1, x0 + ch runs through
all points in the open interval I with x0 and x1 as endpoints. Thus, (3.4) says that

f (x1 ) − f (x0 ) = f ′ (u)(x1 − x0 )

for some u in the open intefval I, which is precisely the statement of the mean
value theorem.
When finding limits of functions, we often encounter situations like

f (x)
lim
x→x0 g(x)

where both lim f (x) and lim g(x) are zero. For example, let
x→x0 x→x0

f (x) = x20 + 2x9 − 3 and g(x) = x7 − 1.

Then
lim f (x) = f (1) = 0 and lim g(x) = g(1) = 0.
x→1 x→1

Hence, we cannot apply limit quotient law to evaluate

f (x) x20 + 2x9 − 3


lim = lim .
x→1 g(x) x→1 x7 − 1

Observe that
f (x) f (1 + h)
lim = lim .
x→1 g(x) h→0 g(1 + h)
Chapter 3. Differentiating Functions of a Single Variable 194

Since we are only interested in the limit when x approaches 1, we are prone to use
the mean value theorem in the form (3.4) and conclude that there are c1 and c2 in
(0, 1) such that

f (x) f (1 + h) − f (1) f ′ (1 + c1 h)
lim = lim = lim ′ . (3.5)
x→1 g(x) h→0 g(1 + h) − g(1) h→0 g (1 + c2 h)

For the functions f and g that we consider above, f ′ and g ′ are both continuous at
x = 1 and g ′ (1) ̸= 0. Hence, we find that

f ′ (1 + c1 h) f ′ (1)
lim = ′ .
h→0 g ′ (1 + c2 h) g (1)

For general differentiable functions f (x) and g(x) with f (1) = g(1) = 0, if we
do not assume that f ′ and g ′ are continuous, we cannot conclude the limit from
(3.5) since c1 and c2 are in general different functions of h.
In this section, we are going to prove a generalization of the mean value
theorem, called the Cauchy mean value theorem, which ensures that we can have
the same value for c1 and c2 .

Theorem 3.19 Cauchy Mean Value Theorem

Let f : [a, b] → R and g : [a, b] → R be two functions that satisfy the


following conditions.

(i) f : [a, b] → R and g : [a, b] → R are continuous.

(ii) f : (a, b) → R and g : (a, b) → R are differentiable.

(iii) g ′ (x) ̸= 0 for all x ∈ (a, b).

Then there is a point x0 in (a, b) such that

f ′ (x0 ) f (b) − f (a)



= .
g (x0 ) g(b) − g(a)

Notice that when g(x) = x, we have the Lagrange’s mean value theorem.
Chapter 3. Differentiating Functions of a Single Variable 195

Proof
The proof uses the same idea as the proof of Lagrange’s mean value
theorem, with the function g(x) = x replaced by a general g(x). By
Remark 3.3, the condition g ′ (x) ̸= 0 for all x ∈ (a, b) implies that g is
strictly monotonic. Hence, g(a) ̸= g(b).
Define the function h : [a, b] → R by

h(x) = f (x) − mg(x),

where the number m is determined by h(a) = h(b). This means

f (a) − mg(a) = f (b) − mg(b),

which gives
f (b) − f (a)
m= .
b−a
Again, the function h : [a, b] → R is continuous on [a, b], differentiable on
(a, b), and satisfies h(a) = h(b). By Rolle’s theorem, there is a point x0 in
(a, b) such that h′ (x0 ) = 0. For this x0 ,

f ′ (x0 ) − mg ′ (x0 ) = 0.

Since g ′ (x0 ) ̸= 0 by assumption, we find that

f ′ (x0 ) f (b) − f (a)



=m= .
g (x0 ) b−a

Example 3.18

Consider the functions f : [1, 7] → R, f (x) = x2 and g : [1, 7] → R,


g(x) = x3 − 9x2 . By Lagrange’s mean value theorem, there are points c1
and c2 in (1, 7) such that

f (7) − f (1)
2c1 = f ′ (c1 ) = = 8,
7−1
and
g(7) − g(1)
3c22 − 18c2 = g ′ (c2 ) = = −15.
7−1
Chapter 3. Differentiating Functions of a Single Variable 196

Solving for c1 and c2 , we have c1 = 4 and and c2 = 5.


By Cauchy mean value theorem, there is a point c in (1, 7) such that

2c f ′ (c) f (7) − f (1) 8


2
= ′
= =− .
3c − 18c g (c) g(7) − g(1) 15

Solving this equation gives


19
c= .
4

An important application of the Cauchy mean value theorem is the following.

Theorem 3.20
Let n be a positive integer, and let (a, b) be an open interval that contains
the point x0 . If the function f : (a, b) → R is n times differentiable, and

f (x0 ) = f ′ (x0 ) = · · · = f (n−1) (x0 ) = 0,

then for any x in (a, b), there is a c ∈ (0, 1) such that

hn (n)
f (x) = f (x0 + ch), where h = x − x0 . (3.6)
n!

Proof
We apply the Cauchy mean value theorem n times to the given function
f : (a, b) → R and the function g : (a, b) → R defined by g(x) = (x−x0 )n .
Notice that g is also n times differentiable,

g(x0 ) = g ′ (x0 ) = · · · = g (n−1) (x0 ) = 0,

and
g (n) (x) = n! for all x ∈ (a, b).
Since f (x0 ) = 0, eq. (3.6) obviously holds for x = x0 with any c ∈ (0, 1).
So we only need to consider a point x = x1 in (a, b) \ {x0 }. First assume
that x1 > x0 . For any 1 ≤ k ≤ n, g (k) (x) ̸= 0 for any x ∈ (x0 , x1 ). Thus
we can apply the Cauchy mean value theorem for the pairs (f, g), (f ′ , g ′ ),
. . ., (f (n−1) , g (n−1) ) over the interval (x0 , x1 ).
Chapter 3. Differentiating Functions of a Single Variable 197

Since f (x0 ) = g(x0 ) = 0, Cauchy mean value theorem implies that there
exists a point u1 in (x0 , x1 ) such that

f (x1 ) f (x1 ) − f (x0 ) f ′ (u1 )


= = ′ .
g(x1 ) g(x1 ) − g(x0 ) g (u1 )

If n = 1, we are done. If n ≥ 2, then f ′ (x0 ) = g ′ (x0 ) = 0. Apply Cauchy


mean value theorem again, we find that there is a u2 in (x0 , u1 ) such that

f (x1 ) f ′ (u1 ) f ′ (u1 ) − f ′ (x0 ) f ′′ (u2 )


= ′ = ′ = .
g(x1 ) g (u1 ) g (u1 ) − g ′ (x0 ) g ′′ (u2 )

Continue with this n times, we find that there are points u1 , . . . , un such
that x0 < un < un−1 < · · · < u1 < x1 , and

f (x1 ) f ′ (u1 ) f (n) (un )


= ′ = · · · = (n) . (3.7)
g(x1 ) g (u1 ) g (un )

Since un ∈ (x0 , x1 ), there is c ∈ (0, 1) such that un = x0 + ch, where


h = x1 − x0 . Eq. (3.7) then implies that
hn (n)
f (x1 ) = f (x0 + ch), where h = x1 − x0 .
n!
This completes the proof if x1 > x0 . The proof for x1 < x0 is similar.

Let us look at a classical example.

Example 3.19

Let x0 be a point in the interval (a, b), and assume that the function f :
(a, b) → R is twice continuously differentiable. Prove that

f (x0 + h) + f (x0 − h) − 2f (x0 )


lim = f ′′ (x0 ).
h→0 h2
Chapter 3. Differentiating Functions of a Single Variable 198

Solution
Let r = min{x0 − a, b − x0 }. Then r > 0 and (x0 − r, x0 + r) ⊂ (a, b).
Define the function g : (−r, r) → R by

g(h) = f (x0 + h) + f (x0 − h) − 2f (x0 ).

Then g is twice continuously diferentiable, and

g ′ (h) = f ′ (x0 + h) − f ′ (x0 − h), g ′′ (h) = f ′′ (x0 + h) + f ′′ (x0 − h).

It is easy to check that


g(0) = g ′ (0) = 0.
By Theorem 3.20, for any h ∈ (−r, r), there is a c(h) ∈ (0, 1) such that

h2 ′′
g(h) = g (c(h)h).
2
Hence,
g(h) f ′′ (x0 + c(h)h) + f ′′ (x0 − c(h)h)
= .
h2 2
Now since c(h) ∈ (0, 1),
|c(h)h| ≤ |h|.
Therefore,
lim c(h)h = 0.
h→0

Since f ′′ is continuous,

lim f ′′ (x0 + k) = f ′′ (x0 ).


k→0

By limit law for composite functions, we find that

f ′′ (x0 + c(h)h) + f ′′ (x0 − c(h)h) f ′′ (x0 ) + f ′′ (x0 )


lim = = f ′′ (x0 ).
h→0 2 2
Therefore,
f (x0 + h) + f (x0 − h) − 2f (x0 ) g(h)
lim 2
= lim 2 = f ′′ (x0 ).
h→0 h h→0 h
Chapter 3. Differentiating Functions of a Single Variable 199

Exercises 3.4
Question 1
Given that p(x) is a polynomial of degree at most 5, and

p(1) = p(1) (1) = p(2) (1) = p(3) (1) = p(4) (1) = 0, p(5) (1) = 1200.

Find the polynomial p(x).

Question 2
Let (a, b) be an interval that contains the point x0 . Given that the function
f : (a, b) → R is three times continuously differentiable, find the limit

f (x0 + 2h) − 2f (x0 + h) + 2f (x0 − h) − f (x0 − 2h)


lim .
h→0 h3
Chapter 3. Differentiating Functions of a Single Variable 200

3.5 Transcendental Functions

Up to now we have only dealt with algebraic functions, which are functions that
can be obtained by performing algebraic operations of addition, multiplication,
division and taking roots on polynomials. In this section, we introduce other
useful elementary functions – the class of transcendental functions which includes
exponential, logarithmic and trigonometric functions. These functions have been
introduced in a pre-calculus course, but not rigorously.
In this section, we are going to define these functions and derive their properties
using calculus. Everything would be done rigorously using the analytic tools that
we have developed so far, except for an existence theorem that we are going to
prove in Chapter 4.
Let us first state this existence theorem.

Theorem 3.21 Existence and Uniqueness Theorem

Let (a, b) be an open interval that contains the point x0 , and let y0 be any
real number. Given that f : (a, b) → R is a continuous function, there
exists a unique differentiable function F : (a, b) → R such that

F ′ (x) = f (x) for all x ∈ (a, b), F (x0 ) = y0 .

The function F (x) that satisfies F ′ (x) = f (x) is called an antiderative of


f (x).

Definition 3.9 Antiderative


Let I be an interval. If f : I → R and F : I → R are functions on I such
that F is differentiable and

F ′ (x) = f (x) for all x ∈ I,

then F (x) is called an antiderivative of f (x).

Theorem 3.21 asserts that a continuous function has an antiderivative. One


way to construct an antiderivative is to use integrals, a topic we are going to
discuss in Chapter 4. Theorem 3.14 says that any two antiderivatives of a given
function differ by a constant. The initial condition F (x0 ) = y0 fixes the constant.
Chapter 3. Differentiating Functions of a Single Variable 201

Hence, only the existence part of Theorem 3.21 is pending a proof. The uniqueness
follows from what we have discussed.

3.5.1 The Logarithmic Function

It is easy to check that for any integer n that is not equal to −1, an antiderivative
of the function f (x) = xn is the function

xn+1
F (x) = .
n+1
So far we haven’t seen any algebraic function whose antiderivative is equal to
1
f (x) = . We define one such function and call it the natural logarithm function.
x
Definition 3.10 The Natural Logarithm Function

The natural logarithm function f : (0, ∞) → R, f (x) = ln x is defined to


be the unique differentiable function satisfying
1
f ′ (x) = , f (1) = 0.
x

1
Since g : (0, ∞) → R, g(x) = is a continuous function, the existence and
x
uniqueness of the function f (x) = ln x is guaranteed by Theorem 3.21.

The Natural Logarithm Function


By definition,
d 1
ln x = , x > 0.
dx x
Since 1/x > 0 for all x > 0, we find that f (x) = ln x is a strictly increasing
function. Moreover, since ln 1 = 0,

• when 0 < x < 1, ln x < 0;

• when x > 1, ln x > 0.

The following gives some useful properties of the natural logarithmic function.
Chapter 3. Differentiating Functions of a Single Variable 202

Figure 3.10: The function y = ln x.

Proposition 3.22 Properties of the Natural Logarithm Function


Let x and y be any positive numbers, and let r be a rational number. We
have the following.

(a) ln(xy) = ln x + ln y
x
(b) ln = ln x − ln y
y
(c) ln xr = r ln x

In part (c), we require r to be a rational number since we have not defined xr


when r is an irrational number.

Proof
To prove (a), we fixed y > 0 and define the function f : (0, ∞) → R by

f (x) = ln(xy) − ln y.

Then f (1) = ln y − ln y = 0, and


y 1
f ′ (x) = = .
xy x
By the uniquesness asserted in Theorem 3.21 and the definition of the
natural logarithm function, we conclude that f (x) = ln x. This proves
(a).
Chapter 3. Differentiating Functions of a Single Variable 203

To prove (b), we notice that part (a) gives


   
x x
ln + ln y = ln × y = ln x.
y y

For (c), notice that it is obvious if r = 0. If r ̸= 0, define the function


f : (0, ∞) → R by
1
f (x) = ln xr .
r
Then f (1) = 0, and

1 rxr−1 1
f ′ (x) = × r
= .
r x x
This allows us to conclude that f (x) = ln x, and (c) is thus proved.

From part (b) of Proposition 3.22, we find that for any x > 0,
1
ln = ln x−1 = − ln x;
x
and if n is a positive integer,

ln xn = n ln x.

In particular, we find that


ln 2n = n ln 2,
1
ln n = −n ln 2.
2
Since ln 2 > 0, we conclude the following.

Proposition 3.23

f : (0, ∞) → R, f (x) = ln x is a strictly increasing function with

lim ln x = −∞, lim ln x = ∞.


x→0+ x→∞

Hence, the range of f (x) = ln x is R.


Chapter 3. Differentiating Functions of a Single Variable 204

3.5.2 The Exponential Functions

Since the function f : (0, ∞) → R, f (x) = ln x is continuous and strictly


increasing, its inverse function exists. We define this inverse function as the
exponential function exp(x). The domain of exp(x) is the range of ln x, which is
R. The range of exp(x) is the domain of ln x, which is (0, ∞).

Definition 3.11 The Natural Exponential Function


The natural exponential function exp : R → R is defined to be the inverse
of the function ln x. It satisfies

ln exp(x) = x for any x ∈ R, exp(ln x) = x for any x > 0.

Figure 3.11: The function y = exp(x).

We can deduce the following properties.

Proposition 3.24 Properties of the Natural Exponential Function I

The exponential function exp(x) is a strictly increasing differentiable


function defined on the set of real numbers. It has the following properties.

(a) exp(x) > 0 for all x ∈ R and exp(0) = 1.

(b) lim exp(x) = 0, lim exp(x) = ∞.


x→−∞ x→∞

d
(c) exp(x) = exp(x).
dx
Chapter 3. Differentiating Functions of a Single Variable 205

Proof
(a) and (b) are obvious from the corresponding properties of ln x. For part
(c), we employ the derivative formula for inverse function. To make it less
confusing, let y = exp(x). Then

ln y = x.

Differentiating both sides with respect to x, we find that


1 dy
= 1.
y dx
Therefore,
d dy
exp(x) = = y = exp(x).
dx dx

From the properties of the natural logarithm stated in Proposition 3.22, we


have the following.

Proposition 3.25 Properties of the Natural Exponential Function II


Let x and y be any real numbers, and let r be a rational number. We have
the following.

(a) exp(x + y) = exp(x) exp(y).


exp(x)
(b) exp(x − y) = .
exp(y)
(c) exp(x)r = exp(rx).

Proof
Let u = exp(x) and v = exp(y). Then u and v are positive numbers and

x = ln u, y = ln v.

By Proposition 3.22,

ln(uv) = ln u + ln v = x + y.
Chapter 3. Differentiating Functions of a Single Variable 206

Therefore,
exp(x) exp(y) = uv = exp(x + y).
Part (b) is proved in the same way. For part (c), Proposition 3.22 implies
that
ln(ur ) = r ln u = rx.
Therefore,
exp(x)r = ur = exp(rx).

Notice that part (c) says that for any postive number u, and any rational number
r,
ur = exp(r ln u).
We can use this to define power functions with irrational powers.

Definition 3.12 Power Functions


For any real number r, the power function f (x) = xr is the function defined
on (0, ∞) by the formula

xr = exp(r ln x).

When r > 0, we can extend the definition to the point x = 0 by definining


f (0) = 0.

We have seen that this definition coincides with the old definition when r is
a rational number. For r > 0, since ln x → −∞ as x → 0+ , r ln x → −∞
as x → 0+ . Since exp(x) → 0 as x → −∞, we conclude that xr → 0 as
x → 0+ . Therefore, the definition f (0) = 0 makes the function f (x) = xr
continuous. Using the fact that exp(x) and ln x are inverses of each other, we
have the following.

For any positive number x and any real number r,

ln(xr ) = r ln x.

Since both ln x and exp(x) are strictly increasing functions, it is easy to deduce
Chapter 3. Differentiating Functions of a Single Variable 207

√ √
Figure 3.12: (a) The function y = x 2 . (b) The function y = x− 2 .

the following.

Monotonicity of Power Functions

1. When r > 0, the function f : [0, ∞) → R, f (x) = xr is strictly


increasing.

2. When r < 0, the function f : (0, ∞) → R, f (x) = xr is strictly


decreasing.

The following gives the properties of power functions.

Proposition 3.26
For any positive numbers x and y, and any real numbers r and s,

(a) (xy)r = xr y r
 r
x xr
(b) = r
y y
(c) xr+s = xr xs
xr
(d) xr−s =
xs
(e) (xr )s = xrs
Chapter 3. Differentiating Functions of a Single Variable 208

Proof
For part (a), we have

(xy)r = exp (r ln(xy)) = exp(r ln x + r ln y)


= exp(r ln x) exp(r ln y) = xr y r .

Part (b) is proved in the same way. For part (c),

xr+s = exp ((r + s) ln x) = exp(r ln x) exp(s ln x) = xr xs .

Part (d) is proved in the same way. For part (e),

(xr )s = exp (s ln(xr )) = exp (rs ln x) = xrs .

Using chain rule, we find that f (x) = xr is a differentiable function.

Proposition 3.27
For any real number r and any positive number x,
d r
x = rxr−1 .
dx

Proof
This follows from straightforward computation.
d r d d r
x = exp (r ln x) = exp(r ln x) (r ln x) = xr × = rxr−1 .
dx dx dx x

Now we want to show that exp(1) = e, where e is the number we defined as


 n
1
e = lim 1 +
n→∞ n

in Chapter 1.
Chapter 3. Differentiating Functions of a Single Variable 209

Theorem 3.28
We have  n
1
exp(1) = lim 1+ = e.
n→∞ n
This imples that
ln e = 1.

Proof
We consider the differentiable function g(x) = ln(1 + x), x > −1, whose
derivative is
1
g ′ (x) = .
1+x
By definition of derivative,

ln(1 + x) g(x) − g(0)


lim = lim = g ′ (0) = 1.
x→0 x x→0 x−0
Since exp(x) is a continuous function, we find that
   
ln(1 + x) ln(1 + x)
lim exp = exp lim = exp(1). (3.8)
x→0 x x→0 x

Notice that {1/n} is a sequence of positive numbers that converges to 0.


Therefore, eq. (3.8) implies that
  
1
lim exp n ln 1 + = exp(1).
n→∞ n

By definition,
    n
1 1
exp n ln 1 + = 1+ .
n n

Thus, we have shown that


 n
1
exp(1) = lim 1+ = e.
n→∞ n

Since exp(x) and ln x are inverses of each other, we find that ln e = 1.


Chapter 3. Differentiating Functions of a Single Variable 210

Definition 3.13 General Exponential Functions


Let a be a positive real number such that a ̸= 1. The exponential function
f (x) = ax is defined by

ax = exp (x ln a) , x ∈ R.

When a = e, f (x) = ex is the natural exponential function

ex = exp(x).

Henceforth, we will also use ex to denote the natural exponential function


exp(x).
The following properties of the general exponential functions can be easily
derived from the corresponding properties of the exp(x) function.

Proposition 3.29
Let a be a positive number.

1. When 0 < a < 1, f (x) = ax is a strictly decreasing function.

2. When a > 1, f (x) = ax is a strictly increasing function.

Proposition 3.30

Let a be a positive number such that a ̸= 1. The function f (x) = ax is


differentiable, and
d x
a = ax ln a.
dx

Proposition 3.31
Let a be a positive number such that a ̸= 1. For any real numbers x and y,

1. ax+y = ax ay
x−y ax
2. a = y
a
3. (ax )y = axy
Chapter 3. Differentiating Functions of a Single Variable 211

3.5.3 The Trigonometric Functions

Now we consider the trigonometric functions. Recall that an angle is usually


measured in degrees, so that the angle of a full circle is 360◦ . But for analysis, we
need to make a change of units to radians.

Figure 3.13: An arc with central angle θ.

The number π is defined as the ratio of the circumsference of a circle to its


diameter. Hence, a circle of radius 1 would have circumsference 2π. This number
π can be shown to be an irrational number. The radian measurement of an angle
is so that an arc with central angle θ radians on a circle of radius r has length rθ,
so that the circumsference of the circle is 2πr. Hence, the conversion between
degrees and radians is
π
θ◦ = θ rad.
180
Historically, sine and cosine are defined using right-angled triangles, as shown
in Figure 3.14.
To extend the definitions of sin θ and cos θ so that θ can be any real numbers,
we use the unit circle x2 +y 2 = 1. The angle measurement starts from the positive
x-axis and we take the counter-clockwie direction as positive direction. For any
real number θ, find a point P (x, y) on the unit circle such that the line segment
between the origin O and the point P makes an angle θ radians with the positive x-
axis (see Figure 3.15). Then we define cos θ and sin θ to be the x and y coordinates
of P :
x = cos θ, y = sin θ.
In this way, the function sin θ and cos θ are defined rigorously, and when θ is
an acute angle, it coincides with the definition using right-angled triangles. From
Chapter 3. Differentiating Functions of a Single Variable 212

Figure 3.14: Classical definitions of sine and cosine functions.

7π 3π
Figure 3.15: The definitions of sin θ and cos θ for (a) θ = and (b) θ = .
3 4

the definitions, it is obvious that sin θ and cos θ are periodic functions of periodic
2π.

Definition 3.14 Periodic Functions


A function f : R → R is said to be periodic if there is a positive number L
so that
f (x + L) = f (x) for all x ∈ R.
Such a number L is called a period of the function f . If L is a period of f ,
then for any positive integer n, nL is also a period of f .

From the definitions, it is quite obvious that sin θ and cos θ are continuous
functions. A rigorous proof is tedious. To show that these two functions are
Chapter 3. Differentiating Functions of a Single Variable 213

differentiable is also possible, but complicated. Two crucial formulas are

sin(θ1 + θ2 ) = sin θ1 cos θ2 + cos θ1 sin θ2 , (3.9a)


cos(θ1 + θ2 ) = cos θ1 cos θ2 − sin θ1 sin θ2 . (3.9b)

The proofs of these two formulas by elementary means are tedious.


In this section, we are going to define the sine and cosine functions using a
different approach. We will show that the functions thus defined agree with the
old definitions.
First, we present an existence and uniquess theorem.

Theorem 3.32 Existence and Uniqueness Theorem


Let α and β be any two real numbers. There exists a unique twice
differentiable function f : R → R satisfying

f ′′ (x) + f (x) = 0, f (0) = α, f ′ (0) = β.

Again, the proof of the existence requires knowledge from later chapters. We
will prove uniqueness here. We begin by a lemma that will be useful later.

Lemma 3.33
Let f : R → R be a twice differentiable function that satisfies

f ′′ (x) + f (x) = 0.

The following holds.

1. f is infinitely differentiable.

2. For any positive integer n, the nth derivative of f , g(x) = f (n) (x),
satisfies
g ′′ (x) + g(x) = 0.

3. The function f (x)2 + f ′ (x)2 is a constant.


Chapter 3. Differentiating Functions of a Single Variable 214

Proof
Since f is twice differentiable, f is continuous and differentiable. Since
f ′′ (x) = −f (x), f ′′ is continuous and differentiable. This implies that f is
three times differentiable and f ′′′ = −f ′ . Continue arguing in this way, we
find that f is infinitely differentiable, and for any nonengative integer n,

f (n+2) x = −f (n) (x).

The latter says that if g = f (n) , then

g ′′ (x) + g(x) = 0.

These prove the first and second statements. For the third statement, we
notice that
d
f ′ (x)2 + f (x)2 = f ′ (x)f ′′ (x) + f (x)f ′ (x)

dx
= 2f ′ (x) (f ′′ (x) + f (x)) = 0.

This implies that f (x)2 + f ′ (x)2 is a constant.

Now we return to Theorem 3.32.

Proof of Theorem 3.32


If f1 and f2 are two functions that satisfy the given conditions, then the
function f = (f1 − f2 ) : R → R is a twice differentiable function satifying

f ′′ (x) + f (x) = 0, f (0) = 0, f ′ (0) = 0.

To prove uniqueness, we only need to show that this function f must be


identically zero. By Lemma 3.33, there is a constant C such that

f ′ (x)2 + f (x)2 = C.

Setting x = 0, we find that C = 0. Hence,

f ′ (x)2 + f (x)2 = 0.
Chapter 3. Differentiating Functions of a Single Variable 215

Since the square of a nonzero number is always positive, we must have

f (x) = f ′ (x) = 0 for all x ∈ R.

This completes the proof that f is identically zero.

Notice that for a function f : R → R that satisfies f ′′ (x) + f (x) = 0, we have

f (4) (x) = −f ′′ (x) = f (x).

This implies that for all positive integers n,

f (4n) (x) = f (x), f (4n+1) (x) = f ′ (x),

f (4n+2) (x) = f ′′ (x), f (4n+3) (x) = f ′′′ (x).


If f is the unique solution to

f ′′ (x) + f (x) = 0, f (0) = α, f ′ (0) = β,

then its derivative g = f ′ is the unique solution to

g ′′ (x) + g(x) = 0, g(0) = β, g ′ (0) = −α.

Definition 3.15 The Sine and Cosine functions


The sine function S(x) = sin x is defined to be the unique twice
differentiable function satisfying

S ′′ (x) + S(x) = 0, S(0) = 0, S ′ (0) = 1.

The cosine function C(x) = cos x is defined as the derivative of S(x).


Namely, C(x) = S ′ (x). It is the unique twice differentiable function
satisfying

C ′′ (x) + C(x) = 0, C(0) = 1, C ′ (0) = 0.

Notice that once we prove the existence of the function S(x) = sin x, then the
function C(x) = cos x exists. One can then check that the function

f (x) = αC(x) + βS(x)


Chapter 3. Differentiating Functions of a Single Variable 216

is a twice differentiable function satifying

f ′′ (x) + f (x) = 0, f (0) = α, f ′ (0) = β.

In other words, to prove the existence part in Theorem 3.32, we only need to
establish the existence of the function S(x) = sin x.
In the following, we establish the properties of the functions S(x) and C(x).

Theorem 3.34
The functions S(x) and C(x) are infinitely differentiable functions that
satisfy the following.

(a) S ′ (x) = C(x) and C ′ (x) = −S(x) for all x ∈ R.

(b) S(x) is an odd function, C(x) is an even function.

(c) S(x)2 + C(x)2 = 1 for all x ∈ R.

(d) For any real numbers x and y, S(x + y) = S(x)C(y) + C(x)S(y).

(e) For any real numbers x and y, C(x + y) = C(x)C(y) − S(x)S(y).

Proof
S ′ (x) = C(x) is by the definition of C(x). Differentiating gives C ′ (x) =
S ′′ (x) = −S(x). To prove (b), one check that the function f (x) = −S(−x)
satisfies f ′′ (x) + f (x) = 0, f (0) = 0 and f ′ (0) = 1. By uniquess of the
function S(x), we have f (x) = S(x), which proves that S(x) is an odd
function. Since C(x) = S ′ (x), C(x) is an even function. Lemma 3.33 says
that S(x)2 + S ′ (x)2 is a constant. Hence, there is a constant A such that

S(x)2 + C(x)2 = A for all x ∈ R.

Setting x = 0 gives A = 1. This proves part (c). For part (d), fixed a real
number y and consider the function

f (x) = S(x + y).


Chapter 3. Differentiating Functions of a Single Variable 217

We find that
f ′ (x) = S ′ (x + y) = C(x + y),
f ′′ (x) + f (x) = S ′′ (x + y) + S(x + y) = 0,
and
f (0) = S(y), f ′ (0) = C(y).
Since the function g(x) = S(y)C(x) + C(y)S(x) satisfies

g ′′ (x) + g(x) = 0, g(0) = S(y), g ′ (0) = C(y),

by uniquesness, we find that f (x) = g(x) for all x ∈ R. Therefore,

S(x + y) = S(x)C(y) + C(x)S(y).

Differentiate with respect to x gives

C(x + y) = C(x)C(y) − S(x)S(y).

Here we have used advanced analytic tools to prove the identities (3.9) in a
simple way. Part (c) in Theorem 3.34 says that

sin2 x + cos2 x = 1 for all x ∈ R.

This implies that

| sin x| ≤ 1, | cos x| ≤ 1 for all x ∈ R.

By definition, sin 0 = S(0) = 0 and cos 0 = C(0) = 1. What is not obvious is


that 0 is in the range of C(x).

Theorem 3.35
There is a smallest positive number u such that C(u) = 0.
Chapter 3. Differentiating Functions of a Single Variable 218

Proof
Since S(x) is differentiable, we can apply mean value theorem to conclude
that there is a point v in (0, 2) such that

S(2) − S(0)
= S ′ (v) = C(v).
2−0
This gives
1 1
|C(v)| = |S(2)| ≤ .
2 2
By part (e) and part (c) in Theorem 3.34,
1
C(2v) = C(v)2 − S(v)2 = 2C(v)2 − 1 ≤ − 1 < 0.
2
Since C(2v) < 0 < C(0), and C(x) is a continuous function, intermediate
value theorem implies that there is a point w in (0, 2v) such that C(w) = 0.
Let
A = {w > 0 | C(w) = 0.} .
We have just shown that A is a nonempty set. By definition, A is bounded
below by 0. Hence, u = inf A exists. By Lemma 1.34, there is a sequence
{wn } in A that converges to u. Since C(x) is continuous, the sequence
{C(wn )} converges to C(u). But C(wn ) = 0 for all n. Hence, C(u) = 0.
Since C(0) = 1, u ̸= 0. Hence, u > 0. This proves that u is the smallest
positive number such that C(u) = 0.

Let u be the smallest positive number such that C(u) = 0. Then we must have
C(x) > 0 for all x ∈ [0, u). Since S ′ (x) = C(x), S(x) is strictly increasing on
[0, u]. Thus, S(x) > 0 for all x ∈ (0, u]. This, and S(u)2 + C(u)2 = 1, implies
that S(u) = 1. From part (d) and part (e) in Theorem 3.34, we find that

S(x + u) = S(x)C(u) + C(x)S(u) = C(x),


C(x + u) = C(x)C(u) − S(x)S(u) = −S(x).
Chapter 3. Differentiating Functions of a Single Variable 219

It follows that

S(x + 2u) = C(x + u) = −S(x),


C(x + 2u) = −S(x + u) = −C(x).
S(x + 3u) = C(x + 2u) = −C(x),
C(x + 3u) = −S(x + 2u) = S(x).
S(x + 4u) = C(x + 3u) = S(x),
C(x + 4u) = −S(x + 3u) = C(x).

The last pair of equations show that S(x) and C(x) are periodic functions of
period 4u. Since S(x) > 0 and C(x) > 0 for x ∈ (0, u), we have the following.

• For x ∈ (0, u), C(x) > 0, S(x) > 0.

• For x ∈ (u, 2u), C(x) < 0, S(x) > 0.

• For x ∈ (2u, 3u), C(x) < 0, S(x) < 0.

• For x ∈ (3u, 4u), C(x) > 0, S(x) < 0.

Together with S(0) = 0, C(0) = 1, S(u) = 1, C(u) = 0, we find that S(2u) = 0,


C(2u) = −1, S(3u) = −1, C(3u) = 0. These imply that for every P (x, y) on
the unit circle x2 + y 2 = 1, there is a unique θ ∈ [0, 4u) such that

x = C(θ), y = S(θ).

What is not obvious is that this θ is exactly the radian of the angle that the line
segment OP makes with the positive x-axis. To show this, we can argue in the
following way. Assume that an object is travelling on the circle x2 + y 2 = 1, and
its position at time t is (x(t), y(t)), where

x = C(t), y = S(t).

It follows that the velocity of the object at time t is (x′ (t), y ′ (t)), where

x′ (t) = −S(t), y ′ (t) = C(t).

This implies that the speed is


p p
x′ (t)2 + y ′ (t)2 = S(t)2 + C(t)2 = 1.
Chapter 3. Differentiating Functions of a Single Variable 220

Hence, the object is travelling at a constant speed 1. The distance travelled up to


time t is then t. This proves that the arclength of the arc from (1, 0) to the point
P (C(t), S(t)) is t. Then t must be the radian of the angle OP makes with the
positive x axis. Hence, the functions C(t) and S(t) coincide with the classical
cos t and sin t functions. Having proved this, by the definition of π, we have

2u = π.

Hence, we can summarize the facts above as follows.

Properties of the Sine and Cosine Functions


The functions S(x) = sin x and C(x) = cos x are 2π periodic infinitely
differentiable functions.
d d
sin x = cos x, cos x = − sin x.
dx dx
Moreover, they have the following properties.

1. sin x + π2 = cos x, cos x + π2 = − sin x.


 

2. sin (x + π) = − sin x, cos (x + π) = − cos x.

3. sin x + 3π = − cos x, cos x + 3π


 
2 2
= sin x.

4. sin x is an odd function, cos x is an even function.

5. sin x = 0 if and only if x = nπ, where n is an integer.

6. cos x = 0 if and only if x = n + 12 π, where n is an integer.




There are four other trigonometric functions. They are defined in terms of
sin x and cos x in the usual way.
Chapter 3. Differentiating Functions of a Single Variable 221

Figure 3.16: The sine function S(x) = sin x.

Figure 3.17: The cosine function C(x) = cos x.

Definition 3.16 Trigonmetric Functions


The tangent, cotangent, secant and cosecant functions are defined as
 
sin x 1 1
tan x = , sec x = , x ̸= n + π, n ∈ Z;
cos x cos x 2
cos x 1
cot x = , csc x = , x ̸= nπ, n ∈ Z.
sin x sin x

The following are easy to derive.

Proposition 3.36
tan x, cot x, sec x and csc x are infinitely differentiable functions with
d d
tan x = sec2 x, cot x = − csc2 x,
dx dx
d d
sec x = sec x tan x, csc x = − csc x cot x.
dx dx
Chapter 3. Differentiating Functions of a Single Variable 222

Figure 3.18: The function f (x) = tan x.

Before closing this subsection, we want to prove some important limits and
inequalities for the function sin x.

Theorem 3.37

1. For any real number x, | sin x| ≤ |x|.


sin x
2. lim = 1.
x→0 x

3. For any x ∈ [0, π/2],


2
x ≤ sin x ≤ x.
π

Proof
When x = 0, sin x = 0 and | sin x| ≤ |x| is obviously true. If x ̸= 0, mean
value theorem implies that there is a number c in (0, 1) such that
sin x sin x − sin 0
= = cos(cx). (3.12)
x x−0
Hence,
sin x
= | cos(cx)| ≤ 1,
x
which implies that | sin x| ≤ |x|. This proves the first statement.
Chapter 3. Differentiating Functions of a Single Variable 223

For the second statement, the definition of derivative implies that

sin x sin x − sin 0 d


lim = lim = sin x = cos 0 = 1.
x→0 x x→0 x−0 dx x=0

For the third statement, define the function g : [0, π2 ] → R by

 sin x , π

if 0 < x ≤ ,
g(x) = x 2
1, if x = 0.

Then g is continuous on [0, π2 ], and differentiable on (0, π2 ), with


 
′ x cos x − sin x 1 sin x π
g (x) = 2
= cos x − when 0 < x < .
x x x 2

As before, for each x ∈ (0, π2 ], mean value theorem implies that there is a
u ∈ (0, x) such that
sin x
= cos u.
x
Since 0 < u < x and the cosine function is strictly decreasing on (0, π2 ),
we find that
 
′ 1 sin x 1
g (x) = cos x − = (cos x − cos u) < 0.
x x x

This shows that g : [0, π2 ] → R is a strictly decreasing function. Since


g(0) = 1 and g( π2 ) = π2 , we find that for all x ∈ (0, π2 ],

2 sin x
≤ ≤ 1.
π x
Thus, for all x ∈ [0, π/2],
2
x ≤ sin x ≤ x.
π

3.5.4 The Inverse Trigonometric Functions

In this section, we are going to define inverse functions for sin x, cos x and tan x.
Since trigonometric functions are periodic functions, they are not one-to-one.
Chapter 3. Differentiating Functions of a Single Variable 224

Hence, we cannot find their inverses over the whole domain of their definitions.
However, we can restrict each of their domains to an interval on which each of
them is one-to-one to define the inverse. Such interval should contain the interval
(0, π/2) which is where these functions are classically defined.

• The largest interval that contains the interval (0, π/2) and on which sin x is
one-to-one is [− π2 , π2 ].

• The largest interval that contains the interval (0, π/2) and on which cos x is
one-to-one is [0, π].

• The largest interval that contains the interval (0, π/2) and on which tan x is
one-to-one is (− π2 , π2 ).

Definition 3.17 Inverse Sine Function


The function sin−1 x is a function defined on [−1, 1] and with range [− π2 , π2 ]
such that

sin(sin−1 x) = x for all x ∈ [−1, 1];


h π πi
sin−1 (sin x) = x for all x ∈ − , .
2 2

Definition 3.18 Inverse Cosine Function


The function cos−1 x is a function defined on [−1, 1] and with range [0, π]
such that

cos(cos−1 x) = x for all x ∈ [−1, 1];


cos−1 (cos x) = x for all x ∈ [0, π].

Definition 3.19 Inverse Tangent Function

The function tan−1 x is a function defined on R and with range (− π2 , π2 )


such that

tan(tan−1 x) = x for all x ∈ R;


 π π
tan−1 (tan x) = x for all x ∈ − , .
2 2
Chapter 3. Differentiating Functions of a Single Variable 225

The differentiability of the inverse trigonometric functions and their derivative


formulas follow immediately from Theorem 3.8.

Theorem 3.38
sin−1 : (−1, 1) → R is a differentiable function with

d 1
sin−1 x = √ .
dx 1 − x2

Theorem 3.39
cos−1 : (−1, 1) → R is a differentiable function with

d 1
cos−1 x = − √ .
dx 1 − x2

Theorem 3.40
tan−1 : R → R is a differentiable function with
d 1
tan−1 x = .
dx 1 + x2
Chapter 3. Differentiating Functions of a Single Variable 226

Exercises 3.5
Question 1
Determine the following limits.
 n
1
(a) lim 1 −
n→∞ n
 n
2
(b) lim 1 +
n→∞ n
 n
2
(c) lim 1 −
n→∞ n

Question 2

For any x ∈ [−1, 1], show that sin−1 x + cos−1 x is a constant and find this
constant.

Question 3
Determine the following limits.

(a) lim
π−
tan x
x→ 2

(b) lim tan x


x→− π2 +

(c) lim tan−1 x


x→−∞

(d) lim tan−1 x


x→∞
Chapter 3. Differentiating Functions of a Single Variable 227

Question 4
Consider the function f : R → R defined by
  
sin 1 ,

if x ̸= 0,
f (x) = x
0,

if x = 0.

Determine whether f is a continuous function. If not, find the points where


the function f is not continuous.

Question 5
Consider the function f : R → R defined by
  
x sin 1 ,

if x ̸= 0,
f (x) = x
0,

if x = 0.

Show that f is a continuous function.

Question 6
Consider the function f : R → R defined by
  
x2 sin 1 ,

if x ̸= 0,
f (x) = x
0,

if x = 0.

Let g : R → R be the function defined by

g(x) = x + f (x).

(a) Show that f is a differentiable function.

(b) Show that f ′ : R → R is not continuous.

(c) Show that g ′ (0) = 1, but for any neighbourhood (a, b) of 0, g : (a, b) →
R is not increasing.
Chapter 3. Differentiating Functions of a Single Variable 228

3.6 L’ Hôpital’s Rules

In this section, we will apply the Cauchy mean value theorem to prove the l’
Hôpital’s rules. The latter are useful rules for finding limits of the form

f (x)
lim ,
x→x0 g(x)

when we have one of the following two indeterminate forms.

1. Type 0/0, where lim f (x) = 0 and lim g(x) = 0.


x→x0 x→x0

2. Type ∞/∞, where lim f (x) = ∞ and lim g(x) = ∞.


x→x0 x→x0

Here x0 can be ∞ or −∞.


Let us first prove the following special case.

Theorem 3.41
Let f : (a, b) → R and g : (a, b) → R be differentiable functions that
satisfy the following conditions.

(i) lim+ f (x) = lim+ g(x) = 0.


x→a x→a

f ′ (x)
(ii) lim+ = L.
x→a g ′ (x)
f (x)
Then lim+ = L.
x→a g(x)

Proof
The condition (i) implies that we can extend f and g to be continuous
functions on [a, b) by defining f (a) = g(a) = 0. Then by Cauchy mean
value theorem, for any x ∈ (a, b), there is a point u(x) ∈ (a, x) such that

f (x) f (x) − f (a) f ′ (u(x))


= = ′ . (3.13)
g(x) g(x) − g(a) g (u(x))
Chapter 3. Differentiating Functions of a Single Variable 229

Since
a < u(x) < x,
squeeze theorem implies that

lim u(x) = a.
x→a+

By limit law for composite functions, we find that

f ′ (u(x)) f ′ (u)
lim+ = lim = L.
x→a g ′ (u(x)) u→a+ g ′ (u)

By (3.13), this proves that

f (x)
lim+ = L.
x→a g(x)

It is easy to see that an analogue of Theorem 3.41 holds for left limits. Combine
the left limit and the right limit, we have the following.

Theorem 3.42 l’ Hôpital’s Rule I

Let x0 be a point in the open interval (a, b), and let D = (a, b)\{x0 }. Given
that f : D → R and g : D → R are diferentiable functions that satisfy the
following conditions.

(i) lim f (x) = lim g(x) = 0.


x→x0 x→x0

f ′ (x)
(ii) lim = L.
x→x0 g ′ (x)
f (x)
Then we have lim = L.
x→x0 g(x)

We return to a problem that we discussed earlier.


Chapter 3. Differentiating Functions of a Single Variable 230

Example 3.20
Determine the limit
x20 + 2x9 − 3
lim
x→1 x7 − 1
if it exists.

Solution
Let f (x) = x20 + 2x9 − 3 and g(x) = x7 − 1. Then

lim f (x) = f (1) = 0 and lim g(x) = g(1) = 0.


x→1 x→1

f and g are continuously differentiable functions with

f ′ (x) = 20x19 + 18x8 and g ′ (x) = 7x6 .

Since
f ′ (x) 20x19 + 18x8 38
lim ′
= lim 6
= ,
x→1 g (x) x→1 7x 7
l’ Hôpital’s rule implies that

x20 + 2x9 − 3 38
lim 7
= .
x→1 x −1 7

Let us look at some other examples.

Example 3.21
Determine whether the limit exists. If it exists, find the limit.
ex − 1 − x
(a) lim
x→0 x2
sin 2x
(b) lim
x→0 3x

cos 2x − 1
(c) lim
x→0 x2
Chapter 3. Differentiating Functions of a Single Variable 231

Solution
(a) This is a limit of the form 0/0. Applying l’ Hôpital’s rule, we have
ex − 1 − x ex − 1
lim = lim .
x→0 x2 x→0 2x
Again, we have a limit of the form 0/0. Applying l’ Hôpital’s rule
again, we have
ex − 1 − x ex − 1 ex 1
lim = lim = lim = .
x→0 x2 x→0 2x x→0 2 2

(b) This is a limit of the form 0/0. Apply l’ Hôpital’s rule, we have
sin 2x 2 cos 2x 2
lim = lim = .
x→0 3x x→0 3 3

(c) This is a limit of the form 0/0. Applying l’ Hôpital’s rule twice, we
have
cos 2x − 1 −2 sin 2x −4 cos 2x
lim 2
= lim = lim = −2.
x→0 x x→0 2x x→0 2

Using l’ Hôpital’s rule, we can give a second solution to Example 3.19.

Example 3.22
Since f is continuous, we have

lim (f (x0 + h) + f (x0 − h) − 2f (x0 )) = 0.


h→0

Since we also have lim h2 = 0, we can apply l’ Hôpital’s rule to get


h→0

f (x0 + h) + f (x0 − h) − 2f (x0 ) f ′ (x0 + h) − f ′ (x0 − h)


lim = lim .
h→0 h2 h→0 2h
Since f ′ is continuous,

lim (f ′ (x0 + h) − f ′ (x0 − h)) = 0.


h→0
Chapter 3. Differentiating Functions of a Single Variable 232

Since we also have lim (2h) = 0, applying l’ Hôpital’s rule again give
h→0

f ′ (x0 + h) − f ′ (x0 − h) f ′′ (x0 + h) + f ′′ (x0 − h)


lim = lim .
h→0 2h h→0 2
It follows from the continuity of f ′′ that

f ′′ (x0 + h) + f ′′ (x0 − h)
lim = f ′′ (x0 ).
h→0 2
These prove that

f (x0 + h) + f (x0 − h) − 2f (x0 )


lim = f ′′ (x0 ).
h→0 h2

In the future, we are going to see that Taylor’s approximation is an alternative


to l’ Hôpital’s rule when the point x0 is finite and the indeterminate form if of the
type 0/0. However, when x0 is infinite or the indeterminate form is of type ∞/∞,
l’ Hôpital’s rule becomes useful.
The following is for the case where x0 is infinite, and the limit is of the form
0/0.

Theorem 3.43 l’ Hôpital’s Rule II

Let a be a positive number. Given that f : (a, ∞) → R and g : (a, ∞) → R


are diferentiable functions that satisfy the following conditions.

(i) lim f (x) = lim g(x) = 0.


x→∞ x→∞

f ′ (x)
(ii) lim = L.
x→∞ g ′ (x)

f (x)
Then we have lim = L.
x→∞ g(x)
Chapter 3. Differentiating Functions of a Single Variable 233

Proof
Let b = 1/a, and define the functions f1 : (0, b) → R and g1 : (0, b) → R
by    
1 1
f1 (x) = f , g1 (x) = g .
x x
Then f1 and g1 are differentiable functions and
   
′ 1 ′ 1 ′ 1 ′ 1
f1 (x) = − 2 f , g1 (x) = − 2 g .
x x x x
Moreover,

lim f1 (x) = lim f (x) = 0, lim g1 (x) = lim g(x) = 0,


x→0+ x→∞ x→0+ x→∞

and
f′ 1

f ′ (x) f ′ (x)
lim+ 1′ = lim+ ′ x
1
= lim = L.
x→0 g1 (x) x→0 g x→∞ g ′ (x)
x
By Theorem 3.41,
f1 (x)
lim+ = L.
x→0 g1 (x)
This implies that
f (x)
lim = L.
x→∞ g(x)

Let us look at the following example.

Example 3.23
 x+1
x
Determine whether the limit lim exists. If it exists, find the
x→∞ x+2
limit.

This is not of the type 0/0. But the logarithm of it can be turned into that form.
Chapter 3. Differentiating Functions of a Single Variable 234

Solution
Consider the function
 x+1  
x x
g(x) = ln = (x + 1) ln .
x+2 x+2

When x → ∞, we have something of the form ∞ · 0. We turn it to the


form 0/0 by
 
x
ln
x+2 ln x − ln(x + 2)
g(x) = = .
1 1
x+1 x+1
l’ Hôpital’s rule implies that
1 1

lim g(x) = lim x x + 2
x→∞ x→∞ 1

(x + 1)2
x2 + 2x + 1
= −2 lim
x→∞ x2 + 2x
= −2.

By continuity of the exponential function, we have


 x+1
x  
lim = lim eg(x) = exp lim g(x) = e−2 .
x→∞ x+2 x→∞ x→∞

Suppose we want to find the limit


x
lim . (3.14)
x→∞ ex

This is a limit of the form ∞/∞. One may say that we can turn it to a limit of the
form 0/0 by writing
x e−x
= .
ex x−1
Then l’ Hôpital’s rule says that if the limit

e−x
lim (3.15)
x→∞ −x−2
Chapter 3. Differentiating Functions of a Single Variable 235

exists and is equal to L, the limit (3.14) also exists and is equal to L. However, the
limit (3.15) is more complicated than the limit (3.14). So this strategy is useless.
Hence, there is a need for us to consider the ∞/∞ indeterminate case. We only
prove the theorem in the case x0 is finite. The case where x0 is infinite can be
dealt with in the same way as in the proof of Theorem 3.43.

Theorem 3.44 l’ Hôpital’s Rule III

Let x0 be a point in the open interval (a, b), and let D be the set D =
(a, b) \ {x0 }. Given that f : D → R and g : D → R are diferentiable
functions that satisfy the following conditions.

(i) lim f (x) = lim g(x) = ∞.


x→x0 x→x0

f ′ (x)
(ii) lim = L.
x→x0 g ′ (x)
f (x)
Then we have lim = L.
x→x0 g(x)

The proof of this theorem is technical because of the infinite limits. The
strategegy to rewrite this as
1/g(x)
lim
x→x0 1/f (x)

is not useful, as have been demonstrated in our discussion before this theorem.

Proof
f (x)
We will prove that the right limit lim+ is equal to L. The proof that
x→x0 g(x)
the left limit is equal to L is similar. Observe that if we fix a point u in
(x0 , b), then for any x in (x0 , u), Cauchy mean value theorem asserts that
there is a cx in (x, u) such that

f (x) − f (u) f ′ (cx )


= ′ .
g(x) − g(u) g (cx )
Chapter 3. Differentiating Functions of a Single Variable 236

This implies that

(f (x) − Lg(x)) − (f (u) − Lg(u)) f ′ (cx )


= ′ − L.
g(x) − g(u) g (cx )

Thus,

f ′ (cx )
 
f (x) g(x) − g(u) f (u) − Lg(u)
−L= − L + . (3.16)
g(x) g(x) g ′ (cx ) g(x)

Fixed ε > 0. By assumption of

f ′ (x)
lim+ = L,
x→x0 g ′ (x)

there exsits a δ1 > 0 such that (x0 , x0 + δ1 ) ⊂ (a, b), and for any x ∈
(x0 , x0 + δ1 ),
f ′ (x) ε

−L < .
g (x) 3
Take u = x0 + δ1 /2. Since lim g(x) = ∞, we find that
x→x0

g(x) − g(u) f (u) − Lg(u)


lim+ =1 lim+ = 0.
x→x0 g(x) x→x0 g(x)

Therefore, there exists a number δ such that 0 < δ ≤ δ1 /2, and for all
x ∈ (x0 , x0 + δ),

g(x) − g(u) f (u) − Lg(u) ε


< 2, < .
g(x) g(x) 3

If x is in (x0 , x0 + δ), x0 < x < u and hence x0 < cx < u < x0 + δ1 . This
implies that
f ′ (cx ) ε

−L < .
g (cx ) 3
Eq. (3.16) then implies that for all x ∈ (x0 , x0 + δ),

f (x) g(x) − g(u) f ′ (cx ) f (u) − Lg(u)


−L ≤ ′
−L +
g(x) g(x) g (cx ) g(x)
ε ε
< 2 × + = ε.
3 3
Chapter 3. Differentiating Functions of a Single Variable 237

This proves that


f (x)
lim+ = L.
x→x0 g(x)

Notice that in the proof, we do not use the assumption that lim f (x) = ∞.
x→x0
Hence, this can be ommited from the conditions in the theorem. If f (x) is bounded
in a neighbourhood of x0 , there is no need to apply l’ Hôpital’s rule.
Let us now look at some examples.

Example 3.24
Let r be a positive number. Prove that
ln x
lim = 0.
x→∞ xr

Deduce that for any positive number s,

lim xs e−x = 0.
x→∞

Solution
The limit
ln x
lim
x→∞ xr
is of the form ∞/∞. Apply l’ Hôpital’s rule, we have
1
ln x 1 1
lim r = lim xr−1 = lim r = 0.
x→∞ x x→∞ rx r x→∞ x
Since lim ex = ∞, and the function f (x) = xs is a continuous function,
x→∞
we find that
(ln u)s
lim xs e−x = lim
x→∞ u→∞ u 
 s
ln u
= lim 1/s
u→∞ u

= 0s = 0.
Chapter 3. Differentiating Functions of a Single Variable 238

The result of this example shows that when x becomes large,

• ln x goes to infinity slower than any positive powers of x;

• any positive powers of x goes to infinity slower than ex .

Example 3.25
Show that there exists a number c so that the function

x x , if x > 0,
f (x) =
c, if x = 0,

is continuous.

Solution
Since
f (x) = xx = ex ln x when x > 0,
f (x) is continuous on (0, ∞). To make f continuous, f must be continuous
at x = 0. This means

c = f (0) = lim+ f (x) = lim+ ex ln x .


x→0 x→0

Let us look at the limit lim+ x ln x. It is of the form 0 · ∞. We turn it to


x→0
the form ∞/∞, and use l’ Hôpital’s rule.
1
ln x
lim x ln x = lim+ = lim+ x = − lim+ x = 0.
x→0+ x→0 1 x→0 1 x→0
− 2
x x
Therefore, when
 
c = exp lim x ln x = e0 = 1,
x→0+

the function f : [0, ∞) → R is continuous.


Chapter 3. Differentiating Functions of a Single Variable 239

Exercises 3.6
Question 1
Determine whether the limit exists. If it exists, find the limit.
x100 + x50 − 2
(a) lim
x→1 3x101 + 4x55 − 7x

2e−x − 2 + 2x
(b) lim
x→0 x2 + 3x3
tan−1 x
(c) lim
x→0 x
tan x − x
(d) lim
x→0 x3

Question 2
Find the limit  2x+1
3x − 1
lim .
x→∞ 3x + 1

Question 3
Let r be a positive number. Prove that

lim+ xr ln x = 0.
x→0

Question 4
Determine whether the limit exists. If it exists, find the limit.
xx − 1
(a) lim
x→1 x − 1

xx − 1
(b) lim+
x→0 x ln x
xx ln(1 + x)
(c) lim+
x→0 x
Chapter 3. Differentiating Functions of a Single Variable 240

3.7 Concavity of Functions

In this section, we study concavity of functions. If a function is twice differentiable,


its concavity is determined by the second derivative.
Recall that if x1 and x2 are two points in R, then as t runs through all numbers
in the interval [0, 1],

x1 + t(x2 − x1 ) = (1 − t)x1 + tx2

runs through all points in the interval [x1 , x2 ]. We say that a subset S of R is
convex if and only if for any two points x1 and x2 in S, and for any t in [0, 1], the
point (1 − t)x1 + tx2 is also in S. We have proved that a subset of R is convex if
and only if it is an interval.

Definition 3.20 Concavity of Functions


Let I be an interval.

1. A function f : I → R is concave up (or convex) provided that for any


two points x1 and x2 in I, and for any t ∈ [0, 1],

f ((1 − t)x1 + tx2 ) ≤ (1 − t)f (x1 ) + tf (x2 ).

2. A function f : I → R is concave down provided that for any two points


x1 and x2 in I, and for any t ∈ [0, 1],

f ((1 − t)x1 + tx2 ) ≥ (1 − t)f (x1 ) + tf (x2 ).

3. A function f : I → R is strictly concave up (or strictly convex) provided


that for any two distinct points x1 and x2 in I, and for any t ∈ (0, 1),

f ((1 − t)x1 + tx2 ) < (1 − t)f (x1 ) + tf (x2 ).

4. A function f : I → R is strictly concave down provided that for any


two distinct points x1 and x2 in I, and for any t ∈ (0, 1),

f ((1 − t)x1 + tx2 ) > (1 − t)f (x1 ) + tf (x2 ).


Chapter 3. Differentiating Functions of a Single Variable 241

Notice that a function f : I → R is concave up if and only if the function


−f : I → R is concave down. Same for the strict concavity.
Geometrically, we draw a line L passing through the points P1 (x1 , f (x1 )) and
P2 (x2 , f (x2 )) on the graph y = f (x). If the equation of this line L is y = g(x),
and x0 = (1 − t)x1 + tx2 , then

g(x0 ) = (1 − t)f (x1 ) + tf (x2 ).

Hence, (x0 , g(x0 )) is point on the line L. Therefore, a function y = f (x) is strictly
concave up if its graph is always below a line segment joining two points on the
graph; and it is strictly concave down if its graph is always above a line segment
joining two points on the graph.

Figure 3.19: (a) A strictly concave up function. (b) A strictly concave down
function.

Example 3.26

For any constants m and c, the function f (x) = mx + c is concave up and


concave down. It is neither strictly concave up nor strictly concave down.

Example 3.27

Show that the function f : R → R, f (x) = x2 is strictly concave up.


Chapter 3. Differentiating Functions of a Single Variable 242

Solution
Let x1 and x2 be any two distinct real numbers, and let t be a number in the
interval (0, 1). Then

f ((1 − t)x1 + tx2 ) − (1 − t)f (x1 ) − tf (x2 )


= (1 − t)2 x21 + 2t(1 − t)x1 x2 + t2 x22 − (1 − t)x21 − tx22
= −t(1 − t)x21 + 2t(1 − t)x1 x2 − t(1 − t)x22
= −t(1 − t)(x1 − x2 )2 .

Since (x1 − x2 )2 > 0, t > 0 and 1 − t > 0, we find that

f ((1 − t)x1 + tx2 ) − (1 − t)f (x1 ) − tf (x2 ) < 0.

This proves that f is strictly concave up.

In the definition of concavity, we do not assume any regularity about the


function. If a function is differentiable, we can characterize the concavity of the
function in terms of its tangent lines.
For a point x0 ∈ (a, b), the equation of the tangent line to the curve y = f (x)
at x = x0 is
y = f (x0 ) + f ′ (x0 )(x − x0 ).
We say that the graph of f is above the tangent line at x = x0 provided that

f (x) ≥ f (x0 ) + f ′ (x0 )(x − x0 ) for all x ∈ [a, b].

We say that the graph of f is strictly above the tangent line at x = x0 except at the
tangential point provided that

f (x) > f (x0 ) + f ′ (x0 )(x − x0 ) for all x ∈ [a, b] \ {x0 }.

Similarly, one can define what it means for the graph of f to be below a tangent
line, or strictly below.
Chapter 3. Differentiating Functions of a Single Variable 243

Theorem 3.45
Let f : [a, b] → R be a function that is continuous on [a, b], and
differentiable on (a, b). The following three conditions are equivalent.

(a) f ′ is strictly increasing on (a, b).

(b) The graph of f is strictly above every tangent line except at the
tangential point.

(c) f is strictly concave up.

Proof
First we prove (a) =⇒ (b). Take any x0 ∈ (a, b). The equation of the
tangent line at x = x0 is

y = g(x) = f (x0 ) + f ′ (x0 )(x − x0 ).

If x ∈ [a, b],

f (x) − g(x) = f (x) − f (x0 ) + f ′ (x0 )(x − x0 ).

When x ̸= x0 , mean value theorem implies that there exists u strictly


between x0 and x such that

f (x) − f (x0 ) = f ′ (u)(x − x0 ).

Therefore
f (x) − g(x) = (x − x0 )(f ′ (u) − f ′ (x0 )).
If a ≤ x < x0 , u < x0 and so f ′ (u) < f ′ (x0 ). This implies that f (x) −
g(x) > 0. If x > x0 , u > x0 and so f ′ (u) > f ′ (x0 ). Then we also have
f (x) − g(x) > 0. In other words, we have proved that for any x0 in (a, b),
for any x ∈ [a, b] \ {x0 },

f (x) > f (x0 ) + f ′ (x0 )(x − x0 ).


Chapter 3. Differentiating Functions of a Single Variable 244

This proves that the graph of f is strictly above every tangent line except at
the tangential point.
Next, we prove (b) =⇒ (c). Given x1 and x2 in [a, b] with x1 < x2 , and
t ∈ (0, 1), let x0 = (1 − t)x1 + tx2 . Then x1 < x0 < x2 , and

x1 − x0 = −t(x2 − x1 ), x2 − x0 = (1 − t)(x2 − x1 ).

By assumption,

f (x1 ) > f (x0 ) + f ′ (x0 )(x1 − x0 ) = f (x0 ) − tf ′ (x0 )(x2 − x1 ),

f (x2 ) > f (x0 ) + f ′ (x0 )(x2 − x0 ) = f (x0 ) + (1 − t)f ′ (x0 )(x2 − x1 ).


Therefore,

(1 − t)f (x1 ) + tf (x2 ) > f (x0 ) = f ((1 − t)x1 + tx2 ).

This proves that f is strictly concave up.


Finally, we prove (c) =⇒ (a). First we will prove that f ′ is increasing
on (a, b). Given x1 and x2 in (a, b) with x1 < x2 , we want to show that
f ′ (x1 ) ≤ f ′ (x2 ). For any x ∈ (x1 , x2 ), there exists t ∈ (0, 1) such that
x = (1 − t)x1 + tx2 . Since f is strictly concave up, we have

f (x) = f ((1 − t)x1 + tx2 ) < (1 − t)f (x1 ) + tf (x2 ).

This implies that

(1 − t)(f (x) − f (x1 )) < t(f (x2 ) − f (x)).

Since x − x1 = t(x2 − x1 ) and x2 − x = (1 − t)(x2 − x1 ), we have

f (x) − f (x1 ) f (x2 ) − f (x)


< . (3.17)
x − x1 x2 − x

Letting x → x+
1 in (3.17), we find that

f (x2 ) − f (x1 )
f ′ (x1 ) ≤ .
x2 − x1
Chapter 3. Differentiating Functions of a Single Variable 245

Letting x → x−
2 in (3.17), we find that

f (x2 ) − f (x1 )
f ′ (x2 ) ≥ .
x2 − x1
These prove that
f ′ (x2 ) ≥ f ′ (x1 ).
In other words, we have proved that f ′ is increasing on (a, b). If f ′ is not
strictly increasing on (a, b), there exist x1 and x2 in (a, b) with x1 < x2 but
f ′ (x1 ) = f ′ (x2 ). Since f ′ is increasing, we will have f ′ (x) = f ′ (x1 ) =
f ′ (x2 ) = m for all x ∈ (x1 , x2 ). This implies that

f (x) = mx + c for all x ∈ [x1 , x2 ].

But then for any t ∈ (0, 1),

f ((1 − t)x1 + tx2 ) = m [(1 − t)x1 + tx2 ] + b = (1 − t)f (x1 ) + tf (x2 ),

which constradicts to the strict concavity of f . Therefore, f ′ must be strictly


increasing on (a, b).

By replacing f by −f in Theorem 3.45, we obtain the following immediately.

Theorem 3.46
Let f : [a, b] → R be a function that is continuous on [a, b], and
differentiable on (a, b). The following three conditions are equivalent.

(a) f ′ is strictly decreasing on (a, b).

(b) The graph of f is strictly below every tangent line except at the
tangential point.

(c) f is strictly concave down.

In Theorem 3.45, if we relax the strictness, the proofs are actually easier.
Chapter 3. Differentiating Functions of a Single Variable 246

Theorem 3.47
Let f : [a, b] → R be a function that is continuous on [a, b], and
differentiable on (a, b). The following three conditions are equivalent.

(a) f ′ is increasing on (a, b).

(b) The graph of f is above every tangent line.

(c) f is concave up.

Theorem 3.48
Let f : [a, b] → R be a function that is continuous on [a, b], and
differentiable on (a, b). The following three conditions are equivalent.

(a) f ′ is decreasing on (a, b).

(b) The graph of f is below every tangent line.

(c) f is concave down.

If a function is twice differentiable, we can characterize concavity using second


derivatives.

Theorem 3.49
Let f : [a, b] → R be a function that is continuous on [a, b], and twice
differentiable on (a, b).

1. f (x) is concave up if and only if f ′′ (x) ≥ 0 for all x ∈ (a, b).

2. f (x) is concave down if and only if f ′′ (x) ≤ 0 for all x ∈ (a, b).

3. If f ′′ (x) > 0 for all x ∈ (a, b), then f is strictly concave up.

4. If f ′′ (x) < 0 for all x ∈ (a, b), then f is strictly concave down.
Chapter 3. Differentiating Functions of a Single Variable 247

Proof
For (a) and (b), this is just the fact that f ′′ (x) ≥ 0 for all x ∈ (a, b) if and
only if f ′ is increasing; f ′′ (x) ≤ 0 for all x ∈ (a, b) if and only if f ′ is
decreasing.
For (c) and (d), we note that f ′′ (x) > 0 for all x ∈ (a, b) implies that f ′ is
strictly increasing on (a, b); while f ′′ (x) < 0 for all x ∈ (a, b) implies that
f ′ is strictly decreasing on (a, b)

We have seen that if a differentiable function g : (a, b) → R is strictly


increasing, it is not necessary that g ′ (x) > 0 for all x ∈ (a, b). This is why
for f : (a, b) → R to be strictly concave up, it is not necessary that f ′′ (x) > 0 for
all x ∈ (a, b). The function f : R → R, f (x) = x4 gives an example of a function
that is strictly concave up, but it is not true that f ′′ (x) > 0 for all x ∈ R.

Example 3.28

Show that the function f : [0, π] → R, f (x) = sin x is strictly concave


down.

Solution
′′
Since f (x) = − sin x < 0 for all x ∈ (0, π), we find that f : [0, π] → R,
f (x) = sin x is strictly concave down.

Example 3.29

Consider the power function f (x) = xr . Since f ′′ (x) = r(r − 1)xr−2 , we


have the following.

1. When r < 0, the function f : (0, ∞) → R, f (x) = xr is strictly concave


up.

2. When 0 < r < 1, the function f : [0, ∞) → R, f (x) = xr is strictly


concave down.

3. When r > 1, the function f : [0, ∞) → R, f (x) = xr is strictly concave


up.
Chapter 3. Differentiating Functions of a Single Variable 248

Next we look at a classical example where the concavity of a function can be


used to prove inequalities.

Example 3.30 Young’s Inequality


Given that p and q are positive numbers such that
1 1
+ = 1.
p q
If a and b are positive numbers, show that
ap bq
ab ≤ + .
p q
Equality holds if and only if ap = bq .

Solution
Notice that since p and q are positive, we have 1/p < 1 and 1/q < 1. This
implies that p > 1 and q > 1. Consider the function f : (0, ∞) → R,
f (x) = ln x. We find that
1 1
f ′ (x) = , f ′′ (x) = − <0 for all x > 0.
x x2
Hence, the function f : (0, ∞) → R, f (x) = ln x is strictly concave down.
This implies that for any two positive numbers x1 and x2 , and any t ∈ (0, 1),

ln ((1 − t)x1 + tx2 ) ≥ (1 − t) ln x1 + t ln x2 . (3.18)

The equality can hold if and only if x1 = x2 . Now, let t = 1/q. Then
t ∈ (0, 1) and 1 − t = 1/p. Given the positive numbers a and b, let

x 1 = ap , x2 = bq .

Then x1 and x2 are positive numbers. Eq. (3.18) implies that


 p
bq

a 1 1
ln + ≥ ln ap + ln bq = ln(ab),
p q p q
Chapter 3. Differentiating Functions of a Single Variable 249

with equality holds if and only if ap = bq . Therefore,


ap b q
ab ≤ + .
p q
Equality holds if and only if ap = bq .
Chapter 3. Differentiating Functions of a Single Variable 250

Exercises 3.7
Question 1

(a) Show that the function f : R → R, f (x) = e−x is strictly concave up.
1
(b) Show that the function f : (0, π) → R, f (x) = is strictly concave
sin x
up.

Question 2
Let f : (a, b) → R be a twice differentiable function. If f : (a, b) → R
is concave down, and f (x) > 0 for all x ∈ (a, b), prove that the function
g : (a, b) → R,
1
g(x) =
f (x)
is concave up.

Question 3
Given that the function f : [a, b] → R is concave down. Show that for any
x1 , x2 , . . . , xn in [a, b], if t1 , t2 , . . . , tn are nonnegative numbers satifying

t1 + t2 + . . . + tn = 1,

then

f (t1 x1 + t2 x2 + · · · + tn xn ) ≥ t1 f (x1 ) + t2 f (x2 ) + · · · + tn f (xn ).

Question 4: Arithmetic Mean-Geometric Mean Inequality


The arithmetic mean-geometric mean inequality states that if a1 , a2 , . . . , an
are n positive numbers, then
a1 + a2 + · · · + an √
≥ n a1 a2 · · · an .
n
Use the concavity of the function f : (0, ∞) → R, f (x) = ln x and the
result of the previous question to prove this inequality.
Chapter 4. Integrating Functions of a Single Variable 251

Chapter 4

Integrating Functions of a Single Variable

The concept of integrals arises naturally when one wants to compute the area
bounded by a curve, such as the area of a circle. Since the ancient time, our
ancestors have found a good strategy to deal with such problems. For example,
they used the area of polygons to approximate the area of a circle. The circle is
partitioned into sectors, and the area of each sector is approximated by the area
of the inscribed triangle (see Figure 4.1). When the circle is partitioned into more
sectors, better approximation is obtained.

Figure 4.1: Approximating the area of a circle by the area of a polygon.

The same idea can be used to find the area enclosed by any curves. This
motivated the definition of integrals. For curves defined by continuous functions,
it is not difficult to formulate a well-defined definition for integrals. However,
mathematicians soon discovered that we need to work with functions that are not
continuous as well. The process to make integrals rigorously defined is long and
tedious. We will follow the historical path and study the Riemann integrals in
this course. This lays down the foundation for advanced theory of integration à
la Lebesgue. For practical applications and computations, Riemann integrals are
sufficient and easier to calculate.
Chapter 4. Integrating Functions of a Single Variable 252

4.1 Riemann Integrals of Bounded Functions

In this section, we define the Riemann integral for a function f : [a, b] → R that
is defined on a closed and bounded interval [a, b]. For this purpose, the function
is necessarily bounded. In Section 4.6, we will discuss how to deal with functions
that are not necessarily bounded, via some limiting processes.
For a closed and bounded interval [a, b], we will always assume that a < b.
We start by a few definitions.

Definition 4.1 Partitions


Let [a, b] be a closed and bounded interval. A partition of [a, b] is a finite
sequence of points x0 , x1 , x2 , . . . , xk , where

a = x0 < x1 < · · · < xk−1 < xk = b.

It is denoted by P = {x0 , x1 , . . . , xk }. For each 0 ≤ i ≤ k, xi is a partition


point. These points partition the interval [a, b] into k subintervals [x0 , x1 ],
[x1 , x2 ], . . ., [xk−1 , xk ]. The ith -subinterval is [xi−1 , xi ].

We have slightly abused notation and used set notation for a partition.

Example 4.1

P = {0, 2, 3, 5, 9, 10} is a partition of the interval [0, 10] into 5 subintervals

[0, 2], [2, 3], [3, 5], [5, 9] and [9, 10].

We use the lengths of the subintervals to measure how fine a partition is.

Definition 4.2 Gap of a Partition

Let P = {x0 , x1 , . . . , xk } be a partition of the interval [a, b]. The gap of the
partition P , denoted by |P | or gap P , is the length of the largest subinterval
in the partition. Namely,

|P | = gap P = max {xi − xi−1 | 1 ≤ i ≤ k} .


Chapter 4. Integrating Functions of a Single Variable 253

Example 4.2

For the partition P = {0, 2, 3, 5, 9, 10} of [0, 10],

|P | = max{2, 1, 2, 4, 1} = 4.

A partition where all subintervals have equal lengths is very useful.

Definition 4.3 Regular Partitions

Let [a, b] be a closed and bounded interval. A regular partition of [a, b] into
k intervals is the partition P = {x0 , x1 , . . . , xk }, where

b−a
|P | = x1 − x0 = x2 − x1 = · · · = xk − xk−1 = .
k
b−a
This implies that xi = x0 + i , 1 ≤ i ≤ k.
k

Example 4.3

The regular partition of the interval [0, 10] into 5 intervals is the partition

P = {0, 2, 4, 6, 8, 10}.
10 − 0
The gap of this partition is |P | = = 2.
5

Next, we define the Riemann sums and Darboux sums.

Definition 4.4 Riemann Sums


Let f : [a, b] → R be a function, and let P = {x0 , x1 , . . . , xk } be a partition
of [a, b]. For each 1 ≤ i ≤ k, choose an intermediate point ξi in the ith -
subinterval [xi−1 , xi ]. Denote this sequence of points {ξi }ki=1 by A. Then
the Riemann sum of f with respect to the partition P and the intermediate
points A = {ξi }ki=1 is the sum
k
X
R(f, P, A) = f (ξi )(xi − xi−1 ).
i=1
Chapter 4. Integrating Functions of a Single Variable 254

Example 4.4

Consider the function f : [0, 6] → R, f (x) = 6x − x2 , and the partition


P = {0, 2, 3, 5, 6} of [0, 6]. Let

A = {1, 3, 4, 5} .

Then
R(f, P, A) = 5 × 2 + 9 × 1 + 8 × 2 + 5 × 1 = 40.

As shown in Figure 4.2, the Riemann sum R(f, P, A) is the sum of the areas
of rectangles that are used to approximate the region bounded by the curve y =
6x − x2 and the x-axis.

Figure 4.2: Riemann sum is an approximation of area under a curve.

In general, if f : [a, b] → R is a nonnegative function, then a Riemann sum


R(f, P, A) is an approximation to the area bounded by the curve y = f (x), the
x-axis, and the lines x = a and x = b. In its definition, we do not need to
assume that f is a bounded function. Since Riemann sum involves an arbitrary
choice of points in each subinterval, to give a bound to Riemann sums, we need
the concept of Darboux sums, whose definition requires f : [a, b] → R to be a
bounded function.
Chapter 4. Integrating Functions of a Single Variable 255

Definition 4.5 Darboux Sums


Let f : [a, b] → R be a bounded function, and let P = {x0 , x1 , . . . , xk } be
a partition of [a, b]. For each 1 ≤ i ≤ k, let

mi = inf f (x), Mi = sup f (x).


xi−1 ≤x≤xi xi−1 ≤x≤xi

The Darboux lower sum L(f, P ) and the Darboux upper sum U (f, P ) are
defined by
k
X
L(f, P ) = mi (xi − xi−1 ),
i=1
Xk
U (f, P ) = Mi (xi − xi−1 ).
i=1

Remark 4.1
For convenience, we denote

inf{f (x) | xi−1 ≤ x ≤ xi } and sup{f (x) | xi−1 ≤ x ≤ xi }

by inf f (x) and sup f (x) respectively. The assumption that f is


xi−1 ≤x≤x xi−1 ≤x≤xi
bounded is needed to ensure that mi and Mi exist for all 1 ≤ i ≤ k. The
reason we use infimum and supremum is obvious, as the function f might
not have minimum or maximum on an interval.

Figure 4.3: Darboux lower sum and Darboux upper sum.


Chapter 4. Integrating Functions of a Single Variable 256

Example 4.5

For the function f : [0, 6] → R, f (x) = 6x − x2 and the partition P =


{0, 2, 3, 5, 6} of [0, 6] considered in Example 4.4, we find that

i interval xi − xi−1 mi Mi
1 [0, 2] 2 0 8
2 [2, 3] 1 8 9
3 [3, 5] 2 5 9
4 [5, 6] 1 0 5

Hence, the Darboux lower sum L(f, P ) and the Darboux upper sum
U (f, P ) are

L(f, P ) = 0 × 2 + 8 × 1 + 5 × 2 + 0 × 1 = 18,
U (f, P ) = 8 × 2 + 9 × 1 + 9 × 2 + 5 × 1 = 48.

Example 4.6

If f : [a, b] → R is the constant function f (x) = c, it is obvious that for any


partition P = {xi }ki=0 of [a, b], and for any choices of intermediate points
A = {ξi }ki=1 ,

L(f, P ) = U (f, P ) = R(f, P, A) = c(b − a).

The following can be easily deduced from the definitions.

Proposition 4.1

Let f : [a, b] → R be a bounded function such that

m ≤ f (x) ≤ M for all a ≤ x ≤ b.

For any partition P = {xi }ki=0 of the interval [a, b], and any choice of
intermediate points A = {ξi }ki=1 for the partition P , we have

m(b − a) ≤ L(f, P ) ≤ R(f, P, A) ≤ U (f, P ) ≤ M (b − a).


Chapter 4. Integrating Functions of a Single Variable 257

Proof
For any 1 ≤ i ≤ k, let

mi = inf f (x), Mi = sup f (x).


xi−1 ≤x≤x xi−1 ≤x≤xi

Then
m ≤ mi ≤ f (ξi ) ≤ Mi ≤ M.
Therefore,

m(xi − xi−1 ) ≤ mi (xi − xi−1 ) ≤ f (ξi )(xi − xi−1 )


≤ Mi (xi − xi−1 ) ≤ M (xi − xi−1 ).

Summing over i from i = 1 to i = k, we obtain

m(b − a) ≤ L(f, P ) ≤ R(f, P, A) ≤ U (f, P ) ≤ M (b − a).

For a bounded nonnegative function f : [a, b] → R, if the region bounded by


the x-axis, the curve y = f (x), the lines x = a and x = b has an area, then a
Darboux lower sum is always less than or equal to the area, and a Darboux upper
sum is always larger than or equal to the area. This leads to the fact that a Darboux
lower sum is always less than or equal to a Darboux upper sum. To prove this for
any bounded functions, we introduce the concept of refinement.

Definition 4.6 Refinement of a Partition


Let P and P ∗ be partitions of the interval [a, b]. We say that P ∗ is refinement
of P if every partition point of P is also a partition point of P ∗ . In other
words, the set of points in P is a subset of the set of points in P ∗ .

Example 4.7

For the partition P = {0, 2, 3, 5, 9, 10} of [0, 10],

P ∗ = {0, 1, 2, 3, 5, 6, 8, 9, 10}

is a refinement.
Chapter 4. Integrating Functions of a Single Variable 258

Figure 4.4: A partition P of [0, 10] and its refinement P ∗ .

If P ∗ is a refinement of P = {xi }ki=1 , then for each 1 ≤ i ≤ k, P ∗ induces a


partition Pi of the interval [xi−1 , xi ].

Example 4.8
For the partition P and P ∗ in Example 4.7, P ∗ induces the partition P1 =
{0, 1, 2}, P2 = {2, 3}, P3 = {3, 5}, P4 = {5, 6, 8, 9} and P5 = {9, 10} of
the intervals [0, 2], [2, 3], [3, 5], [5, 9] and [9, 10] respectively.

Since the union of all the subintervals in the partition Pi , 1 ≤ i ≤ k is the


collection of all the subintervals in the partition P ∗ , the following is quite obvious.

Proposition 4.2

Let f : [a, b] → R be a bounded function, and let P = {xi }ki=0 be a partition


of [a, b]. Given a refinement P ∗ of P , let Pi , 1 ≤ i ≤ k, be the partition that
P ∗ induces on the interval [xi−1 , xi ]. Then
k
X k
X
L(f, Pi ) = L(f, P ∗ ), U (f, Pi ) = U (f, P ∗ ).
i=1 i=1

From this, it is quite easy to obtain the following.

Theorem 4.3
Let f : [a, b] → R be a bounded function, and let P and P ∗ be partitions of
[a, b]. If P ∗ is a refinement of P , then

L(f, P ) ≤ L(f, P ∗ ) ≤ U (f, P ∗ ) ≤ U (f, P ).


Chapter 4. Integrating Functions of a Single Variable 259

Proof
Let P = {xi }ki=0 . For each 1 ≤ i ≤ k, let

mi = inf f (x), Mi = sup f (x).


xi−1 ≤x≤xi xi−1 ≤x≤xi

Since
mi ≤ f (x) ≤ Mi for all x ∈ [xi−1 , xi ],
we find that

mi (xi − xi−1 ) ≤ L(f, Pi ) ≤ U (f, Pi ) ≤ Mi (xi − xi−1 ).

Summing over i from 1 to k gives


k
X k
X k
X k
X
mi (xi − xi−1 ) ≤ L(f, Pi ) ≤ U (f, Pi ) ≤ Mi (xi − xi−1 ).
i=1 i=1 i=1 i=1

By Proposition 4.2, this gives

L(f, P ) ≤ L(f, P ∗ ) ≤ U (f, P ∗ ) ≤ U (f, P ).

Figure 4.5: When a partition is refined, Darboux lower sum gets larger.

If P1 and P2 are partitions of [a, b], a common refinement of P1 and P2 is a


partition P ∗ which contains all the partition points of P1 and P2 . Such a common
refinement always exists. The smallest one is the one whose set of points is the
union of the set of points in P1 and the set of points in P2 .
Chapter 4. Integrating Functions of a Single Variable 260

Figure 4.6: When a partition is refined, Darboux upper sum gets smaller.

Corollary 4.4

Let f : [a, b] → R be a bounded function, and let P and P2 be any two


partitions of [a, b]. Then

L(f, P1 ) ≤ U (f, P2 ).

Proof

Take a common refinement P of the partitions P1 and P2 . By Theorem
4.3,
L(f, P1 ) ≤ L(f, P ∗ ) ≤ U (f, P ∗ ) ≤ U (f, P2 ).

Given a bounded function f : [a, b] → R, we consider the set of Darboux


lower sums and the set of Darboux upper sums of f .

SL (f ) = {L(f, P ) | P is a partition of [a, b]} ,


SU (f ) = {U (f, P ) | P is a partition of [a, b]} .

If m and M are lower and upper bounds of f , then

m(b − a) ≤ L(f, P ) ≤ U (f, P ) ≤ M (b − a).

This implies that the sets SL (f ) and SU (f ) are bounded. When we use the
Darboux lower sums and upper sums to approximate areas, we are interested in
the least upper bound of the lower sums and the greatest lower bound of the upper
sums.
Chapter 4. Integrating Functions of a Single Variable 261

Definition 4.7 Lower Integrals and Upper Integrals

Let f : [a, b] → R be a bounded function.


Z b
1. The lower integral of f , denoted by f , is defined as the least upper
a
bound of the Darboux lower sums.
Z b
f = sup SL (f ) = sup {L(f, P ) | P is a partition of [a, b]} .
a

Z b
2. The upper integral of f , denoted by f , is defined as the greatest lower
a
bound of the Darboux upper sums.
Z b
f = inf SU (f ) = inf {U (f, P ) | P is a partition of [a, b]} .
a

Example 4.9

For the constant function f : [a., b] → R, f (x) = c,

L(f, P ) = U (f, P ) = c(b − a)

for any partition P of [a, b]. Thus, SL (f ) = SU (f ) = {c(b − a)}, and


Z b Z b
f= f = c(b − a).
a a

By Corollary 4.4, we have the following.

Proposition 4.5

Let f : [a, b] → R be a bounded function. We have


Z b Z b
f≤ f.
a a
Chapter 4. Integrating Functions of a Single Variable 262

Proof
By definitions of infimum and supremum, for any positive integer n, there
are partition P1 and P2 such that
Z b Z b
1 1
L(f, P1 ) > f− , U (f, P2 ) < f+ .
a n a n

These, together with Corollary 4.4, give the following.


Z b Z b
2 2
f− f > U (f, P2 ) − L(f, P1 ) − ≥− .
a a n n

Taking the limit n → ∞, we deduce that


Z b Z b
f− f ≥ 0.
a a

Let f : [a, b] → R be a bounded function, and let P be a partition of [a, b].


Z b Z b
1. L(f, P ) ≤ f ≤ f ≤ U (f, P ).
a a

Z b Z b
2. 0 ≤ f − f ≤ U (f, P ) − L(f, P ).
a a

Notice that if w is a number such that


Z b Z b
f ≤w≤ f,
a a

then for any partition P of [a, b],

L(f, P ) ≤ w ≤ U (f, P ).

As we mentioned above, for a nonnegative bounded function f : [a, b] → R,


a Darboux lower sum L(f, P ) is less than or equal to the area below the curve
y = f (x), while a Darboux upper sum is larger than or equal to the area, if such
an area is well-defined. Intuitively, the area would be well-defined if there is a
single number A that is larger than or equal to all the Darboux lower sums, and
Chapter 4. Integrating Functions of a Single Variable 263

less than or equal to all the Darboux upper sums. This is the case if and only if
the lower and the upper integrals are the same.

Definition 4.8 Riemann Integrability

Let f : [a, b] → R be a bounded function. We say that f is Riemann


integrable, or simply integrable, if
Z b Z b
f= f.
a a

In this case, we define the integral of f over [a, b] as


Z b Z b Z b
f= f= f.
a a a

It is the unique number that is larger than or equal to all the Darboux lower
sums, and less than or equal to all the Darboux upper sums.

Remark 4.2
If f : [a, b] → R is a continuous nonnegative function, we are going to
prove that f is Riemann integrable. It follows from our discussions above
Z b
that the integrable f is the area bounded by the curve y = f (x), the
a
x-axis, and the lines x = a and x = b.

Leibniz Notation
In Leibniz notation, the integral of f : [a, b] → R over [a, b] is denoted by
Z b
f (x)dx.
a

Example 4.10

A constant functon f : [a, b] → R, f (x) = c is integrable and


Z b
f = c(b − a).
a
Chapter 4. Integrating Functions of a Single Variable 264

Let us look at an example of a function that is not integrable.

Example 4.11 Non-Integrability of the Dirichlet’s Function

The Dirichlet’s function is the function f : [0, 1] → R defined by



1, if x is rational,
f (x) =
0, if x is irrational.

Show that f is not Riemann integrable.

Solution
Let P = {xi }ki=0 be any partition of the interval [0, 1]. For any 1 ≤ i ≤
k, by denseness of the set of rational numbers and the set of irrational
numbers, there exist a rational number and an irrational number in the
interval [xi−1 , xi ]. This shows that

mi = 0, Mi = 1, for all 1 ≤ i ≤ k.

Hence,
k
X
L(f, P ) = mi (xi − xi−1 ) = 0,
i=1
Xk
U (f, P ) = Mi (xi − xi−1 ) = 1.
i=1

This shows that

SL (f ) = {0}, SU (f ) = {1}.

Therefore, the lower integral and the upper integral of f are


Z b Z b
f = 0, f =1
a a

respectively. Since they are not equal, f is not Riemann integrable.

An interesting question now is what functions are Riemann integrable. Let us


Chapter 4. Integrating Functions of a Single Variable 265

first give alternative criteria for Riemann integrability.

Lemma 4.6
Let f : [a, b] → R be a bounded function. Then the following are
equivalent.

(a) f : [a, b] → R is Riemann integrable.

(b) For any ε > 0, there exists a partition P of [a, b] such that

U (f, P ) − L(f, P ) < ε.

Proof
First, let us prove (a) implies (b). If f is Riemann integrable,
Z b Z b Z b
f= f= f.
a a a

Given ε > 0, by definitions of lower and upper integrals as supremums and


infimums, there exist partitions P1 and P2 of [a, b] such that
Z b Z b
ε ε
L(f, P1 ) > f− and U (f, P2 ) < f+ .
a 2 a 2
This gives
U (f, P2 ) − L(f, P1 ) < ε.
Let P be a common refinement of P1 and P2 . Then

L(f, P1 ) ≤ L(f, P ) ≤ U (f, P ) ≤ U (f, P2 ).

This implies that

U (f, P ) − L(f, P ) ≤ U (f, P2 ) − L(f, P1 ) < ε.

Conversely, assume that (b) holds. Then for every positive integer n, there
is a partition Pn of [a, b] such that
Chapter 4. Integrating Functions of a Single Variable 266

1
0 ≤ U (f, Pn ) − L(f, Pn ) < .
n
Therefore,
Z b Z b
1
0≤ f − f< .
a a n
Taking the n → ∞ limit, squeeze theorem implies that
Z b Z b
f = f.
a a

This shows that f is Riemann integrable.

Theorem 4.7 The Archimedes-Riemann Theorem


Let f : [a, b] → R be a bounded function. Then f : [a, b] → R is Riemann
integrable if and only if there is a sequence {Pn } of partitions of [a, b] such
that
lim (U (f, Pn ) − L(f, Pn )) = 0. (4.1)
n→∞

In this case, the Riemann integral of f over [a, b] can be computed by


Z b
f = lim L(f, Pn ) = lim U (f, Pn ). (4.2)
a n→∞ n→∞

This theorems says that the Riemann integrability of a function can be checked
by the existence of a sequence of partitions satisfying (4.1). Such sequence of
partitions can also be used to compute the Riemann integral. Thus, we give such
sequence a special name.

Definition 4.9 Archimedes Sequence of Partitions

Let f : [a, b] → R be a bounded function. A sequence {Pn } of partitions


of [a, b] is called an Archimedes sequence of partitions for f provided that

lim (U (f, Pn ) − L(f, Pn )) = 0.


n→∞

Hence, Theorem 4.7 says that f : [a, b] → R is Riemann integrable if and only
if it has an Archimedes sequence of partitions.
Chapter 4. Integrating Functions of a Single Variable 267

Proof of Theorem 4.7


If f : [a, b] → R is Riemann integrable, by Lemma 4.6, for every positive
integer n, there is a partition Pn of [a, b] such that
1
0 ≤ U (f, Pn ) − L(f, Pn ) < .
n
By squeeze theorem,

lim (U (f, Pn ) − L(f, Pn )) = 0.


n→∞

Conversely, if there is a sequence {Pn } of partitions of [a, b] such that

lim (U (f, Pn ) − L(f, Pn )) = 0,


n→∞

the definition of limit of sequences implies that for every ε > 0, there is a
positive integer N such that for all n ≥ N ,

|U (f, Pn ) − L(f, Pn )| < ε.

In particular, we find that PN is a partition of [a, b] satisfying

U (f, PN ) − L(f, PN ) < ε.

By Lemma 4.6 again, we find that f is Riemann integrable. This means


Z b Z b Z b
f= f= f . Therefore,
a a a

Z b
0 ≤ U (f, Pn ) − f ≤ U (f, Pn ) − L(f, Pn ).
a

By taking the n → ∞ limit, we find that


Z b
lim U (f, Pn ) = f.
n→∞ a

Since lim (U (f, Pn ) − L(f, Pn )) = 0, we find that


n→∞

Z b
lim L(f, Pn ) = lim U (f, Pn ) = f.
n→∞ n→∞ a
Chapter 4. Integrating Functions of a Single Variable 268

Let us look at an example how to apply the Archimedes-Riemann theorem to


compute integrals.

Example 4.12
2
Let f : [1, 4] → R be the function
Z 4 f (x) = x . Show that f is Riemann
integrable and find the integral f (x)dx.
1

Here we need the formulas


n n
X n(n + 1) X n(n + 1)(2n + 1)
i= , i2 = .
i=1
2 i=1
6

Solution
Let n be a positive integer, and let Pn = {x0 , x1 , . . . , xn } be the regular
partition of [1, 4] into n intervals. Then
3i
xi = 1 + .
n
Notice that the function f : [1, 4] → R, f (x) = x2 is an increasing function.
Therefore, on the interval [xi−1 , xi ],
 2  2
3i − 3 3i
mi = f (xi−1 ) = 1 + , Mi = f (xi ) = 1+ .
n n

From this, we find that


n n 
6(i − 1) 9(i − 1)2

X 3X
L(f, Pn ) = mi (xi − xi−1 ) = 1+ +
i=1
n i=1
n n2
 
3 3(n − 1)(2n − 1)
= n + 3(n − 1) +
n 2n
3
= 2 (14n2 − 15n + 3),
2n
Chapter 4. Integrating Functions of a Single Variable 269

n n 
6i 9i2

X 3X
U (f, Pn ) = Mi (xi − xi−1 ) = 1+ + 2
i=1
n i=1
n n
 
3 3(n + 1)(2n + 1)
= n + 3(n + 1) +
n 2n
3
= 2 (14n2 + 15n + 3).
2n
Moreover,
45
U (f, Pn ) − L(f, Pn ) = .
n
It follows that
lim (U (f, Pn ) − L(f, Pn )) = 0.
n→∞

This proves that {Pn } is an Archimedes sequence of partitions for f .


Therefore, f is Riemann integrable, and
Z 4  
3 15 3
f (x)dx = lim U (f, Pn ) = lim 14 + + 2 = 21.
1 n→∞ n→∞ 2 n n

The following gives an ε−δ characterization of Riemann integrability in terms


of Darboux sums.

Theorem 4.8 Equivalent Definitions of Riemann Integrability

Let f : [a, b] → R be a bounded function. Then the following two


statements are equivalent.
Z b Z b
(i) f : [a, b] → R is Riemann integrable, in the sense that f= f.
a a

(ii) For any ε > 0, there exists a δ > 0 so that if P = {xi }ki=0 is a partition
of [a, b] with |P | < δ, then

U (f, P ) − L(f, P ) < ε.

By Lemma 4.6, f : [a, b] → R is Riemann integrable if and only if for


every ε > 0, there is a partition P satisfying U (f, P ) − L(f, P ) < ε. The
highly nontriviality of this theorem is the existence of a single partition satisfying
Chapter 4. Integrating Functions of a Single Variable 270

U (f, P ) − L(f, P ) < ε is equivalent to the existence of a positive number δ such


that all partitions P with gaps less than δ satisfy U (f, P ) − L(f, P ) < ε.

Proof
(ii) implies (i) follows trivially from Lemma 4.6.
Now assume that (i) holds. Since f : [a, b] → R is bounded, there exists a
positive number M such that

|f (x)| ≤ M for all x ∈ [a, b].

Given ε > 0, Lemma 4.6 implies that there is a partition

P0 = {e
x0 , x es }
e1 , . . . , x

of [a, b] such that


ε
U (f, P0 ) − L(f, P0 ) < .
2
Take
ε
δ= .
8sM
Then δ > 0. If P = {x0 , x1 , . . . , xk } is a partition of [a, b] with |P | < δ,
we want to show that
k
X
U (f, P ) − L(f, P ) = (Mi − mi )(xi − xi−1 ) < ε.
i=1

Here
mi = inf f (x), Mi = sup f (x).
xi−1 ≤x≤xi xi−1 ≤x≤xi

Let
E1 = {1 ≤ i ≤ k | ∃j, x
ej ∈ [xi−1 , xi ]}
be the set that contains those indices i where the interval [xi−1 , xi ] contains
a partition point of P0 , and let E2 = {1, 2, . . . , k} \ E1 be the set of those
indices that are not in E1 .
Chapter 4. Integrating Functions of a Single Variable 271

By definition, the point xe0 can only be in [x0 , x1 ], and the point x
es can only
be in [xk−1 , xk ]. For any 1 ≤ j ≤ s − 1, xej can be in at most two different
subintervals of P . Hence, E1 contains at most 2s elements.
Splitting the sum over i to a sum over E1 and a sum over E2 , we have
X X
U (f, P )−L(f, P ) = (Mi −mi )(xi −xi−1 )+ (Mi −mi )(xi −xi−1 ).
i∈E1 i∈E2

First we estimate the sum over E1 . Since

−M ≤ f (x) ≤ M for all x ∈ [a, b],

we find that for any 1 ≤ i ≤ k,

0 ≤ Mi − mi ≤ 2M.

Since xi − xi−1 ≤ |P | < δ, we have


X X ε
(Mi −mi )(xi −xi−1 ) ≤ 2M (xi −xi−1 ) ≤ 2M δ|E1 | ≤ 4sM δ ≤ .
i∈E1 i∈E1
2

Let P ∗ be the common refinement of P and P0 obtained by taking the union


of their partition points. By our definitions of E1 and E2 , for each i in E2 ,
[xi−1 , xi ] is also a partition interval in P ∗ . Therefore,
X
(Mi − mi ) (xi − xi−1 ) ≤ U (f, P ∗ ) − L(f, P ∗ )
i∈E2
ε
≤ U (f, P0 ) − L(f, P0 ) < .
2
The two estimates above imply that

U (f, P ) − L(f, P ) < ε,

which completes the proof that (i) implies (ii).

A disadvantage of working with Darboux sums is we need to figure out the


infimum and supremum of a function over the partition intervals. Let us turn to
Riemann sums.
Chapter 4. Integrating Functions of a Single Variable 272

Lemma 4.9
Let f : [a, b] → R be a bounded function, and let P = {xi }ki=0 be a partition
of [a, b]. For every ε > 0, there exist choices of intermediate points A and
B for the partition P such that

0 ≤ R(f, P, A) − L(f, P ) < ε, 0 ≤ U (f, P ) − R(f, P, B) < ε.

Proof
For 1 ≤ i ≤ k, let mi = inf f (x) and Mi = sup f (x). By
xi−1 ≤x≤xi xi−1 ≤x≤xi
definitions of infimum and supremum, for each 1 ≤ i ≤ k, there are points
ξi and ηi in [xi−1 , xi ] such that
ε
mi ≤ f (ξi ) < mi + ,
(b − a)
ε
Mi − < f (ηi ) ≤ Mi .
(b − a)

Multiply by (xi − xi−1 ) and sum over i, we find that


k
X k
X k
X
mi (xi − xi−1 ) ≤ f (ξi )(xi − xi−1 ) < mi (xi − xi−1 ) + ε,
i=1 i=1 i=1
Xk k
X k
X
Mi (xi − xi−1 ) − ε < f (ηi )(xi − xi−1 ) ≤ Mi (xi − xi−1 ).
i=1 i=1 i=1

Let A = {ξi }ki=1 and B = {ηi }ki=1 . They are choices of intermediate points
for the partition P . The two inequalities above give

L(f, P ) ≤ R(f, P, A) < L(f, P ) + ε,


U (f, P ) − ε < R(f, P, B) ≤ U (f, P ),

which are the desired results.

The following gives an ε − δ definition for Riemann integrability of a bounded


function.
Chapter 4. Integrating Functions of a Single Variable 273

Theorem 4.10 Equivalent Definitions of Riemann Integrability

Let f : [a, b] → R be a bounded function. Consider the following two


definitions for f to be Riemann integrable.
Z b Z b
(i) f= f.
a a

(ii) There is a number I such that for any ε > 0, there exists a δ > 0 so
that if P = {xi }ki=0 is a partition of [a, b] with |P | < δ, A = {ξi }ki=1
is a choice of intermediate points for P , then

|R(f, P, A) − I| < ε.

These two statements are equivalent, and in case f is Riemann integrable,


Z b Z b Z b
I= f= f= f.
a a a

Note that statement (ii) can be expressed as saying the limit of Riemann sums

I = lim R(f, P, A)
|P |→0

exists.

Proof
Assume that (i) holds. Let
Z b Z b
I= f= f.
a a

Given ε > 0, by Theorem 4.8, there exists a δ > 0 such that if P = {xi }ki=0
is a partition of [a, b] with |P | < δ, then

U (f, P ) − L(f, P ) < ε.


Chapter 4. Integrating Functions of a Single Variable 274

If A = {ξi }ki=1 is any choice of intermediate points for the partition P ,

L(f, P ) ≤ R(f, P, A) ≤ U (f, P ).

Since we also have


L(f, P ) ≤ I ≤ U (f, P ),
we find that

|R(f, P, A) − I| ≤ U (f, P ) − L(f, P ) < ε.

This proves that (i) implies (ii).


Conversely, assume that (ii) holds. By Lemma 4.6, to show that (i) holds, it
suffices to prove that for any ε > 0, there is a partition P of [a, b] so that

U (f, P ) − L(f, P ) < ε.

Given ε > 0, (ii) implies there is a δ > 0 such that if P = {xi }ki=0 is a
partition of [a, b] with |P | < δ, A = {ξi }ki=1 is a choice of intermediate
points for the partition P , then
ε
|R(f, P, A) − I| < . (4.3)
4
Here I is the limit of Riemann sums implied by (ii). Let P = {xi }ni=0 be a
regular partition into n intervals, where n is large enough so that
b−a
|P | = < δ.
n
By Lemma 4.9, there exist choices of intermediate points A and B for the
partition P which satisfy
ε ε
U (f, P ) < R(f, P, A) + , L(f, P ) > R(f, P, B) − .
4 4
These imply that
ε
U (f, P ) − L(f, P ) < R(f, P, A) − R(f, P, B) + .
2
Chapter 4. Integrating Functions of a Single Variable 275

By (4.3),
ε
|R(f, P, A) − R(f, P, B)| ≤ |R(f, P, A) − I| + |R(f, P, B) − I| < .
2
This proves that
U (f, P ) − L(f, P ) < ε,
which completes the proof that (ii) implies (i).

As a consequence of Theorem 4.8 and Theorem 4.10, we have the following.

Corollary 4.11

Let f : [a, b] → R be a bounded function that is Riemann integrable, and


let {Pn } be a sequence of partitions of [a, b] such that

lim |Pn | = 0.
n→∞

Then
Z b
(a) f = lim U (f, Pn ) = lim L(f, Pn )
a n→∞ n→∞

Z b
(b) f = lim R(f, Pn , An ), where for each n ∈ Z+ , An is a choice of
a n→∞
intermediate points for the partition Pn .

This corollary says that if we know apriori that f : [a, b] → R is Riemann


integrable, then we can evaluate the integral by a sequence of partitions whose
gaps goes to 0, using either the Darboux upper sums, or the Darboux lower sums,
or Riemann sums for any choice of intermediate points.

Proof
Z b
Let I = f . Given ε > 0, Theorem 4.8 and Theorem 4.10 imply that
a
there is a δ > such that for any partition P with |P | < δ, and any choice of
intermdiates points A for the partition P ,

U (f, P ) − L(f, P ) < ε and |R(f, P, A) − I| < ε.


Chapter 4. Integrating Functions of a Single Variable 276

Since lim |Pn | = 0, there is a positive integer N so that for all n ≥ N ,


n→∞
|Pn | < δ. This implies that for all n ≥ N ,

U (f, Pn ) − L(f, Pn ) < ε and |R(f, Pn , An ) − I| < ε.

Since L(f, Pn ) ≤ I ≤ U (f, Pn ), we find that

|U (f, Pn ) − I| < ε and |L(f, Pn ) − I| < ε for all n ≥ N.

These prove that

I = lim U (f, Pn ) = lim L(f, Pn ) = lim R(f, Pn , An ).


n→∞ n→∞ n→∞

For every positive integer n, take Pn to be the regular partition of [a, b] into
n intervals. This gives a sequence of partitions {Pn } with

b−a
lim |Pn | = lim = 0.
n→∞ n→∞ n
For the choices of intermediate points An , one can take the left end point of
each interval, or the right end point, or the midpoint.

Example 4.13
We are going to prove in Section 4.3 that a continuous function is
integrable. The function f : [0, 6] → R,Zf (x) = 6x − x2 is continuous. Use
6
Riemann sums to evaluate the integral f (x)dx.
0

Solution
For a positive integer n, let Pn = {xi }ni=0 be the regular partition of [0, 6]
into n intervals. Then
6i
xi = , 0 ≤ i ≤ n.
n
Chapter 4. Integrating Functions of a Single Variable 277

Let An = {ξi }ni=1 , where


6i
ξi = xi = , 1 ≤ i ≤ n.
n
Then
n
X
R(f, Pn , An ) = f (ξi )(xi − xi−1 )
i=1
n 
6 X 36i 36i2

= − 2
n i=1 n n
 
216 n(n + 1) (n + 1)(2n + 1)
= 2 −
n 2 6
2
36(n − 1)
= .
n2
Therefore, Z 6
f (x)dx = lim R(f, Pn , An ) = 36.
0 n→∞
Chapter 4. Integrating Functions of a Single Variable 278

Exercises 4.1
Question 1

Let f : [0, 2] → R be the function f (x) = 4 − x2 . Given a positive integer


n, let Pn be the regular partition of [0, 2] into n subintervals.

(a) Compute L(f, Pn ) and U (f, Pn ).

(b) Show directly that lim (U (f, Pn ) − L(f, Pn )) = 0.


n→∞

(c) Use partZ (b) to conclude that f is Riemann integrable and find the
2
integral f (x)dx.
0

Question 2

Given that the functon f : [0, 4] → R, f (x) = x2 −Z2x + 3 is Riemann


4
integrable. Use Riemann sums to evaluate the integral f (x)dx.
0
Chapter 4. Integrating Functions of a Single Variable 279

4.2 Properties of Riemann Integrals

In this section, we derive some properties of the Riemann integrals. First we show
that integral of a nonnegative function is nonnegative.

Theorem 4.12
If f : [a, b] → R is a bounded function that is Riemann integrable, and

f (x) ≥ 0 for all x ∈ [a, b],

then Z b
f ≥ 0.
a

Proof
Since f is Riemann integrable,
Z b
f = lim L(f, Pn ),
a n→∞

where Pn is the regular partition of [a, b] into n intervals. Since f (x) ≥ 0


for all x ∈ [a, b], we find that

L(f, Pn ) ≥ 0 for all n ∈ Z+ .

Therefore, Z b
f ≥ 0.
a

Linearity is always an important property.


Chapter 4. Integrating Functions of a Single Variable 280

Theorem 4.13 Linearity of Integrals

Let f : [a, b] → R and g : [a, b] → R be bounded functions. If f and g are


Riemann integrable, then for any constants α and β, αf + βg : [a, b] → R
is also Riemann integrable, and
Z b Z b Z b
(αf + βg) = α f +β g.
a a a

Proof
Here we use the fact that a function h : [a, b] → R is Riemann integrable
if and only if the limit lim R(h, P, A) exists. Since f : [a, b] → R and
|P |→0
g : [a, b] → R are bounded, αf + βg is also bounded. Since f : [a, b] → R
and g : [a, b] → R are Riemann integrable,
Z b Z b
f = lim R(f, P, A), g = lim R(g, P, A).
a |P |→0 a |P |→0

Notice that for any partition P = {xi }ki=0 of [a, b], and any choice of
intermediate points A = {ξi }ki=1 for the partition P ,
k
X
R(αf + βg, P, A) = (αf (ξi ) + βg(ξi )) (xi − xi−1 )
i=1

= αR(f, P, A) + βR(g, P, A).

Limit laws imply that

lim R(αf + βg, P, A) = α lim R(f, P, A) + β lim R(g, P, A)


|P |→0 |P |→0 |P |→0
Z b Z b
=α f +β g.
a a

This proves that αf + βg is Riemann integrable and


Z b Z b Z b
(αf + βg) = α f +β g.
a a a

From the previous two theorems, we obtain a comparison theorem for integrals.
Chapter 4. Integrating Functions of a Single Variable 281

Theorem 4.14 Monotonicity

Let f : [a, b] → R and g : [a, b] → R be bounded functions. If f and g are


Riemann integrable, and

f (x) ≥ g(x) for all x ∈ [a, b],

then Z b Z b
f≥ g.
a a

Proof
Define h : [a, b] → R to be the function h(x) = f (x) − g(x). Then
h(x) ≥ 0 for all x ∈ [a, b]. By Theorem 4.13, h is Riemann integrable and
Z b Z b Z b
h= f− g.
a a a
Z b
By Theorem 4.12, h ≥ 0. Hence,
a
Z b Z b
f≥ g.
a a

We can apply the monotonicity theorem to obtain bounds for an integral from
the lower bound and the upper bound of the function.

Example 4.14

Let f : [a, b] → R be a Riemann integrable funtion satisfying

m ≤ f (x) ≤ M for all x ∈ [a, b].

Then Z b
m(b − a) ≤ f ≤ M (b − a).
a

When an interval is partitioned into a finite collection of intervals, the integral


over the whole interval is expected to equal to the sum of the intergrals over the
Chapter 4. Integrating Functions of a Single Variable 282

subintervals. It is enough for us to consider two subintervals.

Theorem 4.15 Additivity

Let f : [a, b] → R be a bounded function, and let c be a point in (a, b).

(a) If f : [a, b] → R is Riemann integrable, then f : [a, c] → R and


f : [c, b] → R are Riemann integrable.

(b) If f : [a, c] → R and f : [c, b] → R are Riemann integrable, then


f : [a, b] → R is Riemann integrable.

In either case, Z b Z c Z b
f= f+ f.
a a c

Proof
We use Lemma 4.6. First we prove (a). Given ε > 0, since f : [a, b] → R
is Riemann integrable, there is a partition P of [a, b] such that

U (f, P ) − L(f, P ) < ε.

Let P ∗ be the partition of [a, b] that is obtained by taking the union of


the partition points in P and P0 = {a, c, b}. If P already contains c as
a partition point, then P ∗ = P . In any case, P ∗ is a refinement of P .
Therefore,

U (f, P ∗ ) − L(f, P ∗ ) ≤ U (f, P ) − L(f, P ) < ε.

Consider P ∗ as a refinement of P0 , let P1 be the partition of [a, c] induced


by P ∗ , and let P2 be the partition of [c, b] induced by P ∗ . Then

L(f, P ∗ ) = L(f, P1 ) + L(f, P2 ), U (f, P ∗ ) = U (f, P1 ) + U (f, P2 ).

These imply that

(U (f, P1 )−L(f, P1 ))+(U (f, P2 )−L(f, P2 )) = U (f, P ∗ )−L(f, P ∗ ) < ε.


Chapter 4. Integrating Functions of a Single Variable 283

Since U (f, P1 ) − L(f, P1 ) ≥ 0 and U (f, P2 ) − L(f, P2 ) ≥ 0, we find that

U (f, P1 ) − L(f, P1 ) < ε and U (f, P2 ) − L(f, P2 ) < ε.

By Lemma 4.6, we conclude that f : [a, c] → R and f : [c, b] → R are


Riemann integrable.
Next, we prove (b). Given ε > 0, since f : [a, c] → R and f : [c, b] → R
are Riemann integrable, there exists a partition P1 of [a, c], and a partition
P2 of [c, b] such that
ε ε
U (f, P1 ) − L(f, P1 ) < and U (f, P2 ) − L(f, P2 ) < .
2 2
Let P be the partition of [a, b] obtained by taking the union of the partition
points in P1 and P2 . Then

L(f, P ) = L(f, P1 ) + L(f, P2 ), U (f, P ) = U (f, P1 ) + U (f, P2 ).

Therefore,

U (f, P ) − L(f, P ) = (U (f, P1 ) − L(f, P1 )) + (U (f, P2 ) − L(f, P2 )) < ε.

This proves that f : [a, b] → R is Riemann integrable.


Now we prove the last statement. For any positive integer n, let P1,n be the
regular partition of [a, c] into n intervals, and let P2,n be the regular partition
of [c, b] into n intervals. Then let Pn be the partition of [a, b] obtained by
taking the union of the partition points in P1,n and P2,n . For the Darboux
upper sums, we have

U (f, Pn ) = U (f, P1,n ) + U (f, P2,n ).

Taking the n → ∞ limits on both sides, we conclude that


Z b Z c Z b
f= f+ f.
a a c
Chapter 4. Integrating Functions of a Single Variable 284

Extension of Definition of Integrals


Z b
The additivity allows us to extend the definition of the integral f to the
a
case where a ≥ b. We define
Z a
f = 0.
a

If a > b, define Z b Z a
f =− f.
a b
Z b Z c
Then one can check that as long as two of the three integrals f, f,
Z b a a

f exist, the third one also exists, and we always have


c
Z b Z c Z b
f= f+ f.
a a c

Using induction, we can extend the additivity theorem.

Corollary 4.16 General Additivity Theorem

Let f : [a, b] → R be a bounded function, and let P0 = {a0 , a1 , . . . , ak } be


a partition of [a, b]. Then f : [a, b] → R is Riemann integrable if and only
if for each 1 ≤ i ≤ k, f : [ai−1 , ai ] → R is Riemann integrable. In this
case,
Z b Z a1 Z a2 Z ak−1 Z b
f (x)dx = f+ f + ··· + f+ f.
a a a1 ak−2 ak−1
Chapter 4. Integrating Functions of a Single Variable 285

Exercises 4.2
Question 1
Given that f : [2, 7] → R is a function satisfying

−3 ≤ f (x) ≤ 11 for all x ∈ [2, 7].


Z 7
Find a lower bound and an upper bound for f (x)dx.
2

Question 2
Given that f : [a, b] → R and g : [a, b] → R are bounded functions, P is a
partition of [a, b], and c and d are two points in [a, b] with c < d. Prove the
following.

(a) inf (f + g)(x) ≥ inf f (x) + inf g(x).


c≤x≤d c≤x≤d c≤x≤d

(b) sup (f + g)(x) ≤ sup f (x) + sup g(x).


c≤x≤d c≤x≤d c≤x≤d

(c) L(f + g, P ) ≥ L(f, P ) + L(g, P ).

(d) U (f + g, P ) ≤ U (f, P ) + U (g, P ).

Then use (c) and (d) to give a proof of the following statement: If f :
[a, b] → R and g : [a, b] → R are Riemann integrable, then f + g : [a, b] →
R is also Riemann integrable, and
Z b Z b Z b
(f + g) = f+ g.
a a a
Chapter 4. Integrating Functions of a Single Variable 286

4.3 Functions that are Riemann Integrable

In this section, we are going to derive Riemann integrability of a few classes of


functions. The first class of functions that are of interest is the class of continuous
functions. If f : [a, b] → R is a continuous function, then f ([a, b]) is sequentially
compact. In particular, f ([a, b]) is bounded. Hence, if f : [a, b] → R is a
continuous function, it is bounded. The crucial property for a continuous function
defined on a closed and bounded interval to be integrable is uniform continuity.

Theorem 4.17
Let f : [a, b] → R be a continuous function. Then f : [a, b] → R is
Riemann integrable.

Proof
Since f : [a, b] → R is a continuous function defined on a closed and
bounded interval, it is uniformly continuous. Given ε > 0, there exists
δ > 0 such that for any u and v in [a, b] with |u − v| < δ,
ε
|f (u) − f (v)| < .
b−a
Let P = {xi }ki=0 be a partition of [a, b] with |P | < δ. For any 1 ≤ i ≤ k,
f : [xi−1 , xi ] → R is continuous. By extreme value theorem, there exists ui
and vi in [xi−1 , xi ] such that

mi = inf f (x) = f (ui ), Mi = sup f (x) = f (vi ).


xi−1 ≤x≤xi xi−1 ≤x≤xi

Then
|ui − vi | ≤ xi − xi−1 ≤ |P | < δ.
Therefore,
ε
Mi − mi = f (vi ) − f (ui ) < .
b−a
Chapter 4. Integrating Functions of a Single Variable 287

Hence,
k
X
U (f, P ) − L(f, P ) = (Mi − mi )(xi − xi−1 )
i=1
k
ε X
< (xi − xi−1 ) = ε.
b − a i=1

This proves that f : [a, b] → R is Riemann integrable.

It follows from this theorem that all the following classes of functions
are integrable on a closed and bounded interval that is contained in their
domains.

• Polynomials

• Rational Functions

• Exponential Functions

• Logarithmic Functions

• Trigonometric Functions

Let us revisit the concept of area, which is our original motivation to define
integrals.

Remark 4.3 Area


Let f : [a, b] → R be a continuous function such that f (x) ≥ 0 for all
x ∈ [a, b], and let R be the region bounded between the x-axis, the lines
x = a and x = b, as well as the curve y = f (x).
Chapter 4. Integrating Functions of a Single Variable 288

Figure 4.7: Darboux lower sum underestimates area while Darboux upper sum
overestimates area.

Given P a partition of [a, b], the Darboux lower sum L(f, P ) is a sum of
areas of rectangles that are inside R. The Darboux upper sum U (f, P ) is a
sum of areas of rectangles whose union contains R. Therefore, if R has a
area A, L(f, P ) is less than or equal to A, while U (f, P ) is larger than or
Z b
equal to A. Since f is continuous, the Riemann integral I = f (x)dx
a
exists. By definition, I is the unique number such that

L(f, P ) ≤ I ≤ U (f, P )

for all partitions P of [a, b]. Therefore, we define the area of R to be this
number I. Namely, Z b
Area of R = f (x)dx.
a

There are also other classes of functions that are Riemann integrable, which
are useful. First, we relax the continuity condition slightly in the previous theorem.
Chapter 4. Integrating Functions of a Single Variable 289

Theorem 4.18
Let f : [a, b] → R be a bounded function that is continuous on (a, b). Then
f : [a, b] → R is Riemann integrable.

Here we only assume f is continuous on (a, b). The function can take on any
values on the boundary points a and b.

Proof
Since f : [a, b] → R is bounded, there is a positive constant M such that

|f (x)| ≤ M for all x ∈ [a, b].

Given ε > 0, let  


ε b−a
r = min , .
8M 3
Then r > 0 and a + r < b − r. The function f : [a + r, b − r] → R is
continuous. By Theorem 4.17, f : [a+r, b−r] → R is Riemann integrable.
Therefore, there is a partition P1 of [a + r, b − r] such that
ε
U (f, P1 ) − L(f, P1 ) < .
2
Let P be the partition of [a, b] obtained by adding the points a and b to P1 .
Then

U (f, P ) − L(f, P ) = U (f, P1 ) − L(f, P1 )


 
+r sup f (x) − inf f (x)
a≤x≤a+r a≤x≤a+r
 
+r sup f (x) − inf f (x)
b−r≤x≤b b−r≤x≤b
ε
< + 4M r
2
ε ε
≤ + = ε.
2 2
This proves that f : [a, b] → R is Riemann integrable.

Z b
As we can see in the proof above, the integral f does not depend on the
a
Chapter 4. Integrating Functions of a Single Variable 290

function value at the end points. In fact, this is true for any finite number of points.

Theorem 4.19
Let f : [a, b] → R be a bounded function that is Riemann integrable.
Assume that g : [a, b] → R is a function and S = {a1 , a2 , . . . , ak } is a
finite subset of [a, b] such that

g(x) = f (x) for all x ∈ [a, b] \ S.

Then g : [a, b] → R is Riemann integrable, and


Z b Z b
g= f.
a a

Proof
Let h : [a, b] → R be the function h(x) = g(x) − f (x). Then h(x) = 0 for
x ∈ [a, b] \ S. Since S is a finite set, h is bounded, and so there is a positive
constant M such that |h(x)| ≤ M for all x ∈ [a, b]. Given a positive integer
n, let Pn = {x0 , x1 , . . . , xn } be the regular partition of [a, b] into n intervals.
There are at most 2k of the intervals [xi−1 , xi ] that contains a point of S. In
these intervals,

−M ≤ inf h(x) ≤ sup h(x) ≤ M.


xi−1 ≤x≤xi xi−1 ≤x≤xi

If [xi−1 , xi ] does not contain any points of S, then

inf h(x) = sup h(x) = 0.


xi−1 ≤x≤xi xi−1 ≤x≤xi

These imply that


n
X 2M k(b − a)
U (h, Pn ) = sup h(x)(xi − xi−1 ) ≤ .
i=1
xi−1 ≤x≤xi n

n
X 2M k(b − a)
L(h, Pn ) = inf h(x)(xi − xi−1 ) ≥ − .
i=1
xi−1 ≤x≤xi n
Chapter 4. Integrating Functions of a Single Variable 291

Therefore,
2M k(b − a) 2M k(b − a)
− ≤ L(h, Pn ) ≤ U (h, Pn ) ≤ .
n n
Taking n → ∞ limits, we find that

lim U (h, Pn ) = lim L(h, Pn ) = 0.


n→∞ n→∞

By the Archimedes-Riemann theorem, h : [a, b] → R is Riemann integrable


Z b
and h = 0. Therefore, g = h + f is also Riemann integrable and
a
Z b Z b Z b Z b
g= h+ f= f.
a a a a

Remark 4.4
If f : (a, b) → R is a bounded function, we can extend the function to
[a, b] and discuss its integrability. By Theorem 4.19, this is not affected by
how we define the function at x = a and x = b. In Z case the extension is b
Riemann integrable, we still denote the integral by f.
a

Definition 4.10 Piecewise Continuous Functions


We say that a function f : [a, b] → R is piecewise continuous if there is
a partition P0 = {a0 , a1 , . . . , ak } of [a, b] such that for each 1 ≤ i ≤ k,
f : (ai−1 , ai ) → R is continuous.

Using the general additivity theorem (Corollary 4.16) and Theorem 4.18, we
obtain the following immediately.

Theorem 4.20
Let f : [a, b] → R be a function that is bounded and piecewise continuous.
Then f : [a, b] → R is Riemann integrable.
Chapter 4. Integrating Functions of a Single Variable 292

Example 4.15

The function f : [−1, 2] → R defined by



2 − x, if − 1 ≤ x < 0,
f (x) =
x 2 , if 0 ≤ x ≤ 2,

is piecewise continuous and bounded. Hence, f : [−1, 2] → R is Riemann


integrable.

Figure 4.8: The piecewise continuous function defined in Example 4.15.

A special class of function that is bounded and piecewise continuous is the


class of step functions.

Definition 4.11 Step Functions

We say that f : [a, b] → R is a step function if there is a partition P0 =


{a0 , a1 , . . . , ak } of [a, b] such that for each 1 ≤ i ≤ k, f : (ai−1 , ai ) → R
is a constant function.

By previous theorem, a step function is Riemann integrable. In fact, it is easy


to compute its integral.
Chapter 4. Integrating Functions of a Single Variable 293

Proposition 4.21

Let P0 = {a0 , a1 , . . . , ak } be a partition of [a, b], and let f : [a, b] → R be a


step function such that for 1 ≤ i ≤ k,

f (x) = ci , when ai−1 < x < ai .

Then f : [a, b] → R is Riemann integrable and


Z b k
X
f= ci (ai − ai−1 ).
a i=1

Example 4.16

Let f : [0, 5] → R be the function defined as



1, if 0 ≤ x ≤ 1,
f (x) =
⌊5/x⌋ , if 1 < x ≤ 5.
Z 5
Show that f is Riemann integrable and find f.
0

Solution
The function f is given explicitly by



 1, if 0 ≤ x ≤ 1,

1 < x ≤ 5/4,

4, if



f (x) = 3, if 5/4 < x ≤ 5/3,


2, if 5/3 < x ≤ 5/2,






1, if 5/2 < x ≤ 5.

This is a step function. Hence, it is integrable, and


Z 5
1 5 5 5 89
f =1×1+4× +3× +2× +1× = .
0 4 12 6 2 12

The following theorem shows that monotonic functions are also Riemann
Chapter 4. Integrating Functions of a Single Variable 294

Figure 4.9: The function defined in Example 4.16.

integrable.

Theorem 4.22
If f : [a, b] → R is a monotonic function, then it is Riemann integrable.

Proof
Without loss of generality, assume that f : [a, b] → R is an increasing
function. If P = {xi }ki=0 is a partition of [a, b], then for any 1 ≤ i ≤ k,

mi = inf f (x) = f (xi−1 ), Mi = inf f (x) = f (xi ).


xi−1 ≤x≤xi xi−1 ≤x≤xi

Therefore,
n
X
U (f, P ) − L(f, P ) = (f (xi ) − f (xi−1 ) (xi − xi−1 ).
i=1

For each positive integer n, let Pn be the regular partition of [a, b] into n
intervals. Then
n
b−aX
U (f, Pn ) − L(f, Pn ) = (f (xi ) − f (xi−1 ))
n i=1
(b − a) (f (b) − f (a))
= .
n
Chapter 4. Integrating Functions of a Single Variable 295

This implies that

(b − a) (f (b) − f (a))
lim (U (f, Pn ) − L(f, Pn )) = lim = 0.
n→∞ n→∞ n
In other words, {Pn } is an Archimedes sequence of partitions for f . By the
Archimedes-Riemann theorem, this proves that f is Riemann integrable.

Example 4.17

Let f : [0, 1] → R be the function defined by f (0) = 1, and for each


positive integer n,
1 1 1
f (x) = 1 − , when <x≤ .
n+1 n+1 n
One can verify that f : [0, 1] → R is a decreasing function. Hence, it is
Riemann integrable. However, f is not a piecewise continuous function,
since it has discontinuities at infinitely many points.

The Riemann integrability of a function implies the Riemann integrability of


its absolute value.

Theorem 4.23
Let f : [a, b] → R be a bounded function. If f : [a, b] → R is Riemann
integrable, then the function |f | : [a, b] → R is Riemann integrable.

Proof
We will first prove the following: For any c and d in [a, b] with c < d,

sup |f (x)| − inf |f (x)| ≤ sup f (x) − inf f (x). (4.4)


c≤x≤d c≤x≤d c≤x≤d c≤x≤d

There are two sequences of points {un } and {vn } in [c, d] such that

lim |f (un )| = inf |f (x)|, lim |f (vn )| = sup |f (x)|.


n→∞ c≤x≤d n→∞ c≤x≤d
Chapter 4. Integrating Functions of a Single Variable 296

Since un and vn are points in [c, d], we find that

|f (vn )| − |f (un )| ≤ |f (vn ) − f (un )| ≤ sup f (x) − inf f (x).


c≤x≤d c≤x≤d

Passing to the n → ∞ limit, we obtain (4.4).


Now, given ε > 0, since f : [a, b] → R is Riemann integrable, there is a
partition P = {xi }ki=0 of [a, b] such that

U (f, P ) − L(f, P ) < ε.

But then

U (|f |, P ) − L(|f |, P )
Xn  
= sup |f (x)| − inf |f (x)| (xi − xi−1 )
xi−1 ≤x≤xi xi−1 ≤x≤xi
i=1
Xn  
≤ sup f (x) − inf f (x) (xi − xi−1 )
xi−1 ≤x≤xi xi−1 ≤x≤xi
i=1

= U (f, P ) − L(f, P ) < ε.

This prove that |f | : [a, b] → R is Riemann integrable.

Figure 4.10: A function y = f (x) and its absolute value y = |f (x)|.


Chapter 4. Integrating Functions of a Single Variable 297

Remark 4.5
The converse of Theorem 4.23 is not true. Namely, if a function f : [a, b] →
R is bounded, |f | : [a, b] → R is Riemann integrable does not imply that
f : [a, b] → R is Riemann integrable. For a counter example, consider the
function f : [0, 1] → R defined as

1, if x is rational,
f (x) =
−1, if x is irrational.

One can prove that f : [0, 1] → R is not integrable, exactly the same way
as in Example 4.11. On the other hand, since |f | : [0, 1] → R is a constant
function, it is Riemann integrable.

The following theorem says that the product of Riemann integrable functions
is Riemann integrable.

Theorem 4.24
Let f : [a, b] → R and g : [a, b] → R be bounded functions. If f :
[a, b] → R and g : [a, b] → R are Riemann integrable, then the function
(f g) : [a, b] → R is also Riemann integrable.

Proof
We will apply Lemma 4.6 to prove the Riemann integrability of the function
h = (f g) : [a, b] → R. Since f : [a, b] → R and g : [a, b] → R are bounded
functions, there is a positive number M so that

|f (x)| ≤ M and |g(x)| ≤ M for all x ∈ [a, b].

We claim that for any c and d in [a, b] with c < d,

sup h(x) − inf h(x)


c≤x≤d c≤x≤d
  (4.5)
≤ M sup f (x) − inf f (x) + sup g(x) − inf g(x) .
c≤x≤d c≤x≤d c≤x≤d c≤x≤d
Chapter 4. Integrating Functions of a Single Variable 298

There are two sequences of points {un } and {vn } in [c, d] such that

lim h(un ) = inf h(x), lim h(vn ) = sup h(x).


n→∞ c≤x≤d n→∞ c≤x≤d

Notice that

|h(vn ) − h(un )| = |g(vn )(f (vn ) − f (un )) + f (un )(g(vn ) − g(un ))|
≤ |g(vn )||f (vn ) − f (un )| + |f (un )||g(vn ) − g(un )|

Since un and vn are in [c, d], we find that

|f (vn ) − f (un )| ≤ sup f (x) − inf f (x),


c≤x≤d c≤x≤d

|g(vn ) − g(un )| ≤ sup g(x) − inf g(x).


c≤x≤d c≤x≤d

Therefore,

|h(vn ) − h(un )|
 
≤ M sup f (x) − inf f (x) + sup g(x) − inf g(x) .
c≤x≤d c≤x≤d c≤x≤d c≤x≤d

Passing to the n → ∞ limit gives (4.5).


Now given ε > 0, there are partitions P1 and P2 of [a, b] such that
ε
U (f, P1 ) − L(f, P1 ) < ,
2M
ε
U (g, P2 ) − L(g, P2 ) < .
2M
Let P ∗ = {x0 , x1 , . . . , xk } be a common refinement of P1 and P2 . Then
ε
U (f, P ∗ ) − L(f, P ∗ ) < ,
2M
ε
U (g, P ∗ ) − L(g, P ∗ ) < .
2M
Chapter 4. Integrating Functions of a Single Variable 299

It follows that

U (h, P ∗ ) − L(h, P ∗ )
Xk  
= sup h(x) − inf h(x) (xi − xi−1 )
xi−1 ≤x≤xi xi−1 ≤x≤xi
i=1
k 
X 
≤M sup f (x) − inf f (x) (xi − xi−1 )
xi−1 ≤x≤xi xi−1 ≤x≤xi
i=1
k 
X 
+M sup g(x) − inf g(x) (xi − xi−1 )
xi−1 ≤x≤xi xi−1 ≤x≤xi
i=1

= M (U (f, P ∗ ) − L(f, P ∗ ) + U (g, P ∗ ) − L(g, P ∗ ))


 ε ε 
<M× + = ε.
2M 2M
This proves that (f g) : [a, b] → R is Riemann integrable.
Chapter 4. Integrating Functions of a Single Variable 300

Exercises 4.3
Question 1
Given that f : [−1, 1] → R is the function defined by
  
sin 4 ,

if x ̸= 0,
f (x) = x
0,

if x = 0.

Explain why f : [−1, 1] → R is Riemann integrable.

Question 2
Let f : [a, b] → R be a bounded function. If f : [a, b] → R is Riemann
integrable, show that
Z b Z b
f (x)dx ≤ |f (x)|dx.
a a

Question 3
Let f : [0, 6] → R be the function defined as

−2, if 0 ≤ x < 1,
f (x) =
⌊4/x⌋ , if 1 ≤ x ≤ 6.
Z 6
Show that f is Riemann integrable and find f.
0

Question 4
Let f : R → R be the function defined as

f (x) = (x − ⌊x⌋)2 .

Explain why the function f is Riemann integrable on any closed and


bounded interval [a, b].
Chapter 4. Integrating Functions of a Single Variable 301

Question 5
Let f : [a, b] → R and g : [a, b] → R be bounded functions. Define the
function h : [a, b] → R by

h(x) = max{f (x), g(x)}.

(a) Show that


f + g + |f − g|
h= .
2
(b) If f : [a, b] → R and g : [a, b] → R are Riemann integrable, show that
h : [a, b] → R is also Riemann integrable.

Question 6 [Cauchy Schwarz Inequality]

Let f : [a, b] → R and g : [a, b] → R be bounded functions that are


Riemann integrable. Prove that
Z b 2 Z b  Z b 
2 2
f (x)g(x)dx ≤ f (x) dx g(x) dx .
a a a
Chapter 4. Integrating Functions of a Single Variable 302

4.4 The Fundamental Theorem of Calculus

In this section, we prove the fundamental theorem of calculus, which gives a


relation between integration and differentiation. It also provides a useful method
to compute integrals of certain functions. We first prove a few results about
integrals.
Given that a function f : [a, b] → R is bounded and Riemann integrable on
[a, b], it is Riemann integrable on any interval [c, d] that is contained in the interval
[a, b]. Thus, we can define a new function F : [a, b] → R by
Z x
F (x) = f (u)du.
a

By definition, F (a) = 0. For any c and d in [a, b], one can check that
Z d Z c Z d
F (d) − F (c) = f (u)du − f (u)du = f (u)du.
a a c

The followng theorem says that F : [a, b] → R is a continuous function.

Theorem 4.25
Let f : [a, b] → R be a bounded function that is Riemann integrable, and
let F : [a, b] → R be the function defined by
Z x
F (x) = f (u)du.
a

Then F : [a, b] → R is a Lipschitz function, and hence it is continuous.

Proof
It is sufficient to prove that F : [a, b] → R is Lipschitz. The continuity
follows. Since f : [a, b] → R is bounded, there is a positive constant M
such that
|f (x)| ≤ M for all x ∈ [a, b].
Chapter 4. Integrating Functions of a Single Variable 303

For any x1 and x2 in [a, b] with x1 < x2 , we have


Z x2
F (x2 ) − F (x1 ) = f (u)du.
x1

Therefore,
Z x2 Z x2
|F (x2 ) − F (x1 )| = f (u)du ≤ |f (u)|du
x1 x1
Z x2
≤ M du = M (x2 − x1 ) = M |x2 − x1 |.
x1

This proves that F : [a, b] → R is a Lipschitz function wth Lipschitz


constant M .

The next is a mean value theorem for integrals.

Theorem 4.26 Mean Value Theorem for Integrals

Let f : [a, b] → R be a continuous function. Then there exists c in [a, b]


such that Z b
1
f (x)dx = f (c).
b−a a

This theorem is known as the mean value theorem since


Z b
1
f (x)dx
b−a a
can be interpreted as the average of the values of f over the interval [a, b].

Proof
Since f : [a, b] → R is a continuous function, the extreme value theorem
says that there are points u and v in [a, b] such that

f (u) ≤ f (x) ≤ f (v) for all x ∈ [a, b].

This implies that


Z b
f (u)(b − a) ≤ f (x)dx ≤ f (v)(b − a).
a
Chapter 4. Integrating Functions of a Single Variable 304

Therefore, the number


Z b
1
w= f (x)dx
b−a a

satisfies
f (u) ≤ w ≤ f (v).
By intermediate value theorem, there is a point c in [a, b] such that f (c) =
w. This gives Z b
1
f (x)dx = f (c).
b−a a

In fact, one can argue that the number c can be chosen to be in (a, b).

Example 4.18

Let f : [a, b] → R be a continuous function. If f (x) ≥ m for all x ∈ [a, b]


and Z b
f (x)dx = m(b − a),
a
prove that
f (x) = m for all x ∈ [a, b].

Solution
Let g : [a, b] → R be the function g(x) = f (x) − m. Then g : [a, b] → R is
a continuous function and g(x) ≥ 0 for all x ∈ [a, b]. Moreover,
Z b Z b Z b
g(x)dx = f (x)dx − mdx = 0.
a a a

We want to show that g(x) = 0 for all x ∈ [a, b]. Suppose to the contrary
that there is a point x0 in [a, b] such that g(x0 ) ̸= 0. Then g(x0 ) > 0. Since
g is continuous, there exists a δ > 0 such that for all x ∈ (x0 − δ, x0 + δ) ∩
(a, b),
g(x0 )
g(x) > .
2
Chapter 4. Integrating Functions of a Single Variable 305

b−a
Without loss of generality, assume that δ < . Then either x0 − δ > a
2
or x0 + δ < b. In any case, (x0 − δ, x0 + δ) ∩ (a, b) = (c, d) is an interval
of length at least δ. But then
Z b Z c Z d Z b
g(x)dx = g(x)dx + g(x)dx + g(x)dx
a a c d
g(x0 )
≥ 0 × (c − a) + (d − c) + 0 × (b − d)
2
g(x0 )
= (d − c) > 0,
2
which is a contradiction. Therefore, we must have g(x) = 0 for all x ∈
[a, b].

Remark 4.6
In the mean value theorem for integrals, we can strengthen the theorem to
have the point c being a point in the open interval (a, b). In the proof, we
Z b
have shown that for w = f (x)dx,
a

f (u) ≤ w ≤ f (v).

Here f (u) is the minimum value of f : [a, b] → R, and f (v) is the


maximum value. By Example 4.18, if w = f (u), then f is a constant.
In this case, we can take c to be any point in (a, b). If w = f (v), the same
reasoning as in Example 4.18 also shows that f is a constant, and so c also
can be any point in (a, b). If w ̸= f (u) and w ̸= f (v), then c is a point
strictly between u and v, and thus it is strictly between a and b.

Now we turn to the fundamental theorem of calculus. Consider the case that an
object is moving with speed v(t) at time t. To find s(t), the distance travelled up to
time t, we can partition the time interval [0, t] into a finite number of subintervals
[0, t1 ], [t1 , t2 ], . . . , [tk−1 , tk ], where tk = t. For each time interval [ti−1 , ti ], where
1 ≤ i ≤ k, take a point t∗i ∈ [ti−1 , ti ], and approximate the average speed over the
time interval [ti−1 , ti ] by the speed at time t∗i , v(t∗i ). Then the distance travelled up
Chapter 4. Integrating Functions of a Single Variable 306

to time t is approximately
k
X
v(t∗i )(ti − ti−1 ).
i=1

We recognize that this is a Riemann sum of the speed function v(t). The distance
travelled s(t) should be calculated as the limit where the gap of the partition goes
to zero. In other words, Z t
s(t) = v(τ )dτ.
0

In Chapter 3, we have motivated that v(t) = s′ (t). Hence,


Z t
s(t) = s′ (τ )dτ,
0

which means that differentiation and integration are inverse processes of each
other. The fundamental theorem of calculus gives a rigorous setting for this.

Theorem 4.27 Fundamental Theorem of Calculus I


Let f : [a, b] → R be a bounded function that is Riemann integrable, and
let F : [a, b] → R be the function defined by
Z x
F (x) = f (u)du.
a

If x0 is a point in (a, b), and f is continuous at x0 , then F is differentiable


at x0 and
F ′ (x0 ) = f (x0 ).

Proof of Fundamental Theorem of Calculus I


We need to show that the limit
F (x0 + h) − F (x0 )
lim
h→0 h
exists and is equal to f (x0 ). Given ε > 0, since f is continuous at x0 , there
is a δ > 0 such that (x0 − δ, x0 + δ) ⊂ (a, b) and
ε
|f (x) − f (x0 )| < for all x ∈ (x0 − δ, x0 + δ). (4.6)
2
Chapter 4. Integrating Functions of a Single Variable 307

For h ∈ (−δ, δ),


Z x0 +h
F (x0 + h) − F (x0 ) − f (x0 )h = f (u)du − f (x0 )h
x0
Z x0 +h
= (f (u) − f (x0 )) du.
x0

Eq. (4.6) implies that


ε
|F (x0 + h) − F (x0 ) − f (x0 )h| ≤ |h|.
2
Hence, if h ∈ (−δ, δ) \ {0},

F (x0 + h) − F (x0 ) ε
− f (x0 ) ≤ < ε.
h 2

This proves that

F (x0 + h) − F (x0 )
lim = f (x0 ).
h→0 h

In fact, the point x0 can be a or b if we consider one-sided derivatives.

Example 4.19

Consider the piecewise continuous function f : [0, 2] → R given by



−1, if 0 ≤ x < 1,



f (x) = 2, if 1 ≤ x < 2,


1,

if x = 2.

We find that

Z x −x, if 0 ≤ x < 1,
F (x) = f (u)du =
0 2x − 3, if 1 ≤ x ≤ 2.
Chapter 4. Integrating Functions of a Single Variable 308

The function F : [0, 2] → R is continuous. For x ∈ (0, 1), F is


differentiable and F ′ (x) = −1 = f (x). For x ∈ (1, 2), F is differentiable
and F ′ (x) = 2. However, F is not differentiable at x = 1 since

F (x) − F (1) F (x) − F (1)


lim− = −1, lim+ = 2.
x→1 x−1 x→1 x−1

Figure 4.11: The function f : [0, 2] → R and F : [0, 2] → R in Example 4.19.

Example 4.20
Evaluate the following derivatives.
Z x
d
(a) sin(u2 )du
dx 0
Z 1
d
(b) sin(u2 )du
dx x
Z x2
d
(c) sin(u2 )du
dx x

Solution
The function f : R → R, f (x) = sin(x2 ) is continuous. Hence, it is
Riemann integrable over any closed and bounded intervals. Let
Z x Z x
F (x) = f (u)du = sin(u2 )du.
0 0
Chapter 4. Integrating Functions of a Single Variable 309

By fundamental theorem of calculus I, F ′ (x) = f (x) = sin(x2 ).


Z x
d d
(a) sin(u2 )du = F (x) = sin(x2 ).
dx 0 dx
Z 1
d d
(b) sin(u2 )du = (F (1) − F (x)) = −F ′ (x) = − sin(x2 ).
dx x dx
Z x2
d d
sin(u2 )du = F (x2 ) − F (x) = 2xF ′ (x2 ) − F ′ (x)

(c)
dx x dx
= 2x sin(x4 ) − sin(x2 ).

Now we turn to the second fundamental theorem of calculus, which provides


a mean for calculating integrals of continuous functions.

Theorem 4.28 Fundamental Theorem of Calculus II


Let F : [a, b] → R be a continuous function, and let f : [a, b] → R be a
bounded function that is continuous on (a, b). If

F ′ (x) = f (x) for all x ∈ (a, b),

Then Z b
f (x)dx = F (b) − F (a).
a

Recall that if the functions F (x) and f (x) are related by

F ′ (x) = f (x) for all x ∈ (a, b),

F is called an antiderivative of f . Hence, the fundamental theorem of calculus II


states that if the function f : (a, b) → R is continuous and it has an antiderivative
F (x) which can be extended to a continuous function F : [a, b] → R, then
Z b
f (x)dx = [F (x)]ba = F (b) − F (a).
a

We will present two proofs of the fundamental theorem of calculus II. The first
one uses fundamental theorem of calculus I.
Chapter 4. Integrating Functions of a Single Variable 310

First Proof of Fundamental Theorem of Calculus II


Since f : [a, b] → R is a bounded function that is continuous on (a, b), it is
integrable over any subinterval of [a, b]. Let G : [a, b] → R be the function
defined by Z x
G(x) = f (u)du.
a

By default, G(a) = 0. By Theorem 4.25, G is continuous on [a, b]. By


fundamental theorem of calculus I,

G′ (x) = f (x) for all x ∈ (a, b).

Hence, F : [a, b] → R and G : [a, b] → R are continuous functions


satisfying
G′ (x) = F ′ (x) for all x ∈ (a, b).
By Theorem 3.14, there is a constant C such that

G(x) = F (x) + C.

Since G(a) = 0, we find that C = −F (a). Therefore,

G(x) = F (x) − F (a),

and so Z b
f (x)dx = G(b) = F (b) − F (a).
a

In different textbooks, the ordering of the two fundamental theorem of calculus


might be different. One can use one to deduce the other. This is why a proof of
the fundamental theorem of calculus II without using the fundamental theorem of
calculus I is of interest.
Chapter 4. Integrating Functions of a Single Variable 311

Second Proof of Fundamental Theorem of Calculus II


Here we use the Lagrange mean value theorem. Since f : [a, b] → R is
a bounded function that is continuous on (a, b), it is integrable. Now if
P = {xi }ki=0 is a partition of [a, b], for each 1 ≤ i ≤ k, the mean value
theorem implies that there is a ξi ∈ (xi−1 , xi ) such that

F (xi ) − F (xi−1 ) = F ′ (ξi )(xi − xi−1 ) = f (ξi )(xi − xi−1 ).

Summing over i gives


k
X k
X
F (b)−F (a) = (F (xi ) − F (xi−1 )) = f (ξi )(xi −xi−1 ) = R(f, P, A),
i=1 i=1

where A = {ξi }ki=1 . Since

L(f, P ) ≤ R(f, P, A) ≤ U (f, P ),

we find that
L(f, P ) ≤ F (b) − F (a) ≤ U (f, P ).
Notice that this is true for any partition P of [a, b]. By definitions of the
lower integral and the upper integral, we find that
Z b Z b
f ≤ F (b) − F (a) ≤ f.
a a

Since f : [a, b] → R is Riemann integrable, the lower integral and the upper
integral are the same. Thus,
Z b Z b Z b
f= f= f = F (b) − F (a).
a a a

We can relax the conditions in the fundamental theorem of calculus II to let f


to be a piecewise continuous function.
Chapter 4. Integrating Functions of a Single Variable 312

Corollary 4.29 Generalized Fundamental Theorem of Calculus II

Let S = {a0 , a1 , . . . , ak } be a finite subset of [a, b] that contains a and b,


and let f : [a, b] → R be a bounded function that is continuous on [a, b] \ S.
If F : [a, b] → R is a continuous function, differentiable on [a, b] \ S, and

F ′ (x) = f (x) for all x ∈ [a, b] \ S,

then Z b
f (x)dx = F (b) − F (a).
a

Proof
We can assume that

a = a0 < a1 < . . . < ak = b.

Since f : [a, b] → R is bounded and piecewise continuous, it is Riemann


integrable. Moreover, by the generalized additivity theorem, we have
Z b k Z
X ai
f (x)dx = f (x)dx.
a i=1 ai−1

Applying
Z ai the fundamental theorem of calculus II to each of the integrals
f (x)dx, we find that
ai−1

Z b k
X
f (x)dx = (F (ai ) − F (ai−1 )) = F (b) − F (a).
a i=1

This completes the proof. Note that it is crucial here that F is continuous
on [a, b].

As is well known, the fundamental theorem of calculus provides a practical


method for computing integrals of functions that have antiderivatives.
Chapter 4. Integrating Functions of a Single Variable 313

Example 4.21

Compute the integral of the piecewise continuous function f : [−1, 2] → R,



2 − x, if − 1 ≤ x < 0,
f (x) =
x 2 , if 0 ≤ x ≤ 2,

that is defined in Example 4.15.

Solution
Using additivity,
Z 2 Z 0 Z 2
f (x)dx = f (x)dx + f (x)dx.
−1 −1 0

Using fundamental theorem of calculus II,


0 0 0
x2
Z Z   
5 5
f (x)dx = (2 − x)dx = 2x − =0− − = ,
−1 −1 2 −1 2 2
2 2 2
x3
Z Z 
2 8 8
f (x)dx = x dx = = −0= .
0 0 3 0 3 3
Hence, Z 2
5 8 31
f (x)dx = + = .
−1 2 3 6
Chapter 4. Integrating Functions of a Single Variable 314

Remark 4.7 Alternative Proof of Mean Value Theorem for Integrals


Using the fundamental theorem of calculus, we can give an alternative proof
of the mean value theorem for integrals as follows. Since the function f :
[a, b] → R is continuous, the function F : [a, b] → R defined by
Z x
F (x) = f (u)du
a

is continuous on [a, b], differentiable on (a, b), and F ′ (x) = f (x) for all
x ∈ (a, b). By Lagrange mean value theorem, there is a c ∈ (a, b) such that
b
F (b) − F (a)
Z
1
f (x)dx = = F ′ (c) = f (c).
b−a a b−a

Finally, we can prove the existence and uniqueness theorem mentioned in


Chapter 3, Theorem 3.21.

Theorem 4.30 Existence and Uniqueness Theorem

Let (a, b) be an open interval that contains the point x0 , and let y0 be any
real number. Given that f : (a, b) → R is a continuous function, there
exists a unique differentiable function F : (a, b) → R such that

F ′ (x) = f (x) for all x ∈ (a, b), F (x0 ) = y0 .

Proof
As we mentioned before, the uniqueness follows from the identity criterion.
For the existence, notice that f is continuous on any closed and bounded
interval that is contained in (a, b). Hence, we can define the function F :
(a, b) → R by Z x
F (x) = f (u)du + y0 .
x0

Then F (x0 ) = y0 by default. By fundamental theorem of calculus, F ′ (x) =


f (x) for all x ∈ (a, b).

Let us look at some other examples how integrals can be applied.


Chapter 4. Integrating Functions of a Single Variable 315

Example 4.22
Find the limit n  
1X πk
lim sin .
n→∞ n n
k=1

Solution
We try to identify
n  
1X πk
sin
n k=1 n
πk
as a Riemann sum. For 1 ≤ k ≤ n, let ξk = . These are equally
n
spaced points in the interval [0, π]. This motivates us to define the function
f : [0, π] → R, f (x) = sin x. Since f is a continuous function, it is
Riemann integrable. Let Pn be the regular partition of [0, π] into n intervals.
Then with An = {ξk }nk=1 , we have
n  
X πk π
R(f, Pn , An ) = sin .
k=1
n n

Since f is Riemann integrable,


Z π
lim R(f, Pn , An ) = f (x)dx.
n→∞ 0

By fundamental theorem of calculus,


Z π Z π
f (x)dx = sin xdx = [− cos x]π0 = 2.
0 0

Therefore,
n  
1X πk 1 2
lim sin = lim R(f, Pn , An ) = .
n→∞ n n π n→∞ π
k=1
Chapter 4. Integrating Functions of a Single Variable 316

Exercises 4.4
Question 1
Evaluate the following derivatives.
Z x
d 2
(a) eu du
dx 0
Z 1
d
(b) cos(u2 )du
dx x
Z x3
d √
(c) 2 + sin u du
dx x

Question 2
Let f : [−2, 6] → R be the function defined by

x2 − x, if − 2 ≤ x < 1,
f (x) =
x − 1 , if 1 ≤ x ≤ 6.
x
Find a continuous function F : [−2, 6] → R such that F is differentiable
on (−2, 6), F (0) = 0, and

F ′ (x) = f (x) for all x ∈ (−2, 6).

Question 3
Find the limit
17 + 27 + · · · + n7
lim .
n→∞ n8

Question 4
Find the limit n  
1X 2 2πk
lim cos .
n→∞ n n
k=1
Chapter 4. Integrating Functions of a Single Variable 317

4.5 Integration by Substitution and Integration by Parts

In this section, we prove the integration by substitution formula and integration


by parts formula. We will only deal with the case where the function that we are
integrating is continuous in the interior of the integration interval. For general case
where the function is piecewise continuous, one can apply the additivity theorem.

4.5.1 Integration by Substitution

Theorem 4.31 Integration by Substitution

Let g : [a, b] → R be a function that satisfies the following conditions:

(i) g is continuous and one-to-one on [a, b];

(ii) g is continuously differentiable on (a, b);

(iii) g ′ (x) is bounded on (a, b).

Then g maps the interval [a, b] onto a closed and bounded interval [c, d] with
end points g(a) and g(b). If f : [c, d] → R is a function that is bounded and
continuous on (c, d), then the function h : [a, b] → R,

h(x) = f (g(x))g ′ (x)

is Riemann integrable and


Z b Z b Z g(b)

h(x)dx = f (g(x))g (x)dx = f (u)du. (4.7)
a a g(a)

This is equivalent to
Z d Z b
f (u)du = f (g(x))|g ′ (x)|dx. (4.8)
c a

The function g : [a, b] → R that satisfies all the three given conditions defines
a smooth change of variables u = g(x) from x to u, in the sense that g is
continuously differentiable on (a, b).
Chapter 4. Integrating Functions of a Single Variable 318

Proof
Since g is one-to-one, we have g((a, b)) ⊂ (c, d). Therefore, the function
h : [a, b] → R, h(x) = f (g(x))g ′ (x) is continuous and bounded on (a, b),
and hence, it is Riemann integrable. For any x ∈ [a, b], let
Z x Z x
H1 (x) = h(u)du = f (g(u))g ′ (u)du,
a
Z x a
F (x) = f (u)du,
c
Z g(x)
H2 (x) = f (u)du = F (g(x)) − F (g(a)).
g(a)

Then H1 (a) = H2 (a) = 0. By fundamental theorem of calculus, H1 and


H2 are differentiable on (a, b), and for any x ∈ (a, b),

H1′ (x) = h(x) = f (g(x))g ′ (x), H2′ (x) = f (g(x))g ′ (x).

Since H1′ (x) = H2′ (x) for all x ∈ (a, b), and H1 (a) = H2 (a), we conclude
that H1 (x) = H2 (x) for all x ∈ [a, b]. Namely,
Z b Z g(b)

f (g(x))g (x)dx = f (u)du.
a g(a)

From this, we see that integration by substitution is just the inverse of the
chain rule for differentiation. To prove the equivalence of (4.7) and (4.8),
we consider two cases.
Case I: g is strictly increasing on [a, b].
In this case, g ′ (x) ≥ 0, and c = g(a), d = g(b). So (4.7) is equivalent to
(4.8).
Case II: g is strictly decreasing.
In this case, g ′ (x) ≤ 0, g(a) = d and g(b) = c. Therefore,
Z b Z b

f (g(x))|g (x)|dx = − f (g(x))g ′ (x)dx
a a

and Z g(b) Z c Z d
f (u)du = f (u)du = − f (u)du.
g(a) d c

Thus, (4.7) and (4.8) are equivalent.


Chapter 4. Integrating Functions of a Single Variable 319

If we impose the condition that f is continuous at the boundary points c and d,


the condition that g is one-to-one can be removed. The points g(a) and g(b) might
not be the boundary points of the interval J = g([a, b]), but the proof still holds.

Theorem 4.32 General Integration by Substitution

Let g : [a, b] → R be a function that satisfies the following conditions:

(i) g is continuous on [a, b];

(ii) g is continuously differentiable on (a, b);

(iii) g ′ (x) is bounded on (a, b).

Then g maps [a, b] to a closed and bounded interval J. If f : J → R is a


continuous function, then the function h : [a, b] → R,

h(x) = f (g(x))g ′ (x)

is Riemann integrable and


Z b Z b Z g(b)

h(x)dx = f (g(x))g (x)dx = f (u)du.
a a g(a)

Example 4.23
Z 3 √
Evaluate the integral x 16 + x2 dx.
−2

Solution

Let f (x) = x and g(x) = 16 + x2 . The function g is continuously
differentiable, with g ′ (x) = 2x, and it maps the interval [−2, 3] onto
the interval [16, 25]. However, it is not one-to-one. The function f is
continuous on [16, 25], so we can apply the integration by substitution. In
practice, we will do substitution by letting u = 16 + x2 , and find that
du
= 2x.
dx
Chapter 4. Integrating Functions of a Single Variable 320

This implies that we can replace xdx by du/2. When x = −2, u = 20;
when x = 3, u = 25. Thus,
Z 3 √ 25 √
1 25 √

125 − 20 20
Z
2
1 3
x 16 + x dx = udu = u 2 = .
−2 2 20 3 20 3

Students are invited to split the integral into a sum of two integrals, one
over the interval [−2, 0], and one over the interval [0, 3]. The function g(x)
is one-to-one on each of these two intervals. Check that the same answer is
obtained.

As we mentioned before, if the change of variables is given by a one-to-one


function u = g(x), the function f does not need to be continuous at the boundary
points. Using addivitivity theorem, Theorem 4.31 still holds when f is a bounded
piecewise continuous function.

Example 4.24

Let a be a positive number, and let f : [0, a] → R be a piecewise continuous


function that is bounded. Show that
Z a Z a
f (x)dx = f (a − x)dx.
0 0

Solution
We consider the change of variables u = g(x) = a − x. This is a strictly
monotonic function with g ′ (x) = −1. Therefore, du = −dx. When x = 0,
u = a; when x = a, u = 0. Hence,
Z a Z 0 Z a
f (x)dx = f (a − u)(−du) = f (a − x)dx.
0 a 0
Chapter 4. Integrating Functions of a Single Variable 321

Example 4.25

Let a be a positive number, and let f : [−a, a] → R be a piecewise


continuous function that is bounded.

(a) If f is an even function, show that


Z a Z a
f (x)dx = 2 f (x)dx.
−a 0

(b) If f is an odd function, show that


Z a
f (x)dx = 0.
−a

Solution
Notice that Z a Z 0 Z a
f (x)dx = f (x)dx + f (x)dx.
−a −a 0
Z 0
For the integral f (x)dx, we consider the change of variables u =
−a
g(x) = −x. This is a strictly monotonic function with g ′ (x) = −1.
Therefore, du = −dx. When x = −a, u = a; when x = 0, u = 0.
Hence, Z 0 Z 0 Z a
f (x)dx = f (−u)(−du) = f (−x)dx.
−a a 0

(a) When f is an even function, f (−x) = f (x) for all x ∈ [0, a].
Therefore,
Z a Z a Z a Z a
f (x)dx = f (x)dx + f (x)dx = 2 f (x)dx.
−a 0 0 0

(b) When f is an odd function, f (−x) = −f (x) for all x ∈ [0, a].
Therefore,
Z a Z a Z a
f (x)dx = − f (x)dx + f (x)dx = 0.
−a 0 0
Chapter 4. Integrating Functions of a Single Variable 322

Figure 4.12: An even function.

Figure 4.13: An odd function.

Example 4.26 Area of a Circle


Find the area of a circle of radius r.
Chapter 4. Integrating Functions of a Single Variable 323

Solution
A circle of radius r with center at the origin has equation x2 + y 2 = r2 .
By symmetry, it is enough for us to find the area in the first quadrant, and
then multiply by 4. The sector in the first quadrant is bounded by the curve

y = r2 − x2 , the lines x = 0, x = r, and the x-axis. Hence, the area of a
circle of radius r is Z r√
A=4 r2 − x2 dx.
0
Making a change of variables x = r sin θ, we find that
dx
= r cos θ.

When x = 0, θ = 0; when x = r, θ = π/2. Therefore,
Z π
2 p
A=4 r2 − r2 sin2 θ r cos θ dθ
0
Z π
2
2
= 4r cos2 θ dθ.
0

Using the formula


1 + cos 2θ
cos2 θ = ,
2
we have
Z π
2
2
A = 2r (1 + cos 2θ) dθ
0
  π2
sin 2θ
2
= 2r θ +
2 0
π
= 2r2 ×
2
2
= πr .
Chapter 4. Integrating Functions of a Single Variable 324

4.5.2 Integration by Parts

Theorem 4.33 Integration by Parts

Let f : [a, b] → R and g : [a, b] → R be functions that satisfy the following


conditions:

(i) f and g are continuous on [a, b];

(ii) f and g are continuously differentiable on (a, b);

(iii) f ′ (x) and g ′ (x) are bounded on (a, b).

Then f g ′ and gf ′ are Riemann integrable on [a, b], and


Z b Z b

f (x)g (x)dx = f (b)g(b) − f (a)g(a) − g(x)f ′ (x)dx.
a a

Proof
Since f and g are continuous on [a, b], they are bounded. Since f ′ (x) and
g ′ (x) are conitnuous and bounded on (a, b), f g ′ and f ′ g are continuous and
bounded on (a, b). Therefore, f g ′ and gf ′ are Riemann integrable on [a, b].
By product rule, for any x ∈ (a, b),

(f g)′ (x) = f (x)g ′ (x) + g(x)f ′ (x).

So (f g)′ is also bounded and continuous on (a, b), and hence Riemann
integrable on [a, b]. Since f g is also continuous on [a, b], we can apply
fundamental theorem of calculus, which gives
Z b
(f g)′ (x)dx = (f g)(b) − (f g)(a).
a

Therefore,
Z b Z b

f (x)g (x)dx + g(x)f ′ (x)dx = f (b)g(b) − f (a)g(a).
a a

This proves the integration by parts formula.


Chapter 4. Integrating Functions of a Single Variable 325

In a nutshell, the integration by parts formula is just the inverse of the product
rule of differentiation. But it is a very useful integration technique.

Integration by Parts
The integration by parts formula is often expressed as
Z Z
udv = uv − vdu.

In practice, we identify which part should be u and which part should be


dv. The function v is defined up to a constant. One can verify directly that
if v is replaced by v + C, where C is a constant, the right hand side of the
formula is not changed. Hence, we can choose a v that is most convenient.

Example 4.27
Let n be a positive integer. Evaluate the integral
Z e
ln x
n
dx.
1 x

Solution
If n = 1, we use integration by substitution with u = ln x. Then
du 1
= .
dx x
When x = 1, u = 0; when x = e, u = 1. Therefore,
Z e Z 1  2 1
ln x u 1
dx = udu = = .
1 x 0 2 0 2

If n ≥ 2, we use integration by parts. Let


1
u(x) = ln x, v ′ (x) = .
xn
Then
du 1 1 1
= , v(x) = − × n−1 .
dx x n−1 x
Both of u(x) and v(x) are continuously differentiable functions on (0, ∞).
Chapter 4. Integrating Functions of a Single Variable 326

Therefore,
Z e  e Z e
ln x 1 ln x 1 1
dx = − × + dx
1 xn n − 1 xn−1 1 n − 1 1 xn
 e
1 1 1 1
=− × −
n − 1 en−1 (n − 1)2 xn−1 1
1 n 1
= 2
− 2 n−1
.
(n − 1) (n − 1) e

Example 4.28
Let I be an open interval that contains the point x0 , and let f : I → R
be a continuous function. Given a positive integer n, define the function
F : I → R by
1 x
Z
F (x) = (x − t)n f (t)dt.
n! x0
Prove that F is (n + 1) times continuously differentiable,

F (x0 ) = F ′ (x0 ) = . . . = F (n) (x0 ) = 0,

and
F (n+1) (x) = f (x) for all x ∈ I.

Solution
Define the function g : R → R by
Z x
g(x) = f (t)dt.
x0

Then g(x0 ) = 0, and by fundamental theorem of calculus,

g ′ (x) = f (x) for all x ∈ I.

Now we prove the statement by induction on n. When n = 1,


Z x
F (x) = (x − t)f (t)dt.
x0
Chapter 4. Integrating Functions of a Single Variable 327

By definition, F (x0 ) = 0. For a fixed x, using integration by parts with


u(t) = x − t and v ′ (t) = f (t), we find that

du
= −1, v(t) = g(t).
dt
It follows that
h it=x Z x Z x
F (x) = (x − t)g(t) + g(t)dt = g(t)dt.
t=x0 x0 x0

Notice that g(t) is continuously differentiable, and hence it is continuous.


By fundamental theorem of calculus,

F ′ (x) = g(x) for all x ∈ I.

Therefore, F ′ (x0 ) = g(x0 ) = 0, and

F ′′ (x) = g ′ (x) = f (x) for all x ∈ I.

This proves that F (x) is twice continuously differentiable. Since we have


also shown that F (x0 ) = F ′ (x0 ) = 0, and F ′′ (x) = f (x) for all x ∈ I. the
statement is true when n = 1.
Assume that we have proved the statement when n = k − 1, where k ≥ 2.
When n = k,
1 x
Z
F (x) = (x − t)k f (t)dt.
k! x0
For a fixed x, using integration by parts with u(t) = (x − t)k and v ′ (t) =
f (t), we find that

du
= −k(x − t)k−1 , v(t) = g(t).
dt
It follows that
Z x
1  t=x 1
F (x) = (x − t)k g(t) t=x0 + (x − t)k−1 g(t)dt
k! (k − 1)! x0
Z x
1
= (x − t)k−1 g(t)dt.
(k − 1)! x0

By inductive hypothesis, the function F (x) satisfies

F (x0 ) = F ′ (x0 ) = · · · = F (k−1) (x0 ) = 0,


Chapter 4. Integrating Functions of a Single Variable 328

and
F (k) (x) = g(x) for all x ∈ I.
The latter implies that F (k) (x0 ) = g(x0 ) = 0, and F (x) is (k + 1) times
differentiable, with
F (k+1) (x) = g ′ (x) = f (x)
a continuous function. Therefore, when n = k+1, the statement also holds.
By principle of mathematical induction, the statement is true for all positive
integers n.
Chapter 4. Integrating Functions of a Single Variable 329

Exercises 4.5
Question 2
Let f : [a, b] → R be a bounded function that is Riemann integrable. Show
that for any real number c,
Z b Z b+c
f (x)dx = f (x − c)dx.
a a+c

Question 2
Explain why Z 1 Z 1
−x2 2
(x + 1)e dx = 2 e−x dx.
−1 0

Question 3
Let a be a positive number. Assume that the functions f : [0, a] → R and
g : [0, a] → R are bounded and piecewise continuous, prove that
Z a Z a
f (x)g(a − x)dx = f (a − x)g(x)dx.
0 0

Question 4
Let m and n be nonnegative integers. Show that
Z 1
m! n!
xm (1 − x)n dx = .
0 (m + n + 1)!
Chapter 4. Integrating Functions of a Single Variable 330

Question 5
Let f : [a, b] → R be a continuous and strictly increasing function which
maps the interval [a, b] bijectively onto the interval [c, d], where c = f (a),
and d = f (b). Denote by g : [c, d] → R the inverse function of f . Notice
that f : [a, b] → R and g : [c, d] → R are Riemann integrable. This
question is regarding the proof of the formula
Z b Z d
f (x)dx = bf (b) − af (a) − g(x)dx. (4.9)
a c

(a) If a > 0 and c > 0, draw a figure to illustrate the formula.

(b) If f is continuously differentiable on (a, b), use integration by


substitution with u = g(x) to prove the formula (4.9).

(c) Let Pn = {xi }ni=0 be the regular partition of the interval [a, b] into n
intervals. For 0 ≤ i ≤ n, let yi = f (xi ). Then Pen = {yi }ni=0 is a
partition of [c, d].

(i) Show that


n
X n
X
f (xi−1 )(xi − xi−1 ) + g(yi )(yi − yi−1 ) = bf (b) − af (a).
i=1 i=1

(ii) Show that lim |Pn | = 0 and lim |Pen | = 0. You might want to
n→∞ n→∞
use uniform continuity.
(iii) Use part (i) and part (ii) to prove the formula (4.9).
Chapter 4. Integrating Functions of a Single Variable 331

4.6 Improper Integrals

In this section, we want to discuss Riemann integrals for functions f : I → R


defined on an interval I, where either I is not bounded, or f is not bounded on I,
or both.

Definition 4.12 Improper Integral


Let I be an interval and let f : I → R be a function defined on I. An
integral of the form Z
f
I
is an improper integral if either f is not bounded on I, or I is an unbounded
interval.

This is not a rigorous


Z definition. We will only be interested in the case where
we can make sense of f.
I
As an example, Theorem 4.30 says that there exists a differentiable function
g : (−1, 1) → R satifying
1
g ′ (x) = √ , g(0) = 0.
1 − x2
It is given by Z x
du
g(x) = √ du.
0 1 − u2
1
Notice that the function f (u) = √ is bounded and continuous on the
1 − u2
interval [0, x] if 0 < x < 1, and on [x, 0] if −1 < x < 0. Therefore, g(x) is
a well-defined Riemann integral when −1 < x < 1. We are interested to extend
the definition of g(x) to x = 1 and x = −1. But f is not bounded on (−1, 1), so
we cannot define the Riemann integral of f on [0, 1] or [−1, 0]. Our studies on the
function sin x shows that g(x) = sin−1 x when x ∈ (−1, 1). Thus,
π π
lim− g(x) = sin−1 1 = , lim + g(x) = sin−1 (−1) = − .
x→1 2 x→−1 2
Hence, it is reasonable to say that the improper integrals
Z 1 Z −1
1 1
√ du and √ du
1−u 2 1 − u2
0 0
Chapter 4. Integrating Functions of a Single Variable 332

have values
Z x Z x
1 π 1 π
lim− √ du = and lim + √ du = −
x→1 0 1 − u2 2 x→−1 0 1 − u2 2

respectively. This is how we are going to make sense of improper integrals.

Definition 4.13 Improper Integrals of Unbounded Functions

1. If the function f : (a, b] → R is not bounded, but it is bounded and


Riemann integrable on any interval [c, b] with a < c < b, then we say
Z b
that the improper integral f (x)dx is convergent if the limit
a
Z b
lim f (x)dx
c→a+ c

exists. Otherwise, we say that the improper integral is divergent. When


the improper integral is convergent, we define its value as
Z b Z b
f (x)dx = lim+ f (x)dx.
a c→a c

2. If the function f : [a, b) → R is not bounded, but it is bounded and


Riemann integrable on any interval [a, c] with a < c < b, then we say
Z b
that the improper integral f (x)dx is convergent if the limit
a
Z c
lim− f (x)dx
c→b a

exists. Otherwise, we say that the improper integral is divergent. When


the improper integral is convergent, we define its value as
Z b Z c
f (x)dx = lim− f (x)dx.
a c→b a
Chapter 4. Integrating Functions of a Single Variable 333

Improper Integrals of Unbounded Functions


Putting in another way, if the function f : (a, b] → R is not bounded, but it
is bounded and Riemann integrable on any intervals [x, b] when a < x < b,
we define the function F : (a, b] → R by
Z b
F (x) = f (u)du.
x

Z b F is a continuous function. We say that the improper integral


Then
f (x)dx is convergent if and only if
a

lim F (x)
x→a+

exists. Similarly, for a function f : [a, b) → R that is not bounded, but is


bounded and Riemann integral on any intervals [a, x] when a < x < b, we
Z b
say that the improper integral f (x)dx is convergent if and only if the
a
continuous function F : [a, b) → R defined by
Z x
F (x) = f (u)du
a

has a limit when x → b− .

Example 4.29
Z 1
1
√ dx is an improper integral as the function f : [0, 1) → R,
0 1 − x2
1
f (x) = √ is not bounded. We have seen that this improper integral
1 − x2
π
is convergent and has value .
2

Example 4.30
Let p be a positive Z number. Determine those values of p for which the
1
1
improper integral p
dx is convergent. Find the value of the improper
0 x
integral when it is convergent.
Chapter 4. Integrating Functions of a Single Variable 334

Solution
For p > 0, define the function F : (0, 1] → R by
Z 1
1
F (x) = p
du.
x u

Then

− ln x,
 if p = 1,
F (x) = 1 − x1−p
 , if p ̸= 1.
1−p

From this, we see that lim+ F (x) exists if and only if 0 < p < 1. Hence,
Zx→0
1
1
the improper integral p
dx is convergent if and only if 0 < p < 1. In
0 x
this case, Z 1
1 1
p
dx = , 0 < p < 1.
0 x 1−p

Z 1
When r ≥ 0, the integral xr dx is just an ordinary integral. However, we
0 Z 1
will sometimes abuse terminology and say that the integral xr dx is convergent
0
if and only if r > −1.

If c is a point in (a, b) and we have a function f : [a, b] \ {c} → R that is


Z b
not bounded, we will define the improper integral f (x)dx as
a
Z c Z b
f (x)dx + f (x)dx.
a c
Z b
We say that the improper integral f (x)dx is convergent provided that
Z c a Z b
both improper integrals f (x)dx and f (x)dx are convergent.
a c

Next we consider improper integrals defined on unbounded intervals.


Chapter 4. Integrating Functions of a Single Variable 335

Definition 4.14 Improper integrals on Unbounded Intervals

1. If f : [a, ∞) → R is a function that is bounded and Riemann


Z on any bounded intervals [a, b], we say that the improper
integrable

integral f (x)dx is convergent if the limit
a
Z b
lim f (x)dx
b→∞ a

exists. Otherwise, we say that the improper integral is divergent. If the


improper integral is convergent, we define its value as
Z ∞ Z b
f (x)dx = lim f (x)dx.
a b→∞ a

2. If f : (−∞, b] → R is a function that is bounded and Riemann


integrable on any bounded intervals [a, b], we say that the improper
Z b
integral f (x)dx is convergent if the limit
−∞

Z b
lim f (x)dx
a→−∞ a

exists. Otherwise, we say that the improper integral is divergent. If the


improper integral is convergent, we define its value as
Z b Z b
f (x)dx = lim f (x)dx.
−∞ a→−∞ a

3. If f : R → R is a function that is bounded and Riemann integrable


on any bounded intervals [a, b], we say that the improper integral
Z ∞
f (x)dx is convergent if and only if for any real number c, both
−∞
the improper integrals
Z c Z ∞
f (x)dx and f (x)dx
−∞ c

are convergent. In such a case, we define the improper integral as


Z ∞ Z c Z ∞
f (x)dx = f (x)dx + f (x)dx. (4.10)
−∞ −∞ c
Chapter 4. Integrating Functions of a Single Variable 336

Remark 4.8
Z ∞
To make the integral f (x)dx well defined when it is convergent, we
−∞
need to check that the right hand side of (4.10) does not depend on the
point c. In fact, we can show that if there is a real number c0 so that both
the improper integrals
Z c0 Z ∞
f (x)dx and f (x)dx
−∞ c0

are convergent, then for any other values of c,


Z c Z ∞
f (x)dx and f (x)dx
−∞ c

are convergent. This is just due to additivity, which says that


Z c Z c0 Z c
f (x)dx = f (x)dx + f (x)dx,
a a c0
Z b Z c0 Z b
f (x)dx = f (x)dx + f (x)dx.
c c c0
Z c Z c0
Thus, lim f (x)dx exists if and only if lim f (x)dx exists, and
a→−∞ a a→−∞ a
Z b Z b
lim f (x)dx exists if and only if lim f (x)dx exists. Moreover,
b→∞ c b→∞ c0
Z c Z c0 Z c
lim f (x)dx = lim f (x)dx + f (x)dx,
a→−∞ a a→−∞ a c0
Z b Z c0 Z b
lim f (x)dx = f (x)dx + lim f (x)dx.
b→∞ c c b→∞ c0
Z c Z c0
Since f (x)dx = − f (x)dx, we find that
c0 c
Z c0 Z ∞ Z c Z ∞
f (x)dx + f (x)dx = f (x)dx + f (x)dx.
−∞ c0 −∞ c
Chapter 4. Integrating Functions of a Single Variable 337

Improper Integrals on Unbounded Intervals


Putting in another way, if f : [a, ∞) → R is a function that is bounded
and Riemann integrable on any bounded intervals, we define the function
F : [a, ∞) → R by Z x
F (x) = f (u)du.
a
Then F is a continuous function. We say that the improper integral
Z ∞
f (x)dx is convergent if and only if the limit
a

lim F (x)
x→∞

exists. Similarly, for a function f : (−∞, b] → R that is bounded


and Riemann integrable on any bounded intervals [a, b], we say that the
Z b
improper integral f (x)dx is convergent if and only if the continuous
−∞
function F : (−∞, b] → R defined by
Z b
F (x) = f (u)du
x

has a limit when x → −∞.

Example 4.31
Let p be any realZnumber. Determine those values of p for which the

1
improper integral dx is convergent. Find the value of the improper
1 xp
integral when it is convergent.

Solution
For a fixed real number p, define the function F : [1, ∞) → R by
Z x
1
F (x) = p
dx.
1 u
Chapter 4. Integrating Functions of a Single Variable 338

Then

ln x,
 if p = 1,
F (x) = x1−p − 1
 , if p ̸= 1.
1−p

From this, we see that the limit lim F (x) exists if and only if p > 1. Hence,
Z ∞ x→∞
1
the improper integral dx is convergent if and only if p > 1, and
1 xp
Z ∞
1 1
p
dx = , p > 1.
1 x p−1

Example 4.32
Determine whether the improper integral is convergent. If yes, find the
value of the integral.
Z ∞
1
(a) dx
0 1 + x2
Z 0
(b) ex dx
−∞
Z ∞
x
(c) dx
−∞ x2 +1

Solution
d 1
(a) Since tan−1 x = , we find that
dx 1 + x2
Z b
1
2
dx = tan−1 b − tan−1 0 = tan−1 b.
0 1+x

Since
π
lim tan−1 b = ,
b→∞ 2
Z ∞
1
the improper integral dx is convergent and its value is
0 1 + x2
Chapter 4. Integrating Functions of a Single Variable 339

Z ∞ Z b
1 1 −1 π
dx = lim dx = lim tan b = .
0 1 + x2 b→∞ 0 1 + x2 b→∞ 2
(b) Since ea → 0 as a → −∞, we have
Z 0 Z 0
x
e dx = lim ex dx = lim (1 − ea ) = 1.
−∞ a→−∞ a a→−∞

Z 0
The improper integral ex dx is convergent and is equal to 1.
−∞

(c) Here, we consider the improper integrals


Z 0 Z ∞
x x
2
dx and dx.
−∞ x + 1 0 x2 + 1

Since
d 2x
ln(1 + x2 ) = ,
dx 1 + x2
we find that Z b
x 1
2
dx = ln(1 + b2 ).
0 1+x 2
But
lim ln(1 + b2 ) = ∞.
b→∞
Z ∞
x
Hence, the improper integral 2
dx is divergent. So, the
Z ∞ 0 x +1
x
improper integral 2
dx is also divergent.
−∞ x + 1

Z ∞
One is tempted to define the improper integral f (x)dx as
−∞
Z a
lim f (x)dx
a→∞ −a
x
if it exists. For part (c) in the example above, f (x) = is an odd function.
Z a 1 + x2
x
Thus, 2
dx = 0 for any a, and so
−a 1 + x
Z a
x
lim dx = 0.
a→∞ −a 1 + x2
Chapter 4. Integrating Functions of a Single Variable 340

In fact, if f : R → R is an odd function, then we always have


Z a
lim f (x)dx = 0.
a→∞ −a

If we use the limit Z a


lim
f (x)dx
a→∞ −a
Z ∞
as a definition for the improper integral f (x)dx, it will lead to undesirable
Z ∞ −∞

results, such as that the integral xdx is convergent. Nevertheless, the limit
−∞
Z a
lim f (x)dx,
a→∞ −a

if
Z ∞it exists, has some applications. It is called the Cauchy principal value of
f (x)dx.
−∞

Definition 4.15 Cauchy Principal Value


If f : R → R is a function that is bounded and Riemann integrable on
any symmetric bounded intervals [−a, a], the Cauchy principal value of the
Z ∞ Z ∞
improper integral f (x)dx, denoted by P.V. f (x)dx, is defined as
−∞ −∞
Z ∞ Z a
P.V. f (x)dx = lim f (x)dx,
−∞ a→∞ −a

if the limit exists.


Z ∞
Thus, we find that if f : R → R is an odd function, then P.V. f (x)dx = 0.
−∞
It is also easy to prove the following.

Proposition 4.34
Z ∞
If the improper integral f (x)dx is convergent, then its Cauchy
−∞
principal value exists, and is equal to the improper integral. Namely,
Z ∞ Z ∞
P.V. f (x)dx = f (x)dx.
−∞ −∞
Chapter 4. Integrating Functions of a Single Variable 341

Proof
Z ∞
If the improper integral f (x)dx is convergent, then the limits
−∞

Z 0 Z b
lim f (x)dx and lim f (x)dx
c→−∞ c b→∞ 0

exists and
Z ∞ Z 0 Z b
f (x)dx = lim f (x)dx + lim f (x)dx.
−∞ c→−∞ c b→∞ 0

This implies that


Z a Z 0 Z a Z ∞
lim f (x)dx = lim f (x)dx + lim f (x)dx = f (x)dx.
a→∞ −a a→∞ −a a→∞ 0 −∞

Consider the integral Z ∞


1
√ dx. (4.11)
0 x(x + 1)
The function f : (0, ∞) → R,
1
f (x) = √
x(x + 1)
is not bounded on any interval (0, b] when b > 0. Hence, the integral is an
improper integral of an unbounded function defined on an unbounded interval.
Using the same principle, we will say that it is convergent if and only if for any
c > 0, the improper integrals
Z c Z ∞
f (x)dx and f (x)dx
0 c

are convergent.
Another natural question to ask is whether one can determine whether an
improper integral is convergent without explicitly computing the integral. There
are some partial solutions to this.

If J is an interval that is contained inZthe interval I, and the integral


Z
f (x)dx is divergent, then the integral f (x)dx is divergent.
J I
Chapter 4. Integrating Functions of a Single Variable 342

Z ∞ Z ∞
For instance, the integral f (x)dx is divergent if the integral f (x)dx
0 1
is divergent.
The next proposition says that linear combination of convergent integrals must
be convergent.

Proposition 4.35 Linearity


Z Z
Let I be an interval. If the improper integrals f (x)dx and
g(x)dx are
I Z I

convergent, then for any constants α and β, the improper integral (αf +
I
βg) is also convergent, and
Z Z Z
(αf + βg) = α f + β g.
I I I

This follows easily from limit laws. Now we want to prove some comparison
theorems for improper integrals. We start with integrals of nonnegative functions.
If a function f is nonpositive, one just consider the function −f , which is then
nonnegative.

Lemma 4.36
Let I be an interval. Given that f : I → R is a nonnegative function that is
bounded and Riemann integrable on any closed and bounded intervals that
are contained in I. Fixed x0 in I and define the function F : I → R by
Z x
F (x) = f (u)du.
x0
Z
1. If I = (a, b] or I = (−∞, b], then the integral f (x)dx is convergent
I
if and only if the function F (x) is bounded below.
Z
2. If I = [a, b) or I = [a, ∞), then the integral f (x)dx is convergent if
I
and only if the function F (x) is bounded above.
Chapter 4. Integrating Functions of a Single Variable 343

Proof
Notice that since f (u) ≥ 0 for all u ∈ I, for any x1 and x2 in I, if x1 < x2 ,
then Z x2
F (x2 ) − F (x1 ) = f (u)du ≥ 0.
x1

This implies that F : I → R is an increasing function.

1. If I = (a, b] or I = (−∞, b], the limit lim+ F (x) or the limit lim F (x)
x→a x→−∞
exists if and only if F (x) is bounded below.

2. If I = [a, b) or I = [a, ∞), the limit lim− F (x) or the limit lim F (x)
x→b x→∞
exists if and only if F (x) is bounded above.
Z ∞
In Proposition 4.34, we have stated that if the improper integral f (x)dx is
Z ∞ −∞

convergent, then the Cauchy principal value P.V. f (x)dx exists. The converse
−∞
is true if the function f : R → R is nonnegative.

Theorem 4.37
Let f : R → R be a nonnegative function that is bounded and Riemann
integrable on any closed and bounded intervals. The improper integral
Z ∞
f (x)dx is convergent if and only if the Cauchy principal value
−∞Z

P.V. f (x)dx exists. Moreover,
−∞
Z ∞ Z ∞
f (x)dx = P.V. f (x)dx.
−∞ −∞

Proof
Z ∞
We just need to show that if the Cauchy principal value P.V. f (x)dx
Z ∞ −∞

exists, then the improper integral f (x)dx is convergent.


−∞
Chapter 4. Integrating Functions of a Single Variable 344

Z ∞
Assume that the Cauchy principal value P.V. f (x)dx exists and is equal
−∞
to I. As in the proof of Lemma 4.38, the function
Z x
F (x) = f (u)du
0

is an increasing function. For any real numbers b and c with b ≤ c, there is


a positive number a such that

−a ≤ b ≤ c ≤ a.

Hence, Z c Z a
F (c) − F (b) = f (x)dx ≤ f (x)dx ≤ I.
b −a

This proves that −I ≤ F (x) ≤ I for all x ∈ R. In other words,


Z the function ∞
F : R → R is bounded. Therefore, the improper integrals f (x)dx and
0Z
Z 0 ∞
f (x)dx are convergent, and so the improper integral f (x)dx is
−∞ −∞
convergent.

Now, we can present the comparison theorem for improper integrals.

Theorem 4.38 Comparison Theorem


Let I be an interval. Given that f : I → R and g : I → R are nonnegative
functions that are bounded and Riemann integrable on any closed and
bounded intervals that are contained in I. Assume that

0 ≤ f (x) ≤ g(x) for all x ∈ I.


Z Z
1. If the integral g(x)dx is convergent, then the integral f (x)dx is
I I
convergent.
Z Z
2. If the integral f (x)dx is divergent, then the integral g(x)dx is
I I
divergent.
Chapter 4. Integrating Functions of a Single Variable 345

Proof
Notice that the second statement is the contrapositive of the first statement.
Hence, we only need to prove the first statement. Fixed x0 in the interval I,
and define
Z x Z x
F (x) = f (u)du, G(x) = g(u)du.
x0 x0

If x > x0 ,
0 ≤ F (x) ≤ G(x).
Therefore, G is bounded above implies F is bounded above. If x < x0 ,
Z x0 Z x0
F (x) = − f (u)du, G(x) = − g(u)du.
x x

Since Z x0 Z x0
0≤ f (u)du ≤ g(u)du,
x x
we find that
0 ≥ F (x) ≥ G(x).
Therefore, G is bounded below implies that F is bounded below. The
assertions about the convergence of the integrals then follow from Lemma
4.36.

Example 4.33
Z ∞
x
We can show that the integral dx is divergent without explicitly
0 x2 + 1
computing the integral. Notice that for x ≥ 1,
1 x
0≤ ≤ 2 .
2x x +1
Z ∞ Z ∞
1 x
Since the integral dx is divergent, the integral 2
dx is also
1 x Z ∞ 1 x +1
x
divergent. Hence, the integral 2
dx is divergent.
0 x +1
Chapter 4. Integrating Functions of a Single Variable 346

Example 4.34
Z ∞
1
Determine whether the improper integral √ dx is convergent.
0 x(x + 1)

Solution
We determine the convergence of the two improper integrals
Z 1 Z ∞
1 1
√ dx and √ dx
0 x(x + 1) 1 x(x + 1)

separately. For 0 < x ≤ 1,


1 1
0≤ √ ≤√ .
x(x + 1) x
Z 1 Z 1
1 1
Since the integral √ dx is convergent, the integral √ dx
0 x 0 x(x + 1)
is convergent. For x ≥ 1,
1 1 1
0≤ √ ≤ √ = 3/2 .
x(x + 1) x x x
∞ Z
1
Since the integral 3/2
dx is convergent, the integral
Z ∞ 1 x
1
√ dx is convergent. From these, we conclude that the
1 x(x
Z ∞+ 1)
1
integral √ dx is convergent.
0 x(x + 1)

Z 1
Since the integral x−p dx is convergent when p < 1, while the integral
Z ∞ 0 Z ∞
−p
x dx is convergent if p > 1, x−p dx is not convergent for any values of
1 0
p. Hence, to determine the convergence of the integral in the example above, we
need to split the integral into two parts and compare to different g(x) = x−p . For
x → 0+ , we ignore the part 1/(x + 1) which has a finite limit. For x → ∞, the
leading term of 1/(x + 1) is 1/x. This is how we identify the correct values of p
to compare to.
Theorem 4.38 provides a useful strategy to determine the convergence of an
Chapter 4. Integrating Functions of a Single Variable 347

integral in the case that the function is nonnegative. For a function that can take
both positive and negative values, we need other strategies.

Theorem 4.39
Let I be an interval. Assume that f : I → R is a function that is
bounded and Riemann integrable on any closed
Z and bounded intervals that
are contained in I. If the improper integral |f (x)|dx is convergent, then
Z I

the improper integral f (x)dx is convergent.


I

This theorem can be interpreted as absolute convergence implies convergence.

Proof
Define the functions f+ : I → R and f− : I → R by

f+ (x) = max{f (x), 0}, f− (x) = max{−f (x), 0}.

In other words,

f (x), if f (x) ≥ 0,
f+ (x) =
0, if f (x) < 0,

−f (x), if f (x) ≤ 0,
f− (x) =
0, if f (x) > 0.
Notice that f+ and f− are nonnegative functions, and

f (x) = f+ (x) − f− (x), |f (x)| = f+ (x) + f− (x).

The second equality implies that

0 ≤ f+ (x) ≤ |f (x)|, 0 ≤ f− (x) ≤ |f (x)| for all x ∈ I.

Theorem 4.23 says that the function |f | : I → R is Riemann integrable on


any closed and bounded intervals that are contained in I.
Chapter 4. Integrating Functions of a Single Variable 348

Question 4.3.5 says that the functions f + : I → R and f− : I → R


are also Riemann integrable on any closed and bounded intervals
Z that are
contained in I. By Theorem 4.38, the improper integrals f+ (x)dx and
Z I Z

f− (x)dx are convergent. By linearity, the improper integral f (x)dx is


I I
also convergent.

Combining Theorem 4.38 and Theorem 4.39, we have the following.

Theorem 4.40 General Comparison Theorem


Let I be an interval. Given that f : I → R and g : I → R are functions that
are bounded and Riemann integrable on any closed and bounded intervals
that are contained in I. If

|f (x)| ≤ g(x) for all x ∈ I,


Z Z
and the integral g(x)dx is convergent, then the integral f (x)dx is
I I
convergent.

Example 4.35
Z ∞
sin x
Show that the improper integral dx is convergent.
1 x2

Solution
For any x ≥ 1,
sin x 1
≤ .
x2 x2
Z ∞ Z ∞
1 sin x
Since the integral dx is convergent, the integral dx is
1 x2 1 x2
convergent.

There are some important special functions in mathematics and physics which
are defined in terms of improper integrals. One such function is the gamma
function, which students have probably seen in probability theory. In fact, gamma
function is ubiquitous in mathematics.
Chapter 4. Integrating Functions of a Single Variable 349

Example 4.36
Z ∞
Let s be a real number. Show that the improper integral ts−1 e−t dt is
0
convergent if and only if s > 0.

Solution
Z 1 Z ∞
s−1 −t
We split the integral into the two integrals t e dt and ts−1 e−t dt.
0 1
Notice that

0 ≤ ts−1 e−1 ≤ ts−1 e−t ≤ ts−1for all t ∈ (0, 1].


Z 1 Z 1
Since s−1
t dt is convergent if and only if s > 0, ts−1 e−t dt is
0 Z ∞ 0
convergent if and only if s > 0. For the integral ts−1 e−t dt, notice
1
that
ts−1
lim ts−1 e−t/2 = lim = 0.
t→∞ t→∞ et/2

Therefore, there is a number t0 > 1 such that for all t ≥ t0 , ts−1 e−t/2 ≤ 1.
Now the function
g(t) = ts−1 e−t/2
is continuous on the interval [0, t0 ]. Hence, it is bounded on [0, t0 ]. These
imply that there is a number M ≥ 1 such that

ts−1 e−t/2 ≤ M for all t ≥ 1.

Hence,
0 ≤ ts−1 e−t ≤ M e−t/2 for all t ≥ 1.
Z ∞ Z ∞
−t/2
Since the integral e dt is convergent, the integral ts−1 e−t dt is
1 1
convergent. Z ∞
Hence, the integral ts−1 e−t dt is convergent if and only if s > 0.
0
Chapter 4. Integrating Functions of a Single Variable 350

The Gamma Function


The gamma function Γ(s) is defined as the improper integral
Z ∞
Γ(s) = ts−1 e−t dt
0

when s > 0. It is easy to find that


Z ∞
Γ(1) = e−t dt = 1.
0

When s > 0, using integration by parts with u(t) = ts and v(t) = −e−t ,
we have
Z b
Γ(s + 1) = lim+ ts e−t dt
a→0 a
b→∞

 s −t b
Z b 
s−1 −t
= lim+ −t e a + s t e dt
a→0 a
b→∞

= lim+ as e−a − bs e−b + sΓ(s)



a→0
b→∞

= sΓ(s).

This gives the formula


Γ(s + 1) = sΓ(s).
By induction, one can show that

Γ(n + 1) = n!.

Hence, the gamma function is a function that interpolates the factorials.


Another special value is
  Z ∞
1
Γ = t−1/2 e−t dt.
2 0

Students have probably seen in multivariable calculus or probability that


Z ∞
2 √
e−x dx = π.
−∞
Chapter 4. Integrating Functions of a Single Variable 351

Making a change of variables t = u2 , we find that


Z ∞ Z b
−1/2 −t
t e dt = lim+ t−1/2 e−t dt
0 a→0 a
b→∞

Z b
2
= lim+ 2 √
e−u du
a→0 a
Zb→∞

2
= e−x dx.
−∞

Hence,   Z ∞
1 √
Γ = t−1/2 e−t dt = π.
2 0

In the future, we are going to explore more about the gamma function. For
example, we will prove the useful formula for the beta integral, which says
that if α > 0, β > 0,
Z 1
Γ(α)Γ(β)
tα−1 (1 − t)β−1 dt = .
0 Γ(α + β)

A lots of other proper or improper integrals can be transformed to this.


When α and β are positive integers, this formula can be proved by
induction. See Question 4.5.4.
Chapter 4. Integrating Functions of a Single Variable 352

Exercises 4.6
Question 1
Z ∞
Let a be a positive real number. Show that the integral e−ax dx is
0
convergent and find its value.

Question 2
Z ∞
2
Let n be a positive integer. Find the value of the integral xn e−x dx.
0

Question 3
Explain why the given integral is an improper integral, and determine
whether it is convergent. If yes, find the value of the integral.
Z 0
x
(a) √ dx
−3 9 − x2
Z 1

(b) x ln xdx
0
Z 2
dx
(c)
0 (x − 1)2

Question 4
Determine whether the improper integral is convergent. If yes, find its
value.
Z ∞ √
x
(a) dx
0 x+1
Z ∞
ln x
(b) dx
1 x2
Chapter 4. Integrating Functions of a Single Variable 353

Question 5
Determine whether the improper integral is convergent.
Z ∞
1
(a) √ dx
0 ( x + 1)2
Z 1 x
e
(b) √ dx
0 x
Z 2π
sin x
(c) dx
0 x3/2
Z ∞
x3
(d) 2 2
dx
−∞ (x + x + 1)
Chapter 5. Infinite Series of Numbers and Infinite Products 354

Chapter 5

Infinite Series of Numbers and Infinite Products

In this chapter, we discuss infinite series of numbers and infinite products.

5.1 Limit Superior and Limit Inferior

In Chapter 1, we have seen that a bounded sequence might not be convergent. In


this section, we will discuss the concepts called limit inferiors and limit superiors,
which characterize the limits of subsequences of a sequence.
First we extend the definitions of supremum and infimum as follows.

Extensions of Infmum and Supremum

1. If a nonempty set S is not bounded below, we write inf S = −∞.

2. If a nonempty set S is not bounded above, we write sup S = ∞.

3. inf{−∞} = −∞, inf{∞} = ∞.

4. sup{−∞} = −∞, sup{∞} = ∞.

The definition of limits are also extended to include −∞ and ∞ as limits.

Theorem 5.1
Let {an } be a sequence of real numbers.

1. The sequence {an } is not bounded above if and only if there is a strictly
increasing subsequence {ank } such that lim ank = ∞.
k→∞

2. The sequence {an } is not bounded below if and only if there is a strictly
decreasing subsequence {ank } such that lim ank = −∞.
k→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 355

Proof
It is sufficient to prove the first statement. If there is a subsequence {ank }
of {an } such that lim ank = ∞, it is obvious that {an } is not bounded
k→∞
above.
Conversely, given that {an } is not bounded above, we want to construct a
strictly increasing subsequence {ank } such that lim ank = ∞. Let n1 = 1.
k→∞
Since {an } is not bounded above, there is a n2 > 1 such that an2 ≥ an1 + 1.
Assume we have found n1 , n2 , . . . , nk−1 , such that

n1 < n2 < · · · < nk−1 ,

and
anj+1 ≥ anj + 1, for all 1 ≤ j ≤ k − 2.
Since {an } is not bounded above, there is an nk > nk−1 such that ank ≥
ank−1 + 1. This constructs a strictly increasing sequence {ank } inductively
which satisfies

ank+1 ≥ ank + 1, for all k ∈ Z+ .

From this, we find that

ank ≥ an1 + k − 1.

Therefore, lim ank = ∞.


k→∞

Associated with a given sequence {an }, we can define two sequences {bn }
and {cn }.

Definition 5.1
Given a sequence {an }, we can define two sequences {bn } and {cn } as
follows. For each positive integer n,

bn = inf ak = inf{ak | k ≥ n}, cn = sup ak = sup{ak | k ≥ n}.


k≥n k≥n
Chapter 5. Infinite Series of Numbers and Infinite Products 356

Example 5.1
1
For the sequence {an } with an = ,
n
1
bn = 0, cn = for all n ≥ 1.
n

Example 5.2

For the sequence {an } with an = n,

bn = n, cn = ∞ for all n ≥ 1.

Example 5.3

For the sequence {an } with an = (−1)n ,

bn = −1, cn = 1 for all n ≥ 1.

The following are obvious from the definitions and Theorem 5.1.

Proposition 5.2

Given that {an } is a sequence of real numbers, for each n ∈ Z+ , let bn =


inf ak and cn = sup ak .
k≥n k≥n

1. For any positive integer n, bn ≤ an ≤ cn .

2. For any positive integer n, bn cannot be ∞, cn cannot be −∞.

3. {an } is not bounded below if and only bn = −∞ for all n ≥ 1.

4. {an } is not bounded above if and only if cn = ∞ for all n ≥ 1.

5. {bn } is an increasing sequence.

6. {cn } is a decreasing sequence.

Since {bn } is an increasing sequence, lim bn = sup{bn } in the general sense.


n→∞
Similarly, lim cn = inf{cn }.
n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 357

Definition 5.2 Limit Inferior and Limit Superior

Let {an } be a sequence of real numbers.

1. The limit inferior or limit infimum of {an }, denoted by lim inf an or


n→∞
lim an , is defined as
n→∞

lim inf an = lim bn = sup inf ak .


n→∞ n→∞ n≥1 k≥n

2. The limit superior or limit supremum of {an }, denoted by lim sup an or


n→∞
lim an , is defined as
n→∞

lim sup an = lim cn = inf sup ak .


n→∞ n→∞ n≥1 k≥n

Notice that using extended definitions of infimum and supremum, the limit
infimum and limit supremum of a sequence always exist, either as a finite number,
or ±∞.

Example 5.4

1
1. For the sequence {an } with an = defined in Example 5.1,
n
lim inf an = 0, lim sup an = 0.
n→∞ n→∞

2. For the sequence {an } with an = n defined in Example 5.2,

lim inf an = ∞, lim sup an = ∞.


n→∞ n→∞

3. For the sequence {an } with an = (−1)n defined in Example 5.3,

lim inf an = −1, lim sup an = 1.


n→∞ n→∞

The following are obvious.


Chapter 5. Infinite Series of Numbers and Infinite Products 358

1. lim inf (−an ) = − lim sup an


n→∞ n→∞

2. lim sup(−an ) = − lim inf an .


n→∞ n→∞

Since
bn ≤ c n for all n ∈ Z+ ,
we obtain the following immediately.

Proposition 5.3

For any sequence {an },

lim inf an ≤ lim sup an .


n→∞ n→∞

We also have the following comparison theorem.

Proposition 5.4

Let {un } and {vn } be sequences of real numbers. If un ≤ vn for all positive
integers n, then

lim inf un ≤ lim inf vn lim sup un ≤ lim sup vn .


n→∞ n→∞ n→∞ n→∞

Example 5.5

Find lim inf an and lim sup an for the sequence {an } defined by
n→∞ n→∞
 
n 1
an = (−1) 1 + .
n

Solution
Notice that for any n ≥ 1,
1 1
a2n−1 = −1 − , a2n = 1 + .
2n − 1 2n
Chapter 5. Infinite Series of Numbers and Infinite Products 359

We observe that

a1 < a3 < · · · < −1 < 1 < · · · < a4 < a2 .

The sequence {a2n−1 } increases to −1, while the sequence {a2n } decreases
to 1. Therefore,
1 1
b2n = b2n+1 = −1 − , c2n−1 = c2n = 1 + .
2n + 1 2n
It follows that

lim inf an = lim bn = −1, lim sup an = lim cn = 1.


n→∞ n→∞ n→∞ n→∞

Example 5.6

Let {an } be the sequence defined by an = (−1)n n. Find lim inf an and
n→∞
lim sup an .
n→∞

Solution
For any n ≥ 1,

a2n−1 = −(2n − 1), a2n = 2n.

The sequence {an } is not bounded below nor bounded above. Therefore,

bn = −∞, cn = ∞.

It follows that

lim inf an = −∞, lim sup an = ∞.


n→∞ n→∞

For a monotoic sequence, it is easy to find its limit inferior and limit superior.
Chapter 5. Infinite Series of Numbers and Infinite Products 360

Theorem 5.5
Let {an } be a monotonic sequence.

1. If {an } is increasing, then bn = an and cn = sup{an }. Therefore,

lim inf an = lim sup an = lim an = sup{an }.


n→∞ n→∞ n→∞

2. If {an } is decreasing, then bn = inf{an } and cn = an . Therefore,

lim inf an = lim sup an = lim an = inf{an }.


n→∞ n→∞ n→∞

In other words, for monotonic sequence, the limit inferior, limit superior, and
the limit are all the same.
In fact, if a sequence {an } has a finite limit, then its limit inferior, limit
superior and limit are all the same.

Theorem 5.6
Let {an } be a sequence, and let a be a finite number. Then the following
two statements are equivalent.

(a) lim an = a.
n→∞

(b) lim inf an = lim sup an = a.


n→∞ n→∞

Proof
For a positive integer n, define bn = inf ak , cn = supk≥n an . Then
k≥n

lim inf an = lim bn , lim sup an = lim cn .


n→∞ n→∞ n→∞ n→∞

By definition,
b n ≤ an ≤ c n .
Hence, (b) implies (a) follows from squeeze theorem.
Chapter 5. Infinite Series of Numbers and Infinite Products 361

Now we prove that (a) implies (b). Given ε > 0, since lim an = a, there
n→∞
is a positive integer N such that for all n ≥ N ,
ε ε
a− < an < a + .
2 2
Hence, for all n ≥ N ,
ε ε ε ε
a− ≤ bn ≤ a + and a− ≤ cn ≤ a + .
2 2 2 2
These prove that for all n ≥ N ,

|bn − a| < ε and |cn − a| < ε.

Thus,
lim inf an = lim sup an = a.
n→∞ n→∞

Hence, we are left to consider sequences which does not have a finite limit.
Let us first characterize when the limit inferior and the limit superior of a sequence
can be −∞ or ∞.

Theorem 5.7
Let {an } be a sequence of real numbers. Then the following three
statements are equivalent.

(a) lim sup an = ∞.


n→∞

(b) {an } is not bounded above.

(c) There is a strictly increasing subsequence {ank } such that lim ank =
k→∞
∞.

Proof
Notice that lim sup an = lim cn , where cn = sup ak . Since {cn } is a
n→∞ n→∞ k≥n
decreasing sequence, lim cn = ∞ if and only if cn = ∞ for all n ≥ 1.
n→∞
Hence, this theorem follows from Theorem 5.1 and Proposition 5.2.
Chapter 5. Infinite Series of Numbers and Infinite Products 362

The limit inferior version of Theorem 5.7 is straightforward.

Theorem 5.8
Let {an } be a sequence of real numbers. Then the following three
statements are equivalent.

(a) lim inf an = −∞.


n→∞

(b) {an } is not bounded below.

(c) There is a strictly decreasing subsequence {ank } such that lim ank =
k→∞
−∞.

What is more nontrivial is when limit superior is −∞, or limit inferior is ∞.

Theorem 5.9
Let {an } be a sequence of real numbers.

1. lim sup an = −∞ if and only if lim an = −∞.


n→∞ n→∞

2. lim inf an = ∞ if and only if lim an = ∞.


n→∞ n→∞

Proof
Notice that if lim sup an = −∞, we must have lim inf an = −∞.
n→∞ n→∞
Similarly, if lim inf an = ∞, it is necessary that lim sup an = ∞.
n→∞ n→∞
It is enough for us to prove the second statement. Let bn = inf ak . Notice
k≥n
that bn ≤ an . If lim inf an = lim bn = ∞, we must have lim an = ∞.
n→∞ n→∞ n→∞
Conversely, assume that lim an = ∞. Given M > 0, there is a positive
n→∞
integer N such that

an ≥ M for all n ≥ M.

This implies that


bn ≥ M for all n ≥ M.
Hence, lim inf an = lim bn = ∞.
n→∞ n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 363

Example 5.7

Consider the sequence {an } with an = n + (−1)n . The first few terms
are given by 0, 3, 2, 5, 4, 7, . . .. This sequence is neither increasing nor
decreasing. For any n ≥ 1,

b2n−1 = b2n = a2n−1 = 2n − 2,

while cn = ∞ for all n ∈ Z+ . Therefore,

lim inf an = lim bn = ∞.


n→∞ n→∞

Combining Theorem 5.7, Theorem 5.8 and Theorem 5.9, we can summarize
the cases where the limit inferior or the limit superior is −∞ or ∞.

Infinities as Limit Superior or Limit Inferior


Let {an } be a sequence of real numbers.

1. lim inf an = lim sup an = −∞ if and only if lim an = −∞. In this


n→∞ n→∞ n→∞
case, {an } is bounded above, not bounded below.

2. lim inf an = lim sup an = ∞ if and only if lim an = ∞. In this case,


n→∞ n→∞ n→∞
{an } is bounded below, not bounded above.

3. lim inf an = −∞ and lim sup an = ∞ if and only if {an } is not bounded
n→∞ n→∞
above nor bounded below.

4. −∞ < lim inf an < ∞ and lim sup an = ∞ if and only if {an } is
n→∞ n→∞
bounded below but not bounded above.

5. lim inf an = −∞ and −∞ < lim sup an < ∞ if and only if {an } is
n→∞ n→∞
bounded above but not bounded below.

The following gives a relation of limit inferior and limit superior with limits
of subsequences.
Chapter 5. Infinite Series of Numbers and Infinite Products 364

Theorem 5.10
Let {an } be sequence with

lim inf an = b and lim sup an = c.


n→∞ n→∞

If {ank } is a subsequence of {an } that converges to a number ℓ, then

b ≤ ℓ ≤ c.

Here b and c can be ±∞.

Proof
For every positive integer n, let

bn = inf ak , cn = sup an .
k≥n k≥n

Then
b = lim bn , c = lim cn .
n→∞ n→∞

Given a subsequence {ank }, {bnk } and {cnk } are subsequences of the


monotonic sequences {bn } and {cn }. Hence, we have

lim bnk = b, lim cnk = c,


k→∞ k→∞

which still holds in the extended sense. Since

bnk ≤ ank ≤ cnk for all k ∈ Z+ ,

we find that
b ≤ ℓ ≤ c.

Now we turn to the case of finite limit superior and finite limit inferior. We
have the following equivalence.
Chapter 5. Infinite Series of Numbers and Infinite Products 365

Theorem 5.11
Let {an } be a sequence of real numbers. Then the following two statements
are equivalent.

(a) lim sup an = c is finite.


n→∞

(b) Given ε > 0,

(i) there exists a positive integer N such that for all n ≥ N , an <
c + ε; and
(ii) for every positive integer N , there exists an integer n ≥ N , such
that an > c − ε.

Proof
For a positive integer n, define cn = sup an , so that lim sup an = lim cn .
k≥n n→∞ n→∞
Let us first prove that (a) implies (b). Given ε > 0, since lim cn = c, there
n→∞
is a positive integer N such that for all n ≥ N , |cn − c| < ε. If n ≥ N , we
find that
an ≤ sup ak = cn < c + ε.
k≥n

This proves (b)(i). Now given ε > 0, there is a positive integer N0 such
that c − ε < cn < c + ε for all n ≥ N0 . For any positive integer N , let
N ′ = max{N, N0 }. Then N ′ ≥ N0 . Hence,

sup ak = cN ′ > c − ε.
k≥N ′

This implies that c − ε is not an upper bound of {ak | k ≥ N ′ }. Therefore,


there is an n ≥ N ′ ≥ N such that an > c − ε. This proves (b)(ii).
Now we prove (b) implies (a). Given ε > 0, (b)(i) implies that there is a
positive integer N such that for all n ≥ N ,
ε
an < c + .
2
Chapter 5. Infinite Series of Numbers and Infinite Products 366

This implies that for all n ≥ N ,


ε
cn ≤ c + < c + ε.
2
For any n ≥ N , (b)(ii) implies that there is a k ≥ n such that ak > c − ε.
Therefore, cn ≥ ak > c − ε. This shows that for all n ≥ N ,

c − ε < cn < c + ε.

Hence, lim sup an = lim cn = c.


n→∞ n→∞

The limit inferior counterpart of Theorem 5.11 is the following.

Theorem 5.12
Let {an } be a sequence of real numbers. Then the following two statements
are equivalent.

(a) lim inf an = b is finite.


n→∞

(b) Given ε > 0,

(i) there exists a positive integer N such that for all n ≥ N , an >
b − ε; and
(ii) for every positive integer N , there exists an integer n ≥ N , such
that an < b + ε.

The following theorem says that the limit inferior and the limit superior are
limits of subsequences.

Theorem 5.13
Let {an } be a sequence.

1. If c = lim sup an is finite, it is the limit of a subsequene of {an }.


n→∞

2. If b = lim inf an is finite, it is the limit of a subsequene of {an }.


n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 367

Proof
It is sufficient for us to prove the first statement. We use (b)(ii) of Theorem
5.11. Take ε = 1 and N = 1. There is an integer n1 ≥ 1 such that
an1 > c − 1. Suppose we have chosen n1 , n2 , . . . , nk−1 such that n1 <
n2 < . . . < nk−1 and
1
anj > c − for all 1 ≤ j ≤ k − 1.
j

Take ε = 1/k and N = nk−1 + 1. There is an nk ≥ N > nk−1 such that


ank > c − 1/k. By induction, we have constructed the subsequence {ank }
with
1
ank > c − for all k ∈ Z+ .
k
Notice that we also have ank ≤ cnk . Therefore,
1
c− < ank ≤ cnk for all k ∈ Z+ .
k
Being a subsequence of {cn }, lim cnk = c. By squeeze theorem,
k→∞

lim ank = c.
k→∞

Theorem 5.7, Theorem 5.8, Theorem 5.10 and Theorem 5.13 give
characterization of limit superior and limit inferior as follows.

1. The sequence {an } is not bounded above if and only if lim sup an = ∞,
n→∞
if and only if there is a strictly increasing subsequence {ank } such that
lim ank = ∞.
k→∞

2. The sequence{an } is not bounded below if and only if lim inf an = −∞,
n→∞
if and only if there is a strictly decreasing subsequence {ank } such that
lim ank = −∞.
k→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 368

3. If the sequence {an } is bounded above, there is a subsequence {ank }


such that lim ank = lim sup an .
k→∞ n→∞

4. If the sequence {an } is bounded below, there is a subsequence {ank }


such that lim ank = lim inf an .
k→∞ n→∞

5. The limit of any subsequence {ank } must be between lim inf an and
n→∞
lim sup an .
n→∞

Let us look at the following example.

Example 5.8

Find lim inf an and lim sup an for the sequence {an } defined by
n→∞ n→∞

2πn
an = sin .
5

Solution
For any n ≥ 1,
2π 4π
a5n = 0, a5n+1 = sin , a5n+2 = sin ,
5 5
6π 8π
a5n+3 = sin, a5n+4 = sin .
5 5
Hence, the limit of a convergent subsequence of {an } can and can only be
2π 4π 6π 8π
0, sin , sin , sin , sin .
5 5 5 5
Since
8π 6π 4π 2π
sin < sin < 0 < sin < sin ,
5 5 5 5
we find that
8π 2π
lim inf an = sin , lim sup an = sin .
n→∞ 5 n→∞ 5
Chapter 5. Infinite Series of Numbers and Infinite Products 369

Exercises 5.1
Question 1
Find lim inf an and lim sup an for the sequence {an }, where
n→∞ n→∞

2n
an = .
n+1

Question 2
Find lim inf an and lim sup an for the sequence {an }.
n→∞ n→∞

2n
(a) an = (−1)n
2n + 1
2n + 1
(b) an = (−1)n
2n

Question 3
Find lim inf an and lim sup an for the sequence {an }.
n→∞ n→∞

(a) an = −2n + (−1)n n

(b) an = n − 2(−1)n n

Question 4
Find lim inf an and lim sup an for the sequence {an } defined by
n→∞ n→∞

2πn
an = cos .
9

Question 5
Prove or disprove: Given two sequences {an } and {bn },

lim sup(an + bn ) = lim sup an + lim sup bn .


n→∞ n→∞ n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 370

5.2 Convergence of Series

In this section, we consider infinite series and its convergence. A series is a sum
of the form ∞
X
an = a1 + a2 + · · · + an + · · · ,
n=1

where {an }∞
n=1 is an infinite sequence. Sometimes a series might start with the
n = 0 term. Since a series is an infinite sum, we need to study whether the sum
makes sense. The natural thing to do is to define it using limits.
For convenience, we will deal with series that starts with the n = 1 term in
this chapter. When necessary, we will explain what changes need to be made if
the series starts with the n = 0 term.

Definition 5.3 Convergence of Series



X
Given an infinite series an , we define its nth partial sum sn by
n=1

n
X
sn = ak .
k=1

We say that the series is convergent or has a finite sum, if the sequence
{sn } has a finite limit. Otherwise, we say that the series is divergent. If the
series is convergent, we define its sum by

X n
X
an = s = lim sn = lim ak .
n→∞ n→∞
n=1 k=1


X
If the infinite series an starts with the n = 0 term, we still define its nth
n=0
partial sum by
n
X
sn = ak when n ≥ 0.
k=0
Chapter 5. Infinite Series of Numbers and Infinite Products 371


X
The convergence of a series an is not affected by a finite number
n=1

X
of terms in the series. Given a positive integer n0 , the series an is
n=n0

X
convergent if and only if the series an is convergent. In case they are
n=1
convergent,

X ∞
X 0 −1
nX
an − an = an .
n=1 n=n0 n=1

Example 5.9

X 1 1
For the series n
, an = n , and the nth partial sum is
n=1
2 2

1 1 1 1
sn = + 2 + ··· + n = 1 − n.
2 2 2 2

1 X 1
Since lim n = 0, we find that lim sn = 1. Hence, the series is
n→∞ 2 n→∞
n=1
2n
convergent and

X 1
n
= 1.
n=1
2

Example 5.10 Harmonic Series



X 1
Determine whether the series is convergent.
n=1
n

Solution
The nth partial sum of the series is
1 1
sn = 1 + + ··· + .
2 n
Chapter 5. Infinite Series of Numbers and Infinite Products 372

For any positive integer n,


1 1 n 1
s2n − sn = + ··· + ≥ = .
n+1 2n 2n 2
Therefore, for any positive integer k,
k
s2k = s2k − s2k−1 + s2k−1 − s2k−2 + · · · + s2 − s1 + s1 ≥ 1 + .
2
This shows that the sequence {sn } is not bounded above. Hence, it is not

X 1
convergent. Therefore, the series is divergent.
n=1
n

Example 5.10 is a typical example where we determine the convergence of a


series without compute the exact value of its partial sum. In this section, we are
going to learn various strategies that can be used to do so.
From linearity of limits, we immediately deduce the following.

Proposition 5.14 Linearity



X ∞
X
Let an and bn be convergent series. Then for any constants α and
n=1 n=1

X
β, the series (αan + βbn ) is also convergent and
n=1


X ∞
X ∞
X
(αan + βbn ) = α an + β bn .
n=1 n=1 n=1

Let us look at a simple criteria that can be used to conclude that a series is
divergent.

Theorem 5.15

X
If a series an is convergent, then lim an = 0. Equivalently, if
n→∞
n=1

X
lim an ̸= 0, then the series an is divergent.
n→∞
n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 373

Proof

X
We just need to prove the first statement. If an is convergent, the
n=1
sequence of partial sums {sn } converges to a number s. Notice that
an = sn − sn−1 , where s0 = 0 by default. Therefore,

lim an = lim sn − lim sn−1 = s − s = 0.


n→∞ n→∞ n→∞


X
When one is determining the convergence of a series an , it is always good
n=1
to start with checking whether the limit lim an is zero.
n→∞

Example 5.11

X
The series (−1)n is divergent since the limit lim (−1)n does not exist.
n→∞
n=1

Example 5.12
Determine the convergence of the series
∞  n
X 1
1+ .
n=1
n

Solution
Since  n
1
lim 1 + = e ̸= 0,
n→∞ n
the series ∞  n
X 1
1+
n=1
n
is divergent.

The geometric series is a series which we can find the partial sums explicitly.
It is useful for comparisons.
Chapter 5. Infinite Series of Numbers and Infinite Products 374

Theorem 5.16 Geometric Series



X
The geometric series rn is convergent if and only if |r| < 1. Moreover,
n=0


X 1
rn = when |r| < 1.
n=0
1−r

Proof

X
n
When |r| ≥ 1, the limit lim r is not 0. Thus, the series rn is divergent.
n→∞
n=0
When |r| < 1, the nth partial sum is

1 − rn+1
sn = 1 + r + r 2 + · · · + r n = .
1−r

In this case, lim rn+1 = 0, and so


n→∞

1 − rn+1 1
lim sn = lim = .
n→∞ n→∞ 1 − r 1−r

X
Hence, the series rn is convergent when |r| < 1, and
n=0


X 1
rn = when |r| < 1.
n=0
1−r


X
arn−1 , where a is

A general geometric series is a series of the form
n=1
the first term of the series. It is convergent if and only if |r| < 1.


X
If all the terms an in the series an are nonnegative, we notice that the
n=1
partial sums {sn } form an increasing sequence. For an increasing sequence, we
have the monotone convergence theorem. Applying to the sequence of partial
sums, we have the following.
Chapter 5. Infinite Series of Numbers and Infinite Products 375

Theorem 5.17

X
If an ≥ 0 for all n ≥ 1, then the series an is convergent if and only if
n=1
the sequence of partial sums {sn } is bounded above.

In practice, it is sufficient that there is a positive integer N so that an ≥ 0 for


all n ≥ N . In Example 5.10, we have used this criterion to show that the harmonic

X 1
series is divergent.
n=1
n
Besides the geometric series, a series that is useful for comparisons is the

X 1 1
p-series p
. When p ≤ 0, this series is not convergent since lim p ̸=
n=1
n n→∞ n
0. Hence, we will concentrate on the case where p > 0. To determine the
convergence of this series, a convenient tool is the integral test.

Theorem 5.18 Integral Test

Suppose that f : [1, ∞) → R is a function that satisfies the following


conditions.

(i) f is continuous.

(ii) f decreases to 0 monotonically.



X
For n ≥ 1, let an = f (n). Then the series an is convergent if and only
Z ∞ n=1

if the improper integral f (x)dx is convergent.


1

Proof
Since f (x) decreases to 0 monotonically, f (x) ≥ 0 for all x ≥ 1, and so
{an } is a nonnegative decreasing sequence with lim an = 0. Let sn = a1 +
n→∞ Z x
th
a2 +· · ·+an be the n partial sum of the series, and let F (x) = f (u)du
1
when x ≥ 1.
Chapter 5. Infinite Series of Numbers and Infinite Products 376

Given a positive integer n, since f is decreasing, we find that

f (n + 1) ≤ f (x) ≤ f (n) for n ≤ x ≤ n + 1.

This implies that Z n+1


an+1 ≤ f (x)dx ≤ an .
n
Therefore, when n ≥ 2,
Z 2 Z n Z n+1
f (x)dx + · · · + f (x)dx + f (x)dx
1 n−1 n
Z 2 Z n
≤ s n ≤ a1 + f (x)dx + · · · + f (x)dx.
1 n−1

This gives
F (n + 1) ≤ sn ≤ a1 + F (n). (5.1)
Z ∞
If the improper integral f (x)dx is convergent, {F (n)} is bounded
1
above by a number M . Therefore,

s n ≤ a1 + M for all n ≥ 1,

X
and so the sequence {sn } is bounded above. Therefore, the series an is
n=1
convergent. Z ∞
If the improper integral f (x)dx is divergent, lim F (n + 1) = ∞. By
1 n→∞
(5.1), we find that the sequence {sn } is not bounded above. Thus, the series
X∞
an is divergent.
n=1

Let us now use the integral test to determine the convergence of the p-series.

Theorem 5.19 p-Series



X 1
Let p be a positive number. The p-series is convergent if and only if
n=1
np
p > 1.
Chapter 5. Infinite Series of Numbers and Infinite Products 377

Figure 5.1: The integral test.

Proof
Define the function f : [1, ∞) → R by
1
f (x) = .
xp
Then f is a continuous function that Z decreases monotonically to 0. By

1
Example 4.30, the improper integral dx is convergent if and only if
1 xp

X 1
p > 1. By integral test, the series p
is convergent if and only if p > 1.
n=1
n

Example 5.13

X 1 1
The series √ is divergent since it is a p-series with p = 2
≤ 1.
n=1
n
Chapter 5. Infinite Series of Numbers and Infinite Products 378

Remark 5.1 Integral Approximation to Partial Sums

Given that f : [1, ∞) → R is a continuous function that monotonically


decreases to 0, let
n
X Z n
sn = f (k), tn = f (x)dx.
k=1 1

From the proof of the integral test, we have

tn + f (n + 1) ≤ sn ≤ a1 + tn .

This implies that


f (n + 1) ≤ sn − tn ≤ a1 ,
X n
which gives bounds for the error when the partial sum sn = ak is
Z n k=1

approximated by the integral f (x)dx. When the improper integral


1
Z ∞ X∞
f (x)dx is convergent, the sum an is also convergent. In this case,
1 n=1
the sum of the infinite series satisfies
Z ∞ X ∞ Z ∞
f (x)dx ≤ an ≤ f (x)dx + a1 .
1 n=1 1


X
If we use sn to approximate the sum s = an , the error is
n=1


X
s − sn = ak .
k=n+1

The same reasoning shows that if n ≥ 1,


Z ∞ Z ∞
f (x)dx ≤ s − sn ≤ f (x)dx.
n+1 n
Chapter 5. Infinite Series of Numbers and Infinite Products 379

Example 5.14 Euler’s Constant


We can prove that the limit
 
1 1
lim 1 + + · · · + − ln n
n→∞ 2 n

exists as follows. Let


1 1
cn = 1 + + · · · + − ln n.
2 n
Then
Z n Z 2 Z n
1 1 1 1 1
ln n = dx ≤ dx + · · · + dx ≤ 1 + + · · · + .
1 x 1 x n−1 x 2 n−1

Therefore, cn ≥ 0 for all n ≥ 1. On the other hand,


Z n+1
1 1 1
cn+1 − cn = − ln(n + 1) + ln n = − dx ≤ 0.
n+1 n+1 n x

Hence, {cn } is a decreasing sequence that is bounded below by 0. By


monotone convergence theorem, {cn } converges to a limit γ. This number
 
1 1
γ = lim 1 + + · · · + − ln n
n→∞ 2 n

is called the Euler-Mascheroni constant, or simply as Euler’s constant. It is


an important constant in mathematics. Numerically, it is equal to

0.577215664901532

correct to 15 decimal places.

Now we return to the comparison test. Using Theorem 5.17, we obtain the
following test for nonnegative series.
Chapter 5. Infinite Series of Numbers and Infinite Products 380

Theorem 5.20 Comparison Test



X ∞
X
Let an and bn be two series satisfying
n=1 n=1

0 ≤ an ≤ b n for all n ≥ 1.

X ∞
X
1. If bn is convergent, an is convergent.
n=1 n=1


X ∞
X
2. If an is divergent, bn is divergent.
n=1 n=1

Proof
Let sn = a1 + . . . + an and tn = b1 + . . . + bn be respectively the nth partial
X∞ X∞
sums of the series an and bn . Then {sn } and {tn } are increasing
n=1 n=1
sequences and
sn ≤ tn .

X
1. If bn is convergent, the sequence {tn } is bounded above. Then the
n=1

X
sequence {sn } is also bounded above. Hence, an is convergent.
n=1


X
2. If an is divergent, the sequence {sn } is not bounded above. Then the
n=1

X
sequence {tn } is also not bounded above. Hence, bn is divergent.
n=1

Example 5.15
Determine the convergence of the series

X 2n
.
n=1
3n − 1
Chapter 5. Infinite Series of Numbers and Infinite Products 381

Solution
For n ≥ 1,
1
3n − 1 ≥ × 3n .
2
Therefore,
2n 2n+1
≤ .
3n − 1 3n
Since the series ∞ ∞
X 2n+1 X 2n
=2
n=1
3n n=1
3n
is a geometric series with r = 2/3, it is convergent. By comparison test,

X 2n
the series is convergent.
n=1
3n − 1

Example 5.16
Determine the convergence of the series

X n
√ .
n=1
n n+1

Solution
For n ≥ 1,
n n 1
√ ≥ √ √ = √ .
n n+1 n n+n n 2 n

X 1
Since the series √ is a p-series with p = 1/2 ≤ 1, it is divergent.
n=1
n

X 1
So the series √ is also divergent. By comparison test, the series
n=1
2 n

X n
√ is divergent.
n=1
n n+1

In applying the comparison test, we need to identify the correct series to


compare to, and prove some strict inequalities. In Example 5.15, we compare
2n 2n
an = n to bn = n , since 2n and 3n are the leading terms of the numerator
3 −1 3
Chapter 5. Infinite Series of Numbers and Infinite Products 382


X 2n
and the denominator of an when n is large. Since we know that the series
3n n=1
is convergent, we need to prove that an is up to a constant, less than or equal to

X
bn , in order to use the comparison test to conclude that an is convergent.
n=1
n n
Simiarly, for Example 5.16, we compare an = √ to bn = √ since
√ n n+1 n n
n and n n are respectively the leading terms of the numerator and denominator
X∞ X∞
of an . Since bn is divergent, so we want to conclude that an is divergent.
n=1 n=1
For this, we need to show that an is larger than a constant times bn .
Proving strict inequalities is tedious, and we see that it might not be necessary.
In fact, we obtain the series to compare to by investigating the leading terms.
This is somehow a limit. Hence, we can replace the comparison test by limit
comparison test.

Theorem 5.21 Limit Comparison Test



X ∞
X
Given the two series an and bn that satisfy the following conditions.
n=1 n=1

(i) an ≥ 0 and bn > 0 for all n ∈ Z+ .


an
(ii) The limit L = lim exists and is finite.
n→∞ bn

an
Since ≥ 0, we must have L ≥ 0.
bn

X ∞
X
1. If L = 0, and the series bn is convergent, then the series an is
n=1 n=1
convergent.

X ∞
X
2. If L > 0, the series an is convergent if and only if the series bn
n=1 n=1
is convergent.

The condition (ii) says that when n is large, an is smaller than or equal to a
multiple of bn .
Chapter 5. Infinite Series of Numbers and Infinite Products 383

Proof
First consider the case where L = 0. By definition of limit with ε = 1,
there is a positive integer N such that for all n ≥ N ,

an
< 1.
bn

Therefore,
0 ≤ an ≤ b n
for all n ≥ N.

X X∞
Since the series bn is convergent, the series bn is convergent. Then
n=1 n=N

X
comparison test implies that the series an is convergent. Therefore, the
n=N

X
series an is convergent.
n=1
Now for the case L > 0, take ε = L/2. There is a positive integer N such
that for all n ≥ N ,
an L
−L < .
bn 2
This implies that
L 3L
0≤ b n ≤ an ≤ bn for all n ≥ N.
2 2

X
Comparison test then shows that the series an is convergent if and only
n=1

X
if the series bn is convergent.
n=1

Example 5.17 Example 5.15 Revisited



X 2n
For the series considered in Example 5.15, we take an =
n=1
3n − 1
2n 2n
and b n = .
3n − 1 3n
Chapter 5. Infinite Series of Numbers and Infinite Products 384

Then
an 1
lim = lim = 1.
n→∞ bn n→∞ 1
1−
3n
∞ ∞
X 2n X 2n
Since the series is convergent, the series is convergent.
n=1
3n n=1
3n − 1

Example 5.18 Example 5.16 Revisited



Xn
For the series √ considered in Example 5.16, we take an =
n=1
n n + 1
n 1
√ and bn = √ . Then
n n+1 n
an 1
lim = lim = 1.
n→∞ bn n→∞ 1
1+ √
n n
∞ ∞
X 1 X n
Since the series √ is divergent, the series √ is divergent.
n=1
n n=1
n n+1

Let us now turn to series that can have negative terms. First we formulate
a Cauchy criterion for convergence of series. Recall that a sequence {sn } is a
Cauchy sequence if for every ε > 0, there is a positive integer N so that for all
m ≥ n ≥ N,
|sm − sn | < ε.
Applying the Cauchy criterion for convergence of sequences (see Theorem 1.43),
and the fact that if m ≥ n > 1,

sm − sn−1 = an + an+1 + · · · + am ,

we obtain the following Cauchy criterion for convergence of infinite series.


Chapter 5. Infinite Series of Numbers and Infinite Products 385

Theorem 5.22 Cauchy Criterion for Infinite Series



X
An infinite series an is convergent if and only if for every ε > 0, there
n=1
is a positive integer N such that for all m ≥ n ≥ N ,

|an + an+1 + · · · + am | < ε.

Using this, we can prove the following.

Theorem 5.23

X ∞
X
If the series |an | is convergent, then the series an is convergent.
n=1 n=1

Proof

X
Given ε > 0, since the series |an | is convergent, there is a positive
n=1
integer N such that for all m ≥ n ≥ N ,

|an | + |an+1 | + · · · + |am | = ||an | + |an+1 | + · · · + |am || < ε.

By triangle inequaltiy, we find that for m ≥ n ≥ N ,

|an + an+1 + · · · + am | < |an | + |an+1 | + · · · + |am | < ε.



X
Using Cauchy criterion, we conclude that the series an is convergent.
n=1


X
The converse of Theorem 5.23 is not true. Namely, there exists series an
n=1

X
which is convergent but the corresponding absolute series |an | is not convergent.
n=1
Therefore, let us make the following definitions.
Chapter 5. Infinite Series of Numbers and Infinite Products 386

Definition 5.4 Absolute Convergence and Conditional Convergence



X
Given that the series an is convergent.
n=1


X ∞
X
1. We say that the series an converges absolutely if the series |an |
n=1 n=1
is convergent.

X
2. We say that the series an converges conditionally if the series
n=1

X
|an | is divergent.
n=1

Example 5.19

X (−1)n
If p > 1, the series converges absolutely.
n=1
np

From the limit comparison test, we have the following.

Theorem 5.24 Limit Comparison Test II



X ∞
X
Given the series an , assume that there is a series bn such that bn > 0
n=1 n=1
|an |
for all n ∈ Z+ , and the limit L = lim exists and is finite. If the series
n→∞ bn

X ∞
X
bn is convergent, then the series an converges absolutely.
n=1 n=1

Example 5.20
Show that the series ∞
X 2n + (−1)n 3n
n=1
5n + 1
is convergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 387

Solution
Let
2n + (−1)n 3n 3n
an = , bn = n .
5n + 1 5
X∞
Then bn > 0 for all n ∈ Z+ and bn is convergent. Now,
n=1
 n
2
n
1 + (−1)
|an | 3
lim = lim = 1.
n→∞ bn n→∞ 1
1+ n
5

X 2n + (−1)n 3n
Therefore, the series converges absolutely, and thus is
n=1
5n + 1
convergent.

To give an example of series that converges conditionally, let us discuss a


convergence test for a special class of series called alternating series.

Definition 5.5 Alternating Series


A series of the form

X
(−1)n−1 bn = b1 − b2 + b3 − b4 + · · · + b2n−1 − b2n + · · · ,
n=1

where bn ≥ 0 for all n ≥ 1, is called an alternating series.

Example 5.21
The series
1 1 1
1− + − + ···
2 3 4
is an alternating series.


X
A necessary condition for an alternating series (−1)n−1 bn to be convergent
n=1
is lim bn = 0. The following theorem says that if {bn } is also decreasing, then
n→∞
the alternating series is convergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 388

Theorem 5.25 Alternating Series Test

If {bn } is a monotonically decreasing sequence with lim bn = 0, the


n→∞

X
alternating series (−1)n−1 bn is convergent.
n=1

Proof
Since {bn } decreases monotonically to 0, bn ≥ 0 for all n ∈ Z+ . Let

X
an = (−1)n−1 bn be the nth term of the series (−1)n−1 bn , and let sn =
n=1
a1 + a2 + · · · + an be the nth partial sum. We are given that

b1 ≥ b2 ≥ · · · ≥ bn ≥ bn+1 · · · .

Therefore,

s2n+1 = s2n−1 + a2n + a2n+1 = s2n−1 − (b2n − b2n+1 ) ≤ s2n−1 ,


s2n+2 = s2n + a2n+1 + a2n+2 = s2n + (b2n+1 − b2n+2 ) ≥ s2n .

This shows that {s2n−1 } is a decreasing sequence and {s2n } is an increasing


sequence. Since
s2n = s2n−1 − b2n ,
we find that
s2 ≤ s2n ≤ s2n−1 ≤ s1 .
Namely, the sequence {s2n−1 } is bounded below by s2 , while the sequence
{s2n } is bounded above by s1 . By the monotone convergence theorem, the
limits
so = lim s2n−1 and se = lim s2n
n→∞ n→∞

exist. Since
−b2n = a2n = s2n − s2n−1 ,
taking the n → ∞ limits give

so = se .
Chapter 5. Infinite Series of Numbers and Infinite Products 389

This proves that the sequence {sn } has a limit s = s0 = se , and thus the

X
alternating series (−1)n−1 bn is convergent.
n=1


X
Notice that the sum of the alternating series s = (−1)n−1 bn is the least
n=1
upper bound of {s2n }, and the greatest lower bound of {s2n−1 }.

Remark 5.2 Approximating the Sum of An Alternating Series

If {bn } is a sequence that decreases monotonically to 0, the alternating


X∞
series (−1)n−1 bn converges to a sum s. If
n=1

n
X
sn = (−1)k−1 bk
k=1

is the nth partial sum, then the error in approximating s by sn is



X
s − sn = (−1)k−1 bk ,
k=n+1

which is also an alternating series. From the proof of Theorem 5.25, we


obtain a simple estimate

|s − sn | ≤ |bn+1 |.

Example 5.22
For the alternating series

X (−1)n−1 1 1 1
=1− + − + ···
n=1
n 2 3 4

1
in Example 5.21, bn = . Since {bn } decreases monotonically to 0, by the
n
alternating series test, the series
Chapter 5. Infinite Series of Numbers and Infinite Products 390


X (−1)n−1 1 1 1
=1− + − + ···
n=1
n 2 3 4

X 1
is convergent. Since the harmonic series is divergent, the series
n=1
n

X (−1)n−1
converges conditionally.
n=1
n

Example 5.23

For any 0 < p ≤ 1, the sequence {1/np } decreases to 0 montonically.



X (−1)n−1
Hence, the alternating series p
is convergent. Since the series
n=1
n
∞ ∞
X 1 X (−1)n−1
p
is divergent, the series p
converges conditionally.
n=1
n n=1
n

Now we turn to two useful tests that are used for testing convergence of power
series. They both based on comparisons with geometric series. We first prove the
following.

Theorem 5.26
Let {an } be a sequence of positive numbers. Then
an+1 √ √ an+1
lim inf ≤ lim inf n an ≤ lim sup n an ≤ lim sup .
n→∞ an n→∞ n→∞ n→∞ an
an+1 √
Hence, if the limit lim exists, the limit lim n an also exists, and the
n→∞ an n→∞
two limits are equal.

Since an > 0 for all n ∈ Z+ , all the four limits in the theorem are nonnegative.

Proof
If {cn } is a sequence of postive numbers, it is easy to verify that
   
1 1 1 1
sup = , inf = .
cn inf{cn } cn sup{cn }
Chapter 5. Infinite Series of Numbers and Infinite Products 391

Therefore, if we prove that


√ an+1
lim sup n
an ≤ lim sup , (5.2)
n→∞ n→∞ an
then
an+1 √
lim inf ≤ lim inf n an
n→∞ an n→∞

follows by applying (5.2) to the reciprocal sequence {1/an }. From


√ √
Proposition 5.23, we have the inequality lim inf n an ≤ lim sup n an .
n→∞ n→∞
Hence, we only need to prove (5.2).
an+1
If lim sup = ∞, there is nothing to prove. Hence, we consider the
n→∞ an
case
an+1
u = lim sup
n→∞ an
is finite. Given ε > 0, there is a positive integer N such that
an+1
<u+ε for all n ≥ N.
an
By induction, we find that

an ≤ aN (u + ε)n−N for all n ≥ N.

Let c = aN (u + ε)−N . Then



n
an ≤ c1/n (u + ε) for all n ≥ N.

This implies that



lim sup n
an ≤ lim sup c1/n (u + ε) = (u + ε) lim c1/n = (u + ε).
n→∞ n→∞ n→∞

Since ε > 0 is arbitrary, we conclude that


√ an+1
lim sup n
an ≤ u = lim sup .
n→∞ n→∞ an
This completes the proof of the theorem.

Now we come to the proof of the root test.


Chapter 5. Infinite Series of Numbers and Infinite Products 392

Theorem 5.27 Root Test



X
Given a series an , let
n=1
p
n
ρ = lim sup |an |.
n→∞


X
1. If ρ < 1, the series an converges absolutely.
n=1


X
2. If ρ > 1, the series an is divergent.
n=1

3. If ρ = 1, the test is inconclusive.

Proof
1−ρ
If ρ < 1, take ε = in (b)(i) of Theorem 5.11. There is a positive
2
integer N such that
p
n
|an | < ρ + ε = ρ1 for all n ≥ N.

Thus, we have
|an | < ρn1 for all n ≥ N.
Notice that
1+ρ
ρ1 =
< 1.
2
X
Therefore, the geometric series ρn1 is convergent. By comparison
n=N

X ∞
X
test, the series |an | is convergent. Thus, the series an converges
n=1 n=1
absolutely.
ρ−1
If ρ > 1, take ε = in (b)(ii) of Theorem 5.11. There are positive
2
integers n1 , n2 , . . . such that 1 ≤ n1 < n2 < . . . and
q
nk
|ank | > ρ − ε = ρ2 for all k ∈ Z+ .
Chapter 5. Infinite Series of Numbers and Infinite Products 393

Thus, we have
|ank | > ρn2 k for all k ∈ Z+ . (5.3)
Since
1+ρ
ρ2 =
> 1,
2
and nk → ∞ as k → ∞, we find that lim ρn2 k = ∞. In other words, the
k→∞
sequence {ρn2 k } is not bounded above. Eq. (5.3) then implies that {|ank |}
is also not bounded above. Therefore, the limit lim an is not zero. Hence,
n→∞
X∞
the series an is divergent.
n=1
Now, let us look at some examples where ρ = 1. First, notice that

   
ln n ln x
n
lim n = lim exp = exp lim = e0 = 1. (5.4)
n→∞ n→∞ n x→∞ x


X 1
For the p-series p
, an = n−p . Thus,
n=1
n


n
 √ −p
ρ = lim n−p = lim n
n = 1.
n→∞ n→∞

But we have seen that the p-series is divergent if p ≤ 1, and it is convergent


when p > 1. This shows that the root test is conclusive when ρ = 1.

Example 5.24
∞  n
X 1−n
Determine the convergence of the series .
n=1
2n + 1

Solution
Applying root test,
s n
n
1−n n−1 1
ρ = lim sup = lim = .
n→∞ 2n + 1 n→∞ 2n + 1 2

Since ρ < 1, we find that the series is convergent.


Chapter 5. Infinite Series of Numbers and Infinite Products 394

Finally, we have the ratio test.

Theorem 5.28 Ratio Test



X
Given a series an with an ̸= 0 for all n ∈ Z+ , let
n=1

an+1 an+1
r = lim inf , R = lim sup .
n→∞ an n→∞ an

X
1. If R < 1, the series an converges absolutely.
n=1


X
2. If r > 1, the series an is divergent.
n=1

3. If r ≤ 1 ≤ R, the test is inconclusive.

Proof
p
n
If R < 1, Theorem 5.26 implies that ρ = lim sup |an | < 1. Theorem
n→∞

X
5.27 implies that an converges absolutely.
n=1 p
n
If r > 1, Theorem 5.26 implies that ρ = lim sup |an | > 1. Theorem 5.27
n→∞

X
implies that an is divergent.
n=1

X1
The p-series provides examples of r = R = 1, but the series is
n=1
np
convergent if p > 1, divergent when p ≤ 1. Hence, ratio test is also
inconclusive when r ≤ 1 ≤ R.

Ratio test is useful to determine the convergence of power series. We are going
to study this in Chapter 6.
Chapter 5. Infinite Series of Numbers and Infinite Products 395

Example 5.25
Determine whether the series is convergent.

X 2n
(a) (−1)n−1
n=1
n+1

X n+1
(b) (−1)n−1
n=1
2n

Solution
2n
(a) Using ratio test with an = (−1)n−1 , we find that
n+1
an+1 2n+1 n+1 n+1
r = R = lim = lim × n = 2 lim = 2.
n→∞ an n→∞ n + 2 2 n→∞ n + 2


X 2n
Therefore, the series (−1)n−1 is divergent.
n=1
n + 1

n+1
(b) Using ratio test with an = (−1)n−1 , we find that
2n
an+1 n+2 2n 1 n+2 1
r = R = lim = lim n+1 × = lim = .
n→∞ an n→∞ 2 n+1 2 n→∞ n + 1 2

X n+1
Therefore, the series (−1)n−1 n is convergent.
n=1
2

Convergence Tests
In this section, we have explored various strategies to determine the

X
convergence of a series an . We make a summary as follows. This is
n=1
a useful manual for beginners, but it is not binding.

X
1. Check whether lim an is 0. If not, the series an is divergent.
n→∞
n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 396

∞ ∞
X X 1
2. Check whether it is a geometric series arn−1 or a p-series p
.
n=1 n=1
n

X
A geometric series arn−1 is convergent if and only if |r| < 1. A
n=1

X 1
p-series p
is convergent if and only if p > 1.
n=1
n

3. If an contains powers of n and functions such as ln n, use integral test.

4. If an involves only expressions of the form rn for more than one r, do


limit comparison test to compare with a geometric series.

5. If an is a rational function of powers of n, do limit comparison test with


a p-series.

6. For alternating series which does not converge absolutely, check whether
alternating series test can be applied.

7. If an = bnn for each n ∈ Z+ , and lim sup bn exists, use root test.
n→∞

8. If an is a product of a rational function of powers of n and expressions


of the form rn , use ratio test.

Finally, we want to prove the following useful fact.

Theorem 5.29
Let r be a real number with |r| < 1. For any real number α,

lim nα rn = 0.
n→∞

Proof
If r = 0, the limit is trivial. Hence, we consider the case |r| < 1 and r ̸= 0.
If α ≤ 0, the statement is also easy to prove since lim rn = 0 and
n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 397


0, if α < 0,
lim nα =
n→∞ 1, if α = 0.
The highly nontrivial case is when α > 0. In this case, lim nα = ∞. Since
n→∞

nα |r|n = nα en ln |r| ,

and ln |r| < 0, we can deduce that lim nα |r|n = 0 from lim x =
n→∞ x→∞ e
0. Nevertheless, let us present an alternative argument here which is
interesting by its own.

X
Consider the series nα rn with an = nα rn . When |r| < 1 and r ̸= 0,
n=1
 α  α
an+1 n+1 n+1
lim = lim |r| = |r| lim = |r| < 1.
n→∞ an n→∞ n n→∞ n

X
By ratio test, the series nα rn is convergent. Therefore,
n=1

lim nα rn = lim an = 0.
n→∞ n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 398

Exercises 5.2
Question 1

X n2 + 1
Determine whether the series is convergent.
n=1
3n2 + n + 1

Question 2

X ln n
Let p be a positive number. Show that the series is convergent if
n=1
np
and only if p > 1.

Question 3
Let p be a positive number. Show that the series

X ln n
(−1)n−1 p
n=1
n

is convergent.

Question 4
Determine whether the series is convergent.

X 3n + (−1)n 4n
(a)
n=1
5n + 2n

X 2n − 5n
(b)
n=1
4n + 3n + 1

Question 5
∞ √
X
n−1 n
Determine whether the series (−1) is convergent.
n=1
n+1
Chapter 5. Infinite Series of Numbers and Infinite Products 399

Question 6
Determine whether the series is convergent.
∞ √
X 2n n + 3
(a)
n=1
5n2 − 2

X 4n2 − 7
(b) √
3 n+1
n=1
6n

Question 7
Use Theorem 5.26 to determine

n
lim n!.
n→∞

Question 8
Determine whether the series is convergent.
∞  √ n
X 2 n−1
(a) √
n=1
n+1
∞  √ n
X 2 n−1
(b) √
n=1
3 n+1

Question 9
Determine whether the series is convergent.
∞ √ n
n−1 n2
X
(a) (−1)
n=1
3n

X 4n
(b) (−1)n−1
n=1
3n n2
Chapter 5. Infinite Series of Numbers and Infinite Products 400

5.3 Rearrangement of Series

In this section, we want to explore more about the difference between a series that
converges absolutely and one that converges conditionally.

X
Given a series an with terms {an }, define
n=1

|an | + an an , if an ≥ 0,
pn = =
2 0, if an < 0;

|an | − an −an , if an ≤ 0,
qn = = .
2 0, if an > 0.
Then 0 ≤ pn ≤ |an |, 0 ≤ qn ≤ |an |, and
|an | = pn + qn , an = pn − qn .
Example 5.26

X (−1)n−1 1 1 1
For the series =1− + − + ···,
n=1
n 2 3 4

1 1
p2n−1 = , p2n = 0; q2n−1 = 0, q2n = .
2n − 1 2n

Theorem 5.30

X
Let an be a convergent series.
n=1


X ∞
X
1. If the series an converges absolutely, then the series pn and the
n=1 n=1

X
series qn are convergent.
n=1


X ∞
X
2. If the series an converges conditionally, then the series pn and
n=1 n=1

X
the series qn are divergent.
n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 401

Proof
X∞ ∞
X
First we show that the two series pn and qn can only be both
n=1 n=1
convergent or both divergent. We have

p n = an + q n , qn = pn − an ,

X
and we are given that the series an is convergent. Therefore, the series
n=1

X ∞
X
qn is convergent implies that the series pn is convergent. Similarly,
n=1 n=1

X ∞
X
the series pn is convergent implies that the series qn is convergent.
n=1 n=1

X ∞
X
If the series an converges absolutely, the series |an | is convergent.
n=1 n=1
Since
0 ≤ pn ≤ |an |, 0 ≤ qn ≤ |an |,

X ∞
X
comparison test implies that the series pn and qn are convergent.
n=1 n=1

X ∞
X
Conversely, if the series pn and qn are convergent, since
n=1 n=1

|an | = pn + qn ,

X ∞
X
the series |an | must be convergent. Therefore, if the series an
n=1 n=1

X
converges conditionally, which means the series |an | is divergent, then
n=1

X ∞
X
the series pn and the series qn must be both divergent.
n=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 402

Example 5.27

X (−1)n−1 1 1 1
For the series = 1−
+ − + · · · in Example 5.26, the
n=1
n 2 3 4
∞ ∞ ∞ ∞
X X 1 X X 1
series pn = and the series qn = are divergent.
n=1 n=1
2n − 1 n=1 n=1
2n

Definition 5.6 Rearrangement of a Series



X ∞
X
A rearrangement of a series an is the series aπ(n) , where π : Z+ →
n=1 n=1
Z+ is a bijective correspondence.

Example 5.28

Let π : Z+ → Z be the bijective correspondence

π(1) = 1, π(2) = 3, π(3) = 2, π(4) = 5, π(5) = 7, π(6) = 4, . . . .

Namely, 
4k − 3, if n = 3k − 2,



π(n) = 4k − 1, if n = 3k − 1,


2k,

if n = 3k.

X (−1)n−1 1 1 1
The rearrangement of the series = 1− + − + ···
n=1
n 2 3 4
induced by π is
1 1 1 1 1
1+ − + + − + ··· .
3 2 5 7 4

The main thing we want to discuss in this section is whether rearrangment


will affect the convergence of a series. Consider the rearrangment discussed in
Example 5.28, we know that original series
∞ ∞
X X (−1)n−1 1 1 1
an = =1− + − + ···
n=1 n=1
n 2 3 4

is convergent. We can find its sum in the following way. Let sn = a1 +a2 +· · ·+an
Chapter 5. Infinite Series of Numbers and Infinite Products 403

be its nth partial sum. Then


1 1 1 1 1
s2n = 1 − + − + · · · + −
 2 3 4 2n − 1 2n   
1 1 1 1 1 1 1 1
= 1 + + + + ··· + + −2 + + ··· +
2 3 4 2n − 1 2n 2 4 2n
   
1 1 1 1 1 1 1
= 1 + + + + ··· + + − 1 + + ··· +
2 3 4 2n − 1 2n 2 n
2n n
X 1 X1
= − .
k=1
k k=1
k
Let n
1 1 X1
cn = 1 + + . . . + − ln n = − ln n.
2 n k=1
k
By Example 5.14, lim cn = γ is the Euler’s constant. We can write s2n as
n→∞

s2n = c2n + ln(2n) − (cn + ln n) = c2n − cn + ln 2.

Then we find that

lim s2n = lim (c2n − cn + ln 2) = γ − γ + ln 2 = ln 2.


n→∞ n→∞

This shows that ∞ ∞


X X (−1)n−1
an = = ln 2.
n=1 n=1
n
X∞ ∞
X
For the rearranged series bn = aπ(n) ,
n=1 n=1

1 1 1
b3k−2 = , b3k−1 = , b3k = − for all k ∈ Z+ .
4k − 3 4k − 1 2k

X
th
Let tn = b1 + b2 + · · · + bn be the n partial sum of the series bn . Now
n=1
n   n
X 1 1 X 1
t3n = + − .
k=1
4k − 3 4k − 1 k=1
2k
As k runs from 1 to n, 4k − 3 and 4k − 1 run through all positive odd integers
between 1 and 4n. Therefore,
2n n 4n 2n n
X 1 X 1 X1 X 1 X 1
t3n = − = − − .
k=1
2k − 1 k=1 2k k=1
k k=1 2k k=1 2k
Chapter 5. Infinite Series of Numbers and Infinite Products 404

Using cn , we can rewrite this as


1 1 1 1 3
t3n = c4n +ln(4n)− (c2n + ln(2n))− (cn + ln n) = c4n − c2n − cn + ln 2.
2 2 2 2 2
This allows us to conclude that
3
lim t3n = ln 2.
n→∞ 2
Since
t3n+1 = t3n + b3n+1 , t3n+2 = t3n+1 + b3n+1 + b3n+2 ,
and lim bn = 0, we find that
n→∞

lim t3n+1 = lim t3n+2 = lim t3n .


n→∞ n→∞ n→∞


X
This proves that the series bn is convergent, and it converges to lim t3n =
n→∞
n=1
3
ln 2.
2 ∞ ∞
X X
Hence, although the series bn is a rearrangement of the series an , it
n=1 n=1
has a different sum.
In the following, we prove that rearrangement of a nonnegative series would
not lead to different sums.

Lemma 5.31

X
+
If an ≥ 0 for all n ∈ Z and the series an is convergent, then any
n=1
rearrangement of the series has the same sum. Namely, for any bijecion
π : Z+ → Z+ ,
X∞ ∞
X
aπ(n) = an .
n=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 405

Proof
In a nutshell, this is just the fact that a nonnegative series is convergent if
and only if the sequence of partial sums is bounded above, and the sum of
the series is the least upper bound of the sequence of partial sums.
For a rigorous argument, define sn = a1 +· · ·+an to be the nth partial sum of
X∞
the series an , and tn = aπ(1) + · · · + aπ(n) to be the nth partial sum of the
n=1

X
series aπ(n) . Notice that both {sn } and {tn } are increasing sequences.
n=1
We are given that s = sup{sn } exists. For any positive integer n, the
set {π(1), π(2), . . . , π(n)} has a maximum Nn . This means that the set
{π(1), π(2), . . . , π(n)} is contained in the set {1, 2, . . . , Nn }. Therefore,

tn ≤ sNn ≤ s.

This shows that the increasing sequence {tn } is bounded above by s.


Hence, t = lim tn = sup{tn } exists and t ≤ s. For the opposite inequality,
n→∞

X X∞
observe that an is a rearrangement of aπ(n) induced by the bijection
n=1 n=1
π −1 : Z+ → Z+ . Hence, the same argument above shows that s ≤ t.
Combine together, we have t = s, thus proving that any rearrangement of

X
the series an has the same sum.
n=1

Now we can prove that any rearrangemnt of an absolutely convergent series


converge to the same sum.

Theorem 5.32 Rearrangement of Absolutely Convergent Series



X
If the series an converges absolutely, then any rearrangement of the
n=1
series has the same sum. Namely, for any bijecion π : Z+ → Z+ ,

X ∞
X
aπ(n) = an .
n=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 406

Proof

X ∞
X
Define the nonnegative series pn and qn by
n=1 n=1

|an | + an |an | − an
pn = , qn = .
2 2
Then
an = p n − q n .

X
Since the series an converges absolutely, Theorem 5.30 says that the
n=1

X ∞
X
series pn and qn are convergent.
n=1 n=1

X
+ +
Lemma 5.31 says that for any bijecion π : Z → Z , the series pπ(n)
n=1

X
and qπ(n) are convergent, and
n=1


X ∞
X ∞
X ∞
X
pπ(n) = pn , qπ(n) = qn .
n=1 n=1 n=1 n=1


X
Therefore, the series aπ(n) is convergent and
n=1


X ∞
X ∞
X ∞
X ∞
X ∞
X
aπ(n) = pπ(n) − qπ(n) = pn − qn = an .
n=1 n=1 n=1 n=1 n=1 n=1

Finally, we come to the celebrated Riemann’s theorem for series that converges
conditionally.
Chapter 5. Infinite Series of Numbers and Infinite Products 407

Theorem 5.33
Riemann’s Theorem for Conditionally Convergent Series

X
Let an be a series that converges conditionally, and let b and c be two
n=1
extended real numbers with b ≤ c. There exists a bijection π : Z+ → Z+

X
such that for the series aπ(n) with partial sums tn = aπ(1) + · · · + aπ(n) ,
n=1

lim inf tn = b, lim sup tn = c.


n→∞ n→∞

Here an extended real number is either an ordinary real number or ±∞. This
theorem implies that one can have a rearrangement of a conditionally convergent
series that diverge to ±∞ or converge to any real number.

Proof
+
For n ∈ Z , let
|an | + an |an | − an
pn = , qn = .
2 2

X
Since the series an converges conditionally, Theorem 5.30 says that the
n=1

X X∞
series pn and qn are divergent. Let
n=1 n=1

S+ = n ∈ Z+ | an ≥ 0 , S− = n ∈ Z+ | an < 0 .
 

Then
S+ ∪ S− = Z+ , S+ ∩ S− = ∅.
There are strictly increasing maps π1 : Z+ → Z+ and π2 : Z+ → Z+ , such
that π1 (Z+ ) = S+ and π2 (Z+ ) = S− .
X∞ ∞
X
Define the nonnegative series un and vn by
n=1 n=1

un = aπ1 (n) , vn = −aπ2 (n) .


Chapter 5. Infinite Series of Numbers and Infinite Products 408

Then the sequences {un } and {vn } are obtained from the sequences {pn }
and {qn } by removing some zero terms. Hence, both nonnegative series
X∞ X∞
un and vn are divergent.
n=1 n=1
Now we start to define the bijection π : Z+ → Z+ . Construct two
sequences of real numbers {bn } and {cn } such that c1 > 0, bn ≤ cn for
all n ∈ Z+ , and
lim bn = b, lim cn = c.
n→∞ n→∞

Take k1 to be the smallest positive integer such that

C1 = u1 + u2 + · · · + uk1 > c1 .

Then define
π(1) = π1 (1), . . . , π(k1 ) = π1 (k1 ).
Take l1 to be the smallest positive integer such that

B1 = C1 − (v1 + v2 + · · · + vl1 ) < b1 .

Then define

π(k1 + 1) = π2 (1), . . . , π(k1 + l1 ) = π2 (l1 ).

Take k2 to be the smallest positive integer such that

C2 = B1 + uk1 +1 + · · · + uk1 +k2 > c2 .

Then define

π(k1 + l1 + 1) = π1 (k1 + 1), . . . , π(k1 + l1 + k2 ) = π1 (k1 + k2 ).

Take l2 to be the smallest positive integer such that

B2 = C2 − (vl1 +1 + vl1 +2 + · · · + vl1 +l2 ) < b2 .

Then define

π(k1 + l1 + k2 + 1) = π2 (l1 + 1), . . . , π(k1 + l1 + k2 + l2 ) = π2 (l1 + l2 ).


Chapter 5. Infinite Series of Numbers and Infinite Products 409


X ∞
X
Continue this construction inductively. Since un and vn are
n=1 n=1
nonnegative sequences that diverges to ∞, and bm ≤ cm for all positive
integers m, the existence of the positive integers km and lm at each step is
guaranteed. It is easy to see that the map π : Z+ → Z+ is a bijection. For

X
the series aπ(n) , let tn = aπ(1) + · · · + aπ(n) be its nth partial sum. Set
n=1
α0 = β0 = 0, and for m ≥ 1, let

αm = k1 + k2 + · · · + km , βm = l1 + l2 + · · · + lm ,
δm = αm−1 + βm−1 + km , λm = δm + lm = αm + βm .

Then

1 ≤ α1 < α2 < · · · < αm < · · · ,


1 ≤ β1 < β2 < · · · < βm < · · · ,
1 ≤ δ1 < λ1 < δ2 < λ2 < · · · < δm < λm < · · · .

By construction,

t1 ≤ t2 ≤ · · · ≤ tδ1 −1 ≤ c1 < tδ1 ≤ c1 + uα1 ,


tδ1 ≥ tδ1 +1 ≥ tδ1 +2 ≥ · · · ≥ tλ1 −1 ≥ b1 > tλ1 ≥ b1 − vβ1 ,
tλ1 ≤ tλ1 +1 ≤ tλ1 +2 ≤ · · · ≤ tδ2 −1 ≤ c2 < tδ2 ≤ c2 + uα2 ,
tδ2 ≥ tδ2 +1 ≥ tδ2 +2 ≥ · · · ≥ tλ2 −1 ≥ b2 > tλ2 ≥ b2 − vβ2 ,
..
.

X
Since the series an is convergent, lim an = 0. This implies that the
n→∞
n=1
sequences {un } and {vn } converge to 0. Therefore,

lim uαm = lim vβm = 0.


m→∞ m→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 410

Given ε > 0, there exists a positive integer M1 so that for all m ≥ M1 ,


ε ε
0 ≤ uαm < , 0 ≤ vβm < .
2 2
There exists a positive integer M2 so that M2 ≥ M1 and for all m ≥ M2 ,
ε ε
bm > b − , cm < c + .
2 2
Let N = max{αM1 , βM1 , λM2 }. If n ≥ N , then n ≥ λM2 > δM2 . Hence,
there exists m ≥ M2 such that

δm ≤ n < δm+1 .

Then
tn ≤ max{cm + uαm , cm+1 + uαm+1 }.
Since m ≥ M2 , cm and cm+1 are less than c + ε/2. Since m ≥ M2 ≥ M1 ,
uαm and uαm+1 are less than ε/2. These imply that for all n ≥ N ,

tn < c + ε.

Hence,
lim sup tn ≤ c.
n→∞

Similarly, we can show that

lim inf tn ≥ b.
n→∞

For all m ∈ Z+ ,

bm − vβm ≤ tλm < bm , cm < tδm ≤ cm + uαm .

Taking m → ∞ limits show that {tλm } is a subsequence of {tn } that


converges to b, and {tδm } is a subseqeunce of {tn } that converges to c.
This completes the proof that

lim inf tn = b, lim sup tn = c.


n→∞ n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 411

Exercises 5.3
Question 1
Show that the series ∞ ∞
X X (−1)n−1
an = √
n=1 n=1
n+1
+
is convergent. If π : Z → Z+ is a bijective correspondence, consider

X X∞
the rearrangement of the series an given by aπ(n) . Does the series
n=1 n=1

X ∞
X
aπ(n) necessarily converge to the same number as the series an ?
n=1 n=1

Question 2
Show that the series
∞ ∞ √
X X (−1)n−1 n
an =
n=1 n=1
n2 + 1

is convergent. If π : Z+ → Z+ is a bijective correspondence, consider



X ∞
X
the rearrangement of the series an given by aπ(n) . Does the series
n=1 n=1

X ∞
X
aπ(n) necessarily converge to the same number as the series an ?
n=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 412

Question 3
Show that the series
∞ ∞ √
X X (−1)n−1 n
an =
n=1 n=1
n+1

is convergent. If π : Z+ → Z+ is a bijective correspondence, consider



X ∞
X
the rearrangement of the series an given by aπ(n) . Does the series
n=1 n=1

X ∞
X
aπ(n) necessarily converge to the same number as the series an ?
n=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 413

5.4 Infinite Products

In this section, we consider infinite products and study its convergence. An infinite
product is a product of the form

Y
un = u1 u2 · · · un · · · ,
n=1

where {un } is an infinite sequence. The definition of convergence of infinite


product is slightly more complicated.

Definition 5.7 Convergence of Infinite Product



Y
Given a sequence {un }, consider the infinite product un .
n=1

(a) If infinitely many of the terms un ’s are zero, then we say that the infinite
Y∞
product un is divergent.
n=1

(b) If only finitely many of the un ’s are zero, there is a positive integer ℓ
such that un is nonzero for all n ≥ ℓ. Form the partial product
n
Y
P [ℓ]n = uk , for n ≥ ℓ.
k=ℓ

(i) If the limit lim P [ℓ]n does not exist or the limit is 0, we say that
n→∞
Y∞
the infinite product un is divergent.
n=1

(ii) If the limit lim P [ℓ]n exists and is equal to a nonzero number
n→∞

Y
P [ℓ], we say that the infinite product un converges to
n=1

ℓ−1
Y
P = P [ℓ] uk .
k=1

The convergence of infinite product is not affected by finitely many terms


in the product. If un ̸= 0 for all n ≥ 1, we will denote the partial product
Chapter 5. Infinite Series of Numbers and Infinite Products 414

n
Y
P [1]n = uk simply as Pn .
k=1


Y
By definition, if the infinite product un converges to 0, then at least one
n=1
of the un is equal to 0, and there are only finitely many of the un ’s that are
equal to 0.

Let us look at a few examples.

Example 5.29
∞  
Y 1
Determine the convergence of the infinite product 1+ .
n=1
n

Solution
1
For n ≥ 1, un = 1 + ̸= 0. Notice that
n
n  
Y 1 2 3 n+1
Pn = 1+ = × × ··· × = n + 1.
k=1
k 1 2 n

∞  
Y 1
Since lim Pn = ∞, the infinite product 1+ is divergent.
n→∞
n=1
n

Example 5.30
∞  
Y 1
Determine the convergence of the infinite product 1− .
n=1
n
Chapter 5. Infinite Series of Numbers and Infinite Products 415

Solution
1
For n ≥ 1, un = 1 − . We find that u1 = 0 and un > 0 for all n ≥ 2.
n
n  
Y 1 1 2 n−1 1
P [2]n = 1− = × × ··· = .
k=2
k 2 3 n n

∞  
Y 1
Since lim P [2]n = 0, the infinite product 1− is divergent.
n→∞
n=1
n

Example 5.31
∞  
Y 1
Determine the convergence of the infinite product 1− 2 .
n=1
n

Solution
1
For n ≥ 1, un = 1 − . We find that u1 = 0 and un > 0 for all n ≥ 2.
n2
n   Y n   n  
Y 1 1 Y 1
P [2]n = 1− 2 = 1− 1+
k=2
k k=2
k k=2 k
1 2 n−1 3 n+1 n+1
= × × ··· × × ··· × = .
2 3 n 2 n 2n
∞  
1 Y 1
Since P [2] = lim P [2]n = , the infinite product 1 − 2 is
n→∞ 2 n=1
n
convergent, and it converges to u1 P [2] = 0.

When the sequence of partial products {Pn } converges to 0, we consider



Y
the infinite product as divergent. This is so that the infinite product un
n=1

Y
is convergent if and only if the infinite product u−1
n is convergent.
n=1

The following is obvious.


Chapter 5. Infinite Series of Numbers and Infinite Products 416

Proposition 5.34

Y
If the infinite product un is convergent, then lim un = 1.
n→∞
n=1

Using this proposition, when we consider convergence of the infinite product



Y
un , we can assume that un > 0 for all n ∈ Z+ .
n=1
There is a Cauchy criterion for convergence of infinite product.

Theorem 5.35 Cauchy Criterion for Infinite Product



Y
Let {un } be a sequence of positive numbers. The infinite profuct un is
n=1
convergent if and only if it satifies the Cauchy criterion, which says that for
every ε > 0, there exists a positive integer N such that for all m ≥ n ≥ N ,
"m #
Y
uk − 1 < ε.
k=n

The proof of this is more complicated than its infinite series counterpart.

Proof
n
Y
Let Pn = uk be the nth partial product. Then Pn > 0 for all n ∈ Z+ . If
k=1

Y
the infinite product un is convergent, then the sequence {Pn } converges
n=1
to a positive number P . This implies that there is a positive integer N1 such
that
P
Pn > for all n ≥ N1 .
2
Given ε > 0, apply Cauchy criterion to the convergent sequence {Pn }, we
find that there is a positive integer N2 such that for all m ≥ n ≥ N2 ,

|Pn − Pm | < .
2
Chapter 5. Infinite Series of Numbers and Infinite Products 417

Let N = max{N1 , N2 } + 1. We find that for all m ≥ n ≥ N ,


"m #
Y Pm
uk − 1 = −1
k=n
Pn−1
1
= × |Pm − Pn−1 |
Pn−1
2 Pε
< × = ε.
P 2
Therefore, the Cauchy criterion for infinite product is satisfied.
Conversely, assume the Cauchy criterion for infinite product holds. Taking
ε = 1/2, we find that there is an integer N1 such that for all m ≥ n ≥ N1 ,
"m #
Y 1
uk − 1 < .
k=n
2

This implies that


1 Pn 3
< < for all n ≥ N1 . (5.5)
2 PN1 2

Now given ε > 0, there is an integer N ≥ N1 such that for all m ≥ n ≥ N ,


" m #
Y 2ε
uk − 1 < .
k=n+1
3PN1

This implies that when m ≥ n ≥ N ,


" m #
Y 3PN1 2ε
|Pm − Pn | = Pn × uk − 1 < × = ε.
k=n+1
2 3P N1

Hence, {Pn } is a Cauchy sequence, and thus it is convergent. Eq. (5.5) then
implies that
1
lim Pn ≥ PN1 > 0.
n→∞ 2
This proves that {Pn } does not converge to 0. Therefore, the infinite
Y∞
product un is convergent.
n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 418

The following gives a relation between the convergence of the infinite product
with the convergence of infinite series.

Theorem 5.36
Let {un } be a sequence of positive numbers. Then the infinite product
Y∞ ∞
X
un is convergent if and only if the infinite series ln un is convergent.
n=1 n=1

Proof

Y
First assume that the infinite product uk is convergent. Given ε > 0,
k=1
since lim ln x = 0, there exists a δ such that 0 < δ < 1 and if |x − 1| < δ,
x→1
then | ln x| < ε. By the Cauchy criterion for infinite products, there is a
positive integer N such that for all m ≥ n ≥ N ,
"m #
Y
uk − 1 < δ.
k=n

It follows that for all m ≥ n ≥ N ,


m
" n
#
X Y
ln uk = ln uk < ε.
k=n k=m


X
This proves that the infinite series ln un satisfies the Cauchy criterion.
n=1
Hence, it is convergent.

X
Conversely, assume that the infinite series ln un is convergent. Given
n=1
ε > 0, since lim ex = 1, there exists δ > 0 such that if |x| < δ, then
x→0

|ex − 1| < ε.
Chapter 5. Infinite Series of Numbers and Infinite Products 419

Using Cauchy criterion for infinite series, we find that there is a positive
integer N such that for all m ≥ n ≥ N ,
m
X
ln uk < δ.
k=n

It follows that for all m ≥ n ≥ N ,


"m # m
!
Y X
uk − 1 = exp ln uk − 1 < ε.
k=n k=n


Y
This shows that the infinite product un satisfies the Cauchy criterion.
n=1
Hence, it is convergent.

Example 5.32

Y a
For any nonzero real number a, the infinite product exp is
n=1
n

X a
divergent since the infinite series is divergent; while the infinite
n=1
n
∞ a ∞
Y X a
product exp 2 is convergent since the infinite series is
n=1
n n=1
n2
convergent.

ln(1 + a)
Since lim = 1, it is natural to compare the convergence of the

a→0 a ∞
Y X
product (1 + an ) to the convergence of the series an .
n=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 420

Theorem 5.37
Let {an } be a sequence of real numbers such that 0 < an < 1 for all
n ∈ Z+ . Then the following three statements are equivalent.

X
(a) The series an is convergent.
n=1


Y
(b) The infinite product (1 + an ) is convergent.
n=1


Y
(c) The infinite product (1 − an ) is convergent.
n=1

Proof
Since 0 < an < 1 for all n ∈ Z+ , we find that 1 + an > 0 and 1 − an > 0

X
for all n ∈ Z+ . A necessary condition for the convergence of either an ,
n=1

Y ∞
Y
or (1 + an ), or (1 − an ), is
n=1 n=1

lim an = 0.
n→∞

v
By Theorem 5.36, it is then sufficient to prove that if {an } is a sequence of
real numbers satisfying 0 < an < 1 for all n ≥ 1, and lim an = 0, then
n→∞
the following three statements are equivalent.

X
(a) The series an is convergent.
n=1


X
(b′ ) The series ln(1 + an ) is convergent.
n=1


X

(c ) The series ln(1 − an ) is convergent.
n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 421

Let bn = ln(1 + an ) and cn = − ln(1 − an ). Notice that bn and cn are also


positive numbers.
Now since the sequence {an } converges to 0, we find that

bn ln(1 + an ) ln(1 + x)
lim = lim = lim = 1,
n→∞ an n→∞ an x→0 x
cn − ln(1 − an ) − ln(1 − x)
lim = lim = lim = 1.
n→∞ an n→∞ an x→0 x

X
By limit comparison test for positive series, we find that an is
n=1

X ∞
X
convergent if and only if bn is convergent, and an is convergent
n=1 n=1

X
if and only if cn is convergent. These establish the equivalence of (a)
n=1
and (b′ ), and the equivalence of (a) and (c′ ).

Example 5.33
Theorem 5.37 can be used to deduce the following.
∞  
Y 1
1. The infinite product 1+ considered in Example 5.29 is
n=1
n

X 1
divergent since the infinite series is divergent.
n=1
n
∞  
Y 1
2. The infinite product 1− considered in Example 5.30 is
n=1
n

X 1
divergent since the infinite series is divergent.
n=1
n
Chapter 5. Infinite Series of Numbers and Infinite Products 422

∞  
Y 1
3. The infinite product 1 − 2 considered in Example 5.31 is
n=1
n

X 1
convergent since the infinite series 2
is convergent.
n=1
n

Theorem 5.38

Y
If the infinite product (1 + |an |) is convergent, then the infinite product
n=1

Y
(1 + an ) is convergent.
n=1

Proof
Without loss of generality, we can assume that |an | < 1 for all n ≥ 1.

Y
Given ε > 0, since the infinite product (1 + |an |) is convergent, Cauchy
n=1
criterion says that there is a positive integer N such that for all m ≥ n ≥ N ,
"m #
Y
(1 + |ak |) − 1 < ε.
k=n

By an inequality in the exercises, we find that


"m # m
Y Y
(1 + ak ) − 1 ≤ (1 + |ak |) − 1 < ε.
k=n k=n


Y
This proves that the infinite product (1 + an ) satisfies the Cauchy
n=1
criterion. Hence, it is convergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 423

Definition 5.8 Absolutely Convergent Infinite Products



Y
We say that the infinite product (1 + an ) converges absolutely if the
n=1

Y
infinite product (1 + |an |) is convergent.
n=1

Theorem 5.38 says that an absolutely convergent infinite product is convergent.

Corollary 5.39

X
Let an be a series that converges absolutely. Then the infinite product
n=1

Y
(1 + an ) converges absolutely.
n=1

Proof

X
Since an converges absolutely, lim an = 0. Without loss of generality,
n→∞
n=1

X
we can assume that |an | < 1 for all n ≥ 1. Since |an | is convergent,
n=1

Y
Theorem 5.37 implies that the infinite product (1 + |an |) is convergent.
n=1

Y
Theorem 5.38 then implies that the infinite product (1 + an ) converges
n=1
absolutely.

Example 5.34
∞ 
(−1)n−1
Y 
The infinite product 1+ is convergent since the series
n=1
n2

X (−1)n−1
2
converges absolutely.
n=1
n

Now it is natural to ask the following question. Is it true that the infinite
Chapter 5. Infinite Series of Numbers and Infinite Products 424


Y ∞
X
product (1 + an ) is convergent if and only if the series an is convergent?
n=1 n=1
The following two examples show that neither one implies the other.

Example 5.35

Let {an } be the sequence defined by


1 1
a2n−1 = √ , a2n = − √ for n ≥ 1.
n+1 n+1
n
X 1
If sn = ak , then s2n−1 = √ and s2n = 0 for all n ≥ 1. This
k=1
n+1

X
implies that the series an converges to 0.
n=1
n
Y
On the other hand, if Pn = (1 + ak ), we find that
k=1

n    n+1
Y 
Y 1 1 1
P2n−1 = 1− 1+ √ , P2n = 1− .
k=2
k n + 1 k=2
k

∞  
Y 1
Since the infinite product 1− is divergent, the infinite product
n=2
n
Y∞
(1 + an ) is divergent.
n=1

X ∞
Y
This gives an example where an is convergent but (1 + an ) is
n=1 n=1
divergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 425

Example 5.36

Let {an } be the sequence defined by


1 1
a2n−1 = √ , a2n = − √ for n ≥ 1.
n n+1

Then √ √
n+1 n
1 + a2n−1 = √ , 1 + a2n =√ .
n n+1
n √
Y n+1
If Pn = (1 + ak ), we find that P2n−1 = √ and P2n = 1 for all
k=1
n

Y
n ≥ 1. Hence, the infinite product (1 + an ) converges to 1.
n=1
n
X
If sn = ak , then
k=1

n   X n
X 1 1 1
s2n = √ −√ = √ √ .
k=1
k k+1 k=1
k( k + 1)
∞ ∞
X 1 X 1
Compare to the series , we find that the series √ √ is
kk=1 k=1
k( k + 1)
divergent. Therefore, lim s2n = ∞, which implies that lim sn does not
n→∞ n→∞

X
exist. Hence, the series an is divergent.
n=1

Y ∞
X
This gives an example where (1 + an ) is convergent but an is
n=1 n=1
divergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 426

Exercises 5.4
Question 1

Given that {an } is a sequence of numbers with an > −1 for all n ∈ Z+ .


Prove that for all n ∈ Z+ ,
n
Y n
Y
(1 + ak ) − 1 ≤ (1 + |ak |) − 1.
k=1 k=1

Question 2
∞  
Y 1
Let s be a positive number. Show that the infinite product 1 − s is
n=2
n
convergent if and only if s > 1.

Question 3
For n ≥ 1, let    
1 1
un = 1+ exp − .
n n

Y
Show that the infinite product un is convergent and find its value.
n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 427

5.5 Double Sequences and Double Series

In this section, we give a brief discussion about double sequences.

Definition 5.9 Double Sequences

A double sequence is a function f : Z+ × Z+ → R that is defined on the


set Z+ × Z+ . It is customary to denote a general term f (m, n) as am,n , and
denote the double sequence by {f (m, n)}∞ ∞
m,n=1 or {am,n }m,n=1 .

The following gives some examples of double sequences.

Example 5.37
 ∞
n(m + 1)
(a)
m(n + 1) m,n=1
 ∞
mn
(b)
m2 + n2 m,n=1

Definition 5.10 Convergence of Double Sequence

We say that a double sequence {am,n }∞


m,n=1 converges to a number a,
written as
a = lim am,n ,
m,n→∞

provided that for every ε > 0, there is a positive integer N so that for all
positive integers m and n with m ≥ N , n ≥ N ,

|am,n − a| < ε.

If a double sequence converges to a number a, this number a is unique, and


we say that the sequence is convergent. Otherwise, we say that the sequence is
divergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 428

Example 5.38
 ∞
n(m + 1)
For the double sequence considered in Example 5.37,
m(n + 1) m,n=1
notice that
  
n(m + 1) 1 1 1 1 1
am,n = = 1− 1+ = 1− + − .
m(n + 1) n+1 m n + 1 m m(n + 1)

Given ε > 0, there is a positive integer N so that 3/N < ε. Then if m ≥ N ,


n ≥ N,
1 1 1 1 1 1
|am,n − 1| < + + < + + < ε.
n + 1 m m(n + 1) N N N

This proves that


n(m + 1)
lim = 1.
m,n→∞ m(n + 1)

 ∞
mn
Before we study the convergence of the double sequence ,
m2 + n2 m,n=1
let us prove the following lemma, which says that for a double sequence {am,n }∞
m,n=1
to be convergent, it should approach the same limit regardless of how m and n
goes to infinity.

Lemma 5.40
Let {am,n }∞
m,n=1 be a double sequence that converges to a number a, and let
g : Z+ → Z+ be a function such that lim g(n) = ∞. Define the sequence
n→∞
{bn }∞
n=1 by
bn = ag(n),n .
Then the sequence {bn }∞
n=1 also converges to a.

Notice that {g(n)} is a sequence of positive integers that diverges to ∞.


Chapter 5. Infinite Series of Numbers and Infinite Products 429

Proof
Given ε > 0, there is a positive integer N1 such that for all (m, n) ∈ Z+ ×
Z+ with m ≥ N1 and n ≥ N1 ,

|am,n − a| < ε.

Since lim g(n) = ∞, there is a positive integer N ≥ N1 such that g(n) ≥


n→∞
N1 for all n ≥ N . If n ≥ N , g(n) ≥ N1 and n ≥ N1 . Therefore,

|bn − a| = |ag(n),n − a| < ε.

This proves that the sequence {bn }∞


n=1 converges to a.

Example 5.39
 ∞
mn
For the double sequence considered in Example 5.37,
m2 + n2 m,n=1
assume that it converges to a. Take g1 : Z+ → Z+ to be the function
g1 (n) = n. Then we find that

n2 1
a = lim 2 2
= .
n→∞ n + n 2
Take g2 : Z+ → Z+ to be the function g2 (n) = 2n. Then we find that

2n2 2
a = lim 2 2
= .
n→∞ 4n + n 5

 valuesof∞ a. This is a contradiction. Therefore, the


We get two different
mn
double sequence is divergent.
m + n2 m,n=1
2

It is easy to prove that linearity also holds for limits of double sequences.
Chapter 5. Infinite Series of Numbers and Infinite Products 430

Proposition 5.41 Linearity

Assume that the double sequences {am,n }∞ ∞


m,n=1 and {bm,n }m,n=1 are
convergent. Then for any constants α and β, the double sequence

{αam,n + βbm,n }∞
m,n=1

is also convergent, and

lim (αam,n + βbm,n ) = α lim am,n + β lim bm,n .


m,n→∞ m,n→∞ m,n→∞

Proof
Let a = lim am,n and b = lim bm,n . Given ε > 0, there are positive
m,n→∞ m,n→∞
integers N1 and N2 such that
ε
|am,n − a| < , for all m ≥ N1 , n ≥ N1 ;
2(|α| + 1)
ε
|bm,n − b| < , for all m ≥ N2 , n ≥ N2 .
2(|β| + 1)
Let N = max{N1 , N2 }. For all positive integers m and n with m ≥ N and
n ≥ N , we have

|(αam,n + βbm,n ) − (αa + βb)| ≤ |α||am,n − a| + |β||bm,n − b|


|α| |β|
< ε+ ε
2(|α| + 1) 2(|β| + 1)
ε ε
< + = ε.
2 2
This proves the assertion.

In the proof, we divide ε/2 by |α| + 1 instead of |α|, because α can be 0.


For the double sequence we considered in Example 5.38, notice that
 
n(m + 1) m+1 n(m + 1)
lim lim = lim = 1 = lim .
m→∞ n→∞ m(n + 1) m→∞ m m,n→∞ m(n + 1)

The question is whether we can find the limit of a double seqeunce {am,n } by
taking the limit n → ∞ first, and then take the limit m → ∞, or in the opposite
order.
Chapter 5. Infinite Series of Numbers and Infinite Products 431

 ∞
mn
For the double sequence , for fixed m ≥ 1, taking the n →
m + n2
2
m,n=1
∞ limit, we have
mn
lim = 0.
n→∞ m2 + n2
Hence,  
mn
lim lim = 0.
m→∞ n→∞ m2 + n2
 ∞
mn
But we have shown that the double sequence is divergent.
m2 + n2 m,n=1
Therefore, we find that to study the limit of a double sequence, in general we
cannot take one limit first before the other. The following theorem says that if one
knows apriori that the double sequence is convergent, one can take iterated limits
under some conditions.

Theorem 5.42
Assume that the double sequence {am,n }∞
m,n converges to a, and for each
m ∈ Z+ , the limit
bm = lim am,n
n→∞

exists. Then the sequence {bm } also converges to a. In other words,


 
lim am,n = a =⇒ lim lim am,n = a
m,n→∞ m→∞ n→∞

provided that the limit lim am,n exists for all m ∈ Z+ .


n→∞

Proof
Given ε > 0, there exists a positive integer N such that for all (m, n) ∈
Z+ × Z+ with m ≥ N and n ≥ N ,
ε
|am,n − a| < .
2
Hence, for fixed m ≥ N , taking the n → ∞ limit gives
ε
|bm − a| ≤ < ε.
2
This proves that lim bm = a.
n→∞
Chapter 5. Infinite Series of Numbers and Infinite Products 432

The assumption that the limit lim am,n exists for each m ∈ Z+ in Theorem
n→∞
5.42 is needed, as the convergence of the double sequence {am,n }∞ m,n does not
guarantee that the limit lim am,n exists. An example is shown below.
n→∞

Example 5.40

Consider the double sequence {am,n }∞


m,n=1 with

m + (−1)n−1
am,n = .
m2
n−1 ∞
 
m + (−1)
For fixed m ∈ Z+ , the sequence does not have a limit
m2 n=1
m+1 m−1
since it is oscillating between 2
and . But the double sequence

m m2
{am,n }m,n=1 converges to zero. This can be proved in the following way.
m+1
Given ε > 0, since lim = 0, there exists a positive integer N so
m→∞ m2
that for all m ≥ N .
m+1
0< < ε.
m2
This implies that if m ≥ N , n ≥ N , then
m+1
0 ≤ am,n ≤ < ε.
m2
Hence, the double sequence {am,n }∞
m,n=1 converges to zero.

Definition 5.11 Bounded Double Sequence

We say that a double sequence {am,n }∞


m,n=1 is bounded if the set

am,n | (m, n) ∈ Z+ × Z+


is bounded.
Chapter 5. Infinite Series of Numbers and Infinite Products 433

Remark 5.3
If a double sequence {am,n }∞m,n=1 is convergent, it is not necessarily
bounded. For example, consider the double sequence {am,n }∞
m,n=1 with

n, if m = 1,
am,n =
1, if m ≥ 2.

Obviously, it is not bounded. However, It is not difficult to prove that the


double sequence {am,n }∞m,n=1 converges to 1.

Definition 5.12 Increasing Double Sequence

We say that a double sequence {am,n }∞ m,n=1 is increasing in both indices


provided that for fixed m ∈ Z+ , {am,n }∞
n=1 is an increasing sequence in n;
+ ∞
and for fixed n ∈ Z , {am,n }m=1 is an increasing sequence in m.

If a double sequence {am,n }∞ m,n=1 is increasing in both indices, for any positive
integers m1 , m2 , n1 , n2 , if m2 ≥ m1 and n2 ≥ n1 , then

am2 ,n2 ≥ am1 ,n1 .

Figure 5.2: An illiustration of the relative positions of (m1 , n1 ) and (m2 , n2 ) when
m2 > m1 and n2 > n1 .
Chapter 5. Infinite Series of Numbers and Infinite Products 434

Example 5.41

The double sequence {am,n }∞


m,n=1 with

mn
am,n =
(m + 1)(n + 1)

is increasing in both indices.

The following is a counterpart of monotone convergence theorem for double


sequences.

Theorem 5.43 Convergence of Increasing Double Sequences

Let {am,n }∞
m,n=1 be a double sequence that is increasing in both indices.
Then the double sequence {am,n }∞ m,n=1 is convergent if and only if it is
bounded above. In case it is convergent, it converges to sup {am,n }.
(m,n)∈Z+ ×Z+

Proof
If the sequence {am,n }∞
m,n=1 converges to a, then there is a positive integer
N such that for all (m, n) ∈ Z+ × Z+ with m ≥ N and n ≥ N ,

|am,n − a| < 1.

This implies that

am,n < a + 1 for all m ≥ N, n ≥ N.

Given (m, n) ∈ Z+ × Z+ , let k = max{m, n, N }. Then k ≥ m, k ≥ n and


k ≥ N . Therefore,
am,n ≤ ak,k < a + 1.
Chapter 5. Infinite Series of Numbers and Infinite Products 435

This prove that the double sequence {am,n }∞ m,n=1 is bounded above by a+1.
In fact, the same reasoning shows that it is bounded above by a + ε for any
ε > 0, but we do not need this.
Conversely, if {am,n }∞
m,n=1 is bounded above, then

a= sup {am,n }
(m,n)∈Z+ ×Z+

exists. Given ε > 0, there exists (m0 , n0 ) ∈ Z+ × Z+ such that

am0 ,n0 > a − ε.

Take N = max{m0 , n0 }. Then if m ≥ N ≥ m0 , n ≥ N ≥ n0 ,

am,n ≥ am0 ,n0 > a − ε.

By definition am,n ≤ a. Therefore, for all (m, n) ∈ Z+ × Z+ with m ≥ N


and n ≥ N , we have
|am,n − a| < ε.
This proves that the double sequence {am,n }∞
m,n=1 is convergent and it
converges to a = sup {am,n }.
(m,n)∈Z+ ×Z+

Now we turn to double series. A double series is a series of the form


X
am,n ,
(m,n)∈Z+ ×Z+

where {am,n }∞ + +
m,n=1 is a double sequence. For each (m, n) ∈ Z × Z , we define
the (m, n) partial sum sm,n by
m X
X n
sm,n = ak,l .
k=1 l=1
Chapter 5. Infinite Series of Numbers and Infinite Products 436

Definition 5.13 Convergence of Double Series


X
We say that the double series am,n is convergent provided that
(m,n)∈Z+ ×Z+
the double sequence of partial sums {sm,n } is convergent. In this case, the
sum of the double series is
X m X
X n
am,n = s = lim sm,n = lim ak,l .
m,n→∞ m,n→∞
(m,n)∈Z+ ×Z+ k=1 l=1

Notice that for any (m, n) ∈ Z+ × Z+ ,


m
X
sm,n − sm,n−1 = ak,n .
k=1

Therefore,
m
X m−1
X
sm,n − sm,n−1 − sm−1,n + sm−1,n−1 = ak,n − ak,n = am,n .
k=1 k=1

From this, we obtain the following immediately.

Proposition 5.44
X
If the double series am,n is convergent, then the double
(m,n)∈Z+ ×Z+
sequence {am,n }∞
m,n=1 converges to 0.

X
If am,n is a double series with am,n ≥ 0 for all (m, n) ∈ Z+ ×
(m,n)∈Z+ ×Z+
+
Z , then the double sequence of partial sums {sm,n } is a double sequence that is
increasing in both indices. From Theorem 5.43, we obtain the following.

Theorem 5.45
X
If am,n is a double series with am,n ≥ 0 for all (m, n) ∈ Z+ ×
(m,n)∈Z+ ×Z+
+
Z , then it is convergent if and only if the double sequence of partial sums
{sm,n } is bounded above.
Chapter 5. Infinite Series of Numbers and Infinite Products 437

Figure 5.3: An illiustration of those terms ak,l that involved in sm2 ,n2 − sm1 ,n1
when m2 > m1 and n2 > n1 .

Corollary 5.46
X
If am,n is a double series with am,n ≥ 0 for all (m, n) ∈ Z+ ×
(m,n)∈Z+ ×Z+
Z , then it is convergent if and only if the sequence {sn,n }∞
+
n=1 is convergent.
In this case,
X X n
n X
am,n = lim sn,n = lim ak,l .
n→∞ n→∞
(m,n)∈Z+ ×Z+ k=1 l=1

This says that we can determine the convergence of a nonnegative double


series from the sequence {sn,n }∞ ∞
n=1 instead of the double sequence {sm,n }m,n=1 .

Proof
Since am,n ≥ 0 for all (m, n) ∈ Z+ × Z+ , the double sequence {sm,n } is
Xwhile the sequence {sn,n } is increasing.
increasing in both indices,
If the double series am,n is convergent, Theorem 5.45 implies
(m,n)∈Z+ ×Z+
that the double sequence of partial sums {sm,n } is bounded above. Being
a subset, the sequence {sn,n }∞
n=1 is also bounded above. By monotone
convergence theorem, the sequence {sn,n }∞n=1 is convergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 438

Conversely, assume that the sequence {sn,n }∞


n=1 is convergent. Then it is
bounded above. Let

t = sup sn,n = lim sn,n .


n∈Z+ n→∞

For any positive integers m and n,

sm,n ≤ max{sm,m , sn,n } ≤ t.

This implies that the double sequence {sm,n } is bounded above by t. Hence,
X
the double series am,n is convergent. From the argument above,
(m,n)∈Z+ ×Z+
we also find that

sup sm,n ≤ t = sup sn,n .


(m,n)∈Z+ ×Z+ n∈Z+

Since the oppositie inequality is obvious, this is in fact an equality. Hence,


X
am,n = sup sm,n = sup sn,n
(m,n)∈Z+ ×Z+ n∈Z+
(m,n)∈Z+ ×Z+
n X
X n
= lim sn,n = lim ak,l .
n→∞ n→∞
k=1 l=1

Let us look at an example.

Example 5.42
Show that the double series
X 1
(m2 + n2 )2
(m,n)∈Z+ ×Z+

is convergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 439

Solution
Notice that
n X
n n k n l
X 1 XX 1 XX 1
sn,n = 2 2 2
≤ 2 2 2
+
k=1 l=1
(k + l ) k=1 l=1
(k + l ) l=1 k=1
(k + l2 )2
2

n X k n ∞
X 1 X k X 1
≤2 4
= 2 4
= 2 3
.
k=1 l=1
k k=1
k k=1
k


X 1
Since the series 3
is convergent, the sequence {sn,n } is bounded
k=1
k
X 1
above. Hence, the double series is convergent.
+ +
(m + n2 )2
2
(m,n)∈Z ×Z

Next, we consider double series that have negative terms. Given a double
sequence {am,n }∞ ∞ ∞
m,n=1 , let {pm,n }m,n=1 and {qm,n }m,n=1 be double sequences
defined by
|am,n | + am,n |am,n | − am,n
pm,n = , qm,n = .
2 2
Then
|am,n | = pm,n + qm,n , am,n = pm,n − qm,n .
{pm,n }∞ ∞
m,n=1 and {qm,n }m,n=1 are nonnegative double sequences with

0 ≤ pm,n ≤ |am,n |, 0 ≤ qm,n ≤ |am,n |.

Definition 5.14 Absolute Convergence of Double Series


X
We say that the double series am,n converges absolutely if the
(m,n)∈Z+ ×Z+
X
double series |am,n | is convergent.
(m,n)∈Z+ ×Z+

Theorem 5.47
X
If the double series am,n converges absolutely, then it is
(m,n)∈Z+ ×Z+
convergent.
Chapter 5. Infinite Series of Numbers and Infinite Products 440

Proof
+ +
For (m, n) ∈ Z × Z , let

|am,n | + am,n |am,n | − am,n


pm,n = , qm,n = .
2 2
Then
0 ≤ pm,n ≤ |am,n |, 0 ≤ qm,n ≤ |am,n |.

Let {s+
m,n }, {sm,n }, {tm,n } and {sm,n } be respectively the double sequences
X X
of partial sums for the double series pm,n , qm,n ,
(m,n)∈Z+ ×Z+ (m,n)∈Z+ ×Z+
X X
|am,n | and am,n . Then
(m,n)∈Z+ ×Z+ (m,n)∈Z+ ×Z+

− −
tm,n = s+
m,n + sm,n , sm,n = s+
m,n − sm,n .

Moreover,
0 ≤ s+
m,n ≤ tm,n , 0 ≤ s−
m,n ≤ tm,n . (5.6)
Since {pm,n }, {qm,n } and {|am,n |} are nonnegative double sequences,

{s+
m,n }, {sm,n } and {tm,n } are nonnegative double sequences that
are increasing in both indices. By assumption, the double series
X
|am,n | is convergent. Therefore, the double sequence {tm,n }
(m,n)∈Z+ ×Z+
is bounded above. Eq. (5.6) implies that the double sequences {s+m,n } and
{s− +
m,n } are also bounded above. Hence, the double sequences {sm,n } and
{s−m,n } are convergent. By linearity, the double sequence {sm,n } is also
convergent and


lim sm,n = lim s+
m,n − lim sm,n .
m,n→∞ m,n→∞ m,n→∞

X
This proves that the double series am,n is convergent, and
(m,n)∈Z+ ×Z+

X X X
am,n = pm,n − qm,n .
(m,n)∈Z+ ×Z+ (m,n)∈Z+ ×Z+ (m,n)∈Z+ ×Z+

There is a simpler proof of this theorem using the same idea as we prove the
Chapter 5. Infinite Series of Numbers and Infinite Products 441

case for single series. The ideas in the proof that we present above have been
used when we prove that any rearrangement of an absolutely convergent single
series is convergent and has the same sum. It is a useful technique for dealing
with absolutely convergent series. One should compare this proof to the proof
of Theorem 4.39 for convergence of improper integrals. In fact,Zinfinite series

and improper integrals are closely related. An improper integral f (x)dx is
−∞
convergent if and only if the double limit
Z b
lim f (x)dx
a→−∞,b→∞ a

exists. This can be rephrased as for any two sequences {am } and {bn } satisfying
lim am = −∞ and lim bn = ∞, the double sequence {Fm,n }, with
m→∞ n→∞

Z bn
Fm,n = f (x)dx
am

is convergent and has the same limit.


As we have seen before, we cannot simply compute the limit of a double
sequence by taking the limit with respect to one index first before the other. For
double series, we cannot find the sum simply by taking the sum with respect to
one index first before the other. Let us look at the following example.

Example 5.43

For (m, n) ∈ Z+ × Z+ , let



m − n, if |m − n| = 1,
am,n =
0, otherwise,
X
and consider the double series am,n . We find that
(m,n)∈Z+ ×Z+

 
∞ −1, ∞
X if m = 1, X 1, if n = 1,
am,n = am,n =
n=1
0, if m ≥ 2; m=1
0, if n ≥ 2.
Chapter 5. Infinite Series of Numbers and Infinite Products 442

Therefore,
∞ ∞
! ∞ ∞
!
X X X X
am,n = −1, am,n = 1.
m=1 n=1 n=1 m=1

We find that changing the orders of summation produces different sums.

Figure 5.4: An illiustration of the terms in the double series defined in Example
5.43.

However, we have the following if the double series is convergent.

Theorem 5.48
X
Assume that the double series am,n converges to s, and for
(m,n)∈Z+ ×Z+

X
+
every fixed m ∈ Z , the series am,n is convergent with sum um . Then
n=1
the series
∞ ∞ ∞
!
X X X
um = am,n
m=1 m=1 n=1

is convergent and its sum is s.


Chapter 5. Infinite Series of Numbers and Infinite Products 443

Proof
Let m X
n
X
sm,n = ak,l .
k=1 l=1

We are given that the double sequence {sm,n }∞


m,n=1 converges to s. Notice
that for fixed m ∈ Z+ ,
m
X m
X n
X
uk = lim ak,l = lim sm,n .
n→∞ n→∞
k=1 k=1 l=1

This shows that for fixed m, the limit bm = lim sm,n exists and it equal to
n→∞
Xm
uk . By Theorem 5.42, the sequence {bm } converges to s. Therefore,
k=1

X
the series um is convergent and has sum s.
m=1

Let us explore more about nonnegative double series first.

Theorem 5.49
X
Given that am,n is a double series with am,n ≥ 0 for all
(m,n)∈Z+ ×Z+
(m, n) ∈ Z+ × Z+ , and it is convergent with sum s. We have the following.

X
(a) For all m ∈ Z+ , um = am,n is finite.
n=1


X
+
(b) For all n ∈ Z , vn = am,n is finite.
m=1


X ∞
X
(c) The series um and the series vn both converge to s. Namely,
m=1 n=1

∞ X
X ∞ ∞ X
X ∞ X
am,n = am,n = am,n .
m=1 n=1 n=1 m=1 (m,n)∈Z+ ×Z+
Chapter 5. Infinite Series of Numbers and Infinite Products 444

Proof
Given positive integers m and n, let
X n
m X n
X m
X
sm,n = am,n , um,n = am,l , vm,n = ak,n .
k=1 l=1 l=1 k=1

Then m n
X X
sm,n = uk,n = vm,l . (5.7)
k=1 l=1
+
Since am,n ≥ 0 for all m, n ∈ Z , we have

um,n ≤ sm,n , vm,n ≤ sm,n for all (m, n) ∈ Z+ × Z+ .

For fixed m, {um,n }∞ ∞


n=1 and {sm,n }n=1 are increasing sequences. For fixed
n, {vm,n }∞ ∞
m=1 and {sm,n }m=1Xare increasing sequences.
Since the double series am,n is convergent with sum s, sm,n ≤ s
(m,n)∈Z+ ×Z+
for all positive integers m and n. Therefore, the sequences {um,n }∞ n=1 ,
∞ ∞ ∞
{sm,n }m=1 , {vm,n }m=1 and {sm,n }m=1 are increasing sequences that are
bounded above by s. Therefore, each of these sequences is convergent.
The convergence of the sequences {um,n }∞ ∞
n=1 and {vm,n }m=1 are precisely
the statements in (a) and (b). By definition,

X ∞
X
um = am,n = lim um,n , vn = am,n = lim vm,n .
n→∞ m→∞
n=1 m=1

Now let m n
X X
bm = uk and cn = vl
k=1 l=1

X ∞
X
be the partial sums of the series um and vn . From (5.7), we find
m=1 n=1
that
m
X n
X
lim sm,n = u k = bm , lim sm,n = vl = cn .
n→∞ m→∞
k=1 l=1

From these, we find that the sequences {bm } and {cn } are also increasing
sequences that are bounded above by s.
Chapter 5. Infinite Series of Numbers and Infinite Products 445

Therefore,
b = lim bm and c = lim cn
m→∞ n→∞

exist, and b ≤ s, c ≤ s. We are now left to prove that b = c = s. It is


sufficient to prove that b = s. Then c = s follows by interchanging the
roles of m and n. Given ε > 0, using the fact that s = sup sn,n | n ∈ Z+


from Corollary 5.46, we find that there is a positive integer N such that

sN,N > s − ε.

But then
N X
X N ∞
N X
X
sN,N = am,n ≤ am,n = bN .
m=1 n=1 m=1 n=1

This shows that


bN > s − ε.
Hence,
b = sup bm > s − ε.
m

Since ε > 0 is arbitrary, we conclude that b ≥ s. Together with b ≤ s that


is proved earlier, we conclude that b = s.

Theorem 5.50
X
Given that am,n is a double series with am,n ≥ 0 for all
(m,n)∈Z+ ×Z+

X
(m, n) ∈ Z+ × Z+ . Assume that for each m ∈ Z+ , the series am,n
n=1

X
converges to um . If the series um is convergent, then the double series
X m=1
am,n is convergent, and
(m,n)∈Z+ ×Z+

∞ ∞ ∞
!
X X X X
am,n = um = am,n .
(m,n)∈Z+ ×Z+ m=1 m=1 n=1
Chapter 5. Infinite Series of Numbers and Infinite Products 446

Proof

X
It is sufficient to prove that the convergence of the series um implies
X m=1
the convergence of the double series am,n . The last statement
(m,n)∈Z+ ×Z+

X
then follows from Theorem 5.48. Assume that the series um converges
m=1
to u. Using the same notations as in the proof of Theorem 5.49, we find
that for each positive integer m, the sequence {um,n }∞n=1 increases to um .
From (5.7), we find that for any positive integers m and n,
m
X
sm,n ≤ uk ≤ u.
k=1

This shows that the double sequence {sm,n }∞


m,n=1 is bounded
X above, and
hence it is convergent. Therefore, the double series am,n is
(m,n)∈Z+ ×Z+
convergent.

Remark 5.4
Putting together Theorem 5.49 and Theorem 5.50, we conclude the
X
following. Given a double series am,n with nonnegative terms
(m,n)∈Z+ ×Z+
am,n , we can determine its convergence and find its sum by first checking
X∞
whether for each fixed m, the series am,n is convergent. If yes, find
n=1

X
the sum, call it as um , and check whether the series um is convergent.
X m=1
If yes, then the double series am,n is convergent and its sum is
(m,n)∈Z+ ×Z+

X X
given by um . Namely, the sum of the double series am,n
m=1 (m,n)∈Z+ ×Z+
can be obtained by iterated summation.
Chapter 5. Infinite Series of Numbers and Infinite Products 447


X
We can also start with the series am,n for each fixed n. This shows that
m=1
for double series with nonnegative terms, we can interchange the orders of
summation. In fact, with slightly more effort, one can prove that we can
sum in any orders.

X
If for some integer m, the series am,n is divergent, then the double series
n=1
X ∞
X
am,n is divergent. Even if the series am,n is convergent for
(m,n)∈Z+ ×Z+ n=1

X
all positive integers m, the series um can still be divergent. In this latter
X m=1
case, the double series am,n is divergent. An example is given
(m,n)∈Z+ ×Z+
by the double series
X 1
.
m2 + n2
(m,n)∈Z+ ×Z+

X 1
For fixed positive integer m, comparison with the series shows that
n=1
n2

X 1
the series is convergent. By integral test, we find that
n=1
m + n2
2

∞ Z ∞
X 1 1 1 π 1
um = ≥ dx − 2 = − 2 > 0.
n=1
m + n2
2
0 m2 +x 2 m 2m m

∞ ∞
X 1 X 1
Since the series is divergent but the series is convergent,
m=1
m n=1
m2
∞   ∞
X π 1 X
the series − is divergent. Hence, the series um is
m=1
2m m2 m=1
divergent.

Finally, we can come back to series with negative terms. From Theorem 5.49,
we have the following.
Chapter 5. Infinite Series of Numbers and Infinite Products 448

Theorem 5.51
X
Given that am,n is a double series that converges absolutely,
(m,n)∈Z+ ×Z+
and it is convergent with sum s. We have the following.

X
(a) For all m ∈ Z+ , um = am,n is finite.
n=1


X
+
(b) For all n ∈ Z , vn = am,n is finite.
m=1


X ∞
X
(c) The series um and the series vn both converge to s. Namely,
m=1 n=1

∞ X
X ∞ ∞ X
X ∞ X
am,n = am,n = am,n .
m=1 n=1 n=1 m=1 (m,n)∈Z+ ×Z+

Proof
Using the same
X notations as in the proof of Theorem 5.47, since
X the double
series |am,n | is convergent, the double series pm,n
(m,n)∈Z+ ×Z+ (m,n)∈Z+ ×Z+
X
and qm,n are convergent. Applying Theorem 5.49 to the
(m,n)∈Z+ ×Z+
X X
nonnegative series pm,n and qm,n , we conclude that
(m,n)∈Z+ ×Z+ (m,n)∈Z+ ×Z+
X∞ ∞
X ∞
X
+ +
for all m ∈ Z and all n ∈ Z , the series pm,n , qm,n , pm,n and
n=1 n=1 m=1

X
qm,n are convergent. Since
m=1

am,n = pm,n − qm,n for all (m, n) ∈ Z+ × Z+ ,



X ∞
X
we conclude that the series am,n and the series am,n are convergent.
n=1 m=1
The remaining assertions are concluded using the same arguments.

This theorem says that absolutely convergent double series enjoys almost the
Chapter 5. Infinite Series of Numbers and Infinite Products 449

same privileges as the nonnegative double series. The following theorem gives a
summary.

Theorem 5.52
X
Given that am,n is a double series that satisfies the following
(m,n)∈Z+ ×Z+
conditions.

X
(i) For each fixed m ∈ Z+ , the series |am,n | is convergent.
n=1

∞ X
X ∞
(ii) |am,n | is convergent.
m=1 n=1

We have the following.


X
(a) The double series am,n converges absolutely.
(m,n)∈Z+ ×Z+


X
(b) For each fixed n ∈ Z+ , the series am,n converges absolutely.
m=1


X
+
(c) For each fixed m ∈ Z , the series am,n converges absolutely.
n=1

∞ X
X ∞ ∞ X
X ∞
(d) Both the series am,n and am,n are convergent.
m=1 n=1 n=1 m=1

(e) The sum of the double series can be computed by iterated summation.
Namely,
X ∞ X
X ∞ ∞ X
X ∞
am,n = am,n = am,n .
(m,n)∈Z+ ×Z+ m=1 n=1 n=1 m=1

Proof
By X
Theorem 5.50, (i) and (ii) implies that the double series
|am,n | is convergent, which gives (a).
(m,n)∈Z+ ×Z+
Chapter 5. Infinite Series of Numbers and Infinite Products 450

By Theorem 5.49, (a) implies (b) and (c). Theorem 5.49 also implies that
the two series
∞ ∞
! ∞ ∞
!
X X X X
|am,n | and |am,n |
m=1 n=1 n=1 m=1

are convergent. Since



X ∞
X ∞
X ∞
X
am,n ≤ |am,n | , am,n ≤ |am,n | ,
n=1 n=1 m=1 m=1

∞ X
X ∞ ∞ X
X ∞
comparison test shows that the series am,n and am,n are
m=1 n=1 n=1 m=1
convergent. This gives (d). The statement (e) follows from (a) and Theorem
5.51.
Chapter 5. Infinite Series of Numbers and Infinite Products 451

Exercises 5.5
Question 1
If a and b are positive constants, show that the double series
X 1
am2 + bn2
(m,n)∈Z+ ×Z+

is divergent.

Question 2
Given that a and b are positive constants, u and v are real numbers, and α
is a number larger than 1. Show that the double series
X sin(mu + nv)
(am2 + bn2 )α
(m,n)∈Z+ ×Z+

is convergent.
Chapter 6. Sequences and Series of Functions 452

Chapter 6

Sequences and Series of Functions

In this chapter, we study sequences and series whose terms depend on a variable.

6.1 Convergence of Sequences and Series of Functions

Let D be a subset of real numbers. For each positive integer n, let fn : D → R be


a function defined on D. Then {fn }∞ n=1 is a sequence of functions defined on D.
Sometimes we will write {fn : D → R} or {fn : D → R}∞ n=1 to make it explicit
that each fn is a function defined on D.
Given a sequence of functions {fn : D → R} that are defined on D, for each
x ∈ D, {fn (x)} is a sequence of real numbers. We can determine whether such a
sequence is convergent.

Definition 6.1 Pointwise Convergence of Sequence of Functions

Given a sequence of functions {fn : D → R} that are defined on D, we


say that it converges pointwise to the function f : D → R provided that
for every x ∈ D, the sequence {fn (x)} converges to f (x). Namely,

f (x) = lim fn (x) for all x ∈ D.


n→∞

In this case, we also say that the function f : D → R is the pointwise limit
of the sequence of functions {fn : D → R}.

Let us look at some examples.

Example 6.1

For each positive integer n, let fn : [0, 1] → R be the function fn (x) = xn .


Study the pointwise convergence of the sequence of functions {fn }.
Chapter 6. Sequences and Series of Functions 453

Solution
Notice that 
0, if 0 ≤ x < 1,
lim xn =
n→∞ 1, if x = 1.
Therefore, the sequence of functions {fn } converges pointwise to the
function f : [0, 1] → R, where

0, if 0 ≤ x < 1,
f (x) =
1, if x = 1.

Figure 6.1: The sequence of functions {fn } defined in Example 6.1.

Example 6.2

For each positive integer n, let fn : [0, 2] → R be the function fn (x) = xn .


Study the pointwise convergence of the sequence of functions {fn }.

Solution
For each x ∈ [0, 1], the sequence {fn (x)} is convergent. For any x ∈ (1, 2],
the sequence {fn (x)} is divergent. Hence, the sequence of functions {fn }
does not converge pointwise.
Chapter 6. Sequences and Series of Functions 454

In Example 6.1, notice that each fn : [0, 1] → R is a continuous function,


but the limit f : [0, 1] → R is not a continuous function.
Given that {fn : D → R} is a sequence of functions that converges
pointwise to the function f : D → R. We will consider the following
questions.

1. If each fn is continuous, is f continuous?

2. If each fn is a differentiable function defined on an open interval I, is f


differentiable on I? If yes, does the sequence {fn′ : I → R} converge to
f ′ : I → R?

3. If each fn is Riemann integrable on a closed and bounded interval I,


is Zf Riemann
  integrable on I? IfZ yes, does the sequence of integrals
fn converge to the integral f?
I I

We have seen that the answer to the first question is no, as given by Example
6.1. The answers to the second and third questions are also no. We will look
at some examples.

Example 6.3
For each positive integer n, let fn : R → R be the function
1
fn (x) = .
1 + nx2
Study the pointwise convergence of the sequence of functions {fn }.

Solution
Since fn (0) = 1 for all n ∈ Z+ ,

lim fn (0) = 1.
n→∞

If x ̸= 0,
1
0 ≤ fn (x) ≤ .
nx2
Chapter 6. Sequences and Series of Functions 455

By squeeze theorem,

lim fn (x) = 0 when x ̸= 0.


n→∞

Hence, the sequence of functions {fn } converges pointwise to the function


f : R → R, where 
0, if x ̸= 0,
f (x) =
1, if x = 0.

Figure 6.2: The sequence of functions {fn } defined in Example 6.3.

In Example 6.3, each of the functions fn is differentiable. But the function


f is not differentiable at x = 0 since it is not continuous at x = 0.

Example 6.4
For each positive integer n, let fn : R → R be the differentiable function
2
fn (x) = xe−nx .

(a) Study the pointwise convergence of the sequence of functions {fn }.

(b) Study the pointwise convergence of the sequence of functions {fn′ }.

Solution
(a) Since fn (0) = 0 for all n ∈ Z+ ,

lim fn (0) = 0.
n→∞
Chapter 6. Sequences and Series of Functions 456

If x ̸= 0, since lim e−u = 0, we find that


u→∞

2
lim fn (x) = x lim e−nx = x lim e−u = 0.
n→∞ n→∞ u→∞

Hence, the sequence of functions {fn } converges pointwise to the function


f : R → R, where f (x) = 0 for all x ∈ R.

(b) For n ∈ Z+ ,
2
fn′ (x) = (1 − 2nx2 )e−nx .
Since fn′ (0) = 1 for all n ∈ Z+ ,

lim fn′ (0) = 1.


n→∞

If x ̸= 0, since lim e−u = 0 and lim ue−u = 0, we find that


u→∞ u→∞

2
lim fn′ (x) = lim (1 − 2nx2 )e−nx = lim (1 − 2u)e−u = 0.
n→∞ n→∞ u→∞

Hence, the sequence of functions {fn′ } converges pointwise to the


function g : R → R, where

0, if x ̸= 0,
g(x) =
1, if x = 0.

Figure 6.3: The sequence of functions {fn } defined in Example 6.4.


Chapter 6. Sequences and Series of Functions 457

Figure 6.4: The sequence of functions {fn′ } in Example 6.4.

In Example 6.4, each of the functions fn is differentiable and the function


f is also differentiable. The sequence {fn′ } also converges pointwise, but it
does not converge to the function f ′ .

Example 6.5
For each positive integer n, let
 
p
Sn = p, q ∈ Z, 0 ≤ p ≤ q ≤ n, q ≥ 1 .
q

Define the function fn : [0, 1] → R by



1, if x ∈ Sn ,
fn (x) =
0, if x ∈
/ Sn .

Study the pointwise convergence of the sequence of functions {fn }.

Solution
If x is a rational number in [0, 1], there exists a nonnegative integer p and a
positive integer q such that 0 ≤ p ≤ q and x = p/q. Therefore, x ∈ Sn for
all n ≥ q. This implies that fn (x) = 1 for all n ≥ q. Hence,

lim fn (x) = 1 if x is rational.


n→∞
Chapter 6. Sequences and Series of Functions 458

/ Sn for any n ∈ Z+ . Therefore,


If x is not a rational number, then x ∈
fn (x) = 0 for all n ∈ Z+ . Hence,

lim fn (x) = 0 if x is irrational.


n→∞

These show that the sequence of functions {fn } converges pointwise to the
Dirichlet function f : R → R,

1, if x is rational,
f (x) = .
0, if x is irrational.

For each n ∈ Z+ , the set Sn which fn (x) ̸= 0 is finite. Thus the function
fn : [0, 1] → R is Riemann integrable. But the Dirichlet function f is not
Riemann integrable.

Example 6.6
For each positive integer n, let

n2 x(1 − nx), if 0 ≤ x ≤ n1 ,
fn (x) =
0, otherwise.
Z 1
Notice that fn is integrable on [0, 1]. Let cn = fn (x)dx.
0

(a) Study the pointwise convergence of the sequence of functions {fn }.

(b) Determine the limit of the sequence {cn }.

Solution
(a) Since fn (0) = 0 for all n ∈ Z+ ,

lim fn (0) = 0.
n→∞
Chapter 6. Sequences and Series of Functions 459

If x > 0, there is a positive integer N so that x > 1/N . This implies that
fn (x) = 0 for all n ≥ N . Hence, we also have

lim fn (x) = 0.
n→∞

Thus, the sequence of functions {fn } converges pointwise to the function


f : R → R, where f (x) = 0 for all x ∈ R.

(b) We compute cn directly. For n ∈ Z+ ,


1/n
x2 nx3

2 1
cn = n − = .
2 3 0 6

Hence, the sequence {cn } converges to 1/6.

In Example 6.6, each of the  fn is integrable and Zthe function f is


Zfunctions
1 1
also integrable. However, fn does not converge to f.
0 0

Figure 6.5: The sequence of functions {fn } defined in Example 6.6.

Now let us consider series of functions.


Chapter 6. Sequences and Series of Functions 460

Definition 6.2 Pointwise Convergence of Series of Functions


A series of functions defined on a set A is a series of the form

X
fn (x),
n=1

where {fn : A → R} is a sequence of functions defined on A. For such a


series, we form the partial sum
n
X
sn (x) = fk (x) for n ≥ 1.
k=1

Then {sn : A → R} is a sequence of functons defined on A. The domain



X
of convergence of the series fn (x) is the set
n=1

D = {x ∈ A | the sequence {sn (x)} is convergent} .

It is the largest subset D of A such that the sequence of functions {sn :


D → R} converges pointwise. For each x in D, let

X
s(x) = fn (x) = lim sn (x)
n→∞
n=1

be the sum of the series. Then the sequence of functions {sn : D → R}


converges pointwise to the function s(x).

Let us reformulate Theorem 5.16 using series of functions.

Example 6.7 Geometric Series



X
For the series xn , the terms are the functions fn (x) = xn , n ≥ 0. They
n=0
are defined on A = R. The partial sums are
n+1
1 − x ,

if x ̸= 1,
sn (x) = 1 + x + · · · + xn = 1−x
n + 1, if x = 1.

Chapter 6. Sequences and Series of Functions 461

The domain of convergence is the set D = (−1, 1). For x ∈ D,



X 1
s(x) = xn = .
n=0
1−x


X
Hence, the series of functons xn converges pointwise on the interval
n=0
1
(−1, 1) to the function s(x) = .
1−x

Example 6.8

2
X
Determine the domain of convergence of the series e−n x .
n=1

Solution
2x
If x ≤ 0, −n2 x ≥ 0 for all n ∈ Z+ . Therefore, lim e−n ̸= 0, and so the
n→∞

2
X
series e−n x is divergent.
n=1
2
If x > 0, lim e−n x = 0. In this case, notice that n2 ≥ n for all n ∈ Z+
n→∞
implies that
2
0 ≤ e−n x ≤ e−nx for all n ∈ Z+ .

X
Since the series e−nx is a geometric series with positive constant ratio
n=1

2x
X
r=e −x
< 1, it is convergent. By the comparison test, the series e−n
n=1
is also convergent.

2x
X
Therefore, the domain of convergence of the series e−n is D =
n=1
(0, ∞).
Chapter 6. Sequences and Series of Functions 462

Exercises 6.1
Question 1
For each positive integer n, let fn : R → R be the function defined by
2
fn (x) = e−nx .

(a) Determine the pointwise convergence of the sequence of functions


{fn }.

(b) Determine the pointwise convergence of the sequence of functions


{fn′ }.

Question 2
For each positive integer n, let fn : (0, ∞) → R be the function defined by
1
fn (x) = .
1 + xn
(a) Determine the pointwise convergence of the sequence of functions
{fn }.

(b) Determine the pointwise convergence of the sequence of functions


{fn′ }.

Question 3
For each positive integer n, let fn : [0, ∞) → R be the function defined by
x
fn (x) = ,
1 + nx
Z 1
and let cn = fn (x)dx.
0

(a) Determine the pointwise convergence of the sequence of functions


{fn }.

(b) Determine the convergence of the sequence {cn }.


Chapter 6. Sequences and Series of Functions 463

Question 4
For each positive integer n, let fn : R → R be the function defined by
x
fn (x) = n sin ,
n
Z 1
and let cn = fn (x)dx.
0

(a) Study the convergence of the sequence of functions {fn }.

(b) Study the convergence of the sequence of functions {fn′ }.

(c) Determine the limit of the sequence {cn }.

Question 5

2
X
Find the domain of convergence of the series of functions ne−nx .
n=1

Question 6

2
X
Find the domain of convergence of the series of functions ne−n x .
n=1
Chapter 6. Sequences and Series of Functions 464

6.2 Uniform Convergence of Sequences and Series of Functions

In Section 6.1, we have seen examples where a sequence of functions {fn : D →


R} converges pointwise to a function f : D → R, but some properties of the
sequence {fn }, such as continuity, differentiability, or integrability, are lost in the
limit function f . We also see an example where differentiability is preserved, but
the derivative of f is not the limit of the derivatives of the sequence {fn }. There is
also an example where each function fn is integrableZ Z interval I, f is also
over an
integrable over I, but the limit of the sequence fn is not f.
I I

Given that {fn : D → R} is a sequence of functions that converges


pointwise to the function f : D → R.

1. Continuity of each fn does not imply the continuity of f .

lim lim fn (x) does not necessary equal to lim lim fn (x).
x→x0 n→∞ n→∞ x→x0

2. The derivative of f does not necessary equal to the limit of {fn′ }

d d
lim fn (x) does not necessary equal to lim fn (x).
dx n→∞ n→∞ dx

3. The integral of f over an interval [a, b] does not necessary equal to the
limit the integrals of fn over [a, b].
Z b Z b
lim fn (x)dx does not necessary equal to lim fn (x)dx.
a n→∞ n→∞ a

Since derivatives and integrals are also limits, all these pathological
behaviors have the same root. Namely, one cannot simply interchange the
orders of two limits, as have been shown in Section 5.5.

In this section, we are going to introduce the concept of uniform convergence.


We are going to see in next section how this extra condition can help to remedy
some of the pathological behaviors mentioned above.
Let us review Example 6.1. The sequence fn : [0, 1] → R, fn (x) = xn is
Chapter 6. Sequences and Series of Functions 465

found to converge pointwise to the function f : [0, 1] → R given by



0, if 0 ≤ x < 1,
f (x) =
1, if x = 1.

For the point x = 1, {fn (1)} converges to f (1) = 1. For any ε > 0, we can take
N = 1. Then for all n ≥ N ,

|fn (1) − f (1)| = 0 < ε.

The same goes for the point x = 0. For any other x in the interval (0, 1), {fn (x)}
converges to f (x) = 0. Given ε > 0, if ε < 1, the smallest N such that

|fn (x) − f (x)| = xn < ε for all n ≥ N

is the smallest positive integer N such that


ln ε
N> .
ln x
One see that this number N would become larger and larger when x approaches 1.
The idea of uniform convergence is to say that N can be chosen to be independent
of the point x in the domain.

Definition 6.3 Uniform Convergence of Sequences of Functions


Let D be a subset of real numbers. We say that a sequence of functions
{fn : D → R} converges uniformly to the function f : D → R, provided
that for every ε > 0, there is a positive integer N such that for all n ≥ N ,
and all x ∈ D,
|fn (x) − f (x)| < ε.

Obviously, we have the following.

Proposition 6.1

Let D be a subset of real numbers. If {fn : D → R} is a sequence of


functions that converges uniformly to the function f : D → R, then the
sequence {fn : D → R} converges pointwise to f : D → R.
Chapter 6. Sequences and Series of Functions 466

Figure 6.6: Uniform convergence of a sequence of functions.

Uniform Limit and Pointwise Limit


If a sequence of functions {fn : D → R} converges uniformly, the uniform
limit is the same as the pointwise limit.

Let us compare the definitions of pointwise and uniform convergence using


logical expressions.

Pointwise Convergence versus Uniform Convergence

• The sequence of functions {fn : D → R} converges pointwise to the


function f : D → R.

∀ x ∈ D, ∀ε > 0, ∃N ∈ Z+ , ∀ n ≥ N, |fn (x) − f (x)| < ε.

• The sequence of functions {fn : D → R} converges uniformly to the


function f : D → R.

∀ε > 0, ∃N ∈ Z+ , ∀ x ∈ D, ∀ n ≥ N, |fn (x) − f (x)| < ε.

One sees that it is a matter of the ordering of the quantifiers, but it makes
a significant difference when we interchange the orders of a universal quantifier
with a existential quantifier.
One should also compare the definition of uniform convergence to uniform
continuity that we discussed in Section 2.5. In both cases, the uniformity is with
Chapter 6. Sequences and Series of Functions 467

respect to the domain D.


Before looking at some examples, let us highlight the negation of uniform
continuity.

Non-Uniform Convergence
In logical expressions, the sequence of functions {fn : D → R} does not
converge uniformly to the function f : D → R is expressed by

∃ ε > 0, ∀N ∈ Z+ , ∃ x ∈ D, ∃ n ≥ N, |fn (x) − f (x)| ≥ ε. (6.1)

The following gives a prelimary test for uniform convergence.

Proposition 6.2

If a sequence of functions {fn : D → R} does not converge pointwise,


then it does not converge uniformly.

If the sequence {fn : D → R} does converge pointwise, to show that it


does not converge uniformly, we only need to establish the statement (6.1) with
f : D → R the pointwise limit of the sequence {fn : D → R}.

Example 6.9

For n ≥ 1, let fn : [0, 1] → R be the function fn (x) = xn . Show that the


sequence {fn : [0, 1] → R} does not converge uniformly.

Solution
In Example 6.1, we have seen that the sequence {fn } converges pointwise
to the function f : [0, 1] → R, where f (x) = 0 for x ∈ [0, 1) and f (1) = 1.
If {fn : [0, 1] → R} converges uniformly, it must converge to the same
function f :: [0, 1] → R. Take ε = 12 . There must be a positive integer N
such that for all n ≥ N , for all x ∈ [0, 1],
1
|fn (x) − f (x)| < .
2
Chapter 6. Sequences and Series of Functions 468

In particular, this says that for all x ∈ [0, 1),


1
xN < .
2
This is absurd since lim− xN = 1. Hence, the sequence {fn : [0, 1] → R}
x→1
does not converge uniformly.

Example 6.10
2
For n ≥ 1, let fn : R → R be the function fn (x) = xe−nx . Show that the
sequence {fn } converges uniformly.

Solution
In Example 6.4, we have seen that the sequence {fn } converges pointwise
to the function f : R → R that is identically 0. Notice that
2
fn′ (x) = (1 − 2nx2 )e−nx .

This shows that fn′ (x) > 0 for |x| < 1/ 2n, and fn′ (x) < 0 for |x| >

1/ 2n. Since

lim fn (x) = 0 and lim fn (x) = 0,


n→−∞ n→∞

we find that fn (x) decreases from 0 to fn (−1/ 2n) when x goes from −∞
√ √ √
to −1/ 2n, fn (x) increases from fn (−1/ 2n) to fn (1/ 2n) when x goes
√ √ √
from −1/ 2n to 1/ 2n, and fn (x) decreases from fn (1/ 2n) to 0 when

x goes from 1/ 2n to ∞. Hence, the minimum and maximum values of
√ √
fn are fn (−1/ 2n) and fn (1/ 2n) respectively. This shows that
√ 1 1 1
|fn (x)| ≤ fn (1/ 2n) = √ e− 2 ≤ √ for all x ∈ R.
2n 2n

Given ε > 0, there is a positive integer N such that 1/ 2N < ε. For all
n ∈ N , for all x ∈ R, we find that
Chapter 6. Sequences and Series of Functions 469

1 1
|fn (x) − f (x)| = |fn (x)| ≤ √ ≤ √ < ε.
2n 2N
This proves that the sequence of functions {fn } converges uniformly to the
function f that is identically 0.

By definition, if a sequence of functions {fn : D → R} converges uniformly


to the function f : D → R, then there is a positive integer N0 such that for all
n ≥ N0 ,
|fn (x) − f (x)| < 1 for all x ∈ D.
This implies that for all n ≥ N0 , the function (fn − f ) : D → R is bounded
above, and thus Mn = sup |fn (x) − f (x)| exists.
x∈D
The following theorem provides a systematic way to determine whether a
sequence of functions {fn : D → R} converges uniformly.

Theorem 6.3
Let D be a subset of real numbers, and let {fn : D → R} be a sequence of
functions defined on D.

I. If the sequence of functions {fn : D → R} does not converge


pointwise to a function, then it does not converge uniformly.

II. If the sequence of functions {fn : D → R} converges pointwise


to a function f : D → R, for each n ∈ Z+ , define the function
gn : D → R by gn (x) = fn (x) − f (x).

(a) If gn is not bounded for infinitely many n, then the sequence of


functions {fn : D → R} does not converge uniformly.
(b) If only finitely many of the functions gn are not bounded, there is
a positive integer N0 such that gn is bounded for all n ≥ N0 . For
n ≥ N0 , let Mn = sup |gn (x)|. Then the sequence of functions
x∈D
{fn : D → R} converges uniformly to the function f : D → R if
and only if lim Mn = 0.
n→∞
Chapter 6. Sequences and Series of Functions 470

Proof
We have addressed I. and II. (a). Let us now consider II. (b). If the sequence
of functions {fn : D → R} converges uniformly to the function f : D →
R, given ε > 0, there is a positive integer N ≥ N0 such that for all n ≥ N
and for all x ∈ D,
ε
|gn (x)| = |fn (x) − f (x)| < .
2
This gives
ε
0 ≤ Mn = sup |gn (x)| ≤ <ε for all n ≥ N.
x∈D 2

Therefore, lim Mn = 0.
n→∞
Conversely, if lim Mn = 0, given ε > 0, there is a positive integer N ≥ N0
n→∞
such that
Mn < ε for all n ≥ N.
It follows that for all n ≥ N , for all x ∈ D,

|fn (x) − f (x)| = |gn (x)| ≤ sup |gn (x)| = Mn < ε.


x∈D

This proves that the sequence of functions {fn : D → R} converges


uniformly to the function f : D → R.

Example 6.11
For the sequence of functions discussed in Example 6.9,

x n , if 0 ≤ x < 1,
fn (x) − f (x) =
0, if x = 1.

Therefore,
Mn = sup |fn (x) − f (x)| = 1.
0≤x≤1

Since lim Mn = 1 ̸= 0, Theorem 6.3 implies that the sequence of


n→∞
functions {fn : [0, 1] → R} with fn (x) = xn does not converge uniformly.
Chapter 6. Sequences and Series of Functions 471

Example 6.12

For the sequence of functions discussed in Example 6.10, fn (x) − f (x) =


2
fn (x) = xe−nx . We have shown that
1
Mn = sup |fn (x) − f (x)| ≤ √ .
x∈R 2n
This implies that lim Mn = 0. Hence, Theorem 6.3 says that the sequence
n→∞
2
of functions {fn : R → R} with fn (x) = xe−nx converges uniformly.

To apply Theorem 6.3, we need to know apriori the pointwise limit of the
sequence of functions {fn } to be able to conclude the uniform convergence of the
sequence. Sometimes it could be difficult to find the limit function. To circumvent
this problem, we introduce the concept of uniformly Cauchy.

Definition 6.4 Uniformly Cauchy Sequence of Functions

Let D be a subset of real numbers. A sequence of functions {fn : D → R}


is uniformly Cauchy provided that for every ε > 0, there is a positive
integer N such that for all m ≥ n ≥ N ,

|fm (x) − fn (x)| < ε for all x ∈ D.

We have the following.

Theorem 6.4
Cauchy Criterion for Uniform Convergence of Sequences of Functions

A sequence of functions {fn : D → R} converges uniformly if and only if


it is uniformly Cauchy.

Proof
If the sequence of functions {fn : D → R} converges uniformly to f :
D → R, given ε > 0, there is a positive integer N such that for all n ≥ N ,
ε
|fn (x) − f (x)| < for all x ∈ D.
2
Chapter 6. Sequences and Series of Functions 472

Using triangle inequality, this proves that for all m ≥ n ≥ N ,

|fm (x) − fn (x)| < ε for all x ∈ D.

This proves that the sequence {fn : D → R} is uniformly Cauchy.


Conversely, if the sequence of functions {fn : D → R} is uniformly
Cauchy, then for each x ∈ D, the sequence {fn (x)} is a Cauchy sequence.
Hence, it converges to a number f (x). This shows that the sequence of
functions {fn : D → R} converges pointwise to a function f : D → R.
To show that the convergence is uniform, given ε > 0, there is a positive
integer N such that for all m ≥ n ≥ N ,
ε
|fm (x) − fn (x)| < for all x ∈ D.
2
For each x ∈ D, fixed n ≥ N and take the limit m → ∞, we find that
ε
|fn (x) − f (x)| ≤ .
2
This proves that for all n ≥ N , for all x ∈ D,

|fn (x) − f (x)| < ε.

Therefore, the sequence of functions {fn : D → R} converges uniformly.

Using Theorem 6.4, Theorem 6.3 can be finetuned as follows. The proof is
straightforward and we leave it to the exercises.

Theorem 6.5
Given that {fn : D → R} is a sequence of functions defined on the subset
D of real numbers, for each pair of (m, n) ∈ Z+ × Z+ , define the extended
real number Mm,n as

Mm,n = sup |fm (x) − fn (x)|.


x∈D

Then the sequence of functions {fn : D → R} converges uniformly if and


only if the double sequence {Mm,n } converges to 0.
Chapter 6. Sequences and Series of Functions 473

Next we turn to series of functions.

Definition 6.5 Uniform Convergence of Series of Functions

Let D be a subset of real numbers and let {fn : D → R} be a sequence



X
of functions defined on D. We say that the series of functions fn (x)
n=1
converges uniformly to the function s : D → R provided that the sequence
Xn
of partial sums {sn : D → R} with sn (x) = fk (x) converges uniformly
k=1
to the function s(x).

Uniform Convergence of Series of Functions


A necessary condition for a series of functions to converge uniformly is that
it converges pointwise.
X ∞
When the series of functions fn (x) converges pointwise to a function
n=1
s(x) on a set D, then for any n ∈ Z+ and any x ∈ D, the series

X
fk (x)
k=n

converges pointwise to the function s(x) − sn−1 (x), where sn (x) =


Xn
fk (x) is the nth partial sum, and s0 (x) = 0 by default.
k=1
Therefore, we can reformulate the definition of uniform convergence of

X
series of functions as follows. The series fn (x) converges uniformly
n=1
to the function s(x) on the set D provided that for any ε > 0, there is a
positive integer N such that for all n ≥ N ,

X
fk (x) < ε for all x ∈ D.
k=n

In most cases, such as Example 6.8, we can only justify a series of functions
converges pointwise, but we cannot find an explicit close form for the sum s(x).
In this case, a Cauchy criterion becomes useful.
Chapter 6. Sequences and Series of Functions 474

From Theorem 6.4, we obtain the following immediately.

Theorem 6.6
Cauchy Criterion for Uniform Convergence of Series of Functions

X
A series of functions fn (x) converges uniformly on a set D if and only
n=1
if for every ε > 0, there is a positive integer N such that for all m ≥ n ≥ N ,
m
X
fk (x) < ε for all x ∈ D.
k=n

Example 6.13

2x
X
For the series e−n considered in Example 6.8, we have shown that
n=1
it converges pointwise on the interval (0, ∞). Let us prove that the
convergence is uniform on any set D of the form D = [a, ∞), where a
is a positive constant.
We notice that if m ≥ n and x ≥ a,
m m ∞ ∞
X
−k2 x
X
−kx
X
−kx
X
−ka e−na
0≤ e ≤ e ≤ e ≤ e = .
k=n k=n k=n k=n
1 − e−a

Given ε > 0, since


e−na
lim = 0,
n→∞ 1 − e−a

there exists a positive integer N such that for all n ≥ N ,

e−na
0≤ < ε.
1 − e−a
Chapter 6. Sequences and Series of Functions 475

It follows that for all m ≥ n ≥ N , and for all x ∈ [a, ∞),


m
X 2x e−na
e−k ≤ < ε.
k=n
1 − e−a


2x
X
By Theorem 6.6, the series e−n converges uniformly on [a, ∞).
n=1


2x
X
The readers are invited to show that the series e−n does not converge
n=1
uniformly on the set (0, ∞). It is a typical situation that allthough the series
converges pointwise on a set A, it fails to converge uniformly on A, but it converges
uniformly on subsets of A. Most of the time, we do not need uniform convergence
on A, but uniform convergence on a collection of subsets of A whose union is A
is enough. In the example above, C = {[a, ∞) | a ≥ 0} is a collection of subsets
of A = (0, ∞) whose union is A.

Definition 6.6 Absolute Convergence of Series of Functions



X
A series of functions fn (x) is said to converge absolutely on a set D
n=1

X
if the series |fn (x)| converges pointwise on D. In this case, the series
n=1

X
fn (x) also converges pointwise on D.
n=1

Now we present a useful test to show that a series of functions converges


absolutely and uniformly on a set D.

Theorem 6.7
Let {fn : D → R} be a sequence of functions defined on D. If the series
X∞ ∞
X
|fn (x)| converges uniformly on D, then the series fn (x) converges
n=1 n=1
absolutely and uniformly on D.
Chapter 6. Sequences and Series of Functions 476

Proof

X
Since the series |fn (x)| converges uniformly on D, it also converges
n=1

X
pointwise. Hence, the series fn (x) converges absolutely on D.
n=1

X
Since |fn (x)| converges uniformly on D, it is uniformly Cauchy. Given
n=1
ε > 0, there is a positive integer N such that for all m ≥ n ≥ N ,
m
X
|fk (x)| < ε for all x ∈ D.
k=n

Triangle inequality implies that


m
X m
X
fk (x) ≤ |fk (x)| < ε for all x ∈ D.
k=n k=n


X
Hence, the series fn (x) is also uniformly Cauchy on D. Therefore, it
n=1
also converges uniformly.

Theorem 6.8 Weiertrass M-Test


Let {fn : D → R} be a sequence of functions defined on D. Assume that
the following conditions are satisfied.

(i) For each n ∈ Z+ , there is a positive constant Mn such that |fn (x)| ≤
Mn for all x ∈ D.

X
(ii) The series Mn is convergent.
n=1


X
Then the series fn (x) converges absolutely and uniformly on D.
n=1
Chapter 6. Sequences and Series of Functions 477

Proof

X
By Theorem 6.7, we only need to show that the series |fn (x)| converges
n=1

X
uniformly on D. Given ε > 0, since the series Mn is convergent, there
n=1
is a positive integer N such that for all m ≥ n ≥ N ,
m
X
Mk < ε.
k=n

This implies that


m
X m
X
|fk (x)| ≤ Mk < ε for all x ∈ D.
k=n k=n


X
By Theorem 6.6, the series |fn (x)| converges uniformly on D.
n=1

Example 6.14

2
X
Let a be a positive number. Show that the series (−1)n−1 e−n x
n=1
converges absolutely and uniformly on [a, ∞).

Solution
2
For n ∈ Z+ , let fn (x) = (−1)n−1 e−n x . For x ∈ [a, ∞), x ≥ a. Hence, for
n ∈ Z+ ,
2
|fn (x)| = e−n x ≤ e−nx ≤ e−na .

X
Since r = e −na
< 1, the geometric series e−na is convergent. By
n=1

2x
X
Weierstrass M -test, the series (−1)n−1 e−n converges absolutely and
n=1
uniformly on [a, ∞).
Chapter 6. Sequences and Series of Functions 478

Exercises 6.2
Question 1
2
For n ≥ 1, let fn : [0, 1] → R be the function fn (x) = e−nx . Show that
the sequence of functions {fn } does not converge uniformly.

Question 2
x
For n ≥ 1, let fn : R → R be the function fn (x) = n sin . Show that
n
the sequence of functions {fn } does not converge uniformly.

Question 3
x
For n ≥ 1, let fn : [0, 2π] → R be the function fn (x) = n sin . Show
n
that the sequence of functions {fn } converges uniformly.

Question 4
x
For n ≥ 1, let fn : [0, ∞) → R be the function fn (x) = .
1 + nx
Determine whether the sequence of functions {fn } converges uniformly.

Question 5

2x
X
Let a be a positive constant. Show that the series (−1)n−1 ne−n
n=1
converges absolutely and uniformly on the set [a, ∞).
Chapter 6. Sequences and Series of Functions 479

6.3 Properties of Uniform Limits of Functions

In this section, we are going to see how uniform convergence can avoid the
pathological behaviours we mentioned in the beginning of Section 6.2. First we
show that uniform limit of continuous functions is continuous. This is a very
important result in mathematical analysis.

Theorem 6.9 Uniform Limit of Continuous Functions is Continuous


Given that D is a subset of real numbers, and {fn : D → R} is a sequence
of continuous functions that converges uniformly to the function f : D →
R. Then the function f : D → R is continuous.

Proof
The proof is a standard 1/3 argument. Given x0 ∈ D, we want to show
that f is continuous at x0 using the ε − δ argument. Given ε > 0, there is a
positive integer N such that for all n ≥ N ,
ε
|fn (x) − f (x)| < for all x ∈ D.
3
We are only going to use this statement when n = N . Since fN is
continuous at x0 , there is a δ > 0 such that for all x ∈ D, if |x − x0 | < δ,
then
ε
|fN (x) − fN (x0 )| < .
3
From these, we find that if x is in D and |x − x0 | < δ, then

|f (x) − f (x0 )| ≤ |f (x) − fN (x)| + |fN (x) − fN (x0 )| + |fN (x0 ) − f (x0 )|
ε ε ε
< + + = ε.
3 3 3
This proves that f is continuous at x0 .
Chapter 6. Sequences and Series of Functions 480

Example 6.15

For the sequence of functions {fn : [0, 1] → R} with fn (x) = xn , its


pointwise limit f : [0, 1] → R,

0, if x ̸= 0,
f (x) =
1, if x = 0,

is not continuous. Since each fn , n ∈ Z+ is a continuous function, Theorem


6.9 can be used to infer that the sequence of functions {fn : [0, 1] → R}
with fn (x) = xn does not converge uniformly.

Applying Theorem 6.9 to series of functions, we have the following.

Corollary 6.10

Given that {fn : D → R} is a sequence of continuous functions defined on



X
D. If the series of functions fn (x) converges uniformly on D, then it
n=1
defines a continuous function s : D → R by

X
s(x) = fn (x).
n=1

Proof
We apply Theorem 6.9 to the sequence of partial sums {sn (x)}. Since
n
X
sn (x) = fk (x) is a finite sum of continuous functions, it is continuous.
k=1
By Theorem 6.9, s(x) = lim sn (x) is continuous.
n→∞

Next, we turn to integration.


Chapter 6. Sequences and Series of Functions 481

Theorem 6.11
Assume that for each n ∈ Z+ , the funtion fn : [a, b] → R is Riemann
integrable. If the sequence of functions {fn : [a, b] → R} converges
uniformly to the function f : [a, b] → R, then f : [a, b] → R is also
Riemann integrable, and the orders of the limit operation and the integration
operation can be interchanged. Namely,
Z b Z b Z b
lim fn (x)dx = f (x)dx = lim fn (x)dx. (6.2)
n→∞ a a a n→∞

Notice that we only assume that each fn is Riemann integrable. We do not


need to assume that it is continuous.

Proof
Given ε > 0, there is a positive integer N such that for all n ≥ N ,
ε
|fn (x) − f (x)| < for all x ∈ [a, b]. (6.3)
3(b − a)

First we take n = N . Since fN : [a, b] → R is Riemann integrable, there is


a partition P = {xi }ki=0 of [a, b] such that
ε
U (fN , P ) − L(fN , P ) < .
3
From (6.3), we have
ε ε
fN (x) − < f (x) < fN (x) + for all x ∈ [a, b].
3(b − a) 3(b − a)

For any 1 ≤ i ≤ k, if x ∈ [xi−1 , xi ],


ε ε
inf fN (x) − ≤ f (x) ≤ sup fN (x) + .
xi−1 ≤x≤xi 3(b − a) xi−1 ≤x≤xi 3(b − a)

This implies that


ε
inf fN (x) − ≤ inf f (x)
xi−1 ≤x≤xi 3(b − a) xi−1 ≤x≤xi
ε
≤ sup f (x) ≤ sup fN (x) + .
xi−1 ≤x≤xi xi−1 ≤x≤xi 3(b − a)
Chapter 6. Sequences and Series of Functions 482

Therefore,
ε ε
U (f, P ) ≤ U (fN , P ) + , L(f, P ) ≥ L(fN , P ) − .
3 3
Thus,
2ε ε 2ε
U (f, P ) − L(f, P ) ≤ U (fN , P ) − L(fN , P ) + < + = ε.
3 3 3
From this, we conclude that f is Riemann integrable. This in turn implies
that for any n ∈ Z+ , the function fn − f is Riemann integrable on [a, b],
and so is the function |fn − f |. With the same ε > 0, we find from (6.3)
that for any n ≥ N ,
Z b Z b Z b
fn (x)dx − f (x)dx = (fn (x) − f (x))dx
a a a
Z b
ε
≤ |fn (x) − f (x)| dx ≤ < ε.
a 3
This proves that (6.2) holds.

Applying Theorem 6.11 to series of functions, we have the following.

Corollary 6.12

Given that {fn : [a, b] → R} is a sequence of Riemann integrable



X
functions. If the series fn (x) converges uniformly, then the function
n=1

X ∞ Z
X b
s(x) = fn (x) is Riemann integrable, the series fn (x)dx is
n=1 n=1 a
convergent, and we can integrate term by term. Namely,
Z b ∞
Z bX ∞ Z
X b
s(x)dx = fn (x)dx = fn (x)dx.
a a n=1 n=1 a
Chapter 6. Sequences and Series of Functions 483

Proof
We apply Theorem 6.11 to the sequence of partial sums {sn (x)}. Since
n
X
sn (x) = fk (x) is a finite sum of Riemann integrable functions, it is
k=1
Riemann integrable. The rest follows from Theorem 6.11.

Example 6.16

In Example 6.6, the sequence of functions {fn : [0, 1] → R} converges


Z to the function f : [0, 1] → 
pointwise
1
RZthat is identically
1  0. However,
since fn (x)dx = 1/6, the sequence fn (x)dx does not converge
Z 1 0 0

to f (x)dx = 0. Theorem 6.11 can be used to deduce that {fn : [0, 1] →


0
R} does not converge to f : [0, 1] → R uniformly.
In fact, one can verify that
n
Mn = sup |fn (x) − f (x)| = sup n2 x (1 − nx) = .
0≤x≤1 0≤x≤1/n 4

Since lim Mn ̸= 0, {fn : [0, 1] → R} does not converge to f : [0, 1] → R


n→∞
uniformly.

Now we consider differentiation. In Example 6.10, we have shown that the


2
sequence {fn : R → R} defined by fn (x) = xe−nx converges uniformly to the
function f : R → R that is identically zero. In Example 6.4, we have seen that
the derivative sequence {fn′ } converges to the function g : R → R given by

0, if x ̸= 0,
g(x) =
1, if x = 0,
We find that
d d
lim fn (x) = 0 which is not equal to lim fn (x) = 1.
dx x=0 n→∞ n→∞ dx x=0

Hence, even though the seqeunce of functions {fn } converges uniformly, we


cannot interchange limit with differentiation.
The following theorem gives a sufficient condition for interchanging limit with
differentiation.
Chapter 6. Sequences and Series of Functions 484

Theorem 6.13
Given that {fn : (a, b) → R} is a sequence of functions which satisfies the
following conditions.

(i) There is a point x0 in the interval (a, b) such that the sequence
{fn (x0 )} converges to a number y0 .

(ii) For each n ∈ Z+ , fn : (a, b) → R is differentiable.

(iii) The sequence of derivative functions {fn′ : (a, b) → R} converges


uniformly to a function g : (a, b) → R.

Then we have the following.

(a) The sequence of functions {fn : (a, b) → R} converges uniformly to a


function f : (a, b) → R.

(b) The function f : (a, b) → R is differentiable.

(c) We can interchange differentiation and limits. Namely, for any x ∈


(a, b),
d d
f ′ (x) = lim fn (x) = lim fn (x) = g(x).
dx n→∞ n→∞ dx

Proof
+
For each n ∈ Z , since fn : I → R is differentiable, it is continuous.
Given a point c in the interval (a, b), let {hn,c : (a, b) → R} be a sequence
of functions defined by

 fn (x) − fn (c) ,

if x ̸= c,
hn,c (x) = x−c (6.4)
 ′
fn (c), if x = c.
Chapter 6. Sequences and Series of Functions 485

Then hn,c : (a, b) → R is a continuous function. For any positive integers


m and n, we have

 (fm (x) − fn (x)) − (fm (c) − fn (c)) ,



if x ̸= c,
hm,c (x) − hn,c (x) = x−c
 ′
fm (c) − fn′ (c), if x = c.

Applying mean value theorem to the differentiable function fm (x) − fn (x),


we find that for any x ∈ (a, b) \ {c}, there is a point ξx in between x and c
such that

hn,c (x) = fm (ξx ) − fn′ (ξx ).
Thus, we find that for any x ∈ (a, b),


|hm,c (x) − hn,c (x)| ≤ sup |fm (x) − fn′ (x)|.
a<x<b

This implies that


sup |hm,c (x) − hn,c (x)| ≤ sup |fm (x) − fn′ (x)|. (6.5)
a<x<b a<x<b

Since the sequence of functions {fn′ } converges uniformly, Theorem 6.5


implies that

lim sup |fm (x) − fn′ (x)| = 0.
m,n→∞ a<x<b

Eq. (6.5) then implies that

lim sup |hm,c (x) − hn,c (x)| = 0.


m,n→∞ a<x<b

By Theorem 6.5 again, we find that the sequence of functions {hn,c :


(a, b) → R} converges uniformly.
Now we specialize to c = x0 . Notice that by definition,

fm (x)−fn (x) = fm (x0 )−fn (x0 )+(x−x0 ) (hm,x0 (x) − hn,x0 (x)) . (6.6)

Given ε > 0, since the sequence {fn (x0 )} is convergent, there is a positive
integer N1 such that for all m ≥ n ≥ N1 ,
ε
|fm (x0 ) − fn (x0 )| < .
2
Chapter 6. Sequences and Series of Functions 486

Since the sequence of functions {hn,x0 (x)} converges uniformly, Theorem


6.4 implies that there is a positive integer N ≥ N1 such that for all m ≥
n ≥ N,
ε
|hm,x0 (x) − hn,x0 (x)| < for all x ∈ (a, b).
2(b − a)

Eq. (6.6) implies that for all m ≥ n ≥ N , and for all x ∈ (a, b),

|fm (x) − fn (x)| ≤ |fm (x0 ) − fn (x0 )| + |x − x0 ||hm,x0 (x) − hn,x0 (x)|
ε ε
< + (b − a) × = ε.
2 2(b − a)

By Theorem 6.4, this proves that the sequence of functions {fn : (a, b) →
R} converges uniformly. Let

f (x) = lim fn (x)


n→∞

be the limit function. Being the limit of a sequence of continuous functions


that converges uniformly, Theorem 6.9 says hat the function f : (a, b) → R
is continuous.
Now we want to prove that f is differentiable and f ′ (x) = g(x) for each x ∈
(a, b). For any fixed c ∈ (a, b), since the sequence of continuous functions
{hn,c (x)} converges uniformly, it also converges pointwise. Taking n → ∞
limits in (6.4), we find that

 f (x) − f (c) ,

if x ̸= c,
hc (x) = lim hn,c (x) = x−c .
n→∞
g(c), if x = c.

Since {hn,c (x)} is a sequence of continuous functions that converges


uniformly, Theorem 6.9 says that the limit function hc : (a, b) → R is
continuous. Therefore,
f (x) − f (c)
g(c) = hc (c) = lim hc (x) = lim .
x→c x→c x−c
This shows that f is differentiable at x = c and f ′ (c) = g(c).
Chapter 6. Sequences and Series of Functions 487

Remark 6.1

1. In Theorem 6.13, we do not need to assume that the sequence of


functions {fn } converges uniformly. It is a consequence of uniform
convergence of the sequence {fn′ }. The condition that there is a point
x0 in (a, b) so that the sequence {fn (x0 )} converges is necessary. For
otherwise if we let fen (x) = fn (x) + n for n ∈ Z+ , then fen′ (x) =
fn′ (x). But the sequence {fen } does not converge if the sequence {fn } is
convergent.

2. If we assume that for all n ∈ Z+ , the function fn : (a, b) → R is


continuously differentiable, there is an easier proof for the conclusions
in Theorem 6.13.

Applying Theorem 6.13 to series of functions, we have the following.

Corollary 6.14

Let {fn : (a, b) → R} be a sequence of differentiable functions. Assume



X
that there is a x0 ∈ (a, b) such that the series fn (x0 ) is convergent, and
n=1

X ∞
X
the series fn′ (x) converges uniformly on (a, b), then the series fn (x)
n=1 n=1
converges uniformly on (a, b) to a differentiable function whose derivative
is given by
∞ ∞
d X X
fn (x) = fn′ (x) for all x ∈ (a, b).
dx n=1 n=1

Proof
Applying Theorem 6.13 to the sequence of partial sums {sn (x)}. Since
Xn
sn (x) = fk (x) is a finite sum of differentiable functions, it is
k=1
differentiable. The rest follows from Theorem 6.13.
Chapter 6. Sequences and Series of Functions 488

Example 6.17

2x
X
Consider the series e−n discussed in Example 6.13. We have shown
n=1
that it converges uniformly on [a, ∞) when a is a positive number. For each
1 2
n ∈ Z+ , fn (x) = − 2 e−n x is a differentiable function with derivative
n
 
d 1 −n2 x 2
− 2e = e−n x .
dx n

Notice that
1
|fn (x)| ≤ for all x ∈ [0, ∞).
n2

X 1
Since the series is convergent, Weierstrass M -test shows that the
n=1
n2
∞ ∞
X X 1 2
series fn (x) = − 2 e−n x converges absolutely and uniformly on
n=1 n=1
n
[0, ∞). Corollary 6.14 shows that for any x ∈ (a, ∞), we can do term by
term differentiation and obtain
∞ ∞
d X 1 −n2 x X −n2 x
− e = e . (6.7)
dx n=1 n2 n=1

Since a > 0 is arbitrary, eq. (6.7) holds for any x > 0. However, this is not
true for x = 0 even if we only consider right derivatives, as the right hand
side of the equation is divergent when x = 0.
Chapter 6. Sequences and Series of Functions 489

Exercises 6.3
Question 1


2x
X
(a) Show that the series n2 e−n defines a continuous function on
n=1
(0, ∞).

2x
X
(b) Show that the series e−n defines a differentiable function on
n=1
(0, ∞), and for each x > 0,
∞ ∞
d X −n2 x X 2
e =− n2 e−n x .
dx n=1 n=1

Question 2
Let {fn : (a, b) → R} be a sequence of continuously differentiable
functions. Assume that there is a point x0 ∈ [a, b] such that the sequence
{fn (x0 )} converges to a point y0 , and the sequence of functions {fn′ :
(a, b) → R} converges uniformly to a function g : (a, b) → R. Use
the fundamental theorems of calculus and Theorem 6.11 to prove that
the sequence of functions {fn : (a, b) → R} converges uniformly to a
differentiable function f : (a, b) → R, and f ′ (x) = g(x) for all x ∈ (a, b).
Chapter 6. Sequences and Series of Functions 490

6.4 Power Series

In this section, we turn to consider a special class of series of functions called


power series. The partial sums of a power series are polynomial functions. Hence,
power series are limits of polynomial sequences. They play important roles in
analysis.

Definition 6.7 Power series


A power series in the variable x is a series of the form

X
cn (x − x0 )n ,
n=0

where x0 is a fixed real number, and c0 , c1 , c2 , . . . are the coefficients.


X
Each term in a power series cn (x − x0 )n is a simple polynomial cn (x −
n=0
x0 )n which is infinitely differentiable. However, as an infinite series, we need
X∞
to address the convergence issue. Obviously, the power series cn (x − x0 )n
n=0
converges when x = x0 .
Recall that in Chapter 5, we have discussed the ratio test in Theorem 5.28.
X∞
Given an is a series with an ̸= 0 for all n ∈ Z+ , let
n=1

an+1 an+1
r = lim inf and R = lim sup .
n→∞ an n→∞ an

X
Then the series an is divergent if r > 1, convergent if R < 1, but inconclusive
n=1
an+1
if r ≤ 1 ≤ R. This test is useful if the limit lim exists. For then r = R and
n→∞ an
we only left with finitely many points which we cannot conclude the convergence
of the power series. Let us look at some examples.
Chapter 6. Sequences and Series of Functions 491

Example 6.18

X xn
Find the domain of convergence of the power series .
n=0
n!

Solution
The power series is convergent when x = 0. When x ̸= 0, using ratio test
xn
with an = , we find that
n!
an+1 |x|
lim = lim = 0.
n→∞ an n→∞ n + 1

Therefore, the series is convergent for all real numbers x. The domain of
convergence is R, the set of real numbers.

Example 6.19

X
Find the domain of convergence of the power series n!xn .
n=0

Solution
The power series is convergent when x = 0. When |x| ̸= 0, using ratio test
with an = n!xn , we have

an+1
lim = lim (n + 1)|x| = ∞.
n→∞ an n→∞


X
Hence, the series is divergent if x ̸= 0. We conclude that the series n!xn
n=0
is only convergent when x = 0. The domain of convergence is the set {0}.

Example 6.20

X xn
Find the domain of convergence of the power series .
n=1
n2
Chapter 6. Sequences and Series of Functions 492

Solution
The power series is convergent when x = 0. When x ̸= 0, using ratio test
xn
with an = 2 , we have
n
2
n2

an+1 n
lim = |x| lim = |x| lim = |x|.
n→∞ an n→∞ (n + 1)2 n→∞ n+1

Therefore, the series is convergent if |x| < 1, and divergent if |x| > 1.
When |x| = 1, the test is inconclusive.
∞ ∞
X 1 X (−1)n
But we know that the series and the series are
n=1
n2 n=1
n2

X xn
convergent. Therefore, the series is convergent if and only if
n=1
n2
|x| ≤ 1. The domain of convergence is the set [−1, 1].

In the examples above, we apply the ratio test to determine the domain of
convergence. This works fine when all the coefficients cn in the power series
X∞
cn (x − x0 )n are nonzero, or only finitely many of them are zero. We need to
n=0  
an+1
find the limit inferior and limit superior of the sequence , where an is
an
the nth term cn (x − x0 )n in the power series. Since

an+1 cn+1 (x − x0 )n+1 cn+1


= n
= |x − x0 | ,
an cn (x − x0 ) cn

essentially
  we need to find the limit inferior and limit superior of thesequence 
cn+1 cn+1
, then multiply by |x − x0 |. If the limit of the sequence
cn cn
exists, the limit inferior and limit superior of this sequence are the same, and the
domain of convergence can be determined up to the end points of an interval. We
apply other convergence test to check the convergence at these end points.
There are two problems with using the ratio test for determining the domain
of convergence.
 
cn+1
1. If the limit inferior and limit superior of the sequence are not the
cn
same, the ratio test is inconclusive for x in an interval.
Chapter 6. Sequences and Series of Functions 493


X
2. When infinitely many of the coefficients cn in the power series cn (x − x0 )n
n=0
are zero, the ratio test cannot be applied. This problem can be circumvented
if there is some patterns on the indices n for which cn is 0. For example, if
c2n = 0 for all n ∈ Z+ , the series only contains the odd terms, and it can be
written as ∞
X
c2n−1 (x − x0 )2n−1 .
n=1

In this case, we can apply the ratio test with an = c2n−1 (x − x0 )2n−1 . However,
the first problem might still be present.

To resolve these problems, we find that the root test (Theorem 5.27) is better

X
from the theoretical point of view. Given a series an , let
n=1
p
n
ρe = lim sup |an |.
n→∞


X
The root test says that the series an is convergent if ρe < 1, divergent if ρe > 1,
n=1
and inconclusive if ρe = 1.
Applying the root test to a power series, we have the following.

Theorem 6.15 Convergence of Power Series



X
Given a power series cn (x − x0 )n , let
n=0
p
n
ρ = lim sup |cn |.
n→∞

1. If ρ = 0, then the power series converges for all real numbers x.

2. If ρ = ∞, then the power series only converges at the point x = x0 .

3. If ρ is a finite positive number, let R = 1/ρ. Then R is a positive number.


The power series is convergent for all x satisfying |x − x0 | < R, and
divergent for all x satisfying |x − x0 | > R.
Chapter 6. Sequences and Series of Functions 494

Proof of Theorem 6.15



X
For the power series cn (x − x0 )n , the nth term is an = cn (x − x0 )n .
n=0
p p
ρe = lim sup n |an | = |x − x0 | lim sup n |cn | = ρ|x − x0 |.
n→∞ n→∞

Now we apply root test as stipulated in Theorem 5.27.

1. If ρ = 0, then ρe = 0, and so the power series converges for all real


numbers x.

2. If ρ = ∞, then ρe = ∞ if x ̸= x0 . Hence, the power series is divergent if


x ̸= x0 . Therefore, the power series only converges at the point x = x0 .

3. If ρ is a finite positive number and R = 1/ρ, then when |x − x0 | < R,


ρe = |x − x0 |ρ < Rρ = 1; when |x − x0 | > R, ρe = |x − x0 |ρ > Rρ = 1.
Therefore, the power series is convergent when |x − x0 | < R, divergent
when |x − x0 | > R.

Corollary 6.16

X
Given a power series cn (x − x0 )n such that cn ̸= 0 for all n, assume
n=0
that the limit
cn+1
ρ = lim
n→∞ cn
exists in the extended sense.

1. If ρ = 0, then the power series converges for all real numbers x.

2. If ρ = ∞, then the power series only converges at the point x = x0 .

3. If ρ is a finite positive number, let R = 1/ρ. Then R is a positive number.


The power series is convergent for all x satisfying |x − x0 | < R, and
divergent for all x satisfying |x − x0 | > R.
Chapter 6. Sequences and Series of Functions 495

Proof
cn+1
By Theorem 5.26, we find that lim exists implies that
n→∞ cn
p
n
p cn+1
lim sup |cn | = lim n |cn | = lim .
n→∞ n→∞ n→∞ cn

The rest follows from Theorem 6.15.

Domain of Convergence
Theorem 6.15 shows that the domain of convergence of a power series
centered at x0 can only be one of the following cases:

1. R 2. {x0 }
3. (x0 − R, x0 + R) 4. [x0 − R, x0 + R)
5. (x0 − R, x0 + R] 6. [x0 − R, x0 + R]

Here R is a positive number.

Definition 6.8 Radius of Convergence



X
Given a power series cn (x − x0 )n , let
n=0
p
n
ρ = lim sup |cn |
n→∞

as an extended real number. Then ρ ≥ 0. Let R = 1/ρ in the extended


sense. Namely, R = ∞ if ρ = 0, and R = 0 if ρ = ∞. This number R

X
is called the radius of convergence of the power series cn (x − x0 )n .
n=0
The power series is convergent when |x − x0 | < R, and divergent when
|x − x0 | > R.

Let us look at the following example.


Chapter 6. Sequences and Series of Functions 496

Example 6.21

Let {cn } be the sequence defined by



n, if n is even,
cn =
1, if n is odd.


X
Find the domain of convergence of the power series cn x n .
n=1

Solution
Notice that 
cn+1 n + 1, if n is odd,
= 1
cn  , if n is even.
n
Applying ratio test with an = cn xn , we find that if x ̸= 0,

an+1 cn+1
lim inf = |x| lim inf =0
n→∞ an n→∞ cn

an+1 cn+1
lim sup = |x| lim sup = ∞.
n→∞ an n→∞ cn
This shows that the ratio test is inconclusive for any x except x = 0.
Let us turn to root test. By (5.4), we have

n
lim n = 1.
n→∞

This implies that


p
n
lim |cn | = 1.
n→∞

X
Therefore, the power series cn xn is convergent when |x| < 1, divergent
n=1

X
when |x| > 1. When x = 1 or −1, we have the series (±1)n cn . Since
n=0
lim cn ̸= 0, we conclude that the power series is divergent when x = 1 or
n→∞
x = −1. Hence, the domain of convergence of the power series is (−1, 1).
Chapter 6. Sequences and Series of Functions 497

This example shows that applying ratio test naively will leads to inconclusive
scenario, but the root test has rescued the problem. In practice, we always want
to avoid applying the root test because it is difficult to find the limit superior of
p
the sequence { n |cn |} when the coefficients cn . In the example above, we can
avoid using root test by writing the power series as a sum of two power series, and
apply the ratio test to the two power series separately. In any case, the root test
has given a theoretical decisive conclusion about the possible types of domain of
convergence for a power series.
For a power series whose radius of convergence R is 0, it only converges at
a single point x = x0 . So there is no point to consider such power series. If the
X∞
radius of convergence R of a power series cn (x − x0 )n is positive, the power
n=0
series defines a function on the open interval (x0 − R, x0 + R). We want to study
the continuity, differentiability and integrability of such a power series. Therefore,
we need to determine whether the power series converges uniformly.
X∞
Unfortunately, in general, a power series cn (x − x0 )n does not converge
n=0
uniformly on the interval (x0 − R, x0 + R). For example, consider the series
X∞ ∞
X
s(x) = xn . In Example 6.7, we have seen that xn is convergent when
n=0 n=0
|x| < 1, and divergent when |x| > 1. Hence, its radius of convergence is R = 1.
X∞
When |x| < 1, the power series xn defines the function
n=0


X 1
s(x) = xn = .
n=0
1−x

The nth partial sum of the series is


1 − xn+1
sn (x) = 1 + x + · · · + xn = .
1−x
Therefore, when x ∈ (−1, 1),
1 1 − xn+1 xn+1
s(x) − sn (x) = − = .
1−x 1−x 1−x
Since
xn+1
lim− = ∞,
x→1 1−x
Chapter 6. Sequences and Series of Functions 498

we find that
sup |s(x) − sn (x)| = ∞.
|x|<1

X
Hence, the series xn does not converge uniformly on (−1, 1). However, if a
n=0
is a number such that 0 < a < 1, then for |x| ≤ a,

xn+1 an+1
≤ .
1−x 1−a

Therefore,
an+1
sup |s(x) − sn (x)| ≤ .
|x|≤a 1−a
This implies that
lim sup |s(x) − sn (x)| = 0.
n→∞ |x|≤a


X
Hence, the series xn converges uniformly on [−a, a].
n=0
A general power series also have similar behavior.

Theorem 6.17 Absolute and Uniform Convergence of a Power Series



X
Given that cn (x − x0 )n is a power series whose radius of convergence
n=0
R is positive. If R1 is any number satisfying 0 < R1 < R, then the power

X
series cn (x − x0 )n converges absolutely and uniformly on the set D1 =
n=0
{x | |x − x0 | ≤ R1 }.

Proof

R + R1 X
Let R2 = . Then R1 < R2 < R. Hence, the series cn (x − x0 )n
2 n=0
X∞
is convergent when |x − x0 | = R2 . Let x2 = x0 + R2 . Then cn (x2 −
n=0

X
x0 )n = cn R2n is convergent. This implies that lim cn R2n = 0.
n→∞
n=0
Chapter 6. Sequences and Series of Functions 499

In particular, the sequence {cn R2n } is bounded. Let M be a positive number


such that
|cn R2n | ≤ M for all n ≥ 0.
We apply the Weiertrass M -test (Theorem 6.8) with fn (x) = cn (x − x0 )n .
We find that
 n
n n R1
|cn (x − x0 ) | ≤ |cn |R1 ≤ M = M rn when |x − x0 | ≤ R1 .
R2

X
Here r = R1 /R2 . Since 0 < r < 1, the geometric series M rn
n=0

X
is convergent. By Weierstrass M -test, the power series cn (x − x0 )n
n=0
converges absolutely and uniformly on the set D1 = {x | |x − x0 | ≤ R1 }.

Remark 6.2 Radius of Convergence Revisited


In the proof of Theorem 6.17, essentially we show that if the power series
X∞
cn (x−x0 )n is convergent when x = x2 , then it is convergent for all x in
n=0
the interval (x0 − R2 , x0 + R2 ), where R2 = |x2 − x0 |. The contrapositive

X
of this statement says that if the power series cn (x − x0 )n is divergent
n=0
when x = x3 , then it is divergent for all x satisfying |x − x0 | > R3 , where
R3 = |x3 − x0 |. Hence, if S is the set
( ∞
)
X
S = |x1 − x0 | cn (x − x0 )n is convergent when x = x1 ,
n=0

then S contains only nonnegative numbers. Obviously, 0 is in S. If R1 is in


S, any positive number r that is less than R1 is also in S. This implies that
if R = sup S, then [0, R) ⊂ S and (R, ∞) is disjoint from S. This provides
an alternative way to define the radius of convergence of the power series
without using the root test. Namely, the radius of convergence R is defined
as the supremum of the set S.
Chapter 6. Sequences and Series of Functions 500

From Theorem 6.17, we obtain the following.

Theorem 6.18 Continuity of a Power Series



X
Given that cn (x − x0 )n is a power series whose radius of convergence
n=0
R is positive. It defines a function

X
f (x) = cn (x − x0 )n
n=0

that is continuous on the set D = {x | |x − x0 | < R}.

Proof
Given any x1 ∈ D = {x | |x − x0 | < R}, R1 = |x1 − x0 | < R.
Theorem 6.17 says that the power series converges uniformly on the set
D1 = {x | |x − x0 | ≤ R1 }, which contains the point x1 .
For n ≥ 0, the function fn (x) = cn (x − x0 )n is continuous. By Corollary

X
6.10, the power series cn (x − x0 )n is continuous at x = x1 .
n=0

The next is about term by term integration of a power series.

Theorem 6.19 Term by Term Integration of a Power Series



X
Given that cn (x − x0 )n is a power series whose radius of convergence
n=0
R is positive. If [a, b] is a closed interval that is contained in the interval
(x0 − R, x0 + R), then the function

X
f (x) = cn (x − x0 )n
n=0

is Riemann integrable on [a, b], and we can integrate term by term. Namely,
Z b ∞
Z bX ∞
X Z b
n
f (x)dx = cn (x − x0 ) dx = cn (x − x0 )n dx. (6.8)
a a n=0 n=0 a
Chapter 6. Sequences and Series of Functions 501

Proof
Let R1 = max{|a − x0 |, |b − x0 |}. Then 0 < R1 < R and [a, b] is contained

X
in [x0 − R1 , x0 + R1 ]. By Theorem 6.17, the power series cn (x − x0 )n
n=0
converges uniformly on [x0 −R1 , x0 +R1 ], and hence on [a, b]. For any n ≥
0, the function fn (x) = cn (x − x0 )n is Riemann integrable. By Corollary
X∞
6.12, the function f (x) = cn (x − x0 )n is Riemann integrable on [a, b]
n=0
and (6.8) holds.

Before we discuss term by term differentiation, we need to prove the uniform


convergence of the derivative series. We will first prove the following lemma.

Lemma 6.20
Given that {an } is a sequence of nonnegative numbers,

n

lim sup n an = lim sup an .
n→∞ n→∞

Proof
√ √
For all n ∈ Z+ , n ≥ 1. By (5.4), we have lim n n = 1. Hence, given
n
n→∞
ε > 0, there is a positive integer N such that for all n ≥ N ,

1≤ n
n < 1 + ε.

Therefore, for all n ≥ N ,



an ≤ n
n an ≤ (1 + ε) an .

This implies that


√ 
lim sup an ≤ lim sup n
n an ≤ (1 + ε) lim sup an .
n→∞ n→∞ n→∞

Since ε can be any positive number, we conclude that



n

lim sup n an = lim sup an .
n→∞ n→∞
Chapter 6. Sequences and Series of Functions 502

Theorem 6.21

X
Let cn (x − x0 )n be a power series with a positive radius of convergence
n=0

X
R. Then the radius of convergence of the derived series ncn (x − x0 )n−1
n=1
is also R.

Proof

X

Let R be the radius of convergence of the derived series ncn (x−x0 )n−1 .
n=1
It is not difficult to see that it is the same as the radius of convergence of
X∞
the series ncn (x − x0 )n . By Lemma 6.20,
n=1

1 p
n
p
n 1

= lim sup |nc n | = lim sup |cn | = .
R n→∞ n→∞ R

This proves that R′ = R.

Notice that if k ∈ Z+ ,

dk
(x − x0 )n = n(n − 1) · · · (n − k + 1)(x − x0 )n−k .
dxk
By induction, we can deduce the following.

Corollary 6.22

X
Given that cn (x − x0 )n is a power series whose radius of convergence
n=0
R is positive. For any k ∈ Z+ , the series

X
n(n − 1) · · · (n − k + 1)cn (x − x0 )n−k
n=k

has radius of convergence R.


Chapter 6. Sequences and Series of Functions 503

Proof

X
The k = 1 case, which says that the series ncn (x − x0 )n−1 has radius of
n=1
convergence R, is given by Theorem 6.21. Applying Theorem 6.21 to the

X ∞
X
n−1
series ncn (x−x0 ) , we find that the series n(n−1)cn (x−x0 )n−2
n=1 n=2
also has radius of convergence R. This is the statement we need to prove
for the k = 2 case. For general k ∈ Z+ , we proceed by induction.

The next theorem says that we can differentiate a power series term by term.

Theorem 6.23 Term by Term Differentiation of a Power Series



X
Given that cn (x − x0 )n is a power series whose radius of convergence
n=0
R is positive. Then the function

X
f (x) = cn (x − x0 )n
n=0

is differentiable on (x0 − R, x0 + R). When x ∈ (x0 − R, x0 + R), we can


X∞
differentiate the power series cn (x − x0 )n term by term to obtain
n=0

∞ ∞
′ d X X
f (x) = cn (x − x0 )n = ncn (x − x0 )n−1 . (6.9)
dx n=0 n=1

Proof
By Theorem 6.21, the radius of convergence of the derived series
X∞ ∞
X
ncn (x − x0 )n−1 is also R. By Theorem 6.17, the series ncn (x −
n=1 n=1
x0 )n−1 converges absolutely and uniformly on [x0 −R1 , x0 +R1 ] if R1 < R.
Given x1 ∈ (x0 − R, x0 + R), |x1 − x0 | < R. Choose R1 such that
|x1 − x0 | < R1 < R. Then x1 ∈ (x0 − R1 , x0 + R1 ). Corollary 6.14 implies
that the function
Chapter 6. Sequences and Series of Functions 504


X
f (x) = cn (x − x0 )n
n=0

is differentiable on (x0 − R1 , x0 + R1 ), and we can perform term by term


differentiation to obtain (6.9). Since x1 is any point in (x0 − R, x0 + R),
this proves the statement of the theorem.

By induction, we have the following.

Corollary 6.24

X
Let cn (x − x0 )n be a power series with a positive radius of convergence
n=0
R. Then the function

X
f (x) = cn (x − x0 )n
n=0

is infinitely differentiable on (x0 − R, x0 + R). For any k ≥ 1 and x ∈


(x0 − R, x0 + R),

X
(k)
f (x) = n(n − 1) · · · (n − k + 1)cn (x − x0 )n−k . (6.10)
n=k

In particular,
f (k) (x0 ) = k!ck .

Let us summarize what we have learned about power series.

Functions Defined by Power Series



X
A power series cn (x − x0 )n is convergent when x = x0 . If the series is
n=0
convergent for some x1 ̸= x0 , then it has a positive radius of convergence
R. The series is convergent for all x satisfying |x − x0 | < R, and divergent
for all x satisfying |x − x0 | > R.
Chapter 6. Sequences and Series of Functions 505

The power series defines a function



X
f (x) = cn (x − x0 )n
n=0

on the interval (x0 − R, x0 + R). This function f (x) is infinitely


differentiable, and we can perform term by term differentiation and term
by term integration.
Functions that are representable by power series are called analytic
functions. Their domains can be naturally extended to complex numbers.
This is the main topic that is discussed in a course in complex analysis.

Definition 6.9 Power Series Expansion of a Function



X
If a power series cn (x − x0 )n has positive radius of convergence R, it
n=0
defines an analytic function f : (x0 − R, x0 + R) → R by

X
f (x) = cn (x − x0 )n .
n=0


X
We say that cn (x − x0 )n is a power series expansion or power series
n=0
representation of the function f (x) on the interval (x0 − R, x0 + R).

Example 6.22
1
When |x| < 1, the function f (x) = has a power series expansion
1−x
given by

1 X
= xn = 1 + x + x2 + · · · + xn + · · · . (6.11)
1 − x n=0

Applying term by term differentiation to (6.11), we obtain the following.


Chapter 6. Sequences and Series of Functions 506

Theorem 6.25
Let k be a nonnegative integer. Then for |x| < 1,
∞  
1 X n n−k
k+1
= x . (6.12)
(1 − x) n=k
k
 
n n(n − 1) · · · (n − k + 1)
Here = are the binomial coefficients.
k k!

Proof
The k = 0 case is just the formula (6.11). By Corollary 6.24, we can
differentiate term by term k times and (6.10) gives

k! X
= n(n − 1) · · · (n − k + 1)xn−k when |x| < 1.
(1 − x)k+1 n=k

Dividing by k! on both sides gives (6.12).

The formula (6.12) is very useful. It has applications in probability theory.

Example 6.23
In probability theory, a geometric random variable is a random variable
X that depends on a parameter p where 0 < p < 1. If one performs a
series of identical and independent Bernoulli trials, each has a probability
p to be a success, then X is the number of these Bernoulli trials need to be
performed until the first success occurs. For any n ∈ Z+ , the probability
that X is equal to n is

P (X = n) = (1 − p)n−1 p.
Chapter 6. Sequences and Series of Functions 507

The expected number of Bernoulli trials need to be performed until the first
success is

X ∞
X
E(X) = nP (X = n) = p n(1 − p)n−1 .
n=1 n=1

Using (6.12) with k = 1 and x = 1 − p, we find that


p 1
E(X) = 2
= .
(1 − (1 − p)) p

The variance of X is given by Var (X) = E(X 2 ) − E(X)2 . To find this,


we compute E(X 2 ) first.

X ∞
X
E(X 2 ) = n2 P (X = n) = p n2 (1 − p)n−1 .
n=1 n=1

Using (6.12) with k = 2 and x = 1 − p, we find that



X 2 2
n(n − 1)(1 − p)n−2 = 3
= 3.
n=1
(1 − (1 − p)) p

Thus,

X ∞
X ∞
X
2 n−1 n−1
n (1 − p) = n(n − 1)(1 − p) + n(1 − p)n−1
n=1 n=1 n=1
2(1 − p) 1 2−p
= 3
+ 2 = .
p p p3
Therefore, the variance of X is
2−p 1 1−p
Var (X) = E(X 2 ) − E(X)2 = 2
− 2 = .
p p p2

1
Recall that the logarithm function f (x) = ln x is defined so that f ′ (x) = .
x
This gives
d 1
ln(1 + x) = .
dx 1+x
Using term by term integration, we can obtain power series representation for the
logarithm function.
Chapter 6. Sequences and Series of Functions 508

Theorem 6.26 Power Series Expansion of Logarithm Function


∞ n
n−1 x
X
For |x| < 1, the power series (−1) is convergent and
n=1
n


X xn x2 x3 xn
ln(1 + x) = (−1)n−1 =x− + + · · · + (−1)n−1 + · · · .
n=1
n 2 3 n

Proof
Given any x1 with |x1 | < 1, let R1 = |x1 |. Since the geometric series
X∞
xn has radius of convergence 1, Theorem 6.19 says that we can integrate
n=0
(6.11) term by term over the interval with 0 and x1 as end points.
Z x1 Z ∞
x1 X ∞ Z x1
1 n
X
dx = x dx = xn dx.
0 1−x 0 n=0 n=0 0

This gives
∞ ∞
X xn+1
1
X xn1
− ln(1 − x1 ) = = .
n=0
n + 1 n=1 n
Replacing x1 by −x, we find that if |x| < 1, then
∞ n ∞
X
nx
X xn
ln(1 + x) = − (−1) = (−1)n−1 . (6.13)
n=1
n n=1
n

The theories that we have deveoped so far do not allow us to take the limit
x → 1− term by term on the right hand side of (6.13). However, we can go
around this problem in another way.

Example 6.24
Show that
X (−1)n−1 ∞
1 1 1
1− + − + ··· = = ln 2. (6.14)
2 3 4 n=1
n
Chapter 6. Sequences and Series of Functions 509

Solution
Notice that if x ̸= −1, then for any n ∈ Z+ ,

1 − (−x)n 1 xn
1−x+x2 −x3 +· · ·+(−1)n−1 xn−1 = = +(−1)n−1 .
1+x 1+x 1+x
Each of the functions is continuous on [0, 1]. Therefore,
Z 1
1 − x + x2 − x3 + · · · + (−1)n−1 xn−1 dx

0
Z 1 Z 1
1 xn
= dx + (−1)n−1 dx.
0 1+x 0 1+x
This gives
n
X (−1)k−1 1 1 1 1
=1− + − + · · · + (−1)n−1 = ln 2 + Rn ,
k=1
k 2 3 4 n

where
1
xn
Z
n−1
Rn = (−1) dx.
0 1+x
When 0 ≤ x ≤ 1, 1 + x ≥ 1, and so

xn
≤ xn when 0 ≤ x ≤ 1.
1+x

Therefore,
1 1
xn
Z Z
1
|Rn | ≤ dx ≤ xn dx = .
0 1+x 0 n+1

This implies that lim Rn = 0. Thereofore,


n→∞

n
1 1 1 X (−1)k−1
1− + − + · · · = lim = ln 2.
2 3 4 n→∞
k=1
k

Next, we give the power series that represents the exponential function.
Chapter 6. Sequences and Series of Functions 510

Theorem 6.27 Power Series Expansion of Exponential Function


For any real numbers x,

x
X xn x x2 x 3 x4
e = =1+ + + + + ··· .
n=0
n! 1! 2! 3! 4!

Proof

X xn
We have shown in Example 6.18 that the power series is convergent
n=0
n!
for all real numbers x. Corollary 6.24 says that it defines an infinitely
differentiable function f : R → R by

X xn x x2 x3 x4
f (x) = =1+ + + + + ··· .
n=0
n! 1! 2! 3! 4!

From this, we have f (0) = 1. Term by term differentiation gives


∞ ∞ ∞

X nxn−1 X xn−1 X xn
f (x) = = = = f (x).
n=1
n! n=1
(n − 1)! n=0 n!

Let g : R → R be the function defined by

g(x) = e−x f (x).

Then we find that

g ′ (x) = e−x f ′ (x) − e−x f (x) = 0.

This shows that there is a constant C such that

g(x) = C for all x ∈ R.

Set x = 0, we find that C = g(0) = e0 f (0) = 1. Hence, e−x f (x) = 1 for


all real numbers x, which implies that f (x) = ex for all real numbers x.

Now we want to return to address an existence problem in Chapter 3. In


Theorem 3.32, we claim that there is a twice differentiable function f (x) that
Chapter 6. Sequences and Series of Functions 511

satisfies the equation


f ′′ (x) + f (x) = 0
and the initial conditions

f (0) = 0, f ′ (0) = 1.

We define this function as sin x. We can now prove the existence. This is actually
the power series method for solving differential equations. Assume that f (x) can
be written as a power series

X
f (x) = cn x n .
n=0

Then f (0) = 0 and f ′ (0) = 1 implies that c0 = 0 and c1 = 1. Differentiate two


times, we have

X ∞
X
f ′′ (x) = n(n − 1)cn xn−2 = (n + 2)(n + 1)cn+2 xn .
n=2 n=0

Substitute into the equation f ′′ (x) + f (x) = 0, we find that



X
[(n + 2)(n + 1)cn+2 + cn ] xn = 0.
n=0

Hence, we find that if {cn } is defined recursively by c0 = 0, c1 = 1, and for all


n ≥ 0,
cn
cn+2 = − ,
(n + 1)(n + 2)
we get a candidate solution for our problem. The recursive formula for {cn } can
be easily solved to give

(−1)n−1
c2n−1 = , c2n = 0 for all n ∈ Z+ .
(2n − 1)!

Now we are left to justify this is indeed the solution to our problem.
Chapter 6. Sequences and Series of Functions 512

Theorem 6.28
The power series

X (−1)n−1 2n−1 x3 x5 x7
x =x− + − + ···
n=1
(2n − 1)! 3! 5! 7!

defines an infinitely differentiable function f : R → R that satisfies

f ′′ (x) + f (x) = 0, f (0) = 0, f ′ (0) = 1.

Proof
First, we need to show that the power series is convergent everywhere. We
(−1)n−1 2n−1
can use the ratio test with an = x . Then if x ̸= 0,
(2n − 1)!

an+1 1
lim = x2 lim = 0.
n→∞ an n→∞ 2n(2n + 1)

This shows that the power series is convergent for all x ∈ R. By Corollary
6.24, it defines an infinitely differentiable function f : R → R,

X (−1)n−1 2n−1 x3 x5 x7
f (x) = x =x− + − + ··· .
n=1
(2n − 1)! 3! 5! 7!

From here, it is straightforward to find that f (0) = 0. We can differentiate


term by term to obtain

x2 x4 x6
f ′ (x) = 1 − + − + ··· .
2! 4 6!
This gives f ′ (0) = 1. Differentiate term by term again, we have

′′
X (−1)n−1 2n−3
f (x) = (2n − 1)(2n − 2) x
n=2
(2n − 1)!
∞ ∞
X (−1)n−1 2n−3 X (−1)n−1 2n−1
= x =− x = −f (x).
n=2
(2n − 3)! n=1
(2n − 1)!

This proves that f ′′ (x) + f (x) = 0, and thus the proof is completed.
Chapter 6. Sequences and Series of Functions 513

As a byproduct, we obtain the power series expansion for the functions sin x
and cos x.

Theorem 6.29 Power Series Expansion of Sine and Cosine Functions


For any real numbers x,

X (−1)n−1 2n−1 x3 x5 x7
sin x = x =x− + − + ··· ,
n=1
(2n − 1)! 3! 5! 7!

X (−1)n x2 x4 x6
cos x = x2n = 1 − + − + ··· .
n=0
(2n)! 2! 4 6!

The power series for cos x is obtained by term by term differentiating the
power series for sin x.
Finally, we want to consider the multiplication of two power series. Given that
p(x) and q(x) are polynomials of degree k and l respectively, with
p(x) = a0 + a1 x + · · · + ak xk and q(x) = b0 + b1 x + · · · + bl xl .
The product p(x)q(x) is a polynomial of degree k + l, with
p(x)q(x) = c0 + c1 x + · · · + ck+1 xk+1
= (a0 b0 ) + (a0 b1 + a1 b0 )x + · · · + ak bl xk+l .
For 0 ≤ n ≤ max{k, l}, we find that
n
X
cn = a0 bn + a1 bn−1 + · · · + an−1 b1 + an b0 = am bn−m .
m=0
This motivates the following.

Definition 6.10 Cauchy Product of Two Series



X ∞
X
Given the two infinite series an and bn , their Cauchy product is
n=0 n=0

X
the infinite series cn , where
n=0

n
X
cn = am bn−m .
m=0
Chapter 6. Sequences and Series of Functions 514

The following theorem is a special case of the Merten’s theorem on Cauchy


products.

Theorem 6.30 Term by Term Multiplication of Power Series



X ∞
X
Let an (x − x0 )n and bn (x − x0 )n be two power series with positive
n=0 n=0
radii of convergence Ra and Rb respectively. Define the sequence {cn }∞
n=0
by
X n
cn = am bn−m .
m=0

X
Then the power series cn (x − x0 )n has radius of convergence R ≥ Rc ,
n=0
where Rc = min{Ra , Rb }. If

X ∞
X ∞
X
f (x) = cn (x−x0 )n , g(x) = an (x−x0 )n , h(x) = bn (x−x0 )n
n=0 n=0 n=0

are the functions defined by each of the power series on (x0 − Rc , x0 + Rc ),


then we have
f (x) = g(x)h(x).

Proof
Without loss of generality, assume that x0 = 0.
It is suficient to prove that for any x1 satisfying 0 < R1 = |x1 | < Rc , the
X∞
series cn xn1 is convergent, and it converges to AB, where
n=0


X ∞
X
A= an xn1 , B= bn xn1 .
n=0 n=0


X
This would imply that the series cn xn is convergent on (−Rc , Rc ),
n=0
which proves that its radius of convergence R is at least Rc .
Chapter 6. Sequences and Series of Functions 515

Take an R2 such that R1 < R2 < Rc . Then R2 < Ra and R2 < Rb . Using
the same reasoning as in the proof of Theorem 6.17, we find that there is a
positive constant M such that

|an xn1 | ≤ M rn and |bn xn1 | ≤ M rn for all n ≥ 0,

where r = R1 /R2 is a number satisfying 0 < r < 1.


For a positive integer n, let
n
X n
X n
X
Cn = ck xk1 , An = ak xk1 , Bn = bk xk1
k=0 k=0 k=0

be the partial sums of each series. For any n ∈ Z+ ,


∞ ∞ ∞
X X X M rn+1
|B − Bn | = bk xk1 ≤ bk xk1 ≤ M rk = .
k=n+1 k=n+1 k=n+1
1−r

By definitions of the sequence {cn }, we find that for n ∈ Z+ ,

X l
n X n
X n
X
Cn = ak bl−k xl1 = ak xk1 bl−k xl−k
1
l=0 k=0 k=0 l=k
Xn n−k
X Xn
= ak xk1 bl xl1 = ak xk1 Bn−k
k=0 l=0 k=0
n
X
= BAn − ak xk1 (B − Bn−k ).
k=0

Therefore, for any n ∈ Z+ ,


n n
X X M rn−k+1
|Cn − BAn | ≤ |ak xk1 ||B − Bn−k | ≤ M rk ×
k=0 k=0
1−r
M2
= (n + 1)rn+1 .
1−r

By Theorem 5.29, lim (n + 1)rn+1 = 0. By squeeze theorem, we find that


n→∞

lim Cn = B lim An = AB,


n→∞ n→∞

which completes the proof of the theorem.


Chapter 6. Sequences and Series of Functions 516

Let us look at an example.

Example 6.25

Consider the function f : R → R defined by f (x) = ex sin x. Find the


power series expansion of f (x) up to the x5 term, and find f (5) (0).

Solution
We know that

x
X xn x2 x3 x4 x5
e = =1+x+ + + + + ··· for all x ∈ R,
n=0
n! 2 6 24 120

X x2n−1 x3 x5
sin x = (−1)n−1 =x− + + ··· for all x ∈ R.
n=1
(2n − 1)! 6 120

By Theorem 6.30,

x2 x3 x 4 x5 x3 x5
  
x
e sin x = 1 + x + + + + + ··· x− + + ···
2 6 24 120 6 120
x3 x4 x 5 x3 x4 x5 x5
= x + x2 + + + − − − + + ···
2 6 24 6 6 12 120
x3 x5
= x + x2 + − + ··· .
3 30
This gives the power series expansion of f (x) up to the x5 term. From this,
we find that  
(5) 1
f (0) = 5! × − = −4.
30
Chapter 6. Sequences and Series of Functions 517

Exercises 6.4
Question 1
Let p be a positive number. Determine the domain of convergence of the

X xn
power series .
n=0
np

Question 2
Show that when |x| < 1,

−1
X x2n−1 x3 x5
tan x= (−1)n−1 =x− + + ··· .
n=1
2n − 1 3 5

Question 3 [The Newton-Gregory Formula]


Show that ∞
1 1 1 X (−1)n−1 π
1 − + − + ··· = = .
3 5 7 n=1
2n − 1 4

Question 4

X
Find a closed form formula for the sum of the series n3 xn when |x| < 1.
n=1

Question 5
Consider the function f : R → R defined by f (x) = ex cos x. Find the
power series expansion of f (x) up to the x5 term, and find f (5) (0).
Chapter 6. Sequences and Series of Functions 518

6.5 Taylor Series and Taylor Polynomials

In Section 6.4, we have seen that the exponential function, logarithm function, sine
and cosine functions have power series representations that are valid on its domain
or a subset of its domain. Power series are limits of sequences of polynomials.
They are infinitely differentiable, and they can be differentiated term by term
and integrated term by term. Thus, they are very useful. Hence, we can ask
the following two questions.

1. If I is an open interval that contains the point x0 , and the function


f : I → R is infinitely differentiable on I, does there exist a positive
X∞
constant R and a power series cn (x − x0 )n such that
n=0


X
f (x) = cn (x − x0 )n when |x − x0 | < R.
n=0

2. If the power series expansion exists, what is the error when we


n
X
approximate f (x) by the partial sum sn (x) = ck (x − x0 )k ?
k=0

For the first question, Corollary 6.24 says that if such a representation exists,
then we must have

f (n) (x0 ) = n!cn for all n ∈ Z+ .

This leads us to the following definition.


Chapter 6. Sequences and Series of Functions 519

Definition 6.11 Taylor Series and Maclaurin Series


If I is an open interval that contains the point x0 , and the function f : I →
R is infinitely differentiable on I, the Taylor series of f (x) at x0 is the
series

X f (n) (x0 ) f ′ (x0 ) f (n) (x0 )
(x−x0 )n = f (x0 )+ (x−x0 )+· · ·+ (x−x0 )n +· · · .
n=0
n! 1! n!

When x0 = 0, the Taylor series at 0 is also called a Maclaurin series.

Here the Taylor series is defined as a power series as long as the function is
infinitely differentiable in an open interval I that contains the point x0 . We do not
assume any convergence. Even though the Taylor series is convergent, we cannot
assume that it converges to the function f (x) itself. In Section 6.6.3, we are going
to see a classical example of an infinitely differentiable function whose Taylor
series converges but to a different function.
Nevertheless, for functions that are defined by a power series centered at x0 ,
Corollary 6.24 gives the following.

Theorem 6.31

X
Assume that the power series cn (x − x0 )n has positive radius of
n=0
convergence R. If f (x) is the function defined by the power series
X∞
cn (x − x0 )n on the interval (x0 − R, x0 + R), then the Taylor series
n=0

X
of f (x) at x0 is cn (x − x0 )n . Namely,
n=0


X f (n) (x0 )
f (x) = (x − x0 )n when x ∈ (x0 − R, x0 + R).
n=0
n!

This shows that the Taylor series of f (x) converges to the function f (x).
It also says that the power series expansion of a function at a point x0 , if
exists, is unique, which is the Taylor series of the function at x0 .

We have the following list of Maclaurin series from Section 6.4.


Chapter 6. Sequences and Series of Functions 520

Useful Maclaurin Series



1 X
1. = xn = 1 + x + x2 + x3 + · · · when |x| < 1.
1 − x n=0

x
X xn x2 x3
2. e = =1+x+ + + · · · for all x ∈ R.
n=0
n! 2! 3!

X x2n−1 x3 x 5 x 7
3. sin x = (−1)n−1 = x − + − + · · · for all x ∈ R.
n=1
(2n − 1)! 3! 5! 7!

X x2n n x2 x4 x6
4. cos x = (−1) =1− + − + · · · for all x ∈ R.
n=0
(2n)! 2! 4! 6!

X xn x2 x3 x4
5. ln(1 + x) = (−1)n−1 =x− + − + · · · when |x| < 1.
n=1
n 2 3 4

Remark 6.3 Maclaurin Series for Odd Functions and Even Functions
Let a be a positive number and let f : (−a, a) → R be an infinitely
differentiable function. Since the derivative of an odd function is even,
and the derivative of an even function is odd, the following holds.

1. If f (x) is an odd function, the Taylor series of f (x) at x = 0 has the


form ∞
X f (2n−1) (0) 2n−1
x ,
n=1
(2n − 1)!
which only contains the odd power terms.

2. If f (x) is a even function, the Taylor series of f (x) at x = 0 has the


form ∞
X f (2n) (0) 2n
x ,
n=0
(2n)!
which only contains the even power terms.

By the uniqueness of power series expansion asserted in Theorem 6.31, and


the results proved in Section 6.4, we can use term by term addition, multiplication,
Chapter 6. Sequences and Series of Functions 521

differentiation and integration to obtain the power series for new functions from
old ones. This is a useful tactic to find Taylor series of a large class of functions
from the few elementary ones listed above.

Example 6.26
x+1
Find the power series expansion of the function f (x) = at x = 0,
4 + x2
and find the largest open interval where this series is convergent.

Solution
Applying the formula

1 X
= xn , |x| < 1
1 − x n=0

gives
∞ 2n
1 1 1X nx
= = (−1) , |x| < 2.
x2
 
4 + x2 4 n=0 22n
4 1+
4

Multiply by 1 + x, we find that when |x| < 2,


∞ 2n+1 ∞ 2n
x+1 1X nx 1X nx
= (−1) + (−1) .
4 + x2 4 n=0 22n 4 n=0 22n

The largest open interval where this series is convergent is (−2, 2).

Example 6.27
Let f : R → R be the function defined by

 sin x ,

if x ̸= 0,
f (x) = x
1, if x = 0.

Show that f is infinitely differentiable, and find f (n) (0) for all n ≥ 0.
Chapter 6. Sequences and Series of Functions 522

Solution
For any real number x, the series

X x2n−1 x3 x5 x7
(−1)n−1 =x− + − + ···
n=1
(2n − 1)! 3! 5! 7!

converges to sin x. When x ̸= 0, dividing by x, we find that the series


∞ ∞
X
n−1 x2n−2 X x2n x2 x4 x6
(−1) = (−1)n =1− + − + ···
n=1
(2n − 1)! n=0 (2n + 1)! 3! 5! 7!


sin x X x2n
converges to . Therefore, the power series (−1)n
x n=0
(2n + 1)!
converges for all x, and when x ̸= 0, it is equal to f (x). When x = 0,
it has value 1, which is equal to f (0). This proves that

X x2n
f (x) = (−1)n for all x ∈ R.
n=0
(2n + 1)!

Since the function f (x) has a power series expansion that converges
everywhere, it is an infinitely differentiable function. From the power series
expansion, we find that

(−1)n
f (2n+1) (0) = 0, f (2n) (0) = for all n ≥ 0.
2n + 1

An important power series expansion that cannot be derived from the list of
Taylor series for elementary functions is the binomial series. Recall that if n is a
positive integer, the binomial expansion of (1 + x)n is given by
n  
n
X n
(1 + x) = xk .
k=0
k

If n is a negative integer, let m = −n − 1. Then m is a nonnegative integer. By


Theorem 6.25, we find that when |x| < 1.
∞   ∞  
n 1 X k k−m
X
k k+m
(1 + x) = = (−x) = (−1) xk .
(1 + x)m+1 k=m m k=0
m
Chapter 6. Sequences and Series of Functions 523

Notice that for k ≥ 0,


 
k k+m (k + m)!
(−1) = (−1)k
m k!m!
(m + k)(m + k − 1) · · · (m + 1)
= (−1)k
k!
(−m − 1)(−m − 2) · · · (−m − k)
=
k!
n(n − 1) · · · (n − k + 1)
= .
k!
Thus, when n is a negative integer, we find that (1 + x)n has a power series
expansion on the interval (−1, 1), which can be written as

n
X n(n − 1) · · · (n − k + 1)
(1 + x) = xk .
k=0
k!

This motivates us to extend the definition of the binomial coefficients.

Definition 6.12 Generalized Binomial Coefficients


For any real number α and any  nonnegative
 integer k, we define the
α
generalized binomial coefficient by
k
 
α
= 1,
0

and for k ≥ 1,  
α α(α − 1) · · · (α − k + 1)
= .
k k!

Example 6.28
Let α be a real number. Show that the Maclaurin series of the function
f : (−1, 1) → R, f (x) = (1 + x)α is
∞  
X α
xk .
k=0
k

When α is not a nonnegative integer, show that the radius of convergence


of this power series is 1.
Chapter 6. Sequences and Series of Functions 524

Solution
The function f is infinitely differentiable. By straightforward computation,
we have
f (k) (x) = α(α − 1) · · · (α − k + 1)(1 + x)α−k .
This gives,
f (k) (0) = α(α − 1) · · · (α − k + 1).
Therefore, the Maclaurin series of f is
∞ ∞ ∞  
X f (k) (0) k
X α(α − 1) · · · (α − k + 1) k
X α k
x = x = x .
k=0
k! k=0
k! k=0
k

For the radius of convergence, we note that


 
α α(α − 1) · · · (α − k + 1)
ck = =
k k!

is nonzero for all k ≥ 0 when α is not a nonnegative integer. Thus, we can


apply ratio test. Since

ck+1 α−k
lim = lim = 1,
k→∞ ck k→∞ k + 1

we find that the radius of convergence of the power series is 1.

In the example above, we have shown that the Maclaurin series of f (x) =
∞  
α
X α k
(1 + x) is x , which is a power series that converges on (−1, 1). But
k=0
k
we have not shown that the Maclaurin series converges to f (x) on (−1, 1), except
when α is an integer. To prove the convergence of the Maclaurin series to the
function, we will study the convergence of the sequence of partial sums.
The partial sums of Taylor series are called Taylor polynomials. They are
important in their own right.
Chapter 6. Sequences and Series of Functions 525

Definition 6.13 Taylor Polynomials


Let I be an open interval that contains the point x0 , and let n be a positive
integer. If the function f : I → R is n times differentiable on I, the nth
Taylor polynomial of f (x) at x0 is the polynomial
n
X f (k) (x0 )
Tn (x) = (x − x0 )k
k=0
k!
f ′ (x0 ) f (n) (x0 )
= f (x0 ) + (x − x0 ) + · · · + (x − x0 )n .
1! n!

In particular,

T1 (x) = f (x0 ) + f ′ (x0 )(x − x0 ),


f ′′ (x0 )
T2 (x) = f (x0 ) + f ′ (x0 )(x − x0 ) + (x − x0 )2 ,
2
f ′′ (x0 ) f ′′′ (x0 )
T3 (x) = f (x0 ) + f ′ (x0 )(x − x0 ) + (x − x0 )2 + (x − x0 )3 ,
2 6
and so on.
Notice that to define Taylor polynomials of degree n for a function f , we do
not need to assume that f is infinitely differentiable. We just need to assume that
f is n times diferentiable.

Example 6.29

For the function f (x) = x cos x, its Taylor series is


∞ 2n
x2 x4 x 3 x5
 
n x
X
f (x) = x (−1) =x 1− + + · · · = x− + +· · · .
n=0
(2n)! 2 24 2 24

If Tn (x) is the nth Taylor polynomial for f (x) at x = 0, then

T1 (x) = T2 (x) = x,
x3
T3 (x) = T4 (x) = x − ,
2
x 3 x5
T5 (x) = T6 (x) = x − + ,
2 24
and so on.
Chapter 6. Sequences and Series of Functions 526

In this example, we notice that T2n−1 (x) = T2n (x) for all n ∈ Z+ . This is
because f (x) is an odd function.
We have noticed that the first Taylor polynomial

T1 (x) = f (x0 ) + f ′ (x0 )(x − x0 )

of a function f (x) at x = x0 is related to the tangent line to the graph of the


function. In fact, by definition of derivatives, we have

f (x) − T1 (x) f (x) − f (x0 ) − (x − x0 )f ′ (x0 )


lim = lim = 0.
x→x0 x − x0 x→x0 x − x0
We say that T1 (x) is a first order approximation of f (x) at x = x0 . In general, we
can define the following concept.

Definition 6.14 Order of Approximation


Let I be an open interval that contains the point x0 , and let n be a positive
integer. We say that two functions f : I → R and g : I → R are nth -order
approximations of each other at the point x0 if

f (x) − g(x)
lim = 0.
x→x0 (x − x0 )n

We will show that the nth Taylor polynomial of a function f (x) at x0 is an nth -
order approximation of the function at x0 . First, we prove the following lemma
which says that for any real number x0 , any polynomial of degree n can be written
Xn
in the form ck (x − x0 )k .
k=0

Lemma 6.32
Given a real number x0 , and a polynomial p(x) of degree n, we have
n
X p(k) (x0 )
p(x) = (x − x0 )k .
k=0
k!

In other words, the nth Taylor polynomial of p(x) is p(x) itself, and the
Taylor series of p(x) is also p(x).
Chapter 6. Sequences and Series of Functions 527

Proof
Let
p(x) = a0 + a1 x + · · · + an xn ,
and let h = x − x0 . Then x = h + x0 . Substitute x by x0 + h, we have

p(x) = a0 + a1 (h + x0 ) + · · · + an (h + x0 )n .

For 0 ≤ k ≤ n, (h + x0 )k is a polynomial of degree k in h. Thus,

a0 + a1 (h + x0 ) + · · · + an (h + x0 )n

is a polynomial of degree n in h. This implies that there are constants c0 ,


c1 , . . ., cn such that
n
X n
X
k
p(x) = ck h = ck (x − x0 )k .
k=0 k=0

Differentiate both sides k times and set x = x0 gives

p(k) (x0 ) = k!ck .

This proves that


n
X p(k) (x0 )
p(x) = (x − x0 )k .
k=0
k!

As a corollary, we have the following, which can be deduced from Theorem


3.20.

Corollary 6.33

If p(x) is a polynomial of degree at most n, and there is a point x0 such that

p(x0 ) = p′ (x0 ) = · · · = p(n) (x0 ) = 0,

then p(x) is identically zero.

We would also like to emphasize again the following.


Chapter 6. Sequences and Series of Functions 528

Corollary 6.34
Let I be an interval that contains the point x0 , and let n be a positive integer.
Given that f : I → R is a function that is n times differentiable, let
n
X f (k) (x0 )
Tn (x) = (x − x0 )k
k=0
k!

be its nth Taylor polynomial at x0 . For 0 ≤ k ≤ n, we have

Tn(k) (x0 ) = f (k) (x0 ).

Proof
By Lemma 6.32, we have
n (k)
X Tn (x0 )
Tn (x) = (x − x0 )k .
k=0
k!

The result follows by comparing coefficients.

Now we prove the approximation theorem.

Theorem 6.35
Let I be an open interval that contains the point x0 , and let n be a positive
integer. Assume that the function f : I → R is n times differentiable.

(a) The nth Taylor polynomial


n
X f (k) (x0 )
Tn (x) = (x − x0 )k
k=0
k!

of f (x) at x0 is an nth -order approximation of f (x) at x0 .

(b) If p(x) is a polynomial of degree at most n, and p(x) is an nth -order


approximation of f (x) at x0 , then p(x) = Tn (x).
Chapter 6. Sequences and Series of Functions 529

Proof
Let us consider (a) first. If n = 1, we need to show that

f (x) − T1 (x) f (x) − f (x0 ) − f ′ (x0 )(x − x0 )


lim = lim = 0.
x→x0 x − x0 x→x0 x − x0
But this is just the definition of f ′ (x0 ). Assume that we have proved the
statement for the n − 1 case. Now we look at
f (x) − Tn (x)
lim .
x→x0 (x − x0 )n

This is a limit of the indeterminate form 0/0.


Notice that
n n−1
X f (k) (x0 ) X (f ′ )(k) (x0 )
Tn′ (x) = (x − x0 ) k−1
= (x − x0 )k
k=1
(k − 1)! k=0
k!

is the (n − 1)th Taylor polynomial for f ′ , and f ′ is (n − 1) times


differentiable. By inductive hypothesis,

f ′ (x) − Tn′ (x)


lim = 0.
x→x0 (x − x0 )n−1

By l’ Hôpital’s rule,

f (x) − Tn (x) f ′ (x) − Tn′ (x)


lim = lim = 0.
x→x0 (x − x0 )n x→x0 n(x − x0 )n−1

This finishes the induction for (a).


X n
Now we consider (b). Let p(x) = ck (x−x0 )k be a polynomial of degree
k=0
at most n which is an nth -order approximation of f (x) at x0 . Then

f (x) − p(x)
lim = 0.
x→x0 (x − x0 )n

We have proved in part (a) that

f (x) − Tn (x)
lim = 0.
x→x0 (x − x0 )n
Chapter 6. Sequences and Series of Functions 530

These give

Tn (x) − p(x) f (x) − p(x) f (x) − Tn (x)


lim = lim − lim = 0.
x→x0 (x − x0 )n x→x0 (x − x0 )n x→x0 (x − x0 )n

It follows that for all 0 ≤ k ≤ n,


Tn (x) − p(x)
lim = 0. (6.15)
x→x0 (x − x0 )k

Notice that
n  (k) 
X f (x0 )
Tn (x) − p(x) = − ck (x − x0 )k .
k=0
k!

Take k = 0 in (6.15), we find that c0 = f (x0 ). Then take k = 1 shows that


f (k) (x0 )
c1 = f ′ (x0 ). Inductively, we show that ck = for all 0 ≤ k ≤ n.
k!
This completes the proof of the theorem.

Theorem 6.35 says that the nth Taylor polynomial of a function f (x) at
a point x0 is the unique polynomial of degree at most n which is an nth -
order approximation of f (x) at x0 . Hence, we also called Tn (x) the Taylor
polynomial of f (x) of order n at x0 . One should avoid calling it the nth
degree Taylor polynomial as we have seen that Tn (x) does not necessary
have degree n.

Given that I is an interval that contains the point x0 , and f : I → R is an


n times differentiable function, the Taylor polynomial Tn (x) is well defined. By
Theorem 6.35,
f (x) − Tn (x)
lim = 0.
x→x0 (x − x0 )n
This implies that for any ε > 0, there is a δ > 0 such that (x0 − δ, x0 + δ) ⊂ I,
and for all x ∈ (x0 − δ, x0 + δ),

|f (x) − Tn (x)| ≤ ε|x − x0 |n . (6.16)

The function
Rn (x) = f (x) − Tn (x)
Chapter 6. Sequences and Series of Functions 531

is called the remainder when we approximate the function f (x) by its nth Taylor
polynomial Tn (x) at x0 . Eq (6.16) says that when x approaches x0 , the order of
Rn (x) is smaller than the order of |x − x0 |n . If we assume that f has one more
derivative, we can say more.
We will first prove the Lagrange remainder theorem which assumes that f :
I → R is (n + 1) times differentiable.

Theorem 6.36 The Lagrange Remainder Theorem


Let I be an open interval that contains the point x0 , and let n be a positive
integer. Given that f : I → R is a function that is (n + 1) times
differentiable, let
n
X f (k) (x0 )
Tn (x) = (x − x0 )k
k=0
k!

be its Taylor polynomial of order n at x0 . For any x ∈ I \ {x0 }, there is a


number c ∈ (0, 1) such that

f (n+1) (ξ)
f (x) − Tn (x) = (x − x0 )n+1 , where ξ = x0 + c(x − x0 ).
(n + 1)!

Recall that ξ = x0 + c(x − x0 ) with c ∈ (0, 1) means that ξ is a point strictly


between x0 and x.

Proof
The proof is a straightforward application of Theorem 3.20, which is a
consequence of Cauchy mean value theorem. Let g : I → R be the function
defined by
n
X f (k) (x0 )
g(x) = f (x) − Tn (x) = f (x) − (x − x0 )k .
k=0
k!

Then g is (n + 1) times differentiable. Corollary 6.34 implies that

g(x0 ) = g ′ (x0 ) = · · · = g (n) (x0 ) = 0.


Chapter 6. Sequences and Series of Functions 532

Applying Theorem 3.20 to the function g, we find that for any x ∈ I \ {x0 },
there is a number c ∈ (0, 1) such that

g (n+1) (ξ)
f (x)−Tn (x) = g(x) = (x−x0 )n+1 , where ξ = x0 +c(x−x0 ).
(n + 1)!
(n+1)
Since Tn (x) is a polynomial of degree n, Tn (x) = 0 for all x ∈ I.
Therefore, g (n+1) (x) = f (n+1) (x) for all x ∈ I. This concludes the proof.

The Lagrange remainder theorem also holds in the n = 0 case. This is just the
Lagrange mean value theorem. Thus Lagrange remainder theorem is an extension
of the Lagrange mean value theorem. It gives useful estimates on the error term
in approximating a function by its Taylor polynomial, especially if f (n+1) (x) is
always positive or always negative in a neighbourhood of x0 .

Example 6.30
In this example, we demonstrate how we can use the Lagrange remainder
theorem to show that the Taylor series of the function f : R → R, f (x) =
ex , converges to f (x) for all real numbers x. The nth Taylor polynomial of
f (x) = ex at x = 0 is

x2 xn
Tn (x) = 1 + x + + ··· + .
2 n!
Since f (n) (x) = ex for any n ∈ Z+ , Lagrange remainder theorem says that
for any n ≥ 0, for any real number x ̸= 0, there is a number ξ strictly
between 0 and x such that
f (n+1) (ξ) n+1 eξ
ex − Tn (x) = x = xn+1 . (6.17)
(n + 1)! (n + 1)!

For fixed x, ξ depends on n but we can use ξ < |x| to get the estimate
eξ ≤ e|x| that is independent of n. This implies that

|x|n+1
|ex − Tn (x)| ≤ e|x| .
(n + 1)!
Chapter 6. Sequences and Series of Functions 533


X |x|n
We have proved in Example 6.18 that the power series is
n=0
n!
|x|n+1
convergent. Therefore, lim = 0. This allows us to conclude
n→∞ (n + 1)!
that
x2 xn
ex = lim Tn (x) = 1 + x + + ··· + + ··· . (6.18)
n→∞ 2 n!
This is an alternative way to prove that the Taylor series of ex converges
to ex , instead of the aproach used in the proof of Theorem 6.27. From the
series expansion (6.18), it is easy to deduce that for all x > 0, and all n ≥ 1,

x2 xn
ex > 1 + x + + ··· + .
2 n!
In particular, we have

ex > 1 + x,
x x2
e >1+x+ ,
2
2
x x3
ex > 1 + x + + ,
2 6
x2 x3 x4
ex > 1 + x + + + .
2 6 24
For x < 0, the series (6.18) is alternating. The sequence {bn } with bn =
|x|n
is not decreasing. However, since x2n−1 < 0 and x2n > 0 for all
n!
n ≥ 1, and eξ > 0 for all ξ, we can use (6.17) to conclude that for x < 0,

ex > 1 + x,
x2
ex < 1 + x + ,
2
x2 x3
ex > 1 + x + + ,
2 6
x2 x3 x4
ex < 1 + x + + + .
2 6 24

In Theorem 3.37, we apply mean value theorem to prove that | sin x| ≤ |x| for
all real numbers x. In the following example, we extend this result partially.
Chapter 6. Sequences and Series of Functions 534

Figure 6.7: The function f (x) = ex and its Taylor polynomials at x = 0.

Example 6.31

x3
Show that for x ∈ (0, π), sin x > x − .
6

Solution
Let f (x) = sin x. Then f (x) is infinitely differentiable, with the third
Taylor polynomial at x = 0 given by

x3
T3 (x) = x − .
6
Apply the Lagrange remainder theorem, we find that for any x ∈ (0, π),
there is a ξ ∈ (0, x) ⊂ (0, π) so that

x3 f (4) (ξ) 4 sin ξ 4


sin x − x + = x = x.
6 24 24
Since sin ξ > 0 for ξ ∈ (0, π), this proves that

x3
sin x > x − for x ∈ (0, π).
6

We have repeatedly used the fact that if f : I → R is a differentiable function


defined on an open interval I, and f ′ (x) = 0 for all x ∈ I, then f (x) is a constant
function. The next theorem extends this result.
Chapter 6. Sequences and Series of Functions 535

Figure 6.8: The function f (x) = sin x and its Taylor polynomials at x = 0.

Theorem 6.37
Let I be an open interval, and let n be a positive integer. Assume that the
function f : I → R is (n + 1) times differentiable, and f (n+1) (x) = 0 for
all x ∈ I. Then f (x) is a polynomial of degree at most n.

Proof
If f (n+1) (x) = 0 for all x ∈ I, take any point x0 in I, and let Tn (x) be the
nth Taylor polynomial of f at x0 . By definition, f (x0 ) = Tn (x0 ). Given
x ∈ I \ {x0 }, the Lagrange remainder theorem implies that there is a point
ξ ∈ I such that

f (n+1) (ξ)
f (x) − Tn (x) = (x − x0 )n+1 .
(n + 1)!

Since f (n+1) (x) is identically 0, we find that for all x ∈ I, f (x) = Tn (x).
This proves that f is a polynomial of degree at most n.

As a corollary, we have the following.


Chapter 6. Sequences and Series of Functions 536

Corollary 6.38
Let I be an open interval, and let n be a positive integer. Assume that
f : I → R and g : I → R are (n + 1) times differentiable functions such
that
f (n+1) (x) = g (n+1) (x) for all x ∈ I,
then there is a polynomial p(x) of degree at most n such that

f (x) = g(x) + p(x).

Next we turn to the Cauchy remainder theorem. In Example 4.28, we have


shown that if g : I → R is a continuous function, x0 is a point in I, n is a positive
integer, then the function G : I → R defined by

1 x
Z
G(x) = (x − t)n g(t)dt
n! x0

is (n + 1) times continuously differentiable,

G(x0 ) = G′ (x0 ) = . . . = G(n) (x0 ) = 0,

and
G(n+1) (x) = g(x) for all x ∈ I.
Theorem 6.39 The Cauchy Remainder Formula
Let I be an open interval that contains the point x0 , and let n be a positive
integer. Given that f : I → R is a function that is (n+1) times continuously
differentiable, let
n
X f (k) (x0 )
Tn (x) = (x − x0 )k
k=0
k!

be its Taylor polynomial of order n at x0 . For any x ∈ I,

1 x
Z
f (x) − Tn (x) = (x − t)n f (n+1) (t)dt.
n! x0
Chapter 6. Sequences and Series of Functions 537

Proof
Let H : I → R be the function defined by

1 x
Z
H(x) = f (x) − Tn (x) − (x − t)n f (n+1) (t)dt.
n! x0

Then by the result proved in Example 4.28, we find that H is a function that
is (n + 1) times continuously differentiable,

H(x0 ) = H ′ (x0 ) = · · · = H (n) (x0 ) = 0,

and

H (n+1) (x) = f (n+1) (x) − f (n+1) (x) = 0 for all x ∈ I.

By Theorem 6.37 and Corollary 6.33, H(x) = 0 for all x ∈ I. This


completes the proof of the assertion.

In Cauchy remainder formula, the error term is expressed as a precise integral,


although in practice it might not be possible to evaluate such an integral. Let us
now apply the Cauchy remainder formula to prove that the Maclaurin series of the
function f (x) = (1 + x)α converges to f (x) when x ∈ (−1, 1).

Theorem 6.40
Let α be a real number. For |x| < 1,
∞  
α
X α k
(1 + x) = x .
k=0
k

Proof
∞  
X α
Let f (x) = (1 + x)α , −1 < x < 1. We have seen that xk is the
k
k=0
th
Maclaurin series of f (x). The n Taylor polynomial of f (x) at x = 0 is
n  
X α
Tn (x) = xk .
k=0
k
Chapter 6. Sequences and Series of Functions 538

We need to show that lim Tn (x) = f (x) for all |x| < 1. It is easy to verify
n→∞
that  
(n+1) α
f (x) = (n + 1)! (1 + x)α−n−1 .
n+1
By Cauchy remainder formula, for x ∈ (−1, 1),

1 x
Z
f (x) − Tn (x) = (x − t)n f (n+1) (t)dt
n! 0
 Z x
α
= (n + 1) (x − t)n (1 + t)α−n−1 dt.
n+1 0

We need to estimate this last integral for x ̸= 0. Making a change of


variables t = xτ , we find that
Z x Z 1
n α−n−1 n+1
(x − t) (1 + t) dt = x (1 − τ )n (1 + xτ )α−n−1 dτ.
0 0

Notice that since x ∈ (−1, 1), when τ ∈ [0, 1],

(1 − τ )n (1 + xτ )α−n−1 ≥ 0.

For fixed x ∈ (−1, 1), the function g : [0, 1] → R, g(τ ) = (1 + xτ )α−1 is


continuous. Therefore, there is a constant M such that

0 ≤ (1 + xτ )α−1 ≤ M for all τ ∈ [0, 1].

This implies that


Z 1 1  n
1−τ
Z
n α−n−1
0≤ (1 − τ ) (1 + xτ ) dτ ≤ M dτ.
0 0 1 + xτ

For any x ∈ (−1, 1) and τ ∈ [0, 1],

1 + xτ ≥ 1 − τ ≥ 0.

This implies that for any x ∈ (−1, 1),


 n
1−τ
0≤ ≤1 for all τ ∈ [0, 1].
1 + xτ
Chapter 6. Sequences and Series of Functions 539

Therefore, we find that


Z 1 n
1−τ
0≤ dτ ≤ 1 for all |x| < 1.
0 1 + xτ
Hence,   n+1
α
|f (x) − Tn (x)| ≤ M (n + 1) x . (6.19)
n+1
∞  
X α n
In Example 6.28, we have proved that the series x is convergent
n=0
n
X∞
when |x| < 1. It follows from Theorem 6.21 that the derived series (n+
  n=0
α
1) xn is also convergent when |x| < 1. This implies that
n+1
  n
α
lim (n + 1) x = 0.
n→∞ n+1

Using squeeze theorem, we deduce from (6.19) that

lim Tn (x) = f (x).


n→∞

This completes the proof.


Chapter 6. Sequences and Series of Functions 540

Exercises 6.5
Question 1
Let f : R → R be the function defined by

 2 − 2 cos x ,

if x ̸= 0,
f (x) = x2
1, if x = 0.

Show that f is infinitely differentiable, and find f (n) (0) for all n ≥ 0.

Question 2
Show that for all x > 0,

x2
x− < ln(1 + x) < x.
2

Question 3
Show that for all x > 0,

x x2 √ x x2 x3
1+ − < 1+x<1+ − + .
2 8 2 8 16

Question 4
Show that for all x ∈ (−π, π),

x2
cos x ≥ 1 − .
2
Chapter 6. Sequences and Series of Functions 541

Question 5
Let α be a real number. Assume that α is not an integer. In Example 6.28,
∞  
X α k
we have shown that the power series x , which is the Maclaurin
k=0
k
series of the function f (x) = (1 + x)α , has radius of convergence 1. Define
the function g : (−1, 1) → R by
∞  
X α k
g(x) = x .
k=0
k

In this question, you are asked to show that g(x) = (1+x)α for x ∈ (−1, 1),
without using the Cauchy remainder formula.

(a) Show that (1 + x)g ′ (x) = αg(x) for all x ∈ (−1, 1).

(b) Let h : (−1, 1) → R be the function defined by h(x) = g(x)(1 + x)−α .


Prove that h is a constant function.

(c) Conclude that g(x) = (1 + x)α for all x ∈ (−1, 1).


Chapter 6. Sequences and Series of Functions 542

6.6 Examples and Applications

In this section, we discuss some examples and applications.


The number e and the number π are two important numbers in mathematics.
In Section 6.6.1 and Section 6.6.2, we prove respectively that these two numbers
are irrational.
In Section 6.6.3, we prove that there is an infinitely differentiable function
whose Taylor series at a point does not converge to the function itself. We also
briefly discuss the applications of such functions, despite its non-analyticity.
In Section 6.6.4, we construct a continuous function that is differentiable
nowhere. It uses Theorem 6.9 which says that uniform limit of continuous functions
is continuous.
In Section 6.6.5, we prove the Weierstrass approximation theorem, which says
that any continuous function defined on a closed and bounded interval can be
uniformly approximated by a polynomial. We give a proof that uses Bernstein’s
approach. It uses the fact that a continuous function defined on a closed and
bounded interval is bounded and uniformly continuous. Later when we study
Fourier series, we are going to prove this important theorem again using the theory
of Fourier series.

6.6.1 The Irrationality of e

In Example 1.36, we have defined the number e as the limit of the increasing
 n
1
sequence {an }, where an = 1 + . We have proved that an ≤ 3 for all
n
n ∈ Z+ . This implies that e ≤ 3. In Theorem 6.27, we proved that

X 1 1 1 1
e= = 1 + + + ··· + + ··· .
n=0
n! 1! 2! n!

Theorem 6.41 Irrationality of e


The number e is irrational.
Chapter 6. Sequences and Series of Functions 543

Proof
Assume to the contrary that e is rational. Then since e is positive, there are
positive integers a and b such that
a
e= .
b
For any positive integer n, we apply Lagrange remainder theorem to the nth
Taylor polynomial of ex at the point x0 = 0. With x = 1, we find that there
is a number cn in the interval (0, 1) such that
1 1 1 ecn
e=1+ + + ··· + + . (6.20)
1! 2! n! (n + 1)!

For n ≥ b, we find that n! is divisible by b, and so n!e is an integer. From


(6.20), we have

ecn
 
n! n! e 3
0 < n!e − n! + n! + + ··· + = < ≤ .
2! n! n+1 n+1 n+1

Notice that for each 1 ≤ k ≤ n, n!/k! is an integer. Hence, for n ≥ b,


 
n! n!
n!e − n! + n! + + ··· +
2! n!

is a positive integer that is less than 3/(n + 1). For n > 3, 3/n + 1 is less
than 1. This gives a contradiction. Hence, e must be irrational.

6.6.2 The Irrationality of π

As in the case of the number e, we will show that π is an irrational number using
proof by contradiction. We begin by two lemmas.
Chapter 6. Sequences and Series of Functions 544

Lemma 6.42
Given that f : R → R and g : R → R are two infinitely differentiable
functions. For any n ∈ Z+ , and any numbers α and β,
Z β Z β
(2n+1)
f (x)g(x)dx + f (x)g (2n+1) (x)dx
α α
2n
X 2n
X (6.21)
k (k) (2n−k)
= (−1) f (β)g (β) − (−1)k f (k) (α)g (2n−k) (α).
k=0 k=0

Proof
+
Given n ∈ Z , define the function F : R → R by
2n
X
F (x) = (−1)k f (k) (x)g (2n−k) (x).
k=0

Then
2n
X 2n
X
F ′ (x) = (−1)k f (k+1) (x)g (2n−k) (x) + (−1)k f (k) (x)g (2n−k+1) (x)
k=0 k=0

Because of the alternating signs, the k = 0 to k = 2n − 1 terms in the first


sum cancel with the k = 1 to k = 2n terms in the second sum. This gives

F ′ (x) = f (2n+1) (x)g(x) + f (x)g (2n+1) (x).

By fundamental theorem of calculus, we find that


Z β Z β
(2n+1)
F (β) − F (α) = f (x)g(x)dx + f (x)g (2n+1) (x)dx.
α α

This proves (6.21).


Chapter 6. Sequences and Series of Functions 545

Lemma 6.43
Let a, b and n be positive integers. Define the polynomial p : R → R by

xn (a − bx)n
p(x) = .
n!
For any integer k satisfying 0 ≤ k ≤ 2n, p(k) (0) and p(k) (a/b) are integers.

Proof
Using binomial expansion, we have
n  
xn X n n−m
p(x) = a (−1)m bm xm .
n! m=0 m

By Lemma 6.32,
2n
X p(k) (0)
p(x) = xk .
k=0
k!
Comparing the coeficients, we find that

0,
 if 0 ≤ k ≤ n − 1,
(k)
p (0) = k!
 
n
 a2n−k (−1)k−n bk−n , if n ≤ k ≤ 2n.
n! k − n

Since k! is divisible by n! when k ≥ n, we find that p(k) (0) is an integer for


all 0 ≤ k ≤ 2n. Expanding p(x) in powers of (x − a/b), we find that
n n  
nb a n X n  a n−m  a m

p(x) = (−1) x− x− .
n! b m=0 m b b

By Lemma 6.32,
2n
X p(k) (a/b)  a k
p(x) = x− .
k=0
k! b
Chapter 6. Sequences and Series of Functions 546

Comparing the coeficients, we find that



   0, if 0 ≤ k ≤ n − 1,
(k) a
p =
 
b n k! n
(−1)
 a2n−k bk−n , if n ≤ k ≤ 2n.
n! k − n

Hence, p(k) (a/b) is also an integer for all 0 ≤ k ≤ 2n.

Now we can prove the theorem.

Theorem 6.44 Irrationality of π


The number π is irrational.

Proof
Assume that π is a rational number. Then there are positive integers a and
b such that
a
π= .
b
+
For n ∈ Z , define the polynomial pn (x) by

xn (a − bx)n bn xn (π − x)n
pn (x) = = ,
n! n!
and let Z π
In = pn (x) sin xdx.
0

Take f (x) = pn (x), g(x) = cos x and α = 0, β = π in Lemma 6.42. Since


(2n+1)
pn (x) is a polynomial of degree 2n, we find that pn (x) = 0. On the
other hand, for all k ≥ 0,

g (4k) (x) = cos x, g (4k+1) (x) = − sin x,


g (4k+2) (x) = − cos x, g (4k+3) (x) = sin x.
Chapter 6. Sequences and Series of Functions 547

In particular, g (2n+1) (x) = (−1)n−1 sin x. From (6.21), we have


( 2n
X
In = (−1)n−1 (−1)k p(k)
n (π)g
(2n−k)
(π)
k=0
2n
) (6.22)
X
− (−1)k p(k)
n (0)g
(2n−k)
(0) .
k=0

(k) (k)
By Lemma 6.43, pn (0) and pn (π) are integers for all 0 ≤ k ≤ 2n. Since

sin 0 = 0, sin π = 0, cos 0 = 1, cos π = −1

are integers, we find that g (k) (0) and g (k) (π) are integers for all k ≥ 0. The
right hand side of (6.22) shows that In is an integer for all n ∈ Z+ . On the
other hand, for all 0 ≤ x ≤ π,
 π 2
0 ≤ x(π − x) ≤ .
2
Therefore, for 0 ≤ x ≤ π,
n
π2b

1
0 ≤ pn (x) ≤ .
n! 4

Since we also have 0 ≤ sin x ≤ 1 for all 0 ≤ x ≤ π, we conclude that


Z π n
π π2b

0 ≤ In = pn (x) sin xdx ≤ .
0 n! 4
∞ n
1 π2b
X 
Because the series is convergent, we find that
n=0
n! 4
n
π2b

1
lim = 0.
n→∞ n! 4

Therefore, there is a positive integer N such that for all n ≥ N ,


n
1 π2b

1
< ,
n! 4 π

which gives
0 ≤ In < 1 for all n ≥ N.
Chapter 6. Sequences and Series of Functions 548

We only need the n = N case now. Since IN is an integer, we must have


IN = 0. However, since
pN (x) sin x
is a continuous function and it is positive on (0, π), by Example 4.18,
Z π
IN = pN (x) sin xdx
0

cannot be zero. This gives a contradiction. Hence, π must be an irrational


number.

6.6.3 Infinitely Differentiable Functions that are Non-Analytic

We consider the function f : R → R defined by


  
exp − 1 ,

if x ̸= 0,
f (x) = x2
0,

if x = 0.

We will show that this function is infinitely differentiable and f (n) (0) = 0 for all
n ≥ 0.

 
1
Figure 6.9: The function f (x) = exp − 2 .
x

Let us first prove the following lemma.


Chapter 6. Sequences and Series of Functions 549

Lemma 6.45
If p(x) is a polynomial, then
   
1 1
lim p exp − =0 (6.23a)
x→0+ x x
   
1 1
lim p exp − 2 = 0. (6.23b)
x→0 x x

Proof
In Example 3.24, we have shown that for any real number s, lim y s e−y =
y→∞
0. From this, we find that if k is an integer,
 
1 1
lim exp − = lim y k e−y = 0,
x→0+ xk x y→∞

and  
1 1
lim exp − 2 = lim y k e−y = 0.
x→0 x2k x y→∞

The latter one implies that for any integer k,


  h i  
1 1 1 1
lim exp − 2 = lim x lim 2k exp − 2 = 0.
x→0 x2k−1 x x→0 x→0 x x

These prove (6.23).

Next, we prove the following.

Theorem 6.46
Let f : R → R be the function defined by
  
exp − 1
, if x ̸= 0,

f (x) = x2 (6.24)
0,

if x = 0.

Then f is an infinitely differentiable function with f (n) (0) = 0 for all n ≥


0.
Chapter 6. Sequences and Series of Functions 550

Proof
We claim that for each positive integer n, there is a polynomial pn (x) of
degree 3n such that
    
pn 1 exp − 1 ,

if x ̸= 0,
f (n) (x) = x x2 (6.25)
0,

if x = 0.

This will show that f is infinitely differentiable. In fact, Lemma 6.45


implies that
   
1 1
lim fn (x) = lim pn exp − 2 = 0 = f (n) (0),
x→0 n→0 x x

which says that f (n) (x) is continuous at x = 0. We will prove (6.25) by


induction on n. When n = 1, we find from the definition (6.24) that
 
′ 2 1
f (x) = 3 exp − 2 when x ̸= 0.
x x

When x = 0, we apply Lemma 6.45 to get


 
1
exp − 2 − 0  
′ x 1 1
f (0) = lim = lim exp − 2 = 0.
x→0 x x→0 x x

Therefore, the n = 1 statement is true with p1 (x) is polynomial of degree 3


given by
p1 (x) = 2x3 .
Assume that the statement is true for the n − 1 case. This means that there
is a polynomial pn−1 (x) of degree 3n − 3 such that
    
pn−1 1 exp − 1 ,

̸ 0,
if x =
f (n−1) (x) = x x2
0,

if x = 0.

When x ̸= 0,
      
1 1 2 1 1
f (n)
(x) = − 2 p′n−1 + 3 pn−1 exp − 2 .
x x x x x
Chapter 6. Sequences and Series of Functions 551

This shows that when x ̸= 0,


   
(n) 1 1
f (x) = pn exp − 2 ,
x x

where
pn (x) = −x2 p′n−1 (x) + 2x3 pn−1 (x).
By inductive hypothesis, p′n−1 (x) is a polynomial of degree 3n − 4. Thus,
x2 p′n−1 (x) is a polynomial of degree 3n − 2. Since 2x3 pn−1 (x) is a
polynomial of degree 3n, pn (x) is a polynomial of degree 3n. For the
derivative at 0, Lemma 6.45 implies that

f (n−1) (x) − f (n−1) (0)


   
(n) 1 1 1
f (0) = lim = lim pn−1 exp − 2 = 0.
x→0 x x→0 x x x

This proves the statement for the n case, and thus completes the induction.

Now we prove our main theorem in this section.

Theorem 6.47
Let I be an open interval that contains the point x0 . There is an infinitely
differentiable function f : I → R whose Taylor series at the point x = x0
is convergent pointwise on I, but it does not converge to f (x) pointwise on
I.

Proof
Define the function f : I → R by
  
exp − 1
, if x ̸= x0 ,

(x − x0 )2
0,

if x = x0 .

Theorem 6.46 implies that the function f (x) is infinitely differentiable on


I, and f (n) (x0 ) = 0 for all n ≥ 0. Hence, the Taylor series of f at x = x0 ,
Chapter 6. Sequences and Series of Functions 552


X f (n) (x0 )
(x − x0 )n ,
n=0
n!
is the series that is identically 0. Therefore, it converges everywhere, but it
does not converge to f (x) except at the point x = x0 .

Using almost the same proof as for Theorem 6.46, we obtain the following.

Theorem 6.48
Given a real number x0 , the function g : R → R defined by
  
exp − 1

, if x > x0 ,
g(x) = x − x0 (6.26)
0, if x ≤ x0 .

is infinitely differentiable.

The function g(x) defined by (6.26) is also not analytic. Nevertheless, it has
some important applications. It is usually used to "smooth" up a function or
truncate a function smoothly.

Theorem 6.49
Given two real numbers a and b with a < b, define the function h : R → R
by

g(x − a)
h(x) =
g(x − a) + g(b − x)



 0, if x ≤ a,
 
1




 exp − (6.27)
 x−a
=    , if a < x < b,
1 1
exp − + exp −






 x−a b−x

1, if x ≥ b.

Then h is a functon that is infinitely differentiable.


Chapter 6. Sequences and Series of Functions 553

Proof
We just need to show that g(x − a) + g(b − x) is nonzero for all x ∈ R.
The rest follows from Theorem 6.48 and the definition of h(x). Since the
function g is nonnegative, in order for g(x − a) + g(b − x) = 0, we must
have g(x − a) = g(b − x) = 0. But we know that g(x − a) = 0 only when
x ≤ a, and g(b − x) = 0 only when x ≥ b. Since the set {x | x ≤ a} and
the set {x | x ≥ b} are disjoint, we conclude that g(x − a) + g(b − x) is
never 0.

Figure 6.10: The function g(x) defined by (6.26) when x0 = 0.

Figure 6.11: The function h(x) defined by (6.27).

Remark 6.4
The function h(x) defined by (6.27) is an example of an infinitely
diferentiable function that is increasing but assume constant values outside
a bounded interval.
Chapter 6. Sequences and Series of Functions 554

6.6.4 A Continuous Function that is Nowhere Differentiable

In this section, we want to construct a continuous function f : R → R which is


not differentiable at any point. The main ingredient in the proof is to note that the
function g : R → R, g(x) = |x| is continuous, and it is not differentiable at x = 0.

Definition 6.15 The function hm


For any positive number m, let hm : R → R be the function defined by

hm (x) = |x|, for all − m ≤ x ≤ m,

and
hm (x + 2m) = h(x) for all x ∈ R.

1
Figure 6.12: The functions hm (x) when m = 1, 2
and 14 .

Let us first explore the properties of the function hm .

Lemma 6.50
Given a positive number m, define xm,k = mk for all k ∈ Z+ . The function
hm defined in Definition 6.15 has the following properties.

(a) hm is a continuous even function that is periodic of period 2m.

(b) For k ∈ Z, the graph of hm : [xm,2k , xm,2k+1 ] → R is a straightline


segment of slope 1; while the graph of hm : [xm,2k−1 , xm,2k ] → R is a
straightline segment of slope −1. Hence, the graph of hm is a union of
straightline segments alternatingly having slopes 1 and −1.

(c) 0 ≤ hm (x) ≤ m for all x ∈ R.


Chapter 6. Sequences and Series of Functions 555

Proof
Part (a) follows from hm (−m) = hm (m). Part (b) can be proved by
induction on k ≥ 0, using the periodicity of hm and the fact that hm is
an even function. Part (c) follows from the definition of hm and periodicity.

Lemma 6.51
Given a positive number ℓ and a point x ∈ R, let

U = [x − ℓ/2, x] and V = [x, x + ℓ/2].

For a positive integer m, let hm : R → R be the function defined in


Definition 6.15. Then one of the following holds.

(a) For each nonnegative integer k, the graph of h2k ℓ : U → R is a line


segment of slope 1 or −1.

(b) For each nonnegative integer k, the graph of h2k ℓ : V → R is a line


segment of slope 1 or −1.

Proof
The points nℓ, n ∈ Z, partition the real line into subintervals of the form
[nℓ, (n + 1)ℓ], each of length ℓ. Since U and V are adjacent intervals of
length ℓ/2, one of them must lie entirely inside one of the intervals of the
form [nℓ, (n + 1)ℓ].
By part (b) in Lemma 6.50, the graph of hℓ : [nℓ, (n + 1)ℓ] → R is a line
segment of slope 1 or −1. This proves the assertion when k = 0. To prove
the assertion for k ≥ 1, we notice that to obtain the graph of the function hℓ
from the graph of the function h2ℓ , we divide each line segment in the graph
of h2ℓ into two equal parts, one of the parts change slope from 1 to −1 or
from −1 to 1. Hence, if W is an interval and the graph of hℓ : W → R is
a line segment, the graph of h2k ℓ : W → R must also be a line segment for
any k ∈ Z+ . This completes the proof the the lemma.

Now we can prove the main theorem in this section.


Chapter 6. Sequences and Series of Functions 556

Theorem 6.52
For a positive number m, let hm : R → R be the function

hm (x) = |x| for |x| ≤ m, and h(x + 2m) = h(x) for all x ∈ R.

For n ≥ 0, let gn : R → R be the function defined by gn (x) = hmn (x),



1 X
with mn = n . Then the series gn (x) converges uniformly to a function
4 n=0
f : R → R,
X∞
f (x) = gn (x).
n=0

f (x) is a continuous function that is not differentiable at any point.

Proof
+
By Lemma 6.50, for each n ∈ Z , the function gn : R → R is continuous
and
1
|gn (x)| ≤ n for all x ∈ R.
4

X 1
Since the series is convergent, Weierstrass M -test implies that
n=0
4n

X
the series gn (x) converges uniformly on R. Since each gn (x) is a
n=0
continuous function, Corollary 6.10 implies that the function f (x) =
X∞
gn (x) is continuous.
n=0
Now we are left to prove that f (x) is not diferentiable at any x ∈ R. Given
x0 ∈ R, assume that
f (x) − f (x0 )
f ′ (x0 ) = lim
x→x0 x − x0
exists. Then for any sequence {xk }∞
k=0 in R \ {x0 }, if lim xk = x0 , then
k→∞
the limit
f (xk ) − f (x0 )
lim
k→∞ x − x0
exists and is equal to f ′ (x0 ).
Chapter 6. Sequences and Series of Functions 557

We construt a sequence {xk }∞ +


k=0 as follows. For each k ∈ Z , Lemma 6.51
implies that either the graph of hmk : [x0 − mk /2, x0 ] → R or the graph of
h : [x0 , x0 +mk /2] → R is a line segment with slope 1 or −1. In the former
case, we let xk = x0 − mk /2. In the latter case, we let xk = x0 + mk /2. In
any case, we find that
mk 1
|xk − x0 | = = 2k+1 for all k ≥ 0.
2 2
This shows that {xk } is a sequence in R \ {x0 } that converges to x0 .
For fixed k ∈ Z+ , mk /2 is a multiple of 2mn for all n > k + 1. By
periodicity of hmn ,
 mk 
gn (xk ) − gn (x0 ) = hmn x0 ± − hmn (x0 ) = 0 for all n > k.
2
This implies that
∞ k
f (xk ) − f (x0 ) X gn (xk ) − gn (x0 ) X gn (xk ) − gn (x0 )
= = .
xk − x0 n=0
xk − x0 n=0
xk − x0

By the definition of xk and Lemma 6.51,

gn (xk ) − gn (x0 )
xk − x0
is equal to 1 or −1 for each 0 ≤ k ≤ n. The sum of an odd number of 1
or −1 must be odd. The sum of an even number of 1 or −1 must be even.
Therefore,
f (xk ) − f (x0 )
ck =
xk − x0
is odd when k is even, and is even when k is odd. This implies that the
sequence {ck }∞k=0 is an integer sequence that is alternatingly odd and even.
Hence, it does not have a limit. This is a contradiction, which allows us to
conclude that f cannot be differentiable at x0 .
Chapter 6. Sequences and Series of Functions 558

Figure 6.13: The functions gn (x) for n = 0, 1, 2, 3, and the function f (x).

6.6.5 The Weierstrass Approximation Theorem

In this section, we prove the Weierstrass approximation theorem using Bernstein’s


ingenious approach. We start with a lemma.

Lemma 6.53
The following identities hold.
n  
X n k
(a) For n ≥ 0, x (1 − x)n−k = 1.
k=0
k
n  
X k n k
(b) For n ≥ 1, x (1 − x)n−k = x.
k=1
n k
n
k2 n k
 
X x(1 − x)
(c) For n ≥ 2, 2
x (1 − x)n−k = x2 + .
k=1
n k n
n  2  
X k n k x(1 − x)
(d) For n ≥ 2, x− x (1 − x)n−k = .
k=0
n k n

Proof
The first identity (a) is just a consequence of the binomial expansion
theorem.
Chapter 6. Sequences and Series of Functions 559

For the identity in (b), notice that when n ≥ k ≥ 1,


   
k n (n − 1)! n−1
= = .
n k (k − 1)!(n − k)! k−1

Therefore,
n   n  
X k n k n−k
X n − 1 k−1
x (1 − x) =x x (1 − x)n−k
k=1
n k k=1
k − 1
n−1
X n − 1
=x xk (1 − x)n−1−k = x.
k=0
k

For part (c), we find that when n ≥ k ≥ 2,


   
k(k − 1) n (n − 2)! n−2
= = .
n(n − 1) k (k − 2)!(n − k)! k−2

It follows that
n   n−2  
X k(k − 1) n k n−k 2
X n−2 k
x (1 − x) =x x (1 − x)n−2−k = x2 .
k=2
n(n − 1) k k=0
k

Writing k 2 = k(k − 1) + k, we have


n
k2 n k
X  
2
x (1 − x)n−k
k=1
n k
n   n  
X k(k − 1) n k n−k
X k n k
= 2
x (1 − x) + 2 k
x (1 − x)n−k
k=1
n k k=1
n
n−1 2 1 x(1 − x)
= x + x = x2 + .
n n n
For the identity in part (d), a straightforward computation gives
n  2  
X k n k
x− x (1 − x)n−k
k=0
n k
n
k2
  
X
2 2k n k
= x − x+ 2 x (1 − x)n−k
k=0
n n k
x(1 − x) x(1 − x)
= x2 − 2x2 + x2 + = .
n n
Chapter 6. Sequences and Series of Functions 560

Definition 6.16 Bernstein Basis Polynomials


For any positive integer n, there are n+1 Bernstein basis polynomials given
by  
n k
pn,k (x) = x (1 − x)n−k , 0 ≤ k ≤ n.
k

 
n k
Figure 6.14: The polynomials pn,k (x) = x (1 − x)n−k when n = 2 and
k
n = 3, for all 0 ≤ k ≤ n.

 
n k
Figure 6.15: The polynomials pn,k (x) = x (1 − x)n−k when n = 4 and
k
n = 5, for all 0 ≤ k ≤ n.

Now we come to our main theorem.


Chapter 6. Sequences and Series of Functions 561

 
n k
Figure 6.16: The polynomials pn,k (x) = x (1 − x)n−k when n = 6 and
k
n = 7, for all 0 ≤ k ≤ n.

Theorem 6.54 Weierstrass Approximation Theorem

Let f : [a, b] → R be a continuous function defined on [a, b]. Given ε > 0,


there is a polynomial p(x) such that

|f (x) − p(x)| < ε for all x ∈ [a, b].

Proof
We first consider the case where [a, b] = [0, 1]. Since f : [0, 1] → R is
continuous on a closed and bounded interval, it is uniformly continuous
and bounded. The boundeness of f implies that there is a positive number
M such that
|f (x)| ≤ M for all x ∈ [0, 1].
Given ε > 0, since f is uniformly continuous, there is a δ > 0 such that for
all x1 and x2 in [0, 1], if |x1 − x2 | < δ, then
ε
|f (x1 ) − f (x2 )| < .
2
For any positive integer n, we construct a polynomial pn (x) to be a
polynomial of degree at most n given by the following linear combination
of Bernstein basis polynomials.
Chapter 6. Sequences and Series of Functions 562

n   n   
X k X k n k
pn (x) = f pn,k (x) = f x (1 − x)n−k .
k=0
n k=0
n k
Let us estimate the supremum of |f (x) − pn (x)| on [0, 1]. For fixed x ∈
[0, 1], part (a) in Lemma 6.53 implies that
X    
k n k
f (x) − pn (x) = f (x) − f x (1 − x)n−k .
k=0
n k

Since xk (1 − x)n−k ≥ 0 for all x ∈ [0, 1] and all n ≥ k ≥ 0, triangle


inequality gives
   
X k n k
|f (x) − pn (x)| ≤ f (x) − f x (1 − x)n−k .
k=0
n k

k
For 0 ≤ k ≤ n, if x − < δ, then
n
 
k ε
f (x) − f < .
n 2

k
If x − ≥ δ, then
n
     2
k k 2M k
f (x) − f ≤ |f (x)| + f ≤ 2M ≤ 2 x − .
n n δ n

In any case, we find that


   2
k ε 2M k
f (x) − f < + 2 x− for all 0 ≤ k ≤ n.
n 2 δ n

Therefore,
n
"  2 #  
X ε 2M k n k
|f (x) − pn (x)| < + 2 x− x (1 − x)n−k .
k=0
2 δ n k

By Lemma 6.53, and the fact that


1
0 ≤ x(1 − x) ≤ for all 0 ≤ x ≤ 1,
4
Chapter 6. Sequences and Series of Functions 563

we find that
ε 2M x(1 − x) ε M
|f (x) − pn (x)| < + 2 ≤ + 2 .
2 δ n 2 2δ n
M M ε
If n ≥ 2
, then 2 ≤ . For any such n, we find that
εδ 2δ n 2
|f (x) − pn (x)| < ε for all 0 ≤ x ≤ 1.

This completes the proof when [a, b] = [0, 1].


For general [a, b], let u : [0, 1] → R be the polynomial function u(t) =
a + t(b − a). This is a continuous function mapping [0, 1] bijectively onto
x−a
[a, b]. The inverse is the continuous function u−1 (x) = . The function
b−a
g = f ◦ u : [0, 1] → R, being a composition of continuous functions,
is continuous. By what we have proved above, given ε > 0, there is a
polynomial q(t) so that

|f (u(t)) − q(t)| < ε for all t ∈ [0, 1].

Let  
−1 x−a
p(x) = q(u (x)) = q .
b−a
Then p(x) is also a polynomial, and p(u(t)) = q(t). Therefore,

|f (u(t)) − p(u(t))| < ε for all t ∈ [0, 1],

which implies that

|f (x) − p(x)| < ε for all x ∈ [a, b].

This completes the proof of the Weierstrass approximation theorem for the
general case.

One cannot extend the Weierstrass approximation theorem to the case where
f : I → R is a continuous function defined on an unbounded interval I. This is
because a non-constant polynomial would approach ∞ or −∞ when x approaches
Chapter 6. Sequences and Series of Functions 564

Figure 6.17: Approximations of the continuous function f (x) by the polynomials


pn (x), where f (x) = 3 sin(4π|x − 1/3|) + 2 sin(6π|x − 3/4|).

∞ or −∞. However, there are bounded continuous functions defined on unbounded


intervals. For example, the function
x
f (x) = 2
x +1
is a bounded continuous function defined on R.

Remark 6.5
In probability theory, a binomial random variable X with parameters n and
p counts the number of successes in n independent and identical Bernoulli
trials, each has a probability p ∈ (0, 1) of being a success. X can take
integer values between 0 and n. The probability that X = k is
 
n k
P (X = k) = p (1 − p)n−k , 0 ≤ k ≤ n.
k

The identity in (a) of Lemma 6.53 amounts to


n  
X n k
p (1 − p)k = 1,
k=0
k
Chapter 6. Sequences and Series of Functions 565

which reflects that the total probability is 1. The identity in part (b) gives
n  
X n k
E(X) = k p (1 − p)k = np,
k=0
k

which is the expected value of a binomial random variable X with


parameters n and p. The identity in part (c) gives
n  
2
X n k
2
E(X ) = k p (1 − p)k = n2 p2 + np(1 − p).
k=0
k

Together with the identity in part (b), the variance of X is given by

Var (X) = E(X 2 ) − E(X)2 = n2 p2 + np(1 − p) − n2 p2 = np(1 − p).

In fact, the variance of a random variable X is defined as

Var (X) = E([X − E(X)]2 ).

The identity in part (d) of Lemma 6.53 is just another way of computing
the variance. Using part (d), we have
n  
X n k
2
Var (X) = (k − np) p (1 − p)k
k=0
k
n  2  
2
X k n k
=n −p p (1 − p)k = np(1 − p).
k=0
n k
References 566

References

[Abb15] Stephen Abbott, Understanding analysis, second ed., Undergraduate


Texts in Mathematics, Springer, New York, 2015. MR 3331079

[Apo74] Tom M. Apostol, Mathematical analysis, second ed., Addison-Wesley


Publishing Co., Reading, Mass.-London-Don Mills, Ont., 1974. MR
0344384

[BS92] Robert G. Bartle and Donald R. Sherbert, Introduction to real analysis,


second ed., John Wiley & Sons, Inc., New York, 1992. MR 1135107

[Fit09] Patrick M. Fitzpatrick, Advanced calculus, second ed., American


Mathematical Society, 2009.

[Ros18] Kenneth Rosen, Discrete mathematics and its applications, eighth ed.,
Mc Graw Hill, 2018.

[Rud76] Walter Rudin, Principles of mathematical analysis, third ed.,


International Series in Pure and Applied Mathematics, McGraw-Hill
Book Co., New York-Auckland-Düsseldorf, 1976. MR 0385023

[SCW20] James Stewart, Daniel K. Clegg, and Saleem Watson, Calculus, ninth
ed., Cengage Learning, 2020.

[Tao14] Terence Tao, Analysis. II, third ed., Texts and Readings in
Mathematics, vol. 38, Hindustan Book Agency, New Delhi, 2014. MR
3310023

[Tao16] , Analysis. I, third ed., Texts and Readings in Mathematics,


vol. 37, Hindustan Book Agency, New Delhi; Springer, Singapore,
2016, Edectronic edition of [ MR3309891]. MR 3728289

[Zor15] Vladimir A. Zorich, Mathematical analysis. I, second ed., Universitext,


Springer-Verlag, Berlin, 2015, With Appendices A–F and new
problems translated by Octavio Paniagua T. MR 3495809
References 567

[Zor16] , Mathematical analysis. II, second ed., Universitext, Springer,


Heidelberg, 2016. MR 3445604

You might also like