0% found this document useful (0 votes)
83 views

Introduction To Ordinary Differential Equations

Leccare notes for the Cours in Differential Equations taught at master degree in Statistics University of Bologna, Italy

Uploaded by

DanieleRitelli
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views

Introduction To Ordinary Differential Equations

Leccare notes for the Cours in Differential Equations taught at master degree in Statistics University of Bologna, Italy

Uploaded by

DanieleRitelli
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 151

Introduction to Ordinary Differential Equations

Daniele Ritelli

December 2, 2021
ii
Contents

1 Gamma and Beta functions 1


1.1 Gamma function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Historical backgruond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Main properties of Γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Beta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Γ 12 and the probability integral .

. . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Legendre duplication formula . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Euler reflexion formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Definite integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Double integration techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Sequences and series of functions 15


2.1 Sequences and series of real or complex numbers . . . . . . . . . . . . . . . . . . 15
2.2 Sequences of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Series of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Power series: radius of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Taylor–MacLaurin series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.1 Binomial series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.6.2 The error function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6.3 Abel theorem and series summation . . . . . . . . . . . . . . . . . . . . . 37
2.7 Basel problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8 Extension of elementary functions to the complex field . . . . . . . . . . . . . . . 41
2.8.1 Complex exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.8.2 Complex goniometric hyperbolic functions . . . . . . . . . . . . . . . . . . 42
2.8.3 Complex logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.9.1 Solved exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.9.2 Unsolved exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3 Multidimensional differential calculus 49


3.1 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Maxima and Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Sufficient conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Lagrange multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Mean–Value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.7 Implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.8 Proof of Theorem 3.22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.9 Sufficient conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

vii
viii CONTENTS

4 First order equations: general theory 61


4.1 Preliminary notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.1 Systems of ODEs: equations of higher order . . . . . . . . . . . . . . . . . 63
4.2 Existence of solutions: Peano theorem . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Existence and uniqueness: Picard–Lindelöhf theorem . . . . . . . . . . . . . . . . 66
4.3.1 Interval of existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.2 Vector–valued differential equations . . . . . . . . . . . . . . . . . . . . . 70
4.3.3 Solution continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 First order equations: explicit solutions 73


5.1 Separable equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 General solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Quasi homogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5 Exact equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.6 Integrating factor for non exact equations . . . . . . . . . . . . . . . . . . . . . . 85
5.6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.7 Linear equations of first order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.7.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.8 Bernoulli equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.8.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6 First order equations: advanced topics 97


6.1 Riccati equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.1.1 Cross–Ratio property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1.2 Reduced form of the Riccati equation . . . . . . . . . . . . . . . . . . . . 99
6.1.3 Connection with the linear equation of second order . . . . . . . . . . . . 100
6.1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.2 Change of variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2.1 Daniel Bernoulli solution of the Riccati equation . . . . . . . . . . . . . . 104
6.2.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

7 Linear equations of second order 109


7.1 Homogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.1.1 Operator notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.1.2 Wronskian determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.1.3 Order reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1.4 Constant–coefficient equations . . . . . . . . . . . . . . . . . . . . . . . . 116
7.1.5 Cauchy–Euler equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.1.6 Invariant and Normal form . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2 Non–homogeneous equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2.1 Variation of parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2.2 Non–homogeneous equations with constant coefficients . . . . . . . . . . . 124
7.3 Change of independent variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.4 Harley differential resolvent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
CONTENTS ix

8 Series solutions 133


8.1 Solution at ordinary points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.2 Airy differential equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.3 Solution at regular singular points . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.4 Bessel equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
x CONTENTS
1 Gamma and Beta functions

1.1 Gamma function


The material presented in this Chapter is based on many excellent textbooks [4], [10], [13], [16],
[20], [30], [37], [41].

1.1.1 Historical backgruond


The Gamma function was introduced by Euler in relation with the factorial problem. From the
early beginning of Calculus, in fact, the problem of understanding the behavior of n! , with
n ∈ N , and the related binomial coefficients, was under the attention of the mathematical
Community. In 1730, the formula of Stirling1 was discovered:
n! en √
lim n
√ = 2π. (1.1)
n→∞ n n

In the same period, Euler introduced a function e(x) , given by the explicit formula (1.2) and
defined for any x > 0 , which reduces to e(n) = n! when its argument is x = n ∈ N . Euler
described his results in a letter to Goldbach2 , who had posed, together with Bernoulli, the
interpolation problem to Euler. This last problem was inspired by the fact that the additive
counterpart of the factorial has a very simple solution:
n(n + 1)
sn = 1 + 2 + . . . · · · + n = ,
2
and by the observation that the above sum function admits a continuation to C given by the
following function:
x(x + 1)
f (x) = ,.
2
Euler’s solution for the factorial is the following integral:

1 x
Z 1 
e(x) = ln dt . (1.2)
0 t

Legendre3 introduced the letter Γ(x) to denote (1.2), and he modified its representation as
follows: Z ∞
Γ(x) = e−t tx−1 dt , x > 0. (1.3)
0
Observe that (1.2) and (1.3) imply the equality Γ(x + 1) = e(x) . In fact, the change of variable
t = − ln u , in the Legendre integral Γ(x + 1) , yields:

1 x 1
Z 1  
ln u
Γ(x + 1) = e ln du = e(x) .
0 u u
1
James Stirling (1692–1770), Scottish mathematician.
2
Christian Goldbach (1690–1764),German mathematician.
3
Adrien Marie Legendre (1752–1833), French mathematician.

1
2 CHAPTER 1. GAMMA AND BETA FUNCTIONS

1.1.2 Main properties of Γ


Function Γ is the natural continuation of the discrete factorial, since:

Γ(n + 1) = n! for n ∈ N , (1.4)

and, most importantly, Γ solves the functional equation:

Γ(x + 1) = x Γ(x) for any x > 0. (1.5)

These aspects are treated with the maximal generality in the Bohr–Mollerup4 Theorems 1.1–1.2.
Note that Γ(x) appears in many formulæ of Mathematical Analysis, Physics and Mathematical
Statistics.

Theorem 1.1. For any x > 0 , the recursion relation (1.5) is true. In particular, when x = n ∈
N , then (1.4) holds.

Proof. Consider (1.3) and integrate by parts:


Z ∞ Z ∞
x −t
 x −t ∞
Γ(x + 1) = t e dt = −t e 0
+x t x−1 e−t dt = x Γ(x) .
0 0

For the integer argument case, exploit (1.3) to observe that:


Z ∞
Γ(1) = e−t dt = 1 ,
0

and use the just proved recursion (1.5) to compute:

Γ(2) = 1 · Γ(1) = 1 , Γ(3) = 2 · Γ(2) = 2 , Γ(4) = 3 · Γ(3) = 3 · 2 = 6 ,

and so on. Hence, Γ(n + 1) = n! can be inferred by an inductive argument.

When x > 0 , function Γ(x) is continuous and differentiable at any order. To evaluate its
derivatives, we use the differentiation of parametric integrals, obtaining:
Z ∞
0
Γ (x) = e−t tx−1 ln t dt , (1.6a)
0
Z ∞
Γ(2) (x) = e−t tx−1 (ln t)2 dt , (1.6b)
0

and more generally: Z ∞


Γ (n)
(x) = e−t tx−1 (ln t)n dt . (1.7)
0

From its definition, Γ(x) is strictly positive. Moreover, since (1.6b) shows that Γ(2) (x) ≥ 0 , it
follows that Γ(x) is also strictly convex. We thus infer the existence for the following couple of
limits:
`0 := lim Γ(x) , `∞ = lim Γ(x) .
x→0+ x→∞

To evaluate `0 , we use the inequality chain:


Z 1
1 1 x−1
Z
x−1 −t 1
Γ(x) > t e dt > t dt = ,
0 e 0 xe
which ensures that:
`0 = +∞ .
4
Harald August Bohr (1887–1951), Danish mathematician and soccer player.
Johannes Mollerup (1872–1937), Danish mathematician.
1.1. GAMMA FUNCTION 3

To evaluate `∞ , since we know, a priori, that such limit exists, we can restrict the focus to
natural numbers, so that we have immediately:

`∞ = lim Γ(n) = lim (n − 1)! = +∞ .


n→∞ n→∞

Observing that Γ(2) = Γ(1) and using Rolle Theorem5 , we see that there exists ξ ∈ ] 1 , 2 [
such that Γ0 (ξ) = 0 . On the other hand, since Γ(2) (x) > 0 , the first derivative Γ0 (x) is
strictly increasing, thus there is a unique ξ such that Γ0 (ξ) = 0 . Furthermore, we have that
0 < x < ξ =⇒ Γ0 (x) < 0 and x > ξ =⇒ Γ0 (x) > 0 . This means that ξ represents the absolute
minimum for Γ(x) , when x ∈ ] 0 , ∞ [ . The numerical determination of ξ and Γ(ξ) is due to
Legendre and Gauss6 :

ξ = 1.4616321449683622 , Γ(ξ) = 0.8856031944108886 .

An important property of Γ is that its logarithm is a convex function, as stated below.


Theorem 1.2. Γ(x) is logarithmically convex.
Proof. Recall the Schwarz inequality7 for functions whose second power is summable:
Z ∞ 2 Z ∞ Z ∞
2
f (t) g(t) dt ≤ f (t) dt · g 2 (t) dt .
0 0 0

If we take:
t x−1
f (t) = e− 2 t 2 , g(x) = f (x) ln t ,
recalling (1.3), (1.6a) and (1.6b), we find the inequality:
2
Γ0 (x) ≤ Γ(x) Γ(2) (x) .

Hence, we conclude that:

d2 Γ(x) Γ(2) (x) − (Γ0 (x))2


ln Γ(x) = ≥ 0.
dx2 Γ2 (x)

Property (1.5) can be iterated, therefore, if n ∈ N and x > 0 :

Γ(x + n) = (x + n − 1) (x + n − 2) · · · (x + 1) x Γ(x) . (1.8)

Definition 1.3. The quotient:


Γ(x + n)
(x)n := = (x + n − 1) (x + n − 2) · · · (x + 1) x (1.9)
Γ(x)
is called Pochhammer symbol or increasing factorial.
Rewriting (1.8) as:
Γ(x + n)
Γ(x) = (1.8b)
(x + n − 1) (x + n − 2) · · · (x + 1) x
allows the evaluation of Γ(x) also for negative values of x , except the integers. For instance,
when x ∈ ] − 1 , 0 [ , Gamma is given by:
Γ(x + 1)
Γ(x) = ;
x
5
Michel Rolle (1652–1719), French mathematician. For the theorem of Rolle see, for example, math-
world.wolfram.com/RollesTheorem.html
6
Carl Friedrich Gauss (1777–1855), German mathematician and physicist.
7
See, for example, mathworld.wolfram.com/SchwarzsInequality.html
4 CHAPTER 1. GAMMA AND BETA FUNCTIONS

in particular:    
1 1
Γ − = −2 Γ .
2 2
When x ∈ ] − 2 , −1 [ , the evaluation is:

Γ(2 − x)
Γ(x) = ,
(x + 1) x

thus, in particular:    
3 4 1
Γ − = Γ .
2 3 2
In other words, Γ(x) is defined on the real line, except the singular points x = 0 , −1 , −2 , · · · ,
and so on, as shown in Figure 1.1.
y

x
-4 -3 -2 -1 1 Ξ 2 3 4 5

Figure 1.1: Plot of Γ(x) .

1.2 Beta function


The Beta function B(x , y) , also called Eulerian integral of first kind, is defined as:
Z 1
B(x , y) = t x−1 (1 − t) y−1 dt , x,y > 0. (1.10)
0

Notice that the change of variable t = 1 − s provides, immediately, the symmetry relation
B(x , y) = B(y , x) in (1.10), which yields:
Z π/2
B(x , y) = 2 cos2 x−1 ϑ sin2 y−1 ϑ dϑ . (1.10a)
0

The main property of the Beta function is its relationship with the Gamma function, as expressed
by Theorem 1.4 below.

Theorem 1.4. For any real x , y > 0 , it holds:

Γ(x) Γ(y)
B(x , y) = . (1.11)
Γ(x + y)
1.2. BETA FUNCTION 5

Proof. From the usual definition (1.3) for the Gamma function, after the change of variable
t = u2 , we have: Z +∞
2
Γ(x) = 2 u2 x−1 e−u du .
0
In the same way: Z +∞
2
Γ(y) = 2 v 2 y−1 e−v dv .
0
Now, we form the product of the last two integrals above, and we use Fubini Theorem, obtaining:
ZZ
2 2
Γ(x) Γ(y) = 4 u2 x−1 v 2 y−1 e−(u +v ) du dv .
[0 ,+∞)×[0 ,+∞)

At this point, we change variable in the double integral, using polar coordinates:
(
u = ρ cos ϑ ,
v = ρ sin ϑ ,

which leads to the relation:


!
Z +∞  Z π/2
−ρ2
Γ(x) Γ(y) = 4 ρ2 x+2 y−1 e dρ cos2 x−1 ϑ sin2 y−1 ϑ dϑ
0 0
Z π/2
= Γ(x + y) 2 cos2 x−1 ϑ sin2 y−1 ϑ dϑ .
0

The thesis follows from (1.10a).

Remark 1.5. Theorem 1.4 can be shown also using, again, Fubini Theorem, starting from:
Z ∞Z ∞
Γ(x) Γ(y) = e−(t+s) tx−1 sy−1 dt ds ,
0 0

and using the changes of variable t = u v and s = u (1 − v) .

1

1.2.1 Γ 2
and the probability integral
1
Using (1.11), we can evaluate Γ and, then, the probability integral. In fact, by taking x = z
2
and y = 1 − z , where 0 < z < 1 , we obtain:
Z 1 Z 1
z−1 −z t z 1
Γ(z) Γ(1 − z) = B(z , 1 − z) = t (1 − t) dt = dt .
0 0 1−t t

Now, the change of variable y = t (1 − t)−1 , leads to:


Z ∞ z−1
y
Γ(z) Γ(1 − z) = dy . (1.12)
0 1+y
1
In particular, the choice z = yields:
2

 
1
Γ = π. (1.13)
2

To see it, observe that (1.12) implies:


!2
1 Z ∞
1
Γ = √ dy , (1.14)
2 0 (1 + y) y
6 CHAPTER 1. GAMMA AND BETA FUNCTIONS

where the right hand–side integral is computed setting y = x2 , so that:


!2
1 Z ∞
1
Γ =2 dx = 2 lim (arctan b) = π .
2 0 1 + x2 b→∞

Evaluating (1.3) at x = 1/2 , and then setting t = x2 , we find, in fact:


∞ ∞
e−t
  Z Z
1 2
Γ = √ dt = 2 e−x dx ,
2 0 t 0

The change of variable s = t (1 − t)−1 , used in (1.10), provides an alternative representation


for the Beta function: Z ∞
sx−1
B(x , y) = ds . (1.10b)
0 (1 + s)x+y
Setting s = t (1 − t)−1 , in fact, leads to:
Z 1 Z ∞
B(x , y) = tx−1 (1 − t)y−1 dt = sx−1 (1 + s)−x+1 (1 + s)−2 ds
0 0

sx−1
Z
= ds .
0 (1 + s)x+y

The following representation Theorem 1.6 decribes the family of integrals related to the Beta
function.

Theorem 1.6. If x , y > 0 and a < b , then:


Z b
(s − a)x−1 (b − s)y−1 ds = (b − a)x+y−1 B(x , y) . (1.15)
a

Proof. In (1.10), employ the change of variable t = (s − a)/(b − a) , to obtain:


Z 1
B(x , y) = tx−1 (1 − t)y−1 dt
0
Z b
= (s − a)x−1 (b − a)−x+1 (b − s)−y+1 (b − a)−1 ds
0
Z b
= (b − a)−x−y+1 (s − a)x−1 (b − s)y−1 ds ,
0

which ends our argument.

The particular situation a = −1 and b = 1 is interesting, since it gives:


Z 1
(1 + s)x−1 (1 − s)y−1 ds = 2x+y−1 B(x , y) . (1.15b)
−1

1.2.2 Legendre duplication formula


Legendre formula expresses Γ(2 x) in terms of Γ(x) and Γ(x + 12 ) .

Theorem 1.7. It holds:



22 x−1 Γ(x) Γ(x + 12 ) = π Γ(2 x) . (1.16)
1.2. BETA FUNCTION 7

Proof. Define the integrals:


Z π 2 x Z π 2 x
2 2
 
I= sin t dt and J= sin(2 t) dt .
0 0

Observe that, with the change of variable 2 t = u in J , it follows that I = J . Observe, further,
that:
 
1 1 1
I = B x+ , ,
2 2 2
Z π  
2
2x 2x−1 1 1
J= (2 sin t cos t) dt = 2 B x + ,x + .
0 2 2

Hence:    
1 1 1 2 x−1 1 1
B x+ , =2 B x + ,x + . (1.17)
2 2 2 2 2
Recalling (1.11), equality (1.17) implies (1.16).

Formula (1.16) is generalised by Gauss multiplication Theorem 1.8, which we present without
proof.

Theorem 1.8. For each m ∈ N , the following formula holds true:


m−1  
Y k
Γ(x) Γ x+ = m1/2−m x (2 π)(m−1)/2 Γ(m x) . (1.18)
m
k=1

1.2.3 Euler reflexion formula


This is a famous and beautiful relation, that was found by Euler and that is stated in (1.19).
It admits two alternative proofs, which are not presented here, as one uses complex integrals,
while the other exploits infinite products.

Theorem 1.9. For any x ∈ ] 0 , 1 [ , it holds:


π
Γ(x) Γ(1 − x) = . (1.19)
sin(π x)

From the reflexion formula (1.19), it follows immediately the computation of integral (1.20).

Corollary 1.10. For any x ∈ ] 0 , 1 [ , it holds:



ux−1
Z
π
du = . (1.20)
0 1+u sin(π x)

It is possible to use (1.19) to establish a cosine reflexion formula.


   
1 1 π π
Γ +p Γ −p = 1
 = (1.21)
2 2 sin 2 +p π cos (p π)

in which we must assume p 6= n + 21 , for n = 0 , 1 , . . . . In terms of the original variable x , we


also have:
π
Γ(x) Γ(1 − x) =  .
cos x − 12 π
8 CHAPTER 1. GAMMA AND BETA FUNCTIONS

1.3 Definite integrals


Gamma and Beta functions are extremely useful for the computation of many definite integrals.
Here, we present some integrals, that can be found in [29] too: to solve them, reflection formulæ
(1.19) and (1.21) are employed. We begin with an integral identity due to Legendre.
Theorem 1.11. If n ≥ 1 then:
Z 1 π  Z 1
dx dx
√ n
= cos √ . (1.22)
0 1−x n 0 1 + xn
Proof. Legendre established formula (1.22) in equation (z) of his treatise [25]. First observe that,
for n ≥ 1 , both integrals converge. Define:
Z 1 Z 1
dx dx
I1 = √ n
, I2 = √ .
0 1 − x 0 1 + xn
Employing the change of variable xn = t , in both integrals, yields:
1
1 1 t n −1
Z  
1 1 1
I1 = √ dt = B ,
n 0 1−t n n 2
1
1 ∞ t n −1
Z  
1 1 1 1
I2 = √ dt = B , − .
n 0 1+t n n 2 n
In integral I2 , above, we used the Beta representation (1.10b). Now, form the ratio I1 /I2 and
exploit Theorem 1.4, to obtain:
B n1 , 21 Γ n1 Γ 21 Γ 21
   
I1 π
= 1 1 1 = 1 1 1 1 1 =
.
Γ n + 2 Γ 21 − n1
1 1
    
I2 B n,2−n Γ n+2 Γ n Γ 2−n

Thus, recalling (1.21):


I1 π π 
= π = cos
I2 n
cos πn


which is our statement.

The same argument followed to demonstrate Theorem 1.11 can be applied to prove the following
Theorem 1.12, thus, we leave it as an exercise.
Theorem 1.12. If 2 a < n , then:
Z 1
xa−1  a  Z ∞ z a−1
√ dx = cos π √ dz . (1.23)
0 1 − xn n 0 1 + zn

Theorem 1.13. If n ∈ N , n ≥ 2 , then:


Z 1
dx π
√n n
= π  . (1.24)
0 1−x n sin
n
Proof. Using, again, the change of variable xn = t leads to:
Z 1
1 1 1 −1 1 1 1 −1
Z Z
dx 1
−n (1− n1 )−1 dt
√ = t n (1 − t) dt = t n (1 − t)
0
n
1 − xn n 0 n 0
     
1 1 1 1 1 1
= B , 1− = Γ Γ 1− .
n n n n n n
Thesis (1.24) follows from reflexion formula (1.19).
1.3. DEFINITE INTEGRALS 9

Theorem 1.14. For any n ≥ 2 , it holds:


Z ∞
dx π
n
= π  . (1.25)
0 1+x n sin
n
Moreover, if n − m > 1 , then:
Z ∞
xm π
dx = π
. (1.26)
0 1 + xn n sin n (m + 1)

1
Proof. The change of variable 1 + xn = is employed in the left hand–side integral of (1.25),
t
1 1 1
and therefore dx = − t− n −1 (1 − t) n −1 dt :
n
Z ∞
1 1 −1 1 1 (1− 1 )−1
Z Z
dx 1
−1 1

n
= t n (1 − t) n dt = t n (1 − t) n −1 dt
0 1+x n 0 n 0
     
1 1 1 1 1 1
= B 1− , = Γ 1− Γ .
n n n n n n

Thesis (1.25) follows from reflexion formula (1.19). Formula (1.26) also follows, using an analo-
gous argument.

Formula (1.27), below, is needed to prove the following Theorem 1.15.


  √ n−1 √
1 π Y π
Γ n− = n−1 (2k − 1) = n−1 (2n − 3)!! . (1.27)
2 2 2
k=1

Theorem 1.15. If n ∈ N , it holds:



π (2n − 3)!!
Z
dx
2 n = n−1 . (1.28)
−∞ (1 + x ) 2 (n − 1)!

Proof. Using the symmetry of the integrand, we have:


Z ∞ Z ∞
dx dx
n =2 .
−∞ (1 + x )
2
0 (1 + x2 )n

1 1 3 1
The change of variable 1 + x2 = , that is, dx = − t− 2 (1 − t)− 2 dt , leads to:
t 2
Z ∞   √  
dx 1 1 π 1
= B n− , = Γ n− .
−∞ (1 + x2 )n 2 2 (n − 1)! 2

Exploiting (1.27), we arrive at thesis (1.28).

By an analogous argument, the following Theorem 1.16 can be demonstrated.

Theorem 1.16. If n p − m > 1 , then:


∞ m+1 m+1
 
xm Γ Γ p−
Z
n n
dx = . (1.29)
0 (1 + xn )p n Γ(p)
10 CHAPTER 1. GAMMA AND BETA FUNCTIONS

1.4 Double integration techniques


As often happens in Analysis, double integration leads to some interesting integral identities,
regardless of the order in which the two integrals are evaluated. Here, we obtain a few of such
important identities, connecting the Eulerian integrals with double integration reversal. In par-
ticular, the Fresnel integrals are attained, which are related to the probability integrals, as well
as the Dirichlet integral, following the presentation in [31].
We start proving the following beautiful identity (1.30), which holds for any b ∈ R and for
0 < p < 2 , in order to provide convergence for the integral.
Z +∞
sin(b x) π b p−1
dx = . (1.30)
0 x p 2 Γ(p) sin p π2

The starting point, to prove (1.30), is the double integral:


Z +∞ Z +∞
I(b , p) = sin(b x) y p−1 e−x y dx dy .
0 0

The assumption we made for the parameter b ensures summability. Hence, exploiting Fubini
Theorem, the integration can be performed regardless of the order. Let us integrate, first, with
respect to x : Z +∞
Z +∞

I(b , p) = y p−1 sin(b x) e−x y dx dy .
0 0
In this way, the integral above turns out to be an elementary one, since:
Z +∞
b
sin(b x) e−x y dx = 2 ,
0 b + y2
and then:
+∞
y p−1
Z
I(b , p) = b dy ,
0 b2 + y 2
y
for which, employing the change of variable t = b , we obtain:

tp−1
Z
p−1
I(b , p) = b dt .
0 1 + t2
The latest formula allows to use identity (1.26) and complete the first computation:

π b p−1
I(b , p) = . (1.31)
2 sin p π2

Now, revert the order of integration:


Z +∞ Z +∞ 
p−1 −x y
I(b , p) = sin(b x) y e dy dx .
0 0

The inner integral is immediately evaluated, in terms of the Gamma function, setting u = x y :
Z +∞ Z ∞
1 1
y p−1 e−x y dy = p u p−1 e−u du = p Γ(p) . (1.32)
0 x 0 x
Equating (1.32) and (1.31) leads to:

b p−1 π
Z
sin(b x)
Γ(p) dx = ,
0 xp 2 sin p π2

which is nothing else but (1.30).


1.4. DOUBLE INTEGRATION TECHNIQUES 11

A first consequence of equation (1.30), corresponding to the particular case b = p = 1 , is the


Dirichlet integral (1.33):
Z +∞
sin x π
dx = (1.33)
0 x 2
Moreover, from (1.30), it is possible to establish a second integral formula (1.34), which gener-
alizes the Dirichlet integral and which holds for q > 1 .
Z ∞
sin xq
   
1 1 π
q
dx = Γ cos . (1.34)
0 x q−1 q 2q
To prove (1.34), the first step is the quite natural change of variabile xq = u , in the integral:
Z ∞ Z ∞
sin xq 1 sin u
dx = du .
0 x q q 0 u2− 1q

The right hand–side integral, above, has the form (1.30), with b = 1 and p = 2 − 1q , therefore:

sin xq
Z
π
dx =  .
xq   
0 1 1 π
2 q Γ 2− q sin 2− q 2

Now, evaluating the sine:


     
1 π π π
sin 2− = sin π − = sin ,
q 2 2q 2q

and using the reflection formula (1.19):


       
1 1 1 1 q−1 π 1
Γ 2− =Γ 1+1− = 1− Γ 1− =    ,
q q q q q sin π Γ 1
q q

we arrive at:    
1
Z ∞
sin xq Γ q sin πq
dx =  .
0 xq 2 (q − 1) sin π
2q

Finally, the goniometric identity:


sin x x
x = 2 cos ,
sin 2 2
implies the equality below, which simplifies to (1.34):
 
1
Z ∞
sin x q Γ q
 
π
q
dx = 2 cos .
0 x 2 (q − 1) 2q

We now show, employing again reversal integration, a cosine relation similar to (1.30), namely:

π b p−1
Z
cos(b x)
dx = , (1.35)
0 xp 2 Γ(p) cos p π2

where we must assume that 0 < p < 1 , to ensure convergence of the integral, due to the
singularity in the origin. To prove (1.35), we consider the double integral:
Z ∞Z ∞
cos(b x) y p−1 e−x y dx dy ,
0 0
12 CHAPTER 1. GAMMA AND BETA FUNCTIONS

from which we show that (1.35) can be reached, via the Fubini Theorem regardless of the order
of integration. The starting point is, then, the equality:
Z ∞ Z ∞  Z ∞ Z ∞ 
p−1 −x y p−1 −x y
cos(b x) y e dy dx = y cos(b x) e dx dy . (1.36)
0 0 0 0

The inner integral in the right hand–side of (1.36) is elementary:


Z ∞
y
cos(b x) e−x y dx = 2 .
0 b + y2

Thus, the right hand–side of (1.36) is:


Z ∞ Z ∞ Z ∞
y 1 yp tp
y p−1 2 dy = 2 dy = b p−1
dt .
b + y2 b2 0 1 + y 1 + t2

0 b 0

The last integral above is in the form (1.26). Therefore, the right hand–side integral of (1.36)
turns out to be: Z ∞
yp π b p−1 π b p−1
dy = = . (1.37)
b2 + y 2 2 cos p2 π
 
0 2 sin p+1 π 2

The inner integral in the left hand–side of (1.36) is given by (1.32). Hence, the left hand–side
integral of (1.36) is: Z ∞
cos(b x)
Γ(p) dx . (1.38)
0 xp
Equating (1.37) and (1.38) leads to (1.35).

There is a further, very interesting consequence of equation (1.30) leading to the evaluation of
the Fresnel8 integrals (1.39), that hold for b > 0 and k > 1 :
Z ∞  
k 1 1  π 
sin(b x ) dx = 1 Γ sin . (1.39)
0 k bk k 2k

To prove (1.39), we start from considering its left hand–side integral, inserting in it the change
xk
of variable u = xk , i.e., du = k xk−1 dx = k dx , thus:
x
Z ∞ Z ∞ 1
1 ∞ sin(b u)
Z
k 1 uk
sin(b x ) dx = sin(b u) du = 1 du .
0 0 k u k 0 u1− k
1
The latest integral above is in the form (1.30), with p = 1 − k . Hence:
1

π b− k
Z
sin(b u)
1 du =  , (1.40)
u1− k 2 Γ 1 − k1 sin 1 − k1 π
 
0
2

and, thus: Z ∞
π
sin(b xk ) dx = 1
1 π π
 
0 2 k b Γ 1−
k
k sin 2 − 2k

At this point, employing the reflection formula (1.19):


   
1 1 π
Γ Γ 1− = ,
k k sin πk
8
Augustin–Jean Fresnel (1788–1827), French civil engineer and physicist.
1.4. DOUBLE INTEGRATION TECHNIQUES 13

and the goniometric identities:


π π   π  sin x x
sin − = cos , x = 2 sin .
2 2k 2k cos 2 2

we obtain (1.39).

In (1.39), the particular choices b = 1 and k = 2 correspond to the sine Fresnel integral:
Z ∞   π  rπ
1 1
sin(x2 ) dx = Γ sin = . (1.41)
0 2 2 4 8

Exploiting the same technique that produced (1.39) from (1.30), it is possible to derive the
cosinus analogue (1.42) of Fresnel integrals, that hold for b > 0 and k > 1 :
Z ∞  
1 1  π 
cos(b xk ) dx = 1 Γ cos . (1.42)
0 k bk k 2k

To prove (1.42), the starting point is (1.35). Then, as in the sine case, we introduce the change
of variable u = xk , we choose p = 1 − k1 , and, via calculations similar to those performed in
the sine case, we arrive at:
Z ∞
π
cos(b xk ) dx = 1 .
2 k b k Γ 1 − k1 cos π2 − 2πk

0

At this point, exploiting the reflection formula (1.19):


 
1 π
Γ 1− = ,
1
Γ k sin πk

k

and the trigonometric properties:


π π   π  sin x x
cos − = sin , x = 2 cos .
2 2k 2k sin 2 2

formula (1.42) follows.


14 CHAPTER 1. GAMMA AND BETA FUNCTIONS
2 Sequences and series of functions

2.1 Sequences and series of real or complex numbers


A sequence is a set of numbers u1 , u2 , u3 , . . . , in a definite order of arrangement, that is, a map
u : N → R or u : N → C , formed according to a certain rule. Each number in the sequence is
called term; un is called the nth term. The sequence is called finite or infinite, according to the
number of terms. The sequence u1 , u2 , u3 , . . . , when considered as a function, is also designated
as (un )n∈N or briefly (un ) .

Definition 2.1. The real or complex number ` is called the limit of the infinite sequence (un )
if, for any positive number ε , there exists a positive number n(ε) , depending on ε , such that
|un − `| < ε for all integers n > n(ε) . In such a case, we denote:

lim un = ` .
n→∞


X
Given a sequence (un ) , we say that its associated infinite series un :
n=1

(i) converges, when it exists the limit:


n
X ∞
X
lim uk := S = un ;
n→∞
k=1 n=1

n
X
(ii) diverges, when the limit of the partial sums uk does not exist.
k=1

2.2 Sequences of functions


Given a real interval [a , b] , we denote F ([a , b]) the collection of all real functions defined on
[a , b] :
F ([a, b]) = {f | f : [a, b] → R} .

Definition 2.2. A sequence of functions with domain [a , b] is a sequence of elements of


F ([a, b]) .

Example 2.3. Functions fn (x) = xn , where x ∈ [0 , 1] , form a sequence of functions in


F ([0 , 1]) .

Let us analyse what happens when n → ∞ . It is easy to realise that a sequence of continuous
functions may converge to a non–continuous function. Indeed, for the sequence of functions in
Example 2.3, it holds:
(
1 if x = 1 ,
lim fn (x) = lim xn =
n→∞ n→∞ 0 if 0 ≤ x < 1 .

15
16 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Thus, even if every function of the sequence fn (x) = xn is continuous, the limit function f (x) ,
defined below, may not be continuos:

f (x) := lim fn (x) .


n→∞

The convergence of a sequence of functions, like that of Example 2.3, is called simple convergence.
We now provide its rigorous definition.
Definition 2.4. If (fn ) is a sequence of functions in I ⊆ [a , b] and f is a real function on I ,
then fn pointwise converges to f if, for any x ∈ I , there exists the limit of the real sequence
(fn (x)) and its value is f (x) :
lim fn (x) = f (x) .
n→∞
Pointwise convergence is denoted as follows:
I
fn −
→f .
I
Remark 2.5. Definition 2.4 can be reformulated as follows: it holds that fn − → f if, for any
ε > 0 and for any x ∈ I , there exists n(ε, x) ∈ N , depending on ε and x , such that:

|fn (x) − f (x)| < ε

for any n ∈ N with n > n(ε , x) .


Example 2.3 shows that the pointwise limit of a sequence of continuos functions may not be
continuos.

2.3 Uniform convergence


Pointwise convergence does not allow, in general, interchanging between limit and integral op-
erators, a possibility that we call passage to the limit. To explain it, consider the sequence of
functions:
2 2
fn (x) = n e−n x
defined on [0 , ∞) ; it is a sequence that clearly converges to the zero function. Employing the
substitution n x = y , evaluation of the integral of fn yields:
Z ∞ Z ∞
2
fn (x)dx = e−y dy .
0 0

We do not have the tools, yet, to evaluate the integral in the left hand–side of the above equality
(but we will soon), but it is clear that it is a positive real number, so we have:
Z ∞ Z ∞ Z ∞
2
lim fn (x)dx = e−y dy = α > 0 6= lim fn (x)dx = 0 .
n→∞ 0 0 0 n→∞

To establish a ‘good’ notion of convergence, that allows the passage to the limit, when we take the
integral of the considered sequence, and that preserves continuity, we introduce the fundamental
notion of uniform convergence.
Definition 2.6. If (fn ) is a sequence of functions defined on the interval I , then fn converges
uniformly to the function f if, for any ε > 0 , there exists nε ∈ N such that, for n ∈ N , n > nε ,
it holds:
sup |fn (x) − f (x)| < ε . (2.1)
x∈I

Uniform convergence is denoted by:


I
fn ⇒ f .
2.3. UNIFORM CONVERGENCE 17

Remark 2.7. Definition 2.6 is equivalent to requesting that, for any ε > 0 , there exists nε ∈ N
such that, for n ∈ N , n > nε , it holds:

|fn (x) − f (x)| < ε , for any x ∈ I . (2.2)


I
Proof. Let fn ⇒ f . Then, for any ε > 0 , there exists nε ∈ N such that:

sup |fn (x) − f (x)| < ε , for any n ∈ N , n > nε ,


I

and this implies (2.2). Viceversa, if (2.2) holds then, for any ε > 0 , there esists nε ∈ N such
that:
sup |fn (x) − f (x)| < ε , for any n ∈ N , n > nε ,
x∈I

I
that is to say, fn ⇒ f .

Remark 2.8. Uniform convergence implies pointwise convergence. The converse does not hold,
as Example 2.3 shows.

In the next Theorem 2.9, we state the so–called Cauchy uniform convergence criterion.

Theorem 2.9. Given a sequence of functions (fn ) in [a , b] , the following statements are equiv-
alent:

(i) (fn ) converges uniformly;

(ii) for any ε > 0 , there exists nε ∈ N such that, for n , m ∈ N , with n , m > nε , it holds:

|fn (x) − fm (x)| < ε , for any x ∈ [a , b] .

Proof. We show that (i) =⇒ (ii). Assume that (fn ) converges uniformly, i.e., for a fixed ε > 0 ,
ε
there exists nε > 0 such that, for any n ∈ N , n > nε , inequality |fn (x) − f (x)| < , holds for
2
any x ∈ [a , b] . Using the triangle inequality, we have:
ε ε
|fn (x) − fm (x)| ≤ |fn (x) − f (x)| + |f (x) − fm (x)| < + =ε
2 2
for n , m > nε .
To show that (ii) =⇒ (i), let us first observe that, for a fixed x ∈ [a , b] , the numerical
sequence (fn (x)) is indeed a Cauchy sequence, thus, it converges to a real number f (x) . We
prove that such a convergence is uniform. Let us fix ε > 0 and choose nε ∈ N such that, for
n , m ∈ N , n , m > nε , it holds:
|fn (x) − fm (x)| < ε
for any x ∈ [a , b] . Now, taking the limit for m → +∞ , we get:

|fn (x) − f (x)| < ε

for any x ∈ [a , b] . This completes the proof.

Example 2.10. The sequence of functions fn (x) = x (1+n x)−1 converges uniformly to f (x) =
0 in the interval [0 , 1] . Since fn (x) ≥ 0 for n ∈ N and for x ∈ [0 , 1] , we have:

x 1
sup = →0 as n→∞.
x∈[0,1] 1 + n x 1+n
18 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Figure 2.1: fn (x) = x(1 + n x)−1 , n = 1,... ,6.

Example 2.11. The sequence of functions fn (x) = (1 + n x)−1 does not converge uniformly to
f (x) = 0 in the interval [0 , 1] . In spite of the pointwise limit of fn for x ∈]0 , 1] , we have in
fact:
1
sup = 1.
x∈[0,1] 1 + nx

Figure 2.2: fn (x) = (1 + n x)−1 , n = 1,... ,6.

Example 2.12. If α ∈ R+ , the sequence of functions, defined on R+ by fn (x) = nα x e−nαx ,


converges pointwise to 0 on R+ , and uniformly if α < 1 .
For x > 0 , in fact, by taking the logarithm, we obtain:

ln fn (x) = ln nα x e−αx = α ln n − n x .


It follows that lim ln fn (x) = −∞ and, then, lim fn (x) = 0 . Pointwise convergence is proved.
n→∞ n→∞
For uniform convergence, we show that, for any n ∈ N , the associate function fn reaches its
absolute maximum in R+ . By differentiating with respect to x , we obtain, in fact:

fn0 (x) = nα eα (−n) x (1 − α n x) ,

1
from which we see that function fn assumes its maximum value in xn = ; such a maximum
n
is absolute, since fn (0) = 0 and lim fn (x) = 0 . We have thus shown that:
x→+∞

1
sup fn (x) = fn = e−α nα−1 .
x∈R+ n

Now, lim sup fn (x) = lim e−α nα−1 = 0 , when α < 1 . Hence, in this case, convergence is
n→∞ x∈R+ n→∞
indeed uniform.
2.3. UNIFORM CONVERGENCE 19

In the following example, we compare two sequences of functions, apparently very similar, but
the first one is pointwise convergent, while the second one is uniformly convergent.
Example 2.13. Consider the sequences of functions (fn ) and (gn ) , both defined on [0 , 1] :
1

n2 x (1 − n x)
 if 0 ≤ x < ,
fn (x) = n (2.3)
1
0 if ≤x≤1,

n
and
1

n x2 (1 − n x)
 if 0≤x< ,
gn (x) = n (2.4)
1
0 ≤x≤1.
if

n
Sequence (fn ) converges pointwise to f (x) = 0 for x ∈ [0 , 1] ; in fact, it is fn (0) = 0 and
1
fn (1) = 0 for any n ∈ N . When x ∈ (0 , 1) , since n0 ∈ N exists such that < x , it follows
n0
that fn (x) = 0 for any n ≥ n0 .
1
The convergence of (fn ) is not uniform; to show this, observe that ξn = maximises fn ,
2n
since:
1

n2 (1 − 2 n x)
 if 0 ≤ x < ,
fn0 (x) = n
1
0 if ≤x≤1.

n
It then follows:
n
sup |fn (x) − f (x)| = sup fn (x) = fn (ξn ) =
x∈[0,1] x∈[0,1] 4
which prevents uniform convergence. With similar considerations, we can prove that (gn ) con-
verges pointwise to g(x) = 0 , and that the convergence is also uniform, since:
1

n x (2 − 3 n x)
 if 0 ≤ x < ,
0
gn (x) = n
1
0 if ≤x≤1,

n
2
implying that ηn = maximises gn and that:
3n
4
sup |gn (x) − g(x)| = sup gn (x) = gn (ηn ) = ,
x∈[0,1] x∈[0,1] 27 n

which ensures the uniform convergence of (gn ) .

Uniform convergence implies remarkable properties. If a sequence of continuous functions is


uniformly convergent, in fact, its limit is also a continuous function.
Theorem 2.14. If (fn ) is a sequence of continuous functions on an closed and bounded interval
[a , b] , which converges uniformly to f , then f is a continuous function.
Proof. Let f (x) be the limit of fn . Choose ε > 0 and x0 ∈ [a b] . Due to uniform convergence,
there exists nε ∈ N such that, if n ∈ N , n > nε , then:
ε
sup |fn (x) − f (x)| < . (2.5)
x∈[a,b] 3

Using the continuity of fn , we can see that there exists δ > 0 such that:
ε
|fn (x) − fn (x0 )| < (2.6)
3
20 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

for any x ∈ [a , b] with |x − x0 | < δ .


To end the proof, we have to show that, given x0 ∈ [a , b] , if x ∈ [a , b] is such that |x − x0 | < δ ,
then |f (x) − f (x0 )| < ε . By the triangular inequality:
|f (x) − f (x0 )| ≤ |f (x) − fn (x)| + |fn (x) − fn (x0 )| + |fn (x0 ) − f (x0 )| .
Observe that:
ε ε ε
|f (x) − fn (x)| < , |fn (x0 ) − f (x0 )| < , |fn (x) − fn (x0 )| < ,
3 3 3
the first two inequalities being due to (2.5), while the third one is due to (2.6). Hence:
|f (x) − f (x0 )| < ε
if |x − x0 | < δ ; this concludes the proof.
When we are in presence of uniform convergence, for a sequence of continuous functions, defined
on the bounded and closed interval [a , b] , then the following passage to the limit holds:
Z b Z b
lim fn (x)dx = lim fn (x)dx . (2.7)
n→∞ a a n→∞
We can, in fact, state the following Theorem 2.15.
Theorem 2.15. If (fn ) is a sequence of continuous functions on [a , b] , converging uniformly to
f (x) , then (2.7) holds true.
Proof. From Theorem 2.14, f (x) is continuos, thus, it is Riemann integrable. Now, choose ε > 0
so that nε ∈ N exists such that, for n ∈ N , n > nε :
ε
|fn (x) − f (x)| < for any x ∈ [a , b] . (2.8)
b−a
By integration:
Z b Z b Z b
ε
fn (x)dx − f (x)dx ≤ |fn (x) − f (x)| dx < (b − a) = ε

a a a b−a
which ends the proof.
Remark 2.16. The passage to the limit is sometimes possible under less restrictive hypotheses
than Theorem 2.15. In the following example, passage to the limit is possible without uniform
convergence. Consider the sequence in [0 , 1] , given by fn (x) = n x (1 − x)n . For such a sequence,
it is:  n
1 n 1
sup |fn (x)| = fn ( n+1 ) = 1− ,
x∈[0,1] n+1 n+1
thus, fn is not uniformly convergent, since it holds:
1
lim sup |fn (x)| = 6= 0 ,
n→∞ x∈[0,1] e
[0 ,1]
On the other hand, it holds that fn −−−→ 0 . Moreover, we can use integration by part as follows:
Z 1 Z 1
lim fn (x) dx = lim n x (1 − x)n dx
n→∞ 0 n→∞ 0
Z 1  0
1 n+1
= lim nx − (1 − x) dx
n→∞ 0 n+1
Z 1
n n
= lim (1 − x)n+1 dx = lim = 0,
n→∞ 0 n + 1 n→∞ (n + 1)(n + 2)

and it also holds: Z 1 Z 1


lim fn (x)dx = 0 dx = 0 .
0 n→∞ 0
2.3. UNIFORM CONVERGENCE 21

Remark 2.17. Consider again the sequences of functions (2.3) and (2.4), defined on [0 , 1] , with
fn → 0 and gn ⇒ 0 . Observing that:
Z 1 Z 1
n 1
fn (x)dx = n x2 (1 − n x)dx =
0 0 6
and 1
Z 1 Z
n 1
gn (x)dx = n2 x(1 − n x)dx = ,
0 0 12 n2
it follows: Z 1 Z 1
1
lim fn (x)dx = =6 f (x)dx = 0 ,
n→∞ 0 6 0
while: Z 1 Z 1
1
lim gn (x)dx = lim =0= g(x)dx = 0 .
n→∞ 0 n→∞ 12 n2 0
In other words, the pointwise convergence of (fn ) does not permit the passage to the limit, while
the uniform convergence of (gn ) does.
We provide a second example to illustrate, again, that pointwise convergence, alone, does not
allow the passage to the limit.
Example 2.18. Consider the sequence of functions (fn ) on [0 , 1] defined by:
 1
 n2 x if 0 ≤ x ≤ ,
n





1 2

fn (x) = 2 n − n2 x if < x ≤ ,


 n n
2



0 if < x ≤ 1 .
n
Observe that each fn is a continuous function. Plots of fn are shown in Figure 2.3, for some
values of n ; it is clear that, pointwise, fn (x) → 0 for n → ∞ .
y

Figure 2.3: Plot of functions fn (x) , n = 3 , . . . , 6 , in Example 2.18. Solid lines are used for even
values of n ; dotted lines are employed for odd n .

By construction, though, each triangle in Figure 2.3 has area equal to 1 , thus, for any n ∈ N :
Z 1
fn (x)dx = 1 .
0

In conclusion: Z 1 Z 1
1 = lim fn (x)dx 6= lim fn (x)dx = 0
n→∞ 0 0 n→∞
22 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

In presence of pointwise convergence alone, therefore, swapping between integral and limit is
not possible.

Uniform convergence leads to a third interesting consequence, connected to the behaviour of


sequences of differentiable functions.
Theorem 2.19. Let (fn ) be a sequence of continuous functions, on [a , b] . Assume that each fn
is differentiable, with continuous derivative, and that:

(i) lim fn (x) = f (x) , for any x ∈ [a , b] ;


n→∞

(ii) (fn0 ) converges uniformly in [a , b] .

Then f (x) is differentiable and


lim f 0 (x) = f 0 (x).
n→∞ n

Proof. Define g(x) as:


g(x) = lim fn0 (x) ,
n→∞
and recall that such a limit is uniform. For Theorems 2.14 and 2.15, g(x) is continuous on [a , b] .
A classical result from Calculus states that, for any x ∈ [a, b] :
Z x Z x
g(t)dt = lim fn0 (t)dt = lim (fn (x) − fn (a)) = f (x) − f (a)
a n→∞ a n→∞

This means that f (x) is differentiable and its derivative is g(x) .

The hypotheses of Theorem 2.19 are essential, as it is shown by the following example.
Example 2.20. Consider the sequence (fn ) on the open interval ] − 1 , 1[ :
2x
fn (x) = , n ∈ N.
1 + n2 x2
Observe that fn converges to 0 uniformly in ] − 1 , 1[ , since:
1
sup |fn (x)| = −→ 0 .
x∈]−1,1[ n n→∞

Function fn is differentiable for any n ∈ N and, for any x ∈ ] − 1 , 1[ and any n ∈ N , the
derivative of fn , with respect to x , is:
2(1 − n2 x2 )
fn0 (x) = .
(1 + n2 x2 )2
Now, consider function g : ] − 1 , 1[ → R :
(
0 if x 6= 0 ,
g(x) =
2 if x = 0 .

]−1,1[
Clearly, fn0 −−−→ g ; such a convergence holds pointwise, but not uniformly; by Theorem 2.14,
in fact, uniform convergence of (fn0 ) would imply g to be continuous, which is not, in this case.
Here, the hypotheses of Theorem 2.19 are not fulfilled, thus its thesis does not hold.
We end this section with Theorem 2.21, due to Dini1 , and the important Corollary 2.23, a
consequence of the Dini Theorem, very useful in many applications. Theorem and corollary
connect monotonicity and uniform convergence for a sequence of functions; for their proof, we
refer the Reader to [10].
1
Ulisse Dini (1845–1918), Italian mathematician and politician.
2.4. SERIES OF FUNCTIONS 23

Theorem 2.21 (Dini). Let (fn ) be a sequence of continuous functions, converging pointwise to
a continuous function f , defined on the interval [a , b] .
Furthermore, assume that, for any x ∈ [a , b] and for any n ∈ N , it holds fn (x) ≥ fn+1 (x) .
Then fn converges uniformly to f in [a , b] .
Remark 2.22. In Theorem 2.21, hypothesis fn (x) ≥ fn+1 (x) can be replaced with its reverse
monotonicity assumption fn (x) ≤ fn+1 (x) , obtaining the same thesis.
Corollary 2.23 (Dini). Let (fn ) be sequence of nonnegative, continuous and integrable func-
tions, defined on R , and assume that it converges pointwise to f , which is also nonnegative,
continuous and integrable. Suppose further that it is either 0 ≤ fn (x) ≤ fn+1 ≤ f (x) or
0 ≤ f (x) ≤ fn+1 ≤ fn (x) , for any x ∈ R and any n ∈ N . Then:
Z +∞ Z +∞
lim fn (x)dx = f (x)dx .
n→∞ −∞ −∞

Example 2.24. Let us consider an application of Theorem 2.21 and Corollary 2.23. Define
fn (x) = xn sin(πx) , x ∈ [0 , 1] . It is immediate to see that, for any x ∈ [0, 1] :
lim xn sin(πx) = 0 .
n→∞

Moreover, since it is 0 ≤ f (x) ≤ fn+1 ≤ fn (x) for any x ∈ [0 , 1] , the convergence is uniform
and, then: Z 1
lim xn sin(πx)dx = 0 .
n→∞ 0

2.4 Series of functions


The process of transformation of a sequence of real numbers into an infinite series works, also,
when extending sequences of functions into series of functions.
Definition 2.25. The series of functions
X∞
fn (x) = f1 (x) + f2 (x) + · · · + fm (x) + · · · · · · (2.9)
n=1

converges in [a , b] , if the sequence of its partial sums:


n
X
sn (x) = fk (x) (2.10)
k=1

converges in [a , b] . The same result applies to uniform convergence, that is,


if (2.10) converges uniformly in [a , b] , then (2.9) converges uniformly in [a , b] .
Remark 2.26. Defining rn (x) := f (x) − sn (x) , then (2.9) converges uniformly in [a, b] if, for
any ε > 0 , there exists nε such that:
sup |rn (x)| < ε , for any n > nε .
x∈[a,b]

The following Theorem 2.27, due to Weierstrass, establishes a sufficient condition to ensure the
uniform convergence of a series of functions.
Theorem 2.27 (Weierstrass Theorem on approximation). Let (fn ) be a sequence of functions
defined on [a , b] . Assume that for any n ∈ N , there esists Mn ∈ R such that |fn (x)| ≤ Mn
for any x ∈ [a , b] . Morevover, assume convergence for the numerical series:

X
Mn .
n=1

Then (2.9) converges uniformly in [a , b] .


24 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Proof. For the Cauchy criterion of convergence (Theorem 2.9), the series of functions (2.9)
converges uniformly if and only if, for any ε > 0 , there exist nε ∈ N such that:

m
X
sup fk < ε , for any m > n > nε .

x∈[a ,b]
k=n+1


X
In our case, once ε > 0 is fixed, since the numerical series Mn converges, there exists nε ∈ N
n=1
such that:
m
X
Mk < ε , for any m > n > nε .
k=n+1
Now, we use the triangle inequality:

m
X m
X m
X
sup fk ≤ sup |fk | ≤ Mk < ε .

x∈[a,b]
k=n+1
x∈[a,b]
k=n+1 k=n+1

This proves the theorem.

Theorem 2.15 is useful for swapping between sum and integral of a series, and Theorem 2.19 for
term–by–term differentiability. We now state three further helpful theorems.
Theorem 2.28. If (fn ) is a sequence of continuous functions on [a , b] and if their series (2.9)
converges uniformly on [a , b] , then:
∞ Z b ∞
 Z b X !
X
fn (x)dx = fn (x) dx . (2.11)
n=1 a a n=1

Proof. Define:

X n
X
f (x) := fn (x) = lim fm (x) . (2.11a)
n→∞
n=1 m=1
By Theorem 2.14, function f (x) is continuous and:
Z b Xn Z b
f (x)dx = lim fm (x)dx . (2.11b)
a n→∞
m=1 a

Now, using the linearity of the integral:


Xn Z b Z b n
X Z b
fm (x)dx = fm (x)dx = sn (x)dx .
m=1 a a m=1 a

From Theorem 2.15, the thesis (2.11) then follows.

We state (without proof) a more general result, that is not based on the uniform convergence,
but only on simple convergence and few other assumptions.
Theorem 2.29. Let (fn ) be a sequence of functions on an interval [a , b] ⊂ R . Assume that
each fn is both piecewise continuous and integrable on I , and that (2.9) converges pointwise,
on I , to a piecewise continuous function f .
Moreover, assume convergence for the numerical (positive terms) series:
X∞ Z b
|fn (x)| dx .
n=1 a

Then the limit function f is integrable in [a , b] and:


Z b X∞ Z b
f (x)dx = fn (x)dx .
a n=1 a
2.4. SERIES OF FUNCTIONS 25

Example 2.30. The series of functions:



X sin(nx)
(2.12)
n2
n=1

converges uniformly on any interval [a , b] .


It is, in fact, easy to use the Weierstrass Theorem 2.27 and verify that:

sin(nx) 1
n2 ≤ n2 .


X 1
Our statement follows from the convergence of the infinite series , shown in formula
n2
n=1
(2.71) of § 2.7, later on.
Moreover, if f (x) denotes the sum of the series (2.12), then, due to the uniform convergence:
π ∞
πX ∞ π ∞
1 − cos(n π)
Z Z Z
sin(n x) X 1 X
f (x)dx = dx = sin(n x)dx = .
0 0 n2 n2 0 n3
n=1 n=1 n=1

Now, observe that: (


2 if n is odd,
1 − cos(nπ) =
0 if n is even.
It is thus possible to infer:
Z π ∞
X 2
f (x)dx = .
0 (2 n − 1)3
n=1

Theorem 2.31. Assume that (2.9), defined on [a , b] , converges uniformly, and assume that
each fn has continuous derivative fn0 (x) for any x ∈ [a , b] ; assume further that the series of
the derivatives is uniformly convergent. If f (x) denotes the sum of the series (2.9), then f (x) is
differentiable and, for any x ∈ [a , b] :

X
f 0 (x) = fn0 (x) . (2.13)
n=1

The derivatives at the extreme points a and b are obviously understood as right and left deriva-
tives, respectively.

Proof. We present here the proof given in [24]. Let us denote by g(x) , with x ∈ [a , b] , the sum
of the series of the derivatives fn0 (x) :

X
g(x) = fn0 (x) .
n=1

By Theorem 2.14, function g(x) is continuous and, by Theorem 2.15, we can integrate term by
term in [a , x] :
Z x ∞ Z
X x ∞
X ∞
X ∞
X
fn0 (ξ)dξ

g(ξ) dξ = = fn (x) − fn (a) = fn (x) − fn (a) , (2.14)
a n=1 a n=1 n=1 n=1

where linearity of the sum is used in the last step of the chain of equalities. Now, recalling
definition (2.11a) of f (x) , formula (2.14) can be rewitten as:
Z x
g(ξ) dξ = f (x) − f (a) . (2.14a)
a
26 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Differentiating both sides of (2.14a), by the Fundamental Theorem of Calculus2 we obtain g(x) =
f 0 (x) , which means that:

X
0
f (x) = g(x) = fn0 (x) .
n=1
Hence, the proof is completed.

Uniform convergence of series of functions satisfies linearity properties expressed by the following
Theorem 2.32, whose proof is left as an exercise.

Theorem 2.32. Given two uniformly convergent power series in [a , b] :



X ∞
X
fn (x), gn (x) ,
n=1 n=1

the following series converges uniformly for any α , β ∈ R :



X
(αfn (x) + βgn (x)) .
n=1

Moreover, if h(x) is a continuous function, defined on [a , b] , then the following series is uniformly
convergent:
X∞
h(x)fn (x) .
n=1

2.5 Power series: radius of convergence


The problem dealt with in this section is as follows. Consider a sequence of real numbers (an )n≥0
and the function defined by the so–called power series:

X
f (x) = an (x − x0 )n . (2.15)
n=0

Given x0 ∈ R , it is important to find all the values x ∈ R such that the series of functions (2.15)
converges.

Example 2.33. With x0 = 0 , the power series:



X ( (2n)! )2 n
x
16n ( n! )4
n=0

converges for |x| < 1 .

Remark 2.34. It is not restrictive, by using a translation, to consider the following simplified–
form power series, obtained from (2.15) with x0 = 0 :

X
an xn . (2.16)
n=0

Obviously, the choice of x in (2.16) determinates the convergence of the series. The following
Lemma 2.35 is of some importance.

Lemma 2.35. If (2.16) converges for x = r0 then, for any 0 ≤ r < |r0 | , it is absolutely and
uniformly convergent in [−r , r] .
2
See, for example, mathworld.wolfram.com/FundamentalTheoremsofCalculus.html
2.5. POWER SERIES: RADIUS OF CONVERGENCE 27


X
Proof. It is assumed the convergence of the numerical series an r0n , that is to say, there
n=0
n
r
exists a positive constant K such that an r0 ≤ K . Since < 1 , then the geometrical
r0
∞ 
X r n

series converges. Now, for any n ≥ 0 and any x ∈ [−r, r] :
r0
n=0

n
n n
an xn = an r0n x ≤ K x ≤ K r .

r0 r0 r0 (2.17)

By Theorem 2.27, inequality (2.17) implies that (2.16) is uniformly convergent. Due to positivity,
the convergence is also absolute.

From Lemma 2.35 it follows the fundamental Theorem 2.36, due to Cauchy and Hadamard3 ,
which explains the behaviour of a power series:
Theorem 2.36 (Cauchy–Hadamard). Given the power series (2.16), then only one of the fol-
lowing alternatives holds:

(i) series (2.16) converges for any x ;


(ii) series (2.16) converges only for x = 0 ;
(iii) there exists a positive number r such that series (2.16) converges for any x ∈ ] − r , r [ and
diverges for any x ∈ ] − ∞ , −r [ ∪ ] r , +∞[ .

Proof. Define the set:



( )
X
C := x ∈ [ 0 , +∞[ : an xn converges .
n=0

If C = [ 0 , +∞ [ , then (i) holds. Otherwise, C is bounded. If C = { 0 } , then (ii) holds.


If both (i) and (ii) are not true, then there exists the positive real number r = sup C . Now,
|y| + r
choose any y ∈ ] − r , r [ and form ȳ = . Since ȳ is not an upper bound of C , then a
2
number z ≥ ȳ exists, for which it converges the series:

X
an z n .
n=0

As a consequence, by Lemma 2.35, series (2.16) converges for any x ∈] − z, z[ , and, in particular,
it is convergent the series:
X∞
an y n .
n=0
To end the proof, take |y| > r and assume, by contradiction, that it is still convergent the series:

X
an y n .
n=0

If so, using Lemma 2.35, it would follow that series (2.16) converges for any x ∈ ] − |y| , |y| [
and, in particular, it would converge for the number:
|y| + r
> r,
2
which contradicts the assumption r = sup C .

3
Jacques Salomon Hadamard (1865–1963), French mathematician.
28 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Definition 2.37. The interval within which (2.16) converges is called interval of convergence
and r is called radius of convergence.

The radius of convergence can be calculated as stated in Theorem 2.38.

Theorem 2.38 (Radius of convergence). Consider the power series (2.16) and assume that the
following limit exists:
an+1
` = lim .
n→∞ an
Then:

(i) if ` = ∞ , series (2.16) converges only for x = 0 ;

(ii) if ` = 0 , series (2.16) converges for all x ;


1
(iii) if ` > 0 , series (2.16) converges for |x| < .
`
1
Therefore r = is the radius of convergence of (2.16).
`
Proof. Consider the series:

X ∞
X
n
|an x | = |an | |x|n
n=0 n=0

and apply the ratio test, that is to say, study the limit of the fraction between the (n + 1)–th
term and the n–th term in the series:
|an+1 | |x|n+1

an+1
lim = |x| lim
= |x| ` .
n→∞ |an | |x|n n→∞ an

If ` = 0 , then series (2.16) converges for any x ∈ R , since it holds:

|an+1 | |x|n+1
lim = 0 < 1.
n→∞ |an | |x|n
If ` > 0 , then:
|an+1 | |x|n+1 1
lim = |x|` < 1 ⇐⇒ |x| < .
n→∞ |an | |x|n `
Eventually, if ` = ∞ , series (2.16) does not converge when x 6= 0 , since it is:

|an+1 | |x|n+1
lim > 1,
n→∞ |an | |x|n

while, for x = 0 , series (2.16) reduces to the zero series, which converges trivially.

Example 2.39. The power series (2.18), known as geometric series, has radius of convergence
r = 1.
X∞
xn . (2.18)
n=0

Proof. In (2.18), it is an = 1 for all n ∈ N , thus:



an+1
lim = 1.
n→∞ an

which means that series (2.18) converges for −1 < x < 1. At the boundary of the interval
of convergence, namely x = 1 and x = −1 , the geometric series (2.18) does not converge. In
conclusion, the interval of convergence of (2.18) is the open interval ] − 1 , 1 [ .
2.5. POWER SERIES: RADIUS OF CONVERGENCE 29

Example 2.40. The power series (2.19) has radius of convergence r = 1 .



X xn
. (2.19)
n
n=1

1
Proof. Here, an = , thus:
n

an+1
lim = lim n = 1 .
n→∞ an n→∞ n + 1

that is, (2.19) converges for −1 < x < 1 .


At the boundary of the interval of convergence, (2.19) behaves as follows: when x = 1 , it reduces
to the divergent harmonic series:

X 1
,
n
n=0

while, when x = −1 , (2.19) reduces to the convergent alternate signs series:



X 1
(−1)n .
n
n=0

The interval of convergence of (2.19) is, thus, [−1 , 1 [ .

Example 2.41. Series (2.20), given below, has infinite radius of convergence:

X xn
. (2.20)
n!
n=0

1
Proof. Since it is an = for any n ∈ N , it follows that:
n!

an+1 1
lim
= lim = 0.
n→∞ an n→∞ n + 1

It is possible to differentiate and integrate power series, as stated in the following Theorem 2.42,
which we include for completeness, as it represents a particular case of Theorems 2.28 and 2.31.
Theorem 2.42. Let f (x) be the sum of the power series (2.16), with radius of convergence r .
The following results hold.
(i) f (x) is differentiable and, for any |x| < r , it is:

X
0
f (x) = n an xn−1 ;
n=1

(ii) if F (x) is the primitive of f (x) , which vanishes for x = 0 , then:



X an n+1
F (x) = x .
n+1
n=0

The radius of convergence of both power series f 0 (x) and F (x) is that of f (x) .

Power series behave nicely with respect to the usual arithmetical operations, as shown in The-
orem 2.43, which states some useful result.
30 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Theorem 2.43. Consider two power series, with radii of convergence r1 and r2 respectively:

X ∞
X
n
f1 (x) = an x , f2 (x) = bn xn . (2.21)
n=0 n=0

Then r = min{r1 , r2 } is the radius of convergence of:



X
(f1 + f2 )(x) = (an + bn ) xn .
n=0

If α ∈ R , then r1 is the radius of convergence of:



X
α f1 (x) = α an xn .
n=0

We state, without proof, Theorem 2.44, concerning the product of two power series.
Theorem 2.44. Consider the two power series in (2.21), with radii of convergence r1 and r2
respectively. The product of the two power series is defined by the Cauchy formula:

X n
X
n
cn x , where cn = aj bn−j , (2.22)
n=0 j=0

that is:

c0 = a0 b0 ,
c1 = a0 b1 + a1 b0 ,
..
.
cn = a0 bn + a1 bn−1 + · · · + an−1 b1 + an b0 .

Series (2.22) has interval of convergence given by |x| < r = min{r1 , r2 } , and its sum is the
pointwise product f1 (x) f2 (x) .

2.6 Taylor–MacLaurin series


Our starting point, here, is the Taylor4 formula with Lagrange5 remainder term. Let f : I → R
be a function that admits derivatives of any order at x0 ∈ I . The Taylor–Lagrange Theorem
states that, if x ∈ I , then there exists a real number ξ , between x and x0 , such that:
n
X f (k) (x0 ) f (n+1) (ξ)
f (x) = Pn ( f (x) , x0 ) + Rn ( f (x) , x0 ) = (x − x0 )k + (x − x0 )n+1 . (2.23)
k! (n + 1)!
k=0

Since f has derivatives of any order, we may form the limit of (2.23) as n → ∞ ; a condition is
stated in Theorem 2.45 to detect when the passage to the limit is effective.
Theorem 2.45. If f has derivatives of any order in the open interval I , with x0 , x ∈ I , and if:

f (n+1) (ξ)
lim Rn (f (x), x0 ) = lim (x − x0 )n+1 = 0 ,
n→∞ n→∞ (n + 1)!

then:

X f (n) (x0 )
f (x) = y (x − x0 )n . (2.24)
n!
n=0
4
Brook Taylor (1685–1731), English mathematician.
5
Giuseppe Luigi Lagrange (1736–1813), Italian mathematician.
2.6. TAYLOR–MACLAURIN SERIES 31

Definition 2.46. A function f (x) , defined on an open interval I , is analytic at x0 ∈ I , if its


Taylor series about x0 converges to f (x) in some neighborhood of x0 .

Remark 2.47. Assuming the existence of the derivatives of any order is not enough to infer
that a function is analytic and, thus, it can be represented with a convergent power series. For
instance, the function:
( 1
e − x2 if x 6= 0
f (x) =
0 if x = 0
has derivatives of any order in x0 = 0 , but such derivates are all zero, therefore the Taylor series
reduces to the zero function. This happens because the Lagrange remainder does not vanish as
n → ∞.

Note that the majority of the functions, that interest us, does not possess the behaviour shown
in Remark 2.47. The series expansion of the most important, commonly used, functions can
be inferred from Equation (2.23), i.e., from the Taylor–Lagrange Theorem. And Theorem 2.45
yields a sufficient condition to ensure that a given function is analytic.

Corollary 2.48. Consider f with derivatives of any order in the interval I = ] a , b [ . Assume
that there exist L , M > 0 such that, for any n ∈ N ∪ {0} and for any x ∈ I :

(n)
f (x) ≤ M Ln . (2.25)

Then, for any x0 ∈ I , function f (x) coincides with its Taylor series in I .

Proof. Assume x > x0 . The Lagrange remainder for f (x) is given by:

f (n+1) (ξ)
Rn (f (x) , x0 ) = (x − x0 )n+1 ,
(n + 1)!

where ξ ∈ (x0 , x) , which can be written as ξ = x0 + α(x − x0 ) , with 0 < α < 1 . Now, using
condition (2.25), it follows:
 n+1
L (b − a)
|Rn (f (x) , x0 )| ≤ M.
(n + 1)!

The thesis follows from the limit:


 n+1
L (b − a)
lim = 0.
n→∞ (n + 1)!

Corollary 2.48, together with Theorem 2.42, allows to find the power series expansion for the
most common elementary functions. Theorem 2.49 concerns a first group of power series that
converges for any x ∈ R .

Theorem 2.49. For any x ∈ R , the exponential power series expansion holds:

X xn
ex = . (2.26)
n!
n=0

The goniometric, hyperbolic, power series expansions also hold:


32 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

∞ ∞
X (−1)n x2n+1 X (−1)n x2n
sin x = , (2.27) cos x = , (2.28)
(2n + 1)! (2n)!
n=0 n=0

∞ ∞
X x2n+1 X x2n
sinh x = , (2.29) cosh x = . (2.30)
(2n + 1)! (2n)!
n=0 n=0

Proof. First, observe that the general term in each of the five series (2.26)–(2.30) comes form
the MacLaurin6 formula.
To show that f (x) = ex is the sum of the series (2.26), let us use the fact that f (n) (x) = ex for
any n ∈ N ; in this way, it is possible to infer that, in any interval [a , b] , the disequality (2.25)
is fulfilled if we take M = 1 and L = max{ex : x ∈ [a, b]} .
To prove (2.27) and (2.28), in which derivatives of the goniometric functions sin x and cos x are
considered, condition (2.25) is immediately verified by taking M = 1 and L = 1 .
Finally, (2.29) and (2.30) are a straightforward consequence of the definition of the hyperbolic
functions in terms of the exponential:
ex + e−x ex − e−x
cosh x = , sinh x = ,
2 2
together with Theorem 2.43.

Theorem 2.50 concerns a second group of power series converging for |x| < 1 .
Theorem 2.50. If |x| < 1 , the following power series expansions hold:

∞ ∞
1 X 1 X
= xn , (2.31) = (n + 1)xn , (2.32)
1−x (1 − x)2
n=0 n=0

∞ ∞
X xn+1 X xn+1
ln(1 − x) = − , (2.33) ln(1 + x) = (−1)n , (2.34)
n+1 n+1
n=0 n=0

∞ ∞
x2n+1 x2n+1
 
X 1+x X
arctan x = (−1)n , (2.35) ln =2 . (2.36)
2n + 1 1−x 2n + 1
n=0 n=0
1
Proof. To prove (2.31), define f (x) = and build the MacLaurin polynomial of order n ,
1−x
which is Pn ( f (x) , 0 ) = 1 + x + x2 + . . . + xn ; the remainder can be thus estimated directly:
 
Rn f (x) , n = f (x) − Pn f (x) , 0
1
= − (1 + x + x2 + · · · + xn ) (2.37)
1−x
1 1−x n+1 x n+1
= − = .
1−x 1−x 1−x
Assuming |x| < 1 , we see that the remainder vanishes for n → ∞ , thus (2.31) follows.
Indentity (2.32) can be proven by employing both formula (2.31) and Theorem 2.44, with an =
bn = 1 .
To obtain (2.33), the geometric series in (2.31) can be integrated term by term, using Theorem
2.42; in fact, letting |x| < 1 , we can consider the integral:
Z x
dt
= − ln(1 − x) .
0 1−t
6
Colin Maclaurin (1698–1746), Scottish mathematician.
2.6. TAYLOR–MACLAURIN SERIES 33

Now, from Theorem 2.42 it follows:



xX ∞
xn+1
Z X
xn dx = .
0 n=0 n+1
n=0

Formula (2.33) is then a consequence of formula (2.31). Formula (2.34) can be proven analogously
to (2.33), by considering −x instead of x .
To prove (2.35), we use again formula (2.31) with t = −x , so that we have:

1 X
= (−1)n x2n .
1 + x2
n=0

Integrating and invoking Theorem 2.42, we obtain:


Z x ∞ 2n+1
dt X
n x
arctan x = = (−1) .
0 1 + t2 2n + 1
n=0

Finally, to prove (2.36), let us consider x = t2 in formula (2.31), so that:



1 X
2
= t2n . (2.38)
1−t
n=0

Integrating, taking |x| < 1 , and using Theorem 2.42, the following result is obtained:

x x ∞
1 1 + x X x2n+1
Z Z 
dt 1 1 1
= + dt = ln = .
0 1 − t2 2 0 1+t 1−t 2 1−x 2n + 1
n=0

2.6.1 Binomial series


The role of the so–called binomial series is pivotal. Let us recall the binomial formula (2.39). If
n ∈ N and x ∈ R then:
n  
n
X n k
(1 + x) = x , (2.39)
k
k=0

where the binomial coefficient is defined as:


 
n n · (n − 1) · · · · · (n − k + 1)
= . (2.40)
k k!

Observe that the left hand side of (2.40) does not require the numerator to be a natural number.
Therefore, if α ∈ R and if n ∈ N , the generalized binomial coefficient is defined as:
 
α α · (α − 1) · · · · · (α − n + 1)
= . (2.41)
n n!

From (2.41) an useful property of the generalized binomial coefficient can be inferred, and later
used to expand in power series the function f (x) = (1 + x)α .

Proposition 2.51. For any α ∈ R and any n ∈ N , the following identity holds:
     
α α α
n + (n + 1) =α . (2.42)
n n+1 n
34 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Proof. The thesis follows from a straightforward computation:


     
α α α α · (α − 1) · · · · · (α − n)
n + (n + 1) =n + (n + 1)
n n+1 n (n + 1)!
 
α α · (α − 1) · · · · · (α − n)
=n +
n n!
 
α α · (α − 1) · · · · · (α − n + 1)
=n + (α − n)
n n!
     
α α α
=n + (α − n) =α .
n n n

By using Proposition 2.51, it is possible to prove the so–called generalised Binomial Theorem
2.52.

Theorem 2.52 (Generalised Binomial). For any α ∈ R and |x| < 1 , the following identity
holds:
∞  
α
X α n
(1 + x) = x . (2.43)
n
n=0

Proof. Let us denote f (x) the sum of the generalised binomial series:
∞  
X α n
f (x) = x ,
n
n=0

and introduce function g(x) as follows:

f (x)
g(x) = .
(1 + x)α

To prove the thesis, let us show that g(x) = 1 for any |x| < 1 . Differentiating g(x) we obtain:

(1 + x)f 0 (x) − αf (x)


g 0 (x) = . (2.44)
(1 + x)α+1

Moreover, differentiating f (x) term by term, using Theorem 2.42, we get:


∞   ∞   ∞  
X α n−1 X α n−1 X α n
(1 + x) f 0 (x) = (1 + x) n x = n x + n x
n n n
n=1 n=1 n=1
∞   ∞  
X α X α n
= (n + 1) xn + n x
n+1 n
n=0 n=1
∞   ∞  
X α n
X α n
= (n + 1) x + n x
n+1 n
n=0 n=0
∞      ∞  
X α α n
X α n
= (n + 1) +n x = α x = α f (x) .
n+1 n n
n=0 n=0

Thus, g 0 (x) = 0 for any |x| < 1 , which implies that g(x) is a constant function. It follows that
g(x) = g(0) = f (0) = 1 , which proves thesis (2.43).
1
When considering the power series expansion of arcsin, the particular value α = − turns out
2
to be important. Let us, then, study the generalised binomial coefficient (2.41) corresponding to
such an α .
2.6. TAYLOR–MACLAURIN SERIES 35

Proposition 2.53. For any n ∈ N , the following identity holds true:


 1
−2 (2 n − 1)!!
= (−1)n , (2.45)
n (2 n)!!
in which n!! denotes the double factorial function (or semi–factorial) of n .
1
Proof. Evaluation of the binomial coefficient yields, for α = − :
2
 1 1 1 1 1
−2 − (− − 1)(− 2 − 2) · · · (− 2 − n + 1)
= 2 2
n n!
1 1
( + 1)( 2 + 2) · · · ( 12 + n − 1)
1 1 3 5
· · · · · 2 n−1
= (−1)n 2 2 = (−1)n 2 2 2 2
.
n! n!
Recalling that the double factorial n!! is the product of all integers from 1 to n of the same
parity (odd or even) as n , we obtain:
1 3 5 2n − 1 (2 n − 1)!!
· · ··· = .
2 2 2 2 2n
Therefore:
− 12
 
(2 n − 1)!!
= (−1)n .
n 2n n!
Recalling further that (2n)!! = 2n n! , thesis (2.45) follows.
The following Corollary 2.54 is a consequence of Proposition 2.53.
Corollary 2.54. For any |x| < 1, using the convention (−1)!! = 1 , it holds:


X (2 n − 1)!!
1
√ = xn , (2.46)
1 − x n=0 (2 n)!!


1 X (2 n − 1)!! n
√ = (−1)n x . (2.47)
1 + x n=0 (2n)!!

Formula (2.46) is implied by Theorem 2.42 and yields the MacLaurin series for arcsin x , while
(2.47) gives the series for arcsinh x , as expressed in the following Theorem 2.55.
Theorem 2.55. Considering |x| < 1 and letting (−1)!! = 1 , then:

X (2 n − 1)!! x2 n+1
arcsin x = , (2.48)
(2 n)!! 2 n + 1
n=0

X (2 n − 1)!! x2 n+1
arcsinh x = (−1)n . (2.49)
(2 n)!! 2 n + 1
n=0

Proof. For |x| < 1 , we can write:


Z x
dt
arcsin x = √ .
0 1 − t2
Using (2.46) with x = t2and applying Theorem 2.42, it follows:
Z xX∞ ∞
(2n − 1)!! 2n (2n − 1)!! x 2n
X Z
arcsin x = t dt = t dt
0 (2n)!! (2n)!! 0
n=0 n=0

X (2n − 1)!! x2n+1
= .
(2n)!! 2n + 1
n=0

Equation (2.49) can be proved analogously.


36 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Using the power series (2.46) and the Central Binomial Coefficient formula:
2n (2n − 1)!!
 
2n
= (2.50)
n n!
provable by induction, a result can be obtained, due to Lehemer7 [26].
1
Theorem 2.56 (Lehmer). If |x| < , then:
4
∞  
1 X 2n n
√ = x . (2.51)
1 − 4x n=0 n
Proof. Formula (2.50) yields:
∞  ∞ ∞
2n n X 2n (2n − 1)!! n X 4n (2n − 1)!! n
X 
x = x = x .
n n! 2n n!
n=0 n=0 n=0

Using again the relation (2n)!! = 2n n! ,


it follows:
∞ ∞ ∞
2n n X 4n (2n − 1)!! n X (2n − 1)!!
X  
x = x = (4x)n .
n (2n)!! (2n)!!
n=0 n=0 n=0

Finally, equality (2.51) follows from (2.46).

2.6.2 The error function


We present here an example on how to deal with power series expansion of a function, which has
great importance in Probability theory. In Statistics, it is fundamental to deal with the following
definite integral: Z x
2
F (x) = e−t dt . (2.52)
0
The main issue with the integral (2.52) is that it is not possible to express it by means of the
known elementary functions [28]. On the other hand, some probabilistic applications require to
know, at least numerically, the values of the function introduced in (2.52). A way to achieve this
goal is integrating by series. Using the power series for the exponential function, it is possible
to write:

2
X t2 n
e−t = (−1)n .
n!
n=0
Since the power series in uniformly convergent, we can invoke Theorem 2.15 and transform the
integral (2.52) into a series as:
Z x ∞ Z x 2n ∞
−t2
X
n t X 1 x2n+1
e dt = (−1) dt = (−1)n . (2.53)
0 0 n! n! 2n + 1
n=0 n=0

The error function, used in Statistics, is defined in the following way:


Z x
2 2
erf(x) = √ e−t dt . (2.54)
π 0
Our previous argument, which led to equation (2.53), shows that the power series expansion of
the error function, introduced in (2.54), is

2 X 1 x2n+1
erf(x) = √ (−1)n . (2.55)
π n=0 n! 2n + 1

Notice that from Theorem 2.38 it follows that the radius of convergence of the power series
(2.55) is infinite.
7
Derrick Henry Lehmer (1905–1991), American mathematician.
2.6. TAYLOR–MACLAURIN SERIES 37

2.6.3 Abel theorem and series summation


We present here an important theorem, due to Abel8 , which explains the behavior of a given
power series, with positive radius of convergence, at the boundary of the interval of convergence.
In the previous Examples 2.39 and 2.40, we observed different behaviors at the boundary of the
convergence interval: they can be explained by Abel Theorem 2.57, for the proof of which we
refer to [6].

Theorem 2.57 (Abel). Denote by f (x) the sum of the power series (2.16), in which we assume
X∞
that the radius of convergenge is r > 0 . Assume further that the numerical series rn an
n=0
converges. Then:

X
lim f (x) = an r n . (2.56)
x→r−
n=0

Proof. The generality of the proof is not affected by the choice r = 1 , as different radii can be
achieved with a straightforward change of variable. Let:

n−1
X
sn = am ;
m=0

then:

X
s = lim sn = an .
n→∞
n=0

Now, observe that a0 = s1 and an = sn+1 − sn for any n ∈ N . If |x| < 1 , then 1 is the
radius of convergence of the power series:


X
sn+1 xn . (2.57)
n=0

To show it, notice that:



sn+2 an+1 + sn+1
lim = lim = 1.
n→∞ sn+1 n→∞ sn+1

When |x| < 1 , series (2.57) can be multiplied by 1 − x , yielding:


X ∞
X ∞
X
n n
(1 − x) sn+1 x = sn+1 x − sn+1 xn+1
n=0 n=0 n=0
X∞ X∞
= sn+1 xn − sn xn (2.58)
n=0 n=1

X ∞
X
n
= s1 + (sn+1 − sn )x = a0 + an xn = f (x) .
n=1 n=1

To obtain thesis (2.56) we have to show that, for any ε > 0 , there exists δε > 0 such that
|f (x) − s| < ε , for any x such that 1 − δε < x < 1 . From (2.58) and using formula (2.31) for the

8
Niels Henrik Abel (1802–1829), Norvegian mathematician.
38 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

sum of the geometric series, we have:



X
f (x) − s = (1 − x) sn+1 xn − s
n=0
X∞ ∞
X
= (1 − x) sn+1 xn − s (1 − x) xn
n=0 n=0
∞ ∞ (2.59)
X X
n n
= (1 − x) sn+1 x − (1 − x) sx
n=0 n=0

X
= (1 − x) (sn+1 − s) xn .
n=0

ε
Now, fixed ε > 0 , there exists nε ∈ N such that |sn+1 − s| < for any n ∈ N , n > nε ;
2
therefore, using the triangle inequality in (2.59), the following holds for x ∈] − 1, 1[ :

n
X ε X
|f (x) − s| = (1 − x) (sn+1 − s)xn + (sn+1 − s)xn


n=0 n=nε +1

n
X ε X
≤ (1 − x) (sn+1 − s)xn + (1 − x) (sn+1 − s)xn


n=0 n=nε +1
nε ∞
(2.60)
X ε X
≤ (1 − x) |sn+1 − s| |x|n + (1 − x) |x|n
2
n=0 n=nε +1
nε nε
X ε X ε
≤ (1 − x) |sn+1 − s| |x|n + ≤ (1 − x) |sn+1 − s| + .
2 2
n=0 n=0

Observing that the function:



X
x 7→ (1 − x) |sn+1 − s|
n=0

is continuous and vanishes for x = 1 , it is possible to choose δ ∈]0 , 1[ such that, if 1−δ < x < 1 ,
we have:

X ε
(1 − x) |sn+1 − s| < .
2
n=0

Thesis (2.56) thus follows.

Theorem 2.57 allows to compute, roundly, the sum of many interesting series.

Example 2.58. Recalling the power series expansion (2.33), from Theorem 2.57, with x = 1 ,
it follows:

X (−1)n+1
ln 2 = .
n
n=1

Example 2.59. Recalling the power series expansion (2.35), Theorem 2.57, with x = 1 , allows
finding the sum of the Leibnitz–Gregory9 series:

π X 1
= (−1)n .
4 2n + 1
n=0
9
James Gregory (1638–1675), Scottish mathematician and astronomer.
Gottfried Wilhelm von Leibnitz (1646–1716), German mathematician and philosopher.
2.7. BASEL PROBLEM 39

Example 2.60. Recalling the particular binomial expansion (2.46), Abel Theorem 2.57 implies
that, for x = 1 , the following holds:

1 X (2n − 1)!!
√ = (−1)n .
2 n=0 (2n)!!
π
Using the fact that arccos x = − arcsin x , it is possible to obtain a second series, which gives
2
π.
Example 2.61. Recalling the arcsin expansion (2.48), from Theorem 2.57 it follows:

π X (2n − 1)!! 1
= .
2 (2n)!! 2n + 1
n=0

Example 2.62. We show here two summation formulæ connecting π to the central binomial
coefficients:  
2n

X n π
= ; (2.61)
4n (2 n + 1) 2
n=0
 
2n

X n π
n
= . (2.62)
16 (2 n + 1) 3
n=0
The key to show (2.61) and (2.62) lies in the representation of the central binomial coefficient
(2.50), whose insertion in the left hand side of (2.61) leads to the infinite series:

X (2n + 1)!!
. (2.63)
2n n! (2n + 1)
n=0

We further notice that, from the power expansion of the arcsin function (2.48), it is possible to
infer the following equality:
√ ∞
arcsin x X (2n + 1)!!
√ = xn . (2.64)
x 2n n! (2n + 1)
n=0

The radius of convergence of the power series (2.64) is 1 ; Abel Theorem 2.57 can thus be applied
to arrive to (2.61). It is worth noting that (2.61) can also be obtained using the Lehemer series
(2.51), via the change of variable y = 4 x and integrating term by term.
A similar argument leads to (2.62); here, the starting point is the following power series expan-
sion, which has, again, radius of convergence r = 1 :
 
2n

arcsin x X n
= n
x2n (2.65)
x 4 (2n + 1)
n=0

1
Equality (2.62) follows by evaluating formula (2.65) at x = .
2

2.7 Basel problem


One of the most celebrated problems in Classical Analysis is the Basel Problem, which consists
in determining the exact value of the infinite series:

X 1
. (2.66)
n2
n=1
40 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Mengoli10 originally posed this problem in 1644, that takes its name from Basel, birthplace of
π2
Euler11 who first provided the correct solution in [12].
6
There exist several solutions of the Basel problem; here we present the solution of Choe [7],
based on the power series expansion of f (x) = arcsin x , shown in Formula (2.48), as well as on
the Abel Theorem 2.57 and on the following integral Formula (2.67), which can be proved by
induction on m ∈ N :
Z π
2 (2 m)!!
sin2 m+1 t dt = . (2.67)
0 (2 m + 1)!!
The first step towards solving the Basel problem is to observe that, in the sum (2.66), the
attention can be confined to odd indexes only. Namely, if E denotes the sum of the series (2.66),
then E can be computed by considering, separately, the sums on even and odd indexes:
∞ ∞
X 1 X 1
2
+ =E.
(2 n) (2 n + 1)2
n=1 n=0

On the other hand:


∞ ∞
X 1 X 1 E
2
= 2
=
(2 n) 4n 4
n=1 n=1

yielding:

X 1 3
2
= E. (2.68)
(2n + 1) 4
n=0

π2 3 π2
Now, observe that E = ⇐⇒ E= . In other words, the Basel problem is equivalent
6 4 8
to show that:

X 1 π2
= , (2.69)
(2n + 1)2 8
n=0

whose proof can be found in [7].


Abel Theorem 2.57 applies to the power series (2.48), since we can prove that (2.48) converges
for x = 1 , using Raabe12 test, that is to say, forming:
 
an
ρ = lim n −1 ,
n→∞ an+1

in which an is the n−th series term, and proving that ρ > 1 . In the case of (2.48), with x = 1 :

(2n − 1)!! x2n+1 (2n + 2)!! 2n + 3


 
ρ = lim n −1
n→∞ (2n)!! 2n + 1 (2n + 1)!! x2n+3
 
2(n + 1)(2n + 3) n(6n + 5) 3
= lim n 2
− 1 = lim 2
= .
n→∞ (2n + 1) n→∞ (2n + 1) 2

This implies, also, that the series (2.48) converges uniformly. The change of variable x = sin t in
π π
both sides of (2.48) yields, when − < t < :
2 2

X (2n − 1)!! sin2n+1 t
t = sin t + . (2.70)
(2n)!! 2n + 1
n=1

10
Pietro Mengoli (1626–1686), Italian mathematician and clergyman from Bologna.
11
Leonhard Euler (1707–1783), Swiss mathematician and physicist.
12
Joseph Ludwig Raabe (1801–1859), Swiss mathematician.
2.8. EXTENSION OF ELEMENTARY FUNCTIONS TO THE COMPLEX FIELD 41

π
Integrating (2.70) term by term, on the interval [0 , ] , and using (2.67), we obtain:
2
∞ Z π
π2 X (2n − 1)!! 2 sin2n+1 t
=1+ dt
8 (2n)!! 0 2n + 1
n=1

X (2n − 1)!! (2n)!! 1
=1+
(2n)!! (2n + 1)!! 2n + 1
n=1
∞ ∞
X 1 X 1
=1+ 2
= .
(2n + 1) (2n + 1)2
n=1 n=0

This shows (2.69) and, thus, the Euler summation formula:



X 1 π2
= . (2.71)
n2 6
n=1

2.8 Extension of elementary functions to the complex field


The set C , of complex numbers, as well as the set R , of reals, in force of the triangle inequality,
possesses the topological structure of metric space. The theory of convergence of sequences
and sequences of functions with complex values is, therefore, analogous to that of real valued
sequences and sequences of functions. As a consequnce, it is possible to extend the domain of the
elementary functions, that are representable in terms of convergent power series, to the complex
domain.

2.8.1 Complex exponential


Let us start considering the complex exponential. In C , the exponential function is defined in
terms of the usual power series, which is thought, here, as a function of a variable z ∈ C :

Definition 2.63.

X zn
ez := . (2.72)
n!
n=0

Equations (2.26) and (2.72) only differ in the fact that, in the latest one, the argument can be
a complex number. Almost all the familiar properties of the exponential still hold, with the one
exception of positivity, which has no sense in the unordered field C . The fundamental property
of the complex exponential is stated in the following Theorem 2.64, due to Euler.

Theorem 2.64 (Euler). For any z = x + iy ∈ C , with x , y ∈ R , it holds:

ex+iy = ex (cos y + i sin y) . (2.73)

Proof. Let z = x + iy ∈ C ; then:

ez = ex+i y
 =e ·e
x iy

i y (i y)2 (i y)3 (i y)4



x
=e · 1+ + + + + ···
1! 2! 3! 4!
y2 y4 y3 y5
   
x
=e · 1− + + ··· + i y − + + ···
2! 4! 3! 5!
= ex · (cos y + i sin y) .

The last step, above, exploits the real power series expansion for the sine and cosine functions
given in (2.27) and (2.28) respectively.
42 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

The first beautiful consequence of Theorem 2.64 is the famous Euler identity.
Corollary 2.65 (Euler identity).
ei π + 1 = 0 . (2.74)
Proof. First observe that, if x = 0 in (2.73), then it holds, for any y ∈ R :
ei y = cos y + i sin y . (2.75)
Now, with y = π , equation (2.75) follows.
The C–extension of the exponential has an important consequence since the exponential func-
tion, when considered as a function C → C , is no longer one–to–one, but it is a periodic function.
In fact, if z , w ∈ C , then:
ez = ew ⇐⇒ z = w + 2 n π i with n ∈ Z .

2.8.2 Complex goniometric hyperbolic functions


Equality (2.75) implies the following formulæ (2.76), again due to Euler and valid for any y ∈ R :
ei y − e−i y ei y + e−i y
sin y = , cos y = . (2.76)
2i 2
It is thus possible to use (2.76) to extend to C the goniometric functions.
Definition 2.66. For any z ∈ C , define:
ei z − e−i z ei z + e−i z
sin z = , cos z = . (2.77)
2i 2
In essence, for the sine and cosine functions, both in their goniometric and hyperbolic versions,
the power series expansions (2.27), (2.28), (2.29) and (2.30) are understood as functions of a
complex variable.

2.8.3 Complex logarithm


To define the complex logarithm, it must be taken into account that the C–exponential function
is periodic, with period 2 π i , thus the C–logarithm is not univocally determined. With this in
mind, we formulate the following definition:
Definition 2.67. If w ∈ C , the logarithm of w is any complex number z ∈ C such that
ez = w .
Remark 2.68. In C , as well as in R , the logarithm of zero is undefined, since, from (2.73), it
follows ez 6= 0 , for any z ∈ C .
Using the polar representation of a complex number, we can represent its logarithms as shown
below.
Theorem 2.69. If w = ρ ei ϑ is a non–zero complex number, the logarithms of w are defined
as:
log w = ln ρ + i (ϑ + 2 n π) , n ∈ Z. (2.78)
Proof. Let w = ez and let z = x + i y ; then, we have to solve the system:
ez = ρ ei ϑ ,
with
ez = ex+i y = ex ei y = ex (cos y + i sin y) , ρ ei ϑ = ρ (cos ϑ + i sin ϑ) ,
from which the real and imaginary components of z are obtained:
x = ln ρ , ρ ≥ 0, y = ϑ + 2nπ.
Since log w = z , thesis (2.78) follows.
2.9. EXERCISES 43

Among the infinite logarithms of a complex number, we pin down one, corresponding to the
most convenient argument.

Definition 2.70. Consider w = ρ ei ϑ , w 6= 0 . The main argument of w is ϑ , with −π < ϑ ≤


π , and it is referred to as arg(w) . Note, also, that ρ = |w| . Then, the principal determination
of the logarithm of w is:

Log w = ln ρ + i ϑ = ln |w| + i arg(w) .

Example 2.71. Compute Log(−1) . Here, w = ρ ei ϑ , with ρ = | − 1| and ϑ = arg(−1) . Since


ln 1 = 0 and arg(−1) = π , we obtain Log(−1) = i π .

In other words, for a non–zero complex w , the principal determination (or principal value)
Log w is the logarithm whose imaginary part lies in the interval (−π , π] .

We end this section introducing the complex power.

Definition 2.72. Given z ∈ C , z 6= 0 , and w ∈ C , the complex power function is defined as:

z w = ew Log z
π
Example 2.73. Compute i i . Applying Definition 2.72: i i = ei Log i . Since arg(i) = and
2
π π π
|i| = 1 , then Log i = i . Finally, i i = ei i 2 = e− 2 .
2
Example 2.74. In C , it is possible solve equations like sin z = 2 , obviously finding complex
solutions. From the sin definition (2.77), in fact, we obtain:

e2 i z − 4 i ei z − 1 = 0 .

Thus:  √ 
ei z = 2 ± 3 i .

Evaluating the logarithms, the following solutions can be found:


π  √ 
z = + 2 n π − i ln 2 ± 3 , n ∈ Z.
2

2.9 Exercises
2.9.1 Solved exercises
1. Given the following sequence of functions, establish whether it is pointwise and/or uniformly
convergent:
n x + x2
fn (x) = , x ∈ [0, 1] ,
n2
2. Exaluate the pointwise limit of the sequence of functions:

fn (x) = n 1 + xn , x ≥ 0.

3. Show that the following sequence of functions converges pointwise, but not uniformly, to
f (x) = 0 :
fn (x) = n x e−n x , x > 0,

4. Show that the following sequence of functions converges uniformly to f (x) = 0 :



1 − xn
fn (x) = , x ∈ [−1 , 1] ,
n2
44 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

5. Show that:

X 1
= ln 2 .
n 2n
n=1

6. Evaluate:

n e−n x
Z
lim dx .
n→∞ 1 1 + nx
1
x(1 − x)
Z
7. Use the definite integral dx to prove that:
0 1+x

X (−1)n 3
= − ln 4 .
(n + 1)(n + 2) 2
n=1

−n
x2

8. Let fn (x) = 1+ , x ≥ 0.
n

a. Show that (fn ) is pointwise convergent to a function f (x) to be determined.


b. Show that: Z ∞ Z ∞
lim fn (x)dx = f (x)dx .
n→∞ 0 0

Solutions to Exercises 2.9.1


1. Sequence fn (x) converges pointwise to zero for any x ∈ [0, 1] since:

x x2 x2
 
x
lim fn (x) = lim + = lim + lim 2 = 0 + 0 = 0 .
n→∞ n→∞ n n2 n→∞ n n→∞ n

To establish if such a convergence is also uniform, we evaluate:

nx + x2
 
n+1
sup |fn (x) − 0| = sup 2
= .
x∈[0,1] x∈[0,1] n n2

Observe that:
n+1
lim sup |fn (x) − 0| = lim = 0.
n→∞ x∈[0,1] n→∞ n2

The uniform convergence on the interval [0, 1] follows.

2. If x = 0 then fn (0) = 1 for any n ∈ N .


3 5
If x ≤ 1 , then xn → 0 , and we can infer that there exists n0 ∈ N such that < 1 + xn < ;
2 2
therefore, for any n > n0 , we have:
r r
n 3 n 5
< fn (x) < .
2 2
q q
Since limn→∞ n 32 = limn→∞ n 52 = 1 , we can use the Sandwich Theorem (or Squeeze
Theorem13 ), for x ≤ 1 , to prove the limit relation:

lim fn (x) = 1 .
n→∞

13
See, for example,mathworld.wolfram.com/SqueezingTheorem.html
2.9. EXERCISES 45

Now, examine what happens when x > 1 . First, notice that:


s  r


n n 1 n 1
fn (x) = 1 + xn = xn +1 =x + 1.
xn xn

1 1
Recalling that, here, < 1 , x 6= 0 , we consider a change of variable t = and repeat the
x x √
previous argument (that we followed in the case of a variable t < 1) to obtain lim n tn + 1 =
n→∞
1 , that is: r
n 1
lim + 1 = 1.
n→∞ xn
In other words, for x > 1 , we have shown that:

lim fn (x) = lim n 1 + xn = x .
n→∞ n→∞

Putting everything together, we have proven that:


(
1 if x ≤ 1 ,
lim fn (x) = f (x) where f (x) =
n→∞ x if x > 1 .

3. The pointwise limit of sequence fn (x) = n x e−n x is f (x) = 0 , due to the exponential decay
of the factor e−n x . To investigate the possible uniform convergence, we consider:

sup |fn (x) − f (x)| = sup n x e−n x .


x>0 x>0

Differentiating we find:
d
n x e−n x = n e−n x (1 − n x) ,

dx
1
showing that x = n is a local maximizer and the corresponding extremum is:
 
1 1
fn = .
n e

But this implies that the found convergence cannot be uniform, since:
1
lim sup |fn (x) − f (x)| = 6= 0 .
n→∞ x>0 e
√ √
4. For any x ∈ [−1, 1] and any n ∈ N , it holds that 1 − xn ≤ 2 , thus:
√ √
1 − xn 2
fn (x) = ≤ 2 . (2.79)
n2 n
Now, observe that inequality (2.79) is independent of x ∈ [−1, 1] : this fact, taking the
supremum with respect to x ∈ [−1, 1] , ensures uniform convergence.

5. Consider, for any n ∈ N , the definite integral:


Z 1
2 1
xn−1 dx = .
0 n 2n
Summing up for all positive integers between 1 and n , we get:
∞ ∞ Z 1
X 1 X 2

n
= xn−1 dx .
n2 0
n=1 n=1
46 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Since the geometric series, in the right–hand side above, converges uniformly, we can swap
series and integral, obtaining:
∞ Z 1 X ∞
!
X 1 2
= xn−1 dx .
n 2n 0
n=1 n=1

Therefore
∞ Z 1
X 1 2 1
n
= dx .
n2
n=1 0 1−x

The thesis follows by integrating:


Z 1
2 1 x= 1
dx = [− ln(1 − x)]x=02 .
0 1−x

n
6. Define the (decreasing) function hn (x) = , with x ∈ [1, +∞) . Then:
1 + nx

n2
h0n (x) = − < 0.
(1 + nx)2

Since:
n
lim =0
x→∞ 1 + n x

and
n n
sup = hn (1) = ,
x∈[1,∞) 1 + n x 1 + n
we can infer that:
n
|hn (x)| ≤ < 1.
1+n
Therefore:
n e−n x

−n x
1 + n x < e .

This shows uniform convergence for fn . We can now invoke Theorem 2.15, to obtain:
Z ∞ Z ∞ Z ∞
n e−n x n e−n x
lim dx = lim dx = 0 dx = 0 .
n→∞ 1 1 + nx 1 n→∞ 1 + n x 1

7. First, evaluate the definite integral:


1 1 
x(1 − x)
Z Z
2 3
dx = 2−x− dx = − ln 4 .
0 1+x 0 x+1 2

Then, recall that, in [0 , 1] , it holds true the geometric series expansion:


∞ ∞
1 X X
= (−x)m = (−1)m xm .
1+x
m=0 m=0

It is thus possible to integrate term by term, obtaining:


1 1 ∞
x(1 − x)
Z Z X
dx = x(1 − x) (−1)m xm dx
0 1+x 0 n=0

X Z 1
= (−1)m xm+1 (1 − x) dx .
m=0 0
2.9. EXERCISES 47

Now, evaluating the last right–hand side integral, we get:


1 ∞  
x(1 − x)
Z X
m 1 1
dx = (−1) −
0 1+x m+2 m+3
m=0

X (−1)m
= .
(m + 2)(m + 3)
m=0

Our statement follows using the change of index n = m + 1 .

8. Observe that:
x2
   2  2 
x x
 2 −n
x
 −n ln 1 + −n +o
1+ =e n =e n n 2
= e−x (1+o(1))
n

where we have used ln(1 + t) = t + o(t) when t ' 0 . This means that:
−n
x2

2
lim 1+ = e−x .
n→∞ n

We have thus shown point (a). The second statement follows from Corollary 2.23.

2.9.2 Unsolved exercises


x
1. Show that the sequence of functions fn (x) = , with x ∈ R , converges pointwise to f (x) =
n
0 , but the convergence is not uniform.
Show also that, on the other hand, when a > 0 , the sequence (fn ) converges uniformly to
f (x) = 0 for x ∈ [−a , a] .
x n
 
2. Let fn (x) = cos √ , x ∈ R . Show that:
n
a. fn converges pointwise to a non–zero function f (x) to be determined;
b. if a > 0 , the sequence (fn ) converges uniformly on [−a , a] .

Hint. Consider the sequence gn (x) = ln fn (x) and use the power series (2.33) and (2.28).
x + x2 e n x
3. Establish if the sequence of functions (fn )n∈N , defined, for x ∈ R , by fn (x) = ,
1 + enx
converges pointwise and/or uniformly.

X 1 3
4. Show that n
= ln .
n3 2
n=1
Z 1 x+1
5. Show that lim e n dx = 1 .
n→∞ 0

6. Consider the following equality and say if (and why) it is true or false:
Z 1 Z 1
x4 x4
lim dx = lim dx .
n→∞ 0 x2 + n2 2
0 n→∞ x + n
2

n(x3 + x) e−x
7. Let fn (x) = , with x ∈ [0 , 1] .
1 + nx

a. Show that (fn ) is pointwise convergent to a function f (x) to be determined.


48 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

b. Show that, for any x ∈ [0, 1] and for any n ∈ N :


2
|fn (x) − f (x)| ≤ .
1 + nx

c. Show that, for any a > 0 , sequence (fn ) converges uniformly to f on [a , 1] , but the
convergence is not uniform on [0 , 1] .
Z 1
d. Evaluate lim fn (x) dx .
n→∞ 0

8. Use the definite integral:


1
1−x
Z
dx
0 1 − x4
to show that:

X 1 π 1
= + ln 2 .
(4 n + 1)(4 n + 2) 8 4
n=0

Hint.
1 1−x 1
2
= 2
+ .
(1 + x)(1 + x ) 2 (1 + x ) 2(1 + x)

9. Use the definite integral: √


Z 1
1+ x
dx
0 1+x
to show that:

X 4n + 1 π
(−1)n−1 = 2 + ln 2 − .
n (2 n + 1) 2
n=1

∞ ∞
x5
Z X 1
10. Show that x 2 dx = 3
.
0 e −1 n=1
n
√ 
11. Show that cos z = 2 ⇐⇒ z = 2 n π − i ln 2 ± 3 , n ∈ Z.
3 Multidimensional differential cal-
culus

3.1 Partial derivatives


The most natural way to define derivatives of functions of several variables is to allow only one
variable at a time to move, while freezing the others. Thus, if f : V → R is a function of n
variables, whose domain is the open set V , we define the set {x1 } × · · · × {xj−1 } × [a, b] ×
{xj+1 } × · · · × {xn } , where [a , b] is chosen so to have {x1 } × · · · × {xj−1 } × {t} × {xj+1 } ×
· · · × {xn } ⊂ V for any t ∈ [a , b] . We shall denote the function:

g(t) := f (x1 , . . . , xj−1 , t , xj+1 , . . . , xn )

by
f (x1 , . . . , xj−1 , · , xj+1 , . . . , xn ) .
If g is differentiable at some t0 ∈ (a , b) , then the first–order partial derivative of f at
(xl , . . . , xj−1 , t0 , xj+1 , . . . , xn ) , with respect to xj , is defined by:
∂f
fxj (x1 , . . . , xj−1 , t0 , xj+1 , . . . , xn ) := (x1 , . . . , xj−1 , t0 , xj+1 , . . . , xn )
∂xj
:=g 0 (t0 ) .
Therefore, the partial derivative fxj exists at a point a if and only if the following limit exists:

∂f f (a + h ej ) − f (a)
(a) := lim .
∂xj h→0 h
Higher–order partial derivatives are defined by iteration. For example, when it exists, the second–
order partial derivative of f , with respect to xj and xk , is defined by:

∂2f
 
∂ ∂f
fxj xk := := .
∂xk ∂xj ∂xk ∂xj
Second–order partial derivatives are called mixed when j 6= k .
Definition 3.1. Let V be a non–empty open subset of Rn , let f : V → R and p ∈ N.

(i) f is said to be C p on V if and only if every k–th order partial derivative of f , with k ≤ p ,
exists and is continuous on V .

(ii) f is said to be C ∞ on V if and only if f is C p for all p ∈ N .

If f is C p on V and q < p , then f is C q on V . The symbol C p (V ) denotes the set of functions


that are C p on an open set V .
For simplicity, in the following we shall state all results for the case m = 1 and n = 2 , denoting
x1 with x and x2 with y . With appropriate changes in notation, the same results hold for any
m, n ∈ N .

49
50 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

Example 3.2. By the Product Rule1 , if fx and gx exist, then:

∂f ∂g ∂f
(f g) = f +g .
∂x ∂x ∂x
Example 3.3. By the Mean–Value Theorem2 , if f ( · , y) is continuous on [a, b] and the partial
derivative fx ( · , y) exists on (a , b) , then there exists a point c ∈ (a , b) (which may depend on y
as well as on a and b) such that:

∂f
f (b , y) − f (a , y) = (b − a) (c , y) .
∂x
In most situations, when dealing with higher–order partial derivatives, the order of computa-
tion of the derivatives is, in some sense, arbitrary. This is expressed by the Clairaut3 –Schwarz
Theorem.

Theorem 3.4 (Clairaut–Schwarz). Assume that V is open in R2 , that (a , b) ∈ V and f :


V → R . Assume further that f is C 2 on V and that one of the two second–order mixed partial
derivatives of f exists on V and is continuous at the point (a, b) . Then, the other second–order
mixed partial derivative exists at (a , b) and the following equality is verified:

∂2 f ∂2 f
(a , b) = (a , b) .
∂y ∂x ∂x ∂y

The hypotheses of Theorem 3.4 are met if f ∈ C 2 (V ) on V ⊆ R2 , V open.


For functions of n variables, the following Theorem 3.5 holds.

Theorem 3.5. If f is C 2 on an open subset V of Rn , if a ∈ V , and if j 6= k , then:

∂2 f ∂2 f
(a) = (a) .
∂xj ∂xk ∂xk ∂xj

Remark 3.6. Existence of partial derivatives does not ensure continuity. As an example, con-
sider: ( xy
if (x , y) 6= (0, 0) ,
f (x , y) = x2 + y 2
0 if (x , y) = (0 , 0) .
This function is not continuous at (0 , 0) , but admits partial derivatives at any (x, y) ∈ R2 ,
since:
f (∆x , 0) − f (0 , 0)
lim = lim 0 = 0 ,
∆x→0 ∆x ∆x→0

and
f (0 , ∆y) − f (0 , 0)
lim = lim 0 = 0 .
∆y→0 ∆y ∆y→0

3.2 Differentiability
In this section, we define what it means for a vector function f to be differentiable at a point
a . Whatever our definition, if f is differentiable at a , then we expect two things:

(1) f will be continuous at a ;

(2) all first–order partial derivatives of f will exist at a .


1
See, for example, mathworld.wolfram.com/ProductRule.html
2
See, for example, mathworld.wolfram.com/MeanValueTheorem.html
3
Alexis Claude Clairaut (1713–1765), French mathematician, astronomer, geophysicist.
3.2. DIFFERENTIABILITY 51

To appreciate the following Definition 3.7 of total derivative of a function of n variables, we


consider one peculiar aspect of differentiable functions of one variable. Recall that f : R → R is
differentiable in x ∈ R if the following limit is finite, i.e., it is a real number:

f (x + h) − f (x)
lim := f 0 (x) .
h→0 h

The definition above is equivalent to the following: f is differentiable in x ∈ R if there exist


ω(h)
α ∈ R and a function ω : (−δ, δ) → R , with ω(0) = 0 and lim = 0 , such that:
h→0 h

f (x + h) = f (x) + α h + ω(h) h . (3.1)

The definition of differentiability for functions of several variables extends Property (3.1).

Definition 3.7. Let f be a real function of n variables. f is said to be differentiable, at a


point a ∈ Rn , if and only if there exists an open set V ⊆ Rn , such that a ∈ V and f : V → R ,
and there exists d ∈ Rn such that:

f (a + h) − f (a) − d · h
lim =0
h→0 ||h||

d is called total derivative of f at a

Theorem 3.8. If f is differentiable at a, then:

(i) f is continuous at a ;

(ii) all first–order partial derivatives of f exist at a ;


 
∂f ∂f
(iii) d = ∇f (a) := (a) , . . . , (a) .
∂x1 ∂xn

∇f (a) is called the gradient (or nabla) of f at a .


A reversed implication to Theorem 3.8 also holds true.

Theorem 3.9. Let V be open in Rn , let a ∈ V and suppose that f : V → R . If all first–order
partial derivatives of f exist in V and are continuous at a , then f is differentiable at a .

The hypotheses of Theorem 3.9 are met if f ∈ C 1 (V ) on V ⊆ Rn , V open.

Theorem 3.10. Let α ∈ R , a ∈ Rn , and assume that f , g : V → R are differentiable at a ,


being V ⊆ Rn an open set. Then, the functions f + g and α f are differentiable at a , and the
following equalities are verified:

(i) ∇(f + g)(a) = ∇f (a) + ∇g(a) ;

(ii) ∇(α f )(a) = α ∇f (a) ;

(iii) ∇(f g)(a) = g(a) ∇f (a) + f (a) ∇g(a) .

Moreover, if g(a) 6= 0 , then f /g is differentiable at a , and it holds:


 
f g(a) ∇f (a) − f (a) ∇g(a)
(iv) ∇ (a) = .
g g 2 (a)
52 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

The composition of functions follows similar rules, for differentiation, as in the one–dimensional
case. For instance, the Chain Rule 4 holds in the following way. Consider a vector function
g : I → Rn , g = (g1 , . . . , gn ) , defined on an open interval I ⊆ R , and consider f : g(I) ⊆
Rn → R . If each of the components gj of g is differentiable at t0 ∈ I , and if f is differentiable
at a = (g1 (t0 ) , . . . , gn (t0 )) , then the composition ϕ(t) := f (g(t)) is differentiable at t0 , and
we have:
ϕ0 (t0 ) = ∇f (a) · g 0 (t0 ) ,
where · is the dot (inner) product in Rn and:

g 0 (t0 ) := g10 (t0 ) , . . . , gn0 (t0 ) .




In order to extend the notion of gradient, we introduce the Jacobian5 matrix associated to a
vector–valued function.

Definition 3.11. Let f : Rn → Rm be a function from the Euclidean n–space to the Euclidean
m–space. f has m real–valued component functions:

f1 (x1 , . . . , xn ) , . . . , fm (x1 , . . . , xn ) .

If the partial derivatives of the component functions exist, they can be organized in an m–by–n
matrix, namely the Jacobian matrix J of f :

∂f1 ∂f1 ∂f1


 
···
 ∂x1 ∂x2 ∂xn 
 . . . ..  ∂(f1 , . . . fm )
J = .
 . .
. . . .  := ∂(x , . . . , x ) .

 ∂fm ∂fm 1 n
∂fm 
···
∂x1 ∂x2 ∂xn
The i–th row of J corresponds to the gradient ∇fi of the i–th component function fi , for
i = 1,... ,m.

We introduce, now, the class of positive homogeneous functions.

Definition 3.12. A function f : Rn \ {0} → R is positive homogeneous, of degree k , if for any


x ∈ Rn \ {0} :
f (α x) = αk f (x) .

The following Theorem 3.13 is known as Euler Theorem on homogeneous functions.

Theorem 3.13. If f : Rn \ {0} → R is continuously differentiable, then f is positive homoge-


neous, of degree k , if and only if:

x · ∇f (x) = k f (x) .

3.3 Maxima and Minima


Definition 3.14. Let V be an open set in Rn , let a ∈ V and suppose that f : V → R . Then:

(i) f (a) is called a local minimum of f if and only if there exists r > 0 such that f (a) ≤ f (x)
for all x ∈ Br (a) , an open ball neighborhood of a ;

(ii) f (a) is called a local maximum of f if and only if there exists r > 0 such that f (a) ≥ f (x)
for all x ∈ Br (a) ;
4
See, for example, mathworld.wolfram.com/ChainRule.html
5
Carl Gustav Jacob Jacobi (1804–1851), German mathematician.
3.4. SUFFICIENT CONDITIONS 53

(iii) f (a) is called a local extremum of f if and only if f (a) is a local maximum or a local
minimum of f .

Remark 3.15. If the first–order partial derivatives of f exist at a, and if f (a) is a local
extremum of f , then ∇f (a) = 0 .
In fact, the one–dimensional function:

g(t) = f (a1 , . . . , aj−1 , t , aj+1 , . . . , an )

has a local extremum at t = aj for each j = 1 , . . . , n . Hence, by the one–dimensional theory:


∂f
(a) = g 0 (aj ) = 0 .
∂xj

As in the one–dimensional case, condition ∇f (a) = 0 is necessary but not sufficient for f (a) to
be a local extremum.
Example 3.16. There exist continuously differentiable functions satisfying ∇f (a) = 0 and
such that f (a) is neither a local maximum nor a local minimum.
Consider, for instance, in the case n = 2 , the following function:

f (x, y) = y 2 − x2 .

It is easy to check that ∇f (0) = 0 , but the origin is a saddle point, as shown in Figure 3.1.

Figure 3.1: Saddle point of the function z = y 2 − x2 .

Let us give a formal definition to such a situation.


Definition 3.17. Let V be open in Rn , let a ∈ V , and let f : V → R be differentiable at a .
Point a is called a saddle point of f if ∇f (a) = 0 and there exists r0 > 0 such that, given
any ρ ∈ (0 , r0 ) , there exist points x , y ∈ Bρ (a) satisfying:

f (x) < f (a) < f (y) .

3.4 Sufficient conditions


To establish sufficient conditions for optimization, we introduce the notion of Hessian6 matrix.
6
Ludwig Otto Hesse (1811–1874), German mathematician.
54 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

Definition 3.18. Let V ⊆ Rn an open set and let f : V → R be a C 2 function. The Hessian
matrix of f at x ∈ V (or, simply, the Hessian) is the symmetric square matrix formed by the
second–order partial derivatives of f , evaluated at point x :
 2 
∂ f
H(f )(x) := (x) , for i, j = 1, . . . , n .
∂xi ∂xj

Tests for extrema and saddle points, in the simplest situation of n = 2 , are stated in Theorem
3.19.

Theorem 3.19. Let V be open in R2 , consider (a , b) ∈ V , and suppose that f : V → R


satisfies ∇f (a, b) = 0 . Suppose further that f ∈ C 2 and set:
2
D := fxx (a, b) fyy (a, b) − fxy (a, b) .

(i) If D > 0 and fxx (a , b) > 0 , then f (a, b) is a local minimum.

(ii) If D > 0 and fxx (a , b) < 0 , then f (a, b) is a local maximum.

(iii) If D < 0 , then (a , b) is a saddle point.


Notice that D is the determinant of the Hessian of f evaluated at (a , b) :

D = det[H(f )(a, b)] .

Example 3.20. A couple of examples are provided here, and the Reader is invited to verify the
stated results.

(1) Function f (x , y) = x3 + 6 x y − 3 y 2 + 2 has a saddle point in (a , b) = (0 , 0) and a local


maximum in (a , b) = (−2 , −2) .

(2) Function f (x , y) = x2 +y 3 −2 x y−y admits a saddle point of coordinates (a , b) = (− 31 , − 13 )


and a local minimum in (a , b) = (1 , 1) .

Tests for extrema and saddle points, in the general situation of n variables, are stated in Theorem
3.21.
Theorem 3.21. In n variables, a critical point x0 :

(i) is a local minimum for f ∈ C 2 if, for each k = 1 , . . . , n :

det[ Hk (f )(x0 ) ] > 0 ,

(ii) is a local maximum for f ∈ C 2 if, for each k = 1 , . . . , n :

det[ (−1)k Hk (f )(x0 ) ] > 0 ,

where Hk (f ) denote the principal minor of order k of H(f ) .

3.5 Lagrange multipliers


In applications, it is often necessary to optimize functions under some constraints. The Lagrange
multipliers Theorem 3.22 provides necessary optimum conditons for a problem of the following
kind:
max f (x) subject to g(x) = 0
or
min f (x) subject to g(x) = 0 .
3.6. MEAN–VALUE THEOREM 55

Theorem 3.22 (Lagrange multipliers – general case). Let m < n , let V be open in Rn , and
let f , gj : V → R be C 1 on V , for j = 1 , 2 . . . , m . Suppose that:

∂(g1 , . . . , gm )
∂(x1 , . . . , xn )

has rank m at x0 ∈ V , where gj (x0 ) = 0 for j = 1, 2, . . . , m . Assume further that x0 is a


local extremum for f in the set:

M = {x ∈ V : gj (x) = 0} .

Then, there exist scalars λ1 , . . . , λm , such that:


m
!
X
∇ f (x0 ) − λk gk (x0 ) = 0.
k=1

We will limit the proof of the Lagrange multipliers Theorem 3.22 in a two–dimensional context.
To this aim, it is first necessary to consider some preliminary results; we will resume the proof
in §3.8.

3.6 Mean–Value theorem


We begin with recalling the definition of a segment in the Euclidean space.

Definition 3.23. Given x , y ∈ Rn , the segment joining x and y is defined as:

[x , y] := {z ∈ Rn : z = t x + (1 − t) y , 0 ≤ t ≤ 1} .

The one–dimensional Mean–Value theorem (already met in Example 3.3), also called Lagrange
Mean–Value theorem or First Mean–Value theorem, can be extended to the Euclidean space
Rn .

Theorem 3.24 (Mean–Value). Let A ⊂ Rn , and f : A → R . Consider x , y ∈ Rn such that


[x , y] ⊂ Ao , the interior of A. Assume that f (x) is differentiable in [x , y] . Then, there exists
z ∈ [x , y] such that:
f (x) − f (y) = ∇f (z) · (x − y) .

Proof. Define ϕ : [0 , 1] → Rn , ϕ(t) = y + t (x − y) . Observe that ϕ ∈ C and it is differentiable


for any t ∈ (0 , 1) . Moreover, ϕ0 (t) = x−y . It follows that g = f ◦ ϕ : [0 , 1] → R is continuous
and differentiable in (0 , 1) . We can thus apply the one–dimensional version of the Mean–Value
Theorem, to infer the existence of η ∈ (0 , 1) such that:

f (x) − f (y) = g(1) − g(0) = g 0 (η).

On the other hand, the Chain Rule implies:

g 0 (η) = ∇f (ϕ(η)) · ϕ0 (η) = ∇f (ϕ(η)) · (x − y) .

Since z = ϕ(η) ∈ [x , y] , Theorem 3.24 is proved.

3.7 Implicit function theorem


The following fundamental step is represented by the implicit function theorem, proved by Dini
in 1878. For simplicity, we provide its proof only in the R2 case, presented in Theorem 3.25, but
its generalization to Rn is quite straightforward, and we state it in Theorem 3.26.
56 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

Theorem 3.25 (Implicit Function – case n = 2). Let Ω be an open set in R2 , and let
f : Ω → R be a C 1 function. Suppose there exists (x0 , y0 ) ∈ Ω such that f (x0 , y0 ) = 0 and
fy (x0 , y0 ) 6= 0 .
Then, there exist δ , ε > 0 such that, for any x ∈ (x0 − δ , x0 + δ) there exists a unique
y = ϕ(x) ∈ (y0 − ε , y0 + ε) such that:

f (x, y) = 0 .

Moreover, function y = ϕ(x) is C 1 in (x0 −δ, x0 +δ) and it holds that, for any x ∈ (x0 −δ , x0 +δ) :
fx (x, ϕ(x))
ϕ0 (x) = − .
fy (x, ϕ(x))

Proof. Let us assume that f (x0 , y0 ) > 0 . Since function fy (x , y) is continuos, it is possibile to
find a ball Bδ1 (x0 , y0 ) in which it is verified that (x , y) ∈ Bδ1 (x0 , y0 ) =⇒ fy (x, y) > 0 .
This means that, with an appropriate narrowing of parameters ε and δ , function y 7→ f (x , y)
can be assumed to be an increasing function, for any x ∈ (x0 − δ , x0 + δ) .
In particular, y 7→ f (x0 , y) is increasing and, since f (x0 , y0 ) = 0 by assumption, the following
disequalities are verified, for ε small enough:

f (x0 , y0 + ε) > 0 and f (x0 , y0 − ε) < 0 .

Using, again, continuity of f and an appropriate narrowing of δ , we infer that, for any x ∈
(x0 − δ , x0 + δ) :
f (x, y0 + ε) > 0 and f (x, y0 − ε) < 0 .
In conclusion, using continuity of y 7→ f (x , y) and the Bolzano theorem7 on the existence of
zeros, we have shown that, for any x ∈ (x0 −δ, x0 +δ) , there is a unique y = ϕ(x) ∈ (y0 −ε, y0 +ε)
such that:
f (x , y) = f (x , ϕ(x)) = 0 .
To prove the second part of Theorem 3.25, we need to show that ϕ(x) is differentiable. To this
aim, consider h ∈ R such that x + h ∈ (x0 − δ, x0 + δ) . In this way, from the Mean–Value
Theorem 3.24, there exist θ ∈ (0, 1) such that:
 
0 = f x + h , ϕ(x + h) − f x , ϕ(x)
 
= fx x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x) h +
   
+ fy x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x) ϕ(x + h) − ϕ(x) ,

thus:  
ϕ(x + h) − ϕ(x) fx x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x)
=−   .
h fy x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x)
The thesis follows by taking, in the equality above, the limit for h → 0 , observing that h →
0 =⇒ θ → 0 , and recalling that f (x , y) is C 1 .

We are now ready to state the Implicit function Theorem 3.26 in the general n–dimensional
case; here, Ω is an open set in Rn × R , thus (x, y) ∈ Ω means that x ∈ Rn and y ∈ R .
Theorem 3.26 (Implicit Function – general case). Let Ω ⊆ Rn × R be open, and let f ∈
C 1 (Ω , R) . Assume that there exists (x0 , y0 ) ∈ Ω such that f (x0 , y0 ) = 0 and fy (x0 , y0 ) 6= 0 .
Then, there exist an open ball Bδ (x0 ) , an open interval (y0 − ε , y0 + ε) and a function ϕ :
(y0 − ε , y0 + ε) → R , such that:
7
Bernard Placidus Johann Nepomuk Bolzano (1781–1848), Czech mathematician, theologian and philosopher.
For the theorem of Bolzano see, for example, mathworld.wolfram.com/BolzanoTheorem.html
3.8. PROOF OF THEOREM 3.22 57

(i) Bδ (x0 ) × (y0 − ε , y0 + ε) ⊂ Ω ;

(ii) (x, y) ∈ Bδ (x0 ) × (y0 − ε , y0 + ε) =⇒ fy (x, y) 6= 0 ;

(iii) for any (x, y) ∈ Bδ (x0 ) × (y0 − ε , y0 + ε) it holds:

f (x, y) = 0 ⇐⇒ y = ϕ(x) ;

(iv) ϕ ∈ C 1 Bδ (x0 ) and




fxj x , ϕ(x)
ϕxj (x) = −  .
fy x , ϕ(x)

3.8 Proof of Theorem 3.22


We can now prove the multipliers Theorem 3.22; as said before, the proof is given only for the
n = 2 case, presented in Theorem 3.27.

Theorem 3.27 (Lagrange multipliers – case n = 2). Let A ⊂ R2 be open, and let f , g : A → R
be C 1 functions. Consider the subset of A :

M = {(x , y) ∈ A : g(x , y) = 0} .

Assume that ∇g(x , y) 6= 0 for any (x , y) ∈ M . Assume further that (x0 , y0 ) ∈ M is a maximum
or a minimum of f (x , y) for any (x , y) ∈ M .
Then, there exists λ ∈ R such that:

∇f (x0 , y0 ) = λ∇g(x0 , y0 ) .

Proof. Since ∇g(x0 , y0 ) 6= 0 , we can assume that gy (x0 , y0 ) 6= 0, . Thus, from the Implicit
function Theorem 3.25, there exist ε , δ > 0 such that, for x ∈ (x0 −δ , x0 +δ) , y ∈ (y0 −ε , y0 +ε) ,
it holds:
g(x , y) = g(x , ϕ(x)) = 0 .

Consider the function x 7→ f x , ϕ(x) := h(x) , for x ∈ (x0 − δ , x0 + δ) . By assumption, h(x)
admits an extremum in x = x0 , therefore its derivative in x0 vanishes. Using the Chain Rule,
it follows:
0 = h0 (x0 ) = fx x0 , ϕ(x0 ) + fy x0 , ϕ(x0 ) ϕ0 (x0 ) .
 
(3.2)
Again, use the Implicit function Theorem 3.25, which gives:

0 gx x0 , ϕ(x0 )
ϕ (x0 ) = − .
gy x0 , ϕ(x0 )

Substituting into (3.2), recalling that ϕ(x0 ) = y0 , we get:

fx (x0 , y0 ) gy (x0 , y0 ) − fy (x0 , y0 ) gx (x0 , y0 ) = 0 ,

which can be rewritten as:


fx (x0 , y0 ) fy (x0 , y0 )
det
= 0.
gx (x0 , y0 ) gy (x0 , y0 )
Since the above determinant is zero, it follows that its rows are proportional, implying that there
exists λ ∈ R such that:
 
fx (x0 , y0 ) , fy (x0 , y0 ) = λ gx (x0 , y0 ) , gy (x0 , y0 ) .
58 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

3.9 Sufficient conditions


The multiplier Theorem 3.22 expresses necessary conditions for the existence of an optimal
solution. Stating sufficient conditions is also important; to such an aim, in the two–dimensional
case, the main tool is the so–called Bordered Hessian.
Suppose we are dealing with the simplest case of constrained optimization, that is, find the
maximum value (max) or the minimum value (min) of f (x , y) under the constraint g(x , y) = 0 .
We form the Lagrangian functional L(x , y , λ) = f (x , y)−λ g(x , y) and, after solving the critical
point system:
 0 0
fx (x , y) − λ gx (x , y) = 0 ,

0 0
fy (x , y) − λ gy (x , y) = 0 ,

g (x , y) = 0 ,

we evaluate:
 00 00 
Lxx Lxy gx
00 00
Λ = det Lxy Lyy gy  .
gx gy 0

Then:

(a) Λ > 0 indicates a maximum value;

(b) Λ < 0 indicates a minimum value.

Example 3.28. An example of interest in Economics concerns the maximization of a production


function of Cobb–Douglas kind. The mathematical problem can be modelled as:

max f (x , y) = xa y 1−a
(3.3)
subject to p x + q y − c = 0

where 0 < a < 1 , and p , q , c > 0 .


In a problem like (3.3), f (x , y) is referred to as objective function, while , by defining the function
w(x , y) = p x + q y − c , the constraint is given by w(x , y) = 0 .
The Lagrangian is L(x , y ; m) = f (x , y) − m w(x , y) . The critical point equations are:

a−1 y 1−a − m p = 0 ,
Lx (x , y ; m) = a x

Ly (x , y ; m) = (1 − a) xa y −a − m q = 0 ,

Lm (x , y ; m) = p x + q y − c = 0 .

Eliminating m from the first two equations, by subtraction, we obtain the two–by-two linear
system in the variables x , y :
(
(1 − a) p x − a q y = 0 ,
px + qy − c = 0 .

Solving the 2 × 2 system and recovering m , from m = (a xa−1 y 1−a ) p−1 , we find the critical
point:
 ac
 x=
p




c (1 − a)
y=


 q
m = (1 − a)1−a aa p−a q a−1

3.9. SUFFICIENT CONDITIONS 59

which is a maximum; the Bordered Hessian is, in fact:

c − a c −a
  a−2  
c − a c 1−a
  a−1  
ac ac
 (a − 1) a (1 − a) a p 
 p q p  q 
a c a c − a c −a
    
a q
 
 a−1  −a
c − ac
 
 ac p q 
 (1 − a) a − q 
 p q c 
p q 0

and its determinant is positive:

aa−1 p2−a q a+1


det = >0.
c (1 − a)a
Example 3.29. In this example, we revert the point of view between constraint and objective
function in a problem like (3.3). Here, the idea is to minimize the total cost, fixing the level of
production. For the sake of simplicity, we treat the particular two–dimensional problem of finding
maxima and minima of f (x , y) = 2 x+y , subject to the constraint x1/4 y 3/4 = 1 , x > 0 , y > 0 .
The critical point equations are:

m y 3/4
2−


 =0,
4 x3/4



3 m x1/4
1− =0,
4 y 1/4




x1/4 y 3/4 = 1 .

Eliminating m from the first two equations, by substitution, we obtain the two–by-two linear
system in the variables x , y : (y
=6,
x
x1/4 y 3/4 = 1 .

Solving the 2 × 2 system and recovering m from m = (4 y 1/4 )/(3 x1/4 ) , the critical point is
found:
4 × 21/4
x = 6−3/4 , y = 61/4 , m= .
33/4
The Bordered Hessian is:
3 m y 3/4 y 3/4
 
3m


 16 x7/4 16 x3/4 y 1/4 4 x3/4  
 3m 3 m x1/4 3 x1/4 
Λ= −
  .
 16 x3/4 y 1/4 16 y 5/4 4 y 1/4 

 y 3/4 3 x1/4 
0
4 x3/4 4 y 1/4
Evaluating Λ at the critical point and computing its determinant:

3 (3)1/4
det Λ = − ,
23/4
we see that we found a minimum.

Example 3.30. The problem presented here is typical in the determination of an optimal
investment portfolio in Corporate Finance.
We seek to minimize f (x, y) = x2 + 2 y 2 + 3 z 2 + 2 xz + 2 y z with the constraints:

x+y+z =1 2x + y + 3z = 7.
60 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

The Lagrangian is:

L(x , y , z ; m , n) = x2 + 2 y 2 + 3 z 2 + 2 x z + 2 y z − m (x + y + z − 1) − n (2 x + y + 3 z − 7) ,

hence the optimality conditions are:




 2x + 2z = m + 2n ,

4 y + 2 z = m + n ,



2x + 2y + 6z = m + 3n ,

x+y+z =1 ,





2x + y + 3z = 7 .

The solution to this 5 × 5 linear system is

x = 0, y = −2 , z = 3, m = −10 , n = 8.

The convexity of the objective function ensures that the found solution is the absolute minimum.
Though this statement should be proved rigorously, we do not treat it here.
4 First order equations: general the-
ory

Our goal, in introducing ordinary differential equations, is to provide a brief account on methods
of explicit integration, for the most common types of ordinary differential equations. However,
it is not taken for granted the main theoretical problem, concerning existence and uniqueness
of the solution of the Initial Value Problem, modelled by (4.3). Indeed, the proof of the Picard–
Lindelöhf Theorem 4.17 is presented in detail: to do this, we will use some notions from the
theory of uniform convergence of sequences of functions, already discussed in Theorem 2.15. An
abstract approach followed, for instance, in Chapter 2 of [39], is avoided here.
In the following Chapter 5, we present some classes of ordinary differential equations for which,
using suitable techniques, the solution can be described in terms of known functions: in this
case, we say that we are able to find an exact solution of the given ordinary differential equation.

4.1 Preliminary notions


Let x be an independent variable, moving on the real axes, and let y be a dependent variable,
that is y = y(x) . Let further y 0 , y 00 , . . . , y (n) represent successive derivatives of y with respect
to x . An ordinary differential equation (ODE) is any relation of equality involving at least one
of those derivatives and the function itself. For instance, the equation below:
dy
(x) := y 0 (x) = 2 x y(x) (4.1)
dx
states that the first derivative of the function y equals the multiplication of 2 x and y . An
additional, implicit statement is that (4.1) holds only for all those x for which both the function
and its first derivative are defined.
The term ordinary distinguishes this kind of equation from a partial differential equation, which
would involve two or more independent variables, a dependent variable and the corresponding
partial derivatives, i.e., for example:
∂f (x , y) ∂f (x , y)
+ 4xy = x+y.
∂x ∂y
The general ordinary differential equation of first order has the form:
F (x, y, y 0 ) = 0 . (4.2)

A function y = y(x) is called a solution of (4.2), on an interval J , if y(x) is differentiable on


J and if the following equality holds for all x ∈ J :
F (x , y(x) , y 0 (x)) ≡ 0 .
In general, we would like to know whether, under certain circumstances, a differential equation
has a unique solution. To accomplish this property, it is usual to consider the so–called Initial
Value Problem (or IVP) which, in the simplest scalar case, takes the form presented in Definition
4.1.

61
62 CHAPTER 4. FIRST ORDER EQUATIONS: GENERAL THEORY

Definition 4.1. Given f : Ω ⊂ R2 → R , being Ω an open set, the initial value problem (also
called Cauchy problem) takes the form:
(
y 0 = f (x , y) , x∈I ,
(4.3)
y(x0 ) = y0 , x 0 ∈ I , y0 ∈ J ,

where I × J ⊆ Ω are intervals, and where we have simply denoted y in place of y(x) .

Remark 4.2. We say that differential equations are studied by quantitative or exact methods
when they can be solved completely, that is to say, all their solutions are known and could be
written in closed form, in terms of elementary functions or, at times, in terms of special functions
(or in terms of inverses of elementary and special functions).

We now provide some examples of ordinary differential equations.

Example 4.3. Let us consider the differential equation:


1
y0 = . (4.4)
x2
If we rewrite equation (4.4) as:  
d 1
y(x) + = 0,
dx x
we see that we are dealing with a function whose derivative is zero. If we seek solutions defined
on an interval, then we can exploit a consequence of the Mean–Value Theorem 3.24 (namely, a
function that is continuous and differentiable on [a , b] and has null first–derivative on (a , b) , is
constant on (a , b)) , to see that:
1
y(x) + = C ,
x
for some constant C and for all x ∈ I , where I is an interval not containing zero. In other
words, as long as we consider the domain of solutions to be an interval like I , any solution of
the differential equation (4.4) takes the form:
1
y(x) = C − , for x ∈ I .
x
By choosing an initial condition, for example y(1) = 5 , a particular value C = 6 is determined,
so that:
1
y(x) = 6 − , for x ∈ I .
x

We can also follow a reverse approach, in the sense that, as illustrated in Example 4.4, given a
geometrical locus, we obtain its ordinary differential equation.

Example 4.4. Consider the family of parabolas of equation:

y = α x2 . (4.5)

Any parabola in the family has the y-axes as common axis, with vertex in the origin. Differen-
tiating, we get:
y0 = 2 α x . (4.6)
Eliminating α from (4.5) and (4.6), we obtain the differential equation:
2y
y0 = . (4.7)
x
This means that any parabola in the family is solution to the differential equation (4.7).
4.1. PRELIMINARY NOTIONS 63

4.1.1 Systems of ODEs: equations of higher order


It is possible to consider differential equations of order higher than one, or systems of many
differential equations of first order.
Example 4.5. The following ordinary differential equations are, respectively, of order 2 and of
order 3 :
x y 00 + 2 y 0 + 3 y − ex = 0 ,
(y (3) )2 + y 00 + y = x .
The second equation is quadratic in the highest derivative y (3) , therefore we say, also, that it
has degree 2.
A system of first–order differential equations is, for example, the following one:
(
y10 = y1 (a − b y2 ) ,
(4.8)
y20 = y2 (c y1 − d) ,
in which y1 = y1 (x) and y2 = y2 (x) are functions of a variable x that, in most applications,
takes the meaning of time. System (4.8) is probably the most famous system of ordinary dif-
ferential equations, as it represents the Lotka-Volterra predator prey system; see, for instance,
[15]. Notice that the left hand–sides in (4.8) are not dependent on x : in this particular case,
the system is called autonomous.
We now state, formally, the definition of Initial Value Problem for a system of n ordinary
differential equations, each of first order, and for a differential equation of order n , with integer
n ≥ 1 in both cases.
Definition 4.6. Consider Ω , open set in R × Rn , with integer n ≥ 1 , and let f : Ω → Rn be
a vector–valued continuous function of (n + 1)–variables. Let further (x0 , y) ∈ Ω and I be an
open interval such that x0 ∈ I .
Then, a vector–valued function s : I → Rn is a solution of the initial value problem:
(
y 0 = f (x , y)
(4.9)
y(x0 ) = y 0
if the following conditions are verified:

(i) s ∈ C 1 (I) ; (iii) s(x0 ) = y 0 ;

(iv) s0 (x) = f (x , s(x)) .



(ii) x , s(x) ∈ Ω for any x ∈ I ;

Remark 4.7. In the Lotka–Volterra  case (4.8), it is n = 2 , thus y = (y1 , y2 ) , the open set
is Ω = R × (0 , +∞) × (0  , +∞) and the continuos function is f (x , y) = f (x , y1 , y2 ) =
y1 (a − b y2 ) , y2 (c y1 − d) .
The rigorous definition of initial value problem for a differential equation of order n is provided
below.
Definition 4.8. Consider an open set Ω ⊆ R × Rn , where n ≥ 1 is integer. Let F : Ω → R
be a scalar continuous function of (n + 1)–variables. Let further (x0 , b) ∈ Ω and I be an open
interval such that x0 ∈ I . Finally, denote b = (b1 , . . . , bn ) .
Then, a real function s : I → R is a solution of the initial value problem:


 y (n) = F (x , y , y 0 , y 00 , · · · , y (n−1) )

y(x0 ) = b1



y 0 (x0 ) = b2 (4.10)

...





 (n−1)
y (x0 ) = bn
64 CHAPTER 4. FIRST ORDER EQUATIONS: GENERAL THEORY

if:

(i) s ∈ C n (I) ;

(ii) x , s(x) , s0 (x) , . . . , s(n−1) (x) ∈ Ω



for any x ∈ I ;

(iii) s(j) (x0 ) = bj+1 , j = 0,1,... ,n − 1;

(iv) s(n) (x) = F x , s(x) , s0 (x) , · · · , s(n−1) (x) .




Definition 4.9. Consider a family of functions y(x ; c1 , . . . , cn ) , depending on x and on n


parameters c1 , . . . , cn , which vary within a set M ⊂ Rn . Such a family is called a complete
integral, or a general solution, of the n–th order equation:

y (n) = F (x , y , y 0 , y 00 , · · · , y (n−1) ) , (4.11)

if it satisfies two requirements:

(1) each function y(x ; c1 , . . . , cn ) is a solution to (4.11)

(2) all solutions to (4.11) can be expressed as functions of the family itself, i.e., they take the
form y(x ; c1 , . . . , cn ) .

Remark 4.10. Systems of first–order differential equations like (4.9) and equations of order n
like (4.10) are intimately related. Given the n-th order equation (4.10), in fact, an equivalent
system can be build, that has form (4.9), by introducing a new vector variable z = (z1 , . . . , zn )
and considering the system of differential equations:


z10 = z2

z20 = z3



... (4.12)


 0
zn−1 = zn



 0
zn = F (x , z1 , z2 , . . . , zn )

with the set of initial conditions: 


z1 (x0 ) = b1 ,

... (4.13)

zn (x0 ) = bn .

System (4.12) can be represented in the vectorial form (4.9), simply by setting z 0 = (z10 , . . . , zn0 ) ,
b = (b1 , . . . , bn ) and:
 
z2

 z3 

f (x , z) = 
 ... .

 zn 
F (x , z1 , z2 , . . . , zn )

Form Remark 4.10, the following Theorem 4.11 can be inferred, whose straightforward proof is
omitted.

Theorem 4.11. Function s is solution of the n–th order initial value problem (4.10) if and
only if the vector function z solves system (4.12), with the initial conditions (4.13).

Remark 4.12. It is also possible to go in the reverse way, that is to say, any system of n
differential equations, of first order, can be transformed into a scalar differential equation of order
4.2. EXISTENCE OF SOLUTIONS: PEANO THEOREM 65

n . We illustrate this procedure with the Lotka-Volterra system (4.8). The first step consists in
computing the second derivative, with respect to x , of the first equation in (4.8):

y10 = y1 (a − b y2 ) =⇒ y100 = y10 (a − b y2 ) − b y1 y20 . (4.8a)

Then, the values of y10 and y20 from (4.8) are inserted in (4.8a), yielding:

y100 = y1 (a − b y2 )2 + b y2 (d − c y1 ) .

(4.8b)

Thirdly, using again the first equation in (4.8), y2 is expressed in terms of y1 and y10 , namely:

a y1 − y10
y10 = y1 (a − b y2 ) =⇒ y2 = . (4.8c)
b y1

Finally, (4.8c) is inserted into (4.8b), which provides the second–order differential equation for
y1 :
y 02
y100 = a y1 − y10 (d − c y1 ) + 1 .

(4.8d)
y1

4.2 Existence of solutions: Peano theorem


In this section, we briefly deal with the problem of the existence of solution for ordinary differ-
ential equations, for which continuity is the only essential hypothesis. The Peano1 Theorem 4.14
on existence is stated, but not demonstrated; the interested Reader is referred to Chapter 2 of
[19].
We first state Peano Theorem in the scalar case.

Theorem 4.13. Consider the rectangle R = [x0 −a , x0 +a]×[y0 −b , y0 +b] , and let f : R → R
be continuos. Then, the initial value problem (4.3) admits at least a solution in a neighborhood
of x0 .

To extend Theorem 4.13 to systems of ordinary differential equations, the rectangle R is replaced
by a parallelepiped, obtained as the Cartesian product of a real interval with an n–dimensional
closed ball.

Theorem 4.14 (Peano). Let us consider the n + 1–dimensional parallelepiped P = [x0 −


a , x0 + a] × B(y 0 , r) , and let f : P → Rn be a continuos function. Then, the initial value
problem (4.9) admits at least a solution in a neighborhood of x0 .

Remark 4.15. Under the sole continuity assumption, a solution needs not to be unique. Con-
sider, for example, the initial value problem:
(
y 0 (x) = 2 |y(x)| ,
p
(4.14)
y(0) = 0 .

The zero function y(x) = 0 is a solution of (4.14), which is solved, though, by function y(x) =
x |x| as well. Moreover, for each pair of real numbers α < 0 < β , the following ϕα ,β (x) function
solves (4.14) too: 
2
−(x − α)
 if x<α,
ϕα ,β (x) = 0 if α ≤ x ≤ β ,


(x − β) 2 if x>β .
In other words, the considered initial value problem admits infinite solutions. This phenomenon
is known as Peano funnel.
1
Giuseppe Peano (1858–1932), Italian mathematician and glottologist.
66 CHAPTER 4. FIRST ORDER EQUATIONS: GENERAL THEORY

4.3 Existence and uniqueness: Picard–Lindelöhf theorem


To ensure existence and uniqueness of the solution to the initial value problem (4.9), a more
restrictive condition than continuity needs to be considered and is presented in Theorem 4.17.
Given the importance of such a theorem, we provide here its proof, though in the scalar case
only; notice that the proof is constructive and turns out useful when trying to evaluate the
solution of the given ordinary differential equation.
The key notion to be introduced is Lipschitz continuity, which may be considered as a kind of
intermediate property, between continuity and differentiability.
For simplicity, we work in a scalar situation; the extension to systems of differential equations
is only technical; some details are provided in § 4.3.2.
We use again R to denote the rectangle:

R = [x0 , x0 + a] × [y0 − b , y0 + b] .

Definition 4.16. Function f : R → R is called uniformly Lipschitz–continuous in y , with


respect to x , if there exists L > 0 such that:

f (x , y1 ) − f (x , y2 ) < L y1 − y2 , for any (x , y1 ) , (x , y2 ) ∈ R . (4.15)

Using the Lipschitz2 continuity property, we prove the Picard–Lindelöhf3 Theorem.

Theorem 4.17 (Picard–Lindelöhf). Let f : R → R be uniformly Lipschitz continuous in y ,


with respect to x , and define:
 b
M = max f , α = min a , . (4.16)
R M
Then, problem (4.3) admits unique solution u ∈ C 1 [x0 , x0 + α] , R .


Proof. The proof is somewhat long, so we present it splitted into four steps.
First step. Let n ∈ N . Define the sequence of functions (un ) by recurrence:

 u0 (x) = y0 ,

Z x

un+1 (x) = y0 +
 f X , un (X) dX .
x0

We want to show that x , un (x) ∈ R for any x ∈ [x0 , x0 + α] . To this aim, it is enough to
prove that, for n ≥ 0 , the following inequality is verified:

|un (x) − y0 | ≤ b , for any x ∈ [x0 , x0 + α] . (4.17)

In the particular case n = 0 , inequality (4.17) is satisfied, since:

|u0 (x) − y0 | = |y0 − y0 | = 0 ≤ b .

In the general case, we have:


Z x

|un+1 (x) − y0 | ≤ f (X , un (X))dX
Z xx0
≤ |f (X , un (X))| dX ≤ M |x − x0 | ≤ M α ≤ b .
x0
2
Rudolf Otto Sigismund Lipschitz (1832–1903), German mathematician.
3
Charles Émile Picard (1856–1941), French mathematician.
Ernst Leonard Lindelöf (1870–1946), Finnish mathematician.
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 67

It is precisely here that we can understand the reason of the peculiar definition (4.16) of the
number α , as such a choice turns out appropriate in correctly defining each (and any) term in
the sequence un . It also highlights the local nature of the solution of the initial value problem
(4.3).
Second step. We now show that (un ) converges uniformly on [x0 , x0 + α] . The identity:
n
X
un = u0 + (u1 − u0 ) + · · · + (un − un−1 ) = u0 + (uk − uk−1 )
k=1

suggests that any sequence (un ) can be thought of as an infinite series: its uniform convergence,
thus, can be proved by showing that the following series (4.18) converges totally on [x0 , x0 + α] :

X
(uk − uk−1 ) . (4.18)
k=1

To prove total convergence, we need to prove, for n ∈ N , the following bound:

Ln−1 |x − x0 |n
|un (x) − un−1 (x)| ≤ M , for any x ∈ [x0 , x0 + α] . (4.19)
n!
We proceed by induction. For n = 1 , the bound is verified, since:

|u1 (x) − u0 (x)| = |u1 (x) − y0 |


Z x Z x

= f (X , y0 ) dX ≤ |f (X , y0 )| dX ≤ M |x − x0 | .
x0 x0

We now prove (4.19) for n + 1 , assuming that it holds true for n . Indeed:
Z x 
 
|un+1 (x) − un (x)| =
f X , un (X) − f X , un−1 (X) dX
Z xx0
 
≤ f X , un (X) − f X , un−1 (X) dX
x0
Z x
≤L |un (X) − un−1 (X)| dX
x0
Ln−1 x
Ln |x − x0 |n+1
Z
≤M |X − x0 |n dX = M .
n! x0 (n + 1)!

Therefore (4.19) is proved and implies that series (4.18) is totally convergent for [x0 , x0 + α] ;
in fact:
∞ ∞ ∞
X X X Ln−1 αn
(un − un−1 ) ≤ sup |un − un−1 | ≤ M
n!
n=1 n=1 [x0 ,x0 +α] n=1

M X (L α)n M
eα L − 1 < +∞ .

= =
L n! L
n=1

Third step. We show that the limit of the sequence of functions (un ) solves the initial value
problem (4.3). From the equality:
 Z x 

lim un+1 (t) = lim y0 + f X , un (X) dX ,
n→∞ n→∞ x0

we obtain, when u = lim un , the fundamental relation:


n→∞
Z x 
u(x) = y0 + f X , u(X) dX , (4.20)
x0
68 CHAPTER 4. FIRST ORDER EQUATIONS: GENERAL THEORY

  
since f X , un (X) ≤ M ensures uniform convergence for f X , un (X) .
n∈N
Now, differentiating both sides of (4.20), we see that u(x) is solution of the initial value problem
(4.3).
Fourth step. We have to prove uniqueness of the solution of (4.3). By contradiction, assume
that v ∈ C 1 ([x0 , x0 + α] , R) solves (4.3) too. Thus:
Z x 
v(x) = y0 + f X , v(X) dX .
x0

As before, it is possible to show that, for any n ∈ N and any x ∈ [x0 , x0 + α] , the following
inequality holds true:
Ln |x − x0 |n
|u(x) − v(x)| ≤ K , (4.21)
n!
where K is given by:
K= max |u(x) − v(x)| .
x∈[x0 ,x0 +α]

Indeed: Z x  
|u(x) − v(x)| ≤ f X , u(X) − f x , v(X) dX ≤ K L |x − x0 | ,
x0

which proves (4.21) for n = 1 . Using induction, if we assume that (4.21) is satisfied for some
n ∈ N , then:
Z x
 
|u(x) − v(x)| ≤ f X , u(X) − f X , v(X) dX
x0
Z x Z x
Ln (X − x0 )n
≤L |u(X) − v(X)| dX ≤ L K dX .
x0 x0 n!

After calculating the last integral in the above inequality chain, we arrive at:

Ln+1 (x − x0 )n+1
|u(x) − v(x)| ≤ K ,
(n + 1)!

which proves (4.21) for index n + 1 . By induction, (4.21) holds true for any n ∈ N .
We can finally end our demonstration of Theorem 4.17. In fact, by taking the limit n → ∞ in
(4.21), we obtain that, for any x ∈ [x0 , x0 + α] , the following inequality is verified:

|u(x) − v(x)| ≤ 0 ,

which shows that u(x) = v(x) for any x ∈ [x0 , x0 + α] .

Remark 4.18. Let us go back to Remark 4.15. In such ap situation, where the initial value
problem (4.14) has multiple solutions, function f (x , y) = 2 |y| does not fulfill the Lipschitz
continuity property. In fact, taking, for istance, y1 , y2 > 0 yields:

|f (y1 ) − f (y2 )| 2
= √ √ ,
|y1 − y2 | y1 − y2

which is unbounded.

The proof of Theorem 4.17, based on successive Picard iterates, is also useful in some simple
situations, where it allows to compute an approximate solution of the initial value problem (4.3).
This is illustrated by Example 4.19.
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 69

Example 4.19. Construct the Picard–Lindelhöf iterates for:


(
y 0 (x) = − 2 x y(x) ,
(4.22)
y(0) = 1 .

The first iterate is y0 (x) = 1 , while subsequent iterates are:


Z x Z x
y1 (x) = y0 (x) + f (t , y0 (t))dt = 1 − 2 t dt = 1 − x2 ,
0 0
Z x Z x
x4
y2 (x) = y0 (x) + f (t , y1 (t))dt = 1 − 2 t (1 − t2 )dt = 1 − x2 + ,
0 0 2
Z x 
t4 x4 x6

2
y3 (x) = 1 − 2 t 1−t + dt = 1 − x2 + − ,
0 4 2 6
Z x 
t4 t6 x4 x6 x8

y4 (x) = 1 − 2 t 1 − t2 + − dt = 1 − x2 + − + ,
0 4 6 2 6 24
and so on. A pattern emerges:

x2 x4 x6 x8 (−1)n x2n
yn (x) = 1 − + − + + ··· + .
1! 2! 3! 4! n!
The sequence of Picard–Lindelhöf iterates converges only if it also converges the series:
m
X (−1)n x2n
y(x) := lim .
m→∞ n!
n=0

Now, recalling the Taylor series for the exponential function:



x
X xn
e = ,
n!
n=0

it follows:

X (−x2 )n 2
y(x) = = e−x .
n!
n=0

We will show later, in Example 5.4, that this function is indeed the solution to (4.22).

We leave it to the Reader, as an exercise, to determine the successive approximations of the


IVP: (
y 0 (x) = y(x) ,
y(0) = 1 .
and to recognise that the solution is the exponential function y(x) = ex .

4.3.1 Interval of existence


The interval of existence of an initial value problem can be defined as the largest interval where
the solution is well defined. This means that the initial point x0 must be within the interval
of existence. In the following, we discuss how to detect such an interval with a theoretical
approach. When the exact solution is available, as in the case of the following Example 4.20, the
determination of the interval of existence is straightforward.

Example 4.20. The initial value problem:


(
y0 = 1 − y2
y(0) = 0
70 CHAPTER 4. FIRST ORDER EQUATIONS: GENERAL THEORY

is solved by the function y(x) = tanh x , so that its interval of existence is R . It is worth noting
that the similar initial value problem:
(
y0 = 1 + y2
y(0) = 0

behaves differently,
 since it is solved by y(x) = tan x and has, therefore, interval of existence
given by − π2 , π2 . Moreover, in this latter case, taking the limit at the boundary of the interval


of existence yields:
limπ y(x) = limπ tan x = ±∞.
x→± 2 x→± 2

This is not a special situation. Even when the interval of existence is bounded, for some theoret-
ical reason that we present later, in detail, the solution can be unbounded; this case is referred
to as a blow-up phenomenon.

4.3.2 Vector–valued differential equations


It is not difficult to adapt the argument presented in Theorem 4.17 to the vector–valued situation
of a function f : Ω → Rn defined on an open set Ω ⊂ R × Rn . In this case, the Lipschitz
continuity condition is:
||f (x , y 1 ) − f (x , y 2 )|| ≤ L ||y 1 − y 2 ||

for any (x , y 1 ) , (x , y 2 ) ∈ R , where the rectangle R is replaced by a cylinder:

R = [x0 , x0 + a] × B b (y 0 ) .

The vector–valued version of the Picard–Lindelöhf Theorem 4.17 is represented by the following
Theorem 4.21, whose proof is omitted, as it is very similar to that of Theorem 4.17.

Theorem 4.21 (Picard–Lindelöhf, vector–valued case). Let f : R → Rn br uniformly Lipschitz


continuous in y , with respect to x , and define:

b
M = max ||f || , α = min {a , }. (4.23)
R M

Then, problem (4.9) admits unique solution u ∈ C 1 ([x0 , x0 + α] , Rn ) .

4.3.3 Solution continuation


To detect the interval of existence, we start by observing that the Picard–Lindelhöf Theorem 4.17
leads to a solution of the IVP (4.3) which is, by construction, local, i.e., it is a solution defined
within a neighborhood of the initial independent data x0 . The radius of this neighborhood
depends on function f (x , y) in different ways: to understand them, we introduce the notion of
joined solutions, as a first (technical) step.

Remark 4.22. Let f : Ω → R be a continuos function, defined on an open set Ω ⊂ R × Rn .


Consider two solutions, y 1 ∈ C 1 ([a , b] , Rn ) and y 2 ∈ C 1 ([b , c] , Rn ) , of the differential equation
y 0 = f (x , y) , such that y 1 (b) = y 2 (b) . Then, function y : [a , c] → Rn defined as:
(
y 1 (x) if x ∈ [a , b]
y(x) =
y 2 (x) if x ∈ (b , c]

is also a solution of y 0 = f (x , y) .
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 71

Function f represents a vector field.


With the Picard–Lindelöhf Theorem 4.21, we can build solutions to initial value problems as-
sociated to y 0 = f (x , y) , choosing the initial data in Ω . In other words, given a point
(x0 , y 0 ) ∈ Ω , we form the IVP (4.9), for which Theorem 4.21 ensures existence of a solution
u(x) in a neighborhood of x0 . If now, in the rectangle R = [x0 , x0 + a] × B b (y 0 ) , we choose
a , b > 0 so that R ⊂ Ω (which is always possible, since Ω is open), then the solution of IVP
(4.9) is defined at least up to the point x1 = x0 + α1 , where the constant α1 > 0 is given by
(4.23).
This allows us to continue and consider a new initial value problem:
(
y 0 = f (x, y)
(4.9a)
y(x1 ) = u(x1 ) := y 1
which is defined at least up to point x2 = x1 + α2 , where constant α2 > 0 is again given by
(4.23).
This procedure can be iterated, leading to the formal Definition 4.23 of maximal domain solution.
The idea of continuation of a solution may be better understood by looking at Figure 4.1.

Rn

y1
y0

x0 x1 R

Figure 4.1: Continuation of a solution

Definition 4.23 (Maximal domain solution). If u ∈ C 1 (I , Rn ) solves the initial value problem
(4.9), we say that u has maximal domain (or that u does not admit a continuation) if there
exists no function v ∈ C 1 (J , Rn ) which also solves (4.9) and such that I ⊂ J .
The existence of the maximal domain solution to IVP (4.9) can be understood euristically, as it
comes from indefinitely repeating the continuation procedure. Establishing it with mathematical
rigor is beyond the aim of these lecture notes, since it would require notions from advanced
theoretical Set theory, such as Zorn’s Lemma4 .
We end this section stating, in Theorem 4.24, a result on the asymptotic behaviour of a solution
with maximal domain, in the particular case where Ω = I × Rn , being I an open interval. Such
a result explains what observed in Example 4.20, though we do not provide a proof of Theorem
4.24.
Theorem 4.24. Let f be defined on an open set I × Rn ⊂ R × Rn . Given (x0 , y 0 ) ∈ I × Rn ,
assume that function y is a maximal domain solution of the initial value problem:
(
y 0 = f (x , y) ,
y(x0 ) = y 0 .
Denote (α , ω) the maximal domain of y . Then, one of two possibility holds respectively for α
and for ω :
4
See, for example, mathworld.wolfram.com/ZornsLemma.html
72 CHAPTER 4. FIRST ORDER EQUATIONS: GENERAL THEORY

(1) it is either α = inf I ,


or α > inf I , implying lim |y(x)| = +∞ ;
x→α+

(2) it is either ω = sup I ,


or ω < sup I , implying lim |y(x)| = +∞ .
x→ω −
5 First order equations: explicit so-
lutions

In the previous Chapter 4 we exposed the general theory, concerning conditions for existence and
uniqueness of an initial value problem. Here, we consider some important particular situations,
in which, due to the structure of certain kind of scalar ordinary differential equations, it is
possible to establish methods to determine their explicit solution

5.1 Separable equations


Definition 5.1. A differential equation is separable if it has the form:
(
y 0 (x) = a(x) b y(x) ,

(5.1)
y(x0 ) = y0 ,

where a(x) and b(y) are continuous functions, respectively defined on intervals Ia and Ib ,
such that x0 ∈ Ia and y0 ∈ Ib .

To obtain existence and uniqueness in the solution of (5.1), we have to assume that b(y0 ) 6= 0 .

Theorem 5.2. If, for any y ∈ Ib , it holds:

b(y) 6= 0 , (5.2)

then the unique solution to (5.1) is function y(x) , defined implicitly by:
Z y Z x
dz
= a(s) ds . (5.3)
y0 b(z) x0

Remark 5.3. The


p hypothesis (5.2) cannot be removed, as shown, for instance, in Remark 4.15,
where b(y) = 2 |y| , which means that b(0) = 0 .

Proof. Introduce the two-variable function:


Z y Z x
dz
F (x , y) := − a(s) ds , (5.3a)
y0 b(z) x0

for which F (x0 , y0 ) = 0, . Recalling (5.2) and since:

∂F (x, y) 1
= ,
∂y b(y)

it follows:
∂F (x0 , y0 ) 1
= 6= 0 .
∂y b(y0 )

73
74 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

We can thus invoke the Implicit function theorem 3.25 and infer the existence of δ , ε > 0 for
which, given any x ∈ (x0 − δ , x0 + δ) , there is a unique C 1 function y = y(x) ∈ (y0 − ε , y0 + ε)
such that:
F (x, y) = 0
and such that, for any x ∈ (x0 − δ, x0 + δ) :

0 Fx x , y(x) a(x)
y (x) = −  = = a(x) b(y) .
Fy x , y(x) 1
b(y)

Function y , implicitly defined by (5.3), is thus a solution of (5.1); to complete the proof, we
still have to show its uniqueness. Assume that y1 (x) and y2 (x) are both solutions of (5.1), and
define: Z y
dz
B(y) := ;
y0 b(z)
then:
d   y10 (x) y20 (x)
B y1 (x) − B y2 (x) =  − 
dx b y1 (x) b y2 (x)
 
a(x) b y1 (x) a(x) b y2 (x)
=  −  = 0.
b y1 (x) b y2 (x)

Notice that
 we used the fact that both y1 (x) and y2 (x) are assumed to solve (5.1). Thus
B y1 (x) − B y2 (x) is a constant function, and its constant value is zero, since y1 (x0 ) =
y2 (x0 ) = y0 . In other words, we have shown that, for any x ∈ Ia :
 
B y1 (x) − B y2 (x) = 0 ,

which means, recalling the definition of B :


Z y1 (x) Z y2 (x) Z y1 (x)
dz dz dz
0= − = .
y0 b(z) y0 b(z) y2 (x) b(z)

At this point, using the Mean–Value Theorem 3.24, we infer the existence of a number X(x)
between the integration limits y2 (x) and y1 (x) , such that:
1 
 y1 (x) − y2 (x) = 0 .
b X(x)

But, from (5.2), it holds:


1
6= 0 ,
b(X(x))
thus:
y1 (x) − y2 (x) = 0 ,
and the theorem proof is completed.

Example 5.4. Consider once more the IVP studied, using successive approximations, in Ex-
ample 4.19: (
y 0 (x) = −2 x y(x) ,
y(0) = 1 .
Setting a(x) = −2 x , b(y) = y , x0 = 0 , y0 = 1 in (5.3) leads to:
Z y Z x
1 2
dz = (−2 z) dz ⇐⇒ ln y = −x2 ⇐⇒ y(x) = e−x .
1 z 0
5.1. SEPARABLE EQUATIONS 75

In the next couple of examples, some interesting, particular cases of separable equations are
considered.

Example 5.5. The choice b(y) = y in (5.3) yields the particular separable equation:
(
y 0 (x) = a(x) y(x) ,
(5.4)
y(x0 ) = y0 ,

where a(x) is a given continuous function. Using (5.3), we get:


Z y Z x Z x
1 y
dz = a(s) ds =⇒ ln = a(s) ds ,
y0 z x0 y0 x0

thus: Z x
a(s) ds
x0
y = y0 e .
x
For instance, if a(x) = − , the initial value problem:
2
( x
y 0 (x) = − y(x)
2
y(0) = 1

has solution:
x2
y(x) = e− 4 .

Example 5.6. In (5.3), let b(y) = y 2 , which leads to the separable equation:
(
y 0 (x) = a(x) y 2 (x) ,
(5.5)
y(x0 ) = y0 ,

with a(x) continuous function. Using (5.3), we find:


Z y Z x Z x
1 1 1
2
dz = a(s) ds =⇒ − + = a(s) ds ,
y0 z x0 y y0 x0

and, solving with respect to y :


1
y= Z x .
1
− a(s) ds
y0 x0

For instance, if a(x) = −2 x , the initial value problem:


(
y 0 (x) = −2 x y 2 (x)
y(0) = 1

has solution
1
y= .
1 + x2

We now provide some practical examples, recalling that a complete treatment needs, both,
finding the analytical expression of the solution and determining the maximal solution domain.

Example 5.7. Consider equation:


(
y 0 (x) = (x + 1) 1 + y 2 (x) ,


y(−1) = 0 .
76 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

Using (5.3), we find:


Z y(x) Z x
1
dz = (s + 1) ds ,
0 1 + z2 −1
and, evaluating the integrals:
Z y(x) Z x
1 1
dz = arctan y(x) , (s + 1) ds = (x + 1)2 ,
0 1 + z2 −1 2
the solution is obtained:
(x + 1)2
y(x) = tan.
2
Observe that the solution y(x) is only well–defined for x in a neighborhood of x0 = −1 and
such that:
π (x + 1)2 π
− < < ,
2 2 2
that is, −1 − π < x < −1 + π .
Example 5.8. Solve the initial value problem:
y 0 (x) = x − 1 ,

y(x) + 1
y(0) = 0 .

From (5.3):
Z y(x) Z x
(z + 1)dz = (s − 1)ds ,
0 0
performing the relevant computations, we get:
1 2 1
y (x) + y(x) = x2 − x ,
2 2
so that: (
p x − 2,
y(x) = −1 ± (x − 1)2 =
−x .
Now, recall that x lies in a neighborhood of zero and that the initial condition requires y(0) = 0 ;
it can be inferred, therefore, that y(x) = −x must be chosen. To establish the maximal domain,
observe that x − 1 vanishes for x = 1 ; thus, we infer that x < 1 .
Example 5.9. As a varies in R , investigate the maximal domain of the solutions to the initial
value problem: (
u0 (x) = a 1 + u2 (x) cos x ,


u(0) = 0 .
Form (5.3), to obtain:
Z u(x) Z x
dz
= a cos s ds.
0 1 + z2 0
After performing the relevant computations, we get:
arctan u(x) = a sin x . (5.6)
It is clear that the Range of the right hand–side of (5.6) is [−a , a] . To obtain a solution defined
π
on R , we have to impose that a < . In such a case, solving with respect to u yields:
2
u(x) = tan (a sin x) .
π π
Viceversa, when a ≥ , since there exists x ∈ R+ for which a sin x = , then, the obtained
2 2
solution is defined in (−x , x) , and x is the minimum positive number verifying the equality
π
a sin x = .
2
5.1. SEPARABLE EQUATIONS 77

5.1.1 Exercises
1. Solve the following separable equations:
 x 
ex
 y 0 (x) = sin2 y(x) , y 0 (x) = ,

 y(x) (d) (1 + ex ) cosh y(x)
(a) r
π y(0) = 0 ,


y(0) =
 ,
2
ex

y 0 (x) = 2 x ,

y 0 (x) = ,
(b) cos y(x) (e) (1 + ex ) cos y(x)
y(0) = 0 ,

y(0) = 5π ,

 2x
y 0 (x) =
 , 
x2 2
cos y(x)  0
y (x) = y (x) ,
(c) (f) 1+x
y(0) = π ,

y(0) = 1 .

4

Solutions:
p 1
(1 + e2 x ) ,

(a) ya (x) = arccos(−x2 ) , (d) yd (x) = arcsinh ln 2
(e) ye (x) = arcsin ln 21 (1 + e2 x ) ,

(b) yf (x) = arcsin x2 + 5 π ,
2
(c) yc (x) = arcsin √1 + x2 ,
 (f) yb (x) = .
2 2(1 + x − ln(1 + x)) − x2

2. Show that the solution to the initial value problem:

e2x

 0
y (x) = y(x)
√ 4 + e2x
y(0) = 5


is y(x) = 4 + e2 x . What is the maximal domain of such a solution?

3. Show that the solution to the initial value problem:

y 0 (x) = x sin x

1 + y(x)
y(0) = 0


is y(x) = 2 sin x − 2 x cos x + 1 − 1 . Find the maximal domain of the solution.

4. Show that the solution to the initial value problem:

y 3 (x)

 0
y (x) =
1 + x2
y(0) = 2

2
is y(x) = √ . Find the maximal domain of the solution.
1 − 8 arctan x
5. Show that the solution to the initial value problem:
(
y 0 (x) = (sin x + cos x) e−y(x)
y(0) = 1

is y(x) = ln(1 + e + sin x − cos x) . Find the maximal domain of the solution.
78 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

6. Show that the solution to the initial value problem:


(
y 0 (x) = (1 + y 2 (x)) ln(1 + x2 )
y(0) = 1

is y(x) = tan x ln(x2 + 1) − 2 x + 2 arctan x + π4 . Find the maximal domain of the solution.


7. Show that the solution to initial value problem:



y 0 (t) = − 1 y(x) − 1 y 4 (x)
x x
y(1) = 1

1
is y(x) = − √
3
. Find the maximal domain of the solution.
1 − 2x3
8. Solve the initial value problem:

00 0
y (x) = (sin x) y (x)

y(0) = 1

 0
y (1) = 0

Hint. Set z(x) = y 0 (x) and solve the equation z 0 (x) = (sin x) z(x) .

5.2 General solution


Given a differential equation, one may want to describe its general solution, without fixing a set
of initial conditions. Consider, for instance, the differential equation:

y 0 = (y − 1) (y − 2) . (5.7)

Equation (5.7) is separable, so we can easily adapt formula (5.3), using indefinite integrals and
adding a constant of integration:
y−2
ln = x + c1 .
y−1
Solving for y , the general solution to (5.7) is obtained:

1
y(x) = 1 + (5.8)
1 − cex
where we set c = ec1 .
Observe that the two constant functions y = 1 and y = 2 are solutions of equation (5.7).
Observe further that y = 2 is obtained from (5.8) taking c = 0 , thus such a solution is a
particular solution to (5.7). Viceversa, solution y = 1 cannot be obtained using the general
solution (5.8); for this reason this solution is called singular.
Singular solutions of a differential equation can be found with a computational procedure, illus-
trated in Remark 5.10.

Remark 5.10. Given the differential equation (4.2), suppose that its general solution is given
by Φ(x , y , c) = 0 . When there exists a singular integral of (4.2), it can be detected eliminating
c from the system: 
Φ(x , y , c) = 0 ,

∂Φ (5.9)

 (x , y , c) = 0 .
∂c
5.3. HOMOGENEOUS EQUATIONS 79

In the case of equation (5.7), system (5.9) becomes:


1

y = 1 +
 ,
x
1 − c ex
e
 =0
(1 − c ex )2

which confirms, eliminating c, that y = 1 is a singular integral of (5.7).


Remark 5.11. When the differential equation (5.3) is given in implicit form, uniqueness of
the solution does not hold; this generates the occurrence of a singular integral, which can be
detected without solving the differential equation (4.2), eliminating y 0 from system 5.10:

0
F (x , y , y ) = 0 ,

∂F (5.10)
 0 (x , y , y 0 ) = 0 .

∂y
A detailed discussion can be found in § 23 of [9].

5.3 Homogeneous Equations


To obtain exact solutions of non–separable differential equations, it is possible, in some specific
situations, to use some ansatz and transform the given equations. A few examples are provided in
the following. The first kind of transformable equations, that we consider here, are the so–called
homogeneous equations.
Theorem 5.12. Given f: [0 , ∞)×[0 , ∞) → R , if f (α x , α y) = f (x , y) , and x0 , y0 ∈ R , x0 6=
y0 y0
0 , are such that f 1 , 6= , then the change of variable y(x) = x u(x) can be employed
x0 x0
to transform the differential equation:
(
y 0 (x) = f x , y(x)

(5.11)
y(x0 ) = y0
into the separable equation:
 
 0
f 1 , u(x) − u(x)
u (x) =
 ,
x (5.12)
y
u(x0 ) = 0 .


x0
Proof. We represent the solution to (5.11) in the form y(x) = x u(x) and we look for the
auxiliary unknown u(x) ; this is a change of variable that, in force of homogeneity conditions,
transforms (5.11) into a separable equation, expressed in terms of u(x) . Differentiating y(x) =
x u(x) yields, in fact, y 0 (x) = x u0 (x) + u(x) . Now, imposing the equality y 0 (x) = f x , y(x)


leads to x u0 (x) + u(x) = f x , x u(x) = f 1 , u(x) , where we used the fact that f (α x , α y) =


f (x , y) . Therefore, u(x) solves the differential equation:


f (1 , u(x)) − u(x)
u0 (x) = .
x
Observe further that the initial condition y(x0 ) = y0 is changed into x0 u(x0 ) = y0 . Recalling
that x0 6= 0 , equation (5.11) is changed into the separable problem (5.12), whose solution u(x)
is defined by:
u(x)
Z
1
ds = ln |x| − ln |x0 | .
f (1 , s) − s
y0
x0
80 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

Example 5.13. Consider the initial value problem:


2 2
y 0 (x) = x + y (x) ,

x y(x)
y(2) = 2 .

In this case, f (x , y) is an homogeneous function:

x2 + y 2
f (x, y) = .
xy

Using y(x) = x u(x) :


 1 + u2 (x) 1
f 1 , u(x) = = − u(x) .
u(x) u(x)
Thus, the transformed problem turns out to be in separable form:

u0 (x) = 1 ,

x u(x)
u(2) = 1 ,

and its solution can be found by integration:


u(x)
Z
s ds = ln |x| − ln 2 ,
1

yielding: p
2 ln |x| + 1 − ln 4 .
u(x) =
√ 2
Observe that the solution is defined on the interval x > eln 4−1 = √ .
e
At this point, going back to our original initial value problem, we arrive at the solution of the
homogeneous problem:
p 2
y(x) = x 2 ln |x| + 1 − ln 4 , x> √ .
e

5.3.1 Exercises
Solve the following initial values problems for homogeneous equations:

y 2 (x) y(x) 
 
1

 y 0 (x) = , y 0 (x) =
 y(x) + x e x ,
 2
x − x y(x) x
1. 3.
y(1) = 1 ,

y(1) = −1 ,

 
3
1 15 x + 11 y(x)
 
0 2 y 0 (x) = −

y (x) = − 2 x y(x) + y (x) ,
  ,
2. x 4. 9 x + 5 y(x)
 
y(1) = 1 , y(1) = 1 ,
 

Solutions:

1 + 3x2 − 1 3. y(x) = −x ln (e − ln x) ,
1. y(x) = ,
x x 1
2x 4. = ,
2 23/5 y 2/5 y 3/5
2. y(x) = 2 , 1+ 3+
3x − 1 x x
5.4. QUASI HOMOGENEOUS EQUATIONS 81

5.4 Quasi homogeneous equations


By employing a few smart ansatz, it is possible to transform some differential equations, non–
separable and non–homogeneous, into equivalent equations that are separable or homogeneous.
Here, we deal with differential equations of the form:
 
0 ax + by + c
y =f , (5.13)
αx + βy + γ

where  
a b
det 6 0.
= (5.14)
α β
In this situation, the linear system:
(
ax + by + c = 0
(5.15)
αx + βy + γ = 0

has a unique solution, say, (x , y) = (x1 , y1 ) . To obtain a homogeneous or a separable equation,


it is possible to exploit the solution uniqueness, employing the change of variable:
( (
X = x − x1 , x = X + x1 ,
⇐⇒
Y = y − y1 , y = Y + y1 .

Example 5.14 illustrates the transformation procedure.

Example 5.14. Consider the equation:

y 0 = 3 x + 4 ,

y−1
y(0) = 2 .

The first step consists in solving the system:


(
3x + 4 = 0 ,
y−1=0 ,

4
whose solution is x1 = − , y1 = 1 . In the second step, the change variable is performed:
3
(
X = x + 43 ,
Y =y−1 ,

which leads to the separable equation:



Y 0 = 3X ,

Y
4
Y
 =1,
3
with solution:  
2 16
3 X − = Y 2 − 1.
9
Recovering the original variables:
 4 16 
3 (x + )2 − = (y − 1)2 − 1
3 9
82 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

and simplifying:  
8
3 x + x = y 2 − 2y
2
3
yields:
p
y =1± 3 x2 + 8 x + 1 .

Finally, recalling that y(0) = 2 , the solution is given by:



p −4 + 13
y(x) = 1 + 3 x2 + 8 x + 1 , x> .
3

The worked–out Example 5.15 illustrate the procedure to be followed if, when considering equa-
tion (5.13), condition (5.14) is not fulfilled.

Example 5.15. Consider the equation

y 0 = − x + y + 1 ,

2x + 2y + 1
y(1) = 2 .

Here, there is no solution to system:


(
x+y+1=0 ,
2x + 2y + 1 = 0 .

In this situation, since the two equations in the system are proportional, the change of variable
to be employed is: (
t = x,
z = x+y.

The given differential equation is, hence, transformed into the separable one:
z

z 0 = ,
2z + 1
z(1) = 3 .

Separating the variables leads to:


Z z Z t
2w + 1
dw = ds .
3 w 1

Thus:
2 z − 6 + ln z − ln 3 = t − 1 =⇒ x + 2 y + ln(x + y) = 5 + ln 3 .

Observe that, in this example, it is not possible to express the dependent variable y in an
elementary way, i.e., in terms of elementary functions.

5.5 Exact equations


Aim of this section is to provide full details on solving exact differential equations. To understand
the idea behind the treatment of this kind of equation, we present Example 5.16, that will help
in illustrating what an exact differential equation is, how its structure can be exploited, to arrive
at a solution, and why the process works as it does.
5.5. EXACT EQUATIONS 83

Example 5.16. Consider the differential equation:


3 x2 − 2 x y
y0 = . (5.16)
2 y + x2 − 1
First, rewrite (5.16) as:
2 x y − 3 x2 + (2 y + x2 − 1) y 0 = 0 . (5.16a)
Equation (5.16a) is solvable under the assumption that a suitable function Φ(x , y) can be found,
that verifies:
∂Φ ∂Φ
= 2 x y − 3 x2 , = 2 y + x2 − 1 .
∂x ∂y
Note that it is not always possible to determine such a Φ(x , y) . In the current Example 5.16,
though, we are able to define Φ(x , y) = y 2 + (x2 − 1) y − x3 . Therefore (5.16a) can be rewritten:
as
∂Φ ∂Φ 0
+ y = 0. (5.16b)
∂x ∂y
Invoking the multi–variable Chain Rule1 , we can write (5.16b) as:
d 
Φ x , y(x) = 0 . (5.16c)
dx
Since, when the ordinary derivative of a function is zero, the function is constant, there must
exist a real number c such that:

Φ x , y(x) = y 2 + (x2 − 1) y − x3 = c .

(5.17)

Thus (5.17) is an implicit solution for the differential equation (5.16); if an initial condition is
assigned, we can determine c .
It is not always possible to determine an explicit solution, expressed in terms of y . In the
particular situation of Example 5.16, though, this is feasible and we are able to find an explicit
solution. For instance, setting y(0) = 1 , we get c = 0 and:
1 p 
y(x) = 1 − x2 + x4 + 4 x3 − 2 x2 + 1 .
2

Let us, now, leave the particular case of Example 5.16, and return to the general situation, i.e.,
consider ordinary differential equation of the form:

M (x , y) + N (x , y) y 0 = 0 . (5.18)

We call exact the differential equation (5.18), if there exists a function Φ(x , y) such that:
∂Φ ∂Φ
= M (x , y) , = N (x , y) , (5.19)
∂x ∂y
and the (implicit) solution to an exact differential equation is constant:

Φ(x , y) = c .

In other words, finding Φ(x , y) constitues the central task in determining whether a differential
equation is exact and in computing its solution.
Establishing a necessary condition for (5.18) to be exact is easy. In fact, if we assume that (5.18)
is exact and that Φ(x , y) satisfies the hypotheses of Theorem 3.4, then the equality holds:
   
∂ ∂Φ ∂ ∂Φ
= .
∂x ∂y ∂y ∂x
1
See, for example, mathworld.wolfram.com/ChainRule.html
84 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

Inserting (5.19), we obtain the necessary condition for an equation to be exact:


∂ ∂
N (x , y) = M (x , y) . (5.20)
∂x ∂y

The result in Theorem 5.17 speeds up the search of solution for exact equations.

Theorem 5.17. Define Q = { (x , y) ∈ R2 | a < x < b , c < y < d } and let M , N : Q → R


be C 1 with N (x , y) 6= 0 for any (x , y) ∈ Q . Assume that M , N, verify the closure condition
(5.20) for any (x , y) ∈ Q . Then, there exists a unique solution to the initial value problem:

y 0 = − M (x , y) ,

N (x , y) (5.21)
y(x0 ) = y0 , (x0 , y0 ) ∈ Q .

Such a solution is implicitly defined by:


Z x Z y
M (t , y0 ) dt + N (x , s) ds = 0 . (5.22)
x0 y0

Example 5.18. Consider the differential equation:


 2
y 0 = − 6 x + y ,
2xy + 1
y(1) = 1 .

The closure condition (5.20) is fullfilled, since:

∂M (x , y)
M (x , y) = 6 x + y 2 =⇒ = 2y,
∂y
∂N (x , y)
N (x , y) = 2 x y + 1 =⇒ = 2y.
∂x
Formula (5.22) then yields:
Z x Z x
M (t , 1) dt = (6 t + 1) dt = −4 + x + 3 x2 ,
Z 1y Z1 y
N (x , s) ds = (2 x s + 1) ds = −1 − x + y + x y 2 .
1 1

Hence, the solution to the given initial value problem is implicitly defined by:

x y 2 + y + 3 x2 − 5 = 0

and, solving for y , two solutions are reached:



−1 ± 1 + 20 x − 12 x3
y= .
2x
Recalling that y(1) = 1 , we choose one solution:

−1 + 1 + 20 x − 12 x3
y= .
2x
Example 5.19. Consider solving:

3 y e3 x − 2 x

 0
y =− ,
e3 x
y(1) = 1 .
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 85

Here, it holds:

∂M (x , y)
M (x , y) = 3 y e3 x − 2 x =⇒ = 3 e3 x ,
∂y
∂N (x , y)
N (x , y) = e3 x =⇒ = 3 e3 x .
∂x
Using formula (5.22):
Z x Z x
M (t , 1) dt = (3 e3 t − 2 t) dt = −x2 + e3 x − e3 + 1 ,
Z 1y Z1 y
N (x , s) ds = (e3 x ) ds = (y − 1) e3 x .
1 1

The solution to the given initial value problem is, therefore:

−x2 + e3 x − e3 + 1 + (y − 1) e3 x = 0 y = e−3 x x2 + e3 − 1 .

=⇒

5.5.1 Exercises
1. Solve the following initial value problems, for exact equations:
3 2
y 0 = 9 x − 2 x y ,
 
y 0 = − 2 x + 3 y ,
(a) 3x + y − 1 (d) x2 + 2 y + 1
y(1) = 2 , y(0) = −3 ,
 
 2
2
y 0 = 9 x − 2 x y ,
 y 0 = 2 x y + 4 ,
(e) 2 (3 − x2 y)
(b) x2 + 2 y + 1
y(−1) = 8 ,

y(0) = −3 ,

2xy

 2
 − 2x
y 0 = 2 x y + 4 , 2

0 = 1+x

(f) y ,
(c) 2 (3 − x2 y) 
 2 − ln (1 + x2 )
y(−1) = −8 , y(0) = 1 .
 

2. Using the method for exact equation, described in this § 5.5, prove that the solution of the
initial value problem:
1 − 3 y 3 e3 x y

y 0 =
3 x y 2 e3 x y + 2 y e3 x y
y(0) = 1

is implicitly defined by y 2 e3 x y −x = 1 , and verify this result using the Dini Implicit function
Theorem 3.25.

5.6 Integrating factor for non exact equations


In § 5.5, we faced differential equations of the form (5.18), with the closure condition (5.20),
essential to detect the solution; we recall both formulæ , for convenience:

∂ ∂
M (x, y) + N (x, y) y 0 = 0 , N (x, y) = M (x , y) .
∂x ∂y

The case is more frequent, though, in which condition (5.19) is not satisfied, so that we are
unable to express the solution of the given differential equation in terms of the known functions.
There is, however, a general method of solution which, at times, allows the solution of the
general differential equation to be formulated using the known functions. In formula (5.18a)
86 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

below, although it can hardly be considered as an orthodox procedure, we split the derivative
y 0 and, then, rewrite (5.18) in the so–called Pfaffian2 form:

M (x , y) dx + N (x , y) dy = 0 . (5.18a)

We do not assume condition (5.20). In this situation, there exists a function µ(x , y) such that,
multiplying both sides of (5.18a) by µ , an equivalent equation is obtained which is exact,
namely:
µ(x , y) M (x, y) dx + µ(x , y) N (x, y) dy = 0 . (5.18b)

This represents a theoretical statement, in the sense that it is easy to formulate conditions that
need to be satisfied by the integrating factor µ , namely:

∂  ∂ 
µ(x , y) N (x, y) = µ(x , y) M (x, y) . (5.23)
∂x ∂y

Evaluating the partial derivatives (and employing a simplified subscript notation for partial
derivatives), the partial differential equation for µ is obtained:

M (x , y) µy − N (x , y) µx = Nx (x , y) − My (x , y) µ . (5.23a)

Notice that solving (5.23a) may turn out to be harder than solving the original differential
equation (5.18a). However, depending on the particular structure of the functions M (x , y) and
N (x , y) , there exist favorable situations in which it is possibile to detect the integrating factor
µ(x , y) , provided that some restrictions are imposed on µ itself. In the following Theorems 5.20
and 5.23, we describe what happens when µ depends on one variable only.

Theorem 5.20. Equation (5.18a) admits an integrating factor µ depending on x only, if the
quantity:
My (x , y) − Nx (x , y)
ρ(x) = (5.24)
N (x , y)
also depends on x only. In this case, it is:
Z
ρ(x) dx
µ(x) = e , (5.25)

with ρ(x) given by (5.24).

Proof. Assume that µ(x, y) is a function of one variable only, say, it is a function of x only,
thus:

µ(x , y) = µ(x) , µx = = µ0x , µy = 0 .
dx
In this situation, equation (5.23a) reduces to:

N (x , y) µ0x = My (x , y) − Nx (x , y) µ ,

(5.23b)

that is:
µ0x My (x , y) − Nx (x , y)
= . (5.23c)
µ N (x , y)
Now, if the left hand–side of (5.23c) depends on x only, then (5.23c) is separable: solving it
leads to the integrating factor represented in thesis (5.25).
2
Johann Friedrich Pfaff (1765–1825), German mathematician.
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 87

Example 5.21. Consider the initial value problem:


2
y 0 = − 3 x y − y ,

x (x − y) (5.26)
y(1) = 3 .

Equation (5.26) is not exact, nor separable. Let us rewrite it in Pfaffian form, temporarily
ignoring the initial condition:

(3 x y − y 2 ) dx + x (x − y) dy = 0 . (5.26a)

Setting M (x , y) = 3 x y − y 2 and N (x , y) = x (x − y) yields:

My (x , y) − Nx (x , y) 3 x − 2 y − (2 x − y) 1
= = ,
N (x , y) x (x − y) x

which is a function of x only. The hypotheses of Theorem 5.20 are fulfilled, and the integrating
factor comes from (5.25): Z
1
dx
x
µ(x) = e = x.
Multiplying equation (5.26a) by the integrating factor x , we form an exact equation, namely:

(3 x2 y − x y 2 ) dx + x2 (x − y) dy = 0 . (5.26b)

Now, we can define the modified functions that constitute (5.26b):

M1 (x , y) = x M (x , y) = 3 x2 y − x y 2 ,

N1 (x, y) = x N (x , y) = x2 (x − y) ,
and employ them in equation (5.22), which also incorporates the initial condition:
Z x Z y
M1 (t , 3) dt + N1 (x , s) ds = 0 ,
1 3

that is: Z x Z y
2
(−9 t + 9 t ) dt + (x2 − x s) ds = 0 .
1 3
Evaluating the integrals:
x2 y 2 3
x3 y − + = 0.
2 2
Solving for y , and recalling the initial condition, leads to the solution of the initial value problem
(5.26): √
3 + x4
y =x+ .
x
Example 5.22. Consider the following initial value problem, in which the differential equation
is not exact nor separable:
2
y 0 = − 4 x y + 3 y − x ,

x (x + 2y) (5.27)
y(1) = 1 .

Here, M (x , y) = 4 x y + 3 y 2 − x , N (x , y) = x (x + 2 y) , so that the quantity below turns out


to be a function of x :
My (x , y) − Nx (x , y) 2
= .
N (x , y) x
88 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

Since the hypotheses of Theorem 5.20 are fulfilled, the integrating factor is given by (5.25):
Z
2
dx
x
µ(x) = e = x2 .

After defining the modified functions:

M1 (x , y) = x M (x , y) = 4 x3 y + 3 x2 y 2 − x3 ,

N1 (x , y) = x N (x , y) = x3 (x + 2 y) ,
we can use them into equation (5.22), which also incorporates the initial condition, obtaining:
Z x Z y
M1 (t , 1) dt + N1 (x , s) ds = 0 ,
1 1

that is: Z x Z y
2 3
(3 t + 3 t ) dt + x2 (2 s + x) ds = 0 .
1 1
Evaluating the integrals yields:
x4 7
x4 y − + x3 y 2 − = 0 .
4 4
Solving for y and recalling the initial condition, we get the solution of the initial value problem
(5.27): √
x5 + x4 + 7 x
y= − .
2x3/2 2

We examine, now, the case in which the integrating factor µ is a function of y only. Given the
analogy with Theorem 5.20, the proof of Theorem 5.23 is not provided here.
Theorem 5.23. Equation (5.18a) admits an integrating factor µ depending on y only, if the
quantity:
Nx (x , y) − My (x , y)
ρ(y) = (5.28)
M (x , y)
also depends on y only. In this case, the integrating factor is:
Z
ρ(y) dy
µ(y) = e , (5.29)

with ρ(y) given by (5.28).


Example 5.24. Consider the initial value problem, with non–separable and non–exact differ-
ential equation:
y 0 = − y (x + y + 1) ,

x (x + 3 y + 2) (5.30)
y(1) = 1 .

Functions M (x , y) = y (x + y + 1) and N (x , y) = x (x + 3 y + 2) are such that the following


quantity is dependant on y only:
Nx (x , y) − My (x , y) 1
= .
M (x , y) y
Formula (5.29) then leads to the integrating factor µ(y) = y , which in turn leads to the following
exact equation, written in Pfaffian form:

y 2 (x + y + 1) dx + x y (x + 3 y + 2) dy = 0 .
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 89

Define the the modified functions:

M1 (x , y) = x M (x , y) = y 2 (x + y + 1) ,

N1 (x , y) = x N (x , y) = x y (x + 3 y + 2) ,
and employ them into equation (5.22), which also incorporates the initial condition, obtaining:
Z x Z y
M1 (t , 1) dt + N1 (x , s) ds ,
1 1

that is: Z x Z y
(2 + t) dt + s x (2 + 3 s + x) ds = 0 .
1 1

The solution to (5.30) can be thus expressed, in implicit form, as:

x2 y 2 5
+ x y3 + x y2 − = 0 .
2 2

To end this § 5.6, let us consider the situation of a family of differential equations for which an
integrating factor µ is available.

Theorem 5.25. Let Q = { (x , y) ∈ R2 | 0 < a < x < b , 0 < c < x < d } , and let f1 and
f2 be C 1 functions on Q , such that f1 (x y) − f2 (x y) 6= 0 . Define the functions M (x , y) and
N (x , y) as:
M (x , y) = y f1 (x y) , N (x , y) = x f2 (x y) .
Then:
1
µ(x , y) = 
x y f1 (x y) − f2 (x y)
is an integrating factor for:
y f1 (x y)
y0 = − .
x f2 (x y)

Proof. It suffices to insert the above expressions of µ , M and N into condition (5.23) and
verify that it gets satisfied.

5.6.1 Exercises
1. Solve the following initial value problems, using a suitable integrating factor.
2
y 0 = − y (x + y) ,
 
y 0 = 3 x + 2 y ,
(a) x + 2y − 1 (e) 2xy
y(1) = 1 , y(1) = 1 ,
 

 0 y − 2 x3


y 0 = − y
2
, y = ,
(b) x (y − ln x) (f) x
y(1) = 1 , y(1) = 1 ,

y 0 = y

y 0 = y ,
, (g) 3
y − 3x
(c) y − 3x − 3 y(0) = 1 ,
y(0) = 0 ,
 3 x
( y 0 = − y + 2 y e ,
y0 = x − y , (h) ex + 3 y 2
(d)
y(0) = 0 , y(0) = 0 .

90 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

5.7 Linear equations of first order


Consider the differential equation:
(
y 0 (x) = a(x) y(x) + b(x) ,
(5.31)
y(x0 ) = y0 .

Let functions a(x) and b(x) be continuous on the interval I ⊂ R . The first–order differential
equation (5.31) is called linear, since y is represented by a polynomial of degree 1 . We can
establish a formula for its integration, following a procedure that is similar to what we did for
separable equations.
Theorem 5.26. The unique solution to (5.31) is:
Z x Z t
a(t) dt − a(s) ds
!
Z x
y(x) = e x0 y0 + b(t) e x0 dt
x0

i.e., in a more compact form:


!
Z x
y(x) = eA(x) y0 + b(t) e−A(t) dt (5.32)
x0

where: Z x
A(x) = a(s) ds . (5.33)
x0

Proof. To arrive at formula (5.32), we first examine the case b(x) = 0 , for which (5.31) reduces
to the separable (and linear) equation:
(
y 0 (x) = a(x) y(x) ,
(5.34)
y(x0 ) = c ,

having set y0 = c . If x0 ∈ I , the solution of (5.34) through point (x0 , c) is:

y(x) = c eA(x) . (5.35)

To find the solution to the more general differential equation (5.31), we use the method of
Variation of Parameters 3 , due to Lagrange: we assume that c is a function of x and search
for c (x) such that the function:
y(x) = c(x) eA(x) (5.36)
becomes, indeed, a solution of (5.31). To this aim, differentiate (5.36):

y 0 (x) = c0 (x) eA(x) + c(x) a(x) eA(x)

and impose that function (5.36) solves (5.31), that is:

c0 (x) eA(x) + c(x) a(x) eA(x) = a(x) c(x) eA(x) + b(x) ,

from which:
c0 (x) = b(x) e−A(x) . (5.37)
Integrating (5.37) between x0 and x , we obtain:
Z x
c(x) = b(t) e−A(t) dt + K ,
x0
3
See, for example, mathworld.wolfram.com/VariationofParameters.html
5.7. LINEAR EQUATIONS OF FIRST ORDER 91

with K constant. Finally, the solution to (5.31) is:


Z x
y(x) = eA(x) b(t) e−A(t) dt + K .

x0

Evaluating y(x0 ) and recalling the initial condition in (5.31), we see that y0 = K . Thesis (5.32)
thus follows.

Remark 5.27. An alternative proof to Theorem 5.26 can be provided, using Theorem 5.20 and
the integrating factor procedure. In fact, if we assume:

M (x , y) = a(x) y(x) + b(x) , N (x , y) = −1 ,

then:
My − Ny
= −a(x) ,
N
which yields the integrating factor µ(x) = e−A(x) , with A(x) defined as in (5.33). Considering
the following exact equation, equivalent to (5.31):

e−A(x) a(x) y + b(x)



0
y (x) = − ,
−e−A(x)
and employing relation (5.22), we obtain:
Z x Z y
 −A(t)
a(t) y0 + b(t) e dt − e−A(x) ds = 0 ,
x0 y0

which, after some straightforward computations, yields formula (5.32).

Remark 5.28. The general solution of the linear differential equation (5.31) can be described
when a particular solution of it is known, together with the general solution of the linear and
separable equation (5.34).
If y1 and y2 are both solutions of (5.31), in fact, there exist v1 , v2 ∈ R such that:
Z x
y1 (x) = eA(x) v1 + b(t) e−A(t) dt ,

x0
Z x
y2 (x) = eA(x) v2 + b(t) e−A(t) dt .

x0

Subtracting, we obtain:
y1 (x) − y2 (x) = v1 − v2 eA(x) ,


which means that y1 − y2 has the form (5.35) and, therefore, solves (5.34). Now, using the fact
that y1 is a solution of (5.31), the general solution to (5.31) can be written as:

y(x) = c eA(x) + y1 (x) , c ∈ R,

and all this is equivalent to saying that the general solution y = y(x) of (5.31) can be written
in the form:
y − y1
= c, c ∈ R.
y2 − y1
Example 5.29. Consider the equation:

3
y 0 (x) = 3 x2 y(x) + x ex ,


y(0) = 1 .
92 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

3
Here, a(x) = 3 x2 and b(x) = x ex . Using (5.32)–(5.33), we get:
Z x Z x
A(x) = a(s) ds = 3 s2 ds = x3
x0 0

and
x x
x2
Z Z
3 3
b(t) e−A(t) dt = t et e−t dt = ,
x0 0 2
so that:
x2
 
x3
y(x) = e 1+ .
2
Example 5.30. Consider the equation:
(
y 0 (x) = 2 x y(x) + x ,
y(0) = 2 .

Note that a(x) = 2 x , b(x) = x . Then:


Z x Z x
A(x) = a(s) ds = 2 s ds = x2
0 0

and
" 2
#x 2
x x
e−t 1 e−x
Z Z
−A(t) −t2
b(t) e = te dt = − = − .
0 0 2 2 2
0

Therefore, the solution is:


2
2 1 e−x  5 x2 1
y(x) = ex 2+ − = e − .
2 2 2 2
Remark 5.31. When, in equation (5.32), functions a(x) and b(x) are constant, we obtain:
(
y 0 (x) = a y(x) + b ,
(5.32a)
y(x0 ) = y0 ,

and the solution is given by:


b  a (x−t0 ) b
y(x) = y0 + e − .
a a

5.7.1 Exercises
Solve the following initial value problems for linear equations.

1 1
 x
 0
y (x) = − y(x) + , y 0 (x) =

2
y(x) + 1 ,
1 + x 2 1 + x 2
3. 1 + x
1.
 
y(0) = 0 ,
y(0) = 0 ,

1
 
y 0 (x) = − sin x y(x) + sin x ,
 y 0 (x) = y(x) + x2 ,

2. 4. x

y(0) = 0 , 
y(1) = 0 ,

5.8. BERNOULLI EQUATION 93

1
 
3
y 0 (x) = 3 x2 y(x) + x ex ,
 y 0 (x) = x −
 y(x) ,
5. 6. 3x

y(0) = 1 , 
y(1) = 1 .

5.8 Bernoulli equation


Fixed α 6= 0 , 1 , a non–linear differential equation of the form:

y 0 = a(x)y + b(x)y α . (5.38)

is known as Bernoulli equation for the fact that Jacob Bernoulli (1654–1705) first solved it. In
the article [32] the historical process which led to solve the famous Bernoulli differential equation
is exposed in detail.
The change of dependent variable v(x) = y 1−α (x) transforms (5.38) into a linear equation:

v 0 (x) = 1 − α a(x) v(x) + 1 − α b(x) .


 
(5.39)

The next three examples illustrate how the change of variable works.

Example 5.32. Consider the differential equation:



y 0 = − 1 y − 1 y 4 ,
x x
y(2) = 1 .

Here α = 4 , and the change of variable is v(x) = y −3 (x) , i.e., y = v −1/3 , leading to:

1 −4/3 0 1 1
− v v = − v −1/3 − v −4/3 .
3 x x

Multiplcation by v 4/3 yields a linear differential equation in v :

1 0 1 1
− v =− v− .
3 x x
Simplifying and recalling the initial condition, we obtain a linear initial value problem:

v 0 = 3 v + 3 ,
x x
v(2) = 1 ,

1 3
solved by v(x) = x − 1 . Hence, the solution to the original problem is:
4
r
4
y(x) = 3 3 .
x −4

Example 5.33. Given x > 0 , solve the initial value problem:



y 0 = 1 y + 5 x2 y 3 ,
2x
y(1) = 1 .

This is a Bernoulli equation with exponent α = 3 . Consider the change of variable:


 1 − 1
y(x) = v(x) 1−3
= v(x) 2 .
94 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS

The associated linear equation in v(x) is:


1
v 0 = − v − 10 x2 ,
x
which is solved by:
7 5
v= − x3 .
2x 2
To recover y we have to assume:
r
7 5 7
− x3 > 0
4
⇐⇒ 0<x< .
2x 2 5
Finally: r
1 4 7
y=q , with 0<x< .
7
− 5
x3 5
2x 2

Example 5.34. Solve the initial value problem:



y 0 (x) = − 1 y(x) − 1 x2 y 5 (x) ,
4x 4
y(1) = 1 .

We have to solve a Bernoulli equation, with exponent α = 5 . The change of variable is:
 1 − 1
y(x) = v(x) 1−5 = v(x) 4 ,

which leads to the transformed linear equation:



v 0 (x) = 1 v(x) + x2 ,
x
v(1) = 1 ,

solved by:
x + x3
v(x) = .
2
Recovering y(x) , we find: r
4 2
y(x) =
x + x3
that is defined for x > 0 .

Remark 5.35. Bernoulli equation (5.38) can also be solved using the same approach used for
linear equations, that is, imposing a solution of the form:

y(x) = c(x) eA(x)

and obtaining the separable equation for c(x) :

c0 (x) = b(x) e(α−1) A(x) cα (x) ,

so that:
  1
1−α
1−α
c(x) = (1 − α) F (x) + c0 ,

where: Z x
F (x) = b(z) e(α−1) A(z) dz .
x0
5.8. BERNOULLI EQUATION 95

Remark 5.36. There is one more way to solve equation (5.38) transforming it into a separable
equation changing the dependent variable as follows

u = y 1−α e(α−1)A(x) (5.40)

where, as usual Z
A(x) = a(x) dx

In fact, if we assume the y solves (5.38) differentiating (5.40) we get

u0 = (1 − α) y −α y 0 e(α−1)A(x) + (α − 1)a(x)y 1−α e(α−1)A(x) (5.40b )

Now expressing y 0 in (5.40b ) using the fact that y solves (5.38) we arrive at

u0 = (1 − α) y −α (a(x)y + b(x)y α ) e(α−1)A(x) + (α − 1)a(x)y 1−α e(α−1)A(x)


(5.40c )
= (1 − α) b(x)e(α−1)A(x)

which integrating delivers u, namely, using c as integration constant:


Z
u(x) = (1 − α) b(x)e(α−1)A(x) dx + c (5.40d )

Therefore solution y of (5.38) is obtained inserting u from (5.40d ) in (5.40) and solving for y:
Z  1
1−α
A(x) (α−1)A(x)
y=e (1 − α) b(x)e dx + c (5.40e )

which agrees with the former Remark 5.35.

5.8.1 Exercises
Solve the following initial value problems for Bernoulli equations.

y 0 = 2 x y + 2√y ,
 
y 0 = − 1 − 1 y 2 ,
1. x ln x 3. 1+x
y(2) = 1 , y(0) = 1 ,
(  2 (
y 0 = x2 y + x5 + x2 y 3 , y0 = y + x y2 ,
2. 4.
y(0) = 1 , y(0) = 1 .
(
y 0 (x) − y(x) + (cos x) y(x)2 = 0 ,
5.
y(0) = 1 .
96 CHAPTER 5. FIRST ORDER EQUATIONS: EXPLICIT SOLUTIONS
6 First order equations: advanced
topics

6.1 Riccati equation


A Riccati differential equation has the following form:

y 0 = a(x) + b(x) y + c(x) y 2 . (6.1)

The solving strategy is based on knowing one particular solution y1 (x) of (6.1). Then, it is
assumed that the other solutions of (6.1) have the form y(x) = y1 (x) + u(x) , where u(x) is an
unknown function, to be found, and that solves the associated Bernoulli equation:

u0 (x) = b(x) + 2 c(x) y1 (x) u(x) + c(x) u2 (x) .



(6.1a)

Another way to form (6.1a) is via the substitution:


1
y(x) = y1 (x) + ,
u(x)
which transform (6.1) directly into the linear equation:

u0 (x) = −c(x) − b(x) + 2 c(x) y1 (x) u(x) .



(6.2)

Notice that, in this latter way, we combine together two substitutions: the first one maps the
Riccati1 equation into a Bernoulli equation; the second one linearizes the Bernoulli equation.
Example 6.1. Knowing that y1 (x) = 1 solves the Riccati equation:
1+x 1
y0 = − + y + y2 , (6.3)
x x
we want to show that the general solution to equation (6.3) is:
x2 e x
y(x) = 1 + .
c + (1 − x)ex
Let us use the change of variable:
1
y(x) = 1 + ,
u(x)
to obtain the linear equation:
1 2
u0 (x) = − − 1+ u(x) . (6.3a)
x x
To solve it, we proceed as we learned. First, compute A(x) :
e−x
Z
2
A(x) = − 1+ dx = −x − 2 ln x =⇒ eA(x) = .
x x2
1
Jacopo Francesco Riccati (1676–1754), Italian mathematician and jurist.

97
98 CHAPTER 6. FIRST ORDER EQUATIONS: ADVANCED TOPICS

Then, form:
Z Z Z
−A(x) 1 2 x
b(x) e dx = − x e dx = − x ex dx = (1 − x) ex .
x

The solution to (6.3a) is, therefore:

e−x x
 c e−x + 1 − x c + (1 − x) ex
u(x) = c + (1 − x) e = = .
x2 x2 x2 e x

Finally, the solution to (6.3) is:

1 x2 ex
y(x) = 1 + =1+ .
u(x) c + (1 − x) ex

Example 6.2. Using the fact that y1 (x) = x solves:

y(x)
y 0 (x) = −x5 + + x3 y 2 (x) , (6.4)
x

find the general solution of (6.4).


1
The substitution y = x + leads to the linear differential equation:
v
1
v0 x+  1 2 2 x5 + 1
5
1 − 2 = −x + v + x3 x + =⇒ v0 = − v − x3 ,
v x v x
whose solution is:
2 x5
c e− 5 1
v(x) = − ,
x 2x
where c is an integration constant. The general solution of (6.4) is, therefore:

2x
y= + x.
2 x5
2 c e− 5 −1

Remark 6.3. In applications, it may be useful to state conditions on the coefficient functions
a(x) , b(x) and c(x) , to the aim that the relevant Riccati equation (6.1) is solved by some
particular function having simple form. The following list summarizes such conditions and, for
each one, the correspondent simple–form solution y1 .

1. Monomial solution:
if a(x) + xn−1 x b(x) + c(x) xn+1 − n = 0 , then y1 (x) = xn .


2. Exponential solution:
if a(x) + en x b(x) + c(x) en x − n = 0 , then y1 (x) = en x .


3. Exponential monomial solution:


if a(x) + en x x b(x) + x2 c(x) en x − n x − 1 = 0 , then y1 (x) = x en x .


4. Sine solution: if a(x) + b(x) sin(n x) + c(x) sin2 (n x) − n cos(n x) = 0 , then y1 (x) =
sin(n x) .

5. Cosine solution: if a(x) + b(x) cos(n x) + c(x) cos2 (n x) + n sin(n x) = 0 , then y1 (x) =
cos(n x) .
6.1. RICCATI EQUATION 99

6.1.1 Cross–Ratio property


Solutions of Riccati equations posses some peculiar properties, due to the connection with linear
equations, as explained in the following Theorem 6.4.
Theorem 6.4. Given any three functions y1 , y2 , y3 , which satisfy (6.1), then the general solu-
tion y of (6.1) can be expressed in the form:
y − y2 y3 − y2
=c . (6.5)
y − y1 y3 − y1
Proof. We saw in § 6.1 that, if y1 is a solution of (6.1), then solutions y2 and y3 will be
determined by two particular choices of u in the substitution:
1
y = y1 + .
u
Let us denote such functions with u2 and u3 , respectively:
1 1
y2 = y1 + , y3 = y1 + .
u2 u3
Recalling that u2 and u3 are solutions to the linear equation (6.2), we know that the general
solution of (6.2) can be written as shown in Remark 5.28:
u − u2
= c.
u3 − u2
At this point, employing the reverse substitution, and following [22] (page 23):
1 1 1
u= , u2 = , u3 = ,
y − y1 y2 − y1 y3 − y1
we arrive at formula (6.5), representing the general solution of (6.1).

A consequence of Theorem 6.4 is the so–called Cross–Ratio property of the Riccati equation,
illustrated in the following Corollary 6.5.
Corollary 6.5. Given any four solutions y1 , . . . , y4 of the Riccati equation (6.1), their Cross–
Ratio is constant and is given by the quantity:
y4 − y2 y3 − y1
. (6.6)
y4 − y1 y3 − y2
Proof. Relation (6.5) implies that, if y4 is a fourth solution of (6.1), then:
y4 − y2 y3 − y2
=c ,
y4 − y1 y3 − y1
which, since c is constant, demonstrates thesis (6.6).

6.1.2 Reduced form of the Riccati equation


The particular differential equation:

u0 = A0 (x) + A1 (x) u2

is known as reduced form of the Riccati equation (6.1). Functions A0 (x) and A1 (x) are related
to functions a(x) , b(x) and c(x) appearing in (6.1). In fact, if B(x) is a primitive of b(x) , i.e.,
B 0 (x) = b(x) , the change of variable:

u(x) = e−B(x) y (6.7)


100 CHAPTER 6. FIRST ORDER EQUATIONS: ADVANCED TOPICS

trasforms (6.1) into the reduced Riccati equation:

u0 = a(x) e−B(x) + c(x) eB(x) u2 . (6.1b)

This can be seen by computing u0 = e−B(x) y 0 − y B 0 (x) from (6.7) and then substituting, in


the factor y 0 − y B 0 (x) , the equalities B 0 (x) = b(x) and y 0 = a(x) + b(x) y + c(x) y 2 , and finally
y = eB(x) u .

Sometimes, given a Riccati equation, its solution can be obtained by simply transforming it to
its reduced form. Example 6.6. illustrates this fact.

Example 6.6. Consider the initial value problem for the Riccati equation:

y 0 = 1 − 1 y + x y 2 ,
2x
y(1) = 0 .

To obtain its reduced form, define:


Z
1 1
B(x) = − dx = − ln x ,
2x 2
and the change of variable:
1 1
y = eB(x) u = e− 2 ln x u = x− 2 u .

The reduced (separable) Riccati equation is, then:


1
(
u0 = x 2 1 + u2 ,


u(1) = 0 ,

whose solution is:


1 3 
u(x) = tan (2 x 2 − 2) .
3
Remark 6.7. The reduced Riccati equation is also separable if and only if there exists a real
number λ such that:
λ a(x) e−B(x) = c(x) eB(x) .
In other words, to have separability of the reduced equation, the function:

c(x) 2 B(x)
e
a(x)

has to be constant and equal to a certain real number λ . The topic of separability of the Riccati
equation is presented in [1, 35, 36, 38, 40].

6.1.3 Connection with the linear equation of second order


Second–order differential equations will be discussed in Chapter 8, but the study of a particular
second–order differential equation is anticipated here, since it is related to the Riccati equation.
A linear differential equation of second order has the form:

y 00 + P (x) y 0 + Q(x) y = 0 , (6.8)

where P (x) and Q(x) are given continuos functions, defined on an interval I ⊂ R . The term
linear indicates the fact that the unknown function y = y(x) and its derivatives appear in
polynomial form of degree one.
6.1. RICCATI EQUATION 101

The second–order linear differential equation (6.8) is equivalent to a particular Riccati equation.
We follow the fair exposition given in Chapter 15 of [33]. Let us introduce a new variable
u = u(x) , setting: Z
y = e−U (x) , with U (x) = − u(x) dx . (6.9)

Compute the first and second derivatives of y , respectively:

y 0 = −u e−U (x) , y 00 = e−U (x) u2 − u0 .




Equation (6.8) gets then transformed into:

e−U (x) (u2 − u0 ) − P (x) u e−U (x) + Q(x) e−U (x) = 0 ,

simplifying which leads to the non–linear Riccati differential equation of first order:

u0 = Q(x) − P (x) u + u2 . (6.8a)

Viceversa, to find a linear differential equation, of second order, that is equivalent to the first–
order Riccati equation (6.1), let us proceed as follows. Consider the transformation:

w0
y=− , (6.10)
c(x) w

with first derivative:


w c0 (x) w0 − c(x) w w00 + c(x) (w0 )2
y0 = .
c2 (x) w2
Now, apply transformation (6.10) to the right hand–side of (6.1), i.e., to a(x) + b(x) y + c(x) y 2 :

b(x) w0 w2
a(x) − + .
c(x) w c(x) (w0 )2

By comparison, and after some algebra, we arrive at:

−a(x) c2 w + b(x) c(x) w0 + c0 (x) w0 − c(x) w00


=0,
c2 w
that is a linear differential equation of second order:

c(x) w00 − b(x) c(x) + c0 (x) w0 + a(x) c2 (x)w = 0



(6.8b)

equivalent to the Riccati equation (6.1).

Example 6.8. Consider the linear differential equation of order 2 :

x 1
y 00 − 2
y0 + y = 0, with − 1 < x < 1. (6.11)
1−x 1 − x2

Following the notations in (6.8):

x 1
P (x) = − , Q(x) = ,
1 − x2 1 − x2

and using the transformation (6.9), we arrive at the Riccati equation:

1 x
u0 = 2
+ u + u2 . (6.11a)
1−x 1 − x2
102 CHAPTER 6. FIRST ORDER EQUATIONS: ADVANCED TOPICS

To obtain the reduced form of (6.11a), we employ the transformation (6.7), observing that, here,
such a transformation works in the following way:
Z
x 1
b(x) = =⇒ B(x) = b(x) dx = − ln(1 + x2 )
1 − x2 2
p
=⇒ v = e−B(x) u = 1 − x2 u .

The reduced Riccati (separable) differential equation is, then:


1 1
v0 = √ +√ v2 , (6.12)
1 − x2 1 − x2
whose general solution is:
v(x) = tan (arcsin x + c) .
Recovering the u variable, we get the solution to equation (6.11a):

tan (arcsin x + c)
u(x) = √ .
1 − x2
To get the solution of the linear equation (6.11), we reuse relation (6.9).
First, a primitive U (x) of u(x) must be found:

tan arcsin x + c
Z Z

U (x) = u(x) dx = √ dx = − ln cos(arcsin x + c) − K .
1 − x2
where ±K is a constant whose sign can be made positive. Then, using (6.9), we can conclude
that the general solution to (6.11) is:

ln cos(arcsin x+c) +K
y(x) = e = eK cos(arcsin x + c) .

6.1.4 Exercises
1. Knowing that y1 (x) = 1 is a solution of:

y 0 (x) = −1 − ex + ex y(x) + y 2 (x) , (6.13)

show that the general solution to (6.13) is:


x
e2 x+e
y(x) = 1 + .
c − eex ex − 1
Z
x x
Hint: e2 x+e dx = ee (ex − 1) .

2. Find the general solution of the equation:


2
y 0 (x) = − + y 2 (x) ,
x2
1
knowing that y1 (x) = is a particular solution.
x
3. Find the general solution of the equation:
1 1
y 0 (x) = −1 + y + 2 y 2 (x) ,
x x
knowing that y1 (x) = x is a particular solution.
6.2. CHANGE OF VARIABLE 103

4. Solve the linear differential equation of second order:

x2 y 00 + 3 x y 0 + y = 0 .

Hint: Transform the equation into a Riccati equation, in reduced form, and then use the fact
that v1 (x) = x2 is a particular solution.

5. Solve the linear differential equation of order 2 :

(1 + x2 ) y 00 − 2 x y 0 + 2 y = 0 .

Hint: Transform the equation into a Riccati equation, then solve it, using the fact that u1 =
1
− is a particular solution.
x

6.2 Change of variable


A differential equation can be solved, at times, using an appropriate change of variable. Aim of
this section 6.2 is to provide a short account on how to apply any change of variable, to a given
differential equation, in a correct way. Details on how to find a change of variable, capable of
transforming a given differential equation into a simpler (possibly, the simplest) form, can be
found in [21], [3].

Consider the differential equation:


yx0 = f (x, y(x)) , (6.14)
where, the subscript x is used to emphasize that we are considering the x–derivative, in contrast
with the fact that, below, we are also going to form the derivative with respect to a new variable.
Consider, in fact, the mapping:
(x, y) 7→ (X, Y )
where X = X(x, y) and Y = Y (x, y) are C 1 functions, with:
 
Xx (x, y) Xy (x, y)
det 6= 0 .
Yx (x, y) Yy (x, y)

This last condition ensures uniqueness for the (x, y)–solution of the system:
(
X = X(x, y) ,
Y = Y (x, y) .

Equation (6.14) gets, thus, changed into:

Dx Y (x, y) Yx + Yy yx0
=
Dx X(x, y) Xx + Xy yx0

i.e.,
Yx + f (x, y) Yy
YX0 = . (6.14a)
Xx + f (x, y) Xy
The right hand–side of (6.14a) contains (x, y) . To complete the coordinate change, we have to
revert the mapping, solving the system:
( (
X = X(x, y) , x = x̂(X, Y ) ,
=⇒
Y = Y (x, y) , y = ŷ(X, Y ) ,

and substituting the founded expressions for x and y into (6.14a).


104 CHAPTER 6. FIRST ORDER EQUATIONS: ADVANCED TOPICS

Example 6.9. Consider the Riccati equation:


2y 1
yx0 = x y 2 −
− 3. (6.15)
x x
2
Consider further the change of variables (X , Y ) = (x y , ln x) , so that:
1
Yx = Dx Y (x , y) = Dx (ln x) = , Yy = Dy Y (x , y) = Dy (ln x) = 0 ,
x
Xx = Dx X(x , y) = Dx (x2 y) = 2 x y , Xy = Dy X(x , y) = Dy (x2 y) = x2 ,
thus:
1 2y 1
+ x y2 − − 3 0 1
YX0 = x x x = 4 2 . (6.15a)
2 y 1 x y −1
2 x y + x y2 − − 3 x2

x x
Coordinates (x , y) can be expressed in terms of the new coordinates (X , Y ) as:
( (
X = x2 y , x = eY ,
=⇒
Y = ln x , y = Xe−2 Y .
Substitution into (6.15a) leads to:
1 1 1 1 1
YX0 = = = ( − ). (6.15b)
e4 Y X 2 e−4 Y −1 X2 −1 2 X −1 X +1
Integrating (6.15b):
1 X +1
Y = ln c +
ln ,
2 X −1
where ln c is an integration constant. Recovering the original variables:
s
1 x2 y + 1 x2 y + 1 
ln x = ln c + ln 2 = ln c .
2 x y−1 x2 y − 1
The solution of (6.15) is, therefore:
s
x2 y + 1 c2 + x2
x=c =⇒ y= .
x2 y − 1 x2 (c2 − x2 )

6.2.1 Daniel Bernoulli solution of the Riccati equation


The equation
y 0 = ay 2 + bxα , a, b ∈ R (6.16)
is in our notation a particular occurrence of a Riccati reduced equation. Daniel Bernoulli (1700-
1782) established in [5] the following conditions of integrability by elementary functions, namely:
4n
α= ,n∈Z (6.17)
1 − 2n
6.2. CHANGE OF VARIABLE 105

Later, Joseph Liouville (1809-1882) proved in [27] that, beyond Bernoulli’s (6.17), equation
(6.16) cannot be solved in terms of elementary functions. In the general situation, solution of
(6.16) are expressed through Bessel functions, that we will study in section 8.4.
Before illustrate the Bernoulli argument we provide the integration of (6.16) in the “limit
case”of (6.17) i.e. taking the limit as n → ∞ which means α = −2. We therefore deal with
a
y0 = + by 2 (6.18)
x2
Change the dependent variable by:

1 z0
y= =⇒ y 0 = − 2
z z
then inserting in (6.18) we arrive at
 z 2
z 0 = −a −b (6.18’)
x
Use a second change of variable:
z
u=
x
being u the new dependent varaible, obtaing the separable equation

−b − u − au2
u0 = (6.18”)
x
The integration of (6.18”) is elementary and depends on the sign of ∆ = 1 − 4ab. As an example
if a = 1 and b = −6 we get: Z Z
du 1
2
= +c
6−u−u x
thus using partial fraction:
u+3
= (cx)5
u−2
To end the process we go back to the original variable using
z 1 1
u= ,z= =⇒ u =
x y xy

so, solving for y we arrive at


cx5 − 1
y=
x(3 + 2cx5 )
which solves
1
y0 = − 6y 2
x2
Now we pass to present the Daniel Bernoulli approach: we follow [23] page 21, aging starting
form the easiest particular case of (6.16) taking α = −4 which means to take n = 1 in (6.17):

b
y 0 = ay 2 + , a, b ∈ R (6.19)
x4
We use the change of variable suggested in [23]:


1 1
x =

X =
 
x X
a =⇒ X 1
 (6.20)
Y =
 , 
y = X − .
x (1 + axy) 
Y a
106 CHAPTER 6. FIRST ORDER EQUATIONS: ADVANCED TOPICS

The change of variables (6.20) transforms (6.19) in a separable equation:

a2 b
YX0 = + a = a + bY 2 (6.19a)
x2 (axy + 1)2

Solution of (6.19a) is therefore obtained with the following integration which depends on the
sign of ab as follows
 1 pa 
Z
dY √
 arctan b Y +c if ab > 0
+ c = ab (6.19b)
a + bY 2 1 p a 
√
 arctanh −b Y + c if ab < 0
−a b
Using (6.19b) we have the solution of the transformed differential equation (6.19a):
r
a √ 
tan a b X + c if ab > 0


b

Y = r
√ (6.19c)
a 
− − tanh −a b X + c if ab < 0


b

Going back to the original variable, if a b > 0 we arrive at


r √ !
1 a ab
x = tan +c
a + x2 y b x

and for a b < 0 we have r √ 


1 a −ab
x = − − tanh +c
a + x2 y b x
Thus we can express solution of (6.19):
 r √ !
 1 b 1 ab
− ax + a x2 cot +c if ab > 0


x

y= r √  (6.19s)
1 b 1 −ab


− + − coth +c if ab < 0


ax a x2 x

How and why Daniel’s change of variables separates the variables in Riccati’s equation (6.19)?
The answer is found, on one hand by returning to the equation (6.16) with arbitrary exponent
α and on the other by parameterizing the change of variables.
Changing variable in (6.16) as


α+3
1
X = x x = 3+α
 

X

a =⇒ 2 1 (6.21)
Y =
  aX − α+3 − Y X − α+3
x(a + xy) y=


aY
We arrive at the transformed equation
α+4
aX − α+3 + bY 2
YX0 =− (6.22)
α+3
Not having specified the parameters allows us to understand α = −4 is “special case”. Moreover
we see that change of variables (6.21) transforms equation (6.16) into an equation of the same
kind, but with a different exponent of x.
Consequently, iterating the process we identify the family of exponents that, after a finite
number of steps, produces a separable equation.
6.2. CHANGE OF VARIABLE 107

It is easily seen that by iterating the transformation of variables the sequence of the exponents
of the independent variable is

α+4 3α + 8 5α + 12 7α + 16 α(2n − 1) + 4n
α, − ,− ,− ,− ..., − , ...
α+3 2α + 5 3α + 7 4α + 9 αn + 2n + 1
Henceforth the iteration will produce a separable is α is such that there is a positive integer n
such that
4n
α(2n − 1) + 4n = 0 =⇒ αn =
1 − 2n
For instance for n = 2 that is α = −8/3 solution of (6.16) is
 √ √  √ 
3√ab

 3 3
x ab cot 3x + k + 3ab − x2/3
if ab > 0


 √  √ 
 x5/3 − 3x4/3 ab cot 3√ ab

3x + k

y= √ √  √ 
3 √−ab


 3 3
x −ab coth 3x + k + 3ab − x2/3
if ab < 0


 √  √
−ab

x5/3 − 3x4/3 −ab coth 3 √


3x + k

When α 6= αn (6.16) is transformed in a linear equation of second order by (Euler) substitution


u0
y = − au
u00 + abxα u = 0
which solution can be expressed in terms of Bessel Functions Jν and Yν

6.2.2 Exercises
1. Prove that the differential equation:

y − 4 x y 2 − 16 x3
yx0 =
y 3 + 4 x2 y + x

is transformed, by the change of variables (x , y) 7→ (X , Y ) , into:

YX0 = −2 X .
p y
where X(x , y) = 4 x2 + y 2 and Y (x , y) = arctan .
2x
Then, use this fact to integrate the original differential equation.

2. Prove that the differential equation:

y 3 + x2 y − y − x
yx0 =
x y 2 + x3 + y − x

is transformed, by the change of variables (x , y) 7→ (X , Y ) , into:

YX0 = Y (1 − Y 2 ) ,
y p
where X(x , y) = arctan and Y (x , y) = x2 + y 2 .
x
Then, use this fact to integrate the original differential equation.

3. Given the differential equation, in the unknown y = y(x) :

y 3 ey
y0 = , (6.23)
y 3 ey − ex
108 CHAPTER 6. FIRST ORDER EQUATIONS: ADVANCED TOPICS

transform it into an equation for Y = Y (X) , using the change of variables:

X(x , y) = − 1 ,

y
Y (x , y) = ex−y .

Then, express the solution to (6.23) in implicit form.

4. Given the differential equation, in the unknown y = y(x) :

y2

 0 y+1
y = 3+ ,
x x (6.24)
y(1) = 0 ,

transform it into an equation for Y = Y (X) , using the change of variables:


 y
X(x , y) = ,
x
Y (x , y) = − 1 ,
x
and then give the solution to (6.24).
7 Linear equations of second order

The general form of a differential equation of order n ∈ N was briefly introduced in equation
(4.10) of Chapter 4. The current Chapter is devoted to the particular situation of linear equations
of second order:
a(x) y 00 + b(x) y 0 + c(x) y = d(x) , (7.1)
where a , b , c and d are continuous real functions of the real variable x ∈ I , being I an interval
in R and a(x) 6= 0 . Equation (7.1) may be represented, at times, in operational notation:

M y = d(x) ,

where M : C 2 (I) → C(I) is a differential operator that acts on the function y ∈ C 2 (I) :

M y = a(x) y 00 + b(x) y 0 + c(x) y . (7.2)

In this situation, existence and uniqueness of solutions are verified, for any initial value problem
associated to (7.1).
Before dealing with the simplest case, in which the coefficient functions a(x) , b(x) , c(x) are
constant, we examine general properties, that hold in any situation. We will study some variable
coefficient equations, that are meaningful in applications. Our treatment can be easily extended
to equation of any order; for details, refer to Chapter 5 of [34] or Chapter 6 of [2].

7.1 Homogeneous equations


Assume that a(x) 6= 0 on a certain interval I . Hence, we can set:
b(x) c(x) d(x)
p(x) = , q(x) = , r(x) = , (7.3)
a(x) a(x) a(x)
and represent the differential equation (7.1) in the explicit form:

L y(x) = r(x) , (7.4)

where L is the differential operator:

L y = y 00 + p(x) y 0 + q(x) y , (7.5)

The homogeneous equation associated to (7.4) is:

Ly = 0. (7.4a)

The first step in studying (7.4) consists in the change of variable:

y(x) = f (x) u(x) ,

where u(x) is the new dependent variable, while f (x) is a function to be specified, in order to
simplify computations. We find:

L(f u) = f u00 + 2 f 0 + p f u0 + f 00 + p f 0 + q f u = r .
 
(7.6)

109
110 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

In (7.6), we can choose f so that the coefficient of u vanishes, that is:

f 00 + p f 0 + q f = 0 . (7.7)

In this way, equation (7.6) becomes easily solvable, since it reduces to a first–order linear equation
in the unknown v = u0 :
f v0 + 2 f 0 + p f v = r .

(7.6a)
At this point, if any particular solution to the homogeneous equation (7.4a) is available, the
solution of the non–homogeneous equation (7.4) can be obtained.
The set of solutions to a homogeneous equation forms a two–dimensional vector space, as illus-
trated in the following Theorem 7.1. The first, and easy, step is to recognize, that given two
solutions y1 and y2 of (7.4a), their linear combination:

y = α 1 y1 + α 2 y2 , α1 , α2 ∈ R ,

is also a solution to (7.4a). In particular, if y1 and y2 are solution to (7.4a), then:

(L y1 )(x) = y100 + p(x) y10 + q(x) y1 = 0 , (7.8)

(L y2 )(x) = y200 + p(x) y20 + q(x) y2 = 0 . (7.9)


To form their linear combination, we multiply both sides of (7.8) and (7.9) by α1 and α2 ,
respectively, and we sum the results, obtaining:

α1 y100 + α2 y200 + p(x) α1 y10 + α2 y20 + q(x) α1 y1 + α2 y2 = 0 .


  

Using the elementary properties of differentiation, we see that:

α1 y100 + α2 y200 = (α1 y1 + α2 y2 )00

and
α1 y10 + α2 y20 = (α1 y1 + α2 y2 )0 .
This shows that α1 y1 + α2 y2 is indeed a solution to (7.4a). This demonstrates also that, when
α2 = 0 , any multiple of one solution of (7.4a) solves (7.4a) too. By iteration, it holds that any
linear combination of solutions of (7.4a) solves (7.4a) too.

7.1.1 Operator notation


The operator notation (7.4) comes out handy in understanding, in details, the structure of the
solution set of a linear homogeneous differential equation,
The L introduced in (7.2) represents a linear operator between the (infinite dimensional) vector
space C (2) (I) , formed by all the functions f whose first and second derivatives, f 0 , f 00 , exist
and are continuous on I , and the (infinite dimensional) space C(I) of the continuous functions
on I :
L : C 2 (I) → C(I) . (7.10)
The task of solving the linear homogeneous differential equation (7.4a) becomes, thus, equivalent
to describing the kernel, denoted by ker(L) , of the linear operator L , that is the space of the
solutions of the linear homogeneous equation (7.4a):
n o
ker(L) = y ∈ C (2) (I) | L y = 0 . (7.11)

Even if C (2) (I) in a infinite-dimensional vector space, ker(L) is a subspace of dimension 2 , as


stated in Theorem 7.1.
7.1. HOMOGENEOUS EQUATIONS 111

Theorem 7.1. Consider the linear differential operator L : C 2 (I) → C(I) , defined by (7.5).
Then, the kernel of L has dimension 2 .

Proof. Fix x0 ∈ I and define the linear operator T : ker(L) → R2 , which maps each function
y ∈ ker(L) onto its initial value, evaluated at x0 , i.e.:

T y = y(x0 ) , y 0 (x0 ) .


The existence and uniqueness Theorem 4.17 means that T y = (0 , 0) implies y = 0 . Hence, by
the theory of linear operators, this mean that T is one–one operator and it holds:

dim ker(L) = dim R2 = 2 .

7.1.2 Wronskian determinant


To study the vector space structure of the set of solutions to (7.4a), it is useful to examine some
properties, related to the linear independence of functions. Given n real functions f1 , . . . , fn of
a real variable, all defined on the same interval I , we say that f1 , . . . , fn are linearly dependent
if there exist n numbers α1 , . . . , αn , not all zero and such that, for any x ∈ I :
n
X
αk fk (x) = 0 . (7.12)
k=1

If condition (7.12) holds only when all the αk are zero (i.e., αk = 0 for all k = 1 , . . . , n), then
functions f1 , . . . , fn are linearly independent.

We now provide a sufficient condition for linear independence of a set of functions. Let us as-
sume that f1 , . . . , fn are n–times differentiable. Then, from equation (7.12), applying successive
differentiations, we can form a system of n linear equations in the variables α1 , . . . , αn :

α1 f1 + α2 f2 + . . . + αn fn = 0,
α1 f10 + α2 f20 + ... + αn fn0 = 0,
α1 f100 + α2 f200 + ... + αn fn00 = 0, (7.13)
.. ..
. .
(n−1) (n−1) (n−1)
α1 f1 +α2 f2 + · · · + αn fn = 0.

Functions f1 , . . . , fn are linearly independent, if it holds:


 
f1 f2 ... fn
 f10 f2 0 ... fn0 
det  .. .. ..  6= 0 . (7.14)
 
 . . . 
(n−1) (n−1) (n−1)
f1 f2 ... fn

The determinant in (7.14) is called the Wronskian1 of functions f1 , . . . , fn , denoted as:


 
f1 (x) f2 (x) ... fn (x)
 f10 (x) f20 (x) ... fn0 (x) 
W (f1 , . . . , fn )(x) = det  .. .. ..  .
 
 . . . 
(n−1) (n−1) (n−1)
f1 (x) f2 (x) . . . fn (x)
1
Josef–Maria Hoëne de Wronski (1778–1853), Polish philosopher, mathematician, physicist, lawyer and
economist.
112 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

For example, functions f1 (x) = sin2 x , f2 (x) = cos2 x , f3 (x) = sin(2 x) , are linearly indepen-
dent on I = R , since their Wronskian is non–zero:
 
f1 (x) f2 (x) f3 (x)
W (f1 , f2 , f3 )(x) = det  f10 (x) f20 (x) f30 (x) 
f100 (x) f200 (x) f300 (x)
sin2 x cos2 x
 
sin(2 x)
= det  2 cos x sin x −2 cos x sin x 2 cos(2 x)  = 4 .
2 2 2 2
2 cos x − sin x) 2 sin x − cos x) −4 sin(2 x)

A non–vanishing Wronskian represents a sufficient condition for linear independence of functions.


It is worth noting that, in general, the Wronskian of a set of linearly independent functions may
vanish, but this situation cannot occurr when the functions are solutions to a linear differen-
tial equation. There exists, in fact, the important result, due to Abel2 , stated in the following
Theorem 7.2.

Theorem 7.2. Let functions y1 (x) and y2 (x) , defined on the interval I , be solutions to the
linear differential equation (7.4a). Then, a necessary and sufficient condition, for y1 and y2 to
be linearly independent, is provided by their Wronskian being non–zero on I .

Proof. The Wronskian of y1 (x) and y2 (x) is a function W : I → R defined as:


  
y1 (x) y2 (x) 0 0

W (x) = W (y1 , y2 )(x) = det 0 = y1 (x) y2 (x) − y2 (x) y1 (x) .
y1 (x) y20 (x)

Differentiating, we obtain:

d
W 0 (x) = W (y1 , y2 )(x) = y1 (x) y200 (x) − y2 (x) y100 (x) .
dx
Since y1 and y2 are solution to (7.4a), recalling that we assume a(x) 6= 0 in (7.1), it holds:

y100 (x) = −p(x) y10 (x) − q(x) y1 (x) ,

y200 (x) = −p(x) y20 (x) − q(x) y2 (x) ,


where p(x) , q(x) are as in (7.3). Then:
 
0 0 0
W (x) = −p(x) y1 (x) y2 (x) − y2 (x) y1 (x) = −p(x) W (x) .

In other words, the Wronskian solves the separable differential equation:

W 0 = −p(x) W . (7.15)

Solving (7.15) yields: Z x


− p(s) ds
W (x) = W (x0 ) e x0 , (7.16)
with p(s) is as in (7.3). Equation (7.16) implies that, if the Wronskian W (x) vanishes in x0 ∈ I ,
then W (x) is the zero function; viceversa, if there exists x0 ∈ I such that W (x0 ) 6= 0 , then
W (x) 6= 0 for each x ∈ I. Hence, to prove the thesis of Theorem 7.2, we need to prove that
there exists x0 ∈ I such that W (x0 ) 6= 0 . The demontration is by contradiction. Let us negate
the assumption:
∃ x0 ∈ I such that W (x0 ) 6= 0 ,
2
Niels Abel (1802-1829)
7.1. HOMOGENEOUS EQUATIONS 113

which means that the following holds true:

W (x0 ) = 0 ∀ x0 ∈ I .

Construct the 2 × 2 linear system of algebraic equations in the unknowns α1 , α2 :


     
y1 (x0 ) y2 (x0 ) α1 0
0 0 = . (7.17)
y1 (x0 ) y2 (x0 ) α2 0

By assumption, the determinant of this homogeneous system is zero, hence the system admits a
T
non trivial solution α1 , α2 , with α1 , α2 not simultaneously null. Now, define the function:

y(x) = α1 y1 (x) + α2 y2 (x) .

Since y(x) is a linear combination of solutions y1 (x) and y2 (x) of (7.4a), then y(x) is also a
solution to (7.4a). And since, by construction, (α1 , α2 ) solves (7.17), then it also holds that:

y(x0 ) = α1 y1 (x0 ) + α2 y2 (x0 ) = 0 ,


y 0 (x0 ) = α1 y10 (x0 ) + α2 y20 (x0 ) = 0 .

At this point, from the existence and uniqueness of the solutions of the initial value problem:
(
Ly = 0 ,
y(x0 ) = y 0 (x0 ) = 0 ,

it turns out that y(x) = 0 identically, implying that α1 = α2 = 0 . So, we arrive at a


contradiction. The theorem is proved.

Putting together Theorems 7.1 and 7.2, it is possible to establish if a pair of solutions to (7.4a)
is a basis for the set of solutions to the equation (7.4a), as illustrated in Theorem 7.3.

Theorem 7.3. Consider the linear differential operator L : C 2 (I) → C(I) defined in (7.4a). If
y1 and y2 are two independent elements of ker(L) , then any other element of ker(L) can be
expressed as a linear combination of y1 and y2 :

y(x) = c1 y1 (x) + c2 y2 (x)

for suitable constants c1 , c2 ∈ R .

7.1.3 Order reduction


When an integral of the homogeneous equation (7.4a) is known, possibly by inspection or by
an educated guess, a second independent solution to (7.4a) can be obtained, with a procedure
illustrated in Example 7.4.

Example 7.4. Knowing that y1 (x) = x solves the differential equation:

x2 (1 + x) y 00 − x (2 + 4 x + x2 ) y 0 + (2 + 4 x + x2 ) y = 0 , (7.18)

a second independent solution to (7.18) can be found, by seeking a solution of the form:

y2 (x) = y1 (x) u(x) = x u(x) .

Let us evaluate the first and the second derivative of y2 :

y20 (x) = u(x) + x u0 (x) , y200 (x) = 2 u0 (x) + x u00 (x) ,


114 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

and substitute the derivatives above into (7.18):

x3 (x + 1) u00 (x) − (x + 2) u0 (x) = 0 .




Now, introducing v(x) = u0 (x) , we see that v has to satisfy the first–order linear separable
differential equation:
x+2
v0 = v,
x+1
which is solved by:
v(x) = c (1 + x) ex .
We can assume c = 1 , since we are only interested in finding one particular solution of (7.18).
Function u(x) is then found by integration:
Z
u(x) = (1 + x) ex dx = x ex .

Therefore, a second solution to (7.18) is y2 (x) = x2 ex , where, again, we do not worry about
the integration constant. Functions y1 , y2 form an independent set of solutions to (7.18) if their
Wronskian:
x 2 ex
   
y1 (x) y2 (x) x
det 0 = det = x2 (x + 1) ex .
y1 (x) y20 (x) 1 x (x + 2) ex
is different from zero. Now, observe that the differential equation (7.18) has to be considered
in one of the intervals (−∞ , −1) , (−1 , 0) , (0, +∞) , where the leading coefficient x2 (1 + x) of
(7.18) does not vanish. On such intervals, the Wronskian does not vanish as well, thus f1 , f2
are linearly independent. In conclusion, the general solution to (7.18) is:

y(x) = c1 x + c2 x2 ex , c1 , c2 ∈ R . (7.19)

The procedure illustrated in Example 7.4 can be repeated in the general case. For simplicity, we
recall (7.4a) written in the explicit form:

y 00 + p(x) y 0 + q(x) y = 0 . (7.20)

If a solution y1 (x) of (7.20) is known, we look for a second solution of the form:

y2 (x) = u(x) y1 (x) ,

where u is a function to be determined. Computing the first and second derivatives of y2 :

y20 = y1 u0 + y10 u , y200 = y1 u00 + 2 y10 u0 + y100 u ,

and inserting them into (7.20), yields, after some computations:

y1 u00 + (2 y10 + p y1 ) u0 + (y100 + p y10 + q y1 ) u = 0 .

Now, since, y1 is a solution to (7.20), the previous equation reduces to:

y1 u00 + (2 y10 + p y1 ) u0 = 0 . (7.21)

Equation (7.21) is a first–order separable equation in the unknown u0 , exactly in the same way
as in the Example 7.4, and it can be integrated to obtain the second solution to (7.20).

The search for a second independent solution to (7.20) can also be pursued using the Wronskian
equation (7.16), without explicitly computing two solutions of (7.20). This is stated in the
following Theorem 7.5.
7.1. HOMOGENEOUS EQUATIONS 115

Theorem 7.5. If y1 (x) is a non–vanishing solution of the second–order equation (7.20), then
a second independent solution is given by:
Z t
−p(s) ds
Z x x0
e
y2 (x) = y1 (x) dt , (7.22)
x0 y12 (t)

with p(s) is as in (7.3).

Proof. Given the assumption that y1 is a non–vanishing function, rewrite the Wronskian as:

y1 y20 − y2 y10 
 
y y2
W (y1 , y2 ) = det 10 0 = y1 y20 − y2 y10 = y12 ,
y1 y2 y12

and observe that:


y1 y20 − y2 y10 d y2 
2 = .
y1 dx y1
In other words:
d y2  W (y1 , y2 )
= , (7.23)
dx y1 y12
integrating which leads to:
Z x
W (y1 , y2 )(s)
y2 (x) = y1 (x) ds , (7.24)
x0 y12 (s)

setting to zero the constant of integration W (x0 ) . At this point, thesis (7.22) follows from
inserting equation (7.16) into (7.24).

Example 7.6. Consider again Example 7.4. The solution y1 (x) = x of (7.18) can be used in
formula (7.22), to detect a second solution to such equation. Observe that, in this case:

−x (2 + 4 x + x2 ) 2 + 4 x + x2
p(x) = = − ,
x2 (1 + x) x (1 + x)

so that: Z
−p(x) dx = x + 2 ln x + ln(1 + x) ,

and: Z
−p(s) ds
e = x2 (1 + x) ex .
The second solution is, hence:
Z
y2 (x) = x (1 + x) ex dx = x2 ex ,

in accordance with the solution y2 found using the method order reduction in Example 7.4.

Example 7.7. Find the general solution of the homogeneous linear differential equation of
second order:
1 1 
y 00 + y 0 + 1 − y = 0.
x 4 x2
We seek a solution of the form y1 = xm sin x . For such a solution, the first and second derivatives
are: (
y10 = xm−1 x cos x + m sin x ,


y100 = xm−2 2 m x cos x + m (m − 1) sin x − x2 sin x .



116 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

Imposing that y1 solves the considered differential equation, we obtain:


1
xm−2 (2 m + 1) x cos x + m2 −

sin x = 0 ,
4
1
which implies, in particular, m = − . In this way, we have proved that
2
sin x
y1 = √
x
solves the given differential equation.
1
To obtain a second independent solution, we employ (7.22); in our case, it is p(x) = , which
x
gives: Z
sin x dx cos x
y2 = √ 2 =− √ .
x sin x x
Remark 7.8. To facilitate the search for some particular solution of a linear differential equa-
tion, conditions on the coefficients p(x) and q(x) , defined in (7.3), are provided below, each
leading to a function y1 that solves (7.20).

1. Monomial solution: if n2 − n + n x p(x) + x2 q(x) = 0 , then y1 (x) = xn .


2. Exponential solution: if n2 + n p(x) + q(x) = 0 , then y1 (x) = en x .
3. Exponential monomial solution: if n2 x+2 n+(1+n x) p(x)+x q(x) = 0 , then y1 (x) = x en x .
2
4. Exponential Gaussian solution: if 2 m+4 m2 x2 +2 m x p(x)+q(x) = 0 , then y1 (x) = em x .
5. Sine solution: if n p(x) cos(n x) − n2 sin(n x) + q(x) sin(n x) = 0 , then y1 (x) = sin(n x) .
6. Cosine solution: if q(x) cos(n x) − n2 cos(n x) − n p(x) sin(n x) = 0 , then y1 (x) = cos(n x) .

7.1.4 Constant–coefficient equations


In equation (7.1), the easiest case occurs when the coefficients functions a(x) , b(x) , c(x) are
constant. Suppose that a, b, c ∈ R, with a 6= 0. A constant–coefficient homogeneous differential
equation has the form:
M y = a y 00 + b y 0 + c y = 0 . (7.25)
We seek for solutions of (7.25) in the exponential form

y(x) = eλ x ,

where λ is a constant to be determined. Computing the first two derivatives:

y 0 (x) = λ y(x) , y 00 (x) = λ2 y(x) ,

and imposing that y(x) solves (7.25), leads to the algebraic equation, called characteristic equa-
tion of (7.25):
a λ2 + b λ + c = 0 . (7.26)
The roots of (7.26) determine solutions of (7.25). Namely, if the discriminant ∆ = b2 − 4 a c
is positive, so that equation (7.26) admits two distinct real roots λ1 and λ2 , then the general
solution to (7.25) is:
y = c1 eλ1 x + c2 eλ2 x . (7.27)
The independence of solutions y1 = c1 eλ1 x and y2 = c2 eλ2 x follows from the analysis of their
Wronskian, which is non–vanishing for any x ∈ R :
  √
y1 y2 (λ1 +λ2 ) x ∆ −b x
W (y1 , y2 )(x) = det 0 0 = (λ2 − λ1 ) e = e a 6= 0 .
y1 y2 a
7.1. HOMOGENEOUS EQUATIONS 117

When ∆ < 0 , equation (7.26) admits two distinct complex conjugate roots λ1 = α + i β and
λ2 = α − i β , and the general solution to (7.25) is:

y = eα x c1 cos(β x) + c2 sin(β x) .

(7.28)

Forming the complex exponential of λ1 and that of λ2 , two complex valued functions z1 , z2
are obtained:

z1 = eλ1 x = e(α+i β) x = eα x ei β x = eα x cos(β x) + i sin(β x) ,




z2 = eλ2 x = e(α−i β) x = eα x e−i β x = eα x cos(β x) − i sin(β x) ,




that have the same real and imaginary parts:

<(z1 ) = <(z2 ) = eα x cos(β x) , =(z1 ) = =(z2 ) = eα x sin(β x) .

Set, for example, y1 = eα x cos(β x) and y2 = eα x sin(β x) . Then, the real solution presented
in (7.28) is a linear combination of the real functions y1 and y2 , which are are independent,
since their Wronskian is non–vanishing:
 
y1 y2
W (y1 , y2 )(x) = det 0 = β e2 α x 6= 0 .
y1 y20

When ∆ = 0 , equation (7.26) has one real root with multiplicity 2 , and the correspondent
solution to (7.25) is:
b
y1 = e − 2 a x
.
In this situation, we need a second independent solution, that is obtained from formula (7.22)
of Theorem 7.5, using the just found y1 and with p(s) built for equation (7.26), thus:
Z
b
− dx
Z a
b e b
y2 = e− 2 a x b
dx = x e− 2 a x .
e− a x
In other words, when ∆ = 0 the general solution of (7.25) is:
b
y = e− 2 a x

c1 + c2 x . (7.29)

Observe that the Wronskian is:


 
y1 y2 b
W (y1 , y2 )(x) = det 0 = e− a x 6= 0 .
y1 y20

Note that the knowledge of the Wronskian expression is useful in the study of non–homogeneous
differential equations too, as it will be shown in § 7.2.

Example 7.9. Consider the initial value problem:


(
y 00 − 2 y 0 + 6 y = 0 ,
y(0) = 0 , y 0 (0) = 1 .

The characteristic equation is λ2 −2 λ+6 = 0 , with roots λ = 1±i 5 . Hence, two independent
solutions are: √  √ 
y1 (x) = ex cos 5x , y2 (x) = ex sin 5x ,
118 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

and the general solution can be expressed as y(x) = c1 y1 (x) + c2 y2 (x) . Now, forming the
initial conditions: (
y(0) = c1 y1 (0) + c2 y2 (0) = c1 ,

y 0 (0) = c1 y10 (0) + c2 y20 (0) = c1 + c2 5 ,
we see that constants c1 and c2 must verify:

c1 = 0 ,
(
c1 = 0 ,
√ =⇒ 1
c1 + c2 5 = 1 , c2 = √ .
5
In conclusion, the considered initial value problem is solved by:
ex √ 
y(x) = √ sin 5x .
5

7.1.5 Cauchy–Euler equations


A Cauchy–Euler differential equation is a particular second–order linear equation, with variable
coefficients, of the form:
a x2 y 00 + b x y 0 + c y = 0 , (7.30)
where a , b , c ∈ R , and with x > 0 . We seek solutions of (7.30) in power form, that is, y = xm ,
being m a constant to be determined and that must satisfy the algebraic equation:
a m (m − 1) + b m + c = 0 , (7.31)
i.e., a m2 + (b − a) m + c = 0 . Let m1 and m2 be the roots of equation (7.31) and denote its
discriminant with ∆ = (a − b)2 − 4 a c . According to the sign of ∆ , the differential equation
(7.30) is solved by a power–form function y defined as follows, with c1 , c2 ∈ R in all cases:
(i) if ∆ > 0 then m1 , m2 are real and distinct, hence y = c1 xm1 + c2 xm2 ;
(ii) if ∆ < 0 then m1 , m2 are complex and conjugate, say m1,2 = α ± i β , thus y =
α

x c1 cos(β ln x) + c2 sin(β ln x) ;
(iii) if ∆ = 0 then m1 = m2 = m , and the solution is y = c1 xm + c2 xm ln x .
Remark 7.10. The solution y of the Cauchy–Euler equation, illustrated in each of the three
cases above, has components y1 , y2 , that are linear independent. This statement can be verified
using the Wronskian.
When ∆ > 0 , equation (7.31) has two distinct real roots m1 6= m2 , which implies, defining
y1 = xm1 and y2 = xm2 , that the Wronskian is non–null:
 
y1 y2
W (y1 , y2 )(x) = det 0 = (m2 − m1 ) xm1 +m2 −1 6= 0 , for x > 0 .
y1 y20

When ∆ < 0 , there are two complex conjugate roots m1,2 = α ± i β of (7.31); then, setting for
example y1 = xα cos(β ln x) and y2 = xα sin(β ln x) , the Wronskian does not vanish:
 
y1 y2
W (y1 , y2 )(x) = det 0 = β x2α−1 6= 0 , for x > 0 .
y1 y20

When ∆ = 0 , equation (7.31) has one real root m of multiplicity 2 ; in this case y1 = xm and
y2 = y1 ln x ; again, the Wronskian does not vanish:
 
y1 y2
W (y1 , y2 )(x) = det 0 = x2 m−1 6= 0 , for x > 0 .
y1 y20

Notice, again, that knowing the Wronskian turns out useful also when studying the non–
homogeneous differential equations case (refer to § 7.2).
7.1. HOMOGENEOUS EQUATIONS 119

Example 7.11. Consider the initial value problem:


(
x2 y 00 − 2 x y 0 + 2 y = 0 ,
y(1) = 1 , y 0 (1) = 0 .

Here equation (7.30) assumes the form:

m (m − 1) − 2 m + 2 = 0 ,

that is m = 1 , m = 2 . Therefore, the general solution of the given equation is y = c1 x + c2 x2 ;


imposing the initial conditions yields the system:
(
c1 + c2 = 1 ,
c1 + 2 c2 = 0 .

In conclusion, the solution of the initial vale problem is y = 2 x − x2 .

Example 7.12. Consider the initial value problem:


(
x2 y 00 − x y 0 + 5 y = 0 ,
y(1) = 1 , y 0 (1) = 0 .

Here, equation (7.30) assumes the form:

m (m − 1) − m + 5 = 0 ,

that is m = 1 ± 2 i . Hence, the general solution of the given equation is y =


x ( c1 cos(2 ln x) + c2 sin(2 ln x) ) ; again, imposing the initial conditions, we obtain the sys-
tem: (
c1 = 1 ,
c1 + 2 c2 = 0 ,
leading to the solution:  
1
y=x cos(2 ln x) − sin(2 ln x) .
2

7.1.6 Invariant and Normal form


To conclude § 7.1, we introduce the fundamental notion of invariant of a homogeneous linear
differential equation of second order (7.4a) The basic fact is that any such equation can be
transformed into a differential equation without the term containing the first derivative, namely:

u00 + I(x) u = 0 , (7.32)

with:
1 0 1
I(x) = q(x) − p (x) − p2 (x) . (7.33)
2 4
In fact, assuming the knowledge of a solution to (7.4a) of the form y = f u leads to, in the
same way followed to obtain equation (7.6):

L(f u) = f u00 + (2 f 0 + p f ) u0 + (f 00 + p f 0 + q f ) u = 0 , (7.34)

in which f (x) can be chosen so that the coefficient of u0 vanishes, namely:


Z
− 12 p(x) dx
2 f0 + p f = 0 =⇒ f (x) = e .
120 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

Function f (x) does not vanish, and has first and second derivatives given by:

fp f (p2 − 2 p0 )
f0 = − , f 00 = .
2 4
Hence, f can be simplified out in (7.34), yielding the reduced form:
 
00 1 0 1 2
u + q− p − p u = 0, (7.35)
2 4

which is stated in (7.32)–(7.33). Equation (7.32) is called the normal form of equation (7.4a).
Function I(x) , introduced in (7.33), is called invariant of the homogeneous differential equation
(7.4a) and represents a mathematical invariant, in the sense expressed by the following Theorem
7.13.

Theorem 7.13. If the equation:

L1 y = y 00 + p1 y 0 + q1 y = 0 (7.36)

can be transformed into the equation:

L2 y = y 00 + p2 y 0 + q2 y = 0 , (7.37)

by the the change of dependent variable y = f u , then the invariants of (7.36)–(7.37) coincide:

1 0 1 1 1
I1 = q1 − p1 − p21 = q2 − p02 − p22 = I2 .
2 4 2 4
Viceversa, when equations (7.36) and (7.37) admit the same invariant, either equation can be
transformed into the other one, by:
Z
1


2 p1 (x) − p2 (x) dx
y(x) = u(x) e .

Remark 7.14. We can transform any second–order linear differential equation into its normal
form. Moreover, if we are able to solve the equation in normal form, then we can obtain, easily,
the general solution to the original equation. The next Example 7.15 clarifies this idea.

Example 7.15. Consider the homogeneous differential equation, depending on the real positive
parameter a :  
00 2 0 2 2
y − y + a + 2 y = 0. (7.38)
x a
Hence:
2 2 1 0 1
p(x) = − , q(x) = a2 + =⇒ I(x) = q(x) − p (x) − p2 (x) = a2 .
x a2 2 4
In this example, the invariant is not dependent of x , and the normal form is:

u00 + a2 u = 0 . (7.39)

The general solution to (7.39) is u(x) = c1 cos(a x)+c2 sin(a x) , where c1 , c2 are real parameter,
and the solution to the original equation (7.38) is:
Z
1

2 p(x) dx
y(x) = u(x) e = c1 x cos(a x) + c2 x sin(a x) .
7.1. HOMOGENEOUS EQUATIONS 121

Example 7.16. Find the general solution of the homogeneous linear differential equation of
second order:
y 00 − 2 tan(x) y 0 + y = 0 .

The first step consists in transforming the given equation into normal form, with the change of
variable: Z Z
1

2 p(x) dx tan x dx
u
y=u e =u e = .
cos x
The normal form is:
u00 + 2 u = 0 ,

which is a constant–coefficient equation, solved by:


√ √
u = c1 cos( 2 x) + c2 sin( 2 x) .

The solution to the given differential equation is:


√ √
cos( 2 x) sin( 2 x)
y = c1 + c2 .
cos x cos x

The normal form clarifies the structure of the solutions to a constant–coefficient, linear equation
of second–order.

Remark 7.17. Consider the constant–coefficient equation (7.25). The change of variable:

b
y = u e− 2 a x

allows to transform (7.25) into normal form:

b2 − 4 a c
u00 − u = 0, (7.40)
4 a2
since, in this case:

b c 1 0 1 2 −b2 + 4 a c
p=− , q=− =⇒ I =q− p − p = .
a a 2 4 4 a2

The normal form (7.40) explains the nature of the following formulæ (7.41), (7.42) and (7.43),
namely describing the structure of the solution to a constant–coefficient homogeneous linear
differential equation of second order. In the following, the discriminant is ∆ = b2 − 4 a c and
c1 , c2 are constant:

(i) if ∆ > 0 ,
 √ √ 
b
−2a x ∆  ∆ 
y(x) = e c1 cosh x + c2 sinh x ; (7.41)
2a 2a

(ii) if ∆ < 0 ,
 √ √ 
b
−2a x −∆  −∆ 
y(x) = e c1 cos x + c2 sin x ; (7.42)
2a 2a

(iii) if ∆ = 0 ,
b
y(x) = e− 2 a x (c1 x + c2 ) . (7.43)
122 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

7.2 Non–homogeneous equation


We finally deal with the non–homogeneous equation (7.4), which we recall and re–label for
convenience:
L y = y 00 + p(x) y 0 + q(x) y = r(x) , (7.44)
changing, slightly, the point of view, in comparison to the beginning of § 7.1. Here, we assume
to know, already, the general solution of the homogeneous equation associated to (7.44). Aim
of this section is, indeed, to describe the relation between solutions of L y = 0 and solutions
of L y = r(x) , being r(x) a given continuous function. The first and probably most important
step in this direction is represented by the following Theorem 7.18.

Theorem 7.18. Let y1 and y2 be independent solutions of L y = 0 , and let yp be a solution


of L y = r(x) . Then, any solution of the latest non–homogeneous equation has the form:

y(x) = c1 y1 (x) + c2 y2 (x) + yp (x) , (7.45)

where c1 , c2 are constant. for suitable constants c1 , c2 ∈ R .

Proof. Using the linearity of the operator L , we see that:

L (y − yp ) = L y − L yp = r − r = 0 .

This means that y − yp can be express by a linear combination of y1 and y2 . Hence, thesis
(7.45) is proved.

Formula (7.45) is called general solution of the non–homogeneous equation (7.44).

7.2.1 Variation of parameters


Theorem 7.18 indicates that, to describe the general solution of the linear non–homogeneous
equation (7.44), we need to know a particular solution to (7.44) and two independent solutions
of the associated homogeneous equation. As a matter of fact, the knowledge of two independent
solutions to L y = 0 allows to individuate a particular solution of (7.44), using the method of
the Variation of parameters, introduced by Lagrange in 1774.

Theorem 7.19. Let y1 and y2 be two independent solutions of the homogeneous equation
associated to (7.44). Then, a particular solution to (7.44) has the form:

yp (x) = k1 (x) y1 (x) + k2 (x) y2 (x) , (7.46)

where: Z Z
y2 (x) r(x) y1 (x) r(x)
k1 (x) = − dx , k2 (x) = dx . (7.47)
W (y1 , y2 )(x) W (y1 , y2 )(x)
Proof. Assume that y1 and y2 are independent solutions of the homogeneous equation associ-
ated to (7.44), and look for a particular solution of (7.44) in the desired form:

yp = k1 y1 + k2 y2 ,

where k1 , k2 are two C 1 functions to be determined. Computing the first derivative of yp yields:

yp0 = k10 y1 + k20 y2 + k1 y10 + k2 y20 . (7.48)

Let us impose a first condition on y1 and y2 , i.e., impose that they verify:

k10 y1 + k20 y2 = 0 , (7.49)


7.2. NON–HOMOGENEOUS EQUATION 123

so that (7.48) reduces to:


yp0 = k1 y10 + k2 y20 . (7.48a)
Now, compute yp00 , by applying differentiation to (7.48a):

yp00 = k1 y100 + k2 y200 + k10 y10 + k20 y20 .

At this point, imposing that yp solves equation (7.44) leads to forming the following expression
(in which variable x is discarded, to ease the notation):

yp00 + p yp0 + q yp = k1 y100 + k2 y200 + k10 y10 + k20 y20 + p k1 y10 + k2 y20 + q k1 y10 + k2 y20
  

= k1 y100 + p y10 + q y1 + k2 y200 + p y20 + q y2 + k10 y10 + k20 y20 .


  

In this way, a second condition on y1 and y2 is obtained:

k10 y10 + k20 y20 = r . (7.50)

Equations (7.49) and (7.50) form a 2 × 2 linear system, in the variables k10 , k20 , that admits a
unique solution:
y2 r y1 r
k10 = − , k20 = , (7.51)
W (y1 , y2 ) W (y1 , y2 )
since its coefficient matrix is the Wronskian W (y1 , y2 ) , which does not vanish, given the as-
sumption that y1 and y2 are independent. Thesis (7.47) follows by integration of k10 , k20 in
(7.51).

Example 7.20. In Example 7.4, we showed that the general solution of the homogeneous
equation (7.18) has the form (7.19), i.e., c1 x + c2 x2 ex , c1 , c2 ∈ R . Here, we use Theorem 7.19
to find the general solution of the non–homogeneous equation:
2
x2 (1 + x) y 00 − x (2 + 4 x + x2 ) y 0 + (2 + 4 x + x2 ) y = x2 (1 + x) . (7.52)

As a first step, let us rewrite (7.52) in explicit form, namely:

x (2 + 4 x + x2 ) 0 (2 + 4 x + x2 )
y 00 − y + y = x2 (1 + x) .
x2 (1 + x) x2 (1 + x)

Then, using equation (7.47), we obtain:

x3
Z Z
k1 (x) = − x2 dx = − , k2 (x) = x e−x dx = −(1 + x) e−x .
3
Hence, the general solution of (7.52) is:

x4
y(x) = c1 x + c2 x2 ex − − x2 (1 + x) .
3
Example 7.21. Consider the initial value problem:
(
x2 y 00 − x y 0 + 5 y = x2 ,
y(1) = y 0 (1) = 0 ,

In Example 7.12, the associated homogeneous equation was considered, of which two independent
solutions were found, namely y1 = x cos(2 ln x) and y2 = x sin(2 ln x) , whose Wronskian is 2 x .
Now, writing the given non–homogenous differential equation in explicit form:
1 0 5
y 00 − y + 2 y = 1,
x x
124 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

and using (7.47), we find:


Z Z
1 1
k1 (x) = − sin(2 ln x) dx , k2 (x) = cos(2 ln x) dx .
2 2
Evaluating the integrals:
1 2 1 
k1 (x) = x cos(2 ln x) − x sin(2 ln x) ,
2 5 5
1 2 1 
k1 (x) = x sin(2 ln x) − x cos(2 ln x) .
2 5 5
Hence, a particular solution of the non–homogenous equation is
x2
yp = k1 y1 + k2 y2 = ,
5
while the general solution is:
x2
y = c1 x cos(2 ln x) + c2 x sin(2 ln x) + .
5
To solve the initial value problem, c1 and c2 need to be determined, imposing the initial
conditions:
1 2
y(1) = c1 + = 0 , y 0 (1) = c1 + 2 c2 + = 0 ,
5 5
1 1
yielding c1 = − , c2 = − . In conclusion, the solution of the given initial value problem is:
5 10
1
y(x) = x (2 x − sin(2 ln x) − 2 cos(2 ln x)) .
10

7.2.2 Non–homogeneous equations with constant coefficients


While studying a second–order differential equation, the easiest situation occurs, probably, when
the equation coefficients are constant. In this case, the application of the method of Variation of
parameters can be performed systematically. Here, though, we only provide the results, that is,
we indicate how to search for some particular solution of a given constant–coefficient equation;
the interested Reader can, then, apply the Variation of parameters to validate our statements.
Assume that a constant–coefficient non–homogeneous differential equation of second order is
given:
M y = a y 00 + b y 0 + c y = r(x) , (7.53)
where r(x) is a continuous real function. Denote with K(λ) the characteristic polynomial asso-
ciated to the differential equation (7.53). A list is provide below, taken from [14], of particular
functions r(x) , together with a recipe to find a relevant particular solution of (7.53).

(1) Let r(x) = e α x Pn (x) , being Pn (x) a given polynomial of degree n .

(a) If K(α) 6= 0 , it means that α is not a root of the characteristic equation. Then, a
particular solution of (7.53) has the form:

yp = eα x Qn (x) , (7.54)

where Qn (x) is a polynomial of degree n to be determined.


(b) If K(α) = 0 , with multiplicity s ≥ 1 , it means that α is a root of the characteristic
equation. Then, a particular solution of (7.53) has the form:

yp = xs eα x Rn (x) , (7.55)

where Rn (x) is a polynomial of degree n to be determined.


7.2. NON–HOMOGENEOUS EQUATION 125
 
(2) Let r(x) = eαx Pn (x) cos(βx) + Qm (x) sin(βx) , being Pn and Qm given polynomials
of degree n and m , respectively.

(a) If K(α + i β) 6= 0 , it means that α + i β is not a root of the characteristic equation. In


this case, a particular solution of (7.53) has the form:
 
αx
yp = e Rp (x) cos(β x) + Sp (x) cos(β x) , (7.56)

with Rp (x) and Sp (x) polynomials of degree p = max{n , m} to be determined.


(b) if K(α + i β) = 0 , it means that α + i β is a root of the characteristic equation, with
multiplicity s ≥ 1 . Then, a particular solution of (7.53) has the form:
 
yp = xs eα x Rp (x) cos(β x) + Sp (x) cos(β x) , (7.57)

where Rp (x) and Sp (x) are polynomials of degree p = max{n , m} to be determined.

(3) Let r(x) = r1 (x) + . . . + rn (x) and let yk be solution of M yk = rk . Then, y = y1 + · · · + yn


solves M y = r . This fact is known as super–position principle.

Example 7.22. Consider the differential equation:

y 00 (x) + 2 y 0 (x) + y(x) = x2 + x .

Here α = 0 is not a root of the characteristic equation:

K(λ) = λ2 + 2 λ + 1 = 0 ,

thus, we are in situation (1a) and we look for a solution of the form (7.54), that is

yp (x) = s0 x2 + s1 x + s2 .

Differentiating: (
yp0 (x) = 2 s0 x + s1 ,
yp00 (x) = 2 s0 ,
and imposing that yp solves the differential equation, we obtain:

s0 x2 + (4 s0 + s1 ) x + 2 s0 + 2 s1 + s2 = x2 + x .

Therefore, it must be:


 
s0 = 1 ,
 s0 = 1 ,

4 s0 + s1 = 1 , =⇒ s1 = −3 ,
 
2 s0 + 2 s1 + s2 = 0 , s2 = 4 .
 

Hence, a particular solution of the given equation is

yp = x 2 − 3 x + 4 .

Finally, solving the associated homogeneous equation, we obtain the required general solution:

y(x) = x2 − 3 x + 4 + c1 e−x + c2 x e−x .


126 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

Example 7.23. Consider the differential equation:

y 00 (x) + y(x) = sin x + cos x .

Observe, first, that the general solution of the associated homogeneous equation is:

y0 (x) = c1 cos x + c2 sin x , c1 , c2 ∈ R .

Observe, further, that the characteristic equation has roots ±i , Hence, we are in situation (2b)
and we look for a solution of the form (7.57), that is:

yp (x) = s1 x cos x + s2 x sin x , s1 , s2 ∈ R .

Imposing that yp (x) solves the given non–homogeneous equation, we find:

2 s2 cos x − 2 s1 sin x = cos x + sin x .

Solving the system:


1 1
s1 = − , s2 = ,
2 2
leads to the general solution of the given equation:
1 1
y(x) = x sin x − x cos x + c1 cos x + c2 sin x , c1 , c2 ∈ R .
2 2

7.3 Change of independent variable


This section is devoted to illustrate how to tranfsform the equation (7.4) changing the inde-
pendent variable x. We start with some working examples. In the first we study a family of
equations we have already studies the Cauchy-Euler, see equation (7.30).
Example 7.24. Consider (again) the linear second order differential equation

a x2 y 00 + b x y 0 + c y = 0 (7.30)

To solve it, in a different way, we use a change of independent variable from x to ξ. This means
that we take a C 2 bijective function φ : R → R+ which realizes the variable transformation:

x = φ(ξ) ⇐⇒ ξ = φ−1 (x) (7.58)

In the case of (7.30) we use as transforming function φ the exponential function:

x = eξ ⇐⇒ ξ = ln x (7.59)

Hence we will obtain a transformed equation in a new dependent variable

z(ξ) = y(x) = y(eξ )

or in other terms
z(ln x) = y(x) (7.60)
We now impose that function y(x) introduced in (7.60) is indeed a solution to (7.30); with this
aim in mind we compute the first and second derivative in (7.60) , using (of course) the chain
rule:
1 0
y 0 (x) = z (ln x) (7.61)
x
1 1
y 00 (x) = − 2 z 0 (ln x) + 2 z 00 (ln x) (7.62)
x x
7.3. CHANGE OF INDEPENDENT VARIABLE 127

Inserting (7.60), (7.61) and (7.62) in (7.30) we get


 
1 00 1 0 1
ax 2
z (ln x) − 2 z (ln x) + bx z 0 (ln x) + x z(ln x) = 0
x2 x x
or, after a bit of calculations
a z 00 (ln x) + (b − a) z 0 (ln x) + c z(ln x) = 0
The last equation is, as matter of fact, a constant coefficient equation in the new independent
variable ξ = ln x, namely:
a z 00 (ξ) + (b − a) z 0 (ξ) + c z(ξ) = 0 (7.63)
Once (7.63) is solved, returning to the original variable Cauchy Euler equation (7.30) is solved.
For instance, the particular equation
x2 y 00 + 3xy 0 + y = 0 (7.64)
where a = 1, b = 3, c = 1 is transformed into the constant coeffcient equation
z 00 (ξ) + 2 z 0 (ξ) + z(ξ) = 0. (7.65)
General solution of (7.65) is z(ξ) = (c1 + c2 ξ) e−ξ , thus returning to the original variables via
(7.59), we obtain the general solution of equation (7.64).
1
y(x) =
(c1 + c2 ln x)
x
Before state the general transformation result we give a second example of transformation.
Example 7.25. Consider the differential equation
(1 + x2 )2 y 00 + 2x(1 + x2 )y 0 + 4y = 0 (7.66)
We use here the change of variable
x = tan ξ ⇐⇒ ξ = arctan x (7.67)
So, being z(arctan x) = y(x) we compute the transformation equations for the derivatives of y :
1
y 0 (x) = z 0 (arctan x) (7.68)
1 + x2
2x 1
y 00 (x) = − 2 2
z 0 (arctan x) + z 00 (arctan x) (7.69)
(1 + x ) (1 + x2 )2
Hence (7.66) is transformed into (we drop the argument of z and its derivatives)
 
2x 0 1 00 1
(1 + x2 )2 − z + z + 2x(1 + x2 ) z 0 + 4z = 0
(1 + x2 )2 (1 + x2 )2 1 + x2
after some simplifications we arrive at the (easy) constant coefficient equation
z 00 + 4z = 0 =⇒ z(ξ) = c1 cos(2ξ) + c2 sin(2ξ)
Going back to the original variable we obtain
y(x) = c1 cos(2 arctan x) + c2 sin(2 arctan x)
But, we are not done yet, since it is indeed possible to express y(x) without trigonometric
functions, in fact keeping in mind relations
1 − tan2 α 2 tan α
cos(2α) = , sin(2α) =
1 + tan2 α 1 + tan2 α
we can express the general solution of (7.66) in algebraic way
1
c1 (1 − x2 ) + c2 x

y(x) = 2
1+x
128 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

7.26. Keeping in mind how we proceed in examples 7.24 and 7.25 we face the general situation.
So consider once more equation

y 00 + p(x)y 0 + q(x)y = 0 (7.4)

with the general change of independent variable that now we represent with

z(ξ) = z(ϕ(x)) = y(x) (7.70)

or, what is the same


x = ϕ−1 (ξ) ⇐⇒ ξ = ϕ(x) (7.70’)
Thus the general transformation rules are

y(x)
 = z(ϕ(x))
y 0 (x) = ϕ0 (x) z(ϕ(x)) (7.71)
= [ϕ0 (x)]2 z 00 (ϕ(x)) + ϕ00 (x)z 0 (ϕ(x))

 00
y (x)

Inserting (7.71) we obtain the transformation of (7.4) (which we put in explicit form)

p(x)ϕ0 (x) + ϕ00 (x) 0 q(x)


z 00 (ϕ(x)) + 2 z (ϕ(x)) + z(ϕ(x)) = 0 (7.72)
0
[ϕ (x)] [ϕ0 (x)]2

Notice that coefficients of the new linear differential operator in (7.72) are expressed as functions
of the (old) variable x. To complete the variabile transformation they have to be converted to
the new ξ variable using x = ϕ−1 (ξ).

In both examples 7.24 and 7.25 the transformed equation (7.72) turned out to be a constant
coefficient equation. Of course this nice situation does not come out of the blu, but is the
consequence of a very specific condition that depends on the coefficients p(x) and q(x) of the
original equation. In general, it is not possibile to transform a variable coefficient equation into
a constant coefficient one using only a variable tranformation involving solely the independent
variable.
For brevity we denote the coefficients of the transformed differential operator (7.72) as

p(x)ϕ0 (x) + ϕ00 (x) q(x)


pb(x) = , qb(x) =
[ϕ0 (x)]2 [ϕ0 (x)]2

To establish conditions under which (7.72) becomes an equation with constant coefficients, it is
necessary at least that qb to be a constant function, hence we impose the following condition

ϕ0 (x) = c q(x), for some c ∈ R.


p
(7.73)

Moreover form (7.73) we have to assume q(x) ≥ 0 also. Thus, after this choice we see that (7.72)
can become a constant coefficient equation if and only if pb also is a constant function. Therefore,
inserting (7.73) in the expression defining pb we see that the expression

1 2p(x)q(x) + q 0 (x)
2c [q(x)]3/2

must also individuate a constant function. Summing up we can state the following result.

Theorem 7.27. If in equation (7.4), q(x) > 0 and the expression

2p(x)q(x) + q 0 (x)
(7.74)
[q(x)]3/2
7.4. HARLEY DIFFERENTIAL RESOLVENT 129

is found to be constant, then the change of independent variable


Z xp
ξ = ϕ(x) = c q(s) ds, c ∈ R (7.75)
x0

transforms equation (7.4) to an equation with constant coefficients. If the expression (7.74) is
not constant, no change of idependent variable will reduce (7.4) to an equation with constant
coefficients.

Example 7.28. Given equation

xy 00 + (8x2 − 1)y 0 + 20x3 y = 0

we have
1 q 0 (x) + 2p(x)q(x) 8
p(x) = 8x − , q(x) = 20x2 =⇒ 3/2
=√
x [q(x)] 5
Hence form theorem 7.27 we know that the change of independent variable
Z x√
ϕ(x) = c 20s2 ds
0

transforms the given equation into an equation with constant coeffcients. Of course we can
choose√c ∈ R in order to keep the transformation as simple as possible. In this case we take
c = 2/ 20 so that we can use ϕ(x) = x2 , x ≥ 0 as a change of independent variable, which gives
the constant coeffcient equation
z 00 + 4z 0 + 5z = 0.
Solution of this equation is
z(ξ) = e−2ξ (c1 cos ξ + c2 sin ξ)
and since ξ = ϕ(x) = x2 the general solution of the given equation is
2
y(x) = e−2x c1 cos x2 + c2 sin x2 .


7.4 Harley differential resolvent


English mathematician Robert Harley, [17] and [18] developed a method to solve some partic-
ular algebraic equation using a linear differential equation. This research was then, and still is,
motivated by the fact that algebraic equations of degree greater than or equal to five are not
solvable using only algebraic radicals. Here we limited ourselves to present the case of the third
degree equation, studied by Harley, which originates a linear differential equation of order two
with variable coeffcients.
Consider the (trinomial) third degree equation

y 3 − 3y + 2x = 0 (7.76)

in the unknown y being x a parameter. Equation (7.76) individuates an implicit function y = y(x)
which will be determinated solving a differential equation. Using implicit differentiation we see
that this function satisfies
2 1
y0 = − (7.77)
3 y2 − 1
Now we “solve” for y 2 in (7.76) obtaining

2x 2(y − x)
y2 = 3 − =⇒ y 2 − 1 =
y y
130 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

and inserting in (7.77) we get


1 y
y0 = − (7.78)
3 y−x
Note that by solving differential equations (7.77) and (7.78) we will find again, as integral
curve, equation (7.76) without obtaining real informations about its solutions.
To get this information we need to do further transformations on equation (7.78), which is
Harley’s main contribution. The starting point is the algebraic identity

y y 2 − x2 − 2(y − x) = y 3 − yx2 − 2y + 2x = y 3 − yx2 − 3y + y + 2x.



(7.79)

Recalling (7.76) we can rewrite (7.79) as

y y 2 − x2 − 2(y − x) = y − yx2 = y(1 − x2 )



(7.80)

Now we go back to (7.78) in light of (7.80) we see that

1 y y 2 − x2 − 2(y − x)

y 1 (1 − x2 )y y 2 + xy − 2
= 2
= 2
= (7.81)
y−x 1−x y−x 1−x y−x 1 − x2
Inserting (7.81) in the differential equation (7.78) we obtain

y 2 + xy − 2
y0 = − (7.82)
3(1 − x2 )

Now, first multiply for 3(1 − x2 ) both sides of (7.82) getting

3(1 − x2 )y 0 = −(y 2 + xy − 2) (7.83)

then differentiate (7.83) obtaining:

3(1 − x2 )y 00 − 6xy 0 = −2yy 0 − y − xy 0 ⇐⇒ 3(1 − x2 )y 00 = (5x − 2y)y 0 − y (7.84)

Now we try to use equality (7.81): first we multiply both sides of (7.84) for 3(1 − x2 ) so that:

9(1 − x2 )2 y 00 = 3(1 − x2 )(5x − 2y)y 0 − 3(1 − x2 )y

thus recalling (7.82) we find out


 
2 2 00 2 1 1
9(1 − x ) y = 3(1 − x )(5x − 2y) − (y + xy − 2) − 3(1 − x2 )y
2
3 1 − x2
= −(5x − 2y)(y 2 + xy − 2) − 3(1 − x2 )y
= 2y 3 − 3xy 2 − 2x2 + 7 y + 10x


Using again (7.76) we replace y 3 with 3y − 2x obtaining:

9(1 − x2 )2 y 00 = 6x − 2x2 + 1 y − 3xy 2



(7.85)

Now take the linear combination of (7.83) and (7.85):

9(1 − x2 )2 y 00 − 6x + 2x2 + 1 y + 3xy 2 + µ 3(1 − x2 )y 0 − (y 2 + xy − 2) = 0


 

then choose µ such that the coeffcient of vanishes y 2 , obtaining:

9(1 − x2 )2 y 00 + 3µ(1 − x2 )y 0 + (3x + µ)y 2 + (1 + µx + 2x2 )y − 2(3x + µ) = 0

hence µ = −3x and eventually we arrive at the Harley differential resolvent

9 1 − x2 y 00 − 9xy 0 + y = 0

(7.86)
7.4. HARLEY DIFFERENTIAL RESOLVENT 131

Referring to the notation of theorem 7.27 we have

x 1 q 0 (x) + 2p(x)q(x)
p(x) = − , q(x) = =⇒ =0
1 − x2 9(1 − x2 ) [q(x)]3/2

Thus we can change the indpendent variable obtaining a constant coefficient equation using
Z x
c 1 c
ξ = ϕ(x) = √ ds = arcsin x
3 0 1−s 2 3

which taking c = 3 leads to

1
z 00 + z = 0 =⇒ z = c1 cos(ξ/3) + c2 sin(ξ/3)
9

Going back to the orginal variable we have the general solution of (7.86)
   
1 1
y(x) = c1 cos arcsin x + c2 sin arcsin x (7.87)
3 3

The integration constants are determined


√ as follows.
√ For example, if we set x = 0 we will have
the solutions of (7.76) y0 = 0, y1 = 3, y1 = − 3 which correspond to three distinct initial
value problems indivuduated by the implicit function theorem, see equation (7.78):
  
9 1 − x2 y 00 − 9xy 0 + y = 0 9 1 − x2 y 00 − 9xy 0 + y = 0 9 1 − x2 y 00 − 9xy 0 + y = 0
  
  

 
 √ 
 √
y(0) = 0 y(0) = 3 y(0) = − 3
y 0 (0) = 2 y 0 (0) = − 1 y 0 (0) = − 1

 
 

3 3 3

which generate respectively the solutions


     
1 1 1
y0 (x) = 2 sin arcsin x , y1,2 (x) = ± 3 cos arcsin x − sin arcsin x
3 3 3

-1

-2
-2 -1 0 1 2

We overlapped the locus y 3 − 3y + 2x = 0 with the three (dashed) branches y0 y1 and y2


using red, green and yellow respectively.
132 CHAPTER 7. LINEAR EQUATIONS OF SECOND ORDER

7.5 Exercises
1. Solve the following second order variable coefficient linear differential equations:
 
00 2 2
(a) y − 1+ y0 + y = 0 ;
x x
2x 2
(b) y 00 − y0 + y = 0;
1 + x2 1 + x2
1
(c) y 00 − y 0 − 4 x2 y = 0 .
x
m
Hint: y = ex .

2. Solve the following initial value problems, using the transformation of the given differential
equation in normal form:
  
00
 0 2
y + 2 sin x y + (sin x) + cos x − 6
y = 0,

x2


(a)

 y(1) = 0 ,
y 0 (1) = 1 ,

2 0

00
y + x y + y = 0 ,


(b) y(1) = 1 ,

 0

y (1) = 0 .

3. Find the general solution of the differential equation


1
y 00 + (cot x) y 0 − y=0,
(sin x)2
1
using the fact that y1 = is a solution. Then, find, if it exists, the particular solution
sin x
y(x) 1
that vanishes for x → 0+ . Say whether there exists a solution such that lim = .
x→0+ x 2
4. Solve the non–homogeneous equation:
1 0
y 00 − y − 4 x2 y = −4 x4 ,
x
2
using the fact that y1 (x) = ex is a solution of the associated homogeneous equation.

5. Finding the appropriate change of variable following theorem 7.27 find the general solution
of the following differential equations
3 0 1
(a) y 00 + y + 6y=0
x x
(b) xy 00 + (4x2 − 1)y 0 + 4x3 y = 2x3
8 Series solutions

Up to this point we illustrate various techniques for solving second-order linear equations. How-
ever, several equations are not tractable using the methods we have presented: for instance
simple equation like y 00 = xy defies all the methods of solution. However for a large class of
equations it is still possible to express solution in terms of powers series. At this purpose we
refer to the concept exposed in section 2.5. Thus in what follows we will work with analytic
functions that is f : I → R, f ∈ C ∞ being I a real open interval such that for x0 ∈ I the Taylor
series

X (x − x0 )n
f (n) (x0 )
n!
n=0

converges in some neighborhood of x0 . Most elementary functions like sin x, cos x and ex are an-
alytic ewerywhere, moreover sums, differences and products of these functions are too. Quotient
of analytic functions are analytic at all points where the denominator does not vanish.

8.1 Solution at ordinary points


Definition 8.1. Referring to the differential operator (7.4) the point x0 is an ordinary point of
the equation Ly = 0 if both p(x) and q(x) are analytic at x0 . If either of these functions is not
analytic at x0 , the x0 is a singular point of Ly = 0

First of all observe that, using if necessary a translation x0 = x − x0 we can assume x0 = 0


in what follows.

Theorem 8.2. If x = 0 is an ordinary point of Ly = 0 the the general solution of Ly = 0 in an


interval containing this point has the form

X
y= an xn = a0 y1 (x) + a1 y1 (x)
n=0

where a0 and a1 are arbitrary constants and y1 and y2 are independent functions that are analytic
at x = 0.

To evaluate the coefficients an as indacated in theorem 8.2 we indicate a five steps procedure,
known as power series method.
Step 1. Evaluate the derivatives of the power series

X
y= an xn (8.1)
n=0
X∞
y0 = nan xn−1 (8.2)
n=0
X∞
y 00 = n(n − 1)an xn−2 (8.3)
n=0

133
134 CHAPTER 8. SERIES SOLUTIONS

Step 2. Collect powers of x and set the coefficient of each power equal to zero.
Step 3. The equation obtained by setting the coefficients of xn to zero in Step 2. will contain
aj terms for a finite number of j’s. Solve the equation for the aj term having the largest index.
The resulting equation is known as the recurrence formula for the given differential equation.
Step 4. Use the recurrence to determine aj , j = 2, 3, . . . in terms od a0 and a1 .
Step 5. Insert the coefficients determinated in Step 4. into (8.1) to express the solution of the
differential equation Ly = 0.
We illustrate these five steps with some working examples. The fist one is very simple in
terms of computation, but it completely represents the procedure just described
Example 8.3. Consider, the example is inspired by [34] section 107, the (constant coefficient)
differential equation
y 00 + 4y = 0
Step 1.

X ∞
X
y 00 + 4y = n(n − 1)an xn−2 + 4 an xn = 0 (S1)
n=0 n=0

Step 2. Change the indexes of the terms in the second series at right hand side of (S1) so that
the series will involve the power xn−2 in the general term, thus (S1) becomes

X ∞
X
y 00 + 4y = n(n − 1)an xn−2 + 4 an−2 xn−2 = 0 (S2)
n=0 n=2

Step 3. Now add the two series in (S2) to obtain


∞ 
X 
y 00 + 4y = n(n − 1)an + 4an−2 xn−2 = 0 (S3)
n=2

summation starts from n = 2 because the first two terms in the first series in (S2) are both zero.
Step 4. Since condition
n(n − 1)an + 4an−2
holds true, solving for an we obtain, for n ≥ 2
4
an = − an−2
n(n − 1)
thus a0 and a1 are free and arbitrary. So to start the recursion fix a0 , a1 ∈ R so that
4 4
a2 = − a0 a3 = − a1
2·1 3·2
4 4
a4 = − a2 a5 = − a3
4·3 5·4
4 4
a6 = − a4 a7 = − a5
6·5 7·6
in general
4 4
a2k = − a2k−2 a2k+1 = − a2k−1
2k · (2k − 1) (2k + 1) · 2k
Multiplying side by side the relations of even index, we obtain

4k
a2 · a4 · · · · · a2k = (−1)k a0 · a2 · · · · · a2k−2
(2k)!
which simplifies to
4k
a2k = (−1)k a0 , k ≥ 1
(2k)!
8.1. SOLUTION AT ORDINARY POINTS 135

A similar argument for odd indexes leads to

4k
a2k+1 = (−1)k a1 , k ≥ 1
(2k + 1)!

Step 5. Now recalling the assumed series (8.1) for y we rewrite it in the form

X ∞
X
y =a0 + a2k x2k + a1 x + a2k+1 x2k+1
k=1 k=1
∞ ∞
! !
X (−1)k 4k X (−1)k 4k 2k+1
=a0 1+ x2k + a1 x+ x
(2k)! (2k + 1)!
k=1 k=1
∞ ∞
X (−1)k 2k a1 X (−1)k
=a0 (2x) + (2x)2k+1
(2k)! 2 (2k + 1)!
k=0 k=0
=a0 cos(2x) + a1 sin(2x)

Example 8.3 is interesting in order to understand how the method of searching for solutions
works, the fact that, for the particular equation considered, the solution can be easily obtained
with other methods, it allows to check the validity of this new approach. Now let’s see how to
use the power series method to solve an equation with variable coefficients,

Example 8.4. Consider, near the ordinary point x0 = 0, equation

(1 − x2 )y 00 − 6xy 0 − 4y = 0

The first difference with the previous example lies in the fact the coefficients are analytical if
|x| < 1, so here power series solution (8.1) is considered for |x| < 1. Inserting (8.1), (8.2) and
(8.3) into the equation we obtain

X ∞
X ∞
X ∞
X
n(n − 1)an xn−2 − n(n − 1)an xn − 6nan xn − 4an xn = 0
n=0 n=0 n=0 n=0

Now we group the identical powers, getting



X ∞
X
n(n − 1)an xn−2 − (n2 + 5n + 4)an xn = 0
n=0 n=0

or

X ∞
X
n(n − 1)an xn−2 − (n + 1)(n + 4)an xn = 0
n=0 n=0

So rescaling the indexes in the second series we obtain



X ∞
X
n(n − 1)an xn−2 − (n − 1)(n + 2)an−2 xn−2 = 0
n=2 n=2

Therefore for n ≥ 2 we have


n+2
n(n − 1)an − (n − 1)(n + 2)an−2 = 0 =⇒ an = an−2
n
while a0 and a1 remain, as theory requires, arbitrary since

n = 0 =⇒ 0 · a0 = 0, n = 1 =⇒ 0 · a1 = 0.
136 CHAPTER 8. SERIES SOLUTIONS

We examine the recurrences obtained equating equal powers


4 5
a2 = a0 a3 = a1
2 3
6 7
a4 = a2 a5 = a3
4 5
8 9
a6 = a4 a7 = a5
6 7
...
In general
2k + 2 2k + 3
a2k = a2k−2 , a2k+1 = a2k−1
2k 2k + 1
Thus, multiplying the “even” column
4 · 6 · · · · · (2k + 2)
a2 a4 . . . a2k = a0 a2 . . . a2k−2
2 · 4 · · · · · 2k
which simplifies in
2k + 2
a2k =a0 = (k + 1) a0 , k ≥ 1
2
For the even terms, the same argument leads to
2k + 3
a2k+1 = a1 , k ≥ 1
3
Now starting form the standard representation
∞ ∞
! !
X X
2k 2k+1
y = a0 + a2k x + a1 x + a2k+1 x
k=1 k=1

follows that
∞ ∞
! !
X
2k
X 2k + 3 2k+1
y = a0 1+ (k + 1)x + a1 x+ x
3
n=1 n=1
It is possible to verify with the usual convergence tests that both series converge for |x| < 1.
Observe that in this particular example both power series are summed in terms of elementary
functions, in fact
∞ ∞
X
2k
X 2k + 3 a0 a1 3x − x3
y = a0 (k + 1)x + a1 x2k+1 = +
3 (1 − x2 )2 3 (1 − x2 )2
n=0 n=0

8.2 Airy differential equation


When the solutions of a differential equation cannot be expressed in terms of elementary func-
tions, the search for solutions in the form of a power series becomes the main method of attack
on the problem, if the coefficients of the differential operator are analytic functions. Here we ex-
pose how power series method works for the Airy equation, which is, in terms of its formulation,
the simplest second order linear differential equation with variable coeffients:

y 00 − xy = 0 (8.4)

However, the process for the solution in power series results more complicated with respect to
the previous examples 8.3 and 8.4. Let’s start the usual power series procedure inserting (8.1)
and (8.3) into the Airy equation (8.4)

X ∞
X
n−2
n(n − 1)an x − an xn+1 = 0 (8.4a)
n=0 n=0
8.2. AIRY DIFFERENTIAL EQUATION 137

From (8.4a) see immediately infer that we must have a2 = 0 and then we can rewrite (8.4a) as

X ∞
X
n(n − 1)an xn−2 − an xn+1 = 0 (8.4b)
n=3 n=0

Now we rearrange the indexes in the first series in (8.4b) to obtain


∞ 
X 
(n + 3)(n + 2)an+3 − an xn+1 = 0 (8.4c)
n=0

Thus the sequence (an ) of the coefficients of the power series solution of equation (8.4) verifies
condition
1

an+3 = an
(n + 3)(n + 2) (8.5)
a0 , a1 , a2 ∈ R, with a2 = 0.

Equation (8.5) identifies a recurrence of the third order, which means that three indexed families
a3n , a3n+1 , a3n+2 are generated by (8.5) starting from a0 , a1 , a2 . Since we saw that a2 = 0 we
see that a3n+2 = 0 for any n ∈ N, while a3n and a3n+1 are determinated as follows:
1 1
a3 = a0 a4 = a1
6·5 7·6
1 1
a6 = a3 a7 = a4
9·8 10 · 9
1 1
a9 = a6 a10 = a7
12 · 11 13 · 12
...

In general
1 1
a3n = a3n−3 , a3n+1 = a3n−3
3n(3n − 1) (3n + 1)3n
Thus, multiplying, as we did in examples 8.3 and 8.4, by columns, we obtain, for n ≥ 1:
a0 a1
a3n = n , a3n+1 = n (8.6)
Y Y
3k(3k − 1) (3k + 1)3k
k=1 k=1

Both products in (8.6) are computed in closed form using Euler Gamma function, in fact, first
write product which define a3n as:
n n  
Y
2n
Y 1
3k(3k − 1) = 3 n! k−
3
k=1 k=1

then observe that


n            
Y 1 1 1 1 2 2 2
k− = n− n−1− ... 1 − = n−1+ n−2+ ...
3 3 3 3 3 3 3
k=1

we can invoke equations (1.8) and (1.9), see page 3 to arrive at


n 
Γ n + 23
 
Y 1
k− = 2

3 Γ 3
k=1

so that
n 2
32n n! Γ n +

Y
3
3k(3k − 1) = 2
 (8.7)
k=1
Γ 3
138 CHAPTER 8. SERIES SOLUTIONS

A similar argument leads to


n 4
32n n! Γ n +

Y
3
3k(3k + 1) = (8.8)
Γ 43

k=1

Now starting form the representation


∞ ∞
! !
X X
y= a0 + a3n x3nk + a1 x + a3n+1 x3n+1
k=1 k=1

from (8.7) and (8.8) follows that


∞ ∞
! !
Γ 23 Γ 34
 
X X
y = a0 1+  x3n + a1 x+  x3n+1
3 2n n! Γ n + 2 32n n! Γ n + 4
n=1 3 n=1 3

so that the general solution of (8.4) is


 X∞  X∞
2 1 3n 4 1
y = a0 Γ 2
 x + a1 Γ  3n+1
4 x (8.9)
3 3 2n n! Γ n + 3 3 2n n! Γ n +
n=0 3 n=0 3

8.3 Solution at regular singular points


When, given the linear differential operator Ly = y 00 + p(x)y 0 + q(x) there exists x0 such that
p(x) and/or q(x) are not analytic in x0 we say that x0 is a singular point.

Definition 8.5. The singular point x0 is a regular singular point of equation Ly = 0 if both
functions (x − x0 )p(x) and (x − x0 )2 q(x) are analytic at x0 . Singular points that are not regular
are called irregular.

Assuming that x0 = 0 is a regular singular point of Ly = 0. The power series method has to
be modified following the so called Frobenius method. Here we limit to a brief mention, see for
instance [8] or [34] for an extensive treatment. In presence of regular singular points we search
power series solutions of the form

X
y = xα an xn (8.10)
n=0

being α a real constant, which will be determinated by a quadratic equation, called indicial
equation.
We illustrate the procedure with a quite popular example, since is exposed also in [34] and
in [11].

Example 8.6. Consider equation

2xy 00 + (1 + x)y 0 − 2y = 0 (8.11)

x0 = 0 is a regular singular point for (8.11). We search for a solution of the form

X
y= an xn+α (8.12)
n=0

Differentiating (8.12) we obtain



X
0
y = (n + α)an xn+α−1 (8.13)
n=0
8.3. SOLUTION AT REGULAR SINGULAR POINTS 139


X
00
y = (n + α)(n + α − 1)an xn+α−2 (8.14)
n=0

Thus inserting in (8.11) we get



X ∞
X ∞
X ∞
X
n+α−1 n+α−1 n+α
2 (n + α)(n + α − 1)an x + (n + α)an x + (n + α)an x −2 an xn+α = 0
n=0 n=0 n=0 n=0

or after some simplifications



X ∞
X
(n + α)(2n + 2α − 1)an xn+α−1 + (n + α − 2)an xn+α (8.15)
n=0 n=0

To individuate α we equate the coeffcient of the lowest power of x in (8.15) to zero, here xα−1 ,
obtaining in this way the indicial equation

2α(α − 1) a0 + α a0 = 0 (8.16)

Solving (8.16) we obtain α = 0 and α = 21 ; notice, in view to the analysis of the general case,
that the difference beteween the two roots of the indicial equation is not an integer. To each
values of α corresponds a solution. Equating to zero the powers of xn+α in (8.15) we obtain the
recurrence rule for the coefficient of the power series solution to (8.11):

(n + α)(2n + 2α − 1)an + (n + α − 3)an−1 = 0 (8.17)

We start with α = 0. Form (8.17) we obtain the recursion relation

3−n
an = an−1
n(2n − 1)

which gives

n=1 a1 = 2a0
1 1
n=2 a2 = a1 = a0
6 3
n≥2 an = 0

So we found a polynomial solution y1 = a0 1 + 2x + 31 x2 .




If α = 21 (8.17) gives
(2n − 5)
an = − an−1
2n(2n + 1)
which, for the first values of n gives
3
n=1→ a1 = a0
2·1·3
1
n=2→ a2 = a1
2·2·5
−1
n=3→ a3 = a2
2·3·7
−3
n=4→ a4 = a3
2·4·9
−5
n=5→ a5 = a4
2 · 5 · 11
−7
n=6→ a6 = a5
2 · 6 · 13
140 CHAPTER 8. SERIES SOLUTIONS

This suggests that the general term is expressed by

(2n − 5)
an = − an−1
2n(2n + 1)

The column product then yelds, for n ≥ 1

(−1)n 3a0
an =
2n n!(2n − 3)(2n − 1)(2n + 1)

Thus we can write the second solution of (8.11) as:



1/2
X (−1)n 3a0
y2 = x + xn+1/2
2n n!(2n − 3)(2n − 1)(2n + 1)
n=1

We remark that, once the polynomial solution y1 = 1 + 2x + 31 x2 is obtained, a second


independent solution can always individuated using formula (7.22), but, in this specific case, the
relevant integration is not possible in terms of elementary functions.

8.4 Bessel equation


In 1764, Euler, investigating the vibrations of a stretched membrane, was probably the first
mathematician who got in touch with the differential equation, which is now known as Bessel
equation. Later Lagrange, Laplace and Poisson worked with Bessel’s equation as well. The
German mathematical astronomer Friedrich Wilhelm Bessel studied the equation while work-
ing on dynamical astronomy. Although Bessel functions were originally discovered by Daniel
Bernoulli, they were named after him. Among countless applications, see for instance the dedi-
cated Wikipedia page, Bessel functions are used in Mathematical statistic also, in order to express
the probability density function of product of two normally distributed random variables.
The Bessel equation of first kind is:

x2 y 00 + xy 0 + (x2 − ν 2 )y = 0 (8.18)

Solutions can be found with the Frobenius method, which consists in finding functions of the
form

X
y= an xn+α , a0 6= 0
n=0

X
y0 = (n + α)an xn+α−1
n=0
X∞
y 00 = (n + α)(n + α − 1)an xn+α−2
n=0

Inserting into (8.18) we get



X ∞
X ∞
X
n+α n+α 2 2
(n + α)(n + α − 1)an x + (n + α)an x + (x − ν ) an xn+α = 0
n=0 n=0 n=0

going on
∞ 
X  ∞
X
2 n+α
(n + α)(n + α − 1) + (α + n) − ν an x + an xn+α+2 = 0
n=0 n=0
8.4. BESSEL EQUATION 141

and finally
∞ 
X  ∞
X
(α + n)2 − ν 2 an xn+α + an−2 xn+α = 0
n=0 n=2
Every coefficient of the powers of x has to be zero. Thus, we have the following equations:
 
α 2 − ν2 a = 0
0



 
(α + 1)2 − ν 2 a1 = 0

  
 (α + n)2 − ν 2 an + an−2 = 0

Since a0 6= 0, we have α = ±ν.


First, we take α = ν. The other equations are
 
 1 + 2ν a1 = 0
 
 n(2ν + n) an + an−2 = 0, n ≥ 2

If 2ν is not a negative integer, these equations establish


an−2
a1 = 0, an = − ,n≥2
n(2ν + n)
This means that the coefficients of odd index are all zero, and that the ones with even index are
determined by the following formula:
a2n−2
a2n = − 2
, n∈N
2 n(ν + n)
We can employ the “column” method starting from the recursion
a2n−2
a2n = − ,n∈N
22 n(n + ν)

a0
a2 = −
22 · 1 · (1 + ν)
a2
a4 = − 2
2 · 2 · (2 + ν)
a4
a6 = − 2
2 · 3 · (3 + ν)
a6
a8 = − 2
2 · 4 · (4 + ν)
......

we can infer the general term (using a Pochhamer symbol)

(−1)n a0 (−1)n a0
a2n = n = a2n =
22n Y 22n n! (1 + ν)n
n! (k + ν)
k=1

Or, employing the Gamma function instead of the Pochhamer symbol:


Γ(1 + ν + n)
(1 + ν)n =
Γ(1 + ν)
therefor we arrive at
(−1)n Γ(1 + ν)
a2n = a0
22n n! Γ(1 + ν + n)
142 CHAPTER 8. SERIES SOLUTIONS

Now since the solution of BE has, for the previsous argument the structure
∞ ∞
X X (−1)n Γ(1 + ν)
y= a2n x2n+ν = a0 x2n+ν
22n n! Γ(1 + ν + n)
n=0 n=0

the Bessel function, Jν (x) of the first kind of order ν and argument x is obtained choosing
1
a0 =
2ν Γ(1 + ν)

giving the expression


∞  2n+ν
X (−1)n x
Jν (x) =
n=0
n! Γ(1 + ν + n) 2
where 2ν is not a negative integer.
The same process can be applied for α = −ν. In this case, we get
∞  2n−ν
X (−1)n x
J−ν (x) =
n=0
n! Γ(1 − ν + n) 2

This time, we need to impose that 2ν is not a positive integer. Using the ratio test it is possible
to show that the series defining Bessel functions of the first kind of order ν and −ν are absolutely
convergent for all x ∈ R (as a matter of fact also in C).
Jν and J−ν are linearly independent: they form a basis for the vector space of the solutions
of (8.18)
Bibliography

[1] J.L. Allen and F.M. Stein. On solution of certain Riccati differential equations. Amer.
Math. Monthly, 71:1113–1115, 1964.

[2] Tom Mike Apostol. Calculus: Multi Variable Calculus and Linear Algebra, with Applications
to Differential Equations and Probability. John Wiley & Sons, New York, 1969.

[3] Daniel J. Arrigo. Symmetry analysis of differential equations: an introduction. John Wiley
& Sons, New York, 2015.

[4] E. Artin and M. Butler. The gamma function. Holt, Rinehart and Winston, New York,
1964.

[5] Daniel Bernoulli. Exercitationes quaedam mathematicæ. Apud Dominicum Lovisam, Venice,
1724.

[6] D. Brannan. A First Course in Mathematical Analysis. Cambridge University Press, Cam-
bridgde, 2006.

X 1 π2
[7] B.R. Choe. An elementary proof of = . American Mathematical Monthly,
n2 6
n=1
94(7):662–663, 1987.

[8] E.A. Coddington and N. Levinson. Theory of ordinary differential equations. Tata McGraw-
Hill Education, 1955.

[9] Lothar Collatz. Differential Equations: An Introduction With Applications. John Wiley
and Sons, 1986.

[10] P. Duren. Invitation to Classical Analysis, volume 17. American Mathematical Society,
Providence, 2012.

[11] Refaat El Attar. Ordinary Differential Equations. Lulu Press Incorporated, 2006.

[12] Leonhard Euler. De summis serierum reciprocarum. Commentarii academiæ scientiarum


Petropolitanae, 7:123–134, 1740.

[13] O.J. Farrell and B. Ross. Solved problems: gamma and beta functions, Legendre polynomials,
Bessel functions. Macmillan, New York, 1963.

[14] Angelo Favini, Ermanno Lanconelli, Enrico Obrecht, and Cesare Parenti. Esercizi di analisi
matematica: Equazioni Differenziali, volume 2. Clueb, 1978.

[15] Herbert I. Freedman. Deterministic Mathematical Models in Population Ecology. M. Dekker,


1980.

[16] A. Ghizzetti, A. Ossicini, and L. Marchetti. Lezioni di complementi di matematica. 2nd ed.
Libreria eredi Virgilo Veschi, Roma, 1972.

143
144 BIBLIOGRAPHY

[17] Robert Harley. On the theory of the transcendental solution of algebraic equations. Quar-
terly Journal of Pure and Applied Mathematics, 5:337–361, 1862.

[18] Robert Harley. On differential resolvents. Proceedings of the London Mathematical Society,
1(1):35–42, 1865.

[19] Phillip Hartman. Ordinary Differential Equations 2nd Edition. Siam, 2002.

[20] O. Hijab. Introduction to calculus and classical analysis. Springer, New York, 2011.

[21] P. Hydon. Symmetry methods for differential equations: a beginner’s guide. Cambridge
University Press, Cambridge, 2000.

[22] Edward L. Ince. Ordinary Differential Equations. Dover, 1956.

[23] Erich Kamke. Differentialgleichungen-Lösungsmethoden und Lösungen, Band 1. Chelsea,


New-York, 1971.

[24] Wilfred Kaplan. Advanced calculus. Addison-Wesley, Boston, 1952.

[25] A.M. Legendre. Traité des fonctions elliptiques et des intégrales Euleriennes, volume 2.
Huzard-Courcier, Paris, 1826.

[26] D.H. Lehmer. Interesting series involving the central binomial coefficient. The American
Mathematical Monthly, 92(7):449–457, 1985.

[27] Joseph Liouville. Remarques nouvelles sur l’equation de Riccati. Journal de mathématiques
pures et appliquées, 6:1–13, 1841.

[28] A.R. Magid. Lectures on differential galois theory. Notices of the American Mathematical
Society, 7:1041–1049, 1994.

[29] C.C. Maican. Integral Evaluations Using the Gamma and Beta Functions and Elliptic
Integrals in Engineering: A Self-study Approach. International Press of Boston Inc., Boston,
2005.

[30] A.M. Mathai and H.J. Haubold. Special Functions for Applied Scientists. Springer, New
York, 2008.

[31] Paul J. Nahin. Inside Interesting Integrals. Springer, Berlin, 2014.

[32] Adam Parker. Who solved the bernoulli equation and how did they do it? College Mathe-
matical Journal, 44:89–97, 2013.

[33] Earl David Rainville. Intermediate differential equations. Macmillan, New York, 1964.

[34] Earl David Rainville and Phillip E. Bedient. Elementary differential equations. 6th ed.
Macmillan, New York, 1981.

[35] P.R.P. Rao. The Riccati differential equation. Amer. Math. Monthly, 69:995–996, 1962.

[36] P.R.P. Rao and V.H. Hukidave. Some separable forms of the Riccati equation. Amer. Math.
Monthly, 75:38–39, 1968.

[37] L. Schwartz. Mathematics for the physical sciences. Addison-Wesley, New York, 1966.

[38] H. Siller. On the separability of the Riccati differential equation. Math. Mag., 43:197–202,
1970.

[39] Wolfgang Walter. Ordinary Differential Equations. Springer, Berlin, 1998.


BIBLIOGRAPHY 145

[40] J.S.W. Wong. On solution of certain Riccati differential equations. Math. Mag., 39:141–143,
1966.

[41] R.C. Wrede and M.R. Spiegel. Schaum’s Outline of Advanced Calculus. McGraw-Hill, New
York, 2010.

You might also like