Amath350 Coursenotes
Amath350 Coursenotes
Fall 2011
1 Introduction 1
ii
4 Systems of Ordinary Differential Equations 61
4.1 Theory of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . 64
4.1.1 The General Solution of the Homogeneous Equation . . . . . 65
4.1.2 The General Solution of the Inhomogeneous Equation . . . . 69
4.2 Homogeneous Systems with Constant Coefficients . . . . . . . . . . 70
4.2.1 The case n = 2 . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.2 The case n = 3 . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2.3 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3 Finding a Particular Solution of the Inhomogeneous Equation . . . 84
4.3.1 Method of Variation of Parameters . . . . . . . . . . . . . . 84
References 135
iii
Chapter 1
Introduction
Example 1.2:
Types of Differential Equations
dy
1. = 6x2 y (ODE)
dx
d2 y
2. + sin(y) = sin(x) (ODE)
dx2
dy1
3. = y1 − y2
dt
dy2
= y1 + 3y2 (ODE System)
dt
∂u ∂u
4. x +u =1 (PDE)
∂x ∂y
1
∂F ∂F 1 ∂ 2F
5. F = +S + S2 2 (Black-Scholes PDE)
∂t ∂S 2 ∂S
6. uxxx + uyy = vx
vxxx + vyy = u (PDE System)
To study the solutions of differential equations will classify them using the following
definitions
Definition 1.3: The order of a differential equation is the order of the highest
derivative in the equation.
Definition 1.4: A differential equation in which the dependent variable and all its
derivatives appear linearly is called linear; otherwise it is called nonlinear.
In the examples above 1, 3 and 4 are first order, 2 and 5 are second order and
example 6 is third order. Also, 1, 3, 5 and 6 are linear while 2 and 4 are nonlinear.
Definition 1.5: A solution of a differential equation is any function which satisfies
the equation.
Example 1.6:
3 dy
1. y(x) = 5e2x is a solution of the DE = 6x2 y. To check this, substitute the
dx
function into the equation.
dy
LHS =
dx
3
= 5e2x · 6x2
= 6x2 y
= RHS
2. y1 (t) = e2t , y2 (t) = −e2t is a solution of (3) above. To check this, substitute
the functions into the equations.
and
2
∂u ∂u
3. u(x, t) = sin(x − t) is a solution of the PDE + = 0. To check this, take
∂x ∂t
the appropriate partial derivatives:
∂u
= cos(x − t)
∂x
∂u
= − cos(x − t)
∂t
Note: In the examples above we made no restrictions on the domain of the solution
function. In some cases it is necessary to make a restriction.
Example 1.7:
∂u ∂u √
Consider again the PDE + = 0. The function u(x, t) = t − x clearly
∂x ∂t
satisfies the DE since
1 −1
ut = √ and ux = √
2 t−x 2 t−x
However u(x, y) is only defined if t − x ≥ 0, i.e., t ≥ x, and ut , ux are only defined
√
if t > x. Thus we say u = t − x is a solution of the DE for t > x.
Example 1.8:
Consider the second order linear ODE
y 00 + y = 0. (1.1)
It is easy to check that y1 = cos(x) and y2 = sin(x) are solutions of this DE. In
fact, so is any linear combination of y1 and y2 . That is, for any constants c1 and c2 ,
is a solution of (1.1).
It is easy to check that problem (A) has the solution y(x) = 2 sin(x), problem (B)
has solutions y(x) = c1 sin(x) for any constant c1 and that there is no solution of
the form (1.2) to problem (C).
The condition in (A) is called an initial condition, while those in (B) and
(C) are called boundary conditions. More generally, we have the following.
3
Definition 1.9: An initial value problem (IVP) is a problem which seeks to find
a solution of a differential equation subject to conditions on the unknown function
and its derivatives at one value of the independent variable. These conditions are
called the initial conditions (IC).
Definition 1.10: A boundary value problem (BVP) is a problem which seeks
to find a solution of a differential equation subject to conditions on the unknown
function and its derivatives at two or more values of the independent variable.
These conditions are called boundary conditions (BC).
There are three questions which form the basis of much of the fundamental
theory of differential equations.
The primary focus of this course is the third question, although we will briefly touch
upon questions 1 and 2.
4
Chapter 2
The most general first order, ordinary differential equation is a relationship of the
form
F (x, y, y 0 ) = 0 (2.1)
where y(x) is the (unknown) dependent function and x is the independent variable.
We will restrict our attention to equations that can be put into standard form:
dy
= f (x, y) (2.2)
dx
Definition 2.1: A solution of the first order ordinary differential equation (2.2)
is any differentiable function y = ψ(x) such that
dy
= ψ 0 (x) = f (x, ψ(x))
dx
for all x in some open interval I ⊆ R.
In other words, ψ, ψ 0 must be defined on I and satisfy the differential equation for
all x in I.
Definition 2.2: The interval I in Definition 2.1 is called the interval of existence
or interval of definition of the solution.
5
y = ψ(x), x ∈ I is a solution of (2.2).
y(x0 ) = y0 (2.3)
Questions 1 and 2 from the end of Chapter 1 can be answered by specifying con-
ditions on the function f (x, y) in (2.2). The following is an example of a theorem
which gives sufficient but not necessary conditions for the existence of a unique
solution.
Theorem 2.4 (Existence & Uniqueness). Suppose that f (x, y) and fy (x, y) are
(real, finite, single-valued and) continuous at all points (x, y) within the rectangle
R : {(x, y) : |x − x0 | ≤ a, |y − y0 | ≤ b}. Then the initial value problem consisting of
(2.2) and (2.3) has a unique solution defined on some interval I = |x−x0 | ≤ h ≤ a.
Proof. See section 2.8 of Boyce and DiPrima [1] or Appendix 1 of Edwards and
Penney [2].
(x0,y0) R
x
I
6
2.1 Separable, First Order Ordinary Differential
Equations
A separable, first order ordinary differential equation is an equation of the form
y 0 = f (x, y) where f (x, y) is the product of a function of x only and a function of
y only, i.e.
y 0 = g(x)h(y) (2.4)
or
1 dy
= g(x) (2.5)
h(y) dx
Loosely, a separable equation is one in which we can put everything to do with y
on one side of the equation and everything to do with x on the other side.
Example 2.5:
Solve the differential equation
dy
= −xy. (2.6)
dx
Solution:
If y 6= 0 we can put this in the form
1 dy
= −x
y dx
To solve this equation we integrate both sides with respect to x:
Z Z
1 dy
dx = − x dx.
y dx
dy
Now use the change of variable formula dx dx = dy:
Z Z
1
dy = − x dx.
y
Evaluating the integrals gives:
x2
ln |y| + c1 = − + c2 ,
2
or
x2
ln |y| = − + C,
2
where c3 = c2 − c1 . This defines the solution implicitly. To find an explicit solution,
solve for y as a function of x by taking the exponential of each side.
x2
|y| = e− 2 +c3
x2
= ec3 e− 2
7
There are two cases to consider. Solutions with y > 0 are given by
x2
y = ec3 e− 2 , x ∈ R
y = 0, x ∈ R
Some solutions for various values of the constant A are shown in Figure 2.2. The
2.5
A>0
A= 0
-10 -7.5 -5 -2.5 0 2.5 5 7.5 10
-2.5 A<0
-5
Figure 2.2: General solution of (2.6) for various values of the parameter A.
formula (2.7) containing all possible solutions of the differential equation is called
the general solution of the differential equation.
Definition 2.6: The general solution of a first order ordinary differential equa-
tion (2.2) is a solution containing one arbitrary constant that represents almost all
solutions of the differential equation.
8
Definition 2.7: A solution which is not represented by the general solution is
called a singular solution.
Definition 2.8: A particular solution of equation (2.2) is a solution containing
no arbitrary constants.
Note: Any constant k such that f (x, k) = 0, for all x ∈ R defines an equilibrium
solution of (2.2).
Example 2.10:
x2
As we saw in the Example 2.5, y = Ce− 2 , x ∈ R is the general solution of equation
2 2
(2.6). Some particular solutions are y = e−x /2 , x ∈ R; y = −5e−x /2 , x ∈ R;
y = 0, x ∈ R. The only equilibrium solution of the equation is y = 0, x ∈ R.
2. Separate:
1 dy
= g(x); h(y) 6= 0
h(y) dx
3. Integrate both sides with respect to x:
Z Z
1 dy
dx = g(x) dx
h(y) dx
dy
4. Change the variable of integration in left side using the formula dy = dx:
dx
Z Z
1
dy = g(x) dx
h(y)
9
5. Check for constant (equilibrium) solutions y = k by looking for values of k
that satisfy:
h(k) = 0
Example 2.11:
Solve the initial value problem
dy −x
= , y 6= 0, y(1) = 2 (2.8)
dx y
Solution:
Separate the equation:
dy
y = −x.
dx
Integrate both sides with respect to x:
Z Z
dy
y dx = − x dx.
dx
Use the change of variables formula:
Z Z
y dy = − x dx.
Unlike Example 2.5, however, there is no way to combine these into one general
solution. We can only define the general solution either for y > 0 or for y < 0.
√ √ dy
Note that in both cases y is defined for − C ≤ x ≤ C, but dx is defined only for
√ √
− C < x < C. Thus the general solution of the differential equation is
√ √ √
y = + C − x2 , − C < x < C for y > 0 (2.9)
√ √ √
y = − C − x2 , − C < x < C for y < 0 (2.10)
10
2.4
(1,2)
y>0
1.6
0.8
-4.8 -4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8
-0.8
-1.6 y<0
-2.4
Figure 2.3: Family of solutions for differential equation y 0 = −xy −1 . The specific
solution of (2.8) satisfying the initial condition y(1) = 2 is given by the thick curve.
These solutions determine the two families of curves shown in Figure 2.3
Now apply the initial condition y(1) = 2. This forces us to choose general
solution (2.9). Putting the initial condition into (2.9):
√
2 = C − 1 ⇒ 4 = C − 1 ⇒ C = 5.
This corresponds to the curve in Figure 2.3 that passes through the point (1, 2).
Suppose that a sum of money is invested in a fund that pays interest at a constant
annual rate of r% per year. The value of the investment at any time depends on
the interest rate, but also on the frequency with which interest is compounded.
Typical frequencies of compounding are yearly, quarterly and daily. We will set up
an initial value problem to model the situation.
Variables:
11
t - time (measured in years)
Constants:
1 dV
= r, V 6= 0
Z V dt Z
1 dV
dt = r dt
V dt
Z Z
1
dV = r dt
V
ln |V | = rt + c1
V = Cert C = ±ec1 .
V0 = Cer·0 ⇒ V0 = C.
V = V0 ert .
12
t(years) m=1 m = 4 m = 12 m = 365 continuous
1 1.0500 1.0509 1.0512 1.0513 1.0513
2 1.1025 1.1045 1.1049 1.1052 1.1052
5 1.2763 1.2820 1.12834 1.2840 1.2840
10 1.6289 1.6436 1.6470 1.6487 1.6487
50 11.4674 11.9951 12.1194 12.1803 12.1825
Table 2.1: Value of investment with a yearly interest rate of 0.05% if the interest
is compounded yearly, quarterly, monthly, daily and continuously.
Is this a good approximation? Table 2.1 compares the value of V (t)/V0 for an
obtained using the formula (2.11) and using continuous compounding with r =
0.05?? for various values of m and t. We can conclude from this example that
continuous compounding is a good approximation if m is large and t is not too
large (e.g. your daily interest bank account) and not good if m is small or t is
large (e.g. your Canada savings bonds).
Note: The continuous compounding approximation can also be useful in studying
investments where the interest rate is not constant but varies in time. In this case,
the initial value problem is
dV
= r(t)V, V (0) = V0
dt
13
2.1.2 Dimensions and Units
When setting up a model such as in the last problem it is important to keep track
of the units used to measure each variable and constant. A model is only correct
if it is dimensionally homogeneous. This means that the units of each term in
an equation should be the same.
In our example
dV
has units of dollars/year
dt
r has units of 1/year
So the equation
dV
= rV
dt
is dimensionally homogeneous with each side having units of dollars/year. Also,
both V (0) and V0 have units of dollars so our initial condition
V (0) = V0
Consider the price of a commodity such as oil or wheat. We wish to determine how
the price changes with time, given some basic assumptions.
Let
14
dq
=S−D
dt
Assumption: The rate of increase of price is proportional to the rate at which the
quantity of commodity declines, that is,
dp dq
= −α ,
dt dt
where α > 0 is a constant. This leads to
dp
= −α(S − D),
dt
or
dp
= α(D − S)
dt
In general, S and D can depend on price, time and even the rate of change of price.
This gives rise to the following first order differential equation for the price:
dp
= α(D(t, p, dp
dt
)) − S(t, p, dp
dt
))
dt
Depending on the form of the demand and supply functions, different types of
differential equations will result.
Example 2.12:
Consider a commodity where the supply and demand functions are given by
and the rate of change of price is twice the rate of change of the commodity.
Determine how the price varies as a function of time, given that the starting price
is p0 .
With these specifications, the model becomes:
dp
= 2[D − S]
dt
dp
= 2(100 + 10p − 3p2 − (100 − 20p + 2p2 ))
dt
= 2(30p − 5p2 )
= 10p(6 − p)
(2.12)
15
The differential equation is is separable, so we solve as in previous examples:
dp
= 10p(6 − p)
Z dt Z
1 dp
dt = 10 dt, p(6 − p) 6= 0
p(6 − p) dt
Z Z
1
dp = 10 dt.
p(6 − p)
To evaluate the integral on the left side we need to use partial fractions:
1 A B
= +
p(6 − p) p 6−p
A(6 − p) + Bp
=
p(6 − p)
6A + (B − A)p
=
p(6 − p)
16
so our solution to the initial value problem is
6
p(t) =
( p60 − 1)e−60t + 1
How does the price change with time? Consider the following limits
6
lim p(t) = lim = 6, for all p0
t→∞ t→∞ ( 6 − 1)e−60t + 1
p0
20
15
10
-0.025 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 0.225
Figure 2.4: Solutions to initial value problem (2.12) for various values of the initial
price p0 .
17
Conclusion: For any starting price p0 6= 0, 6 the price will gradually approach
$6. If p0 > 6 the price will decrease to $6, if p0 < 6 the price will increase to $6.
y 0 = f (x, y)
One can always find an explicit solution of a linear first order ordinary differ-
ential equation. In the following pages we will develop the procedure for doing
this.
Recall from Calculus that, by the product rule of differentiation,
dv du d
u +v = [u(x)v(x)].
dx dx dx
The expression on the left side is called an exact differential as it can be directly
integrated using the Fundamental Theorem of Calculus:
Z
d
[u(x)v(x)] dx = u(x)v(x) + C
dx
Recognizing expressions which can be written as an exact differential is a key step
in solving linear first order ordinary differential equations. Here are some examples:
d
(i) 2x cos(x) − x2 sin(x) = dx
[x2 cos(x)]
18
dy d
(ii) x dx +y = dx
[xy]
dy d
(iii) cos(x) dx − sin(x)y = dx
[cos(x) y]
If the left side of the linear differential equation (2.13) is an exact differential
then the differential equation is easy to solve.
Example 2.14:
dy
Find the general solution of x dx + y = cos(x).
Solution:
Use the product rule to rewrite the left side as an exact differential:
d
[xy] = cos(x).
dx
Integrate both sides of the equation with respect to x:
Z Z
d
[xy] = cos(x)
dx
xy = sin(x) + C.
19
This equation can be solved as in the last example to find the general solution
(exercise):
y(x) = x2 tan(x) + 2x − 2 tan(x) + C sec(x).
The solution is defined on any interval where cos(x) 6= 0, for example, x ∈ (− π2 , π2 ).
The function cos(x) is called an integrating factor for the differential equation
in the previous example. It turns out we can find an integrating factor for any linear
first order differential equation. We will now describe how to find it.
Consider the just the left side of a linear first order differential equation in
standard form:
dy
+ P (x)y. (2.14)
dx
Let µ(x) be the integrating factor. Multiply the expression (2.14) by µ(x)
dy
µ(x)
+ µ(x)P (x)y.
dx
Now assume this can be written as an exact differential, i.e.,
dy d
µ(x) + µ(x)P (x)y = [µ(x)y].
dx dx
Expanding the right hand side then gives
dy dy dµ
µ(x) + µ(x)P (x)y = µ(x) + y .
dx dx dx
Comparing the two sides, we see µ(x) will make (2.14) and exact differential only
if
dµ
µ(x)P (x) = .
dx
This is a separable differential equation for µ that can be solved as described in the
previous section.
1 dµ
= P (x)
µ dx
Z Z
1
dµ = P (x) dx
µ
Z
ln |µ| = P (x)dx + c1
20
R
Definition 2.16: The function µ(x) = e P (x)dx is called the integrating factor
(IF) for the first order, linear ordinary differential equation
dy
+ P (x)y = Q(x).
dx
With this information in hand, we can now describe a general procedure for
solving first order, linear ordinary differential equations.
6. Solve for y
Example 2.17:
Find the general solution of the differential equation xy 0 + 2y = 2x2 .
Solution:
Rewrite the equation in standard form
2
y 0 + y = 2x.
x
2
Here P (x) = x
so the integrating factor is:
2 2
R R
µ(x) = e P (x)dx
=e x
dx
= e2 ln |x| = eln(x ) = x2 .
21
Multiply through the differential equation (in standard form):
x2 y 0 + 2xy = 2x3 .
S(p) = a0
D(p) = a1 + b1 p
where a0 > 0, a1 > 0, b0 are constants and p is the price. If p0 is the initial price
and the rate of change of the price is equal to the negative of the rate of change of
the quantity of the commodity, determine how the price varies in time.
Solution:
Start with the general model described in subsection 2.1.3:
dp dq
= − = −(S − D) = D − S.
dt dt
Using the functions D and S as above gives:
dp
= a1 + b 1 p − a0 ,
dt
which can be rewritten:
dp
− b 1 p = a1 + a0 . (2.15)
dt
Defining the initial condition:
p(0) = p0 (2.16)
gives the initial value problem to be solved.
The differential equation is linear with integrating factor
R
µ(t) = e− b1 dt
= e−b1 t .
22
Multiply through the equation by µ(t):
dp
e−b1 t − e−b1 t b1 p = (a1 + a0 )e−b1 t .
dt
Rewrite the left hand side:
d −b1 t
[e p] = (a1 + a0 )e−b1 t .
dt
Integrate with respect to t:
a1 − a0 −b1 t
e−b1 t p = e +C
−b1
Solve to find the general solution:
a0 − a1
p(t) = + Ceb1 t
b1
For simplicity denote p∗ = a0b−a
1
1
. Note that p∗ is an equilibrium solution of the
equation. Now apply the initial condition to solve for C:
p0 = p∗ + C ⇒ C = p0 − p∗ .
23
Example 2.20:
In Example 2.12 we considered a supply and demand problem where the model was
the following differential equation
dp
= 10p(6 − p) = 60p − 10p2 ,
dt
which we solved as a separable equation. Show that this equation is also a Bernoulli
differential equation and use the transformation about to solve it.
Solution:
Rewriting the equation in the form
dp
− 60p = −10p2
dt
we see that it is a Bernoulli differential equation with n = 2. To solve, let
v = p1−2 = p−1
then
dv dp
= −p−2
dt dt
Using the differential equation we get
dv dp
= −p−2
dt dt
−2
= −p [60p − 10p2 ]
= −60p−1 + 10
= −60v + 10
dv
⇒ + 60v = 10
dt
This is a linear differential equation for v(t) with integrating factor
R
60dt
µ(t) = e = e60t ,
24
1 6
⇒ p(t) = =
v 1 + Ce−60t
This is exactly the general solution we found by solving the differential equation as
a separable equation (see Example 2.12).
For many systems we model with differential equations we are most interested in
the qualitative behaviour of the solution, that is
For some differential equations we can answer these equations without solving the
differential equation, just by using the properties of the differential equation itself.
We will restrict our attention to separable differential equations of the form
y 0 = f (y) (2.17)
Property 1: The equilibrium solutions of y 0 = f (y) are the zeros of the function
f (y).
dy
Property 2: Since dx
= f (y) we have
25
If f (y) is then y(x) is
positive and increasing increasing, concave up
positive and decreasing increasing, concave down
negative and increasing decreasing, concave down
negative and decreasing decreasing, concave up
Table 2.2: How the graph of the solutions y(x) of (2.17) is related to the properties
of f (y).
Properties 1 and 2 are summarized in Table 2.2 The information from Property 1
together with that in Table 2.2 is enough to give a qualitative sketch of the solution
of the differential equation.
Example 2.21:
Consider the differential equation
dp
= 10p(6 − p) = f (p).
dt
We’ve solved this in two ways already (see Examples 2.12 and 2.20). We will now
show how to apply the qualitative approach to this equation. Note the equilibrium
points are the zeros of f (p), that is p = 0 and p = 6. Now compute f 0 (p)
f 0 (p) = 60 − 20p.
2
Since ddt2p = f (p)f 0 (p) we can use Table 2.2 to determine how the properties of p(t)
relate to the values of p. This information is summarized in Table 2.3. We use this
information to qualitatively sketch the graph of p(t) in the various regions as in
Figure 2.5.
26
p
p=6
p=0
t
The power of this method is that it works on equations that we can’t solve or
that are hard to solve.
Example 2.22:
Give a qualitative sketch of the solutions of
dy
= (y − 1)2 (y − 2).
dx
Solution:
In this problem
f (y) = (y − 1)2 (y − 2)
and
Thus the equation has two equilibrium solutions: p = 1 and p = 2. The properties
of the solutions are as given in Table 2.4. Using the information in Table 2.4 we
can give a qualitative sketch of the solutions as shown in Figure 2.6. From this
sketch it is clear that the equilibrium points are both unstable.
27
Range in y f (y) is so y(x) is
y<1 negative, increasing decreasing, concave down
1 < y < 53 negative, decreasing decreasing, concave up
5
3
<y<2 negative, increasing decreasing, concave down
y>2 positive, increasing increasing, concave up
2.5
y=5/3
1.5
0.5
28
Chapter 3
The most general nth order ordinary differential equation is a relationship of the
form
G(x, y, y 0 , y 00 , . . . , y (n) ) = 0 (3.1)
where
dy 00 d2 y dn y
y0 = , y = 2 , . . . , y (n) = n .
dx dx dx
We will focus on linear differential equations. These are equations where G is a
linear function of y, y 0 , . . . , y (n) . Any linear nth order differential equation can be
written in the form
29
3.1 Theory of nth Order Linear ODEs
Note: An initial value problem for equation (3.2) must have n initial conditions:
Theorem 3.8 (Existence and Uniqueness). If there is an open interval I such that
2. an (x) 6= 0 on I,
then, for any x0 ∈ I, the initial value problem consisting of equations (3.2)–(3.3)
has a unique solution on I.
Example 3.9:
Consider the initial value problem
Here we have n = 2,
30
so the aj (x) and F (x) are all continuous on IR. Further a2 (x) = cos(x) is nonzero
on any interval not containing one of the points (2k + 1)π/2, k an integer. Since
x0 = 0, the conditions of Theorem 3.8 are satisfied on the interval I = (− π2 , π2 ).
Thus we can expect there will be a unique solution of the initial value problem on
this interval.
In order to facilitate our discussion of the theory of nth order ordinary differential
equations, we will introduce the following concept.
Definition 3.10: An operator is a transformation which maps functions to func-
tions
Notation: Let T be an operator that maps the function f to the function g, then
we write
T [f (x)] = g(x), or T f (x) = g(x), or T f = g.
The domain and range of an operator are sets of functions. Some important oper-
ators we will use include the following.
31
Let D2 = D ◦ D then
Similarly
dn f
Dn [f (x)] = D[D[· · · D[f (x)] · · · ]] = f (n) (x) = .
dxn
Using these ideas we may write equation (3.2) in operator notation as
or
(an (x)Dn + an−1 (x)Dn−1 + an−2 (x)Dn−2 + . . . + a1 (x)D + a0 (x)I) y = F (x) (3.4)
We can then interpret solving the differential equation as finding the functions y
that are mapped to the function F by the operator φ(D). Note that the operator
φ(D) maps n times differentiable functions to continuous functions, i.e.,
φ(D) : C n (R) → C(R).
Example 3.12:
The equation
cos(x)y 00 + sin(x)y 0 + 3y = tan(x),
can be written in operator notation as φ(D)y = tan(x) where
φ(D) = cos(x)D2 + sin(x)D + 3I.
The equation
d3 y dy
3
+ 10 + 5y = 0,
dx dx
can be written in operator notation as φ(D) = 0 where φ(D) = D3 + 10D + 5I.
Definition 3.13: An operator, T , is called linear if it has the following property.
Given two functions f1 (x) and f2 (x) (in the domain of T ) and two constants c1 ,
c2 ∈ R
T [c1 f1 (x) + c2 f2 (x)] = c1 T [f1 (x)] + c2 T [f2 (x)].
32
Exercise 3.14:
Show that Dj is a linear operator for any j = 1, 2, . . .
Exercise 3.15:
Show that φ(D) is a linear operator.
The following exercise illustrates the property that operators can be factored to
yield equivalent operators.
Exercise 3.16:
Verify that the operators
(D − 1)(D − 2) and D2 − 3D + 2
or
φ(D)y = 0
This is called the complementary equation or the associated homogeneous
equation of equation (3.5).
Let yp (x) be a particular solution of equation (3.5) and yh (x) be the general
solution of the associated homogeneous equation (3.6), then, by Theorem 3.17
y(x) = yp (x) + yh (x) is a solution of (3.5) containing n arbitrary constants. To
show it is the general solution, we need to show every nonsingular solution of (3.5)
can be written in this form. To do this we first obtain some results about the
general solution of (3.6).
33
3.2 The General Solution of the Homogeneous
Equation
Proof. The proof follows from the linearity of the operator φ(D). For x ∈ I we
have
c1 y1 (x) + · · · + cn yn (x)
34
Example 3.21:
The functions cos x, sin(x) are linearly independent on R. To check this consider
Example 3.25:
35
The following result shows how the Wronskian and linear independence are
related for functions which are solutions of the linear ODE (3.6).
Proof. Suppose that W (y1 , y2 , . . . , yn ) 6= 0 for all x ∈ I. Consider the linear system
y1 (x) y2 (x) ··· yn (x) α1 0
0 0 0
y1 (x) y2 (x) ··· yn (x) α2 0
. . = . . (3.8)
.
.
. .
. .
(n−1) (n−1) (n−1)
y1 (x) y2 (x) · · · yn (x) αn 0
Then u(x) satisfies the initial value problem consisting of (3.6) and the initial
conditions
y(x0 ) = 0, y 0 (x0 ) = 0, · · · , y (n−1) (x0 ) = 0
on the interval I. (The initial conditions come from putting (3.9) into (3.8) and
evaluating at x0 .) However, the function y(x) = 0, x ∈ I also satisfies this initial
value problem. By Theorem 3.8 the initial value problem has a unique solution,
hence we must have u(x) = 0, x ∈ I, i.e.,
36
Since not all of the ᾱj are zero, this implies that y1 (x), y2 (x), . . . , yn (x) are linearly
dependent. This is a contradiction, so we must have W (y1 , y2 , . . . , yn ) 6= 0 on I.
Theorem 3.27. Suppose that the conditions of Theorem 3.8 are satisfied on some
interval I and let y1 (x), y2 (x), . . . , yn (x) be n linearly independent solutions on I
of the differential equation (3.6). Then the general solution is
Proof. Let ψ(x) be any solution of equation (3.6) on I, let x0 ∈ I and consider the
initial value problem consisting of (3.6) and the initial conditions
Clearly, ψ(x) satisfies this initial value problem. Further y(x) defined by (3.10)
satisfies the ODE (3.6) for any constants c1 , . . . , cn . Applying the initial conditions
(3.11) to y(x) yields a linear system of equations for c1 , . . . , cn :
Thus the initial value problem consisting of (3.6) and (3.11) also has the solution
37
But, by Theorem 3.8 the initial value problem has a unique solution on I, therefore
Thus every solution of (3.6) on I can be written in the form (3.10), i.e., (3.10) is
the general solution of (3.6).
Note that this proof shows that every solution of (3.6) can be written in the
form (3.10). This implies that, under the conditions of the theorem, there are no
singular solutions of (3.6). Finally, we state a result that gives a useful formula for
the Wronskian.
for all x ∈ I.
38
From a standard property of determinants, the first n − 2 determinants are zero
and thus
y1 (x) ··· yn (x)
0
y1 (x) ··· yn0 (x)
y100 (x) ··· yn00 (x)
W 0 (x) = .. .. .
. .
(n−2) (n−2)
y1 (x) · · · yn (x)
(n) (n)
y1 (x) · · · yn (x)
Since each yj (x) satisfies (3.6) we have
n−1
!
(n) 1 X (i)
yj (x) = − ai (x)yj (x) ,
an (x) i=0
and hence
an−1 (x)
W0 = − W.
an (x)
39
3.3 The General Solution of the Inhomogeneous
Equation
Theorem 3.29. If yh (x) is the general solution of equation (3.6) and yp (x) is a
particular solution of equation (3.5) then yh (x) + yp (x) is the general solution of
the equation (3.5).
Proof. If follows from Theorem 3.17 that yh (x) + yp (x) is a solution of equation
(3.5). Further, since yh (x) is the general solution of the associated homogeneous
equation (3.6), it contains n arbitrary constants i.e., yh (x) = v(x; c1 , . . . , cn ). So
yh (x) + yp (x) is a solution of equation (3.5) containing n arbitrary constants.
To show that yh (x) + yp (x) is the general solution of equation (3.5) we need to
show that every solution of equation (3.5) can be written in the form yh (x) + yp (x)
for some particular values of the constants.
Let ψ(x) be any solution of (3.5) and let w(x) = ψ(x) − yp (x). Then
In other words, ψ(x) can be written in the form yh (x) + yp (x) for some appropriate
choice of the arbitrary constants.
40
Form the general solution of (3.6):
We will consider methods for step 1 in the following section. Approaches for step
2 will be considered in section 3.5.
It is difficult to find solutions of linear, variable coefficient, higher order ordinary dif-
ferential equations. Thus we will now specialize to the case of constant coefficients.
The general form of a constant coefficient nth order linear ordinary differential
equation is
An y (n) + An−1 y (n−1) + · · · + A1 y 0 + A0 y = F (x) (3.13)
where An 6= 0, Aj ∈ R, j = 0, . . . , n. In operator form we write this as
φ(D)[y] = F (x)
where
φ(D) = An Dn + An−1 Dn−1 + · · · + A1 D + A0 I. (3.14)
The associated homogeneous equation is
or
φ(D)[y] = 0.
To motivate our method of solution for (3.15), consider the case when n = 1:
dy
A1 + A0 y = 0.
dx
Rewriting this as
dy A0
= − y,
dx A1
41
A0
− x
we can solve this as a linear or separable equation to find y = Ce A1 . Thus the
solution of any first order, linear, constant coefficient ODE will be an exponential
function. This motivates us to try this same form of solution for higher order,
linear, constant coefficient ODEs.
Example 3.30:
Find the general solution of the differential equation y 00 − y 0 − 2y = 0.
Solution:
Let y = eλx where λ is a constant to be determined. Taking the derivatives of y(x):
y 0 = λeλx
y 00 = λ2 eλx
or
(λ2 − λ − 2)eλx = 0
Since eλx 6= 0 for all x we require λ to satisfy
λ2 − λ − 2 = 0.
So the solutions are linearly independent on (−∞, ∞) and the general solution is
42
Substituting y = eλx into the differential equation and using the fact that Dk [eλx ] =
dk eλx
dxk
= λk eλx yields the auxiliary equation
Since the right hand side of this equation is an degree n polynomial in λ we know
from a result from Linear Algebra that the equation will have n roots. This leads
to the following.
Summary - Solving the constant coefficient equation (3.16)
Step 3 is not always trivial. In the following we will develop several theorems to
handle this. The first theorem deals with the simplest situation.
Proof. Consider first the case where n = 2 and suppose that λ1 6= λ2 are the two
roots of the auxiliary equation. Calculating the Wronskian we have
" #
λ1 x λ2 x
e e
W (eλ1 x , eλ2 x ) = det λ1 x λ2 x
= (λ2 − λ1 )e(λ1 +λ2 )x = −(λ1 − λ2 )e(λ1 +λ2 )x
λ1 e λ2 e
Clearly W 6= 0 for all x ∈ R since the λj are distinct. The result then follows from
Theorem 3.27.
43
Two issues may arise that make step 3 more complicated.
1. The roots may not be distinct, i.e., the auxiliary equation may have repeated
roots (roots of multiplicity > 1).
The following propositions show how to deal with the first issue.
Proof. (For the case n = 2) Let y2 (x) = u(x)y1 (x) where y1 (x) is a solution of
equation (3.6). Substituting y2 (x) into (3.6) yields
a2 (x)y1 u00 + [2a2 (x)y10 + a1 (x)y1 ]u0 + [a2 (x)y100 + a1 (x)y10 + a0 (x)y1 ]u = 0.
Now the last term is zero, since y1 (x) is solution of equation (3.6). Hence that y2 (x)
will be a solution of equation (3.6) if u(x) is a solution of
But this equation can always be solved as we now describe. Substituting v(x) =
u0 (x), v 0 (x) = u00 (x) yields a first order, linear equation for v(x):
This equation can be solve for v(x) using the method of section 2.2. u(x) may then
be obtained by integration.
To see that the two solutions are linearly independent consider the Wronskian
" #
y1 (x) u(x)y1 (x)
W (y1 , y2 ) = det 0 0 0
= −u0 (x)y1 (x)2 .
y1 (x) u (x)y1 (x) + u(x)y1 (x)
Now y1 (x) 6= 0 by assumption and, since u(x) is not a constant, u0 (x) 6= 0. Thus
W 6= 0 and the two solutions are linearly independent.
44
Proposition 3.33 (Repeated Roots). If λ = b is a repeated root (root of multiplic-
ity 2) of the auxiliary equation (3.17), then two linearly independent solutions on
(−∞, ∞) of equation (3.16) are y1 (x) = ebx and y2 (x) = xebx .
p(λ)(λ − b)2 = 0
or
p(λ)(λ2 − 2bλ + b2 ) = 0,
where p(λ) is a polynomial of degree n − 2 with real coefficients and p(b) 6= 0. This
means that the differential equation can be written
We know that y1 (x) = ebx is one solution of the differential equation. We will use
the Proposition 3.32 to find another.
Let y = u(x)ebx . Substituting into the differential equation we obtain
Clearly this last equation will be satisfied, and hence u(x)ebx will be a solution
of equation (3.16), if u00 (x) = 0. Integrating two times with respect to x yields
u(x) = c1 x + c2 , where c1 , c2 are arbitrary constants. Hence
The second term on the right-hand side is just a multiple of y1 and so the simplest
form of the second solution is
y2 (x) = xebx .
45
Exercise 3.34:
Check y2 = xebx is a solution of
(D3 − 3D + 2I)[y] = 0.
λ3 − 3λ + 2 = 0.
A simple check shows that λ = −2 is one root. Factoring the polynomial yields
(λ − 1)2 (λ + 2) = 0.
y1 = ex , y2 = xex , y3 = e−2x .
One can verify the linear independence of these solutions using the Wronskian or
the definition (exercise). So the general solution is
Multiplicity Solutions
2 ebx , xebx
2
3 ebx , xebx , x2 ebx
.. ..
. .
2 m−1
m x
ebx , xebx , x2 ebx , · · · , (m−1)! ebx
Proof. Similar to Proposition 3.33. Use the Method of Reduction of Order to the
find solutions and the Wronskian to show that they are linearly independent.
46
Note: If one has several distinct roots of different multiplicities one can use the
Wronskian to show that the solutions determined by Proposition 3.36 are linearly
independent.
Example 3.37:
Find the general solution of y 0000 − 10y 000 + 37y 00 − 60y 0 + 36y = 0.
Solution:
In operator form the differential equation is
One can verify the linear independence of these solutions using the Wronskian or
the definition (exercise). So the general solution is
The next proposition tells us how to deal with complex roots of the auxiliary
equation.
Proof. Result 1 follows from the fact that the coefficients of the auxiliary equation
are real and the Complex Conjugate Root Theorem.
47
To prove 2, let
and
v(x) = eλx = e(α−iβ)x = eαx e−iβx = eαx (cos βx − i sin βx).
Now u(x) and v(x) are solutions on R of the differential equation since λ, λ are
roots of the auxiliary equation.
Recall from Theorem 3.18 that any linear combination of two solutions of the
differential equation is also a solution, thus
1 1
y1 (x) = u(x) + v(x) = eαx cos βx
2 2
and
1 1
y2 (x) = u(x) − v(x) = eαx sin βx
2i 2i
are also solutions on R of the differential equation. Further
" #
eαx cos βx eαx sin βx
W (y1 , y2 ) = det
αeαx cos βx − βeαx sin βx αeαx sin βx + βeαx cos βx
= αe2αx cos βx sin βx + βe2αx cos2 βx − [αe2αx cos βx sin βx − βe2αx sin2 βx]
= βe2αx 6= 0, for all x ∈ R
Example 3.39:
Solve the initial value problem y 00 − 2y 0 + 5y = 0, y(π/2) = 0, y 0 (π/2) = 2.
Solution:
Write differential equation in operator form:
(D2 − 2D + 5I)[y] = 0.
λ2 − 2λ + 5 = 0.
Using the quadratic formula, the roots are λ = 1 + 2i, λ = 1 − 2i. Applying
Proposition 3.38 with α = 1, β = 2, shows that two linearly independent solutions
are y1 = ex cos 2x, y2 = ex sin 2x and the hence general solution of the differential
equation is
y(x) = c1 ex cos 2x + c2 ex sin 2x, −∞ < x < ∞.
48
Applying the first initial condition, y(π/2) = 0, we have
π π
0 = c1 e 2 cos π + c2 e 2 sin π
π
0 = −c1 e 2
⇒ c1 = 0.
Therefore y(x) = c2 ex sin 2x and y 0 = c2 ex sin 2x+2c2 ex cos 2x. Applying the second
initial condition, y 0 (π/2) = 2, we obtain
π π
2 = c2 e 2 sin π + 2c2 e 2 cos π
π
2 = −2c2 e 2
π
⇒ c2 = −e− 2 .
Thus the solution of the initial value problem is
π
y(x) = −e− 2 ex sin 2x
π
= −ex− 2 sin 2x, −∞ < x < ∞.
Example 3.40:
Find the general solutions of y 0000 − 3y 00 − 4y = 0.
Solution:
The operator form of the differential equation is
(D4 − 3D2 − 4I)[y] = 0,
hence the auxiliary equation is
λ4 − 3λ2 − 4 = 0.
Let µ = λ2 , then µ satisfies the quadratic equation
µ2 − 3µ − 4 = 0,
which has solutions µ = 4 and µ = −1. Thus we have
λ2 = 4 ⇒ λ = ±2
or
λ2 = −1 ⇒ λ = ±i
The corresponding solutions of the differential equation are therefore:
y1 (x) = e2x , y2 (x) = e−2x , y3 (x) = cos(x), y4 (x) = sin(x).
One can show (exercise) that these solutions are linearly independent. The general
solution is thus
y(x) = c1 e2x + c2 e−2x + c3 cos(x) + c4 sin(x), −∞ < x < ∞.
49
3.5 Finding a Particular Solution of the Inhomo-
geneous Equation
Recall we found that the most general solution to the differential equation φ(D)[y] =
F (x) can be written as
y(x) = yh (x) + yp (x)
where
In the previous section we learned how to obtain solutions yh (x). In this section
we will see several approaches for finding a particular solution yp (x).
Solution:
We saw in Example 3.30 that the general solution of the homogeneous equation
(D2 − D − 2I)[y] = 0 is
We need to find a particular solution of equation (3.18) i.e., to find a function that
is mapped to cos(x) by the operator (D2 − D − 2I). Recall that
50
where M1 , M2 are to be determined. Taking the derivative of yp (x)
or
(−3M1 − M2 ) cos(x) + (−3M2 + M1 ) sin(x) = cos(x).
For this equality to hold for all values of x, the constants M1 and M2 must satisfy
−3M1 − M2 = 1
M1 − 3M2 = = 0.
Solution:
We will try the same approach as the last example. Noting that
51
or
0 = e2x .
This can never be true and so the method fails.
Why does it fail? Recall yh (x) = c1 e2x + c2 e−x so e2x is a solution of the
homogeneous equation, i.e., e2x is mapped to 0 by D2 − D − 2I so it can’t be
mapped to e2x . What other function generates e2x when you take its derivative?
Note that
D[xe2x ] = e2x + 2xe2x .
So we take yp = Lxe2x . Then D[yp ] = L(e2x + 2xe2x ), D2 [yp ] = L(4 + 4x)e2x and
substituting into equation (3.19) we obtain
Simplifying gives
L(3e2x ) = e2x
which implies
1
L= .
3
1
The particular solution is therefore yp = 3 xe2x and the general solution is
1
y(x) = yh (x) + yp (x) = c1 e2x + c2 e−x + xe2x , −∞ < x < ∞.
3
The approach used in Examples 3.41 and 3.42 is called the Method of Unde-
termined Coefficients. The idea is to use the specific form of F (x) to determine
a form for yp (x). A little thought reveals that this method will only work on the
constant coefficient equation (3.13) when F (x) consists of functions with have a
finite number of derivatives. The general procedure is outlined below.
Method of Undetermined Coefficients:
This method applies to equation (3.13) if F (x) consists of products or sums of sines,
cosines, exponentials and non-negative, integer powers of x.
1. Given F (x) choose the form for yp (x) according to the Table 3.1 where
Kj , Mj , L are unknown constants which will be determined.
52
Term in F (x) Form of yp (x)
eγx Leγx
xn Kn xn + Kn−1 xn−1 + · · · + K1 x + K0
cos ωx M1 cos ωx + M2 sin ωx
sin ωx
3. If any term in your choice for yp (x) is a solution of the associated homogeneous
equation (3.15), then multiply yp (x) by the lowest power of x so that it is no
longer a solution.
choose an appropriate ypj for each Fj (x), following steps 1-3, and then add
them together:
yp (x) = yp1 (x) + yp2 (x) + · · · + ypm (x)
5. Substitute the form you have found for yp (x) into the equation (3.13) and
determine the values for the unknown coefficients so that yp (x) satisfies the
equation.
Example 3.43:
Find the solution of the initial value problem
Solution:
The associated homogeneous equation is
y 00 + y = 0,
53
To find yp (x) we identify the terms of the the RHS of the differential equation
F (x) = x2 + cos(x)
= F1 (x) + F2 (x).
yp1 (x) = K2 x2 + K1 x + K0
yp2 (x) = M1 cos(x) + M2 sin(x).
However, the choice for yp2 is a solution of the associated homogeneous equation so
we must multiply it by x
Therefore we have
yp = K2 x2 + K1 x + K0 + M1 x cos(x) + M2 x sin(x).
Taking derivatives
x2 : K 2 = 1
x : K1 = 0
1 : K0 + 2K2 = 0 ⇒ K0 = −2
x cos(x) : 0 = 0
x sin(x) : 0 = 0
1
cos(x) : 2M2 = 1 ⇒ M2 =
2
sin(x) : −2M1 = 0 ⇒ M1 = 0.
54
Putting these values into the form for the particular solution yields
1
yp (x) = x2 − 2 + x sin(x), −∞ < x < ∞.
2
Thus the general solution is
1
y(x) = c1 cos(x) + c2 sin(x) + x2 − 2 + x sin(x), −∞ < x < ∞.
2
Applying the first initial condition, y(0) = 1:
c1 − 2 = 1 ⇒ c1 = 3.
c2 = 1.
we take yp (x) = yp1 (x) + yp2 (x). Following Table 3.1, the form for yp1 is
Multiplying this out and noting that there are just two distinct terms, we take
55
Similarly the form for yp2 from Table 3.1 is
so we take
Proposition 3.32 gave a method for using one solution of the homogeneous equa-
tion (3.6) to generate a second linearly independent solution. In fact this approach
can also be applied to find a particular solution of the inhomogeneous equation
(3.2).
Proof. Using Proposition 3.32 we know that we can generate a second linearly inde-
pendent solution of the homogeneous equation (3.2) of the form y2 (x) = u1 (x)y1 (x).
Now consider the inhomogeneous equation. Proceeding as in the proof of Propo-
sition 3.32, one can show that y(x) = u(x)y1 (x) is a solution of equation (3.2) if
u(x) is a solution of
But this equation can always be solved, just as in the proof of Proposition 3.32. Let
up (x) be the solution. Then yp (x) = up (x)y1 (x) is a solution of the inhomogeneous
equation (3.2). Thus the general solution of equation (3.2) is y(x) = c1 y1 (x) +
c2 y2 (x) + yp (x).
Note that this method works for any second order linear differential equations
(variable coefficient or constant coefficient) with any function, F (x), on the right
hand side.
56
Example 3.46:
Find the general solution of the differential equation y 00 + y = tan(x).
Solution:
The general solution of the associated homogeneous equation is
The Method of Undetermined Coefficients won’t work here since the right hand
side is tan(x). So we try the Method of Reduction of Order, with y1 (x) = cos(x).
We then have
Multiplying through the differential equation and solving in the usual way deter-
mines v(x):
57
Thus
u0 = sec(x) + C1 sec2 (x)
which may be integrated to find u(x):
where we have defined c∗1 = c1 + C2 , c∗2 = c2 + C1 . Note that we could omit the
arbitrary constants in the integration of v(x) and u(x) as these just replicate the
general solution of the associated homogeneous equation. The interval of existence
of this solution is any open interval on which cos(x) 6= 0, e.g., (− π2 , π2 ).
This method was first developed by Joseph Lagrange in 1774. The idea is similar
to the Method of Reduction of Order, but instead of just using one solution of the
associated homogeneous equation, we will use all of them. We will develop the
method for the case n = 2. The approach for higher order equations is similar.
Consider a second order inhomogeneous equation
has general solution yh (x) = c1 y1 (x)+c2 y2 (x). We will look for a particular solution
of (3.20) by “varying the parameters” in this solution, that is, by replacing the
constants c1 and c2 with arbitrary (unknown) functions of x:
Substituting this form into equation (3.20) will give one condition on the unknown
functions u1 (x) and u2 (x). We will arbitrarily impose another condition, which will
give two equations to solve for u1 (x) and u2 (x).
58
To begin, take the derivative of yp (x)
yp0 (x) = u01 (x)y1 (x) + u1 (x)y10 (x) + u02 (x)y2 (x) + u2 (x)y20 (x).
Then we have
a2 (x)(u01 y10 + u1 y100 + u02 y20 + u2 y100 ) + a1 (x)(u1 y10 + u2 y20 ) + a0 (x)(u1 y1 + u2 y2 (= F (x).
(a2 (x)y100 +a1 (x)y10 +a0 (x)y1 )u1 +(a2 (x)y100 +a1 (x)y10 +a0 (x)y1 )u2 +a2 (x)(u01 y10 +u02 y20 ) = F (x).
But the first two terms are zero since y1 and y2 are solutions of the associated
homogeneous equation (3.21). Thus we obtain the following condition on u1 and
u2
a2 (x)u01 y10 + a2 (x)u02 y20 = F (x). (3.24)
Note that equations (3.23) and (3.24) are a set of linear algebraic equations for u01
and u02 with coefficient matrix
" #
y1 (x) y2 (x)
M=
a2 (x)y10 (x) a2 (x)y20 (x)
59
We will use the Method of Variation of Parameters to find a particular solution of
the inhomogeneous equation. To begin, we assume a particular solution of the form
yp (x) = u1 (x) cos(x) + u2 (x) sin(x).
The first derivative is
yp0 (x) = u01 cos(x) − u1 sin(x) + u02 sin(x) + u2 cos(x).
Imposing the condition
u01 cos(x) + u02 sin(x) = 0 (3.25)
this becomes
yp0 (x) = −u1 sin(x) + u2 cos(x).
So the second derivative is
yp00 (x) = −u01 sin(x) − u1 cos(x) + u02 cos(x) − u2 sin(x).
Substituting into the differential equation, we obtain
−u01 sin(x) − u1 cos(x) + u02 cos(x) − u2 sin(x) + u1 cos(x) + u2 sin(x) = tan(x)
which simplifies to
−u01 sin(x) + u02 cos(x) = tan(x) (3.26)
Solving equations (3.25)–(3.26) yields
sin2 (x)
u01 = −
cos(x)
u02 = sin(x).
Integrating these equations gives1
u1 = − ln | sec(x) + tan(x)| + sin(x)
u2 = − cos(x).
Thus the particular solution is
yp (x) = − cos(x) ln | sec(x) + tan(x)| + cos(x) sin(x) − sin(x) cos(x)
= − cos(x) ln | sec(x) + tan(x)|,
and the general solution of the inhomogeneous equation is
y(x) = c1 (x) cos(x) + c2 (x) − cos(x) ln | sec(x) + tan(x)|,
with interval of existence (− π2 , π2 ). Note that this is the same solution as was
obtained using the Method of Reduction of Order in Example 3.46.
1
We omit the constants of integration as they will just regenerate the solution of the homoge-
neous equation.
60
Chapter 4
The order of the system is determined by the order of the highest derivative
that appears in the set of equations.
The system is linear the equations depend linearly on the unknown functions
and their derivatives, otherwise it is called nonlinear.
Here are some examples.
d2 r dr
= + φ + sin(t)
dt dt
dφ
= 5r − 3φ
dt
61
Example 4.2:
Consider a commodity where the supply and demand functions are related to price
through their derivatives:
dS
= F1 (t, P ) (4.1)
dt
dD
= F2 (t, P ) (4.2)
dt
adding in the equation for the price as described in subsection 2.1.3
dP
= α(D − S), where α > 0 is a constant (4.3)
dt
gives a 3 dimensional system of first order, ordinary differential equations for the
unknown functions S(t), D(t), P(t).
This system can be reduced to one equation for the price as follows. From (4.3)
d2 P dD dS
= α − α = α[F2 (t, P ) − F1 (t, P )] (4.4)
dt2 dt dt
which is a second order differential equation for P (t). If (4.4) can be solved for P (t)
(e.g., if F1 , F2 are linear in P ), then we can put the solution for P (t) into (4.1) and
(4.2) and solve for S(t) and D(t) by integration.
The approach of Example 4.2 for solving systems is called the method of
elimination. It does not always apply and is quite ad hoc, so we will develop a
more systematic approach to solving systems. First we state an important result.
where
dk y
y (k) = .
dxk
Introducing the following variables
y1 = y, y2 = y 0 , y3 = y 00 , · · · , yn = y (n−1) ,
62
we have
dy1
= y2 ,
dx
dy2
= y3 ,
dx
..
.
dyn
= g(x, y1 , y2 , · · · , yn ).
dx
This is an n dimensional system of first order, ordinary differential equations for the
unknown functions y1 (x), y2 (x), . . . , yn (x). Any solution of this system will generate
a corresponding solution of the original nth order ordinary differential equation.
The proof for systems of higher-order ordinary differential equations is similar.
Due to Theorem 4.3 we will only consider first order systems. The most general
dimensional system of first order, ordinary differential equations in standard form
is
dy1
= f1 (x, y1 , · · · , yn )
dx
dy2
= f2 (x, y1 , · · · , yn )
dx
..
. (4.5)
dyn
= fn (x, y1 , · · · , yn )
dx
We will focus on the case where the fj are linear functions of the yk . In this case
(4.5) may be written
dy1
= a11 (x)y1 + · · · + a1n (x)yn + f1 (x)
dx
..
. (4.6)
dyn
= an1 (x)y1 + · · · + ann (x)yn + fn (x)
dx
To simplify the development of the theory, we introduce the following notation.
63
Let
y1 a11 (x) · · · a1n (x) F1 (x)
. .. .. .
y = .. , A(x) = , F = .. .
. .
yn an1 (x) · · · ann (x) Fn (x)
Theorem 4.9 (Existence and Uniqueness Theorem). Suppose A(x) and F(x) are
continuous on the open interval I and x0 ∈ I. Then, given any vector
y0 = (y01 , · · · , y0n )T ∈ Rn the initial value problem consisting of (4.7) and (4.8)
has a unique solution on I.
64
It follows from Theorem 4.3 that any nth order linear ordinary differential equa-
tion can be written as an n dimensional first order linear system (4.7). Thus the
theory for solving (4.7) will be similar to that for nth order ordinary differential
equations.
Definition 4.10: The associated homogeneous equation for equation (4.7) is
dy
= A(x)y (4.10)
dx
Theorem 4.11. If u(x) is a solution of equation (4.10) on I1 and v(x) is a solution
of equation (4.7) on I2 then y(x) = u(x) + v(x) is a solution of equation (4.7) on
I = I1 ∩ I2 .
As for nth order linear ODEs, a candidate for the general solution of the inhomo-
geneous equation (4.7) is the sum of the general solution of (4.10) and a particular
solution of (4.7). Before proving this, we first derive some results about the general
solution of (4.10).
y = c1 y 1 + c2 y 2 + · · · + ck y k
65
To define the general solution of (4.10) we need the following
Definition 4.13: Let f1 (x), f2 (x), . . . , fk (x) be real n-dimensional vector-valued
functions defined on the interval I, i.e., fi : I → Rn . The functions f1 , f2 , . . . , fk
are linearly dependent on I if there exist k constants α1 , . . . , αk , which are not
all zero, such that
Solution:
Consider
α1 f1 + α2 f2 = 0
In component form this becomes
Clearly (4.12) is satisfied for any α1 , α2 with α1 = −α2 . Equation (4.11) however
requires α2 = 0 when x = 0 and α1 = 0 when x = π/2. Therefore, the only choice
that satisfies (4.11)–(4.12) for all x ∈ R is α1 = α2 = 0 and the functions f1 , f2 are
linearly independent.
Definition 4.15: The Wronskian of the functions f1 (x), . . . , fn (x) is defined as
f11 (x) f12 (x) · · · fn1 (x)
f12 (x) f22 (x) · · · fn2 (x)
W (f1 , f2 , · · · , fn ) = det .. .. ..
. . .
f1n (x) f2n (x) · · · fnn (x)
Example 4.16:
From the previous example
" #
sin(x) cos(x)
W (f1 , f2 ) = det = sin(x) cos(x)−cos2 (x) = cos(x)(sin(x)−cos(x))
cos(x) cos(x)
66
Proposition 4.17. Let A(x) be a continuous function from I to IRn×n and let
y1 , y2 , . . . , yn be solutions on I of (4.10). Then y1 , y2 , . . . , yn are linearly indepen-
dent on I if and only if W (y1 , y2 , . . . , yn ) 6= 0 for all x ∈ I
Proof. Suppose that W (y1 , y2 , . . . , yn ) 6= 0 for all x ∈ I. Consider the linear system
y11 (x) · · · yn1 (x) α1
.. .. ..
. = 0. (4.13)
. .
y1n (x) · · · ynn (x) αn
Then u(x) satisfies the initial value problem consisting of (4.10) and the initial
condition
y(x0 ) = 0
on the interval I. (The initial condition comes from putting (4.15) into (4.14) and
evaluating at x0 .) However, the function y(x) = 0, x ∈ I also satisfies this initial
value problem. By Theorem 4.9 the initial value problem has a unique solution,
hence we must have u(x) = 0, x ∈ I, i.e.,
Since not all of the ᾱj are zero, this implies that y1 (x), y2 (x), . . . , yn (x) are linearly
dependent. This is a contradiction, so we must have W (y1 , y2 , . . . , yn ) 6= 0 on
I.
67
Theorem 4.18. Let A(x) be a continuous function from I to IRn×n . If y1 , y2 , . . . , yn
are n linearly independent solutions on I of the DE (4.10) then the general solution
of (4.10) is
Proof. Let w(x) be any solution of (4.10). By Proposition 4.17, there is a point
x0 ∈ I such that W (y1 , . . . , yn ) 6= 0 at x = x0 . Define y0 = w(x0 ). Clearly w(x)
is a solution of the IVP consisting of the DE (4.10) and the IC y(x0 ) = y0 .
We now show that there is a solution to this IVP in the form (4.16). It follows
from the Theorem 4.12 that (4.16) is a solution of the DE (4.10). It remains to
show there are constants c1 , ..., cn , such that
Thus the IVP also has the solution k1 y1 (x) + · · · + kn yn (x). By the Existence and
Uniqueness Theorem, the IVP has a unique solution, thus we must have w(x) =
k1 y1 (x) + · · · + kn yn (x). That is, any solution of (4.10) can be written in the form
(4.16).
Note that this proof shows that every solution of (4.10) can be written in the
form (3.10). This implies that, under the conditions of the theorem, there are no
singular solutions of (4.10). Finally, we state a result that gives a useful formula
for the Wronskian.
68
Proposition 4.19. Suppose that the conditions of Theorem 4.9 are satisfied and
let y1 (x), y2 (x), . . . , yn (x) be n solutions on the interval I of the nth order linear
ODE (4.10). Then, for any x0 ∈ I, the Wronskian of these functions satisfies
R x Pn
− aii (x) dx
W (y1 , y2 , . . . , yn ) = W (x0 )e x0 i=1
for all x ∈ I.
Proof. The proof is similar to that of Proposition 3.28. See Section 7.3 of Boyce
and DiPrima [1].
We can now return to the problem of finding the general solution of the inhomoge-
neous equation (4.7).
Theorem 4.20. Let yh (x) be the general solution of equation (4.10) and yp (x) be
a particular solution of equation (4.7). Then yh (x) + yp (x) is the general solution
of (4.7).
Proof. It follows from Theorem 4.11 that yh (x) + yp (x) is a solution of (4.7).
Further, since yh (x) is the general solution of the associated homogeneous equation
(4.10), it contains n arbitrary constants, i.e., yh (x) = v(x; c1 , . . . , cn ).
To show that it is the general solution suppose that ŷ(x) is any solution of (4.7)
and consider
w(x) = ŷ(x) − yp (x).
Differentiating we find
dw dŷ dyp
= − = A(x)ŷ + F(x) − (A(x)yp + F (x))
dx dx dx
= A(x)(ŷ − yp )
= A(x)w,
so w is a solution of (4.10). Then, from Theorem 4.18, w(x) = yh (x) for some
particular values of the arbitrary constants, i.e., w(x) = v(x; K1 , . . . , Kn ). Thus
we have
That is, any solution of equation (4.7) can be written in the form yh (x)+yp (x).
69
Note that this proof shows that every solution of (4.7) can be written in the
form yh (x) + yp (x). This implies that, under the conditions of the theorem, there
are no singular solutions of (4.7).
Summary - Solving the inhomogeneous equation (4.7)
It follows from Theorems 4.18 and 4.20 that the general solution of equation (4.7)
may be found using the following steps.
yh (x) = c1 y1 + c2 y2 + · · · + cn yn
We will consider methods for step 1 in the following section. Approaches for step
2 will be considered in section 4.3.
y = eλx v
λv = Av. (4.21)
71
3. If λ = α+iβ is an eigenvalue of A with eigenvector v = u+iw, then λ = α−iβ
is an eigenvalue of A with eigenvector v = u − iw, where u, w ∈ Rn and u
and w are linearly independent vectors.
4. If λ is a root of multiplicity m of (4.23) it will have at least one and not more
than m associated eigenvectors.
Proof. It is clear from the previous discussion that eλi x vi , i = 0, . . . , k are solutions
of equation (4.20). To show they are linearly independent, consider
α1 v1 + α2 v2 + · · · + αk vk = 0.
72
Thus, by the Principle of Superposition, two real valued solutions are
1
y1 (x) = (z1 (x) + z2 (x)) = (u cos βx − w sin βx)eαx ,
2
1
y2 (x) = (z1 (x) − z2 (x)) = (u sin βx + w cos βx)eαx .
2i
We can check the linear independence of these solutions using the definition. Sup-
pose
α1 y1 (x) + α2 y2 (x) = 0 ∀x ∈ R. (4.25)
For x = 0 this becomes
α1 y1 (0) + α2 y2 (0) = 0
⇒ α1 u + α2 w = 0.
Example 4.23:
Find the general solution of
" #
dy 1 1
= Ay where A = .
dx 4 1
Solution:
First we find the eigenvalues of A. The characteristic equation is:
det(A − λI) = 0
" #
1−λ 1
⇒ det = (1 − λ)2 − 4 = 0
4 1−λ
⇒ λ2 − 2λ − 3 = 0.
73
" #" # " #
1−3 1 v11 0
=
4 1−3 v12 0
" #" # " #
−2 1 v11 0
=
4 −2 v12 0
−2v11 + v12 = 0
⇒ .
4v11 − 2v12 = 0
This system implies v12 = 2v11 so the eigenvectors are given by
!
k
k ∈ R.
2k
For λ2 = −1
(A + I)v2 = 0
" #" # " #
2 1 v21 0
=
4 2 v22 0
2v21 + v22 = 0
⇒ .
4v21 + 2v22 = 0
This system implies v22 = −2v21 so the eigenvector is given by
!
k
k ∈ R.
−2k
Applying Theorem 4.18 we know the general solution of the differential equation is
! ! !
3x 1 −x 1 c1 e3x + c2 e−x
y(x) = c1 e + c2 e = , x ∈ R.
2 −2 2c1 e3x − 2c2 e−x
74
Example 4.24:
Find the solution of the initial value problem
! !
dy −1 −1 1
= y, y(0) = .
dx 5 −3 2
Solution:
First we find the eigenvalues of A. The characteristic equation is
det(A − λI) = 0
!
−1 − λ −1
⇒ det =0
5 −3 − λ
⇒ (−1 − λ)(−3 − λ) + 5 = 0
⇒ λ2 + 4λ + 8 = 0.
(1 − 2i)v1 − v2 = 0
⇒
5v1 − (1 + 2i)v2 = 0
This system implies v2 = (1 − 2i)v1 so the eigenvector is given by
!
1
k k ∈ R.
1 − 2i
Choosing k = 1 we obtain
! ! !
1 1 0
v= = +i = u + iw.
1 − 2i 1 −2
75
5.0
2.5
y2 0.0
-2.5
5
-5.0 0 y1
-1 -5
0 1 2
x
Figure 4.1: Plot of solutions for Example 4.24. The thick curve is the solution of the
initial value problem. The thin curves are other members of the general solution.
So far all our theoretical results have been for general n. To proceed further we
need to consider specific values of n or there will be too many cases to consider.
76
1. The eigenvalues λ1 , λ2 are real and have linearly independent eigenvectors
v1 , v2 . Then Theorem 4.21 applies. Note the eigenvalues may be equal.
We know one solution is y1 = veλx . A first guess would be to look for a solution
of the form wxeλx where w is to be determined.
Exercise 4.25:
Show that if y1 = veλx is one solution of (4.10) then there is no way to choose the
vector w so that wxeλx is also a solution.
y2 = ueλx + wxeλx
where u and u are to be determined. Upon substitution into the differential equa-
tion (4.10), we arrive at
x0 : λu + w = Au ⇒ (A − λI)u = w (4.26)
x1 : λw = Aw (4.27)
77
We summarize the above discussion in the following.
Proposition 4.27. Suppose that A is a 2 × 2 matrix with one repeated, real eigen-
value, λ, and every eigenvector of λ is a scalar multiple of the vector v. Then the
general solution of the differential equation (4.20) is
where u is a solution of
(A − λI)u = v.
Example 4.28:
Find the general solution of
!
dy 1 −1
= Ay where A = .
dx 1 3
Solution:
The characteristic equation is
det(A − λI) = 0
!
1 − λ −1
⇒ det =0
1 3−λ
⇒ (1 − λ)(3 − λ) + 1 = 0
⇒ λ2 − 4λ + 4 = 0
⇒ (λ − 2)2 = 0
Thus the eigenvalues are λ = 2, 2. Now solve to fine the corresponding eigenvectors.
(A − 2I)v = 0
! ! !
1 − 2 −1 v1 0
=
1 3−2 v2 0
! ! !
−1 −1 v1 0
=
1 1 v2 0
−v1 − v2 = 0
⇒ ⇒ v2 = −v1
v1 + v2
78
!
1
Choose v = (every other eigenvector is a scalar multiple of this). One
−1
solution of the differential equation is then
!
1
y1 (x) = e2x .
−1
where u satisfies
(A − 2I)u = v
! ! !
−1 −1 u1 1
=
1 1 u2 −1
−u1 − u2 = 1
⇒ ⇒ u2 = −1 − u1
u1 + u2 = −1
Choose u1 = 0 so that !
0
u= .
−1
So a second linearly independent solution is
" ! !#
0 1
y2 (x) = e2x +x ,
−1 −1
79
2. λ1 , λ2 , λ3 ∈ R: 2 linearly independent eigenvectors v1 , v2 . Find third solution
as in n = 2 case of repeated roots.
Example 4.29:
Find the general solution of
−3 0 −4
y0 = Ay, A = −1 −1 −1 .
1 0 1
Solution:
One
can verify that the eigenvalues of A are λ = −1, −1, −1 with eigenvector
0
v = 1 . One solution is therefore
0
0
y1 = eλx v = e−x 1 .
1
In a similar manner to the discussion of the n = 2 case, we can show that a third
linearly independent solution is given by
x2
y3 (x) = eλx (w + xu + v)
2
where u, v are as above and w satisfies
(A − λI)w = u.
80
−1
Solving gives w = 0 , so that the general solution is
1
y(x) = c1 y1 (x) + c2 y2 (x) + c3 y3 (x)
−2c2 − (1 + 2x)c3
2
= e−x c1 + c2 x + x2 c3 .
c2 + c3 (1 + x)
4.2.3 Application
Example 4.30:
Suppose a commodity has a constant supply S = 100 units and the rate of change
of the demand is given by
dD
= 80 − 4P
dt
where P is the price per unit of the commodity. The rate of increase of price is
equal to the rate of decrease of quantity and the initial demand is 70 units and the
initial price is $20/unit. Determine out how the demand and price vary in time.
Solution:
The model is
dD
= 80 − 4P, D(0) = 70
dt
dP
= D − S = D − 100, P (0) = 20
dt
!
D
Let y = , then we can write this in vector form as
P
dy
= Ay + F(t), (4.28)
dt
where ! !
0 −4 80
A= , F= .
1 0 −100
The associated homogeneous equation is thus
dy
= Ay.
dt
The characteristic equation of A is found to be λ2 + 4 = 0, so that the eigenvalues
are λ = ±2i. We find the eigenvectors as usual:
For λ = 2i
(A − λI)v = 0
81
! ! !
−2i −4 v1 0
=
1 −2i v2 0
−2iv1 − 4v2 = 0 1 i
⇒ v2 = v1 = − v1
v1 − 2iv2 = 0 2i 2
Choose ! ! !
2 2 0
v= = +i .
−i 0 −1
! ! !
2 0 2
Then λ̄ = −2i has eigenvector v = −i = .
0 −1 i
! !
2 0
Applying Theorem 4.22 with α = 0, β = 2, u = ,w = , shows that
0 −1
the general solution of the associated homogeneous equation is:
" ! ! # " ! ! #
2 0 2 0
yh (t) = c1 cos 2t − sin 2t + c2 sin 2t + cos 2t
0 −1 0 −1
" #
2c1 cos 2t + 2c2 sin 2t
= .
c1 sin 2t − c2 cos 2t
How do we find yp (t)? Since F(t) is a constant function we will look for a constant
solution !
k1
yp = ,
k2
where k1 , k2 are to be determined. Substituting this into differential equation (4.28)
yields ! ! ! !
0 0 −4 k1 80
= +
0 1 0 k2 −100
−4k2 + 80 = 0 k1 = 100
⇒ ⇒ .
k1 − 100 = 0 k2 = 20
Therefore the particular solution is
!
100
yp (t) = .
20
82
160
140
120
100
80
60
40
20
-1 0 1 2 3 4 5 6 7 8 9 10 11
Figure 4.2: Plot of price and demand functions vs. time. The price function, P , is
given by the solid line while the demand function, D, is represented by the dashed
line. The thick horizontals represent their equilibrium values.
! !
D(0) 70
Now apply the initial condition y(0) = = :
P (0) 20
! !
2c1 + 100 70 c1 = −15
= ⇒ .
−c2 + 20 20 c2 = 0
Note that the demand and price oscillate around their equilibrium values (see Figure
4.2.
83
4.3 Finding a Particular Solution of the Inhomo-
geneous Equation
This method can be used to find yp (x) or the general solution y(x) of equation
(4.7). The main idea used in this method is exactly same as in the Method of
Variation of Parameters for nth order equations developed in subsection 3.5.3. We
will develop the method for the case n = 2 first, and give the generalization for
higher dimensional systems at the end.
Suppose that yh (x) = c1 y1 (x) + c2 y2 (x) is the general solution of the associated
homogeneous equation (4.10). Assume that there is a solution of (4.7) of the form
y(x) = u1 (x)y1 (x) + u2 (x)y2 (x) (4.29)
where u1 (x), u2 (x) are scalar functions to be determined. Taking the derivative:
y0 = u01 y1 + u1 y1 0 + u02 y2 + u2 y2 0
and substituting into (4.7) gives
u01 y1 + u1 y1 0 + u02 y2 + u2 y2 0 = Au1 (x)y1 (x) + Au2 (x)y2 (x) + F.
Using the fact that y1 , y2 are solutions of the homogeneous equation (4.10) this
becomes
u01 y1 + u1 Ay1 + u02 y2 + u2 Ay2 = Au1 y1 + Au2 y2 + F
which simplifies to
u01 y1 + u02 y2 = F. (4.30)
In component form we see this is a linear system in u01 , u02 :
u01 y11 + u02 y21 = F1
u01 y12 + u02 y22 = F2 , (4.31)
with coefficient matrix " #
y11 y21
M= .
y21 y22
Since y1 and y2 are linearly independent on their interval of existence, I, it follows
that det(M ) 6= 0 for all x ∈ I. Thus we for any x ∈ I we can solve (4.31) to find
expression for u01 and u02 , viz.,
u01 = G1 (x)
u02 = G2 (x)
84
where the Gi (x) depend on Fi (x) and yij . These differential equations can be solved
by integrating
Z
u1 = G1 (x) dx + c1 ,
Z
u2 = G2 (x) dx + c2 . (4.32)
Notes:
1. If we leave out the constants of integration in (4.32) then we just get a par-
ticular solution of (4.7)
Example 4.31:
Find the general solution of
y0 = Ay + F(x)
where " # !
1 1 x
A= and F(x) = .
4 1 ex
Solution:
We already showed in Example 4.23 that the solution of the associated homogeneous
equation is
85
We will apply Variation of Parameters to find the general solution of the given
inhomogeneous equation. Let y(x) = u1 (x)y1 (x) + u2 (x)y2 (x). As shown above,
this implies
u01 y1 + u02 y2 = F
or, in components form,
4e3x u01 = 2x + ex
1 −3x 1 −2x
⇒ u01 = xe + e
2Z 4 Z
1 −3x 1
⇒ u1 = xe dx + e−2x dx + c1
2 4
1 1 1
= − xe−3x − e−3x − e−2x + c1 ,
6 18 8
4e−x u02 = 2x − ex
1 x 1 2x
⇒ u02 = xe − e
2Z 4 Z
1 x 1
⇒ u2 = xe dx − e2x dx + c2
2 4
x x 1 x 1 2x
= e − e − e + c2 .
2 2 8
Thus the general solution of the inhomogeneous differential equation is
y(x) = u1 y1 (x) + u2 y2
!
1 −3x 1 −3x 1 −2x 1
= − xe − e − e + c1 e3x
6 18 8 2
!
x x 1 x 1 2x 1
+ e − e − e + c2 e−x
2 2 8 −2
! !
e3x e−x
= c1 + c 2
2e3x −2e−x
! !
1 1 1 x 1 1 1 1 x 1
+ − x− − e + x− − e
6 18 8 2 2 2 8 −2
! !
c1 e3x + c2 e−x 1
3
x − 59 − 41 ex
= +
2c1 e3x − 2c2 e−x − 43 x + 89
= yh (x) + yp (x).
86
Chapter 5
Partial Differential Equations (PDEs) are differential equations where the unknown
function depends on two or more independent variables. Such equations will
involve partial derivatives. Recall our notation for partial derivatives:
∂u ∂ 2u ∂ 2u
= ux = uxx = uxy .
∂x ∂x2 ∂x∂y
Here are some examples
∂u
= ye5x 1st order, linear PDE for u(x, y)
∂x
uxxx + cos(y)uxy + x2 y 2 uy = x2 u 3rd order, linear PDE for u(x, y
∂ 2u
= 8x + sin(y) 2nd order, linear PDE for u(x, y)
∂x∂y
φ + x2 φ φ + φ + t2 φy φt = 0 3rd order nonlinear PDE for φ(x, t)
ttt 2 tx 2xx
∂y ∂y
+ = cos(xt) 1st order, nonlinear PDE for y(x, t)
∂x ∂t
We will now restate some definitions for partial differential equations.
Definition 5.1: The order of a partial differential equation is the order of the
highest derivative that appears in the equation.
The most general 2nd order partial differential equation for u(x, y) is
The most general linear, 2nd order partial differential equation for u(x, y) is
a(x, y)uxx +b(x, y)uxy +c(x, y)uyy +d(x, y)ux +e(x, y)uy +f (x, y)u = g(x, y). (5.2)
87
Recall: For a function F (x, y), ∂F
∂x
means “differentiate F with respect to x holding
y fixed”. We can use this idea to solve very simple partial differential equations as
in the next two examples.
Example 5.3:
∂u
Solve the following partial differential equation ∂x
= ye5x .
Solution:
Integrating both sides with respect to x holding y fixed gives
y
u(x, y) = e5x + F (y),
5
where F (y) is an arbitrary function of y only. This means that the functions y5 e5x ,
y 5x 2
5
e + y 2 , y5 e5x + y sin(y)ey are all solutions of the partial differential equation.
We can check that the solution is correct by differentiating:
∂u ∂ y 5x ∂
= e + (F (y))
∂x ∂x 5 ∂x
y 5x
= (5e ) + 0
5
= ye5x
Example 5.4:
∂2u
Solve the following partial differential equation ∂x∂y
= 8x + sin(y).
Solution:
Start by integrating with respect to x holding y fixed, to obtain
∂u
= 4x2 + x sin(y) + F (y),
∂y
where F (y) is an arbitrary function of y only. Now integrate with respect to y
holding x fixed, to obtain
Z
2
u(x, y) = 4x y − x cos(y) + F (y) dy + G(x),
Note: Example 5.3 is a 1st order partial differential equation and has one
arbitrary function in the solution. Example 5.4 is a 2nd order partial differential
equation and has two arbitrary functions in the solution.
88
Definition 5.5: The general solution of an nth order partial differential equation
is an expression containing n arbitrary functions that represents all nonsingular
solutions of the differential equation
Definition 5.6: A particular solution of a partial differential equation is a so-
lution containing no arbitrary functions.
Example 5.7:
The general solution of the PDE in Example 5.4 is
u(x, y) = 4x2 y−x cos(y), u(x, y) = 4x2 y−x cos(y)+ey u(x, y) = 4x2 (y+1)−cos(y)(x+1).
u(x0 , y) = f (y)
• ∂u
where f, g are known functions
|
∂x (x,y0 )
= g(x)
The number of boundary conditions needed depends on the order of the partial
differential equation.
Definition 5.8: A differential equation together with a boundary condition is
called boundary value problem (BVP).
Not all boundary value problems have solutions as the following example illus-
trates.
Example 5.9:
Find the solution (if any) of the boundary value problems consisting of the partial
differential equation ∂u ∂x
= ye5x and each of the following boundary conditions:
(i) u(0, y) = y 3 , (ii) u(1, y) = 0, (iii) u(x, 0) = x2 , (iv) uy (0, y) = 1.
Solution:
Recall from Example 5.3 that the general solution of the partial differential equation
is u(x, y) = y5 e5x + F (y). Now apply each of the boundary conditions.
89
(i) u(0, y) = y 3 :
y y
+ F (y) = y 3 ⇒ F (y) = y 3 − .
5 5
So the boundary value problem has the unique solution
y y y
u(x, y) = e5x + y 3 − = y 3 + (e5x − 1).
5 5 5
(ii) u(1, y) = 0:
y 5 y
e + F (y) = 0 ⇒ F (y) = −e5 .
5 5
So the boundary value problem has the unique solution
y y y
u(x, y) = e5x − e5 = (e5x − e5 ).
5 5 5
(iii) u(x, 0) = x2 :
F (0) = x2
There is no way to choose F to satisfy this equation, so there is no solution
to the boundary value problem.
(iv) uy (0, y) = 1:
1 5x
uy (x, y) = e + F 0 (y)
5
uy (0, y) = 1
1
⇒1 = + F 0 (y)
5
4
⇒ F 0 (y) =
5
4
⇒ F (y) = y + C,
5
where C is an arbitrary constant. Thus there is a one parameter family of
solutions to the boundary value problem
y 4
u(x, y) = e5x + y + C,
5 5
i.e., the boundary value problem has a solution but it is not unique.
Example 5.10:
∂2u
Find the solution of the boundary value problem ∂x∂y
= 8x + sin(y), u(x, 0) =
x2 , uy (0, y) = e2y .
Solution:
From Example 5.4, the general solution of the differential equation is
90
Applying the boundary condition u(x, 0) = x2 we have
To apply the second condition we first calculate the partial derivative with respect
to y:
uy (x, y) = 4x2 + x sin(y) + H 0 (y).
Then the boundary condition uy (0, y) = e2y implies
H 0 (y) = e2y
1 2y
H(y) = e + C.
2
We thus have a final solution of
1
u(x, y) = 4x2 y − x cos(y) + e2y + C + x2 + x − H(0),
2
which simplifies to
1
u(x, y) = 4x2 y − x cos(y) + (e2y − 1) + x2 + x,
2
upon noting that H(0) = C + 12 .
Geometric Interpretation
∂ 2u ∂u
=2 + y, u(x, 0) = x2 , uy (0, y) = 0.
∂x∂y ∂y
91
4 4
3 3
2 2
1 1
u
0 2 0 2
-1 1.5 -1 1.5
-2
-1 1 -2
-1 1
-0.5 y -0.5 y
0 0.5 0 0.5
0.5 0.5
x 1 x 1
1.5 0 1.5 0
2 2
Figure 5.1: Plot of the solutions to the boundary value problems of Example 5.9
(i) and (ii).
Solution:
To begin, rewrite the differential equation as
∂ ∂u
− 2u = y.
∂y ∂x
Next, integrate with respect to y holding x fixed
∂u y2
− 2u = + F (x).
∂x 2
Thinking of y as fixed, this is a linear ordinary differential equation in x with inte-
grating factor e−2x . Multiplying through by the integrating factor and rearranging
the left hand side we obtain
∂ −2x y2
e u = e−2x + e−2x F (x).
∂x 2
Now integrate with respect to x holding y fixed to obtain
e−2x y 2
Z
−2x
e u=− + e−2x F (x) dx + G(y),
4
where F (x) and G(y) are arbitrary functions. Solving for u(x, y) and simplifying
the arbitrary functions as in Example 5.10, we obtain the general solution
y2
Z
u(x, y) = − + e 2x
e−2x F (x) dx + e2x G(y)
4
y2
= − + H(x) + e2x G(y),
4
92
Applying the boundary condition u(x, 0) = x2 we have
Recall that we showed any second order linear partial differential equation for
u(x, y) can be written in the form
a(x, y)uxx +b(x, y)uxy +c(x, y)uyy +d(x, y)ux +e(x, y)uy +f (x, y)u = g(x, y). (5.3)
The operator Dx takes the partial derivative with respect to x of the input function:
∂u
Dx u(x, y) = = ux .
∂x
93
y 2 2x
Figure 5.2: 3-D plot of function x2 + 4
(e − 1)
94
The operator Dy takes the partial derivative with respect to y of the input function:
∂u
Dy u(x, y) = = uy .
∂y
By extension, we have
∂ 2u
Dx2 u = Dx (Dx u) = = uxx
∂x2
∂ 2u
Dy2 u = Dy (Dy u) = 2 = uyy .
∂y
We can now rewrite (5.3) in operator notation:
[a(x, y)Dx2 + b(x, y)Dx Dy + c(x, y)Dy2 + d(x, y)Dx + e(x, y)Dy + f (x, y)I]u = g(x, y).
(5.4)
Note that the operator acting on u is a “polynomial” of degree 2 in Dx and Dy
with coefficients which are functions of x and y. More generally, we can write any
nth order, linear partial differential equation for u(x, y) in the form
u = uh + up
95
Example 5.13:
The PDE in Example 5.11 can be written in operator notation as
Φ(Dx , Dy ) = y
Φ(Dx , Dy ) = 0,
with general solution uh (x, y) = H(x) + e2x G(y). Thus the general solution of the
original PDE can be written
u(x, y) = uh + up
The most general first order partial differential equation with two independent
variables is given by
G(x, y, u, ux , uy ) = 0. (5.8)
Definition 5.15: The first order PDE (5.8) is linear if G is a linear function of
u, ux and uy .
The most general linear, first order partial differential equation with two inde-
pendent variables is given by
This equation can always be solved by fixing y and treating it as a first order linear
differential equation in x. (Note that we did this in Example 5.11)Similarly, one
can always solve equation (5.9) in case a(x, y) = 0, b(x, y) 6= 0.
96
To deal with the general equation (5.9) when a(x, y) 6= 0 and b(x, y) 6= 0 we
will make a change of variables to try to put (5.9) into a form like (5.10). We will
illustrate the idea with an example and discuss the general equation.
Example 5.16:
Solve ux + uy = 0.
Solution:
Consider the change of variables
ξ = x, η = x − y,
and define û(ξ, η) via u(x, y) = û(ξ(x, y), η(x, y)). Then, using the chain rule, we
have
ûξ = 0.
Note that solution in Example 5.16 does not depend on x and y independently
but only in the combination x−y. This means that along the lines x−y = constant
the solution is constant. The lines will thus give structure to the solution and are
called the characteristics of the partial differential equation. A particular
solution is shown in Figure 5.3. The characteristics are the thick lines.
How do we find the change of variables to simplify a partial differential equation
as in Example 5.16? We will show how this can be done for any linear, first order
equation (5.9). To do this we will need the following.
97
1
0.5
–0.5
–1
–2
0
y 2 –1 –2 –3
2 1 0
3
x
Figure 5.3: Plot of the particular solution of Example 5.16, u(x, y) = sin(x − y).
Lemma 5.17. Let φ(x, y) = k be the general solution of the first order ordinary
dy
differential equation dx = f (x, y). Then φ satisfies
φx
= −f (x, y).
φy
φx + φy f (x, y) = 0,
98
Consider equation (5.9), where a(x, y) 6= 0. We will apply the change of variables
This means that the transformation is invertible and we can write x = x(ξ, η), y =
y(ξ, η). Define
û(ξ, η) = u(x(ξ, η), y(ξ, η))
or, equivalently,
u(x, y) = û(ξ(x, y), η(x, y)).
As we saw in Example 5.16
ux = ûξ ξx + ûη ηx
uy = ûξ ξy + ûη ηy ,
or
! !
a(x, y)ξx + b(x, y)ξy ûξ + a(x, y)ηx + b(x, y)ηy ûη + c(x, y)û = f (x, y).
(a(x, y)ξx + b(x, y)ξy )ûξ + c(x, y)û = f (x, y). (5.12)
ηx b(x, y)
=− . (5.13)
ηy a(x, y)
99
Thus it follows from Lemma 5.17 that if we choose η = φ(x, y) where φ(x, y) = k
is the general solution of the ordinary differential equation
dy b(x, y)
=
dx a(x, y)
dy b(x, y)
= (5.15)
dx a(x, y)
where â(ξ, η) = a(x(ξ, η), y(ξ, η)), ĉ(ξ, η) = c(x(ξ, η), y(ξ, η)) and fˆ(ξ, η) = f (x(ξ, η), y(ξ, η)).
Definition 5.18: Equation (5.15) is called the characteristic equation of the
partial differential equation (5.9). The curves defined by η(x, y) = k, i.e., φ(x, y) =
k are called the characteristics of the partial differential equation (5.9).
Exercise 5.19:
Find the characteristic equation for the partial differential equation of Example 5.16
and show that the method outlined above leads to the change of variables used in
the example.
Example 5.20:
Find and sketch the characteristics of the partial differential equation
ux − xuy = 4,
then find the solution with the boundary condition u(0, y) = sin(y).
100
3
2
k>0
-5 -4 -3 -2 -1 0 1 2 3 4 5
k=0
-1
k<0
-2
-3
Solution:
We have a(x, y) = 1 and b(x, y) = −x so the characteristic equation is
dy x
=− .
dx 1
The general solution of this is
x2
y=− + k,
2
which defines the characteristics for the equation. The characteristics are shown in
Figure 5.4.
Making the change of variables
x2
ξ=x η=y+ ,
2
puts the differential equation in the form
ûξ = 4.
û(ξ, η) = 4ξ + F (η)
F (y) = sin(y)
101
3
2 k>0
k<0
k=0
-5 -4 -3 -2 -1 0 1 2 3 4 5
-1
-2 k<0
k>0
-3
x2
u(x, y) = 4x + sin(y + ).
2
Example 5.21:
Find and sketch the characteristics of the partial differential equation
ξ=x η = xy
102
puts the differential equation in the form
η
ξuξ − 2ξ 2 u = ξ 3
ξ
which simplifies to
uξ − 2ξu = ξη.
This is a first order, linear ordinary differential equation with integrating factor
2
e−ξ . Thus it can be solved in the usual way:
∂ −ξ2 2
e u = ξe−ξ η
∂ξ
2 1 2
e−ξ u = − e−ξ η + F (η)
2
η 2
u(ξ, η) = − + eξ F (η).
2
Reverting to the original variables yields the general solution
xy 2
u(x, y) = − + ex F (xy), x 6= 0.
2
The most general linear, second order partial differential equation with two inde-
pendent variables is given by
a(x, y)uxx +b(x, y)uxy +c(x, y)uyy +d(x, y)ux +e(x, y)uy +f (x, y)u = g(x, y). (5.16)
Assume a(x, y) 6= 0. We will attempt to simplify this equation using the same
approach that we used for first order partial differential equations, that is, using
the change of variables
ξ = ξ(x, y) η = η(x, y)
where det J(ξ, η) 6= 0, i.e., ξx ηy − ξy ηx 6= 0.
Let û(ξ, η) = u(x(ξ, η), y(ξ, η)). The first derivatives are as found in the previous
subsection:
ux = ûξ ξx + ûη ηx ,
uy = ûξ ξy + ûη ηy .
103
The second derivatives are given by
uxx = ûξξ ξx2 + ûξ ξxx + 2ûξη ξx ηx + ûηη ηx2 + ûη ηxx ,
uxy = ûξξ ξx ξy + ûξ ξxy + ûξη ξx ηy + ûξη ξy ηx + ûηη ηx ηy + ûη ηxy ,
uyy = ûξξ ξy2 + ûξ ξyy + 2ûξη ξy ηy + ûηη ηy2 + ûη ηyy .
A(ξ, η)ûξξ + B(ξ, η)ûξη + C(ξ, η)ûηη + D(ξ, η)ûξ + E(ξ, η)ûη + fˆ(ξ, η)û = ĝ(ξ, η),
(5.17)
where
A = aξx2 + bξx ξy + cξy2
B = 2aξx ηx + b(ξx ηy + ξy ηx ) + 2cξy ηy
C = aηx2 + bηx ηy + cηy2 (5.18)
D = aξxx + bξxy + cξyy + dξx + eξy
E = aηxx + bηxy + cηyy + dηx + eηy
and
fˆ = f (x(ξ, η), y(ξ, η)), ĝ(ξ, η) = g(x(ξ, η), y(ξ, η)).
Exercise 5.22:
Verify that the formulas given in equation 5.18 are correct.
104
Now ξ(x, y) and η(x, y) are real valued functions as are a(x, y), b(x, y) and c(x, y).
So whether it is possible to satisfy equations (5.19)-(5.22) depends on the functions
a, b, c. There are three different cases, which are used to classify all second order
partial differential equations as described below.
Definition 5.23: Suppose a(x, y), b(x, y), c(x, y) are continuous functions and a(x, y) 6=
0 on the region D ⊆ R2
I If b2 − 4ac > 0 on D then the partial differential equation (5.16) is called hyper-
bolic.
III If b2 − 4ac < 0 on D then the partial differential equation (5.16) is called
elliptic.
Note that if a, b, c are constants (a, b, c ∈ R) then the partial differential equation
will be of one class on all of R2 . If any of a, b, c is a function of x and y then the
partial differential equation may change class as x and y vary.
Example 5.24:
Consider the partial differential equation
We will now show how each class of second order linear partial differential
equation can be reduced to its simplest form, which is called the canonical form.
105
3
Hyperbolic
Parabolic
Elliptic
-5 -4 -3 -2 -1 0 1 2 3 4 5
-1
Parabolic
-2
Hyperbolic
-3
Figure 5.6: Classification of the partial differential equation uxx + 4yuxy + 9uyy −
ux + uy = 0.
ξ = φ1 (x, y)
will guarantee √
ξx −b − b2 − 4ac
= ⇒ A = 0,
ξy 2a
while choosing
η = φ2 (x, y)
106
will guarantee √
ηx −b + b2 − 4ac
= ⇒ C = 0.
ηy 2a
Note: It is easy to check that ξx ηy − ξy ηx 6= 0 which guarantees that this transfor-
mation is one-to-one. Further, one can show that B 6= 0.
It follows that the transformation ξ = φ1 (x, y), η = φ2 (x, y) reduces the partial
differential equation (5.16) to
Since B 6= 0 we can divide through to get the canonical form for hyperbolic partial
differential equations
ûξη + Φ(ξ, η, û, ûξ , ûη ) = 0,
where Φ = (Dûξ + E ûη + fˆû − ĝ)/B.
Definition 5.25: Equation (5.23) and (5.24) are called the characteristic equa-
tions for a hyperbolic partial differential equation. The families of curves in the xy
plane corresponding to the general solutions (5.25) and (5.26) of these differential
equations are called the characteristics.
107
If we choose η(x, y) = x it can be shown that the transformation is invertible and
C 6= 0. Thus, using the substitution
ξ = φ(x, y) η = x,
Dividing through by C gives the canonical form for any parabolic partial differential
equation
ûηη + Φ(ξ, η, û, ûξ , ûη ) = 0,
where Φ = (Dûξ + E ûη + fˆû − ĝ)/C.
Here b2 − 4ac < 0 so the characteristic equations (5.23) and (5.24) do not have
real valued solutions. Thus the partial differential equation has no characteristics.
However, using other transformations, we may still simplify the partial differential
equation as we now show.
Noting that
we see that the transformation ξ = φ(x, y), η = x where φ(x, y) = k is the general
solution of
dy bηx + 2cηy
= ,
dx 2aηx + bηy
will make B = 0. Thus the partial differential equation (5.16) becomes
108
3 3
2 2
1 1
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
-1 -1
-2 -2
-3 -3
(a) (b)
Example 5.26:
Consider
uxx + 2uxy − 3uyy = 0.
Classify the equation and sketch the characteristics if any. Then put the equation
into its canonical form and find the general solution.
Solution:
Here we have a = 1, b = 2, c = −3, so b2 − 4ac = 4 + 12 = 16 > 0 for all (x, y) ∈ R2 .
Thus the partial differential equation is hyperbolic on R2 .
The characteristic equations are
√ √
dy b+ b2 −4ac dy b2 −4ac
dx
= 2a
and dx
= b− 2a
=3 = −1,
which have general solutions
y = 3x + k and y = −x + k.
D = E = fˆ = ĝ = 0.
B ûξη = 0,
109
where
B = 2aξx ηx + b[ξx ηy + ξy ηx ] + 2cξy ηy = −16 6= 0.
This implies that
ûξη = 0,
or,
∂ ∂ û
= 0.
∂ξ ∂η
Integrating with respect to ξ holding η fixed yields
∂ û
= F (η).
∂η
A further integration with respect to η holding ξ fixed then gives
Z
û(ξ, η) = F (η) dη + G(ξ)
= H(η) + G(ξ).
ux = H 0 (y + x) − 3G0 (y − 3x)
uxx = H 00 (y + x) + 9G00 (y − 3x)
uxy = H 00 (y + x) − 3G00 (y − 3x)
uy = H 0 (y + x) + G0 (y − 3x)
uyy = H 00 (y + x) + G00 (y − 3x)
110
Solution:
We have a = 4, b = 4, c = 1, so b2 − 4ac = 16 − 16 = 0 and the partial differential
equation is parabolic on R2 . The characteristic equation is
dy b 4 1
= = = ,
dx 2a 8 2
which has general solution
1
y = x + k.
2
Therefore, the characteristics are the lines y − 21 x = k.
Let ξ = y − 21 x, η = x. Then, from the derivation for Class II, we know that
A = 0 and B = 0. Further, since d = e = f = 0 and the second partial derivatives
of ξ and η are zero, from equation (5.18) we have
Not all second order partial differential equations may be solved using the ap-
proach of the last two examples. Thus we consider a more general approach.
111
5.4 The Fourier Transform
where α, β are constants. fˆ(ω) is called the transform of f (x) and K(ω, x) is
called the kernel of the transform.
Integral transforms are helpful in solving linear differential equations as they
can be used transform differential equations into simpler equations. For example,
an integral transform may change an ordinary differential equation into an algebraic
equation.
Definition 5.29: Let f (x) be defined for all x ∈ R. The Fourier Transform of
f is defined by Z ∞
F{f (x)} = fˆ(ω) = f (x)e−iωx dx
−∞
All the usual rules for evaluating limits can be easily extended to complex valued
functions of a real variable. We will use these rules as needed.
Example 5.30:
Let (
ebx for x < 0,
f (x) =
0 for x ≥ 0,
where b > 0. Determine if F{f (x)} exists. If it does, calculate it.
112
Solution:
Consider
Z ∞ Z 0
−iωx
f (x)e dx = ebx e−iωx dx
−∞ −∞
Z 0
= lim e(b−iω)x dx
s→−∞ s
0
e(b−iω)x
= lim
s→−∞ (b − iω)
s
1
= lim [1 − ebs e−iωs ]
s→−∞ b − iω
1
1 − ebs (cos ωs − i sin ωs)
= lim
s→−∞ b − iω
1
=
b − iω
b + iω
=
b2 + ω 2
Since the integral converges, the Fourier transform of f (x) exists and
b + iω
F{f (x)} = .
b2 + ω 2
Example 5.31:
Let f (x) = ebx , where b is a fixed real number. Determine if F{f (x)} exists. If it
does, calculate it.
Solution:
Consider
Z ∞ Z ∞
−iωx
f (x)e dx = ebx e−iωx dx
−∞ −∞
Z 0 Z t
(b−iω)x
= lim e dx + lim e(b−iω)x dx
s→−∞ s t→∞ 0
1 1 bt
1 − ebs (cos ωs − i sin ωs) + lim
= lim e (cos ωt − i sin ωt) − 1
s→−∞ b − iω t→∞ b − iω
If b < 0 then the first limit diverges. If b > 0 then the second limit diverges. If
b = 0, both limits diverge. Thus the improper integral diverges in both cases and
the Fourier transform of f (x) = ebx does not exist for any b.
Can one determine if the Fourier transform of a function exists without evalu-
ating the improper integral? The next theorem gives a partial answer.
113
1. f is piecewise continuous on every interval [−M, M ] for any M > 0
R∞
2. −∞ |f (x)| dx converges (i.e, f is absolutely integrable)
Proof. By condition 1, one can write down the improper integral in the definition
of the Fourier transform as a sum of (improper) integrals. Now
Z ∞ Z ∞
−iωx
f (x)e dx ≤ |f (x)e−iωx | dx
−∞
Z−∞
∞
= |f (x)| dx
−∞
since |e−iωx | = 1. It follows from a comparison theorem for improper integrals that
Z ∞
f (x)e−iωx dx
−∞
Note that this theorem gives sufficient conditions for existence of the Fourier
transform. If a function does not satisfy one or both of these conditions, then the
Fourier transform of the function may or may not exist.
Example 5.33:
Let (
−10 if |x| ≤ 1
f (x) =
0 otherwise
Determine if F{f (x)} exists.
Solution:
Clearly f (x) is piecewise continuous on every interval [−M, M ] for any M > 0.
Now consider
Z ∞ Z 1
|f (x)| dx = 10 dx
−∞ −1
= 20
Thus f (x) is absolutely integrable. Thus we can conclude, applying Theorem 5.32,
that the Fourier transform of f (x) exists.
Another question we may ask is whether the Fourier transform is invertible, i.e.,
given fˆ(ω) can we find f (x) such that F{f (x)} = fˆ(ω)? The answer lies in the
following definition and theorem.
114
Definition 5.34: Let fˆ(ω) be defined for all ω ∈ R. The Inverse Fourier Trans-
form of fˆ(ω) is given by
Z ∞
−1 ˆ 1
F {f (ω)} = fˆ(ω)eiωx dω.
2π −∞
Theorem 5.35. If f is continuous and absolutely integrable and F{f (x)} = fˆ(ω)
then
F −1 {fˆ(ω)} = f (x)
The following theorem shows that the Fourier transform is a linear transforma-
tion.
Theorem 5.36. Let f and g be two functions such that the Fourier transforms of
f and g exist and let α and β be two real numbers. Then the Fourier transform of
αf + βg exists and
Note that the second step follows from the fact that the Fourier transforms of f
and g both exist, so the all the improper integrals are convergent.
The next theorem shows how the Fourier transform maps derivatives of func-
tions. It is a key result needed when using Fourier transforms to solve differential
equations.
Theorem 5.37. Let f be differentiable on R and suppose that the Fourier transform
of f exists and is given by F{f (x)} = fˆ(ω). Then the Fourier transform of f 0 exists
and is given by F{f 0 (x)} = iω fˆ(ω).
R∞
Proof. Since F{f (x)} exists, −∞
f (x)e−iωx dx converges which implies
115
Using Integration by Parts we have
Z b Z b
b
f 0 (x)e−iωx dx = f (x)e−iωx a
− f (x)(−iω)e−iωx dx,
a a
= iω fˆ(ω).
Another useful property of the Fourier transform is given in the following the-
orem.
Theorem 5.40 (Shifting Property). Let f (x) be continuous and absolutely inte-
grable with Fourier transform fˆ(ω). Then
F −1 {fˆ(ω)e−iωt } = f (x − t).
116
Example 5.41:
Use the Fourier transform to find the general solution of the partial differential
equation
ux + uy = 0.
Solution:
Assume that u is continuous and absolutely integrable with respect to x (i.e.,
R∞
−∞
|u(x, y)| dx < ∞, ∀y ∈ R). Then the Fourier transform of u with respect
to x exists. Let
Z ∞
û(ω, y) = F{u(x, y)} = u(x, y)e−iωx dx.
−∞
iωû + ûy = 0,
which we rewrite as
∂ û
= −iωû.
∂y
Fixing ω, this is a separable ordinary differential equation for û with respect to y.
Solving in the usual way gives
Z Z
∂ û
= − iω ∂y
û
ln |û| = −iωy + F̂ (ω)
If we assume that Ĝ has inverse Fourier transform G(x) which is absolutely inte-
grable and continuous, then we may apply the Shifting Property (Theorem 5.40)
to conclude that the general solution of the differential equation is
117
where G is an arbitrary (differentiable) function. Note that this solution is the
same as what we found in Example 5.16 using the method of characteristics.
In the last example, we needed to find the inverse Fourier transform of a function
that was a product of two functions of ω, û(ω, y) = Ĝ(ω)e−iωy . We were able to
do this directly using the Shifting Property, but that may not always be the case.
The next theorem deals with this situation in a more general way. First we need a
definition.
Definition 5.42: The convolution of f and g is
Z ∞
(f ∗ g)(x) = f (x − z)g(z) dz.
−∞
Theorem 5.43. Let f (x), g(x) be bounded, continuous and absolutely integrable
with Fourier transforms given by F{f (x)} = fˆ(ω) and F{g(x)} = ĝ(ω). Then
F −1 {fˆ(ω)ĝ(ω)} = f ∗ g.
118
1. f ∗ g = g ∗ f
Proof.
1. From the definition we have
Z ∞
f ∗g = f (x − z)g(z) dz
−∞
Let w = x − z
Z −∞
= − f (w)g(x − w) dw
∞
Z ∞
= g(x − w)f (w) dw
−∞
= g∗f
γuxx = ut (5.29)
u(x,t)
119
uxx ∝ ut
or
γuxx = ut .
The constant of proportionality, γ, is called the diffusivity or diffusion constant.
The most common boundary value problems involving this equation involve one
condition when t is fixed
lim u(x, t) = 0
x→−∞
lim u(x, t) = 0 (5.32)
x→∞
γuxx − ut = 0,
We will use the Fourier transform to solve the heat equation subject to the con-
ditions (5.30) and (5.32). Note that the boundary conditions (5.32) are necessary
conditions for u to be absolutely integrable with respect to x.
120
Assuming that the Fourier transform of u exists (u continuous and absolutely in-
R∞
tegrable with respect to x) let û(ω, t) = F{u(x, t)} = −∞ u(x, t)e−iωx dx. Applying
the differentiation Theorem, we have
Putting these into the given partial differential equation yields the simpler equation
assuming this exists. We will solve the initial value problem consisting of (5.33)
and (5.34).
For fixed ω, (5.33) is a separable differential equation for û with respect to t,
which may be solved in the usual way.
∂ û
= −γω 2 û
Z ∂t Z
∂ û
= − γω 2 ∂t
û
ln |û| = −γω 2 t + G(ω)
2
û(ω, t) = H(w)e−γω t .
H(ω) = fˆ(ω).
121
We know F −1 {fˆ(ω)} = f (x). What about F −1 {e−γω t }? Apply the definition of
2
Therefore
x2
−γω 2 t e− 4γt
F −1 {e }= √ = g(x).
2 πγt
Thus we have
u(x, t) = F −1 {fˆ(ω)ĝ(ω)}
and, by Theorem 5.43,
u(x, t) = f ∗ g
Z ∞
= f (x − z)g(z) dz
−∞
z2
∞
e− 4γt
Z
= f (x − z) √ dz
−∞ 2 πγt
√
for t > 0, x ∈ R. Setting y = −z/(2 γt) this can be expressed as
Z ∞
1 √ 2
u(x, t) = √ f (x + 2 γty)e−y dy.
π −∞
Exercise 5.46:
Verify that the expression above satisfies the heat equation and the conditions
(5.30) and (5.32).
122
Chapter 6
Pricing of Derivatives
6.1 Terminology
An option is the right, but not obligation, to buy or sell a security (e.g., a stock)
for an agreed upon price at (or before) some agreed upon time in the future.
• The agreed upon price is called the strike price and will be denoted by K.
• The agreed upon time is called the strike time or expiry date and will be
denoted by T .
• S > K — The owner of the option uses (exercises) it to buy stock at price
K then sells stock at price S, for profit S − K. Value of option at t = T is
S − K. (Option is in the money).
123
• S ≤ K — The owner of the option will not exercise it and makes no profit.
We say it expires unexercised. Value of option at t = T is 0. (Option is out
of the money).
Similarly, one can show the value of a put option at the strike time is
(
0 if S > K
P (T ) =
K − S if S ≤ K
Arbitrage
Arbitrage exists whenever financial instruments are mispriced relative to each other.
Example
Bank A charges 5% interest per year on a loan
Bank B gives 6% interest per year on a savings account
A person can take one year loan for $1000 from Bank A and put it in a savings
account at Bank B. At the end of the year they have $1060 in account and have
to repay only $1050 thus making $10 profit (without investing any of their own
money).
Assumption: Whenever arbitrage exists prices will quickly change to eliminate it.
Thus, any model for the price of an option should be set up so there is no
arbitrage. Potential for arbitrage is usually measured by comparison with a “risk
free” investment (e.g., Government bonds) continuously compounded at interest
rate r. The following (overly simple) example illustrates this.
Example 6.1:
The price of a share of stock is currently $S. At time T the price will be $200
with probability p and $50 with probability (1 − p). An investor can purchase a
European call option with price C. The strike time of the option is T and strike
price is $150. How should C be chosen?
Simple approach: compare two investments
- with probability p : stock price is $200 > $150. The value of the option is
thus $200 - $150 = $50.
124
- with probability 1 − p: stock price is $50 < $150. The value of the option
is thus $0.
Expected value of option at time T is: 50p + 0(1 − p) = 50p.
CerT = 50p
⇒ C = 50pe−rT
Note: Price should really depend on the initial price of stock. More detailed
analysis shows that p, r are related to the initial price.
125
6.2 Pricing Options
Let F be the value of an option on some asset. We will assume that F depends on
• t – time
Let S(t) be the price at time t and S(t + ∆t) be the price a small time ∆t later.
Then the change in S is ∆S = S(t + ∆t) − S(t), and we can represent the model
above as:
∆S = fdet (S, t, ∆t) + frand (S, t, ∆W (t)),
1
where ∆W (t) is a continuous-time stochastic process. This means that S is
also a stochastic process.
We will focus on a simple model of the following form
µ and σ are constants. µ is called the growth rate or drift and σ is called
the volatility. In addition, we will assume that ∆W (t) is a normally distributed
random variable with mean 0 and variance ∆t. In this case, ∆W (t) is called a
√
Wiener process, and can be written ∆W (t) = φ ∆t where φ is a normally
distributed random variable with mean 0 and variance 1.
Exercise 6.2:
√
Let ∆W (t) = φ ∆t, where φ is a normally distributed random variable with mean
0 and variance 1. Verify the following.
1
A stochastic process can be thought of as a random variable with a time dependent probability
density function.
126
1. ∆W (t) is a normally distributed random variable with mean 0 and variance
∆t.
2. (∆W (t))2 is a normally distributed random variable with mean ∆t and vari-
ance 2(∆t)2 .
If the price of an asset, S(t), satisfies the stochastic DE (6.2) what is a model for
the value of an option on that asset, F (S, t)? Consider the change in F resulting
from a change ∆t in t (and a corresponding change ∆S in S):
∂F ∂F 1 ∂ 2F 2 ∂ 2F 1 ∂ 2F
∆F = ∆S + ∆t + (∆S) + (∆S)(∆t) + (∆t)2 + · · ·
∂S ∂t 2 ∂S 2 ∂S∂t 2 ∂t2
(Use (6.1))
∂F ∂F 1 ∂ 2F
= [µS∆t + σS∆W (t)] + ∆t + 2
[µS∆t + σS∆W (t)]2
∂S ∂t 2 ∂S
∂ 2F 1 ∂ 2F
+ [µS∆t + σS∆W (t)](∆t) + (∆t)2 + · · ·
∂S∂t 2 ∂t2
1
= σSFS ∆W (t) + [Ft + µSFS ]∆t + σ 2 S 2 FSS (∆W (t))2
2
1 1
+[µσS 2 FSS + σSFSt ]∆t∆W (t) + [ Ftt + µSFSt + µ2 S 2 FSS ](∆t)2 + · · ·
2 2
127
S (dollars)
20
15
10
0
0 2 4 6 8 10
t (years)
(a)
S (dollars)
20
15
10
0
0 2 4 6 8 10
t (years)
(b)
S (dollars)
20
15
10
0
0 2 4 6 8 10
t (years)
(c)
Figure 6.1: Typical solutions of the stochastic DE (6.2) with initial condition S(0) =
10. (a) µ = −0.05, σ = 0.1; (b) µ = 0.05, σ = 0.1; (c) µ = 0.05, σ = 0.2.
128
√
Now we will use the assumption that ∆W (t) = φ ∆t. Recall that this implies
(∆W (t))2 has mean ∆t and variance 2(∆t)2 . Thus we will approximate (∆W (t))2
by ∆t. Retaining only the largest terms in ∆t then gives
1
∆F ≈ σSFS ∆W (t) + (Ft + µSFS + σ 2 S 2 FSS )∆t (6.3)
2
Letting ∆t → 0 gives a stochastic DE for F
1
dF = σSFS dW (t) + (Ft + µSFS + σ 2 S 2 FSS )dt (6.4)
2
Note that this is in the same form as (6.2) but the coefficients are dependent on
S, F and the derivatives of F . It is much more difficult to solve (even numerically)
than (6.2).
The idea of Black and Scholes was to eliminate the stochastic term from the
DE for F in the following way. Consider a portfolio created by selling the option
and buying units of the underlying asset. The value of this portfolio at time t is
Π = F − S. The change in this value over a small time change ∆t, assuming is
constant over the time ∆t, is
∆Π = ∆F − ∆S
(Use (6.1) and (6.3))
1
= [σSFS − σS]∆W (t) + [Ft + µSFS + σ 2 S 2 FSS − µS]∆t.
2
Next comes the key step. Setting = FS , eliminates the stochastic term (and part
of the deterministic term), leaving:
1
∆Π = [Ft + σ 2 S 2 FSS ]∆t.
2
Then as ∆t → 0 this gives
1
dΠ = [Ft + σ 2 S 2 FSS ]dt,
2
which is equivalent to deterministic differential equation
dΠ 1
= Ft + σ 2 S 2 FSS . (6.5)
dt 2
Now consider an alternate portfolio where the same amount of money is put in
a risk free investment with interest rate r. Assuming continuous compounding, the
risk free investment satisfies the DE
dΠ
= rΠ. (6.6)
dt
129
Assuming that no arbitrage exists and there are no transaction fees, the returns
of the two portfolios should be equal for all time. To see why, suppose that, at
some time, the return from the first portfolio is more than that from the second.
Then a person could make a risk free profit by taking a loan at interest rate r and
using this to buy the first portfolio. Then the amount owed on the loan would
be less than the return on the investment. Similarly, if the return from the first
portfolio is less than the second, the someone could make risk free profit by selling
the portfolio and investing the money at an interest rate r.
Thus the right hand sides of (6.5) and (6.6) should be equal, i.e.,
1
rΠ = Ft + σ 2 S 2 FSS .
2
Setting Π = F − S and = FS , gives the Black-Scholes PDE for F
1 2 2
σ S FSS + rSFS + Ft − rF = 0 (6.7)
2
where
We will now formulate boundary conditions for this PDE. The boundary condi-
tions determine what kind of option it is. We will consider a European call option.
One condition comes from the value of the option at the strike time. Recall that
K is the strike price. At t = T there are two possibilities
• S > K. In this case the option is exercised and the price at the strike time is
S − K, i.e., F (S, T ) = S − K.
• S ≤ K. In this case the option expires unexercised since the option is worth-
less at the strike time, i.e., F (S, T ) = 0.
This gives the following condition (sometimes called the final condition)
130
A second condition comes from property (P1) above. If S = 0 at some time
t ≤ T then S = 0 < K at the strike time T and the option won’t be exercised,
i.e., F (0, T ) = 0. Thus, as soon as S = 0 we know option won’t be exercised so it
becomes worthless. This can be expressed as the boundary condition
F (0, t) = 0 (6.9)
A third condition comes from considering the limit of large values for the price.
If S → ∞ at some t0 < T , then S > K for all t > t0 . In particular S > K at the
strike time, T , thus F (S, T ) = S − K. Thus, soon as S → ∞ we know option will
be exercised and will have value S − K. This can be expressed as
F (S, t) = S − K as S → ∞.
We will solve the Black-Scholes PDE by transforming it into the heat equation and
using the solution for the heat equation derived in the last chapter.
Step I. Change of independent and dependent variables
Consider the change of variables S = Kex , t = T − σ2τ2 . It is easy to check that this
S 2
transformation is invertible and has inverse transformation x = ln K , τ = σ2 (T −t).
Let F (S, t) = Kv(x, τ ) and calculate the partial derivatives
∂ ∂v dτ σ2K
Ft = (Kv) = K =− vτ ,
∂t ∂τ dt 2
∂ ∂v dx K
FS = (Kv) = K = vx ,
∂S ∂x
dS S
∂ ∂ K K K dx K
FSS = (FS ) = vx = − 2 vx + vxx = 2 (vxx − vx ).
∂S ∂S S S S dS S
So the PDE (6.7) becomes
σ2K σ2K
(vxx − vx ) + rKvx − vτ − rKv = 0
2 2
⇓
2r 2r
vxx + ( 2 − 1)vx − vτ − 2 v = 0
σ σ
or
vxx + (δ − 1)vx − vτ − δv = 0 (6.11)
131
where δ = σ2r2 > 0. Note that this is a constant coefficient parabolic partial differ-
ential equation.
Note also that we have the following correspondence of limits:
σ2T
t→0 ⇔ τ →
2
t→T ⇔ τ →0
S → 0 ⇔ x → −∞
S→∞ ⇔ x→∞
σ2 T
Thus we will consider the solution of (6.11) on 0 ≤ τ ≤ 2
and −∞ < x < ∞.
The final condition, (6.8), becomes an initial condition:
v(x, τ ) = ex as x → ∞. (6.14)
uxx − uτ = 0. (6.15)
Note that this is the heat equation (5.29) with γ = 1. With this change of the
dependent variable, initial condition (6.12) becomes
132
The boundary condition (6.14) becomes
eαx+βτ u(x, τ ) = ex , as x → ∞.
Thus we have reduced the problem of solving the Black-Scholes PDE (6.7) on
0 ≤ t ≤ T and 0 ≤ S < ∞ subject to the final condition (6.8) to the problem of
2
solving (6.15) on 0 ≤ τ ≤ σT and −∞ < x < ∞ subject to the initial condition
(6.16).
Step III. Using the solution of the heat equation
Recall (see the end of Chapter 5) that the heat equation BVP
may be solved using the Fourier Transform, and that the solution is given by
Z ∞
1 √ 2
u(x, τ ) = √ f (x + 2 γτ y)e−y dy.
π −∞
so the solution is
Z ∞ √ √
1 (δ+1) (δ−1) 2
u(x, τ ) = √ [e 2
(x+2 τ y)
−e 2
(x+2 τ y)
]+ e−y dy.
π −∞
To find an explicit expression for the solution, we need to evaluate the integral.
To begin, note that
(δ+1) √
(x+2 τ y)
(δ−1) √
(x+2 τ y)
√
e 2 −e 2 >0 ⇔ x + 2 τ y > 0.
Thus the lower limit of the integral may be replaced by − 2√x τ and the integral may
be split into two parts:
(δ+1) (δ−1)
x Z ∞ √ x Z ∞ √
e 2
(δ+1) τ y−y 2 e 2
τ y−y 2
u(x, τ ) = √ e dy − √ e(δ−1) dy.
π x
− 2√ π x
− 2√
τ τ
133
√ √
Now make the change of variables w1 = − 2(y − (δ+1) τ ) in the first integral and
√ (δ−1) √
2
w2 = − 2(y − 2 τ ) in the second. Then we have
(δ+1) (δ+1) 2 (δ+1) √ (δ−1) (δ−1) 2 (δ−1) √
x+ 4 τ Z √x + 2τ x+ 4 τ Z √x + 2τ
e 2 2τ 2
−w12 /2 e 2 2τ 2 2
u(x, τ ) = √ e dw1 − √ e−w2 /2 dw2 .
2π −∞ 2π −∞
Recall that Z z
1 2
Φ(z) = √ e−w /2 dw
2π −∞
is the cumulative distribution for a normal random variable with mean zero and
standard deviation 1.
Thus we can write
(δ + 1) √ (δ − 1) √
(δ+1) (δ+1) 2
x+ 4 τ x (δ−1) (δ−1) 2
x+ 4 τ x
u(x, τ ) = e 2 Φ √ + 2τ −e 2 Φ √ + 2τ .
2τ 2 2τ 2
(1−δ) (δ+1) 2
S
Returning to the original variables, F (S, t) = Ke 2 x− 4 τ u(x, τ ), x = ln K ,
2
δ = σ2r2 , τ = σ2 (T − t), gives the solution of the Black-Scholes equation for a
European call option:
1. No arbitrage exists.
2. Price obeys stochastic DE (6.2) with dW (t) a normal random variable with
mean 0 and variance dt.
7. Assets are divisible, i.e. we do not have to buy or sell integer numbers of
assets.
134
References
[5] P.V. O’Neil. Beginning Partial Differential Equations. Wiley, Hoboken, NJ,
2008.
135