MST209 Mathematical Methods and Models
MST209 Mathematical Methods and Models
and models
�
����
� �� � ���
This unit introduces the topic of differential equations. It is an important
field of study, and several subsequent units are also devoted to it. There are
many applications of differential equations throughout the course.
The subject is developed without assuming that you have come across it
before, but the unit assumes that you have previously had a basic grounding
in calculus. In particular, you will need to have a good grasp of the basic
rules for differentiation and integration. (These were revised in Unit 1 of
this course.)
From the point of view of later studies, Sections 3 and 4 contain the most
important material.
The recommended study pattern is to study one section per study session,
and to study the sections in the order in which they appear.
You will need the computer algebra package for the course for Subsection 2.3
and for all of Section 5. The computer work for Subsection 2.3 may be
postponed until later (for example until your study of Section 5) without
affecting your ability to study the subsequent sections.
This publication forms part of an Open University course. Details of this and other Open Uni-
versity courses can be obtained from the Course Information and Advice Centre, PO Box 724,
The Open University, Milton Keynes, MK7 6ZS, United Kingdom: tel. +44 (0)1908 653231, e-mail
[email protected]
Alternatively, you may visit the Open University website at http://www.open.ac.uk where you can
learn more about the wide range of courses and packs offered at all levels by The Open University.
To purchase a selection of Open University course materials, visit the webshop at www.ouw.co.uk,
or contact Open University Worldwide, Michael Young Building, Walton Hall, Milton Keynes,
MK7 6AA, United Kingdom, for a brochure: tel. +44 (0)1908 858785, fax +44 (0)1908 858787,
e-mail [email protected]
������� ����
�
��
���� � ���� ����
�
In the course you will meet many examples of differential equations. Fre-
quently these arise from studying the motion of physical objects, but we
shall start with an example drawn from biology and show how this leads
naturally to a particular differential equation.
Suppose that we are interested in the size of a particular population, and in
how it varies over time. The first point to make is that any population size is
measured in integers (whole numbers), so it is not clear how differentiation
will be relevant. (Differentiable functions must be continuous, and therefore
defined on an interval of real numbers in R.) Nevertheless, if the population
is large, say in the hundreds of thousands, a change of one unit will be
relatively very small, and in these circumstances we may choose to model
the population size as a continuous function of time. We shall write this
function as P (t), and our task is to show how P (t) may be described by a
differential equation.
Let us assume a fixed starting time (which we shall label t = 0). If the
population is not constant, then there will be ‘leavers’ and ‘joiners’. For
example, in a population of humans in a particular country, the former
will be those who die or emigrate, whilst the latter represent births and
immigrants.
It is usual to express birth rates as a proportion of the current population
size. For example, the UK Office for National Statistics quotes birth rates in
various age groups as a number per 1000 women. Death rates are specified
in a similar way. To emphasize that these rates are expressed as a proportion
of the current population, we shall use the terms ‘proportionate birth rate’ Note that in our model the
and ‘proportionate death rate’. proportionate birth rate is
expressed as a proportion of
For our simple model we shall ignore immigration and emigration, and con- the whole population, not
centrate solely on births and deaths. Denote the proportionate birth rate just the number of women.
by b and the proportionate death rate by c. Then, in a short interval of
time δt, we would expect
number of births bP (t)δt, (1.1)
number of deaths cP (t)δt, (1.2)
where P (t) is the population size at time t.
�
��
���� � ���� ����
�
This principle applies to any quantity whose change, over a given time in-
terval, is due solely to the specified input and output.
The accumulation δP of population over the time interval δt is the popu-
lation at the end of the interval minus the population at the start of the
interval; that is,
δP = P (t + δt) − P (t).
The input is the number of births (Equation (1.1)), and the output is the
number of deaths (Equation (1.2)). The input–output principle now enables
us to express the accumulation δP of the population over the time interval
δt as
δP bP (t)δt − cP (t)δt = (b − c)P (t)δt.
Dividing through by δt, we obtain
δP
(b − c)P (t).
δt
The approximations involved in deriving this equation become progressively
more accurate for shorter time intervals. So, finally, by letting δt tend to
zero, we obtain This is the step that requires
P to be a continuous (rather
dP
= (b − c)P (t). than discrete) function of t.
dt
(This follows because
dP P (t + δt) − P (t)
= lim
dt δt→0 δt
is the definition of the derivative of P .)
This is a differential equation because it describes dP /dt rather than the
eventual object of our interest (which is P itself). The purpose of this unit
is to enable you to solve a wide variety of such equations.
Of course, we can simplify the above equation slightly by using the pro-
P
portionate growth rate r, which is the difference between the proportionate
birth and death rates: r = b − c. Then our model becomes
dP
= rP.
dt r>0
For very simple population models, r is taken to be a constant. As we shall
see, this leads to a prediction of exponential growth (or, if r < 0, decay)
in population size with time, as illustrated in Figure 1.1. This may be a t
very good approximation for certain populations, but it cannot be sustained
indefinitely if r > 0. ������ ���
In practice, both the proportionate birth rate and the proportionate death
rate will vary, and so therefore will the proportionate growth rate. It turns
out to be convenient to model these changes as being dependent on the
population size, so that the proportionate growth rate r becomes a function
of P . The justification for this is as follows. When the population is low, one
may assume that there is potential for it to grow (assuming a reasonable
environment). The proportionate growth rate should therefore be high.
However, as the population grows, there will be competition for resources.
�
��
���� � ���� ����
�
Thus the proportionate growth rate will decline, and in this way unlimited
(exponential) growth does not occur.
A particularly useful model arises from taking r(P ) to be a decreasing linear
function of P . We shall write this as
� �
P You will see later why this
r(P ) = k 1 − , (1.3) particular form is chosen.
M
where k and M are positive constants. Looking at this formula, you can
see that the proportionate growth rate r decreases linearly with P , from the
value k (when P = 0) to 0 (when P = M ).
Using this expression for r, the above differential equation satisfied by P
becomes
� �
dP P
= kP 1 − . (1.4)
dt M
This is well known to biologists as the logistic equation — we shall consider
it further in Section 2, and see how to solve it in Section 3. For now, we have
achieved our objective of showing that differential equations arise naturally
in modelling the real world.
�
��
���� � ���� ����
�
dy
= f (x, y). Equation (1.4) is of� this form,
�
dx P
The right-hand side here stands for an expression involving both, either or with f (t, P ) = kP 1 − .
M
neither of the variables x and y, but no other variables and no derivatives.
According to the definition above, a function has to satisfy a differential
equation in order to be regarded as a solution of it. The differential equation
is satisfied by the function provided that when the function is substituted This substitution includes the
into the equation, the left- and right-hand sides of the equation give an requirement that the function
identical expression. should be differentiable
(i.e. that it should have a
You are asked to verify in the next exercise that several functions are solu- derivative) at all points where
tions of corresponding first-order differential equations. Later in the unit, it is claimed to be a solution.
�
��
���� � ���� ����
�
you will see how all of these differential equations may be solved; but even
when a solution has been deduced, it is worth checking in the manner of this
dy
*(a) y = 2ex − (x2 + 2x + 2); = y + x2 . Remember that an asterisk
dx denotes an exercise (or part
dy of one) that is considered
(b) y = 12 x2 + 32 ; = x.
dx particularly important.
2
*(c) u = 2ex /2 ; u
= xu.
�
27 − x2 √ √ dy x
(d) y = (−3 3 < x < 3 3); =− (y = 0). Note that the restriction
3 dx 3y y= 0 placed on the
*(e) y = t + e−t ; ẏ = −y + t + 1. differential equation in
part (d) is necessary to ensure
*(f) y = t + Ce−t ; ẏ = −y + t + 1. (Here C is an arbitrary constant.) that −x/3y is well defined.
In the last two parts of Exercise 1.2 you were asked to verify that
y = t + e−t and y = t + Ce−t
are solutions of the differential equation ẏ = −y + t + 1, where in the second
case C is an arbitrary constant. In saying that C is arbitrary, we mean that
it can assume any real value. Whatever number is chosen for C, the corre-
sponding expression for y(t) is always a solution of the differential equation.
The particular function y = t + e−t is just one example of such a solution,
obtained by choosing C = 1.
This demonstrates that solutions of a differential equation can exist in pro-
fusion; as a result, we need terms to distinguish between the totality of all
these solutions for a given equation and the individual solutions that are
completely specified.
����������
(a) The general solution of a differential equation is the collection
of all possible solutions of that equation.
(b) A particular solution of a differential equation is a single solu-
tion of the equation, and consists of a solution function whose rule
contains no arbitrary constant.
�
��
���� � ���� ����
�
As you have seen, there are many solutions of a differential equation. How-
ever, a particular solution of the equation, representing a definite relation-
ship between the variables involved, is often what is needed. This is achieved
by using a further piece of information in addition to the differential equa-
tion. Often the extra information takes the form of a pair of values for the
independent and dependent variables.
For example, in the case of a population model, it would be natural to
specify the starting population, P0 say, and to start measuring time from
t = 0. We could then write
P = P0 when t = 0, or, equivalently, P (0) = P0 .
A requirement of this type is called an initial condition.
����������
(a) An initial condition associated with the differential equation
dy
= f (x, y)
dx
specifies that the dependent variable y takes some value y0 when
the independent variable x takes some value x0 . This is written
either as
y = y0 when x = x0
or as
y(x0 ) = y0 .
The numbers x0 and y0 are referred to as initial values.
(b) The combination of a first-order differential equation and an initial
condition is called an initial-value problem.
The word ‘initial’ in these definitions arises from those (frequent) cases in
which the independent variable represents time. In such cases, the differen-
tial equation describes how the system being modelled behaves once started,
while the initial condition specifies the configuration in which the system is
started off. In fact, if the initial condition is y(x0 ) = y0 , then we are often
interested in solving the corresponding initial-value problem for x > x0 . If x represents time, then
x > x0 is ‘the future’ after the
system has been started off.
��
��
���� � ���� ����
�
������� ���
Using the result given in Exercise 1.3(b), solve the initial-value problem
dy
= x + y, y(0) = 1.
dx
�� �
From Exercise 1.3(b), on replacing the variables t, u by x, y, respectively,
the general solution of the differential equation here is
y = Cex − x − 1.
The initial condition says that y = 1 when x = 0, and on feeding these values
into the general solution we find that
1 = Ce0 − 0 − 1 = C − 1.
Hence C = 2, and the particular solution of the differential equation that
solves the initial-value problem is
y = 2ex − x − 1.
y
Finally in this subsection, note that one needs to keep an eye on the domain 4
of the function defining the differential equation. ‘Gaps’ in the domain 2
usually show up as some form of restriction on the nature of a solution –10 –5 5 10 x
–2
curve. For example, consider the differential equation
–4
dy 1 –6
= . (1.5)
dx x
It turns out that there are two distinct families of solutions of this equation, ������ ���
given by y = ln x + C (if x > 0) and y = ln(−x) + C (if x < 0). These two Since |x| = −x if x < 0, you
families of solutions are illustrated in Figure 1.2. Notice that the right-hand can see that this agrees with
side of Equation (1.5) is not defined at x = 0, and that there is no solution what we know from Unit 1,
that crosses the y-axis. namely
� that
1
This unit deals with numerical and analytic (symbolic) methods of solving dx = ln |x|.
x
differential equations. However, before we can discuss numerical methods,
we need to know something about the way that errors and accuracy are
described: this is the topic of the next subsection.
��
��
���� � ���� ����
�
In Exercise 1.5(b), you saw that an absolute error of less than 0.0005 in
the value of π results in an absolute error of the order of 1.8 × 1011 in
the calculated value of f (π), for f (x) = e10x . For this function f , errors
are severely magnified! However, the situation is not quite so bad as this
statement might suggest. The calculated value of f (π) is not completely
unreliable — it is accurate to two significant figures. The value of f (π) is
itself very large (4.4 × 1013 ), so an error of 1.8 × 1011 is not quite so serious
as it sounds. We often want a measure of ‘error’ that takes into account the
size of the error relative to the size of the number being calculated.
��
��
���� � ����
���� ����� ��� ������� ������
dy 1
=√ .
dx 1 − x2
(b) Using the result of part (a), find the solution of the initial-value problem
dy 1
=√ , y( 12 ) = π2 .
dx 1 − x2
� π �
x = tan(t + C) − 2 < t + C <
π2
is a solution of the differential equation
ẋ = 1 + x2 .
(b) Using the result of part (a), find the solution of the initial-value problem
� �
ẋ = 1 + x2 , x π4
= 1.
��
��
���� � ����
���� ����� ��� ������� ������
���������
A direction field associates a unique direction to each point within
a specified region of the (x, y)-plane. The direction corresponding to
the point (x, y) may be thought of as the slope of a short line segment
through the point.
In particular, the direction field for the differential equation
dy
= f (x, y)
dx
associates the direction f (x, y) with the point (x, y).
��
��
���� � ����
���� ����� ��� ������� ������
0 1 2 x
–2 –1
������ ���
From this diagram, we can gain a good qualitative impression of how the
graphs of particular solutions of Equation (2.3) behave. The aim is mentally
to sketch curves on the diagram in such a way that the tangents to the curves
are always parallel to the local slopes of the direction field. For example,
starting from the point (−1, 0.5) (that is, taking the initial condition to be
y(−1) = 0.5), we expect the solution graph initially to fall as we move to
the right. The magnitude of the negative slope decreases, however, and
eventually reaches zero, after which the slope becomes positive and then
increases. On this basis, we could sketch the graph of the corresponding
particular solution and obtain something like the curve shown in Figure 2.3.
(– 1 , 0 . 5)
0 1 2 x
–2 –1
������ ���
��
���� � ����
���� ����� ��� ������� ������
P
1500
1000
500
0
2 4 6 8 10 t
–500
������ ���
(b) What does your answer to part (a) tell you about the predicted be-
haviour of a population whose size P (t) at time t is modelled by this
logistic equation?
��
��
���� � ����
���� ����� ��� ������� ������
slope = f ( x 0, y0)
P1 ( x1, Y1)
P0 Y 1 – y0
(x 0, y0)
h
0 x0 x1 x
������ ���
The idea is that the point P1 , whose coordinates have been denoted by
(x1 , Y1 ), provides an approximate value Y1 of the solution function y(x) at The reason for using Y1 here,
x = x1 . Now, unless the solution function follows a straight line as x moves rather than y1 , will be
from x0 to x1 , Y1 is unlikely to give the exact value of y(x1 ). However, the explained shortly.
hope is that, because we headed off from x0 along the correct slope, as given
by the direction field, Y1 will be reasonably close to the exact value. Before
worrying about accuracy, let us continue with the construction of the points
in our sequence.
The next thing that we need to do, before proceeding to the second step
in the construction process, is determine formulae for x1 and Y1 in terms
of x0 , y0 , h and f (x0 , y0 ). By the construction described, as the point P1
is reached from P0 by taking a step to the right of horizontal length h, we
have
x1 = x0 + h. (2.5)
We can express Y1 in terms of other quantities by equating two expressions
for the slope of the line segment P0 P1 ,
Y1 − y0
= f (x0 , y0 ),
h
and then rearranging to give
Y1 = y0 + hf (x0 , y0 ). (2.6)
This completes the first step, and we now take a second step to the right.
��
��
���� � ����
���� ����� ��� ������� ������
The direction of the second step is along the line with slope defined by the
direction field at the point P1 , namely f (x1 , Y1 ). The second step moves us
from P1 through a further horizontal distance h to the right, to the point
labelled P2 , as illustrated in Figure 2.6. This point provides an approximate
value Y2 of the solution function y(x) at x = x2 .
y
slope = f ( x1, Y1) P2 ( x2, Y2)
Y 2 – Y1
( x1, Y1) P1
h
P0
0 x0 x1 x2 x
������ ���
As in the first step, we need now to express the coordinates (x2 , Y2 ) of P2
in terms of x1 , Y1 , h and f (x1 , Y1 ). We have
x2 = x1 + h (2.7)
and (equating two expressions for the slope of the line segment P1 P2 )
Y2 − Y1
= f (x1 , Y1 ),
h
which can be rearranged to give
Y2 = Y1 + hf (x1 , Y1 ). (2.8)
Having carried out two steps of the process, it is possible to see that the
same procedure can be applied to construct any number of further steps,
and we next generalize to a description of what happens at the (i + 1)th
step, where i represents any non-negative integer.
Suppose that after i steps we have reached the point Pi , with coordinates
(xi , Yi ). For the (i + 1)th step, we move away from Pi along the line with
slope f (xi , Yi ), as defined by the direction field at Pi . After moving through a
horizontal distance h to the right, we reach the point Pi+1 , whose coordinates
are denoted by (xi+1 , Yi+1 ), as illustrated in Figure 2.7. The point Pi+1
provides an approximate value Yi+1 of the solution function y(x) at x = xi+1 .
y
P i+1 ( xi+1 , Y i+1 )
slope = f ( x i , Y i )
Y i+1 – Yi
Pi
(xi , Yi )
h
0 xi xi+1 x
������ ���
��
��
���� � ����
���� ����� ��� ������� ������
y = y(x) P6
P5
P4
P3
P2
P1
P0
0 x0 x1 x2 x3 x4 x5 x6 x
������ ���
��
���� � ����
���� ����� ��� ������� ������
Nevertheless, the formulae (2.9) and (2.10) provide a method for finding ap-
proximate solutions to the initial-value problem (2.4), in terms of numerical The accuracy of such
estimates Y1 , Y2 , Y3 , . . . at the respective domain values x1 , x2 , x3 , . . .. This approximate solutions, and
is called Euler’s method, and is summarized below. ways of improving accuracy,
will be considered shortly.
������� ���
Consider the initial-value problem
dy
= x + y, y(0) = 1.
dx
Use Euler’s method, with step size h = 0.2, to obtain an approximation to
y(1).
�� �
We have x0 = 0, Y0 = y0 = 1, and f (xi , Yi ) = xi + Yi . The step size is given
as h = 0.2. Equation (2.9) with i = 0 gives
x1 = x0 + h = 0 + 0.2 = 0.2,
and Equation (2.10) with i = 0 gives
Y1 = Y0 + hf (x0 , Y0 ) = 1 + 0.2 × (0 + 1) = 1.2.
For the second step, we have (from Equation (2.9) with i = 1)
x2 = x1 + h = 0.2 + 0.2 = 0.4,
and (from Equation (2.10) with i = 1)
Y2 = Y1 + hf (x1 , Y1 ) = 1.2 + 0.2 × (0.2 + 1.2) = 1.48.
If more than a couple of steps of such a calculation have to be computed by
hand, then it is a good idea to lay out the calculation as a table. In this
case, by continuing as above and putting i in turn equal to 2, 3 and 4, we
obtain Table 2.1.
��
��
���� � ����
���� ����� ��� ������� ������
So, at x = 1, Euler’s method with step size h = 0.2 gives the approximation
y(1) 2.976 64.
��
��
���� � ����
���� ����� ��� ������� ������
y
estimate with
h = 0.4
with h = 0.2
y = y (x)
with h = 0.1
exact solution
at x = 0.4
������ ���
In fact, it can be shown that the accuracy of Euler’s method does indeed
usually improve when we take a smaller step size.
To demonstrate this, consider the initial-value problem from Exercise 2.2.
This has the exact solution y = ex (as you can verify), and its value at
x = 1 is y(1) = e = 2.718 282, to six decimal places. In Exercise 2.2 you
showed that with a step size h = 0.2, Euler’s method gives the approximation
2.488 32 for y(1). Table 2.2 shows the corresponding results (to six decimal
places) obtained when we apply Euler’s method to this same initial-value
problem but with successively smaller step sizes h. In Exercise 2.2, where
h = 0.2, the value of y(1) was
approximated by Y5 . From
����� ��� the column for ‘number of
steps’ in Table 2.2, you can
h Approximation Absolute Number of see that y(1) is approximated
to y(1) error steps by:
0.1 2.593 742 0.124 539 10 Y10 when h = 0.1;
As expected, the absolute errors in the third column of the table become
progressively smaller as h is reduced.
Looking more carefully at these absolute errors, we notice that they seem
to tend towards a sequence in which each number is a tenth of the previous
one. Since each value of h in the table is a tenth of the previous one, this
suggests that:
absolute error is approximately proportional to step size h.
This turns out to be a general property of Euler’s method, for sufficiently You will see this property
small values of the step size. So, not only do we know that accuracy can be stated formally in Unit 26,
improved by decreasing the step size h, but this general property also tells us where you will also see how
the property can be used to
that, by making h small enough, the absolute error in an approximation can estimate the size of absolute
errors.
��
��
���� � ����
���� ����� ��� ������� ������
A few words of caution are necessary at this point. Although the absolute
error can be made as small as we please by making the step size h sufficiently
small, this is valid only if the arithmetic is performed using sufficient decimal
places. Where a calculator or computer is involved, the number of decimal
places that can be used is limited, and as a result rounding errors may
be introduced into the calculations. After a certain point, any increase in
accuracy brought about by reducing the size of h may be swamped by these
rounding errors.
Moreover, rounding errors are not the only problem. Before concluding that
h should always be chosen to be very small, we must also consider the cost
of this additional accuracy. Now, by cost is meant the effort involved, which
can be measured in a variety of ways; commonly for iterative methods (such
as Euler’s method) it is measured by the number of steps taken. In general
for numerical methods, the greater the accuracy required, the greater the
cost. To illustrate this, look back at Table 2.2. The last column of the
table shows how the number of steps required for the calculation goes up
in inverse proportion to the step size: to move from x = 0 to x = 1, it In general, to move from a
takes 10 steps with step size h = 0.1, 100 steps with step size h = 0.01, to b (where b > a) with step
and so on. Since, for sufficiently small h, the error in Euler’s method is size h takes (b − a)/h steps.
approximately proportional to the step size, it follows that for this method
a ten-fold improvement in accuracy is paid for by a ten-fold increase in the
number of steps required.
So, for Euler’s method and similar methods, the choice of step size has to be
based on a compromise between the two opposing requirements of accuracy
and cost. Methods for choosing the step size are discussed in Unit 26, which
also introduces other numerical methods for solving initial-value problems
that are considerably more efficient than Euler’s method. In fact, Euler’s Greater efficiency means that
method is not suitable for high-accuracy work. Its virtue lies rather in its the same or better numerical
simplicity and its clear illustration of the basic principles of how differential accuracy is achieved with
fewer numerical
equations may be solved numerically. computations.
In any practical case, calculations of the type described in this subsection
are ideally suited to being performed on a computer, as you will see in the
next subsection.
��
��
���� � ����
���� ����� ��� ������� ������
��
������ ���
Use Euler’s method to obtain approximations to y(1) for the initial-value
problem
dy
= x + y, y(0) = 0,
dx
using step sizes h = 1, 0.5, 0.2, 0.1, 0.01, 0.001, 0.0001, in turn. In each case,
plot the graph of the solution on an appropriate direction field diagram, and
observe how each graph compares with the previous one.
��
������ ���
Euler’s method is to be used to estimate the value of the function y(x) at
x = 0.1, 0.2, . . . , 1 for the initial-value problem
dy
= x + y, y(0) = 0.
dx
(a) Use the step sizes h = 0.1, 0.01, 0.001, 0.0001, in turn. Compare the
results in each case with the exact solution y = ex − x − 1, and comment
on how the size of the absolute error varies with h.
(b) Compare your estimates for the step sizes h = 0.1 and h = 0.01. Then
compare your estimates for all four step sizes. What can you conclude?
��
��
���� � ������� �������
���������
–2 –1 0 1 2 x
–1
–2
������ ����
(c) On the basis of the direction field, what can be said about the graphs
of solutions of the differential equation?
(d) Write down the formulae required in order to apply Euler’s method to
the initial-value problem
dy
= y + x2 , y(−1) = −0.2,
dx
��
��
���� � ������� �������
���������
This means that the general solution of Equation (3.2) can be written down
directly as an indefinite integral; and, if the integration can be performed,
then the equation is solved.
Once the general solution has been found, it is possible to single out a
particular solution by specifying a value for the constant C. As before, this
value may be found by applying an initial condition.
������� ���
(a) Find the general solution of the differential equation
dy
= e−3x .
dx
��
��
���� � ������� �������
���������
(b) Find the particular solution of this differential equation that satisfies
the initial condition y(0) = 53 .
�� �
(a) On applying direct integration, we obtain the general solution
�
y = e−3x dx = − 13 e
−3x + C,
5
3 = − 13 e
0 + C,
so C = 2. The required particular solution is therefore
y = − 13 e
−3x + 2.
Procedure 3.1 uses x for the independent variable and y for the dependent
where other symbols are used for the variables. But remember that the
alone. Thus direct integration can be applied, for example, to the differential
equation
dx
= cos t,
dt
to give the general solution
�
x = cos t dt = sin t + C,
��
��
���� � ������� �������
���������
The answer to Exercise 3.2(b) can be generalized to any differential equation This is a simple extension of
of the form the�result from Unit 1 that
f (x)
dy f (x) dx = ln |f (x)| + C,
=k (f (x) = 0), f (x)
dx f (x) for f (x) = 0.
where k is a constant, to give the general solution
y = k ln |f (x)| + C,
where C is an arbitrary constant.
= x,
1 + y 2 dx
and then integrate both sides with respect to x, which gives
� �
1 dy
dx = x dx. (3.5)
1 + y 2 dx
Applying the rule for integration by substitution (in Leibniz notation) to See Section 6 of Unit 1.
the left-hand side, we obtain
� �
1 dy 1
2
dx = dy,
1 + y dx 1 + y2
so Equation (3.5) becomes
� �
1
dy = x dx.
1 + y2
On performing the two integrations, we obtain See the table of standard
1 2 integrals in the Handbook.
arctan y = 2x + C, (3.6)
��
��
���� � ������� �������
���������
where C is an arbitrary constant. Making y the subject of the equation, we Note that one arbitrary
obtain the solution expression constant suffices.
y = tan(
12 x
2 + C).
The approach just demonstrated applies more widely. In principle, it works
for any differential equation of the form
dy
= g(x)h(y). (3.3)
dx
On dividing this equation through by h(y) (for all values of y other than
those where h(y) = 0), we obtain
1 dy
= g(x).
h(y) dx
Integration with respect to x on both sides gives
� �
1 dy
dx = g(x) dx,
h(y) dx
and, on applying the rule for integration by substitution to the left-hand
side, this becomes
� �
1
dy = g(x) dx. (3.7) This is the form that you
h(y) need to remember! Note that
If the two integrals can be evaluated at this stage, then we reach an equation you can obtain it ‘informally’
that relates x and y and features an arbitrary constant. This equation is by dividing Equation (3.3) by
the general solution of the differential equation (for values of y other than h(y), ‘multiplying through
by dx’, and then adding the
those where h(y) = 0); but usually y will not be the subject of this equation. two integral signs.
It is a form of the general solution called an implicit (general) solution of
the differential equation. (An example of an implicit solution is provided
by Equation (3.6).) Usually, the final aim is to make y the subject of the
equation, if possible — that is, to manipulate the equation into the form
y = function of x.
This is called the explicit (general) solution of the differential equation.
In either case (implicit or explicit), a particular solution may be obtained
from the general solution as before, by applying an initial condition.
The method just described for solving differential equations of the form (3.3)
is called the method of separation of variables since, in Equation (3.7), we
have separated the variables to either side of the equation, with only the
dependent variable appearing on the left and only the independent variable
on the right. The method is summarized below.
the form
dy
= g(x)h(y). (3.3)
dx
(a) Divide both sides by h(y) (where h(y) = 0), and integrate both
� �
1
dy = g(x) dx. (3.7)
h(y)
(b) If possible, perform the integrations, to obtain an implicit form of
��
��
���� � ������� �������
���������
The separation of variables method is useful, but there are some difficulties
with it. First, it may not be possible to perform the necessary integrations.
Second, the general solution obtained is restricted to those values of y such
that h(y) = 0. Third, it may not be possible to perform the necessary
manipulations to obtain an explicit solution.
Of these difficulties, the first can be overcome by use of a numerical method,
such as Euler’s method. The second will be discussed shortly. The third
will usually also need numerical techniques.
It is necessary to be careful about the domain or image set of the solution
obtained, as the following example illustrates.
����� ���
(a) Find the general solution of the differential equation
dy x
=− (y > 0).
dx 3y
(b) Find the particular solution that satisfies the initial condition y(0) = 3.
�� �
(a) The equation is of the form
dy
= g(x)h(y),
dx
� �
3y dy = −x dx. With practice, you will be
able to move directly to this
Evaluating the integrals gives stage, as shown in
Procedure 3.2.
3 2
2y = −
12 x
2 + B,
where B is an arbitrary constant. This is an implicit form of the general
solution.
On solving for y (and noting the condition y > 0 given above, which
determines the sign of the square root), we obtain the explicit general
solution
�
y =
13 (2B − x2 ).
This can be simplified slightly by writing C in place of 2B, where C
formula for y represents a real quantity greater than zero only when the
��
��
���� � ������� �������
���������
dm
= −λm (m > 0), This model can be applied to
dt other radioactive substances
where the decay constant λ is a positive constant characteristic of the ura- by selecting the appropriate
nium isotope. value of the parameter λ.
(a) Find the general solution of this differential equation.
(b) Find the particular solution for which the initial amount of uranium
The condition m > 0 in Exercise 3.3 arose from the modelling context. This
condition enabled us to find the general solution without needing to worry
about dividing by zero at Step (a) of the separation of variables method
(and hence without needing to restrict the image set further). Suppose we
were to forget the modelling context — that is, suppose we were to remove
the restriction m > 0. How does this affect the solution process? And how
do we cope with the case where m = 0? These questions are answered in the
following example where, to emphasize the absence of the previous modelling
context, the variables used are x and y.
������� ���
Find the general solution of the differential equation
dy
= −λy,
dx
where λ is a non-zero constant.
�� �
To apply the separation of variables method, we need to exclude the cases
where y = 0. So, for y = 0, on dividing through by y, integrating with
respect to x, and using the rule for integration by substitution on the left-
hand side, we obtain
� �
1
dy = (−λ) dx. (3.8)
y
Integrating, we obtain You saw in Unit 1 that
�
1
ln |y| = −λx + B, dy = ln |y| (y =
0).
y
where B is an arbitrary constant. Taking exponentials gives
|y| = e−λx+B
or, removing the modulus sign,
y = ±e−λx+B = ±eB e−λx = Ce−λx ,
where C = ±eB is a non-zero but otherwise arbitrary constant.
��
��
���� � ������� �������
���������
This is not quite the general solution, as we have to consider what happens
when y = 0. Now, looking at the above solution, it is natural to ask what
happens when C = 0. This gives the zero function, y = 0 for all x, and
inspection of the differential equation shows that this is a particular solution.
So we now have the general solution
y = Ce−λx ,
where C is an arbitrary constant. (Positive C corresponds to y > 0, negative
C to y < 0, and C = 0 to the particular solution y = 0.)
��
��
���� � ������� ������ ������������ ���������
This section presents one final method of analytic solution for first-order
differential equations, but you should be aware that the idea generalizes to
higher-order differential equations and is important from a theoretical point Linear second-order
considered in Unit 3.
����������
(a) A first-order differential equation for y = y(x) is linear if it can be
dy
+ g(x)y = h(x), (4.1) This differential equation can
dx be written in the general form
where g(x) and h(x) are given functions. dy
= f (x, y)
(b) A linear first-order differential equation is said to be homoge- dx
that we have been using by
neous if h(x) = 0 for all x, and inhomogeneous or non- putting
homogeneous otherwise. f (x, y) = −g(x)y + h(x).
dy
= xy 2
dx
is not, due to the presence of the non-linear term y 2 .
is linear.
dy dy dz
(a) + x3 y = x5 (b) = x sin x (c) = −3z 1/2
dx dx dt
dy dy
(d) ẏ + y 2 = t (e) x + y = y2 (f) (1 + x2 ) + 2xy = 3x2
dx dx
��
��
���� � ������� ������ ������������ ���������
���� ���
If the functions g(x) and h(x) are continuous throughout an interval
(a, b) and x0 belongs to this interval, then the initial-value problem This includes the possibility
that either a = −∞ or b = ∞,
dy so the interval might be all of
+ g(x)y = h(x), y(x0 ) = y0 ,
dx the real line.
This is a very powerful result, since it means that once you have found a
solution in a particular interval, that solution will be the only one.
There is a particularly useful technique for solving linear differential equa-
tions, to which we turn next.
��
��
���� � ������� ������ ������������ ���������
This solution was arrived at by noting that the left-hand side of Equa-
tion (4.2) is of the form
dy dp
p + y, (4.4)
dx dx
where p = 1 + x2 , and that this form can be re-expressed, using the Product
Rule, as
d
(py).
dx
Linear differential equations need not come in this convenient form. For
example, the left-hand side of the equation
� �
dy 2x 3x2
+ y = (4.5)
dx 1 + x2 1 + x2
is not of the form (4.4). However, Equation (4.2) can be obtained from
Equation (4.5) on multiplying through by p = 1 + x2 . For this reason,
p = 1 + x2 may be called an integrating factor for Equation (4.5): it is
the factor by which Equation (4.5) needs to be multiplied in order that the
resulting differential equation has a left-hand side of the form (4.4), enabling
direct integration to be performed.
This leaves the question of how such an integrating factor can be found,
starting from Equation (4.5). The answer comes from writing down the two
properties that such a function p = p(x) must satisfy, as follows.
• Multiplying Equation (4.5) by p gives, on the left-hand side,
� �
dy 2x
p +p y.
dx 1 + x2
• The left-hand side must be of the form
dy dp
p + y. (4.4)
dx dx
Comparison of these two expressions shows that p must itself be a particular
solution of the differential equation
� �
dp 2x
= p. (4.6)
dx 1 + x2
This is a homogeneous linear first-order differential equation, and we can
solve it by separation of variables. Indeed, following Procedure 3.2, the
0)
equation becomes (for p =
� �
dp 2x
= dx.
p 1 + x2
Performing the left-hand integral gives
�
2x
ln |p| = dx,
1 + x2
so
�� �
2x
|p| = exp dx . (4.7)
1 + x2
��
��
���� � ������� ������ ������������ ���������
= D(1 + x2 ),
where D (= exp(A)) is a positive but otherwise arbitrary constant. Hence The case D = 0 corresponds
to the solution p = 0 of
p = ±D(1 + x ), 2
Equation (4.6), but this
solution is not of interest.
which, by redefining D, can be written as
p = D(1 + x2 ),
where D is now a non-zero but otherwise arbitrary constant.
Thus an integrating factor for Equation (4.5) is p(x) = D(1 + x2 ). Multi-
plying through the equation by this factor yields
dy
D(1 + x2 ) + 2Dxy = 3Dx2 ,
dx
and now you can see that (since D = 0) the arbitrary constant D can be
chosen without affecting the applicability of the form (4.4). Therefore we
choose the integrating factor to have the simplest possible form — in this
case we obtain p(x) = 1 + x2 .
As you have seen, this leads to the solution of Equation (4.5) by direct inte-
gration, and the formula for this integrating factor is given by Equation (4.7)
as
�� �
2x
p = exp dx . (4.8)
1 + x2
This approach generalizes to any linear first-order differential equation, pro-
vided that the integrals involved can be evaluated. For an equation written
in the form
dy
+ g(x)y = h(x), (4.1)
dx
the function g(x) takes the place of 2x/(1 + x2 ) in Equation (4.5). To find
an integrating factor p = p(x) for Equation (4.1), the argument proceeds as
above, with 2x/(1 + x2 ) replaced by g(x) at each step. This leads to the
generalized form of Equation (4.8), namely
�� �
p = exp g(x) dx , (4.9) Remember that calculation of
the integrating factor does
which defines the integrating factor for Equation (4.1). not require the inclusion of a
constant of integration.
When Equation (4.1) is multiplied through by the integrating factor, the
resulting differential equation is
dy
p(x) + p(x)g(x)y = p(x)h(x), (4.10)
dx
the left-hand side of which, by the definition of p, is of the form (4.4); so The definition of p ensures
Equation (4.10) can be re-expressed, using the Product Rule, as that the left-hand side of
Equation (4.10) is of the
d form (4.4) since
(p(x)y) = p(x)h(x). (4.11) � �� ��
dx dp d
= exp g(x) dx
Direct integration can then be used on Equation (4.11) to try to find the dx dx �
� �
general solution. = exp g(x) dx g(x)
This integrating factor method is summarized below. = p(x)g(x).
��
��
���� � ������� ������ ������������ ���������
as
You can, if you wish, check
dy that you have found p
p(x) + p(x)g(x)y = p(x)h(x). correctly by checking that
dx dy
(c) Rewrite the differential equation as p(x) + p(x)g(x)y
dx
d
d = (p(x)y) ,
(p(x)y) = p(x)h(x). dx
dx i.e. by checking that
(d) Integrate this last equation, to obtain dp/dx = p(x)g(x).
�
p(x)y = p(x)h(x) dx.
It is a good idea to check, by
(e) Divide through by p(x), to obtain the general solution in explicit substitution into the original
form. equation, that the function
obtained is indeed a solution.
������� ���
Use the integrating factor method to find the general solution of each of the The first example cannot be
following differential equations. solved by separation of
variables. The latter two can,
dy 2xy dy y−1 dy 2y
(a) =x− 2 (b) = (x > 0) (c) = as you saw in Exercise 3.4.
dx x +1 dx x dx 1 + x2 You can compare these
answers with those obtained
�� � earlier.
(a) On rearranging the differential equation as
dy 2xy
+ = x,
dx x2 + 1
we see that it is in the form of Equation (4.1) with
2x
�� �
2x
p = exp dx
x2 + 1
��
��
���� � ������� ������ ������������ ���������
dx
Integrating both sides gives
�
(x + 1)y = x(x2 + 1) dx
2
�
= (x3 + x) dx
= 14 x4 + 21 x2 + C,
where C is an arbitrary constant. Finally, to obtain an explicit solution
we divide by x2 + 1 to obtain
x4 + 2x2 + 4C
y= .
4(x2 + 1)
(b) On rearranging the differential equation as
dy 1 1
− y=− ,
dx x x
we see that it is in the form of Equation (4.1) with g(x) = h(x) = −1/x.
�� � � �
1
p = exp − dx
x
= exp(− ln x) (since x > 0) Recall that a ln x = ln(xa )
� � �� and hence, in particular,
1
= exp ln − ln x = ln(x−1 ) = ln(1/x).
x
1
= . Checking, we see that
x
dp 1
Multiplying through the equation by p(x) = 1/x gives = − 2 = g(x)p(x).
dx x
1 dy 1 1
− 2 y = − 2 ,
x dx x x
and the differential equation becomes
� �
d 1 1
y = − 2.
dx x x
Integration then gives
� � �
y 1
= − 2 dx
x x
1
= + C,
x
y = 1 + Cx,
where C is an arbitrary constant.
��
��
���� � ������� ������ ������������ ���������
(c) In order to put the given differential equation into the form (4.1), we
need to bring the term in y to the left-hand side to obtain
dy 2
− y = 0. (4.12)
dx 1 + x2
Hence, in this case, we have g(x) = −2/(1 + x2 ) and h(x) = 0. The The equation is homogeneous.
integrating factor is
�� � � �
2
p = exp − dx = exp(−2 arctan x) = e−2 arctan x . Checking, we see that
1 + x2
dp −2e
−2 arctan x
Multiplying through by the integrating factor gives =
dx 1 + x2
� �
dy
−2 arctan x 2y −2 arctan x −2 arctan x 2
e − e = 0. =e −
dx 1 + x
2 1 + x2
Thus the differential equation can be rewritten as
= p(x)g(x).
d � −2 arctan x �
e y = 0.
dx
It follows, on integrating, that
e−2 arctan x y = C, or, equivalently, y = Ce2 arctan x ,
where C is an arbitrary constant. This is the general solution.
dv dy
(c) + 5v = 0 (d) (1 + x2 ) + 2xy = 1 + x2
du dx
����
��� ���
Solve each of the following initial-value problems. The differential equation in
part (a) is equivalent to that
(a) ẏ + y = t + 1, y(1) = 0. considered in parts (e) and (f)
(b) e3t ẏ = 1 − e3t y, y(0) = 3. of Exercise 1.2.
��
��
���� � ������� �������
��������� �� ���
�������
dv
(b) + 4v = 3 cos 2t
dt
�
eat
eat cos bt dt = 2 (a cos bt + b sin bt) + C,
a
+ b2
where C is an arbitrary constant.)
��