Morris W. Hirsch and Stephen Smale (Eds.) - Differential Equations, Dynamical Systems, and Linear Algebra-Academic Press, Elsevier (1974) Bonito
Morris W. Hirsch and Stephen Smale (Eds.) - Differential Equations, Dynamical Systems, and Linear Algebra-Academic Press, Elsevier (1974) Bonito
Dynamical Systems,
and Linear Algebra
M O R R I S W . HIRSCH A N D STEPHEN S M A L E
Uniuersity of Gdijomiu, Berkeley
Hirsch, Moms,Date
Differential equations, dynamical systems, and
linear algebra.
This book is about dynamical aspects of ordinary differential equations and the
relations between dynamical systems arid certain fields outside pure mathematics.
A prominent role is played by the structure theory of linear operators on finite-
dimensional vector spaces; we have included a self-contained treatment of that
subject .
The background material needed to understand this book is differential calculus
of several variables. For example, Serge Lang’s Calculus of Several Variables, up to
the chapter on integration, contains more than is needed to understand much of our
text. On the other hand, after Chapter 7 we do use several results from elementary
analysis such as theorems on uniform convergence; these are stated but not proved.
This mathematics is contained in Lang’s Analysis I , for instance. Our treatment of
linear algebra is systematic and self-contained, although the most elementary parts
have the character of a review; in any case, Lang’s Calculus of Several Variables
develops this elementary linear algebra at a leisurely pace.
While this book can be used as early as the sophomore year by students with a
strong first year of calculus, it is oriented mainly toward upper division mathematics
and science students. It can also be used for a graduate course, especially if the later
chapters are emphasized.
It has been said that the subject of ordinary differential equations is a collection
of tricks and hints for finding solutions, and that it is important because it can
solve problems in physics, engineering, etc. Our view is that the subject can be
developed with considerable unity and coherence; we have attempted such a de-
velopment with this book. The importance of ordinary differential equations
vis d vis other areas of science lies in its power to motivate, unify, and give force to
those areas. Our four chapters on “applications” have been written to do exactly
this, and not merely to provide examples. Moreover, an understanding of the ways
that differential equations relates to other subjects is a primary source of insight
and inspiration for the student and working mathematician alike.
Our goal in this book is t o develop nonlinear ordinaty differential equations in
open subsets of real Cartesian space, R”,in such a way that the extension to
manifolds is simple and natural. We treat chiefly autonomous systems, emphasizing
qualitative behavior of solution curves. The related themes of stability and physical
significance pervade much of the material. Many topics have been omitted, such as
Laplace transforms, series solutions, Sturm theory, and special functions.
The level of rigor is high, and almost everything is proved. More important,
however, is that ad hoc methods have been rejected. We have tried to develop
X PREFACE
proofs that add insight to the theorems and that are important methods in their
own right.
We have avoided the introduction of manifolds in order to make the book more
widely readable; but the main ideas can easily be transferred t o dynamical systems
on manifolds.
The first six chapters, especially Chapters 3-6, give a rather intensive and com-
plete study of linear differential equations with constant coefficients. This subject
matter can almost be identified with linear algebra; hence those chapters constitute
a short course in linear algebra as well. The algebraic emphasis is on eigenvectors and
+
how to find them. We go far beyond this, however, to the “semisimple nilpotent”
decomposition of an arbitrary operator, and then on to the Jordan form and its real
analogue. Those proofs that are far removed from our use of the theorems are
relegated to appendices. While complex spaces are used freely, our primary concern
is to obtain results for real spaces. This point of view, so important for differential
equations, is not commonly found in textbooks on linear algebra or on differential
equations.
Our approach to linear algebra is a fairly intrinsic one; we avoid coordinates
where feasible, while not hesitating to use them as a tool for computations or proofs.
On the other hand, instead of developing abstract vector spaces, we work with
linear subspaces of Rnor Cn,a small concession which perhaps makes the abstraction
more digestible.
Using our algebraic theory, we give explicit methods of writing down solutions
to arbitrary constant coefficient linear differential equations. Examples are included.
In particular, the S + N decomposition is used to compute the exponential of an
arbitrary square matrix.
Chapter 2 is independent from the others and includes an elementary account
of the Keplerian planetary orbits.
The fundamental theorems on existence, uniqueness, and continuity of solutions
of ordinary differential equations are developed in Chapters 8 and 16. Chapter 8 is
restricted to the autonomous case, in line with our basic orientation toward dynami-
cal systems.
Chapters 10, 12, and 14 are devoted to systematic introductions to mathematical
models of electrical circuits, population theory, and classical mechanics, respectively.
The Brayton-Moser circuit theory is presented as a special case of the more general
theory recently developed on manifolds. The Volterra-Lotka equations of competing
species are analyzed, along with some generalizations. I n mechanics we develop
the Hamiltonian formalism for conservative systems whose configuration space is
an open subset of a vector space.
The remaining five chapters contain a substantial introduction to the phase
portrait analysis of nonlinear autonomous systems. They include a discussion of
“generic” properties of linear flows, Liapunov and structural stability, Poincarb
Bendixson theory, periodic attractors, and perturbations. We conclude with an
Afterword which points the way toward manifolds.
PREFACE xi
The following remarks should help the reader decide on which chaptcrs t o rrad
and in what order.
Chapters 1 and 2 are elementary, but they present many ideas that recur through-
out the book.
Chapters 3-7 form a sequence that develops linear theory rather thoroughly.
Chapters 3, 4, and 5 make a good introduction to linear operators and linear differ-
ential equations. The canonical form theory of Chaptcr 6 is thc basis of the stability
results proved in Chapters 7, 9, and 13; howcvrr, this hravy algebra might be post-
poned at a first exposure to this material and the results takcn on faith.
The existence, uniqueness, and continuity of solutions, proved in Chapter 8, are
used (often implicitly) throughout the rest of the book. Depending on the reader’s
taste, proofs could be omitted.
A reader intcrrstcd in the nonlinear material, who has some background i n linear
theory, might start with the stability theory of Chapter 9. Chapters 12 (rcology),
13 (periodic attractors), and 16 (perturbations) depend strongly on Chaptor 9, whilr
the section on dual vector spaccs and gradicnts will mnkc Chapters 10 (clcctrical
circuits) and 14 (mechanics) easier to understand.
Chaptcr 12 also depends on Chapter 11 (PoincarbBcndixson) ; and the material
in Section 2 of Chapter I 1 on local sections is used again in Chapters 13 and 16.
Chapter 15 (nonautonomous equations) is a continuation of Chapter 8 and is
used in Chapters 11, 13, and 16; however it can be omitted at a first reading.
The logical dependence of the later chapters is summarized in the following chart :
The book owes much t o many people. We only mention four of them here. Ikuko
Workman and Ruth Suzuki did an excellent job of typing the manuscript. Dick
Palais made a number of useful comments. Special thanks are due to Jacob Palis,
who read the manuscript thoroughly, found many minor errors, and suggested
several substantial improvements. Professor Hirsch is grateful to the Miller Institute
for its support during part of the writing of the book.
Chapter 1
First Examples
The purpose of this short chapter is to develop some simple examples of differen-
tial equations. This development motivates the linear algebra treated subsequently
and moreover gives in an elementary context some of the basic ideas of ordinary
differential equations. Later these ideas will be put into a more systematic exposi-
tion. In particular, the examples themselves are special cases of the class of differen-
tial equations considered in Chapter 3. We regard this chapter as important since
some of the most basic ideas of differential equations are seen in simple form.
is the simplest differential equation. It is also one of the most important. First,
what does it mean? Here z = z ( t ) is an unknown real-valued function of a real
variable t and d x / d t is its derivative (we will also use z’ or z’ ( t ) for this derivative) .
The equation tells us that for every value of t the equality
x’(t) = w(t)
is true. Here a denotes a constant.
The solutions to (1) are obtained from calculus: if K is any constant (real num-
ber), the function f ( t ) = K s t is a solution since
f(t) = aK@ = aj(t).
2 1. FIRST EXAMPLES
I"
Moreover, there are no other solutions. To see this, let u ( t ) be any solution and
compute the derivative of u(t)e-':
d
- (u(t)e-l)
dt
= u'(t)e-'" + u(t) (-(~e-~')
= au(t)e+ - au(t)e-.' = 0.
Therefore u(t)e*' is a constant K , so u ( t ) = Ks'. This proves our assertion.
The constant K appearing in the solution is completely determined if the value
uoof the solution at a single point to is specified. Suppose that a function ~ ( t satisfy-
)
ing (1) is required such that z(t0) = uo, then K must satisfy KS'O = uo. Thus
equation (1) has a unique solution satisfying a specified initial condition z( t o ) = uo.
For simplicity, we often take to = 0; then K = uo. There is no loss of generality
in taking to = 0, for if u ( t ) is a solution with u(0) = uo,then the function u ( t ) =
u ( t - t o ) is a solution with u(t0) = uo.
It is common to restate (1) in the form of an initial value problem:
(2) Z' = ax, ) K.
~ ( 0=
A solution z( t ) to (2) must not only satisfy the first condition (1) , but must also
take on the prescribed initial value K at t = 0. We have proved that the initial
value problem (2) has a unique solution.
The constant a in the equation Z' = az can be considered 88 a parameter. If a
changes, the equation changes and so do the solutions. Can we describe qualita-
tively the way the solutions change?
The sign of a is crucial here:
if a > 0, 1imt+- KP' equals co when K > 0, and equals - 00 when K < 0;
if a = 0, K s ~= constant;
if a < 0, 1imt+*Ks' = 0.
41. THE SIMPLEST EXAMPLES 3
XI
Ax=(2,-1/2)
FIG. B
4 1. FIRST EXAMPLES
Initial conditions are of the form x(k) = u where u = (ul, w)is a given point
of R2.Geometrically, this means that when t = to the curve is required to pass
through the given point u.
The map (that is, function) A : R2+ R: (or x + Az) can be considered a vector
@field on R2.This means that to each point x in the plane we assign the vector Ax.
For purposes of visualization, we picture Ax as a vector “based a t 2’); that is, we
+
assign to z the directed line segment from x to x Az. For example, if a1 = 2,
a = -3, and x = (1, 1), then a t (1, 1) we picture an arrow pointing from (1, 1)
to (1, 1)+ (2, -3) = (3,3) (Fig. B). Thus if Ax = (2x1, - + G ) , we attach to
+
each point x in the plane an arrow with tail at x and head a t x Ax and obtain
the picture in Fig. C.
Solving the differential equation (3) or (3‘) with initial conditions (u1, A) at
t = 0 means finding in the plane a curve z( 1 ) that satisfies (3’) and passes through
the point u = (ul, uz) when t = 0. A few solution curves are sketched in Fig. D.
The trivial solution (xl(t),a ( t ) )= (0, 0) is also considered a “curve.”
The family of all solution curves as subsets of R2 is called the “phase portrait”
of equation (3) (or (3’)).
The one-dimensional equation x’ = a2 can also be interpreted geometrically: the
phase portrait is as in Fig. E, which should be compared with Fig. A. It is clearer
to picture the graphs of (1) and the solution curves for (3) since two-dimensional
pictures are better than either one- or three-dimensional pictures. The graphs of
$1. THE SIMPLEST EXAMPLES 5
a ’0 a c 0
FIG. E
&(u + u ) = &(u) + & ( u ) and t$,(Xu) = M f ( u ) , for all vectors u, u, and all
real numbers A.
As time proceeds, every point of the plane moves simultaneously along the tra-
jectory passing through it. In this way the collection of maps c#q : R2 -+ Rz,t E R,is
a one-parameter family of transformations. This family is called the $ow or dynami-
cal system or Rzdetermined by the vector field x + Ax, which in turn is equivalent
to the system (3).
The dynamical system on the real line R corresponding to equation (1) is par-
ticularly easy to describe: if a < 0, all points move toward 0 as time goes to 00 ; if
a > 0, all points except 0 move away from 0 toward f 00 ; if a = 0, all points stand
still.
We have started from a differential equation and have obtained the dynamical
system &. This process is established through the fundamental theorem of ordinary
differential equations as we shall see in Chapter 8.
Later we shall also reverse this process: starting from a dynamical system &, a
differential equation will be obtained (simply by differentiating &(u) with respect
to t ) .
It is seldom that differential equations are given in the simple uncoupled form
(3). Consider, for example, the system:
yz = 51 + 22.
21 = y1 - yz,
x2 = -yl + 2y2.
$1. THE SIMPLEST EXAMPLES 7
I
I
I YI =o
I
FIG. F
y; = 2: + 2;.
By substitution
y: = +
2(521 322) + ( -621 - 422) = 421 + 2322,
y; = (521 + 322) -k ( -621 - 422) = -21 - 52.
Another substitution yields
Y; = ~ ( Y -
I Yz) 2(-Y1 ~Yz)J
y; = -(y1 - yz) - (-yl f 2y2)J
or
(5) y: = 2Y1,
y; = -y2.
The last equations are in diagonal form and we have already solved this class of
systems. The solution (yl(t), yz(t)) such that (y1(0), y ~ ( 0 ) )= (UI, UZ) is
y1(t) = 8%
y2(t) = e-'q.
8 1. FIRST EXAMPLES
The phase portrait of this system (5) is given evidently in Fig. D. We can find
the phase portrait of the original system (4) by simply plotting the new coordinata
BX~E yl = 0,yS = 0 in the (x1,g) plane and sketching the trajectories y(l) in these
coordinates. Thus y1 = 0 is the line LI: xr = -221 and yt = 0 is the line 4: g =
-21.
Thue we have the phase portrait of (4) as in Fig. F, which should be compared
with Fig. D.
Formulas for the solution to (4) can be obtained by substitution aa follows.
Let (ul, ur) be the initial values (x1(0), a(0)) of a solution (zl(t), a ( t ) ) to (4).
Corresponding to (ul, ur) is the initial value (Y, q) of a solution (yl(t), y2(t)) to
(5) where
Ul = 2Ul+ ur,
Thue
and
zdt) = W2u1 + ur) - e-'(ul + 21q),
Ifwe compare these formulasto Fig. F, we see that the diagram instantly gives us
the qualitative picture of the solutions, while the formulas convey little geometric
information. In fact, for many purposes, it is better to forget the original equation
(4) and the corresponding solutions and work entirely with the "diagonalized"
equations (5), their solution and phase portrait.
PROBLEMS
like. Then sketch the phase portrait of the corresponding differential equation
x’ = Ax, guessing where necessary.
dxn
-
dt
Here the aij (i = 1, . . . , n; j = 1, . . . , n ) are n2 constants (real numbers), while
each z,denotes an unknown real-valued function of a real variable t. Thus (4) of
Section 1 is an example of the system (1) with n = 2, all = 5, a12 = 3, &I = -6,
as?= -4.
10 1. FIRST EXAMPLES
At this point we are not trying to solve ( 1 ) ; rather, we want to place it in a geo-
metrical and algebraic setting in order to understand better what a solution means.
At the most primitive level, a solution of (1) is a set of n differentiable real-
valued functions xi(t) that make ( 1 ) true.
In order to reach a more conceptual understanding of (1) we introduce real n-di-
mensional Cartesian space R".This is simply the set of all n-tuples of real numbers.
An element of Rnis a "point" x = (XI, . . . , xn); the number xi is the ith coordinate
of the point x. Points x, y in Rnare added coordinatewise:
x +y = (Xl, . . . , 2") + (Yl, . . . , Yn) = (51 + Y1, . . . , xn + Yn).
d(t) =
h-4
1
lim- ( z ( t + h) - x ( t ) ) .
It haa a natural geometric interpretation aa the vector u ( t ) based at x ( t ) ,which is
a translate of x'(t). This vector is called the langat w b r to the curve at t (or a t
dt)).
If we imagine t as denoting time, then the length I x'(t) I of the tangent vector is
interpreted physically as the speed of a particle describing the curve x ( t ) .
To write (1) in an abbreviated form we call the doubly indexed set of numbers
aij an n X n matrix A, denoted thus:
note that this is the ith row in the right-hand side of (1). In this way the matrix A
is interpreted aa a map
A:Rn+Rn
which to x assigns Ax.
With this notation (1) is rewritten
(2) X' = AX.
Thus the system (1) can be considered aa a single "vector differential equation"
(2). (The word equation is classically reserved for the case of just one variable; we
shall call (2) both a system and an equation.)
We think of the map A: R" + R" aa a vector &ld on R": to each point x E R"
it assigns the vector based at x which is a translate of Ax. Then a solution of (2)
is a curve z:R + Rnwhose tangent vector at any given 1 is the vector Az(t) (trans-
lated to x ( t ) ) . See Fig. D of Section 1.
In Chapters 3 and 4 we shall give methods of explicitly solving (2) , or equivs-
lently (1). In subsequent chapters it will be shown that in fact (2) haa a Unique
solution x ( t ) satisfying any given initial condition x(0) = uo E R". This is the
fundamental theorem of linear differential equations with constsnt c d c i e n t a ; in
Section 1 this waa proved for the special case 7~ = 1.
12 1. FIRBT EXAMPLES
PROBLEMS
1. For each of the following matrices A sketch the vector field x + A x in R*.
(Missing matrix entries are 0.)
I L L
0 -1 -1
1 0 1 1
-3 1 1
2. For A as in (a), (b) , (c) of Problem 1, solve the initial value problem
X’ = A X , ) (ki,kf, ka).
~ ( 0=
3. Let A be aa in (e) , Problem 1. Find constants a, b, c such that the curve t +
( a COB t, b sin t, ce-f‘*) is a solution to x’ = A s with z(0) = (1, 0, 3).
4. Find two different matrices A, B such that the curve
z ( t ) = (el, 2eef,48’)
satisfiea both the differential equations
x’ = A x and x‘ = Bx.
5. Let A = [air] be an n X n diagonal matrix, that is, aij = 0 if i # j . Show that
the differential equation
x’ = Ax
has a unique solution for every initial condition.
6. Let A be an X n diagonal matrix. Find conditions on A guaranteeing that
limx(t) = 0
I-.-
(b) Let A = [I -23. Find solutions u ( t ) , u(l) to z’= A s such that every
+
solution can be expressed in the form au(t) Bu(t) for suitable con-
st.anta a,8.
Notes
The background needed for a reader of Chapter 1 is a good first year of college
calculus. One good source is S. Lang’s Second Course in Calculus [12, Chapters I,
11, and 1x1. In this reference the material on derivatives, curves, and vectors in
Rnand matrices is discussed much more thoroughly than in our Section 2.
Chapter 2
Newton’s Equation and Kepler’s Law
We shall go into details of’this field in Section 6. Other important examples of force
fields are derived from electrical forces, magnetic forces, and so on.
The connection between the physical concept of force field and the mathematical
concept of differential equation is Newton’s second law: F = ma. This law asserts
that a particle in a force field moves in such a way that the force vector at the loca-
tion of the particle, a t any instant, equals the acceleration vector of the particle
times the mass m. If x ( t ) denotes the position vector of the particle at time t , where
x: R + Ra is a sufficiently differentiable curve, then the acceleration vector is the
second derivative of x ( t ) with respect to time
a ( t ) = *(t).
(We follow tradition and use dots for time derivatives in this chapter.) Newton’s
second law states
F ( z ( t ) ) = m*(t).
Thus we obtain a second order differential equation:
1
= - F(z).
m
In Newtonian physics it is assumed that m is a positive constant. Newton’s law of
gravitation is used to derive the exact form of the function F (2). While these equa-
tions are the main goal of this chapter, we first discuss simple harmonic motion
and then basic background material.
often in calculus courses, (2) is the only solution of (1) satisfying these initial condi-
tions. Later we will show in a systematic way that these facts are true.
Using basic trigonometric identities, (2) may be rewritten in the form
(3) 4 0 = a cos (Pt to), +
+
where a = ( A z B2)Ilais called the amplitude, and cos to = A (A2 B2)-1’2. +
In Section 6 we will consider equation (1) where a constant term is added (repre-
senting a constant disturbing force) :
(4) X +
p2x = K .
Then, similarly to ( 1), every solution of (4)has the form
Thus (z,z)= I z I*. If z,y: I --* Rnare C1functions, then a version of the Leibnie
product rule for derivatives is
J('y)' = ("J a / ) ('J +
9')l
aa can be easily checked using coordinate functions.
We w i l l have occasion to consider functions f: Rn-+ R (which, for example,
could be given by temperature or density). Such a map f is called C1 if the map
Rn-+ R given by mch partial derivative z + af/azi(z)is defined and continuous
(in Chapter 5 we discuss continuity in more detail). In this case the gradient of
f, called grad f, is the map Rn-+ Rnthat sends z into (af/azl (z), . . . ,af/az. (3)) .
Gradf is an example of a vector field on Rn. (In Chapter 1 we considered only
linear vector fields, but gradf may be more general.)
Next, consider the composition of two C1maps aa follows:
f 0
I + R n + R.
The chain rule can be expressed in this context aa
using the definitions of gradient and inner product, the reader can prove that this
is equivalent t o
A vector field F:R' +RS is called IL force field if the vector F ( z ) assigned to the
point z is interpreted as a force acting on a particle placed at 2.
Many force fields appearing in physics arise in the following way. There is a C1
function
V : Ra+ R
euch that
= -grad V ( z ) .
(The negative sign is traditional.) Such a force field is called conservative. The
function V ie called the potential energy function. (More properly V should be called
a potential energy since adding a constant to it does not change the force field
-grad V ( z ) . )Problem 4 relates potential energy to work.
1s 2. NEWTON’S EQUATION AND KEPLER’SLAW
Here i ( t ) is interpreted as the velocity vector at time t ; its length I k ( t ) I is the speed
a t time t. If we consider the function z: R + R* as describing a curve in R3,then
k ( t ) is the tangent vector to the curve at z ( t ) .
For a particle moving in a conservative force field F = -grad V , the potential
.
energy a t x is defined to be V (x) Note that whereas the kinetic energy depends on
the velocity, the potential energy is a function oi position.
The total energy (or sometimes simply energy) is
E=T+V.
This has the following meaning. If z ( t ) is the trajectory of a particle moving in
the conservative force field, then E is a real-valued function of time:
E(t) = 4 I m k ( t ) l2 + V ( z ( t ) ) .
A force field F is called central if F ( x ) points in the direction of the line through
every Z. In other words, the vector F ( z ) is always a scalar multiple of x, the
5 , for
coefficient depending on x :
F(Z) = X(z)x.
We often tacitly exclude from consideration a particle at the origin; many central
force fields are not defined (or are “infinite”) at the origin.
Lemma Let F be a conservative force field. Then the following statements are
equivalent :
(a) F is central,
(b) F ( x ) = f ( l x l)z,
(c) F ( z ) = -grad V ( Z )and V ( z ) = g(l x I).
Proof. Suppose (c) is true. To prove (b) we find, from the chain rule:
this proves (b) with f ( l z I) = g’(l x 1)/1 x I. It is clear that (b) implies (a). To
show that (a) implies (c) we must prove that V is constant on each sphere.
S,= ( z E R 3 1 1 x l = a ) , a>0.
Since any two points in S, can be connected by a curve in S, it suffices to show that
V is constant on any curve in S,. Hence if J C R is an interval and u : J + S, is
a C1 map, we must show that the derivative of the composition V u 0
II V
J+S,CR3+R
is identically 0. This derivative is
d
- v ( u ( t ) ) = (grad v ( u ( t > ) u, ’ ( t ) )
dt
20 2. NEWTON’S EQUATION AND KEPLER’S LAW
=o
because 1 u ( t ) I = a.
In Section 5 we shall consider a special conservative central force field obtained
from Newton’s law of gravitation.
Consider now a central force field, not necessarily conservative.
Suppose at some time to, that P C Rsdenotes the plane containing the particle,
the velocity vector of the particle and the origin. The force vector F ( z ) for any
point z in P also lies in P. This makes it plausible that the particle stays in the plane
P for all time. In fact,, this is true: a particle moving in a central force field moves
in a fixed plane.
The proof depends on the cram product (or vector product) u X v of vectors u,
u in Rs. We recall the definition
uX u = (UZUS- uxl)~,USUI - UIUS, U I -~ ~ 2 0 1 )E Rs
and that u X u = - u X u .= I u I I u I N sin 0, where N is a unit vector perpendicu-
lar to u and u, ( U , u, N) oriented as the axes (“right-hand rule”), and 0 is the angle
between u and u.
Then the vector u X u = 0 if and only if one vector is a scalar multiple of the
other; if u X u # 0, then u X u is orthogonal to the plane containing u and u. If
u and u are functions of t in R,then a version of the Leibniz product rule asserts
(as one can check using Cartesian coordinates) :
d
-(uXu) =tixu+uxu.
dt
Now let z ( t ) be the path of a particle moving under the influence of a central
force field. We have
d
-(zXk) = x x x + z x x
dt
= x x x
=o
because 3 is a scalar multiple of z. Therefore z ( t ) X k ( t ) is a constant vector y.
If y # 0, this means that z and x always lie in the plane orthogonal to y, as asserted.
If y = 0, then k ( t ) = g ( t ) z ( t ) for some scalar function g ( t ) . This means that the
velocity vector of the moving particle is always directed along the line through the
64. CENTRAL FORCE FIELDS 21
origin and the particle, aa is the force on the particle. This makes it plausible that
the particle always moves along the same line through the origin. To prove this let
( q ( t ) , z 2 ( t ) ,z 8 ( t ) )be the coordinates of z ( t ) . Then we have three differential
equations
k = 1 , 2 , 3.
By integration we find
F ( z ) = -grad V(z)
x = + Ir ddt
( i - d*)i - - (r"e)j.
We can now prove one of Kepler’s laws. Let A ( t ) denote the area swept out by
the vector ~ ( tin) the time from to to 1. In polar coordinates d.4 = +r2 do. We define
the areal velocity to be
A = ir24,
the rate at which the position vector sweeps out area. Kepler observed that the
line segment joining a planet to the sun sweeps out equal areas in equal times, which
we interpret to mean A = constant. We have proved more generally that this is
true for any particle moving in a conservative central force field; this is a con-
sequence of conservation of angular momentum.
i5. States
dv
m- = F(x).
dt
A solution to (1’) is a curve t --t ( x ( t ), v ( t ) ) in the state space R3 X R3such that
$(t) = v(t) and i ) ( t ) = m - * F ( x ( l ) ) for all 1.
It can be seen then that the solutions of (1) and (1’) correspond in a natural
fashion. Thus if x ( t ) is a solution of ( 1 ) , we obtain a solution of (1’) by setting
v ( t ) = k ( t ) . The map R3 X R3--t R3X R3that sends ( x , v ) into ( v , m-’F(x) ) is a
vector field on the space of states, and this vector field defines the diflerentiul equation,
(1’).
56. ELLIPTICAL PLANETARY ORBITS 23
A solution (x(t), v ( t )) to (1’) gives the passage of the state of the system in time.
Now we may interpret energy as a function on the state space, R3X R3 -+ R,
defined by E ( x , v ) = +m I v l2 +
V(x). The statement that “the energy is an
integral” then means that the composite function
t--, (x(t), do) + E ( 4 t ) , v ( t > )
is constant, or that on a solution curve in the state space, E i s constant.
We abbreviate R3X R3by 8. An integral (for (1’)) on 8 is then any function
that is constant on every solution curve of (1‘). It was shown in Section 4 that in
addition to energy, angular momentum is also an integral for (1’). In the nineteenth
century, the idea of solving a differential equation was tied to the construction of a
sufficient number of integrals. However, it is realized now that integrals do not exist
for differential equations very generally; the problems of differentialequations have
been considerably freed from the need for integrals.
Finally, we observe that the force field may not be defined on all of R3, but only
on some portion of it, for example, on an open subset U C R3.In this case the path
x(t) of the particle is assumed to lie in U . The force and velocity vectors, however,
are still allowed to be arbitrary vectors in R3.The force field is then a vector field
on U , denoted by F : U + R3.The state space is the Cartesian product U X R3,and
(1’) is a first order equation on U X R3.
We now pass to consideration of Kepler’s first law, that planets have elliptical
orbits. For this, a central force is not sufficient. We need the precise form of V as
given by the “inverse square law.”
We shall show that in polar coordinates ( r , 0) , an orbit with nonzero angular
momentum h is the set of points satisfying
r(1 + B cos 0 ) = 1 = constant; e = constant,
which defines a conic, as can be seen by putting r cos e = x, r2 = x2 y2. +
Astronomical observations have shown the orbits of planets to be (approxi-
mately) ellipses.
Newton’s law of gravitation states that a body of mass ml exerts a force on a
body of mass m2.The magnitude of the force is gmlm2/r2,where r is the distance
between their centers of gravity and g is a constant. The direction of the force on
Inz is from r i i 2 to ml.
Thus if m llies at the origin of R3 and m2 lies at x E R3,the force on m2 is
We must now face the fact that both bodies will move. However, if m 1 is much
greater than m2,its motion will be much less since acceleration is inversely propor-
tional to mass. We therefore make the simplifying assumption that one of the
bodies does not move; in the case of planetary motion, of course it is the sun that
is assumed at rest. (One might also proceed by taking the center of mass at the
origin, without making this simplifying assumption.)
We place the sun at the origin of R3 and consider the force field corresponding
to a planet of given mass m. This field is then
F(x) = -c- X
Ix 13)
where C is a constant. We then change the units in which force is measured to obtain
the simpler formula
X
F(x) = --
I x 13'
It is clear this force field is central. Moreover, it is conservative, since
-2- - grad V ,
I 2 13
where
-1
V=-.
1x1
Observe that F ( x ) is not defined at 0.
As in the previous section we restrict attention to particles moving in the plane
R2;or, more properly, in R2- 0. The force field is the Newtonian gravitational field
in R2,F ( x ) = -x/1 x 13.
Consider a particular solution curve of our differential equation 3 = m-'F(x).
The angular momentum h and energy E are regarded as constants in time since
they are the same at all points of the curve. The case h = 0 is not so interesting; it
corresponds to motion along a straight line toward or away from the sun. Hence
we assume h # 0.
Introduce polar coordinates ( r , 0) ; along the solution curve they become func-
tions of time ( r ( t ) , 0 ( t ) ) . Since ~ * is8 constant and not 0, the sign of 8 is constant
along the curve. Thus 0 is always increasing or always decreasing with time. There-
fore T is a function of 0 along the curve.
Let u ( t ) = l/r ( t ) ; then u is also a function of 0 ( t ) . Note that
u= -v.
We have a convenient formula for kinetic energy T.
$6. ELLIPTICAL PLANETARY ORBITS 25
Lemma
Proof. From the formula for x in Section 4 and the definition of T we have
T = )m[F + (r6)Z-J.
Also,
- 1 du du
- h--
p = --d =
u2 d0 m do
by the chain rule and the definitions of u and h ; and also
h hu
r(j = -- = -.
mr m
Substitution in the formula for T proves the lemma.
Now we find a differential equation relating u and 0 along the solution curve.
Observe that T = E - V = E +
u. From the lemma we get
-+u=- m
d2u
do2 h2'
where na/h2is a constant.
We re-examine the meaning of just what we are doing and of (2). A particular
orbit of the planar central force problem is considered, the force being gravitational.
Along this orbit, the distance r from the origin (the source of the force) is a function
of 0, as is l / r = u. We have shown that this function u = u(0) satisfies ( 2 ) , where
h is the constant angular momentum and m is the mass.
The solution of ( 2 ) (as was seen in Section 1) is
C =
1
f - (2mh2E
h2
+ m2)1'2.
26 2. NEWTON’S EQUATION AND KEPLER’S LAW
(4)
We recall from analytic geometry that the equation of a conic in polar coordinates
is
(5)
1
u = - (1
1
+e ~ ~ ~ e u) =, -.r
1
Here 1 is the latus rectum and c 2 0 is the eccentricity. The origin is a focus and the
three cases e > 1, c = 1, e < 1 correspond respectively to a hyperbola, parabola,
and ellipse. The case e = 0 is a circle.
Since (4) is in the form ( 5 ) we have shown that the orbit of a particle moving under
fhe influence of a Newtonian gravitational force i s a conic of eccentricity
e = ( l + - & - ) 2Eh2 ‘ I 2
(1 + ~ ) 1 ’ 2 c o s B> -1.
But if 0 = =tr radians, cos 0 = - 1 and hence
(1 +2 y ’ < 1.
This is equivalent to E < 0. For some of the planets, including the earth, complete
revolutions have been observed; for these planets cos e = - 1 at least once a year.
Therefore their orbits are ellipses. In fact from a few observations of any planet it
can be shown that the orbit is in fact an ellipse.
NOTES 27
PROBLEMS
( F ( y ( s ) ) ,! / ’ ( s ) ) d s J
where ~’(s) is the (unit) tangent vector to the path. Prove that the force field
is conservative if and only if the work is independent of the path. In fact if
F = -grad V, then the work done is V(ZI) - V ( Z O ) .
5. How can we determine whether the orbit of (a) Earth and (b) Pluto is an
ellipse, parabola, or hyperbola?
6. Fill in the details of the proof of the theorem in Section 4.
7. Prove the angular momentum h, energy E, and mass m of a planet are related
by the inequality
m
E>--
2h2’
Notes
Lang’s Second Course in Calculus [I21 is a good background reference for the
mathematics in this chapter, especially his Chapters 3 and 4. The physics material
is covered extensively in a fairly elementary (and perhaps old-fashioned) way in
28 2. NEWTON’S EQUATION AND KEPLER’S LAW
Principles of Jlechanics by Synge and Griffith [23]. One can also find the mechanics
discussed in t,he book on advanced calculus by Loomis and Sternberg [15, Chapter
131.
The unsystematic ad hoc methods used in Section 6 are successful here because
of the relative simplicity of the equations. These methods do not extend very far
into mechanics. In general, there are not enough “integrals.”
The model of planetary motion in this chapter is quite idealized; it ignores the
gravitational effect of the other planets,
Chapter 3
Linear Systems with Constant
Coeficients and Real Eigenvalues
The purpose of this chapter is to begin the study of the theory of linear operators,
which are basic to differential equations. Section 1 is an outline of the necessary
facts about vector spaces. Since it is long it is divided into Parts A through F. A
reader familiar with some linear algebra should use Section 1 mainly as a reference.
In Section 2 we show how to diagonalize an operator having real, distinct eigen-
values. This technique is used in Section 3 to solve the linear, constant coefficient
system 5' = A z , where A is an operator having real distinct eigenvalues. The last
section is an introduction to complex eigenvalues. This subject will be studied
further in Chapter 4.
We emphasize that for many readers this section should be used only as a refer-
ence or a review.
The setting for most of the differential equations in this book is Cartesian space
R";this space was defined in Chapter 1, Section 2, as were the operators of addition
and scalar multiplication of vectors. The following familiar properties of these
30 3. L I N E A R SYSTEMS : CONSTANT COEFFICIENTS, REAL EIGENVALUES
VSl: I: +
y =y + 2,
2+0=2,
+ (-z) = 0,
+Y) + + (Y +
5
(2 = 5 2).
vs2: (A +
p ) z = xz + pz,
X(x +
y ) = xz + xy,
12 = 5,
Oz = 0 (the first 0 in R, the second in R").
These operations satisfying VS1 and VS2 define the vector space structure on R".
Frequently, our development relies only on the vector space structure and ignores
the Cartesian (that is, coordinate) structure of Rn.To emphasize this idea, we may
write E for Rnand call E a vector space.
The standard coordinates are often ill suited to the differential equation being
studied; we may seek new coordinates, as we did in Chapter 1, giving the equation
a simpler form. The goal of this and subsequent chapters on algebra is to explain
this process. I t is very useful to be able to treat vectors (and later, operators) as
objects independent of any particular coordinate system.
The reader familiar with linear algebra will recognize VS1 and VS2 as the defining
axioms of an abstract vector space. With the additional axiom of finite dimen-
sionality, abstract vector spaces could be used in place of R" throughout most of
this book.
Let A = [a,,] be some n X m matrix as in Section 2 of Chapter 1. Thus each
a,, is a real number, where ( i ,j ) ranges over all ordered pairs of integers with 1 5
i 5 n, 1 5 j 5 n . The matrix A can be considered as a map A : Rn+ Rn where
the ith coordinate of A z is CYpIa,,x,, for each z = (zl, . . . , 5,) in R".I t is easy
t o check that this map satisfies, for z, y E R", E R :
+
L 1 : A ( x y) = A X + Ay,
L2: A(Xz) = XAz.
These are called linearity properties. Any map A : Rn+ R" satisfying L1 and L2
is called a linear map. Even more generally, a map A : Rn Rm (perhaps different ---f
domain and range) that satisfies L1 and L2 is called linear. In the case where the
domain and range are the same, A is also called an operator. The set of all operators
on Rnis denoted by L(Rn).
Note that if ek E Rnis the vector
ek = ( 0 , . . . , 0, 1 , 0 , . . , , O),
$1. I3.4SIC LINEAR ALGEBRA 31
= c ak,(Bek)
k
= c byer).
k
U,k,(C
a
32 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIGENVALUES
Therefore
(TS)ej= C (C bikakj)ei.
i k
Thus I has ones on the diagonal (from upper left to lower right) and zeros elsewhere.
I t is clear that A + +
0 = 0 A = A , OA = A 0 = 0, and A I = I A = A , for
both operators and matrices.
If T is an operator and X any real number, a new operator T is defined by
(XT)x = ~ ( T x ) .
If A = [ a i j ] is the matrix of T , then the matrix of AT is XA = [hai,], obtained by
multiplying each entry in A by A. It is clear that
OT = 0,
1T = T ,
and similarly for matrices. Here 0 and 1 are real numbers.
The set L ( R") of all operators on R", like the set M , of all n X n matrices, satis-
ties the vector space axiom VS1, VS2 with 0 as 0 and x, y, z as operators (or ma-
trices). If we consider an n X n matrix as a point in Rn2,the Cartesian space of
dimension n2, then the vector space operations on L ( Rn) and M , are the usual
ones.
$1. BASIC LINEAR ALGEBRA 33
is
Proposition 1 Every vector space F has a basis, and every basis of F has the same
number of elements. If (el, . . . , e k ] C F is an independent subset that is not a basis,
by adjoining to it suitable vectors ek+l, . . . , em one can form a basis (el, . . . , em).
The number of elements in a basis of F is called the dimension of F , denoted by
dim F . If (el, . . . , em] is a basis of F , then every vector z E F can be expressed
m
z = C tiei, ti E R,
i-1
since the ei span F. Moreover, the numbers tl, . . . , t, are unique. To see this,
suppose also that
m
x = C siei.
i-1
Then
0 = x - x = C ( t i - si)ei;
i
by independence,
t,-si=O, i = 1, . . . , m.
These numbers tl, . . . , t, are called the coordinates of x in the basis (el, . . . , em).
The standard basis el, . . . , em of R" is defined by
e, = ( 0 , . . . , 0, 1, 0 , . . . , 0 ) ; i = 1,. . . , n,
with 1 in the ith place and 0 elsewhere. This is in €act a basis; for C tie, =
( 1 1 , . . . , t n ) , so { e l , . . . , en] spans R";independence is immediate.
$1. BASIC LINEAR ALGEBRA 35
Proposition 2 Two vector spaces are isomorphic i f and only if they have the same
dimension. I n particular, every n-dimensional vector space is isomorphic to R”.
Proof. Suppose E and F are isomorphic. If {el, . . . , en) is a basis for El it is
easy to verify that Tel, . . . , Ten span F (since T is onto) and are independent
(since T is one-to-one). Therefore E and F have the same dimension, n. Conversely,
.
suppose ( e l , . . . , e n ) and ( f l , . . , f n ) are bases for E and F , respectively. Define
T :E F to be the unique linear map such that Tei = f , , i = 1, . . . , n: if x =
---f
(a) Ker T = 0,
(b) I m T = F ,
(c) T i s an isomorphism.
36 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIGENVALUES
E, and clearly,
cp(C xiei) = (51, . . . Zn). )
Each of the new coordinates yi: E + R is a linear map, and so can be expressed
in terms of the old coordinates (xl,. . . , x,). In this way another n X n matrix is
Q1. BASIC LINEAR ALGEBRA 37
defined :
(4)
= c (c qklPi$lj)
l j
by ( 5 ) . Each term of the internal sum on the right is 0 unless I = j, in which case
it is q k & j . Thus
6ki = QkjQij.
j
We have proved ;
Proposition 4 The m.alrix expressing new wordinales in terms of fhe old is the in-
verse transpose of the matrix expressing the new basis in terms of the old.
where {el, . . . en) is the standard basis of Rn. Equivalently, the ith coordinate
.
)
of T x , z = (21, . . , xn)) is
P = [pij], fi = C piiei.
j
representing T .
For n = 1, the determinant of T: R1 R1is the factor by which T multiplies
--f
lengths, except possibly for sign. Similarly, for R2and areas, Ra and volumes.
If A is a triangular matrix (aij = 0 for i > j, or aij = 0 for i < j) then Det A =
-
all - * annJthe product of the diagonal elements.
From D3 we deduce:
Therefore we can define the trace of an operator to be the trace of any matrix repre-
senting it. It is not easy to interpret the trace geometrically.
Note that
+
Tr(A B) = Tr(A) Tr(B). +
The rank of an operator is defined to be the dimension of its image. Since every
n X nmatrix defines an operator on Rn,we can define the rank of a matrix A to be
the rank of the corresponding operator T. Rank is invariant under similarity.
The vector space Im T is spanned by the images under T of the standard b&
. .
vector, el, , , ek. Since Tej is the n-tuple that is thejth column of A , it follows that
the rank of A equals the maximum number of independent columns of A.
This gives a practical method for computing the rank of an operator T. Let A
be an n X n matrix representing T in some basis. Denote the j t h column of A by
cj, thought of as an n-tuple of numbers, that is, an element of R".The rank of T
. .
equals the dimension of the subspace of Rnspanned by cl, . , c,. This subspace is
+
also spanned by c1, . . . , Cj-1, cj h k , Cj+1, . . . , Cn; X E R. Thus we may replace
any column cj of A by c, + k k , for any X E R, k # j . In addition, the order of
the columns can be changed without altering the rank. By repeatedly transform-
ing A in these two ways we can change A to the form
where D is an r X T diagonal matrix whose diagonal entries are different from zero
and C has n - r rows and T columns, and all other entries are 0. It is easy to see
that the rank of B, and hence of A , is r.
From Proposition 3 (Part B) it follows that an operator on an n-dimensional
vector space is invertible if and only if it has rank n.
the union of the basis elements of the Ej to obtain a basis for E, T has the matrix
This means the matrices A are put together corner-to-corner diagonally as indi-
cated, all other entries in A being zero. (We adopt the convention that the blank
entries in a matrix are zeros.)
For direct sums of operators there is the useful formula:
Det(T1 e * * * e Tn) = (Det Ti) (Det Tn),
and the equivalent matrix formula:
Det diag(Al, . . . , A n ) = (Det AI) - (Det An).
Also :
Tr(T1 q . 9 CB Tn)+ + Tr(Tn),
= Tr(TI) * . *
and
Tr diag(Al, . . . , An) = Tr(AI) + - + Tr(An). *
We identify the Cartesian product of Rm and Rnwith Rm+"in the obvious way.
If E C Rm and F C Rn are subspaces, then E X F is a subspace of Rm+nunder
this identification. Thus the Cartesian product of two vector spaces is a vector space.
Det [j -
-6 -4-X
= (X - 2)(X + 1).
The eigenvalues are therefore 2 and - 1. The eigenvectors belonging to 2 are solu-
tions of the equations ( T - 21)s = 0, or
which, in coordinates, is
3x1 + 3x2 = 0,
-621 - 6x2 = 0.
44 3. LIN E AR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIGENVALUES
Note that any vector x'= (a, a) in R2can be written in the form ylfi y&; +
then x = (yl- 92, --yl +2y2)using the definition of the fi. Therefore (yl, 92) are
the coordinates of x in the new basis. Thus
x=By, B=[-l
1 -
i].
This is how the diagonalizing change of coordinates was found in Section 1 of Chap-
ter 1.
Example. Let T have the matrix [i -:I. The characteristic polynomial is
a~lzl+ (an - + + = 0,
~)ZZ * * * anxn
m
= C t j ( Tej - ane,)
i-1
46 3. LINEAR SYSTEMS : CONSTANT COEFFICIENTS, REAL EIGENVALUES
We will often use the expression “ A has real distinct eigenvalues” for the hypothesis
of Theorems 1 and 2.
Another useful condition implying diagonalizability is that an operator have a
symmetric matrix (aij = aj;) in some basis; see Chapter 9.
Let us examine a general operator T on R2for diagonalizability. Let the matrix
[z
be :I; the characteristic polynomial ~ T ( A ) is
Det [” -c A d-A
( a - A) ( d - A) - bc
A2 - (a + d)X + (ad - b c ) .
+
Notice that a d is the trace Tr and ad - bc is the determinant Det. The roots
of p ~A )(, and hence the eigenvalues of T , are therefore
+[Tr f (Tr2- 4 Det)l’z].
The roots are real and distinct if Tr2 - 4 Det > 0; they are nonreal complex c o n
jugates if Tr - 4 Det < 0; and there is only one root, necessarily real, if Tr -
4 Det = 0. Therefore T is diagonalizable if Tr2 - 4 Det > 0. The remaining case,
Tr2 - 4 Det = 0 is ambiguous. If T is diagonalizable, the diagonal elements are
eigenvectors. If p~ has only one root a,then T has a matrix [t 3. Hence T = al.
But this means any matrix for T is diagonal (not just diagonalizable) ! Therefore
when Tr2 - 4 Det = .O either every matrix for T,or no matrix for T , is diagonal.
[:
The operator represented by 3 cannot be diagonalized, for example.
g3. DIFFERENTIAL EQUATIONS WITH REAL, DISTINCT EIGENVALUES 47
= Q-'(QAQ-')y
= A&-'y;
X' = A X .
4s 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIOENVALUES
Moreover,
~ ( 0=
) Q-'y(O) = Q-'Qzo = zO.
or
2 = Pty.
Note the order of the subscripts in (4) ! The ith column of P t consists of the coordi-
nates of fi. The matrix Q in the proof is the inverse of Pt. However, for some pur-
poses, it is not necessary to compute Q.
In the new coordinates the original differential equation becomes
(5) y' = X,yi, i = 1, . . . , n,
so the general solution is
yi(t) = aiexp(tXi); i = 1, . . . , n,
where al, . . . , a,, are arbitrary constants, ai = yi(0). The general solution to the
original equation is found from (4) :
(6) z,(t) = Ci pijuiexp(tXi); j = 1, . . . , n.
$3. DIFFERENTIAL EQUATIONS WITH REAL, DISTINCT EIQENVALUES 49
an exp (&)
To find a solution z ( t ) with a specified initial value
s(0) = u = (u1, . . . , Un),
one substitutes t = 0 in (6), equates the right-hand side to u,and solves the result-
ing system of linear algebraic equations for the unknowns (al,. . . , an) :
(7) C pi,ai = uj; j = 1, . . . , n.
i
2: = 51 + 2x2,
Z$ = ~1 - xa.
The corresponding matrix is
Since A is triangular,
Det(A - XI) = (1 - A)(2 - A ) ( - 1 - A).
Hence the eigenvalues are 1, 2, - 1. They are real and distinct, so the theorem
applies.
The matrix B is
50 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIGENVALUES
ydt) = b@',
ya(t) = ce-', a, b, c arbitrary constants.
To relate the old and new coordinates we must find three eigenvectors f1, fi, fa
of A belonging respectively to the eigenvalues 1, 2, -1. The second column of A
shows that we can take
f2 = (0,1 , 0 > ,
and the third column shows that we may take
f: = (O,O, 1).
To find fl = (u1, vz, u8) we must solve the vector equation
( A - 1)ji = 0,
or
:
1 0 -2
:][:]=o;
us
hence
(9) x l ( t ) = 2ae1,
x 2 ( t ) = -2ae' + bg',
za(t) = aer + ce-',
where a, b, c are arbitrary constants.
The reader should verify that (9) is indeed a solution to (8).
To solve an initial value problem for (8), with
Zi(0) = ui; i = 1, 2, 3,
we must select a, b, c appropriately.
From (9) we find
a ( 0 ) = 2a,
x2(0)= -2a + b,
xs(0) = a + c.
Thus we must solve the linear system
(10) 2a = u1,
-2a + b = UZ,
a + c = u3,
for the unknowns a, b, c. This amounts to inverting the matrix of coefficientsof the
left-hand side of (10),which is exactly the matrix Pt. For particular values of UI,UZ,
us,it is easier to solve (10) directly.
This procedure can, of course, be used for more general initial values, z(t0) = u.
The following observation is an immediate consequenceof the proof of Theorem 1.
Theorem 2 Let the n x n matrix A have n distinct real eigenvalues X,I .. ., An.
By using this theorem we get much information about the general character of
the solutions directly from the knowIedge of the eigenvalues, without explicitly
solving the differential equation. For example, if all the eigenvalues are negative,
evidently
lim ~ ( 1 )= 0
t-oa
for every solution z ( t ), and conversely. This aspect of linear equations will be
investigated in later chapters.
Theorem 2 leads to another method of solution of (1).Regard the coefficients ci,
a~ unknowns; set
si(t) = C cijexp(tXj); i = 1,. . . , n,
i
2: = 21 + 23hJ
Z: = 21 - za,
with the initial condition
s(0) = ( l , O , O ) .
The eigenvalues are X1 = 1, Xz = 2, Xa = - 1. Our solution must be of the form
zl(t) = cllet + + clre-l;C U ~ '
Therefore
czl = Cll + 2cz1,
2CB= c12 + 2%
-Czs = Cis + ~Qs,
which reduces to
czl = -Cn,
ha = 0.
From xi = x1 - xs we obtain
cad + 2 ~ ~ -8 1case-' = (c11 - cde' + (c12 - c d 8 ' + (CIS - cde-'.
Therefore
Ca1 = c11 - CSl,
2cm = c12 - cm,
-cas = ClS - caa,
which boils down to
c8l = $Cll,
Cm = 0.
Without using the initial condition yet, we have found
xl(6) = c d ,
PROBLEMS
Lo 2 -3J
z: = 21 + x2;
x1(0) = a,
z2(0) = b,
is
xl(f) = ae',
x2(t) = e'(b + at).
(Hint: If ( yl ( t ), y2( 1 ) ) is another solution, consider the functions e-'yl ( t ) ,
e-'yz(t).)
4. Let an operator A have real, distinct eigenvalues. What condition on the eigen-
values is equivalent to 1imt+- I z ( t ) I = 03 for every solution x ( t ) to z' = Ax?
5. Suppose the n X n matrix A has real, distinct eigenvalues. Let 1 -+ +( 1, 50)
be the solution to x' = Ax with initial value d(0, zo) = zo.
(a) Show that for each fixed t,
lim + ( t , YO) = d(t, ZO).
Yo-20
A class of operators that have no real eigenvalues are the planar operators T.,b:
R2 --f R* represented by matrices of the form Aa,6 = [f:3, b # 0. The charac-
teristic polynomial is
X2 - 2uX + +
(a2 b2) ,
where roots are
a ib, + a - ib; i = 47.
\Ve interpret Ta,bgeometrically as follows. Introduce the numbers r , 0 by
r = (a2 + b2)
e = arc cos (:), cos = -.UT
56 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIQENVALUES
Next, we consider the operator T o n RS where the matrix is [--;I The char- .
acteristic polynomial is A2 - 2 A + 2, where roots are
l+i, 1-i.
44. COMPLEX EIGENVALUES 57
- yz += XI 22.
dt
= bx + ay.
We use complex variables to formally find a solution, check that what we have
found solves ( 1), and postpone the uniqueness proof (but see Problem 5 ) .
+ +
Thus replace (5,y) by z i y = z, and -.] by a bi = p. Then (1) becomes
(2) z‘ = pz.
Following the lead from the beginning of Chapter 1, we write a solution for (2),
z ( t ) = K e * p . Let us interpret this in terms of complex and real numbers. Write
+ +
the complex number K as u i v and set z ( t ) = x ( t ) i y ( t ) , el@ = etaeitb.A stand-
ard formula from complex numbers (see Appendix I ) says that eifb= cos tb +
i sin tb. Putting this information together and taking real and imaginary parts we
58 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIGENVALUES
obtain
(3) x(1) = uetacos tb - veto sin tb,
y ( t ) = uetasin Ib + vela cos tb.
The reader who is uneasy about the derivation of (3) can regard the preceding
paragraph simply as motivation for the formulas (3) ; it is easy to verify directly
by differentiation that (3) indeed provides a solution to ( 1 ) . On the other hand,
all the steps in the derivation of (3) are justifiable.
We have just seen how introduction of complex variables can be an aid in solving
differential equations. Admittedly, this use was in a very special case. However,
many systems not in the form (1) can be brought to that form through a change
of coordinates (see Problem 5 ) . In Chapter 4 we shall pursue this idea systemati-
cally. At present we merely give an example which was treated before in the Kepler
problem of Chapter 2.
Consider the system
(4) 5' = y,
'1,
The corresponding matrix is
A = [-b2 0
whose eigenvalues are h b i . It is natural to ask whether A can be put in the form
through a coordinate change. The answer is yes; without explaining how we dis-
covered them (this will be done in Chapter 4), we introduce new coordinates (u,v)
by setting x = v, y' = bu. Then
1
u' = - y' = -bv,
b
V' = X' = bu.
We have already solved the system
U' = -bv,
V' = bu;
the solution with ( u ( O ) ,v ( 0 ) ) = (uo, VO) is
u ( t ) = uocos tb - vo sin Ib,
v ( t ) = uosin Ib + vo cos tb.
84. COMPLEX EIGENVALUES 59
x(t) Yo
= - sin
b
tb + xo cos tb,
g ( t ) = yo cos tb - bzosin tb,
as can be verified by differentiation.
We can put this solution in a more perspicuous form as follows. Let C =
+ ) ~ and write, assuming C # 0,
[ ( ~ ~ / b x02]1/2
Then u2 + v2 = 1, and
x(t) = C[v cos tb - u sin tb].
Let to = b-l arc cos v, so that
cos bto = v , sin bto = u.
Then x ( t ) = C(cos bt cos bto - sin bt sin bto), or
(5) ) CCOS
~ ( t= b(t - to);
and
(6) y(t) = bC sin b ( t - to)
as the reader can verify; C and to are arbitrary constants.
From ( 5 ) and (6) we see that
x2 y2
-+-,=1.
c2 (bc)
Thus the solution curve (z (2) , g ( t ) ) goes round and round an ellipse.
Returning to the system (4) , the reader has probably recognized that it is equiva-
lent to the second order equation on R
(7) x” + b2x = 0,
obtained by differentiating the first equation of (4) and then substituting the
second. This is the famous equation of “simple harmonic motion,” whose general
solution is ( 5 ) .
60 3. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, REAL EIGENVALUES
PROBLEMS
-:I.
y' = -x; x ( 0 ) = ( 3 , -9);
x(0) = 1, y(0) = 1.
A = [i
2. Sketch the phase portraits of each of the differential equations in Problem 1.
3. Let A = 0 -11 and let x(t) be a solution to x' = Ax, not identically 0. The
curve x ( t ) is of the following form :
(a) a circle if a = 0 ;
(b) a spiral inward toward (0, 0 ) if a < 0, b # 0;
(c) a spiral outward away from (0, 0 ) if a > 0,b # 0.
What effect has the sign of b on the spirals in (b) and (c)? What is the phase
portrait if b = O?
where I = [i 3.
(b) Show that there exists a 2 X 2 matrix Q such that A& = QB.
(Hint:Write out the above equation in the four entries of Q = [ q i j ]
Show that the resulting system of four-linear homogeneous equations in
the four unknowns q i j has the coefficient matrix of part (a).)
(c) Show that Q can be chosen invertible.
Therefore the system x' = Ax has unique solutions for given initial conditions.
@, COMPLEX EIGENVALUES 61
y' = x ;
x(0) = 0, y(0) = -7.
Chapter 4
Linear Systems with Constant
Coeficients and Complex Eigenvalues
As we saw in the last section of the preceding chapter, complex numbers enter
naturally in the study and solution of real ordinary differential equations. In gen-
eral the study of operators of complex vector spaces facilitates the solving of linear
differential equations. The first part of this chapter is devoted to the linear algebra
of complex vector spaces. Subsequently, methods are developed to study almost all
first order linear ordinary differential equations with constant coefficients, including
those whose associated operator has distinct, though perhaps nonreal, eigenvalues.
The meaning of “almost all” will be made precise in Chapter 7.
Observe that the above theorem is stronger than the corresponding theorem
in the real caae. The latter demanded the turther substantial condition that the
roots of the characteristic polynomial be real.
Say that an operator T on a complex vector space is semisimple if it is diagonal-
64 4. LINEAR SYSTEMS: CONSTANT COEFFICIENTS, COMPLEX EIGENVALUES
set of fixed points of u, that is, the set of z such that a(%) = z, is precisely the set
of real numbers in C.
This operation U , or conjugation, can be extended immediately to C" by defining
u: C n --f Cn by conjugating each coordinate. That is,
x = +C(z + i y ) + ( x - iy)].
$1. COMPLEX VECTOR SPACES 65
PROBLEMS
2. Let h' C R" and F C C n be subspaces. What relations, if any, exist between
dim li and dim Ec? Brtween dim F and dim FR?
3. If F C CrBis any subspace, what relation is there between F and FRC?
4. Let L' br a real vector space and T E L(L').Show that (Iier T ) c = I<er(T c ) ,
(Im T ) c = I m( 5 ° C ) , and (T-l)c = (Tc)-' if T is invertible.
Proposition. If T is an operator on a real vector space E , then the set of its eigen-
values is preserved under complex conjugation. Thus if >.is an eigenvalue so is X. Con-
sequently, we may write the eigenvalues of T as
Hence
For the proof we pass to the complexification T c and apply the theorem of the
preceding section together with the above proposition. This yields a basis for Ec
(el, . . . , e,, f l J f l J . . . , fa, Is)
of eigcnvectore of T c corresponding to the eigenvalues
(AIJ * . . 7 XVJ M1J p1J . .. 9 pS, as).
Now let F , bc the complex subspace of EC spanned by (el, , . . , er) and F b be
the subspace spanned by (fl,f2, . . . , f s , fs I. Thus Fa and Fb are invariant subspaces
for T c on EC and form a direct sum decomposition for Ec,
Ec = Fa @ Fb.
For the proof of Theorem 2, simply let Fi be the complex subapace of EC spanned
by the eigenvectors, fi, ji correaponding to the eigenvalues pi, pi. Then let Ei be
Fc n E. The rest follows.
Theorems 1 and 2 reduce in principle the study of an operator with distinct eigen-
values to the case of an operator on a real two-dimensional vector space with nonreal
eigenvalues.
A = [; -3.
The study of such a matrix A and the corresponding differential equation on
R*,&/dt = Ax, waa the content of Chapter 3, Section 4.
We now give the proof of Theorem 3.
Let T c :EC +EC be the complexification of T . Since TChas the same eigenvaluea
+
aa T I there are eigenvectors cp, in EC belonging to p, p, reapectively.
+ +
Let (p = u iu with upu E Rn.Then = u - iu. Note that u and u are in Ec, for
u=3(cp+q), u = ! 2( -cp - (PI.
Hence u and u are in EC n Rn = E. Moreover, it is easy to see that u and v are
independent (use the independence of (p, q ) . Therefore ( u , u ) is a basis for E.
To compute the matrix of T in this basis we start from
+
T C ( U iu) = ( a + bi)(u+ iu)
= ( - b u + au) + i(au + bu).
T c ( u + i u ) = Tu + iTu.
Therefore
Tu = au + bu,
Tu = -bu + au.
This means that the matrix of T in the baais ( u , u ) is F 31, completing the p m f .
In the c o m e of the proof we have found the following interpretation of a complex
eigenvalue of a real operator T E L ( E ) , E C Rn:
+
Corollary Let cp E Ede an eigenveclor of T belonging to a ib, b # 0. If (p = u +
w E Cn,then ( u , u ) is a baais for E girring cp the matriz [i3.
Note that u and v can be obtained directly from cp and u (without reference to
0 )by the formulas in the proof of Theorem 3.
$3. APPLICATION OF COMPLEX LINEAR ALGEBRA TO DIFFERENTIAL EQUATIONS 69
PROBLEM
dx
- = Tx,
dt
(3)
(4) dy,-
dt
- T,y, on two-dimensional E,,
2; = 21 + 2x2,
or
Thus
21 =
22
y1
= -y1;
+ y2,
The new coordinates are given by
i or x = ~ y ,
P = [
1
-1
- 'I.
1
'1
1
= B,
or B = A I , in
~ the notation of Section 4,Chapter 3.
Thus, as we saw in that section, our differential equation
dx
- = AX
dt
on R2,having the form
dY
- = By
dt
in the y-coordinates, can be solved as
yl(t) = ue' cos t - vet sin t,
yt(1) = ue' sin t + ve' cos 1.
The original equation has as its general solution
xl(t) = + v ) e *cost + (u - v)e' sin t,
(u
xz(1) = -uef cos t + vet sin 1.
Example 2 Consider on R3 the differential equation
( A - x)e = 0, = 0;
P=[
-10
; ; ;].
0 0
and
Now we have transformed our original equation x' = Ax following the outline
:I
given in the beginning of this section to obtain
y' = By, B =
[: I
0 2 -3 , IJ = P-'x.
This can be solved explicitly for y as in the previous example and from this solution
one obtains solutions in terms of the original x-coordinates by x = Py.
$3. APPLICATION OF COMPLEX LINEAR ALGEBRA TO DIFFERENTIAL EQUATIONS 73
dz
- = TCZ, z E C".
dt
One can make sense of ( 1 ~ as) a differential equation either by making definitions
directly for derivatives of curves R + Cn or by considering Cn as R2n,that is,
R2n ---f Cn,
dwi
_ - - piwi, i = 1, . . . , s.
dt
= ( ~ ~ e x p ( X i.t ). ,. , crexp(Xrt),
cr+1 . . . , Cr+r+i exp (pit) , . .
exp (14, ., ~r+2r -
exp ( p i t ) 1
Now it can be checked that if z ( 0 ) E R", then z ( t ) E R" for all t, using formal
properties of complex exponentials. This can be a useful approach to the study of
(1).
PROBLEM
Solve 2' = T x where T is the operator in (a) and (b) of Problem 1, Section 2.
Chapter 5
Linear Systems and Exponentials
of Operators
The object of this chapter is to solve the linear homogeneous system with con-
stant coefficients
(1) z’ = As ,
where A is an operator on R n (or an n X n matrix). This is accomplished with
exponentials of operators.
This method of solution is of great importance, although in this chapter we can
compute solutions only for special cases. When combined with the operator theory
of Chapter 6 , the exponential method yields explicit solutions for every system (1).
For every operator A , another operator eA, called the exponential of A , is defined
-
in Section 4. The function A -+ eA has formal properties similar to those of ordinary
exponentials of real numbrrs; indced, the latter is a special case of the former.
Likewise the function t e l A ( t E R) resembles the familiar eta, where a E R. In
particular, it is shown that the solutions of (1) are exactly the maps z: R + R n
given by
z ( t ) = efAK ( K E R”).
Thus we establish existence and uniqueness of solution of (1) ; “uniqueness” means
that there is only one snlution x ( t ) satisfying a given initial condition of the form
z(tn) = KO.
Exponcntials of operators are defined in Section 3 by means of an infinite series
in the opcrator space I,( Rn); t h r series is formally the same as the usual series for
e.Convergence is established by means of a special norm on L(Rn),the uniform
norm. Norms in general arc discussed in Section 2, while Section 1 briefly reviews
some basic topology in Rti.
Sections 5 and G arc devoted to two less-central types of differential equations.
One is a simple inhomogeneous system and the other a higher order equation of one
variable. We do not, however, follow the heavy emphasis on higher order equations
$1. REVIEN’ O F TOPOLOGY I N R“ 75
of some texts. In geometry, physics, and other kinds of applied mathematics, one
seldom encounters naturally any differential equation of order higher than two.
Often even the second order equations are studied with more insight after reducing
t o a first order system (for example, in Hamilton’s approach to mechanics).
Be(z) c x.
A sequence (56) = .
x l , x2, . . in Rn converges to the limit y E Rnif
lim
k-oo
I Xk - y I = 0.
Equivalently, every neighborhood of y contains all but a finite number of the points
of the sequence. We denote this by y = limk,, x k or zk 4 y . If Z k = ( x k l , . . . , z k n )
and y = ( y l , . . . , y n ), then { z k ) converges to y if and only if limk,, x k j = yj, j =
1 , . . . , n. A sequence that has a limit is called convergent.
A sequence ( x k } in Rn is a Cauchy sequence if for every e > 0 there exists an
integer ko such that
I xj - x k I < e if k 2 k~ and j 2 ko.
The following basic property of Rn is called metric completeness:
A sequence converges to a limit if and only if it is a Cauchy sequence.
A subset Y C R n is closed if every sequence of points in Y that is convergent
has its limit in Y . It is easy to see that this is equivalent to: Y is closed if the com-
plement Rn - Y is open.
Let X C Rnbe any subset. A map f : X + Rm is continuous if it takes convergent
sequences to convergent sequences. This means: for every sequence ( x k } in with x
limxa = y E X,
k-oo
$2. NEW NORMS FOR OLD 77
it is true that
lim f(a)= f ( Y ) .
k-m
lmax = max(I 2 1 I, . . . , I zn I I J
I z [sum = 1x1 I+ + 1 ~ 1. n
Let @ = {jl, . . . , fn) be a basis for Rn and define the Euclidean @-norm:
+ .. . +
n
The a max-norm of x is
so limN(zk) = N(y) in R.
Since N is continuous, it attains a. maximum value B and a minimum value A
on the closed bounded set
$2. NEW NORMS FOR OLD 79
Proof. Let A > 0, B > 0 be as in (4). Suppose ( 5 ) holds. Then the inequality
0-< [ ~h - y I 5 A - ' N ( x ~- y)
shows that limk,, I X k - y I = 0, hence X k y. The converse is proved SiIndady.
--$
( 6 ) for every e > 0, there exists a n integer no > 0 such that if p > n 2. no, then
N ( x-
~ 2,) < e.
Proof. Suppose E C Rn,and consider ( x k ) as a sequence in Rn.The condition
(6) is equivalent to the Cauchy condition by the equivalence of norms. Therefore
(6) is equivalent to convergence of the sequence to some y E R".But y E E because
subspaces are closed sets.
c x k = y
k-1
and say the series C xk converges to y. If all the X k are in a subspace E C Rn,then
also y E E because E is a closed set.
A series X k in a normed vector space ( E , N ) is absolutely convergent if the series
of real numbers Zr-0 N ( z ~ is) convergent. This condition implies that C xk is
convergent in E. Moreover, it is independent of the norm on E , as follows easily
from equivalence of norms. Therefore it is meaningful to speak of absolute con-
vergence of a series in a vector space E , without reference to a norm.
A useful criterion for absolute convergence is the comparison test: a series X k
in a normed vector space (23,N ) converges absolutely provided there is a conver-
gent series C a k of nonnegative real numbers ah such that
52. NEW NORMS FOR OLD 81
For
c c
P P
0 I N(Xk) 5 ak;
k-n+l k-n+l
hence c?--oN(zk) converges by applying the Cauchy criterion to the partial sum
sequences ofc N (zk)and ak. c
PROBLEMS
1. Prove that the norms described in the beginning of Section 2 actually are
norms.
2. I z Ip is a norm on Rn, where
n
(d) Let (el, . . , , em) be a basis for E. Show that there is a unique inner
product on E such that
(ei, e,) = 6ij for all i, j .
6. Which of the following formulas define norms on R2? (Let ( x , y) be the co-
ordinates in R2.)
+ +
(a) ( 2 2 x y y2)*12; (b) ( x 2 - 3xy y2)II2; +
+
(c) (I %I I Y I)? ( 4 +(I 5 I + +
I ?/I) acx2 Y2)1'2. +
7. Let U C Rn be a bounded open set containing 0. Suppose U is convex: if 2 E U
and y E U , then the line segment ( tx +
(1 - 1 ) y I 0 I t 5 1 ) is in U. For
each x E Rn define
~ ( x =) least upper bound of ( A 2 0 I Xz E U ) .
Then the function
is a norm on Rn.
8. Let M , he the vector space of n X n matrices. Denote the transpose of A E M ,
by At. Show that an inner product (see Problem 5 ) on Mn is defined by the
formula
( A , B ) = Tr(AtB).
Express this inner product in term of the entries in the matrices A and B.
9. Find the orthogonal complement in M n (see Problem 8) of the subspace of
diagonal matrices.
10. Find a basis for the subspace of Mn of matrices of trace 0. What is the ortho-
gonal complement of this subspace?
r
$3. Exponentials of Operators
The set L(Rn) of operators on Rn is identified with the set Mn of 'n X n matrices.
This in turn is the same as Rn2since a matrix is nothing but a list of n2 numbers.
(One chooses an ordering for these numbers.) Therefore L(Rn) is a vector space
under the usual addition and scalar multiplication of operators (or matrices). We
may thus speak of norms on L(Rn), convergence of series of operators, and so on.
A frequently used norm on L(R") is the unijorm norm. This norm is defined in
terms of a given norm on Rn = E, which we shall write as I x I. If T : E -iE is an
operator, the uniform norm of T is defined to be
IITII = m a x ( l T x I I l z l 5 1 ) .
$3. EXPONENTIALS OF OPERATORS 83
Hence
1
k = II T II 2 I Ty I = I Tx I
1x1
~
We now define an important series generalizing the usual exponential series. For
any operator T:Rn -+ Rn define
C -.
1 .-
exp(T) = er =
k! t-O
(Here k! is k factorial, the product of the first k positive integers if k > 0, and
O! = 1 b y definition.) This is a series in the vector space L(Rn).
The proof of (a) follows from the identities P ( A + B) P-l = PAP-' + PBP-I
and (PTP-1)k = PTkP-l. Therefore
Therefore
of Chapter 3, which preservea sums, products, and real multiples. It is eaay to see
that it also preserves limits. Therefore
eT w @ear
where ed is the complex number CLO( i b ) k / k ! .Using t2 = -1, we find the real
part of es to be the sum of the Taylor series (at 0 ) for cos b; similarly, the imaginary
part is sin b. This proves (d).
Observe that (c) implies that es is invertible for every operator S. This iS 4-
ogous to the fact that d # 0 for every real number s.
As an example we compute the exponential of T = [t :I. We write
T = aI + B, B = [: :].
S6 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
Thus
We can now compute eA for any 2 X 2 matrix A . We will see in Chapter 6 that
can find an invertible matrix P such that the matrix
B = PAP-'
has one of the following forms:
For (2)
cos b
sin b
-sin b
cos b 1
as was shown in the proposition above. For (3)
1 0
= e"l 11
as we have just seen. Therefore eA can be computed from the formula
eA = F I B P = P-leBP.
There is a very simple relationship between the eigenvectors of T and those of
eT:
= e'x.
We conclude this section with the observation that all that has been said for
exponentials of operators on Rn also holds for operators on the complex vector space
C". This is because Cn can be considered aa the real vector space Rln by simply
ignoring nonreal scalars; every complex operator is a fortiori a real operator. In
addition, the preceding statement about eigenvectors is equally valid when complex
eigenvalues of an operator on C n are considered; the proof is the same.
PROBLEMS
1. Let N be any norm on L(Rn). Prove that there is a constant K such that
N(ST) I KN(S)N(T)
for all operators S, T.Why must K 1 l?
2. Let T:Rn + Rm be a linear transformation. Show that T is uniformly con-
-
tinuow: for all c > 0 there exists 6 > 0 such that if \ x y I < 6 then
I T X - T y I < C.
3. Let T:Rn + Rn be an operator. Show that
11 T 11 = least upper bound ~-
l~l lzl 01.
z
4. Find the uniform norm of each of the following operators on R2:
5. Let
ss 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
(b) Show that for every c > 0 there is a basis of R2for which
II T lla < 3 + €1
(e) 1A :I
0 0 0
(f) [:: :] [: : "1
0 1 3
(g)
0 1 x
(h) [i
0 -i
'1 (i) [' 2 l+i
1 0 0 0
L(Rn) is identified with Rn2,it makes sense to speak of the derivative of this map.
Proposition
= &AA;
that the last limit equals A follows from the series definition of ehA. Note that A
commutes with each term of the series for e t A , hence with elA. This proves
the proposition.
90 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
We can now solve equation (1). We recall from Chapter 1 that the general solu-
tion of the scalar equation
x' = a2 ( a E R)
is
Z(t) = ke*"; k = z(0).
The same is true where Z,a, and k are allowed to be complex numbers (Chapter 3).
These results are special cases of the following, which can be considered as the
fundamental theorem of linear differential equations with constant coefficients.
Theorem Let A be an operator on Rn.Then the solution of the initial value problem
(1') Z' = Az, ) K E
~(0= Rn,
is
(2) etAK,
and there are ?LO other solutions.
Proof, The preceding lemma shows that
= AeLAK;
since eOAK = K , it follows that (2) is a solution of (1'). To see that there are no
other solutions, let 5 ( 1 ) be any solution of (1') and put
y(1) = e-%(t).
Then
= -Ae-'Ax(t) + e-LAAz(t)
= e-'A(-A + A)z(t)
= 0.
Therefore y ( t ) is a constant. Setting t = 0 shows y ( t ) = K . This completes the
proof of the theorem.
Z; = bxi + UXZ,
$4. HOMOGESEOUS LINEAR SYSTEMS 91
1.
In Section 3 we saw that
B = PAP-' = [o A 0
A, < 0 < p.
In the ( y l , y2) plane the phase portrait looks like Fig. A on p. 91.
Case IZ. -411 eigenvalues have negative real parts. This important case is called
a sink. It has the characteristic property that
limz(t) = 0
t-m
for every solution x ( t ) . If A is diagonal, this is obvious, for the solutions are
y(t) = (clext,cze") ; X < 0, ~r< 0.
FIG. B. Focus: B = [; 3 K 0.
44. HOMOGENEOUS LINEAR SYSTEMS 93
are of the form with y(t) as above and P C L(R2); clearly, z ( t ) --+ 0 as t --f a.
The phase portrait for these subcases looks like Fig. B if the eigenvalues are
equal (a focus) and like Fig. C if they are unequal (a node).
If the eigenvalues are negative but A is not diagonalizable, there is a change
of coordinates z = P y (see Chapter 6 ) giving the equivalent equation
Y' = BY,
where
Case I I I . All eigenvalues have positive real part. In this case, called a source, we
have
lim 1 z ( t ) I = 00 and lim J z ( t ) 1 = 0.
;-OD I-r-W
A proof similar to that of Case I1 can be given; the details are left to the reader.
The phase portraits are like Figs. B-E with the arrows reversed.
Case IV. The eigenvalues are pure imaginary. This is called a center. It is charac-
terized by the property that all solutions are periodic with the same period. To see
this, change coordinates to obtain the equivalent equation
We know that
Y' = BY,
B = [- -3.
cos tb -sin tb
etB =
[sin tb cos tb
Therefore if y ( t ) is any solution,
Tr <
Saddles
~ Det < O
FIG. G
The phase portrait in the y-coordinates consists of concentric circles. In the original
x-coordinates the orbits may be ellipses as in Fig. F. (If b < 0, the arrows point
clockwise.)
Figure G summarizea the geometric information about the phase portrait of
x' = Ax that can be deduced from the characteristic polynomial of A . We write
this polynomial as
A* - (TrA)A Det A. +
The discrimina.nt A is defined to be
A = (Tr A ) 2 - 4 Det A .
The eigenvaluea are
4 (TrA f <A).
Thus real eigenvalues correspond to the case A 2 0 ; the eigenvalues have negative
real part when Tr A < 0 ; and so on.
The geometric interpretation of 5' = Ax is as follows (compare Chapter 1). The
map Rn--$ R" which sends x into Ax is a vector field on R". Given a point K of
Rn,there is a unique curve t 4 e t A K which starts at K at time zero, and is a solution
of (1). (We interpret t as time.) The tangent vector to this curve at a time lo is the
vector Az(to) of the vector field at the point of the curve x(l0).
We may think of points of Rnflowing simultaneously along these solution curves.
The position of a point x E Rnat time t is denoted by
&(x) = e%.
given by
+f(x) = etAx.
The collection of maps (4,)trR is called the flow corresponding to the differential
equation (1). This flow has the basic property
48+t = 4I 4f,
which is just another way of writing
-
e(~+t)A = edetA.
1
this is proved in the proposition in Section 2. The flow is called linear because each
map :Rn R n is a linear map. In Chapter 8 we shall define more general nonlinear
+f
flows.
The phase portraits discussed above give a good visualization of the correspond-
ing flows. Imagine points of the plane all moving a t once along the curves in the
direction of the arrows. (The origin stays put.)
PROBLEMS
-y
(a)
x'
{y'
=
=
22
2y (b) {2' =
y' = x
22 - y
+ 2y
(c)
y'
=
= x (d)
! 2' = -22
y' = x - 2y
I
2' = y - 22
xI=y+z
(e) y' = z
2) = 0
In ( a ) , (b), and (c) of Problem 1, find the solutions satisfying each of the
following initial conditions:
(a) x( 0) = 1, y(0) = -2; (b) ~ ( 0 =) 0, y(0) = -2;
( c ) x( 0) = 0, y(0) = 0.
Let A : Rn -+ Rn be an operator that leaves a subspace E C R" invariant.
Let x: R + Rn be a solution of x' = A r . If x(to) E E for some to E R, show
that x ( t ) E E for all t E R.
Suppose A E L(Rn) has a real eigenvalue X < 0. Then the equation x' = Ax
98 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
8. Classify and sketch the phase portraits of planar differential equations x' =
Ax, A E L(R2), where A has zero as an eigenvalue.
9. For each of the following matrices A consider the corresponding differential
equation x' = Ax. Decide whether the origin is a sink, source, saddle, or none
of these. Identify in each case those vectors u such that 1imf+-x(tj = 0, where
x ( t ) is the solution with x(0) = u :
10. Which values (if any) of the parameter k in the following matrices makes the
origin a sink for the corresponding differential equation x' = Ax?
11. Let + t : R2 R2 be the flow corresponding to the equation x' = Ax. (That
is, t + h ( x ) is the solution passing through x at t = 0.) Fix 7 > 0, and show
+,
that is a linear map of R2 R2. Then show that & preserves area if and only
---f
$5. A NONHOMOGENEOUS EQUATION 99
if Tr A = 0, and that in this case the origin is not a sink or a hource. (fZint:
An operator is area-preserving if and only if the determinant is & l . )
12. Describe in words the phase portraits of x’ = Ax for
13. Suppose A is an n X n matrix with n distinct eigenvalues and the real part of
every eigenvalue is less than some negative number a. Show that for every
solution to x’ = Ax, there exists to > 0 such that
] x(t) I < eta if t 2 to.
14. Let T be an invertible operator on R”, n odd. Then x‘ = Tx has a nonperiodic
solution.
15. Let A = [: j] have nonreal eigenvalues. Then b # 0. The nontrivial solutions
curves to 2’ = Ax are spirals or ellipses that are oriented clockwise if b > 0
and counterclockwise if b < 0. (Hint: Consider the sign of
d
- arc tan(x2(t)/xl(t)).)
dt
f ( t ) = /'C"'B(S) ds
0
+ K,
so as a candidate for a solution of (1) we have
Let us examine (3) to we that it indeed makes sense. The integrand in (3) and
the previous equation ie the vector-valued function s -ie A ' B ( s )mapping R into
Rn. In fact, for any continuousmap g of the reds into a vector space Rn,the integral
can be defined aa an element of Rn. Given a basis of Rn, this integral is a vector
whose c o o r h t e a are the integrals of the coordinate functions of g.
The integral as a function of ita upper limit t is a map from R into Rn.For each
t the operator acts on the integral to give an element of Rn. So t -iz(1) is a well-
defined map from R into E.
To check that (3) is a solution of ( l ) , we differentiate z ( t ) in (3) :
+ AeA1[
z'(t) = B ( t ) l e-"B(s) ds + K]
= B(t) + Az(t).
z;= 21 + t.
Here
A = [y -3 B(t) = [3].
Hence
sin t - t coa t
cost+tsint- 1
To compute (3) we set
8'= [sint
cos t -sin t
cost
102 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
t::;]
hence the general solution is
= [cos t
sint
-sin t ] [sin t - t cos t K1 +
cost c o s t + t s i n t - l + K 2
Performing the matrix multiplication and simplifying yields
xl(t) = - t+ K1 cost + (1 - Kz)sin t ,
x2(t) = 1 - (1 - K z ) cost + K1sint.
This is the solution whose value at t = 0 is
x~(O)= K1, a ( 0 ) = Kz.
PROBLEMS
system :
(2) x: = a,
x; = -bx1 - ux¶.
Thus if x ( t ) = (xl( t ) , q ( t ) ) is a solution of (2), then s ( t ) = x1( t ) is a solution
of (1); if s ( t ) is a solution of (1), then x(t) = (s(t) , s ’ ( t ) ) is a solution of (2).
This procedure of introducing new variables works very generally to reduce
higher order equations to first order ones. Thus consider
(3) s(*) + + +
- . . a,-ls’ ans = 0. +
Here s is a real function of t and s(“) is the nth derivative of s, while al, . . . , a n are
constants.
In this case the new variables are x1 = s, x2 = x:, . . . ,x n = xn-l’ and the equation
(3) is equivalent to the system
(4) x: = 22,
d = x8,
I:
In vector notation (4) has the form x’ = Ax,where A is the matrix
(4’) 0 1 0
0 0 1
0 0
--an -‘an4
Proposition The characteristic polynomial of (4’) i s
p ( X ) = X” + alXn-1 + - + a,.*
Proof. One uses induction on n. For n = 2, this is easily checked. Assume the
truth of the proposition for n - 1, and let be the ( n - 1) X ( n - 1) sub-
matrix of A consisting of the last ( n - 1) rows and last ( n - 1) columns. Then
Det ( X I - A ) is easily computed to be X Det ( X I - An-1) +
a,by expanding along
the first column. The induction hypothesis yields the desired characteristic
polynomial.
The point of the proposition is that it gives the characteristic polynomial directly
from the equation for the higher order differential equation (3).
104 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
at first that these roots are real and distinct. Then (1) reduces to the equation of
first order (2); one can find a diagonaliaing system of coordinates (yl,yz). Every
solution of (2) for these coordinates is then v1( 1 ) = Kl exp ( A d ) , yz ( t ) = K2 exp ( A d ) ,
with arbitrary constants K1, KZ. Thus zl(t) or s ( t ) is a certain linear combination
a ( t ) = pllK1 exp(X1t) +p12K2 exp(Xzt). We conclude that if All Xz are real and
distinct then every solution of ( 1 ) is of the form
s ( t ) = CI exp ( A d ) + CZexp ( A d )
for some (real) constants C1, C2. These constants can be found if initial values
s (t o ) , s’( t o ) are given.
Next, suppose that X1 = Xz = X and that these eigenvalues are real. In this case
the 2 X 2 matrix in (2) is similar to a matrix of the form
We find that
s ' ( t ) = ( -C1 + Cz)e-l - Czte-c.
From the initial conditions in (5) we get, setting t = 0 in the last two formulas
c, = 1,
-c1 + cz = 2.
Hence Cz = 3 and the solution to ( 5 ) is
s ( t ) = e-l + 3te-l.
The reader may verify that this actually is a solution to (5) !
The final case to consider is that when X1, Xz are nonreal complex conjugate num-
+
bers. Suppose XI = u iv, XZ = u - i v . Then we get a solution (as in Chapter 3 ) :
cos vt - KZsin vt),
y l ( t ) = eUL(K1
yZ(t) = e U 1 ( Ksin
I vt + K2 cos v t ) .
Thus we obtain s ( t ) as a linear combination of y l ( t ) and y z ( t ) ,so that finally,
s ( t ) = eu'(C1cos vt + Czsin v t )
for some constants C1, Cz.
A special case of the last equation is t.he "harmonic oscillator" :
S" + bzs = 0 ;
the eigenvalues are &ib, and the general solution is
Cl cos bt + Czsin bt.
We summarize what we have found.
Theorem Let XI, Xz be the roots of the polynomial Xz + aX + b. Then every solution
of the diflerentiul equation
(1) s" + as' + bs = 0
is of the following type:
Case ( a ) . XI, Xz are real distinct: s ( t ) = C1 exp(Xlt) CZexp(X2t) ; +
+
Case ( b ) . X1 = Xz = X is real: s ( t ) = CleA' C d e A f ;
Case ( c ) . XI = Xz = u + iv, v # 0 : s ( t ) = ewt(Cl cos vt C2 sin ut). +
In each case C1, Czare (real) constants determined by initial conditions of the
form
s(t0) = a, S'(t0) = 8.
The nth order linear equation (3) can also be solved by changing it to an equiva-
lent first order system. First order systems that come from nth order equations
106 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
have special properties which enable them to be solved quite easily. To understand
the method of solution requires more linear algebra, however. We shall return to
higher order equations in the next chapter.
We make a simple but important observation about the linear homogeneous
equation (3) :
If s( t ) and q ( 1 ) are solutions to ( 3 ), so is the function s ( t ) + q ( t ) ; if k is any real
number, then ks(t) is a so2ution.
In other words, the set of all solutions is a vector space. And since n initial conditions
determines a solution uniquely (consider the corresponding first order system), the
dimension of the vector space of solutions equals the order of the differential
equation.
A higher order inhomogeneous linear equation
(6) S(") + als'"-" + * * + an5 = b(t)
can be solved (in principle) by reducing it to a first order inhomogeneous linear
system
[ "1
i = Ax + B (t)
and applying variation of constants (Section 5 ) . Note that
B(t) =
b(t)
As in the case of first order systems, the general solution to (6) can be expressed
aa the general solution to the corresponding homogeneous equation
s(") + als("-') + . . . + a,s =0
u' = 32 - u - 2v,
y' = v ,
8' = -5u + 4y.
PROBLEMS
5. State and prove a generalization of Problem 4 for for nth order differen-
tial equations
+
s(") als(n-l) * +- +
a,s = 0,
where the polynomial
X" + alXn-' + . + a,,
has n distinct roots with negative real parts.
108 5. LINEAR SYSTEMS AND EXPONENTIALS OF OPERATORS
Votes
The aim of this chapter is to achieve deeper insight into the solutions of the
differential equation
(1) X' = AX, A E L(E), E = R",
by decomposing the operator A into operators of particularly simple kinds. In
Sections 1 and 2 we decompose the vector space E into a direct sum
E = E l @ . * e. E r
and A into a direct sum
A = A1 @ * * * @A,, Ak E L(Ek).
Each Ak can be expressed aa a sum
Ak = s k -k Nk; Sk, N k E L(Ek),
with 8 k semisimple (that is, its complexification is diagonalizable), and N k nil-
potent (that is, (Nk)"'= 0 for some m ) ; moreover, b!%and N k commute. This
reduces the series for erAto a finite sum which is easily computed. Thus solutions
to (1) can be found for any A .
Section 3 is devoted to nilpotent operators. The goal is a special, essentially
unique matrix representation of a nilpotent operator. This special matrix is applied
in Section 4 to the nilpotent part of any operator T to produce special matrices
for T called the Jordan form; and for operators on real vector spaces, the real canon-
ical form. These forms make the structure of the operator quite clear.
In Section 5 solutions of the differential equation x' = Ax are studied by means
of the real canonical form of A . It is found that all solutions are linear combinations
of certain simple functions. Important information about the nature of the solu-
tions can be obtained without explicitly solving the equation.
110 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
In this section we state a basic decomposition theorem for operators; the proof
is given in Appendix 111. It is not necessary to know the proof in order to use the
theorem, however.
In the rest of this section T denotes an operator on a vector space E, which may
be real or complex; but if E is real it is assumed that all eigenvalues of T are real.
Let the characteristic polynomial of T be given as the product
I
p(t)= fl ( t - Ak),'.
k-1
Here All . . , , Ak are the distinct roots of p ( t ), and the integer nk 2 1 is the multi-
plicity of h k ; note that nl + +
- nk = dim E.
+
Let us see what this decomposition means. Suppose first that there is only one
eigenvalue X, of multiplicity n = dim E. The theorem implies E = E ( T,A). Put
N = T - XI, S XI.
+
Then, clearly, T = N S and SN = NS. Moreover, S is diagonal (in every basis)
and N is nilpotent, for E = E ( T , X) = Ker N". We can therefore immediately
compute
NL
"-1
eT E eS# = eh C -*
k-O k!'
there is no difficulty in finding it.
Example 1 Let T = [i 3. The characteristic polynomial is
p ( t ) = t2 - 4t + 4 = (t - 2)Z.
There is only one eigenvalue, 2, of multiplicity 2. Hence
s = [20 O]
2 ,
N = T - S = [ -1 -1
More generally,
elT = elsetN = 8I(I + tN)
= 8
"
1 - 2t
t l+t
-7.
Thus the method applies directly to solving the differential equation x' = T x
(see the previous chapter).
For comparison, try to compute directly the limit of
and S is diagonalized by a basis for E which is made up of bases for the generalized
eigenspaces.
We have proved:
[A -: :].
nates is
1 -2
To =
-2 1 -2
0 -2 4
0 0 0
one can verify that the vector
as = (0, 2, 1)
is a basis.
Let CB be the basis (ul, a,a*) of Ra. Let T = S + N be as in Theorem 2. In
(%-coordinates,S has the matrix
this follows from the eigenvalues of T being - 1, - 1, 1. Let Sobe the matrix of S
in standtwd coordinates. Then
where P is the inverse transpose of the matrix whose rows are ul, u2,u3. Hence
(PI)%=
0 1 0 ,
[: :]
Therefore
[: : I:
P= 0 1 -2.
114 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
[A -A ;].
hfatrix niultiplication gives
0 0
so =
We can now find the matrix No of N in the standard basis for R',
No = To - So
-1 1 -2 -1 0 0
We have now computed the matrices of S and N. The reader might verify that
N* = 0 and SN = NS.
We compute the matrix in standard coordinates of es not by computing the matrix
eso directly from the definition) which involves an infinite series, but as follows:
=! E .I[.f 36 ; -!I1
exp(So) = exp(PISIP) = P'exp(S1)P
0 e-'0 0 1 0
1 0
which turns out to be
e-l
0
0
e-1 -2e-1+
O 1
0 0 e
It is easy to compute exp (N o ) :
$1. THE PRIMARY DECOMPOBITION 115
Finally, we obtain
[.
which gives
e-I e1 -2 e '
exp( To) = e1 -2e-;+ 2e].
0 0
It is no more difficult to compute ecTo,t E R. Replacing To by tT0 transforms So
to tSo,No to tNo,and so on; the point is that the same matrix Pis used for all values
to t. One obtains
exp (t To) = exp (tSo)exp (tNo)
=
1.
e-'
0
0
e
l -2e'+2et
1 -2t
t
0 1 0
: ] [ 0 0 1 ]
Theorem 3 Let A be any operator on a real or complex vector apace. Let ita charac-
teriatic polynomial be
n
p(t) = C adk.
L-0
Then p ( A ) = 0, that is,
I)
C akA*(x) = 0
L-0
for all x E E.
Proof. We may assume E = Rnor Cn;since an operator on Rnand its complexi-
fication have the same characteristic polynomial, there is no loss of generality in
assuming E is a complex vector space.
It suffices to show that P ( A )x = 0 for all x in an arbitrary generalized eigenspace
116 6. LINEAR aYamMs AND CANONICAL mms OF OPERATORS
It is easy to see that SIis diagonalisable, NI is nilpotent, and SlNl = NISI. There-
fore SO = SIand No = NI.This means that Soand No commute with u as asserted.
There are unique operators S, N in L(Rn)such that
So Sc, No = Nc.
Since the map A + dc is one-to-one, it follows that
SN = N S
62. THE s +N DECOMPOSITION 117
for
(SN - NS)c = SONO 0. - NOSO =
+
Let T = S N as in Theorem 1. Since Sc is diagonalizable, it follows from
Chapter 4 that in a suitable basis 63 of Rn,described below, S has a matrix of the
form
Here XI, . . . , A, are the real eigenvalues of T, with multiplicity; and the complex
numbers
ak + ibk; k = 1, . . . , s,
are the complex eigenvalues with positive imaginary part, with multiplicity. Note
that T, Tc,SC,and S have the same eigenvalues.
The exponential of the matrix tL, t E R is easy to calculate since
is
1,
These four vectors, in order, form a basis (B of R'. This basis gives S the matrix
1 0
0 - .1
L 1 0
(We know this without further computation.)
The matrix of S in standard coordinates is
51 i3
so = P ' S I P ,
where P1is the transpose of the matrix of components of (B; thus
p' =
0 0 - 1 0
f$. THE s +N DECOMPOSITION 119
P=
I: : :I
0 0 - 1
0 1 0 - 1
0
I "1
Hence
0 -1 0
so = 0 1 0 - 1
1 O 01
0 0
The matrix of N in standard coordinates is then
No = To - So
0 -1 0 0'
0 0 0
1 0 -1
1 0 1 0.
=[; -1
0
which indeed is nilpotent of order 2 (where * denotes a zero).
The matrix of elT in standard coordinates is
exp(tTo) = exp(tN0 + tSo) = exp(tNo) exp(tSo)
= (I + tNo)Pexp(t&)P1.
From
cost -sin t
cos t
exp(tS1) =
PROBLEMS
1. For each of the following operators T find bases for the generalized eigenspaces;
give the matrices (for the standard basis) of the semisimple and nilpotent
parts of T.
-2 0 0 0 0 2 2 2 2
2 0 6 0 1 -2 3 3 3 3
4 4 4 4
10. If A and B are commuting operators, find a formula for the semisimple and
nilpotent parts of A B and A +
B in terms of the corresponding parts of A
and B. Show by example that the formula is not always valid if A and B do not
commute.
11. Identify Rn+' with the set P n of polynomials of degree 5 n, via the corre-
spondence
(any ... 1 G)* antn + * * * + + a.
alt
16. Find necessary and sufficient conditions on a, b, c, d in order that the operator
[ Z :] be
(a) diagonalizable; (b) semisimple; (c) nilpotent.
17. Let F C E be invariant under T E L ( E ) . If T is nilpotent, or semisimple, or
diagonalizable, so is T 1 F .
In the previous section we saw that any operator T can be decomposed uniquely
as
T = S + N
with S semisimple, N nilpotent, and S N = N S . We also found a canonical form
for S, that is, a type of matrix representing S which is uniquely determined by
T , except for the ordering of diagonal blocks. In the complex case, for example,
S = diag(A1,. . . , A,,),
where All . . . , A,, are the roots of the characteristic polynomial of T, listed with
their proper multiplicities.
Although we showed how to find some matrix representation of N , we did not
give any special one. In this section we shall find for any nilpotent operator a matrix
that is uniquely determined by the operator (except for order of diagonal blocks).
From this we shall obtain a special matrix for any operator, called the Jordan
canonical form.
An elementary nilpotent block is a matrix of the form
with 1’s just below the diagonal and 0’s elsewhere. We include the one-by-one
matrix [ O ] .
If N : E E is an operator represented by such a matrix in a basis el, . . . , en,
then N behaves as follows on the basis elements:
N(ed = e2,
N(e2) = es,
N(e,-l) = en,
N(e,,) = 0.
It is obvious that N n (eb) = 0, k = 1, . . . , n; hence Nn = 0. Thus N is nilpotent
of order n. Moreover, N k # 0 if 0 5 k < n, since Nbel = ek+1 # 0.
In Appendix I11 we shall prove
$3. NILPOTENT CANONICAL FORMS 123
The question arises: given a nilpotent operator, how is its canonical form found?
To answer this let us examine a nilpotent matrix which is already in canonical
form, say, the 10 X 10 matrix
O1 0 I
G 01
K
N =
01
COl
COl
COl .
124 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
Next, consider 6 2 = dim Ker P. Each 1 X 1 block (that is, the blocks [OJ)
contributes one dimension to Ker Tz. Each 2 X 2 block contributes 2, while the
3 X 3 block also contributes 2. Thus
62 = v1 + 2v2 + 2va.
For iSS = dim Ker Ts,we see that the 1 X 1 blocks each contribute 1; the 2 X 2
blocks each contribute 2; and the 3 X 3 block contributes 3. Hence
68 = v1 + 2v2 + 3vs.
In this example Ns= 0, hence 88 = 6k, k > 3.
For an arbitrary nilpotent operator T on a vector space of dimension n, let N
.
be the canonical form; define the numbers 6k and V k , k = 1, . . , n, as before. By
the same reasoning we obtain the equations
v1 = 26, - 82.
Subtracting the ( k + 1)th from the kth givea
Vk = -6k-1 + 26k - 1< k < n;
and the last equation givea vn. Thus we have proved the following theorem, in which
part (b) allowe us to compute the canonical form of any nilpotent operator:
Note that the equations in (b) can be aubaumed under the single equation
valid for Cru integer8 k 2 1, if we note that ti0 = 0 and 6 k = a,, for k > n.
There is the more difficult problem of finding a basis that puts a given nilpotent
operator in canonical form. An algorithm is implicit in Appendix 111. Our point of
view, however, is to obtain theoretical information from canonical forms. For
example, the equations in the preceding theorem immediately prove that two nil-
potent qmator8 N,M on a veetor space E are similrrr if and only if dim Ker Nb =
dimKerMb for 1 5 k s d i m E .
+
For computational purposes, the S N decomposition ia usually adequate. On
the other hand, the exietence and uniqueneaa of the canonical forms ia important
for theory.
126 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
PROBLEMS
1. Verify that each of the following operators is nilpotent and find its canonical
form :
0 0 2 3 0
2. Let N be a matrix in nilpotent canonical form. Prove N is similar to
(a) kN for all nonzero k E R,
(b) the transpose of N .
3. Let N be an n X n nilpotent matrix of rank r. If N R = 0, then k 2 n / ( n - r ) .
4. Classify the following operators on R4by similarity (missing entries are 0) :
!
A
1
* .
(Some of the diagonal blocks may be 1 X 1 matrices [A].) That is, X I A has +
A’s along the diagonal; below the diagonal are 1’s and 0’s; all other entries are 0.
+
The blocks making up X I A are called elementary Jordan matrices, or elementary
A-blocks. A matrix of the form (1) is called a Jordan matrix belonging to A, or briefly,
a Jordan A-block.
Consider next an operator T : E + E whose distinct eigenvalues are XI, . . . , A,;
as usual E is complex if some eigenvalue is nonreal. Then E = El e * * * e Em,
where Ek is the generalized Ak-eigenspace, k = 1, . . . , m. We know that T I Ek =
AkI + Nk with Nk nilpotent. We give Ek a basis adk, which gives T I Ek a Jordan
-
matrix belonging to Ak. The basis 63 = adl U . U a d k of E gives T a matrix of the
form
C = diag[CI,. . . , C , ) ,
where each Ck is a Jordan matrix belonging to Ak. Thus C is composed of diagonal
blocks, each of which is an elementary Jordan matrix C . The matrix C is called the
Jordan form (or Jordan matrix) of T.
We have constructed a particular Jordan matrix for T , by decomposing E as a
direct sum of the generalized eigenspaces of T . But it is easy to see that given any
Jordan matrix M representing T , each Jordan A-block of M represents the restric-
12s 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
Except for the order of these blocks, the matrix is uniquely determined by T.
Any operator similar to T has the same Jordan form. The Jordan form can be
written A +B , where B is a diagonal matrix representing the semisimple part of
T while A is a canonical nilpotent matrix which represents the nilpotent part of
T; and A B = BA.
Note that each elementary A-block contributes 1 to the dimension of Ker ( T - A).
Therefore,
are a basis for E . It is easy to see that in this basis, T 1 E, has a matrix composed
v. JORDAS AND REAL CANONICAL FORMS 129
or D,
where
D
I2
. .
*
. .
12 L
elements are the real eigenvalues, with multiplicity. Each block G -3,b > 0, appears
+
as many times as the multiplicity of the eigenvalue a bi. Such a matrix is uniquely
determined by the similarity class of T , except for the order of the blocks.
Definition The matrix described in the theorem is called the real canonical form
of T . If T has only real eigenvalues, it is the same as the Jordan form. If T is nil-
potent, it is the same as the canonical form discussed earlier for nilpotent operators.
Proposition I n the real canonical form of a n operator T on a real vector space, the
I.
number of blocks of the form
X
1
. .
1 X
+
is dim Ker ( T - A). Th,enumber of blocks of the form ( 2 ) is dim Ker ( TC- ( a ib) ) .
The real canonical form of an operator T exhibits the eigenvalues as part of a
matrix for T . This ties them to T much more directly than their definition as roots
of the characteristic polynomial. For example, it is easy to prove:
!
number of k X k blocks of the form
X
1 .
. .
. .
1 h
I*
blocks of the form
D
I
where X runs through all real eigenvalues and all complex eigenvalues with positive
imaginary part.
0 0 0
T=['0 1 0 -14
0 0 1
-8
'1 .
132 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
There remains the problem of finding a basis that puts an operator in real canon-
ical form. An algorithm can be derived from the procedure in Appendix I11 for
putting nilpotent operators in canonical form. We shall have no need for it, however.
PROBLEMS
(a> [- 10 '1
0
(b) [: 4 1 (c)
l+i
[' + i
0 "3
2. Find the real canonical forms of the operators in Problem 1, Section 2.
3. Find the real canonical forms of operators in Problem 4, Section 2.
4. What are the possible real canonical forms of an operator on R" for n I 51
5. Let A be a 3 X 3 real matrix which is not diagonal. If (A + I)* = 0, find the
real canonical form of A.
6 . Let A be an operator. Suppose q ( X ) is a polynomial (not identically 0) such
that q ( A ) = 0. Then the eigenvalues of A are roots of q.
7. Let A , B be commuting operators on C" (respectively, Rn). There is a basis
putting both of them in Jordan (respectively, real) canonical form.
8. Every n X n matrix is similar to its transpose.
9. Let A be an operator on R".An operator B on R" is called a real logarithm
of A if eB = A. Show that A has a real logarithm if and only if A is an iso-
morphism and the number of Jordan X-blocks is even for each negative eigen-
value A.
85. CANONICAL FORMS AND DIFFERENTIAL EQUATIONS 133
10. Show that the number of real logarithms of an operator on Rnis either 0, 1,
or countably infinite.
05.
1.
Canonical Forms and DifTerential Equations
[.:I. 1 A
1
From the decomposition
A=X+N,
0
we fhd by the exponential method (Chapter 5 ) that the solution to (1) with initial
value ~ ( 0 =
) C E Rnis
z ( t ) = e"C = e t A e W
1 -c;
1 1
e-
21 .
. .
(n
P-'
- l)!
5
... -
21
1 CII
- *
134 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
In coordinates,
D = [; -:I, I = [0
1 0
1.
1
Let m be the number of blocks D so that n = 2m. The solution to (1) can be com-
puted using exponentials. It is easiest to consider the equation
(3) Z' = Bz,
where z : R -+ Cmis an unknown map and B is the complex m X m matrix
, p=a+ib.
+
Put Ck = L k iMk,k = 1, . . . , m, and take real and imaginary parts of ( 4 ) ;
using the identity
= e'(cos bt + i sin b't)
one obtains
i-1 tk
(5) X j ( t > = B' C k! [ L j - k cos bt - Mi+ sin bt],
45. CANONICAL FORMS AND DIFFERENTIAL EQUATIONS 135
Consider now Eq. (1) where A is any real n X n matrix. By a suitable change
or coordinates x = P y we transform A into real canonical form B = P A P ’ . The
equation
(8) y‘ = By
is equivalent to (1) : every solution s(t)to (1) has the form
4 t ) = PY(t),
where y ( t ) solves ( 8 ) .
Equation (8) breaks up into a set of uncoupled equations, each of the form
U’ = B,u,
where B, is one of the blocks in the real canonical form B of A . Therefore the co-
ordinates of solutions to (8) are linear coordinates of the function described in (6)
+
and (7) , where A or a bi is an eigenvalue of B (hence of A ) . The same therefore
is true of the original equation (1).
Notice that if A has real eigenvalues, then the functions displayed in Theorem 1
include these of the form Pe”.
This result does not tell what the solutions of (1) are, but it tells us what form
the solutions take. The following is a typical and very important application of
Theorem 1.
136 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
PROBLEMS
1. (a) Suppose that every eigenvalue of A E L(Rn)has real part less than
-a < 0. Prow that there exists a constant k > 0 such that if z(l) is a
65. CANONICAL FORMS AND DIFFERENTIAL EQUATIONS 137
2. Let A E L(Rn). Suppose all solutions of x' = A x are periodic with the same
period. Then A is semisimple and the characteristic polynomial is a power of
+
t2 a2, a E R.
3. Suppose at least one eigenvalue of A E L(R") has positive real part. Prove
that for any a E Rn, t > 0 there is a solution x(t) to x' = A x such that
Iz(0) - a 1 <t and lim I x(t)l = a.
t-00
,5. For any solution to x' = A x , A E L(Rn), show that exactly one of the follow-
ing alternatives holds:
(a) liml+mx(t) = 0 and 1imt+-- I x(t)I = 0 0 ;
(b) liml+m1 x(t)I = w and lirnt--= x(t) = 0;
(c) there exist constants M , N > 0 such that
M < I x(t) I < N
for all t E R.
6. Let A E L(R4) be semisimple and suppose the eigenvalues of A are f a i , f b i ;
a > 0, b > 0.
(a) If a/b is a rational number, every solution to x' = A x is periodic.
(b) If a / b is irrational, there is a nonperiodic solution x(t) such that
M < 1 s(t)l < N
for suitable constant.s M, N > 0.
138 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
Here s: R + R is an unknown function, al, . . . ,a, are constants, and d k )means the
kth derivative of s.
2,-*' = z,,
z,' = anzl - an-1z2 - - alzn.
If s is a solution to (1), then
z = (8, s', . . . , 8(n-1) )
1.
is a solution to (1). From Theorem 4, Section 1 we know that every solution to
(2) has derivatives of all orders.
The matrix of coefficients of the linear system (2) is the n X n matrix
(3) 0
f6. HIGHER ORDER LINEAR EQUATIONS 139
which has rank n - 1. Hence A - X has rank n or n - 1, but rank n is ruled out
since X is an eigenvalue. Hence A - A has rank n - 1,so Ker (A - X) has dimension
1. This proves Proposition 2.
Theorem The following n functions fwm a baPis for the solutions of (1) :
(a) the junction retA',where X runs through the dktinct red rook of the charm-
i!.eristic polynomial (4) , and k i s a nonnegative integer in the range 0 5 k <
multiplicity of A; together with
(b) the junctions
Pet cos bt and tLP1sin bt,
+
where a bi runs through the complex rook of (4) having b > 0 and k i s a
+
nonnegative i n h e r in the range 0 5 k < mzcltiplicity of a bi.
140 6. LINEAR SYSTEMS AND CANONICAL FORMS OF OPERATORS
Proof. We call the functions listed in the proposition basic functions. It follows
from Theorem 1 of the previous section that every solution is a linear combination
of basic functions.
The proof that each basic function is in fact a solution is given in the next section.
By Proposition 1 it follows that the solutions to (1) are exactly the linear combina-
tions of basic functions.
It remains to prove that each solution is a unique linear combination of basic
functions. For this we first note that there are precisely n functions listed in (a)
and (b) : the number of functions listed equals the sum of the multiplicities of the
real roots of p ( A ) , plus twice the sum of the multiplicities of the complex roots with
positive imaginary parts. Since nonreal roots come in conjugate pairs, this total
is the sum of the multiplicities of all the roots, which is n.
Define a map (a: Rn + Rn as follows. Let f1, . . . , fn be an ordering of the basic
functions. For each a = (a1,. . . , an) E Rnlet s.(t) be the solution
n
S. = C ajfjfi.
i-1
Define
(ab) = (scl(O), Sa'(O), - * * , .S.(~-~)(O)) E Rn.
It is easy to see that (a is a linear map. Moreover, (a is surjective since for each
(h, . . . , anJ E Rnthere is some Rolution s such that
(5) s(0) = ao, . . . , s(n-')(O) = an-1,
PROBLEMS
Moreover,
Ker q ( D ) C Ker p ( D ) ,
since qr = rq. In addition, if f E Ker q ( D ) and 9 E Ker r ( D ) , then f g E +
Ker p ( D ) .
We can now give a proof that if ( t - A)"' divides p ( t ) , then tke'h E Ker p ( D ),
0 I k 5 m - 1. It suffices to prove
(2) ( D - A)'+'~'S'~ =: 0, k = 0,1, . . . .
Note that D(elh) = Aea, or
( D - X)etX = 0.
Next, observe the following relation between operators :
Dt - tD = 1
(this means D ( t f ) - tDf = f, which follows from the Leibniz formula). Hence also
(D - X)t - t ( D - A) = 1.
In this chapter we study some important kinds of linear flows etA, particularly
contractions. A (linear) contraction is characterized by the property that every
trajectory tends to 0 as t 4 00. Equivalently, the eigenvalues of A have negative
real parts. Such flows form the basis for the study of asymptotic stability in Chapter
9. Contractions and their extreme oppositee, expansions, are studied in Section 1.
Section 2 is devoted to hyperbolic flows el-’, characterized by the condition that
the eigenvalues of A have nonzero real parts. Such a flow is the direct s u m of a
contraction and an expansion. Thus their qualitative behavior is very simple.
In Section 3 we introduce the notion of a generic property of operators on R”;
this means that the set of operators which have that property contains% dense
open subset of L(R”). It is shown that “semisimple” is a generic property, and
also, “generating hyperbolic flowsJJis a generic property for operators.
The concept of a generic property of operators is a mathematical way of making
precise the idea of “almost all” operators, or of a “typical” operator. This point is
discussed in Section 4.
of (1). Of especial interest are equilibrium states. Such a state 5 E U is one that
does not change with time. Mathematically, this means that the constant map
t + f is a solution to (1) ; equivalently, f(*) = 0. Hence we define an equilib-
rium of (1) to be a point 5 E U such that f(5) = 0.
From a physical point of view only equilibria that are “stable” are of interest.
A pendulum balanced upright is in equilibrium, but this is very unlikely to occur;
moreover, the slightest disturbance will completely alter the pendulum’s behavior.
Such an equilibrium is unstable. On the other hand, the downward rest position is
stable; if slightly perturbed from it, the pendulum will swing around it and (because
of friction) gradually approach it again.
Stability is studied in detail in Chapter 9. Here we restrict attention to linear
systems and concentrate on the simplest and most important type of stable
equilibrium.
Consider a linear equation
(2) X’ = AX, A E L(R”).
The origin 0 E Rn is called a sink if all the eigenvalues of A have negative real
parts. We also say the linear flow e l A is a contraction.
In Chapter 6, Theorems 2 and 3, Section 5 , it was shown that 0 is a sink if and
only if every trajectory tends to 0 as t + w . (This is called asymptotic stability.)
From Problem 1, Section 5 of that chapter, it follows that trajectories approach
a sink exponentially. The following result makes this more precise.
for every eigenvalue X of A . Then E has a basis such that in the corresponding inner
product and norm,
(4) a I 5 I* I (Ax,x) I B I x I*
for all x E E .
Assuming the truth of the lemma, we derive an estimate for solutions of x’ = Ax.
Let ( q ,. . . , 2,) be coordinates on E corresponding to a basis &3 such that (4)
holds, and let
~ ( t=
) ( s ( t ) , - .. 3 zn(t))
be a solution to x’ = A z . Then for the norm and inner product defined by &3 we have
Hence
or
a
ar<-loglsl</3.
at
It follows by integration that
at 5 log I x ( t ) ) - log IX(0)l IBt;
hence
or
I 4 0 ) I II z(t>I 5 I .(O) I.
Theorem 1 is proved by letting B = - b < 0 where the eigenvalues of A have
real parts less than -b.
We now prove the lemma; for simplicity we prove only the second inequality
of (4).
01. 8INKS AND 80URCE8 147
where a k + ibk
[; -3,
is an eigenvalue of A. By assumption
x j < c, ak < c.
Given Rn the inner product defined by
(ej, ei) = (.fk,6) = k k , gk) = 1,
and all other inner products among the ej, 6,and g k being 0. Then a computation
shows
(Aej, e j ) = X j < C, ( A f k , f k ) = a k < C;
or
. .
, Dj=[z 1I:- I =[
1 0
0 1
1.
148 7. CONTRACTIONS AND GENERIC PROPERTIE6 OF OPERATORB
1,
+
For the first kind of block ( 5 ) , we can write A = S N where S haa the matrix
u,I and N haa the matrix
1 0
Thus the basis vectors (el, , . . , en) are eigenvectora of S, while
Nel = 9,
Nen-l = en,
Nen = 0.
Na1 = &a,
N+%= drl
t .
. .
(7) . .
t aj
Let (5, v), denote the inner product corresponding to a,.It is clear by considering
the matrix (7) that
( A 4 +-( S X l 4 &9 c--ro*
(X? 4, 12 I*
81. SINKS AND SOURCES 149
Therefore if c is sufficiently small, the basis (B, satisfies the lemma for a block ( 5 ) .
The case of a block (6) is similar and is left to the reader. This completes the proof
of the lemma.
The qualitative behavior of a flow near a sink has a simple geometrical inter-
pretation. Suppose 0 € Rn is a sink for the linear differential equation x' = f (2).
Consider the spheres
Sa= (xER"IIxI = a ) , a>0,
where I z I is the norm derived from an inner product as in the theorem. Since I x ( t ) I
has negative derivatives, the trajectories all point inside these spheres as in Fig. A.
FIG. A
We emphasize that this is true for the spheres in a special norm; it may be false
for some other norm.
The linear flow ef-' that has the extreme opposite character to a contraction is an
expangion, for which the origin is called a source: every eigenvalue of A has positive
real part. The following result is the analogue of Theorem 1 for expansions.
The proof is like that of Theorem 1, using the lemma and the first inequality of
(4) *
PROBLEMS
A type of linear flow elA that is more general than contractions and expansions is
the hyperbolic flow: all eigenvalues of A have nonzero real part.
92. HYPERBOLIC FLOWS 151
After contractions and expansions, hyperbolic linear flows have the simplest
types of phase portraits. Their importance stems from the fact that almost every
linear flow is hyperbolic. This will be made precise, and proved, in the next section.
The following theorem says that a hyperbolic flow is the direct sum of a contrac-
tion and an expansion.
Theorem Let e'A be a hyperbolic linear Jlow, A E L ( E ). Then E has a direct sum
decomposition
E = En e EU
invariant under A , such that the induced Jlow on En is a contraction and the induced
flow on EUis an expansion, This decomposition i s unique.
Proof. We give E a basis putting A into real canonical form (Chapter 6 ) . We
order this basis so that the canonical form matrix first has blocks corresponding to
eigenvalues with negative real parts, followed by blocks corresponding to positive
eigenvalues. The first set of blocks represent the restriction of A to a subspace
En C E, while the remaining blocks represent the restriction of A to Eu C E.
Since En is invariant under A , it is invariant under elA. Put A I fl = A , and
A I Eu = A,. Then etA 1 En = elA.. By Theorem 1, Section 1, elA I Enis a contraction.
Similarly, Theorem 2, Section 1 implies that e l A I EU is an expansion.
Thus A = A , Q A , is the desired decomposition.
To check uniqueness of the decomposition, suppose Fn Q FUis another decom-
position of E invariant under the flow such that elA 1 FBis a contraction and elA I FU
is an expansion. Let x E FB.We can write
x = y + z, y E En, z E Eu.
Since e t A z+ 0 as t + Q) , we have etAy+ 0 and e% + 0. But
1 etAz1 2 eta I z 1, a > 0,
for all t 2 0. Hence 1 z 1 = 0. This shows that FB C Em.The same argument shows
that En C FB;hence En = FB. Similar reasoning about ectA shows that E' = FU.
This completes the proof.
9
EU
FIG. B
IE U
FIG. C
93. GENERIC PROPERTIEB OF OPERATOW! 153
PROBLEMS
1. Let the eigenvalues ofA E,L(R33be A, p , v. Notice that erAis a hyperbolic flow
and sketch its phase portrait if:
(a) X<p<v<O;
+
(b) X < 0, p = u bi, u < 0, b > 0;
(c) X < O , p = a + b i , a > 0 , b > 0;
(d) X < 0 < p = Y and A is semisimple;
(e) X < p < O < v .
or
An interesting kind of subset ofX is a set X C F which is both open and dense. It
is characterized by the following properties : every point in the complement of F
can be approximated arbitrarily closely by points ofX (because X is dense) ; but no
point in X can be approximated arbitrarily closely by points in the complement
(because X is open).
Here is a simple example of a dense open set in R2:
x= { ( Z J y) E Rz 1 Zy # 1).
This, of course, is the complement of the hyperbola defined by sy = 1. If zoyo# 1
and I s - 20 1, I y - yo I are small enough, then zy # 1; this proves X open. Given
any (so,yo) E R2, we can find ( 5 , y) as close as we like to (so,yo) with s y # 1; this
proves X dense.
More generally, one can show that the complement of any algebraic curve in
R2 is dense and open.
A dense open set is a very fat set, as the following proposition shows:
Proof. It can be easily shown generally that the intersection of a finite number
of open sets is open, so X is open. To prove X dense let U C F be a nonempty
open set. Then U n XIis nonempty since XIis dense. Because U and XIare open,
U n XIis open. Since U n X1is open and nonempty, ( U n Xl) n XIis nonempty
because Xzis dense. Since X1is open, U n XIn Xais open. Thus ( U n X1n X2)n XI
is nonempty, and so on. So U n X is nonempty, which proves that X is dense in F.
Now consider a subset X of the vector space L(Rn). It makes sense to call X
dense, or open. In trying to prove this for a given X we may use any convenient
norm on L(Rn). One such norm is the B-max norm, where is a basis Rn:
= max{I aij I I [aij]
11 T 1(arnu = (%-matrixof T ) .
T=S+N,
where c
XI
Xr
S=
DI
- D,
and
c
0
1
. .
1 0
N =
I2 02
. .
-
0 0
O1I 1 02 = [o O].
The eigenvalues of T (with multiplicities) are XI, ... Xr, and a1 f ibl, . . . ,
a, f ib,.
Given t > 0, let
A:, . .. A:, a;, . . . ,a:
156 7. CONTRACTIONS AND GENERIC PROPERTIES OF OPERATORS
x:
0:
D,
+
and T' = S' N . Then the a-max norm of T - T' is less. than t, and the eigen-
values of TI are the n distinct numbers
x:, . . , x:,
* a: f ibl, . . , a: f ib,.
*
The set of semisimple operators is not open. For example, every neighborhood
of the semisimple operator I E L(R2) contains a nonsemisimple operator of the
form 3. [:
We also have
PROBLEMS
1. Each of the following properties defines a set of real n X n matrices. Find out
which sets are dense, and which are open in the space L(Rn) of all linear opera-
tors on Rn:
(a) determinant # 0;
(b) trace is rational;
(c) entries are not integers;
(d) 3 _< determinant < 4;
(e) - 1 < I X I < 1 for every eigenvalue A;
(f) no real eigenvalues;
(g) each real eigenvalue has multiplicity one.
2. Which of the following properties of operators on Rn are generic?
(a) I X I # 1 for every eigenvalue A;
(b) n = 2 ; some eigenvalue is not real;
(c) n = 3 ; some eigenvalue is not real;
(d) no solution of z' = A z is periodic (except the zero solution) ;
(e) there are n distinct eigenvalues, with distinct imaginary parts;
(f) A z # z and A z # - z for all z # 0.
3. The set of operators on R n that generate contractions is open, but not dense, in
L(R") . Likewise for expansions.
158 7. CONTRACTIONS AND GENERIC PROPERTIES O F OPERATORS
This chapter is more difficult than the preceding ones; it is also central to the
study of ordinary differential equations. We suggest that the reader browse through
the chapter, omitting the proofs until the purpose of the theorems begins to fit
into place.
Note that the definition implies that the map 4f: S + S is C1for each t and has a
C1inverse t&t (take s = - t in (b)).
An example of a dynamical system is implicitly and approximately defined by
the differential equations in the Newton-Kepler chapter. However, we give a pre-
cise example aa follows.
Let A be an operator on a vector space E ; let E = S and 4: R X S --+ S be de-
fined by + ( t , 2) = e%. Thus 41:S + S can be represented by +t = etA. Clearly,
40 = eO = the identity operator and since e(l+')" = etAegA,we have defined a dy-
namkal system on E (see Chapter 5 ) .
This example of a dynamical system is related to the differential equation dx/dt =
A x on E . A dynamical system on S in general gives rise to a differential equation
on S, that is, a vector field on 8,f: S -+ E. Here S is supposed to be an open set in
the vector space E . Given qh, define f by
thus for x in 8,f ( x ) is a vector in E which we think of as the tangent vector to the
curve t 4 t$l(x) at t = 0. Thus every dynamical system gives rise to a differential
equation.
We may rewrite this in more conventional terms. If &: S + S is a dynamical
system and x E S, let x ( t ) = qit(2) , and f : S --+ E be defined aa in (1). Then we
may rewrite (1) as
(1') 2' = f(x).
Thus z ( t ) or r#~((z)is the solution curve of (1') satisfying the initial condition
x ( 0 ) = x. There is a converse process to the above; given a differential equation
one has associated to it, an object that would be a dynamical system if it were
defined for all t. This process is the fundamental theory of differential equations
and the rest of this chapter is devoted to it.
The equation (1') we are talking about is called an autonomous equation. This
means that the function f does not depend on time. One can also consider a C1
map f: I X W -+ E where I is an interval and W is an open set in the vector space.
The equation in that case is
(2) 2' = f ( t , 5)
and is called nonautonomous. The existence and uniqueness theory for (1') will
02. THE FUNDAMENTAL THEOREM 161
be developed in this chapter; (2) will be treated in Chapter 15. Our emphasis in
this book is completely on the autonomous case.
Throughout the rest of this chapter, E will denote a vector space with a norm;
W C E, an open set in E; and f: W + E a continuous map. By a 8 0 l ~ t i o nof the
differential equation
(1) z’ = f(z)
we mean a differentiable function
u :J - W
defined on some interval J C R such that for all I! E J
u’(t> = f ( 4 t ) ) .
Here J could be an interval of real numbere which iS open, closed, or half open, half
closed. That is,
(a, b ) = ( 1 E R I a < 1 < b ) ,
or
[a, b ] = ( 1 E R I a 5 I! 5 b ) ,
or
(a, b ] = ( t E R I a < 1 5 b ) ,
and so on. Also, a or b could be 00, but intervals like (a, ao] are not allowed.
Geometrically, u is a curve in E whose tangent vector u’(1) equals f ( u ( 1 ) ;) we
think of this vector as based at u(1). The map f : W + E is a vector field on W.A
solution u may be thought of aa the path of a particle that move8 in E 80 that at
time t, its tangent vector or velocity is given by the value of the vector field at the
position of the particle. Imagine a dust particle in a steady wind, for example, or
an electron moving through a constant magnetic field. See ah0 Fig. A, where u(t0) =
z,u’(t0) = f(4.
FIG. A
162 8. FUNDAMENTAL THEORY
FIG. B
solution
5: (-a, a) +W
of the diflerential equation
5' = f ( z )
satisfying the initial condition
z(0) = xo.
Before giving the proof we recall the meaning of the derivative Of (z) for x E W .
This is a linear operator on E ; it assigns to a vector u E E, the vector
a 4
1
D f ( z ) u = lim- ( f ( z
s
+ su) - f ( z ) ) , s E R,
which will exist if Of ( 2 ) is defined.
. .
In coordinates ( X I , . . .,x,) on E, let f ( z ) = (fi(z1, . ., z,,), . .,&,(XI, . . ., z.));
then Df (5) is represented by the n X n matrix of partial derivatives
(a/azj) ( f i ( x 1 , * - t xn)).
Conversely, if all the partial derivatives exist and are continuous, then f is C1. For
each 2 E W , there is defined the operator norm 11 D f ( z ) I J of the linear operator
Of (z) E L ( E ) (see Chapter 5 ) . If u E E, then
(1) I D f ( z ) uI 5 I1 Df(x)II I u I.
That f : W + E is C1implies that the map W + L ( E ) which sends z + Df(x) is a
continuous map (see, for example, the notes at end of this chapter).
164 8. FUNDAMENTAL THEORY
Hence, by (1),
a < min(b/M, l/K], and define J = [-a, a]. Recall that b is the radius of the
ball Wo.We shall define a sequence of functions uo, u1, . . . from J to Wo.We shall
prove they converge uniformly to a function satisfying (4), and later that there
are no other solutions of (4). The lemma that is used to obtain the convergence
of the uk: J + Wois the following:
This is called uniform convergence of the functions Uk. This lemma is proved in
elementary analysis books and will not be proved here.
The sequence of functions U k is defined as follows:
Let
% ( t ) = ZO.
Let
Ul(t) = Zo + ~ f f ds.o )
Assuming that uk(t) has been defined and that
I uk(t) - z0 I 5 b for all t E J ,
let
uk+l(t) = ZO + l f ( u k ( s ) )ds.
This makes sense since u k ( s ) E WOso the integrand is defined. We show that
I uk+l(t) - Zo I 2 b Or uk+l(t) E WCI for t E J ;
this will imply that the sequence can be continued to uk+2,Uk+& and so on.
We have
I uk+l(t) - xo I 5 1'0
If(uk(s))lds
166 8. FUNDAMENTAL THEORY
c ffhL
0
5
k-N
<_t
uk+l(t) = ZO
Z(t) = + lim
k-m
l I f ( u k ( 8 ) ) d.8
a
-
-” + [[?imf ( u k(41ds
k-oo
63. EXISTENCE AND UNIQUENESS 167
z(t) = y(t).
Another proof of uniqueness follows from the lemma of the next section.
We have proved Theorem 1 of Section 2. Note that in the course of the proof
the following waa shown: Given any ball WoC W of radius b about 5, with
max,,r., I f(z) I I M, where f on Wo has Lipschitz constant K and 0 < a <
min(b / M , l/K}, then there is a unique solution z: ( - a , a ) 4 W of (3) such that
z(0) = 20.
Some remarks are in order.
Consider the situation in Theorem 1 with a C*map f: W + E, W open in E.
Two solution curues of x' = f(z) cannot cross. This is an immediate consequence of
uniqueness but is worth emphasizing geometrically. Suppose cp: J + W , $: J1+ W
are two solutions of z' = f(z) such that cp(t1) = $(Q. Then cp(t1) is not a crossing
+
because if we let $tl(t) = $(t, - tl t ) , then #1 is also a solution. Since $ l ( t l ) =
$ ( 4 ) = cp(tl), it follows that rll and p agree near tl by the uniqueness statement of
168 8. FUNDAMENTAL THEORY
FIG. A FIG. B
FIG. C
Let us see how 1 e “iterat in scheme” used in the proof in this section applies
to a very simple differential equation. Consider W = R and f(z) = x, and search
for a solution of z‘ = x in R (we know already that the solution s ( t ) satisfying
~(0= ) zois given by z ( t ) = zoe‘).
Set
uo(0 = zo,
#4. CONTINUITY OF SOLUTIONEI IN INITIAL CONDITIONS 169
and so
As k + m , Uk ( 1 ) converges to
For Theorem 1 of Section 2 to be at all interesting in any physical sense (or even
mathematically) it needs to be complemented by the property that the solution
z ( t ) depends continuously on the initial condition x ( 0 ) . The next theorem gives a
precise statement of this property.
u(t) 5 c +/ k S ) ds
0
then
u(t) I V(0.
By differentiation of U we find
U'(t) = K u ( t ) ;
hence
Hence
so
log U ( t ) 5 log U ( 0 ) + Kt
by integration. Since cf(0) = C,we have by exponentiation
and so
If C = 0, then apply the above argument for a sequence of positive C i that tend
to 0 as i 0 0 . This proves the lemma.
we have
v(t> I +
~ ( 4 ) /'Kv(8)
ro
d8.
hence
for all t between y and 8. Hence a/ is diffemntiableat 8, and in fact @'(I!?) = f(y(8)).
Therefore y is a solution on [y, 81. Since there is a solution on an interval u, a),
8 > 8, we can extend y to the interval (a,a). Hence (a,8) could not be a maximal
domain of a solution. This completea the proof of the theorem.
Proof. Let [O, /3) be the maximal half-open interval on which there is a solution
y as above. Then y([O, a)) C A , and so /3 cannot be finite by the theorem.
Theorem Let f(z) be C'. Let y ( t ) be a solution to x' = f(z) defined on the closed
interval [to, t J , with y(to> = YO. There is a neighborhood U C E of yo and a constant
K such that if zo E U , then there is a unique solution z ( t ) also deJined on [to, t l ) with
z ( 4 ) = zo; and z satisjies
I ~ ( t -) z ( t ) l 5 K I YO - zo I exp(K(t - 4 ) )
fm all t E [to, t13.
+(t) = N,Y).
+(O, v) = Y.
$7. THE FLOW OF A DIFFERENTIAL EQUATION 175
hl = { ( t , y) E R X W I t E J(y) I.
The map ( t , y) + + ( t , y) is then a function
+: n-+ W .
+
We call the flow of equation (1).
We shall often write
+ ( t , z) = +t(z>.
in the sense that if one side of ( 2 ) is defined, so is the other, and they are equal.
Proof. First, suppose s and t are positive and +, (ot ) is defined. This means
(2)
t E J ( z ) and s E J ( & ( z ) ) . Suppose J ( z ) = (a,a). Then a < t < a; we shall
+
show /3 > s t. Define
y: (a,s t ] + + w
i
by
+(r, x) if a < r < t ;
y(r>
+
=
+ ( r - t, + t ( z ) ) if t Ir It s.
Then y is a solution and y(0) = z.Hence s + t E J ( z ) . Moreover,
+,+t(z) = Y(S + t ) = +,(+t(x)).
The rest of the proof of Theorem 1 uses the same ideas and is left to the reader.
Itl--&I<8, Iz1--oI<6.
176 8. FUNDAMENTAL THEORY
Then
I 4 ( t l , 21) - 4(b, 20) I 5 I 4 ( t l , 21) - 4 ( t l , 20) I + I 4(tl, zo) - @(to, I.
5)
The second term on the right goes to 0 with 6 because the solution through zo is
continuous (even differentiable) in t. The first term on the right, by the estimate
in Section 6, is bounded by 6eRd which also goes to 0 with 6. This proves Theorem 2.
PROBLEMS
1. Write out the first few terms of the Picard iteration scheme (Section 3) for
each of the following initial value problems. Where possible, use any method
to find explicit solutions. Discuss the domain of the solution.
+
(a) 2’ = x 2; z(0) = 2.
(b) Z’ = *la; ~ ( 0 =) 0.
(c) 2’ = */a; x ( 0 ) = 1.
(d) z’ = sin z;z(0) = 0.
(e) x’ = 1/22;x(1) = 1.
2. Let A be an n X n matrix. Show that the Picard method for solving x‘ = Ax,
z(0) = u gives the solution e%
3. Derive the Taylor series for sin t by applying the Pinard method to the first
order system corresponding to the second order initial value problem
2’’ = -x; x ( 0 ) = 0, z’(0) = 1.
4. For each of the following functions, find a Lipschitz constant on the region
indicated, or prove there is none:
(a) f ( x ) = 1 ~ 1 , - - m < x < - m .
(b) f(z) = x“*, -1 5 x 5 1.
(c) f(z) = 1/z, 1 5 5 5 -m.
+
( 4 f(s, Y> = ( 5 2Y, -Y>, (5, Y) E R2.
(a) There are infinitely many solutions satisfying z(0) = 0 on every in-
terval [0, @].
(b) For what values of a are there infinitely many solutions on [0, a] satisfy-
ing x(0) = -l?
.
6. Let f : E + E be continuous; suppose f(z) 5 M . For each n = 1, 2, . . , let
x,,:[0, 11+ E be a solution to z’ = f(z). If z,,(O) converges, show that a
subsequence of { xn) converges uniformly to a solution. (Hint: Look up Ascoli’s
theorem in a book on analysis.)
7. Use Problem 6 to show that continuity of solutions in initial conditions follows
from uniqueness and existence of solutions.
178 8. FUNDAMENTAL THEORY
8. Prove the followinggeneral fact (see also Section 4) :if C 2 0 and u,u : [0,8] --$
u(t> 5 CeV(t), ~ ( t =
) /'u(s) d ~ .
0
9. Define f: R +R by
f(x) = 1 if x 5 1; f(z) = 2 if x > 1.
There is no solution to z' = f(z) on any open interval around t = 1.
10. Let g: R +R be Lipschitr and f: R +R continuous. Show that the system
' = !7(3),
2
I/' = f(z)I/,
has at most one solution on any interval, for a given initial value. (Hint:
Use
Gronwall's inequdity.)
Notea
Our treatment of calculus tends to be from the modern point of view. The deriva-
tive is viewed aa a linear transformation.
Suppose that U is an open set of a vector spwe E and that g: U --+ F is some map,
F a second vector space. What is +hederivative of g at xo E Uo?We say that this
derivative exists and is denoted by Dg(xo) E L ( E , F) if
Then, if, for each x E U , the derivative D g ( x ) exista, this derivative defines a
map
U + L ( E , F), z+Dg(z).
If this map is continuous, then g is said to be C1. If this map is C1itself, then g is
said to be 0.
Now suppose F, G,H are three vector spaces and u, u are open sets of F, G, re-
NOTES 179
some physical (or biological, economic, or the like) system described by (1) then
f is an "equilibrium state": if the system is at f it always will be (and always
was) at 2.
Let 4: + W be the flow associated with (1) ; 61 C R X W is an open set, and
for each x E W the map t --+ 4 ( t l x) = &(x) is the solution passing through x when
t = 0; it is defined for t in some open interval. If 2 is an equilibrium, then & ( 2 ) = 2
for all t E R. For this reason, 3 is also called a stationary point, or fixed point, of
the flow. Another name for f is a zero or singular point of the vector field f.
Suppose f is linear: W = Rnand f (2) = Ax where A is a linear operator on R".
Then the origin 0 E Rn is an equilibrium of (1). In Chapter 7 we saw that when
A < 0 is greater than the real parts of the eigenvalues of A , then solutions &(x)
approach 0 exponentially :
I cpf(2)l I CeAt
for some C > 0.
Now suppose f is a C1 vector field (not necessarily linear) with equilibrium
point 0 E R".We think of the derivative Df(0) = A off at 0 as a linear vector
field which approximates f near 0. We call it the linear part off at 0. If all eigen-
values of Df(0) have negative real parts, we call 0 a sink. More generally, an
equilibrium f of (1) is a sink if all eigenvalues of Df(f) have negative real parts.
The following theorem says that a nonlinear sink f behaves locally like a linear
sink : nearby solutions approach 2 exponentially.
This shows, first, that I z ( t ) 1 is decreasing; hence I z ( t ) I E U for all t E [O, to].
Since U is compact, it follows from Section 5, Chapter 8 that the trajectory s ( t )
is defined and in U for all t 2 0. Secondly, (2) implies that
I4t) I 5 fTrc1 4 0 ) I
for all t 2 0. Thus (a) and (b) are proved and ( c ) follows from equivalence of
norms.
The phase portrait, at a nonlinear sink Z looks like that of the linear part of the
vector field: in a suit'able norm the trajectories point inside all sufficiently small
spheres about 5 (Fig. A).
Remember that the spheres are not necessarily “round” spheres; they are spheres
in a special norm. In standard coordinates they may be ellipsoids.
A simple physical example of a nonlinear sink is given by a pendulum moving in
a vertical plane (Fig. B) . We assume a constant downward gravitational force
equal to the mass m of the bob; we neglect the mass of the rod supporting the
bob. We assume there is a frictional (or viscous) force resisting the motion, pro-
portional to the speed of the bob.
Let 1 be the (constant) length of the rod. The bob of the pendulum moves along
a circle of radius 1. If 0 ( t ) is the counterclockwise angle from the vertical to the
rod at time t, then the angular velocity of the bob is &/dt and the velocity is
1 do/&. Therefore the frictional force is -kl d0/dt, k a nonnegative constant; this
force is tangent to the circle.
The downward gravitational force m has component -m sin e ( t ) tangent to the
circle; this is the force on the bob that produces motion. Therefore the total force
tangent to the circle at time t is
or
FIG. B. Pendulum.
184 9. STABILITY OF EQUILIBRIA
The real part -k/2m is negative as long aa the coefficient of friction k is positive
and the mass is positive. Therefore the equilibrium 8 = w = 0 is a sink. We con-
clude: for all sufficiently small initial angles and velocities, the pendulum tends
toward the equilibrium position (0, 0).
This, of course, is not surprising. In fact, from experience it seems obvious that
from any initial position and velocity the pendulum will tend toward the down-
ward equilibrium state, except for a few starting states which tend toward the
vertically balanced position. To verify this physical conclusion mathematically
takes more work, however. We return to this question in Section 3.
Before leaving the pendulum we point out a paradox: the pendulum cannot come
lo rest. That is, once it is in motion-not in equilibrium-it cannot reach an equi-
librium state, but only approach one arbitrarily closely. This follows from unique-
ness of solutions of differential equations! Of course, one knows that pendulums
actually do come to rest. One can argue that the pendulum is not “really” at rest,
92. STABILITY 185
but its motion is too small to observe. A better explanation is that the mathematical
model (3) of its motion is only an approximation to reality.
PROBLEMS
42. Stability
FIG. A. Stability.
I
U
u'
I FIG. B. Asymptotic stability.
FIG. C. Instability.
equation
(2) X‘ = AX,
where A has pure imaginary eigenvalues. The orbits are all ellipses (Fig. D) .
5’ = f(x).
Then no eigenvalue of Df( 2 ) has positive real part.
Similarly, for any b > 0 there exists a Euclidean norm on EZsuch that
(4) (BY,Y > < b I Y Iz, all Y E Ez.
We choose b so that
O<b<a.
We take the inner product on E = El CB E2 to be the direct sum of these inner
products on El and E2; we also use the norms associated to these inner products
on El, Ez, E. If z = (2,y) E El CB Ez, then I z I = (I z I* I y Ip)l'*. +
We shall use the Taylor expansion off around 0:
f(z, Y) = (Az + R ( z , Y> BY + S(z, Y)
> = (fi(z, Y) ,fz(z,
with
('J = '; (R(zJ y)! '('9 y>> = &(')'
Thus, given any c > 0, there exists 6 > 0 such that if U = &(O) (the ball of
radius 6 about 0) ,
(5) 1 & ( z > 1 Ic I z I for z E U.
We define the cone C = ((5,y) E BI CB Ez I I z I 1 I Y I).
This lemma yields our instability theorem aa follows. We interpret first condi-
tion (a). Let 8 : El X EZ + R be defined by g(z, y) = +(I z I* - I y 11). Then
g is C1, r1[0m,) = C,and r1(0) is the boundary of C.
92. STABILITY 189
FIG. F
thus
190 9. BTABILITY OF EQUILIBRIA
types of dynamical systems (the gradient systems of Section 4), almost every
state is in the basin of some sink; other states are “improbable” (they constitute
a set of measure 0). For such a system, the sinks represent the different types of
long term behavior.
I t is often a matter of practical importance to determine the basin of a sink 3.
For example, suppose Z represents some desired equilibrium state of a physical
system. The extent of the basin tells us how large a perturbation from equilibrium
we can allow and still be sure that the system will return to equilibrium.
We conclude this section by remarking that James Clerk Maxwell applied
stability theory to the study of the rings of the planet Saturn. He decided that
they must be composed of many small separate bodies, rather than being solid or
fluid, for only in the former case are there stable solutions of the equations of mo-
tion. He discovered that while solid or fluid rings were mathematically possible,
the slightest perturbation would destroy their configuration.
PROBLEMS
[; 8 8 :]
(a) A 0 (b) 1 (c) 0 -1 0 0
J
=
1 0
0 1 0 -1 0
-1 0
1
4. Show that the dynamkal system in R*, where equations in polar coordinates
are
+sin (l/r), r > 0,
8' = 1, r' =
0, r = 0,
has a stable equilibrium at the origin. (Hint: Every neighborhood of the
origin contains a solution curve encircling the origin.)
5. Let f : Rn Rn be C1 and suppose f(0) = 0.. If some eigenvalue of Df(0)has
positive real part, there is a nonzero solution z(t), - 00 < t I 0, to x' = f(x),
such t,hat limk-, z(t) = 0. (Hint:Use the instability theorem of Section 3 to
find a sequence of solutions z.(t) , tn t 5 0, in &(O) with I zn(0) I = 8 and
limn-.c zn(tn) = 0.)
6. Let g: Rn --.) Rn be C1 and suppose f(0) = 0. If some eigenvalue of D g ( 0 ) has
negative real part, there is a solution g ( t ) , 0 I t < 00 , to z' = g(z) , such that
limt+mg ( t ) = 0. (Hint: Compare previous problem.)
y’ = -2(2 - l),
z’ = -51.
The z-axis ( = I (x,y, z) I x = y = 0 ) ) consists entirely of equilibrium points.
Let us investigate the origin for stability.
The linear part of the system at (0, 0, 0) is the matrix
There are two imaginary eigenvalues and one zero eigenvalue. All we can conclude
from this is thatdthe origin is not a sink.
+
Let us look for a Liapunov function for (0, 0,O) of the form V ( z ,y, z) = a 9
by‘ f cz‘,with a, b, c > 0. For such a V,
+
v = 2(azz’ buy’ czz’) ; +
80
*V = 2crzy(z - 1) - bzy(z - 1) - a‘.
194 9. STABILITY OF EQUILIBRIA
du
- = -grad @(z).
dt
Let ( 2 , ij) E Wo X Rabe an equilibrium point. Then ij = 0 and grad @(Z) = 0.
To investigate stability a t ( 2 , 0 ) , we try to use the total energy
E(X, u ) = 3m I U p + rn@'(X)
to construct a Liapunov function. Since a Liapunov function must vanish a t
( 5 , 0 ) , we subtract from E ( s , u ) the energy of the state (3,0 ) , which is @(f), and
define V : Wo X Ra + R by
V ( r ,u ) = E ( s ,u) - E(2,O)
= hm lu12 + rn@(.r) - m @ ( i ) ,
By conservation of energy, P = 0. Since tmvl 2 0, we assume @(z) > at$) for
x near f, s # 2 , in order to make V a Liapunov function. Therefore we have proved
the well-known theorem of Lagrange: an equilibrium ( 2 , 0 ) of a conservative force
field is stable i f the potential energy has a local absolute minimum at 2.
Proof of Liupunov's theorem. Let 6 > 0 be so small that the closed ball
Ba(Z) around of radius b lies entirely in U . Let a be the minimum value of V
on the boundary of B a ( f ) ,that is, on the sphere Sa(f) of radius 6 and center f.
Then a > 0 by (a). Let U 1= ( z E Ba(2) I V ( z ) < a).Then no solution starting
in U 1can meet & ( f ) since V is nonincreasing on solution curves. Hence every
solution starting in U , never leaves Ba(f). This proves f is stable. Now assume
(c) holds as well, so that V is strictly decreasing on orbits in U - f. Let x ( t ) be a
solution starting in U 1 - 2 and suppose s(tn)+ zo C &(f) for some sequence
tn + ; such a sequence exists by compactness of Ba(2). We assert zo = 2. T o see
this, observe that V ( s ( t ) )> V ( z o )for all t 2 0 since V ( z ( t ) )decreases and
V ( z ( t , ) ) + V ( z o )by continuity of V . If zo # 2 , let z ( t ) be the solution starting
at 20. For any s > 0, we have V ( z ( s ) ) < V ( % ) .Hence for any solution y(s) starting
$3. LIAPUNOV FUNCTIONS 1%
E(O(i!,,),U(t0)) = E ( f * , @(to))
= +
mlC+14to>2 21
12ml.
But
4to)) I c
E(@(i!,,)), < 2ml.
This contradiction shows that B(a) < a , and so P , is positively invariant.
We assert that P, fulfills the second condition of Theorem 2. For suppose E is
constant on a trajectory. Then, along that trajectory, E = 0 and so w = 0. Hence,
from (3) of Section 1, 8’ = 0 so B is constant on the orbit and also sin 0 = 0. Since
I B I < T , it follows that 0 = 0. Thus the only entire orbit in P, on which E is con-
stant is the equilibrium orbit (0, 0).
Finally, P , is a closed set. For if (Bo, coo) is a limit point of P,, then I 0, I 5 a ,
and E(&, LOO) I c by continuity of E. But I Bo I = a implies E(B0,w0) > c. Hence
I e0 I < T and so (Bo, w 0 ) E P,.
From Theorem 2 we conclude that each P , C B ( 0 , 0) ; hence the set
P = U ( P c10 < c < 2ml)
is contained in B ( 0 , O ) .Note that
P = {(B, U) 1 E(B, < 2ml
U) and IB 1 <T).
This result is quite natural on physical grounds. For 2ml is the total energy of
the state ( a , 0 ) where the bob of the pendulum is balanced above the pivot. Thus
if the pendulum is not pointing straight up, and the total energy is less than the
total energy of the balanced upward state, then the pendulum will gradually
approach the state (0,O).
There will be other states in the basin of (0, 0) that are not in the set P . Con-
sider a state ( a , u ) , where u is very small but not zero. Then ( a , u ) 4 P , but the
pendulum moves immediately into a state in P , and therefore approaches (0, 0).
Hence ( T , u) E B ( 0 , 0). See Exercises 5 and 6 for other examples.
The set L defined above is called the set of vlimit points, or the vlimit set, of
the trajectory z ( t ) (or of any point on the trajectory). Similarly, we define the
set of a-limit points, or the a-limit set, of a trajectory y ( t ) to. be the set of all points
b such that limn+- y ( t n ) = b for some sequence tn - 00. (The reason, such aa
--.)
it is, for this terminology is that a is the first letter and o the last letter of the
Greek alphabet.) We will make extensive use of these concepts in Chapter 11.
A set A4in the domain W of a dynamical system is invariant if for every 2 E A,
t$((r)is defined and in A for all t E R. The following facts, essentially proved in
the proof of Theorem 2, will be used in Chapter 11.
Proposition The a-limit set and the d i m i t set of a trajectory whichis definedforall
t E R are closed invariant sets.
PROBLEMS
The total energy E is a Liapunov function for the corresponding first order
system
2' = y,
Y' = -g(x>;
E is kinetic energy plus potential energy, and the potential energy at x E R
is the work required to move the mass from 0 to 5.)
4. In Problem 3 suppose also that there is a frictional force opposing the motion,
of the form - f ( x ) v , f(s) 2 0, where v is the velocity, and x the position of the
particle. If f-'(O) = 0, then (0, 0) is asymptotically stable, and in fact every
trajectory tends toward (0,0).
5. Sketch the phase portraits of
(a) the pendulum with friction (see also Problem 6) ;
(b) the pendulum without friction.
6. (a) For the frictional pendulum, show that for every integer n and every
angle 0, there is an initial state (&, wo) whose trajectory tends toward
(0,0) , and which travels n times, but not n +
1 times, around the circle.
(b) Discuss the set of trajectories tending toward the equilibrium ( T , 0 ) .
7. Prove the following instability theorem: Let V be a C1 real-valued function
-
defined on a neighborhood U of an equilibrium f of a dynamical system.
Suppose V ( 5 ) = 0 and V > 0 in U - f. If V ( x n ) > 0 for some sequence
x,, f, then f is unstable.
8. Let V be a strict Liapunov function for an equilibrium f of a dynamical system.
Let c > 0 be such that V-l[O, c] is compact and contains no other equilibrium.
Then V-l[O, c ] C B ( 2 ) .
grad V = (E
axl
,. . . , 9
ax,
is the gradient vector field
grad 8 : U + Rn
200 9. STABILITY OF EQUILIBRIA
Note by (2) that the nonregular or critical points of V are precisely the equi-
librium points of the system ( 1 ) .
Since the trajectories of the gradient system ( 1 ) are tangent to -grad V ( x ) ,
we have the following geometric description of the flow of a gradient system:
Theorem 3 Let
x' = -grad V ( z )
be a gradient system. At regular points the trajectories MOSS level surfaces orthogonally.
Nonregular points are equilibria of the system. Isolated minima are asymptotically
Stable.
or
dx
- -- - 2 4 2
dt
- 1)(2x - l ) , a
dY = - 2 y .
-
dt
The study of this differential equation starts with the equilibria. These are
found by setting the right-hand sides equal to 0, or -2x(x - 1)(2x - 1) = 0,
- 2 y = 0.
;1
We obtain precisely three equilibria: ZI = (0, 0),ZII = ($, 0),ZIII = (1,O). To
check their stability properties, we compute the derivative Df(z)which in co-
ordinates is
r 1
(-24% - 1 ) ( 2 x - 1))
d
-
dY
(-2Y)
9. STABILITY OF EQUILIBRIA
-
,---
We conclude from the main result on nonlinear sinks that 21, ZIII are sinks while ZII
is a saddle. By the theorem of Section 2, ZII is not a stable equilibrium.
The graph of V looks like that in Fig. A. The curve8 on the graph represent
intersections with horizontal planes. The level "surfaces" (curves, in this case)
+
look like those in Fig. B. Level curves of V(z, y) -- Z ( s - 1)* y* and the phase
FIG. C. Level curves of V ( z , y) and gradient lines of (z',y') = --gad V(z, y).
portrait of (x', y') = -grad V ( x , y ) , superimposed on Fig. B, look like Fig. C.
The level curve shaped like a reclining figure eight is V-' (&) .
More information about a gradient flow is given by:
PROBLEMS
1. For each of the following functions V ( u ) ,sketch the phase portrait of the gradi-
ent flow u' = -grad V ( u ) . Identify the equilibria and classify them as to
stability or instability. Sketch the level surfaces of V on the same diagram.
(a) z' + 2d + +
(b) i - 9' - 2% 49 5
(c) ysin z + + + +
(d) 22' - 2 ~ y 59' 4s 49 4
(el z f + d - z (f) i ( z - 1) + 9Yy - 2) + z'
2. Suppose a dynamical system is given. A trajectory z ( t ) , 0 5 t < a,is called
recurrent if z(L) --* z(0) for some sequence t, + a. Prove that a gradient
dynamical system has no nonconstant recurrent trajectories.
3. Let V: E + R be CZ and mppose V-l( - 00, c] is compact for every c E R .
. ..
Suppose also D V ( z ) # 0 except for a finite number of points P I , , p,. Prove:
(a) Every solution z ( t ) of z' = -grad V ( z ) is defined for all t 2 0;
(b) limt+., z ( t ) exists and equals one of the equilibrium points p l , ... , p,,
for every solution ~ ( 1 ) .
Since E and E* have the same dimension, say n, E* has a basis of n elements.
If (el, . . . , en) = is a basis for E, they determine a basis {e:, . . . , e:) = a*
95. GRADIENTS AND INNER PRODUCTS 205
for E* by defining
e;: E + R J
e;(C lie<) = li;
i
We now prove some results of the preceding section concerning the differential
equation
(2) 2’ = -grad V(z),
using our new definition of grad V.
The dual vector space is also used to study linear operators. We define the
udjoint of an operator
T:E+E
(where E has some fixed inner product) to be the operator
TI: E +E
defined by the equality
(Tx, Y ) = (x,T*Y)
for all x, y in E. To make sense of this, first keep y fixed and note that the map
z + (Tx,y ) is a linear map E 4 R; hence it defines an element X ( y ) E E*. We
define
T*y = W X ( y ) ,
where
@:E+E*
is the isomorphism defined earlier. It is easy to see that T* is linear.
If @ is an orthonormal basis for E, that is, @ = [el, . . . , en] and
(ei, el> = 6ij,
then the @-matrix of T* turns out to be the transpose of the @-matrix for T I &s is
easily verified.
$5. QRADIENTS AND INNER PRODUCTS 207
Theorem 3 Let E be a real vector space with an inner product and let T be a self-
adjoint operator on E. Then the eigenvalues of T are real.
Proof. Let EC be the complexification of E . We extend ( , ) to a function
+ +
EC X EC ---t C as follows. If x i y and u iv are in Ec, define
(x + iy, + iv) = (5, u> + i ( ( Y , u > - (s, v ) ) + (?I, v ) .
It is easy to verify the following for all a, b E Ec, X E C :
(3) (a, a ) >0 if a # 0,
(4) X(a, b ) = (Xa, b ) = (a, Xb),
where - denotes the complex conjugate.
Let Tc: EC + EC be the complexification of T ; thus T c ( s i y ) = T s + + i(Ty).
Let ( T * ) cbe the complexification of T*. It is easy to verify that
(5) (Tea, b ) = (a, (Web).
(Thisis true even if T is not self-adjoint.)
Suppose X E C is an ttigenvalue for T and a E EC an eigenvector for X; then
Tca = Xa.
BY ( 5 )
(%a, a > = (a, ( T * ) c a )
= (a, Tca).
since T* = T . Hence
(ka, a ) = (a, ha).
But, by (41,
k(a, a> = (ha, a>,
while
X(a, a ) = (a, Xa);
so, by (3), X = X and X is real.
For simplicity we aasume the vector space is R",equipped with the usual inner
product. Let 2 be an equilibrium of the system
x' = -grad V ( 2 ) .
The operator
DF (5)
has the matrix
This theorem is also true for gradients defined by arbitrary inner products.
For example, a gradient system in the plane cannot have spirals or centers at
equilibria. In fact, neither can it have improper nodes because of:
Theorem 5 Let E be a real vector space with an i n m product. Then any self-
adjoini operator on E can be diagonalized.
Proof. Let T : E + E be self-adjoint. Since the eigenvalues of T are real, there
is a nonzero vector el E E such that Tel = Alel, XI E R. Let
El = (xE E I (5, e l ) = 01,
Hence T leaves El invariant. Give El the same inner product aa E; then the operator
Ti = T I Ei E L(E1)
is self-adjoint. In the same way we find a nonzero vector Q E El such that
Te = X$s; Xs E R.
Note that el and et are independent, since (el, a) = 0. Continuing in this way, we
.
find a maximal independent set = (el, . . , en} of eigenvectors of T.These must
span E, otherwise we could enlarge the set by looking at the restriction of T to
the subspace orthogonal to el, . . . , en. In thia basis a,T is diagonal.
.
We have actually proved more. Note that er, . . , en are mutually orthogonal;
and we can take them to have norm 1. Therefore a 8eZj-adjoint operatm (or a 8ym-
metric matrix) can be diagonalized by an orthonormal barris.
For gradient system we have proved:
NOTES 209
Theorem 6 At an equilibrium of a gradient JEow the linear part of the vector &Zd
ie diagOna2izabl.eba, a n mthononnal ba&.
PROBLEMS
p : ;]
1. Find an orthonormal diagonelising basis for each of the following operators:
(4 2 1 0 -2
[l 11 (b) c - 2 01 (c)
1 2 -1
Notea
First a simple but very basic circuit example is described and the differential
equations governing the circuit are derived. Our derivation is done in such a way
that the ideas extend to general circuit equations. That is why we are so careful
to make the maps explicit and to describe precisely the sets of states obeying
physical laws. This is in contrast to the more typical ad hoc approach to nonlinear
circuit theory.
The equations for this example are analyzed from the purely mathematical
point of view in the next three sections; these are the classical equations of Lienard
and Van der Pol. In particular Van der Pol’s equation could perhaps be regarded
as the fundamental example of a nonlinear ordinary differential equation. It
possesses an oscillation or periodic solution that is a periodic attractor. Every
nontrivial solution tends to this periodic solution; no linear flow can have this
property. On the other hand, for a periodic solution to be viable in applied mathe-
matics, this or some related stability property must be satisfied.
The construction of the phase portrait of Van der Pol in Section 3 involves
some nontrivial mathematical arguments and many readers may wish to skip or
postpone this part of the book. On the other hand, the methods have some wider
use in studying phase portraits.
Asymptotically stable equilibria connote death in a system, while attract,ing
oscillators connote life. We give an example in Section 4 of a continuous transition
from one to the other.
In Section 5 we give an introduction to the mathematical foundations of elec-
trical circuit theory , especially oriented toward the analysis of nonlinear circuits.
211
L3
FIG. A
212 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
flowing into a node k equal to the total current flowing out of that node. (Think of
the water analogy to make this plausible.) For our circuit this is equivalent to
KCL: i~ = {L = -ic.
FIG. B
$1. AN RLC CIRCUIT 213
(Faraday’s law)
FIG. C
214 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
On the other hand, the capacitor (which may be thought of aa two metal plates
separated by some insulator; in the water model it is a tank) imposes the condition
dY -- -x.
-
dt
These equations are analyzed in the following section.
$2. ANALYSIS OF THE CIRCUIT EQUATIONS 215
PROBLEMS
1. Find the differential equations for the network in Fig. D, where the resistor k
voltage controlled, that is, the resistor characteristic is the graph of a C’ func-
tion g: R -+ R,g ( m ) = i ~ .
FIG. D
2. Show that the LC circuit consisting of one inductor and one capacitor wired
in a closed loop oscillates.
Here we begin a study of the phase portrait of the planar differential equation
derived from the circuit of the previous section, namely:
dx
-
at = Y -f(z>,
- = -x,
dY
dt
This is one form of Lienard’s equation. If f(s) = x3 - x, then ( 1 ) is a form of
Van der Pol’s equation.
First consider the most simple case of linear f (or ordinary resistor of Section 1 ) .
Let f ( x ) = Kx, K > 0 . Then ( 1 ) takes the form
in fact a sink. Every state tends to zero; physically this is the dissipative effect of
the resistor. Furthermore, one can see that (0,0) will be a spiral sink precisely
when K < 2.
Next we consider the equilibria of (1) for a general C’function f.
There is in fact a unique equilibrium Z of (1) obtained by setting
Y - f(s> = 0,
or
J$R~ZR
to obtain
$3. VAX DEH POL’S EQUATION 217
FIG. A
The goal here is to continue the study of Lienard’s equation for a certain func-
tion f.
dx
dt = y - f(x), f(x) = 9 - 2,
-
dY
-= -X.
dt
218 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
C
I y-
FIG. A
-
dY = -x.
dt
In this case we can give a fairly complete phase portrait analysis.
Theorem There i s one nontrivial periodic solution of (1) and every nonequilibrium
solution tends to this periodic solution. “The system oscillates.”
We know from the previous section that (2) has a unique equilibrium a t ( 0 , 0) ,
and it is a source. The next step is to show that every nonequilibrium solution
“rotates” in a certain sense around the equilibrium in a clockwise direction. To
this end we divide the ( 2 , y ) plane into four disjoint regions (open sets) A , B ,
C,D in Fig. A. These regions make up the complement of the curves
(3) Y -f(x> = 0,
- x = 0.
These curves (3) thus form the boundaries of the four regions. Let us make this
more precise. Define four curves
u- = ( ( X l Y ) IY <o,s =O),
g-= ((x,y)Iz<O,y = 2 - x ) .
$3. VAN DER POL’S EQUATION 219
These curves are disjoint; together with the origin they form the boundaries of
the four regions.
Next we see how the vector field (x‘, y ‘ ) of (1) behaves on the boundary curves.
It is clear that y’ = 0 at (0,O) and on v+ u r,and nowhere clse; and 2’ = 0 exactly
on g+ u g- u (0, 0). Furthermore the vector (x’, y ’ ) is horizontal on v+ u v- and
points right on v+, and left on v- (Fig. B) . And (x’, y ’ ) is vertical on g+ U p,point-
ing downward on g+ and upward on r.In each region A , B, C,D the signs of
x’ and y‘ are constant. Thus in A , for example, we have x‘ > 0, y‘ < 0, and so the
vector field always points into the fourth quadrant.
The next part of our analysis concerns the nature of the flow in the interior of
the regions. Figure B suggests that trajectories spiral around the origin clockwise.
The next two propositions make this precise.
D
f
v-
FIG. B
FIG. C
I"+
Proposition 2 Every trajectory i s defined for (at least) all t 2 0. Except for (0, 0 ),
each trajectory repeatedly crosses the curves u+, g+, tr,g-, in clockwise order, passing
among the regions A , B, C,D in clockwise order.
To analyze further the flow of the Van der Pol oscillator we define a map
0: v+ + v+
$3. VAN DER POL'S EQUATION 221
as follows. Let p E u+; the solution curve t + + t ( p ) through p is defined for all
t 2 0. There will be a smallest t l ( p ) = tl > 0 such that + r , ( p ) E u+. We put u ( p ) =
+ f l ( p ) .Thus u ( p ) is the first point after p on the trajectory of p (for t > 0) which
is again on v+ (Fig. E) . The map p + t l ( p ) is continuous; while this should be
intuitively clear, it follows rigorously from Chapter 11. Hence u is also continuous.
Note that u is one to one by Uniqueness of solutions.
The importance of this section m p u: u+ + u+ comes from its intimate relation-
ship to the phase portrait of the flow. For example:
d P > = +r,(p).
I"'
I "+
Iv -
FIG. F. The map a:u+ -+ u-
See Fig. F. The map Q is also one to one by uniqueness of solutions and thus mono-
tone.
Using the methods in the proof of Proposition 1 it can be shown that there is a
unique point po E v+ such that the solution curve
I$t(Po) I0 I t I hbo) 1
intersects the curve g+ at the point (1, 0) where g+ meets the x-axis. Let r = I po I.
Define a continuous map
6 : u+ + R,
= QI - q1 = 0.
FIG. G
224 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
Since u has only one fixed point q1 = qo. This shows that the trajectory of p spirals
toward y aa t + 00. The same thing is true if p < go; the details are left to the
reader. Since every trajectory except (0,O) meets u+, the proof of the main theorem
is complete.
It remains to prove Proposition 4.
We adopt the following notation. Let y: [a, b ] + R*be a C1curve in the plane,
written y ( t ) = ( z ( t ) , y ( t ) ) . If F: R*+ R is C1,define
hence
Similarly if y’(t) # 0.
Recall the function
W ( z ,!I)= +($ + v*>.
Let y ( t ) = ( z ( t ) , g ( t ) ) , 0 2 t 2 ti = h ( p ) be the solution curve joining p E u+
to a ( p ) E r.By definition 6 ( p ) = W ( z ( t 2 )y, ( t 2 ) ) - W(z(O),y(0)). Thus
6(p) = 10
11
-z(t)(z(t)* - z ( t ) ) dt;
b ( p ) = /flz(t)*(l - z ( t ) ’ ) dt.
0
This immediately proves (a) of Proposition 4 because the integrand is positive for
0< z(l) < 1.
We may rewrite the last equality as
.
,I
1
FIG. H
&(p) = / 22(1 -
’i
2‘), i = 1,2,3.
PROBLEMS
2/' = -x.
2'. Give a proof of Proposition 2.
3. (Hartman [9, Chapter 7, Theorem 10.21) Find the phase portrait of the
following differential equation and in particular show there is a unique non-
trivial periodic solution:
y = y - f(x>,
Yl = -dx),
where all of the following are assumed:
(i) f, g are C1;
(ii) y ( - x ) = - g ( x ) and x g ( s ) > 0 for all z # 0;
(iii) f ( - r ) = -f(x) and f(x) < 0 for 0 < x < a;
(iv) for x > a, f(x) is positive and increasing;
(v) f(x) + co as x + a.
(Hint: Imitate the proof of the theorem in Section 3.)
4. (Hard !) Consider the equation
X' = y -f ( ~ ) , f: R 4R,C',
y' = -x.
Given f, how many periodic solutions does this system have? This would be
interesting to know for many broad classes of functions f. Good results on this
would probably make an interesting research article.
FIG. I
94. HOPF BIFURCATION 227
-
dv = -2.
dt
Consider as an example the special case where f,,is described by
(24 f,(z) = zd - pz.
Then we apply the results of Sections 2 and 3 to see what happens as p is varied
from - 1 to 1.
For each p , - 1 5 p 5 0, the resistor is passive and the proposition of Section 2
implies that all solutions tend asymptotically to zero as I + 00. Physically the
circuit is dead, in that after a period of transition all the currents and voltages
228 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
- I 6 pL\* 0 ocp6l
FIG. A. Bifurcation.
stay at 0 (or as close to 0 as we want). But note that as p crosses 0, the circuit
becomes alive. It will begin to oscillate. This follows from the fact that the analysis
of Section 3 applies to (2) when 0 < p I 1; in this case (2) will have a unique
periodic solution 'yr and the origin becomes a source. In fact every nontrivial
solution tends to 'y,, as t + 00. Further elaboration of the ideas in Section 3 can be
used to show that v,, +0 as p +0, p > 0.
For (2), p = 0 is the bifurcation value of the parameter. The basic structure of
the phase portrait changes as p passes through the value 0. See Fig. A.
The mathematician E. Hopf proved that for fairly general one-parameter families
of equations 2' = f,,(2), there must be a closed orbit for p > if the eigenvalue
character of an equilibrium changes suddenly at po from a sink to a source.
PROBLEMS
1. Find all values of p which are the bifurcation points for the linear differential
equation :
dx
- -- I.rz
dt + Yl
_
dY -- 2 - 2y.
dt
2. Prove the statement in the text that y' , + 0 as p 0, p > 0.
We give here a way of finding the ordinary differential equations for a class of
electrical networks or circuits. We consider networks made up of resistors, capaci-
tors, and inductors. Later we discuss briefly the nature of these objects, called the
branches of the circuit; a t present it suffices to consider them as devices with two
85. MORE GENERAL CIRCUIT EQUATIONS 229
8c-a 8-0
This last is just the expression of KCL at the node a. This proves the theorem.
a linear map P: 9) + R by
c
a
P(Z1, . . . , 2.) = ZiV(0Ii).
i-1
Thus P E 9)*.
To see that d * P = v, consider first the current state i8 E 9 defined above just
before Theorem 2. Then
(d*P)ip = V(di8)
= V(B+) - V(lr)
= v(B).
Since the states i p , p E B form a basis for 4,this shows that v = d*P. Hence v is in
the image of g*.
Conversely, assume that v = d*W, W E a)*. For the kth node 01 define V ( a ) =
W ( ja),where fa E a, has kth coordinate 1 and all other coordinates 0. Then 1' is
a voltage potential for v since the voltage which v assigns to the branch B is
V ( i @ ) = d*W(ip)
= W(f8') - w(.f8->
= V(B+> - V(D-1.
This completes the proof of Theorem 2.
The space of urirestricled states of the circuit is the Cartesian space 9 x g*. Those
states which satisfy KCL and KVL constitute a linear subspace K C 9 X g*. By
Theorems 1 and 2,
K = Ker d X Im d* C g X g*.
An actual or physical state of the network must lie in K .
The power 4 in a network is a real function defined on the big state space 9 X g*
and in fact is just the natural pairing discussed earlier. Thus if (i, v ) E 9 X g*,
the power + ( z , v ) = v ( i ) or in terms of Cartesian coordinates
'#'(i~ v ) = c iBV8,
8
Here LAis determined by the inductor and is called the inductance. It is assumed
to be a C1positive function of i x .
Similarly a capacitor in the r t h branch defines a C’positive function u, + C,(u,)
called the capacitance; and the current, voltage in the 7th branch satisfy.
We now examine the resistor conditions more carefully. These are conditions on
the states themselves and have an effect similar to Kirchhoff’s laws in that they
place physical restrictions on the space of all states, d X d*. We define Z to be the
subset of d X d* consisting of states that satisfy the two Kirchhoff laws and the
resistor conditions. This space Z is called the space of physical states and is de-
scribed by
Z = ( ( i ,v ) E 0 X d* I (i, u ) E K,f,(i,) = u,, p = 1, . . . , r ) .
Here (iplv,) denotes the components of i, v in the pth branch and p varies over
the resistor branches, r in number.
Under rather generic conditions, Z will be a manifold, that is, the higher dimen-
sional analog of a surface. Differential equations can be defined on manifolds; the
capacitors and inductors in our circuit will determine differential equations on Z
whose corresponding flow at:Z Z describes how a state changes with time.
Because we do not have at our disposal the notions of differentiable manifolds,
we will make a simplifying assumption before proceeding to the differential equa-
tions of the circuit. This is the assumption that the space of currents in the in-
ductors and voltages in the capacitors may be used to give coordinates to Z. We
make this more precise.
$5. MORE GENERAL CIRCUIT EQUATIONS 233
Let d: be the space of all currents in the inductor branches, so that 6: is naturally
isomorphic to R’,where 1 is the number of inductors. A point i of 6: will be denoted
by i = (ill. . . , ir) where ix is the current in the Xth branch. There is a natural
map (a projection) i ~ g: + d: which just sends a current state into its components
in the inductors.
Similarly we let e* be the space of all voltages in the capacitor branches so
that e* is isomorphic to Re, where c is the number of capacitors. Also vc: 8* + e*
will denote the corresponding projection.
Consider the map i~ X V C : 9 X 8* + 6: X e* restricted to Z C 9 X g*. Call
this map 7r: Z + d: X e*. (It will help in following this rather abstract presentation
to follow it along with the example in Section 1.)
Under this hypothesis, we may identify the space of physical states of the net-
work with the space d: X e*. This is convenient because, as we shall see, the dif-
ferential equations of the circuit have a simple formulation on d: X e*. In words
the hypothesis may be stated: the current in the inductors and the voltages in
the capacitors, via Kirchhoff’s laws and the laws of the resistor characteristics,
determine the currents and voltages in all the branches.
Although this hypothesis is strong, it makes some sense when one realizes that
the “dimension” of Z should be expected to be the same as the dimension of
d: X e*.This follows from the remark after the proposition on dim K,and the fact
that Z is defined by r additional equations.
To state the equations in this case we define a function P: 9 X g* + R called
the mixed poienlial. We will follow the convention that indices p refer to resistor
branches and sums over such p means summation over the resistor branches.
Similarly X is used for inductor branches and y for capacitor branches. Then
P : 9 X g* + R is defined by
P(i, v ) = C i7v7 + C / f p ( i p 4.
)
7 P
Here the integral refers to the indefinite integral so that P is defined only up to an
arbitrary constant. Now P by restriction may be considered as a map P: Z + R
and finally by our hypothesis may even be considered as a map
P:d: X e * + R .
(By an “abuse of language” we use the same letter P for all three maps.)
Now assume we have a particular circuit of the type we have been considering.
At a given instant to the circuit is in a particular current-voltage state. The states
will change as time goes on. In this way a curve in 9 X g* is obtained, depending
on the initial state of the circuit.
234 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
dv ap
C,(V,) -,
dt
= -,
av,
where X and y run through all iiuluctors and capacitors of the circuit respectively.
Conversely, every solution curve to these equations i s a physical trajectory.
C v p ( t ) i @ ( t ) = Os
8EB
We rewrite this as
C vpip + C V A i A + C v,i, = 0.
From Leibniz’ rule we get
C v,iT) = (C v,i,)’ - C i,vY’.
Substituting this into the preceding equation gives
$5. MORE GENERAL CIRCUIT EQUATIONS 235
from the definition of P and the generalized Ohm’s laws. By the chain rule
ap
ix’ + C - v,’.
dP ap
- = C-
dt aiA a%
From the last two equations we find
Some remarks on this theorem are in order. First, one can follow this develop-
ment for the example of Section 1 to bring the generality of the above down to
earth. Secondly, note that if there are either no inductors or no capacitors, the
Brayton-Moser equations have many features of gradient equations and much of
the material of Chapter 9 can be applied; see Problem 9. In the more general case
the equations have the character of a gradient with respect to an indefinite metric.
We add some final remarks on an energy theorem. Suppose for simplicity that
all the Lx and C, are constant and let
W:d: X e * + R
be the function W ( i , v ) = 4 CALAiA2 +
3 C, C,V,~.Thus W has the form of a
norm square and its level surfaces are generalized ellipsoids; W may be interpreted
aa the energy in the inductor and capacitor branches. Define P,: d: X e* + R
(power in the resistors) to be the composition
d:Xe*-,ZC4X4*4R,
where p,(i, v ) = C ipvp (summed over resistor branches). We state without proof:
Theorem 5 Let 4: Z + d: X e* be any solution of the equations of the previous
theorem. Then
PROBLEMS
4. Prove Theorem 5.
Y
FIG. A
65. MORE GENERAL CIRCUIT EQUATIONS 237
C
FIG. B
6. Show that the differential equations for the circuit in Fig. B are given by:
dix
L -dt = -(v,+tJ )
7‘ ?
c-
dvr
dt
= ix,
--
\/
FIG. C
238 10. DIFFERENTIAL EQUATIONS FOR ELECTRICAL CIRCUITS
linear resistor, and the box is a resistor with characteristic given by i = f(u) .
Find the mixed potential and the phase portrait for some choice of f. See
Problem 7.
9. We refer to the Brayton-Moser equations. Suppose there are no capacitors.
(a) Show that the function P: d: --* R decreases along nonequilibrium tra-
jectories of the Brayton-Moser equations.
(b) Let n be the number of inductors. If each function L, is a constant,
find an inner product on Rn = d: which makes the vector
Notes
We have already seen how periodic solutions in planar dynamical systems play
an important role in electrical circuit theory. In fact the periodic solution in Van
der Pol’s equation, coming from the simple circuit equation in the previous chapter,
has features that go well beyond circuit theory. This periodic solution is a “limit
cycle,” a concept we make precise in this chapter.
The PoincarbBendixson theorem gives a criterion for the detection of limit
cycles in the plane ; this criterion could have been used to find the Van der Pol oscilla-
tor. On the other hand, this approach would have missed the uniqueness.
PoincartLBendixson is a basic tool for understanding planar dynamical systems
but for differential equations in higher dimensions it has no generalization or
counterpart. Thus after the first two rather basic sections, we restrict ourselves to
planar dynamical systems. The first section gives some properties of the limiting
behavior of orbits on the level of abstract topological dynamics while in the next
section we analyze the flow near nonequilibrium points of a dynamical system.
Throughout this chapter we consider a dynamical system on an open set W in a
vector space E, that is, the flow 4, defined by a C1vector field f: W + E.
FIG. A
FIG. B
equilibrium is its own u-limit set and d i m i t set. A closed orbit is the a-limit and
*limit set of every point on it. In the Van der Pol oscillator there is a unique closed
orbit y ; it is the w-limit of every point except the origin (Fig. A). The origin is the
a-limit set of every point inside y. If y is outside y , then L,(y) is empty.
There are examples of limit sets that are neither closed orbits nor equilibria, for
example the figure 8 in the flow suggested by Fig. B. There Are three equilibria, two
sources, and one saddle. The figure 8 is the w h i t set of all points outeide it. The
right half of the 8 is the wlimit set of all points inside it except the equilibrium, and
similarly for the left half.
In three dimensions there are extremely complicated examples of limit sets,
although they are not easy to describe. In the plane, however, limit sets are fairly
simple. In fact Fig. B is typical, in that one can show that a limit set other than a
closed orbit or equilibrium is made up of equilibria and trajectories joining them.
The PoincanSBendixson theorem says that if a compact limit set in the plane
contains no equilibria it is a closed orbit.
We recall from Chapter 9 that a limit set is closed in W ,and is invariant under
the flow. We shall also need the following result:
Proposition (a) If z and z are 012 the same trajectory, then Lu(s) = L,(z);simi-
hr2y for a-limits.
(b) If D is a closed positively invariant set and z E D, then L,(z) C D ; similarly
for negatively invariant seih and u-limits.
(c) A closed invariant set, in particular a limit set, contains the u-limit and w-limit
sets of every point in it.
01. LIMIT BETB 241
PROBLEMS
1. Show that a compact limit set is connected (that is, not the union of two disjoint
nonempty closed sets.
2. Identify R4 with C* having two complex coordinates (w, z), and consider the
linear system
(*I w' = 2AW,
z' = 2+eiz,
where 0 is an irrdiond real number.
(a) Put a = 8." and show that the set (anI n = 1, 2, . . .) is dense in the
unitcircIeC= ( Z E c l l z l = 1).
(b) Let +# be the flow of (*) . Show that for n an integer,
+n (w, Z) = (w, 0"~).
(c) Let (WD, ZO)belong to the torus C X C C C*. Use (a), (b) to show that
LW(W0, ZO) = Ldwo, ZO) = c x c.
cd) Find L, and La of an arbitrary point of C*.
3. Find a linear system on R" = Ck such that if a belongs to the k-torus
CX ---
X C C Ck,then
L,(a) = L,(a) = C*.
4. In Problem 2, suppose instead that 0 is r a l i o d . Identify L, and La of every
point.
5. Let X be a nonempty compact invariant set for a C' dynamical system. Suppose
that X is minimd, that is, X contains no compact invariant nonempty proper
eubset. Prove the following:
(a) Every trajectory in X is dense in X ;
(b) La@) = L,(z) = X for each z E X ;
(c) For any (relatively) open set U C X,there is a number P > 0 such that
for any z E X,lo E R,there exists t such that +t(z) E U and I 1 - C I C
p;
342 11. THE POINCARI~-BENDIXSON THEOREM
We consider again the flow 4, of the C1vector field j : W + E. Suppose the origin
0 E E belongs to W.
A local section at 0 off is an open set S containing 0 in a hyperplane H C E which
is transverse to j. By a hyperplane we mean a linear subspace whose dimension
is one less than dim E. To say that S C H is transverse to f means that f(x) f. H
for all z C S. In particular f(z) # 0 for x E S.
Our first use of a local section at 0 will be to construct a “flow box” in a neighbor-
hood of 0. A flow box gives a complete description of a flow in a neighborhood of
any nonequilibrium point of any flow, by means of special (nonlinear) coordinates.
The description is simple : points move in parallel straight lines at constant speed.
We make this precise as follows. A diflemorphism q : U ---t V is a differentiable
map from one open set of a vector space to another with a differentiable inverse.
A flow box is a diffeomorphism
R X H ~ N ~ W
of a neighborhood N of (0, 0) onto a neighborhood of 0 in W , which transforms
the vector field f : W -+ E into the constant vector field (1,O) on R X H.The flow
of f is thereby converted to a simple flow on R X H:
Mt, Y) = (1 + 8, !I)*
The map 9 is defined by
W t , 9) = 4&),
for (II y ) in a sufficiently small neighborhood of (0,0) in R X H. One appeals to
Chapter 15 to see that \k is a C1map. The derivative of \k at (0,O)is easily computed
to be the linear map which is the identity on 0 X H, and on R = R X 0 it sends
1 to f(0). Sincef(0) is transverse to H, it follows that D q ( 0 , 0) is an isomorphism.
Hence by the inverse function theorem \k maps an open neighborhood N of (0, 0)
diffeomorphically onto a neighborhood V of 0 in h’. Wc take N of the form
$2. LOCAL SECTIONS AND FLOW BOXES 243
R@
FIG: B
ie 0,and
FIG. A
83. MONOTONE SEQUENCES IN PLANAR DYNAMICAL SYSTEM8 245
FIG. R
Proposition 1 Let S be a local section of a C1planar dynamical system and yo, y1,
y2, . . . a sequence of distinct points of S that are on the same solution curve C. If the
sequence i s monotone along C , it is also monotone dong S.
Proof. It suffices to consider three points yo, yl, y2. Let Z be the simple closed
curve made up of the part B of C between yo and y1 and the segment T C S between
yo and y l . Let D be the closed bounded region bounded by 2. We suppose that the
trajectory of yl leaves D at y1 (Fig. C) ; if it enters, the argument is similar.
We assert that at any point of T the trajectory leaves D. For it either leaves or
enters because, T being transverse to the flow, it crosses the boundary of D . The
set of points in T whose trajectory leaves D is a nonempty open subset T- C T, by
z, yI
4 ..
FIG. C
246 11. THE POINCAH&BENDIXSON THEOREM
continuity of the flow; the set T , C T where trajectories enter D is also open in
T . Since T- and T+ are disjoint and T = T- U T+, it.follows from connectedness
of the interval that T+ must be empty.
It follows that the complement of D is positively invariant. For no trajectory
can enter D at a point of T ; nor can it cross B, by uniqueness of solutions.
Therefore t#q(yl) E R2- D for all t > 0. In particular, yz E S - T.
The set S - T is the union of two half open intervals 1 0 and Il with yo an end-
point of loand y1 an endpoint of Z1. One can draw an arc from a point +@(yl) (with
t > 0 very small) to a point of 11, without crossing 2 . Therefore Zl is outside D.
Similarly lois inside D. It follows that y2 E Zlsince it must be outside D. This
shows that y1 is between yo and y2 in I, proving Proposition 1.
FIG. I)
$3. MONOTONE SEQUENCES IN PLANAR DYNAMICAL SYSTEMS 247
PROBLEMS
FIG. E
4. Let z be a recurrent point of a C' planar dynamical system, that is, there is a
sequence 1, -,f 00 such that
t$k($1 + 2.
It follows that #,(y) = t$,(y) ; hence t$,-,(y) = y, r - s > 0. Since L,(z) contains
no equilibrium, y belongs to closed orbit.
It remains to prove that if 7 is a closed orbit in L,(z) then 7 = L , ( z ) . It is
enough to show that
l hd( t $r ( z ) , r) = 0,
b- m
where d(t$f(z),7 ) is the distance from x to the compact set y (that is, the distance
from &(z) to the nearest point of y ) .
94. THE POINCAR~~-BENDIXBON THEOREM 249
h.(Z> E 8,
dt.(s) + 2,
Let B > 0. From Chapter 8, there exists 6 > 0 such that if I 2. - u I < 6 and
I t 1 < X + e t h e n I h ( z . ) -$t(u)I < 8 .
Let no be 80 large that I zn - z I < 6 for all n 2 n,,.Then
I $t(zn) - +t(z)I <B
if I t I 5 X + t and n 2 720. NOWlet 1 2 t n v Let n 2 no be such that
tn It I &+I.
Then
d(+t(z), 7 ) I I h ( z >- +t-t.(z>I
= 1 $t-t.(zn> - #t-t.(z)I
<B
since I t - t. 1 5 X + t. The proof of the Poincad-Bendixaon theorem is complete.
PROBLEMS
In the proof of the Poincart5-Bendixson theorem it was shown that limit cycles
enjoy a certain property not shared by other closed orbits: if y is an w-limit cycle,
there exists z tf y such that
limd(&(z), y) = 0.
1- m
For an u-limit cycle replace 00 by - 0 0 . Geometrically this means that some tra-
jectory spirals toward y m t + 00 (for w-limit cycles) or as t + - m (for a-limit
cycles). See Fig. A.
IS
FIG. B
A = (Y 7 = L(Y)J - y
i s open.
Proposition 2 Let y be a closed orbit and suppose that the domain W of the dynamicd
system includes the whole open region U enclosed by 7 . Then U contains either a n
equilibrium or a limit cycle.
252 11. THE POINCAR&BENDIXSON THEOREM
and
Theorem 2 Let y be a closed orbit enclosing an open set U contained in the domain
W of the dynamical system. Then U contains a n equilibrium.
Proof. Suppose IJ contains no equilibrium. If x , --t 2 in U and each xn lies
on a closed orbit, then x must lie on a closed orbit. For otherwise the trajectory o f
x would spiral toward a limit cycle, and by Proposition 1 so would the trajectory
of some x..
Let A 1 0 be the greatest lower bound of the areas of regions enclosed by closed
orbits in U. Let [y,) be a sequence of closed orbits enclosing regions of areas A ,
such that lim,,+- A , = A. Let x. E yn. Since y U U is compact we may assume
Z,---fx E U . Then if U contains no equilibrium, x lies on a closed orbit 8 of area
A (8). The usual section argument shows that as n --+ 00 , y, gets arbitrarily close
to 8 and hence the area A n - A (8) of the region between y, and 8, goes to 0. Thus
)
A(j3) = A .
We have shown that if U contains no equilibrium, it contains a closed orbit 8
enclosing a region of minimal area. Then the region enclosed by /3 contains neither
an equilibrium nor a closed orbit, contradicting Proposition 2.
The following result uses the spiraling properties of limit cycles in a subtle way.
Theorem 3 Let H be a $rst integral of a planar C' dynamical system (that is, H
is a real-valued junction that is conslant on trajectories). If H is not constant on any
open set, then there are no limit cycles.
Proof. Suppose there is a limit cycle y; let c E R be the constant value of H
on y. If x ( t ) is a trajectory that spirals toward 'y, then H ( x ( t ) ) = c by continuity
of H . In Proposition 1we found an open set whose trajectories spiral toward y; thus
H is constant on an open set.
$5. APPLICATIONS OF THE POINCAR$-BENDIXSON THEOREM 253
PROBLEMS
1. The celebrated Brouwer $xed point theorem states that any continuous map f
of the closed unit ball
D" = {Z E R"Ilzl = 1)
into itself has a fixed point (that is, f(z) = z for some 2).
(a) Prove this for n = 2, assuming that f is C1, by finding an equilibrium for
the vector field g(s) = f(z) - z.
(b) Prove Brouwer's theorem for n = 2 using the fact that any continuous
map is the uniform limit of C1maps.
2. Let f be a C1 vector field on a neighborhood of the annulus
A = (z I
E R2 I 5 I z I 5 2 ) .
Suppose that f has no zeros and that f is transverse to the boundary, pointing
inward.
(a) Prove there is a closed orbit, (Notice that the hypothesis is weaker than
in Problem 1, Section 3.)
(b) If there are exactly seven closed orbits, show that one of them has orbits
spiraling toward it from both sides.
3. Let f : R2+ R2 be a C1vector field With no zeros. Suppose the flow +t generated
by f preserves area (that.is, if S is any open set, the area of +t (8)is independent
of t ) . Show that every trajectory is a closed set.
4. Let f be a C1 vector field on a neighborhood of the annulus A of Problem 2.
Suppose that for every boundary point z, f(z) is a nonzero vector tangent to
the boundary.
(a) Sketch the possible phase portraits in A under the further assumption
that there are no equilibria and no closed orbits besides the boundary
circles. Include the case where the boundary trajectories have opposite
orientations.
(b) Suppose the boundary trajectories are oppositely oriented and that the
flow preserves area. Show that A contains an equilibrium.
5. Let f and g be C1 vector fields on R2such that cf(z), g(z) ) = 0 for all s. If f
has a closed orbit, prove that g has a zero.
6. Let j be a C1 vector field on an open set W C R2and H : W + R a C1function
such that
D H ( z ) f ( z )= 0
for all z. Prove that:
(a) H is constant on solution curves of z' = f(s);
254 11. THE POINCAR~~-BENDIXSON THEOREM
Notes
In this chapter we examine some nonlinear two dimensional systems that have
been used as mathematical models of the growth of two species sharing a common
environment. In the first section, which treats only a single species, various mathe-
matical assumptions on the growth rate are discussed. These are intended to capture
mathematically, in the simplest way, the dependence of the growth rate on food
supply and the negative effects of overcrowding.
In Section 2, the simplest types of equations that model a predator-prey ecology
are investigated: the object is to find out the long-run qualitative behavior of tra-
jectories. A more sophisticated approach is used in Section 3 to study two competing
species. Instead of explicit formulas for the equations, certain qualitative assump-
tions are made about the form of the equations. (A similar approach to predator
and prey is outlined in one of the problems.) Such assumptions are more plausible
than any set of particular equations can be; one has correspondingly more confidence
in the conclusions reached.
An interesting phenomenon observed in Section 3 is bifurcation of behavior.
Mathematically this means that a slight quantitative change in initial conditions
leads to a large qualitative difference in long-term behavior (because of a change of
w-limit sets). Such bifurcations, also called “catastrophes,” occur in many applica-
tions of nonlinear systems; several recent theories in mathematical biology have
been based on bifurcation theory.
The birth rate of a human population is usually given in terms of the number
of births per thousand in one year. The number one thousand is used merely to
avoid decimal places; instead of a birth rate of 17 per thousand one could just as
256 12. ECOLOGY
well speak of 0.017 per individual (although this is harder t o visualize). Similarly,
the period of one year is also only a convention; the birth rate could just as well
be given in terms of a week, a second, or any other unit of time. Similar remarks
apply to the death rate and to the growth Tab, or birth rate minus death rate. The
growth rate is thus the net change in population per unit of time divided by the
total population at the beginning of the time period.
Suppose the population y(1) a t time t changes to y +
Ay in the time interval
[t, t + At]. Then the (average) growth rate is
AY
Y ( 1 ) At *
3
dt
= a(u - uo)y(t).
Here a and uo are constants, dependent only on the species, and u is a parameter,
$1. ONE SPECIES 257
dependent on the particular environment but constant for a given ecology. (In
the next section u will be another species satisfying a second differential equation.)
The preceding equation is readily solved:
Thus the population must increase without limit, remain constant, or approach
0 as a limit, depending on whether u > uo, u A uo, or u < uo. If we recall that actu-
ally fractional values of y ( 1 ) are meaningless, we see that for all practical purposes
“y ( t ) + 0” really means that the population dies out in a finite time.
In reality, a population cannot increase without limit; a t least, this has never
been observed! It is more realistic to assume that when the population level exceeds
a certain value 9, the growth rate is negative. We call this value 9, the limiting
population. Note that 9 is not necessarily an upper bound for the population. Rea-
sons for the negative growth rate might be insanity, decreased food supply, over-
crowding, smog, and so on. We refer to these various unspecified causes as social
phenomena. (There may be positive social phenomena; for example, a medium size
population may be better organized to resist predators and obtain food than a
Rmall one. But we ignore this for the moment.)
Again making the simplest mathematical assumptions, we suppose the growth
rate is proportional to 9 - y :
a = c(9 - y ) , c > Oaconstant.
Thus we obtain the equation of limited growth:
dY
-=c(r]-y)y; c>o, 9>0.
dt
Note that this suggests
A&!
At
= c9y - cy2.
This means that during the period At the population change is cy2 At less than i t
would be without social phenomena. We can interpret cy2 as a number propor-
tional to the average number of encounters between y individuals. Hence cy2 is a
kind of social friction.
The equilibria of (2) occur a t y = 0 and y = 9. The equilibrium at 9 is asymptot-
ically stable (if c > 0) since the derivative of c(9 - y ) y a t 9 is -cq, which is
negative. The basin of 9 is { y 1 y > 0 ) since y (t) will increase to 9 as a limit if 0 <
y (0) < 9, and decrease to 9 as a limit if 9 < y (0). (This can be seen by considering
the sign of dyldt.)
A more realistic model of a single species is
Y’ = M ( Y ) Y *
Here the variable growth rate M is assumed to depend only on the total population
Y-
258 12. ECOLOGY
PROBLEMS
We consider a predator species y and its prey x. The prey population is the total
food supply for the predators a t any given moment. The total food consumed by
the predators (in a unit of time) is proportional to the number of predator-prey
encounters, which we assume proportional to xy. Hence the per capita food supply
for the predators a t time t is proportional to ~ ( t ) Ignoring
. social phenomena for
the moment, we obtain from equation (1) of the preceding section:
y’ = a(s- UO)Y,
lations, and is proportional to At; we write it asf(x, y) At. What should we postulate
about f(x, y ) ?
It is reasonable that f(z, y) be proportional to y: twice as many cats will eat
twice as many mice in a small time period. We also assume f(x, y) is proportional
to x: if the mouse population is doubled, a cat will come across a mouse twice as
often. Thus we put f(x, y) = P t y , @ a positive constant. (This assumption is less
plausible if the ratio of prey to predators is very large. If a cat is placed among a
sufficiently large mouse population, after a while it will ignore the mice.)
The prey species is assumed to have a constant per capita food supply available,
sufficient to increase its population in the absence of predators. Therefore the prey
is subject to a differential equation of the form
X' = A X - BXY.
In this way we arrive a t the predator-prey equations of Volterra and Lotka:
2' = (A - BY)x,
A , B, C, D > 0.
y' = (CX - D)y.
This system has equilibria a t (0,0) and z = (D/C, A/B). It is easy to see that
(0,0) is a saddle, hence unstable. The eigenvalues a t (D/C, A/B) are pure imagi-
nary, however, which gives no information about stability.
We investigate the phase portrait of (1) by drawing the two lines
A
x"0: y = j p
These divide the region x > 0, y > 0 into four quadrants (Fig. A). In each quadrant
the signs of x' and y' are constant as indicated.
The positive x-axis and the positive y-axis are each trajectories as indicated in
Fig. A. The reader can make the appropriate conclusion about the behavior of the
population.
Otherwise each solution curve (x(t), y ( t ) ) moves counterclockwise around z
from one quadrant to the next. Consider for example a trajectory ( x ( t ) , y(t))
starting a t a point
D
x(0) = u > - > 0,
C
A
y ( O ) = u > - >BO
Therefore
A
(3 1 - 2 y ( t ) 2 ve",
B
for 0 I1 < T . From the second inequality of (2) we see that T is finite. From (2)
and (3) we see that for 1 E J , ( ~ ( t ) ~, ( 1 ) is) confined to the compact region
- < x ( t ) 5 u,
D
C-
A
- < y ( t ) 5 ve+.
B -
02. PREDATOR AND PREY 26 1
= -dF
x ’ + - y dG
. I
dx dY
Hence
H(x, y ) =
dF
zz ( A - B y ) + y dG
- (CX- 0 ) .
dY
We obtaiii H = 0 provided
x dF/dx
-I:- - y dG/dy
CX- D By-A’
Since c and y are independent variables, this is possible if and only if
x dF/dx y dG/dy
-=-- - constant.
CX-D By-A
Putting the constant equal to 1 we get
_
dF - c - -D
dx 5)
integrating we find
F ( x ) = CX - Dlogz,
G(y) = By - A logy.
Thus the function
H(x,y ) = CX - D l o g x + B y - A logy,
defined for z > 0, y > 0, is constant on solution curves of (I).
262 12. ECOLOGY
By considering the signs of d H / d r and aH/dy it is easy to see that the equilibrium
z = ( D / C , A / B ) is an absolute minimum for H . It follows that H (more precisely,
H - H ( 2 ) ) is a Liapunov function (Chapter 9 ) . Therefore z is a stable equilibrium
We note next that there are no limit cycles; this follows from Chapter 11 because
H is not constant on any open set.
We now prove
(0, O ) ( X
We now have the following (schematic) phase portrait (Fig. B ) . Therefore, for
any given initial populations ( x ( O ) , y (0)) with s(0) # 0, and y(0) # 0, other
than i , the populations of predator and prey will oscillate cyclically.
52. PREDATOR AND PREY 263
No matter what the numbers of prey and predator are, neither species will die
out, nor will it grow indefinitely. On the other hand, except for the state z, which
is improbable, the populations will not remain constant.
Let us introduce social phenomena of Section 1 into the equations (1). We obtain
the following predator-rey equations of species with limited growth :
(-51 X' = (A - By - XS)Z,
y' = (CX- D - p y ) ~ .
The constants -4,B , C , D,X, p are all positive.
We divide the upper-right quadrant Q ( 5 > 0, y > 0 ) into sectors by the two
lines
L : A - B y - XX = 0;
M : CX - D - py = 0.
Along these lines x' = 0 and y' = 0, respectively. There are two possibilities, ac-
cording to whether these lines intersect in Q or not. If not (Fig. C), the predators
die out and the prey population approaches its limiting value A/X (where L meets
the x-axis).
r /
- - - - -
A/ x
FIG. C. Predators + 0; prey -B A/A.
This is because it is impossible for b0t.h prey and predators to increase at the
same time. If the prey is above its limiting population it must decrease and after
a while the predator population also starts to decrease (when the trajectory crosses
M).After thab point the prey can never increase past A/X, and so the predators
continue to decrease. If the trajectory crosses L, the prey increases again (but not
past A / X ) , while the predators continue to die off. In the limit the predators dis-
appear and the prey population stabilizes at A/X.
264 12. ECOLOGY
FIG. D
Suppose now that L and M cross at a point z = (2, d ) in the quadrant Q (Fig.
D ) ; of course z is an equilibrium. The linear part of the vector field (3) at z is
--Xi -BZ
[ cg -pg I.
The characteristic polynomial has positive coefficients. Both roots of such a poly-
nomial have negative real parts. Therefore z is asymptotically stable.
Note that in addition tc the equilibria at z and (0,0), there is also an equilibrium,
a saddle, at the intersection of the line L with the x-axis.
I t is not easy to determine the basin of z ; nor do we know whether there are any
limit cycles. Nevertheless we can obtain some information.
Let L meet the x-axis at ( p , 0) and the y-axis at (0, q ) . Let be a rectangle
whose corners are
(0, 01, ( P , O), (0, Q), ( P , Q)
with p > p , Q > q, and ( p , Q) E M (Fig. E). Every trajectory at a boundary point
of I' either enters r or is part of the boundary. Therefore r is positively invariant.
Every point in Q is contained in such a rectangle.
By the Poincard-Bendixson theorem the w-limit set of any point (5, y ) in r, with
x > 0, y > 0 , must be a limit cycle or one of the three equilibria (0,O) , z or ( p , 0 ) .
We rule out (0, 0) and ( p , 0) by noting that 2' is increasing near (0, 0); and y' is
increasing near ( p , 0). Therefore L y ( p ) is either z or a limit cycle in I". By a con-
sequence of the Poincard-Bendixson theorem any limit cycle must surround z .
We observe further that any such rectangle r contains all limit cycles. For a
limit cycle (like any trajectory) must enter r, and is positively invariant.
Fixing ( p , Q) as above, it follows that for any initial values (z(O), y ( O ) ) , there
ezists k > 0 a h that
x(t) < P, y(t) <Q if t 1 lo.
One can also find eventual lower bounds for z ( t ) and ~ ( 1 ) .
03. COMPETING SPECIES 265
We also see that in the long run, a trajectory either approaches z or else spirals
down to a limit cycle.
From a practical standpoint a trajectory that tends toward z is indistinguishable
from z after a certain time. Likewise a trajectory that approaches a limit cycle y
can be identified with y after it is sufficiently close.
The conclusion is that any ecology of predators and prey which obeys equations ( 2 )
eventually eettles down to either a constunt or periodic population. There are absolute
upper bounds that no population can exceed in the long run, m matter what the initial
populations are.
PROBLEM
Show by examples that the equilibrium in Fig. D can be either a spiral sink or a
node. Draw diagrams.
We consider now two species x, y which compete for a common food supply.
Instead of analyzing specific equations we follow a different procedure : we consider
a large class of equations about which we assume only a few qualitative features. In
this way considerable generality is gained, and little is lost because specific
equations can be very difficult to analyze.
266 12. ECOMQY
The equations of growth of the two species are written in the form
(1 1 ' =
2 M ( z , y)z,
Y' = N(Z1 Y>Y,
dhere the growth rates M and N are C1functions of nonnegative variables 2, y.
The following assumptions are made:
( a ) If either species increases, the growth rate of the other goes down. Hence
aM aN
-< 0 and - < 0.
aY az
(b) If either population is very large, neither species can multiply. Hence
there exists K > 0 such that
M ( z , y ) 5 0 and N ( z , y ) 50 if z > K or y 2 K .
(c) In the absence of either species, the other has a positive growth rate up to
a certain population and a negative growth rate beyond it. Therefore there are
constants a > 0, b > 0 such that
M ( x , 0) >0 for x <a and M ( z , 0) <0 for z > a,
N(0, y) >0 for y <b and N(0,y) <0 for y > b.
By (a) anti (c) each vertical line r x R meets the set p = M-'(0) exactly once
if 0 5 z 5 a and not a t all if z > a. By (a) and the implicit function theorem p
is the graph of a nonnegative CLmap f : [0, a] + R such that j-I(O) = a. Below
the curve p , M > 0 and above it M < 0 (Fig. A).
FIG. A
In the same way the set Y = iV-' (0) is a smooth curve of the form
( (2, Y)I z = f 7 ( Y ) l ,
83. COMPETING SPECIES 267
FIG. B
J
X ' < 0
y'c 0
Bad Good
FIG. D
Consider first of all the region Bo whose boundary contains ( 0 , O ) .This is of type
I (x’ > 0, y’ > 0). If q is an ordinary point of p n dBo, we can connect q to a point
inside Bo by a path which avoids Y. Along such a path y‘ > 0 . Hence (x’,y’) points
upward out of Bo a t q since p is the graph of a function. Similarly a t an ordinary
point r of Y n dBo, (x’,y ’ ) points to the right, out of Bo at r . Hence B, is good, and
so every vertex of Bo is good.
Next we show that if B is a basic region and dB contains one good vertex p of
p n Y, then B is good. We assume that near p , the vector field along aB points into
B ; we also assume that in B , z’< 0 and y’ > 0. (The other cases are similar.) Let
p, C p, YO C Y be arcs of ordinary boundary points of B adjacent to p (Fig. E). For
example let r be any ordinary point of dB n p and q any ordinary point of PO. Then
y‘ > 0 at q. As we move along p from q to r the sign of y‘ changes each time we cross
Y. The number of such crossings is even because r and q are on the same side of Y .
Hence y’ > 0 a t r . This means that (z’,y ’ ) points up a t r . Similarly, x’ < 0 at
every ordinary point of Y n dB. Therefore along p the vector (x’,y ’ ) points up;
along Y is points left. Then B lies above p and left of Y. Thus B is good.
This proves the lemma, for we can pass from any vertex to any other along p,
starting from a good vertex. Since successive vertices belong to the boundary of a
common basic region, each vertex in turn is proved good. Hence all are good.
As a consequence of the lemma, each basic region, and its closure, is either posi-
tively or negatively invariant.
270 12. ECOLOGY
FIG. E
What are the possible w-limit points of the flow (1 ) ? There are no closed orbits.
For a closed orbit must be contained in a basic region, but this is impossible since
z ( t ) and y ( t ) are monotone along any solution curve in a basic region. Therefore
all w-limit points are equilibria.
We note also that each trajectory is defined for all t 1 0 ,because any point
lies in a large rectangle I' spanned by (0, 0 ) , (a, 0 ) , (0, yo), (Q, y o ) with zo > a,
yo > b ; such a rectangle is compact and positively invariant (Fig. F). Thus we
have shown:
Theorem The jeow qh of (1) has the following property: for all p = (z,y ) , z 2 0 ,
y 2 0, the limit
lim 4c(P)
t- 00
We conclude that the populations of two competing species always tend to one of a
finite number of limiting populations.
83. COMPETING SPECIES 27 1
FIG. F
Examining the equilibria for stability, one finds the following results. A vertex
where p and Y each have negative slope, but p is steeper, is asymptotically stable
(Fig. G). One sees this by drawing a small rectangle with sides parallel to the axes
around the equilibrium, putting one corner in each of the four adjacent regions.
Such a rectangle is positively invariant; since it can be arbitrarily small, the equilib-
rium is asymptotically stable. Analytically this is expressed by
slope of p = -M
-, < slope of Y =
Ns
--
Mv Nu < OJ
where M . = a M / d z , M u = a M / d y , and so on, at the equilibrium, from which a
computation yields eigenvalues with negative real parts. Hence we have a sink.
FIG. G
A case by case study of the different ways I( and Y can cross shows that the only
other asymptotically stable equilibrium is (a, 0) when (b, 0) is above p, or (a,0)
when (a,0 ) is to the right of Y. All other equilibria are unstable. For example, q
in Fig. H is unstable because arbitrarily near it, to the left, is a trajectory with z
272 12. ECOLOGY
decreasing; such a trajectory tends toward (0,b ) . Thus in Fig. H, (0, b ) and p
are asymptotically stable, while q, r , s and (a, 0) are unstable. Note that r is a
Eource.
There must be at least one asymptotically stable equilibrium. If (0, b ) is not one,
then it lies under p ; and if (a, 0) is not one, i t lies over p. In that case p and Y cross,
and the first crossing to the left of (a,0 ) is asymptotically stable.
Every trajectory tends to an equilibrium; it is instructive to see how these
w-limits change as the initial state changes. Let us suppose that q is a saddle. Then
it can be shown that exactly two trajectories a, a' approach q, the so-called stable
manifolds of q, or sometimes separatrices of q. We concentrate on the one in the
unbounded basic region B,, labeled a in Fig. H.
All points of B , to the left of a end up at (0, b ) , while points to the right go to
p . As we move across a this limiting behavior changes radically. Let us consider
this bifurcation of behavior in biological terms.
Let vo, v1 be states in Bo, very near each other but separated by a ;suppose the
trajectory of vo goes to p while that of v1 goes to (0, b ) . The point vo = (a,yo)
represents an ecology of competing species which will eventually stabilize a t p .
$3. COMPETING SPECIES 273
Note that both populations are positive a t p . Suppose that some unusual event
occurs, not accounted for by our model, and the state of the ecology changes sud-
denly from vo to v1. Such an event might be introduction of a new pest>icide,importa-
tion of additional members of one of the species, a forest fire, or the like. Mathe-
matically the event is a jump from the basin of p to that of (0, b).
Such a change, even though quite small, is an ecological catastrophe. For the
trajectory of v1 has quite a different fate: it goes to (0,b) and the x species is wiped
out!
Of course in practical ecology one rarely has Fig. H to work with. Without it, the
change from vo t o v1 does not seem very different from the insignificant change from
vo to a near state u2, which also goes to p . The moral is clear: in the absence of com-
prehensive knowledge, a deliberate change in the ecology, even an apparently minor
one, is a very risky proposition.
PROBLEMS
1. The equations
2’ = x ( 2 - x - y),
y’ = y ( 3 - 22 - y)
satisfy conditions (a) through (d) for competing species. Explain why these
equations make it mathematically possible, but extremely unlikely, for both
species to survive.
2. Two species x, y are in symbiosis if an increase of either population leads to an
increase in the growth rate of the other. Thus we assume
with
aM aN
-
aY > 0 and - > 0.
ax
We also suppose that the total food supply is limited; hence for some A > 0,
R > 0 we have
M(x,y) < O if s> A,
W(x, y ) <0 if y > B.
If both populations are very small, they both increase; hence
M(0, 0) >0 and -V(O, 0) > 0.
274 12. ECOLOGY
Assuming that the intersections of the curves M-'(O), N-'(O) are finite, and
all are transverse, show that:
(a) every trajectory tends to an equilibrium in the region 0 < 5 < A , 0 <
Y < B;
(b) there are no sources;
(c) there is at least one sink;
(d) if aM/ax < 0 and aN/ay < 0, there is a unique sink z, and z = L,(x,y )
for all 5 > 0, y > 0.
3. Prove that under plausible hypotheses, two mutually destructive species can-
not coexist in the long run.
4. Let y and x denote predator and prey populations. Let
2' = M ( 2 , y)z,
Notes
Here we define asymptotic stability for closed orbits of a dynamical system, and
an especially important kind called a periodic attractor. Just m sinks are of major
importance among equilibria in models of “physical” systems, so periodic attractors
are the most important kind of oscillations in such models. As we shall show in
Chapter 16, such oscillations persist even if the vector field is perturbed.
The main result is that a certain eigenvalue condition on the derivative of the
flow implies asymptotic stability. This is proved by the same method of local sec-
tions used earlier in the PoincarbBendixson theorem. This leads to the study of
“discrete dynamical systems’’ in Section 2, a topic which is interesting by itself.
Let f: W + Rnbe a C1 vector field on an open set W C Rn;the flow of the dif-
ferential equation
(1) 2‘ = f(z)
is denoted by &.
Let y C W be a closed orbit of the flow, that is, a nontrivial periodic solution
curve. We call y asymptotically stable if for every open set U1 C W, with y C U1
there is an open set lJz, y C U ZC U1 such that $ I ( U , ) C U1 for all t > 0 and
limd(g,(z), y) = 0.
I- x
Theorem 2 Let y be a closed orbit of period X of the dynamical system ( 1). Let p E y.
Suppose that n - 1 of the eigenvalues of the linear m a p D+A( p ) : E E are less than
1 in absolute value. Then T is asymptotically stable.
Some remarks on this theorem are in order. First, it assumes that +A is differenti-
able. In fact, &(z) is a C1function of ( t , z); this is proved in Chapter 16. Second,
the condition on D ~ A ( Pis )independent of p E y. For if q E y is a different point,
let r E R be such that + , ( p ) = q. Then
D$A(p) = o($-h&)
(PI
= D&(p) -'&A (d&+(PI 1
which shows that D+r(p) ia similar t o D&,(q). Third, note that 1 is always an eigen-
value of D + A ( ~since
)
D+A(p)f(p) = f (PI -
278 13. PERIODIC ATTRACTORS
This means that any point sufficiently near to y has the same fate as a definite
point of y.
It can be proved (not easily) that the closed orbit in the Van der Pol oscillator
is a periodic attractor (see the Problems).
The proofs of Theorems 2 and 3 occupy the rest of this chapter. The proof of
Theorem 2 depends on a local section S to the flow at p, analogous to those in Chap-
ter 10 for planar flows: S is an open subset of an ( n - 1)-dimensional subspace
transverse to the vector field at p. Following trajectories from one point of S to
another, defines a C’map h: So+ S , where Sois open in S and contains p. We call
h the Poincard map. The following section studies the “discrete dynamical system”
h: So S. In particular p E Sois shown to be an asymptotically stable fixed point
of h, and this easily implies Theorem 2.
PROBLEM
DetD+x(p) = exp [[ ‘x
Tr Df(9fP)d l ) ,
show that the closed orbit in the Van der Pol oscillator is a periodic attractor.
space” of some sort, then g (x ) is the state of the system 1 unit of time after it is in
state x. After 2 units of time it will be in state g*(x) = g ( g ( x ) ); after n units, in
state q”(x). Thus instead of a continuous family of states (C$ ~(X I t )E R ) we have
the discrete family (gn(x) 1 n E Z ) , where 2 is the set of integers.
The diffeomorphism might be a linear operator T : E E. Such systems are
--f
studied in linear algebra. We get rather complete information about their structure
from the canonical form theorems of Chapter 6.
Suppose T = eA, A E L ( E ) . Then T is the “time one map” of the linear flow
e*A. If this continuous flow etA represents some natural dynamical process, the
discrete flow T n = enAis like a series of photographs of the process taken a t regular
time intervals. If these intervals are very small, the discrete flow is a good approxi-
mation to the continuous one. A motion picture, for example, is a discrete flow
that is hard to distinguish from a continuous one.
The analogue of an equilibrium for a discrete system g: E E is a fired point
--f
Z = g ( 2 ) . For a linear operator T , the origin is a fixed point. If there are other
fixed points, they are eigenvectors belonging to the eigenvalue 1.
We shall be interested in stability properties of fixed points. The key example is a
linear contraction: an operator T E L ( E ) such that
lim Tnx = 0
n--m
for all x t E. The time one map of a contracting flow is a linear contraction.
I:or any c > 0 there is another basis { e l , . . . , em1 giving TC the "c-Jordan form"
This was proved in Chapter 7 . Give Ec the max norm for this basis:
I C a,ej 1 = max{I aj I I,
where al, . . . , a, are arbitrary complex numbers. Then if I I < 1 and c is suffi-
ciently small, (c) is satisfied. This completes the proof of Proposition 1.
Hence
<Plzl+tlxl.
-
Putting v = p + I
c < 1 we have I g ( x ) 5 Y I x I for x E V . Given a neighborhood
U of 0, choose r > 0 so small that the ball U , of radius r about 0 lics in U . Then
I gnx I 5 v" J x I for x E U,; hence gnx E c', and g"x -+ 0 as x -+ a. This completes
the proof.
The preceding argument can be slightly modified to show that in the specified
norm,
18(2) -9(Y)I I P I Z - Y I , P<1,
9: U-+ S ,
dx) = 41(2)(2)
is obtained, U being a neighborhood of 0. In fact, by Section 2 of Chapter 11, there
is such a U and a unique C' map r : U + R such that &(=)(x) E S for all x in U
and ~ ( 0 =) X.
Now let U , T be as above and put So = S n U . Define a C1map
9: so --+ s,
o(x) = &(I) (2) *
Then g is a discrete dynamical system with a fixed point a t 0. See Fig. A. We call
g a Poincart? map. Note that the Poincar6 map may not be definable at all points
of S (Fig. B) .
There is an intimate connection between the dynamical properties of the flow
near y and those of the Poincar6 map near 0. For example:
282 13. PERIODIC ATTRACTORS
Proposition 1 Let 9: SO+ S be a Poincard map for 7 a8 above. Let x E Sobe such
that g " ( s ) = 0. Then
l i m d ( & ( s ) , y) = 0.
t+m
Proof. Let g"(s) = Z n E S. Since g"+'(z) is defined, sn E So. Put r(sn) = A,.
Since Z n + 0, A, + X (the period of 7 ) . Thus there is an upper bound r for
{I An 1 1 n 2 0).By continuity of the flow, as n --f 00,
I4r(sn) - dJr(0)I -0
unifom2y in s E [O, r ] . For any t > 0, there exist s ( t ) E [0, r ] , and an integer
$3. STABILITY AND CLOSED ORBITS 283
lim d(4t(z), 7 ) = 0
1-m
The following result links the derivative of the Poincarb map to that of the flow.
We keep the same notation.
= D h ( 0 )1 H.
It is easy to see that the derivatives of any two Poincard maps, for different
sections at 0, are similar.
We now have all the ingredients for the proof of Theorem 2 of the first section.
Suppose y is a closed orbit of period X as in that theorem. We may assume 0 E 7.
We choose an (n - 1)-dimensional subspace H of E as follows. H is like an
eigenspace corresponding to the eigenvalues of Dth(0) with absolute vaiue less
than 1. Precisely, let B C EC be the direct sum of the generalized eigenspaces
belonging to these eigenvalues for the complexification ( D + r ( O ) ) c :EC + Ec, and
let H = B n E. Then H is an (n - l)-dimensional subspace of E invariant under
D@A(O)and the restriction D+A(O)IH is a linear contraction.
Let S C H be a section at 0 and g : So+ S a Poincard map. The previous proposi-
tion implies that the fixed point 0 E Sois a sink for g. By Proposition 2, y is asymptob
ically stable.
To prove Theorem 3, it suffices to consider a point z E So where g: SO+ S is
the Poincard map of a local section at 0 E y (since every trajectory starting near
y intersects So).
i3. STABILITY AND CLOSED ORBITS 285
where
PROBLEMS
The goal of this very short chapter is to do two things: (1) to give a statement
of the famous n-body problem of celestial mechanics and (2) to give a brief intro-
duction to Hamiltonian mechanics. We give a more abstract treatment of Hamil-
tonian theory than is given in physics texts; but our method exhibits invariant
notions more clearly and has the virtue of passing easily to the case where the
configuration space is a manifold.
Thus Euclidean three space, the configuration space of one body, is a three-
dimensional vector space together with an inner product.
The configuration space M for the n-body problem is the Cartesian product of
E with itself n times; thus M = ( E ) ”and x = ( 5 1 , . . . , xn), where xi E E is the
position of the ith body. Note that x, denotes a point in El not a number.
One may deduce the space of states from the configuration space as the space
TM of all tangent vectors to all possible curves in M . One may think of T M as the
product M X M and represent a state as ( x , v ) E M X M , where x is a configura-
.
tion as before and v = (vl, . . , vn), v i E E being the velocity of the ith body. A
state of the system gives complete information about the system at a given moment
and (at least in classical mechanics) determines the complete life history of the
state.
The determination of this life history goes via the ordinary differential equations
of motion, Newton’s equations in this instance. Good insights into these equations
can be obtained by introducing kinetic and potential energy.
The kinetic energy is a function K : M X M + R on the space of states which
is given by
~ ( 2 V ,) = ’c
2 i-1
mi I v i 12.
Here the norm of v ; is the Euclidean norm on E. One may also consider K to be
given directly by an inner product B on M by
1 ”
B (v , w) = - C mi(vii wi),
2 1
K(x,v) = B(v, v ) .
It is clear that B defines an inner product on M where (vi, w;)means the original
inner product on E.
The potential energy V is a function on M defined by
We suppose that the gravitational constant is 1 for simplicity. Note that this
function is not defined at any “collision” (where x; = zj) . Let Aij be the subspace of
collisions of the ith and j t h bodies so that
A;j = { x E M I xi = xj}, i <j.
Thus Aij is a linear subspace of the vector space M . Denote the space of all collisions
by A C M so that A = U A + Then properly speaking, the domain of the potential
energy is M - A:
V : M - A-R.
Q1. THE 11-BODY PROBLEM 289
(1) Configuration space Q, an open set in a vector space E (in the above case
Q = M - AandE = M ) .
(2) A C2 function K : Q X E + R, kinetic energy, such that K ( s , v ) has the
form K ( z , v ) = K z ( v ,v ) , where K , is an inner product on E (in the above
case K , was independent of z, but in problems with constraints, K , de-
pends on 2).
(3) A C2 function V : Q + R,potential energy.
The triple,(&, K , V ) is called a simple mechanical system, and Q x I3 the slate
apace of the system. Given a simple mechanical system ( Q , K , V ) the energy or
total energy is the function e: Q X E + R defined by e(z, v ) = K ( z , v ) V(z). +
For a simple mechanical system, one can canonically define a vector field on
Q x E which gives the equations of motion for the states (points of Q X E). We
will see how this can be done in the next section.
290 14. CLASSICAL MECHANICS
It can be shown that every symplectic form is of this type for some representa-
tion of F as E X E*.
Now let U be an open subset of a vector space F provided with a symplectic
form n. There is a prescription for assigning to any c1 function H : U 4 R,a C1
vector field X H on U called the Hamiltonian vector field of H . In this context H ia
called a Humiltonian or a Hamiltonian function. To obtain X H let D H : U 4 F*
be the derivative of H and simply write
(1) X,(S) = W’DH(z), 2 E U,
where 0-’is the inverse of the isomorphism 0: F + F* defined by n above. (1) is
equivalent to saying ~ ( X(2) H , y ) = DH (2)( y ) ,all y E F. Thus XH:U + F is a
C’ vector field on U ; the differential equations defined by this vector field are
called Hamilton’s equations. By using coordinates we can compare these with what
are called Hamilton’s equations in physics books.
Let no be the symplectic form on F = E X E* defined above and let
2 = (xl, . . . , xn) represent points of E and y = ( y l , . . . , y n ) points of E* for the
dual coordinate structures on E and E*. Let 0 0 : F. + F* be the associated iso-
morphism.
For any c1 function H : U + R,
Frorn this, one has that @(;‘DH(x,y) is the vector with components
in this setting, H plays the role of energy, and the solution curves represent the
motions of states of the system.
Then set
v) = (q, MV)).
Consider the example of a simple mechanical system of a particle with mass m
moving in Euclidean three space E under a conservative force field given by poten-
tial energy V . In this case state space is E X E and K : E X E + R is given by
K ( q , u ) = 3m I u 12. Then A : E X E + E X E* is given by h,(v) = p E E*, where
p ( w ) = 2K,(v, t o ) ; or
P ( W > = m(v, 2 0 )
and ( , ) is the inner product on E. I n a Cartesian coordinate system on E , p =
mu, so that the image p of v under h is indeed the classical momentum, “conjugate”
to 2).
Returning to our simple mechanical system in general, note that the Legendre
transformation has an inverse, so that h is a diffeomorphism from the state space
to the phase space. This permits one to transfer the energy function e on state
space to a function H on phase space called the Hamiltonian of a simple mechanical
system. Thus we have
R
H = eoX-1
dH
p! = -- , i=l, . . . , n.
aqc
Since for a given mechanical system N (interpreted as total energy) is a known
function of p , , q,, these are ordinary diflerential equations. The basic assertion of
Hamiltonian mechanics is that they describe the motion of the system.
The justification for this assertion is twofold. On one hand, there are many
cases where Hamilton’s equations are equivalent to Newton’s; we discuss one
below. On the other hand, there are common physical systems to which Newton’s
laws do not directly apply (such as a spinning top), but which fit into the framework
of “simple mechanical systems,” especially if the configuration space is allowed
to be a surface or higher dimensional manifold. For many such systems, Hamilton’s
eauations have been verified experimentally.
294 14. CLABBICAL MECHANIC8
These are the familii Newton’s equations, again. Conversely, Newton’s equations
imply Hamilton’s in this case.
PROBLEMS
1. Show that if the vector @ace F has a symplectic form il on it, then F has even
dimension. Hint:Give F an inner product ( , ) and let A : F +F be the operator
defined by ( A z , y ) = Q(z, y). Consider the eigenvectors of A.
2. (Lagrange) Let (Q,K, V) be a simple mechnnical system and XH the associ-
ated Hamiltonian vector field on phase space. Show that (qJ 0) is an equi-
librium for XH if and only if DV(q) = 0 ; and (qJ0 ) is a stable equilibrium if
q is an isolated minimum of V. (Hint:Use conservation of energy.)
3. Consider the second order differential equation in one variable
3 +f(z) = 0’
where f: R --* R is ?t and if f(z) = 0, then f’( z) # 0. Describe the orbit struc-
ture of the associated system in the plane
J=v
8 = -f(z)
when f(z) = z - 9. Discuss this phase-portrait in general. (Hid: Consider
H(zJv, = 3@ +f 0
f ( t ) dt
NOTES 295
Notes
This is a short technical chapter which takes care of some unfinished business
left over from Chapter 8 on fundamental theory. We develop existence, uniqueness,
and continuity of solutions of nonautonomous equations x’ = f ( t , 2). Even though
our main emphasis is an autonomous equations, the theory of nonautonomous
linear equations 2’ = A ( t )x is needed as a technical device in establishing differenti-
ability of flows. The variational equation along a solution of an autonomous equation
is an equation of this type.
(1) 2’(t) = f( 4 5 ) ,
Z(h) = uo
The proof is the same as that of the fundamental theorem for autonomous equa-
tions (Chapter 8) , the extra variable t being inserted where appropriate.
The theorem applies in particular to functions f ( t , 2) that are C’, or even con-
tinuously differentiable only in s; for such an f is locally Lipschitz in x (in the
obvious sense). In particular we can prove:
This means that for every e > 0, there exists 6 > 0 such that if I 5: I 5 6, then
Theorem 1 The JEow + ( t , 2) of (1) is C1; that is, a+/at and a4/& erist and are
continuous in ( t , x).
300 1s. NONAUTONOMOUS EQUATIONS AND DIFFERENTIABILITY OF FLOWS
Df#Jo(to)= I .
Here we regard toas a parameter. An important special case is that of an equilibrium
2 so that = Z. Putting Df(2)= A E L ( E ) ,we get
d
- ( D 4 f(2)) = AD4,(2),
dt
Df#Jo(3)= I .
The solution to this is
Dt~%(2)= e l A .
This means that in a neighborhood of an equilibrium the Jlow i s approximately linear.
We now prove the proposition. For simplicity we take to = 0. The integral equa-
tions satisfied by y ( t , 0, y ( t ) , and u ( t , t ) are
limR(z, y - z ) / l y - z I = 0
2-W
N = max{llDf(y,s>ll I s E JOI.
whence
for some constant C depending only on K and the length of Jo.Applying Gronwall’s
302 15. NONAU1Y)NOYOUB EQUATIONS AND DIFTERENTIABILITY OF FLOWS
inequality we obtain
5 CCG" I E l
if t E JOand I I 5 61. (Recall that 61 depends on t.) Since c is any positive number,
this shows that g ( t ) / l [ I +0 uniformly in 1 E Jo,which proves the proposition.
We ahow next that the flow enjoys the same degree of differentiability as does
the data.
A function f : W +E is called 0, 1 5 r < 00 if it hee r continuoua derivatives.
For r 2 2 this is equivalent to:f is (? snd Of: W + L ( E ) is CP1. Iff is C'for all
r 2 1, we say f is C". We let C" mean "continuous."
or equivalently,
(9 1 5' = f(x), u' = Df(5)U.
PROBLEMS
This chapter is an introduction to the problem: What effect does changing the
differential equation itself have on the solution? In particular, we find general condi-
tions for equilibria to persist under small perturbations of the vector field. Similar
results are found for periodic orbits. Finally, we discuss briefly more global problems
of the same type. That is to say, we consider the question: When does the phase
portrait itself persist under perturbations of the vector field? This is the problem of
structural stability.
I h(z)l, II W z ) I I ; E w.
We allow the possibility 11 h = 00 if these numbers are unbounded.
$1. PEHSXSTESCE OF EQUILIBRIA 305
-
PERTURBATION THEORY AND STRUCTURAL STABILITY
(3)
Hence
cp'(0 = Of(!/ + tu)u.
f(4 fb) = 1M
0
1
Y + tu)udt
$1. PERSISTENCE OF EQUILIBRIA 307
Then
II A - ~ f ( X 0 ) I I < **
Since the map x + D f ( x ) is continuous, there is a neighborhood N1 C W of xo
such that if x E N1, then
II D f ( x ) - Df(Z0)Il < ta.
It follows that if g E 2)( W ) is such that
I1 &(t) - Df(x)II < 3.1
for all x E N1, then Dg(x)is invertible for all x E N1. The set of such g is a neighbor-
hood Xi off.
Let Y > )I Df(x0)-l I I. The map A + A-l from invertible operators to L ( E ), is
continuous (use the formula in Appendix I for the inverse of a matrix). It follows
that f has a neighborhood Enz C X l and xo has a neighborhood N z C N , such that
if g E % and y E N z , then
II DB(Y)--' II <.v .
We can find still smaller neighborhoods, Xa C X2 off and N S C NZof xo,such that
if g E X 8 and y, 2 E N I , then
1
I1 W Y ) - Ds(z>II < V -*
It now follows from Lemma 1 that for any ball V C N and any g E 318, g I V is one-to-
one.
Fix a ball V C N8 around XO. Let B C V be a closed ball around xo and choose
6 > 0 as in L e m m a 2. There is a neighborhood X C 378 off such that if g E En, then
PROBLEMS
f(z) I < 6 for all ( t , x), then every solution x ( t ) to x' = g ( t , x) with z(6) E W
satisfies z ( t ) E W for all t 2 6 and I x ( t ) l < E for all t greater than some tl.
(Hint: If V is a strict Liapunov function for 0, then ( d l d t ) ( V ( z ( t ) is close
to ( d / d t ) ( V ( y ( t ) )where
, y' = f ( y ) . Hence ( d / d t ) ( V ( x ( t )<
) 0 if I z ( t ) l is
not too small. Imitate the proof of Liapunov's theorem.)
We may assume that the closure of Sois a compact subset of S. Let a > 0. There
exists 60 > 0 such that if g E %( W) and I g(z) - j ( z ) I < 6o for all z E So, then,
first, S will be a local section at 0 for g, and second, there is a C1 map o: So+ R
such that
I o(z) - 4 z ) l < a,
and
The question of the uniqucness of the closed orbit of the perturbation is interest-
ing. It is not necessarily unique; in fact, i t is possible that all points of U lie on
closed orbits of j ! But it is true that closed orbits other than y will have periods
much bigger than y. In fact, given e > 0, there exists 6 > 0 so small that if 0 <
d ( z , y ) < 6 and t+((z) = r , t > 0 , then t > 2X - e. The same will hold true for
sufficiently small perturbations of y: the fixed point y of v that we found above
lies on a closed orbit of g whose period is within c of 1;while any other closed orbit
of g that meets Sowill have to circle around p several times before i t closes up. This
follows from the relation of closed orbits to the sections; see Fig. A.
There is one special case where the uniqueness of the closed orbit of the perturba-
tion can be guaranteed: if y is a periodic attractor and g is sufficiently close to f,
then p will also be a periodic attractor; hence every trajectory that comes near p
winds closer and closer to p as t + 00 and so cannot be a closed orbit.
Similarly, if y is a periodic repeller, so is p, and again uniqueness holds.
Consider next the case where y is a hyperbolic closed orbit. This means that the
derivative a t 0 E y of the Poincark map has no eigenvalues of absolute value 1. In
this case a weaker kind of uniqueness obtains: there is a neighborhood V C U of
y such that if 32 is small enough, every g E 3t will have a unique closed orbit that
is entirely contained i n V . It is possible, however, for every neighborhood of a hyper-
bolic closed orbit to intersect other closed orbits, although this is hard to picture.
We now state without proof an important approximation result. Let B C Rn
be a closed ball and aB its boundary sphere.
PROBLEMS
1. Show that the eigenvalue condition in the main theorem of this section
is necessary.
2. Let y be a periodic attractor of x' = f(z). Show there is a C' real-valued
function V ( z ) on a neighborhood of y such that V 2 0, V-l(O) = y, and
(d/dt)(V(z(t)< ) 0 if x ( t ) is a solution curve not in y. ( H i n t : Let z ( t ) be
the solution curve in y such that x ( t ) - z ( t ) + 0 as t --+ a0 ; see Chapter 13,
Section 1, Theorem 3. Consider JOT I ~ ( t - ) z ( t ) [ * dt for some large constant T . )
3. Let W C Rnbe open and let y be a periodic attractor for a C1vector field f : W --+
Rn.Show that y has a neighborhood U with the following property. For any
t > 0 there exists 6 > 0 such that if g: R X W .--) Rn is C* and I g ( t , z) -
f ( z ) I < 6, then every solution z ( t ) to x' = g ( t , z) with z(to) E U satisfies
x ( t ) E U for all 1 2 to and d ( x ( t ) ,y) < c for all t greater than some tl. ( H i n t :
Problem 2, and Problem 2 of Section 1.)
In the previous sections we saw that certain features of a flow may be preserved
under small perturbations. Thus if a flow has a sink or attractor, any nearby flow
will have a nearby sink; similarly, for periodic attractors.
It sometimes happens that any nearby flow is topologically the same as a given
flow, that is, for any sufficiently small perturbation of the flow, a homeomorphism
exists that carries each trajectory of the original flow onto a trajectory of the per-
turbation. ( A homeomorphism is simply a continuous map, having a continuous
inverse. ) Such a homeomorphism sets up a one-to-one correspondence between
equilibria of the two flows, closed orbits, and so on. I n this case the original flow
(or its vector field) is called structurally stable.
Here is the precise definition of structural stability, a t least in the restricted
setting of vector fields which point in on the unit disk (or ball) in Rn. Let
Dn = ( z E R"llz1 5 1)
$3. STRUCTURAL STABILITY 313
x’ = Ax, A = [ -1
‘1.
0
By arbitrary slight perturbation, the matrix A can be changed to make the origin
either a sink, saddle, or source. Since these have different dynamic behavior, the
flows are not topologically the same. Hencef is not structurally stable. I n contrast,
it is known that the Van der Pol oscillator is structurally stable.
The following is the main result of this section. It gives an example of a class of
structurally stable systems. (See Fig. A.)
Before proving this we mention three other results on structural stability. These
concern a C1 vector field f : W + R2 where W C R2 is a neighborhood of D2. The
first is from the original paper on structural stability by Pontryagin and Andronov.
Theorem 2 Suppose f points inward on D2. Then the following conditions taken
together are equivalent to structural stability on D2:
i
(b)
f
FIG. B. (a) Flow near a saddle connection; (b) breaking a saddle connection.
Theorem 4 The set of structurally stable systems contained in grad (Dn) is open
and dense in grad (D").
if 0 < I z I 5 2r. It follows that Br is in the basin of 0, and that f ( z ) points inward
along aB,. It is clear that f has a neighborhood 31.0 C 'U ( W ) such that if q € a,
then also g ( t ) points inward along aB,.
Let 0 < E < r and put s = r + c . If I y I < E, then the closed ball B,(y)about y
with radius s satisfies :
Br C B , ( Y ) C
Let Y < p < 1. We assert that if 11 g - f Ill is sufficiently small, then the sink a
of g will be in B,, and moreover,
(1) (9(z),5 - a) 5 P I - a I2
if z E B. ( a ) . To see this, write
( g ( z ) ,2 - a ) = U(z - a ) , 2 - a ) + ( d z ) - f ( Z - a ) , 2 - a>
< I z - a l2 + (g(z) - f ( z - a ) , z - a).
- Y
We now prove Theorem 1. Since Dn is compact and f (2) points inward along the
boundary, no solution c i n e can leave Dn. Hence Dn is positively invariant. Choose
r > 0 and 3t C U ( W ) aa in the proposition. Let 310 C X be a neighborhood o ff
80 small that if g f G, then g(s) points inward along aDn. Let $ t be the flow of
g f %. Note that Dn is also positively invariant for $L1.
For every s f Dn - int B,, t h e is a neighborhood U, C W of z and t, > 0
such that if y f U , and t 1 t,, then
I4t(Y)l < r -
By compactness of aDn a finite number U,,, . . . , U,, of the sets U , cover aDn.Put
to = max(t,,, . . . , &).
Then $,(On - int B,) C B,, if t 2 6. It follows from continuity of the flow in f
(Chapter 15) that f has a neighborhood 311 C 31 such that if g E 311, then
+f (Dn - int B,) C B, if t 1 to.
This implies that
limJ.*(z)= a for all z E Dn.
L-. 00
For let 5 D" ; then y = (5) B,. , and B,. C basin of a under +!.
It also implies that every y D" - a is of the form (x)for some x aD" and +f
@(s,0 = 4t(Z>
Another way of saying this is that h maps c#I((z)to $,(x) for z E aDn, t 2 0, and
h ( 0 ) = a; therefore h maps trajectories of 4 to trajectories of $, preserving orienta-
tion. Clearly, h(Dn) = D". The continuity of h is verified from continuity of the
flows,and by reversing the role of the flow and its perturbation one obtains a con-
318 16. PERTURBATION THEORY AND STRUCTURAL STABILITY
PROBLEMS
FIG. C
__----
FIG. A. A vector field tangent to S.
/I
rigid body with one point fixed is a compact three-dimensional manifold, the set
of rotations of Euclidean three space.
The topology (global structure) of a manifold plays an important role in the
analysis of dynamical systems on the manifold. For example, a dynamical system
on the two sphere S must have an equilibrium; this can be proved using the
PoincartS-Bendixson theorem.
The mathematical treatment of electrical circuit theory can be extended if mani-
folds are used. The very restrictive special hypothesis in Chapter 10 was made in
order to avoid manifolds. That hypothesis is that the physical states of a circuit
(obeying Kirchhoff’s and generalized Ohm’s laws) can be parametrized by the
inductor currents and capacitor voltages. This converts the flow on the space of
physical states into a flow on a vector space. Unfortunately this assumption ex-
cludes many circuits. The more general theory simply deals with the flow directly
on the space of physical states, which is a manifold under “generic” hypotheses
on the circuit.
Rlanifolds enter into differential equations in another way. The set of points
whose trajectories tend to a given hyperbolic equilibrium form a submanifold called
the stable manifold of the equilibrium. These submanifolds are a key to any deep
global understanding of dynamical systems.
Our analysis of the long-term behavior of trajectories has been limited to the
simplest kinds of limit sets, equilibria and closed orbits. For some types of systems
these are essentially all that can occur, for example gradient flows and planar sys-
tems. But to achieve any kind of general picture in dimensions higher than two, one
AFTERIVOIID 321
must confront limit sets which can be extremely complicated, even for structurally
stable systems. It can happen that a compact region contains infinitely many
periodic solutions with periods approaching infinity. PoincarC was dismayed by
his discovery that this could happen even in the Newtonian three-body problem,
and expressed despair of comprehending such a phenomenon.
In spite of the prevalence of such systems it is not easy to prove their existence,
and we cannot go into details here. But to give some idea of how they arise in ap-
parently simple situations, we indicate in Fig. B a discrete dynamical system in
the plane. Here the rectangle ABCD is sent to its image A’B‘C‘D‘ in the most
obvious way by a diffeomorphism f of R2; thus f ( A ) = A’, and so on. It can be
shown that f will have infinitely many periodic points, and that this property is
preserved by perturbations. (A point p is periodic if f ” ( p ) = p for some n > 0.)
Considering R2 as embedded in R3,one can construct a flow in R3 transverse to RZ
whose time one map leaves R2 invariant and is just the diffeomorphism f i n R2.Such
a flow has closed orbits through the periodic points off.
FIG. B
This appendix collects various elementary facts that most readers will have
seen before.
Thus the map f assigns to each element x E X (that is, x belongs to X ) an element
f ( x ) = y of Y . In this case we often write x -+ y or x - + f ( x ) . The zdentity map
i: X ---f X is defined by i(x) = x and if Q is a subset of X , Q C X , the inclusion
map a: Q -+ X is defined by a ( q ) = q. Iff: X -+ Y , and g: Y + 2 are two maps,
the composition g f (or sometimes written gf) is defined by g f ( r ) = g (f(x)).
0 0
Imf = (y E Y I y = f ( x ) , some x E X ) .
Then f is onto if Im f = Y . An inverse g (or f-’) off is a map g: Y X such that-+
c + + . . . + zn,
n
5; = z1 22
i-I
where the ei are elements of a vector space. If there is not much ambiguity, the
limits are omitted:
c s i = c1+ . . . +z..
2. Complex Numbers
We recall the elements of complex numbers C.We are not interested in complex
analysis in itself; but sometimes the use of complex numbers simplifies the study
of real differential equations.
The set of complex numbers C is the Cartesian plane RZconsidered as a vector
space, together with a product operation.
Let i be the complex number i = (0, 1) in coordinates on R2.Then every complex
number z can be written uniquely in the form z = 2 +
i y where z, y are real num-
bers. Complex numbers are added as elements of R2,so if z = x +
i y , 2’ = z’ +iy’,
then z +
2’ = (c +5’) +
i(y +
y’) : the rules of addition carry over from RZto C.
Multiplication of complex numbers is defined as follows: if z = x +
i y and
2’ = x’ + iy’, then zz’ = (zz’ - yy’) i(cy’ + +
z’y). Note that iz = -1 (or
“i = 6 1 ” ) with thisdefinition of product and this fact is ~ yaid l to remembering
the product definition. The reader may check the following properties of multi-
plication :
(a) zz’ = z’z.
(b) ( z z ’ ) ~ ’ ’ = ~(”2”).
(c) l z = z (here 1 = 1 + i.0).
(d) If z = z + iy is not 0, then
(2 + 2') = z + z',
E' = 22'.
IZZ'I = IZIIZ'I,
This use of the exponential symbol can be justified by showing that it is con-
sistent with a convergent power series representation of e.. Here one takes the
power series of ea+ib as one does for ordinary real exponentials; thus
ea+ib
c +n!ib)"
= * (a
n-O
One can operate with complex exponentials by the same rules as for real ex-
ponentials.
3. Determinants
One may find a good account of determinants in Lang's Second Course in Calculus.
[l2]. Here we just write down a couple of facts that are useful.
First we give a general expression for a determinant. Let A = [aij] be the
(n X n) matrix whose entry in the ith row and j t h column is a+ Denote by Aij
the ( n - 1 ) X (n - 1) matrix obtained by deleting the ith row a n d j t h column.
Then if i is a fixed integer, 1 5 i 5 n, the determinant satisfies
Det A = ( - 1) '+'ail Det Ail + * * + (-1) Det A,..
ELEMENTARY FACTS 325
Thus the expression on the right does not depend on i and furthermore gives a
way of finding (or defining) Det A inductively. The determinant of a 2 X 2 matrix
[z is ad - bc. For a 3 X 3 matrix
one obtains
Recall that if Det A # 0, then A has an inverse. One way of finding this inverse
is to solve explicitly the system of equations Ax = y for x obtaining x = By ; then
B is an inverse A - ' for A .
If n e t A # 0, one has the formula
A-l = transpose of
[
(-l)i+jDet A i j
D etA 1
It follows easily from the recursive definition that the determinant of a tri-
angular matrix is the product of the diagonal entries.
The purpose of this section is to prove Propositions 1 and 3 of Section lB, Chap-
ter 3.
Proposition 1 Every vector space F has a basis, and every basis of F has the same
number of elements. I f {el, . . . , ek) C F i s an independent subset that is not a basis,
by adjoining to it suitable vectors et+l, . . ., em, one can form a basis el, . . ., em.
Lemma 2 Let { e l , . . . , en} be a basis for a vector space F. If ul, , . . , urn are linearly
independent elements of F , then m 5 n.
Proof. It is sufficient to show that m # n + 1. Suppose otherwise. Then each
u, is a linear combination of the ei,
+ 1.
n
ui = C aikek, i = 1, . . . , n
k-1
From Lemma 2 we obtain the part of Proposition 1 which says that two bases
have the same number of elements. If (el, . . . , en} and (UI, . . . , urn) are the two
bases, then the lemma says m I n. An interchange yields n 5 m.
Say that a set S = (ut, . . . , Om) of linearly independent elements of F is maximal
if for every u in F, u 6 S, the set (u, UI, . . . , urn) is dependent.
Let
p ( z ) = anZn + a,-lzn-' + + a12 +
* * * an # 0,
.
be a polynomial of degree n 2 1 with complex coefficients ao, . . , a,. Then p ( z ) = 0
for a t least one z E C.
The proof is based on the following basic property of polynomials.
Proposition 1 l i m ~ + I ~P ( Z ) I = a.
Hence
Therefore there exists L > 0 such that if I z I 2 L, then the right-hand side of
(1) is 2 3 I a, I > 0, and hence
is a polynomial taking the same values as p , hence it suffices to prove that q has a
root. Clearly, I q ( 0 ) I is minimal. Hence we may assume that
(3) the minimum value of I p ( z ) I is I p ( 0 ) I = 1 Q 1.
We write
p(Z) = a0 + arZk + Zk+?(Z), Uk # 0, k 2 1,
where r is a polynomial of degree n - k - 1 if k < n and r = 0 otherwise.
We choose w so that
(4 1 a0 + akwk = 0.
In other words, w is a kth root of -ao/ak. Such a root exists, for if
-
-a0-
ak
+
- ~ ( C OO S i sin 0 ) ,
+ c akAb
n
p(z) = (2 - A)q(z)
k-0
or
P(4 = - A M Z ) + P(A>
(2
The goal of this appendix is to prove three results of Chapter 6: Theorem 1and
the uniqueness of the S + N decomposition, of Section 1; and Theorem 1 of Sec-
tion 3.
1. A Decomposition Theorem
Then
Lemma V = NeM.
Proof. Since T M = L ,+1 = M , T I M is invertible; also, T n ( M ) = M and
Tnx # 0 for nonzero z in M . Since T n ( N ) = 0, we have N n M = 0. If x E V is
any vector, let Tmx = g E M . Since T m I M is invertible, Tmx = Tmz,z E M.
Put 2 = (z - z ) +
z. Since z - z E N , z E M, this proves the lemma.
Let al,. . . , a, be the distinct eigenvalues of T . For each eigenvalue LYI, define
subspaces
N k = N ( T - arZ) = u
Ker(T - a ~ , I ) j ,
J-20
Proposition V = N1 e * - a~N , .
Proof. We use induction on the dimension d of V,the caam d = 0 or 1 being
trivial. Suppose d > 1 and assume the theorem for any space of smaller dimen-
sion. In particular, the theorem 2s assumed to hold for T I M I :MI + M I .
It therefore suffices to prove that the eigenvalues of T I M1 are a2,. . . , a,, and
that
We can now prove Theorem 1. Let nk be the multiplicity of OLk as a root of the
characteristic polynomial of T.Then T I Nk: Nk + Nk has the unique eigenvalue
CYL (the proof is like that of (2) above) , and in fact the lemma implies that a k has
I
multiplicity nk as an eigenvalue of T Nk. Thus the degree of the characteristic
polynomial of T I Nk is nk = dim Nk.
The generalized eigenspace of T : V + V belonging to (Yk is defined by Ek =
E ( T , a k ) = Ker ( T - ak)nk. Then, clearly, Ek c
Nk.
In fact, it follows that Ek = Nk from the definition of Nk and Lemma 2 of the
next section (applied to T - a ) .This finishes the proof of the theorem if V is
complex. But everything said above is valid for an operator on a real vector space
provided its eigenvalues are real. The theorem is proved.
2. Uniqueness of S and N
n-1
= C akNn+k--f-lx
k-j
n-1
= ajNn-lx + C akNn+k--flx
k-j+l
= ajNn-lx
since n +k - j - 1 1 n if k 2 j +
1. Thus ajNn-12 = 0, so Nn-Ix = 0 because
aj # 0. But this contradicts n = nil(x, N ) .
This result proves that in the basis (x,N x , . . , , Nn-lx], n = nil (x), the nilpotent
ON CANONICAL FORMS 335
with ones below the diagonal, zeros elsewhere. This is where the ones below the
diagonal in the canonical form come from.
An argument similar to the proof of Lemma 1 shows:
if C,", akNkx = 0, then ah = 0 for k < nil(x, N ) .
It is convenient to introduce the notation p ( T ) to denote the operator xi, ahTk
if p is the polynomial
p ( t ) = antr +. . + ad + a,
+
The proof goes by induction on dim V , the case dim V = 0 being trivial. If
dim V > 0, then dim N ( V ) < dim V , since N has a nontrivial kernel. Therefore
there are nonzero vectors y l , . . . , yr in N ( V ) such that
N(V) = Z(yi) @ * * * @Z(y,).
In this appendix we prove the inverse function theorem and the implicit function
theorem.
Inverse function theorem Let W be an open set in a vector space E and let f:
W E be a C' map. Suppose xo E W is such that Df(xo) is an invertible linear
operator on E. Then zo has an open neighborhood V C W such that f I V is a diffeo-
morphism onto an open set.
Proof. By continuity of Of:W + L ( E ) there is an open ball V C W about s
and a number Y > 0 such that if y, z E V, then Df(y) is invertible,
11 Df(Y)-' 11 < "J
and
II Dfk) - Df(z)II < p-'.
It follows from L e m m a 1 of Chapter 16, Section 1, that f I V is one-to-one. More-
over, Lemma 2 of that section implies that f ( V ) is an open set.
The map f-I: f ( V ) + V is continuous. This follows from local compactness of
f( V). Alternatively, in the proof of L e m m a 1 it is shown that if y and z are in
V, then
I Y - I I v If(!/) - f(z> I;
hence, putting j(y) = a and f(z) = b, we have
If'(a) - f-'(b) I I v I a - b I,
which proves f-I continuous.
It remains to prove that f-1 is C'. The derivative of f-' at a = f(r) E f( V ) is
Df(s)-'.To see this, we write, for b = f ( y ) E f( V ) :
f-'(b) - j - ' ( a ) - Df(zj-'(b- a ) = y - 2 - Df(z)-'(f(g)
- f(z>).
338 APPENDIX IV
Hence
<
-
IY-zl
This clearly goes to 0 aa 1 f(y) - f (z)I goes to 0. Therefore D(f-l) ( a ) =
[Df(f-'a) rl.
Thus the map D (f-'> :f (V ) + L ( E ) is the composition:fl, followed
by Df, followed by the inversion of invertible operators. Since each of these maps
ie continuous, 80 is D(f-').
ie invertible. Put F ( z o , yo) = c. Then there are open sets U C El, V C Et with
(ZO, Yo) E uxvcw
and a unique C1 map
g:u+v
such that
F ( z , g(z)) = c
for all x EIU,and m o v e r , F ( x , y) # c i f ( x ,y) E U x V and y # g(x).
THE INVERSE FUNCTION THEOREM 339
Before beginning the proof we remark that the conclusion can be rephrased thus:
the graph of g is the set
F - ~ ( c )n (u x v).
Thus F-l ( c ) is a “hypersurface” in a neighborhood of ( xo, yo).
To prove the implicit function theorem we apply the inverse function theorem
to the map
f: W - t E i X E2,
23. J. Synge and B. Griffiths, Principles of Mechanies (New York: McGraw-Hill, 1949).
24. R. Thom, Slabilild Slruclurelle el Morphoghnbse: Essai d'une thdorie ghnbrale des modbles (Read-
ing, Massachusetts: Addison-Wesley, 1973).
25. A. Wintner, The Analytical Foundations of Celeslial Mechanics (Princeton, New Jersey:
Princeton Univ. Press, 1941).
26. E. Zeeman, Differential equations for heartbeat and nerve impulses, in Dynamical Systems
(M. M. Peixoto, ed.), p. 683 (New York: Academic Press, 1973).
Answers to Selected Problems
Chapter 1
Section 2 , page 12
Chapter 2
Page 27
1. F ( z ) = - K x : V ( r ) = K l l x l l * , x ER2
2
~
dx2
m- = -grad V = -Kx
dP
“hlost” initial conditions means the set of v) E R2 X R2 such that u is not
(2,
collinear with x .
23 2y3 22
2. (a) with V ( x ,y) = - - - - and (c) with V ( x ,y ) = -
3 3 2
7. Hint: Use (4) Section 6.
344 ANSWERS TO SELECTED PROBLEMS
Chapter 3
Section 3, page 54
Section 4 , page 60
1. +
(d) c = 3et cos 2t 9eg sin 2t
y = 3ef sin 21 - 9e' cos 2t.
Chapter 4
Section 1, page 65
Section 2, page 69
Section 3, page 73
Chapter 5
Section 2 , page 81
3. A = l , B = &
4. (a) fi ( b) t (c) 1 (d) 4
6 . (a) and (d)
Section 3, page 87
5
3. Hint: Note -
ITXI - -
ITYI --I T y I i f y = - - .
1x1 IYI 1x1
4. (a) The norm is 1.
7. Hint: Use geometric series.
W
1
C x i --- for O < s < l , withx=(II-TI(.
1-0 1- 5
13. Hint: Show that all the terms in the power series for eA leave E invariant.
346 ANSWER8 TO BELECTED PROBLEMS
Section 4 , page 97
z ( t ) = -f9[4t + 1) + - + e4Ik.
e4
(b)
16
(c) s ( 1 ) = A cos t + B sin t .
y ( t ) = -A sin.t + B cos t + 21.
2. (a) ( b ) s ( l ) = -ef+
s ( l ) = cos 21. +
e2r-2.
3. (a) c o s f i 2 , sin* 1 (b) e x p f i t , exp -fi t
4. Hint: Check cases ( a ) , ( b ) , (c) of the t,heorem.
8. a = 0, b > 0; period is 4 / 2 r .
Chapter 6
'1
1. (a) Gcneralizcd l-eigenspace spanned by (1, 0 ) , (0, 1 ) ;
s = [ o1 J,
0 N=[O
0 0
INSWERS TO SELECTED PROBLEMS 347
s= c' 'I,
0 -1
1v = [; ;I.
2. If the rth power of the mat,rix is [b,,], then b , j = 0 for i <j + r ( r = 1, 2, . . . ) .
3. The only eigenvalue is 0.
5 . Consider the S +
iV decomposition.
6. A prcservrs each generalized eigenspace EX; hence it suffices to consider the
+
restrictions of A and T to Ex. If T = S N , then S Ex = X I which commutes
with A . Thus S and T both commute with A ; so therefore does N = T - S.
S. Use thr Cayley-Hamilton theorem.
1.5. Consider bases of the kernel and the image.
1. Canonical forms:
3. Assumc that N is in nilpotent canonical form. Let b denote the number of blocks
and s the maximal number of rows in a block. Then bs 5 n ; also b = n - T and
s 5 k.
4. Similar pairs are ( a ) , (d) and (b) , (c)
1. (a) [Z 0 -i
O ]
(c) [ I0f i l + i 3
4. F o r n = 3:
o o c
348 ANSWERS TO SELECTED PROBLEMS
6. If Ax = p t , 5 # 0, then 0 = q ( A ) z = q(p)x.
8. Show that A and A t have the same Jordan form if A is a complex matrix, and
the same real canonical form if A is real.
1. (a) Let every eigenvalue have real part < - b with b > a > 0. Let A = S N +
with S semisimple and N nilpotent. In suitable coordinates 1) ers )I 5 e-lb,
)I etx 11 5 C P . Then 11 elA 11 Ic e f b l n ,and so eta I( el* I( + O as t 4 00.
Lets>Obesolargethateto((etA (1 < 1 f o r t k s . P u t k = min(((etA1l-I)
for 0 5 t I s.
2. If z is an eigenvector belonging to an eigenvalue with nonzero real part, then
the solution etAxis not periodic. If ib, iC are pure imaginary eigenvalues, b # f c ,
and z , tu E Cn are corresponding eigenvectors, then the real part of etA( z w) +
is a nonperiodic solution.
1. s ( l ) e-l.
=
2. (a) In ( 7 ) , A = B = 0. Hence s(0) = C, s'(0) = D, s(*)(O) = -c,
s y o ) = -D.
Chapter 7
Chapter 8
Page 177
1. (a) f(s) = 5 + 2.
uo(t) = 2,
ul(t) = 2 + / t f ( u ~ ( s ) )ds
0
= 2 + /'4ds0
= 2 + 4t,
l'
0
Hence
z ( t ) = limu,(t) = 4et - 2.
n-m
u,(t) = 0
for all n: Hence z ( t ) = 0.
(c) z(t) = t 3 .
' '
4. (a) 1
c 5 1 58.
350 ANSWERS TO SELECTED PROBLEMS
Chapter 9
1. H i n t : Find eigrnvcctors.
2. Let Ax = Xx, A y = py, X # p, p # 0. Then (2, y ) = p-l(x, A y ) = p - l ( A z , y) =
Xp-'(z, y), and Xp-' # 1.
5 . A x = grad $ ( x , Ax).
ANSWERS TO SELECTED PROBLEMS 351
Chapter 10
x = k, y = vc.
1. p = - 2 , p=-lf21/7.
x = iL, y = vc
Chapter 11
1. H i n t : If the limit set L is not connected, find disjoint open sets Ul, lT2 con-
taining L. Then find a bounded sequence of points x , on the trajectory with
X, t ui, 2, [Jz.
4. H i n t : Every solution is periodic.
352 ANSWERS M SELECTED PROBLEMS
Chapter 13
(a) Hint: Show that the given condition is equivalent to the existence of an
eigenvalue a of & ( x ) with I a I < 1. Apply Theorem 2.
Chapter 15
Chapter 16
A Change
Absolute convergence, 80 of bases, 36
Adjoint, 230 of coordinates, 6, 36
Adjoint of operator, 206 Characteristic, 213, 232
(YLimit point, 198, 239 Characteristic polynomial, 43, 103
Andronov, 314 Closed orbit, 248
Angular momentum, 21 Closed subset, 76
hnnulus, 247 Companion matrix, 139
.4ntisymmetric map, 290 Comparison test, 80
Areal velocity, 22 Competing species, 265
.4symptotic period, 277 Complex Cartesian space, 62
Asymptotic stability, 145, 180, 186 Complex eigenvalues, 43, 55
.4symptotically stable periodic solution, 276 Complex numbers, 323
iZsymptotically stable sink, 280 Complex vector space, 62, 63
.4utonomous equation, 160 Complexification of operator, 65
Complexification of vector spaces, 64
B Configuration space, 287
Conjugate of complex number, 323
Bad vertices, 269
Conjugate momentum, 293
Based vector, 10
Conjugation, 64
Basic functions, 140
Conservation
Basic regions, 267
Basin, 190 of angular momentum, 21
of energy, 18, 292
Basis, 34
Conservative force field, 17
of solutions, 130
continuous map, 76
Belongs to eigenvector, 42
Bifurcation, 227, 255 Continuously differentiable map, 16
of behavior, 272 Contracting map theorem, 286
Bifurcation point, 3 Contraction, 145
Convergence, 76
Bilinearity, 75
Convex set, 164
Boundary, 229
Coordiiiate system, 36
Branches, 229, 211
Brayton-Moser theorem, 234 Coordinates, 34
Brouwer fixed point theorem, 253 Cross product, 20
Current, 211
C Current states, 212, 229
Curve, 3, 10
C', C*, 178
Cyclic subspace, 334
Canonicnl forms, 122, 123, 331
Cyclic vector, 334
Capacitance, 232
Capacitors, 21 1, 232
D
Cartesian product of vector spaces, 42
Cartesian space, 10 Dense set, 154
Cauchy sequence, 76 Derivntive, 11, 178
Cauchy's inequality, 75 Ileterniinaiits, 3!), 3'24
Cayley-Hamilton theorem, 115 Diagonal form, 7
Center, 95 Diagonal matrix, 45
Central force fields, 19 U)iaRoiinlizabilit?-,45
Chaiu rule, 17, 178 I )itl'eonwrphism, 2-13
SUBJECT INDEX
RECENT TITLES
In preparation
Lours HALIEROWEN.Polynoininal Identities in Ring Theory
JOSEPH J. ROTMAN.
An Introduction to Homological Algebra
ROBERT B. BURCKEI..
An Introductioii To Classical Complex Analysis : Volume 2
H
1 3
1 4