Partial Differential Equations Mathematical Techniques For Engineers
Partial Differential Equations Mathematical Techniques For Engineers
Marcelo Epstein
Partial
Differential
Equations
Mathematical Techniques for Engineers
Mathematical Engineering
Series editors
Jörg Schröder, Essen, Germany
Bernhard Weigand, Stuttgart, Germany
Today, the development of high-tech systems is unthinkable without mathematical
modeling and analysis of system behavior. As such, many fields in the modern
engineering sciences (e.g. control engineering, communications engineering,
mechanical engineering, and robotics) call for sophisticated mathematical methods
in order to solve the tasks at hand.
The series Mathematical Engineering presents new or heretofore little-known
methods to support engineers in finding suitable answers to their questions,
presenting those methods in such manner as to make them ideally comprehensible
and applicable in practice.
Therefore, the primary focus is—without neglecting mathematical accuracy—on
comprehensibility and real-world applicability.
To submit a proposal or request further information, please use the PDF Proposal
Form or contact directly: Dr. Jan-Philip Schmidt, Publishing Editor (jan-philip.
[email protected]).
123
Marcelo Epstein
Department of Mechanical
and Manufacturing Engineering
University of Calgary
Calgary, AB
Canada
vii
Contents
Part I Background
1 Vector Fields and Ordinary Differential Equations . . . . . . . . . . . . . 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Curves and Surfaces in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Cartesian Products, Affine Spaces . . . . . . . . . . . . . . . . 4
1.2.2 Curves in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Surfaces in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 The Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 The Divergence of a Vector Field . . . . . . . . . . . . . . . . 9
1.3.2 The Flux of a Vector Field over an Orientable
Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Statement of the Theorem . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 A Particular Case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.1 Vector Fields as Differential Equations . . . . . . . . . . . . 12
1.4.2 Geometry Versus Analysis . . . . . . . . . . . . . . . . . . . . . 13
1.4.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.4 Autonomous and Non-autonomous Systems . . . . . . . . 16
1.4.5 Higher-Order Equations . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.6 First Integrals and Conserved Quantities . . . . . . . . . . . 18
1.4.7 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . 21
1.4.8 Food for Thought . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Partial Differential Equations in Engineering . . . . . . . . . . . . . . . . . . 25
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 What is a Partial Differential Equation? . . . . . . . . . . . . . . . . . . . 26
2.3 Balance Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 The Generic Balance Equation . . . . . . . . . . . . . . . . . . 28
ix
x Contents
Although the theory of partial differential equations (PDEs) is not a mere generaliza-
tion of the theory of ordinary differential equations (ODEs), there are many points of
contact between both theories. An important example of this connection is provided
by the theory of the single first-order PDE, to be discussed in further chapters. For
this reason, the present chapter offers a brief review of some basic facts about systems
of ODEs, emphasizing the geometrical interpretation of solutions as integral curves
of a vector field.
1.1 Introduction
It is not an accident that one of the inventors of Calculus, Sir Isaac Newton (1642–
1727), was also the creator of modern science and, in particular, of Mechanics. When
we compare Kepler’s (1571–1630) laws of planetary motion with Newton’s f = ma,
we observe a clear transition from merely descriptive laws, that apply to a small
number of phenomena, to structural and explanatory laws encompassing almost
universal situations, as suggested in Fig. 1.1. This feat was achieved by Newton,
and later perfected by others, in formulating general physical laws in the small
(differentials) and obtaining the description of any particular global phenomenon by
means of a process of integration (quadrature).
In other words, Newton was the first to propose that a physical law could be
formulated in terms of a system of ordinary differential equations. Knowledge of the
initial conditions (position and velocity of each particle at a given time) is necessary
and sufficient to predict the behaviour of the system for at least some interval of time.
From this primordial example, scientists went on to look for differential equations that
unlock, as it were, the secrets of Nature. When the phenomena under study involve a
continuous extension in space and time one is in the presence of a field theory, such
as is the case of Solid and Fluid Mechanics, Heat Transfer and Electromagnetism.
© Springer International Publishing AG 2017 3
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_1
4 1 Vector Fields and Ordinary Differential Equations
Newton:“the force is
central and proportional
to the acceleration”
These phenomena can be described in terms of equations involving the fields and
their partial derivatives with respect to the space and time variables, thus leading
to the formulation of systems of partial differential equations. As we shall see in
this course, and as you may know from having encountered them in applications, the
analysis of these systems is not a mere generalization of the analysis of their ordinary
counterparts. The theory of PDEs is a vast field of mathematics that uses the tools of
various mathematical disciplines. Some of the specialized treatises are beyond the
comprehension of non-specialists. Nevertheless, as with so many other mathematical
areas, it is possible for engineers like us to understand the fundamental ideas at a
reasonable level and to apply the results to practical situations. In fact, most of the
typical differential equations themselves have their origin in engineering problems.
We denote by R the set of real numbers. Recall the notion of Cartesian product of
two sets, A and B, namely, the set A × B consisting of all ordered pairs of the form
(a, b), where a belongs to A and b belongs to B. More formally,
q−p
R
Fig. 1.2 The affine nature of R3
Note that the Cartesian product is not commutative. Clearly, we can consider the
Cartesian product of more than two sets (assuming associativity). In this spirit we
can define
Rn = R × R × · · · × R . (1.2)
n times
Thus, Rn can be viewed as the set of all ordered n-tuples of real numbers. It has
a natural structure of an n-dimensional vector space (by defining the vector sum
and the multiplication by a scalar in the natural way).1 The space Rn (or, for that
matter, any vector space) can also be seen as an affine space. In an affine space, the
elements are not vectors but points. To every ordered pair of points, p and q, a unique
vector can be assigned in some predefined supporting vector space. This vector is
denoted as pq or, equivalently, as the “difference” q − p. If the space of departure
was already a vector space, we can identify this operation with the vector difference
and the supporting space with the vector space itself, which is what we are going to
do in the case of Rn (see Fig. 1.2). In this sense, we can talk about a vector at the
point p. More precisely, however, each point of Rn has to be seen as carrying its own
“copy” of Rn , containing all the vectors issuing from that point. This is an important
detail. For example, consider the surface of a sphere. This is clearly a 2-dimensional
entity. By means of lines of latitude and longitude, we can identify a portion of this
entity with R2 , as we do in geography when drawing a map (or, more technically,
1 The dot (or inner) product is not needed at this stage, although it is naturally available. Notice, inci-
dentally, that the dot product is not always physically meaningful. For example, the 4-dimensional
classical space-time has no natural inner product.
6 1 Vector Fields and Ordinary Differential Equations
a chart) of a country or a continent. But the vectors tangent to the sphere at a point
p, do not really belong to the sphere. They belong, however, to a copy of the entire
R2 (the tangent plane to the sphere at that point). In the case in which the sphere is
replaced by a plane, matters get simplified (and, at the same time, confused).
1.2.2 Curves in Rn
γ : J → Rn , (1.3)
where xi is the running variable in the i-th copy of R. The map γ is called a parame-
trized curve in Rn . Since to each point t ∈ J we assign a particular point in Rn , we
can appreciate that the above definition corresponds to the intuitive idea of a one-
dimensional continuous entity in space, namely something with just one “degree of
freedom”. The graph of a parametrized curve (i.e., the collection of all the image
points) is a curve. Notice that the same curve corresponds to an infinite number of
x3
v
p
r(tp )
γ
x2
[ | ] t
tp
J x1
If each of the functions xi (t) is not just continuous but also differentiable (to some
order), we say that the curve is differentiable (of the same order). We say that a
function is of class C k if it has continuous derivatives up to and including the order
k. If the curve is of class C ∞ , we say that the curve is smooth.
It is often convenient to use a more compact vector notation2 by introducing the
so-called position vector r in Rn , namely, the vector with components x1 , x2 , . . . xn .
A curve is then given by the equation
r = r(t). (1.5)
2A luxury that we cannot afford on something like the surface of a sphere, for obvious reasons.
8 1 Vector Fields and Ordinary Differential Equations
1.2.3 Surfaces in R3
r = r(ξ1 , ξ2 ). (1.9)
The domain of definition of the parameters ξ1 and ξ2 need not be limited to a rec-
tangle, but can be any (closed) connected region in R2 . Higher-order surfaces (or
hypersurfaces) can be defined analogously in Rn by considering continuous func-
tions xi = xi (ξ1 , . . . ξn−1 ). More generally, the main object of Differential Geometry
is a differentiable manifold of an arbitrary number of dimensions. An n-dimensional
manifold can be covered with coordinate patches, each of which looks like an open
set in Rn .
Keeping one of the coordinates (ξ1 , say) fixed and letting the other coordinate
vary, we obtain a coordinate curve (of the ξ2 kind, say) on the given surface. The
surface can, therefore, be viewed as a one-parameter family of coordinate curves of
one kind or the other. More graphically, the surface can be combed in two ways with
coordinate curves, as illustrated in Fig. 1.4. In the differentiable case, a tangent vector
ξ2
⎧
]
⎪
⎪ x3
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
Σ
J2
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩ r(ξ1 , ξ2 )
[
x2
[ ] ξ1
J1 x1
∂r ∂r
e1 = e2 = . (1.10)
∂ξ1 ∂ξ2
They constitute a basis for the tangent plane to the surface. They are known as the
natural base vectors associated with the given parameters ξ1 , ξ2 .
The cross product of the natural base vectors provides us, at each point, with a
vector m = e1 × e2 perpendicular to the surface. The equation of the tangent plane
at a point x10 , x20 , x30 of the surface is, therefore,
Remark 1.4 For the particular case of a surface expressed as x3 = f (x1 , x2 ), the
natural base vectors (adopting x1 , x2 as parameters) have the Cartesian components
∂f ∂f
e1 = 1, 0, e2 = 0, 1, . (1.12)
∂x1 ∂x2
and the equation of the tangent plane at (x10 , x20 , x30 ) can be written as
∂f ∂f
x3 − x30 = (x1 − x10 ) + (x2 − x20 ). (1.14)
∂x1 ∂x2
vi = vi (x1 , x2 , x3 ) i = 1, 2, 3. (1.15)
10 1 Vector Fields and Ordinary Differential Equations
P
dA
x2
x1
1.3 The Divergence Theorem 11
flux v,A = v · n dA. (1.17)
A
The (Riemann) integral on the right-hand side can be regarded as the limit of the sum
of the elementary fluxes as the partition is increasingly refined.
A proof of this theorem, known also as the theorem of Gauss, can be found in
classical calculus books, such as [4] or [5]. A more modern and more general, yet
quite accessible, formulation is presented in [6].
This fundamental result of vector calculus can be regarded as a generalization of
the fundamental theorem of calculus in one independent variable. Indeed, in the one-
dimensional case, if we identify the domain D with a segment [a, b] in R, a vector
field v has a single component v. Moreover, the boundary ∂D consists of the two-
element set {a, b}. The exterior unit normals at the points a and b are, respectively,
the vectors with components −1 and +1. Thus, the divergence theorem reduces to
b
dv
dx = v(b) − v(a), (1.19)
dx
a
3 This
notation is justified by the geometric theory of integration of differential forms, which lies
beyond the scope of these notes.
12 1 Vector Fields and Ordinary Differential Equations
∂φ
(∇φ)i = . (1.20)
∂xi
What is the meaning of the term ∇φ · n? Clearly, this linear combination of partial
derivatives is nothing but the directional derivative of the function φ in the direction
of the unit vector n. If we denote this derivative by dφ/dn, we can write the statement
of the divergence theorem for the gradient of a scalar field as
dφ
∇ 2 φ dV = dA. (1.23)
dn
D ∂D
As we have seen, given a differentiable curve in Rn , one can define the tangent vector
at each of its points. The theory of ODEs can be regarded geometrically as providing
the answer to the inverse question. Namely, given a vector field in Rn and a point p0
in Rn , can one find a curve γ through p0 whose tangent vector at every point p of γ
coincides with the value of the vector field at p?
Let us try to clarify this idea. A vector field in Rn is given by a map
v : Rn → Rn , (1.24)
or, in components,
vi = vi (x1 , . . . , xn ) i = 1, . . . , n. (1.25)
1.4 Ordinary Differential Equations 13
xi = xi (t), (1.26)
dxi (t)
= vi (x1 (t), . . . , xn (t)) i = 1, . . . , n, (1.27)
dt
and the initial conditions (of passing through a given point p0 with coordinates xi0 )
for some (initial) value t0 of the parameter t. We clearly see that the geometric
statement of tangency to the given vector field translates into the analytic statement
of Eq. (1.27), which is nothing but a system of ordinary differential equations (ODEs)
of the first order.
Remark 1.5 If the vector field vanishes at a point P, the solution of (1.27) through P
is constant, so that the entire curve collapses to a point. We say that P is an equilibrium
position of the system.
Using the notation of Eq. (1.5), the system (1.27) can be written as
dr
= v. (1.29)
dt
The system is linear if it can be written as
dr
= A r, (1.30)
dt
where A is a square constant matrix.
so that we are requiring that the vector tangent to the parametrized integral curve
must be exactly equal (and not just proportional) to the vector field. The parameter t
emerging from the solution process itself may or may not have an intrinsic physical
meaning in a given context. Moreover, it should be clear that this parameter is at
most determined up to an additive constant. This arbitrariness can be removed by
specifying the value t0 = 0 at the ‘initial’ point xi0 .
An important question is whether or not a system of ODEs with given initial con-
ditions always has a solution and, if so, whether the solution is unique. Translating
this question into geometrical terms, we ask whether, given a vector field, it is always
possible to find a (unique) integral curve through a given point P. Geometrical intu-
ition tells us that, as long as the field is sufficiently regular, we can advance a small
step in the direction of the local vector at P to reach a nearby point P and then repeat
the process to a nearby point P , and so on, to obtain at least a small piece of a
curve. This intuition, aided by the power of geometric visualization, turns out to be
correct and is formalized in the existence and uniqueness theorem, which is briefly
discuseed in Sect. 1.4.7.
1.4.3 An Example
For illustrative purposes, let us work in the plane (n = 2) and let us propose the fol-
lowing vector field (which will be later related to a very specific physical application;
can you guess which?)
v1 = x2
(1.31)
v2 = − sin x1
dx2
= − sin x1 . (1.33)
dt
The corresponding Mathematica code and plot are shown in Fig. 1.7.
Let us now consider the same example but with different initial conditions, closer
to the origin, such as x10 = −1.5, x20 = 1. The corresponding Mathematica code and
plot are shown in Fig. 1.8.
1.4 Ordinary Differential Equations 15
2
X2
4
10 5 0 5 10
X1
Fig. 1.6 Vector field associated with the system of ODEs (1.31)
10 5 5 10
10 5 5 10
2
X2
4
10 5 0 5 10
X1
and d(Δx2 )/dt = (−1)k+1 Δx1 , where Δx1 , Δx2 are the incremental variables, so
that the linearized system has to be studied in the vicinity of the origin.
A system of ODEs such as (1.27) is called autonomous, a word meant to indicate the
fact that the given vector field does not depend on the parameter. A more general,
non-autonomous, system would have the form
1.4 Ordinary Differential Equations 17
dxi (t)
= vi (t, x1 (t), . . . , xn (t)) i = 1, . . . , n. (1.34)
dt
If, as is often the case, the system of equations is intended to represent the evolution of
a dynamical system (whether in Mechanics or in Economics, etc.) and if the parameter
has the intrinsic meaning of time, the explicit appearance of the time variable in the
vector field seems to contradict the principle that the laws of nature do not vary
in time. As pointed out by Arnold,4 however, the process of artificially isolating a
system, or a part of a system, from its surroundings for the purpose of a simplified
description, may lead to the introduction of time-dependent fields.
An important property of the solutions of autonomous systems of ODEs is the
group property, also known as the time-shift property. It states that if r = q(t) is a
solution of a system of ODEs corresponding to the vector field v = v(r), namely if
dq(t)
= v(q(t)), (1.35)
dt
for all t, then the curve r = q(t + s), for any fixed s, is also a solution of the same
problem. Moreover, the two integral curves coincide. The proof is straightforward.
We start by defining the function q̂(t) = q(t + s) and proceed to calculate its deriv-
ative at some value t = τ of the parameter. We obtain
d q̂ dq(t + s) dq(t)
= = dt = v(q(τ + s)) = v(q̂(τ )). (1.36)
dt t=τ dt t=τ t=τ +s
dx d2x d n−1 x
x1 = x x2 = x3 = ... xn = , (1.38)
dt dt 2 dt n−1
in terms of which the original differential equation can be written as the first-order
system
dx1
= x2 , (1.39)
dt
dx2
= x3 , (1.40)
dt
.
.
.
dxn−1
= xn , (1.41)
dt
dxn
= F (t, x1 , x2 , . . . , xn ) . (1.42)
dt
Thus a system of second-order equations, such as one obtains in the formulation
of problems in dynamics of systems of particles (and rigid bodies), can be reduced
to a system of first-order equations with twice as many equations. The unknown
quantities become, according to the scheme just described, the positions and the
velocities of the particles. In this case, therefore, the space of interest is the so-called
phase space, which always has an even dimension. If the system is non-autonomous,
it is sometimes convenient to introduce the odd-dimensional extended phase space,
which consists of the Cartesian product of the phase space with the time line R. This
terminology is widely used even in non-mechanical applications. The vector field
corresponding to an autonomous dynamical system is called its phase portrait and
its integral curves are called the phase curves. A careful analysis of the phase portrait
of an autonomous dynamical system can often reveal many qualitative properties of
its solutions.
dr(t)
= v(r(t)), (1.43)
dt
1.4 Ordinary Differential Equations 19
φ : Rn → R (1.44)
that attains a constant value over every solution of the system (1.43). In other words,
the function φ(x1 , . . . xn ) is constant along every integral curve of the vector field.
Clearly, any constant function is, trivially, a first integral. We are, therefore, only
interested in non-constant first integrals, which are the exception rather than the
rule. Whenever a non-constant first integral exists, it is usually of great physical
interest, since it represents a conserved quantity. A mechanical system is said to
be conservative if the external forces can be derived from a scalar potential U :
Rn → R, in which case the total energy of the system (kinetic plus potential) is
conserved.
Let a mechanical system with n degrees of freedom (such as a collection of springs
and masses) be described by the matrix equation
M r̈ = f(r), (1.45)
where the constant mass matrix M is symmetric and f is the vector of external forces.
The position vector r is measured in an inertial frame of reference and superimposed
dots indicate time derivatives. In many instances (such as when forces are produced by
a gravitational or electrostatic field) the external forces derive from a scalar potential
U = U(r) according to the prescription
∂U(r)
f =− , (1.46)
∂r
or, in components,
∂U
fi = − i = 1, . . . , n. (1.47)
∂xi
1 T
T = T (ṙ) = ṙ M ṙ, (1.48)
2
the total energy E can be defined as
E = T + U. (1.49)
Let r = r(t) be a solution of the system (1.45) for some initial conditions r(0) = r0
and ṙ(0) = u0 . Let us calculate the derivative of the total energy E with respect to t
along this solution. Using the chain rule of differentiation, we can write
dE
= ṙT M r̈ − ṙT f, (1.50)
dt
20 1 Vector Fields and Ordinary Differential Equations
where we have exploited the symmetry of the mass matrix and the potential relation
(1.46). Collecting terms and enforcing (1.45) (since we have assumed r(t) to be a
solution of this system), we obtain
dE
= ṙT (M r̈ − f) = 0, (1.51)
dt
which proves that E is a constant along every trajectory of the system. The value of
this constant is uniquely determined by the initial conditions.
For conservative mechanical systems with a single degree of freedom (n = 1), the
integral curves in the phase space coincide with the level sets of the total energy, as
described in Box 1.1. This remark facilitates the qualitative analysis of such systems.
For a fuller treatment, consult Chap. 2 of [1] or Sect. 2.12 of [2], which are useful for
the solution of Exercises 1.3 and 1.5.
z = f (x) + g(y)
When describing, in Sect. 1.4.2, an intuitive geometric way to visualize the construc-
tion of an integral curve of a vector field (by moving piecewise along the vectors of
the field), we mentioned that the field must be ‘sufficiently regular’. It may appear
that mere continuity of the field would be sufficient for this intuitive picture to make
sense. A more rigorous analysis of the problem, however, reveals that a somewhat
stronger condition is needed, namely, Lipschitz continuity. In the case of a real func-
tion of one real variable
f : [a, b] → R, (1.52)
for all x1
= x2 . An example of a continuous function that is not Lipschitz continuous
is the function
f (x) = + |x| (1.54)
point r0 ∈ D, there exists an ε > 0 such that the initial value problem
dr
=v r(t0 ) = r0 (1.55)
dt
has a unique solution in the interval [t0 − ε, t0 + ε].
In geometrical terms, given a sufficiently regular vector field, we can always find
at each point a small enough integral curve passing through that point. The theorem is
also applicable to non-autonomous systems, as long as the dependence of the vector
field on the parameter is continuous. For linear systems the theorem guarantees the
existence of the solution for all values of t.
Exercises
Excercise 1.1 Show that the expression (1.16) is preserved upon a change of Carte-
sian coordinates.
Excercise 1.2 Show that in a system of cylindrical coordinates ξ1 , ξ2 , ξ3 defined by
x1 = ξ1 cos ξ2
x2 = ξ1 sin ξ2
x3 = ξ3 ,
the divergence of a vector field v with cylindrical components v̂1 , v̂2 , v̂3 is given by
Excercise 1.3 (a) Show that the system of ODEs given by Eqs. (1.32) and (1.33)
can be used to represent the motion of a pendulum in a vertical plane. (b) Describe
qualitatively the behaviour of solutions of the two types discussed above. What kind
of solution is represented by closed curves in phase space? (c) By changing the initial
conditions one can control which of the two types of behaviour will result. Clearly,
there exists a locus of points in phase space corresponding to initial conditions that
lie precisely in the boundary between the two types of behaviour. This locus is called
a separatrix. From considerations of conservation of energy, determine the equation
of this separatrix. (d) Plot your separatrix and verify numerically (using, for example,
the Mathematica package) that indeed a small perturbation of the initial conditions
to one side leads to a different behaviour from that caused by a perturbation to the
other side of the separatrix.
Excercise 1.4 Draw (approximately) the phase portrait for a damped pendulum,
where the damping force is proportional to the angular velocity. Compare with the
results for the undamped pendulum and comment on the nature of the solutions in
both cases. Consider various values of the damping coefficient. Is there a critical
value? Compare your results qualitatively with the corresponding one for a linear
spring with and without damping.
Excercise 1.5 A particle moves along the x axis under the force field
F(x) = −1 + 3x 2 .
Draw and analyze the corresponding phase portrait, with particular attention to the
level curves of the total energy (which represent the trajectories of the system in
phase space). Do not use a computer package.
Excercise 1.6 Show that for a system of masses subjected only to central forces
(namely, forces passing through a common fixed point in an inertial frame), the
vector of angular momentum of the system with respect to that point is conserved.
Recall that the angular momentum is the moment of the linear momentum. For the
particular case of a single particle, prove that the trajectories are necessarily plane
and derive Kepler’s law of areas.
At each point (x1 , x2 , x3 ) of R3 these two vectors determine a plane. In other words,
we have defined a two-dimensional distribution in R3 . Attempt a drawing of this
distribution around the origin and explain intuitively why this distribution fails to
be involutive. Strengthen your argument by assuming that there exists an integral
surface with equation x3 = ψ(x1 , x2 ) and show that imposing the condition that its
tangent plane belongs to the distribution (at each point in a vicinity of the origin)
leads to a contradiction.
24 1 Vector Fields and Ordinary Differential Equations
References
Many of the PDEs used in Engineering and Physics are the result of applying physical
laws of conservation or balance to systems involving fields, that is, quantities defined
over a continuous background of two or more dimensions, such as space and time.
Under suitable continuity and differentiability conditions, a generic balance law in
both global (integral) and local (differential) forms can be derived and applied to
various contexts of practical significance, such as Traffic flow, Solid Mechanics,
Fluid Mechanics and Heat Conduction.
2.1 Introduction
Partial differential equations arise quite naturally when we apply the laws of nature
to systems of continuous extent. We speak then of field theories. Thus, whereas the
analysis of the vibrations of a chain of masses interconnected by springs gives rise
to a system of ODEs, the dynamic analysis of a bar, where the mass is smeared out
continuously over the length of the bar, gives rise to a PDE. From this simple example,
it would appear that PDEs are a mere generalization of their ordinary counterparts,
whereby a few details need to be taken care of. This false impression is exacerbated
these days by the fact that numerical procedures, that can be implemented as computer
codes with relative ease, do actually approximate the solutions of PDEs by means of
discrete systems of algebraic equations. This is clearly a legitimate thing to do, but
one must bear in mind that, unless one possesses a basic knowledge of the qualitative
aspects of the behaviour of the continuous system, the discrete approximation may
not be amenable to a correct interpretation.
One hardly needs to defend the study of PDEs on these grounds, since they stand
alone as one of the greatest intellectual achievements of the human race in its attempt
to understand the physical world. Need one say more than the fact that from solid
and fluid mechanics all the way to quantum mechanics and general relativity, the
© Springer International Publishing AG 2017 25
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_2
26 2 Partial Differential Equations in Engineering
language of nature has so far been transcribed into PDEs? There has been recently
a trend to declare the emergence of a “new science”, in which the prevalent lan-
guage will be that of cellular automata and other tools that represent the behaviour
of complex systems as the result of simple interactions between a very large (but
finite) number of discrete sites of events.1 These models are particularly powerful
in applications where the underlying phenomena are too intricate to capture in any
degree of detail by means of PDEs. Such is the case in multi-scale phenomena that
appear in many modern applications in a variety of fields (biology, environmental
engineering, nanomechanics, and so on). It is too early to predict the demise of
Calculus, however. As many times in the past (think of quantum mechanics, chaos,
economics), it appears that in one way or another the usefulness of mathematical
limits (differentiation, integration) is not entirely dependent on whether or not the
actual physical system can “in reality” attain those limits. Calculus and differential
equations are here to stay just as trigonometry and Euclidean geometry are not likely
to go away.
∂u ∂3u
u ,i = u ,i jk = . (2.2)
∂xi ∂xk ∂x j ∂xi
By abuse of notation, we have listed in Eq. (2.1) just a generic term for each order
of differentiation, understanding that all the values of the indices are to be considered.
For example, when we write the argument u ,i j we mean the n 2 entries in the square
matrix of second partial derivatives.2 The requirement n > 1 is essential, otherwise
(if n = 1) we would have an ordinary differential equation. The highest order of
1 This point of view is advocated in [4] with particular force by Stephen Wolfram, a physicist and
the creator of the Mathematica code (which, ironically, is one of the best tools in the market for the
solution of differential equations).
2 On the other hand, recalling the equality of mixed partial derivatives (under assumptions that we
assume to be fulfilled), the number of independent entries of this matrix is actually only n(n + 1)/2.
2.2 What is a Partial Differential Equation? 27
One of the primary sources of PDEs in Engineering and Physics is the stipulation
of conservation laws. Conservation laws or, more generally, balance laws, are the
result of a complete accounting of the variation in time of the content of an extensive
physical quantity in a certain domain. A simple analogy is the following. Suppose
that you are looking at a big farming colony (the domain of interest) and you want
to focus attention on the produce (the physical quantity of interest). As time goes
on, there is a variation in the quantity of food contained in the domain. At any given
instant of time, you want to account for the rate of change of this food content. There
are some internal sources represented in this case by the rate at which the land yields
new produce (so and so many tons per week, say). There are also sinks (or negative
sources) represented by the internal consumption of food by workers and cattle,
28 2 Partial Differential Equations in Engineering
damage caused by hail and pests, etcetera. We will call these sources and sinks the
production of the quantity in question. It is measured in units of the original quantity
divided by the unit of time. In addition to these internal factors, there is also another
type of factors that can cause a change in content. We are referring to exchanges of
food through the boundary of the colony. These include the buying and selling of
produce that takes place at the gates, the perhaps illegal activities of some members
or visitors that personally take some food away to other destinations, etcetera. At any
given instant of time, we can estimate the rate at which these exchanges take place at
the boundary. We will call these transactions the flux of the quantity in question. We
may have a flux arising also from the fact that the boundary of the domain of interest
is changing (encroached by an enemy or by natural causes, etcetera). Assuming that
we have accounted for every one of these causes and that we believe in the principles
of causality and determinism (at least as far as the material world is concerned), we
may write the generic equation of balance as
d content
= production + flux, (2.3)
dt
where t is the time variable.
In physically meaningful examples (balance of energy, momentum, mass, electric
charge, and so on), it is often the case that the content, the production and the flux are
somehow distributed (smeared) over the volume (in the case of the content and the
production) or over the area of the boundary (in the case of the flux). In other words,
these magnitudes are given in terms of densities, which vary (continuously, say) from
point to point and from one instant to the next. It is precisely this property (whether
real or assumed) that is responsible for the fact that we can express the basic equation
of balance (2.3) in terms of differential equations. Indeed, the differential equations
are obtained by assuming that Eq. (2.3) applies to any sub-domain, no matter how
small.
Let U represent an extensive quantity for which we want to write the equation of
balance. We assume this quantity to be scalar, such as mass, charge or energy content.3
Consider a spatial region ω fixed in R3 and representing a subset of the region of
interest. Our four independent variables are the natural coordinates x1 , x2 , x3 of R3
and the time variable t.4 When we say that U is an extensive quantity, we mean
that we can assign a value of U (the content) to each such subset ω. On physical
grounds we further assume that this set function is additive. By this we understand
3 Vector quantities, such as linear and angular momentum, can be treated in a similar way by
identifying U alternatively with each of the components in a global Cartesian frame of reference.
4 Consequently, we will not strictly adhere to the notational convention (2.2).
2.3 Balance Laws 29
that the total content in two disjoint subsets is equal to the sum of the contents in
each separate subset. Under suitable continuity conditions, it can be shown that the
content of an extensive quantity U is given by a density u = u(x1 , x2 , x3 , t) in terms
of an integral, namely,
U= u dω. (2.4)
ω
It is clear that this expression satisfies the additivity condition. The units of u are the
units of U divided by a unit of volume.
Similarly, the production P is assumed to be an extensive quantity and to be
expressible in terms of a production density p = p(x1 , x2 , x3 , t) as
P= p dω. (2.5)
ω
The units of p are the units of U divided by a unit of volume and by the time unit.
We adopt the sign convention that a positive p corresponds to creation (source) and
a negative p corresponds to annihilation (sink).
The flux F represents the change in content per unit time flowing through the
boundary ∂ω, separating the chosen subset ω from the rest of the region of interest.
In other words, the flux represents the contact interaction between adjacent subsets.
A remarkable theorem of Cauchy shows that under reasonable assumptions the flux
is governed by a vector field, known as the flux vector f. More precisely, the inflow
per unit area and per unit time is given by
dF = (f · n)da, (2.6)
where n is the exterior unit normal to da. The significance of this result can be
summarized as follows:
1. The ‘principle of action and reaction’ is automatically satisfied. Indeed at any
given point an element of area da can be considered with either of two possible
orientations, corresponding to opposite signs of the unit normal n. Physically,
these two opposite vectors represent the exterior unit normals of the sub-bodies
on either side of da. Thus, what comes out from one side must necessarily flow
into the other.
2. All boundaries ∂ω that happen to have the same common tangent plane at one
point transmit, at that point, exactly the same amount of flux. Higher order prop-
erties, such as the curvature, play no role whatsoever in this regard. In fact, this
is the main postulate needed to prove Cauchy’s theorem.
3. The fact that the amount of flux depends linearly on the normal vector (via the dot
product) conveys the intuitive idea that the intensity and the angle of incidence
of the flowing quantity are all that matter. If you are sun-tanning horizontally at
30 2 Partial Differential Equations in Engineering
high noon in the same position as two hours later, you certainly are taking in more
radiation per unit area of skin in the first case.
We are now in a position of implementing all our hypotheses and conclusions into
the basic balance Eq. (2.3). The result is
d
u dω = pdω + f · n da. (2.7)
dt
ω ω ∂ω
This equation represents the global balance equation for the volume ω. It should
be clear that this equation is valid under relatively mild conditions imposed on the
functions involved. Indeed, we only need the density u to be differentiable with
respect to time and otherwise we only require that the functions be integrable. This
remark will be of great physical significance when we study the propagation of
shocks. In the case of a content u and a flux vector f which are also space-wise
differentiable, we can obtain a local version of the generic balance equation. This
local (‘infinitesimal’) version is a partial differential equation. To derive it, we start
by observing that, due to the fact that the volume ω is fixed (that is, independent of
time), the order of differentiation and integration on the left-hand side of Eq. (2.7)
can be reversed, that is,
d ∂u
u dω = dω. (2.8)
dt ∂t
ω ω
Moreover, the surface integral on the right-hand side of Eq. (2.7) is the flux of a
vector field on the boundary of a domain and is, therefore, amenable to be treated by
means of the divergence theorem according to Eq. (1.18), namely,
f · n da = divfdω. (2.9)
∂ω ω
Collecting all the terms under a single integral we obtain the global balance equation
in the form
∂u
− p − divf dω = 0. (2.10)
∂t
ω
This equation is satisfied identically for any arbitrary sub-domain ω. If the integrand
is continuous, however, it must vanish identically. For suppose that the integrand is,
say, positive at one point within the domain of integration. By continuity, it will also
be positive on a small ball B around that point. Applying the identity (2.10) to this
sub-domain B, we arrive at a contradiction. We conclude, therefore, that a necessary
and sufficient condition for the global balance equation to be satisfied identically for
arbitrary sub-domains is the identical satisfaction of the partial differential equation
2.3 Balance Laws 31
∂u
− p − divf = 0. (2.11)
∂t
This is the generic equation of balance in its local (differential) form. It is a single
PDE for a function of 4 variables, x1 , x2 , x3 and x4 = t.
There are several reasons to present an independent derivation of the generic law
of balance for the case of a single spatial dimension. The first reason is that in
the case of just one dominant spatial dimension (waves or heat flow in a long bar,
current in a wire, diffusion of pollutants in a tube, etcetera), the divergence theorem
mercifully reduces to the statement of the fundamental theorem of calculus of one
variable (roughly speaking: “differentiation is the inverse of integration”). Notice
that we still are left with two independent variables, one for the spatial domain (x)
and one for the time dependence (t). Another important reason has to do with the
peculiar nature of a domain in R as compared with domains in higher dimensions.
If the spatial domain is two-dimensional, such as a membrane, its boundary is the
perimeter curve, while the upper and lower faces of the membrane are identified
with the interior points. For a three-dimensional domain, the boundary is the whole
bounding surface. On the other hand, a closed connected domain in R is just a closed
interval [a, b], with a < b. Its boundary consists of just two distinct points, as shown
in Fig. 2.1. Moreover, the exterior normal to the boundary is defined at those points
only, as a unit vector at a pointing in the negative direction and a unit vector at b
pointing in the positive direction of the real line. The flux vector f and the velocity
vector v have each just one component and can be treated as scalars. Physically,
we may think of U as the content of some extensive quantity in a wire or a long
cylindrical bar. It is important to realize that the lateral surface of this wire does not
exist, in the sense that it is not part of the boundary. On the contrary, the points on this
domain
x
a b
boundary
(vanishingly small) lateral surface are identified precisely with the interior points of
the wire.
If we assume that the quantity U = U (t) is continuously distributed throughout
the domain, we can express it in terms of a density u = u(x, t) per unit length of the
bar as
b
U = u d x. (2.12)
a
b
P= p d x. (2.13)
a
oppositely oriented
exterior unit normals
(and of equal absolute value) for the right part of the bar. Notice that in this simple
one-dimensional case, the flux vector has only one component, which we denote by
f = f (x, t). Nevertheless, the concept of flux vector is very important and can be
used directly in two- and three- dimensional spatial contexts. For the actual boundary
of the body, the flux vector may be specified as a boundary condition, depending on
the specific problem being solved.
Introducing our specific expressions for content, production and flux in the generic
balance equation (2.3), we obtain
b
b
d
u(x, t) d x = p(x, t) d x + f (b, t) − f (a, t). (2.14)
dt
a a
As far as sign convention for the flux is concerned, we have assumed that a positive
scalar flux is inwards, into the body. What this means is that if the flux vector points
outwards, the scalar flux is actually inwards. If this convention is found unnatural,
all one has to do is reverse the sign of the last two terms in Eq. (2.14).5
This is the equation of balance in its global form. It is not yet a partial differential
equation. It is at this point that, if we wish to make the passage to the local form of the
equation, we need to invoke the fundamental theorem of calculus (or the divergence
theorem in higher dimensional contexts). Indeed, we can write
b
∂ f (x, t)
f (b, t) − f (a, t) = d x. (2.15)
∂x
a
We obtain, therefore,
b
b
b
d ∂f
u(x, t) d x = p(x, t) d x + d x. (2.16)
dt ∂x
a a a
If we consider that the integral limits a and b, though arbitrary, are independent of
time, we can exchange in the first term of the equation the derivative with the integral,
namely,
b
b
b
∂u ∂f
dx = p(x, t) d x + d x, (2.17)
∂t ∂x
a a a
identically for all possible integration limits. We claim now that this identity is
possible only if the integrands themselves are balanced, namely, if
5 This
common policy is adopted in [3]. This is an excellent introductory text, which is highly
recommended for its clarity and wealth of examples.
34 2 Partial Differential Equations in Engineering
∂u ∂f
= p(x, t) + . (2.18)
∂t ∂x
The truth of this claim can be verified by collecting all the integrands in Eq. (2.17)
under a single integral and then arriving at a combined integrand whose integral
must vanish no matter what limits of integration are used. Clearly, if the integrand is
continuous and does not vanish at some point in the domain of integration, it will also
not vanish at any point in a small interval containing that point (by continuity). It will,
therefore, be either strictly positive or strictly negative therein. Choosing, then, that
small interval as a new domain of integration, we would arrive at the conclusion that
the integral does not vanish, which contradicts the assumption that the integration
must vanish for all values of the limits. We conclude that Eq. (2.18) must hold true.
When we look at the local form of the balance equation, (2.18), we realize that we
have a single equation containing partial derivatives of two unknown functions, u
and f . What this is telling us from the physical point of view is that the equations of
balance are in general not sufficient to solve a physical problem. What is missing? If
we think of the problem of heat transfer through a wire (which is an instance of the
law of balance of energy), we realize that the material properties have played no role
whatsoever in the formulation of the equation of balance. In other words, at some
point we must be able to distinguish (in these macroscopic phenomenological models
in which matter is considered as a continuum) between different materials. Copper
is a better heat conductor than wood, but the law of balance of energy is the same for
both materials! The missing element, namely the element representing the response
of a specific medium, must be supplied by means of an extra equation (or equations)
called the constitutive law of the medium. Moduli of elasticity, heat conductivities,
piezoelectric and viscosity constants are examples of the type of information that may
be encompassed by a constitutive law. And what is that the constitutive equation can
stipulate? Certainly not the production, since this is a matter of sources and sinks,
which can be controlled in principle regardless of the material at hand. Instead, it is
the flux vector within the body that will differ from material to material according
to the present state of the system. The state of a system is given in terms of some
local variables of state s1 , s2 , . . . , sk (positions, temperatures, velocity gradients, and
so on), so that both the flux f and the content density u may depend on them. The
constitutive law is then expressed by equations such as
and
f = fˆ (s1 (x, t), . . . , sk (x, t), x) . (2.20)
2.3 Balance Laws 35
The reason that we have included a possible explicit dependence on x is that the
properties of the medium may change from point to point (as is the case in the so-
called functionally graded bodies, for instance). In principle, these properties could
also change in time, as is the case in processes of aging (biological or otherwise). In
some cases, a single variable of state is enough to characterize the system, so that
ultimately Eq. (2.18) becomes a PDE for the determination of this variable of state
as a function of space and time. Sometimes, it is possible to adopt the density u itself
as a single variable of state, so that the constitutive law simply reads
∂u ∂ fˆ ∂u ∂ fˆ
= p+ + . (2.22)
∂t ∂u ∂x ∂x
It is often convenient to adopt a subscript notation for partial derivatives of the
unknown field variable u = y(x, t), such as
∂u ∂u ∂2u ∂2u
ux = ut = uxx = u xt = ... (2.23)
∂x ∂t ∂x 2 ∂t∂x
Notice that, since there is no room for confusion, we don’t place a comma before
the subscripts indicating derivatives, as we did in (2.2). In this compact notation, Eq.
(2.22) reads
∂ fˆ ∂ fˆ
ut − ux = p + . (2.24)
∂u ∂x
We have purposely left the partial derivatives of the constitutive function fˆ unaffected
by the subscript notation. The reason for this is that the constitutive function fˆ is not
an unknown of the problem. On the contrary, it is supposed to be known as that part
of the problem statement that identifies the material response. Its partial derivatives
are also known as some specific functions of u and x. Notice that in the case of a
homogeneous material, the last term in Eq. (2.24) vanishes.
In the terminology introduced in the previous section, Eq. (2.24) is a first order,
quasi-linear PDE. If the constitutive function fˆ happens to be a linear function of
u, the PDE becomes linear. The linearity of constitutive laws is still one of the most
common assumptions in many branches of engineering (for example: Hooke’s law,
Ohm’s law, Fourier’s law, Darcy’s law, etcetera, are not actual laws of nature but
constitutive assumptions that are useful linear approximations to the behaviour of
some materials within certain ranges of operation). Notice that in the examples just
mentioned, the constitutive laws are expressed in terms of space derivatives of state
variables (respectively, displacement, electric potential, temperature and pressure).
As a result, the equation of balance combined with the constitutive law yields a second
order PDE. The theory of a single first-order PDE is comparable in its precision and
36 2 Partial Differential Equations in Engineering
implementation to the theory of systems of ODEs. This is not the case for higher
order PDEs or for systems of first order PDEs, as we shall see later. At this point,
however, we are only interested in illustrating the emergence of PDEs of any order
and type from well-defined engineering contexts, without much regard for their
possible solutions. Accordingly, in the next section, we will display several instances
of balance laws, which constitute a good (but by no means the only) source of PDEs
in applications.
A comprehensive review of models for traffic flow is beyond our present scope.
Instead, we present here a simplified version of the fundamental equation, based on
the assumptions that the road is of a single lane and that (within the portion of road
being analyzed) there are no entrances or exits. The quantity we want to balance
is the content of cars. We, therefore, interpret u = u(x, t) as the car density at the
point x along the road at time t. Since we have assumed no entrances or exits, the
production term p vanishes identically. The flux term f has the following physical
interpretation: At any given cross section of the road and at a given instant of time,
it measures the number of cars per unit time that pass through that cross section or,
more precisely, the number of cars per unit time that enter one of the portions of road
to the right or left of the cross section. With our (counter-intuitive) sign convention,
a positive value of f corresponds to an inflow of cars. We have seen that the flux is
actually governed by a flux vector f. Denoting by v = v(x, t) the car-velocity field,
we can write
f = −u v. (2.25)
In other words, if the velocity points in the direction of the exterior normal to the
boundary (so that the dot product is positive) the term u v measures the number
of cars that in a unit of time are coming out through that boundary. Since in our
case everything is one-dimensional, the velocity vector is completely defined by its
component v along the axis of the road, so that we can write
f = −u v. (2.26)
The local balance equation (2.24) for this traffic flow problem reads, therefore,
∂(u v)
ut + u x = 0. (2.27)
∂u
2.4 Examples of PDEs in Engineering 37
The time has come now to adopt some constitutive law. Clearly, the velocity of
the cars may depend on a large number of factors, including the time of day, the
weather, the traffic density, etcetera. In the simplest model, the velocity will depend
only on the traffic density, with larger densities giving rise to smaller speeds. From
practical considerations, since cars have a finite length, there will be an upper bound
u max for the density, and it is sensible to assume that when this maximum is attained
the traffic comes to a stop. On the other hand, we may or may not wish to consider
an upper limit vmax for the speed, when the traffic density tends to zero. If we do, a
possible constitutive equation that we may adopt is
u
v = vmax 1 − . (2.28)
u max
If, on the other hand, we do not want to impose a speed limit in our model, a possible
alternative constitutive law is u max
v = k ln , (2.29)
u
where k is a positive constant (which perhaps varies from road to road, as it may take
into consideration the quality of the surface, the width of the lane, and so on).
Introducing the constitutive law (2.28) into our balance law (2.27), we obtain the
quasi-linear first-order PDE
u
ut + 1 − 2 vmax u x = 0. (2.30)
u max
In the extreme case when the speed is independent of the density and equal to a
constant, we obtain the advection equation
u t + vmax u x = 0. (2.31)
2.4.2 Diffusion
Diffusive processes are prevalent in everyday life. They occur, for example, whenever
a liquid or gaseous substance spreads within another (sneezing, pouring milk into a
cup of coffee, industrial pollution, etc.). The process of heat flow through a substance
subjected to a temperature gradient is also a diffusive process. All these processes
are characterized by thermodynamic irreversibility (the drop of milk poured into the
coffee will never collect again into a drop).
Consider a tube filled with water at rest in which another substance (the pollutant)
is present with a variable concentration u = u(x, t). Let p = p(x, t) be the produc-
tion of pollutant per unit length and per unit time. This production can be the result
of industrial exhaust into the tube, coming from its lateral surface at various points,
38 2 Partial Differential Equations in Engineering
or of a similar process of partial clean-up of the tube. If there is any influx through
the ends of the tube, it will have to be considered as part of the boundary conditions
(which we have not discussed yet), rather than of the production term. The flux, just
as in the case of traffic flow, represents the amount of pollutant traversing a given
cross section per unit time. In the case of traffic flow, we introduced as a variable
of state the speed of the traffic, which we eventually related to the car density by
means of a constitutive law. In the case of diffusion of a pollutant, on the other hand,
it is possible to formulate a sensible, experimentally based, constitutive law directly
in terms of the pollutant concentration. The most commonly used law, called Fick’s
law, states that the flux vector is proportional to the gradient of the concentration,
namely,
f = D ux , (2.32)
where the constant D is the diffusivity of the pollutant in water. A moment’s reflection
reveals that, with our sign convention, if we want the pollutant to flow in the direction
of smaller concentrations, the diffusivity must be positive. Introducing these results
into the general balance equation (2.18), we obtain
u t − D u x x = p. (2.33)
Assuming that the particles in a thin cylindrical bar are constrained to move in the
axial direction, the law of balance of momentum (Newton’s second law) can be seen
as a scalar equation. The momentum density (momentum per unit length) is given
by ρ A v, ρ being the mass density, A the cross-section area and v the component
of the velocity vector. The production term in this case consists of any applied force
per unit length (such as the weight, if the bar is held vertically). We will assume for
now that there are no applied external forces, so that the production term vanishes
identically. The flux associated with the momentum is what we call the stress tensor,
which in this case can be represented by a single component σ (perpendicular to the
normal cross sections). The balance of momentum7 reads
6 The adjective non-homogeneous, in this case, refers to the fact that there are sources or sinks, that
is, p does not vanish identically. Material inhomogeneity, on the other hand, would be reflected in
a variation of the value of the diffusivity D throughout the tube.
7 Neglecting convective terms.
2.4 Examples of PDEs in Engineering 39
∂(ρ A v) ∂(σ A)
= . (2.34)
∂t ∂x
Assuming constant cross section and density, we obtain
ρ vt = σx . (2.35)
This balance equation needs to be supplemented with a constitutive law. For a linearly
elastic material, the stress is proportional to the strain (ε), that is,
σ = E ε, (2.36)
by the kinematic relations of the infinitesimal theory of strain. Putting all these results
together we obtain the second-order linear PDE
u tt = c2 u x x , (2.39)
Equation (2.39) is known as the one-dimensional wave equation. The constant c will
be later interpreted as the speed of propagation of waves in the medium. A similar
equation can be derived for the problem of small transverse vibrations of a string
(such as that of a guitar) under tension. In this case, the constant c is given by the
square root of the ratio between the tension in the string and its mass per unit length.8
2.4.4 Solitons
u t + u u x + u x x x = 0. (2.41)
Here, u = u(x, t) represents a measure of the height of the water in a long channel
of constant cross section. The KdV equation is a third-order quasi-linear PDE. It can
be brought to the form of a conservation law (2.18) by setting p = 0 and
1 2
f =− u + uxx . (2.42)
2
q ρh
wx x + w yy = − + wtt , (2.43)
T T
where h is the thickness of the membrane. In the absence of the external loading
term q, this is the two-dimensional wave equation. If, on the other hand, we seek
an equilibrium position under the action of a time-independent load, we obtain the
second-order linear PDE q
wx x + w yy = − . (2.44)
T
This is the Poisson equation. If the right-hand side vanishes (no load, but perhaps a
slightly non-planar frame) we obtain the Laplace equation. These equations appear
in many other engineering applications, including fluid mechanics, acoustics, elec-
trostatics and gravitation.
2.4 Examples of PDEs in Engineering 41
dx
dy
x
In Continuum Mechanics [1] the field variables are always associated with a con-
tinuous material body as the carrier of contents, sources and fluxes. The material
body, made up of material points, manifests itself through its configurations in the
physical space R3 . In this brief presentation, we will adhere strictly to the Eulerian
formulation, which adopts as its theatre of operations the current configuration of
the body in space. The domains ω used in the formulation of the generic balance
equation must, accordingly, be subsets of the current configuration. In other words,
they must be made of spatial points occupied at the current time by material particles.
The generic equation of balance in its global form (2.7) or in its local form (2.11),
is still applicable. In Continuum Mechanics, however, it is convenient to identify
two distinct parts of the total flux F through the boundary ∂ω which we call the
convected flux Fc and the physical flux F p , that is,
F = Fc + F p . (2.45)
where n is the exterior unit normal to da as part of ∂ω. The negative sign indicates
that we have assumed an influx as positive. Naturally, if the particle velocity happens
to be tangential to the boundary at a point, the convected flux at that point vanishes.
Remark 2.1 Had we assumed a material volume as the point of departure (that is,
a volume that follows the particles in their motion in space), the corresponding
convected flux would have automatically vanished. This simplification would have
resulted, however, in the need to use a compensatory transport theorem for the
calculation of the time variation of an integral on a moving domain. The convected
flux is, literally, in the eyes of the beholder.
The second part of the flux, F p has the clear physical meaning of the flow of
content through the boundary due to causes other than mere motion or rest of the
control volume. Thus, for instance, if the content in question is the internal energy
of a rigid body at rest, the flux through the boundary represents the conductive heat
flux. It is important to notice once more that the physical flux takes place at each
internal boundary (not just the external boundary of the body) separating a sub-body
from the rest. Cauchy’s theorem implies that the physical flux is governed by a flux
vector field f p , as before. Thus, we obtain the global form
d
u dω = pdω + (−uv + f p ) · n da. (2.47)
dt
ω ω ∂ω
Applying the divergence theorem and following our previous localization argument
we obtain the generic law of balance of Continuum Mechanics in its local form as
2.4 Examples of PDEs in Engineering 43
∂u
− p − div(−uv + f p ) = 0. (2.48)
∂t
In terms of the material derivative, discussed in Box 2.1, this equation can also be
written as
Du
− p + u divv − divf p = 0. (2.49)
Dt
Accordingly, we obtain
∂ρ
+ div(ρv) = 0, (2.50)
∂t
or, using the material derivative,
Dρ
+ ρ divv = 0. (2.51)
Dt
This equation is known in Fluid Mechanics as the continuity equation.
In this case, according to Newton’s second law, the content density is the vector of
linear momentum, namely, ρ v. The production is the body force b per unit spatial
volume and the physical flux is the surface traction vector t. We will implement
the generic balance equation component by component in a global Cartesian inertial
frame. For each component ti of the traction vector, according to Cauchy’s theorem,
there exists a flux vector with components σi j ( j = 1, 2, 3). We have thus a matrix
representing, in the given coordinate system, the components of the Cauchy stress
tensor σ. The surface traction t = σ · n is best expressed in components as
ti = σi j n j , (2.52)
where the summation convention for repeated indices has been enforced, as explained
in Box 2.2. The equation of balance (2.48) for ρvi reads
∂ρvi
− bi − −ρvi v j + σi j , j = 0. (2.53)
∂t
On the other hand, the continuity equation (2.50) is written in Cartesian components
(always enforcing the summation convention) as
∂ρ
+ ρv j , j = 0. (2.54)
∂t
9 This is the case of the conservation of mass in conventional Continuum Mechanics. In the context of
growing bodies (such as is the case in some biological materials) mass is not necessarily conserved.
2.4 Examples of PDEs in Engineering 45
∂vi
ρ + ρvi, j v j = bi + σi j, j . (2.55)
∂t
Using the material derivative, we can write the local form of the balance of linear
momentum as
Dvi
ρ = bi + σi j, j . (2.56)
Dt
The material derivative of the velocity is, not surprisingly, the acceleration. Having
thus enforced the conservation of mass, the form of Newton’s second law for a
continuum states that the mass density times the acceleration equals the body force
plus the net contact force over the boundary of an elementary volume element.
divv = vi,i ,
where, as in Eq. (2.2), commas stand for partial derivatives with respect to the
coordinates designated by the indices following the comma. An equation such
as
Bi = Ai jk jk
stands for
n
n
Bi = Ai jk jk .
k=1 j=1
46 2 Partial Differential Equations in Engineering
Expressions such as
are considered wrong, unless the summation convention has been explicitly
suspended.
For a continuous deformable medium, unlike the case of a rigid body, the balance
of angular momentum is an independent postulate. It establishes that the angular
momentum with respect to any point attached to an inertial reference frame is equal
to the sum of the moments of the external forces about that point. The angular momen-
tum density is the (pseudo-)vector r ×(ρv), where, without any loss of generality, we
have identified the fixed point as the origin of coordinates, so that r is the standard
position vector. Assuming the absence of applied couples, and enforcing both mass
conservation and balance of linear momentum, the final result is purely algebraic,
namely,
σi j = σ ji . (2.57)
1
e= ρv · v + ρ. (2.58)
2
In this expression, the first term on the right-hand side represents the kinetic energy
density while is the internal energy per unit mass, postulated to exist as a function of
state by the first law of Thermodynamics. The mechanical source density is stipulated
by the same law as the power of the body force, that is b · v, while the thermal (that
is, non-mechanical) source is provided by sources of heat distributed with a density
r per unit volume. Similarly, the physical mechanical flux is given by the power of
the traction, that is, t · v while the physical heat flux is defined by means of a heat
flux vector q such that the non-mechanical influx per unit area and per unit time is
given by −q · n. The balance of energy equates the rate of change of the total energy
2.4 Examples of PDEs in Engineering 47
with the sum of the mechanical and thermal contributions. The final result can be
expressed as
D
ρ = r − qi,i + σi j vi, j , (2.59)
Dt
where the previous balance equations have been enforced.
Exercises
Exercise 2.1 Derive Eq. (2.18) by applying the equation of balance to a small (infin-
itesimal) slice of the bar, that is, for a slice contained between the cross sections at
x and at x + d x.
Exercise 2.2 Carry out all the steps leading to Eq. (2.57) that establishes the sym-
metry of the Cauchy stress tensor.
Exercise 2.3 Carry out all the steps leading to Eq. (2.59) that establishes the balance
of energy in a continuous medium.
1
Di j = vi, j + v j,i .
2
The Newtonian compressible viscous fluid has the constitutive equation
In this equation, p = p(ρ) is some increasing function of the density and λ and μ
are constant viscosity coefficients. The symbol I stands for the spatial identity tensor
and the operator tr is the trace, namely, trD = Dii is the sum of the diagonal entries
in the matrix representing D in a Cartesian frame. Use this constitutive equation in
the equation of balance of linear momentum to obtain the Navier-Stokes equations
of Fluid Mechanics [2]. These are three PDEs for the density and the components of
the velocity field. The continuity equation completes the system.
References
1. Chadwick P (1999) Continuum mechanics: Concise theory and problems. Dover, New York
2. Chorin AJ, Marsden JE (1993) A mathematical introduction to fluid mechanics. Springer, Berlin
3. Knobel R (2000) An introduction to the mathematical theory of waves. American Mathematical
Society, Providence
4. Wolfram S (2002) A new kind of science. Wolfram Media, Champaign
Part II
The First-Order Equation
Chapter 3
The Single First-Order Quasi-linear PDE
Remarkably, the theory of linear and quasi-linear first-order PDEs can be entirely
reduced to finding the integral curves of a vector field associated with the coefficients
defining the PDE. This idea is the basis for a solution technique known as the method
of characteristics. It can be used for both theoretical and numerical considerations.
Quasi-linear equations are particularly interesting in that their solution, even when
starting from perfectly smooth initial conditions, may break up. The physical meaning
of this phenomenon can be often interpreted in terms of the emergence of shock
waves.
3.1 Introduction
It has been pointed out by many authors1 that there is no general theory that encom-
passes all partial differential equations or even some large classes of them. The
exception is provided by the case of a single first order PDE, for which a general
theory does exist that, remarkably, reduces the problem to the solution of certain
systems of ordinary differential equations. The most general first-order PDE for a
function u = u(x1 , . . . , xn ) is of the form
F x1 , . . . , xn , u, u ,1 , . . . , u ,n = 0, (3.1)
F (x1 , . . . , xn , u, p1 , . . . , pn ) = 0. (3.3)
1 This
point is made most forcefully by Arnold in [1].
© Springer International Publishing AG 2017 51
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_3
52 3 The Single First-Order Quasi-linear PDE
p p
F (x, u, p) = 0
γ∗
u u
γ
x x
Fig. 3.1 A differential equation (left) and the canonical lift of a curve (right)
3.1 Introduction 53
∂u ∂u
a1 (x1 , . . . , xn , u) + · · · + an (x1 , . . . , xn , u) = c(x1 , . . . , xn , u). (3.4)
∂x1 ∂xn
In the case of two independent variables, x and y, it is possible to visualize the solution
of a PDE for a function u = u(x, y) as a surface2 in the three-dimensional space with
coordinates x, y, u. We call this surface an integral surface of the PDE. As pointed
out by Courant and Hilbert,3 geometric intuition is of great help in understanding
the theory, so that it seems useful to limit ourselves to the case of two independent
variables, at least for now. The visualization of the elements of the theory becomes
particularly useful in the case of linear and quasi-linear equations. The general non-
linear case is more difficult to grasp and will be the subject of Chap. 5.
The general form of a quasi-linear first-order PDE in two independent variables
is
a(x, y, u) u x + b(x, y, u) u y = c(x, y, u). (3.5)
2 This visualization has nothing to do with the more abstract geometric interpretation given in Box
3.1, which we will not pursue.
3 In [3], p. 22. This classical treatise on PDEs, although not easy to read, is recommended as a basic
reference work in the field of PDEs. A few of the many standard works that deal with first-order
PDEs (not all books do) are: [4–7]. Don’t be fooled by the age of these books!
4 Later, however, we will allow certain types of discontinuities of the solution.
54 3 The Single First-Order Quasi-linear PDE
is that the surface representing the solution has a well-defined tangent plane at each
point of its domain and, therefore, a well-defined normal direction.5 As we know
from Eq. (1.13), at any given point x, y a vector (not necessarily of unit length) in
this normal direction is given by the components
⎧ ⎫
⎨ ux ⎬
uy (3.6)
⎩ ⎭
−1
we conclude that the statement of the differential equation can be translated into
the following geometric statement: The normal to a solution surface must be at each
point perpendicular to the characteristic vector w with components a b c evaluated
at that point in space. But this is the same as saying that this last vector must lie in
the tangent plane to the solution surface!
Remark 3.1 The use of the imagery of a solution of a first-order PDE in two inde-
pendent variables as a surface in R3 is hardly necessary. As discussed in Box
3.2, it carries spurious geometric ideas, such as the normal to the surface. These
ideas are certainly useful to visualize the properties of a solution, but may not be
directly extended to higher dimensions. That the characteristic vector is tangent
to a solution surface can, in fact, be easily proved by purely analytical means.
Indeed, let u = u(x, y) be a solution and let P = (x0 , y0 , u 0 = u(x0 , y0 )) be
a point of this solution. The characteristic vector at this point has components
a0 = a(x0 , y0 , u 0 ), b0 = b(x0 , y0 , u 0 ), c0 = c(x0 , y0 , u 0 ). For a nearby point
P + d P = (x0 + d x, y0 + dy, u 0 + du) to lie in the solution, the increments
must satisfy the algebraic relation
du = u x d x + u y dy,
u − u 0 = p (x − x0 ) + q(y − y0 ),
Let us consider the following problem in the theory of systems of ODEs: Find inte-
gral curves of the characteristic vector field w(x, y, u). From Chap. 1, we have some
experience in this type of problem, so we translate it into the system of characteristic
equations
dx
= a(x, y, u), (3.8)
ds
dy
= b(x, y, u), (3.9)
ds
du
= c(x, y, u). (3.10)
ds
As we know from the theory of ODEs, this system always has a solution (at least
locally). This solution can be visualized as a family of non-intersecting integral
curves in space. In the context of the theory of first-order quasi-linear PDEs these
curves are called the characteristic curves of the differential equation, or simply
56 3 The Single First-Order Quasi-linear PDE
Lemma 3.1 A characteristic curve having one point in common with an integral
surface of a quasi-linear first-order PDE necessarily lies on this surface entirely.
We claim that this function vanishes identically over the domain of existence of the
characteristic. The geometrical meaning of the function U (s) is displayed in Fig. 3.2.
We first project γ onto the (x, y) plane and obtain the projected characteristic β as
the curve with equations x̂(s), ŷ(s), 0. Next, we lift β to the integral surface as the
u
γ
U
u = u(x, y)
P
β+
A careful calculation of the derivative of U (using the chain rule and enforcing the
original differential equation) reveals that
dU d û ∂u d x̂ ∂u d ŷ d û ∂u ∂u d û
= − − = −a −b = − c = 0. (3.13)
ds ds ∂x ds ∂ y ds ds ∂x ∂y ds
The (unique) solution to this trivial equation with the initial condition (3.12) yields
U (s) ≡ 0. (3.14)
This result means that γ must lie entirely on the solution surface.
Remark 3.2 We note that in the proof of this lemma we have not invoked the fact
that the characteristic vector at P is tangential to the integral (solution) surface.
Put differently, we could have proved the lemma directly from the notion of integral
curves of the characteristic field. The fact that the characteristic vectors are tangential
to integral surfaces can be obtained as a corollary to the lemma.
Two important corollaries can be obtained almost immediately from the Fundamental
Lemma 3.1.
1. If two different solution surfaces have one point in common, they must intersect
along the whole characteristic passing through that point. Conversely, if two solu-
tions intersect transversely along a curve, this curve must be a characteristic of the
differential equation. By transversely we mean that along the common curve they
don’t have the same tangent plane. This means that at any point along this curve
we have a well-defined line of intersection of the local tangent planes of the two
surfaces. But recall that the tangent to the characteristic curve must belong to both
tangent planes, and therefore to their intersection. In other words, the intersection
between the two planes is everywhere tangent to the characteristic direction and,
therefore, the intersection curve is an integral curve of the characteristic field.
2. An integral surface is necessarily a collection of integral curves, since once it
contains a point it must contain the whole characteristic through that point. Con-
versely, any surface formed as a one-parameter collection of characteristic curves
is an integral surface of the PDE. What we mean by “one-parameter collection” is
that, since the characteristic curves are already one-dimensional entities, to form
a surface (which is two-dimensional) we have one degree of freedom left. For
58 3 The Single First-Order Quasi-linear PDE
example, we can take an arbitrary non-characteristic line and consider the surface
formed by all the characteristics emerging from this line. To show that a surface
so formed must necessarily be an integral surface, that is, a solution of the PDE,
we consider an arbitrary point P on the given surface. Since, by construction, this
surface contains the characteristic through P, the normal to the surface is also
perpendicular to the characteristic direction. But this is precisely the statement of
the PDE, which is, therefore, satisfied at each point of the given surface.
The general conclusion is that the solutions of a single first-order quasi-linear PDE
in two variables can be boiled down to the solution of a system of ordinary differential
equations. This result remains true for more than two independent variables and also
for fully nonlinear equations (in which case the concept of characteristic curves must
be extended to the so-called characteristic strips).
The main problem in the theory of first-order PDEs is the following so-called Cauchy
problem or initial value problem6 : Given the values of u on a curve in the x, y plane,
find a solution that attains the prescribed values on the given curve. An equivalent
way to look at this problem is to regard the given (“initial”) data as a space curve
with parametric equations
x = x̄(r ), (3.15)
y = ȳ(r ), (3.16)
u = ū(r ). (3.17)
The Cauchy problem consists of finding an integral surface that contains this
curve. From the results of the previous section we know that this integral surface
must consist of a one-parameter family of characteristics. Let the characteristics be
obtained (by integration of the characteristic equations) as
6 Some authors reserve the name of initial value problem for the particular case in which the data
are specified on one of the coordinate axes (usually at t=0).
3.3 Building Solutions from Characteristics 59
initial curve
(identified with s = 0)
characteristics
r
x̃r x̃s
J =
(3.27)
ỹr ỹs
does not vanish. Note that, by virtue of Eqs. (3.8) and (3.9), we know that on the
solution surface x̃s = a and ỹs = b, we can write the determinant as
The problem has thus been completely solved, provided J = 0. The vanishing of this
determinant will be later interpreted as the occurrence of a mathematical catastrophe.
Physically, the solution ceases to be uniquely defined and a shock wave is generated.
This situation does not develop if the PDE is linear.
Suppose that the initial data curve (as a curve in the x, y, u space) happens to be
characteristic. In that case, when trying to build a one-parameter family of charac-
teristics, we find that we keep getting the same curve (namely, the initial data curve)
over and over again, so that a solution surface is not generated. This should not be
surprising. We already know, from Sect. 3.3.2, that characteristics are lines of inter-
section between solutions. Moreover, using the initial data curve (which is now a
characteristic) we can build an infinity of distinct one-parameter families of charac-
teristics to which it belongs. In other words, there are infinite solutions that satisfy
the initial data. A different way to express this situation (called the characteristic
initial value problem) is by saying that the PDE in this case does not provide extra
information to allow us to come out uniquely from the initial curve. To drive this
point further, we note that by providing differentiable data by means of Eqs. (3.15),
(3.16) and (3.17), we are also automatically prescribing the derivative of the desired
solution in the direction of the curve, namely, d ū/dr = c. On the other hand, by the
chain rule at any point of the initial curve and enforcing the PDE, we know that
d ū d x̄ d ȳ
= ux + uy = u x a + u y b = c. (3.29)
dr dr dr
We conclude that the PDE cannot provide us with information in directions other
than characteristic ones. The initial data must remedy this situation by giving us
information about the derivative of u in another direction.
3.3 Building Solutions from Characteristics 61
If the initial data curve is not characteristic over its whole length but happens to
be tangential to a characteristic curve at one point, we have a situation that requires
special treatment. An extreme case is obtained when the initial curve is not charac-
teristic anywhere but is everywhere tangent to a characteristic curve. In this case, we
have an initial curve that is an envelope of characteristics. Again, this case requires
special treatment.
A more subtle situation arises when the initial data are self-contradictory. To see
the geometrical meaning of this situation, let P = (x0 , y0 , z 0 ) be a point on the initial
data curve ρ. From the theorem of existence and uniqueness of ODEs, we know that
there is a unique characteristic γ through P in some neighbourhood of the point.
Assume, moreover, that ρ and γ are not mutually tangent at P. We don’t seem to
have a problem. But suppose now that ρ and γ project on exactly the same curve in
the x, y plane. Since the tangent plane of the integral surface at P must contain both
the tangent to γ (because it is the local characteristic vector) and the tangent to ρ
(because the initial curve must belong to the solution), we obtain a vertical tangent
plane, which is not permitted if u is differentiable.
By homogeneous we mean that the right-hand side (the term without derivatives)
is zero. It follows immediately from the characteristic equations (3.8), (3.9) and
(3.10) that the first two equations can be integrated separately from the third. What
this means is that we can now talk about characteristic curves in the x, y plane.
From Eq. (3.10), we see that the value of u on these “projected” characteristic curves
must be constant. In other words, the original characteristic curves are contained in
horizontal planes and they project nicely onto non-intersecting curves in the x, y
plane.
Example 3.1 (Advection equation) Find the solution of the following advection
equation with constant coefficients
u t + 3u x = 0, (3.31)
Solution: The characteristic curves are given by the solutions of the system
dt dx du
=1 =3 = 0. (3.33)
ds ds ds
This system is easily integrated to yield
t =s+A x = 3s + B u = C. (3.34)
The initial curve in this case lies on top of the x axis, so we can choose x itself as a
parameter. To preserve the notation of the general procedure, we write the equation
of the initial curve explicitly as
1
t =0 x =r u= . (3.35)
r2 +1
Now is the time to enforce Eqs. (3.21), (3.22) and (3.23) to obtain
1
A=0 B =r C= . (3.36)
r2 + 1
1
t =s x = 3s + r u= . (3.37)
r2 + 1
1
u= . (3.38)
(x − 3t)2 + 1
This is the desired solution. It consists of a traveling wave of the same shape as the
initially prescribed profile. This is precisely the physical meaning of the advection
equation with constant coefficients. The wave travels forward with a speed of 3.
Remark 3.3 When producing an exact solution of a PDE, it is a good idea at the
end of the whole process to verify by direct substitution that the proposed solution
satisfies the given equation and the initial conditions.
dx
= a(x, y), (3.40)
ds
dy
= b(x, y), (3.41)
ds
du
= c(x, y) u + d(x, y). (3.42)
ds
Just as in the case of the homogeneous equation, we observe that the first two equa-
tions can be solved independently to yield a family of non-intersecting curves in the
x, y plane. The value of u on these lines, however, is no longer constant. Again, the
characteristic curves project nicely on the x, y plane, since the third equation can be
solved on the basis of the first two, curve by curve. Notice that from this point of
view, the linearity of the right-hand side doesn’t play a determining role. The method
of solution for the non-homogeneous equation follows in all respects the same lines
as the homogeneous one.
u t + 3u x = −2u x, (3.43)
t =0 x =r u = H [r ]. (3.45)
dt dx du
=1 =3 = −2u x. (3.46)
ds ds ds
64 3 The Single First-Order Quasi-linear PDE
u = C e−3s −2Bs
2
t =s+A x = 3s + B . (3.47)
A=0 B =r C = H [r ]. (3.48)
Putting all these results together, just as in the previous example, we obtain the
solution as
u = H [x − 3t] e3t −2xt .
2
(3.49)
So far, everything seems to be working smoothly. The problems start once one crosses
the threshold into the non-linear realm. For the time being, nonlinear for us means
quasi-linear, since we have not dealt with the genuinely nonlinear case. Geometrically
speaking, the reason for the abnormal behaviour that we are going to observe is that in
the case of a quasi-linear equation the characteristics (that live in the 3-dimensional
space x, y, u) do not project nicely onto the plane of the independent variables x, y.
From the point of view of the characteristic system of ODEs, this is the result of
the fact that the first two equations are coupled with the third, unlike the case of the
linear equation, where u was not present in the first two equations. As a consequence
of this coupling, for the same values of x, y, but for a different value of u, we
obtain, in general, characteristics that do not project onto the same curve. In other
words, the projected characteristics intersect and the projected picture is a veritable
mess. Figures 3.4 and 3.5 may help to understand the above statements. Therefore,
given smooth initial conditions on a curve may lead to intersections of the projected
characteristics emerging from the initial curve. What this means is that at one and the
same point in the space of independent variables, we may end up having two different
solutions. When the independent variables are space and time, this situation is usually
described as the development of a shock after a finite lapse of time. This mathematical
catastrophe is accompanied by physical counterparts (sonic booms, for example).
3.4 Particular Cases and Examples 65
y y
x x
y y
a(x, y)ux + b(x, y)uy = c(x, y, u) a(u)ux + b(u)uy = 0
x x
u t + u u x = 0. (3.50)
This equation is known as the inviscid Burgers equation. It has important applications
to gas dynamics.
The characteristic ODEs of Eq. (3.50) are
dt dx du
=1 =u = 0, (3.51)
ds ds ds
which can be integrated to obtain the characteristic curves as
Let us regard pictorially the solutions u = u(x, t) as waves which at any given
time τ have the geometric profile given by the function of the single variable x defined
as u = u (τ ) (x) = u(x, τ ). This way of looking at the solution usually corresponds to
the physical interpretation and makes it easier to describe what is going on in words.
We will now consider initial data at time t = 0, that is, we will prescribe a certain
initial profile given by some function f (x), namely,
The initial curve for our Cauchy problem is then given parametrically by
t =0 x =r u = tanh(r ). (3.55)
Recall that r represents the running coordinate in the initial curve (along the x axis).
The projection onto the plane x, t of these characteristics is given parametrically by
the first two expressions above. In this case, it is easy to eliminate the characteristic
parameter s and to write the family of projected characteristics as
x = t tanh(r ) + r. (3.58)
For each r , this is the equation of a straight line. The situation is represented in Fig. 3.6
(produced by Mathematica ), which shows that the projected characteristics fan out,
as it were, and that they will never intersect for t > 0.
We now ask: What happens to the wave profile as time goes on? Since our solution
(if it exists) is given parametrically by Eq. (3.57), the profile for a fixed time τ > 0
is given parametrically by
Show[Table[ParametricPlot[{s*Tanh[0.1*i]+0.1*i,s},{s,0,2}, PlotRange->{{-1,1},{0,2}},
Let us plot this profile separately for three different times τ = 0, 2, 4, as shown in
Fig. 3.7.
The wave profile tends to smear itself out over larger and larger spatial extents. We
can now plot the solution as a surface in a three-dimensional space with coordinates
x, t, u (for t ≥ 0), as shown in Fig. 3.8.
Consider now instead an initial profile with rising and falling parts, such as
1
f (x) = . (3.60)
x2 +1
GraphicsRow[Table[ParametricPlot[{i*Tanh[r]+r,Tanh[r]},{r,-7.5,7.5}, PlotRange->{{-7,7},{-1.2,1.2}},
u u u
x x x
6 4 2 2 4 6 6 4 2 2 4 6 6 4 2 2 4 6
0.5 0.5 0.5
ParametricPlot3D[{s,s*Tanh[r]+r,Tanh[r]},{s,0,10},{r,-15,15}, AxesLabel->{t,x,u},
t
x= + r, (3.62)
r2 + 1
but now the slope is always positive, with a pattern that fans out or in for positive or
negative r , respectively. Figure 3.9 shows the state of affairs.
From Eq. (3.61) we can read off for any t = τ the following parametric relation
between u and x:
τ 1
x= 2 +r u= 2 . (3.63)
r +1 r +1
Figure 3.10 shows the wave profiles for τ = 0.0, 0.5, 1.0, 1.5, 2.0, 2.5.
3.4 Particular Cases and Examples 69
Show[Table[ParametricPlot[{s/(i^2+1)+i,s},{s,0,4},PlotRange->{{-3,4},{0,4}}, FrameLabel->{x,t},
Frame->True, PlotStyle->Black],{i,-3,4,0.2}]]
to read off r and s as functions of x and t. For this to be possible, we need the Jacobian
determinant
∂x ∂x
−2r s
∂r ∂s
r 2 +1 + 1 r 2 +1
J =
(3.65)
∂t ∂t
0 1
∂r ∂s
Show[Table[ParametricPlot[{r+i/(r^2+1),1/(r^2+1)},{r,-10,10}, PlotRange->{{-7,7},{-0,1.2}},
Fig. 3.10 Wave profiles at times t = 0, 0.7, 1.4, 2.1, 2.8, 3.5
Of all these combinations of r and s we are looking for that one that renders the
minimum value of t. In general, t is a function of both r and s, so that we have to
solve for the vanishing differential of this function under the constraint (3.66). In our
particular example, since t = s, this is a straightforward task. We obtain
2
r2 + 1
dt = ds = d = 0. (3.67)
2r
Expanding this expression and choosing the smallest root we obtain the value of r
at the breaking point as √
3
rb = , (3.68)
3
which yields the breaking time
2 2 √
rb + 1 8 3
tb = sb = = ≈ 1.54. (3.69)
2rb 9
This value is corroborated by a look at the graph of the intersecting projected char-
acteristics. An equivalent reasoning to obtain the breaking time could have been the
following. Equation (3.63) provides us with the profile of the wave at time t = τ .
3.4 Particular Cases and Examples 71
We are looking for the smallest value of τ that yields an infinite slope du/d x. This
slope is calculated as
du du/dr 2r
= = 2 . (3.70)
dx d x/dr 2r τ − r 2 + 1
7 The classical reference work for this kind of problem is [2]. The title is suggestive of the importance
of the content.
72 3 The Single First-Order Quasi-linear PDE
function CharacteristicNumMethod
% The quasi-linear PDE aa*u_x + bb*u_t= cc is solved by the method
% of characteristics. Users must specify the functions aa, bb and cc.
% The solution is graphed in parametric form, so that shock formation
% can be discerned from the graph. The shock is not treated. Initial
% conditions are specified by function init at time t=0
N = 50; % Number of steps along the parameter s
M = 200; % Number of steps along the parameter r
ds = 0.1; % Parameter s step size
dr = 0.05; % Parameter r step size
solution = zeros(N,M,5);
% Forward differences
x00 = x0+aa(x0,t0,u0)*ds;
t00 = t0+bb(x0,t0,u0)*ds;
u00 = u0+cc(x0,t0,u0)*ds;
x0=x00;
t0=t00;
u0=u00;
end
end
% Plot solution
kc=0;
for i=1:N
for j=1:M
kc=kc+1;
XX(i,j)=solution(i,j,1);
TT(i,j)=solution(i,j,2);
UU(i,j)=solution(i,j,3);
end
end
figure(1);
function aa = aa(x,t,u)
% Coefficient of u_x
aa = u;
end
function bb = bb(x,t,u)
% Coefficient of u_t
bb = 1;
end
function cc = cc(x,t,u)
% Right-hand side
cc =0;
end
function init=init(x)
init=1/(x^2+1);
end
3.5 A Computer Program 73
0.9
0.8
0.7
0.6
0.5
u
0.4
0.3
0.2
0.1
0
5
0
t -6 -4 -2 0 2 4 6
x
Exercises
Exercise 3.1 What conditions must the functions a, b, c in (3.5) satisfy for this
equation to be linear?
Exercise 3.2 Are the characteristics of a linear equation necessarily straight lines?
Are the characteristic ODEs of a linear PDE necessarily linear? Are the characteristic
ODEs of a quasi-linear PDE necessarily non-linear?
Exercise 3.3 ([8], p. 96) Solve the following initial-value problem for v = v(x, t):
vt + c vx = x t v(x, 0) = sin x,
where c is a constant.
Exercise 3.4 Find the solution of the equation
(x − y)u x + (y − x − u)u y = u
∂u ∂u
y +x =u (3.73)
∂x ∂y
74 3 The Single First-Order Quasi-linear PDE
Further questions: Is this a linear PDE? What are the projected characteristics of this
PDE? Is the origin an equilibrium position for the characteristic equations? Do the
initial conditions determine the solution uniquely for the whole of R2 ? What would
the domain of existence of the solution be if the initial conditions had been specified
on the line y = 1 instead of y = 0?
Exercise 3.6 Modify the MATLAB code of Box 3.3 to solve Exercise 3.5 numer-
ically. Apart from the obvious changes to the functions aa, bb, cc and init, you may
want to modify some of the numerical values of the parameters at the beginning
of the program controlling the range and precision of the solution. Comment on
quantitative and qualitative aspects of the comparison between the exact solution
obtained in Exercise 3.5 and the numerical solution provided by the program. Does
the numerical solution apply to the whole of R2 ?
Exercise 3.7 Find the breaking time of the solution of the inviscid Burgers equation
(3.50) when the initial condition is given by the sinusoidal wave u(x, 0) = sin x.
Where does the breaking occur?
u t + u 2 u x = 0.
(a) Show that the new function w = u 2 satisfies the usual Burgers equation. (b) On
the basis of this result find the solution of the initial value problem of the modified
Burgers equation with initial condition u(x, 0) = x. (c) What is the domain of validity
of this solution? (d) Is any simplification achieved by this change of variables?
References
The impasse exposed at the end of Chap. 3 can be summarized as follows: For some
quasi linear PDEs with given initial conditions the solution provided by the method of
characteristics ceases to be single-valued. It may very well be the case that a reinter-
pretation of the PDE, or a modification of the dependent and independent variables,
can allow us to live with the multi-valued nature of the result. On the other hand, in
most applications, the function u = u(x, y) is the carrier of an intrinsic physical vari-
able, whose single-valued nature is of the essence.1 In this event, the engineer may
wish to check whether or not the model being used has been obtained by means of
excessive simplifications (for example, by neglecting higher-order derivatives). It is
remarkable, however, that a way out of the impasse can be found, without discarding
the original model, by allowing a generalized form of the equation and its solutions.
Indeed, these generalized solutions are only piecewise smooth. In simpler words,
these solutions are differentiable everywhere, except at a sub-domain of measure
zero (such as a line in the plane), where either the function itself or its derivatives
1 For this and other points in the theory of shocks in one-dimensional conservation laws, references
We recall that, in practice, PDEs more often than not appear in applications as the
result of the statement of an equation of balance. In Chap. 2, Eq. (2.14), we learned
that the preliminary step towards obtaining the balance PDE for the case of one
spatial variable x had the integral form
x2 x2
d
u(x, t) d x = p(x, t) d x + f (x2 , t) − f (x1 , t), (4.1)
dt
x1 x1
where x1 , x2 are arbitrary limits with x1 < x2 . There is no reason, therefore, to discard
this integral version in favour of the differential counterpart, since in order to obtain
the latter we needed to assume differentiability of u = u(x, t) over the whole domain
of interest. This domain will be divided into two parts,2 namely, the strip 0 ≤ t ≤ tb
and the half-plane t > tb . The upper part, moreover, will be assumed to be subdivided
into two regions, D− and D+ , in each of which the solution, u − and u + , is perfectly
smooth. These two regions are separated by a smooth curve with equation
x = xs (t) (4.2)
along which discontinuities in u and/or its derivatives may occur. This curve, illus-
trated in Fig. 4.1, is, therefore, the shock front (the carrier of the discontinuity). The
reason that we are willing to accept this non-parametric expression of the shock
curve is that, from the physical point of view, the derivative d xs /dt represents the
instantaneous speed of the shock, which we assume to be finite at all times. The
shock curve, clearly, passes through the original breaking point, with coordinates
(xb , tb ).
We evaluate the integral on the left-hand side of Eq. (4.1) for times beyond tb (and
for a spatial interval (a, c) containing the shock) as
c xs (t) c
d d d
u(x, t) d x = u(x, t) d x + u(x, t) d x (4.3)
dt dt dt
a a xs (t)
2 For the sake of simplicity, we are assuming that no other shocks develop after the breaking time
tb that we have calculated.
4.2 Generalized Solutions 77
t
xs = xs (t)
D− D+
a c
tb
x
0 xb
Because the limits of these integrals depend on the variable with respect to which
we are taking derivatives (t), we can no longer simply exchange the derivative with
the integral. Either by doing the derivation yourself, or by consulting a Calculus
textbook, you can convince yourself of the following formula3
g(t) g(t)
d ∂u(x, t) dg(t) d f (t)
u(x, t) d x = d x + u(g(t), t) − u( f (t), t) . (4.4)
dt ∂t dt dt
f (t) f (t)
Introducing this formula into Eq. (4.3) and the result into (4.1), we obtain
xs (t) c
∂u d xs ∂u d xs
d x + u(xs− , t) + d x − u(xs+ , t) = f (c, t) − f (a, t),
∂t dt ∂t dt
a xs (t)
(4.5)
where the superscripts “−” and “+” indicate whether we are evaluating the (possibly
discontinuous) solution immediately to the left or to the right of the shock curve,
respectively. In Eq. (4.5) we have assumed that there is no production, since its
presence would not otherwise affect the final result. We now let the limits of the
original integral, a and c, approach x − and x + , respectively, and obtain the speed of
the shock as
d xs f (xs+ , t) − f (xs− , t)
= . (4.6)
dt u(xs− , t) − u(xs+ , t)
We have calculated the breaking time tb for this problem. We are now in a position
to do more than this, namely, to determine the whole zone whereby the solution
is multi-valued. This information is obtained by implementing the zero-determinant
condition (3.66) in Eq. (3.64), which yield the bounding line of this zone in parametric
form. The plot of this boundary is shown in Fig. 4.2.
It should not be surprising that this line contains a cusp (at time tb ), since it is
obtained as an envelope of characteristic lines which, as shown in Fig. 3.9, turn first
one way and then the other. The shock curve starts at this cusp point and develops
within the zone enclosed by the two branches according to the Rankine–Hugoniot
condition. Notice that at each point within this domain, there are three values pro-
duced by the characteristic method. The middle value is irrelevant, since the solution
must be smooth on either side of the shock curve, and the Rankine–Hugoniot condi-
tion must be enforced between the highest and the lowest value. What this means is
that, within the domain of multiplicity, we have a well-defined and smooth right-hand
side of the Rankine–Hugoniot condition, which can then be regarded as an ODE for
the determination of the shock curve. The only problem is the explicit calculation of
the highest and lowest values of the solution.
4 Naturally,
because of the sign convention we used for the flux, the formula found in most books
changes the sign of the right-hand side.
4.3 A Detailed Example 79
ParametricPlot[{(r^2+1)/(2*r)+r,(r^2+1)^2/(2*r)},{r,0,2}, AxesLabel->{x,t},
PlotRange->{{-1,5},{0,5}}, PlotStyle->Black]
In our particular case, it is not difficult to see, by looking at Eqs. (2.24) and (3.50),
that the flux function is given by
1
f (u) = − u 2 . (4.8)
2
Introducing this result into Eq. (4.6), we obtain
+ 2 − 2
d xs u − u 1 +
= + −
= u + u− . (4.9)
dt 2 (u − u ) 2
This means that the shock curve negotiates its trajectory in such a way as to have
a slope equal to the average of the solutions coming from either side. This can be
enforced either analytically (if this average is easily available) or numerically (if it
isn’t). Figure 4.3 illustrates what average we are talking about, by showing a typical
profile of the multi-valued solution for some t > tb .
So, for each instant of time t > tb , moving the vertical line between its two extreme
points of tangency (which project onto the boundary of the multi-valued zone),
we obtain a smooth function for the right-hand side of Eq. (4.9). In our particular
example, to obtain the values u + and u − analytically we need to solve for the highest
and lowest roots of a cubic equation. Indeed, according to Eq. (3.63), the profile at
time τ is given parametrically by
80 4 Shock Waves
τ 1
x= +r u= . (4.10)
r2 + 1 r2 + 1
r 3 − x r 2 + r + (τ − x) = 0. (4.11)
In the multi-valued zone, this cubic equation will have three real roots, which, when
substituted into the second Eq. (4.10), provide three values of u (i.e., the three
intersections of the vertical line through x with the wave profile). The Rankine–
Hugoniot condition requires the average between the highest and the lowest of these
values to determine the slope of the shock curve dividing the two regions where the
solution is smooth.
It is interesting to remark that the foregoing picture is an example of a mathemati-
cal catastrophe, namely, a case in which a perfectly smooth manifold (the parametric
surface provided by the method of characteristics, as given by Eq. (3.61)) has singu-
larities in terms of its projection on a plane (because it turns around in such a way that
there are vertical tangent planes). The theory of catastrophes, as developed among
others by the French mathematician René Thom, was very popular some years ago
to explain almost everything under the sun, from the crash of the stock market to the
behaviour of a threatened dog.5 A plot of our equations
s 1
t =s x= +r u= (4.12)
r2 + 1 r2 + 1
is shown in Fig. 4.4. Note that this graph is identical to the one in Fig. 3.11 produced
numerically by the code of Box 3.3.
For the sake of the illustration, we solve our problem numerically. We remark,
however, that the domain of the Rankine–Hugoniot differential equation has a bound-
ary containing a cusp at the initial point. Thus, the theorem of uniqueness and its
numerical implications may not be readily enforceable. Given a cubic equation such
as (4.11), one of the roots is always real (since complex roots appear in conjugate
pairs) and, when the other two roots happen to be real (in other words, in the multi-
valued zone of interest), we need to select the two roots corresponding to the highest
and lowest values of u. Choosing these two roots, we solve and plot the solution, as
shown in Figs. 4.5 and 4.6.
Figure 4.7 shows the shock solution as a ‘chopped’ version of the multi-valued
solution obtained before. The curved vertical wall projects on the shock curve
depicted in Fig. 4.6.
82 4 Shock Waves
(* Roots by Cardano-Tartaglia *)
rr1=x[t]/3-(2^(1/3) (3-x[t]^2))/(3 (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t x[t]+8 x[t]^2-4 t
x[t]^3+4 x[t]^4])^(1/3))+(-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t x[t]+8 x[t]^2-4 t x[t]^3+4
x[t]^4])^(1/3)/(3 2^(1/3))
rr2=x[t]/3+((1+I Sqrt[3]) (3-x[t]^2))/(3 2^(2/3) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t
x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))-((1-I Sqrt[3]) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-
36 t x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))/(6 2^(1/3))
rr3=x[t]/3+((1-I Sqrt[3]) (3-x[t]^2))/(3 2^(2/3) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t
x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))-((1+I Sqrt[3]) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-
36 t x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))/(6 2^(1/3))
maxr=Max[Abs[rr1],Abs[rr2],Abs[rr3]]
minr=Min[Abs[rr1],Abs[rr2],Abs[rr3]]
u1=1/(maxr^2+1)
u2=1/(minr^2+1)
tb=8*Sqrt[3]/9
xbb=Sqrt[3]
NDSolve[{x'[t]==0.5*(u1+u2),x[tb]==xbb},x[t],{t,tb,6*tb}]
plot1=ParametricPlot[{{Evaluate[x[t]/.%],t},{(r^2+1)/(2*r)+r,(r^2+1)^2/(2*r)}},{t,tb,6*tb},{r,0.001,3},
plotRange ->{{0,5},{0,5}},AxesLabel->{x,t}, AspectRa o->0.75]
The merit of the example solved in the previous section is that it clearly shows how a
shock wave (an essentially discontinuous phenomenon) can develop in a finite time
out of perfectly smooth initial conditions. If, on the other hand, the initial conditions
are themselves discontinuous we may obtain the immediate formation and subsequent
propagation of a shock wave. A different phenomenon may also occur, as we will
discuss in the next section. Discontinuous initial conditions occur as an idealized
4.4 Discontinuous Initial Conditions 83
Fig. 4.7 The shock solution (left) as a chopped version of the multi-valued solution (right)
limit of a steep initial profile, such as that corresponding to the sudden opening of a
valve.
Consider again the inviscid Burgers equation (3.50), but with the discontinuous
initial conditions −
u for x ≤ 0
u(x, 0) = + (4.13)
u for x > 0
where u − and u + are constants with u − > u + . These are known as Riemann initial
conditions. The projected characteristics for this quasi-linear problem are depicted
in Fig. 4.8.
We observe that for all t > 0 there is a region of intersection of two characteristics,
shown shaded in Fig. 4.8. Recalling that for this equation each characteristic carries
the constant value u − or u + , we conclude that a shock solution is called for in that
84 4 Shock Waves
u t
u− u−
1.0
1.0
u+
u+
x x
0 0
Fig. 4.8 Projected characteristics (right) for Riemann conditions (left) with u − > u +
u−
u+
region. Invoking the Rankine–Hugoniot condition (4.6) and the flux function (4.8)
for the inviscid Burgers equation, we obtain that the shock velocity is constant and
given by
d xs 1 −
= u + u+ . (4.14)
dt 2
The complete solution is schematically represented in Fig. 4.9
4.4 Discontinuous Initial Conditions 85
Even in the case of linear PDEs it is possible to have a situation whereby the (pro-
jected) characteristics intersecting the initial manifold do not cover the whole plane.
An example is the equation yu x + xu y = u, for which the characteristics are equi-
lateral hyperbolas. In the case of quasi-linear equations, the projected characteristics
depend, of course, on the values of u on the initial manifold, so that the situation just
described may be determined by the initial values of the unknown function. Consider
again the inviscid Burgers equation with the following Riemann initial conditions
−
u for x ≤ 0
u(x, 0) = (4.15)
u + for x > 0
where u − and u + are constants with u − < u + . This is the mirror image of the problem
just solved, where the fact that u − was larger than u + led to the immediate formation
a shock. The counterpart of Fig. 4.8 is shown in Fig. 4.10. The shaded region is
not covered by any characteristic emanating from the initial manifold t = 0. Since in
physical reality the function u does attain certain values in the shaded region, we need
to extend the solution. A possible way to do so is to postulate a constant value of the
solution in that region. This device, however, would introduce in general two shocks.
It can be verified that no value of the constant would satisfy the Rankine–Hugoniot
conditions.6 A clue as to what needs to be done can be gathered by imagining that the
jump in the initial conditions has been replaced by a smooth, albeit steep, transition.
Correspondingly, the projected characteristics would now cover the whole half-plane
t > 0 and would gradually join the existing characteristics in a fan-like manner.
Moreover, on each of the new characteristic lines the value of u is constant. In the
limit, as the steepness of the transition tends to infinity, the fan of characteristics
emanates from the origin, as suggested in Fig. 4.11. To determine the value of u on
each of these new characteristic lines, we start by noticing that the function u on
the shaded region would have to be of the form u = f (x/t), so as to preserve the
constancy on each characteristic in the fan. Introducing the function f into the PDE,
u t
u+
u−
x x
0 0
Fig. 4.10 Projected characteristics (right) for Riemann conditions (left) with u − < u +
6 For the possibility of extending the values u − and u + into the shaded region and producing a
legitimate shock, see Box 4.1.
86 4 Shock Waves
x
0
we obtain
x f f x
0 = u t − uu x = − f 2
+ f =− − f , (4.16)
t t t t
where primes indicate derivatives of f with respect to its argument. Since we have
already discarded the constant solutions ( f = 0), we are left with
x
u= . (4.17)
t
In other words, for each time t > 0 the new solution provides a linear interpolation
between the values u − and u + as shown in Fig. 4.12. This solution is continuous
and differentiable everywhere except on the two extreme characteristics. It is called
a rarefaction wave, since it corresponds to a softening of the initial conditions as
time goes on. In applications to gas dynamics, a rarefaction wave is a wave of
decompression.
u+
u−
x x
Exercises
Exercise 4.1 ([2], p. 153) Write the Rankine–Hugoniot condition for the traffic
equation (2.30). Assume that some cars have already stopped at a red traffic light
and that they are packed at the maximum density. Assume, moreover, that the cars
approaching the end of the stopped line are traveling at a constant speed v0 < vmax .
Find the speed at which the tail of the stopped traffic backs up as more and more cars
join in.
Exercise 4.2 The Burgers equation for a dust: Imagine a one-dimensional flow
of non-interacting particles with no external forces. Show that this phenomenon is
described exactly by the inviscid Burgers equation (3.50). [Hint: interpret u(x, t) as
the velocity field expressed in a fixed inertial coordinate system and notice that the
particle velocities remain constant.] Notice that if all the particles are moving, say,
to the right and if the initial conditions are such that the velocity profile increases to
the right, then there is no danger that any particles will catch up with other particles.
On the other hand, if any part of the initial velocity profile has a decreasing pattern,
some particles will eventually catch up with those to their right and a ‘snowballing
effect’ (that is, a shock) will occur. The Rankine–Hugoniot condition is more difficult
to interpret intuitively, but it may be useful to try. [Hint: imagine identical particles
equally spaced moving to the right at constant speed and encountering identical
particles at rest. Assume perfectly plastic multiple collisions and use induction.]
Exercise 4.3 Show that for the initial conditions (4.15) a mathematical alternative
to the rarefaction wave is the following shock solution
88 4 Shock Waves
⎧ − −
⎨u for x ≤ 1
2
u + u+
u= −
⎩
u + for x > 1
2
u + u+
Specifically, verify that this solution satisfies the Rankine–Hugoniot condition. Does
it satisfy the entropy condition?
References
5.1 Introduction
From the treatment of the previous chapters, it is quite clear that quasi-linear equations
can be characterized geometrically in a manner not very different from that of linear
equations. It is true that the behaviour of quasi-linear equations is richer in content
due to the fact that projected characteristics may intersect and thus give rise to the
appearance of shocks. Nevertheless, the basic interpretation of the first-order PDE
as a field of directions and the picture of a solution as a surface fitting this field are
the same, whether the equation happens to be linear or quasi linear. In a genuinely
nonlinear first-order PDE, on the other hand, these basic geometric ideas are lost
and must be replaced by somewhat more general counterparts. Remarkably, in spite
of the initially intimidating nature of the problem, the nature of the final result is
analogous to that of the linear and quasi linear cases. Namely, the construction of a
solution of a genuinely nonlinear PDE turns out to be based entirely upon the solution
of a system of ODEs.
Nonlinear first-order PDEs appear in a variety of applications, most notably in
the characterization of wave fronts arising from a system of linear second-order
© Springer International Publishing AG 2017 89
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_5
90 5 The Genuinely Nonlinear First-Order Equation
F(x, y, u, u x , u y ) = 0. (5.1)
u − u 0 = u x (x − x0 ) + u y (y − y0 ). (5.2)
It is understood that in this equation the derivatives u x and u y are evaluated at the
point (x0 , y0 ) and that u 0 = u(x0 , y0 ), since the point of tangency must belong to the
surface. The equation of the tangent plane we have just written is a direct consequence
of the very definition of partial derivatives of a function of two variables. The vector
with components {u x , u y , −1} is perpendicular to the surface at the point of tangency.
What our PDE (5.1) tells us is that the two slopes of the tangent plane of any
putative solution surface are not independent of each other, but are rather interrelated
by the point-wise algebraic condition (5.1), a fact which, in principle, allows us to
obtain one slope when the other one is given.
To get a pictorial idea of what this linkage between the two slopes means, let us
revisit the linear or quasi linear case, namely,
Since the normal vector to any solution surface at a point has components proportional
to {u x , u y , −1}, we conclude, from Eq. (5.3), that these normal vectors are necessarily
perpendicular to a fixed direction in space, namely the (characteristic) direction
{a, b, c}. What this means is that the tangent planes to all possible solution surfaces
(at a given point) intersect at the line defined by the characteristic direction, thus
forming a pencil of planes, as shown in Fig. 5.1. A pencil of planes resembles the
pages of a widely open book as they meet at the spine.
5.2 The Monge Cone Field 91
In the genuinely non-linear case, on the other hand, no preferred direction {a, b, c}
is prescribed by the PDE, resulting in the fact that the possible tangent planes (which
clearly constitute a one-parameter family of planes at each point of space) do not
necessarily share a common line. In general, therefore, we may say that, instead of
constituting a pencil of planes around a given line, they envelop a cone-like surface
Fig. 5.3 The solution is a surface tangent to the Monge cone field
known as the Monge cone1 at the given point in space, as shown schematically in
Fig. 5.2.2
The task of finding a solution to the PDE (5.1) can be regarded geometrically
as that of constructing a surface fitting the Monge cone field, in the sense that the
surface is everywhere tangent to the local cone, as shown schematically in Fig. 5.3.
The Monge cone can be seen at each point as defining not just one characteristic
direction (as was the case in the quasi-linear equation) but a one-parameter family of
characteristic directions, namely, the family of generators of the cone. To simplify
the notation, let us put
p = ux q = uy. (5.4)
Thus, the PDE (5.1) can be seen at a given point x0 , y0 , u 0 as imposing an algebraic
relation between the possible values of the slopes p and q, viz.,
F(x0 , y0 , u 0 , p, q) = 0. (5.5)
1 Inhonour of Gaspard Monge (1764–1818), the great French mathematician and engineer, who
made seminal contributions to many fields (descriptive geometry, differential geometry, partial
differential equations).
2 There are some mathematical subtleties. For example, we are tacitly assuming that the partial
derivatives of the function F with respect to the arguments u x and u y do not vanish simultaneously.
Also, we are considering a small range of tangent planes, where one of the slopes is a single-valued
function of the other.
5.3 The Characteristic Directions 93
For each value of one of the slopes, this equation provides us with one3 value of the
other slope. In other words, Eq. (5.5) can be regarded as providing a one-parameter
family of slopes, namely,
To find the intersection between two planes, we need to take the cross product
of their normals.4 The intersection between two neighbouring tangent planes is,
therefore, aligned with the vector
i j k
v = p q −1 (5.8)
p + p dα q + q dα −1
We have indicated with primes the derivatives with respect to the parameter and we
have used otherwise standard notation for unit vectors along the coordinate axes. On
the other hand, taking the derivative of Eq. (5.5) with respect to the parameter, we
obtain
F p p + Fq q = 0, (5.9)
where we have used the index notation for partial derivatives of the function F with
respect to the subscripted variable. Combining the last two results, we conclude that
the direction of the cone generator located in the plane corresponding to the value α
of the parameter is given by the vector with components
⎧ ⎫
⎪
⎪ Fp ⎪
⎪
⎪
⎪ ⎪
⎪
⎨ ⎬
Fq (5.10)
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎭
p F p + q Fq
3 Seeprevious footnote.
4 TheMonge cone is a particular case of an envelope of surfaces. In Box 5.1 we present a more
general derivation.
94 5 The Genuinely Nonlinear First-Order Equation
u = f (x, y, α),
dx dy du
= a(x, y, u) = b(x, y, u) = c(x, y, u). (5.11)
ds ds ds
In the present, genuinely non-linear, case, the system of ODEs
dx dy du
= Fp = Fq = p F p + q Fq , (5.12)
ds ds ds
5.3 The Characteristic Directions 95
Fx + Fu p + F p px + Fq qx = 0, (5.13)
Fy + Fu q + F p p y + Fq q y = 0. (5.14)
dx dy
Fx + Fu p + px + qx = 0, (5.15)
ds ds
dx dy
Fy + Fu q + py + q y = 0. (5.16)
ds ds
But, since ‘mixed partials are equal’, we have that p y = qx , so that ultimately
Eqs. (5.15) and (5.16) can be written as
dp
= −Fx − Fu p, (5.17)
ds
dq
= −Fy − Fu q. (5.18)
ds
Equations (5.12), (5.17) and (5.18) constitute a system of five first-order ODEs
satisfied by the characteristic curves contained in a given solution surface. Suppose
now, vice versa, that these five equations had been given a priori, without any knowl-
edge of any particular solution surface. This system ‘happens to have’ the function
F as a first integral. What this means is that this function attains a constant value on
every integral curve of the given system of ODEs. Indeed, we check that
96 5 The Genuinely Nonlinear First-Order Equation
dF dx dy du dp dq
= Fx + Fy + Fu + Fp + Fq = 0, (5.19)
ds ds ds ds ds ds
where we have used Eqs. (5.12), (5.17) and (5.18). If we now single out of all the
possible solutions of this system of ODEs those for which F = 0, we obtain a special
(three-parameter) sub-family of solutions called characteristic strips of the PDE. The
reason for this terminology is that each such solution can be seen as an ordinary curve,
x(s), y(s), u(s), each point of which carries a plane element, that is, the two slopes
p(s), q(s). The image to have in mind is that of a tapeworm. If a characteristic strip
has an element x, y, u, p, q in common with a solution surface, the whole strip must
belong to this surface.
5.4 Recapitulation
It may not be a bad idea to review the basic geometric ideas we have been working
with.5 A point is a triple (x0 , y0 , u 0 ). A plane element is a quintuple (x0 , y0 , u 0 , p, q).
Thus, a plane element consists of a point and two slopes defining a plane passing
through that point, namely,
u − u 0 = p (x − x0 ) + q (y − y0 ). (5.20)
where we are using the notation (5.4). We form the following system of five ODEs,
which we call the characteristic system associated with the given PDE:
dx dy du
= Fp = Fq = p F p + q Fq
ds ds ds
dp dq
= −Fx − Fu p = −Fy − Fu q. (5.23)
ds ds
Any solution (integral curve) of this system can be obviously viewed as a one-
parameter family of plane elements supported by a curve. We claim that this one-
parameter family is necessarily a strip, which will be called a characteristic strip. The
proof of this assertion follows directly from substitution of the first three equations
of the system (5.23) into the strip condition (5.21).
To pin down a characteristic strip (it being the solution of a system of ODEs),
we only need to specify any (initial) plane element belonging to it. We have shown
that the function F defining our PDE is a first integral of its characteristic system.
Therefore, if any one plane element of a characteristic strip satisfies the equation
F = 0, so will the whole characteristic strip to which it belongs. This result can be
interpreted as follows. If a plane element is constructed out of a point on an integral
surface of the PDE and of the tangent plane to this surface at that point, then the
strip that this element uniquely determines has a support curve that belongs to the
integral surface (that is, a characteristic curve) and the plane elements of this strip are
made up of the tangent planes to the integral surface at the corresponding points. Two
characteristic strips with F = 0 whose support curves have a common point with a
common tangent, must coincide. Therefore, two integral surfaces having a common
point and a common tangent plane thereat, must have a whole characteristic strip in
common (that is, they share the whole support curve and are tangential to each other
along this curve).
98 5 The Genuinely Nonlinear First-Order Equation
The Cauchy (or initial) problem for a nonlinear PDE is essentially the same as for
its linear or quasi-linear counterpart. Given an initial curve in the plane
u = û(r ), (5.25)
the Cauchy problem deals with the possibility of finding a solution of the PDE over a
neighbourhood of the initial curve in such a way that it attains the prescribed values
over this curve. Equations (5.24) and (5.25) constitute the parametric equations of a
curve in three-dimensional space. The Cauchy problem can, therefore, be rephrased
as follows: to find an integral surface of the PDE containing this space curve.
In the case of linear and quasi-linear equations, the solution to this problem was
based on the construction of the one-parameter family of characteristics issuing from
the various points of this space curve. The situation in a genuinely non-linear first-
order PDE is more delicate, since what we have at our disposal is not a collection
of characteristic curves, but rather of characteristic strips. The task of constructing
the solution must start therefore by extending in a unique way the initial data to
a (non-characteristic) strip, and only then solving the differential equations of the
characteristic strips to generate a one-parameter family. We will need to show how this
extension is accomplished and to prove that the one-parameter family of characteristic
support curves is indeed an integral surface. These tasks are somewhat more difficult
than in the case of the quasi-linear equation, but the fundamental idea of reducing the
Cauchy problem of a first-order PDE to the integration of a system of ODEs remains
the same.
The (non-characteristic) initial strip supported by the given curve will have a
parametric representation consisting of the equations of the supporting curve (5.24),
(5.25) and two additional equations
providing the slopes of the tangent plane as functions of the running parameter r .
To determine these two functions, we have at our disposal two equations. The first
equation is the strip condition (5.21), guaranteeing that each plane element contains
the local tangent to the curve, namely,
d x̂ d ŷ d û
p̂ + q̂ = . (5.27)
dr dr dr
5.5 The Cauchy Problem 99
The second equation at our disposal is the PDE itself (which we clearly want to
see satisfied on this initial strip), that is,
We note that these two equations constitute, at each point, merely algebraic rela-
tions between the two unknown quantities p̂(r ), q̂(r ). To be able to read off these
unknowns at a given point in terms of the remaining variables, we need the corre-
sponding Jacobian determinant
d x̂ d ŷ
dr dr
J =
(5.29)
F p Fq
not to vanish at that point. By continuity, there will then exist a neighbourhood of this
point with the same property. In this neighbourhood (which we will assume to be the
whole curve) we can obtain the desired result by algebraic means. Using each plane
element thus found as an initial condition for the system of characteristic ODEs, and
setting the parameter s of the characteristic strips thus obtained to 0 at the point of
departure, we obtain a one-parameter family of characteristic strips, namely,
We claim that the first three equations in (5.30) constitute an integral surface of
the PDE. It is clear that on the surface represented parametrically by these three
equations, the PDE is satisfied as an algebraic relation between the five variables
x, y, u, p, q. What remains to be shown is that we can read off the parameters r and
s in terms of x and y from the first two equations and that, upon entering these values
into the third equation and calculating the partial derivatives u x and u y , we recover,
respectively, the values of p and q given by the last two equations in (5.30). We will
omit the proof of these facts.6
5.6 An Example
To illustrate all the steps involved in the solution of a non-linear first-order PDE
by the method of characteristic strips, we will presently solve a relatively simple
example.7 The problem consists of finding a solution of the PDE
u = u 2x − u 2y , (5.31)
6 See [3].
7 This example is suggested as an exercise in [4], p. 66.
100 5 The Genuinely Nonlinear First-Order Equation
1
x =r y=0 u = − r 2. (5.33)
4
The strip condition (5.27) yields
1
1 p + 0 q = − r. (5.34)
2
Moreover, the PDE itself, i.e. Eq. (5.31), yields
1
− r 2 = p2 − q 2 . (5.35)
4
Solving the system of equations (5.34) and (5.35), we obtain
√
1 2
p=− r q=± r. (5.36)
2 2
This completes the strip over the support curve (5.33). It is important to notice that,
due to the non-linearity of the PDE, we happen to obtain two different possibilities
for the initial strip, each of which will give rise to a different solution of the equation.
Our next task is to obtain and solve the characteristic system of ODEs. Writing
the PDE (5.31) in the form
F(x, y, u, p, q) = u − p 2 + q 2 = 0, (5.37)
dx dy du dp dq
= −2 p = 2q = −2 p 2 + 2q 2 = −p = −q.
ds ds ds ds ds
(5.38)
This system is easily integrated to
x = 2 Ae−s + C, (5.39)
y = −2Be−s + D, (5.40)
p = Ae−s , (5.42)
q = Be−s , (5.43)
x = r (2 − e−s ), (5.45)
√
y = ± 2r (1 − e−s ), (5.46)
1
u = − r 2 e−2s . (5.47)
4
We now solve Eqs. (5.45) and (5.46) for r and s to obtain
√
2
r=x∓ y, (5.48)
2
x
e−s = 2 − √ . (5.49)
x∓ 2y/2
Substituting these values in Eq. (5.47), we obtain the desired solution as the inte-
gral surface
1
√ 2
u = − ± 2y − x . (5.50)
4
Geometrically, the solution is either one of two horizontal oblique parabolic cylinders.
Either cylinder contains the initial data.
Our treatment of first-order linear and non-linear PDEs has been constrained so far to
the case of two independent variables. The main reason for this restriction has been to
102 5 The Genuinely Nonlinear First-Order Equation
enable the visualization of the solutions as surfaces in R3 and thus to foster geometric
reasoning. The generalization to an arbitrary number of independent variables is quite
straightforward. Consider the quasi-linear equation
∂u ∂u
a1 (x1 , . . . , xm , u) + · · · + am (x1 , . . . , xm , u) = c(x1 , . . . , xm , u), (5.51)
∂x1 ∂xm
where we are using the summation convention in the range 1, . . . , m and the notation
introduced in (3.2).
The characteristic vector field associated with this equation is defined as the vector
field in Rm+1 with components a1 , . . . , am , c. The integral curves of this field, namely,
the solutions of the system of characteristic ODEs
d xi du
= ai (i = 1, . . . , m) = c, (5.53)
ds ds
are the characteristic curves of the PDE. As before, we can prove that if a charac-
teristic curve has one point in common with a solution u = u(x1 , . . . , xm ), then the
whole characteristic curve belongs to this solution. In geometric terms, a function
u = u(x1 , . . . , xm ) is a hyper-surface of dimension m in Rm+1 . The Cauchy problem
consists of finding a solution when initial data have been given in a hyper-surface Γ
of dimension m − 1. In parametric form, such a hyper-surface can be represented as
u t + uu x + u 2 u y = 0 u(x, y, 0) = x + y, (5.55)
dx dy dt du
=u = u2 =1 = 0. (5.56)
ds ds ds ds
5.7 More Than Two Independent Variables 103
x = r1 y = r2 t =0 u = r1 + r2 , (5.58)
A = r1 B = r2 C =0 D = r1 + r2 . (5.59)
x = (r1 + r2 )s + r1 y = (r1 + r2 )2 s + r2 t =s u = r1 + r2 .
(5.60)
We need to express the parameters in terms of the original independent variable
x, y, t. Adding the first two equations and enforcing the third, we obtain
whence
−(t + 1) ± (t + 1)2 + 4t (x + y)
r1 + r2 = . (5.62)
2t
Invoking the fourth parametric equation we can write the final result as
−(t + 1) + (t + 1)2 + 4t (x + y)
u= . (5.63)
2t
The choice of the positive sign has to do with the imposition of the initial condition.
Because of the vanishing denominator at t = 0 we verify that
−(t + 1) +(t + 1)2 + 4t (x + y)
lim
t→0 2t
−1 + √
2(t+1)+4(x+y)
2 (t+1)2 +4t (x+y)
= lim = x + y. (5.64)
t→0 2
Our solution (5.63) is not defined when the radicand is negative, namely, in the
subspace of R3 defined as
It is not difficult to verify that at the boundary of this domain the Jacobian determinant
∂(x, y, t)/∂(r1 , r2 , t) vanishes. Moreover, the t-derivative of the solution at the initial
manifold is infinite.
The most general non-linear PDE of the first order has the form
F(x1 , . . . , xm , u, p1 , . . . pm ) = 0, (5.66)
where the notation of Sect. 3.1 is used, namely, pi = u ,i . We define the characteristic
system of ODEs associated with the PDE (5.66) as
d xi du dpi
= F pi = pi F pi = −Fxi − Fu pi , (5.67)
ds ds ds
where the summation convention in the range i = 1, . . . , m is understood. These
equations are the m-dimensional analogues of Eqs. (5.12), (5.17) and (5.18).
The function F itself is a first integral of the characteristic system. Indeed, always
using the summation convention, on every solution of the characteristic system we
obtain
dF d xi du dpi
= Fxi + Fu + F pi
ds ds ds ds
= Fxi F pi + Fu pi F pi + F pi −Fxi − Fu pi = 0. (5.68)
on which we specify u = û(r1 , ..., rm−1 ). By analogy with the two-dimensional case,
we extend the initial data as some functions pi = p̂i (r1 , . . . rm−1 ). To this end, we
impose the strip conditions
∂ x̂i
p̂i = û rk k = 1, . . . , m − 1, (5.70)
∂rk
5.7 More Than Two Independent Variables 105
and the PDE itself evaluated at the initial manifold, that is,
Equations (5.70) and (5.71) constitute an algebraic system of m equations that can,
in principle, be solved for the values of p1 , . . . , pm for each point r1 , . . . , rm−1 on
the initial manifold. By the inverse function theorem, this is possible if the Jacobian
determinant ∂ x̂1
∂r . . . ∂∂rx̂m
1 1
. ... .
. ... .
J = (5.72)
. ... .
∂ x̂1 . . . ∂ x̂m
∂rm−1 ∂r
F . . . Fm−1
p1 pm
does not vanish over the initial manifold. If this condition is satisfied, we can build
a solution of the PDE that contains the initial data by constructing the (m − 1)-
parameter family of characteristic strips issuing from each point of the initial man-
ifold. This is achieved by means of the by now familiar procedure of setting s = 0
in the general expression for the characteristic strips and equating with the corre-
sponding values at each point of the initial manifold (extended as explained above).
In this way, the ‘constants’ of integration are obtained in terms of the parameters
r1 , . . . , rm−1 . The solution is thus obtained in the parametric form
dqi ∂H dpi ∂H
= =− i = 1, . . . , n. (5.74)
dt dpi dt dqi
Note that these are ODEs. The partial derivatives on the right hand side are known
functions of q, p, t obtained by differentiating the Hamiltonian function. A solution
of this system constitutes a trajectory. A trajectory can be regarded as a curve in R2n ,
the space with coordinates q1 , . . . , qn , p1 , . . . , pn known as the phase space of the
system.
Although Hamilton’s equations were originally formulated by starting from
Lagrangian Mechanics and effecting a certain Legendre transformation relative to
the generalized velocities, Hamiltonian systems can arise independently in Mechan-
ics and in other branches of Physics such as Optics, General Relativity and Quantum
Mechanics.
The peculiarity of this PDE is that the unknown function u does not appear explicitly
in the function F. The characteristic strips for this equation are given, according to
(5.67), by
d xi dpi du
= F pi = −Fxi = pi F pi i = 1, . . . , n, (5.76)
ds ds ds
with the summation convention implied in the range 1, . . . , n.
If we compare the first two expressions of (5.76) with their counterparts in (5.67)
we realize that, except for some minor differences in notation, they are identical.
An important detail is that, since F does not contain u, the first two expressions are
independent from the third. In other words, the characteristic system can be integrated
first by solving the 2n equations involving just xi and pi , and only later solving the
evolution equation for u. Notice also that, although in the PDE the symbols pi stand
for the partial derivatives u ,i , this fact is irrelevant as far as the characteristic equations
are concerned.
5.8 Application to Hamiltonian Systems 107
In conclusion, a first-order PDE in reduced form gives rise, via its characteristic
equations, to a Hamiltonian system!
Instead of looking for a solution in the form u = u(x1 , . . . , xn ) let us look for
a solution in the implicit form w(x1 , . . . xn , u) = 0, and let us investigate what
PDE does the function w of n + 1 independent variables satisfy. Since
∂w ∂w ∂w
d x1 + · · · + d xn + du = 0,
∂x1 ∂xn ∂u
we conclude that
∂w
∂u ∂xi
pi = =− ∂w
.
∂xi ∂u
5.8.4 An Example
u = f (x, y, α, β)
8 See Boxes 5.3 and 5.4. For a thorough understanding of these topics within the mathematical
context, see [1], p. 59, [2], p. 33, and [3], p. 29. For many interesting and challenging problems on
the general integral, [4] is highly recommended.
5.8 Application to Hamiltonian Systems 109
that satisfies the PDE for arbitrary values of the two parameters α, β. Since the
parameters are arbitrary and independent, we may decide to impose a restriction
to the two-parameter family by choosing a specific functional dependence
β = β(α). We have at our disposal an arbitrary function of a single variable
to control the solution. More specifically, referring to Box 5.1, we can obtain
the envelope of the new one-parameter family, and eliminate the parameter α,
by choosing a specific function β and solving the algebraic system
dβ(α)
u = f (x, y, α, β(α)) f α (x, y, α, β(α)) + f β (x, y, α, β(α)) = 0.
dα
But, since, at each point, the envelope of a family coincides with one of the
solutions and has the same tangent plane, we conclude that this envelop is
itself a solution! We have thus obtained a solution depending on an arbitrary
function β, that is, a general integral.
Finally, a singular integral can sometimes be found that is not comprised
within the solutions delivered by the general integral. This singular solution is
obtained as the envelope of the whole two-parameter family u = f (x, y, α, β).
It can be regarded as an envelope of envelopes. It is delivered by the system of
algebraic equations
In order to read off α and β from the last two equations, according to the
inverse function theorem, the Jacobian determinant ∂( f α , f β )/∂(α, β) (which
happens to be the Hessian determinant) must not vanish.
For the sake of simplicity, we have only dealt with the case of two indepen-
dent variables, but a similar treatment can be justified for higher dimensions.
Since complete integrals are in general difficult to obtain, let us deal with a simple
example from Hamiltonian Mechanics, namely, the classical ballistic problem: A
particle of mass m moving under the action of constant gravity g in the x, y plane,
where y is the upright vertical direction. The Hamiltonian function in this case is the
total energy, expressed in terms of coordinates x, y and momenta p, q as
1
2
H (x, y, p, q) = p + q 2 + mg y. (5.78)
2m
The Hamilton–Jacobi equation is, therefore,
2 2
1 ∂S ∂S ∂S
+ + mg y + = 0. (5.79)
2m ∂x ∂y ∂t
110 5 The Genuinely Nonlinear First-Order Equation
Since the only way that functions of different variables may be equated to each other
is if they are constant, we obtain the three conditions
2 2
1 d f1 1 d f2 d f3
=A + mg y = B = −(A + B), (5.82)
2m dx 2m dy dt
f α (x, y, a) = γ,
where γ is a constant. We claim that, for any values of the three parameters
α, β, γ, the four equations
define a characteristic strip of the PDE. We start by noticing that each of these
equations eliminates, so to speak, a degree of freedom in the 5-dimensional
space of coordinates x, y, u, p, q, so that 4 independent equations, in general,
determine a curve in this space. We notice, moreover, that this line will lie
on the 4-dimensional sub-manifold F = 0. This conclusion follows from the
fact that the function f (x, y, α) is a solution of the PDE for arbitrary values
of α. Thirdly, we verify that we have a three-parameter family of such (one-
dimensional) curves sweeping the sub-manifold F = 0. Accordingly, to pin
down one of these curves we need to adjust the parameters α, β, γ to satisfy
any given initial conditions x0 , y0 , u 0 , p0 , q0 . Clearly, q0 is not independent
of p0 , since the condition F = 0 must be satisfied by the initial conditions
too. Finally, we verify that, for any fixed values of α, β, γ, the curve satisfies
the strip condition (5.21). This is an immediate consequence of the fact that
du = f x d x + f y dy = pd x + qdy.
All that is left is to ascertain that these strips are characteristic. Following
[2], consider the differential of our main condition f α (x, y, a) = γ, that is,
f αx d x + f αy dy = 0.
This equation provides a specific ratio between d x and dy provided that the
two partial derivatives do not vanish simultaneously. Substituting our complete
integral into the PDE and differentiating with respect to α we obtain
F p f xα + Fq f yα = 0.
112 5 The Genuinely Nonlinear First-Order Equation
dy Fq
= .
dx Fp
This result is equivalent to the first two conditions for a characteristic strip, as
given in Eq. (5.23). The condition du = pd x + qdy, which we have already
derived, implies that the third condition in (5.23) is fulfilled. The final two con-
ditions are obtained comparing the differentials of p and q with the (vanishing)
total derivatives of F with respect to x and y, respectively. For n independent
variables, the parameter α is replaced by n − 1 parameters αi and the charac-
teristics are obtained by equating the derivatives of the complete integral with
respect to each of these parameters to a constant.
Exercises
Exercises 5.1 ([4], p. 66.) Find the characteristics of the equation u x u y = u and
determine the integral surface that passes through the parabola x = 0, y 2 = u.
Exercises 5.2 Modify the code of Box 3.3 to handle a general nonlinear first-order
PDE in two independent variables.
Exercises 5.3 Solve the PDE of Example 5.1 in parametric form with the initial
condition u(x, y, 0) = 1/(1 + (x + y)2 ). Plot the various profiles (as surfaces in R3 )
of the solution for several instants of time. Notice the formation of multiple-valued
profiles, indicating the emergence of shock waves.
Exercises 5.4 Show that
∂(x1 , . . . , xm )
= J,
∂(s, r1 , . . . , rm−1 )
References
1. Duff GFD (1956) Partial differential equations. Toronto University Press, Toronto
2. Garabedian PR (1964) Partial differential equations. Wiley, London
3. John F (1982) Partial differential equations. Springer, Berlin
4. Sneddon IN (1957) Elements of partial differential equations. McGraw-Hill, New York (Repub-
lished by Dover (2006))
Part III
Classification of Equations
and Systems
Chapter 6
The Second-Order Quasi-linear Equation
A careful analysis of the single quasi-linear second-order equation is the gateway into
the world of higher-order partial differential equations and systems. One of the most
important aspects of this analysis is the distinction between hyperbolic, parabolic
and elliptic types. From the physical standpoint, the hyperbolic type corresponds to
physical systems that can transmit sharp signals over finite distances. The parabolic
type represents diffusive phenomena. The elliptic type is often associated with statical
situations, where time is absent. Of these three types, the hyperbolic case turns out to
resemble the single first-order PDE the most. In particular, characteristic lines make
their appearance and play a role in the understanding of the propagation phenomena
and in the prediction of the speed, trajectory and variation in amplitude of the signals,
without having to solve the differential equation itself.
6.1 Introduction
a u x x + 2b u x y + c u yy = d , (6.1)
and on the type of equation. For example, in the case of the dynamics of a system
of n particles moving in space, the equations of motion are 3n second-order (or,
equivalently, 6n first-order) ordinary differential equations. If, at some initial time,
we prescribe the instantaneous positions and velocities of each particle, a total of 6n
numbers, the differential equations allow us to calculate the accelerations, and thus
to come out of the original state and predict the evolution of the system. Notice that,
by successive differentiations of the equations of motion, if all the functions involved
are analytic (i.e., if they admit a convergent Taylor expansion), we can obtain any
number of derivatives at the initial time. In this way, at least for the analytic case, we
can extend the solution to a finite interval of time.1
In the case of a single first-order PDE we have seen that the Cauchy problem
consists of specifying the values of the unknown function on an initial curve. In
essence, unless the curve happens to be a characteristic of the PDE, the differential
equation allows us to come out of this curve using the information provided by
the differential equation concerning the first derivatives. Again we observe that the
information is given on a set with one dimension less than the space of independent
variables and it involves knowledge of the function and its derivatives up to and
including one degree less than the order of the PDE. In the case of a first-order
equation, one degree less means no derivatives at all. It is natural, therefore, to
expect that the Cauchy problem for a second-order PDE in two dimensions will
involve specifying, on a given curve, the unknown function and its first derivatives.
We expect then the second-order PDE to provide enough information about the
second (and higher) derivatives so that we can come out of the initial curve. In fact,
the classical theorem of Cauchy–Kowalewski2 proves that if the coefficients in the
PDE are analytic, this procedure can be formalized to demonstrate the existence
and uniqueness of the (analytic) solution in some neighbourhood of a point on the
initial manifold. That said, we don’t want to convey the impression that this initial-
value problem is prevalent in the treatment and application of all possible equations.
We will see later that boundary value problems and mixed initial-boundary-value
problems are prevalent in applications. But, from the conceptual point of view, the
understanding of the behaviour of the equation and its solution in the vicinity of an
initial manifold with known initial data is of paramount importance. In particular, we
will presently see how it can be used to classify the possible second-order quasi-linear
equations into three definite types.3
1 The theorem of existence and uniqueness, on the other hand, requires much less than analyticity.
2 For a detailed proof, see the classical treatise [1].
3 In some textbooks, the classification is based on the so-called normal forms of the equations. In
order to appreciate the meaning of these forms, however, it is necessary to have already seen an
example of each. We prefer to classify the equations in terms of their different behaviour vis-à-vis
the Cauchy problem. Our treatment is based on [4], whose clarity and conciseness are difficult to
match.
6.2 The First-Order PDE Revisited 117
in the space of independent variables on which the value of the solution is specified
as
u = û(r ), (6.4)
the Cauchy problem consists of finding an integral surface that contains the space
curve represented by Eqs. (6.3) and (6.4). Let us assume that we know nothing at all
about characteristics and their role in building integral surfaces. We ask the question:
Does the differential equation (6.2) provide us with enough information about the
first derivatives of the solution being sought so that we can come out, as it were,
of the initial curve? Intuitively, the answer will be positive if, and only if, the PDE
contains information about the directional derivative in a direction transversal to the
original curve in the plane. We may ask why, if we need the whole gradient, we only
demand the derivative in one transversal direction. The answer is: because the initial
data, as given by Eq. (6.4), already give us the necessary information in the direction
of the curve itself. Indeed, assuming that the function û is differentiable, we obtain
by the chain rule of differentiation
d û d x̂ d ŷ
= ux + uy , (6.5)
dr dr dr
where u x and u y are the partial derivatives of any putative solution of the PDE. This
equation must hold true along the whole initial curve, if indeed we want our solution
to satisfy the given initial conditions. Always moving along the initial curve, we see
that the determination of the two partial derivatives u x , u y at any given point along
this curve is a purely algebraic problem, consisting in solving (point-wise) the linear
system of equations
a b ux c
= , (6.6)
x̂ ŷ uy û
where we indicate by a prime the derivative with respect to the curve parameter.
The first equation of this linear system is provided by the PDE and the second
equation of the system is provided by the initial conditions via Eq. (6.5). No more
information is available. This linear system will have a unique solution if, and only
if, the determinant of the coefficient matrix does not vanish. If this is the case, we
118 6 The Second-Order Quasi-linear Equation
obtain in a unique fashion the whole gradient of the solution and we can proceed to
extend the initial data to a solution in the nearby region (for example, by a method
of finite differences). Otherwise, namely if the determinant vanishes, there are two
possibilities:
1. The rank of the augmented matrix
a b c
(6.7)
x̂ ŷ û
Inspired by the analysis of the first-order case, we formulate the Cauchy problem
for the second-order quasi-linear PDE (6.1) as follows: Given a curve by Eq. (6.3),
and given, along this curve, the (initial) values of the unknown function and its first
partial derivatives
u = û(r ), (6.8)
u x = û 1 (r ), (6.9)
u y = û 2 (r ), (6.10)
where û(r ), û 1 (r ) and û 2 (r ) are differentiable functions of the single variable r , find
a solution of Eq. (6.1) compatible with these Cauchy data.
Before investigating whether or not this problem has a solution, it is worth remark-
ing that the functions û 1 (r ) and û 2 (r ) cannot be specified arbitrarily. Indeed, they
must be compatible with the derivative of the function û(r ), as we have seen above
in Eq. (6.5). Specifically,
d x̂ d ŷ d û
û 1 + û 2 = . (6.11)
dr dr dr
An equivalent way to prescribe the data needed for the Cauchy problem, and to avoid
a contradiction, is to stipulate the value of the unknown function u on the initial curve
and the value of the first derivative du/dn in any direction n transversal to the curve.
6.3 The Second-Order Case 119
By analogy with the first-order case, we want to investigate whether or not the
differential equation provides us with enough information about the second deriv-
atives so that we can come out of the initial curve with the given initial data. For
this to be possible, we need to be able to calculate at each point of the initial curve
the values of all three second partial derivatives. We start by remarking that the
Cauchy data already provide us with some information about the second derivatives
of any proposed solution, just like in the first-order case. Indeed, by the chain rule
of differentiation, we know that at any point of the initial curve
û 1 = u x x x̂ + u x y ŷ (6.12)
and
û 2 = u yx x̂ + u yy ŷ . (6.13)
Thus, the determination of the three second partial derivatives of the solution along
the initial curve is, at each point, a purely algebraic problem defined by the system
of linear equations
⎡ ⎤⎧ ⎫ ⎧ ⎫
a 2b c ⎨ u x x ⎬ ⎨ d ⎬
⎣ x̂ ŷ 0 ⎦ u x y = û 1 (6.14)
⎩ ⎭ ⎩ ⎭
0 x̂ ŷ u yy û 2
Δ = a ŷ 2 − 2b x̂ ŷ + c x̂ 2 . (6.15)
If this determinant does not vanish at any point along the initial curve (for the given
Cauchy data), there exists a point-wise unique solution for all three second partial
derivatives and, therefore, it is possible to come out of the initial curve by means of the
information gathered from the initial data and the differential equation. Otherwise,
that is, when the determinant vanishes, if the rank of the augmented matrix is equal to
3, the system is incompatible and there are no solutions. If this rank is less than 3, the
system is compatible, but the solution is not unique. In this case, if the determinant
vanishes identically along the initial curve, the curve is called a characteristic of
the differential equation for the given Cauchy data. Notice that in the second-order
case we will be representing only the projected curves on the x, y plane, since the
visualization of the entire Cauchy problem as a curve would involve a space of 5
dimensions. If the problem happens to be linear, the projected characteristics are
independent of the initial data.
According to what we have just learned, in the linear case the characteristic curves
can be seen as solutions of the ODE
a ŷ 2 − 2b x̂ ŷ + c x̂ 2 = 0, (6.16)
120 6 The Second-Order Quasi-linear Equation
If, on the other hand, the PDE is quasi-linear, the coefficients of this equation may
be functions of u and its two (first) partial derivatives. What this means is that, in the
legitimate quasi-linear case, we are dealing with characteristics that depend on the
initial data, just as in the first-order case. At any rate, Eq. (6.17) reduces to
√
dy b± b2 − ac
= . (6.18)
dx a
We have assumed that a = 0, for the sake of the argument. We see that in the case of
second order equations, in contradistinction with the first-order case, there may be
no characteristics at all! This occurs when the discriminant of the quadratic equation
happens to be negative, namely when
b2 − a c < 0. (6.19)
If this is the case, the equation is called elliptic at the point in question and for
the given initial data. At the other extreme, we have the case in which two distinct
characteristics exist. This happens when the discriminant is positive, that is,
b2 − a c > 0. (6.20)
In this case, the equation is called hyperbolic at the point in question and for the
given initial data. The intermediate case, when
b2 − a c = 0, (6.21)
is called parabolic. In this case, we have just one characteristic direction.4 The reason
for this terminology, as you may have guessed, is that quadratic forms in two variables
give rise to ellipses, hyperbolas or parabolas precisely according to the above criteria.
If the original PDE is not only linear but also with constant coefficients, then the type
(elliptic, hyperbolic or parabolic) is independent of position and of the solution. If
the equation is linear, but with variable coefficients, the type is independent of the
solution, but it may still vary from point to point. For the truly quasi-linear case, the
type may depend both on the position and on the solution. In light of the second-order
case, we can perhaps say that the single first-order PDE is automatically hyperbolic.
4 Although we have assumed that the first coefficient of the PDE does not vanish, in fact the con-
clusion that the number of characteristic directions is governed by the discriminant of the quadratic
equation is valid for any values of the coefficients, provided, of course, that not all three vanish
simultaneously.
6.3 The Second-Order Case 121
Although this is not a precise statement, it is indeed the case that the treatment of
hyperbolic second-order PDEs is, among all three types, the one that most resembles
the first-order counterpart in terms of such important physical notions as the ability
to propagate discontinuities.
ψ:D→R (6.22)
xi = xi (s) (i = 1, . . . , n) (6.23)
d ψ̄
n
d xi
= ψ̄,i . (6.24)
ds i=1
ds
This result may not look spectacular, since it is what we would have done anyway,
without asking for Hadamard’s permission, but it does allow us to use the chain
rule even when the function is not defined over an open (tubular) neighbourhood
containing the curve. Why is this important at this point of our treatment? The
reason is as follows. Let us assume that we have a domain D ⊂ Rn subdivided
into two sub-domains, D+ and D− , as shown in Fig. 6.1, whose boundaries share a
common smooth part Λ, a manifold of dimension n − 1, also called a hyper-surface
or just a surface.
5 The treatment in this section draws from [5], pp. 491–529. It should be pointed out that Hadamard
proved several theorems and lemmas that carry his name. The lemma used in this section is a rather
elementary result in Calculus. Its proof was originally given by Hadamard in [3], p. 84.
122 6 The Second-Order Quasi-linear Equation
D− D+
Assume that the function ψ and each of its derivatives ψ,i are continuous in the
interior of each of the respective sub-domains, but that they attain possibly different
limits, (ψ − , ψ,i− ) and (ψ + , ψ,i+ ), as we approach Λ, according to whether we come
from paths within D− or D+ , respectively. In other words, the given function and/or
its derivatives undergo a jump upon crossing Λ. We refer to Λ as a singular surface or
a surface of discontinuity. We will use the following convenient short-hand notation
to denote the jump of a quantity such as ψ across Λ:
ψ = ψ + − ψ − . (6.25)
According to Hadamard’s lemma, we are in a position to apply the chain rule (6.24)
independently at each of the sub-domains to calculate the derivative of ψ along a
smooth curve lying on Λ, namely,
dψ + n
d xi dψ − n
d xi
= ψ,i+ and = ψ,i− . (6.26)
ds i=1
ds ds i=1
ds
Subtracting the second equation from the first and using the notation (6.25) we obtain
dψ
n
d xi
= ψ,i . (6.27)
ds i=1
ds
In other words, the derivative of the jump of a function in a direction tangential to the
surface of discontinuity is given by the jump of the derivative in the same direction.
Thus, the jumps of a function and of its partial derivatives cannot be entirely arbitrary,
but must be related by Eq. (6.27). It is interesting to note that in the case for which
the function ψ happens to be continuous across Λ, the jumps of its derivatives must
satisfy the condition
n
d xi
ψ,i =0 (6.28)
i=1
ds
6.4 Propagation of Weak Singularities 123
This condition can be stated as follows: If ψ is continuous across Λ, the jump of its
gradient is orthogonal to Λ.6
Returning to the general case, if we were to choose a local coordinate system with
all but one of its natural base vectors lying on Λ, the derivative in the direction of the
transverse coordinate would not be at all involved in condition (6.27), as one would
intuitively expect. Equation (6.27) embodies the so-called geometric compatibility
conditions. This terminology is meant to emphasize the fact that these conditions
emerge from a purely geometric analysis of the situation, without any reference to
equations of balance that may arise from the physical formulation of a particular
problem.
When one of the independent variables of the problem is identified with time and
the remaining ones with space, a singular surface can be regarded as the propagation
of a wave front. In the case of just two independent variables, the wave front consists
of a single point. The slope of the singular curve can, in this case, be seen as the
speed of propagation of the front. This idea can be generalized for the case of more
than two independent variables.
Given a PDE of order n, a singular surface is said to be weak if only the n-th
(or higher) derivatives are discontinuous across it, while the lower derivatives are
all continuous. For a second-order equation, for example, a weak singular surface
may only carry discontinuities of the derivatives of order 2 and higher. In Continuum
Mechanics applications, where the relevant equations are indeed of second order,
these singularities are known as acceleration waves. If the first derivatives are dis-
continuous, we are in the presence of strong singularities, or shocks. If the function
itself (the displacement, say) is discontinuous, we are in the presence of a dislocation.
This terminology may not apply in other contexts.
Consider the general quasi-linear second-order PDE (6.1). Taking the jump of
this equation across a weak singular curve with parametric equations
a u x x + 2b u x y + c u yy = 0. (6.30)
6 In using this terminology, we are implicitly assuming that we have defined the natural dot product
in Rn . A more delicate treatment, would consider the gradient not as a vector but as a differential
form which would then be annihilated by vectors forming a basis on the singular surface. We have
already discussed a similar situation in Box 3.2.
124 6 The Second-Order Quasi-linear Equation
We have assumed that the coefficients of the PDE (for example, the moduli of elas-
ticity) are smooth functions throughout the domain of interest. Since we are dealing
with a weak singularity, all the first derivatives are continuous across the singular
curve, implying that Eq. (6.28) may be applied, identifying the generic function ψ
successively with each of the two first-derivatives. As a result, we obtain the two
conditions
d x̃ d ỹ
u x x + u x y = 0, (6.31)
ds ds
d x̃ d ỹ
u yx + u yy = 0. (6.32)
ds ds
Equations (6.30), (6.31) and (6.32) can be written in matrix form as
⎡ ⎤⎧ ⎫ ⎧ ⎫
a 2b c ⎨ u x x ⎬ ⎨ 0 ⎬
⎣ x̃ ỹ 0 ⎦ u x y = 0 , (6.33)
⎩ ⎭ ⎩ ⎭
0 x̃ ỹ u yy 0
where primes indicate derivatives with respect to the curve parameter. For this homo-
geneous system of linear equations to have a non-trivial solution, its determinant must
vanish. In this way, we recover condition (6.16), implying that weak discontinuities
can only exist on characteristic curves! If one of the independent variables is time,
this conclusion can be expressed as the fact that weak signals propagate along char-
acteristic curves, and their (local) speed of propagation is measured by the slope
of these curves. Notice that, in particular, elliptic equations cannot sustain weak
discontinuities, since they have no characteristic curves.
Note that Eqs. (6.31) and (6.32) imply that the jumps of the second derivatives are
all interrelated. Denote, for example,
u x x = B. (6.34)
d x̃
u x y = −B (6.35)
d ỹ
and
2
d x̃
u yy = B . (6.36)
d ỹ
Suppose that a function ψ = ψ(x1 , . . . , xn ) and all its first partial derivatives
are continuous across Λ. By Eq. (6.28), at any given point on Λ
ψ,i j = (ψ,i ), j = ai n j ,
ai n j = a j n i .
This equality is only possible if the vectors a and n are collinear. We conclude
that there exists a scalar μ such that ai = μn i for all i = 1, . . . , n. We can,
therefore, write
ψ,i j = μn i n j .
It is, in fact, possible to squeeze out more information. For simplicity, we will confine
attention to the homogeneous linear case so that, in particular, the coefficients a, b, c
are independent of the unknown function and its derivatives, while the right-hand side
vanishes.7 Assume that the singularity curve is nowhere tangential to the x direction.
Differentiating the PDE with respect to x and then taking jumps we obtain
a u x x x + 2b u x x y + c u x yy = ax u x x + 2bx u x y + cx u yy = 0. (6.37)
The jumps of the third derivatives, however, are not independent of those of the second
derivatives, as we know from the geometric compatibility conditions (6.27). Indeed,
identifying the generic function ψ successively with each of the second derivatives,
we can write
u x x
= u x x x x̃ + u x x y ỹ , (6.38)
ds
u x y
= u x yx x̃ + u x yy ỹ , (6.39)
ds
u yy
= u yyx x̃ + u yyy ỹ . (6.40)
ds
Multiplying Eqs. (6.38) and (6.39), respectively, by a ỹ and c x̃ , and then adding
the results we obtain
u x x u x y
a ỹ + c x̃ = a u x x x + c u x yy x̃ ỹ + u x x y a ỹ 2 + C x̃ 2 .
ds ds
(6.41)
Since the curve in question is characteristic, we can apply Eq. (6.16) to the last term
of Eq. (6.41), thus yielding
u x x u x y
a ỹ + c x̃ = a u x x x + 2b u x x y + c u x yy x̃ ỹ . (6.42)
ds ds
Introducing this result into Eq. (6.37), we obtain
u x x u x y
a ỹ + c x̃ + ax u x x + 2bx u x y + cx u yy x̃ ỹ = 0. (6.43)
ds ds
We have succeeded in obtaining an equation relating exclusively the jumps of the
second derivatives and their derivatives with respect to the curve parameter. By virtue
of Eqs. (6.34), (6.35) and (6.36), we can write
dB d(B x̃ / ỹ )
a ỹ − c x̃ + ax − 2bx (x̃ / ỹ ) + cx (x̃ / ỹ )2 B x̃ ỹ = 0.
ds ds
(6.44)
This is a first-order ODE for the evolution of the magnitude of the jump. It is some-
times called the transport equation or the decay-induction equation.
If the characteristic curve is parametrized by y, which can always be done locally
around a point at which ỹ = 0, the transport equation can be written more compactly
as
dB d(B λ)
a −c λ + ax − 2bx λ + cx λ2 B = 0, (6.45)
dy dy
6.4 Propagation of Weak Singularities 127
where λ denotes the characteristic slope x̃ / ỹ . Given that, except for B = B(y), all
the quantities involved are known from the solution of the characteristic equation,
the integration of the transport equation is elementary.
There are two immediate consequences of this formula. The first is that the
transport equation becomes a non-linear ODE. The second consequence, bear-
ing an important physical repercussion, is that, as the wave front advances and
encounters preexisting values u + + +
x x , u x y , u yy ahead of the wave, these values
affect the decay or growth of the amplitude of the propagating discontinuity.
It is a useful exercise, that we could have carried out also for first-order equations, to
ask ourselves how the form of a PDE is affected by an arbitrary change of coordinates.
A new system of coordinates in the plane is specified by means of two smooth
functions of two new variables ξ and η, namely,
must not vanish at any point. Only when this is the case, which we assume from
here on, can Eq. (6.46) be inverted to yield ξ, η as (smooth) functions of x, y. The
function u = u(x, y) can be expressed in terms of the new variables by composition
of functions, namely,
We are trying to distinguish, by means of a hat over the symbol, between the function
as an operator and the result of applying this operator to the arguments. When there
is no room for confusion, however, this practice can be abandoned and let the context
indicate which is the function being considered. By a direct iterated use of the chain
rule of differentiation, we obtain the expressions
u x = û ξ ξx + û η ηx ,
u y = û ξ ξ y + û η η y ,
u x x = û ξξ ξx2 + 2û ξη ξx ηx + û ηη ηx2 , (6.49)
u x y = û ξξ ξx ξ y + 2û ξη (ξx η y + ξ y ηx ) + û ηη ηx η y ,
u yy = û ξξ ξ y2 + 2û ξη ξ y η y + û ηη η 2y .
In the new coordinate system, therefore, the original PDE (6.1) can be written as
In this expression, we assume that the arguments of the coefficients have been
expressed in terms of the new variables. The new coefficients of the second-order
terms are given by the quadratic expressions
â = a ξx2 + 2b ξx ξ y + c ξ y2 ,
b̂ = a ξx ηx + b (ξx η y + ξ y ηx ) + c ξ y η y , (6.51)
ĉ = a ηx2 + 2b ηx η y + c η 2y .
The second-order part of the original PDE can be regarded as a quadratic form
governed by the matrix
ab
A= . (6.52)
bc
It is not difficult to verify that the counterpart for the transformed equation is governed
by the matrix
â b̂
 = = J −1 A J −T , (6.53)
b̂ ĉ
where J stands now for the Jacobian matrix. Notice that the determinants of both
matrices, A and Â, will always have the same sign or vanish simultaneously. Each
6.5 Normal Forms 129
û ξη + . . . = 0, (6.54)
where we have indicated only the principal (i.e., second-order) part. An alternative
coordinate transformation can be found that brings the hyperbolic equation to the
form
û ξξ − û ηη + . . . = 0. (6.55)
These are the so-called normal forms of an equation of the hyperbolic type.
For a parabolic equation, the normal form is
û ξξ + . . . = 0, (6.56)
û ξξ + û ηη + . . . = 0, (6.57)
Exercises
Exercise 6.1 Show that option 2 after Eq. (6.7), if applied to every point along the
initial curve, corresponds exactly to the fact that the space curve represented by
Eqs. (6.3) and (6.4) is a characteristic curve of the PDE (6.2).
Exercise 6.2 Show that the first-order quasi-linear PDE (6.2) can be regarded as the
specification of the directional derivative of the unknown function in the characteristic
direction.
Exercise 6.3 For each of the following second-order PDEs determine the type
(elliptic, parabolic, hyperbolic) and, if necessary, the regions over which each type
applies. Obtain and draw the characteristic curves wherever they exist.
8 The proof is not as straightforward as it may appear from the casual reading of some texts. A good
2.5u x x + 5u x y + 1.5u yy + 5u = 0.
2u x x + 4u x y + 2u yy + 3(x 2 + y 2 )u x = e y .
u x x − 2u x y + 2u yy + 4u x u y = 0.
u x x + x u yy = 0. (Tricomi equation)
Exercise 6.4 For a solid body without cracks in the small-deformation regime, find
which components of the strain tensor may be discontinuous across some plane.
[Hint: the displacement field is continuous].
Exercise 6.5 Generalize the transport equation (6.44) to the case of the non-
homogeneous linear equation (6.1), where
Apply your result to obtain and solve the transport equation for the modified one-
dimensional wave equation when a linear viscous term has been added (namely, a
term proportional to the velocity). Are the (projected) characteristics affected by this
addition?
References
1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol 2. Interscience, Wiley, New
York
2. Garabedian PR (1964) Partial differential equations. Wiley, New York
3. Hadamard J (1903) Leçons sur la Propagation des Ondes et les Équations de l’Hydrodynamique.
Hermann, Paris. www.archive.org
4. John F (1982) Partial differential equations. Springer, Berlin
5. Truesdell C, Toupin RA (1960) The classical field theories. In: Flügge S (ed), Handbuch der
Physik. Springer, Berlin
Chapter 7
Systems of Equations
where ai j , bi j , ci are differentiable functions of x, y and u. We will also use the block
matrix notation
As before, we attempt to find whether or not, given the values u = û(r ) of the
vector u on a curve x = x̂(r ), y = ŷ(r ), we can calculate the derivatives u,x and u,y
throughout the curve. In complete analogy with Eq. (6.5), we can write
d û d x̂ d ŷ
= u,x + u,y . (7.3)
dr dr dr
This vector equation represents, in fact, n scalar equations. Combining this informa-
tion with that provided by the system of PDEs itself, we obtain at each point of the
curve a system of 2n algebraic equations, namely,
⎡ ⎤
d x̂
0 ··· 0 d ŷ
0 ··· 0 ⎧ ⎫ ⎧ d û 1 ⎫
⎥⎪ u 1,x ⎪ ⎪ ⎪
dr dr
⎢ 0 ⎪ ⎪ ⎪ ⎪
··· ⎥⎪ ⎪ ⎪ ⎪
d ŷ
⎢
d x̂
0 0 ··· 0 ⎪ ⎪ ⎪
dr
⎪
⎢ ·
dr dr
⎥⎪⎪ u 2,x ⎪
⎪ ⎪
⎪
d û 2
⎪
⎪
· ··· · · · ··· · ⎥⎪⎪ · ⎪ ⎪ ⎪
dr
⎢ ⎪ ⎪ · ⎪
⎢ · ⎥⎪⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪
⎢ ··· ··· · · · ··· · ⎥⎪⎪ · ⎪
⎪ ⎪ ⎪
⎪ ⎪
⎪ ·
⎪
⎪
⎪
⎢ · ⎥⎪ ⎪ ⎪
⎪ ⎪
⎢ · ··· · · · ··· · ⎥⎪⎪
⎪ · ⎪
⎪
⎪
⎪
⎪ ·
⎪
⎪
⎪
⎢ 0 0 ···d x̂
0 ··· d ŷ ⎥ ⎪
⎨ ⎬ ⎪
⎪ ⎨ d û n ⎪
⎬
⎢ 0 dr ⎥ u
⎢a dr
b1n ⎥ ⎥ ⎪ u 1,y ⎪ = ⎪ cdr ,
n,x
⎢ 11 a12 · · ·
a1n b11 b12 · · · ⎪
(7.4)
⎢a b2n ⎥ ⎪ ⎪ ⎪ ⎪
⎥⎪⎪ u 2,y ⎪ ⎪ ⎪
1
⎢ 21 a22 · · ·
a2n b21 b22 · · · ⎪ ⎪ ⎪
⎢ ⎥⎪⎪ ⎪
⎪ ⎪
⎪ c ⎪
⎪
⎥⎪ ⎪ ⎪ ⎪
2
⎢ ⎪
⎪ · ⎪ ⎪ ⎪
⎪ · ⎪
⎪
⎢ · · ··· · · · ··· · ⎥ ⎪
⎪ ⎪ ⎪
⎪ ⎪ ⎪
⎪
⎢ ⎥⎪⎪ · ⎪
⎪ ⎪
⎪ · ⎪
⎪
⎢ · · ··· · · · ··· · ⎥ ⎪
⎪ ⎪ ⎪
⎪ ⎪ ⎪
⎪
⎢ ⎥⎪⎪ · ⎪
⎪ ⎪
⎪ · ⎪
⎪
⎣ · · ··· · · · ··· · ⎦⎪ ⎩
u n,y
⎭ ⎪
⎪ ⎩ ⎪
⎭
cn
an1 an2 · · · ann bn1 bn2 · · · bnn
where I is the unit matrix of order n. If the determinant of this system of 2n linear
equations is not zero, we obtain (point-wise along the curve) a unique solution for
the local values of the partial derivatives. If the determinant vanishes, we obtain a
(projected) characteristic direction. According to a result from linear algebra [2], the
determinant of a partitioned matrix of the form
⎡ ⎤
M N
R=⎣ ⎦, (7.6)
P Q
which is the same as the generalized eigenvalue problem for the characteristic slopes
d x/dy, that is,
dx
det A − B = 0. (7.9)
dy
In this way, we arrive at the same conclusion as before, namely, that weak discon-
tinuities of the solution can only exist across characteristic curves. Totally elliptic
systems cannot sustain any discontinuities.
134 7 Systems of Equations
where the n × n matrices A, B, C and the vector d are functions of x and y alone.
Assume that the eigenvalues λ1 , . . . , λn obtained from Eq. (7.9) are all real and
distinct.1 Let M be the modal matrix, whose columns are (linearly independent)
eigenvectors corresponding to these eigenvalues. Then
AM = BM, (7.14)
where the new matrix Ĉ and the new vector d̂ are still functions of x and y alone.
What we have achieved, at almost no cost, is a decoupling of the main part of each
equation of the system in the sense that each equation contains a derivation in only
one of the characteristic directions, as follows directly from the fact that λi = d x/dy
is the local slope of the i-th characteristic line, as suggested in Fig. 7.1. Thus, the
dependent variable vi can attain different values vi− , vi+ on either side of the i-th
characteristic line without violating the i-th differential equation. Moreover, it is
also possible to obtain the variation of the amplitude of the jump vi along the
characteristic line.
1 There may be multiple eigenvalues, as long as the dimension of the corresponding eigenspaces is
equal to the multiplicity.
7.1 Systems of First-Order Equations 135
vi−
vi+
The classical theory of beams, named after Daniel Bernoulli (1700–1782) and Leon-
hard Euler (1707–1783), is one of the cornerstones of structural engineering. It is
based on a number of simplifying assumptions, some of which are the following: (a)
the beam is symmetric with respect to a plane; (b) the beam axis is straight; (c) the
supports and the loading are symmetrical about the plane of symmetry of the beam;
(c) the loading is transversal, that is, perpendicular to the axis; (d) the deflections of
the axis are transversal and very small when compared with the beam dimensions;
(e) the material abides by Hooke’s law of linear elasticity; (f) the plane normal cross
sections of the beam remain plane and perpendicular to the deformed axis; (g) the
rotary inertia of the cross sections is neglected in the dynamic equations. The last
two assumptions are crucial to the simplicity and usefulness of the theory while also
embodying some of its limitations.
On the basis of these assumptions, the equations of classical beam theory are not
difficult to derive. Introducing x, y Cartesian coordinates in the plane of symmetry
and aligning the x axis with the beam axis, as shown in Fig. 7.2, we denote by
q = q(x) the transverse load per unit length (positive downwards) and by w =
w(x, t) the transverse deflection (positive upwards). The time coordinate is denoted
by t. Assuming for specificity a constant cross section of area A and centroidal
moment of inertia I , the governing equations can be written as
⎧
⎪
⎪ Vx = −q − ρAwtt
⎨
Mx = V
(7.17)
⎪
⎪ E I θx = M
⎩
wx = θ
wx
x
V + dV
dx
of perpendicularity between the cross sections and the deformed axis implies the
vanishing of the corresponding shear strains, but not of the shear stresses, an internal
contradiction of the theory. The first two equations are the result of enforcing vertical
and rotational dynamic equilibrium, while the third equation arises from Hooke’s law.
The fourth equation establishes the perpendicularity condition by equating the slope
of the axis to the rotation of the cross section.
Introducing the linear and angular velocities v = wt and ω = θt , respectively, the
system (7.17) can be rewritten as the first-order system
⎧
⎪
⎪ Vx = −q − ρAvt
⎨
Mx = V
(7.18)
⎪
⎪ E I ωx = Mt
⎩
vx = ω
where
⎡ ⎤ ⎧ ⎫ ⎧ ⎫
0 0 0 ρA ⎪
⎪ V⎪ ⎪ −q ⎪
⎢0 0 ⎨ ⎪ ⎬ ⎪
⎨ ⎪
⎬
0 0 ⎥ M V
A=⎢
⎣0 − 1
⎥ u= c= . (7.20)
0 0 ⎦ ⎪
⎪ ω⎪ ⎪ 0 ⎪
EI ⎩ ⎪ ⎭ ⎪
⎩ ⎪
⎭
0 0 0 0 v ω
dt ρ dt ρA
=± =± . (7.24)
dx 1,2 E dx 3,4 G As
The first two roots are the inverse of the speeds of propagation of the ‘bending waves’,
while the last two roots are the inverse of the speeds of propagation of the ‘shear
waves’. The importance of these finite speeds in aircraft design was first recognized
in [5], where a clear exposition of the evolution of strong discontinuities in beams
is presented from basic principles and extended to the case of multiple eigenvalues.
Notice how, rather than solving specific numerical problems, a careful analysis of the
structure of the equations has allowed us to arrive at significant conclusions about
the phenomena at hand with relatively elementary computations.
or, more compactly, using the summation convention of Box 2.2 for repeated capital
indices in the range 1, . . . , K ,
138 7 Systems of Equations
A I u,x I = c. (7.26)
The various matrices A I and the vector c are assumed to be differentiable functions
of the arguments x1 , . . . , x K , u 1 , . . . , u n . We want to search for characteristic mani-
folds. As usual, these are special (K − 1)-dimensional hyper-surfaces in the space of
independent variables. On these characteristic manifolds, the specification of initial
data is not sufficient to guarantee the existence of a unique solution in a neighbour-
hood of the manifold. Alternatively, these manifolds can be regarded as carriers of
weak singularities, that is, discontinuities in the first (or higher) derivatives of the
solutions. In Sect. 6.4.1 we obtained an important result to the effect that if a function
is continuous across a singular surface the jump of its gradient is necessarily collinear
with the normal n (in the usual Cartesian metric) to the surface. Thus, we may write
u,x I = a n I I = 1, . . . , K , (7.27)
(n I A I )a = 0. (7.28)
For the vector a, containing the intensity of the jumps, to be non-zero the determinant
of the coefficient matrix must vanish, that is,
det(n I A I ) = 0. (7.29)
This is the equation defining the possible local normals to characteristic manifolds.
We ask ourselves: what kind of equation is this? It is obviously a homogeneous
polynomial of degree n in the K components n I . It may so happen that no vector n
(that is, no direction) satisfies this equation, in which case the system is said to be
totally elliptic at the point in question. At the other extreme, if fixing K − 1 entries
of n the polynomial in the remaining component has n distinct real roots, we have a
case of a totally hyperbolic system at the point. Let a putative singular hyper-surface
be given by an equation such as
φ(x1 , . . . , x K ) = 0. (7.30)
Then the vector n, not necessarily of unit length, can be identified with the gradient
of φ, namely,
n I = φ,x I (7.31)
so that we get
det(A I φ,x I ) = 0. (7.32)
This is a highly non-linear single first-order PDE for the function φ. We call it
the characteristic equation or the generalized eikonal equation associated with the
7.1 Systems of First-Order Equations 139
ψ(x1 , . . . , x K −1 ) − t = 0, (7.33)
the spatial wave front at any instant of time t0 is the level curve ψ = t0 , as suggested
in Fig. 7.3 for the case K = 3. This observation leads to a somewhat friendlier
formulation of the characteristic equation, as described in Box 7.1.
det(Aα φ,xα − A K ) = 0,
where the summation convention for Greek indices is restricted to the range
1, . . . , K − 1. Every solution of this first-order non-linear PDE represents a
characteristic manifold. In terms of the normal speed of a moving surface,
introduced in Box 7.2, this equation can be written point-wise as the algebraic
condition
det(Aα m α − V A K ) = 0,
in which m α are the component of the unit normal to the spatial wave front.
In terms of classification, the system is totally elliptic if there are no real
eigenvalues V . It is totally hyperbolic if it has K − 1 distinct real eigenvalues.
Repeated eigenvalues can also be included in the definition, since they appear
in applications.
140 7 Systems of Equations
x2
x1
A I J u,x I x J = c, (7.34)
where the matrices A I J and the vector c are assumed to be differentiable functions
of the independent variables x I and possibly also of the unknown functions u i and
their first derivatives u i,I . We proceed to evaluate the jump of this equation under
the assumption that neither the functions u i nor their first derivatives undergo any
discontinuities, since we are looking for weak discontinuities. The result is
A I J u,x I x J = 0. (7.35)
Invoking the iterated compatibility condition derived in Box 6.1, we can write
A I J a n I n J = 0, (7.36)
where a is a vector with n entries known as the wave amplitude vector. We conclude
that a hyper-surface element with normal n is characteristic if
det (A I J n I n J ) = 0. (7.37)
7.2 Systems of Second-Order Equations 141
f (x1 , x2 , x3 , t) = 0
where the summation convention is used for the spatial indices. Dividing by
dt and recognizing the vector with components Vi = d xi /dt as the velocity
associated with the pairing P, P yields
Vi f ,i = − f t or V · ∇ f = − ft .
ft
U = Vn = V · n = − √ .
∇f ·∇f
It follows that for τ = 0 the coordinate lines corresponding to the new variables ξ I
lie on the singular surface. Let g = g(x1 , . . . , x K −1 , t) be a differentiable function
of the old variables. It can be readily converted into a function ĝ of the new variables
by the composition
For the sake of compactness, let us abuse the notation and denote by commas the
partial derivatives with respect to either x I or ξ I . The distinction will be clear from
the name of the function (un-hatted or hatted, respectively). In the same vein, let
us denote by superimposed dots partial derivatives with respect to t or τ . With this
understanding, we obtain
⎧
⎨ ĝ,I = g I + ġ ψ,I for I = 1, . . . , K − 1
(7.40)
⎩ ˙
ĝ = − ġ
The beauty of the new variables is that the coordinates ξ I are interior coordinates,
as a result of which derivatives in those K − 1 directions do not experience any
jump! Consequently, ξ-derivatives commute with the jump operator. Since we are
dealing with weak waves, jumps of functions of the variables ξ I , τ will occur only in
quantities with two or more τ derivatives. Differentiating Eq. (7.40), we obtain the
following relations
¨
g̈ = ĝ (7.41)
¨ ψ,I
ġ,I = −ĝ (7.42)
¨ ψ,I ψ,J
g,I J = ĝ (7.43)
... ...
g = − ĝ (7.44)
...
¨ ψ,I + ĝ ψ,I
g̈,I = ĝ (7.45)
...
¨ ,I ψ,J − ĝ
ġ,I J = −ĝ ¨ ,J ψ,I − ĝψ
¨ ,I J − ĝ ψ,I ψ,J . (7.46)
The matrices A I J are assumed to be constant and symmetric. Taking jumps and
invoking Eqs. (7.41) and (7.43) yields
¨ = 0,
A I J ψ,I ψ,J − I û (7.48)
144 7 Systems of Equations
in which I is the unit matrix of order n. We assume the system to be totally hyperbolic.
A solution ψ = ψ(x1 , . . . , x K −1 ) of the first-order eikonal equation
det A I J ψ,I ψ,J − I = 0 (7.49)
or
...
¨ ,I ψ,J + û
A I J û ¨ ,J ψ,I + ûψ
¨ ,I J + A I J ψ,I ψ,J − I û = 0. (7.51)
Recall that the eigenvector is known in direction and that, therefore, this equation
becomes an ODE for the amplitude a of û. ¨ This is the desired transport equation.
It can be shown [1] that its characteristics coincide with the bi-characteristics of the
original equation.
7.2.3.1 Characteristics
We have derived in Sect. 7.1.4 the dynamic equations for a Timoshenko beam as a
system of 4 first-order PDEs. It is also possible to express these equations in terms
of 2 second-order PDEs for the displacement w and the rotation θ, respectively. The
result2 for a beam of homogeneous properties is
⎧
⎨ ρAwtt − G As (wx − θ)x = −q
(7.53)
⎩
ρI θtt − E I θx x = G As (wx − θ)
These equations can be recast in the normal form (7.47), except for the addition of
a vector c on the right-hand side. Since K = 2 we obtain the single matrix
G As
ρA
0
A= E (7.54)
0 ρ
which, except for a different numbering, are identical to those obtained earlier as
Eq. (7.24). The corresponding eigenvectors are, respectively, 1, 0 and 0, 1. Thus,
the characteristics are straight lines. We have four families of characteristics in the
x, t plane, each pair with slopes of different signs. For total hyperbolicity, these slopes
are all distinct. The interesting case of double eigenvalues is considered separately
in Box 7.3.
The general prescription of Eq. (7.52) has to be modified slightly to take into consid-
eration the additional vector c. Indeed, this term contributes first partial derivatives,
which have been assumed to be continuous and, accordingly, do not affect the char-
acteristic equations. On the other hand, when constructing the transport equation,
a further derivative is introduced which affects the formation of the jumps. The
additional term is ⎧ G As ⎫
⎨ ρA θxt ⎬
ct = (7.57)
⎩ G As ⎭
− ρI wxt
146 7 Systems of Equations
The final result corresponding to Eq. (7.52) for the first eigenvalue is
!
G As
ρA
0 2ax ρA G As 0
10 E
G As +a = 0. (7.58)
0 ρ 0 ρA 1
This equation implies that ax = 0, that is, the amplitude of a jump in the second
derivative of the displacement remains constant as it propagates. A similar analysis for
the second eigenvalue shows that the amplitude of the second derivative of the rotation
also remains constant. The main reason for these facts, apart from the constancy of the
sectional properties, is the complete decoupling between the modes. This decoupling
stems from the particular form of the additional term c and from the assumption of
total hyperbolicity, namely, that the speeds of propagation of the bending and shear
signals are different. A surprising discrepancy is found when these speeds happen to
be equal to each other, as discussed in Box 7.3.
x x D x C x
a = C cos + D sin b=− cos + sin ,
2κ 2κ κ 2κ κ 2κ
7.2 Systems of Second-Order Equations 147
√
where κ = A/I is the radius of gyration and C, D are constants to be
adjusted according to the initial conditions, assuming that this knowledge is
available at some point on one of the characteristic lines. The solution has
been parametrized by x, but could as well be parametrized by t or any other
parameter along the characteristic line. The main conclusion is that, unlike the
case of simple eigenvalues, whereby the amplitude does not decay or grow,
in the case of a double eigenvalue the amplitudes of both bending and shear
signals are coupled and evolve harmonically in time.
1
px x + p yy + pzz = ptt , (7.59)
c2
1
ψx2 + ψ 2y + ψz2 − = 0. (7.61)
c2
x =r y = r2 ψ = 0. (7.62)
Physically, this initial wave front could have been produced by a distribution of
many speakers or other sound producing devices arranged on a parabolic surface and
activated simultaneously at time t = 0. Extending these data by the strip condition
and by the PDE itself yields
1
ψx + 2ψ y r = 0 ψx2 + ψ 2y − = 0. (7.63)
c2
148 7 Systems of Equations
dx dy dψ 2 dψx dψ y
= 2ψx = 2ψ y = 2ψx2 + 2ψ 2y = 2 =0 = 0.
ds ds ds c ds ds
(7.64)
The solution of this system is given by
2r s s s
x =− √ +r y= √ + r2 ψ= . (7.65)
c 1 + 4r 2 c 1 + 4r 2 c2
For any value of s the solutions projected on the x, y plane are straight lines. They
are the bi-characteristics of the original PDE. For this particular equation, the bi-
characteristics are perpendicular to the spatial wave fronts. On the concave side of the
initial parabola these bi-characteristics (or rays) tend to converge. This phenomenon,
observed also in optics, is known as focusing and the formation of a caustic. Figure 7.4
provides a plot of the bi-characteristics for our example.
For a circular (cylindrical) surface of radius R and centre at (0, R), all the rays
converge to the centre. The initial wave front can be written parametrically as
0.8
0.6
0.4
0.0
7.2 Systems of Second-Order Equations 149
The characteristic ODEs are given again by (7.64). The solution in this case is given
by
2s 2s 2s
x =− sin θ + R cos θ y= cos θ + R(1 − cos θ) ψ = 2. (7.68)
c c c
The parameters s, θ are readily eliminated and we obtain the solution of the wave
fronts in the form
1 "
ψ= R − x 2 + (R − y)2 . (7.69)
c
We have a single equation, so that the matrices A I J become scalars. In our case,
moreover, they attain the simple form c2 δ I J , where δ I J is the Kronecker symbol.
¨ is just a scalar a and the decay equation (7.52)
Correspondingly, the amplitude û
becomes
2ax ψx + 2a y ψ y + a(ψx x + ψ yy ) = 0. (7.70)
Notice that at this stage the coefficients ψx , ψ y are known. The characteristics of this
first-order PDE for a are given by
dx dy da
= 2ψx = 2ψ y = −a(ψx x + ψ yy ). (7.71)
dr dr dr
As expected, the projected characteristics of the transport equation are precisely the
same as the bi-characteristics of the original equation. Thus, weak discontinuities
travel along rays (in this example perpendicular to the moving wave front). Let us
integrate the transport equation for the circular case, since the parabolic case leads
to more involved formulas. Our objective is to satisfy our intuition that as the signals
converge toward a focus the acoustic pressure increases without bound. Introducing
the solution ψ into (7.71), the system of equations is written as
dx 2x dy 2(R − y) da a
=− = = , (7.72)
dr cρ dr cρ dr cρ
"
where ρ = x 2 + (R − y)2 is the distance to the centre of the circle. From the first
two equations we conclude that
dx −x
= , (7.73)
dy R−y
Notice that
2
dρ = ρx d x + ρ y dy = − dr. (7.75)
c
It follows that
da da dr a
= =− , (7.76)
dρ dr dρ 2ρ
which integrates to
R
a = a0 , (7.77)
ρ
where a0 is the intensity of the excitation (produced, say, by a sudden pulse of the
speakers). Thus, the pressure grows without bound as ρ approaches the center of
the circle. It is important to realize that we have not solved the acoustic differential
equation (7.60). We have merely found the rays as carriers of weak discontinuities and
the variation of their amplitude. This information was gathered by solving first-order
differential equations only.
∂vi
ρ + ρvi, j v j = bi + σi j, j . (7.78)
∂t
Here, σi j = σ ji are the (Cartesian) components of the stress tensor, ρ is the current
density, bi are the components of the spatial body force and vi are the components
of the velocity vector. This formulation is strictly Eulerian or spatial. For elastic
solids, it is convenient to introduce an alternative (Lagrangian) formulation based on
a fixed reference configuration. Nevertheless, for our current purpose, we will adopt a
linearized formulation around the present configuration, in which case the distinction
between the two formulations alluded to above can be disregarded. Moreover, the
inertia term appearing on the left-hand side of Eq. (7.78) will be approximated by
the product ρ ∂v ∂t
i
, where ρ is identified with the density in the fixed (current) state,
so that the mass conservation (continuity equation) does not need to be enforced.
Finally, our main kinematic variable is a displacement vector field with components
u i = u i (x1 , x2 , x3 , t) in the global inertial frame of reference. In terms of these
variables, the velocity components are given by
7.2 Systems of Second-Order Equations 151
∂u i
vi = . (7.79)
∂t
The material properties will be assumed to abide by Hooke’s law
1
σi j = Ci jkl u k,l + u l,k . (7.80)
2
The fourth-order tensor of elastic constants C enjoys the symmetries
In this equation λ and μ are the Lamé coefficients and δi j is the Kronecker symbol,
equal to 1 when the two indices are equal and vanishing otherwise.
The whole apparatus can be condensed in a system of three coupled second-order
PDEs for the three displacement components, namely,
∂2ui
Ci jkl u k,l j + bi = ρ i = 1, 2, 3. (7.83)
∂t 2
7.2.5.2 Hyperbolicity
in which m i are the components of a unit vector normal to the wave front and ai are
the components of the wave amplitude vector. The tensor
Q ik = Ci jkl m j m l (7.85)
is called the acoustic tensor in the direction m.3 Total hyperbolicity corresponds to
the case in which, for every m, the tensor Q is (symmetric and) positive-definite with
three distinct eigenvalues proportional to the speeds of propagation. The associated
orthonormal eigenvectors are called the acoustical axes associated with the direction
m. If one of the acoustical axes coincides with m, the wave is said to be longitudinal.
3 For a fuller treatment of the general non-linear theory, see [6, 7].
152 7 Systems of Equations
If they are perpendicular, the wave is called transversal. These concepts make sense
because n = K − 1 and the displacement vector can be regarded as an element of the
physical space R3 . The case of one repeated eigenvalue occurs in isotropic materials,
where transversal waves propagate at identical speed.
Exercises
Exercise 7.1 Write the equations of an intermediate theory of beams for which the
normality of cross sections is preserved but the rotary inertia is included. Find the
speed of propagation of weak singularities for this theory.
Exercise 7.2 In the Timoshenko beam we found 4 distinct speeds of propagation and
identified the first two as pertaining to bending waves and the last two as pertaining to
shear waves. Justify this terminology by calculating the eigenvectors corresponding
to each of the 4 speeds.
Exercise 7.3 Carry out all the steps necessary to obtain Eq. (7.16) and provide
explicit expressions for Ĉ and d̂. Compare with the results in [4], p. 48.
Exercise 7.4 Apply the procedure explained in Sect. 7.1.3 to the Timoshenko beam
equations to obtain a system in canonical form. Consider the case of repeated eigen-
values and obtain the corresponding transport equations. Compare your results with
those obtained in [5] and with those obtained in Box 7.3.
Exercise 7.5 It is not always possible to obtain a single higher order PDE equivalent
to a given system of first-order PDEs. In the case of the Timoshenko beam, however,
this can be done by differentiation and elimination. Provide two alternative formula-
tions based, respectively, on a system of two second-order equations for the rotation
θ and the deflection w, and on a single fourth-order equation for the deflection w.
Obtain the characteristic speeds from each of these two alternative formulations.
Exercise 7.6 Prove that the speed of propagation of a surface moving on a material
background, as described in Box 7.2, is given by
D f /Dt
Up = − √ ,
∇ f ·∇ f
Exercise 7.7 Show that in an isotropic elastic material (in infinitesimal elasticity)
all transverse waves travel at the same speed. Find the speeds of longitudinal and
transverse waves in terms of the Lamé constants. Determine restrictions on these
constants so that waves can actually exist.
References 153
References
The archetypal hyperbolic equation is the wave equation in one spatial dimension.
It governs phenomena such as the propagation of longitudinal waves in pipes and
the free transverse vibrations of a taut string. Its relative simplicity lends itself to
investigation in terms of exact solutions of initial and boundary-value problems. The
main result presented in this chapter is the so-called d’Alembert solution, expressed
within any convex domain as the superposition of two waves traveling in opposite
directions with the same speed. Some further applications are explored.
u tt = c2 u x x , (8.1)
T
θ + dθ
ds
θ
T
dx
x
Since the angles are assumed to be very small, we can replace the sine and the tangent
of the arc by the arc itself. Moreover, at any fixed instant of time
dθ ≈ d(tan θ) = du x = u x x d x. (8.4)
u tt = c2 u x x + q. (8.5)
As before, this constant has the units of speed. In the absence of externally applied
transverse forces q, we recover the one-dimensional wave equation (8.1). We refer
to this case as the free vibrations of the string.
The characteristic equation (6.18) is given, in the case of the wave equation and with
the notation of the present chapter, by
dt 1
=± . (8.7)
dx c
8.2 Hyperbolicity and Characteristics 159
η ξ
In other words, the equation is hyperbolic. Since the equation is linear, the projected
characteristics are, as expected, independent of the solution. Moreover, because the
equation has constant coefficients, they are straight lines. We have seen that weak
singularities can occur only across characteristics. It follows in this case that weak
signals1 propagate with the constant speed c. Introducing the new variables
x +c t x −c t
ξ= η= , (8.8)
2 2
the one-dimensional wave equation is reduced to the canonical (normal) form
u ξη = 0. (8.9)
we conclude that, in the domain of interest, there exists a function F(ξ) such that
∂u(ξ, η)
= F(ξ). (8.11)
∂ξ
The d’Alembert representation of the general solution applies to any convex domain.
Nevertheless, the solution of any real-life problem requires the specification of bound-
ary (fixed ends of a finite string) and initial (position and velocity at t = 0) conditions.
In particular, the boundary conditions limit the direct use of the d’Alembert represen-
tation over the whole domain of interest. For this reason, the best direct application
of the d’Alembert solution is to the case of a spatially unbounded domain (i.e.,
an infinite string). Assume that the initial conditions are given by some functions
u 0 (x), v0 (x), namely,
u(x, 0) = u 0 (x) (8.14)
and
u t (x, 0) = v0 (x). (8.15)
In other words, the functions u 0 (x), v0 (x) represent, respectively, the known initial
shape and velocity profile of the string. Using the d’Alembert representation (8.13),
8.4 The Infinite String 161
we immediately obtain
u 0 (x) = f (x) + g(x) (8.16)
and
v0 (x) = c f (x) − c g (x), (8.17)
where primes are used to denote derivatives of a function of a single variable. Differ-
entiating equation (8.16) and combining the result with Eq. (8.17), we can read off
1 1
f (x) = u (x) + v0 (x), (8.18)
2 0 2c
1 1
g (x) = u (x) − v0 (x). (8.19)
2 0 2c
On integrating, we obtain
x
1 1
f (x) = u 0 (x) + v0 (z) dz + C, (8.20)
2 2c
0
x
1 1
g(x) = u 0 (x) − v0 (z) dz − C. (8.21)
2 2c
0
Notice that the lower limit of integration is immaterial, since it would only affect the
value of the constant of integration C. The reason for having the same integration
constant (with opposite signs) in both expressions stems from the enforcement of
Eq. (8.16). Notice also that the dependence on x is enforced via the upper limit,
according to the fundamental theorem of calculus, while z is a dummy variable of
integration. For the functions f and g to be of class C 2 it is sufficient that u 0 be C 2
and that v0 be C 1 .
Now that we are in possession of the two component functions of the d’Alembert
representation, we are in a position of stating the final result, namely,
u(x, t) = f (x + c t) + g(x − c t)
(8.22)
t
x+c t
x−c
1 1 1 1
= u 0 (x + c t) + v0 (z) dz + u 0 (x − c t) − v0 (z) dz.
2 2c 2 2c
0 0
(x, t)
x
x − ct x + ct
t
x+c
1 1 1
u(x, t) = u 0 (x + c t) + u 0 (x − c t) + v0 (z) dz. (8.23)
2 2 2c
x−c t
From the point of view of the general theory of hyperbolic equations, it is important
to notice that the value of the solution at a given point (x, t) is completely determined
by the values of the initial conditions in the finite closed interval [x − c t, x + c t].
This interval, called the domain of dependence of the point (x, t), is determined
by drawing the two characteristics through this point backwards in time until they
intersect the x axis, as shown in Fig. 8.3.3 From the physical point of view, any
initial datum or signal outside the domain of dependence of a point has no influence
whatsoever on the response of the system at that point of space and time. In other
words, signals propagate at a definite finite speed and cannot, therefore, be felt at
a given position before a finite time of travel. This remark has implications on any
numerical scheme of solution (such as the method of finite differences) that one may
attempt to use in practice.
The natural counterpart of the notion of domain of dependence is that of range of
influence. Given a closed interval [a, b] (or, in particular, a point) in the x axis (or
anywhere else, for that matter), its range of influence consists of the collection of
points (in the future) whose domain of dependence intersects that interval (or point).
It is obtained by drawing the outbound characteristics from the extreme points of the
interval and considering the wedge shaped zone comprised between them, as shown
in Fig. 8.4. A point outside the range of influence will be not affected at all by the
initial data in the interval [a, b].
3 Further elaborations of these ideas can be found in most books on PDEs. For clarity and conciseness,
x
a b
The d’Alembert solution can also be applied to the problem of the dynamics of a
semi-infinite string fixed or otherwise supported at one end. Let the undeformed
string occupy the region 0 ≤ x < ∞ and let the initial conditions be given as in the
previous problem by
u(x, 0) = u 0 (x) 0≤x <∞ (8.24)
and
u t (x, 0) = v0 (x) 0 ≤ x < ∞. (8.25)
At the end x = 0 we must now impose a boundary condition. Consider the case in
which this end is fixed, namely,
Note that at time t = 0 the boundary condition may happen to be inconsistent with
the initial conditions at the left end of the string. We will for now explicitly assume
that this is not the case. In other words, we assume that
t
x+c
1 1 1
u(x, t) = u 0 (x + c t) + u 0 (x − c t) + v0 (z) dz 0 ≤ ct ≤ x.
2 2 2c
x−c t
(8.28)
164 8 The One-Dimensional Wave Equation
The solution in the remaining part of the domain (namely, 0 ≤ x ≤ ct) must also
be amenable to a d’Alembert decomposition of the form
Since we have a constant of integration at our disposal, we may set g1 (0) = g(0) and
obtain
f 1 (x) = f (x). (8.31)
The physical meaning of this result is that the backward moving wave is unaffected
by the presence of the support, which should not be surprising. To obtain the forward
moving wave, we impose the boundary condition (8.26) and obtain
In other words,
From the physical point of view this result means that the forward wave arriving at a
point with coordinates x0 , t0 situated in the upper domain is a reflected (and inverted)
version of the backward wave issuing at the initial time t = 0 from the point of the
string with coordinate ct0 − x0 . The time at which the reflection occurs is t0 − x/c,
as shown in Fig. 8.5.
Combining all the above results, we can obtain an explicit formula for the solution
in the upper domain as
t0 − x0 /c
x
ct0 − x0
8.5 The Semi-infinite String 165
t
C
B
η D ξ
t
x+c
1 1 1
u(x, t) = u 0 (x + c t) − u 0 (−(x − c t)) + v0 (z) dz 0 ≤ x ≤ c t.
2 2 2c
−(x−c t)
(8.34)
The representation used in Fig. 8.5 suggests that the analytic expressions obtained by
means of d’Alembert’s solution can be also obtained geometrically by constructions
based on the characteristic lines alone. To this end, consider a parallelogram-shaped
domain enclosed by characteristics, such as the shaded domain shown in Fig. 8.6.
Denoting the corners of the parallelogram by A, B, C, D, as shown, it is not
difficult to conclude that any function u = u(x, t) of the d’Alembert form (8.13)
satisfies the condition
u A + uC = u B + u D , (8.35)
with an obvious notation. This result follows from Eq. (8.12) on observing that
ξ A = ξD ξ B = ξC η A = ηB ηC = η D . (8.36)
Every C 2 solution of the wave equation, therefore, satisfies the simple algebraic
identity (8.35). Conversely, every function that satisfies Eq. (8.35) for every charac-
teristic parallelogram within a given domain also satisfies the wave equation within
the domain.4 If we have a function that is not of class C 2 and yet satisfies Eq. (8.35),
n
ai
C(x, t) o m in
r d ma
p pe do
u
w er
lo
B
D
x
x − ct 0 −(x − ct) x + ct
we may say that it satisfies the wave equation in a weak sense or, equivalently, that
it is a generalized or weak solution of the one-dimensional wave equation.
Let us apply Eq. (8.35) to the solution of the semi-infinite string problem with
a fixed end. Consider a point C with coordinates (x, t) in the upper domain of the
problem, as represented in Fig. 8.7. We complete a characteristic parallelogram by
drawing the two characteristics from this point back to the lines x = 0 and x = c t,
thus obtaining the points D and B, respectively. The remaining point, A, is obtained
as the intersection of the characteristic line issuing from D and the line x = c t, as
shown in the figure.
By the boundary condition (8.26), we have
u D = 0. (8.37)
u(x, t) = u C = u B − u A . (8.38)
t)
−(x−c
1 1 1
u A = u 0 (−(x − c t)) + u 0 (0) + v0 (z) dz (8.39)
2 2 2c
0
8.5 The Semi-infinite String 167
t
x+c
1 1 1
u B = u 0 (x + c t) + u 0 (0) + v0 (z) dz. (8.40)
2 2 2c
0
Combining Eqs. (8.38), (8.39) and (8.40) we recover the solution (8.34).
There exists yet another way to obtain the solution for a semi-infinite (and, eventually,
for a finite) string. It consists of extending the initial data to the whole line in such
a way that the boundary conditions are satisfied automatically by the corresponding
solution of the infinite string. This procedure needs to be used with care and on
a case-by-case basis. Consider the case discussed in the previous section, namely,
the one with initial and boundary conditions described by Eqs. (8.24)–(8.27). We
now extend the initial conditions as odd functions over the whole domain. Such an
extension is shown pictorially in Fig. 8.8, where the initial function, u 0 (x) or v0 (x),
given in the original domain 0 ≤ x < ∞, has been augmented with a horizontally
and vertically flipped copy so as to obtain an odd function over the whole domain.
The extended functions, denoted here with a bar, coincide with the given data in
the original domain and, moreover, enjoy the property
ū 0 (x) = −ū 0 (−x) v̄0 (x) = −v̄0 (−x) − ∞ < x < ∞. (8.41)
t
x+c
1 1 1
ū(x, t) = ū 0 (x + c t) + ū 0 (x − c t) + v̄0 (z) dz. (8.42)
2 2 2c
x−c t
Let us evaluate this solution over the positive time axis. The result is
c t
1 1 1
ū 0 (0, t) = ū 0 (c t) + ū 0 (−c t) + v̄0 (z) dz. (8.43)
2 2 2c
−c t
ū(0, t) = 0. (8.44)
We conclude that the extended solution automatically satisfies the desired boundary
condition of the semi-infinite string. It follows, therefore, that restricting the extended
solution to the original domain provides the solution5 to the original problem. From
the physical point of view, we may say that the forward and backward waves of
the extended solution interfere with each other destructively, in a way that has been
precisely calibrated to produce a zero value for all times at x = 0.
It is interesting to note that the solution obtained will in general not be of class
C 2 since, even if the original initial conditions and the extended initial conditions
are C 2 and even if they vanish at the origin, the extended initial conditions are only
guaranteed to be C 1 at the origin, unless the extra conditions
happen to be satisfied. If these conditions are not satisfied, the corresponding weak
discontinuities will propagate along the characteristic emerging from the origin. This
feature is not due to the method of solution (since the solution is ultimately unique).
Remark 8.1 Although the method of extension of the initial conditions is a legit-
imate alternative to the method of characteristics, it should be clear that the latter
is more general than the former. Indeed, the method of extension of the initial data
is applicable only when the supported end is actually fixed, whereas the method of
characteristics is still viable when the support is subjected to a given motion.
8.6.1 Solution
The method of characteristics can be used in principle to solve for the vibrations of
a finite string of length L with appropriate boundary conditions. All one has to do
is to divide the domain of interest (which consists of a semi-infinite vertical strip
5 Clearly, we are tacitly invoking an argument of uniqueness, which we have not pursued.
8.6 The Finite String 169
u or ux known u or ux known
x
0 L
u and ut known
in space-time) into regions, as shown in Fig. 8.9. The lower triangular region at the
base of the strip is solved by the general formula (8.23). Assuming that the solution
is continuous over the whole strip, we can then use the values at the boundary of this
triangle, together with the known values at the vertical boundaries of the strip, to use
the parallelogram formula (8.35) for the next two triangular regions. The procedure is
carried out in a similar manner for increasingly higher (triangular or parallelogram)
regions. It is not difficult to write a computer code to handle this recursive algorithm
(as suggested in Exercise 8.10).
For the particular case of fixed ends, an equivalent procedure is obtained by
exploiting the extension idea, namely the ‘trick’ already used for the semi-infinite
string to reduce the problem to that of an infinite string. Consider the problem of a
finite string of length L whose ends are fixed, namely,
To avoid any possible strong discontinuities, we confine our attention to the case in
which the initial and boundary conditions are consistent with each other in the sense
170 8 The One-Dimensional Wave Equation
that
u 0 (0) = u 0 (L) = v0 (0) = v0 (L) = 0. (8.48)
We now extend the initial conditions (uniquely) to odd functions with period
2L. Although this idea is both intuitively appealing and, eventually, justified by the
results, it is worthwhile noting that the need for periodicity can actually be derived by
means of a rational argument.6 The odd periodic extension of a function is illustrated
in Fig. 8.10.
In addition to conditions (8.41), the extended initial data satisfy now the periodicity
conditions
By the same procedure as in the case of the infinite string, invoking just the odd
character of the extension, we can be assured that the fixity condition is satisfied at
the left end of the string. The periodicity takes care of the right end. Indeed
t
L+c
1 1 1
ū(L , t) = ū 0 (L + c t) + ū 0 (L − c t) + v̄0 (z) dz. (8.50)
2 2 2c
L−c t
ū(L , t) = 0. (8.52)
Again, from the physical point of view, we can interpret this result as the outcome of
a carefully calibrated mutually destructive interference of the backward and forward
traveling waves.
6 See [6], p. 50, a text written by one of the great Russian mathematicians of the 20th century.
8.6 The Finite String 171
We have found solutions to the wave equation with given initial and boundary condi-
tions by means of different procedures. Is the solution unique? Is it stable in the sense
that small changes in the initial and/or boundary conditions result in small changes
in the solution? These questions are of great importance for PDEs in general and
must be studied in detail for each case. A problem in PDEs is said to be well posed
if it can be shown that a solution exists, that it is unique and that it depends continu-
ously (in some sense) on the initial and boundary data. For the one-dimensional wave
equation on a finite spatial domain the answer to the question of uniqueness can be
found, somewhat unexpectedly, in the underlying physics of the problem by invoking
the principle of conservation of energy in non-dissipative systems, as explained in
Box 8.1.
We need to verify that, from the mathematical viewpoint, the total energy
W = K + U is indeed a conserved quantity. Invoking the results of Box 8.1, namely,
L
1 2
W = u t + c2 u 2x ρAd x, (8.53)
2
0
we calculate the rate of change of the energy on any C 2 solution u = u(x, t) of the
wave equation as
L
dW 1 d 2
= u t + c2 u 2x ρAd x
dt 2 dt
0
L
= u t u tt + c2 u x u xt ρAd x
0
L L
x=L
= (u t u tt ) ρAd x + c2 u x u t ρA x=0 − u t c2 u x x ρAd x
0 0
L
x=L
= u t u tt − c2 u x x ρAd x + c2 u x u t ρA x=0
0
x=L
= c2 u x u t ρA x=0 , (8.54)
where we have integrated by parts and assumed, for simplicity, a constant density ρ
and a constant cross section A. For the case of fixed ends, since u t = 0 at both ends,
we obtain the desired result, namely,
172 8 The One-Dimensional Wave Equation
L
1 2
K = ρu Ad x,
2 t
0
where we are confining our attention to the case of a finite bar of length L. The total potential
energy is stored as elastic energy, just as in the case of a linear spring, given by
L L
1 2 1 2 2
U= Eu Ad x = ρc u x Ad x.
2 x 2
0 0
For the application to the transverse vibrations of a taut string, the kinetic energy K is given by
the same expression as its counterpart for longitudinal waves, except that the variable u must be
interpreted as a transverse, rather than longitudinal, displacement. The potential energy U for the
vibrating string, on the other hand, must be examined more carefully. Indeed, we have assumed
that the tension T remains constant during the process of deformation.a It does, however, perform
work by virtue of the extension of the string. If ds represents the length of a deformed element,
whose original length was d x, as indicated in Fig. 8.1, we can write
1
ds = 1 + u 2x d x = 1 + u 2x + . . . d x.
2
a The energy is, of course, of elastic origin. This is a case of small deformations imposed upon
pre-existent larger ones. As an example, consider a linear spring with elongation e and internal
force F = ke, whose elastic energy is W = 0.5ke2 . Its increment due to a small de superimposed
on a background e0 is d W = ke0 de = Fde.
dW
= 0. (8.55)
dt
The same result is, of course, obtained if an end is free to move but subjected to no
external force, so that u x = 0.
Let u 1 = u 1 (x, t) and u 2 = u 2 (x, t) be C 2 solutions of the wave equation with the
same boundary and initial conditions. Since the wave equation is linear, the differ-
ence u = u 1 − u 2 satisfies the wave equation with homogeneous (that is, vanishing)
8.6 The Finite String 173
boundary and initial conditions. Thus, the total energy at time t = 0 vanishes. Since,
as we have just demonstrated, the total energy is conserved, we have
L
1 2
u t + c2 u 2x ρAd x = 0 (8.56)
2
0
for all times. In view that the integrand is non-negative, this result is only possible
if it vanishes identically. Hence, u = u 1 − u 2 is constant and, therefore, zero. This
concludes the proof of uniqueness.
The issue of continuous dependence on initial data can be settled, for instance, by
using the technique of periodic extension and explicitly determining a norm of the
difference between the solutions corresponding to two sets of initial data. This issue
is clearly of great importance for numerical methods of solution of PDEs.
The free vibrations of a finite string are necessarily periodic in time. This fact can be
established in various ways, one of which we will pursue presently. We know that
the (unique) solution for the free vibrations of a simply supported string of length
L can be obtained by the technique of extending the initial conditions periodically
over R. The extension of the initial conditions u 0 and v0 yields, respectively, two
odd functions ū 0 (x) and v̄0 (x) with a period of 2L and the (d’Alembert) solution of
the problem is obtained as
t
x+c
1 1 1
u(x, t) = ū 0 (x + c t) + ū 0 (x − c t) + v̄0 (z) dz. (8.57)
2 2 2c
x−c t
For each value of x, this function u(x, t) turns out to be periodic in time, with a
period of 2L/c, namely,
u(x, t + 2L/c) = u(x, t), (8.58)
as can be verified by direct substitution noting that, due to the assumed odd character
of the extension, the integral of the initial velocity over a whole period must vanish.
Notice that, although the displacement pattern is recovered exactly at regular
intervals equal to the period, its shape, in general, varies with time, neither does
the string become instantaneously un-deformed at any intermediate instant. We will
later show that there are special solutions that do preserve their shapes and vary only
in amplitude as time goes on. These special solutions can be interpreted physically
as standing waves. They provide a different avenue of approach to the solution of
vibration problems in Engineering and Physics.
174 8 The One-Dimensional Wave Equation
A violin player can change continuously the length of the string by sliding a fin-
ger over the fingerboard in a smooth motion known as glissando. This change of
length involves an increase in the material content of the string enclosed between
the two supports. Similarly, biological growth may arise as the result of additional
material being deposited at the boundaries of an organ. These two processes involve
time scales much larger than, say, the period of free vibrations of the original body.
Nevertheless, they are interesting pictures to bear in mind when dealing with prob-
lems of moving boundaries. In the case of the violin player, the effect of the moving
boundary can be perceived clearly by the ear as a variation of the fundamental pitch
of the sound. Remarkably, the method of characteristics can be applied without any
essential modification to problems involving moving boundaries.
Consider the idealized ‘violinist problem’ consisting of solving the wave equation
in the domain D (shaded in Fig. 8.11) with initial conditions
α
c
x
L
8.7 Moving Boundaries and Growth 175
x
B A
of the slope and repeat as many times as needed to arrive at a point A in the base
segment; (d) keep track of the number of bounces n R ; (e) repeat the whole zigzag
procedure but starting with the downward-left characteristic to obtain a point B in
the base after a number n L of bounces (if any); the value of the displacement at P is
given by
1
uP = (−1)n R u 0 (A) + (−1)n L u 0 (B) . (8.61)
2
In the example of Fig. 8.12, we have n R = 1 and n L = 0.
The Slinky is one of the most celebrated toys of the twentieth century. Essentially
a helical spring, its versatility is due in part to the interaction of its elasticity with
gravity and with lateral as well as axial deformations. To simplify matters, though,
let us consider it as a linear elastic bar with density ρ, modulus of elasticity E and
un-stretched length L undergoing axial displacements only in the small deformation
regime under no external forces.7 Under these conditions, it abides by the wave
equation. The question we pose is the following: Holding one end while the other
end is free and starting from a rest configuration, is it possible to obtain a prescribed
displacement history f (t) of the free end by imposing a displacement history g(t)
of the held end? From the mathematical standpoint, this is a problem in the theory
of boundary control of PDEs. Here, however, we will consider it as an independent
challenge for its own sake.
The key to solve this problem is that, for the case of the wave equation, the roles
of the space (x) and time (t) variables can be exchanged. What this means is that the
following is a perfectly well-defined Cauchy problem: Solve the wave equation
7 Thissimplified model is not realistic for the actual Slinky for many reasons. We are only using it
as a motivation for a well defined problem in linear one-dimensional elasticity.
176 8 The One-Dimensional Wave Equation
−L/c
where U0 (t) and ε0 (t) are defined over the whole real line. Let us identify the free
end with x = 0 and the held end with x = L. Let the desired displacement history
of the free end be of the form
⎧
⎨ 0 for t < 0
U0 (t) = (8.64)
⎩
f (t) for t ≥ 0
and let ε0 (t) = 0, which corresponds to a zero strain (and stress) at an unsupported
end, as desired.
The shaded area in Fig. 8.13 indicates the range of influence R of the non-
vanishing Cauchy data. In particular, the line x = L is affected only for times
t ≥ −L/c. As expected on physical grounds, therefore, the non-vanishing history
g(t) of the displacement to be applied at the held end must start earlier than the
desired displacement at the free end. The explicit form of this history is, according
to the d’Alembert solution implemented in the exchanged time-space domain,
1 L 1 L
g(t) = u(L , t) = U0 t − + U0 t + , (8.65)
2 c 2 c
8.8 Controlling the Slinky? 177
which indeed vanishes identically for t < L/c. By uniqueness, we conclude that if
we return to the original space-time picture and at time t = −L/c (or any earlier
time) we prescribe zero initial conditions of both displacement and velocity and as
boundary conditions we specify at x = 0 an identically zero strain and at x = L a
displacement equal to g(t), we should recover the same u(x, t) for the finite beam
as the one provided by the solution of the previous Cauchy problem. In particular,
along the line x = 0, we should recover the desired displacement f (t). The problem
has been thus completely solved. For more realistic applications, it is not difficult to
incorporate the effect of gravity.
A similar procedure can be used to demonstrate the boundary control of a Timo-
shenko beam. A good way to imagine this problem is to think of holding a fishing
rod at one end and, by applying displacements and/or rotations at the held end, to try
to achieve a specified displacement at the free end. By reversing the roles of space
and time, it is possible to show [1] that, if both the displacement and the rotation
at the held end are amenable to independent prescription, then the problem has a
unique solution, just like the case of the slinky. When the displacement at the held
end vanishes and only the rotation is amenable to prescription, the problem can also
be solved, but requires the solution of a recursive functional equation.
Source terms give rise to the inhomogeneous wave equation in the form
that we have already encountered in deriving the equation for the vibrating string (8.5)
subjected to an external force. It is enough to show how to solve this inhomogeneous
problem with vanishing initial conditions and (for the finite case) vanishing boundary
conditions, since any other conditions can be restored by superposition with the
solution of the source-free case with any given initial and boundary conditions.
There are various ways to deal with the inhomogeneous wave equation.8 The
appeal of Duhamel’s principle is that it can be motivated on physical, as well as math-
ematical, grounds. Mathematically, this method can be regarded as a generalization to
linear PDEs of the method of variation of constants used in solving inhomogeneous
linear ODEs. For a more intuitively physical motivation, see Box 8.2.
Assume that the solution of the homogeneous problem is available for arbitrary
initial conditions. Moreover, the initial conditions could be specified, rather than just
at t = 0, at any reference time τ > 0 with the corresponding solution valid for t ≥ τ
and vanishing for t < τ . In particular, we denote by ū(x, t; τ ) the solution to the
homogeneous problem with the initial conditions
f f
t t
τ h
x x
If one of these slices were acting alone starting at time τ for the duration h and then disappearing,
it would leave the system, initially at rest, with a certain velocity distribution. The effect of having
had the force acting for a short duration and then removed is, one claims, the same as not having
had the force at all but applying instead that velocity distribution as an ‘initial’ condition. A glance
at the PDE is convincing enough to conclude that the velocity in question is, to a first degree
of approximation, v(x, τ + h) = f (x, τ )h. A more particular argument is suggested in Exercise
8.13. Since our system is linear, it abides by the principle of superposition. The combined effect
of the individual slices of force is, therefore, the sum of the effects produced by the corresponding
homogeneous problems with the equivalent initial conditions of velocity. In the limit, as h → 0,
this sum tends to the integral propounded by Duhamel’s principle.
Notice the subtlety that the expression ū(x, t; t) implies that t denotes the reference
time and, therefore, tautologically,
ū(x, t; t) = 0, (8.69)
8.9 Source Terms and Duhamel’s Principle 179
To verify that the expression (8.68) actually satisfies the PDE (8.66), we evaluate
t
u x x (x, t) = ū x x (x, t; τ )dτ . (8.71)
0
Moreover,
t t
u t (x, t) = ū(x, t; t) + ū t (x, t; τ )dτ = ū t (x, t; τ )dτ , (8.72)
0 0
t t
u tt (x, t) = ū t (x, t; t) + ū tt (x, t; τ )dτ = f (x, t) + ū tt (x, t; τ )dτ . (8.73)
0 0
u tt − c2 u x x = e x , t ≥ 0, (8.74)
)
x+c(t−τ
1 ex
ū(x, t; τ ) = e z dz = − sinh c(t − τ ) t ≥ τ. (8.77)
2c c
x−c(t−τ )
c
τ
x
x−c(t−τ ) x+c(t−τ )
t t
ex
u(x, t) = ū(x, t; τ )dτ = − sinh c(t − τ )dτ
c
0 0
ex ex
=− [cosh c(t − τ )] t
0 = (cosh ct − 1). (8.78)
c2 c2
Exercises
Exercise 8.1 Make sure that you can reproduce and reason through all the steps
leading to Eq. (8.13). In particular, discuss the interpretation of the two waves just
introduced. Which of the two functions f, g represents the advancing wave?
Exercise 8.2 Show in detail that the solution (8.43) coincides with (8.34) in the
original domain.
Exercise 8.3 (a) Use the extension technique to obtain and analyze the solution of
the semi-infinite string when the left end of the string is free to move but is forced to
preserve a zero slope at all times. Does the forward wave in the upper domain become
the reflection of a backward wave bouncing against the support? If so, does inversion
occur? (b) In terms of characteristics, place a characteristic diamond ABC D with
point A at the origin and point C directly above it. Show that, for the given boundary
condition, u C = 2u B − u A .
Exercise 8.4 A violin string of length L, supported at both ends, is released from
rest. The initial displacement is known within a maximum point-wise error ε. Show
that the same uncertainty is to be expected at any subsequent time. Hint: use the
triangle inequality.
Exercise 8.5 A piano string of length L, supported at both ends, is struck at its
straight configuration. The imposed initial velocity is known within a maximum
point-wise error ε. Show that the point-wise uncertainty in the ensuing displacement
is expected to grow linearly as time goes on. Comment on the implications of this
result on any numerical method.
8.9 Source Terms and Duhamel’s Principle 181
Exercise 8.6 Show that Eq. (8.57) implies that at a time equal to half the period,
the displacements are the inverted spatial mirror image of the initial displacements,
regardless of the initial velocities.
Exercise 8.7 Obtain the time periodicity of the solution strictly from a geometric
analysis of the characteristics. By the same method, obtain the result of Exercise 8.6.
Exercise 8.8 Show that for the case of a free end that maintains a zero slope while
the other end is fixed, the resulting motion is periodic in time. What is the period?
What is the shape after half a period?
Exercise 8.9 A string of a grand piano is made of steel (density = 7800 kg/m3) and
has a diameter of 0.5 mm. The string is subjected to a tension of 600 N and placed
between two supports 0.3 m apart. The impact of a hammer at time t = 0 can be
approximated by assuming a zero displacement function and a velocity given by the
function ⎧
⎨ 0 0 ≤ x < 0.18 m
v0 (x) = 4 m/s 0.18 m ≤ x ≤ 0.20 m
⎩
0 0.20 m < x < 0.30 m
The origin of coordinates has been placed at one of the supports. Find the displace-
ment of the string after 0.15 ms at the point x = 0.10 m. At the same instant, indicate
those portions of the string (if any) experiencing a zero velocity. What is the period
of the motion?
Exercise 8.10 Write a computer code to handle the procedure described in Sect. 8.6.1
for any given boundary displacements and initial conditions. Apply it to the case of
fixed ends and verify the time periodicity of the solution.
Exercise 8.11 Write a computer code to carry out the algorithm described in
Sect. 8.7. Run the program with the initial condition u 0 (x) = sin πx/L. Check the
resulting shape as time goes on. Use various values of the ratio α/c < 1.
Exercise 8.12 A Slinky toy has been idealized (not very realistically) as an elastic
bar of length L = 1 m. This may happen when a child plastically deforms the slinky
to that lamentable state so that it can no longer be enjoyed. For small additional
deformations it still behaves elastically. The product E A of the modulus of elasticity
times the equivalent cross-sectional area is estimated at E A = 1N and the mass per
unit length is ρA = 0.2 kg/m. Placing the toy on a (frictionless) horizontal table
to exclude gravity effects, determine the motion to be applied on one end so that
the resulting displacements on the other end are given by the function 0.05(1 −
cos ωt)H (t), where H (t) is the Heaviside step function. Plot the solution
√ for various
values of ω in the range 1 < ω < 15. What happens when ω = 0.5π 5 s−1 or any
of its odd multiples? Explain.
Exercise 8.13 (Duhamel’s principle unraveled) Each length element of the vibrating
string can ultimately be regarded as a mass attached to a spring of stiffness k somehow
182 8 The One-Dimensional Wave Equation
representing the restoring (elastic) forces of the string. Consider this mass-spring
system at rest up to a time t = τ and then subjected to a force of intensity F(τ )
acting for a short interval of time h and then removed. Since the spring is at rest, in
the small interval of time h it will undergo a negligible displacement (of the order of
h 2 ) while the velocity, according to Newton’s second law, will undergo an increase of
F(τ )h/m, where m is the mass. Subject the system to an initial (at time τ ) velocity
F(τ )h/m and solve the homogeneous spring-mass equation m ẍ + kx = 0 for all
subsequent times. The result of this step should be
m F(τ )h k
x̄(t; τ ) = sin (t − τ ) t ≥ τ.
k m m
t
m F(τ ) k
x(t) = lim x̄(t; τ ) = sin (t − τ ) dτ .
h→0 k m m
0
Verify that this expression satisfies the differential equation m ẍ + kx = F(t) with
zero initial conditions. Carefully distinguish between the variables t and τ and, when
differentiating with respect to t, observe that it appears both in the integrand and in
the upper limit of the integral. A simpler example is provided by the particular case
F(t) = constant.
References
In spite of being limited to the solution of only certain types of PDEs, the method
of separation of variables provides often an avenue of approach to large classes of
problems and affords important physical insights. An example of this kind is provided
by the analysis of vibration problems in Engineering. The separation of variables in
this case results in the resolution of a hyperbolic problem into a series of elliptic
problems. The same idea will be applied in another chapter to resolve a parabolic
equation in a similar manner. One of the main by-products of the method is the
appearance of a usually discrete spectrum of natural properties acting as a natural
signature of the system. This feature is particularly manifest in diverse applications,
from musical acoustics to Quantum Mechanics.
9.1 Introduction
1 Historically,
in fact, the method was discovered by Jean Baptiste Joseph Fourier (1768–1830) in
his celebrated book Théorie Analytique de la Chaleur. Thus, its very first application lies within
the realm of parabolic equations.
© Springer International Publishing AG 2017 183
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_9
184 9 Standing Waves and Separation of Variables
PDEs. We will first present the method within the context of the wave equation and
then, after introducing the concepts of eigenvectors and eigenvalues of differential
operators, we will proceed to discuss some applications of these ideas.
It may be a good idea at this point to review briefly the theory of vibrations of struc-
tures with a finite number of degrees of freedom. From the mathematical point of
view, as we know, these structures are governed by systems of ordinary (rather than
partial) differential equations. Nevertheless, when it comes to the method of separa-
tion of variables and its consequences, there exist so many commonalities between
the discrete and the continuous case that to omit their mention would constitute a
callous disregard for our ability to draw intellectual parallels. Moreover, numerical
methods for the solution of continuous dynamical systems usually consist of some
technique of discretization, whereby inertia and stiffness properties are lumped in
a finite number of points, thus resulting in an approximation by means of a system
with a finite number of degrees of freedom.
A system of a finite number of masses interconnected by means of linear elastic
springs2 moves in space, in the absence of external forces, according to a system of
linear ODEs that can be written as
K u + M ü = 0. (9.1)
In this equation K and M denote, respectively, the stiffness and mass (square) matrices
of the system, while u is the vector of kinematic degrees of freedom (whose number n,
therefore, determines the order of the matrices involved). Superimposed dots denote
derivatives with respect to the time variable t. The stiffness matrix K is symmetric
and positive semi-definite and the mass matrix M is symmetric and positive definite.
We recall that a square matrix K is said to be positive semi-definite if for all non-
vanishing vectors m (of the appropriate order) the following inequality is satisfied
mT K m ≥ 0. (9.2)
A square matrix is positive definite, on the other hand, if the strict inequality applies
(that is >, rather than ≥) in Eq. (9.2). From a geometric point of view, one may
say, somewhat loosely, that the vector obtained by applying the matrix to any non-
zero vector never forms an obtuse angle with the original vector. In the case of
positive definiteness, the angle is always acute. In the case of the stiffness and mass
2 The linearity of the stiffness properties, which results in the linearity of the equations of motion,
is either an inherent property of the system or, alternatively, the consequence of assuming that the
displacements of the system are very small in some precise sense.
9.2 A Short Review of the Discrete Case 185
Dividing through by f , which is certainly not identically zero if we are looking for
a non-trivial solution, we obtain
f¨
K U = −M U . (9.5)
f
transpose of the shape vector (which certainly cannot be the zero vector, since we
are only interested in non-trivial solutions), we obtain
f¨ UT K U
=− T ≤ 0. (9.6)
f U MU
Let us investigate the case in which the equal sign holds. Notice that this is only
possible because of the semi-positive definiteness of the stiffness matrix. If this
matrix were positive definite, the ratio would certainly be negative. So, if the ratio
is actually zero, we obtain that the second derivative of f must vanish identically,
which means that the system is moving with a linearly growing amplitude. From
the physical point of view, this motion corresponds to the degrees of freedom of
the system as a rigid entity.3 In other words, for a system which has been properly
supported against rigid-body motions this situation cannot occur (and, in this case,
the stiffness matrix is positive definite). In any case, we can write
f¨
= −ω 2 . (9.7)
f
also called the characteristic equation, requires the calculation of the roots of a
polynomial whose degree equals the number n of degrees of freedom of the system.
According to the fundamental theorem of algebra, this equation has exactly n roots
ω12 , . . . , ωn2 (some of which may be repeated). In general, some or all of these roots
may be complex, but for the case of symmetric real matrices they are guaranteed
to be all real. Moreover, the (semi-) positive definiteness of the matrices involved
guarantees that the roots are non-negative (as we have already determined). Finally,
if ωi2 = ω 2j and if Ui , U j are respectively corresponding normal modes, the following
(weighted) orthogonality condition must be satisfied:
UiT M U j = 0. (9.10)
UiT M U j = δi j i, j = 1, . . . , n, (9.11)
where the function f i (t) is a solution of Eq. (9.7) with ω = ωi . There are clearly
two possibilities. Either ωi = 0, in which case we go back to the rigid-body motion
at constant speed, or ωi > 0, in which case we obtain (for some arbitrary constants
Ai , Bi )
f i = Ai cos(ωi t) + Bi sin(ωi t). (9.13)
The time dependence of the (non-rigid) normal modes is, therefore, harmonic.
To complete our review of the treatment of discrete systems, we will show how,
when some external forces f are applied in correspondence with the degrees of
freedom, or when some specific initial conditions are prescribed, the solution can
be represented in terms of the normal modes of the system. These concepts will
reappear in a more general form in the case of continuous systems.
Let the discrete system, still in the absence of external forces, be subjected to the
initial conditions
Since our system is linear, the principle of superposition applies, as can be easily
verified. What we mean by this is that the sum of any two solutions of Eq. (9.1) is also
a solution. For this reason, we attempt to represent the solution u(t) corresponding
to the initial conditions (9.14) as a sum of independent shape-preserving motions of
the form (9.12). We are setting, therefore,
n
n
u(t) = Ui f i (t) = Ui (Ai cos(wi t) + Bi sin(ωi t)) , (9.15)
i=1 i=1
where, for definiteness, we have assumed that there are no zero eigenvalues (i.e., all
rigid motions are prevented).5
The constants Ai , Bi will be adjusted, if possible, to satisfy the initial conditions.
Multiplying Eq. (9.14) to the left by UkT M and invoking the orthonormality condition
(9.11), yields
UkT M u(t) = Ak cos(wi t) + Bk sin(ωi t). (9.16)
Ak = UkT M u0 , (9.17)
ωk Bk = UkT M v0 . (9.18)
We limit our analysis to the case in which these external forces are harmonic, for
example of the form
f = f0 sin(ωt). (9.20)
We look only for a particular solution u p (t) of Eq. (9.19), since the general solution
of the homogeneous equation is already available via the previous treatment. We try
a solution of the form
u p (t) = U p sin(ωt). (9.21)
5 The treatment for the general case is identical, except for the fact that normal modes of the form
Ai + Bi t must be included.
6 In Eq. (9.18) the summation convention does not apply.
9.2 A Short Review of the Discrete Case 189
We conclude that, except in the case in which the frequency of the external load
happens to coincide with one of the natural frequencies of the system, a particular
solution of the form (9.21) is uniquely determined. The exceptional case is called
resonance and it results in steadily increasing amplitudes of the response with the
consequent disastrous effects. A nice exercise is to express the vector of the external
forces in terms of the eigenvector basis and then determine the components of the
particular solution in the same basis one by one. The case of a general (not necessarily
harmonic) periodic force can also be handled by similar methods, but it is best treated
together with the continuous case.
Although we intend to deal with more general situations, in this section we will
devote our attention to the wave equation that has already occupied us in Chap. 8,
namely,
u tt = c2 u x x , (9.23)
For now, we do not specify any initial conditions. As we know, Eqs. (9.23) and (9.24)
describe the small transverse deflections of a uniform string of length L, supported
at its ends, in the absence of external loading. A shape-preserving (or synchronous)
motion is a solution of this equation of the form
In this equation, U (x) represents the shape that is preserved as time goes on. A
solution of this type is sometimes also called a standing wave. The method used
to find a standing wave solution is justifiably called separation of variables. To see
whether there are standing-wave solutions of the wave equation, we substitute the
assumption (9.25) in the wave equation (9.23) and obtain
We are adopting the standard notation for time and space derivatives of functions of
one variable. Following the lead of the treatment of discrete systems, we now isolate
(separate) all the time-dependent functions to one side of the equation and get
f¨(t) U (x)
= c2 . (9.27)
f (t) U (x)
We conclude that each of the sides of this equation must be a constant, since a function
of one variable cannot possibly be identical to a function of a different independent
variable. This constant may, in principle, be of any sign. Anticipating the final result,
we will presently assume that it is negative. Thus, we write
f¨(t) U (x)
= c2 = −ω 2 . (9.28)
f (t) U (x)
On the other hand, Eq. (9.28) also implies that (since c is a constant) the shape itself
is harmonic, viz., ωx ωx
U (x) = C cos + D sin . (9.30)
c c
We still need to satisfy the boundary conditions. It follows form Eqs. (9.24) and
(9.25) that, regardless of the value of ω, the boundary condition at x = 0 implies
that
C = 0. (9.31)
Since we are looking for a non-trivial solution, we must discard the possibility D = 0.
We conclude that non-trivial solutions do exist and that they exist only for very par-
ticular values of ω, namely, those values that render the sine function zero in expres-
sion (9.32). These values (again excluding the one leading to the trivial solution) are
precisely
kπc
ωk = k = 1, 2, . . . (9.33)
L
We have thus obtained the surprising result that there exists an infinite, but
discrete, spectrum of natural frequencies of the vibrating string corresponding to
shape-preserving vibrations. The corresponding shapes, or normal modes, are sinu-
9.3 Shape-Preserving Motions of the Vibrating String 191
soidal functions whose half-periods are exact integer divisors of the length of the
string. The fact that in this case the frequencies of the oscillations turned out to be
exact multiples of each other, is the physical basis of musical aesthetics (at least
until now …).
Notice that our assumption that the constant in Eq. (9.28) had to be negative is
now amply justified. Had we assumed a non-negative constant, we would have been
unable to satisfy both boundary conditions. In the case of the discrete system, there
are no boundary conditions and the selection of natural frequencies is entirely based
on the fact that the determinant (characteristic) equation, being polynomial, has a
finite number of roots. In the continuous case, the selection of frequencies is mediated
by the boundary conditions. By extension, the natural frequencies of the continuous
case are also called eigenvalues of the corresponding differential operator and the
normal modes of vibration are its eigenvectors.7 Putting back together the spatial
and temporal parts, we can express any shape-preserving solution in the form
ω x
k
u k (x, t) = sin (Ak cos(ωk t) + Bk sin(ωk t)). (9.34)
c
Just as in the discrete case, the normal modes of vibration satisfy an orthogonal-
ity condition. Indeed, consider two different natural frequencies ωi = ω j and the
corresponding normal modes
ω x ω x
i j
Ui = sin U j = sin (9.35)
c c
Using the trigonometric identity
L L
iπx jπ x L
Ui U j d x = sin cos dx = δi j . (9.37)
L L 2
0 0
The integration of the product of two functions over the length of the domain plays,
therefore, the role of a dot product in the space of functions, as we shall see later
again.
7 We content ourselves with pointing out these similarities. In fact these similarities run even deeper,
particularly when we regard the underlying differential equations as linear operators on an infinite-
dimensional vector space of functions, just as a matrix is a linear operator on a finite-dimensional
space of vectors.
192 9 Standing Waves and Separation of Variables
∞
F(x) = Dn Un (x), (9.39)
n=1
where the numbers Dn are the components of the representation. This bold statement
must be justified in precise mathematical terms. In this book, however, we will adopt
it as an act of faith.
For the case of the wave equation that we have just considered, the normal modes
are sine functions and the expansion (9.39) reduces to a particular case of the Fourier
series, which we will encounter in Chap. 10. The fully-fledged Fourier series includes
also terms involving the cosine function and it can be used to represent ‘any’ periodic
function (or ‘any’ function defined over a finite interval that has been extended to
a periodic function over the whole line). In more general cases, such as that of a
non-uniform string that we will study next, the normal modes (or eigenfunctions) are
no longer sines or cosines, but the validity of the representation (9.39) in terms of
normal modes is preserved. These topics are known as Sturm–Liouville eigenvalue
problems and are discussed in mathematical textbooks.8 The more general proofs of
convergence pertain to the field of functional analysis.
From the physical point of view, we may say that nature has endowed the vibrat-
ing string (and, in fact, all elastic systems) with a preferred set of shape-preserving
vibrations, each of which oscillates with a definite frequency. All these natural
frequencies constitute the characteristic spectrum of the system. The remarkable
fact is that arbitrary periodic vibrations of the system can be expressed in terms of
these natural modes of vibration. Thus, any periodic vibration of the system can be
analyzed in terms of its spectral components. An intuitive grasp of these facts was
perhaps somehow obtained by Pythagoras over 25 centuries ago. It is worth pointing
out that the contribution of Louis de Broglie to the understanding of the quantum
mechanical model of the atom runs along similar lines.
Let F(x) be a function (vanishing at the ends of the interval) for which we want
to find the representation (9.39). In other words, we are interested in obtaining the
value of the coefficient Dk for each and every k. Multiplying both sides of (9.39) by
Uk (x), integrating both sides of the resulting equation over the interval [0, L] and
invoking the orthogonality conditions (9.37), we obtain the following surprisingly
simple result
L
2
Dk = F(x) Uk (x) d x. (9.40)
L
0
The orthogonality conditions played a major role in the de-coupling of the formulas
for the individual coefficients, just as they do in the discrete case (the components of a
vector in an orthonormal basis are simply the dot products by the base vectors). There
is here a subtlety, however, that we must mention. It has to do with the fact that we
have assumed that the series on the right-hand side of (9.39) can be integrated term by
term. Moreover, we have not specified in what precise sense the series converges to
the given function, if it does indeed converge. These and other similar issues (which
are not very difficult to understand) are outside the scope of these notes.
Let us assume that the vibrating string is subjected at time t = 0 to a displacement
u(x, 0) = F(x) and a velocity u t (x, 0) = G(x), both vanishing at the ends of the
string (which remain fixed during the interval of time under consideration). These
two functions can be expanded in terms of the normal modes as
∞
F(x) = Dn Un (x), (9.41)
n=1
∞
G(x) = E n Un (x). (9.42)
n=1
Put differently, we expand the solution in terms of the normal modes, each oscillating
at its characteristic frequency. Our task is to determine all the constants An , Bn . At
the initial time, Eq. (9.43) yields
∞
u(x, 0) = Un (x) An (9.44)
n=1
and
∞
u t (x, 0) = Un (x) ωn Bn . (9.45)
n=1
In obtaining this last equation we have assumed that the series can be differentiated
term by term. Comparing these results with those of Eqs. (9.42) and (9.43), respec-
tively, and recalling that the coefficients of the expansion in terms of normal modes
are unique, we conclude that
A n = Dn (9.46)
and
En
Bn = . (9.47)
ωn
The method just used is based on normal-mode superposition. It relies on the fact
that for a homogeneous linear equation with homogeneous boundary conditions, the
sum of two solutions is again a solution, and so is the product of any solution by a
constant. In engineering applications (for example in structural engineering) this fact
constitutes the so-called principle of superposition. It is not a principle of nature, but
rather a property of linear operators.
If we consider a non-homogeneous situation (either because there is an external
production, such as a force acting on a structure, or because the boundary conditions
are not homogeneous, such as a prescribed displacement or slope at the end of a
beam), it is easy to show (by direct substitution) that the difference between any
two solutions of the non-homogeneous case is necessarily a solution of the homoge-
neous equation with homogeneous boundary conditions. It follows that the solution
of a non-homogeneous problem, with prescribed initial conditions, can be obtained
by adding any one particular solution of the non-homogeneous problem (regard-
less of initial conditions) to the general solution of the homogeneous problem. The
adjustable constants of the homogeneous solution can then be determined so as to
satisfy the initial conditions for the sum thus obtained.
To illustrate these ideas, let us consider first the case in which the string has been
loaded by means of a periodic load of the form
The boundary conditions are still the vanishing of the displacements at the ends of
the string. Recall that the PDE governing the situation is given by Eq. (8.5), viz.,
u tt − c2 u x x = q, (9.49)
− ω 2 U p − c2 U p = Q. (9.51)
This equation should be compared with its discrete counterpart, Eq. (9.22). We have
obtained a non-homogeneous ODE, which can be solved by any means. From the
physical point of view, however, it is interesting to find a particular solution of this
ODE by making use of a normal mode superposition. We express the solution and
the loading, respectively, as
∞
U p (x) = D pn Un , (9.52)
n=1
196 9 Standing Waves and Separation of Variables
∞
Q(x) = Hn Un . (9.53)
n=1
We now assume that the second derivative of Eq. (9.52) can be carried out term
by term, and use Eq. (9.38) for the normal modes to obtain
∞
∞
ω2
U p (x) = D pn Un = − n
D pn Un . (9.54)
n=1 n=1
c2
Hn
D pn = . (9.55)
ωn2 − ω 2
Clearly, this solution is applicable only if the frequency of the applied harmonic
force does not happen to coincide with any of the natural frequencies of the system
(in which case we have the phenomenon of resonance, with the amplitude of the
response increasing steadily in time).
If the forcing function is not harmonic (or even periodic), we can still make use
of the normal mode decomposition to find a particular solution of (9.49). We assume
now that the coefficients of the expansions (9.52) and (9.53) are functions of time,
namely,
∞
u p (x, t) = D̂n (t) Un (x), (9.56)
n=1
∞
q(x, t) = Ĥn (t) Un (x). (9.57)
n=1
Note that the coefficients in Eq. (9.57) can be calculated, instant by instant, by the
formula
L
2
Ĥk = q(x, t) Uk (x) d x. (9.58)
L
0
Similarly, the coefficients of the yet to be determined particular solution are given by
L
2
D̂k = u p (x, t) Uk (x) d x. (9.59)
L
0
9.4 Solving Initial-Boundary Value Problems by Separation of Variables 197
Let us multiply Eq. (9.49) through by Uk (x) and integrate over the string to get
L L
∂2u p ∂2u p
− c2 Uk (x) d x = q Uk (x) d x. (9.60)
∂t 2 ∂x 2
0 0
Integrating by parts, and taken into consideration that the boundary terms vanish, we
use Eq. (9.38) to write
L L L
∂2u p
c 2
Uk (x) d x = c2 u p Uk (x) dx = −ωk2 u p Uk d x. (9.61)
∂x 2
0 0 0
¨ ∞
∂2u p
= D̂ n (t) Un (x). (9.62)
∂t 2
n=1
This is an ODE for the determination of the time-dependent coefficients of the par-
ticular solution of the PDE. Clearly, we only need a particular solution of this ODE.
Notice that in the case in which the time dependence of Q k happens to be harmonic,
we recover the solution given by (9.55). Otherwise, we can use, for example, the
Duhamel integral formula, viz.,
t
1
D̂k (t) = Ĥ (τ ) sin(ωk (t − τ )) dτ . (9.64)
ωk
0
In the analysis of the vibration of general linearly elastic systems the method of
separation of variables leads to the separation of the (linear) problem into a trivial
temporal part and a spatial part that is governed by a PDE (or system thereof) of the
elliptic type. We start with the special case of a single spatial dimension.
Consider the case of a string, such as that of a musical instrument, with a smoothly
varying cross section with area A = A(x). In accordance with Eq. (8.5), the governing
equation is
u tt − c(x)2 u x x = q. (9.66)
where
T
c(x) = . (9.67)
ρ A(x)
In other words, the mass per unit length of the string is a function of position along
the string. We ask the same question as in the case of the uniform string: Are there
shape preserving solutions of the homogeneous equation? If so, what is the precise
shape of these solutions and how does their amplitude vary with time? We proceed
in exactly the same manner as before, namely, we substitute the shape-preservation
assumption
U (x, t) = U (x) f (t) (9.68)
f¨(t) U (x)
= c(x)2 . (9.69)
f (t) U (x)
Just as in the case of Eq. (9.27), we reason that both sides of this equation must
necessarily be constant. Assuming this constant to be negative (for the same reasons
as before, which are justified a posteriori) we obtain that the time variation of the
normal modes is again necessarily harmonic, that is,
For the shape of the normal modes, however, we obtain the ODE
ω2
U (x) = − U (x). (9.71)
c(x)2
If we compare this equation with the discrete counterpart (9.8), we see that the
mass matrix corresponds to the variable mass density per unit length. We need now
to ascertain the existence of non-trivial solutions of Eq. (9.71) satisfying the given
homogeneous boundary conditions. Without entering into the subtleties of the Sturm-
Liouville theory, we can convince ourselves that such solutions exist by the following
intuitive argument.
The left-hand side of Eq. (9.71) is (roughly) a measure of the curvature of the
solution. In this sense, Eq. (9.71) tells us that the curvature of a solution (a normal
mode) is proportional to the function itself, and that the constant of proportionality
must be, point by point, negative. What this means is that if, starting from the left end
we assume the solution to move upwards, the curvature will be negative, and thus
it will bring us downwards, and vice-versa. So, let us assume that we choose some
candidate value for the natural frequency. We may be lucky and hit the other end of
the string. If we don’t, however, we can change the value of ω gradually until we do
hit the other end. That we will always be able to do so is guaranteed by the fact that
the x-axis attracts the solution towards it, according to our curvature interpretation.
Moreover, once we find a value of the frequency which satisfies the condition of
hitting the other boundary, we can increase it gradually until we hit the far end again.
Every time we do this, we add another half wave to the shape of the solution. This
hit-and-miss argument can, of course, be formalized into a proof and, perhaps more
importantly, into a numerical algorithm to find the normal modes.9
In conclusion, although for a general non-uniform string we no longer have sinu-
soidal normal modes, the normal modes have a wave-like appearance. The natural
frequencies, moreover, will no longer be integer multiples of each other. As a result,
our ear will perceive the various frequencies as dissonant with respect to each other.
That is why guitar and violin strings have a constant cross section. In the case of the
drum, however, because of the two-dimensionality of the membrane, the frequen-
cies are not integer multiples of each other even in the case of constant thickness.
Hence follows the typical dissonant sound of drums and cymbals. Be that as it may,
we will now prove that the normal modes (eigenfunctions) satisfy a generalized (or
weighted) orthogonality condition, just as in the case of the discrete system.
Let Um , Un be two normal modes corresponding, respectively, to two different
natural frequencies ωm , ωn . We have, therefore,
ωm2
Um (x) = − Um (x), (9.72)
c(x)2
ωn2
Un (x) = − Un (x). (9.73)
c(x)2
Multiplying Eq. (9.72) by Un and Eq. (9.73) by Um , subtracting the results and inte-
grating over the length of the string yields
L L
Um Un
Un Um − Um Un d x = −(ωm2 − ωn2 ) d x. (9.74)
C(x)2
0 0
Integrating by parts the left-hand side of this equation, however, and implementing the
boundary conditions, we conclude that it must vanish. Since the natural frequencies
were assumed to be different, we conclude that
L
Um Un
d x = 0. (9.75)
c(x)2
0
This is the desired generalized orthogonality condition. Since the normal modes are
determined up to a multiplicative constant, we may choose to normalize them by
imposing, without any loss of generality, the extra condition
L
Um Un
d x = δmn . (9.76)
c(x)2
0
Given a function F(x) that vanishes at the ends of the string, we can express it in
terms of the normal modes as
∞
F(x) = Dn Un (x). (9.77)
n=1
From here on, the treatment of the non-uniform string is identical in all respects to
that of the uniform string, provided that one takes into consideration the new normal
modes and their generalized orthogonality condition.10
10 In particular, the Duhamel integral will have to be expressed differently as compared to the uniform
case.
9.5 Shape-Preserving Motions of More General Continuous Systems 201
(E I u x x )x x = −q − ρAu tt , (9.79)
in which u = u(x, t) denotes the (small) transverse displacement. The free vibrations
of a beam of constant properties is, therefore, governed by the homogeneous equation
c4 u x x x x + u tt = 0, (9.80)
with
EI
c4 = . (9.81)
ρA
Setting
u(x, t) = U (x) f (t), (9.82)
yields
U f
− c4 = = −ω 2 . (9.83)
U f
where √
ω
γ= , (9.85)
c
can be expressed as
We conclude that the natural frequencies are obtained from the roots of the transcendental equation
1
cos γ L = − .
cosh γ L
Written in this way, since the right-hand side decays extremely fast, the natural frequencies
beyond the first and the second are obtained with very small error from cos γ L = 0. The first and
second roots are, respectively, γ1 L = 0.597π and γ2 L = 1.494π. The lowest natural frequency
is, therefore,
0.597π 2 E I
ω1 = γ1 c =
2 2
.
L ρA
The dimensions can be easily calibrated to produce the usual orchestra pitch A440.
Let Ui (x) and U j (x) be normal modes corresponding to two distinct eigenvalues
γi4 and γ 4j , respectively for some specific support conditions. We recall that these
eigenfunctions are obtained, up to a multiplicative constant, from nontrivial solutions
of the homogeneous system of equations for the constants of integration. Exploiting
Eq. (9.84) and integrating by parts, we obtain
L L
(γ 4j − γi4 ) Ui U j d x = (Ui U
j − U j Ui )d x
0 0
L
= (−Ui U L
j + U j Ui )d x + Ui U j − U j Ui 0
0
= Ui U L
j − U j Ui − Ui U j + U j Ui 0 . (9.87)
9.5 Shape-Preserving Motions of More General Continuous Systems 203
The last expression vanishes by virtue of the boundary conditions, whence the desired
orthogonality.
An important feature of eigenfunctions beyond their orthogonality is their
completeness in the sense that every smooth enough function can be expressed as
an infinite linear combination of these basic functions.11 In particular, this feature
is helpful in solving non-homogeneous problems by expressing the forcing load in
terms of the eigenfunctions, as was done for the vibrating string.
1
u x x + u yy = u tt , (9.90)
c2
1
∇2u = u tt . (9.91)
c2
∇ 2U f
c2 = = −ω 2 . (9.93)
U f
As far as the shape itself is concerned, we are left with the elliptic PDE
ω2
∇ 2U + U = 0. (9.95)
c2
In other words, the natural frequencies ω, as expected, will be determined by solving
the eigenvalue problem for the linear differential operator ∇ 2 . The selection of the
frequency spectrum will depend on the shape and size of the membrane domain A
and on the type of boundary conditions imposed, just as in the previous problems.
Since the Laplacian operator is elliptic, we have yet to consider this kind of problems.
We can, however, advance a few useful considerations. We start by observing that
the divergence theorem (1.18) can be restated, with the appropriate interpretation,
in a two-dimensional (rather than three-dimensional) Cartesian domain A, in which
case the flux is evaluated over the boundary curve ∂A. Applying the theorem to a
vector field given by the product of a scalar field φ time the gradient of another scalar
field ψ, we obtain
dψ
∇ · (φ∇ψ) dA = φ ds, (9.96)
dn
A ∂A
where n denotes the normal to the boundary curve. On the other hand, it is easy to
check that
∇ · (φ∇ψ) = ∇φ · ∇ψ + φ∇ 2 ψ. (9.97)
This elegant result has important consequences. Let Ui (x, y) and U j (x, y) be eigen-
functions (that is, natural modes of vibration) corresponding to two distinct natural
frequencies ωi , ω j , respectively. Then, emulating Eq. (9.74) or (9.87), we can write
9.5 Shape-Preserving Motions of More General Continuous Systems 205
1 2
(ω − ωi2 ) Ui U j dA = − (Ui ∇ 2 U j − U j ∇ 2 Ui )dA
c2 j
A A
dU j dUi
=− Ui − Uj ds. (9.99)
dn dn
∂A
What this result entails is that the orthogonality condition between eigenfunctions
will be satisfied for at least two types of boundary conditions. The first type corre-
sponds to a simple support, namely, to the vanishing of the transverse displacement.
This kind of boundary condition (U = 0) for an elliptic operator is known as the
Dirichlet type. The other kind of boundary condition (dU/dn = 0), known as the
Neumann type, corresponds to the vanishing of the slope of the membrane.12 In more
general problems of Elasticity, the Neumann boundary condition corresponds to the
specification of a surface traction. At any rate, we conclude that the eigenfunctions
corresponding to different eigenvalues are orthogonal. It can also be shown that they
form a complete set and that any sufficiently smooth function defined over the domain
A can be expanded in terms of the set formed by all the eigenfunctions.
Although we will not pursue at this stage the actual solution of the elliptic equation
in the general case, it turns out that a further application of the method of separation
of variables will allow us to solve the problem for the case of a rectangular membrane.
Indeed, let the domain A be the Cartesian product [0, a] × [0, b], where a and b are
the lengths of the sides. Consider the case of a simply supported membrane along
the whole perimeter. We try a variable separated solution of Eq. (9.95) in the form
which yields
d2 X d 2Y ω2
+ = − . (9.101)
dx2 dy 2 c2
This identity is only possible if each of the summands is constant, namely, after
solving and imposing the boundary conditions at x = 0 and y = 0, if
ω2
X (x) = A sin λx Y (y) = B sin μy with λ2 + μ2 = . (9.102)
c2
Imposing the boundary conditions at x = a and y = b and discarding the trivial
solution, we obtain the condition
m2 n2 ω2
π 2
+ = m, n = 1, 2, 3, . . . (9.103)
a2 b2 c2
12 As a matter of historical interest, the Neumann boundary condition is named after the German
mathematician Carl Neumann (1832–1925), not to be confused with the Hungarian-American math-
ematician John von Neumann (1903–1957).
206 9 Standing Waves and Separation of Variables
From the musical point of view, we can see that the successive natural frequencies
do not appear in ratios of small integers, which accounts for the blunt sound of drums
in an orchestra. Moreover, for a ratio b/a equal to a rational number, we have multiple
eigenvalues. For instance, in a square membrane, we have u mn = u nm for any pair with
m = n. In that case, any linear combination of u mn and u nm is again an eigenfunction
corresponding to the same eigenvalue. The idea of an ocular tonometer based on a
careful measurement of the normal modes of the eye and their sensitivity to small
variations of the intra-ocular pressure is adversely affected by these considerations,
since it is difficult to ascertain which normal mode of vibration has been excited by
an external source.
Exercises
Exercise 9.1 Write the equations of motion of a system with 2 or 3 masses moving
in the plane (or in the line) and interconnected by springs. Are there any non-zero
vectors (that is, displacements) for which the equal sign in Eq. (9.2) applies? Why,
or why not? What would the violation of the inequality mean for the stiffness and/or
for the mass matrix in physical terms?
Exercise 9.2 For a given vector f = f0 sin(ωt) in Eq. (9.19), express the vector f0 in
terms of the eigenvector basis and then determine the components of the particular
solution U p in the same basis. What happens in this method when ω happens to
coincide with one of the natural frequencies?
Exercise 9.3 Compare the solution given by Eqs. (9.43)–(9.47) with that given (for
the same problem, with a slightly different notation) by the extension technique in
Chap. 8. If you prefer, consider just the case in which the initial velocity is identically
zero. Are these solutions the same, as they should?
Exercise 9.4 Verify, by direct substitution, that the integral (9.64) satisfies the ODE
(9.63).
Exercise 9.5 A guitarist plucks a string of length L at exactly its midpoint. Let
W denote the magnitude of the imposed deflection under the finger just before it
is released from rest. Assuming the two halves of the string to be straight at that
instant, determine the coefficients of its expression in terms of the eigenfunctions
(i.e., perform a Fourier expansion). Plot the approximate shape obtained by using
just a few terms of the expansion. Using the method of separation of variables,
find the solution for the motion of the string. Plot it for various times within a
9.5 Shape-Preserving Motions of More General Continuous Systems 207
period, using just a few terms of the approximation. Solve the same problem by the
method of characteristics and compare the results for the same times. Comment on
the comparison. Is any of the two solutions exact? If so, which one?
Exercise 9.6 Consider a vibrating string of constant density but with a linearly
varying cross section x
A(x) = A0 1 + 4 ,
L
where L is the string length and A0 is the area at the left end. Implement numerically
the shooting method described in Sect. 9.5.1 to find the first few natural frequencies
and the corresponding normal modes. You may use specific numerical values or
introduce a non-dimensional coordinate ξ = x/L and calculate the eigenvalues
relative to those of a string with constant cross section A0 . The shooting routine can
easily be handled with Mathematica by solving the ODE with zero displacement and,
say, unit slope at the left end. Varying the coefficient containing the eigenvalue until
the other end is hit and counting the number of oscillations, the solution is obtained
in just a few runs.
Exercise 9.7 Show that for a simply supported (i.e., pinned-pinned) Bernoulli beam,
the natural modes of vibration are the same as for the vibrating string, while the
successive natural frequencies are not in the same relation.
Exercise 9.9 For the vibrating membrane, show that the orthogonality between nor-
mal modes corresponding to different natural frequencies is also verified by a third
kind of boundary condition known as the Robin type. It corresponds physically to
an elastic support that keeps the proportionality between the displacement and the
normal slope of the membrane, namely, U = kdU/dn, where k is an elastic constant.
Exercise 9.10 A rectangular membrane with a/b = 1.5, simply supported around
its perimeter in the x, y plane, is subjected to a uniformly distributed normal load p
(per unit area of the membrane) and to a uniform tension T (load per unit length) in
all directions. Find an approximate solution of this static problem by means of the
technique of separation of variables. Give numerical values to the various constants
and estimate the maximum deflection of the membrane to 3 significant digits.
208 9 Standing Waves and Separation of Variables
References
The archetypal parabolic equation is the diffusion equation, or heat equation, in one
spatial dimension. Because it involves a time derivative of odd order, it is essentially
irreversible in time, in sharp distinction with the wave equation. In physical terms
one may say that the diffusion equation entails an arrow of time, a concept related
to the Second Law of Thermodynamics. On the other hand, many of the solution
techniques already developed for hyperbolic equations are also applicable for the
parabolic case, and vice-versa, as will become clear in this chapter.
Many phenomena of everyday occurrence are by nature diffusive.1 They arise, for
example, as the result of sneezing, pouring milk into a cup of coffee, intravenous
injection and industrial pollution. These phenomena, consisting of the spread of one
substance within another, are characterized by thermodynamic irreversibility as the
system tends to equilibrium by trying to render the concentration of the invading
substance as uniform as possible. The flow of heat is also a diffusive process. A more
graphical way to describe this irreversibility is to say that diffusive phenomena are
characterized by an arrow of time. Thus, the drop of milk poured into the coffee will
never collect itself again into a drop.
Consider a tube of constant cross section filled with a liquid at rest (the substrate),
in which another substance (the pollutant) is present with a variable concentration
g = g(x, t), measured in terms of mass of pollutant per unit length of tube. If this
1 This
section is largely a more detailed repetition of Sect. 2.4.2.
© Springer International Publishing AG 2017 209
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_10
210 10 The Diffusion Equation
dx
tube is embedded in a hostile environment and, if the tube wall permits it, a certain
amount of pollutant, p = p(x, t), may perfuse through the lateral wall per unit length
and per unit time. The quantity p is usually called the production. We want to account
for the variation in time of the amount of pollutant contained in an infinitesimal slice
of width d x, as shown in Fig. 10.1.
If the pollutant were to remain at rest, this accounting would be trivial, as it
would state that the change in pollutant content, namely ∂g(x,t)∂t
d x, is entirely due
to the perfusion through the lateral wall, that is, p(x, t) d x. In reality, however,
the pollutant tends to move with a velocity v(x, t) in the direction x of the tube
axis. This motion, which is the essence of the diffusive phenomenon, results in
an inflow through the left face of the slice given by g(x, t) v(x, t), measured in
mass of pollutant per unit time. Analogously, the right face of the slice, located
at the spatial position x + d x, will witness an outflow of pollutant in the amount
g(x + d x, t) v(x + d x, t). The net contribution due to flow through the faces is,
therefore, given by g(x, t) v(x, t)−g(x +d x, t) v(x +d x, t) = − ∂(gv)
∂x
d x +O(d x 2 ),
where we have assumed the quantities involved to be differentiable. Adding up the
various contributions, we obtain in the limit as d x → 0 the balance equation
∂g ∂(gv)
+ = p. (10.1)
∂t ∂x
To complete the physical description of the diffusion phenomenon, we need to
supply a constitutive equation that relates the two dependent field variables v and g.
In the case of diffusion of a pollutant (or, in general, a substance in small concen-
trations within another), it is possible to formulate a sensible, experimentally based,
constitutive law directly in terms of the pollutant concentration. The most commonly
used model, called Fick’s law, states that
gv = −D grad g, (10.2)
where grad denotes the spatial gradient and the positive constant D is a property that
depends on the substances involved. The minus sign in Eq. (10.2) agrees with the
fact that the pollutant tends to flow in the direction of smaller concentrations.
10.1 Physical Considerations 211
Combining the last two equations, we obtain the second-order linear PDE
∂g ∂2g
− D 2 = p. (10.3)
∂t ∂x
In the absence of production we obtain the homogeneous equation
∂g ∂2g
− D 2 = 0, (10.4)
∂t ∂x
known as the diffusion equation and also as the heat equation. A clever statistical
motivation for the diffusion equation is presented in Box 10.1.
xi = i h i = . . . − 3, −2, −1, 0, 1, 2, 3, . . . ,
tj = j k j = 0, 1, 2, 3, . . . ,
The link between the discrete model and the diffusion equation is obtained
by formulating the latter as a finite-difference approximation on the assumed
j
space-time grid. Setting g(xi , t j ) = Ni / h and using standard approximation
formulae for first and second derivatives, we obtain
j+1 j j j j
Ni − Ni N − 2Ni + Ni+1
≈ D i−1 .
k h2
The First Law of Thermodynamics asserts that for each substance there exists a
function of state, called the internal energy, whose rate of change is balanced by the
power of the external forces (or mechanical power) acting on the system plus the
heating input (or thermal power), namely,
d internal energy
= mechanical power + thermal power. (10.5)
dt
If we consider a fixed non-deforming substrate, such as a metal wire, the mechani-
cal power vanishes and the free energy is a function of the temperature alone. In close
analogy with the diffusion case shown in Fig. 10.1, the thermal power going into a
slice of width d x consists of two parts: (i) A power supply p = p(x, t), measured in
terms of energy per unit length and per unit time. This power is the result of sources
of heat distributed throughout the length of the wire (or its lateral surface). (ii) A
heat flux q = q(x, t) in the direction of the axial coordinate x. This flux, due to the
ability of the material to conduct heat, is measured in terms of energy per unit time.2
Denoting by g = g(x, t) the temperature field and by u = u(x, t) the internal energy
2 In the more general three-dimensional context, the production term p is measured per unit volume
(rather than length) and the flux term q is measured per unit area. Since the cross section has been
assumed to be constant, we did not bother to effect the formal passage to one dimension.
10.1 Physical Considerations 213
content per unit length, the statement of the balance of energy is expressed as
∂u(x, t) ∂q(x, t)
= p(x, t) − . (10.6)
∂t ∂x
For many materials, a good empirical constitutive law for increments Δu in inter-
nal energy due to corresponding increments Δg in temperature is given by the linear
relation
Δu = c Δg, (10.7)
where c is a constant known as the specific heat (capacity). The internal energy
depends in general also on the deformation, which in our case has been ignored
since the material was assumed to be rigid.
As far as the heat flux is concerned, Fourier’s Law of heat conduction is an
empirical relation valid for most materials within limited temperature ranges. It
establishes that the heat flux is proportional to the gradient of the temperature. In our
notation, this law is expressed as
∂g
q = −k , (10.8)
∂x
where k is the thermal conductivity of the material, a positive constant. The minus sign
expresses the fact that heat flows spontaneously form higher to lower temperatures.
Introducing the constitutive equations (10.7) and (10.8) into the energy balance
equation (10.5), we obtain
∂g ∂2g
c − k 2 = p. (10.9)
∂t ∂x
This equation is identical in form3 to Eq. (10.3) governing the diffusion of one sub-
stance into another, as we studied in Sect. 10.1.1. For this reason, this equation is
known both as the (non-homogeneous) diffusion equation and as the heat equation.
The adjective non-homogeneous refers here to the fact that there are body sources.
Thus, the equation would be called homogeneous if the right-hand side were zero. On
the other hand, the material itself may have properties, such as the specific heat or the
thermal conductivity, varying from point to point, in which case it is the body (rather
than the equation) which would be called inhomogeneous. In deriving Eq. (10.9),
in fact, it was assumed that the coefficient of thermal conductivity k was constant
throughout the domain of interest. If, instead, k and/or c are functions of position
(that is, if the material is inhomogeneous) Eq. (10.9) should be replaced by
∂g ∂ ∂g
c(x) − k(x) = p. (10.10)
∂t ∂x ∂x
3 With D = k/c.
214 10 The Diffusion Equation
The diffusion equation, as we already know, is of the parabolic type. At each point,
therefore, it has a single characteristic direction, whose slope is given by
dt
= 0. (10.11)
dx
There are some similarities between the heat equation and the wave equation (and
between hyperbolic and parabolic equations in general), but there are also many
differences, both in interpretation and in the nature of their solutions, which we
would like to point out. An important difference from the mathematical and physical
points of view is that, unlike the wave equation, the heat equation is not invariant
with respect to time reversal. In other words, if we were to make the change of
independent variables
x̂ = x
(10.12)
tˆ = −t
we would not obtain the same equation (as would be the case with the wave equation).
From the physical point of view, this is a manifestation of the fact that the diffusion
equation describes thermodynamically irreversible processes. If we were to run a
video of a wave phenomenon backwards, we would not be able to tell whether or
not we are witnessing a real phenomenon. But if we were to run backwards a movie
of a diffusive phenomenon, we would immediately be able to tell that something
“unphysical” is taking place: the milk already dissolved in coffee spontaneously
becomes a small drop, a bar in thermal equilibrium spontaneously gets cooler at one
end and warmer at the other, and so on.
A feature that the diffusion equation shares with the wave equation is that in both
cases the initial value problem makes physical and mathematical sense. Thus, from
the state of the system at one particular instant of time, the differential equation allows
us to predict the evolution of the system for future times. But here, again, there is a
significant difference when we realize that, according to Eq. (10.11), the initial curve
(t = 0, say) is a characteristic of the PDE. What this means is that we will not be able
to consider the specification of initial data along a characteristic as the exceptional
case, but rather as the rule. Moreover, if we recall that characteristics are lines along
which weak singularities propagate, we find that according to the heat equation these
disturbances (if they could exist at all) must propagate at an infinite speed! In fact, it
can be shown that, in the interior of its domain of existence, any solution of the heat
equation must be of class C ∞ . In other words, contrary to the wave equation, any
singularity in the data at the boundary of the domain is immediately smeared out, as
befits a diffusive process.
When discussing the different types of second-order equations, we remarked that
if the initial data (the values of the function and of its first partial derivatives) are
specified on a characteristic line they will in general contravene the PDE. In the case
of hyperbolic equations, upon reconciling the initial data with the PDE we end up
10.2 General Remarks on the Diffusion Equation 215
losing uniqueness of the solution. In order to restore it, one has to deal with the
so-called characteristic initial value problem, whereby data have to be specified on
two intersecting characteristics. In the case of the diffusion equation, on the other
hand, it is clear that to reconcile the initial data on a characteristic line (t = constant)
we simply have to refrain from specifying the value of the time derivative gt = ∂g ∂t
.
Indeed, by specifying the value of the function itself, and assuming that it is twice
differentiable, we can obtain its second spatial derivative, and the PDE automatically
delivers the value of the time derivative. We note, moreover, that the values of all
subsequent partial derivatives of all orders become thus available. This means that,
contrary to the case of the wave equation, the reconciliation of the initial data with
the PDE does not lead to a lack of uniqueness.
Without assuming any initial and/or boundary conditions, we try to see whether the
method of separation of variables can give us some indication of possible solutions
of the homogeneous diffusion equation. We set
f˙ G
=D = −λ2 , (10.14)
f G
with an obvious notation. Without any loss of generality, we may henceforth assume
that D = 1, since this can always be achieved by a suitable re-scaling of the spatial
variable. The choice of negative sign in the constant rightmost side of Eq. (10.14) is
dictated by the reasoning that follows. Integrating first the time-dependent part, we
obtain
f (t) = C e−λ t .
2
(10.15)
Since, as already pointed out, the diffusion equation implies an arrow of time, it is
clear that, had we chosen a positive value in the right-hand side, the solution would
have rapidly diverged with (increasing) time. The spatial part leads to the solution
Note that the special choice λ = 0 yields the solution g(x, t) = A + Bx. This
time-independent solution of the diffusion equation corresponds to a case of steady
state (or equilibrium).
Since we are dealing with a linear homogeneous equation, any linear combination
of solutions is a solution. We may, for example, choose for each value of λ some
prescriptions A = A(λ) and B = B(λ), and form an integral (which is, after all, a
limit of sums) such as
∞
(A cos(λx) + B sin(λx)) e−λ t dλ.
2
g(x, t) = (10.18)
−∞
Provided the integral converges, this expression is a new solution of the diffusion
equation. We will later exploit this fact to show how to construct a solution in this
way by judiciously choosing the spectral coefficients A(λ) and B(λ) so as to match
any given initial and boundary conditions.
Proof Let M and m denote, respectively, the maximum values6 of g in D and in the
union of the base and the vertical sides of D (that is, in ∂D minus the open top of
the rectangle, as indicated with a thick line in Fig. 10.2). Assume the statement of
its maximum and minimum values at one or more points of its domain.
10.4 The Maximum–Minimum Theorem and Its Consequences 217
x
0 L
the theorem not to be true. There exists, therefore, a point (x0 , t0 ) with 0 < x0 < L
and 0 < t0 ≤ T at which g attains the value M > m. We construct the augmented
function
M −m
h(x, t) = g(x, t) + (x − x0 )2 . (10.19)
4L 2
The reason for this construction will become apparent soon. We only remark now
that the value of this function is at each point of D greater than or equal to the value
of the solution at that point. In particular, the restriction of this function to the union
of the vertical sides and the base of D satisfies the inequality
M −m M 3
h(x, t) ≤ m + = + m < M. (10.20)
4 4 4
This result tells us that the new function h(x, t) also attains its maximum value at
some point (x1 , t1 ) with 0 < x1 < L , 0 < t1 ≤ T . At this point we must have that
h t ≥ 0 and h x x ≤ 0. [Important question: why don’t we just say h t = 0?] Combining
these two conditions, we obtain
h t − h x x ≥ 0. (10.21)
M −m M −m
h t − h x x = gt − gx x − =− < 0. (10.22)
2L 2 2L 2
Thus, the assumption M > m has led us to a contradiction and the first part of the
theorem has been proved. The proof of the part dealing with the minimum value
follows directly by noting that if g is a solution so is −g.
Corollary 10.1 (Uniqueness) The initial-boundary value problem of the heat equa-
tion has a unique solution.
218 10 The Diffusion Equation
Proof The problem we are describing corresponds to the specification of the values of
g at the base and the vertical sides of the rectangle. The proof follows immediately
from the assumption that there exist two different solutions to this problem. The
difference between these two solutions would, therefore, vanish on this part of the
boundary. Since our equation is linear, this difference satisfies the heat equation in
the given domain. It follows that (unless the difference is identically zero) we have
found a solution of the heat equation that attains a maximum or minimum value at a
point not belonging to the part of the boundary stipulated by the maximum–minimum
theorem.
From both the physical and the computational points of view, it is important to
ascertain that the behaviour of the solutions of the heat equation is not chaotic. In other
words, a small change in the initial and/or boundary data results in a correspondingly
small change in the solution. This is the content of the following corollary of the
maximum–minimum theorem.
Corollary 10.2 (Continuous dependence on the data) The solution of the initial-
boundary value problem depends continuously on the data in the sense that a small
change in the data results in a correspondingly small change in the solution.
Proof More precisely, this corollary states that if the data (specified on the base and
the two vertical sides of the rectangular region [0, L] × [0, T ]) corresponding to two
solutions g1 (x, t) and g2 (x, t) satisfy the conditions
and
|g1 (L , t) − g2 (L , t)| < , (10.25)
To prove this corollary we start by noticing once again that the difference between two
solutions is itself a solution of the (homogeneous) heat equation. The corresponding
data are, clearly, given by the difference of the data of the individual solutions.
Applying the main theorem to the difference between the given solutions the corollary
follows.
We have extracted several important facts out of a relatively simple proof. More-
over, the statement of the main theorem corresponds to the physically intuitive fact
that if, for example, the maximum value of the temperature data occurs at the base of
the rectangle, then we do not expect at any time and at any point the temperature to
rise above this value. In its search for thermal equilibrium, the bar will seek to even
out the temperatures as much as permitted by the boundary data.
Consider a rod of finite length occupying the interval [0, L] and thermally insulated
on its lateral surface. The temperature distribution g(x, t) abides by the diffusion
equation (10.4), also called the heat equation. At the ends of the bar, the temperature
is kept equal to zero7 at all times, i.e.,
Moreover, at the initial time, the temperature throughout the length of the rod is given
as some continuous function
For consistency, we assume that the function g0 vanishes at the two ends of the
rod. As we already know (from the results of the previous section), if this problem
has a solution it must be unique. We try a solution by the method of separation
of variables. According to Eq. (10.17), except for the spatially linear solution, any
variable-separated solution must be of the form
Enforcing the boundary conditions (10.27), we obtain that the constant A must vanish
and that the parameter λ must belong to a discrete spectrum given by the formula
nπ
λ= n = 1, 2, . . . (10.30)
L
7 Note that the temperature appearing in the heat equation is not necessarily the absolute thermody-
namic temperature.
220 10 The Diffusion Equation
Introducing this form of the solution into the initial condition (10.28), we obtain
∞
nπx
g0 (x) = Bn sin . (10.32)
n=1
L
L nπx
2
Bn = g0 (x) sin d x. (10.33)
L L
0
Notice that as time goes on, the solution tends to a state of thermal equilibrium, as
expected. From a detailed analysis of this solution, one can verify that if the initial
temperature distribution is continuous with piece-wise continuous derivatives, the
solution is of class C ∞ for t > 0. We have already alluded to this property earlier and
indicated that, from the physical point of view, its meaning is that any irregularities
in the initial data are immediately smoothed out by the diffusive process of heat
transfer. It is interesting to remark that one can use this property of the solution to
prove that the heat equation cannot in general be solved backward in time.
The method of separation of variables has allowed us to solve the homogeneous
heat equation (no distributed heat sources or sinks) under a regime of homogeneous
boundary conditions (zero temperature at the ends of the rod). Other, more general,
homogeneous boundary conditions (such as insulated ends) can also be considered,
leading to Fourier series expansions involving cosine terms. The solution of cases
where the boundary conditions are arbitrary functions of time or where there exist
heat sources distributed over the length of the rod, however, requires the consideration
of other methods (such as Laplace transforms, Green functions, Duhamel integrals
and eigenfunction expansions). The general treatment of these methods is beyond
the scope of this book, but we will explore some of them to a limited extent.
10.6 Non-homogeneous Problems 221
We assume, moreover, that the following consistency conditions are satisfied, namely,
Our aim is to show that this problem (a homogeneous equation with inhomo-
geneous boundary conditions) can be generally converted into a problem of a non-
homogeneous equation with homogeneous boundary conditions. To this effect, we
decompose the solution into the sum of two terms as
The second term (which, somewhat imprecisely, will be referred to as the steady state
part of the solution) is given by a spatial linear interpolation of the given boundary
conditions, namely,
x x
S(x, t) = f 0 (t) 1 − + f L (t) . (10.40)
L L
Introducing the proposed decomposition (10.39) into the original PDE (10.35), we
obtain
G t − DG x x = −St . (10.41)
Thus, the “transient” (non-steady) part, G(x, t), of the solution satisfies a non-
homogeneous version of the heat equation, whereby the sources are obtained by
a particular linear combination of the time-derivatives of the boundary conditions.
On the other hand, it is not difficult to verify that (in fact, by construction) the function
G(x, t) satisfies the homogeneous boundary conditions
This expression agrees with its counterpart for the treatment of the non-homogeneous
wave equation that we studied in a previous chapter. Similarly, we express the heat
source as
∞ nπx
St (x, t) = Cn (t) sin . (10.45)
n=1
L
The coefficients of this expansion can be calculated at each instant of time by the by
now familiar formula
L nπx
2
Cn (t) = St (x, t) sin d x. (10.46)
L L
0
d Dn nπ 2
+D Dn = −Cn . (10.47)
dt L
A particular solution of this equation is given by
t
Cn (τ )e−( L )
2
nπ
D(t−τ )
Dn (t) = − dτ . (10.48)
0
10.6 Non-homogeneous Problems 223
The constants can be adjusted to satisfy the initial condition (10.43). Finally, the
solution of the original problem is obtained form Eq. (10.39).
In the case of a rod of infinite spatial extent, we are confronted with the pure Cauchy
(or initial-value) problem
without any boundary conditions. We will assume that the initial temperature distri-
bution g0 (x) is continuous and bounded over the real line. To show the uniqueness
of the solution of this problem, we would like to emulate the procedure we used in
the case of the finite rod, namely, to prove a maximum–minimum theorem. In order
to achieve this goal, however, it turns out that, unlike the finite case, we must now
make an extra assumption on the nature of the solution: we need to assume a-priori
that the sought after solution is continuous and bounded. Otherwise, it can be shown
explicitly that the maximum–minimum theorem doesn’t hold and the solution is, in
fact, not unique. A standard argument due to Tychonoff8 shows how to construct a
C ∞ solution that vanishes at t = 0. This solution, however, is unbounded. A solution
g(x, t) is said to be bounded if there exists a positive number M such that
Let g1 (x, t) and g2 (x, t) be two bounded solutions of Eq. (10.50). The difference
g = g1 −g2 between these solutions is, therefore, also bounded. Instead of proceeding
to prove an independent maximum theorem, we can take advantage of the maximum
theorem for the finite case to produce a proof of uniqueness by showing that g(x, t)
must vanish identically over the half plane of interest. To this end, we attempt to
construct a family of solutions g L (x, t) of the heat equation, over the finite spatial
intervals −L ≤ x ≤ L, each of which enjoys the property of being non-negative
and greater than (or equal to) |g(x, t)| over the common domain of definition. Such
a family of solutions is given by the prescription9
4M x2
g L (x, t) = 2 Dt + , (10.53)
L 2
as can be verified. In particular, we notice that for any given T > 0 the values taken
by this solution over the part of the boundary consisting of the base [−L , L] × {0}
and the sides {−L} × [0, T ] and {L} × [0, T ] are point by point larger than (or equal
to) the corresponding values of |g|. This must, therefore, be true for the interior
points as well. Fixing an arbitrary point (x, t), we conclude from Eq. (10.53) that for
sufficiently large L the absolute value of g(x, t) can be bounded by as small a positive
number as desired. This concludes the proof of uniqueness. As a corollary of this
theorem, one can (by the same procedure as in the finite case) prove the continuous
dependence of the solution on the initial data.
Having thus demonstrated the uniqueness of the Cauchy problem, we need to
construct a solution by any method, which will then become the (unique) solution.
In particular, if two solutions are found which appear to be different (perhaps because
of the different methods used to derive them), they are automatically identical to each
other. We have already remarked that fairly general solutions of the heat equation
can be found by adjusting the coefficients in the expression
∞
(A(λ) cos(λx) + B(λ) sin(λx)) e−λ t dλ,
2
g(x, t) = (10.54)
−∞
∞
g0 (x) = (A(λ) cos(λx) + B(λ) sin(λx)) dλ. (10.55)
−∞
When we compare this expression with the familiar formula for the Fourier series,
we realize that it can be regarded as a generalized version of it. The generalization
consists in not demanding that the function represented be periodic, since the domain
of definition of the initial conditions is now unbounded. As a result, we no longer
obtain a discrete spectrum of possible values for the wavelength, but rather a continu-
ous spectrum, where every wavelength is represented. If we had at our disposal some
kind of orthogonality condition, as was the case in the finite domain, we would be
able to obtain these coefficients directly from Eq. (10.55). Instead, we will proceed
to introduce the concept of Fourier integral by a heuristic argument of passage to
the limit of the Fourier series as the period tends to infinity.
∞
1
f (x) = a0 + (an cos(λn x) + bn sin(λn x)) . (10.56)
2 n=1
The equal sign in this equation has to be taken with a pinch of salt. Be that as it
may, the “frequencies” λn constitute a discrete spectrum dictated by the period of
the function being represented, specifically given by
nπ
λn = . (10.57)
L
The coefficients of the expansion (10.57), also called amplitudes, are given by the
integrals
L
1
an = f (ξ) cos(λn ξ) dξ, (10.58)
L
−L
L
1
bn = f (ξ) sin(λn ξ) dξ. (10.59)
L
−L
The (complex) coefficients are related to the (real) coefficients by the formulas
1
2
(an − ibn ) for n ≥ 0
cn = (10.62)
1
(a
2 n
+ ibn ) for n < 0
226 10 The Diffusion Equation
L
1
cn = f (ξ)e−iλn ξ dξ. (10.63)
2L
−L
Notice that, although the original function may have been defined only in the interval
[−L , L], the Fourier representation is valid over the whole line. In other words, the
Fourier series represents a periodic extension of the given function, obtained by just
translating and copying the function ad infinitum. When performing this extension,
even if the original function is continuous, we may obtain points of discontinuity at
the extreme values of each period. In such cases, it can be shown that the Fourier
series converges to the average value. We will not discuss these or other phenomena
pertaining to the convergence of Fourier series. In particular, we will assume that
differentiation can be carried out term by term and that the series thus obtained is an
almost-everywhere faithful representation of the derivative of the original function.
Taking these liberties, we can easily understand why the Fourier series can be so
useful in the solution of differential equations. For example, the second derivative
of a function has Fourier coefficients which are, one by one, proportional to the
coefficients of the original function, the constant of proportionality being −λ2n . In
the transformed world of Fourier coefficients, therefore, taking a second derivative
is interpreted as a kind of multiplication.
We want to extend the above concepts to functions that are not necessarily periodic
and that are defined over the entire real line. We will make the assumption that the
absolute value of the given function f (x) is integrable and that the integral over the
real line is finite, namely, for some positive number M,
∞
| f (x)| d x ≤ M. (10.64)
−∞
∞ H
1 inπ(ξ−x)
f H (x) = f (ξ)e− H dξ, (10.65)
2H n=−∞
−H
where we have combined Eqs. (10.61) and (10.62). We intend to let H go to infinity
and to replace the summation by an integral. To achieve this goal, it is convenient to
define π
Δ= . (10.66)
H
10.8 The Fourier Series and the Fourier Integral 227
∞ ∞
1
f (x) = f (ξ)eiλ(x−ξ) dξ dλ. (10.68)
2π
−∞ −∞
This formula is known as the Fourier integral. Notice that in obtaining this result
we have defined a new continuous variable λ, whose discrete values in the limiting
process were nπ/H = nΔ, precisely as required by the definition of an integral. The
Fourier integral can be regarded in two steps, just as we suggested for the Fourier
series. In the first step, called the Fourier transform, we produce a transformation
of the original function f (of the independent variable x) to another function F (of
the independent variable λ running within the “frequency domain”) by means of the
formula
∞
1
F(λ) = √ f (ξ)e−iλξ dξ. (10.69)
2π
−∞
∞
1
f (ξ) = √ F(λ)eiλξ dλ. (10.70)
2π
−∞
The process of obtaining the Fourier transform of functions is, clearly, a linear
operation from the space of functions into itself. We indicate this linear operator by
F. Thus, we can write
F(λ) = F[ f (x)]. (10.71)
Again, just as in the case of the Fourier series, we obtain that in the frequency
domain differentiation is interpreted as multiplication by the frequency variable.
Another important property is the so-called convolution. The convolution product
f ∗ g of two functions f (x) and g(x) is defined as
∞
1
( f ∗ g)(x) = √ f (x − ξ)g(ξ) dξ. (10.74)
2π
−∞
We are now in a position of solving the Cauchy problem as formulated in Eqs. (10.50)
and (10.51), provided we add the extra condition that both the initial temperature
distribution g0 (x) and its derivative g0 (x) vanish at x = ±∞. Fourier-transforming
Eq. (10.50) with respect to the space variable (while the time variable remains as a
parameter), we can write
We note that the x-Fourier transform of g(x, t) is a function of λ and of the time vari-
able t, which has remained unaffected by the transformation. Denoting this transform
by G(λ, t) and using the derivative property of the Fourier transform, we write
We note that the derivative with respect to the time parameter is directly reflected
as the derivative with respect to the same parameter of the Fourier transform, and it
is only the derivative with respect to the transformed variable that enjoys the special
10.9 Solution of the Cauchy Problem 229
property derived above. For each value of λ, Eq. (10.77) is a first-order ODE. The
initial condition is obtained by Fourier-transforming the initial condition (10.51). We
denote
G 0 (λ) = F[g0 (x)]. (10.78)
The solution of (10.77) with initial condition (10.78) is an elementary problem and
we obtain
G(λ, t) = G 0 (λ)e−Dλ t .
2
(10.79)
The solution to the original problem is given by the inverse transform of this function.
The calculation of inverse transforms is usually not a straightforward task. In this
particular case, however, it can be accomplished by an application of the convolution
formula (10.75). Indeed, the inverse transform of the second factor in the right-hand
side of (10.79) is given by
1 x2
F −1 e−Dλ t = √ e− 4Dt .
2
(10.80)
2Dt
We have obtained this value from a table of Fourier transforms, although in this case
the direct evaluation of the inverse transform is relatively straightforward. Applying
now the convolution formula to (10.79), we obtain
1 x2
g(x, t) = F −1 G 0 (λ)e−Dλ t = g0 (x) ∗ √ e− 4Dt ,
2
(10.81)
s Dt
or
∞
1 (x−ξ)2
g(x, t) = √ g0 (ξ)e− 4Dt dξ. (10.82)
2 π Dt
−∞
This is the solution of the Cauchy problem for the heat equation. A useful way
to interpret this result is obtained by the use of the concept of generalized functions,
as we will do in the next section. Figure 10.3 shows a plot of the solution (10.82)
for D = 1 and a bell-shaped initial temperature distribution given by the function
g0 (x) = e−x /2 . The integration has been numerically achieved with the use of
2
Mathematica . The initial time has been taken somewhat greater than zero, to avoid
a numerical singularity.
∞
φf = g(x) f (x) d x ∀g ∈ F. (10.84)
−∞
It is not difficult to prove (by an argument akin to the so-called fundamental lemma
of the calculus of variations) that the linear functional thus defined determines the
function f uniquely. If this were all that we have to say about generalized functions it
wouldn’t be worth our effort. But consider, for example, the following functional on
the given space of functions F: it assigns to each function the value of the function
at the origin. We denote this functional by δ and call it Dirac’s delta. More precisely
11 For technical reasons, the space of functions over which these functionals are defined consists of
the so-called space of test functions. Each test function is of class C ∞ and has compact support
(that is, it vanishes outside a closed and bounded subset of R). The graph of a test function can be
described as a smooth ‘bump’.
10.10 Generalized Functions 231
∞
g(x)δ(x) d x = g(0) ∀g ∈ F. (10.86)
∞
The area under the graph of each function remains thus always equal to 1. As we
calculate the integral of the product of these functions with any given function g, we
obtain
∞ 1/n
n
g(x)δn (x) d x = g(x) d x = g(x̄). (10.88)
2
−∞ −1/n
We have used the mean value theorem to replace the integral of a function by the
value of the function at an interior point x̄ times the length 2/n of the interval of
integration. As n increases, the intermediate point gets more and more confined and
eventually becomes the origin, which is the only point common to all the nested
intervals. Thus we recover (10.86) as a limiting case.
An obvious feature of the Dirac delta function is its filtering or substitution prop-
erty
∞
g(x)δ(x − a) d x = g(a). (10.89)
−∞
δ5 (x)
δ4 (x)
δ3 (x)
δ2 (x)
δ1 (x)
x
0 1
∞
φ f [g] = g(x) f (x) d x ∀g ∈ F. (10.90)
−∞
∞
φ [g] = −
f g (x) f (x) d x = −φ f [g ] ∀g ∈ F. (10.91)
−∞
The generalized derivative (which we don’t bother to indicate with anything but
the conventional symbol for ordinary derivatives) is given by13
12 We must now use the fact that the function space consisted of functions with compact support, so
∞ ∞
H [g] = −H [g ] = − g (x)H (x) d x = − g (x) d x = g(0). (10.94)
−∞ 0
We see that the action of the derivative of the Heaviside function is identical to the
action of the Dirac delta function on each and every function of the original function
space. We conclude, therefore, that the derivative of the Heaviside function is the
Dirac delta function.
Let us go back to our solution of the heat equation as expressed in Eq. (10.82),
and let us assume that our initial condition g0 (x) is not a function but a distribution.
In particular, let us consider the case
The physical meaning of such an initial condition is that at the initial time we placed
a concentrated source of heat at the point x = a. This interpretation is clearly
contained in the conception of the Dirac function as a limit of a sequence of ordinary
functions, as we have demonstrated above. Indeed, the functions in this sequence
vanish everywhere except for an increasingly smaller and smaller interval around
that point. When we plug this initial condition in the general solution of the Cauchy
problem, we obtain:
∞ (x−a)2
1 2
− (x−ξ) e− 4Dt
ga (x, t) = √ δ(ξ − a)e 4Dt dx = √ . (10.96)
2 π Dt 2 π Dt
−∞
The meaning of the expression in the right-hand side of this equation is, therefore,
the temperature distribution, as a function of time and space, in an infinite rod which
has been subjected to a concentrated unit source of heat at the point x = a at time
t = 0. This is thus some kind of “influence function” (of the same type that used to
be studied in structural engineering in the old days for bridge design). In the context
of the theory of differential equations, these functions (representing the effect due to
a unit concentrated cause at an arbitrary position) are called Green’s functions. The
usefulness of Green’s functions is that, because the differential equation of departure
is linear, we can conceive of the solution as simply a superposition of the effects
of the concentrated unit sources. This interpretation is borne out by the following
equation
∞
g(x, t) = g0 (a)ga (x, t) da, (10.97)
−∞
which is the same as Eq. (10.82). This calculation shows that, if we have any means
(exact or approximate) to calculate Green’s function for a particular differential equa-
tion (perhaps with some boundary conditions), then we have a recipe for constructing
solutions by superposition integrals.
234 10 The Diffusion Equation
Consider once more the Cauchy problem14 for the inhomogeneous heat equation
As we have already discovered in Sect. 8.9 when dealing with the wave equation,
Duhamel’s principle constructs the solution of non-homogeneous problems out of a
clever continuous superposition of solutions of homogeneous problems, for which
the solution is assumed to be known (either exactly or approximately). We remark that
we only need to solve the stated problem (10.98) for homogeneous initial conditions,
i.e., for
g(x, 0) = 0 − ∞ < x < ∞. (10.100)
with the original initial conditions (10.99) as solved, and if we call its solution ḡ(x, t),
then the function g(x, t) − ḡ(x, t) satisfies Eq. (10.98) with the homogeneous initial
conditions (10.100). In other words, if we solve Eq. (10.98) with homogeneous initial
conditions, all we have to do to obtain the required solution is to add the solution of
the homogeneous equation with the original inhomogeneous initial conditions.
In view of our newly acquired familiarity with the Dirac distribution, we may
motivate Duhamel’s principle by viewing the right-hand side of Eq. (10.98), repre-
senting the forcing function, as an infinite superposition of pulses in the form
∞
f (x, t) = f (x, τ )δ(t − τ ) dτ . (10.102)
−∞
For this integral to make sense, we are tacitly extending the given forcing function
as zero over the interval (−∞, 0). At any rate, if t > 0, the lower limit of the
integral can be changed to 0. The mathematical expression (10.102) can be seen as
the counterpart of the graphical representation given in Box 8.2.
Assume that we are able to solve the inhomogeneous problem
14 Although we are presenting the principle in the context of an infinite rod, the same idea can be
t
g(x, t) = g(x, t; τ ) dτ . (10.104)
0
The remarkable fact is that we can actually obtain the solution of (10.103) by
means of an initial value problem of the homogeneous equation! To visualize how
this is possible,15 all we need to do is integrate Eq. (10.103) with respect to time
over a small interval (τ − , τ + ). In so doing, and taking into consideration that
g(x, t; τ ) vanishes for t < τ , we obtain
The meaning of this equation is that the problem of Eq. (10.103) with homogeneous
initial conditions can be replaced with the homogeneous problem (no forcing term),
but with the initial condition
Since we have assumed that the solution of the homogeneous problem with arbi-
trary initial conditions is available (for example, by means of Fourier transforms,
as per Eq. (10.82)), we obtain the solution of the original problem by means of
Eq. (10.94).
Exercises
Exercise 10.1 (Spinal drug delivery)16 A drug has been injected into the spine so
that at time t = 0 it is distributed according to the formula g(x, 0) = c + C sin πx L
,
where the spine segment under study extends between x = 0 and x = L, and where
c and C are constants. The drug concentration at the ends of the spine segment is
artificially maintained at the value c for all subsequent times. If, in the absence of
any production, T is the time elapsed until the difference between the concentration
at the midpoint of the spine and c reaches one-half of its initial value, calculate the
longitudinal diffusion coefficient D of the drug through the spinal meninges. For
a rough order of magnitude, assume L = 10 mm and T = 3 h. [Hint: verify that
Exercise 10.2 (Modified discrete model) Show that if the probability β in the model
of Box 10.1 has a value 0 < β < 0.5, so that the particles have a positive probability
α = 1 − 2β of staying put, the diffusion equation is still recovered, but with a
different value for the diffusion coefficient D.
Exercise 10.3 (Biased discrete diffusion) Let β + and β − = 1−β + represent, respec-
tively, the generally different probabilities for a particle to move to the right or to the
left. Obtain a PDE whose approximation matches the corresponding discrete model.
Propose a physical interpretation.
Exercise 10.4 (Finite domain) Modify the original discrete model so that it can
accommodate a spatial domain of a finite extent. Consider two different kinds of
boundary conditions, as follows: (1) The number of particles at each end of the
domain remains constant. For this to be the case, new particles will have to be
supplied or removed at the ends. (2) The total number of particles is preserved, with
no flux of new particles through the end points of the domain. Implement the resulting
model in a computer code and observe the time behaviour. What is the limit state of
the system for large times under both kinds of boundary conditions?
Exercise 10.5 Prove that if the data (over the three appropriate sides of a rectangular
region) of one solution of the heat equation are everywhere greater than the data of
another solution, then the same holds true for the corresponding solutions at all the
interior points of the region. Moreover, show that if the absolute value of the data at
each point is smaller than that of the data of an everywhere positive solution, then
so is the absolute value at each interior point smaller than the corresponding value
of the (positive) solution at that point.
Exercise 10.6 (Irreversibility) Show that the boundary-initial value of the heat equa-
tion for a finite rod cannot in general be solved backward in time for general initial
conditions. [Hint: assume that it can and impose initial conditions that are not of
class C ∞ .]
g(x, 0) = x 0 ≤ x ≤ 1.
Exercise 10.8 Verify the equivalence between expressions (10.56) and (10.61).
Exercise 10.9 (Convolution) Prove Eq. (10.75). [Hint: apply the inverse Fourier
transform to the right-hand side.]
Exercise 10.10 Solve the Cauchy problem for the non-homogeneous heat equation
Exercise 10.11 (Fourier transform and the wave equation) Apply the Fourier trans-
form method to the solution of the Cauchy problem for the (homogeneous) one-
dimensional wave equation when the initial velocity is zero and the initial defor-
mation of the (infinite) string is given as some function f (x). Compare with the
d’Alembert solution.
Exercise 10.12 The free transverse vibrations of a beam are described by the fourth-
order differential equation (9.80), namely,
c4 u x x x x + u tt = 0,
where c is a constant. Use Fourier transforms to solve for the vibrations of an infinite
beam knowing that at time t = 0 the beam is released from rest with a displacement
given by a function u(x, 0) = f (t). Express the result as an integral. Hint: The
inverse Fourier transform of cos aλ2 is √12a cos 4ax2
− π4 .
Exercise 10.13 (Distributional derivative) Justify the conclusion that the distribu-
tional derivative of the Heaviside function is the Dirac distribution by approximating
the (discontinuous) Heaviside function by means of a sequence of continuous func-
tions including straight transitions with increasingly steeper slopes.
where k is a constant and L is the length of the beam. Determine the load applied
on the beam by calculating the second distributional derivative of the given bending
moment function. Show that, extending the diagram as zero beyond the beam domain,
we also recover the reactions at the supports.
238 10 The Diffusion Equation
Exercise 10.15 (Influence function in statics) Given a simply supported beam, iden-
tify the influence function ga (x) with the bending moment diagram due to a unit con-
centrated force acting at the point a ∈ [0, L]. Apply Eq. (10.97) carefully to obtain
the bending moment diagram for an arbitrary load g(x). Compare with the standard
answer. Check the case g(x) = constant.
Exercise 10.16 Express the solution of the inhomogeneous problem at the end of
Sect. 10.9 in terms of the notation (10.96).
Exercise 10.17 Construct the solution of the Cauchy problem (10.98) with homo-
geneous boundary conditions by means of Duhamel’s principle. Compare the result
with that of the exercise at the end of Sect. 10.10.
References
1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol I. Interscience, Wiley, New
York
2. Epstein M (2012) The elements of continuum biomechanics. Wiley, London
3. Farlow SJ (1993) Partial differential equations for scientists and engineers. Dover, New York
4. John F (1982) Partial differential equations. Springer, Berlin
5. Petrovsky IG (1954) Lectures on partial differential equations. Interscience, New York
(Reprinted by Dover (1991))
6. Zauderer E (1998) Partial differential equations of applied mathematics, 2nd edn. Interscience,
Wiley, New York
Chapter 11
The Laplace Equation
The Laplace equation is the archetypal elliptic equation. It appears in many applica-
tions when studying the steady state of physical systems that are otherwise governed
by hyperbolic or parabolic operators. Correspondingly, elliptic equations require the
specification of boundary data only, and the Cauchy (initial-value) problem does not
arise. The boundary data of a second-order elliptic operator offer a choice between
two extremes: either the function or its transverse derivative must be specified at the
boundary, but not both independently. From the physical standpoint, this dichotomy
makes perfect sense. Thus, in a body in equilibrium, one expects to specify a support
displacement or the associated reaction force, but not both.
11.1 Introduction
u x x + u yy + u zz = 0, (11.3)
and
u x x + u yy + u zz = f (x, y, z), (11.4)
The flux of the vector v over an oriented surface element d A with (exterior) unit
normal n, is obtained by projecting the vector on the normal and multiplying by the
1 Notice that we have also used the term harmonic to designate a sinusoidal function of one variable.
area of the element. In terms of these definitions, therefore, the divergence theorem
establishes that
(div v) d V = v · n d A. (11.7)
D ∂D
We will use this theorem (whose proof can be found in any good textbook of Calculus)
to derive some useful expressions involving the Laplacian operator.
Consider a differentiable function u(x1 , ..., xn ). The gradient of this function is
the vector field ∇u with components
⎧ ∂u ⎫
⎪
⎪ ∂x1 ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ . ⎪
⎪
⎨ ⎬
.
{∇u} = (11.8)
⎪
⎪ . ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ∂u ⎭
∂xn
The divergence of the gradient is, therefore, precisely the Laplacian. We conclude
that the integral of the Laplacian of a scalar field over a domain is equal to the flux
of its gradient through the boundary. In this case, the dot product of the vector field
(i.e., the gradient) with the exterior unit normal is the directional derivative of the
scalar field in the exterior normal direction n. Thus, we can write
du
∇2u d V = d A. (11.9)
dn
D ∂D
Consider next two scalar fields u, v. Applying the divergence theorem to the
product of one field times the gradient of the other, we obtain
du
2
v dA = v∇ u + ∇u · ∇v d V. (11.10)
dn
∂D D
Suppose now that the function u is harmonic over the domain D and that it
vanishes on its boundary ∂D. From Eq. (11.12) it will follow that the integral of
the square of the magnitude of the gradient must vanish. But for a continuous and
non-negative function this is possible only if the gradient vanishes identically within
the domain. In other words, all its partial derivatives vanish identically, so that the
function must, in fact, be a constant. But since the function has been assumed to
vanish over the boundary of the domain, we conclude that it must vanish over the
whole of D. This rather trivial observation implies the uniqueness of the solution of
the so-called Dirichlet problem defined as follows.
Dirichlet problem: Find a solution u of the Poisson equation ∇ 2 u = f over a
domain D with prescribed values of u on the boundary ∂D. The proof of uniqueness2
is straightforward. Indeed, assume that there exist two solutions to this problem. Since
the Poisson equation is linear, the difference between these two solutions must be
harmonic (i.e., must satisfy the Laplace equation) and attain a zero value over the
whole boundary. It follows from our previous reasoning that this difference must be
identically zero, so that both solutions coincide.
Consider now the case in which u is harmonic over the domain D and that its
normal derivative (rather than the function itself) vanishes on the boundary ∂D.
Again, by applying Eq. (11.12), we arrive at the conclusion that u must be a constant.
Nevertheless, in this case, we can no longer conclude that it must vanish. We thus
obtain the following statement about the solution of the so-called Neumann problem.3
Neumann problem: Find a solution of the Poisson equation ∇ 2 u = f over a
domain D with prescribed values of the normal derivative on ∂D. A solution of this
problem is determined uniquely to within an additive constant. Moreover, according
to Eq. (11.9), the solution can only exist if the boundary data satisfy the auxiliary
condition
du
dA = f d V. (11.13)
dn
∂D D
Remark 11.1 Intuitively, we can imagine that the Dirichlet problem corresponds to
the specification of displacements, while the Neumann problem corresponds to the
specification of boundary tractions in an elastic structure. This explains why the
Neumann problem requires the satisfaction of an auxiliary condition: The tractions
must be in equilibrium with the applied body forces. If they are not, a solution cannot
exist within the realm of statics. The dynamic problem is, of course, governed by
hyperbolic equations, such as the wave equation, which necessitate the specification
of initial displacements and velocities and which include the forces of inertia.
Notice that, since the specification of the value of the solution on the boundary is
enough to determine a unique solution (if such a solution indeed exists), we cannot
2 Strictlyspeaking, this proof of uniqueness requires the solution to be twice differentiable not just
in the interior but also on the boundary of the domain. This requirement can be relaxed if the proof
is based on the maximum-minimum principle, that we shall study below.
3 Named after the German mathematician Carl Gottfried Neumann (1832–1925), not to be confused
simultaneously specify both the function and its normal derivative on the boundary. In
other words, the Cauchy problem for the Laplace equation has no solution in general.
This fact is in marked contrast with hyperbolic equations. It can be shown that the
Cauchy problem for the Laplace equation is in general unsolvable even locally.4 We
have already seen that a solution of the heat equation must be of class C ∞ . In the
case of Laplace’s equation, the complete absence of characteristic directions, leads
one to guess that perhaps this will also be the case, since no discontinuities can be
tolerated. It can be shown, in fact, that bounded solutions of Laplace’s equation must
be not just C ∞ , but also (real) analytic (i.e., they must have convergent Taylor-series
expansions in an open neighbourhood of every point).
Proof The proof can be carried out along similar lines as in the case of the parabolic
equation. Since the boundary ∂D is a closed and bounded set, the restriction of u
to ∂D must attain a maximum value m at some point of ∂D. On the other hand,
since D ∪ ∂D is also closed and bounded, the function must attain its maximum
M at some point P of D ∪ ∂D. Let us assume that P is an interior point and that,
moreover, M > m. Without any loss of generality, we may assume that the origin of
coordinates is at P. Let us now construct the auxiliary function
M −m 2
v=u+ r . (11.14)
2d 2
In this expression, r denotes the length of the position vector and d is the diameter
of D.6 This function v is strictly larger than u, except at P, where they have the
same value, namely M. The restriction of v to ∂D, on the other hand, will satisfy the
inequality
M −m
v≤m+ < M. (11.15)
2
We conclude that v attains its maximum at an interior point. On the other hand,
4 See [3], p. 98. Notice that, correspondingly, the Dirichlet problem has in general no solution for
the hyperbolic and parabolic cases, since the specification of the solution over the whole boundary
of a domain will in general lead to a contradiction. For this point see [2], p. 236.
5 See [4], p. 169.
6 Recall that the diameter of a set (in a metric space) is the least upper bound of the distances between
M −m M −m
∇2v = ∇2u + 2
= > 0. (11.16)
d d2
Since, however, at an interior point of a domain any maximum of a differentiable
function must be a relative maximum, none of the second partial derivatives can be
positive, making the satisfaction of Eq. (11.16) impossible. Having arrived at this
contradiction, we conclude that the maximum of u is attained at the boundary ∂D.
Changing u to −u, we can prove that the same is true for the minimum value.
Just as in the case of the parabolic heat equation, we can prove as corollaries of
the maximum-minimum theorem the uniqueness and continuous dependence on the
boundary data of the Dirichlet problem.
A nice intuitive visualization of the maximum-minimum principle can be gathered
from the case of a membrane (or a soap film) extended over a rigid closed frame. If we
give the initially plane frame a small transverse deformation (i.e., a warping), we do
not expect the membrane to bulge either upwards or downwards beyond the frame,
unless external forces are applied. A similar intuitive interpretation can be stated in
the realm of the thermal steady state over some plane region. The temperature attains
its maximum and minimum values at the boundaries of the region.
There are many ways to tackle the difficult problem of solving the Laplace and
Poisson equations in some degree of generality. A useful result towards this end is
obtained by investigating the possible existence of spherically symmetric solutions.
Let P ∈ Rn be a point with coordinates x̄1 , ..., x̄n and let r = r (x1 , ..., xn ) denote
the distance function to P, namely,
n
2
r = + x j − x̄ j . (11.17)
j=1
u = g(r ). (11.18)
If this is the case, it is not difficult to calculate its Laplacian. Indeed, denoting by
primes the derivatives of g with respect to its independent variable r , we obtain
∂u ∂r xk − x̄k
= g = g . (11.19)
∂xk ∂xk r
n−1
∇ 2 u = g g. (11.21)
r
If we wish to satisfy Laplace’s equation, therefore, we are led to the solution of a
simple linear ODE. Specifically,
n−1
g g = 0. (11.22)
r
The solution, which exists on the whole of Rn , except at P (where it becomes
unbounded), is given by
⎧
⎨A+ B
r n−2
if n > 2
g= , (11.23)
⎩
A + B ln r if n = 2
where A and B are arbitrary constants. Notice that in the particularly important case
n = 3 the solution is a linear function of the reciprocal distance (from the physical
point of view, this corresponds to the electrostatic or gravitational potentials of a
concentrated charge or mass).
To reveal the meaning of the solution (11.23), let us consider the following (three-
dimensional) problem associated with a sphere of radius ε with centre at P. We look
for a bounded C 1 spherically-symmetric function u(x, y, z) vanishing at infinity and
satisfying the Poisson equation
1
∇2u = r ≤ ε, (11.24)
4
3
π ε2 r
Just as before, the assumed spherical symmetry allows us to reduce this problem to
that of an ODE, namely, setting u = g(r ; ε),
⎧ 3
2 ⎨ 4π ε3 if r ≤ ε
g (r ; ε) + g (r ; ε) = . (11.26)
r ⎩
0 if r > ε
A remarkable feature of the solution just found is that the solution outside the
sphere is independent of the size of the sphere.7 As the radius of the sphere tends
to zero, the right-hand side of Eq. (11.25) approaches δ P , the (three-dimensional)
Dirac delta function at P. This means that, with A = 0 and with the value of B
appropriately calibrated (for each dimension), Eq. (11.23) provides the solution of
the problem
∇2u = δP , (11.28)
with zero boundary condition at infinity. We call this a fundamental solution of the
Laplace equation with pole P. This solution depends both on the coordinates x j
of the variable point in space and the coordinates x̄ j of point P. It is sometimes
convenient to express this double dependence explicitly with the notation K (x j , x̄ j ).
The explicit formulas for the fundamental solution in an arbitrary dimension can be
obtained in a similar way.8
Any solution of Eq. (11.28) in some domain D ⊂ Rn containing P, regardless of
boundary conditions, will also be called a fundamental solution with pole P. Clearly,
if w is harmonic within this domain, the new function
7 This feature of the solution for the case of the gravitational field was extremely important to
Newton, who was at pains to prove it. It is this property that allowed him to conclude that the forces
exerted by a homogeneous sphere on empty space are unchanged if the total mass is concentrated
at its centre.
8 For the case n = 2, we have B = 1/2π. For n = 3, as we have seen, the value is B = −1/4π.
For higher dimensions, the value of the constant can be shown to be related to the ‘area’ of the
corresponding hyper-sphere, which involves the Gamma function. See [3], p. 96.
11.5 Green’s Functions 247
u(x j ) = G(x j , ξ j ) f (ξ j ) d Vξ . (11.31)
D
The solution of the Dirichlet problem for the Laplace equation with inhomoge-
neous boundary conditions, can be obtained in a similar way. Indeed, let the boundary
values be given by
u|∂D = h(x j ). (11.32)
Let us assume that this function h has enough smoothness that we can extend it
(non-uniquely, of course) to a C ∞ function ĥ defined over the whole domain D.
Then clearly the function
v = u − ĥ (11.33)
In terms of the Green function (assumed to be known for the domain under consid-
eration), the solution of this problem is given by Eq. (11.31) as
v(x j ) = − G(x j , ξ j ) ∇ 2 ĥ(ξ j ) d Vξ . (11.35)
D
From Eq. (11.33) we obtain the solution of the original Dirichlet problem as
u(x j ) = ĥ(x j ) − G(x j , ξ j ) ∇ 2 ĥ(ξ j ) d Vξ . (11.36)
D
This expression would seem to indicate that the solution depends on the particular
extension ĥ adopted. That this is not the case can be deduced by a straightforward
application of Eq. (11.11), appropriately called a Green identity, which yields
dG(x j , ξ j )
G(x j , ξ j ) ∇ 2 ĥ(ξ j ) d Vξ = ĥ(x j ) − h(ξ j ) d Aξ . (11.37)
dn
D ∂D
In obtaining this result, the properties of each of the functions involved were
exploited. Combining Eqs. (11.36) and (11.37), we obtain the final result
dG(x j , ξ j )
u(x j ) = h(ξ j ) d Aξ (11.38)
dn
∂D
248 11 The Laplace Equation
Thus, the solution involves only the values of the data at the boundary, rather than
any extension to the interior. Expression (11.41) is regular at every interior point of
the domain.
Although we have not provided rigorous proofs of any of the preceding theorems
(proofs that can be found in the specialized books),9 we have attempted to present
enough circumstantial evidence to make these results at least plausible. The main
conclusion so far is that to solve a Dirichlet problem over a given domain, whether for
the Laplace or the Poisson equation, can be considered equivalent to finding Green’s
function for that domain. It is important to realize, however, that finding Green’s
function is itself a Dirichlet problem.
Theorem 11.2 (Mean value theorem) The value of a harmonic function u at any
point is equal to the average of its values over the surface of any sphere with centre
at that point.
Proof The theorem is valid for the circle, the sphere or, in the general case, an
n-dimensional ball. For specificity, we will consider the (three-dimensional) case of
a sphere. Let P be a point with coordinates x̂ j in the domain under consideration
and let B denote the (solid) sphere with centre P and radius R. Always keeping
the centre fixed, we proceed to apply Green’s formula (11.11) to the fundamental
solution K (x j , x̄ j ) and to the harmonic function u(x j ) and we notice the vanishing
of the term ∇ 2 u (by hypothesis), and obtain
dK du
u ∇2 K d V = u −K d A. (11.39)
dn dn
B ∂B
But, due to the spherical symmetry of K , both it and its derivative in the normal
direction (which is clearly radial) are constant over the boundary of the ball. Recalling
that, by a direct application of the divergence theorem, the flux of the gradient of a
harmonic function vanishes over the boundary of any domain, we conclude that the
last term under the right-hand side integral vanishes. The normal derivative of K at
the boundary is obtained directly from Eq. (11.23), or (11.27), as
dK dK 1
= = . (11.40)
dn dr 4π R 2
Finally, invoking Eq. (11.28) for the fundamental solution K , we obtain
Corollary 11.1 The value of a harmonic function at a point is also equal to the
volume average over any ball with centre at that point.
Proof Trivial.
Remark 11.2 It is remarkable that the converse of the mean value theorem also holds.
More specifically, if a continuous function over a given domain satisfies the mean
value formula for every ball contained in this domain, then this function is harmonic
in the domain. For a rigorous proof of this fact, see [1], p. 277.
It is, in fact, not difficult to construct explicit Green’s functions for a domain which
is a circle, a sphere or, in the general case, an n-dimensional ball of given radius R.
The construction is based on a simple geometric property of circles and spheres. We
digress briefly to show this property in the case of a circle (the extension to three
dimensions is rather obvious, by rotational symmetry).
Given a circle of centre P and radius R, as shown in Fig. 11.1, let Q = P be
an arbitrary internal point. Extending the radius through Q, we place on this line
another point S outside the circle, called the it reflected image of Q, according to the
proportion
PS R 2
= . (11.42)
PQ PQ
Q
R
C
P
250 11 The Laplace Equation
It is clear that this point lies outside the circle. Let C be a point on the circumfer-
ence. The triangles Q PC and C P S are similar, since they share the angle at P and
the ratio of the adjacent sides. Indeed, by Eq. (11.42),
PS PC
= . (11.43)
PC PQ
It follows that the ratio of the remaining sides must be the same, namely,
CS PC
= . (11.44)
CQ PQ
The right-hand side of this equation is independent on the particular point C chosen
on the circumference. It depends only on the radius of the circle and the radial distance
to the fixed point Q.
We use this property to construct Green’s function for the circle (or the sphere) B.
We start by noting that (always keeping Q and, therefore, also S fixed) the fundamen-
tal solution K (X, S), where X denotes an arbitrary point belonging to B, is smooth
in B and its boundary ∂B. This follows from the fact that S is an exterior point.
Moreover, if the point X happens to belong to the boundary, the value of K (X, S) is
given by
n−2
PQ
K (X, S)| X ∈∂B = K (X, Q)| X ∈∂B . (11.45)
R
This result is a direct consequence of Eq. (11.44) and the general formula (11.23) for
n > 2. For n = 2, a similar (logarithmic) formula applies. Therefore, the function
2−n
PQ
G(X, Q) = K (X, Q) − K (X, S), (11.46)
R
Assume that a Dirichlet problem has been given by specifying the value of the
field on the boundary as a function h(Y ), Y ∈ ∂B. According to Eq. (11.41), the
solution of this problem is given by
dG(X, Y )
u(X ) = h(Y ) d Ay . (11.48)
dn
∂B
11.7 Green’s Function for the Circle and the Sphere 251
We need, therefore, to calculate the normal (i.e., radial) derivative of Green’s func-
tions (just derived) at the boundary of the ball. When this is done carefully, the
result is
2
dG(X, Y ) 1 R2 − P X
= H (X, Y ) = 2
. (11.49)
dn Y 4π R XY
This is the formula for the sphere. For the circle, the denominator has a 2 rather
than a 4. Equation (11.48), with the use of (11.49), is known as Poisson’s formula. It
solves the general Dirichlet problem for a ball.
Exercises
Exercise 11.1 Show that under an orthogonal change of coordinates, the Laplacian
retains the form given in Eq. (11.5). If you prefer to do so, work in a two-dimensional
setting.
Exercise 11.2 Express the Laplacian in cylindrical coordinates (or polar coordi-
nates, if you prefer to work in two dimensions).
Exercise 11.3 Obtain Eqs. (11.9), (11.10) and (11.11). Show, moreover, that the flux
of the gradient of a harmonic function over the boundary of any bounded domain
vanishes.
Exercise 11.4 A membrane is extended between two horizontal concentric rigid
rings of 500 and 50 mm radii. If the inner ring is displaced vertically upward by an
amount of 100 mm, find the resulting shape of the membrane. Hint: Use Eq. (11.23).
Exercise 11.5 Carry out the calculations leading to (11.27). Make sure to use each of
the assumptions made about the solution. Find the solution for the two-dimensional
case.
Exercise 11.6 Carry out and justify all the steps necessary to obtain Eq. (11.38).
Exercise 11.7 What is the value of G(X, P)? How is this value reconciled with
Eq. (11.46)?
Exercise 11.8 Adopting polar coordinate in the plane, with origin at the centre of
the circle, and denoting the polar coordinates of X by ρ, θ and those of Y (at the
boundary) by R, ψ, show that the solution to Dirichlet’s problem is given by the
expression
2π
1 R 2 − ρ2
u(ρ, θ) = h(ψ) dψ. (11.50)
2π R2 + ρ2 − 2R ρ cos(θ − ψ)
0
where h(ψ) represents the boundary data. Use Eqs. (11.48) and (11.49) (with a 2 in
the denominator).
252 11 The Laplace Equation
References
1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol II. Interscience, Wiley,
New York
2. Garabedian PR (1964) Partial differential equations. Wiley, New York
3. John F (1982) Partial differential equations. Springer, Berlin
4. Petrovsky IG (1991) Lectures on partial differential equations. Dover, New York
5. Sobolev SL (1989) Partial differential equations of mathematical physics. Dover, New York
Index
G
General integral, 109
M
Generalized function, 230
Manifold, 8
Geometric compatibility conditions, 123
Mass conservation, 44
Glissando, 174
Material derivative, 43
Gradient, 11
Maximum-minimum theorem, 216, 243
Green’s function, 233, 246
Green’s identities, 241 Mean value theorem, 248
Group property, 17 Modal matrix, 134
Growth, 174 Monge cone, 92
H N
Hadamard’s lemma, 121 Natural base vectors, 9
Heat Natural frequency, 186
capacity, 213 Neumann boundary condition, 205
equation, 211, 213 Neumann problem, 242
specific, 213 Normal forms, 129
Heat equation, 38 Normal mode, 186
Index 255
V
S Variation of constants, 177
Scalar field, 11 Vector field, 9, 12
Separatrix, 23 Vibrating string, 157
Shear waves in beams, 137
Slinky, 175
Smooth, 7 W
Solitons, 39 Wave amplitude vector, 140
Spectral coefficients, 216 Wave breaking, 65
Spectrum, 190 Wave equation, 39
Spherical symmetry, 244 Wave front, 123
Stability, 171 Weak singularity, 123