0% found this document useful (0 votes)
897 views261 pages

Partial Differential Equations Mathematical Techniques For Engineers

Mathematics, Partial Differential Equations

Uploaded by

Kannada Kuvara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
897 views261 pages

Partial Differential Equations Mathematical Techniques For Engineers

Mathematics, Partial Differential Equations

Uploaded by

Kannada Kuvara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 261

Mathematical Engineering

Marcelo Epstein

Partial
Differential
Equations
Mathematical Techniques for Engineers
Mathematical Engineering

Series editors
Jörg Schröder, Essen, Germany
Bernhard Weigand, Stuttgart, Germany
Today, the development of high-tech systems is unthinkable without mathematical
modeling and analysis of system behavior. As such, many fields in the modern
engineering sciences (e.g. control engineering, communications engineering,
mechanical engineering, and robotics) call for sophisticated mathematical methods
in order to solve the tasks at hand.
The series Mathematical Engineering presents new or heretofore little-known
methods to support engineers in finding suitable answers to their questions,
presenting those methods in such manner as to make them ideally comprehensible
and applicable in practice.
Therefore, the primary focus is—without neglecting mathematical accuracy—on
comprehensibility and real-world applicability.
To submit a proposal or request further information, please use the PDF Proposal
Form or contact directly: Dr. Jan-Philip Schmidt, Publishing Editor (jan-philip.
[email protected]).

More information about this series at http://www.springer.com/series/8445


Marcelo Epstein

Partial Differential Equations


Mathematical Techniques for Engineers

123
Marcelo Epstein
Department of Mechanical
and Manufacturing Engineering
University of Calgary
Calgary, AB
Canada

ISSN 2192-4732 ISSN 2192-4740 (electronic)


Mathematical Engineering
ISBN 978-3-319-55211-8 ISBN 978-3-319-55212-5 (eBook)
DOI 10.1007/978-3-319-55212-5
Library of Congress Control Number: 2017934212

© Springer International Publishing AG 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To Ruth, Tamar and David,
my children, my builders.
Preface

A course on Mathematical Techniques for Engineers does not have a well-defined


scope. What the so-called mathematical techniques needed by a well-trained
engineer might be is clearly a matter of controversy. There is a direct relation
between the mathematics one knows and the mathematics one is likely to use. In
other words, it is often the case that the knowledge leads to the usage, rather than
the other way around, as many believe. Why is it so? Because if you do not know a
mathematical concept (the notion of characteristic lines, for example) you are
unlikely to realize that you may need it (to describe shock waves or traffic flow,
say), no matter how long you witness the phenomena (sonic booms, traffic jams) or
how smart you are.
The question, therefore, is not so much what to include in a course of this nature,
but why should one leave out entire mathematical sub-disciplines (graph theory,
topology, functional analysis, and so on). It has become a tradition, however, in most
engineering schools to expect that engineering students be exposed to at least one
course on partial differential equations (PDEs), these being the backbone of various
fundamental disciplines (solid mechanics, fluid mechanics, thermodynamics, elec-
tromagnetism, control of systems with distributed parameters, gravitation, etc.)
There are many excellent, even outstanding, texts and treatises on PDEs at a
variety of levels. On the other hand, when a few years ago I was given the task of
lecturing a graduate course on Mathematical Techniques for Engineers, a course
that I am still in charge of, I found it both convenient and necessary to develop a set
of class notes that would serve as a common foundation while letting each student
find the book or books best suited to his or her style of learning and depth of
interest. This policy has been amply rewarded by comments from the students
themselves over the years. In publishing these notes, barely edited so as to preserve
some of the freshness of a class environment, I hope that engineering students in
other institutions may find in them some intellectual stimulus and enjoyment.

Calgary, AB, Canada Marcelo Epstein


2017

vii
Contents

Part I Background
1 Vector Fields and Ordinary Differential Equations . . . . . . . . . . . . . 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Curves and Surfaces in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Cartesian Products, Affine Spaces . . . . . . . . . . . . . . . . 4
1.2.2 Curves in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Surfaces in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 The Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 The Divergence of a Vector Field . . . . . . . . . . . . . . . . 9
1.3.2 The Flux of a Vector Field over an Orientable
Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Statement of the Theorem . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 A Particular Case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.1 Vector Fields as Differential Equations . . . . . . . . . . . . 12
1.4.2 Geometry Versus Analysis . . . . . . . . . . . . . . . . . . . . . 13
1.4.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.4 Autonomous and Non-autonomous Systems . . . . . . . . 16
1.4.5 Higher-Order Equations . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.6 First Integrals and Conserved Quantities . . . . . . . . . . . 18
1.4.7 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . 21
1.4.8 Food for Thought . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Partial Differential Equations in Engineering . . . . . . . . . . . . . . . . . . 25
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 What is a Partial Differential Equation? . . . . . . . . . . . . . . . . . . . 26
2.3 Balance Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 The Generic Balance Equation . . . . . . . . . . . . . . . . . . 28

ix
x Contents

2.3.2 The Case of Only One Spatial Dimension . . . . . . . . . . 31


2.3.3 The Need for Constitutive Laws . . . . . . . . . . . . . . . . . 34
2.4 Examples of PDEs in Engineering . . . . . . . . . . . . . . . . . . . . . . . 36
2.4.1 Traffic Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4.2 Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.3 Longitudinal Waves in an Elastic Bar . . . . . . . . . . . . . 38
2.4.4 Solitons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.5 Time-Independent Phenomena . . . . . . . . . . . . . . . . . . . 40
2.4.6 Continuum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 41
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Part II The First-Order Equation


3 The Single First-Order Quasi-linear PDE . . . . . . . . . . . . . . . . . . . . . 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Quasi-linear Equation in Two Independent Variables . . . . . . . . . 53
3.3 Building Solutions from Characteristics . . . . . . . . . . . . . . . . . . . 56
3.3.1 A Fundamental Lemma . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.2 Corollaries of the Fundamental Lemma . . . . . . . . . . . . 57
3.3.3 The Cauchy Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.4 What Else Can Go Wrong? . . . . . . . . . . . . . . . . . . . . . 60
3.4 Particular Cases and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.1 Homogeneous Linear Equation . . . . . . . . . . . . . . . . . . 61
3.4.2 Non-homogeneous Linear Equation . . . . . . . . . . . . . . . 62
3.4.3 Quasi-linear Equation . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.5 A Computer Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4 Shock Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 The Way Out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 Generalized Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3 A Detailed Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Discontinuous Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4.1 Shock Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4.2 Rarefaction Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5 The Genuinely Nonlinear First-Order Equation . . . . . . . . . . . . . . . . 89
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 The Monge Cone Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3 The Characteristic Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4 Recapitulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.5 The Cauchy Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.6 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.7 More Than Two Independent Variables . . . . . . . . . . . . . . . . . . . 101
Contents xi

5.7.1 Quasi-linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . 101


5.7.2 Non-linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.8 Application to Hamiltonian Systems . . . . . . . . . . . . . . . . . . . . . . 105
5.8.1 Hamiltonian Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.8.2 Reduced Form of a First-Order PDE . . . . . . . . . . . . . . 106
5.8.3 The Hamilton–Jacobi Equation . . . . . . . . . . . . . . . . . . 107
5.8.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Part III Classification of Equations and Systems


6 The Second-Order Quasi-linear Equation . . . . . . . . . . . . . . . . . . . . . 115
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2 The First-Order PDE Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3 The Second-Order Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.4 Propagation of Weak Singularities . . . . . . . . . . . . . . . . . . . . . . . 121
6.4.1 Hadamard’s Lemma and Its Consequences . . . . . . . . . 121
6.4.2 Weak Singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.4.3 Growth and Decay. . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.5 Normal Forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7 Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.1 Systems of First-Order Equations . . . . . . . . . . . . . . . . . . . . . . . . 131
7.1.1 Characteristic Directions . . . . . . . . . . . . . . . . . . . . . . . 131
7.1.2 Weak Singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.1.3 Strong Singularities in Linear Systems . . . . . . . . . . . . 134
7.1.4 An Application to the Theory of Beams . . . . . . . . . . . 135
7.1.5 Systems with Several Independent Variables . . . . . . . . 137
7.2 Systems of Second-Order Equations . . . . . . . . . . . . . . . . . . . . . . 140
7.2.1 Characteristic Manifolds . . . . . . . . . . . . . . . . . . . . . . . 140
7.2.2 Variation of the Wave Amplitude . . . . . . . . . . . . . . . . 142
7.2.3 The Timoshenko Beam Revisited . . . . . . . . . . . . . . . . 144
7.2.4 Air Acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.2.5 Elastic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

Part IV Paradigmatic Equations


8 The One-Dimensional Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . 157
8.1 The Vibrating String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 Hyperbolicity and Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 158
8.3 The d’Alembert Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.4 The Infinite String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.5 The Semi-infinite String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
xii Contents

8.5.1 D’Alembert Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 163


8.5.2 Interpretation in Terms of Characteristics . . . . . . . . . . 165
8.5.3 Extension of Initial Data . . . . . . . . . . . . . . . . . . . . . . . 167
8.6 The Finite String. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.6.1 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.6.2 Uniqueness and Stability . . . . . . . . . . . . . . . . . . . . . . . 171
8.6.3 Time Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.7 Moving Boundaries and Growth . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.8 Controlling the Slinky? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.9 Source Terms and Duhamel’s Principle . . . . . . . . . . . . . . . . . . . 177
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
9 Standing Waves and Separation of Variables . . . . . . . . . . . . . . . . . . 183
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
9.2 A Short Review of the Discrete Case . . . . . . . . . . . . . . . . . . . . . 184
9.3 Shape-Preserving Motions of the Vibrating String . . . . . . . . . . . 189
9.4 Solving Initial-Boundary Value Problems by Separation
of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 192
9.5 Shape-Preserving Motions of More General Continuous
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5.1 String with Variable Properties . . . . . . . . . . . . . . . . . . 198
9.5.2 Beam Vibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.5.3 The Vibrating Membrane. . . . . . . . . . . . . . . . . . . . . . . 203
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
10 The Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
10.1 Physical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
10.1.1 Diffusion of a Pollutant . . . . . . . . . . . . . . . . . . . . . . . . 209
10.1.2 Conduction of Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
10.2 General Remarks on the Diffusion Equation . . . . . . . . . . . . . . . . 214
10.3 Separating Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
10.4 The Maximum–Minimum Theorem and Its Consequences . . . . . 216
10.5 The Finite Rod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
10.6 Non-homogeneous Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
10.7 The Infinite Rod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
10.8 The Fourier Series and the Fourier Integral . . . . . . . . . . . . . . . . 225
10.9 Solution of the Cauchy Problem . . . . . . . . . . . . . . . . . . . . . . . . . 228
10.10 Generalized Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.11 Inhomogeneous Problems and Duhamel's Principle . . . . . . . . . . 234
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
11 The Laplace Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
11.2 Green’s Theorem and the Dirichlet and Neumann Problems. . . . 240
11.3 The Maximum-Minimum Principle . . . . . . . . . . . . . . . . . . . . . . . 243
Contents xiii

11.4 The Fundamental Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244


11.5 Green’s Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
11.6 The Mean-Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
11.7 Green’s Function for the Circle and the Sphere . . . . . . . . . . . . . 249
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Part I
Background
Chapter 1
Vector Fields and Ordinary Differential
Equations

Although the theory of partial differential equations (PDEs) is not a mere generaliza-
tion of the theory of ordinary differential equations (ODEs), there are many points of
contact between both theories. An important example of this connection is provided
by the theory of the single first-order PDE, to be discussed in further chapters. For
this reason, the present chapter offers a brief review of some basic facts about systems
of ODEs, emphasizing the geometrical interpretation of solutions as integral curves
of a vector field.

1.1 Introduction

It is not an accident that one of the inventors of Calculus, Sir Isaac Newton (1642–
1727), was also the creator of modern science and, in particular, of Mechanics. When
we compare Kepler’s (1571–1630) laws of planetary motion with Newton’s f = ma,
we observe a clear transition from merely descriptive laws, that apply to a small
number of phenomena, to structural and explanatory laws encompassing almost
universal situations, as suggested in Fig. 1.1. This feat was achieved by Newton,
and later perfected by others, in formulating general physical laws in the small
(differentials) and obtaining the description of any particular global phenomenon by
means of a process of integration (quadrature).
In other words, Newton was the first to propose that a physical law could be
formulated in terms of a system of ordinary differential equations. Knowledge of the
initial conditions (position and velocity of each particle at a given time) is necessary
and sufficient to predict the behaviour of the system for at least some interval of time.
From this primordial example, scientists went on to look for differential equations that
unlock, as it were, the secrets of Nature. When the phenomena under study involve a
continuous extension in space and time one is in the presence of a field theory, such
as is the case of Solid and Fluid Mechanics, Heat Transfer and Electromagnetism.
© Springer International Publishing AG 2017 3
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_1
4 1 Vector Fields and Ordinary Differential Equations

Newton:“the force is
central and proportional
to the acceleration”

Kepler: “equal areas


are swept in equal times”

Fig. 1.1 New science from old

These phenomena can be described in terms of equations involving the fields and
their partial derivatives with respect to the space and time variables, thus leading
to the formulation of systems of partial differential equations. As we shall see in
this course, and as you may know from having encountered them in applications, the
analysis of these systems is not a mere generalization of the analysis of their ordinary
counterparts. The theory of PDEs is a vast field of mathematics that uses the tools of
various mathematical disciplines. Some of the specialized treatises are beyond the
comprehension of non-specialists. Nevertheless, as with so many other mathematical
areas, it is possible for engineers like us to understand the fundamental ideas at a
reasonable level and to apply the results to practical situations. In fact, most of the
typical differential equations themselves have their origin in engineering problems.

1.2 Curves and Surfaces in Rn

1.2.1 Cartesian Products, Affine Spaces

We denote by R the set of real numbers. Recall the notion of Cartesian product of
two sets, A and B, namely, the set A × B consisting of all ordered pairs of the form
(a, b), where a belongs to A and b belongs to B. More formally,

A × B = {(a, b) | a ∈ A, b ∈ B}. (1.1)


1.2 Curves and Surfaces in Rn 5

Properly speaking, this is not an


element of the original R3 , but we
can assign to it a unique element of
R R3 , shown here as a vector at the
origin.

q−p

R
Fig. 1.2 The affine nature of R3

Note that the Cartesian product is not commutative. Clearly, we can consider the
Cartesian product of more than two sets (assuming associativity). In this spirit we
can define
Rn = R × R × · · · × R . (1.2)
n times

Thus, Rn can be viewed as the set of all ordered n-tuples of real numbers. It has
a natural structure of an n-dimensional vector space (by defining the vector sum
and the multiplication by a scalar in the natural way).1 The space Rn (or, for that
matter, any vector space) can also be seen as an affine space. In an affine space, the
elements are not vectors but points. To every ordered pair of points, p and q, a unique
vector can be assigned in some predefined supporting vector space. This vector is
denoted as pq or, equivalently, as the “difference” q − p. If the space of departure
was already a vector space, we can identify this operation with the vector difference
and the supporting space with the vector space itself, which is what we are going to
do in the case of Rn (see Fig. 1.2). In this sense, we can talk about a vector at the
point p. More precisely, however, each point of Rn has to be seen as carrying its own
“copy” of Rn , containing all the vectors issuing from that point. This is an important
detail. For example, consider the surface of a sphere. This is clearly a 2-dimensional
entity. By means of lines of latitude and longitude, we can identify a portion of this
entity with R2 , as we do in geography when drawing a map (or, more technically,

1 The dot (or inner) product is not needed at this stage, although it is naturally available. Notice, inci-

dentally, that the dot product is not always physically meaningful. For example, the 4-dimensional
classical space-time has no natural inner product.
6 1 Vector Fields and Ordinary Differential Equations

a chart) of a country or a continent. But the vectors tangent to the sphere at a point
p, do not really belong to the sphere. They belong, however, to a copy of the entire
R2 (the tangent plane to the sphere at that point). In the case in which the sphere is
replaced by a plane, matters get simplified (and, at the same time, confused).

1.2.2 Curves in Rn

Consider now a continuous map (that is, a continuous function)

γ : J → Rn , (1.3)

where J = [t0 , t1 ] (with t1 > t0 ) is a (closed) interval of R, as shown in Fig. 1.3. If we


denote by t the running variable in R, this map can be represented by n continuous
functions
xi = xi (t) i = 1, . . . , n, (1.4)

where xi is the running variable in the i-th copy of R. The map γ is called a parame-
trized curve in Rn . Since to each point t ∈ J we assign a particular point in Rn , we
can appreciate that the above definition corresponds to the intuitive idea of a one-
dimensional continuous entity in space, namely something with just one “degree of
freedom”. The graph of a parametrized curve (i.e., the collection of all the image
points) is a curve. Notice that the same curve corresponds to an infinite number of

x3
v
p

r(tp )
γ

x2

[ | ] t
tp
  
J x1

Fig. 1.3 A parametrized curve


1.2 Curves and Surfaces in Rn 7

parametrized curves. By abuse of terminology, we will usually say “curve” when


actually referring to a parametrized curve.

Remark 1.1 Given a parametrized curve γ : J → Rn , a change of parameter is


obtained by specifying any continuous strictly monotonic function σ : J → R. The
composition γ ◦ σ −1 is a different parametrized curve with the same graph as the
original curve γ. The simplest change of parameter is a shift or translation, given by
the the function t → r = σ(t) = t − a, where a is a constant. Similarly, a change of
scale is given by the re-parametrization t → r = σ(t) = At, where A is a positive
constant. In many applications a particular choice of parameter has a clear physical
meaning. In mechanics, for example, the natural parameter is time. In strength of
materials, the natural parameter of (2D) Mohr’s circle is the angle between normals
to planes.

Remark 1.2 A continuous real function y = f (x) of a single variable x can be


regarded as a particular case of a parametrized curve by identifying x with the para-
meter. The graph of such a function is, clearly, an ordinary curve. But, contrary to the
general case, this curve cannot have self-intersections or be cut by a “vertical” line
in more than one point. Thus, the usual identification of plane curves with functions
of one variable is incorrect.

If each of the functions xi (t) is not just continuous but also differentiable (to some
order), we say that the curve is differentiable (of the same order). We say that a
function is of class C k if it has continuous derivatives up to and including the order
k. If the curve is of class C ∞ , we say that the curve is smooth.
It is often convenient to use a more compact vector notation2 by introducing the
so-called position vector r in Rn , namely, the vector with components x1 , x2 , . . . xn .
A curve is then given by the equation

r = r(t). (1.5)

The tangent vector to a differentiable curve γ at a point p = γ(tp ) is, by definition,


the vector 
dr(t) 
v= . (1.6)
dt t=tp

The components of the tangent vector are, accordingly, given by



dxi (t) 
vi = . (1.7)
dt t=tp

2A luxury that we cannot afford on something like the surface of a sphere, for obvious reasons.
8 1 Vector Fields and Ordinary Differential Equations

1.2.3 Surfaces in R3

A parametrized surface in R3 is the two-dimensional analog of a parametrized curve,


namely a continuous map
 : J1 × J2 → R3 , (1.8)

where J1 and J2 are (closed) intervals of R. If we denote by ξ1 and ξ2 the running


variables in J1 and J2 , respectively, the surface can be represented by the equation

r = r(ξ1 , ξ2 ). (1.9)

The domain of definition of the parameters ξ1 and ξ2 need not be limited to a rec-
tangle, but can be any (closed) connected region in R2 . Higher-order surfaces (or
hypersurfaces) can be defined analogously in Rn by considering continuous func-
tions xi = xi (ξ1 , . . . ξn−1 ). More generally, the main object of Differential Geometry
is a differentiable manifold of an arbitrary number of dimensions. An n-dimensional
manifold can be covered with coordinate patches, each of which looks like an open
set in Rn .

Remark 1.3 A continuous real function x3 = f (x1 , x2 ) of two real variables is a


particular case of a parametrized surface, namely, x1 = ξ1 , x2 = ξ2 , x3 = f (ξ1 , ξ2 ).
This surface necessarily has the property that every line parallel to the x3 axis cuts
the graph of the surface at most once.

Keeping one of the coordinates (ξ1 , say) fixed and letting the other coordinate
vary, we obtain a coordinate curve (of the ξ2 kind, say) on the given surface. The
surface can, therefore, be viewed as a one-parameter family of coordinate curves of
one kind or the other. More graphically, the surface can be combed in two ways with
coordinate curves, as illustrated in Fig. 1.4. In the differentiable case, a tangent vector

ξ2


]


⎪ x3








Σ
J2









⎩ r(ξ1 , ξ2 )
[

x2
[ ] ξ1

  
J1 x1

Fig. 1.4 A parametrized surface


1.2 Curves and Surfaces in Rn 9

to a coordinate curve is automatically tangent to the surface, namely, it belongs to


the local tangent plane to the surface. These two tangent vectors, therefore, are given
at each point of the surface by

∂r ∂r
e1 = e2 = . (1.10)
∂ξ1 ∂ξ2

They constitute a basis for the tangent plane to the surface. They are known as the
natural base vectors associated with the given parameters ξ1 , ξ2 .
The cross product of the natural base vectors provides us, at each point, with a
vector m = e1 × e2 perpendicular to the surface. The equation of the tangent plane
at a point x10 , x20 , x30 of the surface is, therefore,

m1 (x1 − x10 ) + m2 (x2 − x20 ) + m3 (x3 − x30 ) = 0, (1.11)

where m1 , m2 , m3 are the Cartesian components of m.

Remark 1.4 For the particular case of a surface expressed as x3 = f (x1 , x2 ), the
natural base vectors (adopting x1 , x2 as parameters) have the Cartesian components
   
∂f ∂f
e1 = 1, 0, e2 = 0, 1, . (1.12)
∂x1 ∂x2

A vector m normal to the surface is given by


 
∂f ∂f
m = e1 × e2 = − ,− ,1 , (1.13)
∂x1 ∂x2

and the equation of the tangent plane at (x10 , x20 , x30 ) can be written as

∂f ∂f
x3 − x30 = (x1 − x10 ) + (x2 − x20 ). (1.14)
∂x1 ∂x2

1.3 The Divergence Theorem

1.3.1 The Divergence of a Vector Field

A vector field over a region D in R3 is an assignation of a vector v to each point of


the region. If x1 , x2 , x3 is a Cartesian coordinate system in D, the vector field v can
be given in terms of its components v1 , v2 , v3 by means of three functions

vi = vi (x1 , x2 , x3 ) i = 1, 2, 3. (1.15)
10 1 Vector Fields and Ordinary Differential Equations

The vector field is said to be continuous (differentiable) if each of the functions vi is


continuous (differentiable).
Given a differentiable vector field v we can define its divergence (denoted as ∇ · v
or div v) as the scalar field given by

∂v1 ∂v2 ∂v3


div v = + + . (1.16)
∂x1 ∂x2 ∂x3

1.3.2 The Flux of a Vector Field over an Orientable Surface

Consider an infinitesimal element of area dA at a point P in R3 as shown in Fig. 1.5.


On the unique line orthogonal to this element we may choose either one of two unit
normal vectors, one for each of the two sides of dA. Each of the two possible choices
determines an orientation of dA. Let n denote one of these choices. If v is a vector
at P, we call the flux of v through dA with the chosen orientation the scalar quantity
v · n dA.
A smooth surface A in R3 is orientable if an orientation n can be chosen smoothly
on A. By this we mean that if we partition A into differential elements dA, we should
be able to choose one of the two possible orientations for each element in such a way
that, in the limit as the partition becomes infinitely fine, the resulting vector field
n defined on A is smooth. Not every surface in R3 is orientable, as shown by the
familiar example of the Moebius band. An orientable surface A is said to be oriented
if one of the two possible orientations of each dA has been chosen smoothly. Given
an oriented surface A and a vector field v defined on this surface, the flux of v over
A is given by

Fig. 1.5 Elementary flux x3


v · n dA of a vector v
through the oriented surface n v
element dA

P
dA

x2

x1
1.3 The Divergence Theorem 11

flux v,A = v · n dA. (1.17)
A

The (Riemann) integral on the right-hand side can be regarded as the limit of the sum
of the elementary fluxes as the partition is increasingly refined.

1.3.3 Statement of the Theorem

If D is a bounded domain in R3 we denote by ∂D its boundary.3 The boundary of a


domain is always orientable. We will systematically choose the orientation of ∂D as
the one determined by the exterior unit normals n.
Theorem 1.1 (Divergence theorem) The integral of the divergence of a vector field
on a bounded domain D is equal to the flux of this vector field over its boundary ∂D,
namely,
div v dV = v · n dA. (1.18)
D ∂D

A proof of this theorem, known also as the theorem of Gauss, can be found in
classical calculus books, such as [4] or [5]. A more modern and more general, yet
quite accessible, formulation is presented in [6].
This fundamental result of vector calculus can be regarded as a generalization of
the fundamental theorem of calculus in one independent variable. Indeed, in the one-
dimensional case, if we identify the domain D with a segment [a, b] in R, a vector
field v has a single component v. Moreover, the boundary ∂D consists of the two-
element set {a, b}. The exterior unit normals at the points a and b are, respectively,
the vectors with components −1 and +1. Thus, the divergence theorem reduces to

b
dv
dx = v(b) − v(a), (1.19)
dx
a

which reproduces the familiar result.

1.3.4 A Particular Case

Let φ = φ(x1 , x2 , x3 ) be a scalar field. Its gradient, denoted by gradφ or ∇φ is the


vector field with Cartesian components

3 This
notation is justified by the geometric theory of integration of differential forms, which lies
beyond the scope of these notes.
12 1 Vector Fields and Ordinary Differential Equations

∂φ
(∇φ)i = . (1.20)
∂xi

The divergence of this vector field is the scalar field

∂2φ ∂2φ ∂2φ


div(gradφ) = ∇ · (∇φ) = ∇ 2 φ = + 2 + 2, (1.21)
∂x12 ∂x2 ∂x3

where we have used various possible notations. In particular, the operator ∇ 2 is


known as the Laplacian.
Naturally, the divergence theorem can be applied to the case of a vector field
obtained as the gradient of a scalar field. The result is

∇ φ dV =
2
∇φ · n dA. (1.22)
D ∂D

What is the meaning of the term ∇φ · n? Clearly, this linear combination of partial
derivatives is nothing but the directional derivative of the function φ in the direction
of the unit vector n. If we denote this derivative by dφ/dn, we can write the statement
of the divergence theorem for the gradient of a scalar field as


∇ 2 φ dV = dA. (1.23)
dn
D ∂D

1.4 Ordinary Differential Equations

1.4.1 Vector Fields as Differential Equations

As we have seen, given a differentiable curve in Rn , one can define the tangent vector
at each of its points. The theory of ODEs can be regarded geometrically as providing
the answer to the inverse question. Namely, given a vector field in Rn and a point p0
in Rn , can one find a curve γ through p0 whose tangent vector at every point p of γ
coincides with the value of the vector field at p?
Let us try to clarify this idea. A vector field in Rn is given by a map

v : Rn → Rn , (1.24)

or, in components,
vi = vi (x1 , . . . , xn ) i = 1, . . . , n. (1.25)
1.4 Ordinary Differential Equations 13

According to our statement above, we are looking for a curve

xi = xi (t), (1.26)

satisfying the conditions

dxi (t)
= vi (x1 (t), . . . , xn (t)) i = 1, . . . , n, (1.27)
dt

and the initial conditions (of passing through a given point p0 with coordinates xi0 )

xi (t0 ) = xi0 , (1.28)

for some (initial) value t0 of the parameter t. We clearly see that the geometric
statement of tangency to the given vector field translates into the analytic statement
of Eq. (1.27), which is nothing but a system of ordinary differential equations (ODEs)
of the first order.
Remark 1.5 If the vector field vanishes at a point P, the solution of (1.27) through P
is constant, so that the entire curve collapses to a point. We say that P is an equilibrium
position of the system.
Using the notation of Eq. (1.5), the system (1.27) can be written as

dr
= v. (1.29)
dt
The system is linear if it can be written as

dr
= A r, (1.30)
dt
where A is a square constant matrix.

1.4.2 Geometry Versus Analysis

Understanding of mathematical concepts is often enhanced by approaching a problem


from various angles, just as we can look at a sculpture from different points of
view to better imbibe its meaning. In the realm of mathematics, perhaps controlled
by different sides of the brain, we can roughly distinguish between two modes of
thought: geometrical and analytical. In the instance at hand, the geometric viewpoint
is represented by the notion of vector field, while its analytic counterpart is a system
of first-order ordinary differential equations. A solution of this system corresponds
to an integral curve of the vector field, namely, a curve that is everywhere tangent to
the background vector field. We note that we are talking about a parametrized curve,
14 1 Vector Fields and Ordinary Differential Equations

so that we are requiring that the vector tangent to the parametrized integral curve
must be exactly equal (and not just proportional) to the vector field. The parameter t
emerging from the solution process itself may or may not have an intrinsic physical
meaning in a given context. Moreover, it should be clear that this parameter is at
most determined up to an additive constant. This arbitrariness can be removed by
specifying the value t0 = 0 at the ‘initial’ point xi0 .
An important question is whether or not a system of ODEs with given initial con-
ditions always has a solution and, if so, whether the solution is unique. Translating
this question into geometrical terms, we ask whether, given a vector field, it is always
possible to find a (unique) integral curve through a given point P. Geometrical intu-
ition tells us that, as long as the field is sufficiently regular, we can advance a small
step in the direction of the local vector at P to reach a nearby point P and then repeat
the process to a nearby point P , and so on, to obtain at least a small piece of a
curve. This intuition, aided by the power of geometric visualization, turns out to be
correct and is formalized in the existence and uniqueness theorem, which is briefly
discuseed in Sect. 1.4.7.

1.4.3 An Example

For illustrative purposes, let us work in the plane (n = 2) and let us propose the fol-
lowing vector field (which will be later related to a very specific physical application;
can you guess which?)

v1 = x2
(1.31)
v2 = − sin x1

This vector field is illustrated in Fig. 1.6.


Let us now arbitrarily choose the initial point p0 with coordinates x10 = −9, x20 = 2.
We want to find a curve that, passing through this point, is always tangent to the
local value of the given vector field. In other words, we want to solve the following
nonlinear system of ODEs
dx1
= x2 , (1.32)
dt

dx2
= − sin x1 . (1.33)
dt
The corresponding Mathematica code and plot are shown in Fig. 1.7.
Let us now consider the same example but with different initial conditions, closer
to the origin, such as x10 = −1.5, x20 = 1. The corresponding Mathematica code and
plot are shown in Fig. 1.8.
1.4 Ordinary Differential Equations 15

VectorPlot[{x2, −Sin[x1]}, {x1, −9,9}, {x2, −3,3}, AspectRatio → 0.4, FrameLabel →


{X1, X2}, VectorStyle → Black, PlotRange → {{−10,10}, {−4,4}}]
4

2
X2

4
10 5 0 5 10
X1

Fig. 1.6 Vector field associated with the system of ODEs (1.31)

curve = NDSolve[{x1′[ ] == x2[ ], x2′[ ] == −Sin[x1[ ]], x1[0] == −9, x2[0] =


= 2}, {x1[ ], x2[ ]}, { , 0,7.5}]

ParametricPlot[Evaluate[{x1[ ], x2[ ]}/. curve], { , 0,7.5}, PlotRange → {{−10,10}, {−4,4}}, PlotStyle


→ Black]
4

10 5 5 10

Fig. 1.7 A solution of the system

We obtain a qualitatively different type of solution, represented by a closed curve.


To emphasize this fact, we show in Fig. 1.9 a plot of the two integral curves just
obtained hovering over the vector field in the background. We can clearly see that
the curves are indeed tangential to the vector field at each point!
Remark 1.6 Note that the points with coordinates (kπ, 0), where k is an integer,
are equilibrium positions of the system. The behaviour around an equilibrium point
(stability, instability) can be determined by linearizing the system in a small neigh-
bourhood of the point. In our case, the linearized version becomes d(Δx1 )/dt = Δx2
16 1 Vector Fields and Ordinary Differential Equations

curve1 = NDSolve[{x1′[ ] == x2[ ], x2′[ ] == −Sin[x1[ ]], x1[0] == −1.5, x2[0] =


= 1}, {x1[ ], x2[ ]}, { , 0,9}]

ParametricPlot[Evaluate[{x1[ ], x2[ ]}/. curve1], { , 0,9}, PlotRange


→ {{−10,10}, {−4,4}}, PlotStyle → Black]
4

10 5 5 10

Fig. 1.8 A solution of the system

2
X2

4
10 5 0 5 10
X1

Fig. 1.9 Solutions as integral curves of the vector field

and d(Δx2 )/dt = (−1)k+1 Δx1 , where Δx1 , Δx2 are the incremental variables, so
that the linearized system has to be studied in the vicinity of the origin.

1.4.4 Autonomous and Non-autonomous Systems

A system of ODEs such as (1.27) is called autonomous, a word meant to indicate the
fact that the given vector field does not depend on the parameter. A more general,
non-autonomous, system would have the form
1.4 Ordinary Differential Equations 17

dxi (t)
= vi (t, x1 (t), . . . , xn (t)) i = 1, . . . , n. (1.34)
dt
If, as is often the case, the system of equations is intended to represent the evolution of
a dynamical system (whether in Mechanics or in Economics, etc.) and if the parameter
has the intrinsic meaning of time, the explicit appearance of the time variable in the
vector field seems to contradict the principle that the laws of nature do not vary
in time. As pointed out by Arnold,4 however, the process of artificially isolating a
system, or a part of a system, from its surroundings for the purpose of a simplified
description, may lead to the introduction of time-dependent fields.
An important property of the solutions of autonomous systems of ODEs is the
group property, also known as the time-shift property. It states that if r = q(t) is a
solution of a system of ODEs corresponding to the vector field v = v(r), namely if

dq(t)
= v(q(t)), (1.35)
dt
for all t, then the curve r = q(t + s), for any fixed s, is also a solution of the same
problem. Moreover, the two integral curves coincide. The proof is straightforward.
We start by defining the function q̂(t) = q(t + s) and proceed to calculate its deriv-
ative at some value t = τ of the parameter. We obtain
  
d q̂  dq(t + s)  dq(t) 
=  = dt  = v(q(τ + s)) = v(q̂(τ )). (1.36)
dt t=τ dt t=τ t=τ +s

1.4.5 Higher-Order Equations

In mechanical applications, the use of Newton’s law leads in general to systems


of second-order (rather than first-order) ODEs. Nevertheless, it is not difficult to
show that an ordinary differential equation of order n is equivalent to a system of n
first-order ODEs. Indeed, let a differential equation of order n be given by
 
d n x(t) dx(t) d n−1 x(t)
= F t, x(t), , . . . , , (1.37)
dt n dt dt n−1

where F is a differentiable function in each of its n + 1 variables. We define the


following new dependent variables

4 In his beautiful book


[1]. Another excellent book by the same author, emphasizing applications to
Newtonian and Analytical Mechanics, is [2].
18 1 Vector Fields and Ordinary Differential Equations

dx d2x d n−1 x
x1 = x x2 = x3 = ... xn = , (1.38)
dt dt 2 dt n−1
in terms of which the original differential equation can be written as the first-order
system
dx1
= x2 , (1.39)
dt

dx2
= x3 , (1.40)
dt

.
.
.

dxn−1
= xn , (1.41)
dt

dxn
= F (t, x1 , x2 , . . . , xn ) . (1.42)
dt
Thus a system of second-order equations, such as one obtains in the formulation
of problems in dynamics of systems of particles (and rigid bodies), can be reduced
to a system of first-order equations with twice as many equations. The unknown
quantities become, according to the scheme just described, the positions and the
velocities of the particles. In this case, therefore, the space of interest is the so-called
phase space, which always has an even dimension. If the system is non-autonomous,
it is sometimes convenient to introduce the odd-dimensional extended phase space,
which consists of the Cartesian product of the phase space with the time line R. This
terminology is widely used even in non-mechanical applications. The vector field
corresponding to an autonomous dynamical system is called its phase portrait and
its integral curves are called the phase curves. A careful analysis of the phase portrait
of an autonomous dynamical system can often reveal many qualitative properties of
its solutions.

1.4.6 First Integrals and Conserved Quantities

Given a vector field v in Rn and the corresponding autonomous system of ODEs

dr(t)
= v(r(t)), (1.43)
dt
1.4 Ordinary Differential Equations 19

a first integral is a differentiable function

φ : Rn → R (1.44)

that attains a constant value over every solution of the system (1.43). In other words,
the function φ(x1 , . . . xn ) is constant along every integral curve of the vector field.
Clearly, any constant function is, trivially, a first integral. We are, therefore, only
interested in non-constant first integrals, which are the exception rather than the
rule. Whenever a non-constant first integral exists, it is usually of great physical
interest, since it represents a conserved quantity. A mechanical system is said to
be conservative if the external forces can be derived from a scalar potential U :
Rn → R, in which case the total energy of the system (kinetic plus potential) is
conserved.
Let a mechanical system with n degrees of freedom (such as a collection of springs
and masses) be described by the matrix equation

M r̈ = f(r), (1.45)

where the constant mass matrix M is symmetric and f is the vector of external forces.
The position vector r is measured in an inertial frame of reference and superimposed
dots indicate time derivatives. In many instances (such as when forces are produced by
a gravitational or electrostatic field) the external forces derive from a scalar potential
U = U(r) according to the prescription

∂U(r)
f =− , (1.46)
∂r
or, in components,
∂U
fi = − i = 1, . . . , n. (1.47)
∂xi

If we consider the kinetic energy T as the scalar function

1 T
T = T (ṙ) = ṙ M ṙ, (1.48)
2
the total energy E can be defined as

E = T + U. (1.49)

Let r = r(t) be a solution of the system (1.45) for some initial conditions r(0) = r0
and ṙ(0) = u0 . Let us calculate the derivative of the total energy E with respect to t
along this solution. Using the chain rule of differentiation, we can write

dE
= ṙT M r̈ − ṙT f, (1.50)
dt
20 1 Vector Fields and Ordinary Differential Equations

where we have exploited the symmetry of the mass matrix and the potential relation
(1.46). Collecting terms and enforcing (1.45) (since we have assumed r(t) to be a
solution of this system), we obtain

dE
= ṙT (M r̈ − f) = 0, (1.51)
dt
which proves that E is a constant along every trajectory of the system. The value of
this constant is uniquely determined by the initial conditions.
For conservative mechanical systems with a single degree of freedom (n = 1), the
integral curves in the phase space coincide with the level sets of the total energy, as
described in Box 1.1. This remark facilitates the qualitative analysis of such systems.
For a fuller treatment, consult Chap. 2 of [1] or Sect. 2.12 of [2], which are useful for
the solution of Exercises 1.3 and 1.5.

Box 1.1 Architecture and phase portraits. Roughly speaking, a shell is


a surface with thickness. In industrial architecture, particularly when using
reinforced-concrete, it is not uncommon to find translational shells. These
structures are generated by translating a plane curve while riding upon another
plane curve acting as a guide and lying on a perpendicular plane. Translational
surfaces are a particular case of surfaces generated by rigid motions of a curve.
They were first defined and studied by the French mathematician Jean-Gaston
Darboux (1842–1917). In a Cartesian coordinate system x, y, z, the function

z = f (x) + g(y)

where f and g are functions of one variable, is a translational surface. Either


curve, f (x) or g(y), can be considered as the guide for the translation of the
other. After a rainfall, if the surface can contain water, the water level will be
bounded by a level curve of the surface. On the other hand, it is clear from
Eqs. (1.46), (1.48) and (1.49) that the total energy, in the case of a one-degree-
of-freedom system, can be represented as a translational surface in the space
of coordinates (x, v = ẋ, E). Since in a conservative system the total energy is
a constant of the motion, we conclude that the trajectories are the level curves
of the energy surface! In geometrical terms, this remark provides a way to
visualize the various types of trajectories for a one-degree-of-freedom system
by imagining the parabola representing the kinetic energy traveling over the
graph of the potential energy as the guide, and then visualizing the level curves.
The graphs below correspond to M = 8 and U = x 2 (x 2 − 4).
1.4 Ordinary Differential Equations 21

1.4.7 Existence and Uniqueness

When describing, in Sect. 1.4.2, an intuitive geometric way to visualize the construc-
tion of an integral curve of a vector field (by moving piecewise along the vectors of
the field), we mentioned that the field must be ‘sufficiently regular’. It may appear
that mere continuity of the field would be sufficient for this intuitive picture to make
sense. A more rigorous analysis of the problem, however, reveals that a somewhat
stronger condition is needed, namely, Lipschitz continuity. In the case of a real func-
tion of one real variable
f : [a, b] → R, (1.52)

the function is said to be Lipschitz continuous if there exists a non-negative constant


K such that  
 f (x2 ) − f (x1 ) 
  ≤ K, (1.53)
 x −x 
2 1

for all x1
= x2 . An example of a continuous function that is not Lipschitz continuous
is the function
f (x) = + |x| (1.54)

in the interval [−1, 1].


A differentiable function on a closed interval is automatically Lipschitz continu-
ous. The definition of Lipschitz continuity can be extended to functions of several
variables in an obvious way.
Theorem 1.2 (Picard-Lindelöf) Let v be a vector field defined on a closed domain5
D of Rn and let the components of v be Lipschitz continuous. Then, for each interior

5 This domain may be the whole of Rn .


22 1 Vector Fields and Ordinary Differential Equations

point r0 ∈ D, there exists an ε > 0 such that the initial value problem

dr
=v r(t0 ) = r0 (1.55)
dt
has a unique solution in the interval [t0 − ε, t0 + ε].
In geometrical terms, given a sufficiently regular vector field, we can always find
at each point a small enough integral curve passing through that point. The theorem is
also applicable to non-autonomous systems, as long as the dependence of the vector
field on the parameter is continuous. For linear systems the theorem guarantees the
existence of the solution for all values of t.

1.4.8 Food for Thought

A vector field on Rn gives rise to a so-called one-dimensional distribution on Rn .


Indeed, if we consider the line of action of each vector at each point of Rn , we obtain a
field of directions or, equivalently, we have at each point a one-dimensional subspace
of Rn . A two-dimensional distribution, accordingly, would consists of attaching at
each point a plane (that is, a two-dimensional subspace of Rn ). In this spirit, and
emboldened by the theorem of existence and uniqueness just presented, we may
intuitively foresee that we may be able to construct a small integral surface whose
tangent plane coincides at each point with the plane in the distribution. This predic-
tion, however, is incorrect, as Exercise 1.7 shows.
Some two-dimensional distributions do admit integral surfaces. These special
distributions are called integrable. The theorem of Frobenius provides a necessary and
sufficient conditions for a distribution to be integrable. This integrability condition is
known as involutivity. A fairly accessible treatment of this subject can be found in [3].

Exercises
Excercise 1.1 Show that the expression (1.16) is preserved upon a change of Carte-
sian coordinates.
Excercise 1.2 Show that in a system of cylindrical coordinates ξ1 , ξ2 , ξ3 defined by

x1 = ξ1 cos ξ2
x2 = ξ1 sin ξ2
x3 = ξ3 ,

the divergence of a vector field v with cylindrical components v̂1 , v̂2 , v̂3 is given by

1 ∂(ξ1 v̂1 ) 1 ∂ v̂2 ∂ v̂3


div v = + + .
ξ1 ∂ξ1 ξ1 ∂ξ2 ∂ξ3
1.4 Ordinary Differential Equations 23

Excercise 1.3 (a) Show that the system of ODEs given by Eqs. (1.32) and (1.33)
can be used to represent the motion of a pendulum in a vertical plane. (b) Describe
qualitatively the behaviour of solutions of the two types discussed above. What kind
of solution is represented by closed curves in phase space? (c) By changing the initial
conditions one can control which of the two types of behaviour will result. Clearly,
there exists a locus of points in phase space corresponding to initial conditions that
lie precisely in the boundary between the two types of behaviour. This locus is called
a separatrix. From considerations of conservation of energy, determine the equation
of this separatrix. (d) Plot your separatrix and verify numerically (using, for example,
the Mathematica package) that indeed a small perturbation of the initial conditions
to one side leads to a different behaviour from that caused by a perturbation to the
other side of the separatrix.

Excercise 1.4 Draw (approximately) the phase portrait for a damped pendulum,
where the damping force is proportional to the angular velocity. Compare with the
results for the undamped pendulum and comment on the nature of the solutions in
both cases. Consider various values of the damping coefficient. Is there a critical
value? Compare your results qualitatively with the corresponding one for a linear
spring with and without damping.

Excercise 1.5 A particle moves along the x axis under the force field

F(x) = −1 + 3x 2 .

Draw and analyze the corresponding phase portrait, with particular attention to the
level curves of the total energy (which represent the trajectories of the system in
phase space). Do not use a computer package.

Excercise 1.6 Show that for a system of masses subjected only to central forces
(namely, forces passing through a common fixed point in an inertial frame), the
vector of angular momentum of the system with respect to that point is conserved.
Recall that the angular momentum is the moment of the linear momentum. For the
particular case of a single particle, prove that the trajectories are necessarily plane
and derive Kepler’s law of areas.

Excercise 1.7 Consider two vector fields, u and v, in R3 with components

u = (1, 0, 0) v = (0, 1, x1 ). (1.56)

At each point (x1 , x2 , x3 ) of R3 these two vectors determine a plane. In other words,
we have defined a two-dimensional distribution in R3 . Attempt a drawing of this
distribution around the origin and explain intuitively why this distribution fails to
be involutive. Strengthen your argument by assuming that there exists an integral
surface with equation x3 = ψ(x1 , x2 ) and show that imposing the condition that its
tangent plane belongs to the distribution (at each point in a vicinity of the origin)
leads to a contradiction.
24 1 Vector Fields and Ordinary Differential Equations

References

1. Arnold VI (1973) Ordinary differential equations. MIT Press, Cambridge


2. Arnold VI (1978) Mathematical methods of classical mechanics. Springer, Berlin
3. Chern SS, Chen WH, Lam KS (2000) Lectures on differential geometry. World Scientific,
Singapore
4. Courant R (1949) Differential and integral calculus, vol II. Blackie and Son Ltd, Glasgow
5. Marsden JE, Tromba AJ (1981) Vector calculus, 2nd edn. W. H. Freeman, San Francisco
6. Spivak M (1971) Calculus on manifolds: a modern approach to classical theorems of advanced
calculus. Westview Press, Boulder
Chapter 2
Partial Differential Equations in Engineering

Many of the PDEs used in Engineering and Physics are the result of applying physical
laws of conservation or balance to systems involving fields, that is, quantities defined
over a continuous background of two or more dimensions, such as space and time.
Under suitable continuity and differentiability conditions, a generic balance law in
both global (integral) and local (differential) forms can be derived and applied to
various contexts of practical significance, such as Traffic flow, Solid Mechanics,
Fluid Mechanics and Heat Conduction.

2.1 Introduction

Partial differential equations arise quite naturally when we apply the laws of nature
to systems of continuous extent. We speak then of field theories. Thus, whereas the
analysis of the vibrations of a chain of masses interconnected by springs gives rise
to a system of ODEs, the dynamic analysis of a bar, where the mass is smeared out
continuously over the length of the bar, gives rise to a PDE. From this simple example,
it would appear that PDEs are a mere generalization of their ordinary counterparts,
whereby a few details need to be taken care of. This false impression is exacerbated
these days by the fact that numerical procedures, that can be implemented as computer
codes with relative ease, do actually approximate the solutions of PDEs by means of
discrete systems of algebraic equations. This is clearly a legitimate thing to do, but
one must bear in mind that, unless one possesses a basic knowledge of the qualitative
aspects of the behaviour of the continuous system, the discrete approximation may
not be amenable to a correct interpretation.
One hardly needs to defend the study of PDEs on these grounds, since they stand
alone as one of the greatest intellectual achievements of the human race in its attempt
to understand the physical world. Need one say more than the fact that from solid
and fluid mechanics all the way to quantum mechanics and general relativity, the
© Springer International Publishing AG 2017 25
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_2
26 2 Partial Differential Equations in Engineering

language of nature has so far been transcribed into PDEs? There has been recently
a trend to declare the emergence of a “new science”, in which the prevalent lan-
guage will be that of cellular automata and other tools that represent the behaviour
of complex systems as the result of simple interactions between a very large (but
finite) number of discrete sites of events.1 These models are particularly powerful
in applications where the underlying phenomena are too intricate to capture in any
degree of detail by means of PDEs. Such is the case in multi-scale phenomena that
appear in many modern applications in a variety of fields (biology, environmental
engineering, nanomechanics, and so on). It is too early to predict the demise of
Calculus, however. As many times in the past (think of quantum mechanics, chaos,
economics), it appears that in one way or another the usefulness of mathematical
limits (differentiation, integration) is not entirely dependent on whether or not the
actual physical system can “in reality” attain those limits. Calculus and differential
equations are here to stay just as trigonometry and Euclidean geometry are not likely
to go away.

2.2 What is a Partial Differential Equation?

A partial differential equation for a function u of the independent variables x1 , x2 , . . . ,


xn (n > 1) is a relation of the form
⎛ ⎞

F ⎝xi , u, u ,i , u ,i j , . . . , u , i jk... ⎠ = 0, i, j, k, . . . = 1, . . . , n, (2.1)



m indices

where F is a function and where we have introduced the following notation: A


subscript preceded by a comma, indicates a partial derivative with respect to the
corresponding independent variable. If more than one index follows a comma, it is
understood that successive derivatives have been taken. Thus, for instance,

∂u ∂3u
u ,i = u ,i jk = . (2.2)
∂xi ∂xk ∂x j ∂xi

By abuse of notation, we have listed in Eq. (2.1) just a generic term for each order
of differentiation, understanding that all the values of the indices are to be considered.
For example, when we write the argument u ,i j we mean the n 2 entries in the square
matrix of second partial derivatives.2 The requirement n > 1 is essential, otherwise
(if n = 1) we would have an ordinary differential equation. The highest order of

1 This point of view is advocated in [4] with particular force by Stephen Wolfram, a physicist and
the creator of the Mathematica code (which, ironically, is one of the best tools in the market for the
solution of differential equations).
2 On the other hand, recalling the equality of mixed partial derivatives (under assumptions that we

assume to be fulfilled), the number of independent entries of this matrix is actually only n(n + 1)/2.
2.2 What is a Partial Differential Equation? 27

differentiation appearing, which in our case we have indicated by m, is called the


order of the differential equation.
By a solution of the PDE (2.1) we mean a function u = u(xi ) = u(x1 , . . . , xn )
which, when substituted in Eq. (2.1), satisfies it identically within a given domain of
the independent variables x1 , . . . xn . Clearly, in this version of the theory, the proposed
solution must necessarily be differentiable at least m times (otherwise we wouldn’t
even be able to check that the equation is satisfied). The function F, which actually
characterizes the particular differential equation being studied and represents the
physical laws at hand, is also subject to conditions of continuity and differentiability
which we will not stipulate at this point, but we will assume that as many derivatives
of this function exist as we need. A PDE is said to be linear if F depends linearly on
the unknown function u and all its derivatives. The coefficients of a linear PDE may
still depend on the independent variables. If they don’t, we have a case of a linear
equation with constant coefficients. Even such a simple situation is not amenable to
a straightforward treatment, like in the case of ODEs. If the function F is linear in
the highest derivatives only (namely, on all the derivatives of order m), the PDE is
said to be quasi-linear. Otherwise (if it depends non-linearly on at least one of the
highest derivatives) the equation is nonlinear.
A relatively simple case is that for which the number of independent variables is
equal to 2. In this case, a solution can be visualized as a surface in the 3-dimensional
space with coordinates x1 , x2 , u. It follows from this intuitive picture that the analog
of the notion of integral curve is that of an integral surface and perhaps that the
analog of the initial conditions at a point is the specification of initial conditions
along a whole curve through which the integral surface must pass. More difficult is
to visualize at this point what might be the analog of the vector field which, as we
know, is associated with a system of ODEs. Leaving this issue aside for the moment,
let us remark that just as we have systems of ODEs we can also have systems of
PDEs. The question of the equivalence of a single PDE of higher order to a system
of PDEs of order 1 is somewhat more delicate than its ODE counterpart.

2.3 Balance Laws

One of the primary sources of PDEs in Engineering and Physics is the stipulation
of conservation laws. Conservation laws or, more generally, balance laws, are the
result of a complete accounting of the variation in time of the content of an extensive
physical quantity in a certain domain. A simple analogy is the following. Suppose
that you are looking at a big farming colony (the domain of interest) and you want
to focus attention on the produce (the physical quantity of interest). As time goes
on, there is a variation in the quantity of food contained in the domain. At any given
instant of time, you want to account for the rate of change of this food content. There
are some internal sources represented in this case by the rate at which the land yields
new produce (so and so many tons per week, say). There are also sinks (or negative
sources) represented by the internal consumption of food by workers and cattle,
28 2 Partial Differential Equations in Engineering

damage caused by hail and pests, etcetera. We will call these sources and sinks the
production of the quantity in question. It is measured in units of the original quantity
divided by the unit of time. In addition to these internal factors, there is also another
type of factors that can cause a change in content. We are referring to exchanges of
food through the boundary of the colony. These include the buying and selling of
produce that takes place at the gates, the perhaps illegal activities of some members
or visitors that personally take some food away to other destinations, etcetera. At any
given instant of time, we can estimate the rate at which these exchanges take place at
the boundary. We will call these transactions the flux of the quantity in question. We
may have a flux arising also from the fact that the boundary of the domain of interest
is changing (encroached by an enemy or by natural causes, etcetera). Assuming that
we have accounted for every one of these causes and that we believe in the principles
of causality and determinism (at least as far as the material world is concerned), we
may write the generic equation of balance as

d content
= production + flux, (2.3)
dt
where t is the time variable.
In physically meaningful examples (balance of energy, momentum, mass, electric
charge, and so on), it is often the case that the content, the production and the flux are
somehow distributed (smeared) over the volume (in the case of the content and the
production) or over the area of the boundary (in the case of the flux). In other words,
these magnitudes are given in terms of densities, which vary (continuously, say) from
point to point and from one instant to the next. It is precisely this property (whether
real or assumed) that is responsible for the fact that we can express the basic equation
of balance (2.3) in terms of differential equations. Indeed, the differential equations
are obtained by assuming that Eq. (2.3) applies to any sub-domain, no matter how
small.

2.3.1 The Generic Balance Equation

Let U represent an extensive quantity for which we want to write the equation of
balance. We assume this quantity to be scalar, such as mass, charge or energy content.3
Consider a spatial region ω fixed in R3 and representing a subset of the region of
interest. Our four independent variables are the natural coordinates x1 , x2 , x3 of R3
and the time variable t.4 When we say that U is an extensive quantity, we mean
that we can assign a value of U (the content) to each such subset ω. On physical
grounds we further assume that this set function is additive. By this we understand

3 Vector quantities, such as linear and angular momentum, can be treated in a similar way by
identifying U alternatively with each of the components in a global Cartesian frame of reference.
4 Consequently, we will not strictly adhere to the notational convention (2.2).
2.3 Balance Laws 29

that the total content in two disjoint subsets is equal to the sum of the contents in
each separate subset. Under suitable continuity conditions, it can be shown that the
content of an extensive quantity U is given by a density u = u(x1 , x2 , x3 , t) in terms
of an integral, namely,

U= u dω. (2.4)
ω

It is clear that this expression satisfies the additivity condition. The units of u are the
units of U divided by a unit of volume.
Similarly, the production P is assumed to be an extensive quantity and to be
expressible in terms of a production density p = p(x1 , x2 , x3 , t) as

P= p dω. (2.5)
ω

The units of p are the units of U divided by a unit of volume and by the time unit.
We adopt the sign convention that a positive p corresponds to creation (source) and
a negative p corresponds to annihilation (sink).
The flux F represents the change in content per unit time flowing through the
boundary ∂ω, separating the chosen subset ω from the rest of the region of interest.
In other words, the flux represents the contact interaction between adjacent subsets.
A remarkable theorem of Cauchy shows that under reasonable assumptions the flux
is governed by a vector field, known as the flux vector f. More precisely, the inflow
per unit area and per unit time is given by

dF = (f · n)da, (2.6)

where n is the exterior unit normal to da. The significance of this result can be
summarized as follows:
1. The ‘principle of action and reaction’ is automatically satisfied. Indeed at any
given point an element of area da can be considered with either of two possible
orientations, corresponding to opposite signs of the unit normal n. Physically,
these two opposite vectors represent the exterior unit normals of the sub-bodies
on either side of da. Thus, what comes out from one side must necessarily flow
into the other.
2. All boundaries ∂ω that happen to have the same common tangent plane at one
point transmit, at that point, exactly the same amount of flux. Higher order prop-
erties, such as the curvature, play no role whatsoever in this regard. In fact, this
is the main postulate needed to prove Cauchy’s theorem.
3. The fact that the amount of flux depends linearly on the normal vector (via the dot
product) conveys the intuitive idea that the intensity and the angle of incidence
of the flowing quantity are all that matter. If you are sun-tanning horizontally at
30 2 Partial Differential Equations in Engineering

high noon in the same position as two hours later, you certainly are taking in more
radiation per unit area of skin in the first case.
We are now in a position of implementing all our hypotheses and conclusions into
the basic balance Eq. (2.3). The result is


d
u dω = pdω + f · n da. (2.7)
dt
ω ω ∂ω

This equation represents the global balance equation for the volume ω. It should
be clear that this equation is valid under relatively mild conditions imposed on the
functions involved. Indeed, we only need the density u to be differentiable with
respect to time and otherwise we only require that the functions be integrable. This
remark will be of great physical significance when we study the propagation of
shocks. In the case of a content u and a flux vector f which are also space-wise
differentiable, we can obtain a local version of the generic balance equation. This
local (‘infinitesimal’) version is a partial differential equation. To derive it, we start
by observing that, due to the fact that the volume ω is fixed (that is, independent of
time), the order of differentiation and integration on the left-hand side of Eq. (2.7)
can be reversed, that is,

d ∂u
u dω = dω. (2.8)
dt ∂t
ω ω

Moreover, the surface integral on the right-hand side of Eq. (2.7) is the flux of a
vector field on the boundary of a domain and is, therefore, amenable to be treated by
means of the divergence theorem according to Eq. (1.18), namely,

f · n da = divfdω. (2.9)
∂ω ω

Collecting all the terms under a single integral we obtain the global balance equation
in the form

∂u
− p − divf dω = 0. (2.10)
∂t
ω

This equation is satisfied identically for any arbitrary sub-domain ω. If the integrand
is continuous, however, it must vanish identically. For suppose that the integrand is,
say, positive at one point within the domain of integration. By continuity, it will also
be positive on a small ball B around that point. Applying the identity (2.10) to this
sub-domain B, we arrive at a contradiction. We conclude, therefore, that a necessary
and sufficient condition for the global balance equation to be satisfied identically for
arbitrary sub-domains is the identical satisfaction of the partial differential equation
2.3 Balance Laws 31

∂u
− p − divf = 0. (2.11)
∂t
This is the generic equation of balance in its local (differential) form. It is a single
PDE for a function of 4 variables, x1 , x2 , x3 and x4 = t.

2.3.2 The Case of Only One Spatial Dimension

There are several reasons to present an independent derivation of the generic law
of balance for the case of a single spatial dimension. The first reason is that in
the case of just one dominant spatial dimension (waves or heat flow in a long bar,
current in a wire, diffusion of pollutants in a tube, etcetera), the divergence theorem
mercifully reduces to the statement of the fundamental theorem of calculus of one
variable (roughly speaking: “differentiation is the inverse of integration”). Notice
that we still are left with two independent variables, one for the spatial domain (x)
and one for the time dependence (t). Another important reason has to do with the
peculiar nature of a domain in R as compared with domains in higher dimensions.
If the spatial domain is two-dimensional, such as a membrane, its boundary is the
perimeter curve, while the upper and lower faces of the membrane are identified
with the interior points. For a three-dimensional domain, the boundary is the whole
bounding surface. On the other hand, a closed connected domain in R is just a closed
interval [a, b], with a < b. Its boundary consists of just two distinct points, as shown
in Fig. 2.1. Moreover, the exterior normal to the boundary is defined at those points
only, as a unit vector at a pointing in the negative direction and a unit vector at b
pointing in the positive direction of the real line. The flux vector f and the velocity
vector v have each just one component and can be treated as scalars. Physically,
we may think of U as the content of some extensive quantity in a wire or a long
cylindrical bar. It is important to realize that the lateral surface of this wire does not
exist, in the sense that it is not part of the boundary. On the contrary, the points on this

domain

x
a b

boundary

Fig. 2.1 The boundary of a domain in R consists of two points


32 2 Partial Differential Equations in Engineering

(vanishingly small) lateral surface are identified precisely with the interior points of
the wire.
If we assume that the quantity U = U (t) is continuously distributed throughout
the domain, we can express it in terms of a density u = u(x, t) per unit length of the
bar as

b
U = u d x. (2.12)
a

Similarly, the production P = P(t) can be expressed in terms of a density p =


p(x, t) per unit length and per unit time as


b
P= p d x. (2.13)
a

As a sign convention, we assume that a positive p corresponds to creation (source)


and a negative sign to annihilation (sink).
The flux term requires some further discussion. Clearly, if we cut the bar into two
pieces and we focus attention on one of these pieces alone, as far as the quantities
U and P are concerned, all we have to do is change the limits of integration in Eqs.
(2.12) and (2.13). On the other hand, by the process of cutting, we have created a
new boundary point at each of the sub-bodies and a corresponding flux. If we assume
(following an idea similar to Newton’s law of action and reaction) that whatever flow
enters through the new boundary into one of the parts must necessarily be coming
out of the new boundary of the other part, we realize that the flux is best represented
by the dot product of a flux vector f = f(x, t) with the unit vector n at the boundary
pointing in the outward direction of the part under study. This situation is illustrated
in Fig. 2.2.
If the flux vector points to the right, the flux through the cross section, as shown
in the figure, will be positive (an inflow) for the left part of the bar and negative

oppositely oriented
exterior unit normals

new boundary pair

Fig. 2.2 The flux vector


2.3 Balance Laws 33

(and of equal absolute value) for the right part of the bar. Notice that in this simple
one-dimensional case, the flux vector has only one component, which we denote by
f = f (x, t). Nevertheless, the concept of flux vector is very important and can be
used directly in two- and three- dimensional spatial contexts. For the actual boundary
of the body, the flux vector may be specified as a boundary condition, depending on
the specific problem being solved.
Introducing our specific expressions for content, production and flux in the generic
balance equation (2.3), we obtain


b
b
d
u(x, t) d x = p(x, t) d x + f (b, t) − f (a, t). (2.14)
dt
a a

As far as sign convention for the flux is concerned, we have assumed that a positive
scalar flux is inwards, into the body. What this means is that if the flux vector points
outwards, the scalar flux is actually inwards. If this convention is found unnatural,
all one has to do is reverse the sign of the last two terms in Eq. (2.14).5
This is the equation of balance in its global form. It is not yet a partial differential
equation. It is at this point that, if we wish to make the passage to the local form of the
equation, we need to invoke the fundamental theorem of calculus (or the divergence
theorem in higher dimensional contexts). Indeed, we can write


b
∂ f (x, t)
f (b, t) − f (a, t) = d x. (2.15)
∂x
a

We obtain, therefore,


b
b
b
d ∂f
u(x, t) d x = p(x, t) d x + d x. (2.16)
dt ∂x
a a a

If we consider that the integral limits a and b, though arbitrary, are independent of
time, we can exchange in the first term of the equation the derivative with the integral,
namely,

b
b
b
∂u ∂f
dx = p(x, t) d x + d x, (2.17)
∂t ∂x
a a a

identically for all possible integration limits. We claim now that this identity is
possible only if the integrands themselves are balanced, namely, if

5 This
common policy is adopted in [3]. This is an excellent introductory text, which is highly
recommended for its clarity and wealth of examples.
34 2 Partial Differential Equations in Engineering

∂u ∂f
= p(x, t) + . (2.18)
∂t ∂x
The truth of this claim can be verified by collecting all the integrands in Eq. (2.17)
under a single integral and then arriving at a combined integrand whose integral
must vanish no matter what limits of integration are used. Clearly, if the integrand is
continuous and does not vanish at some point in the domain of integration, it will also
not vanish at any point in a small interval containing that point (by continuity). It will,
therefore, be either strictly positive or strictly negative therein. Choosing, then, that
small interval as a new domain of integration, we would arrive at the conclusion that
the integral does not vanish, which contradicts the assumption that the integration
must vanish for all values of the limits. We conclude that Eq. (2.18) must hold true.

2.3.3 The Need for Constitutive Laws

When we look at the local form of the balance equation, (2.18), we realize that we
have a single equation containing partial derivatives of two unknown functions, u
and f . What this is telling us from the physical point of view is that the equations of
balance are in general not sufficient to solve a physical problem. What is missing? If
we think of the problem of heat transfer through a wire (which is an instance of the
law of balance of energy), we realize that the material properties have played no role
whatsoever in the formulation of the equation of balance. In other words, at some
point we must be able to distinguish (in these macroscopic phenomenological models
in which matter is considered as a continuum) between different materials. Copper
is a better heat conductor than wood, but the law of balance of energy is the same for
both materials! The missing element, namely the element representing the response
of a specific medium, must be supplied by means of an extra equation (or equations)
called the constitutive law of the medium. Moduli of elasticity, heat conductivities,
piezoelectric and viscosity constants are examples of the type of information that may
be encompassed by a constitutive law. And what is that the constitutive equation can
stipulate? Certainly not the production, since this is a matter of sources and sinks,
which can be controlled in principle regardless of the material at hand. Instead, it is
the flux vector within the body that will differ from material to material according
to the present state of the system. The state of a system is given in terms of some
local variables of state s1 , s2 , . . . , sk (positions, temperatures, velocity gradients, and
so on), so that both the flux f and the content density u may depend on them. The
constitutive law is then expressed by equations such as

u = û (s1 (x, t), . . . , sk (x, t), x) (2.19)

and
f = fˆ (s1 (x, t), . . . , sk (x, t), x) . (2.20)
2.3 Balance Laws 35

The reason that we have included a possible explicit dependence on x is that the
properties of the medium may change from point to point (as is the case in the so-
called functionally graded bodies, for instance). In principle, these properties could
also change in time, as is the case in processes of aging (biological or otherwise). In
some cases, a single variable of state is enough to characterize the system, so that
ultimately Eq. (2.18) becomes a PDE for the determination of this variable of state
as a function of space and time. Sometimes, it is possible to adopt the density u itself
as a single variable of state, so that the constitutive law simply reads

f = fˆ(u(x, t), x). (2.21)

In this case, substituting the constitutive law into (2.18), we obtain

∂u ∂ fˆ ∂u ∂ fˆ
= p+ + . (2.22)
∂t ∂u ∂x ∂x
It is often convenient to adopt a subscript notation for partial derivatives of the
unknown field variable u = y(x, t), such as

∂u ∂u ∂2u ∂2u
ux = ut = uxx = u xt = ... (2.23)
∂x ∂t ∂x 2 ∂t∂x
Notice that, since there is no room for confusion, we don’t place a comma before
the subscripts indicating derivatives, as we did in (2.2). In this compact notation, Eq.
(2.22) reads
∂ fˆ ∂ fˆ
ut − ux = p + . (2.24)
∂u ∂x

We have purposely left the partial derivatives of the constitutive function fˆ unaffected
by the subscript notation. The reason for this is that the constitutive function fˆ is not
an unknown of the problem. On the contrary, it is supposed to be known as that part
of the problem statement that identifies the material response. Its partial derivatives
are also known as some specific functions of u and x. Notice that in the case of a
homogeneous material, the last term in Eq. (2.24) vanishes.
In the terminology introduced in the previous section, Eq. (2.24) is a first order,
quasi-linear PDE. If the constitutive function fˆ happens to be a linear function of
u, the PDE becomes linear. The linearity of constitutive laws is still one of the most
common assumptions in many branches of engineering (for example: Hooke’s law,
Ohm’s law, Fourier’s law, Darcy’s law, etcetera, are not actual laws of nature but
constitutive assumptions that are useful linear approximations to the behaviour of
some materials within certain ranges of operation). Notice that in the examples just
mentioned, the constitutive laws are expressed in terms of space derivatives of state
variables (respectively, displacement, electric potential, temperature and pressure).
As a result, the equation of balance combined with the constitutive law yields a second
order PDE. The theory of a single first-order PDE is comparable in its precision and
36 2 Partial Differential Equations in Engineering

implementation to the theory of systems of ODEs. This is not the case for higher
order PDEs or for systems of first order PDEs, as we shall see later. At this point,
however, we are only interested in illustrating the emergence of PDEs of any order
and type from well-defined engineering contexts, without much regard for their
possible solutions. Accordingly, in the next section, we will display several instances
of balance laws, which constitute a good (but by no means the only) source of PDEs
in applications.

2.4 Examples of PDEs in Engineering

2.4.1 Traffic Flow

A comprehensive review of models for traffic flow is beyond our present scope.
Instead, we present here a simplified version of the fundamental equation, based on
the assumptions that the road is of a single lane and that (within the portion of road
being analyzed) there are no entrances or exits. The quantity we want to balance
is the content of cars. We, therefore, interpret u = u(x, t) as the car density at the
point x along the road at time t. Since we have assumed no entrances or exits, the
production term p vanishes identically. The flux term f has the following physical
interpretation: At any given cross section of the road and at a given instant of time,
it measures the number of cars per unit time that pass through that cross section or,
more precisely, the number of cars per unit time that enter one of the portions of road
to the right or left of the cross section. With our (counter-intuitive) sign convention,
a positive value of f corresponds to an inflow of cars. We have seen that the flux is
actually governed by a flux vector f. Denoting by v = v(x, t) the car-velocity field,
we can write
f = −u v. (2.25)

In other words, if the velocity points in the direction of the exterior normal to the
boundary (so that the dot product is positive) the term u v measures the number
of cars that in a unit of time are coming out through that boundary. Since in our
case everything is one-dimensional, the velocity vector is completely defined by its
component v along the axis of the road, so that we can write

f = −u v. (2.26)

The local balance equation (2.24) for this traffic flow problem reads, therefore,

∂(u v)
ut + u x = 0. (2.27)
∂u
2.4 Examples of PDEs in Engineering 37

The time has come now to adopt some constitutive law. Clearly, the velocity of
the cars may depend on a large number of factors, including the time of day, the
weather, the traffic density, etcetera. In the simplest model, the velocity will depend
only on the traffic density, with larger densities giving rise to smaller speeds. From
practical considerations, since cars have a finite length, there will be an upper bound
u max for the density, and it is sensible to assume that when this maximum is attained
the traffic comes to a stop. On the other hand, we may or may not wish to consider
an upper limit vmax for the speed, when the traffic density tends to zero. If we do, a
possible constitutive equation that we may adopt is

u
v = vmax 1 − . (2.28)
u max

If, on the other hand, we do not want to impose a speed limit in our model, a possible
alternative constitutive law is u max
v = k ln , (2.29)
u
where k is a positive constant (which perhaps varies from road to road, as it may take
into consideration the quality of the surface, the width of the lane, and so on).
Introducing the constitutive law (2.28) into our balance law (2.27), we obtain the
quasi-linear first-order PDE

u
ut + 1 − 2 vmax u x = 0. (2.30)
u max

In the extreme case when the speed is independent of the density and equal to a
constant, we obtain the advection equation

u t + vmax u x = 0. (2.31)

2.4.2 Diffusion

Diffusive processes are prevalent in everyday life. They occur, for example, whenever
a liquid or gaseous substance spreads within another (sneezing, pouring milk into a
cup of coffee, industrial pollution, etc.). The process of heat flow through a substance
subjected to a temperature gradient is also a diffusive process. All these processes
are characterized by thermodynamic irreversibility (the drop of milk poured into the
coffee will never collect again into a drop).
Consider a tube filled with water at rest in which another substance (the pollutant)
is present with a variable concentration u = u(x, t). Let p = p(x, t) be the produc-
tion of pollutant per unit length and per unit time. This production can be the result
of industrial exhaust into the tube, coming from its lateral surface at various points,
38 2 Partial Differential Equations in Engineering

or of a similar process of partial clean-up of the tube. If there is any influx through
the ends of the tube, it will have to be considered as part of the boundary conditions
(which we have not discussed yet), rather than of the production term. The flux, just
as in the case of traffic flow, represents the amount of pollutant traversing a given
cross section per unit time. In the case of traffic flow, we introduced as a variable
of state the speed of the traffic, which we eventually related to the car density by
means of a constitutive law. In the case of diffusion of a pollutant, on the other hand,
it is possible to formulate a sensible, experimentally based, constitutive law directly
in terms of the pollutant concentration. The most commonly used law, called Fick’s
law, states that the flux vector is proportional to the gradient of the concentration,
namely,
f = D ux , (2.32)

where the constant D is the diffusivity of the pollutant in water. A moment’s reflection
reveals that, with our sign convention, if we want the pollutant to flow in the direction
of smaller concentrations, the diffusivity must be positive. Introducing these results
into the general balance equation (2.18), we obtain

u t − D u x x = p. (2.33)

This second-order linear PDE is known as the (non-homogeneous)6 diffusion equa-


tion. It is also known as the one-dimensional heat equation, in which case u stands
for the temperature and the constant D is a combination of the heat capacity and the
conductivity of the material.

2.4.3 Longitudinal Waves in an Elastic Bar

Assuming that the particles in a thin cylindrical bar are constrained to move in the
axial direction, the law of balance of momentum (Newton’s second law) can be seen
as a scalar equation. The momentum density (momentum per unit length) is given
by ρ A v, ρ being the mass density, A the cross-section area and v the component
of the velocity vector. The production term in this case consists of any applied force
per unit length (such as the weight, if the bar is held vertically). We will assume for
now that there are no applied external forces, so that the production term vanishes
identically. The flux associated with the momentum is what we call the stress tensor,
which in this case can be represented by a single component σ (perpendicular to the
normal cross sections). The balance of momentum7 reads

6 The adjective non-homogeneous, in this case, refers to the fact that there are sources or sinks, that
is, p does not vanish identically. Material inhomogeneity, on the other hand, would be reflected in
a variation of the value of the diffusivity D throughout the tube.
7 Neglecting convective terms.
2.4 Examples of PDEs in Engineering 39

∂(ρ A v) ∂(σ A)
= . (2.34)
∂t ∂x
Assuming constant cross section and density, we obtain

ρ vt = σx . (2.35)

This balance equation needs to be supplemented with a constitutive law. For a linearly
elastic material, the stress is proportional to the strain (ε), that is,

σ = E ε, (2.36)

where E is Young’s modulus. Adopting the (axial) displacement u = u(x, t) as a


state variable, we have
v = ut , (2.37)

by definition of velocity, and


ε = ux , (2.38)

by the kinematic relations of the infinitesimal theory of strain. Putting all these results
together we obtain the second-order linear PDE

u tt = c2 u x x , (2.39)

where the constant c is given by


E
c= . (2.40)
ρ

Equation (2.39) is known as the one-dimensional wave equation. The constant c will
be later interpreted as the speed of propagation of waves in the medium. A similar
equation can be derived for the problem of small transverse vibrations of a string
(such as that of a guitar) under tension. In this case, the constant c is given by the
square root of the ratio between the tension in the string and its mass per unit length.8

2.4.4 Solitons

In many phenomena not governed by the wave equation it is possible to observe


the propagation of highly concentrated pulses traveling undistorted at a constant
speed. These traveling waves are known as solitons. They were first observed in
shallow water, where gravity drives the propagation. The equation governing this
phenomenon was first derived by Korteweg and de Vries in 1895 and is now known

8 See Sect. 8.1.


40 2 Partial Differential Equations in Engineering

as the KdV equation. We will not present its derivation. It reads

u t + u u x + u x x x = 0. (2.41)

Here, u = u(x, t) represents a measure of the height of the water in a long channel
of constant cross section. The KdV equation is a third-order quasi-linear PDE. It can
be brought to the form of a conservation law (2.18) by setting p = 0 and

1 2
f =− u + uxx . (2.42)
2

2.4.5 Time-Independent Phenomena

If in the examples just presented we eliminate the dependence of the variables on


time, namely if u = u(x), we obtain in each case an ODE representing a configura-
tion of steady state or equilibrium of the system. On the other hand, if we were to
extend the spatial domain from one to two dimensions (rather than just one, as we
have been doing so far), the steady-state equation would still be a PDE in two inde-
pendent variables. As an example, we will consider the equilibrium configuration of
a membrane which has been initially stretched by applying a high uniform tension
T (per unit length, say) in all directions and then attached to a rigid plane frame
along its perimeter. The membrane thus prepared is then subjected to a transverse
load (perpendicular to the plane of the frame) of magnitude q(x, y), where x, y is
a system of Cartesian coordinates in the plane of the unloaded membrane. We are
interested in calculating the transverse deflection w = w(x, y) corresponding to an
equilibrium configuration.
We have already remarked that the transverse vibrations of a tensed string are given
by the wave equation (2.39). A careful derivation of the analogous two-dimensional
membrane counterpart, as depicted in Fig. 2.3, would lead to the dynamical equation

q ρh
wx x + w yy = − + wtt , (2.43)
T T
where h is the thickness of the membrane. In the absence of the external loading
term q, this is the two-dimensional wave equation. If, on the other hand, we seek
an equilibrium position under the action of a time-independent load, we obtain the
second-order linear PDE q
wx x + w yy = − . (2.44)
T
This is the Poisson equation. If the right-hand side vanishes (no load, but perhaps a
slightly non-planar frame) we obtain the Laplace equation. These equations appear
in many other engineering applications, including fluid mechanics, acoustics, elec-
trostatics and gravitation.
2.4 Examples of PDEs in Engineering 41

Fig. 2.3 Balance of forces w


in a membrane
q

dx

dy
x

2.4.6 Continuum Mechanics

In Continuum Mechanics [1] the field variables are always associated with a con-
tinuous material body as the carrier of contents, sources and fluxes. The material
body, made up of material points, manifests itself through its configurations in the
physical space R3 . In this brief presentation, we will adhere strictly to the Eulerian
formulation, which adopts as its theatre of operations the current configuration of
the body in space. The domains ω used in the formulation of the generic balance
equation must, accordingly, be subsets of the current configuration. In other words,
they must be made of spatial points occupied at the current time by material particles.
The generic equation of balance in its global form (2.7) or in its local form (2.11),
is still applicable. In Continuum Mechanics, however, it is convenient to identify
two distinct parts of the total flux F through the boundary ∂ω which we call the
convected flux Fc and the physical flux F p , that is,

F = Fc + F p . (2.45)

The convected flux appears as a natural consequence of having adopted a fixed


spatial volume ω for the analysis. Since the material particles are moving with a
velocity field v = v(x1 , x2 , x3 , t), they are in general entering into or exiting from ω.
In so doing, they import or export a certain amount of content per unit time. Figure 2.4
makes it clear what this amount dFc is at any given elemental area da lying on the
boundary of ω. We have
dFc = −u(v · n)da, (2.46)
42 2 Partial Differential Equations in Engineering

Fig. 2.4 The amount of v dt


content flowing out of the
volume ω through the area
element da during a time dt n
is proportional to the volume
da
of the parallelepiped shown

where n is the exterior unit normal to da as part of ∂ω. The negative sign indicates
that we have assumed an influx as positive. Naturally, if the particle velocity happens
to be tangential to the boundary at a point, the convected flux at that point vanishes.

Remark 2.1 Had we assumed a material volume as the point of departure (that is,
a volume that follows the particles in their motion in space), the corresponding
convected flux would have automatically vanished. This simplification would have
resulted, however, in the need to use a compensatory transport theorem for the
calculation of the time variation of an integral on a moving domain. The convected
flux is, literally, in the eyes of the beholder.

The second part of the flux, F p has the clear physical meaning of the flow of
content through the boundary due to causes other than mere motion or rest of the
control volume. Thus, for instance, if the content in question is the internal energy
of a rigid body at rest, the flux through the boundary represents the conductive heat
flux. It is important to notice once more that the physical flux takes place at each
internal boundary (not just the external boundary of the body) separating a sub-body
from the rest. Cauchy’s theorem implies that the physical flux is governed by a flux
vector field f p , as before. Thus, we obtain the global form


d
u dω = pdω + (−uv + f p ) · n da. (2.47)
dt
ω ω ∂ω

Applying the divergence theorem and following our previous localization argument
we obtain the generic law of balance of Continuum Mechanics in its local form as
2.4 Examples of PDEs in Engineering 43

∂u
− p − div(−uv + f p ) = 0. (2.48)
∂t
In terms of the material derivative, discussed in Box 2.1, this equation can also be
written as
Du
− p + u divv − divf p = 0. (2.49)
Dt

Box 2.1 The material derivative


The partial time-derivative ∂u∂t
appearing in Eq. (2.48) describes the rate of
change of the density u at a fixed spatial position (x1 , x2 , x3 ). If we imagine
an observer sitting at that position and recording events as time goes on, this
partial derivative is the slope of the graph of u versus time as produced by
that observer. Thus, if the regime happens to be steady, this slope will vanish
identically. If we imagine, instead, a hypothetical observer riding with a fixed
material particle, this second observer will record a different result! The slope
of the graph recorded by this moving observer describes the rate of change
of u at that particle. This is called the material derivative of u, denoted by
Du/Dt. In the case of a steady state which is spatially non-constant, clearly
the material derivative will not vanish in general. What is the relation between
the partial derivative and the material derivative? The particle that at time t
passes through the spatial position (x1 , x2 , x3 ) will occupy at time t + dt the
position (x1 + v1 dt, x2 + v2 dt, x3 + v3 dt), where (v1 , v2 , v3 ) are the local
components of the velocity. At this point the value of u has, to a first degree of
approximation, the value

u(x1 + v1 dt, x2 + v2 dt, x3 + v3 dt, t + dt)


∂u ∂u ∂u ∂u
=u(x1 , x2 , x3 , t) + v1 dt + v2 dt + v3 dt + dt.
∂x1 ∂x2 ∂x3 ∂t

Accordingly, we obtain

Du u(x1 + v1 dt, x2 + v2 dt, x3 + v3 dt, t + dt) − u(x1 , x2 , x3 , t)


= lim
Dt dt→0 dt
∂u ∂u ∂u ∂u
= + v1 + v2 + v3
∂t ∂x1 ∂x2 ∂x3
∂u
= + ∇u · v,
∂t
where ∇u is the gradient of u.
44 2 Partial Differential Equations in Engineering

2.4.6.1 Conservation of Mass: The Continuity Equation

A balance law in Continuum Mechanics is said to be a conservation law if the


production p and the physical flux f p vanish identically.9 The content u in this case
is to be identified with the mass density ρ = ρ(x1 , x2 , x3 , t). Applying the generic
equation of balance (2.48) we obtain

∂ρ
+ div(ρv) = 0, (2.50)
∂t
or, using the material derivative,


+ ρ divv = 0. (2.51)
Dt
This equation is known in Fluid Mechanics as the continuity equation.

2.4.6.2 Balance of Linear Momentum

In this case, according to Newton’s second law, the content density is the vector of
linear momentum, namely, ρ v. The production is the body force b per unit spatial
volume and the physical flux is the surface traction vector t. We will implement
the generic balance equation component by component in a global Cartesian inertial
frame. For each component ti of the traction vector, according to Cauchy’s theorem,
there exists a flux vector with components σi j ( j = 1, 2, 3). We have thus a matrix
representing, in the given coordinate system, the components of the Cauchy stress
tensor σ. The surface traction t = σ · n is best expressed in components as

ti = σi j n j , (2.52)

where the summation convention for repeated indices has been enforced, as explained
in Box 2.2. The equation of balance (2.48) for ρvi reads

∂ρvi  
− bi − −ρvi v j + σi j , j = 0. (2.53)
∂t
On the other hand, the continuity equation (2.50) is written in Cartesian components
(always enforcing the summation convention) as

∂ρ  
+ ρv j , j = 0. (2.54)
∂t

9 This is the case of the conservation of mass in conventional Continuum Mechanics. In the context of

growing bodies (such as is the case in some biological materials) mass is not necessarily conserved.
2.4 Examples of PDEs in Engineering 45

Combining the last two results, we obtain

∂vi
ρ + ρvi, j v j = bi + σi j, j . (2.55)
∂t
Using the material derivative, we can write the local form of the balance of linear
momentum as
Dvi
ρ = bi + σi j, j . (2.56)
Dt
The material derivative of the velocity is, not surprisingly, the acceleration. Having
thus enforced the conservation of mass, the form of Newton’s second law for a
continuum states that the mass density times the acceleration equals the body force
plus the net contact force over the boundary of an elementary volume element.

Box 2.2 The summation convention in Cartesian coordinates


Attributed to Albert Einstein, the summation convention is a notational device
which, in addition to being compact and elegant, is often revealing of possible
mistakes in computations. In an expression or equation made up of sums of
monomials, where each monomial consists of products of indexed quantities,
the following rules apply:
1. In each monomial an index (subscript) can appear at most twice. If repeated
in a monomial, an index is said to be dummy. Otherwise, if it appears just
once, it is called free.
2. The free indices must be balanced in all monomials. In other words, every
monomial in an expression must have exactly the same free indices.
3. The dummy indices in every monomial indicate a summation over that
index in the range 1 to n, where n is the dimension of the Cartesian space
under consideration.
As an example, the divergence of a vector field v is given by

divv = vi,i ,

where, as in Eq. (2.2), commas stand for partial derivatives with respect to the
coordinates designated by the indices following the comma. An equation such
as
Bi = Ai jk jk

stands for

n 
n
Bi = Ai jk jk .
k=1 j=1
46 2 Partial Differential Equations in Engineering

Expressions such as

C = Dkkk Ci = D jkk Ci j = Dikk

are considered wrong, unless the summation convention has been explicitly
suspended.

2.4.6.3 Balance of Angular Momentum

For a continuous deformable medium, unlike the case of a rigid body, the balance
of angular momentum is an independent postulate. It establishes that the angular
momentum with respect to any point attached to an inertial reference frame is equal
to the sum of the moments of the external forces about that point. The angular momen-
tum density is the (pseudo-)vector r ×(ρv), where, without any loss of generality, we
have identified the fixed point as the origin of coordinates, so that r is the standard
position vector. Assuming the absence of applied couples, and enforcing both mass
conservation and balance of linear momentum, the final result is purely algebraic,
namely,
σi j = σ ji . (2.57)

Put in words, the Cauchy stress tensor is symmetric.

2.4.6.4 Balance of Energy: The First Law of Thermodynamics


in a Continuous Medium

The total energy density e in a continuum is given by

1
e= ρv · v + ρ. (2.58)
2
In this expression, the first term on the right-hand side represents the kinetic energy
density while  is the internal energy per unit mass, postulated to exist as a function of
state by the first law of Thermodynamics. The mechanical source density is stipulated
by the same law as the power of the body force, that is b · v, while the thermal (that
is, non-mechanical) source is provided by sources of heat distributed with a density
r per unit volume. Similarly, the physical mechanical flux is given by the power of
the traction, that is, t · v while the physical heat flux is defined by means of a heat
flux vector q such that the non-mechanical influx per unit area and per unit time is
given by −q · n. The balance of energy equates the rate of change of the total energy
2.4 Examples of PDEs in Engineering 47

with the sum of the mechanical and thermal contributions. The final result can be
expressed as
D
ρ = r − qi,i + σi j vi, j , (2.59)
Dt
where the previous balance equations have been enforced.

Exercises

Exercise 2.1 Derive Eq. (2.18) by applying the equation of balance to a small (infin-
itesimal) slice of the bar, that is, for a slice contained between the cross sections at
x and at x + d x.

Exercise 2.2 Carry out all the steps leading to Eq. (2.57) that establishes the sym-
metry of the Cauchy stress tensor.

Exercise 2.3 Carry out all the steps leading to Eq. (2.59) that establishes the balance
of energy in a continuous medium.

Exercise 2.4 (The Navier-Stokes equations) The rate of deformation tensor D is


defined as the symmetric part of the velocity gradient. In components

1 
Di j = vi, j + v j,i .
2
The Newtonian compressible viscous fluid has the constitutive equation

σ = − p(ρ)I + λ trD + 2μD.

In this equation, p = p(ρ) is some increasing function of the density and λ and μ
are constant viscosity coefficients. The symbol I stands for the spatial identity tensor
and the operator tr is the trace, namely, trD = Dii is the sum of the diagonal entries
in the matrix representing D in a Cartesian frame. Use this constitutive equation in
the equation of balance of linear momentum to obtain the Navier-Stokes equations
of Fluid Mechanics [2]. These are three PDEs for the density and the components of
the velocity field. The continuity equation completes the system.

References

1. Chadwick P (1999) Continuum mechanics: Concise theory and problems. Dover, New York
2. Chorin AJ, Marsden JE (1993) A mathematical introduction to fluid mechanics. Springer, Berlin
3. Knobel R (2000) An introduction to the mathematical theory of waves. American Mathematical
Society, Providence
4. Wolfram S (2002) A new kind of science. Wolfram Media, Champaign
Part II
The First-Order Equation
Chapter 3
The Single First-Order Quasi-linear PDE

Remarkably, the theory of linear and quasi-linear first-order PDEs can be entirely
reduced to finding the integral curves of a vector field associated with the coefficients
defining the PDE. This idea is the basis for a solution technique known as the method
of characteristics. It can be used for both theoretical and numerical considerations.
Quasi-linear equations are particularly interesting in that their solution, even when
starting from perfectly smooth initial conditions, may break up. The physical meaning
of this phenomenon can be often interpreted in terms of the emergence of shock
waves.

3.1 Introduction

It has been pointed out by many authors1 that there is no general theory that encom-
passes all partial differential equations or even some large classes of them. The
exception is provided by the case of a single first order PDE, for which a general
theory does exist that, remarkably, reduces the problem to the solution of certain
systems of ordinary differential equations. The most general first-order PDE for a
function u = u(x1 , . . . , xn ) is of the form
 
F x1 , . . . , xn , u, u ,1 , . . . , u ,n = 0, (3.1)

where F is a function of 2n + 1 variables.


A commonly used notation, inspired perhaps by Hamiltonian Mechanics, is
obtained by defining the new (‘momentum’) variables
pi = u ,i i = 1, . . . , n. (3.2)

The differential equation (3.1) is written as

F (x1 , . . . , xn , u, p1 , . . . , pn ) = 0. (3.3)

1 This
point is made most forcefully by Arnold in [1].
© Springer International Publishing AG 2017 51
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_3
52 3 The Single First-Order Quasi-linear PDE

In a (2n + 1)-dimensional space with coordinates x1 , . . . , x, u, p1 , . . . , pn the differ-


ential equation acquires a peculiar geometric meaning, briefly described in Box 3.1,
which will not be pursued in this book, but which may provide at least a glimmer of
a modern geometric way to understand the theory. This approach is pursued in [1].

Box 3.1 A curious geometric view of differential equations


Modern Differential Geometry goes beyond the analysis of curves and surfaces
in R3 and works, instead, on general spaces or manifolds of arbitrary dimension
that may not even have a metric structure. Although such flights of fancy are
well beyond the scope of this book, let us look briefly at the geometrical mean-
ing of a differential equation and its solutions. For simplicity, let us consider the
case n = 1, namely, the case of an ODE. A PDE is treated similarly in higher
dimensional spaces. Consider, therefore, a space with coordinates x, u, p, these
coordinates representing, respectively, the independent variable, the dependent
variable and its derivative, as shown in Fig. 3.1. Technically, these are coor-
dinates in the space of jets of dimension 1 over the manifold whose running
coordinate is x. In this jet space, the differential equation F(x, y, p) = 0 can be
regarded as a surface F! More generally, a differential equation is nothing more
than a 2n-dimensional (hyper-)surface embedded in the (2n + 1)-dimensional
space of 1-jets. This is the geometric meaning of a differential equation. What
is a solution? In the case n = 1, a solution is a differentiable curve in the
x, u space with the property that the derivative p = du/d x at each point x, u
determines a point (x, u, p) lying on the differential equation F = 0 (that is,
the surface F). This analytic statement can also be given a geometric meaning.
For any differentiable curve γ in the two-dimensional space x, u we can define
its canonical lift as a curve γ∗ in the three-dimensional jet space x, u, p by
simply adding the ‘slope’ coordinate p = du/d x. Thus, a straight line in the
space x, u is lifted canonically to a constant (‘horizontal’) straight line in the
jet space. In this view, a solution of the differential equation F(x, u, p) = 0 is
a curve γ whose canonical lift lives in the surface F.

p p
F (x, u, p) = 0

γ∗
u u

γ
x x

Fig. 3.1 A differential equation (left) and the canonical lift of a curve (right)
3.1 Introduction 53

The first-order PDE (3.3) is quasi-linear if it is linear in each of the derivative


variables pi (i = 1, . . . , n). The general form of a quasi-linear first-order PDE is,
accordingly,

∂u ∂u
a1 (x1 , . . . , xn , u) + · · · + an (x1 , . . . , xn , u) = c(x1 , . . . , xn , u). (3.4)
∂x1 ∂xn

If the right-hand side vanishes identically the equation is said to be homogeneous.


The equation is linear if it is linear in both the unknown function u and its derivatives.
Thus, a quasi-linear equation is a particular case of a non-linear equation.

3.2 Quasi-linear Equation in Two Independent Variables

In the case of two independent variables, x and y, it is possible to visualize the solution
of a PDE for a function u = u(x, y) as a surface2 in the three-dimensional space with
coordinates x, y, u. We call this surface an integral surface of the PDE. As pointed
out by Courant and Hilbert,3 geometric intuition is of great help in understanding
the theory, so that it seems useful to limit ourselves to the case of two independent
variables, at least for now. The visualization of the elements of the theory becomes
particularly useful in the case of linear and quasi-linear equations. The general non-
linear case is more difficult to grasp and will be the subject of Chap. 5.
The general form of a quasi-linear first-order PDE in two independent variables
is
a(x, y, u) u x + b(x, y, u) u y = c(x, y, u). (3.5)

Consider a possible solution u = u(x, y) of Eq. (3.5). To be considered as a


solution it must be at least once differentiable (otherwise, we wouldn’t even be able
to check that it is a solution).4 What this means from the geometrical point of view

2 This visualization has nothing to do with the more abstract geometric interpretation given in Box
3.1, which we will not pursue.
3 In [3], p. 22. This classical treatise on PDEs, although not easy to read, is recommended as a basic

reference work in the field of PDEs. A few of the many standard works that deal with first-order
PDEs (not all books do) are: [4–7]. Don’t be fooled by the age of these books!
4 Later, however, we will allow certain types of discontinuities of the solution.
54 3 The Single First-Order Quasi-linear PDE

is that the surface representing the solution has a well-defined tangent plane at each
point of its domain and, therefore, a well-defined normal direction.5 As we know
from Eq. (1.13), at any given point x, y a vector (not necessarily of unit length) in
this normal direction is given by the components
⎧ ⎫
⎨ ux ⎬
uy (3.6)
⎩ ⎭
−1

Since Eq. (3.5) can be written as


⎧ ⎫
⎨ ux ⎬
a b c u y = 0, (3.7)
⎩ ⎭
−1

we conclude that the statement of the differential equation can be translated into
the following geometric statement: The normal to a solution surface must be at each
point perpendicular to the characteristic vector w with components a b c evaluated
at that point in space. But this is the same as saying that this last vector must lie in
the tangent plane to the solution surface!

Remark 3.1 The use of the imagery of a solution of a first-order PDE in two inde-
pendent variables as a surface in R3 is hardly necessary. As discussed in Box
3.2, it carries spurious geometric ideas, such as the normal to the surface. These
ideas are certainly useful to visualize the properties of a solution, but may not be
directly extended to higher dimensions. That the characteristic vector is tangent
to a solution surface can, in fact, be easily proved by purely analytical means.
Indeed, let u = u(x, y) be a solution and let P = (x0 , y0 , u 0 = u(x0 , y0 )) be
a point of this solution. The characteristic vector at this point has components
a0 = a(x0 , y0 , u 0 ), b0 = b(x0 , y0 , u 0 ), c0 = c(x0 , y0 , u 0 ). For a nearby point
P + d P = (x0 + d x, y0 + dy, u 0 + du) to lie in the solution, the increments
must satisfy the algebraic relation

du = u x d x + u y dy,

where the derivatives are evaluated at P. If we take d x = a0 and dy = b0 we obtain


du = u x a0 + u y b0 = c0 , according to the statement of the PDE. We conclude,
therefore, that the characteristic vector lies in the tangent space of the solution at P.

5 But see Box 3.2.


3.2 Quasi-linear Equation in Two Independent Variables 55

Box 3.2 What does normal mean?


It is clear that a differentiable function u = u(x, y) has a well-defined tangent
plane at each point (x0 , y0 ). The equation of this plane is given by

u − u 0 = p (x − x0 ) + q(y − y0 ),

where u 0 = u(x0 , y0 ) and where the derivatives p = u x , q = u y are evaluated


at (x0 , y0 ). But what do we mean by a vector perpendicular to this plane?
Our coordinates x, y, u may have completely different meanings and units.
Thus, x may represent space, y may represent time and u may be a velocity!
In this context, what does it mean to measure lengths of vectors or angles
between them? In a rigorous theory, where the space of coordinates x, y, u is
not endowed with any intrinsic metric structure we cannot speak of lengths
and angles. Instead, we must distinguish between vectors and co-vectors and
the action of the latter over the former. In particular, the gradient operator
produces a co-vector, for which no metric structure is necessary. To avoid
these refinements, once we have chosen units for each of the coordinates, we
assign to each point a unique numerical triple (x, y, u) and measure lengths
and angles in the ordinary sense of R3 . We must, however, make sure that our
conclusions are independent of this choice. As a simple example, suppose that
we find two vectors each of which is perpendicular to the tangent plane to the
surface at some point. We conclude that these two vectors must be collinear.
This conclusion is independent of the metric that we have chosen on the basis
of the coordinate units.

Let us consider the following problem in the theory of systems of ODEs: Find inte-
gral curves of the characteristic vector field w(x, y, u). From Chap. 1, we have some
experience in this type of problem, so we translate it into the system of characteristic
equations
dx
= a(x, y, u), (3.8)
ds

dy
= b(x, y, u), (3.9)
ds

du
= c(x, y, u). (3.10)
ds
As we know from the theory of ODEs, this system always has a solution (at least
locally). This solution can be visualized as a family of non-intersecting integral
curves in space. In the context of the theory of first-order quasi-linear PDEs these
curves are called the characteristic curves of the differential equation, or simply
56 3 The Single First-Order Quasi-linear PDE

characteristics. We have already called the vector field w with components a b c


the characteristic vector field. The characteristics are, thus, the integral curves of the
characteristic vector field.

3.3 Building Solutions from Characteristics

3.3.1 A Fundamental Lemma

Lemma 3.1 A characteristic curve having one point in common with an integral
surface of a quasi-linear first-order PDE necessarily lies on this surface entirely.

Proof Let P = (x0 , y0 , u 0 ) be a point lying on an integral surface u = u(x, y)


of the PDE and let γ be the unique characteristic curve through P. As a solution
of the system of ODEs (3.8), (3.9) and (3.10), γ is expressed by some functions
x̂(s), ŷ(s), û(s). Let s0 be the value of the parameter s at P. Consider the following
function of s  
U (s) = û(s) − u x̂(s), ŷ(s) . (3.11)

We claim that this function vanishes identically over the domain of existence of the
characteristic. The geometrical meaning of the function U (s) is displayed in Fig. 3.2.
We first project γ onto the (x, y) plane and obtain the projected characteristic β as
the curve with equations x̂(s), ŷ(s), 0. Next, we lift β to the integral surface as the

u
γ

U
u = u(x, y)

P
β+

Fig. 3.2 The fundamental lemma


3.3 Building Solutions from Characteristics 57
 
curve β + with equations x̂(s), ŷ(s), u x̂(s), ŷ(s) . Our claim is that γ = β + . The
function U is precisely the difference in the third coordinate of these two curves.
Clearly, by definition,
U (s0 ) = 0. (3.12)

A careful calculation of the derivative of U (using the chain rule and enforcing the
original differential equation) reveals that

dU d û ∂u d x̂ ∂u d ŷ d û ∂u ∂u d û
= − − = −a −b = − c = 0. (3.13)
ds ds ∂x ds ∂ y ds ds ∂x ∂y ds

The (unique) solution to this trivial equation with the initial condition (3.12) yields

U (s) ≡ 0. (3.14)

This result means that γ must lie entirely on the solution surface. 
Remark 3.2 We note that in the proof of this lemma we have not invoked the fact
that the characteristic vector at P is tangential to the integral (solution) surface.
Put differently, we could have proved the lemma directly from the notion of integral
curves of the characteristic field. The fact that the characteristic vectors are tangential
to integral surfaces can be obtained as a corollary to the lemma.

3.3.2 Corollaries of the Fundamental Lemma

Two important corollaries can be obtained almost immediately from the Fundamental
Lemma 3.1.
1. If two different solution surfaces have one point in common, they must intersect
along the whole characteristic passing through that point. Conversely, if two solu-
tions intersect transversely along a curve, this curve must be a characteristic of the
differential equation. By transversely we mean that along the common curve they
don’t have the same tangent plane. This means that at any point along this curve
we have a well-defined line of intersection of the local tangent planes of the two
surfaces. But recall that the tangent to the characteristic curve must belong to both
tangent planes, and therefore to their intersection. In other words, the intersection
between the two planes is everywhere tangent to the characteristic direction and,
therefore, the intersection curve is an integral curve of the characteristic field.
2. An integral surface is necessarily a collection of integral curves, since once it
contains a point it must contain the whole characteristic through that point. Con-
versely, any surface formed as a one-parameter collection of characteristic curves
is an integral surface of the PDE. What we mean by “one-parameter collection” is
that, since the characteristic curves are already one-dimensional entities, to form
a surface (which is two-dimensional) we have one degree of freedom left. For
58 3 The Single First-Order Quasi-linear PDE

example, we can take an arbitrary non-characteristic line and consider the surface
formed by all the characteristics emerging from this line. To show that a surface
so formed must necessarily be an integral surface, that is, a solution of the PDE,
we consider an arbitrary point P on the given surface. Since, by construction, this
surface contains the characteristic through P, the normal to the surface is also
perpendicular to the characteristic direction. But this is precisely the statement of
the PDE, which is, therefore, satisfied at each point of the given surface.
The general conclusion is that the solutions of a single first-order quasi-linear PDE
in two variables can be boiled down to the solution of a system of ordinary differential
equations. This result remains true for more than two independent variables and also
for fully nonlinear equations (in which case the concept of characteristic curves must
be extended to the so-called characteristic strips).

3.3.3 The Cauchy Problem

The main problem in the theory of first-order PDEs is the following so-called Cauchy
problem or initial value problem6 : Given the values of u on a curve in the x, y plane,
find a solution that attains the prescribed values on the given curve. An equivalent
way to look at this problem is to regard the given (“initial”) data as a space curve
with parametric equations
x = x̄(r ), (3.15)

y = ȳ(r ), (3.16)

u = ū(r ). (3.17)

The Cauchy problem consists of finding an integral surface that contains this
curve. From the results of the previous section we know that this integral surface
must consist of a one-parameter family of characteristics. Let the characteristics be
obtained (by integration of the characteristic equations) as

x = f (s, A, B, C), (3.18)

y = g(s, A, B, C), (3.19)

u = h(s, A, B, C), (3.20)

where s is a parameter and where A, B, C are constants of integration. We will adjust


these “constants” so that for some fixed value of the parameter (s = 0, say) and for

6 Some authors reserve the name of initial value problem for the particular case in which the data
are specified on one of the coordinate axes (usually at t=0).
3.3 Building Solutions from Characteristics 59

initial curve
(identified with s = 0)

characteristics
r

Fig. 3.3 A one-parameter (r ) family of characteristics

a given value of r the values of x, y, u given by both sets of equations coincide. In


this way we obtain the characteristic issuing from the point r of the initial curve and,
moreover, we have adjusted matters so that at the point of their intersection the value
of the parameter on the characteristic curve is 0, as suggested in Fig. 3.3. We are in
fact creating a coordinate system r, s on the surface being constructed.
This adjustment process leads to the algebraic equations

f (0, A, B, C) = x̄(r ), (3.21)

g(0, A, B, C) = ȳ(r ), (3.22)

h(0, A, B, C) = ū(r ). (3.23)

These equations are, in principle, solvable for A, B, C in terms of r . In other words,


the constants of integration are adjusted at each point of the initial curve. This is what
we meant by a “one-parameter family of characteristics”. We thus obtain (perhaps
implicitly) three functions A(r ), B(r ), C(r ). Introducing these functions into the
characteristic equations, we obtain finally

x = f (s, A(r ), B(r ), C(r )) = x̃(r, s), (3.24)

y = g (s, A(r ), B(r ), C(r )) = ỹ(r, s), (3.25)

u = h (s, A(r ), B(r ), C(r )) = ũ(r, s), (3.26)


60 3 The Single First-Order Quasi-linear PDE

These three equations constitute the parametric representation of a surface. Never-


theless, we still have to be able to express it in the form u = u(x, y). In other words,
we have to be able to read off the pair r, s in terms of the pair x, y from the first
two equations and introduce the result into the third equation. This elimination is
possible in principle if, and only if, the Jacobian determinant


x̃r x̃s

J =

(3.27)
ỹr ỹs

does not vanish. Note that, by virtue of Eqs. (3.8) and (3.9), we know that on the
solution surface x̃s = a and ỹs = b, we can write the determinant as

J = b x̃r − a ỹr . (3.28)

The problem has thus been completely solved, provided J = 0. The vanishing of this
determinant will be later interpreted as the occurrence of a mathematical catastrophe.
Physically, the solution ceases to be uniquely defined and a shock wave is generated.
This situation does not develop if the PDE is linear.

3.3.4 What Else Can Go Wrong?

Suppose that the initial data curve (as a curve in the x, y, u space) happens to be
characteristic. In that case, when trying to build a one-parameter family of charac-
teristics, we find that we keep getting the same curve (namely, the initial data curve)
over and over again, so that a solution surface is not generated. This should not be
surprising. We already know, from Sect. 3.3.2, that characteristics are lines of inter-
section between solutions. Moreover, using the initial data curve (which is now a
characteristic) we can build an infinity of distinct one-parameter families of charac-
teristics to which it belongs. In other words, there are infinite solutions that satisfy
the initial data. A different way to express this situation (called the characteristic
initial value problem) is by saying that the PDE in this case does not provide extra
information to allow us to come out uniquely from the initial curve. To drive this
point further, we note that by providing differentiable data by means of Eqs. (3.15),
(3.16) and (3.17), we are also automatically prescribing the derivative of the desired
solution in the direction of the curve, namely, d ū/dr = c. On the other hand, by the
chain rule at any point of the initial curve and enforcing the PDE, we know that

d ū d x̄ d ȳ
= ux + uy = u x a + u y b = c. (3.29)
dr dr dr
We conclude that the PDE cannot provide us with information in directions other
than characteristic ones. The initial data must remedy this situation by giving us
information about the derivative of u in another direction.
3.3 Building Solutions from Characteristics 61

If the initial data curve is not characteristic over its whole length but happens to
be tangential to a characteristic curve at one point, we have a situation that requires
special treatment. An extreme case is obtained when the initial curve is not charac-
teristic anywhere but is everywhere tangent to a characteristic curve. In this case, we
have an initial curve that is an envelope of characteristics. Again, this case requires
special treatment.
A more subtle situation arises when the initial data are self-contradictory. To see
the geometrical meaning of this situation, let P = (x0 , y0 , z 0 ) be a point on the initial
data curve ρ. From the theorem of existence and uniqueness of ODEs, we know that
there is a unique characteristic γ through P in some neighbourhood of the point.
Assume, moreover, that ρ and γ are not mutually tangent at P. We don’t seem to
have a problem. But suppose now that ρ and γ project on exactly the same curve in
the x, y plane. Since the tangent plane of the integral surface at P must contain both
the tangent to γ (because it is the local characteristic vector) and the tangent to ρ
(because the initial curve must belong to the solution), we obtain a vertical tangent
plane, which is not permitted if u is differentiable.

3.4 Particular Cases and Examples

3.4.1 Homogeneous Linear Equation

Consider the homogeneous linear equation

a(x, y) u x + b(x, y) u y = 0. (3.30)

By homogeneous we mean that the right-hand side (the term without derivatives)
is zero. It follows immediately from the characteristic equations (3.8), (3.9) and
(3.10) that the first two equations can be integrated separately from the third. What
this means is that we can now talk about characteristic curves in the x, y plane.
From Eq. (3.10), we see that the value of u on these “projected” characteristic curves
must be constant. In other words, the original characteristic curves are contained in
horizontal planes and they project nicely onto non-intersecting curves in the x, y
plane.

Example 3.1 (Advection equation) Find the solution of the following advection
equation with constant coefficients

u t + 3u x = 0, (3.31)

with the initial condition


1
u(x, 0) = . (3.32)
x2 +1
62 3 The Single First-Order Quasi-linear PDE

Solution: The characteristic curves are given by the solutions of the system

dt dx du
=1 =3 = 0. (3.33)
ds ds ds
This system is easily integrated to yield

t =s+A x = 3s + B u = C. (3.34)

The initial curve in this case lies on top of the x axis, so we can choose x itself as a
parameter. To preserve the notation of the general procedure, we write the equation
of the initial curve explicitly as

1
t =0 x =r u= . (3.35)
r2 +1

Now is the time to enforce Eqs. (3.21), (3.22) and (3.23) to obtain

1
A=0 B =r C= . (3.36)
r2 + 1

Introducing this result into Eq. (3.34), we get

1
t =s x = 3s + r u= . (3.37)
r2 + 1

We still need to eliminate the pair r, s in favour of x, y, an operation which in this


case is trivial. We introduce the result into the last expression in (3.37) and obtain

1
u= . (3.38)
(x − 3t)2 + 1

This is the desired solution. It consists of a traveling wave of the same shape as the
initially prescribed profile. This is precisely the physical meaning of the advection
equation with constant coefficients. The wave travels forward with a speed of 3.
Remark 3.3 When producing an exact solution of a PDE, it is a good idea at the
end of the whole process to verify by direct substitution that the proposed solution
satisfies the given equation and the initial conditions.

3.4.2 Non-homogeneous Linear Equation

The general form of the non-homogeneous linear equation is

a(x, y) u x + b(x, y) u y = c(x, y) u + d(x, y). (3.39)


3.4 Particular Cases and Examples 63

The characteristic equations are

dx
= a(x, y), (3.40)
ds
dy
= b(x, y), (3.41)
ds
du
= c(x, y) u + d(x, y). (3.42)
ds
Just as in the case of the homogeneous equation, we observe that the first two equa-
tions can be solved independently to yield a family of non-intersecting curves in the
x, y plane. The value of u on these lines, however, is no longer constant. Again, the
characteristic curves project nicely on the x, y plane, since the third equation can be
solved on the basis of the first two, curve by curve. Notice that from this point of
view, the linearity of the right-hand side doesn’t play a determining role. The method
of solution for the non-homogeneous equation follows in all respects the same lines
as the homogeneous one.

Example 3.2 (Propagation of discontinuities) A first-order PDE admits discontin-


uous initial data. The reason for this somewhat surprising fact is precisely that the
construction of the solution shows that the initial data are propagated along char-
acteristics. In the linear case, the projected characteristics do not interact with their
neighbours. Recall also that, as remarked in Sect. 3.3.4, the PDE prescribes deriva-
tives in the characteristic directions only, so that the fact that there is a discontinuity
in the transverse direction does not impair the verification of the validity of the
discontinuous solution. As an example, consider the linear PDE

u t + 3u x = −2u x, (3.43)

with the initial condition


u(x, 0) = H [x], (3.44)

where H [x] is the Heaviside (unit step) function.


Solution:
The initial curve is given in parametric form by

t =0 x =r u = H [r ]. (3.45)

The characteristic differential equations are

dt dx du
=1 =3 = −2u x. (3.46)
ds ds ds
64 3 The Single First-Order Quasi-linear PDE

This system is easily integrated to

u = C e−3s −2Bs
2
t =s+A x = 3s + B . (3.47)

Setting s = 0 and equating to the initial curve equations, we obtain:

A=0 B =r C = H [r ]. (3.48)

Putting all these results together, just as in the previous example, we obtain the
solution as
u = H [x − 3t] e3t −2xt .
2
(3.49)

We observe that the discontinuity propagates along the corresponding characteristic.


Nevertheless, since the characteristics in this case are not horizontal as before, the
magnitude of the jump changes progressively. In this case, on one side of the projected
characteristic (where x < 3t) the solution vanishes identically, whereas on the other
side it has the value e−3t , which is thus the value of the jump. In this case, the
2

discontinuity gets attenuated as time progresses.

3.4.3 Quasi-linear Equation

So far, everything seems to be working smoothly. The problems start once one crosses
the threshold into the non-linear realm. For the time being, nonlinear for us means
quasi-linear, since we have not dealt with the genuinely nonlinear case. Geometrically
speaking, the reason for the abnormal behaviour that we are going to observe is that in
the case of a quasi-linear equation the characteristics (that live in the 3-dimensional
space x, y, u) do not project nicely onto the plane of the independent variables x, y.
From the point of view of the characteristic system of ODEs, this is the result of
the fact that the first two equations are coupled with the third, unlike the case of the
linear equation, where u was not present in the first two equations. As a consequence
of this coupling, for the same values of x, y, but for a different value of u, we
obtain, in general, characteristics that do not project onto the same curve. In other
words, the projected characteristics intersect and the projected picture is a veritable
mess. Figures 3.4 and 3.5 may help to understand the above statements. Therefore,
given smooth initial conditions on a curve may lead to intersections of the projected
characteristics emerging from the initial curve. What this means is that at one and the
same point in the space of independent variables, we may end up having two different
solutions. When the independent variables are space and time, this situation is usually
described as the development of a shock after a finite lapse of time. This mathematical
catastrophe is accompanied by physical counterparts (sonic booms, for example).
3.4 Particular Cases and Examples 65

In a linear PDE such as In a quasi-linear PDE such as


a(x, y)ux + b(x, y)uy = 0 a(u)ux + b(u)uy = 0
u u
the characteristics above a point the characteristics above a point
in the x, y plane project onto in the x, y plane may project onto
the same curve in this plane different curves in this plane

y y

x x

Fig. 3.4 Characteristics of linear and quasi-linear equations

y y
a(x, y)ux + b(x, y)uy = c(x, y, u) a(u)ux + b(u)uy = 0

x x

Fig. 3.5 Projected characteristics

Example 3.3 (Breaking of waves) Consider the quasi-linear PDE

u t + u u x = 0. (3.50)

This equation is known as the inviscid Burgers equation. It has important applications
to gas dynamics.
The characteristic ODEs of Eq. (3.50) are

dt dx du
=1 =u = 0, (3.51)
ds ds ds
which can be integrated to obtain the characteristic curves as

t =s+A x =C s+B u = C. (3.52)


66 3 The Single First-Order Quasi-linear PDE

Let us regard pictorially the solutions u = u(x, t) as waves which at any given
time τ have the geometric profile given by the function of the single variable x defined
as u = u (τ ) (x) = u(x, τ ). This way of looking at the solution usually corresponds to
the physical interpretation and makes it easier to describe what is going on in words.
We will now consider initial data at time t = 0, that is, we will prescribe a certain
initial profile given by some function f (x), namely,

u(x, 0) = u (0) (x) = f (x), (3.53)

where f (x) is assumed to be smooth. We will investigate what happens thereafter


(for t > 0).
Consider first the easy case in which f (x) is a monotonically increasing function
of x, such as
f (x) = tanh(x). (3.54)

The initial curve for our Cauchy problem is then given parametrically by

t =0 x =r u = tanh(r ). (3.55)

Following the general procedure (equating the characteristic expressions for s = 0


with the corresponding equations for the initial curve), we obtain

A=0 B =r C = tanh(r ). (3.56)

We have thus obtained the following one-parameter family of characteristics, where


the parameter of the family is r :

t =s x = s tanh(r ) + r u = tanh(r ). (3.57)

Recall that r represents the running coordinate in the initial curve (along the x axis).
The projection onto the plane x, t of these characteristics is given parametrically by
the first two expressions above. In this case, it is easy to eliminate the characteristic
parameter s and to write the family of projected characteristics as

x = t tanh(r ) + r. (3.58)

For each r , this is the equation of a straight line. The situation is represented in Fig. 3.6
(produced by Mathematica ), which shows that the projected characteristics fan out,
as it were, and that they will never intersect for t > 0.
We now ask: What happens to the wave profile as time goes on? Since our solution
(if it exists) is given parametrically by Eq. (3.57), the profile for a fixed time τ > 0
is given parametrically by

x = τ tanh(r ) + r u = tanh(r ). (3.59)


3.4 Particular Cases and Examples 67

Show[Table[ParametricPlot[{s*Tanh[0.1*i]+0.1*i,s},{s,0,2}, PlotRange->{{-1,1},{0,2}},

FrameLabel->{x,t}, Frame->True, PlotStyle->Black], {i,-9,9}]]

Fig. 3.6 Characteristics fanning out towards the future

Let us plot this profile separately for three different times τ = 0, 2, 4, as shown in
Fig. 3.7.
The wave profile tends to smear itself out over larger and larger spatial extents. We
can now plot the solution as a surface in a three-dimensional space with coordinates
x, t, u (for t ≥ 0), as shown in Fig. 3.8.
Consider now instead an initial profile with rising and falling parts, such as

1
f (x) = . (3.60)
x2 +1

GraphicsRow[Table[ParametricPlot[{i*Tanh[r]+r,Tanh[r]},{r,-7.5,7.5}, PlotRange->{{-7,7},{-1.2,1.2}},

AxesLabel->{x,u}, PlotStyle->Black, AspectRao->0.8],{i,1,3,1}]]

u u u

1.0 1.0 1.0

0.5 0.5 0.5

x x x
6 4 2 2 4 6 6 4 2 2 4 6 6 4 2 2 4 6
0.5 0.5 0.5

1.0 1.0 1.0

Fig. 3.7 Wave profiles at times t = 1, 2, 3


68 3 The Single First-Order Quasi-linear PDE

ParametricPlot3D[{s,s*Tanh[r]+r,Tanh[r]},{s,0,10},{r,-15,15}, AxesLabel->{t,x,u},

Mesh->25, BoxRaos->{1,2,1}, PlotRange->{{0,10},{-15,15}}]

Fig. 3.8 Graph of the solution

The characteristics are obtained as


s 1
t =s x= +r u= . (3.61)
r2 + 1 r2 + 1

For each value of r , the projected characteristic is the straight line

t
x= + r, (3.62)
r2 + 1

but now the slope is always positive, with a pattern that fans out or in for positive or
negative r , respectively. Figure 3.9 shows the state of affairs.
From Eq. (3.61) we can read off for any t = τ the following parametric relation
between u and x:
τ 1
x= 2 +r u= 2 . (3.63)
r +1 r +1

Figure 3.10 shows the wave profiles for τ = 0.0, 0.5, 1.0, 1.5, 2.0, 2.5.
3.4 Particular Cases and Examples 69

Show[Table[ParametricPlot[{s/(i^2+1)+i,s},{s,0,4},PlotRange->{{-3,4},{0,4}}, FrameLabel->{x,t},

Frame->True, PlotStyle->Black],{i,-3,4,0.2}]]

Fig. 3.9 Characteristics intersect forming an envelope with a cusp

We see that as two projected characteristics converge, the corresponding points


of the profile become closer and closer in projection (that is, more and more on top
of each other), so that the slope of the profile tends to infinity. Although there is
no problem drawing the parametric profiles, we see that, in fact, the solution ceases
to exist at that point in time in which the slope of the profile becomes infinite. We
have a catastrophe, like the breaking of a water wave. The reason why we say that
the solution ceases to exist is that we can no longer express u as a (single-valued)
function of x and y. Indeed, in order to be able to do so, we should be able, from the
first two Eq. (3.61), namely,
s
t =s x= + r, (3.64)
r2 + 1

to read off r and s as functions of x and t. For this to be possible, we need the Jacobian
determinant
∂x ∂x

−2r s


∂r ∂s

r 2 +1 + 1 r 2 +1

J =


(3.65)

∂t ∂t

0 1

∂r ∂s

not to vanish. The vanishing of J occurs when


 2 2
r + 1 − 2r s = 0. (3.66)
70 3 The Single First-Order Quasi-linear PDE

Show[Table[ParametricPlot[{r+i/(r^2+1),1/(r^2+1)},{r,-10,10}, PlotRange->{{-7,7},{-0,1.2}},

AxesLabel->{x,u}, PlotStyle->Black, AspectRao->0.8, PlotPoints->20],{i,0,3.5,0.7}]]

Fig. 3.10 Wave profiles at times t = 0, 0.7, 1.4, 2.1, 2.8, 3.5

Of all these combinations of r and s we are looking for that one that renders the
minimum value of t. In general, t is a function of both r and s, so that we have to
solve for the vanishing differential of this function under the constraint (3.66). In our
particular example, since t = s, this is a straightforward task. We obtain
 2
r2 + 1
dt = ds = d = 0. (3.67)
2r

Expanding this expression and choosing the smallest root we obtain the value of r
at the breaking point as √
3
rb = , (3.68)
3
which yields the breaking time
 2 2 √
rb + 1 8 3
tb = sb = = ≈ 1.54. (3.69)
2rb 9
This value is corroborated by a look at the graph of the intersecting projected char-
acteristics. An equivalent reasoning to obtain the breaking time could have been the
following. Equation (3.63) provides us with the profile of the wave at time t = τ .
3.4 Particular Cases and Examples 71

We are looking for the smallest value of τ that yields an infinite slope du/d x. This
slope is calculated as

du du/dr 2r
= =  2 . (3.70)
dx d x/dr 2r τ − r 2 + 1

For it to go to infinity, the denominator must vanish, whence


 2 2
r +1
τ= . (3.71)
2r
The minimum time is obtained by equating dτ /dr to zero, thus yielding the same
result as Eq. (3.68).
Once we have calculated the breaking time tb we can determine the spatial location
xb of the shock initiation. It is obtained from Eq. (3.64) as
sb √
xb = + rb = 3, (3.72)
rb2 +1

where we have used Eqs. (3.64) and (3.68).


We have learned from this example that, in general, a quasi-linear first-order PDE
may develop breaking catastrophes of the type just described. The solution obtained
by integrating the characteristic equations and exhibiting a one-parameter family
thereof is not valid beyond the breaking time (which can, however, be calculated
by that very method). In physical applications, such as gas dynamics, this event is
interpreted as the birth of a shock wave. It is important to notice that the shock
develops out of perfectly smooth initial conditions. Anyone doubting the power of
mathematics to predict, at least in principle, real world events should take a close
look at this example.7 Physics, however, does not come to a halt simply because a
certain mathematical singularity has been detected. The phenomena continue their
inexorable path. In the next chapter we will investigate how the mathematical problem
can be resolved.

3.5 A Computer Program


The method of characteristics can be used efficiently for the numerical solution of
first-order PDEs. Indeed, all one needs to do is to program the initial value problem for
a system of first-order ODEs. In the MATLAB code shown in Box 3.3 an elementary
(though not very accurate) forward difference method is used. The accuracy can
be improved by replacing this integrator by the Runge–Kutta method. The user
must input the functions describing the coefficients of the PDE. The case illustrated
corresponds to the inviscid Burgers equation solved above. Figure 3.11 shows the
graph of the numerical solution produced by the code.

7 The classical reference work for this kind of problem is [2]. The title is suggestive of the importance

of the content.
72 3 The Single First-Order Quasi-linear PDE

Box 3.3 A simple computer code

function CharacteristicNumMethod
% The quasi-linear PDE aa*u_x + bb*u_t= cc is solved by the method
% of characteristics. Users must specify the functions aa, bb and cc.
% The solution is graphed in parametric form, so that shock formation
% can be discerned from the graph. The shock is not treated. Initial
% conditions are specified by function init at time t=0
N = 50; % Number of steps along the parameter s
M = 200; % Number of steps along the parameter r
ds = 0.1; % Parameter s step size
dr = 0.05; % Parameter r step size

solution = zeros(N,M,5);

for j = 1:M % Iteration over initial condition parameter 'r'


x0 = (j-M/2)*dr;
t0 = 0;
u0=init(x0);

for i=1:N % Iteration over curve continuation parameter 's'


solution(i,j,1) = x0;
solution(i,j,2) = t0;
solution(i,j,3) = u0;

% Forward differences
x00 = x0+aa(x0,t0,u0)*ds;
t00 = t0+bb(x0,t0,u0)*ds;
u00 = u0+cc(x0,t0,u0)*ds;
x0=x00;
t0=t00;
u0=u00;
end
end

% Plot solution
kc=0;
for i=1:N
for j=1:M
kc=kc+1;
XX(i,j)=solution(i,j,1);
TT(i,j)=solution(i,j,2);
UU(i,j)=solution(i,j,3);

end
end

figure(1);

S = surf(XX, TT, UU);


xlabel('x'); ylabel('t'); zlabel('u');
end

function aa = aa(x,t,u)
% Coefficient of u_x
aa = u;
end

function bb = bb(x,t,u)
% Coefficient of u_t
bb = 1;
end

function cc = cc(x,t,u)
% Right-hand side
cc =0;
end

function init=init(x)
init=1/(x^2+1);
end
3.5 A Computer Program 73

0.9

0.8

0.7

0.6

0.5
u

0.4

0.3

0.2

0.1

0
5

0
t -6 -4 -2 0 2 4 6
x

Fig. 3.11 Numerical solution of the inviscid Burgers equation

Exercises
Exercise 3.1 What conditions must the functions a, b, c in (3.5) satisfy for this
equation to be linear?
Exercise 3.2 Are the characteristics of a linear equation necessarily straight lines?
Are the characteristic ODEs of a linear PDE necessarily linear? Are the characteristic
ODEs of a quasi-linear PDE necessarily non-linear?
Exercise 3.3 ([8], p. 96) Solve the following initial-value problem for v = v(x, t):

vt + c vx = x t v(x, 0) = sin x,

where c is a constant.
Exercise 3.4 Find the solution of the equation

(x − y)u x + (y − x − u)u y = u

for a function u = u(x, y) passing through the line y = x, u = 1.


Exercise 3.5 Solve the differential equation

∂u ∂u
y +x =u (3.73)
∂x ∂y
74 3 The Single First-Order Quasi-linear PDE

with the initial condition


u(x, 0) = x 3 . (3.74)

Further questions: Is this a linear PDE? What are the projected characteristics of this
PDE? Is the origin an equilibrium position for the characteristic equations? Do the
initial conditions determine the solution uniquely for the whole of R2 ? What would
the domain of existence of the solution be if the initial conditions had been specified
on the line y = 1 instead of y = 0?

Exercise 3.6 Modify the MATLAB code of Box 3.3 to solve Exercise 3.5 numer-
ically. Apart from the obvious changes to the functions aa, bb, cc and init, you may
want to modify some of the numerical values of the parameters at the beginning
of the program controlling the range and precision of the solution. Comment on
quantitative and qualitative aspects of the comparison between the exact solution
obtained in Exercise 3.5 and the numerical solution provided by the program. Does
the numerical solution apply to the whole of R2 ?

Exercise 3.7 Find the breaking time of the solution of the inviscid Burgers equation
(3.50) when the initial condition is given by the sinusoidal wave u(x, 0) = sin x.
Where does the breaking occur?

Exercise 3.8 Consider the modified Burgers equation

u t + u 2 u x = 0.

(a) Show that the new function w = u 2 satisfies the usual Burgers equation. (b) On
the basis of this result find the solution of the initial value problem of the modified
Burgers equation with initial condition u(x, 0) = x. (c) What is the domain of validity
of this solution? (d) Is any simplification achieved by this change of variables?

References

1. Arnold VI (2004) Lectures on partial differential equations. Springer, Berlin


2. Courant R, Friedrichs KO (1948) Supersonic flow and shock waves. Interscience, New York
3. Courant R, Hilbert D (1962) Methods of mathematical physics, vol II. Interscience, Wiley, New
York
4. Duff GFD (1956) Partial differential equations. University of Toronto Press, Toronto
5. Garabedian PR (1964) Partial differential equations. Wiley, New York
6. John F (1982) Partial differential equations. Springer, Berlin
7. Sneddon IN (1957) Elements of partial differential equations. McGraw-Hill, Maidenheach
Republished by Dover 2006
8. Zauderer E (1998) Partial differential equations of applied mathematics, 2nd edn. Interscience,
Wiley, New York
Chapter 4
Shock Waves

When the projected characteristics of a quasi-linear first-order PDE intersect within


the domain of interest, the solution ceases to exist as a well-defined function. In the
case of PDEs derived from an integral balance equation, however, it is possible to
relax the requirement of continuity and obtain a single-valued solution that is smooth
on either side of a shock front of discontinuity and that still satisfies the global balance
law. The speed of propagation of this front is obtained as the ratio between the jump
of the flux and the jump of the solution across the shock.

4.1 The Way Out

The impasse exposed at the end of Chap. 3 can be summarized as follows: For some
quasi linear PDEs with given initial conditions the solution provided by the method of
characteristics ceases to be single-valued. It may very well be the case that a reinter-
pretation of the PDE, or a modification of the dependent and independent variables,
can allow us to live with the multi-valued nature of the result. On the other hand, in
most applications, the function u = u(x, y) is the carrier of an intrinsic physical vari-
able, whose single-valued nature is of the essence.1 In this event, the engineer may
wish to check whether or not the model being used has been obtained by means of
excessive simplifications (for example, by neglecting higher-order derivatives). It is
remarkable, however, that a way out of the impasse can be found, without discarding
the original model, by allowing a generalized form of the equation and its solutions.
Indeed, these generalized solutions are only piecewise smooth. In simpler words,
these solutions are differentiable everywhere, except at a sub-domain of measure
zero (such as a line in the plane), where either the function itself or its derivatives

1 For this and other points in the theory of shocks in one-dimensional conservation laws, references

[2, 5] are recommended.


© Springer International Publishing AG 2017 75
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_4
76 4 Shock Waves

experience discrete jumps. Moreover, a weaker interpretation of the conservation


law itself may be allowed. By weak we mean an integral form of the balance laws
that admits solutions with a lower degree of differentiability than we do in the dif-
ferential version of the equations. These generalized and weak solutions, carrying
discontinuities born at the breaking time and developing into the future, are what we
call shocks.

4.2 Generalized Solutions

We recall that, in practice, PDEs more often than not appear in applications as the
result of the statement of an equation of balance. In Chap. 2, Eq. (2.14), we learned
that the preliminary step towards obtaining the balance PDE for the case of one
spatial variable x had the integral form
x2 x2
d
u(x, t) d x = p(x, t) d x + f (x2 , t) − f (x1 , t), (4.1)
dt
x1 x1

where x1 , x2 are arbitrary limits with x1 < x2 . There is no reason, therefore, to discard
this integral version in favour of the differential counterpart, since in order to obtain
the latter we needed to assume differentiability of u = u(x, t) over the whole domain
of interest. This domain will be divided into two parts,2 namely, the strip 0 ≤ t ≤ tb
and the half-plane t > tb . The upper part, moreover, will be assumed to be subdivided
into two regions, D− and D+ , in each of which the solution, u − and u + , is perfectly
smooth. These two regions are separated by a smooth curve with equation

x = xs (t) (4.2)

along which discontinuities in u and/or its derivatives may occur. This curve, illus-
trated in Fig. 4.1, is, therefore, the shock front (the carrier of the discontinuity). The
reason that we are willing to accept this non-parametric expression of the shock
curve is that, from the physical point of view, the derivative d xs /dt represents the
instantaneous speed of the shock, which we assume to be finite at all times. The
shock curve, clearly, passes through the original breaking point, with coordinates
(xb , tb ).
We evaluate the integral on the left-hand side of Eq. (4.1) for times beyond tb (and
for a spatial interval (a, c) containing the shock) as

c xs (t) c
d d d
u(x, t) d x = u(x, t) d x + u(x, t) d x (4.3)
dt dt dt
a a xs (t)

2 For the sake of simplicity, we are assuming that no other shocks develop after the breaking time
tb that we have calculated.
4.2 Generalized Solutions 77

t
xs = xs (t)

D− D+

a c

tb

x
0 xb

Fig. 4.1 The shock front

Because the limits of these integrals depend on the variable with respect to which
we are taking derivatives (t), we can no longer simply exchange the derivative with
the integral. Either by doing the derivation yourself, or by consulting a Calculus
textbook, you can convince yourself of the following formula3

g(t) g(t)
d ∂u(x, t) dg(t) d f (t)
u(x, t) d x = d x + u(g(t), t) − u( f (t), t) . (4.4)
dt ∂t dt dt
f (t) f (t)

Introducing this formula into Eq. (4.3) and the result into (4.1), we obtain

xs (t) c
∂u d xs ∂u d xs
d x + u(xs− , t) + d x − u(xs+ , t) = f (c, t) − f (a, t),
∂t dt ∂t dt
a xs (t)
(4.5)
where the superscripts “−” and “+” indicate whether we are evaluating the (possibly
discontinuous) solution immediately to the left or to the right of the shock curve,
respectively. In Eq. (4.5) we have assumed that there is no production, since its
presence would not otherwise affect the final result. We now let the limits of the
original integral, a and c, approach x − and x + , respectively, and obtain the speed of
the shock as

3 Thegeneralization of this formula to three dimensions is known as Reynolds’ transport theorem,


with which you may be familiar from a course in Continuum Mechanics.
78 4 Shock Waves

d xs f (xs+ , t) − f (xs− , t)
= . (4.6)
dt u(xs− , t) − u(xs+ , t)

This result is known as the Rankine–Hugoniot jump condition.4 It establishes that


for a discontinuous (shock) solution to exist, the shock must propagate at a speed
equal to the ratio between the jump of the flux and the jump of the solution across
the shock curve. This extra condition allows, in general, the solution to be extended
beyond the time of shock formation.

4.3 A Detailed Example

Many textbooks present examples of propagation of shocks in which the discontinuity


is already imposed in the initial conditions. Moreover, the initial profile is very simple.
A typical example, discussed in Sect. 4.4 below, imposes the step function u(x, 0) =
1 − H [x], so that the characteristics consist of two sets of straight parallel lines. The
reason behind these idealized examples can be found in the inherent difficulty implied
by the integration of the Rankine–Hugoniot condition. Let us consider, however, the
case of the PDE (3.50) with the initial condition (3.60), with which we are familiar
already, namely
1
ut + u u x = 0 u(x, 0) = 2 . (4.7)
x +1

We have calculated the breaking time tb for this problem. We are now in a position
to do more than this, namely, to determine the whole zone whereby the solution
is multi-valued. This information is obtained by implementing the zero-determinant
condition (3.66) in Eq. (3.64), which yield the bounding line of this zone in parametric
form. The plot of this boundary is shown in Fig. 4.2.
It should not be surprising that this line contains a cusp (at time tb ), since it is
obtained as an envelope of characteristic lines which, as shown in Fig. 3.9, turn first
one way and then the other. The shock curve starts at this cusp point and develops
within the zone enclosed by the two branches according to the Rankine–Hugoniot
condition. Notice that at each point within this domain, there are three values pro-
duced by the characteristic method. The middle value is irrelevant, since the solution
must be smooth on either side of the shock curve, and the Rankine–Hugoniot condi-
tion must be enforced between the highest and the lowest value. What this means is
that, within the domain of multiplicity, we have a well-defined and smooth right-hand
side of the Rankine–Hugoniot condition, which can then be regarded as an ODE for
the determination of the shock curve. The only problem is the explicit calculation of
the highest and lowest values of the solution.

4 Naturally,
because of the sign convention we used for the flux, the formula found in most books
changes the sign of the right-hand side.
4.3 A Detailed Example 79

ParametricPlot[{(r^2+1)/(2*r)+r,(r^2+1)^2/(2*r)},{r,0,2}, AxesLabel->{x,t},

PlotRange->{{-1,5},{0,5}}, PlotStyle->Black]

Fig. 4.2 Boundary of the domain of multiplicity

In our particular case, it is not difficult to see, by looking at Eqs. (2.24) and (3.50),
that the flux function is given by

1
f (u) = − u 2 . (4.8)
2
Introducing this result into Eq. (4.6), we obtain
 + 2  − 2
d xs u − u 1 + 
= + −
= u + u− . (4.9)
dt 2 (u − u ) 2

This means that the shock curve negotiates its trajectory in such a way as to have
a slope equal to the average of the solutions coming from either side. This can be
enforced either analytically (if this average is easily available) or numerically (if it
isn’t). Figure 4.3 illustrates what average we are talking about, by showing a typical
profile of the multi-valued solution for some t > tb .
So, for each instant of time t > tb , moving the vertical line between its two extreme
points of tangency (which project onto the boundary of the multi-valued zone),
we obtain a smooth function for the right-hand side of Eq. (4.9). In our particular
example, to obtain the values u + and u − analytically we need to solve for the highest
and lowest roots of a cubic equation. Indeed, according to Eq. (3.63), the profile at
time τ is given parametrically by
80 4 Shock Waves

Fig. 4.3 The shock chooses


the highest and lowest values
at a location satisfying the
R-H condition

τ 1
x= +r u= . (4.10)
r2 + 1 r2 + 1

For any given value of x, we obtain

r 3 − x r 2 + r + (τ − x) = 0. (4.11)

In the multi-valued zone, this cubic equation will have three real roots, which, when
substituted into the second Eq. (4.10), provide three values of u (i.e., the three
intersections of the vertical line through x with the wave profile). The Rankine–
Hugoniot condition requires the average between the highest and the lowest of these
values to determine the slope of the shock curve dividing the two regions where the
solution is smooth.
It is interesting to remark that the foregoing picture is an example of a mathemati-
cal catastrophe, namely, a case in which a perfectly smooth manifold (the parametric
surface provided by the method of characteristics, as given by Eq. (3.61)) has singu-
larities in terms of its projection on a plane (because it turns around in such a way that
there are vertical tangent planes). The theory of catastrophes, as developed among
others by the French mathematician René Thom, was very popular some years ago
to explain almost everything under the sun, from the crash of the stock market to the
behaviour of a threatened dog.5 A plot of our equations

s 1
t =s x= +r u= (4.12)
r2 + 1 r2 + 1

is shown in Fig. 4.4. Note that this graph is identical to the one in Fig. 3.11 produced
numerically by the code of Box 3.3.
For the sake of the illustration, we solve our problem numerically. We remark,
however, that the domain of the Rankine–Hugoniot differential equation has a bound-

5 See [1, 4].


4.3 A Detailed Example 81

Fig. 4.4 Plot of the multi-valued (parametric) solution surface

ary containing a cusp at the initial point. Thus, the theorem of uniqueness and its
numerical implications may not be readily enforceable. Given a cubic equation such
as (4.11), one of the roots is always real (since complex roots appear in conjugate
pairs) and, when the other two roots happen to be real (in other words, in the multi-
valued zone of interest), we need to select the two roots corresponding to the highest
and lowest values of u. Choosing these two roots, we solve and plot the solution, as
shown in Figs. 4.5 and 4.6.
Figure 4.7 shows the shock solution as a ‘chopped’ version of the multi-valued
solution obtained before. The curved vertical wall projects on the shock curve
depicted in Fig. 4.6.
82 4 Shock Waves

(* Roots by Cardano-Tartaglia *)

rr1=x[t]/3-(2^(1/3) (3-x[t]^2))/(3 (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t x[t]+8 x[t]^2-4 t
x[t]^3+4 x[t]^4])^(1/3))+(-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t x[t]+8 x[t]^2-4 t x[t]^3+4
x[t]^4])^(1/3)/(3 2^(1/3))

rr2=x[t]/3+((1+I Sqrt[3]) (3-x[t]^2))/(3 2^(2/3) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t
x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))-((1-I Sqrt[3]) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-
36 t x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))/(6 2^(1/3))

rr3=x[t]/3+((1-I Sqrt[3]) (3-x[t]^2))/(3 2^(2/3) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-36 t
x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))-((1+I Sqrt[3]) (-27 t+18 x[t]+2 x[t]^3+3 Sqrt[3] Sqrt[4+27 t^2-
36 t x[t]+8 x[t]^2-4 t x[t]^3+4 x[t]^4])^(1/3))/(6 2^(1/3))

(* Choose highest and lowest *)

maxr=Max[Abs[rr1],Abs[rr2],Abs[rr3]]

minr=Min[Abs[rr1],Abs[rr2],Abs[rr3]]

u1=1/(maxr^2+1)

u2=1/(minr^2+1)

tb=8*Sqrt[3]/9

xbb=Sqrt[3]

(* Solve Rankine-Hugoniot differen al equa on *)

NDSolve[{x'[t]==0.5*(u1+u2),x[tb]==xbb},x[t],{t,tb,6*tb}]

plot1=ParametricPlot[{{Evaluate[x[t]/.%],t},{(r^2+1)/(2*r)+r,(r^2+1)^2/(2*r)}},{t,tb,6*tb},{r,0.001,3},
plotRange ->{{0,5},{0,5}},AxesLabel->{x,t}, AspectRa o->0.75]

Fig. 4.5 Mathematica code to solve the Rankine–Hugoniot ODE

4.4 Discontinuous Initial Conditions

4.4.1 Shock Waves

The merit of the example solved in the previous section is that it clearly shows how a
shock wave (an essentially discontinuous phenomenon) can develop in a finite time
out of perfectly smooth initial conditions. If, on the other hand, the initial conditions
are themselves discontinuous we may obtain the immediate formation and subsequent
propagation of a shock wave. A different phenomenon may also occur, as we will
discuss in the next section. Discontinuous initial conditions occur as an idealized
4.4 Discontinuous Initial Conditions 83

Fig. 4.6 The shock front as a solution of the Rankine–Hugoniot ODE

Fig. 4.7 The shock solution (left) as a chopped version of the multi-valued solution (right)

limit of a steep initial profile, such as that corresponding to the sudden opening of a
valve.
Consider again the inviscid Burgers equation (3.50), but with the discontinuous
initial conditions  −
u for x ≤ 0
u(x, 0) = + (4.13)
u for x > 0

where u − and u + are constants with u − > u + . These are known as Riemann initial
conditions. The projected characteristics for this quasi-linear problem are depicted
in Fig. 4.8.
We observe that for all t > 0 there is a region of intersection of two characteristics,
shown shaded in Fig. 4.8. Recalling that for this equation each characteristic carries
the constant value u − or u + , we conclude that a shock solution is called for in that
84 4 Shock Waves

u t

u− u−
1.0
1.0
u+
u+
x x
0 0

Fig. 4.8 Projected characteristics (right) for Riemann conditions (left) with u − > u +

u−

u+

Fig. 4.9 Solution of the Riemann problem exhibiting a shock

region. Invoking the Rankine–Hugoniot condition (4.6) and the flux function (4.8)
for the inviscid Burgers equation, we obtain that the shock velocity is constant and
given by
d xs 1 − 
= u + u+ . (4.14)
dt 2
The complete solution is schematically represented in Fig. 4.9
4.4 Discontinuous Initial Conditions 85

4.4.2 Rarefaction Waves

Even in the case of linear PDEs it is possible to have a situation whereby the (pro-
jected) characteristics intersecting the initial manifold do not cover the whole plane.
An example is the equation yu x + xu y = u, for which the characteristics are equi-
lateral hyperbolas. In the case of quasi-linear equations, the projected characteristics
depend, of course, on the values of u on the initial manifold, so that the situation just
described may be determined by the initial values of the unknown function. Consider
again the inviscid Burgers equation with the following Riemann initial conditions
 −
u for x ≤ 0
u(x, 0) = (4.15)
u + for x > 0

where u − and u + are constants with u − < u + . This is the mirror image of the problem
just solved, where the fact that u − was larger than u + led to the immediate formation
a shock. The counterpart of Fig. 4.8 is shown in Fig. 4.10. The shaded region is
not covered by any characteristic emanating from the initial manifold t = 0. Since in
physical reality the function u does attain certain values in the shaded region, we need
to extend the solution. A possible way to do so is to postulate a constant value of the
solution in that region. This device, however, would introduce in general two shocks.
It can be verified that no value of the constant would satisfy the Rankine–Hugoniot
conditions.6 A clue as to what needs to be done can be gathered by imagining that the
jump in the initial conditions has been replaced by a smooth, albeit steep, transition.
Correspondingly, the projected characteristics would now cover the whole half-plane
t > 0 and would gradually join the existing characteristics in a fan-like manner.
Moreover, on each of the new characteristic lines the value of u is constant. In the
limit, as the steepness of the transition tends to infinity, the fan of characteristics
emanates from the origin, as suggested in Fig. 4.11. To determine the value of u on
each of these new characteristic lines, we start by noticing that the function u on
the shaded region would have to be of the form u = f (x/t), so as to preserve the
constancy on each characteristic in the fan. Introducing the function f into the PDE,

u t

u+

u−
x x
0 0

Fig. 4.10 Projected characteristics (right) for Riemann conditions (left) with u − < u +

6 For the possibility of extending the values u − and u + into the shaded region and producing a
legitimate shock, see Box 4.1.
86 4 Shock Waves

Fig. 4.11 The characteristic t


fan

x
0

we obtain
x f f  x 
0 = u t − uu x = − f  2
+ f =− − f , (4.16)
t t t t
where primes indicate derivatives of f with respect to its argument. Since we have
already discarded the constant solutions ( f  = 0), we are left with
x
u= . (4.17)
t
In other words, for each time t > 0 the new solution provides a linear interpolation
between the values u − and u + as shown in Fig. 4.12. This solution is continuous
and differentiable everywhere except on the two extreme characteristics. It is called
a rarefaction wave, since it corresponds to a softening of the initial conditions as
time goes on. In applications to gas dynamics, a rarefaction wave is a wave of
decompression.

u+

u−

Fig. 4.12 Solution containing a rarefaction wave


4.4 Discontinuous Initial Conditions 87

Box 4.1 The entropy condition


We have discarded the imposition of a constant solution within the shaded area of Fig. 4.10 in favour
of the linearly interpolated condition on the grounds that the accompanying shocks would have vio-
lated the Rankine-Hugoniot condition. It is possible, however, to propose other solutions containing
a shock that do not violate the Rankine-Hugoniot condition (see Exercise 4.3). In particular, it is
possible to extend the solutions u − and u + into the shaded area and have them meet at a shock
line that satisfies the Rankine Hugoniot condition. The continuous solution is to be preferred over
the solutions of this kind because it satisfies the following, physically based, entropy condition [3]:
“The characteristics starting [from the initial manifold t = 0] on either side of the discontinuity
curve when continued in the direction of increasing t intersect the line of discontinuity." Or: “Every
point can be connected by a backward drawn characteristic to a point in the initial line."
t t

x x

“Good" shock “Bad" shock

Exercises

Exercise 4.1 ([2], p. 153) Write the Rankine–Hugoniot condition for the traffic
equation (2.30). Assume that some cars have already stopped at a red traffic light
and that they are packed at the maximum density. Assume, moreover, that the cars
approaching the end of the stopped line are traveling at a constant speed v0 < vmax .
Find the speed at which the tail of the stopped traffic backs up as more and more cars
join in.

Exercise 4.2 The Burgers equation for a dust: Imagine a one-dimensional flow
of non-interacting particles with no external forces. Show that this phenomenon is
described exactly by the inviscid Burgers equation (3.50). [Hint: interpret u(x, t) as
the velocity field expressed in a fixed inertial coordinate system and notice that the
particle velocities remain constant.] Notice that if all the particles are moving, say,
to the right and if the initial conditions are such that the velocity profile increases to
the right, then there is no danger that any particles will catch up with other particles.
On the other hand, if any part of the initial velocity profile has a decreasing pattern,
some particles will eventually catch up with those to their right and a ‘snowballing
effect’ (that is, a shock) will occur. The Rankine–Hugoniot condition is more difficult
to interpret intuitively, but it may be useful to try. [Hint: imagine identical particles
equally spaced moving to the right at constant speed and encountering identical
particles at rest. Assume perfectly plastic multiple collisions and use induction.]

Exercise 4.3 Show that for the initial conditions (4.15) a mathematical alternative
to the rarefaction wave is the following shock solution
88 4 Shock Waves
⎧ −  − 
⎨u for x ≤ 1
2
u + u+
u=  − 

u + for x > 1
2
u + u+

Specifically, verify that this solution satisfies the Rankine–Hugoniot condition. Does
it satisfy the entropy condition?

References

1. Arnold VI (2003) Catastrophe theory. Springer, Heidelberg


2. Knobel R (2000) An introduction to the mathematical theory of waves. American Mathematical
Society, Providence
3. Lax PD (1973) Hyperbolic systems of conservation laws and the mathematical theory of shock
waves. Regional conference series in applied mathematics, vol 11. SIAM, Philadelphia
4. Poston T, Stewart I (2012) Catastrophe theory and its applications. Dover, New York
5. Zauderer E (1998) Partial differential equations of applied mathematics, 2nd edn. Wiley-
Interscience, New York
Chapter 5
The Genuinely Nonlinear First-Order
Equation

The general non-linear first-order partial differential equation requires a deeper


analysis than its linear and quasi-linear counterparts. Instead of a field of characteris-
tic directions, the non-linear equation delivers a one-parameter family of directions
at each point of the space of dependent and independent variables. These directions
subtend a local cone-like surface known as the Monge cone. Augmenting the under-
lying space by adding coordinates representing the first partial derivatives of the
unknown field, however, it is possible to recover most of the features of the quasi-
linear case so that, ultimately, even the solution of the general non-linear equation
can be reduced to the integration of a system of ordinary differential equations.

5.1 Introduction

From the treatment of the previous chapters, it is quite clear that quasi-linear equations
can be characterized geometrically in a manner not very different from that of linear
equations. It is true that the behaviour of quasi-linear equations is richer in content
due to the fact that projected characteristics may intersect and thus give rise to the
appearance of shocks. Nevertheless, the basic interpretation of the first-order PDE
as a field of directions and the picture of a solution as a surface fitting this field are
the same, whether the equation happens to be linear or quasi linear. In a genuinely
nonlinear first-order PDE, on the other hand, these basic geometric ideas are lost
and must be replaced by somewhat more general counterparts. Remarkably, in spite
of the initially intimidating nature of the problem, the nature of the final result is
analogous to that of the linear and quasi linear cases. Namely, the construction of a
solution of a genuinely nonlinear PDE turns out to be based entirely upon the solution
of a system of ODEs.
Nonlinear first-order PDEs appear in a variety of applications, most notably in
the characterization of wave fronts arising from a system of linear second-order
© Springer International Publishing AG 2017 89
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_5
90 5 The Genuinely Nonlinear First-Order Equation

PDEs describing wave phenomena. Another important application of the theory is


encountered in the Hamilton–Jacobi formulation and its applications to Mechanics.
In this chapter, we will start with an exposition of the basic concepts pertaining to
the analysis of a single nonlinear first-order PDE in two independent variables and
then extend the analysis to more general situations.

5.2 The Monge Cone Field

The general first-order PDE in two variables is of the form

F(x, y, u, u x , u y ) = 0. (5.1)

Here, F is a differentiable function of its 5 arguments. We seek a solution, that is, a


function u = u(x, y) satisfying Eq. (5.1). Such a solution can be regarded as a surface
in the three-dimensional space of (Cartesian) coordinates x, y, u, just as in the quasi
linear case. Assuming this solution to be differentiable (as we must, unless we are
talking about generalized solutions), this surface will have a well-defined tangent
plane at each point of its domain. The equation of the tangent plane at a generic point
(x0 , y0 , u 0 ) lying on this surface is given by

u − u 0 = u x (x − x0 ) + u y (y − y0 ). (5.2)

It is understood that in this equation the derivatives u x and u y are evaluated at the
point (x0 , y0 ) and that u 0 = u(x0 , y0 ), since the point of tangency must belong to the
surface. The equation of the tangent plane we have just written is a direct consequence
of the very definition of partial derivatives of a function of two variables. The vector
with components {u x , u y , −1} is perpendicular to the surface at the point of tangency.
What our PDE (5.1) tells us is that the two slopes of the tangent plane of any
putative solution surface are not independent of each other, but are rather interrelated
by the point-wise algebraic condition (5.1), a fact which, in principle, allows us to
obtain one slope when the other one is given.
To get a pictorial idea of what this linkage between the two slopes means, let us
revisit the linear or quasi linear case, namely,

a(x0 , y0 , u 0 ) u x + b(x0 , y0 , u 0 ) u y = c(x0 , y0 , u 0 ). (5.3)

Since the normal vector to any solution surface at a point has components proportional
to {u x , u y , −1}, we conclude, from Eq. (5.3), that these normal vectors are necessarily
perpendicular to a fixed direction in space, namely the (characteristic) direction
{a, b, c}. What this means is that the tangent planes to all possible solution surfaces
(at a given point) intersect at the line defined by the characteristic direction, thus
forming a pencil of planes, as shown in Fig. 5.1. A pencil of planes resembles the
pages of a widely open book as they meet at the spine.
5.2 The Monge Cone Field 91

Fig. 5.1 A pencil of planes

In the genuinely non-linear case, on the other hand, no preferred direction {a, b, c}
is prescribed by the PDE, resulting in the fact that the possible tangent planes (which
clearly constitute a one-parameter family of planes at each point of space) do not
necessarily share a common line. In general, therefore, we may say that, instead of
constituting a pencil of planes around a given line, they envelop a cone-like surface

Fig. 5.2 A Monge cone as the envelope of a family of planes at a point


92 5 The Genuinely Nonlinear First-Order Equation

Fig. 5.3 The solution is a surface tangent to the Monge cone field

known as the Monge cone1 at the given point in space, as shown schematically in
Fig. 5.2.2
The task of finding a solution to the PDE (5.1) can be regarded geometrically
as that of constructing a surface fitting the Monge cone field, in the sense that the
surface is everywhere tangent to the local cone, as shown schematically in Fig. 5.3.

5.3 The Characteristic Directions

The Monge cone can be seen at each point as defining not just one characteristic
direction (as was the case in the quasi-linear equation) but a one-parameter family of
characteristic directions, namely, the family of generators of the cone. To simplify
the notation, let us put
p = ux q = uy. (5.4)

Thus, the PDE (5.1) can be seen at a given point x0 , y0 , u 0 as imposing an algebraic
relation between the possible values of the slopes p and q, viz.,

F(x0 , y0 , u 0 , p, q) = 0. (5.5)

1 Inhonour of Gaspard Monge (1764–1818), the great French mathematician and engineer, who
made seminal contributions to many fields (descriptive geometry, differential geometry, partial
differential equations).
2 There are some mathematical subtleties. For example, we are tacitly assuming that the partial

derivatives of the function F with respect to the arguments u x and u y do not vanish simultaneously.
Also, we are considering a small range of tangent planes, where one of the slopes is a single-valued
function of the other.
5.3 The Characteristic Directions 93

For each value of one of the slopes, this equation provides us with one3 value of the
other slope. In other words, Eq. (5.5) can be regarded as providing a one-parameter
family of slopes, namely,

p = p(α) q = q(α). (5.6)

Each value of the parameter α corresponds to a plane. As α varies, these planes


envelop the Monge cone. A generator of this cone can loosely be regarded as the
intersection of two “neighbouring” planes. According to Eq. (5.2), the equation of
the one-parameter family of planes tangent to the cone is explicitly given by

u − u 0 = p(α) (x − x0 ) + q(α) (y − y0 ). (5.7)

To find the intersection between two planes, we need to take the cross product
of their normals.4 The intersection between two neighbouring tangent planes is,
therefore, aligned with the vector
 
 i j k 

v =  p q −1  (5.8)
 p + p  dα q + q dα −1 


We have indicated with primes the derivatives with respect to the parameter and we
have used otherwise standard notation for unit vectors along the coordinate axes. On
the other hand, taking the derivative of Eq. (5.5) with respect to the parameter, we
obtain
F p p  + Fq q  = 0, (5.9)

where we have used the index notation for partial derivatives of the function F with
respect to the subscripted variable. Combining the last two results, we conclude that
the direction of the cone generator located in the plane corresponding to the value α
of the parameter is given by the vector with components
⎧ ⎫

⎪ Fp ⎪


⎪ ⎪

⎨ ⎬
Fq (5.10)

⎪ ⎪


⎪ ⎪

⎩ ⎭
p F p + q Fq

3 Seeprevious footnote.
4 TheMonge cone is a particular case of an envelope of surfaces. In Box 5.1 we present a more
general derivation.
94 5 The Genuinely Nonlinear First-Order Equation

Box 5.1 Envelopes of families of surfaces


A one-parameter family of surfaces is given by an equation

u = f (x, y, α),

where α is the parameter. If the dependence of f on α is smooth, there exists in


general an enveloping surface u = U (x, y) with the property of being tangent
at each of its points to one of the members of the family. This envelope can be
obtained by solving the algebraic system

u = f (x, y, α) 0 = f α (x, y, α),

where we are using the subscript notation for partial derivatives. If f αα = 0


at a value α = α0 , the inverse function theorem ensures that we can read off
from the second equation α = g(x, y) in an interval around α0 . Plugging this
function back in the first equation we obtain

u = f (x, y, g(x, y)) = U (x, y).

Since f α = 0, we obtain Ux = f x + f α gx = f x , which is the desired property


for the envelope.
Intuitively, if we start from the surface u = f (x, y, α0 ) and consider the
nearby surface u = f (x, y, α0 + dα) = f (x, y, α0 ) + f α (x, y, α0 )dα, these
two surfaces intersect along a curve obtained by equating f (x, y, α0 ) =
f (x, y, α0 ) + f α (x, y, α0 )dα. Hence, we obtain the condition f a (x, y, α0 ) =
0. So, the system of equations u = f (x, y, α) and f α (x, y, α0 ) = 0 can be
regarded as a one-parameter family of curves obtained as intersections between
neighbouring surfaces of the original family.

At each point of the three-dimensional space with coordinates (x, y, u) we have


thus defined a one-parameter family of characteristic directions. It is important to
observe that in the quasi-linear case, where F p = a(x, y, u) and Fq = b(x, y, u),
expression (5.10) provides us with an explicit system of equations for the character-
istic curves, namely, the familiar

dx dy du
= a(x, y, u) = b(x, y, u) = c(x, y, u). (5.11)
ds ds ds
In the present, genuinely non-linear, case, the system of ODEs

dx dy du
= Fp = Fq = p F p + q Fq , (5.12)
ds ds ds
5.3 The Characteristic Directions 95

is under-determined, because the right-hand sides of these equations depend on the


5 variables x, y, u, p, q, which are interrelated by just one extra condition (the PDE
itself). But consider now an integral surface u = u(x, y) of our PDE. On this surface,
the functions F p and Fq can be calculated as explicit functions of x and y, and so
can p and q. We conclude that, given an integral surface, Eq. (5.12) give us a well-
defined vector field attached to this surface. It is not difficult to see that these vectors
are actually tangent to the surface. Indeed, the vector with components { p, q, −1}
is perpendicular to the surface and can be checked to be also perpendicular to the
vector defined by the right-hand-sides of Eq. (5.12). We conclude that on any given
integral surface of the PDE there exists a well-defined family of characteristics,
namely, of curves that have everywhere a characteristic direction (or, in other words,
the integral surface chooses at each point one of the local Monge cone generators).
To write explicitly the system of (five) ODEs satisfied by these characteristic curves
associated with a given solution surface, we need to calculate the derivatives of p and
q with respect to the curve parameter s. To this effect, we note that, since the PDE
(5.1) is satisfied identically, we must also have (by taking derivatives with respect to
x and y, respectively)

Fx + Fu p + F p px + Fq qx = 0, (5.13)

Fy + Fu q + F p p y + Fq q y = 0. (5.14)

Combining these results with Eq. (5.12), we can write

dx dy
Fx + Fu p + px + qx = 0, (5.15)
ds ds
dx dy
Fy + Fu q + py + q y = 0. (5.16)
ds ds
But, since ‘mixed partials are equal’, we have that p y = qx , so that ultimately
Eqs. (5.15) and (5.16) can be written as

dp
= −Fx − Fu p, (5.17)
ds
dq
= −Fy − Fu q. (5.18)
ds
Equations (5.12), (5.17) and (5.18) constitute a system of five first-order ODEs
satisfied by the characteristic curves contained in a given solution surface. Suppose
now, vice versa, that these five equations had been given a priori, without any knowl-
edge of any particular solution surface. This system ‘happens to have’ the function
F as a first integral. What this means is that this function attains a constant value on
every integral curve of the given system of ODEs. Indeed, we check that
96 5 The Genuinely Nonlinear First-Order Equation

dF dx dy du dp dq
= Fx + Fy + Fu + Fp + Fq = 0, (5.19)
ds ds ds ds ds ds
where we have used Eqs. (5.12), (5.17) and (5.18). If we now single out of all the
possible solutions of this system of ODEs those for which F = 0, we obtain a special
(three-parameter) sub-family of solutions called characteristic strips of the PDE. The
reason for this terminology is that each such solution can be seen as an ordinary curve,
x(s), y(s), u(s), each point of which carries a plane element, that is, the two slopes
p(s), q(s). The image to have in mind is that of a tapeworm. If a characteristic strip
has an element x, y, u, p, q in common with a solution surface, the whole strip must
belong to this surface.

5.4 Recapitulation

It may not be a bad idea to review the basic geometric ideas we have been working
with.5 A point is a triple (x0 , y0 , u 0 ). A plane element is a quintuple (x0 , y0 , u 0 , p, q).
Thus, a plane element consists of a point and two slopes defining a plane passing
through that point, namely,

u − u 0 = p (x − x0 ) + q (y − y0 ). (5.20)

The vector with components { p, q, −1} is perpendicular to this plane. A smooth


one-parameter family of plane elements (x(s), y(s), u(s), p(s), q(s)) consists of a
smooth curve at each point of which there is a plane attached. These planes may
or may not be tangential to the underlying curve, also called the support curve
(x(s), y(s), u(s)). If they are, we say that (x(s), y(s), u(s), p(s), q(s)) is a strip.
Notice that so far all these concepts are completely independent of any PDE; they
are purely geometric definitions. We now ask the question: what is the condition for
a one-parameter quintuple to constitute a strip? The answer is quite straightforward
once we observe that for this to be the case the normal to the plane attached at a point
must be perpendicular to the support curve at that point. In other words, we need to
satisfy
dx dy du
p +q −1 = 0. (5.21)
ds ds ds
This equation is appropriately known as the strip condition. The above geometric
concepts are illustrated in Fig. 5.4.
Having established these basic definitions, let us introduce the first-order non-
linear PDE
F(x, y, u, p, q) = 0, (5.22)

5 We follow the terminology of [3].


5.4 Recapitulation 97

Fig. 5.4 Planes and strips

where we are using the notation (5.4). We form the following system of five ODEs,
which we call the characteristic system associated with the given PDE:

dx dy du
= Fp = Fq = p F p + q Fq
ds ds ds
dp dq
= −Fx − Fu p = −Fy − Fu q. (5.23)
ds ds
Any solution (integral curve) of this system can be obviously viewed as a one-
parameter family of plane elements supported by a curve. We claim that this one-
parameter family is necessarily a strip, which will be called a characteristic strip. The
proof of this assertion follows directly from substitution of the first three equations
of the system (5.23) into the strip condition (5.21).
To pin down a characteristic strip (it being the solution of a system of ODEs),
we only need to specify any (initial) plane element belonging to it. We have shown
that the function F defining our PDE is a first integral of its characteristic system.
Therefore, if any one plane element of a characteristic strip satisfies the equation
F = 0, so will the whole characteristic strip to which it belongs. This result can be
interpreted as follows. If a plane element is constructed out of a point on an integral
surface of the PDE and of the tangent plane to this surface at that point, then the
strip that this element uniquely determines has a support curve that belongs to the
integral surface (that is, a characteristic curve) and the plane elements of this strip are
made up of the tangent planes to the integral surface at the corresponding points. Two
characteristic strips with F = 0 whose support curves have a common point with a
common tangent, must coincide. Therefore, two integral surfaces having a common
point and a common tangent plane thereat, must have a whole characteristic strip in
common (that is, they share the whole support curve and are tangential to each other
along this curve).
98 5 The Genuinely Nonlinear First-Order Equation

5.5 The Cauchy Problem

The Cauchy (or initial) problem for a nonlinear PDE is essentially the same as for
its linear or quasi-linear counterpart. Given an initial curve in the plane

x = x̂(r ) y = ŷ(r ), (5.24)

on which the values of the unknown function have been prescribed as

u = û(r ), (5.25)

the Cauchy problem deals with the possibility of finding a solution of the PDE over a
neighbourhood of the initial curve in such a way that it attains the prescribed values
over this curve. Equations (5.24) and (5.25) constitute the parametric equations of a
curve in three-dimensional space. The Cauchy problem can, therefore, be rephrased
as follows: to find an integral surface of the PDE containing this space curve.
In the case of linear and quasi-linear equations, the solution to this problem was
based on the construction of the one-parameter family of characteristics issuing from
the various points of this space curve. The situation in a genuinely non-linear first-
order PDE is more delicate, since what we have at our disposal is not a collection
of characteristic curves, but rather of characteristic strips. The task of constructing
the solution must start therefore by extending in a unique way the initial data to
a (non-characteristic) strip, and only then solving the differential equations of the
characteristic strips to generate a one-parameter family. We will need to show how this
extension is accomplished and to prove that the one-parameter family of characteristic
support curves is indeed an integral surface. These tasks are somewhat more difficult
than in the case of the quasi-linear equation, but the fundamental idea of reducing the
Cauchy problem of a first-order PDE to the integration of a system of ODEs remains
the same.
The (non-characteristic) initial strip supported by the given curve will have a
parametric representation consisting of the equations of the supporting curve (5.24),
(5.25) and two additional equations

p = p̂(r ) q = q̂(r ), (5.26)

providing the slopes of the tangent plane as functions of the running parameter r .
To determine these two functions, we have at our disposal two equations. The first
equation is the strip condition (5.21), guaranteeing that each plane element contains
the local tangent to the curve, namely,

d x̂ d ŷ d û
p̂ + q̂ = . (5.27)
dr dr dr
5.5 The Cauchy Problem 99

The second equation at our disposal is the PDE itself (which we clearly want to
see satisfied on this initial strip), that is,

F(x̂(r ), ŷ(r ), û(r ), p̂(r ), q̂(r )) = 0. (5.28)

We note that these two equations constitute, at each point, merely algebraic rela-
tions between the two unknown quantities p̂(r ), q̂(r ). To be able to read off these
unknowns at a given point in terms of the remaining variables, we need the corre-
sponding Jacobian determinant
 
 d x̂ d ŷ 
 dr dr 
J =  
 (5.29)
 F p Fq 

not to vanish at that point. By continuity, there will then exist a neighbourhood of this
point with the same property. In this neighbourhood (which we will assume to be the
whole curve) we can obtain the desired result by algebraic means. Using each plane
element thus found as an initial condition for the system of characteristic ODEs, and
setting the parameter s of the characteristic strips thus obtained to 0 at the point of
departure, we obtain a one-parameter family of characteristic strips, namely,

x = x(r, s) y = y(r, s) u = u(r, s) p = p(r, s) q = q(r, s). (5.30)

We claim that the first three equations in (5.30) constitute an integral surface of
the PDE. It is clear that on the surface represented parametrically by these three
equations, the PDE is satisfied as an algebraic relation between the five variables
x, y, u, p, q. What remains to be shown is that we can read off the parameters r and
s in terms of x and y from the first two equations and that, upon entering these values
into the third equation and calculating the partial derivatives u x and u y , we recover,
respectively, the values of p and q given by the last two equations in (5.30). We will
omit the proof of these facts.6

5.6 An Example

To illustrate all the steps involved in the solution of a non-linear first-order PDE
by the method of characteristic strips, we will presently solve a relatively simple
example.7 The problem consists of finding a solution of the PDE

u = u 2x − u 2y , (5.31)

6 See [3].
7 This example is suggested as an exercise in [4], p. 66.
100 5 The Genuinely Nonlinear First-Order Equation

which on the x-axis attains the value


1
u = − x 2. (5.32)
4
Solution: Our first task is to construct a (hopefully non-characteristic) strip sup-
ported by the initial 3-D curve, which can be parametrized as

1
x =r y=0 u = − r 2. (5.33)
4
The strip condition (5.27) yields

1
1 p + 0 q = − r. (5.34)
2
Moreover, the PDE itself, i.e. Eq. (5.31), yields

1
− r 2 = p2 − q 2 . (5.35)
4
Solving the system of equations (5.34) and (5.35), we obtain

1 2
p=− r q=± r. (5.36)
2 2
This completes the strip over the support curve (5.33). It is important to notice that,
due to the non-linearity of the PDE, we happen to obtain two different possibilities
for the initial strip, each of which will give rise to a different solution of the equation.
Our next task is to obtain and solve the characteristic system of ODEs. Writing
the PDE (5.31) in the form

F(x, y, u, p, q) = u − p 2 + q 2 = 0, (5.37)

Equation (5.23) result in the system

dx dy du dp dq
= −2 p = 2q = −2 p 2 + 2q 2 = −p = −q.
ds ds ds ds ds
(5.38)
This system is easily integrated to

x = 2 Ae−s + C, (5.39)

y = −2Be−s + D, (5.40)

u = (A2 − B 2 )e−2s + E, (5.41)


5.6 An Example 101

p = Ae−s , (5.42)

q = Be−s , (5.43)

where A, B, C, D, E are adjustable constants of integration. Setting s = 0, we make


each of these expressions equal to the respective counterpart in Eq. (5.33) or (5.36)
and thus obtain

1 2 √
A=− r B=± r C = 2r D = ± 2r E = 0. (5.44)
2 2
Introducing these values into Eqs. (5.39)–(5.41), we obtain the following parametric
equation of the integral surface:

x = r (2 − e−s ), (5.45)

y = ± 2r (1 − e−s ), (5.46)

1
u = − r 2 e−2s . (5.47)
4
We now solve Eqs. (5.45) and (5.46) for r and s to obtain

2
r=x∓ y, (5.48)
2
x
e−s = 2 − √ . (5.49)
x∓ 2y/2

Substituting these values in Eq. (5.47), we obtain the desired solution as the inte-
gral surface
1
√ 2
u = − ± 2y − x . (5.50)
4
Geometrically, the solution is either one of two horizontal oblique parabolic cylinders.
Either cylinder contains the initial data.

5.7 More Than Two Independent Variables

5.7.1 Quasi-linear Equations

Our treatment of first-order linear and non-linear PDEs has been constrained so far to
the case of two independent variables. The main reason for this restriction has been to
102 5 The Genuinely Nonlinear First-Order Equation

enable the visualization of the solutions as surfaces in R3 and thus to foster geometric
reasoning. The generalization to an arbitrary number of independent variables is quite
straightforward. Consider the quasi-linear equation

∂u ∂u
a1 (x1 , . . . , xm , u) + · · · + am (x1 , . . . , xm , u) = c(x1 , . . . , xm , u), (5.51)
∂x1 ∂xm

for a function u of the m variables x1 , . . . , xm . In a more economic notation this


equation can be also written as
ai pi = c, (5.52)

where we are using the summation convention in the range 1, . . . , m and the notation
introduced in (3.2).
The characteristic vector field associated with this equation is defined as the vector
field in Rm+1 with components a1 , . . . , am , c. The integral curves of this field, namely,
the solutions of the system of characteristic ODEs

d xi du
= ai (i = 1, . . . , m) = c, (5.53)
ds ds
are the characteristic curves of the PDE. As before, we can prove that if a charac-
teristic curve has one point in common with a solution u = u(x1 , . . . , xm ), then the
whole characteristic curve belongs to this solution. In geometric terms, a function
u = u(x1 , . . . , xm ) is a hyper-surface of dimension m in Rm+1 . The Cauchy problem
consists of finding a solution when initial data have been given in a hyper-surface Γ
of dimension m − 1. In parametric form, such a hyper-surface can be represented as

xi = x̂i (r1 , . . . , rm−1 ) (i = 1, . . . , m) u = û(r1 , . . . , rm−1 ), (5.54)

where r1 , . . . , rm−1 are parameters. Assuming that this initial hyper-surface is


nowhere tangent to a characteristic curve, the problem has a unique solution in a
neighbourhood of Γ . This solution can be constructed as the (m − 1)-parameter
family of characteristics issuing from the points of Γ

Example 5.1 Solve the Burgers-like quasi-linear initial-value problem

u t + uu x + u 2 u y = 0 u(x, y, 0) = x + y, (5.55)

for a function u = u(x, y, t) in the half-space t ≥ 0.


Solution: The characteristic equation are

dx dy dt du
=u = u2 =1 = 0. (5.56)
ds ds ds ds
5.7 More Than Two Independent Variables 103

The characteristics (integral curves) are obtained as

x = Ds + A y = D2s + B t =s+C u = D, (5.57)

where A, B, C, D are integration constants. The initial condition can be written in


parametric form as

x = r1 y = r2 t =0 u = r1 + r2 , (5.58)

where r1 , r2 are parameters.


Setting s = 0 at the initial manifold and enforcing the initial conditions, we obtain

A = r1 B = r2 C =0 D = r1 + r2 . (5.59)

The solution hyper-surface in parametric form reads, therefore,

x = (r1 + r2 )s + r1 y = (r1 + r2 )2 s + r2 t =s u = r1 + r2 .
(5.60)
We need to express the parameters in terms of the original independent variable
x, y, t. Adding the first two equations and enforcing the third, we obtain

x + y = (r1 + r2 )2 t + (r1 + r2 )(t + 1), (5.61)

whence
−(t + 1) ± (t + 1)2 + 4t (x + y)
r1 + r2 = . (5.62)
2t
Invoking the fourth parametric equation we can write the final result as

−(t + 1) + (t + 1)2 + 4t (x + y)
u= . (5.63)
2t
The choice of the positive sign has to do with the imposition of the initial condition.
Because of the vanishing denominator at t = 0 we verify that

−(t + 1) +(t + 1)2 + 4t (x + y)
lim
t→0 2t
−1 + √
2(t+1)+4(x+y)
2 (t+1)2 +4t (x+y)
= lim = x + y. (5.64)
t→0 2

Our solution (5.63) is not defined when the radicand is negative, namely, in the
subspace of R3 defined as

(t + 1)2 + 4t (x + y) < 0. (5.65)


104 5 The Genuinely Nonlinear First-Order Equation

It is not difficult to verify that at the boundary of this domain the Jacobian determinant
∂(x, y, t)/∂(r1 , r2 , t) vanishes. Moreover, the t-derivative of the solution at the initial
manifold is infinite.

5.7.2 Non-linear Equations

The most general non-linear PDE of the first order has the form

F(x1 , . . . , xm , u, p1 , . . . pm ) = 0, (5.66)

where the notation of Sect. 3.1 is used, namely, pi = u ,i . We define the characteristic
system of ODEs associated with the PDE (5.66) as

d xi du dpi
= F pi = pi F pi = −Fxi − Fu pi , (5.67)
ds ds ds
where the summation convention in the range i = 1, . . . , m is understood. These
equations are the m-dimensional analogues of Eqs. (5.12), (5.17) and (5.18).
The function F itself is a first integral of the characteristic system. Indeed, always
using the summation convention, on every solution of the characteristic system we
obtain
dF d xi du dpi
= Fxi + Fu + F pi
ds ds ds ds 
= Fxi F pi + Fu pi F pi + F pi −Fxi − Fu pi = 0. (5.68)

A solution of the characteristic system (5.67) on which F = 0 is called a character-


istic strip. Suppose that a characteristic strip has a point in common with a solution
u = u(x1 , . . . , xm ). By this we mean, of course, that the strip coincides with the
solution at that point and that they both have the same derivatives p1 , . . . , pm . Then,
since F = 0 on the strip, the characteristic strip lies entirely on the solution.
The Cauchy problem for the non-linear first-order PDE stipulates data on an initial
manifold of dimension m − 1. This initial manifold is given, generally, in terms of
some differentiable functions depending on m − 1 parameters ri (i = 1, . . . , m − 1),
that is,
xi = x̂i (r1 , . . . , rm−1 ) i = 1, . . . , m, (5.69)

on which we specify u = û(r1 , ..., rm−1 ). By analogy with the two-dimensional case,
we extend the initial data as some functions pi = p̂i (r1 , . . . rm−1 ). To this end, we
impose the strip conditions

∂ x̂i
p̂i = û rk k = 1, . . . , m − 1, (5.70)
∂rk
5.7 More Than Two Independent Variables 105

and the PDE itself evaluated at the initial manifold, that is,

F(x̂1 , . . . , x̂m , û, p̂1 , . . . p̂m ) = 0. (5.71)

Equations (5.70) and (5.71) constitute an algebraic system of m equations that can,
in principle, be solved for the values of p1 , . . . , pm for each point r1 , . . . , rm−1 on
the initial manifold. By the inverse function theorem, this is possible if the Jacobian
determinant  ∂ x̂1 
 ∂r . . . ∂∂rx̂m 
 1 1 
 . ... . 
 
 . ... . 
J =   (5.72)

 . ... . 
 ∂ x̂1 . . . ∂ x̂m 
 ∂rm−1 ∂r 
 F . . . Fm−1 
p1 pm

does not vanish over the initial manifold. If this condition is satisfied, we can build
a solution of the PDE that contains the initial data by constructing the (m − 1)-
parameter family of characteristic strips issuing from each point of the initial man-
ifold. This is achieved by means of the by now familiar procedure of setting s = 0
in the general expression for the characteristic strips and equating with the corre-
sponding values at each point of the initial manifold (extended as explained above).
In this way, the ‘constants’ of integration are obtained in terms of the parameters
r1 , . . . , rm−1 . The solution is thus obtained in the parametric form

xi = xi (s, r1 , . . . , rm−1 ) (i = 1, . . . , m) u = u(s, r1 , . . . , rm−1 ). (5.73)

To eliminate the parameters, we need to guarantee that the Jacobian matrix


∂(x1 , . . . , xm )/∂(s, r1 , . . . , rm−1 ) does not vanish. But this determinant J is pre-
cisely given by Eq. (5.72), as shown in Exercise 5.4. Since J does not vanish, by
assumption, on the initial manifold (where s = 0), by continuity it will not vanish
on a neighbourhood of the initial manifold. Therefore, in principle, we obtain a local
solution in the desired form u = u(x1 , . . . , xm ).

5.8 Application to Hamiltonian Systems

5.8.1 Hamiltonian Systems

A Hamiltonian system consists of states described in terms of n generalized coor-


dinates q1 , . . . , qn and n generalized momenta p1 , . . . , pn . The physical proper-
ties of the system are completely characterized by a single scalar function H =
H (q1 , . . . , qn , p1 , . . . , pn , t), where t is a time-like variable. This function, assumed
106 5 The Genuinely Nonlinear First-Order Equation

to be differentiable, is called the Hamiltonian of the system. Finally, the evolution


of the system in time is governed by Hamilton’s equations, viz.,

dqi ∂H dpi ∂H
= =− i = 1, . . . , n. (5.74)
dt dpi dt dqi

Note that these are ODEs. The partial derivatives on the right hand side are known
functions of q, p, t obtained by differentiating the Hamiltonian function. A solution
of this system constitutes a trajectory. A trajectory can be regarded as a curve in R2n ,
the space with coordinates q1 , . . . , qn , p1 , . . . , pn known as the phase space of the
system.
Although Hamilton’s equations were originally formulated by starting from
Lagrangian Mechanics and effecting a certain Legendre transformation relative to
the generalized velocities, Hamiltonian systems can arise independently in Mechan-
ics and in other branches of Physics such as Optics, General Relativity and Quantum
Mechanics.

5.8.2 Reduced Form of a First-Order PDE

At first sight, there seems to be no relation whatsoever between Hamiltonian systems


(governed by a system of ODEs) and first-order PDEs. It is true that, as we have
pointed out repeatedly, the Cauchy problem for a first-order PDE can be reduced to
the integration of the system of characteristic ODEs. But is the converse possible
or even worthwhile? To work towards an answer to these questions, let us start by
considering a special kind of first-order PDE, which we call reduced. This a PDE of
the form
F(x1 , . . . , xn , p1 , . . . pn ) = 0. (5.75)

The peculiarity of this PDE is that the unknown function u does not appear explicitly
in the function F. The characteristic strips for this equation are given, according to
(5.67), by

d xi dpi du
= F pi = −Fxi = pi F pi i = 1, . . . , n, (5.76)
ds ds ds
with the summation convention implied in the range 1, . . . , n.
If we compare the first two expressions of (5.76) with their counterparts in (5.67)
we realize that, except for some minor differences in notation, they are identical.
An important detail is that, since F does not contain u, the first two expressions are
independent from the third. In other words, the characteristic system can be integrated
first by solving the 2n equations involving just xi and pi , and only later solving the
evolution equation for u. Notice also that, although in the PDE the symbols pi stand
for the partial derivatives u ,i , this fact is irrelevant as far as the characteristic equations
are concerned.
5.8 Application to Hamiltonian Systems 107

In conclusion, a first-order PDE in reduced form gives rise, via its characteristic
equations, to a Hamiltonian system!

Box 5.2 What is so special about a reduced first-order PDE?


We have called reduced a first-order PDE that does not explicitly include
the unknown function u. According to Eq. (5.76) the characteristic equations
of a reduced first-order PDE yield a Hamiltonian system, a remarkable fact
considering that no Physics has been invoked. Remarkable too is the fact that
every first order PDE can be brought into a reduced form, at a relatively small
price. To see how this is possible, let us start from the general non-reduced
PDE
F(x1 , . . . , xn , u, p1 , . . . , pn ) = 0.

Instead of looking for a solution in the form u = u(x1 , . . . , xn ) let us look for
a solution in the implicit form w(x1 , . . . xn , u) = 0, and let us investigate what
PDE does the function w of n + 1 independent variables satisfy. Since

∂w ∂w ∂w
d x1 + · · · + d xn + du = 0,
∂x1 ∂xn ∂u

we conclude that
∂w
∂u ∂xi
pi = =− ∂w
.
∂xi ∂u

Thus, the function w of the independent variables x1 , . . . , xn , u satisfies the


equation
 ∂w ∂w
  
∂x 1 ∂x n ∂w ∂w ∂w
F x1 , . . . , xn , u, − ∂w
,...,− ∂w
= G x1 , . . . , xn , u, ,..., , =0
∂u ∂u
∂x1 ∂xn ∂u

This is a first-order PDE in reduced form for the function w of n + 1 variables.


Having obtained a solution of this equation, we have also obtained the solution
u to the original equation in implicit form.

5.8.3 The Hamilton–Jacobi Equation

We are now in a position to answer the question as to whether, given a Hamiltonian


system, one can always find a reduced first-order PDE whose characteristics are
Hamilton’s equations of the system. The answer is surprisingly simple, positive
and constructive. Let H = H (q1 , . . . , qn , p1 , . . . , pn , t) be the Hamiltonian of the
108 5 The Genuinely Nonlinear First-Order Equation

system. The corresponding first-order PDE, known as the Hamilton–Jacobi equation,


is  
∂S ∂S ∂S
H q1 , . . . , qn , ,..., ,t + = 0. (5.77)
∂q1 ∂qn ∂t

The function S = S(q1 , . . . , qn , t), obtained as a solution of this equation, is called


an action function. The action function acts as some kind of scalar potential for the
momenta.

5.8.4 An Example

The Hamilton–Jacobi equation establishes that any Hamiltonian system is ultimately


governed by the solutions of a single scalar PDE of the first order. Apart from the
theoretical importance of this result, it also has practical applications. Suppose we
are given a Hamiltonian function H . The obvious way to analyze the evolution of
the system out of some initial conditions is to solve the system of ODEs provided
by Hamilton’s equations. If, instead, we construct the associated Hamilton–Jacobi
equation and if, by some technique, we manage to find a solution of this equation
involving some arbitrary constants, then we have solved the problem in a completely
different way. In particular, we can recover the characteristics corresponding to our
initial conditions.
A complete integral of a first order PDE F(x1 , . . . , xn , u, p1 , . . . , pn ) = 0 is a
function f (x1 , . . . , xn , a1 , . . . , an ) which satisfies the PDE for all values of the arbi-
trary parameters a1 , . . . , an . The characteristics of the PDE are obtained from a
complete integral by equating the derivatives of the complete integral with respect
to each of the parameters to a constant.8

Box 5.3 Complete, general and singular integrals


We have learned how to solve the Cauchy problem for any particular initial
data by the method of characteristics. Can we define, in some sense, a general
solution? One way to approach this question is to note that, for the case of two
independent variables x, y, the solution depends on the values of a function
on a parametrized curve lying on the x, y plane. In other words, the solution
depends on a function of a single variable. Keeping this observation in mind,
let us define a complete integral of the PDE F(x, y, u, p, q) = 0 as a function

u = f (x, y, α, β)

8 See Boxes 5.3 and 5.4. For a thorough understanding of these topics within the mathematical
context, see [1], p. 59, [2], p. 33, and [3], p. 29. For many interesting and challenging problems on
the general integral, [4] is highly recommended.
5.8 Application to Hamiltonian Systems 109

that satisfies the PDE for arbitrary values of the two parameters α, β. Since the
parameters are arbitrary and independent, we may decide to impose a restriction
to the two-parameter family by choosing a specific functional dependence
β = β(α). We have at our disposal an arbitrary function of a single variable
to control the solution. More specifically, referring to Box 5.1, we can obtain
the envelope of the new one-parameter family, and eliminate the parameter α,
by choosing a specific function β and solving the algebraic system

dβ(α)
u = f (x, y, α, β(α)) f α (x, y, α, β(α)) + f β (x, y, α, β(α)) = 0.

But, since, at each point, the envelope of a family coincides with one of the
solutions and has the same tangent plane, we conclude that this envelop is
itself a solution! We have thus obtained a solution depending on an arbitrary
function β, that is, a general integral.
Finally, a singular integral can sometimes be found that is not comprised
within the solutions delivered by the general integral. This singular solution is
obtained as the envelope of the whole two-parameter family u = f (x, y, α, β).
It can be regarded as an envelope of envelopes. It is delivered by the system of
algebraic equations

u = f (x, y, α, β) f α (x, y, α, β) = 0 f β (x, y, α, β) = 0.

In order to read off α and β from the last two equations, according to the
inverse function theorem, the Jacobian determinant ∂( f α , f β )/∂(α, β) (which
happens to be the Hessian determinant) must not vanish.
For the sake of simplicity, we have only dealt with the case of two indepen-
dent variables, but a similar treatment can be justified for higher dimensions.

Since complete integrals are in general difficult to obtain, let us deal with a simple
example from Hamiltonian Mechanics, namely, the classical ballistic problem: A
particle of mass m moving under the action of constant gravity g in the x, y plane,
where y is the upright vertical direction. The Hamiltonian function in this case is the
total energy, expressed in terms of coordinates x, y and momenta p, q as

1 2 
H (x, y, p, q) = p + q 2 + mg y. (5.78)
2m
The Hamilton–Jacobi equation is, therefore,
 2  2 
1 ∂S ∂S ∂S
+ + mg y + = 0. (5.79)
2m ∂x ∂y ∂t
110 5 The Genuinely Nonlinear First-Order Equation

To find a complete integral we try a solution of the form

S(x, y, t) = f 1 (x) + f 2 (y) + f 3 (t). (5.80)

Substituting this assumption in Eq. (5.79) we obtain


 2  2
1 d f1 1 d f2 d f3
+ + mg y + = 0. (5.81)
2m dx 2m dy dt

Since the only way that functions of different variables may be equated to each other
is if they are constant, we obtain the three conditions
 2  2
1 d f1 1 d f2 d f3
=A + mg y = B = −(A + B), (5.82)
2m dx 2m dy dt

where A, B are arbitrary constants. Upon integration, we obtain



√ 2 2(B − mg y)3/2
f 1 (x) = 2m A x + k1 f 2 (y) = + k2 f 3 (t) = −(A + B)t + k3 ,
3gm 1/2
(5.83)
where k1 , k2 , k3 are constants of integration. The complete integral found is, therefore,

√ 2 2(B − mg y)3/2
S(x, y, t, A, B, C) = 2m A x + − (A + B)t + C. (5.84)
3gm 1/2

The constant C is irrelevant, since the function S can be determined only up to an


additive constant as a result of the fact that the Hamilton–Jacobi equation is in reduced
form (that is, S itself does not appear explicitly in the PDE). The characteristics
are obtained by taking the derivatives of the complete integral with respect to the
parameters and equating to constants, as explained in Box 5.4. We obtain
 
m 2(B − mg y)
−t + x =a −t + = b. (5.85)
2A g2 m

As expected in the ballistic problem, the horizontal coordinate is a linear function of


time, while the vertical coordinate varies quadratically. The four constants A, B, a, b
can be pinned down when the initial conditions of position and momentum are
specified.
5.8 Application to Hamiltonian Systems 111

Box 5.4 Obtaining the characteristics from the complete integral


We restrict attention to the case of a first-order PDE in reduced form with
two independent variables, namely, F(x, y, p, q) = 0. Due to the assumed
reduced form of the PDE, if u = u(x, y) is a solution, so is u = u(x, y) + β,
where β is an arbitrary constant. The immediate consequence of this fact is
that, to form a complete integral, we need to find only a solution of the form
u = f (x, y, α). The complete integral is given by u = f (x, y, α) + β. Our
intention is to obtain a general characteristic strip that can be fitted to any
initial conditions. Consider the condition

f α (x, y, a) = γ,

where γ is a constant. We claim that, for any values of the three parameters
α, β, γ, the four equations

f α (x, y, a) = γ u = f (x, y, α) + β p = f x (x, y, α) q = f y (x, y, α)

define a characteristic strip of the PDE. We start by noticing that each of these
equations eliminates, so to speak, a degree of freedom in the 5-dimensional
space of coordinates x, y, u, p, q, so that 4 independent equations, in general,
determine a curve in this space. We notice, moreover, that this line will lie
on the 4-dimensional sub-manifold F = 0. This conclusion follows from the
fact that the function f (x, y, α) is a solution of the PDE for arbitrary values
of α. Thirdly, we verify that we have a three-parameter family of such (one-
dimensional) curves sweeping the sub-manifold F = 0. Accordingly, to pin
down one of these curves we need to adjust the parameters α, β, γ to satisfy
any given initial conditions x0 , y0 , u 0 , p0 , q0 . Clearly, q0 is not independent
of p0 , since the condition F = 0 must be satisfied by the initial conditions
too. Finally, we verify that, for any fixed values of α, β, γ, the curve satisfies
the strip condition (5.21). This is an immediate consequence of the fact that
du = f x d x + f y dy = pd x + qdy.
All that is left is to ascertain that these strips are characteristic. Following
[2], consider the differential of our main condition f α (x, y, a) = γ, that is,

f αx d x + f αy dy = 0.

This equation provides a specific ratio between d x and dy provided that the
two partial derivatives do not vanish simultaneously. Substituting our complete
integral into the PDE and differentiating with respect to α we obtain

F p f xα + Fq f yα = 0.
112 5 The Genuinely Nonlinear First-Order Equation

Since mixed partial derivatives are symmetric, we obtain

dy Fq
= .
dx Fp

This result is equivalent to the first two conditions for a characteristic strip, as
given in Eq. (5.23). The condition du = pd x + qdy, which we have already
derived, implies that the third condition in (5.23) is fulfilled. The final two con-
ditions are obtained comparing the differentials of p and q with the (vanishing)
total derivatives of F with respect to x and y, respectively. For n independent
variables, the parameter α is replaced by n − 1 parameters αi and the charac-
teristics are obtained by equating the derivatives of the complete integral with
respect to each of these parameters to a constant.

Exercises
Exercises 5.1 ([4], p. 66.) Find the characteristics of the equation u x u y = u and
determine the integral surface that passes through the parabola x = 0, y 2 = u.
Exercises 5.2 Modify the code of Box 3.3 to handle a general nonlinear first-order
PDE in two independent variables.
Exercises 5.3 Solve the PDE of Example 5.1 in parametric form with the initial
condition u(x, y, 0) = 1/(1 + (x + y)2 ). Plot the various profiles (as surfaces in R3 )
of the solution for several instants of time. Notice the formation of multiple-valued
profiles, indicating the emergence of shock waves.
Exercises 5.4 Show that
∂(x1 , . . . , xm )
= J,
∂(s, r1 , . . . , rm−1 )

where J is given by Eq. (5.72). [Hint: use Eq. (5.67)].


Exercises 5.5 Show explicitly how the characteristics of the Hamilton–Jacobi
Eq. (5.77) reproduce Hamilton’s equations (5.74). What happens when the
Hamiltonian is independent of time?

References

1. Duff GFD (1956) Partial differential equations. Toronto University Press, Toronto
2. Garabedian PR (1964) Partial differential equations. Wiley, London
3. John F (1982) Partial differential equations. Springer, Berlin
4. Sneddon IN (1957) Elements of partial differential equations. McGraw-Hill, New York (Repub-
lished by Dover (2006))
Part III
Classification of Equations
and Systems
Chapter 6
The Second-Order Quasi-linear Equation

A careful analysis of the single quasi-linear second-order equation is the gateway into
the world of higher-order partial differential equations and systems. One of the most
important aspects of this analysis is the distinction between hyperbolic, parabolic
and elliptic types. From the physical standpoint, the hyperbolic type corresponds to
physical systems that can transmit sharp signals over finite distances. The parabolic
type represents diffusive phenomena. The elliptic type is often associated with statical
situations, where time is absent. Of these three types, the hyperbolic case turns out to
resemble the single first-order PDE the most. In particular, characteristic lines make
their appearance and play a role in the understanding of the propagation phenomena
and in the prediction of the speed, trajectory and variation in amplitude of the signals,
without having to solve the differential equation itself.

6.1 Introduction

The general form of a quasi-linear second-order PDE for a function u = u(x, y) of


two independent variables is

a u x x + 2b u x y + c u yy = d , (6.1)

where a, b, c, d are functions of the arguments x, y, u, u x , u y . Recall that by quasi-


linearity we mean that we only require that the highest derivatives in the equation
appear linearly. Both in the case of ODEs and in the case of first-order PDEs, we
had a chance of appreciating the importance of interpreting a differential equation, at
least locally, in the following way: If we are given appropriate initial conditions on an
appropriate set (a point, a line), the differential equation gives us information on how
to come out, as it were, from this initial set. The nature of the initial set and the nature
of the initial information that needs to be given on it depend on the type of problem
© Springer International Publishing AG 2017 115
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_6
116 6 The Second-Order Quasi-linear Equation

and on the type of equation. For example, in the case of the dynamics of a system
of n particles moving in space, the equations of motion are 3n second-order (or,
equivalently, 6n first-order) ordinary differential equations. If, at some initial time,
we prescribe the instantaneous positions and velocities of each particle, a total of 6n
numbers, the differential equations allow us to calculate the accelerations, and thus
to come out of the original state and predict the evolution of the system. Notice that,
by successive differentiations of the equations of motion, if all the functions involved
are analytic (i.e., if they admit a convergent Taylor expansion), we can obtain any
number of derivatives at the initial time. In this way, at least for the analytic case, we
can extend the solution to a finite interval of time.1
In the case of a single first-order PDE we have seen that the Cauchy problem
consists of specifying the values of the unknown function on an initial curve. In
essence, unless the curve happens to be a characteristic of the PDE, the differential
equation allows us to come out of this curve using the information provided by
the differential equation concerning the first derivatives. Again we observe that the
information is given on a set with one dimension less than the space of independent
variables and it involves knowledge of the function and its derivatives up to and
including one degree less than the order of the PDE. In the case of a first-order
equation, one degree less means no derivatives at all. It is natural, therefore, to
expect that the Cauchy problem for a second-order PDE in two dimensions will
involve specifying, on a given curve, the unknown function and its first derivatives.
We expect then the second-order PDE to provide enough information about the
second (and higher) derivatives so that we can come out of the initial curve. In fact,
the classical theorem of Cauchy–Kowalewski2 proves that if the coefficients in the
PDE are analytic, this procedure can be formalized to demonstrate the existence
and uniqueness of the (analytic) solution in some neighbourhood of a point on the
initial manifold. That said, we don’t want to convey the impression that this initial-
value problem is prevalent in the treatment and application of all possible equations.
We will see later that boundary value problems and mixed initial-boundary-value
problems are prevalent in applications. But, from the conceptual point of view, the
understanding of the behaviour of the equation and its solution in the vicinity of an
initial manifold with known initial data is of paramount importance. In particular, we
will presently see how it can be used to classify the possible second-order quasi-linear
equations into three definite types.3

1 The theorem of existence and uniqueness, on the other hand, requires much less than analyticity.
2 For a detailed proof, see the classical treatise [1].
3 In some textbooks, the classification is based on the so-called normal forms of the equations. In

order to appreciate the meaning of these forms, however, it is necessary to have already seen an
example of each. We prefer to classify the equations in terms of their different behaviour vis-à-vis
the Cauchy problem. Our treatment is based on [4], whose clarity and conciseness are difficult to
match.
6.2 The First-Order PDE Revisited 117

6.2 The First-Order PDE Revisited

Before proceeding to the study of the second-order equation, it may be useful to


revisit the first-order case. Given the first-order quasi-linear PDE

a(x, y, u) u x + b(x, y, u) u y = c(x, y, u), (6.2)

and a parametrized (initial) curve

x = x̂(r ) y = ŷ(r ), (6.3)

in the space of independent variables on which the value of the solution is specified
as
u = û(r ), (6.4)

the Cauchy problem consists of finding an integral surface that contains the space
curve represented by Eqs. (6.3) and (6.4). Let us assume that we know nothing at all
about characteristics and their role in building integral surfaces. We ask the question:
Does the differential equation (6.2) provide us with enough information about the
first derivatives of the solution being sought so that we can come out, as it were,
of the initial curve? Intuitively, the answer will be positive if, and only if, the PDE
contains information about the directional derivative in a direction transversal to the
original curve in the plane. We may ask why, if we need the whole gradient, we only
demand the derivative in one transversal direction. The answer is: because the initial
data, as given by Eq. (6.4), already give us the necessary information in the direction
of the curve itself. Indeed, assuming that the function û is differentiable, we obtain
by the chain rule of differentiation

d û d x̂ d ŷ
= ux + uy , (6.5)
dr dr dr
where u x and u y are the partial derivatives of any putative solution of the PDE. This
equation must hold true along the whole initial curve, if indeed we want our solution
to satisfy the given initial conditions. Always moving along the initial curve, we see
that the determination of the two partial derivatives u x , u y at any given point along
this curve is a purely algebraic problem, consisting in solving (point-wise) the linear
system of equations     
a b ux c
= , (6.6)
x̂  ŷ  uy û 

where we indicate by a prime the derivative with respect to the curve parameter.
The first equation of this linear system is provided by the PDE and the second
equation of the system is provided by the initial conditions via Eq. (6.5). No more
information is available. This linear system will have a unique solution if, and only
if, the determinant of the coefficient matrix does not vanish. If this is the case, we
118 6 The Second-Order Quasi-linear Equation

obtain in a unique fashion the whole gradient of the solution and we can proceed to
extend the initial data to a solution in the nearby region (for example, by a method
of finite differences). Otherwise, namely if the determinant vanishes, there are two
possibilities:
1. The rank of the augmented matrix
 
a b c
(6.7)
x̂  ŷ  û 

is 2, in which case there is no solution;


2. The rank of this augmented matrix is less than 2, in which case there is an infinite
number of solutions.
In this way, we discover the concept of characteristic direction in a natural way as
an answer to the question: When does the Cauchy problem not have a unique solution
in the neighbourhood of a point? The answer can be interpreted as follows: When
the PDE itself does not add any information to that provided by the initial data.

6.3 The Second-Order Case

Inspired by the analysis of the first-order case, we formulate the Cauchy problem
for the second-order quasi-linear PDE (6.1) as follows: Given a curve by Eq. (6.3),
and given, along this curve, the (initial) values of the unknown function and its first
partial derivatives
u = û(r ), (6.8)

u x = û 1 (r ), (6.9)

u y = û 2 (r ), (6.10)

where û(r ), û 1 (r ) and û 2 (r ) are differentiable functions of the single variable r , find
a solution of Eq. (6.1) compatible with these Cauchy data.
Before investigating whether or not this problem has a solution, it is worth remark-
ing that the functions û 1 (r ) and û 2 (r ) cannot be specified arbitrarily. Indeed, they
must be compatible with the derivative of the function û(r ), as we have seen above
in Eq. (6.5). Specifically,
d x̂ d ŷ d û
û 1 + û 2 = . (6.11)
dr dr dr
An equivalent way to prescribe the data needed for the Cauchy problem, and to avoid
a contradiction, is to stipulate the value of the unknown function u on the initial curve
and the value of the first derivative du/dn in any direction n transversal to the curve.
6.3 The Second-Order Case 119

By analogy with the first-order case, we want to investigate whether or not the
differential equation provides us with enough information about the second deriv-
atives so that we can come out of the initial curve with the given initial data. For
this to be possible, we need to be able to calculate at each point of the initial curve
the values of all three second partial derivatives. We start by remarking that the
Cauchy data already provide us with some information about the second derivatives
of any proposed solution, just like in the first-order case. Indeed, by the chain rule
of differentiation, we know that at any point of the initial curve

û 1 = u x x x̂  + u x y ŷ  (6.12)

and
û 2 = u yx x̂  + u yy ŷ  . (6.13)

Thus, the determination of the three second partial derivatives of the solution along
the initial curve is, at each point, a purely algebraic problem defined by the system
of linear equations
⎡ ⎤⎧ ⎫ ⎧ ⎫
a 2b c ⎨ u x x ⎬ ⎨ d ⎬
⎣ x̂  ŷ  0 ⎦ u x y = û 1 (6.14)
⎩ ⎭ ⎩ ⎭
0 x̂  ŷ  u yy û 2

The determinant of the coefficient matrix of this system is given by

Δ = a ŷ 2 − 2b x̂  ŷ  + c x̂ 2 . (6.15)

If this determinant does not vanish at any point along the initial curve (for the given
Cauchy data), there exists a point-wise unique solution for all three second partial
derivatives and, therefore, it is possible to come out of the initial curve by means of the
information gathered from the initial data and the differential equation. Otherwise,
that is, when the determinant vanishes, if the rank of the augmented matrix is equal to
3, the system is incompatible and there are no solutions. If this rank is less than 3, the
system is compatible, but the solution is not unique. In this case, if the determinant
vanishes identically along the initial curve, the curve is called a characteristic of
the differential equation for the given Cauchy data. Notice that in the second-order
case we will be representing only the projected curves on the x, y plane, since the
visualization of the entire Cauchy problem as a curve would involve a space of 5
dimensions. If the problem happens to be linear, the projected characteristics are
independent of the initial data.
According to what we have just learned, in the linear case the characteristic curves
can be seen as solutions of the ODE

a ŷ 2 − 2b x̂  ŷ  + c x̂ 2 = 0, (6.16)
120 6 The Second-Order Quasi-linear Equation

which (assuming, for example, that x̂  = 0) can be written as


 2  
dy dy
a − 2b + c = 0. (6.17)
dx dx

If, on the other hand, the PDE is quasi-linear, the coefficients of this equation may
be functions of u and its two (first) partial derivatives. What this means is that, in the
legitimate quasi-linear case, we are dealing with characteristics that depend on the
initial data, just as in the first-order case. At any rate, Eq. (6.17) reduces to

dy b± b2 − ac
= . (6.18)
dx a
We have assumed that a = 0, for the sake of the argument. We see that in the case of
second order equations, in contradistinction with the first-order case, there may be
no characteristics at all! This occurs when the discriminant of the quadratic equation
happens to be negative, namely when

b2 − a c < 0. (6.19)

If this is the case, the equation is called elliptic at the point in question and for
the given initial data. At the other extreme, we have the case in which two distinct
characteristics exist. This happens when the discriminant is positive, that is,

b2 − a c > 0. (6.20)

In this case, the equation is called hyperbolic at the point in question and for the
given initial data. The intermediate case, when

b2 − a c = 0, (6.21)

is called parabolic. In this case, we have just one characteristic direction.4 The reason
for this terminology, as you may have guessed, is that quadratic forms in two variables
give rise to ellipses, hyperbolas or parabolas precisely according to the above criteria.
If the original PDE is not only linear but also with constant coefficients, then the type
(elliptic, hyperbolic or parabolic) is independent of position and of the solution. If
the equation is linear, but with variable coefficients, the type is independent of the
solution, but it may still vary from point to point. For the truly quasi-linear case, the
type may depend both on the position and on the solution. In light of the second-order
case, we can perhaps say that the single first-order PDE is automatically hyperbolic.

4 Although we have assumed that the first coefficient of the PDE does not vanish, in fact the con-
clusion that the number of characteristic directions is governed by the discriminant of the quadratic
equation is valid for any values of the coefficients, provided, of course, that not all three vanish
simultaneously.
6.3 The Second-Order Case 121

Although this is not a precise statement, it is indeed the case that the treatment of
hyperbolic second-order PDEs is, among all three types, the one that most resembles
the first-order counterpart in terms of such important physical notions as the ability
to propagate discontinuities.

6.4 Propagation of Weak Singularities

6.4.1 Hadamard’s Lemma and Its Consequences

Hadamard’s lemma5 is a theorem of calculus that legitimizes the calculation of the


directional derivative of a function by means of the chain rule when the desired
direction lies on the boundary, rather than just in the interior, of the domain of
definition. To be more precise, let D be an open domain in Rn and let ψ be a function

ψ:D→R (6.22)

of class C 1 . Put differently, ψ is a real valued function of n independent real variables


x1 , . . . , xn with continuous first derivatives ψ,i (i = 1, . . . , n). Let us, moreover,
assume that the boundary ∂D of D is smooth and that, as we approach the boundary
along any interior path, the function ψ and each of its partial derivatives ψ,i (i =
1, . . . , n) approach finite limits ψ̄ and ψ̄,i , respectively. Consider a smooth curve

xi = xi (s) (i = 1, . . . , n) (6.23)

lying on the boundary ∂D. Then, Hadamard’s lemma asserts that

d ψ̄ 
n
d xi
= ψ̄,i . (6.24)
ds i=1
ds

This result may not look spectacular, since it is what we would have done anyway,
without asking for Hadamard’s permission, but it does allow us to use the chain
rule even when the function is not defined over an open (tubular) neighbourhood
containing the curve. Why is this important at this point of our treatment? The
reason is as follows. Let us assume that we have a domain D ⊂ Rn subdivided
into two sub-domains, D+ and D− , as shown in Fig. 6.1, whose boundaries share a
common smooth part Λ, a manifold of dimension n − 1, also called a hyper-surface
or just a surface.

5 The treatment in this section draws from [5], pp. 491–529. It should be pointed out that Hadamard

proved several theorems and lemmas that carry his name. The lemma used in this section is a rather
elementary result in Calculus. Its proof was originally given by Hadamard in [3], p. 84.
122 6 The Second-Order Quasi-linear Equation

Fig. 6.1 A surface of


discontinuity

D− D+

Assume that the function ψ and each of its derivatives ψ,i are continuous in the
interior of each of the respective sub-domains, but that they attain possibly different
limits, (ψ − , ψ,i− ) and (ψ + , ψ,i+ ), as we approach Λ, according to whether we come
from paths within D− or D+ , respectively. In other words, the given function and/or
its derivatives undergo a jump upon crossing Λ. We refer to Λ as a singular surface or
a surface of discontinuity. We will use the following convenient short-hand notation
to denote the jump of a quantity such as ψ across Λ:

ψ = ψ + − ψ − . (6.25)

According to Hadamard’s lemma, we are in a position to apply the chain rule (6.24)
independently at each of the sub-domains to calculate the derivative of ψ along a
smooth curve lying on Λ, namely,

dψ +  n
d xi dψ −  n
d xi
= ψ,i+ and = ψ,i− . (6.26)
ds i=1
ds ds i=1
ds

Subtracting the second equation from the first and using the notation (6.25) we obtain

dψ 
n
d xi
= ψ,i  . (6.27)
ds i=1
ds

In other words, the derivative of the jump of a function in a direction tangential to the
surface of discontinuity is given by the jump of the derivative in the same direction.
Thus, the jumps of a function and of its partial derivatives cannot be entirely arbitrary,
but must be related by Eq. (6.27). It is interesting to note that in the case for which
the function ψ happens to be continuous across Λ, the jumps of its derivatives must
satisfy the condition
n
d xi
ψ,i  =0 (6.28)
i=1
ds
6.4 Propagation of Weak Singularities 123

This condition can be stated as follows: If ψ is continuous across Λ, the jump of its
gradient is orthogonal to Λ.6
Returning to the general case, if we were to choose a local coordinate system with
all but one of its natural base vectors lying on Λ, the derivative in the direction of the
transverse coordinate would not be at all involved in condition (6.27), as one would
intuitively expect. Equation (6.27) embodies the so-called geometric compatibility
conditions. This terminology is meant to emphasize the fact that these conditions
emerge from a purely geometric analysis of the situation, without any reference to
equations of balance that may arise from the physical formulation of a particular
problem.
When one of the independent variables of the problem is identified with time and
the remaining ones with space, a singular surface can be regarded as the propagation
of a wave front. In the case of just two independent variables, the wave front consists
of a single point. The slope of the singular curve can, in this case, be seen as the
speed of propagation of the front. This idea can be generalized for the case of more
than two independent variables.

6.4.2 Weak Singularities

Given a PDE of order n, a singular surface is said to be weak if only the n-th
(or higher) derivatives are discontinuous across it, while the lower derivatives are
all continuous. For a second-order equation, for example, a weak singular surface
may only carry discontinuities of the derivatives of order 2 and higher. In Continuum
Mechanics applications, where the relevant equations are indeed of second order,
these singularities are known as acceleration waves. If the first derivatives are dis-
continuous, we are in the presence of strong singularities, or shocks. If the function
itself (the displacement, say) is discontinuous, we are in the presence of a dislocation.
This terminology may not apply in other contexts.
Consider the general quasi-linear second-order PDE (6.1). Taking the jump of
this equation across a weak singular curve with parametric equations

x = x̃(s) y = ỹ(s), (6.29)

we obtain along this curve the jump condition

a u x x  + 2b u x y  + c u yy  = 0. (6.30)

6 In using this terminology, we are implicitly assuming that we have defined the natural dot product
in Rn . A more delicate treatment, would consider the gradient not as a vector but as a differential
form which would then be annihilated by vectors forming a basis on the singular surface. We have
already discussed a similar situation in Box 3.2.
124 6 The Second-Order Quasi-linear Equation

We have assumed that the coefficients of the PDE (for example, the moduli of elas-
ticity) are smooth functions throughout the domain of interest. Since we are dealing
with a weak singularity, all the first derivatives are continuous across the singular
curve, implying that Eq. (6.28) may be applied, identifying the generic function ψ
successively with each of the two first-derivatives. As a result, we obtain the two
conditions
d x̃ d ỹ
u x x  + u x y  = 0, (6.31)
ds ds

d x̃ d ỹ
u yx  + u yy  = 0. (6.32)
ds ds
Equations (6.30), (6.31) and (6.32) can be written in matrix form as
⎡ ⎤⎧ ⎫ ⎧ ⎫
a 2b c ⎨ u x x  ⎬ ⎨ 0 ⎬
⎣ x̃  ỹ  0 ⎦ u x y  = 0 , (6.33)
⎩ ⎭ ⎩ ⎭
0 x̃  ỹ  u yy  0

where primes indicate derivatives with respect to the curve parameter. For this homo-
geneous system of linear equations to have a non-trivial solution, its determinant must
vanish. In this way, we recover condition (6.16), implying that weak discontinuities
can only exist on characteristic curves! If one of the independent variables is time,
this conclusion can be expressed as the fact that weak signals propagate along char-
acteristic curves, and their (local) speed of propagation is measured by the slope
of these curves. Notice that, in particular, elliptic equations cannot sustain weak
discontinuities, since they have no characteristic curves.
Note that Eqs. (6.31) and (6.32) imply that the jumps of the second derivatives are
all interrelated. Denote, for example,

u x x  = B. (6.34)

Then, we obtain directly from (6.31) and (6.32)

d x̃
u x y  = −B (6.35)
d ỹ

and
 2
d x̃
u yy  = B . (6.36)
d ỹ

In Continuum Mechanics applications this result implies that discontinuities in the


acceleration are accompanied by discontinuities in the strain rate of the deforming
medium. A more general version of Eqs. (6.34)–(6.36), which will be used later, is
presented in Box 6.1.
6.4 Propagation of Weak Singularities 125

Box 6.1 Iterated compatibility conditions

Suppose that a function ψ = ψ(x1 , . . . , xn ) and all its first partial derivatives
are continuous across Λ. By Eq. (6.28), at any given point on Λ

ψ,i j  = (ψ,i ), j  = ai n j ,

where n i are the (Cartesian) components of a vector perpendicular to the hyper-


surface Λ and ai is a coefficient of proportionality. These quantities constitute
a vector a. But we can also write

ψ, ji  = (ψ, j ),i  = a j n i .

By the equality of mixed partial derivatives we conclude that

ai n j = a j n i .

This equality is only possible if the vectors a and n are collinear. We conclude
that there exists a scalar μ such that ai = μn i for all i = 1, . . . , n. We can,
therefore, write
ψ,i j  = μn i n j .

This elegant general result is known as the iterated geometric compatibility


condition. It can be generalized for higher derivatives.

6.4.3 Growth and Decay

It is, in fact, possible to squeeze out more information. For simplicity, we will confine
attention to the homogeneous linear case so that, in particular, the coefficients a, b, c
are independent of the unknown function and its derivatives, while the right-hand side
vanishes.7 Assume that the singularity curve is nowhere tangential to the x direction.
Differentiating the PDE with respect to x and then taking jumps we obtain

a u x x x  + 2b u x x y  + c u x yy  = ax u x x  + 2bx u x y  + cx u yy  = 0. (6.37)

The jumps of the third derivatives, however, are not independent of those of the second
derivatives, as we know from the geometric compatibility conditions (6.27). Indeed,

7 For the treatment of the quasi-linear case, see Box 6.2.


126 6 The Second-Order Quasi-linear Equation

identifying the generic function ψ successively with each of the second derivatives,
we can write
u x x 
= u x x x  x̃  + u x x y  ỹ  , (6.38)
ds

u x y 
= u x yx  x̃  + u x yy  ỹ  , (6.39)
ds

u yy 
= u yyx  x̃  + u yyy  ỹ  . (6.40)
ds

Multiplying Eqs. (6.38) and (6.39), respectively, by a ỹ  and c x̃  , and then adding
the results we obtain

u x x  u x y     
a ỹ  + c x̃  = a u x x x  + c u x yy  x̃  ỹ  + u x x y  a ỹ 2 + C x̃ 2 .
ds ds
(6.41)

Since the curve in question is characteristic, we can apply Eq. (6.16) to the last term
of Eq. (6.41), thus yielding

u x x  u x y   
a ỹ  + c x̃  = a u x x x  + 2b u x x y  + c u x yy  x̃  ỹ  . (6.42)
ds ds
Introducing this result into Eq. (6.37), we obtain

u x x  u x y   
a ỹ  + c x̃  + ax u x x  + 2bx u x y  + cx u yy  x̃  ỹ  = 0. (6.43)
ds ds
We have succeeded in obtaining an equation relating exclusively the jumps of the
second derivatives and their derivatives with respect to the curve parameter. By virtue
of Eqs. (6.34), (6.35) and (6.36), we can write

dB d(B x̃  / ỹ  )  
a ỹ  − c x̃  + ax − 2bx (x̃  / ỹ  ) + cx (x̃  / ỹ  )2 B x̃  ỹ  = 0.
ds ds
(6.44)

This is a first-order ODE for the evolution of the magnitude of the jump. It is some-
times called the transport equation or the decay-induction equation.
If the characteristic curve is parametrized by y, which can always be done locally
around a point at which ỹ  = 0, the transport equation can be written more compactly
as
dB d(B λ)  
a −c λ + ax − 2bx λ + cx λ2 B = 0, (6.45)
dy dy
6.4 Propagation of Weak Singularities 127

where λ denotes the characteristic slope x̃  / ỹ  . Given that, except for B = B(y), all
the quantities involved are known from the solution of the characteristic equation,
the integration of the transport equation is elementary.

Box 6.2 The transport equation for the quasi-linear case


If the PDE is quasi-linear, at least one of the coefficients a, b, c depends on at
least one of the first derivatives of u. For the sake of the argument, suppose that
the first term of the equation is of the form a(x, y, u, u x , u y )u x x . As we reach
the step in the derivation of the transport equation corresponding to Eq. (6.37),
we need to calculate the derivative of this term with respect to x and then take
the jump of the result, which leads to
 
(a u x x )x  =  ax + au u x + au x u x x + au y u yx u x x + au x x x 
= (ax + au u x )u x x  + au x u 2x x  + au y u yx u x x  + au x x x .

Thus we encounter the jump of products of two quantities, say p and q. A


straightforward calculation delivers

 pq = − pq + p + q + q +  p.

There are two immediate consequences of this formula. The first is that the
transport equation becomes a non-linear ODE. The second consequence, bear-
ing an important physical repercussion, is that, as the wave front advances and
encounters preexisting values u + + +
x x , u x y , u yy ahead of the wave, these values
affect the decay or growth of the amplitude of the propagating discontinuity.

6.5 Normal Forms

It is a useful exercise, that we could have carried out also for first-order equations, to
ask ourselves how the form of a PDE is affected by an arbitrary change of coordinates.
A new system of coordinates in the plane is specified by means of two smooth
functions of two new variables ξ and η, namely,

x = x(ξ, η) y = y(ξ, η). (6.46)

For this coordinate change to be legitimate within a certain region (a coordinate


patch), we must also require that within this region the Jacobian determinant
 
∂(x, y)  xξ xη 
J= = (6.47)
∂(ξ, η) yξ yη 
128 6 The Second-Order Quasi-linear Equation

must not vanish at any point. Only when this is the case, which we assume from
here on, can Eq. (6.46) be inverted to yield ξ, η as (smooth) functions of x, y. The
function u = u(x, y) can be expressed in terms of the new variables by composition
of functions, namely,

u = u(x, y) = u(x(ξ, η), y(ξ, η)) = û(ξ, η). (6.48)

We are trying to distinguish, by means of a hat over the symbol, between the function
as an operator and the result of applying this operator to the arguments. When there
is no room for confusion, however, this practice can be abandoned and let the context
indicate which is the function being considered. By a direct iterated use of the chain
rule of differentiation, we obtain the expressions

u x = û ξ ξx + û η ηx ,
u y = û ξ ξ y + û η η y ,
u x x = û ξξ ξx2 + 2û ξη ξx ηx + û ηη ηx2 , (6.49)
u x y = û ξξ ξx ξ y + 2û ξη (ξx η y + ξ y ηx ) + û ηη ηx η y ,
u yy = û ξξ ξ y2 + 2û ξη ξ y η y + û ηη η 2y .

In the new coordinate system, therefore, the original PDE (6.1) can be written as

â û ξξ + 2b̂ û ξη + ĉ û ηη = d̂. (6.50)

In this expression, we assume that the arguments of the coefficients have been
expressed in terms of the new variables. The new coefficients of the second-order
terms are given by the quadratic expressions

â = a ξx2 + 2b ξx ξ y + c ξ y2 ,
b̂ = a ξx ηx + b (ξx η y + ξ y ηx ) + c ξ y η y , (6.51)
ĉ = a ηx2 + 2b ηx η y + c η 2y .

The second-order part of the original PDE can be regarded as a quadratic form
governed by the matrix  
ab
A= . (6.52)
bc

It is not difficult to verify that the counterpart for the transformed equation is governed
by the matrix  
â b̂
 = = J −1 A J −T , (6.53)
b̂ ĉ

where J stands now for the Jacobian matrix. Notice that the determinants of both
matrices, A and Â, will always have the same sign or vanish simultaneously. Each
6.5 Normal Forms 129

of these determinants is precisely the discriminant of the quadratic equation that we


used to distinguish between the three types of equations. As expected, therefore, if a
second-order quasi-linear PDE is hyperbolic (parabolic, elliptic) in one coordinate
system, it will remain hyperbolic (parabolic, elliptic) in any other. Thus, the equation
type is an invariant quality and carries a definite physical meaning. An argument
based, among other considerations, on the algebra of symmetric real matrices, can
be used to show8 that, with a suitable change of coordinates, a quasi-linear hyperbolic
equation can always be expressed, in some coordinate chart, as

û ξη + . . . = 0, (6.54)

where we have indicated only the principal (i.e., second-order) part. An alternative
coordinate transformation can be found that brings the hyperbolic equation to the
form
û ξξ − û ηη + . . . = 0. (6.55)

These are the so-called normal forms of an equation of the hyperbolic type.
For a parabolic equation, the normal form is

û ξξ + . . . = 0, (6.56)

Finally, for the elliptic equation we have the normal form

û ξξ + û ηη + . . . = 0, (6.57)

The existence of these normal forms prompts us to study, in separate chapters,


three paradigmatic equations, one of each type, corresponding to the simplest possible
forms of the omitted terms. These are, respectively, the wave equation and the heat
equation in one spatial dimension and the Laplace equation in two spatial dimensions.

Exercises

Exercise 6.1 Show that option 2 after Eq. (6.7), if applied to every point along the
initial curve, corresponds exactly to the fact that the space curve represented by
Eqs. (6.3) and (6.4) is a characteristic curve of the PDE (6.2).

Exercise 6.2 Show that the first-order quasi-linear PDE (6.2) can be regarded as the
specification of the directional derivative of the unknown function in the characteristic
direction.

Exercise 6.3 For each of the following second-order PDEs determine the type
(elliptic, parabolic, hyperbolic) and, if necessary, the regions over which each type
applies. Obtain and draw the characteristic curves wherever they exist.

8 The proof is not as straightforward as it may appear from the casual reading of some texts. A good

treatment of these normal or canonical forms can be found in [2].


130 6 The Second-Order Quasi-linear Equation

2.5u x x + 5u x y + 1.5u yy + 5u = 0.

2u x x + 4u x y + 2u yy + 3(x 2 + y 2 )u x = e y .

u x x − 2u x y + 2u yy + 4u x u y = 0.

u x x + x u yy = 0. (Tricomi equation)

Exercise 6.4 For a solid body without cracks in the small-deformation regime, find
which components of the strain tensor may be discontinuous across some plane.
[Hint: the displacement field is continuous].

Exercise 6.5 Generalize the transport equation (6.44) to the case of the non-
homogeneous linear equation (6.1), where

d = e(x, y) u x + f (x, y) u y + g(x, y) u.

Apply your result to obtain and solve the transport equation for the modified one-
dimensional wave equation when a linear viscous term has been added (namely, a
term proportional to the velocity). Are the (projected) characteristics affected by this
addition?

References

1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol 2. Interscience, Wiley, New
York
2. Garabedian PR (1964) Partial differential equations. Wiley, New York
3. Hadamard J (1903) Leçons sur la Propagation des Ondes et les Équations de l’Hydrodynamique.
Hermann, Paris. www.archive.org
4. John F (1982) Partial differential equations. Springer, Berlin
5. Truesdell C, Toupin RA (1960) The classical field theories. In: Flügge S (ed), Handbuch der
Physik. Springer, Berlin
Chapter 7
Systems of Equations

Quasi-linear equations of order higher than 1 and systems of quasi-linear equations


of any order can be subjected to a classification akin to that of the single second-order
equation. Between the two extremes of totally hyperbolic and elliptic types a larger
variety of intermediate types can be found. The analysis of totally hyperbolic equa-
tions and systems is particularly fruitful as it leads to the concepts of characteristic
manifolds and bi-characteristic lines or rays. Physically, the former represent wave
fronts propagating through space and the latter are the lines along which signals
propagate. As in the case of a single second-order hyperbolic PDE, it is possible to
predict the trajectories and variation of amplitude of these signals without necessarily
solving the original equations themselves.

7.1 Systems of First-Order Equations

7.1.1 Characteristic Directions

A system of quasi-linear first-order PDEs for n functions u 1 , . . . , u n of two indepen-


dent variables x and y can be written as
⎡ ⎤⎧ ⎫ ⎡ ⎤⎧ ⎫ ⎧ ⎫
a11 . . . a1n ⎪ ⎪ u1 ⎪
⎪ b11 . .. b1n ⎪ ⎪ u1 ⎪
⎪ ⎪
⎪ c1 ⎪

⎢ . ⎪ ⎪ ⎪ ⎪ ⎪ ⎪
⎢ ... . ⎥ ⎥

⎨ . ⎪
⎬ ⎢ . .
⎢ .. . ⎥⎥

⎨ . ⎪
⎬ ⎪
⎨ . ⎪

⎢ . ... . ⎥ ⎢ ⎥
⎢ ⎥⎪ . ⎪ + ⎢ . . .. . ⎥ .
⎪ ⎪
= .
⎪ ⎪
, (7.1)
⎣ . . . . . ⎦⎪ ⎪ ⎪
. ⎪ ⎣ . . .. . ⎦⎪ ⎪ . ⎪
⎪ ⎪
⎪ . ⎪


⎩ ⎪ ⎭ ⎪
⎩ ⎪ ⎭ ⎩ ⎪
⎪ ⎭
an1 . . . ann u n ,x bn1 . .. bnn u n ,y cn

where ai j , bi j , ci are differentiable functions of x, y and u. We will also use the block
matrix notation

© Springer International Publishing AG 2017 131


M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_7
132 7 Systems of Equations

Au,x + Bu,y = c. (7.2)

As before, we attempt to find whether or not, given the values u = û(r ) of the
vector u on a curve x = x̂(r ), y = ŷ(r ), we can calculate the derivatives u,x and u,y
throughout the curve. In complete analogy with Eq. (6.5), we can write

d û d x̂ d ŷ
= u,x + u,y . (7.3)
dr dr dr
This vector equation represents, in fact, n scalar equations. Combining this informa-
tion with that provided by the system of PDEs itself, we obtain at each point of the
curve a system of 2n algebraic equations, namely,
⎡ ⎤
d x̂
0 ··· 0 d ŷ
0 ··· 0 ⎧ ⎫ ⎧ d û 1 ⎫
⎥⎪ u 1,x ⎪ ⎪ ⎪
dr dr
⎢ 0 ⎪ ⎪ ⎪ ⎪
··· ⎥⎪ ⎪ ⎪ ⎪
d ŷ

d x̂
0 0 ··· 0 ⎪ ⎪ ⎪
dr

⎢ ·
dr dr
⎥⎪⎪ u 2,x ⎪
⎪ ⎪

d û 2


· ··· · · · ··· · ⎥⎪⎪ · ⎪ ⎪ ⎪
dr
⎢ ⎪ ⎪ · ⎪
⎢ · ⎥⎪⎪ ⎪
⎪ ⎪
⎪ ⎪

⎢ ··· ··· · · · ··· · ⎥⎪⎪ · ⎪
⎪ ⎪ ⎪
⎪ ⎪
⎪ ·



⎢ · ⎥⎪ ⎪ ⎪
⎪ ⎪
⎢ · ··· · · · ··· · ⎥⎪⎪
⎪ · ⎪



⎪ ·



⎢ 0 0 ···d x̂
0 ··· d ŷ ⎥ ⎪
⎨ ⎬ ⎪
⎪ ⎨ d û n ⎪

⎢ 0 dr ⎥ u
⎢a dr
b1n ⎥ ⎥ ⎪ u 1,y ⎪ = ⎪ cdr ,
n,x
⎢ 11 a12 · · ·
a1n b11 b12 · · · ⎪
(7.4)
⎢a b2n ⎥ ⎪ ⎪ ⎪ ⎪
⎥⎪⎪ u 2,y ⎪ ⎪ ⎪
1
⎢ 21 a22 · · ·
a2n b21 b22 · · · ⎪ ⎪ ⎪
⎢ ⎥⎪⎪ ⎪
⎪ ⎪
⎪ c ⎪

⎥⎪ ⎪ ⎪ ⎪
2
⎢ ⎪
⎪ · ⎪ ⎪ ⎪
⎪ · ⎪

⎢ · · ··· · · · ··· · ⎥ ⎪
⎪ ⎪ ⎪
⎪ ⎪ ⎪

⎢ ⎥⎪⎪ · ⎪
⎪ ⎪
⎪ · ⎪

⎢ · · ··· · · · ··· · ⎥ ⎪
⎪ ⎪ ⎪
⎪ ⎪ ⎪

⎢ ⎥⎪⎪ · ⎪
⎪ ⎪
⎪ · ⎪

⎣ · · ··· · · · ··· · ⎦⎪ ⎩
u n,y
⎭ ⎪
⎪ ⎩ ⎪

cn
an1 an2 · · · ann bn1 bn2 · · · bnn

which can be written more compactly in terms of partitioned matrices as


⎡ d x̂ ⎤⎧ ⎫ ⎧ ⎫
dr
I d ŷ
dr
I⎨ u,x ⎬ ⎨ ddrû ⎬
⎣ ⎦ = , (7.5)
⎩ ⎭ ⎩ ⎭
A B u,y c

where I is the unit matrix of order n. If the determinant of this system of 2n linear
equations is not zero, we obtain (point-wise along the curve) a unique solution for
the local values of the partial derivatives. If the determinant vanishes, we obtain a
(projected) characteristic direction. According to a result from linear algebra [2], the
determinant of a partitioned matrix of the form
⎡ ⎤
M N
R=⎣ ⎦, (7.6)
P Q

where M and Q are square sub-matrices, is obtained as


7.1 Systems of First-Order Equations 133
 
det R = det Q det M − NQ−1 P , (7.7)

assuming that Q is non-singular. Accordingly, if we assume that B in (7.5) is non-


singular, we obtain the following condition for a direction to be characteristic
 
d x̂ d ŷ −1
det I− B A = 0, (7.8)
dr dr

which is the same as the generalized eigenvalue problem for the characteristic slopes
d x/dy, that is,  
dx
det A − B = 0. (7.9)
dy

The same result is obtained if we assume that A, instead of B, is non-singular.


If the eigenvalues λ = d x/dy of the matrix A weighted by B, as indicated in
Eq. (7.9), are all real and distinct, the system is called totally hyperbolic. At the other
extreme, if n is even and all the eigenvalues are complex, the system is totally elliptic.
Intermediate cases are also possible. Notice that for quasi-linear systems the type
depends also on the solution.

7.1.2 Weak Singularities

Assuming the solution u to be continuous, we investigate the possibility of propa-


gation of weak signals, that is, those carrying a discontinuity in the first or higher
derivatives. Taking jumps of the system (7.2) yields

Au,x  + Bu,y  = 0. (7.10)

On the other hand, by virtue of the geometric compatibility conditions (6.27), we


obtain that, in the direction tangential to a possible line of discontinuity x = x̃(s),
y = ỹ(s),
d x̃ d ỹ
u,x  + u,y  = 0. (7.11)
ds ds
Combining these results we conclude that a weak discontinuity can only occur if
 
dx
A− B u,x  = 0. (7.12)
dy

In this way, we arrive at the same conclusion as before, namely, that weak discon-
tinuities of the solution can only exist across characteristic curves. Totally elliptic
systems cannot sustain any discontinuities.
134 7 Systems of Equations

7.1.3 Strong Singularities in Linear Systems

A strong singularity of a solution of a PDE is a discontinuity of order smaller than the


order of the PDE. In a system of first-order PDEs, therefore, a strong singularity is a
discontinuity of the solution itself. We have already encountered such discontinuities
in the formation of shocks out of perfectly smooth initial conditions. These shocks,
however, do not propagate along characteristics. On the other hand, for strictly lin-
ear equations, where the projected characteristics are independent of the solution,
discontinuities can only arise as a consequence of the specification of discontinuous
initial data, as we have already learned in Example 3.2. Strong discontinuities in
linear hyperbolic systems propagate along characteristics.
Consider the linear system

Au,x + Bu,y = Cu + d, (7.13)

where the n × n matrices A, B, C and the vector d are functions of x and y alone.
Assume that the eigenvalues λ1 , . . . , λn obtained from Eq. (7.9) are all real and
distinct.1 Let M be the modal matrix, whose columns are (linearly independent)
eigenvectors corresponding to these eigenvalues. Then

AM = BM, (7.14)

where  is the diagonal matrix of eigenvalues. Defining a new vector v of dependent


variables by
u = Mv, (7.15)

the system (7.13) can be written in canonical form as

v,x + v,y = Ĉv + d̂, (7.16)

where the new matrix Ĉ and the new vector d̂ are still functions of x and y alone.
What we have achieved, at almost no cost, is a decoupling of the main part of each
equation of the system in the sense that each equation contains a derivation in only
one of the characteristic directions, as follows directly from the fact that λi = d x/dy
is the local slope of the i-th characteristic line, as suggested in Fig. 7.1. Thus, the
dependent variable vi can attain different values vi− , vi+ on either side of the i-th
characteristic line without violating the i-th differential equation. Moreover, it is
also possible to obtain the variation of the amplitude of the jump vi  along the
characteristic line.

1 There may be multiple eigenvalues, as long as the dimension of the corresponding eigenspaces is
equal to the multiplicity.
7.1 Systems of First-Order Equations 135

Fig. 7.1 Strong


discontinuities in linear i-th characteristic
systems
vi− + dvi+
vi+ + dvi+

vi−
vi+

7.1.4 An Application to the Theory of Beams

7.1.4.1 The Bernoulli–Euler Model

The classical theory of beams, named after Daniel Bernoulli (1700–1782) and Leon-
hard Euler (1707–1783), is one of the cornerstones of structural engineering. It is
based on a number of simplifying assumptions, some of which are the following: (a)
the beam is symmetric with respect to a plane; (b) the beam axis is straight; (c) the
supports and the loading are symmetrical about the plane of symmetry of the beam;
(c) the loading is transversal, that is, perpendicular to the axis; (d) the deflections of
the axis are transversal and very small when compared with the beam dimensions;
(e) the material abides by Hooke’s law of linear elasticity; (f) the plane normal cross
sections of the beam remain plane and perpendicular to the deformed axis; (g) the
rotary inertia of the cross sections is neglected in the dynamic equations. The last
two assumptions are crucial to the simplicity and usefulness of the theory while also
embodying some of its limitations.
On the basis of these assumptions, the equations of classical beam theory are not
difficult to derive. Introducing x, y Cartesian coordinates in the plane of symmetry
and aligning the x axis with the beam axis, as shown in Fig. 7.2, we denote by
q = q(x) the transverse load per unit length (positive downwards) and by w =
w(x, t) the transverse deflection (positive upwards). The time coordinate is denoted
by t. Assuming for specificity a constant cross section of area A and centroidal
moment of inertia I , the governing equations can be written as


⎪ Vx = −q − ρAwtt

Mx = V
(7.17)

⎪ E I θx = M

wx = θ

In these equations ρ denotes the constant density, E is the modulus of elasticity,


θ = θ(x, t) is the rotation of the cross section, and V = V (x, t) and M = M(x, t)
are, respectively the internal shear force and bending moment resulting from the
axial and transversal stresses in the beam cross section. Notice that the assumption
136 7 Systems of Equations

Fig. 7.2 Beam theory y


derivation q θ
M V M + dM

wx
x

V + dV

dx

of perpendicularity between the cross sections and the deformed axis implies the
vanishing of the corresponding shear strains, but not of the shear stresses, an internal
contradiction of the theory. The first two equations are the result of enforcing vertical
and rotational dynamic equilibrium, while the third equation arises from Hooke’s law.
The fourth equation establishes the perpendicularity condition by equating the slope
of the axis to the rotation of the cross section.
Introducing the linear and angular velocities v = wt and ω = θt , respectively, the
system (7.17) can be rewritten as the first-order system


⎪ Vx = −q − ρAvt

Mx = V
(7.18)

⎪ E I ωx = Mt

vx = ω

This system can be written as

Au,t + Iu,x = c, (7.19)

where
⎡ ⎤ ⎧ ⎫ ⎧ ⎫
0 0 0 ρA ⎪
⎪ V⎪ ⎪ −q ⎪
⎢0 0 ⎨ ⎪ ⎬ ⎪
⎨ ⎪

0 0 ⎥ M V
A=⎢
⎣0 − 1
⎥ u= c= . (7.20)
0 0 ⎦ ⎪
⎪ ω⎪ ⎪ 0 ⎪
EI ⎩ ⎪ ⎭ ⎪
⎩ ⎪

0 0 0 0 v ω

Solving the eigenvalue problem


 
dt
det A − I = 0, (7.21)
dx

we obtain the trivial condition  4


dt
= 0. (7.22)
dx

This means that any disturbance would travel at infinite speed.


7.1 Systems of First-Order Equations 137

7.1.4.2 The Timoshenko Beam

To remedy the deficiencies of the classical theory, Stephen Timoshenko (1887–1972)


proposed in the early 1920s a modified theory that bears his name. He made just two
modifications to the theory, as follows: (a) plane normal cross sections remain plane
but do not necessarily remain perpendicular to the deformed axis; (b) the rotational
inertia of the cross sections is included in the dynamic equilibrium equations. As
a result, the rigid (positive counterclockwise) rotation θ of the cross sections is no
longer equal to the slope of the axis and a shear strain equal to their difference is
related, via Hooke’s shear elastic modulus G, to the shear force. Since the shear strain
is constant at each section (rather than quadratic) an equivalent “shear area” As is
introduced to compensate for this deliberate error in the theory. The equations of
the Timoshenko beam theory result in the system (7.19) but with the new coefficient
matrix ⎡ ⎤
0 0 0 ρA
⎢ 0 0 −ρI 0 ⎥
A=⎢ ⎣ 0 − E1I 0 0 ⎦ .
⎥ (7.23)
1
G As
0 0 0

The eigenvalue problem yields now four distinct roots

     
dt ρ dt ρA
=± =± . (7.24)
dx 1,2 E dx 3,4 G As

The first two roots are the inverse of the speeds of propagation of the ‘bending waves’,
while the last two roots are the inverse of the speeds of propagation of the ‘shear
waves’. The importance of these finite speeds in aircraft design was first recognized
in [5], where a clear exposition of the evolution of strong discontinuities in beams
is presented from basic principles and extended to the case of multiple eigenvalues.
Notice how, rather than solving specific numerical problems, a careful analysis of the
structure of the equations has allowed us to arrive at significant conclusions about
the phenomena at hand with relatively elementary computations.

7.1.5 Systems with Several Independent Variables

In the case of n quasi-linear first-order equations for n functions u i = u i (x1 , . . . , x K ),


with i = 1, . . . , n, of K independent variables, we write the system in matrix notation
as
A1 u,x1 + A2 u,x2 + · · · + A K u,x K = c, (7.25)

or, more compactly, using the summation convention of Box 2.2 for repeated capital
indices in the range 1, . . . , K ,
138 7 Systems of Equations

A I u,x I = c. (7.26)

The various matrices A I and the vector c are assumed to be differentiable functions
of the arguments x1 , . . . , x K , u 1 , . . . , u n . We want to search for characteristic mani-
folds. As usual, these are special (K − 1)-dimensional hyper-surfaces in the space of
independent variables. On these characteristic manifolds, the specification of initial
data is not sufficient to guarantee the existence of a unique solution in a neighbour-
hood of the manifold. Alternatively, these manifolds can be regarded as carriers of
weak singularities, that is, discontinuities in the first (or higher) derivatives of the
solutions. In Sect. 6.4.1 we obtained an important result to the effect that if a function
is continuous across a singular surface the jump of its gradient is necessarily collinear
with the normal n (in the usual Cartesian metric) to the surface. Thus, we may write

u,x I  = a n I I = 1, . . . , K , (7.27)

where a is a vector of n entries.


Taking the jump of Eq. (7.26) and invoking the last result yields

(n I A I )a = 0. (7.28)

For the vector a, containing the intensity of the jumps, to be non-zero the determinant
of the coefficient matrix must vanish, that is,

det(n I A I ) = 0. (7.29)

This is the equation defining the possible local normals to characteristic manifolds.
We ask ourselves: what kind of equation is this? It is obviously a homogeneous
polynomial of degree n in the K components n I . It may so happen that no vector n
(that is, no direction) satisfies this equation, in which case the system is said to be
totally elliptic at the point in question. At the other extreme, if fixing K − 1 entries
of n the polynomial in the remaining component has n distinct real roots, we have a
case of a totally hyperbolic system at the point. Let a putative singular hyper-surface
be given by an equation such as

φ(x1 , . . . , x K ) = 0. (7.30)

Then the vector n, not necessarily of unit length, can be identified with the gradient
of φ, namely,
n I = φ,x I (7.31)

so that we get
det(A I φ,x I ) = 0. (7.32)

This is a highly non-linear single first-order PDE for the function φ. We call it
the characteristic equation or the generalized eikonal equation associated with the
7.1 Systems of First-Order Equations 139

system. A characteristic manifold (hyper-surface) is the level curve φ = 0 of a


solution φ of this eikonal equation. An interesting idea consists of investigating the
characteristics of the eikonal equation. They are called the bi-characteristics of the
original system. These lines play a fundamental role in the theory of wave propagation
and rays. In many applications, it is convenient to single out one of the independent
variables as the time coordinate, say x K = t. The physical meaning of a singular
surface φ(x1 , . . . , x K −1 , t) = 0 in space-time is a propagating wave front. If we write
the equation of the singular surface as

ψ(x1 , . . . , x K −1 ) − t = 0, (7.33)

the spatial wave front at any instant of time t0 is the level curve ψ = t0 , as suggested
in Fig. 7.3 for the case K = 3. This observation leads to a somewhat friendlier
formulation of the characteristic equation, as described in Box 7.1.

Box 7.1 A friendlier formulation of the characteristic equation


We already had an opportunity to remark that the solutions φ of Eq. (7.32)
do not directly represent a characteristic hyper-surface. Indeed, we need to
impose the additional constraint φ = 0. This feature can be circumvented by
assuming, as is often the case in applications, that a time variable t = x K can
be singled out and that the derivative of φ with respect to t does not vanish.
Introducing the new representation φ(x, . . . , x K −1 , t) = ψ(x1 , . . . , x K −1 ) − t,
the characteristic equation (7.32) becomes

det(Aα φ,xα − A K ) = 0,

where the summation convention for Greek indices is restricted to the range
1, . . . , K − 1. Every solution of this first-order non-linear PDE represents a
characteristic manifold. In terms of the normal speed of a moving surface,
introduced in Box 7.2, this equation can be written point-wise as the algebraic
condition
det(Aα m α − V A K ) = 0,

in which m α are the component of the unit normal to the spatial wave front.
In terms of classification, the system is totally elliptic if there are no real
eigenvalues V . It is totally hyperbolic if it has K − 1 distinct real eigenvalues.
Repeated eigenvalues can also be included in the definition, since they appear
in applications.
140 7 Systems of Equations

Fig. 7.3 Spatial wave fronts t


as level sets of a
spatiotemporal wave front

x2

x1

7.2 Systems of Second-Order Equations

7.2.1 Characteristic Manifolds

Although it is possible to express any higher-order equation or system thereof by


means of a system of first-order equations, there is often a loss of physical meaning
in the process of introducing artificial intermediate variables and equations. For this
reason, it is convenient in many cases to leave the equations in their original form,
such as obtained in applications in Solid and Fluid Mechanics and many other fields
in Engineering and Physics. A system of n second-order quasi-linear equations for
n functions of K variables can be written compactly as

A I J u,x I x J = c, (7.34)

where the matrices A I J and the vector c are assumed to be differentiable functions
of the independent variables x I and possibly also of the unknown functions u i and
their first derivatives u i,I . We proceed to evaluate the jump of this equation under
the assumption that neither the functions u i nor their first derivatives undergo any
discontinuities, since we are looking for weak discontinuities. The result is

A I J u,x I x J  = 0. (7.35)

Invoking the iterated compatibility condition derived in Box 6.1, we can write

A I J a n I n J = 0, (7.36)

where a is a vector with n entries known as the wave amplitude vector. We conclude
that a hyper-surface element with normal n is characteristic if

det (A I J n I n J ) = 0. (7.37)
7.2 Systems of Second-Order Equations 141

The classification of the system as totally hyperbolic, elliptic or otherwise follows


in the usual way. In applications to Elasticity, the purely spatial part of the tensor
A I J n I n J is known as the acoustic tensor associated with a given direction in space.

Box 7.2 Moving surfaces in R3


Consider a differentiable function f (x1 , x2 , x3 , t) of the spatial Cartesian coor-
dinates x1 , x2 , x3 and time t. The equation

f (x1 , x2 , x3 , t) = 0

represents a fixed three-dimensional hyper-surface in R4 . There is an alter-


native, more physical, way to view this equation. Indeed, for each value of t
we obtain an ordinary two-dimensional surface in the space with coordinates
x1 , x2 , x3 . Accordingly, we can regard the equation f (x1 , x2 , x3 , t) = 0 as a
one-parameter family of surfaces St in ordinary space. Since the parameter is
time, we can regard this family as a moving surface in space.
An important question is: what is the velocity of this moving surface? Clearly,
the velocity is a point-wise notion. Each point of the surface has its own veloc-
ity, and this velocity, in general, varies with time. But, what is a ‘point’ of this
surface? Assume that we were to attach, at time t, an observer to a point P
moving with the surface. Where would this observer be located at time t + dt?
There is an infinite number of ways to establish a one-to-one correspondence
between points in St and St+dt ! Choosing a particular point P  in St+dt we
obtain a vector with components d x1 , d x2 , d x3 . These increments are con-
strained by the vanishing of f on all surfaces of the family by the equation
St+dt
normal
St U df = f,i dxi + ft dt = 0,
P
P V

where the summation convention is used for the spatial indices. Dividing by
dt and recognizing the vector with components Vi = d xi /dt as the velocity
associated with the pairing P, P  yields

Vi f ,i = − f t or V · ∇ f = − ft .

The gradient ∇ f is proportional to the unit normal n to the surface at P, the


constant of proportionality being the magnitude of ∇ f . We conclude that

ft
U = Vn = V · n = − √ .
∇f ·∇f

The right-hand side of this equation is clearly independent of the particular P 


chosen as a pair of P. We conclude that: The normal speed v n of the moving
142 7 Systems of Equations

surface at P is a meaningful intrinsic geometric quantity. We call the vector


U =U n the normal velocity of the moving surface at P.
If the surface is moving on a material background, the material particle instan-
taneously occupying the position P at time t has a velocity v. The difference
U p = U − v · n is the relative speed of the surface with respect to the particle,
also called the speed of propagation. If it vanishes identically, we say that the
moving surface is material since it is dragged by the motion and is always
occupied by the same material particles.

7.2.2 Variation of the Wave Amplitude

7.2.2.1 Jump Relations

A peculiar feature of hyperbolic equations and systems is that, without necessarily


solving the field equations, it is possible to obtain a wealth of information about
the solution. In particular, when the characteristic manifolds are obtained and inter-
preted as wave fronts, it becomes possible to describe events taking place at those
wave fronts, such as the decay or growth of the amplitude of an initially imposed
discontinuity. The equation governing this growth or decay is an ordinary differential
equation known as the transport equation or the decay-induction equation. We have
already encountered this equation in Sect. 6.4.3 for the general quasi-linear second-
order equation in two independent variables. The situation is somewhat more involved
when the number of independent variables is larger than 2. The technique used in
deriving the transport equation consists of differentiating the PDE with respect to
one of the variables, t say, and then calculating its jump. This procedure seems to
complicate matters, since the jumps of third derivatives appear in the equation. It can
be shown, however, that these derivatives are affected by the same matrix coefficient
as the wave amplitude vector in Eq. (7.36). This circumstance enables the elimination
of this term by projecting the equation over the wave amplitude corresponding to
the particular wave front being dealt with. A general treatment of this issue is pre-
sented in very clear terms in [7] for quasi-linear systems of first-order equations. An
application to a particular system of second-order equations is given in [1]. Here we
present a more or less complete treatment for a particular linear system of second-
order equations. The procedure can be extended to the general quasi-linear system
by more or less obvious modifications.
An ingenious device is exploited in [7] which allows to easily construct a new set
of independent variables, K − 1 of which lie in the space-time wave front. Assuming
this wave front to have been found and expressed in the form (7.33), a change of
variables is introduced as

ξ I = x I for I = 1, . . . , K − 1
(7.38)
τ = ψ(x1 , . . . , x K −1 ) − t
7.2 Systems of Second-Order Equations 143

It follows that for τ = 0 the coordinate lines corresponding to the new variables ξ I
lie on the singular surface. Let g = g(x1 , . . . , x K −1 , t) be a differentiable function
of the old variables. It can be readily converted into a function ĝ of the new variables
by the composition

g(x1 , . . . , t) = g(ξ1 , . . . , ψ(ξ1 , . . .) − τ ) = ĝ(ξ1 , . . . , τ ). (7.39)

For the sake of compactness, let us abuse the notation and denote by commas the
partial derivatives with respect to either x I or ξ I . The distinction will be clear from
the name of the function (un-hatted or hatted, respectively). In the same vein, let
us denote by superimposed dots partial derivatives with respect to t or τ . With this
understanding, we obtain

⎨ ĝ,I = g I + ġ ψ,I for I = 1, . . . , K − 1
(7.40)
⎩ ˙
ĝ = − ġ

The beauty of the new variables is that the coordinates ξ I are interior coordinates,
as a result of which derivatives in those K − 1 directions do not experience any
jump! Consequently, ξ-derivatives commute with the jump operator. Since we are
dealing with weak waves, jumps of functions of the variables ξ I , τ will occur only in
quantities with two or more τ derivatives. Differentiating Eq. (7.40), we obtain the
following relations

¨
g̈ = ĝ (7.41)
¨ ψ,I
ġ,I  = −ĝ (7.42)
¨ ψ,I ψ,J
g,I J  = ĝ (7.43)
... ...
 g  = − ĝ  (7.44)
...
¨ ψ,I +  ĝ ψ,I
g̈,I  = ĝ (7.45)
...
¨ ,I ψ,J − ĝ
ġ,I J  = −ĝ ¨ ,J ψ,I − ĝψ
¨ ,I J −  ĝ ψ,I ψ,J . (7.46)

7.2.2.2 The Transport Equation

For definiteness, we consider a linear system of n equations for n functions u 1 , . . . , u n


of K variables x1 , . . . , x K −1 , t in the normal form

A I J u,I J = ü. (7.47)

The matrices A I J are assumed to be constant and symmetric. Taking jumps and
invoking Eqs. (7.41) and (7.43) yields
 
¨ = 0,
A I J ψ,I ψ,J − I û (7.48)
144 7 Systems of Equations

in which I is the unit matrix of order n. We assume the system to be totally hyperbolic.
A solution ψ = ψ(x1 , . . . , x K −1 ) of the first-order eikonal equation
 
det A I J ψ,I ψ,J − I = 0 (7.49)

provides us with a spatiotemporal wave front ψ(x1 , . . . , x K −1 ) − t = 0. At each


point of this hyper-surface, we obtain a single eigen-direction satisfying Eq. (7.48).
If we differentiate the original PDE with respect to time and then take the jumps
while invoking Eqs. (7.41) and (7.46), we can write
 ...  ...
¨ ,I ψ,J − û
A I J −û ¨ ,J ψ,I − ûψ
¨ ,I J −  û ψ,I ψ,J = − û , (7.50)

or
    ...
¨ ,I ψ,J + û
A I J û ¨ ,J ψ,I + ûψ
¨ ,I J + A I J ψ,I ψ,J − I  û  = 0. (7.51)

... of A I J ψ,I ψ,J −I


Since the matrices A I J are symmetric, the right and left eigenvectors
are the same. On multiplying to the left by the transpose of  û  we cancel out the
term containing the jump of the third derivative. The result is
 
û ¨ ,I ψ,J + û
¨ T A I J û ¨ ,J ψ,I + ûψ
¨ ,I J = 0. (7.52)

Recall that the eigenvector is known in direction and that, therefore, this equation
becomes an ODE for the amplitude a of û. ¨ This is the desired transport equation.
It can be shown [1] that its characteristics coincide with the bi-characteristics of the
original equation.

7.2.3 The Timoshenko Beam Revisited

7.2.3.1 Characteristics

We have derived in Sect. 7.1.4 the dynamic equations for a Timoshenko beam as a
system of 4 first-order PDEs. It is also possible to express these equations in terms
of 2 second-order PDEs for the displacement w and the rotation θ, respectively. The
result2 for a beam of homogeneous properties is

⎨ ρAwtt − G As (wx − θ)x = −q
(7.53)

ρI θtt − E I θx x = G As (wx − θ)

2 See Exercise 7.5.


7.2 Systems of Second-Order Equations 145

These equations can be recast in the normal form (7.47), except for the addition of
a vector c on the right-hand side. Since K = 2 we obtain the single matrix
 
G As
ρA
0
A= E (7.54)
0 ρ

The additional term c is given by


 q+G As 
ρA
θx
c= (7.55)
− GρIAs (wx − θ)

Equation (7.49) provides us in this case immediately with the eigenvalues


 2  2
dt ρA dt ρ
= = . (7.56)
dx 1 G As dx 2 E

which, except for a different numbering, are identical to those obtained earlier as
Eq. (7.24). The corresponding eigenvectors are, respectively, 1, 0 and 0, 1. Thus,
the characteristics are straight lines. We have four families of characteristics in the
x, t plane, each pair with slopes of different signs. For total hyperbolicity, these slopes
are all distinct. The interesting case of double eigenvalues is considered separately
in Box 7.3.

7.2.3.2 Variation of Amplitude

The general prescription of Eq. (7.52) has to be modified slightly to take into consid-
eration the additional vector c. Indeed, this term contributes first partial derivatives,
which have been assumed to be continuous and, accordingly, do not affect the char-
acteristic equations. On the other hand, when constructing the transport equation,
a further derivative is introduced which affects the formation of the jumps. The
additional term is ⎧ G As ⎫
⎨ ρA θxt  ⎬
ct  = (7.57)
⎩ G As ⎭
− ρI wxt 
146 7 Systems of Equations

The final result corresponding to Eq. (7.52) for the first eigenvalue is
      !
  G As
ρA
0 2ax ρA G As 0
10 E
G As +a = 0. (7.58)
0 ρ 0 ρA 1

This equation implies that ax = 0, that is, the amplitude of a jump in the second
derivative of the displacement remains constant as it propagates. A similar analysis for
the second eigenvalue shows that the amplitude of the second derivative of the rotation
also remains constant. The main reason for these facts, apart from the constancy of the
sectional properties, is the complete decoupling between the modes. This decoupling
stems from the particular form of the additional term c and from the assumption of
total hyperbolicity, namely, that the speeds of propagation of the bending and shear
signals are different. A surprising discrepancy is found when these speeds happen to
be equal to each other, as discussed in Box 7.3.

Box 7.3 When two eigenvalues meet


The shear area As of the Timoshenko beam is a matter of some controversy.
At any rate, it is an adjustable quantity that may depend on the application
at hand [3]. In exceptional cases it may attain a value such that G As = E A,
which would lead to the equality of the eigenvalues of the matrix A. In that
case, any vector a, b in the space of jumps with coordinates wtt , θtt  is
an eigenvector. Equation (7.58) must then be replaced by
  
E
ρ
0 2ax E b
+ = 0,
0 E
ρ
2bx ρ − AI a

which leads to the system



⎨ 2ax + b = 0

2bx − A
I
a = 0.

In obtaining these equations, a very important observation is that, unlike the


case of simple eigenvalues, the projection on the corresponding eigenvector
¨ T
(that is, the multiplication
  ...left by û in Eq. (7.52)) is not called for,
to the
since the term A I J ψ,I ψ,J − I  û  in Eq. (7.51) cancels out identically in the
case of a double eigenvalue. It is this fact that permits us to obtain two equations
rather than just one, as was the case of simple eigenvalues. The solution of this
coupled system is

x x D x C x
a = C cos + D sin b=− cos + sin ,
2κ 2κ κ 2κ κ 2κ
7.2 Systems of Second-Order Equations 147


where κ = A/I is the radius of gyration and C, D are constants to be
adjusted according to the initial conditions, assuming that this knowledge is
available at some point on one of the characteristic lines. The solution has
been parametrized by x, but could as well be parametrized by t or any other
parameter along the characteristic line. The main conclusion is that, unlike the
case of simple eigenvalues, whereby the amplitude does not decay or grow,
in the case of a double eigenvalue the amplitudes of both bending and shear
signals are coupled and evolve harmonically in time.

7.2.4 Air Acoustics

7.2.4.1 Wave Fronts and Rays

A relatively simple example is provided by air acoustics, which is governed by the


three-dimensional wave equation

1
px x + p yy + pzz = ptt , (7.59)
c2

in which p = p(x, y, z, t) is the acoustic pressure and c is the speed of sound.


Writing the equation of the characteristic manifold in the form

φ(x, y, z, t) = ψ(x, y, z) − t = 0, (7.60)

as suggested in Box 7.1, we obtain

1
ψx2 + ψ 2y + ψz2 − = 0. (7.61)
c2

Assuming, for simplicity, a two-dimensional situation ψ = ψ(x, y), we can solve


the characteristic equation by the method of characteristic strips developed in Chap. 5.
We adopt as the initial condition a parabolic wave front given parametrically by

x =r y = r2 ψ = 0. (7.62)

Physically, this initial wave front could have been produced by a distribution of
many speakers or other sound producing devices arranged on a parabolic surface and
activated simultaneously at time t = 0. Extending these data by the strip condition
and by the PDE itself yields

1
ψx + 2ψ y r = 0 ψx2 + ψ 2y − = 0. (7.63)
c2
148 7 Systems of Equations

The characteristic strips are solutions of the system

dx dy dψ 2 dψx dψ y
= 2ψx = 2ψ y = 2ψx2 + 2ψ 2y = 2 =0 = 0.
ds ds ds c ds ds
(7.64)
The solution of this system is given by

2r s s s
x =− √ +r y= √ + r2 ψ= . (7.65)
c 1 + 4r 2 c 1 + 4r 2 c2

For any value of s the solutions projected on the x, y plane are straight lines. They
are the bi-characteristics of the original PDE. For this particular equation, the bi-
characteristics are perpendicular to the spatial wave fronts. On the concave side of the
initial parabola these bi-characteristics (or rays) tend to converge. This phenomenon,
observed also in optics, is known as focusing and the formation of a caustic. Figure 7.4
provides a plot of the bi-characteristics for our example.
For a circular (cylindrical) surface of radius R and centre at (0, R), all the rays
converge to the centre. The initial wave front can be written parametrically as

x = R sin θ y = R(1 − cos θ) ψ = 0. (7.66)

The strip condition (5.27) yields

ψx R cos θ + ψ y R sin θ = 0. (7.67)

Fig. 7.4 Acoustic focusing


1.0

0.8

0.6

0.4

0.4 0.2 0.2 0.4

0.0
7.2 Systems of Second-Order Equations 149

The characteristic ODEs are given again by (7.64). The solution in this case is given
by

2s 2s 2s
x =− sin θ + R cos θ y= cos θ + R(1 − cos θ) ψ = 2. (7.68)
c c c

The parameters s, θ are readily eliminated and we obtain the solution of the wave
fronts in the form
1 " 
ψ= R − x 2 + (R − y)2 . (7.69)
c

7.2.4.2 Transport Equation

We have a single equation, so that the matrices A I J become scalars. In our case,
moreover, they attain the simple form c2 δ I J , where δ I J is the Kronecker symbol.
¨ is just a scalar a and the decay equation (7.52)
Correspondingly, the amplitude û
becomes
2ax ψx + 2a y ψ y + a(ψx x + ψ yy ) = 0. (7.70)

Notice that at this stage the coefficients ψx , ψ y are known. The characteristics of this
first-order PDE for a are given by

dx dy da
= 2ψx = 2ψ y = −a(ψx x + ψ yy ). (7.71)
dr dr dr
As expected, the projected characteristics of the transport equation are precisely the
same as the bi-characteristics of the original equation. Thus, weak discontinuities
travel along rays (in this example perpendicular to the moving wave front). Let us
integrate the transport equation for the circular case, since the parabolic case leads
to more involved formulas. Our objective is to satisfy our intuition that as the signals
converge toward a focus the acoustic pressure increases without bound. Introducing
the solution ψ into (7.71), the system of equations is written as

dx 2x dy 2(R − y) da a
=− = = , (7.72)
dr cρ dr cρ dr cρ
"
where ρ = x 2 + (R − y)2 is the distance to the centre of the circle. From the first
two equations we conclude that

dx −x
= , (7.73)
dy R−y

so that, as expected, the rays are the radii

x = k(R − y). (7.74)


150 7 Systems of Equations

Notice that
2
dρ = ρx d x + ρ y dy = − dr. (7.75)
c
It follows that
da da dr a
= =− , (7.76)
dρ dr dρ 2ρ

which integrates to 
R
a = a0 , (7.77)
ρ

where a0 is the intensity of the excitation (produced, say, by a sudden pulse of the
speakers). Thus, the pressure grows without bound as ρ approaches the center of
the circle. It is important to realize that we have not solved the acoustic differential
equation (7.60). We have merely found the rays as carriers of weak discontinuities and
the variation of their amplitude. This information was gathered by solving first-order
differential equations only.

7.2.5 Elastic Waves

7.2.5.1 Basic Equations of Linear Infinitesimal Elasticity

In Chap. 2 we presented the basic balance equations of Continuum Mechanics. In


particular, the equation of balance of linear momentum (2.55) reads

∂vi
ρ + ρvi, j v j = bi + σi j, j . (7.78)
∂t
Here, σi j = σ ji are the (Cartesian) components of the stress tensor, ρ is the current
density, bi are the components of the spatial body force and vi are the components
of the velocity vector. This formulation is strictly Eulerian or spatial. For elastic
solids, it is convenient to introduce an alternative (Lagrangian) formulation based on
a fixed reference configuration. Nevertheless, for our current purpose, we will adopt a
linearized formulation around the present configuration, in which case the distinction
between the two formulations alluded to above can be disregarded. Moreover, the
inertia term appearing on the left-hand side of Eq. (7.78) will be approximated by
the product ρ ∂v ∂t
i
, where ρ is identified with the density in the fixed (current) state,
so that the mass conservation (continuity equation) does not need to be enforced.
Finally, our main kinematic variable is a displacement vector field with components
u i = u i (x1 , x2 , x3 , t) in the global inertial frame of reference. In terms of these
variables, the velocity components are given by
7.2 Systems of Second-Order Equations 151

∂u i
vi = . (7.79)
∂t
The material properties will be assumed to abide by Hooke’s law

1  
σi j = Ci jkl u k,l + u l,k . (7.80)
2
The fourth-order tensor of elastic constants C enjoys the symmetries

Ci jkl = Ckli j = C jikl . (7.81)

It has, therefore, a maximum of 21 independent components. Particular material mod-


els enjoy further symmetries, the most symmetric of which is the isotropic material,
completely characterized by two elastic constants. For the isotropic material we have

Ci jkl = λδi j δkl + μ(δik δ jl + δil δ jk ). (7.82)

In this equation λ and μ are the Lamé coefficients and δi j is the Kronecker symbol,
equal to 1 when the two indices are equal and vanishing otherwise.
The whole apparatus can be condensed in a system of three coupled second-order
PDEs for the three displacement components, namely,

∂2ui
Ci jkl u k,l j + bi = ρ i = 1, 2, 3. (7.83)
∂t 2

7.2.5.2 Hyperbolicity

Introducing the wave front equation φ(x1 , x2 , x3 , t) = ψ(x1 , x2 , x3 ) − t = 0 and


implementing Eq. (7.36), we obtain the propagation condition
 
Ci jkl m j m l − ρU 2 δik ak = 0, (7.84)

in which m i are the components of a unit vector normal to the wave front and ai are
the components of the wave amplitude vector. The tensor

Q ik = Ci jkl m j m l (7.85)

is called the acoustic tensor in the direction m.3 Total hyperbolicity corresponds to
the case in which, for every m, the tensor Q is (symmetric and) positive-definite with
three distinct eigenvalues proportional to the speeds of propagation. The associated
orthonormal eigenvectors are called the acoustical axes associated with the direction
m. If one of the acoustical axes coincides with m, the wave is said to be longitudinal.

3 For a fuller treatment of the general non-linear theory, see [6, 7].
152 7 Systems of Equations

If they are perpendicular, the wave is called transversal. These concepts make sense
because n = K − 1 and the displacement vector can be regarded as an element of the
physical space R3 . The case of one repeated eigenvalue occurs in isotropic materials,
where transversal waves propagate at identical speed.
Exercises

Exercise 7.1 Write the equations of an intermediate theory of beams for which the
normality of cross sections is preserved but the rotary inertia is included. Find the
speed of propagation of weak singularities for this theory.

Exercise 7.2 In the Timoshenko beam we found 4 distinct speeds of propagation and
identified the first two as pertaining to bending waves and the last two as pertaining to
shear waves. Justify this terminology by calculating the eigenvectors corresponding
to each of the 4 speeds.

Exercise 7.3 Carry out all the steps necessary to obtain Eq. (7.16) and provide
explicit expressions for Ĉ and d̂. Compare with the results in [4], p. 48.

Exercise 7.4 Apply the procedure explained in Sect. 7.1.3 to the Timoshenko beam
equations to obtain a system in canonical form. Consider the case of repeated eigen-
values and obtain the corresponding transport equations. Compare your results with
those obtained in [5] and with those obtained in Box 7.3.

Exercise 7.5 It is not always possible to obtain a single higher order PDE equivalent
to a given system of first-order PDEs. In the case of the Timoshenko beam, however,
this can be done by differentiation and elimination. Provide two alternative formula-
tions based, respectively, on a system of two second-order equations for the rotation
θ and the deflection w, and on a single fourth-order equation for the deflection w.
Obtain the characteristic speeds from each of these two alternative formulations.

Exercise 7.6 Prove that the speed of propagation of a surface moving on a material
background, as described in Box 7.2, is given by

D f /Dt
Up = − √ ,
∇ f ·∇ f

where D/Dt is the material derivative operator defined in Box 2.1.

Exercise 7.7 Show that in an isotropic elastic material (in infinitesimal elasticity)
all transverse waves travel at the same speed. Find the speeds of longitudinal and
transverse waves in terms of the Lamé constants. Determine restrictions on these
constants so that waves can actually exist.
References 153

References

1. Cohen H, Epstein M (1982) Wave fronts in elastic membranes. J Elast 12(3):249–262


2. Gentle JE (2007) Matrix algebra. Springer, Berlin
3. Hutchinson JR (2001) Shear coefficients for Timoshenko beam theory. ASME J Appl Mech
68:87–92
4. John F (1982) Partial differential equations. Springer, Berlin
5. Leonard RW, Budiansky B (1954) On traveling waves in beams, NACA report 1173. http://naca.
central.cranfield.ac.uk/reports/1954/naca-report-1173.pdf
6. Truesdell C, Noll W (1965) The non-linear field theories of mechanics. In: Flügge S (ed) Hand-
buch der Physik. Springer, Berlin
7. Varley E, Cumberbatch E (1965) Non-linear theory of wave-front propagation. J Inst Maths
Applics 1:101–112
Part IV
Paradigmatic Equations
Chapter 8
The One-Dimensional Wave Equation

The archetypal hyperbolic equation is the wave equation in one spatial dimension.
It governs phenomena such as the propagation of longitudinal waves in pipes and
the free transverse vibrations of a taut string. Its relative simplicity lends itself to
investigation in terms of exact solutions of initial and boundary-value problems. The
main result presented in this chapter is the so-called d’Alembert solution, expressed
within any convex domain as the superposition of two waves traveling in opposite
directions with the same speed. Some further applications are explored.

8.1 The Vibrating String

We have already encountered the one-dimensional wave equation in Chap. 2 when


dealing with longitudinal deformations of an elastic bar of constant cross section.
There, we obtained the second-order PDE

u tt = c2 u x x , (8.1)

where the constant c is given by 


E
c= , (8.2)
ρ

E being Young’s modulus and ρ the density of the material.


For purposes of visualization, it is sometimes convenient to have in mind another
phenomenon, also governed by Eq. (8.1), namely, the small transverse vibrations of
a string under tension, just like the strings of a musical instrument. To derive the
equation in this context, we will resort to an elementary technique. Denoting by A
the cross-sectional area and by T the value of the tension, and assuming that the
transverse deflections u are so small that the tension remains practically unaffected

© Springer International Publishing AG 2017 157


M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_8
158 8 The One-Dimensional Wave Equation

Fig. 8.1 A vibrating string u

T
θ + dθ
ds
θ

T
dx
x

by the motion, we can draw a free-body diagram of an infinitesimal portion of the


string of extent d x, as shown in Fig. 8.1.
Newton’s second law applied to the vertical components of the forces yields

− T sin θ + T sin(θ + dθ) + q d x = u tt ρ A d x. (8.3)

Since the angles are assumed to be very small, we can replace the sine and the tangent
of the arc by the arc itself. Moreover, at any fixed instant of time

dθ ≈ d(tan θ) = du x = u x x d x. (8.4)

Putting all these results together, we obtain

u tt = c2 u x x + q. (8.5)

The constant c is now given by 


T
c= . (8.6)
ρA

As before, this constant has the units of speed. In the absence of externally applied
transverse forces q, we recover the one-dimensional wave equation (8.1). We refer
to this case as the free vibrations of the string.

8.2 Hyperbolicity and Characteristics

The characteristic equation (6.18) is given, in the case of the wave equation and with
the notation of the present chapter, by

dt 1
=± . (8.7)
dx c
8.2 Hyperbolicity and Characteristics 159

Fig. 8.2 Characteristics as t


coordinate lines ξ, η

η ξ

In other words, the equation is hyperbolic. Since the equation is linear, the projected
characteristics are, as expected, independent of the solution. Moreover, because the
equation has constant coefficients, they are straight lines. We have seen that weak
singularities can occur only across characteristics. It follows in this case that weak
signals1 propagate with the constant speed c. Introducing the new variables

x +c t x −c t
ξ= η= , (8.8)
2 2
the one-dimensional wave equation is reduced to the canonical (normal) form

u ξη = 0. (8.9)

This change of variables is tantamount to using the characteristics as coordinate lines,


as suggested by Fig. 8.2.

8.3 The d’Alembert Solution

Suppose that we are given a convex2 domain in R2 and a C 2 function u = u(ξ, η)


satisfying the wave equation identically in that domain. Since, according to the wave
equation (8.9), we must have that
 
∂ ∂u(ξ, η)
u ξη = = 0, (8.10)
∂η ∂ξ

1 As discussed in Sect. 7.1.3, this applies to strong signals as well.


2A subset of Rn is convex if whenever two points belong to the set so does the entire line segment
comprised by them. The requirement of convexity is sufficient for the validity of the statement. If
the domain is not convex, we may have different functions in different parts of the domain, and still
satisfy the differential equation.
160 8 The One-Dimensional Wave Equation

we conclude that, in the domain of interest, there exists a function F(ξ) such that

∂u(ξ, η)
= F(ξ). (8.11)
∂ξ

A further integration of this equation yields

u(ξ, η) = f (ξ) + g(η), (8.12)

where f and g are arbitrary C 2 functions of one real variable.


It is not difficult to verify that, for a given solution u = u(ξ, η), the functions f
and g are unique to within a constant (that, if added to f , must be subtracted from g).
Returning to the original variables x, t, we can write the general solution of Eq. (8.1)
as
u(x, t) = f (x + c t) + g(x − c t). (8.13)

This is the so-called d’Alembert solution of the one-dimensional wave equation. In


some sense, it states that we can interpret the wave equation as the second-order
counterpart of the advection equation discussed in Chap. 3. Indeed, it establishes
that the general solution of the one-dimensional wave equation is a superposition of
two waves propagating with equal speed c in opposite directions while keeping their
shapes unchanged in time.

8.4 The Infinite String

The d’Alembert representation of the general solution applies to any convex domain.
Nevertheless, the solution of any real-life problem requires the specification of bound-
ary (fixed ends of a finite string) and initial (position and velocity at t = 0) conditions.
In particular, the boundary conditions limit the direct use of the d’Alembert represen-
tation over the whole domain of interest. For this reason, the best direct application
of the d’Alembert solution is to the case of a spatially unbounded domain (i.e.,
an infinite string). Assume that the initial conditions are given by some functions
u 0 (x), v0 (x), namely,
u(x, 0) = u 0 (x) (8.14)

and
u t (x, 0) = v0 (x). (8.15)

In other words, the functions u 0 (x), v0 (x) represent, respectively, the known initial
shape and velocity profile of the string. Using the d’Alembert representation (8.13),
8.4 The Infinite String 161

we immediately obtain
u 0 (x) = f (x) + g(x) (8.16)

and
v0 (x) = c f  (x) − c g  (x), (8.17)

where primes are used to denote derivatives of a function of a single variable. Differ-
entiating equation (8.16) and combining the result with Eq. (8.17), we can read off

1  1
f  (x) = u (x) + v0 (x), (8.18)
2 0 2c
1  1
g  (x) = u (x) − v0 (x). (8.19)
2 0 2c
On integrating, we obtain

x
1 1
f (x) = u 0 (x) + v0 (z) dz + C, (8.20)
2 2c
0

x
1 1
g(x) = u 0 (x) − v0 (z) dz − C. (8.21)
2 2c
0

Notice that the lower limit of integration is immaterial, since it would only affect the
value of the constant of integration C. The reason for having the same integration
constant (with opposite signs) in both expressions stems from the enforcement of
Eq. (8.16). Notice also that the dependence on x is enforced via the upper limit,
according to the fundamental theorem of calculus, while z is a dummy variable of
integration. For the functions f and g to be of class C 2 it is sufficient that u 0 be C 2
and that v0 be C 1 .
Now that we are in possession of the two component functions of the d’Alembert
representation, we are in a position of stating the final result, namely,

u(x, t) = f (x + c t) + g(x − c t)
(8.22)
 t
x+c  t
x−c
1 1 1 1
= u 0 (x + c t) + v0 (z) dz + u 0 (x − c t) − v0 (z) dz.
2 2c 2 2c
0 0

or, more compactly,


162 8 The One-Dimensional Wave Equation

Fig. 8.3 Domain of t


dependence

(x, t)

x
x − ct x + ct

 t
x+c
1 1 1
u(x, t) = u 0 (x + c t) + u 0 (x − c t) + v0 (z) dz. (8.23)
2 2 2c
x−c t

From the point of view of the general theory of hyperbolic equations, it is important
to notice that the value of the solution at a given point (x, t) is completely determined
by the values of the initial conditions in the finite closed interval [x − c t, x + c t].
This interval, called the domain of dependence of the point (x, t), is determined
by drawing the two characteristics through this point backwards in time until they
intersect the x axis, as shown in Fig. 8.3.3 From the physical point of view, any
initial datum or signal outside the domain of dependence of a point has no influence
whatsoever on the response of the system at that point of space and time. In other
words, signals propagate at a definite finite speed and cannot, therefore, be felt at
a given position before a finite time of travel. This remark has implications on any
numerical scheme of solution (such as the method of finite differences) that one may
attempt to use in practice.
The natural counterpart of the notion of domain of dependence is that of range of
influence. Given a closed interval [a, b] (or, in particular, a point) in the x axis (or
anywhere else, for that matter), its range of influence consists of the collection of
points (in the future) whose domain of dependence intersects that interval (or point).
It is obtained by drawing the outbound characteristics from the extreme points of the
interval and considering the wedge shaped zone comprised between them, as shown
in Fig. 8.4. A point outside the range of influence will be not affected at all by the
initial data in the interval [a, b].

3 Further elaborations of these ideas can be found in most books on PDEs. For clarity and conciseness,

we again recommend [3, 4].


8.5 The Semi-infinite String 163

Fig. 8.4 Range of influence t

x
a b

8.5 The Semi-infinite String

8.5.1 D’Alembert Solution

The d’Alembert solution can also be applied to the problem of the dynamics of a
semi-infinite string fixed or otherwise supported at one end. Let the undeformed
string occupy the region 0 ≤ x < ∞ and let the initial conditions be given as in the
previous problem by
u(x, 0) = u 0 (x) 0≤x <∞ (8.24)

and
u t (x, 0) = v0 (x) 0 ≤ x < ∞. (8.25)

At the end x = 0 we must now impose a boundary condition. Consider the case in
which this end is fixed, namely,

u(0, t) = 0 0 ≤ t < ∞. (8.26)

Note that at time t = 0 the boundary condition may happen to be inconsistent with
the initial conditions at the left end of the string. We will for now explicitly assume
that this is not the case. In other words, we assume that

u 0 (0) = 0 v0 (0) = 0. (8.27)

According to our discussion on domain of dependence and region of influence, it


should be clear that the solution in the region under the right-pointing characteristic
through the origin coincides with the solution for an infinite string with the same
initial conditions, viz.,

 t
x+c
1 1 1
u(x, t) = u 0 (x + c t) + u 0 (x − c t) + v0 (z) dz 0 ≤ ct ≤ x.
2 2 2c
x−c t
(8.28)
164 8 The One-Dimensional Wave Equation

The solution in the remaining part of the domain (namely, 0 ≤ x ≤ ct) must also
be amenable to a d’Alembert decomposition of the form

u(x, t) = f 1 (x + c t) + g1 (x − c t). (8.29)

At x = c t we demand continuity, that is,

f 1 (2c t) + g1 (0) = f (2c t) + g(0). (8.30)

Since we have a constant of integration at our disposal, we may set g1 (0) = g(0) and
obtain
f 1 (x) = f (x). (8.31)

The physical meaning of this result is that the backward moving wave is unaffected
by the presence of the support, which should not be surprising. To obtain the forward
moving wave, we impose the boundary condition (8.26) and obtain

0 = u(0, t) = f 1 (c t) + g1 (−c t) ∀t > 0. (8.32)

In other words,

g1 (x − c t) = − f 1 (−(x − c t)) 0 ≤ x < c t. (8.33)

From the physical point of view this result means that the forward wave arriving at a
point with coordinates x0 , t0 situated in the upper domain is a reflected (and inverted)
version of the backward wave issuing at the initial time t = 0 from the point of the
string with coordinate ct0 − x0 . The time at which the reflection occurs is t0 − x/c,
as shown in Fig. 8.5.
Combining all the above results, we can obtain an explicit formula for the solution
in the upper domain as

Fig. 8.5 Reflection of t


backward waves in the
semi-infinite string (x0 , t0 )

t0 − x0 /c

x
ct0 − x0
8.5 The Semi-infinite String 165

t
C

B
η D ξ

Fig. 8.6 A characteristic parallelogram

 t
x+c
1 1 1
u(x, t) = u 0 (x + c t) − u 0 (−(x − c t)) + v0 (z) dz 0 ≤ x ≤ c t.
2 2 2c
−(x−c t)
(8.34)

8.5.2 Interpretation in Terms of Characteristics

The representation used in Fig. 8.5 suggests that the analytic expressions obtained by
means of d’Alembert’s solution can be also obtained geometrically by constructions
based on the characteristic lines alone. To this end, consider a parallelogram-shaped
domain enclosed by characteristics, such as the shaded domain shown in Fig. 8.6.
Denoting the corners of the parallelogram by A, B, C, D, as shown, it is not
difficult to conclude that any function u = u(x, t) of the d’Alembert form (8.13)
satisfies the condition
u A + uC = u B + u D , (8.35)

with an obvious notation. This result follows from Eq. (8.12) on observing that

ξ A = ξD ξ B = ξC η A = ηB ηC = η D . (8.36)

Every C 2 solution of the wave equation, therefore, satisfies the simple algebraic
identity (8.35). Conversely, every function that satisfies Eq. (8.35) for every charac-
teristic parallelogram within a given domain also satisfies the wave equation within
the domain.4 If we have a function that is not of class C 2 and yet satisfies Eq. (8.35),

4 See [3], p. 42.


166 8 The One-Dimensional Wave Equation

n
ai
C(x, t) o m in
r d ma
p pe do
u
w er
lo
B
D

x
x − ct 0 −(x − ct) x + ct

Fig. 8.7 Geometrical derivation of the solution

we may say that it satisfies the wave equation in a weak sense or, equivalently, that
it is a generalized or weak solution of the one-dimensional wave equation.
Let us apply Eq. (8.35) to the solution of the semi-infinite string problem with
a fixed end. Consider a point C with coordinates (x, t) in the upper domain of the
problem, as represented in Fig. 8.7. We complete a characteristic parallelogram by
drawing the two characteristics from this point back to the lines x = 0 and x = c t,
thus obtaining the points D and B, respectively. The remaining point, A, is obtained
as the intersection of the characteristic line issuing from D and the line x = c t, as
shown in the figure.
By the boundary condition (8.26), we have

u D = 0. (8.37)

Therefore, by Eq. (8.35), we must have

u(x, t) = u C = u B − u A . (8.38)

If we demand continuity of the solution, however, the values of u A and u B can be


obtained by integrating over their respective domains of dependence according to
Eq. (8.23). Thus,

 t)
−(x−c
1 1 1
u A = u 0 (−(x − c t)) + u 0 (0) + v0 (z) dz (8.39)
2 2 2c
0
8.5 The Semi-infinite String 167

 t
x+c
1 1 1
u B = u 0 (x + c t) + u 0 (0) + v0 (z) dz. (8.40)
2 2 2c
0

Combining Eqs. (8.38), (8.39) and (8.40) we recover the solution (8.34).

8.5.3 Extension of Initial Data

There exists yet another way to obtain the solution for a semi-infinite (and, eventually,
for a finite) string. It consists of extending the initial data to the whole line in such
a way that the boundary conditions are satisfied automatically by the corresponding
solution of the infinite string. This procedure needs to be used with care and on
a case-by-case basis. Consider the case discussed in the previous section, namely,
the one with initial and boundary conditions described by Eqs. (8.24)–(8.27). We
now extend the initial conditions as odd functions over the whole domain. Such an
extension is shown pictorially in Fig. 8.8, where the initial function, u 0 (x) or v0 (x),
given in the original domain 0 ≤ x < ∞, has been augmented with a horizontally
and vertically flipped copy so as to obtain an odd function over the whole domain.
The extended functions, denoted here with a bar, coincide with the given data in
the original domain and, moreover, enjoy the property

ū 0 (x) = −ū 0 (−x) v̄0 (x) = −v̄0 (−x) − ∞ < x < ∞. (8.41)

In other words, they are odd functions of their argument.


Let ū(x, t) denote the solution to the problem of the infinite string under the initial
conditions ū 0 (x), v̄0 (x). In accordance with Eq. (8.23), this solution is given by

 t
x+c
1 1 1
ū(x, t) = ū 0 (x + c t) + ū 0 (x − c t) + v̄0 (z) dz. (8.42)
2 2 2c
x−c t

Fig. 8.8 Extension of data for the semi-infinite string


168 8 The One-Dimensional Wave Equation

Let us evaluate this solution over the positive time axis. The result is

c t
1 1 1
ū 0 (0, t) = ū 0 (c t) + ū 0 (−c t) + v̄0 (z) dz. (8.43)
2 2 2c
−c t

By virtue of the deliberately imposed condition (8.41), we immediately obtain

ū(0, t) = 0. (8.44)

We conclude that the extended solution automatically satisfies the desired boundary
condition of the semi-infinite string. It follows, therefore, that restricting the extended
solution to the original domain provides the solution5 to the original problem. From
the physical point of view, we may say that the forward and backward waves of
the extended solution interfere with each other destructively, in a way that has been
precisely calibrated to produce a zero value for all times at x = 0.
It is interesting to note that the solution obtained will in general not be of class
C 2 since, even if the original initial conditions and the extended initial conditions
are C 2 and even if they vanish at the origin, the extended initial conditions are only
guaranteed to be C 1 at the origin, unless the extra conditions

u 0 (0) = v0 (0) = 0 (8.45)

happen to be satisfied. If these conditions are not satisfied, the corresponding weak
discontinuities will propagate along the characteristic emerging from the origin. This
feature is not due to the method of solution (since the solution is ultimately unique).
Remark 8.1 Although the method of extension of the initial conditions is a legit-
imate alternative to the method of characteristics, it should be clear that the latter
is more general than the former. Indeed, the method of extension of the initial data
is applicable only when the supported end is actually fixed, whereas the method of
characteristics is still viable when the support is subjected to a given motion.

8.6 The Finite String

8.6.1 Solution

The method of characteristics can be used in principle to solve for the vibrations of
a finite string of length L with appropriate boundary conditions. All one has to do
is to divide the domain of interest (which consists of a semi-infinite vertical strip

5 Clearly, we are tacitly invoking an argument of uniqueness, which we have not pursued.
8.6 The Finite String 169

u or ux known u or ux known

x
0 L
u and ut known

Fig. 8.9 Characteristic solution for the finite string

in space-time) into regions, as shown in Fig. 8.9. The lower triangular region at the
base of the strip is solved by the general formula (8.23). Assuming that the solution
is continuous over the whole strip, we can then use the values at the boundary of this
triangle, together with the known values at the vertical boundaries of the strip, to use
the parallelogram formula (8.35) for the next two triangular regions. The procedure is
carried out in a similar manner for increasingly higher (triangular or parallelogram)
regions. It is not difficult to write a computer code to handle this recursive algorithm
(as suggested in Exercise 8.10).
For the particular case of fixed ends, an equivalent procedure is obtained by
exploiting the extension idea, namely the ‘trick’ already used for the semi-infinite
string to reduce the problem to that of an infinite string. Consider the problem of a
finite string of length L whose ends are fixed, namely,

u(0, t) = 0 u(L , t) = 0 ∀t ≥ 0, (8.46)

with given initial conditions

u(x, 0) = u 0 (x) u t (0) = v0 (x) 0 ≤ x ≤ L. (8.47)

To avoid any possible strong discontinuities, we confine our attention to the case in
which the initial and boundary conditions are consistent with each other in the sense
170 8 The One-Dimensional Wave Equation

Fig. 8.10 Extension of data for the finite string

that
u 0 (0) = u 0 (L) = v0 (0) = v0 (L) = 0. (8.48)

We now extend the initial conditions (uniquely) to odd functions with period
2L. Although this idea is both intuitively appealing and, eventually, justified by the
results, it is worthwhile noting that the need for periodicity can actually be derived by
means of a rational argument.6 The odd periodic extension of a function is illustrated
in Fig. 8.10.
In addition to conditions (8.41), the extended initial data satisfy now the periodicity
conditions

ū 0 (x) = ū 0 (x + 2L) v̄0 (x) = v̄0 (x + 2L) − ∞ < x < ∞. (8.49)

By the same procedure as in the case of the infinite string, invoking just the odd
character of the extension, we can be assured that the fixity condition is satisfied at
the left end of the string. The periodicity takes care of the right end. Indeed

 t
L+c
1 1 1
ū(L , t) = ū 0 (L + c t) + ū 0 (L − c t) + v̄0 (z) dz. (8.50)
2 2 2c
L−c t

But, for odd periodic functions, we have

ū 0 (L + x) = −ū 0 (L − x) v̄0 (L + x) = −v̄0 (L − x). (8.51)

Combining this fact with Eq. (8.50), we immediately obtain

ū(L , t) = 0. (8.52)

Again, from the physical point of view, we can interpret this result as the outcome of
a carefully calibrated mutually destructive interference of the backward and forward
traveling waves.

6 See [6], p. 50, a text written by one of the great Russian mathematicians of the 20th century.
8.6 The Finite String 171

8.6.2 Uniqueness and Stability

We have found solutions to the wave equation with given initial and boundary condi-
tions by means of different procedures. Is the solution unique? Is it stable in the sense
that small changes in the initial and/or boundary conditions result in small changes
in the solution? These questions are of great importance for PDEs in general and
must be studied in detail for each case. A problem in PDEs is said to be well posed
if it can be shown that a solution exists, that it is unique and that it depends continu-
ously (in some sense) on the initial and boundary data. For the one-dimensional wave
equation on a finite spatial domain the answer to the question of uniqueness can be
found, somewhat unexpectedly, in the underlying physics of the problem by invoking
the principle of conservation of energy in non-dissipative systems, as explained in
Box 8.1.
We need to verify that, from the mathematical viewpoint, the total energy
W = K + U is indeed a conserved quantity. Invoking the results of Box 8.1, namely,

L
1  2 
W = u t + c2 u 2x ρAd x, (8.53)
2
0

we calculate the rate of change of the energy on any C 2 solution u = u(x, t) of the
wave equation as

L
dW 1 d  2 
= u t + c2 u 2x ρAd x
dt 2 dt
0
L
 
= u t u tt + c2 u x u xt ρAd x
0
L L
 x=L
= (u t u tt ) ρAd x + c2 u x u t ρA x=0 − u t c2 u x x ρAd x
0 0
L
   x=L
= u t u tt − c2 u x x ρAd x + c2 u x u t ρA x=0
0
 x=L
= c2 u x u t ρA x=0 , (8.54)

where we have integrated by parts and assumed, for simplicity, a constant density ρ
and a constant cross section A. For the case of fixed ends, since u t = 0 at both ends,
we obtain the desired result, namely,
172 8 The One-Dimensional Wave Equation

Box 8.1 Energy


The total energy can be split into kinetic and potential components. When the wave equation is
derived in the context of longitudinal waves in elastic bars (as done in Sect. 2.4.3), the kinetic
energy due to the longitudinal translation of the cross sections is given by

L
1 2
K = ρu Ad x,
2 t
0

where we are confining our attention to the case of a finite bar of length L. The total potential
energy is stored as elastic energy, just as in the case of a linear spring, given by

L L
1 2 1 2 2
U= Eu Ad x = ρc u x Ad x.
2 x 2
0 0

For the application to the transverse vibrations of a taut string, the kinetic energy K is given by
the same expression as its counterpart for longitudinal waves, except that the variable u must be
interpreted as a transverse, rather than longitudinal, displacement. The potential energy U for the
vibrating string, on the other hand, must be examined more carefully. Indeed, we have assumed
that the tension T remains constant during the process of deformation.a It does, however, perform
work by virtue of the extension of the string. If ds represents the length of a deformed element,
whose original length was d x, as indicated in Fig. 8.1, we can write

 
1
ds = 1 + u 2x d x = 1 + u 2x + . . . d x.
2

The infinitesimal work performed by the tension T is, therefore,


1 2 1
T (ds − d x) ≈ T u = ρAc2 u 2x .
2 x 2
Since this work is completely recovered upon return to the straight configuration, it can legit-
imately be called a potential energy and we obtain the same expression as in the case of the
longitudinal waves in a bar.

a The energy is, of course, of elastic origin. This is a case of small deformations imposed upon
pre-existent larger ones. As an example, consider a linear spring with elongation e and internal
force F = ke, whose elastic energy is W = 0.5ke2 . Its increment due to a small de superimposed
on a background e0 is d W = ke0 de = Fde.

dW
= 0. (8.55)
dt
The same result is, of course, obtained if an end is free to move but subjected to no
external force, so that u x = 0.
Let u 1 = u 1 (x, t) and u 2 = u 2 (x, t) be C 2 solutions of the wave equation with the
same boundary and initial conditions. Since the wave equation is linear, the differ-
ence u = u 1 − u 2 satisfies the wave equation with homogeneous (that is, vanishing)
8.6 The Finite String 173

boundary and initial conditions. Thus, the total energy at time t = 0 vanishes. Since,
as we have just demonstrated, the total energy is conserved, we have

L
1  2 
u t + c2 u 2x ρAd x = 0 (8.56)
2
0

for all times. In view that the integrand is non-negative, this result is only possible
if it vanishes identically. Hence, u = u 1 − u 2 is constant and, therefore, zero. This
concludes the proof of uniqueness.
The issue of continuous dependence on initial data can be settled, for instance, by
using the technique of periodic extension and explicitly determining a norm of the
difference between the solutions corresponding to two sets of initial data. This issue
is clearly of great importance for numerical methods of solution of PDEs.

8.6.3 Time Periodicity

The free vibrations of a finite string are necessarily periodic in time. This fact can be
established in various ways, one of which we will pursue presently. We know that
the (unique) solution for the free vibrations of a simply supported string of length
L can be obtained by the technique of extending the initial conditions periodically
over R. The extension of the initial conditions u 0 and v0 yields, respectively, two
odd functions ū 0 (x) and v̄0 (x) with a period of 2L and the (d’Alembert) solution of
the problem is obtained as

 t
x+c
1 1 1
u(x, t) = ū 0 (x + c t) + ū 0 (x − c t) + v̄0 (z) dz. (8.57)
2 2 2c
x−c t

For each value of x, this function u(x, t) turns out to be periodic in time, with a
period of 2L/c, namely,
u(x, t + 2L/c) = u(x, t), (8.58)

as can be verified by direct substitution noting that, due to the assumed odd character
of the extension, the integral of the initial velocity over a whole period must vanish.
Notice that, although the displacement pattern is recovered exactly at regular
intervals equal to the period, its shape, in general, varies with time, neither does
the string become instantaneously un-deformed at any intermediate instant. We will
later show that there are special solutions that do preserve their shapes and vary only
in amplitude as time goes on. These special solutions can be interpreted physically
as standing waves. They provide a different avenue of approach to the solution of
vibration problems in Engineering and Physics.
174 8 The One-Dimensional Wave Equation

8.7 Moving Boundaries and Growth

A violin player can change continuously the length of the string by sliding a fin-
ger over the fingerboard in a smooth motion known as glissando. This change of
length involves an increase in the material content of the string enclosed between
the two supports. Similarly, biological growth may arise as the result of additional
material being deposited at the boundaries of an organ. These two processes involve
time scales much larger than, say, the period of free vibrations of the original body.
Nevertheless, they are interesting pictures to bear in mind when dealing with prob-
lems of moving boundaries. In the case of the violin player, the effect of the moving
boundary can be perceived clearly by the ear as a variation of the fundamental pitch
of the sound. Remarkably, the method of characteristics can be applied without any
essential modification to problems involving moving boundaries.
Consider the idealized ‘violinist problem’ consisting of solving the wave equation
in the domain D (shaded in Fig. 8.11) with initial conditions

u(x, 0) = u 0 (x) u t (0) = 0 0 ≤ x ≤ L, (8.59)

and with boundary conditions

u(0, t) = 0 u(L + αt, t) = 0 ∀t ≥ 0. (8.60)

In this equation, α is assumed to be a positive constant, so that the vibrating length


of the string increases linearly with time. The constant α represents the speed of
translation of the right support. On intuitive grounds, to be verified by the mathemat-
ics, this speed should be smaller than the speed c for the problem to have a solution
different from the restriction to D of the solution of the semi-infinite string.
The algorithm to compute the value of the solution at any point P ∈ D can be
conceived as follows: (a) Starting at P draw the downward-right characteristic until
it intersects the boundary of B; (b) if this intersection falls at a point A within the
base segment 0 ≤ x ≤ L, stop; otherwise (c) reflect the line by changing the sign

Fig. 8.11 Domain of t


definition for the moving
boundary problem

α
c

x
L
8.7 Moving Boundaries and Growth 175

Fig. 8.12 Solution t


algorithm

x
B A

of the slope and repeat as many times as needed to arrive at a point A in the base
segment; (d) keep track of the number of bounces n R ; (e) repeat the whole zigzag
procedure but starting with the downward-left characteristic to obtain a point B in
the base after a number n L of bounces (if any); the value of the displacement at P is
given by
1 
uP = (−1)n R u 0 (A) + (−1)n L u 0 (B) . (8.61)
2
In the example of Fig. 8.12, we have n R = 1 and n L = 0.

8.8 Controlling the Slinky?

The Slinky is one of the most celebrated toys of the twentieth century. Essentially
a helical spring, its versatility is due in part to the interaction of its elasticity with
gravity and with lateral as well as axial deformations. To simplify matters, though,
let us consider it as a linear elastic bar with density ρ, modulus of elasticity E and
un-stretched length L undergoing axial displacements only in the small deformation
regime under no external forces.7 Under these conditions, it abides by the wave
equation. The question we pose is the following: Holding one end while the other
end is free and starting from a rest configuration, is it possible to obtain a prescribed
displacement history f (t) of the free end by imposing a displacement history g(t)
of the held end? From the mathematical standpoint, this is a problem in the theory
of boundary control of PDEs. Here, however, we will consider it as an independent
challenge for its own sake.
The key to solve this problem is that, for the case of the wave equation, the roles
of the space (x) and time (t) variables can be exchanged. What this means is that the
following is a perfectly well-defined Cauchy problem: Solve the wave equation

7 Thissimplified model is not realistic for the actual Slinky for many reasons. We are only using it
as a motivation for a well defined problem in linear one-dimensional elasticity.
176 8 The One-Dimensional Wave Equation

Fig. 8.13 Boundary control t


and the Slinky problem

−L/c

u tt − c2 u x x = 0 for x > 0, (8.62)

with the ‘initial’ data

u(0, t) = U0 (t) u x (0, t) = ε0 (t), (8.63)

where U0 (t) and ε0 (t) are defined over the whole real line. Let us identify the free
end with x = 0 and the held end with x = L. Let the desired displacement history
of the free end be of the form

⎨ 0 for t < 0
U0 (t) = (8.64)

f (t) for t ≥ 0

and let ε0 (t) = 0, which corresponds to a zero strain (and stress) at an unsupported
end, as desired.
The shaded area in Fig. 8.13 indicates the range of influence R of the non-
vanishing Cauchy data. In particular, the line x = L is affected only for times
t ≥ −L/c. As expected on physical grounds, therefore, the non-vanishing history
g(t) of the displacement to be applied at the held end must start earlier than the
desired displacement at the free end. The explicit form of this history is, according
to the d’Alembert solution implemented in the exchanged time-space domain,
   
1 L 1 L
g(t) = u(L , t) = U0 t − + U0 t + , (8.65)
2 c 2 c
8.8 Controlling the Slinky? 177

which indeed vanishes identically for t < L/c. By uniqueness, we conclude that if
we return to the original space-time picture and at time t = −L/c (or any earlier
time) we prescribe zero initial conditions of both displacement and velocity and as
boundary conditions we specify at x = 0 an identically zero strain and at x = L a
displacement equal to g(t), we should recover the same u(x, t) for the finite beam
as the one provided by the solution of the previous Cauchy problem. In particular,
along the line x = 0, we should recover the desired displacement f (t). The problem
has been thus completely solved. For more realistic applications, it is not difficult to
incorporate the effect of gravity.
A similar procedure can be used to demonstrate the boundary control of a Timo-
shenko beam. A good way to imagine this problem is to think of holding a fishing
rod at one end and, by applying displacements and/or rotations at the held end, to try
to achieve a specified displacement at the free end. By reversing the roles of space
and time, it is possible to show [1] that, if both the displacement and the rotation
at the held end are amenable to independent prescription, then the problem has a
unique solution, just like the case of the slinky. When the displacement at the held
end vanishes and only the rotation is amenable to prescription, the problem can also
be solved, but requires the solution of a recursive functional equation.

8.9 Source Terms and Duhamel’s Principle

Source terms give rise to the inhomogeneous wave equation in the form

u tt − c2 u x x = f (x, t), (8.66)

that we have already encountered in deriving the equation for the vibrating string (8.5)
subjected to an external force. It is enough to show how to solve this inhomogeneous
problem with vanishing initial conditions and (for the finite case) vanishing boundary
conditions, since any other conditions can be restored by superposition with the
solution of the source-free case with any given initial and boundary conditions.
There are various ways to deal with the inhomogeneous wave equation.8 The
appeal of Duhamel’s principle is that it can be motivated on physical, as well as math-
ematical, grounds. Mathematically, this method can be regarded as a generalization to
linear PDEs of the method of variation of constants used in solving inhomogeneous
linear ODEs. For a more intuitively physical motivation, see Box 8.2.
Assume that the solution of the homogeneous problem is available for arbitrary
initial conditions. Moreover, the initial conditions could be specified, rather than just
at t = 0, at any reference time τ > 0 with the corresponding solution valid for t ≥ τ
and vanishing for t < τ . In particular, we denote by ū(x, t; τ ) the solution to the
homogeneous problem with the initial conditions

8 See, for example, [6] p. 51, [2] p. 103, [5] p. 221.


178 8 The One-Dimensional Wave Equation

Box 8.2 The intuition behind Duhamel’s principle


Duhamel’s principle expresses the solution of an inhomogeneous PDE out of a clever continuous
superposition of solutions of homogeneous problems, for which the solution is assumed to be
known (either exactly or approximately). How is it possible to exchange a problem containing
sources with a collection of source-free problems? A non-rigorous way to motivate Duhamel’s
principle consists of conceiving the effect of the right-hand side sources as the sum of the effects
of sources with the same spatial distribution but which are constant in time over each one of a
large number of small time intervals of extent h = Δt. In other words, we view the graph of the
right-hand side of Eq. (8.66), representing the forcing function f (x, t), sliced, as a slab of cheese
divided into thin vertical slices, at regular time intervals of thickness h = Δt, as shown in the
figure.

f f

t t

τ h
x x

If one of these slices were acting alone starting at time τ for the duration h and then disappearing,
it would leave the system, initially at rest, with a certain velocity distribution. The effect of having
had the force acting for a short duration and then removed is, one claims, the same as not having
had the force at all but applying instead that velocity distribution as an ‘initial’ condition. A glance
at the PDE is convincing enough to conclude that the velocity in question is, to a first degree
of approximation, v(x, τ + h) = f (x, τ )h. A more particular argument is suggested in Exercise
8.13. Since our system is linear, it abides by the principle of superposition. The combined effect
of the individual slices of force is, therefore, the sum of the effects produced by the corresponding
homogeneous problems with the equivalent initial conditions of velocity. In the limit, as h → 0,
this sum tends to the integral propounded by Duhamel’s principle.

u(x, τ ) = 0 u t (x, τ ) = f (x, τ ). (8.67)

According to Duhamel’s principle, the solution of the inhomogeneous equation (8.66)


is given by
t
u(x, t) = ū(x, t; τ )dτ . (8.68)
0

Notice the subtlety that the expression ū(x, t; t) implies that t denotes the reference
time and, therefore, tautologically,

ū(x, t; t) = 0, (8.69)
8.9 Source Terms and Duhamel’s Principle 179

identically for all t. Similarly,

ū t (x, t; t) = f (x, t). (8.70)

To verify that the expression (8.68) actually satisfies the PDE (8.66), we evaluate

t
u x x (x, t) = ū x x (x, t; τ )dτ . (8.71)
0

Moreover,

t t
u t (x, t) = ū(x, t; t) + ū t (x, t; τ )dτ = ū t (x, t; τ )dτ , (8.72)
0 0

where we have enforced (8.69). Finally, by virtue of (8.70),

t t
u tt (x, t) = ū t (x, t; t) + ū tt (x, t; τ )dτ = f (x, t) + ū tt (x, t; τ )dτ . (8.73)
0 0

The result follows directly from Eqs. (8.71) and (8.73).

Example 8.1 Solve the Cauchy problem

u tt − c2 u x x = e x , t ≥ 0, (8.74)

with the initial conditions


u(x, 0) = u t (x, 0) = 0. (8.75)

Solution: We need to find the solution ū(x, t; τ ) of the homogeneous problem


with ‘initial’ conditions

u(x, τ ) = 0 u t (x, τ ) = e x . (8.76)

Using d’Alembert’s solution according to Fig. 8.14, we obtain it as

 )
x+c(t−τ
1 ex
ū(x, t; τ ) = e z dz = − sinh c(t − τ ) t ≥ τ. (8.77)
2c c
x−c(t−τ )

According to Duhamel’s principle the answer is


180 8 The One-Dimensional Wave Equation

Fig. 8.14 Example of t


Duhamel’s principle
(x, t)

c
τ

x
x−c(t−τ ) x+c(t−τ )

t t
ex
u(x, t) = ū(x, t; τ )dτ = − sinh c(t − τ )dτ
c
0 0
ex ex
=− [cosh c(t − τ )] t
0 = (cosh ct − 1). (8.78)
c2 c2
Exercises

Exercise 8.1 Make sure that you can reproduce and reason through all the steps
leading to Eq. (8.13). In particular, discuss the interpretation of the two waves just
introduced. Which of the two functions f, g represents the advancing wave?

Exercise 8.2 Show in detail that the solution (8.43) coincides with (8.34) in the
original domain.

Exercise 8.3 (a) Use the extension technique to obtain and analyze the solution of
the semi-infinite string when the left end of the string is free to move but is forced to
preserve a zero slope at all times. Does the forward wave in the upper domain become
the reflection of a backward wave bouncing against the support? If so, does inversion
occur? (b) In terms of characteristics, place a characteristic diamond ABC D with
point A at the origin and point C directly above it. Show that, for the given boundary
condition, u C = 2u B − u A .

Exercise 8.4 A violin string of length L, supported at both ends, is released from
rest. The initial displacement is known within a maximum point-wise error ε. Show
that the same uncertainty is to be expected at any subsequent time. Hint: use the
triangle inequality.

Exercise 8.5 A piano string of length L, supported at both ends, is struck at its
straight configuration. The imposed initial velocity is known within a maximum
point-wise error ε. Show that the point-wise uncertainty in the ensuing displacement
is expected to grow linearly as time goes on. Comment on the implications of this
result on any numerical method.
8.9 Source Terms and Duhamel’s Principle 181

Exercise 8.6 Show that Eq. (8.57) implies that at a time equal to half the period,
the displacements are the inverted spatial mirror image of the initial displacements,
regardless of the initial velocities.

Exercise 8.7 Obtain the time periodicity of the solution strictly from a geometric
analysis of the characteristics. By the same method, obtain the result of Exercise 8.6.

Exercise 8.8 Show that for the case of a free end that maintains a zero slope while
the other end is fixed, the resulting motion is periodic in time. What is the period?
What is the shape after half a period?

Exercise 8.9 A string of a grand piano is made of steel (density = 7800 kg/m3) and
has a diameter of 0.5 mm. The string is subjected to a tension of 600 N and placed
between two supports 0.3 m apart. The impact of a hammer at time t = 0 can be
approximated by assuming a zero displacement function and a velocity given by the
function ⎧
⎨ 0 0 ≤ x < 0.18 m
v0 (x) = 4 m/s 0.18 m ≤ x ≤ 0.20 m

0 0.20 m < x < 0.30 m

The origin of coordinates has been placed at one of the supports. Find the displace-
ment of the string after 0.15 ms at the point x = 0.10 m. At the same instant, indicate
those portions of the string (if any) experiencing a zero velocity. What is the period
of the motion?

Exercise 8.10 Write a computer code to handle the procedure described in Sect. 8.6.1
for any given boundary displacements and initial conditions. Apply it to the case of
fixed ends and verify the time periodicity of the solution.

Exercise 8.11 Write a computer code to carry out the algorithm described in
Sect. 8.7. Run the program with the initial condition u 0 (x) = sin πx/L. Check the
resulting shape as time goes on. Use various values of the ratio α/c < 1.

Exercise 8.12 A Slinky toy has been idealized (not very realistically) as an elastic
bar of length L = 1 m. This may happen when a child plastically deforms the slinky
to that lamentable state so that it can no longer be enjoyed. For small additional
deformations it still behaves elastically. The product E A of the modulus of elasticity
times the equivalent cross-sectional area is estimated at E A = 1N and the mass per
unit length is ρA = 0.2 kg/m. Placing the toy on a (frictionless) horizontal table
to exclude gravity effects, determine the motion to be applied on one end so that
the resulting displacements on the other end are given by the function 0.05(1 −
cos ωt)H (t), where H (t) is the Heaviside step function. Plot the solution
√ for various
values of ω in the range 1 < ω < 15. What happens when ω = 0.5π 5 s−1 or any
of its odd multiples? Explain.

Exercise 8.13 (Duhamel’s principle unraveled) Each length element of the vibrating
string can ultimately be regarded as a mass attached to a spring of stiffness k somehow
182 8 The One-Dimensional Wave Equation

representing the restoring (elastic) forces of the string. Consider this mass-spring
system at rest up to a time t = τ and then subjected to a force of intensity F(τ )
acting for a short interval of time h and then removed. Since the spring is at rest, in
the small interval of time h it will undergo a negligible displacement (of the order of
h 2 ) while the velocity, according to Newton’s second law, will undergo an increase of
F(τ )h/m, where m is the mass. Subject the system to an initial (at time τ ) velocity
F(τ )h/m and solve the homogeneous spring-mass equation m ẍ + kx = 0 for all
subsequent times. The result of this step should be
  
m F(τ )h k
x̄(t; τ ) = sin (t − τ ) t ≥ τ.
k m m

The superposition of all these solutions is, in the limit,

t   
 m F(τ ) k
x(t) = lim x̄(t; τ ) = sin (t − τ ) dτ .
h→0 k m m
0

Verify that this expression satisfies the differential equation m ẍ + kx = F(t) with
zero initial conditions. Carefully distinguish between the variables t and τ and, when
differentiating with respect to t, observe that it appears both in the integrand and in
the upper limit of the integral. A simpler example is provided by the particular case
F(t) = constant.

References

1. Epstein M (2013), Notes on the flexible manipulator. arXiv:1312.2912


2. Garabedian PR (1964) Partial differential equations. Wiley, New York
3. John F (1982) Partial differential equations. Springer, Berlin
4. Knobel R (2000) An introduction to the mathematical theory of waves. American Mathematical
Society, Providence
5. Sneddon IN (1957) Elements of partial differential equations. McGraw-Hill, Republished by
Dover, 2006
6. Sobolev SL (1989) Partial differential equations of mathematical physics. Dover, New York
Chapter 9
Standing Waves and Separation of Variables

In spite of being limited to the solution of only certain types of PDEs, the method
of separation of variables provides often an avenue of approach to large classes of
problems and affords important physical insights. An example of this kind is provided
by the analysis of vibration problems in Engineering. The separation of variables in
this case results in the resolution of a hyperbolic problem into a series of elliptic
problems. The same idea will be applied in another chapter to resolve a parabolic
equation in a similar manner. One of the main by-products of the method is the
appearance of a usually discrete spectrum of natural properties acting as a natural
signature of the system. This feature is particularly manifest in diverse applications,
from musical acoustics to Quantum Mechanics.

9.1 Introduction

We have witnessed, in Chap. 8, the ability of the d’Alembert decomposition to provide


solutions of the wave equation under a variety of boundary conditions. An alternative
procedure to handle these and other problems can be found in the method of sepa-
ration of variables. Among the attractive features of this method, we can mention:
(i) the physical meaning that, within a given context, can usually be attributed to var-
ious elements of the procedure; (ii) the possibility to attack problems in more than
one spatial dimension; and (iii) the applicability of the method beyond the realm of
hyperbolic equations to linear equations of other types.1 Nevertheless, it is important
to remark that the method of separation of variables is not universally applicable. It
is just one of the many techniques at our disposal to attempt to find solutions to linear

1 Historically,
in fact, the method was discovered by Jean Baptiste Joseph Fourier (1768–1830) in
his celebrated book Théorie Analytique de la Chaleur. Thus, its very first application lies within
the realm of parabolic equations.
© Springer International Publishing AG 2017 183
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_9
184 9 Standing Waves and Separation of Variables

PDEs. We will first present the method within the context of the wave equation and
then, after introducing the concepts of eigenvectors and eigenvalues of differential
operators, we will proceed to discuss some applications of these ideas.

9.2 A Short Review of the Discrete Case

It may be a good idea at this point to review briefly the theory of vibrations of struc-
tures with a finite number of degrees of freedom. From the mathematical point of
view, as we know, these structures are governed by systems of ordinary (rather than
partial) differential equations. Nevertheless, when it comes to the method of separa-
tion of variables and its consequences, there exist so many commonalities between
the discrete and the continuous case that to omit their mention would constitute a
callous disregard for our ability to draw intellectual parallels. Moreover, numerical
methods for the solution of continuous dynamical systems usually consist of some
technique of discretization, whereby inertia and stiffness properties are lumped in
a finite number of points, thus resulting in an approximation by means of a system
with a finite number of degrees of freedom.
A system of a finite number of masses interconnected by means of linear elastic
springs2 moves in space, in the absence of external forces, according to a system of
linear ODEs that can be written as

K u + M ü = 0. (9.1)

In this equation K and M denote, respectively, the stiffness and mass (square) matrices
of the system, while u is the vector of kinematic degrees of freedom (whose number n,
therefore, determines the order of the matrices involved). Superimposed dots denote
derivatives with respect to the time variable t. The stiffness matrix K is symmetric
and positive semi-definite and the mass matrix M is symmetric and positive definite.
We recall that a square matrix K is said to be positive semi-definite if for all non-
vanishing vectors m (of the appropriate order) the following inequality is satisfied

mT K m ≥ 0. (9.2)

A square matrix is positive definite, on the other hand, if the strict inequality applies
(that is >, rather than ≥) in Eq. (9.2). From a geometric point of view, one may
say, somewhat loosely, that the vector obtained by applying the matrix to any non-
zero vector never forms an obtuse angle with the original vector. In the case of
positive definiteness, the angle is always acute. In the case of the stiffness and mass

2 The linearity of the stiffness properties, which results in the linearity of the equations of motion,
is either an inherent property of the system or, alternatively, the consequence of assuming that the
displacements of the system are very small in some precise sense.
9.2 A Short Review of the Discrete Case 185

matrices we are discussing, these properties are consequences of reasonable physical


behaviour.
A solution of the equations of motion (9.1) is a vector u = u(t) which satisfies
Eq. (9.1) identically for a given time interval. To pin down a solution, we need to
specify initial conditions of position and velocity (at time t = 0, say). If we disregard
for the moment these initial conditions, we may ask ourselves the following question:
Are there any solutions that preserve their shape as time goes on? To explain this
property somewhat differently, assume that as time goes on we are taking snapshots of
the motion of the system. After each snapshot we record the ratios between the values
of the degrees of freedom (assuming that they are not all zero at that particular time).
We can call the collection of these ratios the shape of the present state of the system.
The motion will be shape-preserving if its shape (namely the very collection of those
ratios) is the same in each snapshot, no matter when it was taken. In other words,
there is a synchronicity of the motion: all degrees of freedom vanish simultaneously
and move, each in its own direction, in the same proportion until perhaps they all
simultaneously attain an extreme value at zero speed, whereby they begin to move
in the opposite direction. Now, we ask, what would the mathematical form of such
a solution (if it exists) be? The answer is quite simply the following

u(t) = U f (t). (9.3)

In this equation, the constant (i.e., time-independent) vector U represents precisely


the shape of the synchronous motion, whereas the function f (t) describes its evo-
lution in time. As time goes on, the shape is preserved and the only change is in the
value of the amplitude, as encapsulated in the function f (t).
It is a rather remarkable fact that, under such mild assumptions, the information
supplied is sufficient to calculate both the shape vector U and the time-evolution
function f (t) for all possible synchronous motions of the system. At the outset, it
is important to realize that these quantities are determined only up to an arbitrary
multiplicative factor, since the linearity and homogeneity of the system imply that a
multiple of any solution is itself a solution.
To obtain all possible shape-preserving motions, we introduce Eq. (9.3) into (9.1)
and immediately obtain
K U f + M U f¨ = 0. (9.4)

Dividing through by f , which is certainly not identically zero if we are looking for
a non-trivial solution, we obtain


K U = −M U . (9.5)
f

By construction, the left-hand side of this equation is independent of time, and


so therefore must the right-hand side be. We conclude that the ratio f¨/ f must be
constant! To determine the possible sign of this constant, we have at our disposal the
(semi-) positive definiteness of the matrices. Multiplying Eq. (9.5) to the left by the
186 9 Standing Waves and Separation of Variables

transpose of the shape vector (which certainly cannot be the zero vector, since we
are only interested in non-trivial solutions), we obtain

f¨ UT K U
=− T ≤ 0. (9.6)
f U MU

Let us investigate the case in which the equal sign holds. Notice that this is only
possible because of the semi-positive definiteness of the stiffness matrix. If this
matrix were positive definite, the ratio would certainly be negative. So, if the ratio
is actually zero, we obtain that the second derivative of f must vanish identically,
which means that the system is moving with a linearly growing amplitude. From
the physical point of view, this motion corresponds to the degrees of freedom of
the system as a rigid entity.3 In other words, for a system which has been properly
supported against rigid-body motions this situation cannot occur (and, in this case,
the stiffness matrix is positive definite). In any case, we can write


= −ω 2 . (9.7)
f

The scalar ω ≥ 0 is called a natural frequency of the system. Can it be arbitrary?


We will now discover (as, in some way, Pythagoras is said to have observed 25
centuries ago) that a linear dynamical system allows only a finite number of natural
frequencies, and that these frequencies are completely determined by the stiffness and
mass properties of the system. Moreover, to each of these frequencies a specific shape
(or shapes) can be associated, also determined by the stiffness and inertia properties.
Each of these shapes is called a normal mode of the system. These remarkable facts
(really, things cannot get much more beautiful than this, can they?) can be generalized
for continuous systems, as we shall see, thus forming the basis of musical acoustics
and quantum mechanics, among other disciplines.
Introducing Eq. (9.7) into (9.5) we get
 
K − ω 2 M U = 0. (9.8)

This is a homogeneous system of linear algebraic equations. If the coefficient matrix


K − ω 2 M were non-singular, the solution of this system would be unique, thus
implying that U = 0, a solution which we discard (the trivial solution). Therefore, a
non-trivial solution can only exist if the determinant of the coefficient matrix vanishes.
We are thus led naturally to a generalized eigenvalue problem for a symmetric real
matrix weighted by another matrix of the same kind. The solution of the (scalar)
determinant equation  
det K − ω 2 M = 0, (9.9)

3 It is best here to think of the case of small rigid-body motions only.


9.2 A Short Review of the Discrete Case 187

also called the characteristic equation, requires the calculation of the roots of a
polynomial whose degree equals the number n of degrees of freedom of the system.
According to the fundamental theorem of algebra, this equation has exactly n roots
ω12 , . . . , ωn2 (some of which may be repeated). In general, some or all of these roots
may be complex, but for the case of symmetric real matrices they are guaranteed
to be all real. Moreover, the (semi-) positive definiteness of the matrices involved
guarantees that the roots are non-negative (as we have already determined). Finally,
if ωi2 = ω 2j and if Ui , U j are respectively corresponding normal modes, the following
(weighted) orthogonality condition must be satisfied:

UiT M U j = 0. (9.10)

In the mathematical context, each root ωi is a (generalized) eigenvalue of the stiff-


ness matrix weighted by the mass matrix, and the corresponding normal mode is a
(generalized) eigenvector. Each of the above-mentioned properties (and some more)
can be proved without too much mathematical effort.4 To summarize the main result,
we can establish that, given a linear dynamical system with n degrees of freedom,
characterized by Eq. (9.1), there exists a (not necessarily unique) set of n (weighted)
orthonormal eigenvectors (or normal modes of vibration) of the system. By ortho-
normality we mean that

UiT M U j = δi j i, j = 1, . . . , n, (9.11)

where δi j is the Kronecker symbol, equal to 1 if i = j and vanishing otherwise. Being


linearly independent, these vectors can be used as a basis of the vector space Rn .
Notice that a shape-preserving motion is necessarily of the form

u(t) = Ui f i (t), (9.12)

where the function f i (t) is a solution of Eq. (9.7) with ω = ωi . There are clearly
two possibilities. Either ωi = 0, in which case we go back to the rigid-body motion
at constant speed, or ωi > 0, in which case we obtain (for some arbitrary constants
Ai , Bi )
f i = Ai cos(ωi t) + Bi sin(ωi t). (9.13)

The time dependence of the (non-rigid) normal modes is, therefore, harmonic.
To complete our review of the treatment of discrete systems, we will show how,
when some external forces f are applied in correspondence with the degrees of
freedom, or when some specific initial conditions are prescribed, the solution can
be represented in terms of the normal modes of the system. These concepts will
reappear in a more general form in the case of continuous systems.
Let the discrete system, still in the absence of external forces, be subjected to the
initial conditions

4 An excellent reference is [2].


188 9 Standing Waves and Separation of Variables

u(0) = u0 u̇(0) = v0 . (9.14)

Since our system is linear, the principle of superposition applies, as can be easily
verified. What we mean by this is that the sum of any two solutions of Eq. (9.1) is also
a solution. For this reason, we attempt to represent the solution u(t) corresponding
to the initial conditions (9.14) as a sum of independent shape-preserving motions of
the form (9.12). We are setting, therefore,


n 
n
u(t) = Ui f i (t) = Ui (Ai cos(wi t) + Bi sin(ωi t)) , (9.15)
i=1 i=1

where, for definiteness, we have assumed that there are no zero eigenvalues (i.e., all
rigid motions are prevented).5
The constants Ai , Bi will be adjusted, if possible, to satisfy the initial conditions.
Multiplying Eq. (9.14) to the left by UkT M and invoking the orthonormality condition
(9.11), yields
UkT M u(t) = Ak cos(wi t) + Bk sin(ωi t). (9.16)

Enforcing the initial conditions (9.14), we immediately obtain

Ak = UkT M u0 , (9.17)

ωk Bk = UkT M v0 . (9.18)

This solves the initial-value problem completely.6


Assume now that, in addition to the initial conditions, some external forces f are
specified in correspondence with the degrees of freedom of the system. The equations
of motion become now
K u + M ü = f. (9.19)

We limit our analysis to the case in which these external forces are harmonic, for
example of the form
f = f0 sin(ωt). (9.20)

We look only for a particular solution u p (t) of Eq. (9.19), since the general solution
of the homogeneous equation is already available via the previous treatment. We try
a solution of the form
u p (t) = U p sin(ωt). (9.21)

5 The treatment for the general case is identical, except for the fact that normal modes of the form
Ai + Bi t must be included.
6 In Eq. (9.18) the summation convention does not apply.
9.2 A Short Review of the Discrete Case 189

Substitution into Eq. (9.19) yields


 
K − ω 2 M U p = f0 . (9.22)

We conclude that, except in the case in which the frequency of the external load
happens to coincide with one of the natural frequencies of the system, a particular
solution of the form (9.21) is uniquely determined. The exceptional case is called
resonance and it results in steadily increasing amplitudes of the response with the
consequent disastrous effects. A nice exercise is to express the vector of the external
forces in terms of the eigenvector basis and then determine the components of the
particular solution in the same basis one by one. The case of a general (not necessarily
harmonic) periodic force can also be handled by similar methods, but it is best treated
together with the continuous case.

9.3 Shape-Preserving Motions of the Vibrating String

Although we intend to deal with more general situations, in this section we will
devote our attention to the wave equation that has already occupied us in Chap. 8,
namely,
u tt = c2 u x x , (9.23)

with the boundary conditions

u(0, t) = u(L , t) = 0 t ≥ 0. (9.24)

For now, we do not specify any initial conditions. As we know, Eqs. (9.23) and (9.24)
describe the small transverse deflections of a uniform string of length L, supported
at its ends, in the absence of external loading. A shape-preserving (or synchronous)
motion is a solution of this equation of the form

u(x, t) = U (x) f (t). (9.25)

In this equation, U (x) represents the shape that is preserved as time goes on. A
solution of this type is sometimes also called a standing wave. The method used
to find a standing wave solution is justifiably called separation of variables. To see
whether there are standing-wave solutions of the wave equation, we substitute the
assumption (9.25) in the wave equation (9.23) and obtain

U (x) f¨(t) = c2 U  (x) f (t). (9.26)


190 9 Standing Waves and Separation of Variables

We are adopting the standard notation for time and space derivatives of functions of
one variable. Following the lead of the treatment of discrete systems, we now isolate
(separate) all the time-dependent functions to one side of the equation and get

f¨(t) U  (x)
= c2 . (9.27)
f (t) U (x)

We conclude that each of the sides of this equation must be a constant, since a function
of one variable cannot possibly be identical to a function of a different independent
variable. This constant may, in principle, be of any sign. Anticipating the final result,
we will presently assume that it is negative. Thus, we write

f¨(t) U  (x)
= c2 = −ω 2 . (9.28)
f (t) U (x)

We conclude, therefore, that the time-dependence of our shape-preserving candidate


is harmonic, just as in the discrete case, that is,

f (t) = A cos(ωt) + B sin(ωt). (9.29)

On the other hand, Eq. (9.28) also implies that (since c is a constant) the shape itself
is harmonic, viz.,  ωx   ωx 
U (x) = C cos + D sin . (9.30)
c c
We still need to satisfy the boundary conditions. It follows form Eqs. (9.24) and
(9.25) that, regardless of the value of ω, the boundary condition at x = 0 implies
that
C = 0. (9.31)

The boundary condition at the other end results in


 
ωL
D sin = 0. (9.32)
c

Since we are looking for a non-trivial solution, we must discard the possibility D = 0.
We conclude that non-trivial solutions do exist and that they exist only for very par-
ticular values of ω, namely, those values that render the sine function zero in expres-
sion (9.32). These values (again excluding the one leading to the trivial solution) are
precisely
kπc
ωk = k = 1, 2, . . . (9.33)
L
We have thus obtained the surprising result that there exists an infinite, but
discrete, spectrum of natural frequencies of the vibrating string corresponding to
shape-preserving vibrations. The corresponding shapes, or normal modes, are sinu-
9.3 Shape-Preserving Motions of the Vibrating String 191

soidal functions whose half-periods are exact integer divisors of the length of the
string. The fact that in this case the frequencies of the oscillations turned out to be
exact multiples of each other, is the physical basis of musical aesthetics (at least
until now …).
Notice that our assumption that the constant in Eq. (9.28) had to be negative is
now amply justified. Had we assumed a non-negative constant, we would have been
unable to satisfy both boundary conditions. In the case of the discrete system, there
are no boundary conditions and the selection of natural frequencies is entirely based
on the fact that the determinant (characteristic) equation, being polynomial, has a
finite number of roots. In the continuous case, the selection of frequencies is mediated
by the boundary conditions. By extension, the natural frequencies of the continuous
case are also called eigenvalues of the corresponding differential operator and the
normal modes of vibration are its eigenvectors.7 Putting back together the spatial
and temporal parts, we can express any shape-preserving solution in the form
ω x 
k
u k (x, t) = sin (Ak cos(ωk t) + Bk sin(ωk t)). (9.34)
c
Just as in the discrete case, the normal modes of vibration satisfy an orthogonal-
ity condition. Indeed, consider two different natural frequencies ωi = ω j and the
corresponding normal modes
ω x  ω x 
i j
Ui = sin U j = sin (9.35)
c c
Using the trigonometric identity

2 sin α sin β = cos(α − β) − cos(α + β), (9.36)

it is not difficult to obtain the orthogonality condition

L L    
iπx jπ x L
Ui U j d x = sin cos dx = δi j . (9.37)
L L 2
0 0

The integration of the product of two functions over the length of the domain plays,
therefore, the role of a dot product in the space of functions, as we shall see later
again.

7 We content ourselves with pointing out these similarities. In fact these similarities run even deeper,

particularly when we regard the underlying differential equations as linear operators on an infinite-
dimensional vector space of functions, just as a matrix is a linear operator on a finite-dimensional
space of vectors.
192 9 Standing Waves and Separation of Variables

9.4 Solving Initial-Boundary Value Problems


by Separation of Variables

We have found all the possible shape-preserving solutions of the one-dimensional


wave equation. The method of separation of variables (which is precisely the math-
ematical expression of shape preservation) provided us with a harmonic time depen-
dence, while the space dependence was settled by finding all the non-trivial solutions
of the ODE
ω2
U  (x) = − 2 U (x). (9.38)
c
It is interesting to compare this equation with its counterpart for the discrete
case, Eq. (9.8). We start by noticing that even if the density of the string had been a
function of position, the separation of variables would have resulted in a harmonic
time dependence, while Eq. (9.38) would have included the variable density. We will
study this case in some detail later, but we remark at this point that this variable
density would have been the counterpart of the mass matrix of the discrete case.
In the discrete case, we were in possession of a linear operator K which produced
vectors out of vectors (forces out of displacements, say). Now we have a linear oper-
ator on a space of functions. Functions defined over the same domain can be added
together (point-wise) and multiplied by scalars (also point-wise). These operations
endow a space of functions of a given class (say C ∞ ) with the structure of a vector
space. The operator on the left-hand side of Eq. (9.38) consists of taking the second
derivative of a function to produce another function. This operation is clearly linear
(the derivative of a sum of functions is equal to the sum of the derivatives, and so on).
The main difference between the discrete and the continuous case is that the space
of functions under consideration is an infinite-dimensional vector space.
The problem posed by Eq. (9.38) can be stated as follows: Find those functions
(read: ‘vectors’) which, when used as an input for our operator (the second derivative)
result in an output proportional to the input. But this is precisely the statement of
an eigenvalue problem! When we solved this problem for the discrete case, we
found that there was a finite number of eigenvalues and that one can construct an
orthogonal basis of eigenvectors. In the continuous case, we found that the number
of eigenvalues is (countably) infinite and that the corresponding eigenvectors (which
we called normal modes) are orthogonal, in the sense of Eq. (9.37).
Our claim now is that, in some precise sense, these normal modes can be under-
stood as a basis of the infinite-dimensional space of functions under consideration.
By this we mean that: (i) the normal modes are linearly independent; and (ii) every
smooth enough function that vanishes at the ends of the interval under consideration
can be expressed as an (infinite) linear combination of the normal modes, with coef-
ficients to be determined uniquely for each function. In other words, we claim that
‘any’ function F(x) defined in the interval [0, L] and vanishing at its ends can be
represented uniquely as a series of the form
9.4 Solving Initial-Boundary Value Problems by Separation of Variables 193



F(x) = Dn Un (x), (9.39)
n=1

where the numbers Dn are the components of the representation. This bold statement
must be justified in precise mathematical terms. In this book, however, we will adopt
it as an act of faith.
For the case of the wave equation that we have just considered, the normal modes
are sine functions and the expansion (9.39) reduces to a particular case of the Fourier
series, which we will encounter in Chap. 10. The fully-fledged Fourier series includes
also terms involving the cosine function and it can be used to represent ‘any’ periodic
function (or ‘any’ function defined over a finite interval that has been extended to
a periodic function over the whole line). In more general cases, such as that of a
non-uniform string that we will study next, the normal modes (or eigenfunctions) are
no longer sines or cosines, but the validity of the representation (9.39) in terms of
normal modes is preserved. These topics are known as Sturm–Liouville eigenvalue
problems and are discussed in mathematical textbooks.8 The more general proofs of
convergence pertain to the field of functional analysis.
From the physical point of view, we may say that nature has endowed the vibrat-
ing string (and, in fact, all elastic systems) with a preferred set of shape-preserving
vibrations, each of which oscillates with a definite frequency. All these natural
frequencies constitute the characteristic spectrum of the system. The remarkable
fact is that arbitrary periodic vibrations of the system can be expressed in terms of
these natural modes of vibration. Thus, any periodic vibration of the system can be
analyzed in terms of its spectral components. An intuitive grasp of these facts was
perhaps somehow obtained by Pythagoras over 25 centuries ago. It is worth pointing
out that the contribution of Louis de Broglie to the understanding of the quantum
mechanical model of the atom runs along similar lines.
Let F(x) be a function (vanishing at the ends of the interval) for which we want
to find the representation (9.39). In other words, we are interested in obtaining the
value of the coefficient Dk for each and every k. Multiplying both sides of (9.39) by
Uk (x), integrating both sides of the resulting equation over the interval [0, L] and
invoking the orthogonality conditions (9.37), we obtain the following surprisingly
simple result
L
2
Dk = F(x) Uk (x) d x. (9.40)
L
0

The orthogonality conditions played a major role in the de-coupling of the formulas
for the individual coefficients, just as they do in the discrete case (the components of a
vector in an orthonormal basis are simply the dot products by the base vectors). There
is here a subtlety, however, that we must mention. It has to do with the fact that we
have assumed that the series on the right-hand side of (9.39) can be integrated term by
term. Moreover, we have not specified in what precise sense the series converges to

8A short treatment can be found in [1], p. 291.


194 9 Standing Waves and Separation of Variables

the given function, if it does indeed converge. These and other similar issues (which
are not very difficult to understand) are outside the scope of these notes.
Let us assume that the vibrating string is subjected at time t = 0 to a displacement
u(x, 0) = F(x) and a velocity u t (x, 0) = G(x), both vanishing at the ends of the
string (which remain fixed during the interval of time under consideration). These
two functions can be expanded in terms of the normal modes as


F(x) = Dn Un (x), (9.41)
n=1



G(x) = E n Un (x). (9.42)
n=1

The coefficients Dn , E n can be obtained by the prescription (9.40) applied to the


respective functions. We now represent the solution to the wave equation under the
given initial conditions in the form


u(x, t) = Un (x) ( An cos(ωn t) + Bn sin(ωn t)) . (9.43)
n=1

Put differently, we expand the solution in terms of the normal modes, each oscillating
at its characteristic frequency. Our task is to determine all the constants An , Bn . At
the initial time, Eq. (9.43) yields


u(x, 0) = Un (x) An (9.44)
n=1

and


u t (x, 0) = Un (x) ωn Bn . (9.45)
n=1

In obtaining this last equation we have assumed that the series can be differentiated
term by term. Comparing these results with those of Eqs. (9.42) and (9.43), respec-
tively, and recalling that the coefficients of the expansion in terms of normal modes
are unique, we conclude that

A n = Dn (9.46)

and
En
Bn = . (9.47)
ωn

This completes the solution of the problem.


9.4 Solving Initial-Boundary Value Problems by Separation of Variables 195

The method just used is based on normal-mode superposition. It relies on the fact
that for a homogeneous linear equation with homogeneous boundary conditions, the
sum of two solutions is again a solution, and so is the product of any solution by a
constant. In engineering applications (for example in structural engineering) this fact
constitutes the so-called principle of superposition. It is not a principle of nature, but
rather a property of linear operators.
If we consider a non-homogeneous situation (either because there is an external
production, such as a force acting on a structure, or because the boundary conditions
are not homogeneous, such as a prescribed displacement or slope at the end of a
beam), it is easy to show (by direct substitution) that the difference between any
two solutions of the non-homogeneous case is necessarily a solution of the homoge-
neous equation with homogeneous boundary conditions. It follows that the solution
of a non-homogeneous problem, with prescribed initial conditions, can be obtained
by adding any one particular solution of the non-homogeneous problem (regard-
less of initial conditions) to the general solution of the homogeneous problem. The
adjustable constants of the homogeneous solution can then be determined so as to
satisfy the initial conditions for the sum thus obtained.
To illustrate these ideas, let us consider first the case in which the string has been
loaded by means of a periodic load of the form

q = Q(x) sin(ωt). (9.48)

The boundary conditions are still the vanishing of the displacements at the ends of
the string. Recall that the PDE governing the situation is given by Eq. (8.5), viz.,

u tt − c2 u x x = q, (9.49)

which is a non-homogeneous version of the one-dimensional wave equation. We


need any particular solution of this equation, so we try a solution of the form

u p (x, t) = U p (x) sin(ωt). (9.50)

Introducing this assumption in (9.49) and using (9.48), we obtain

− ω 2 U p − c2 U p = Q. (9.51)

This equation should be compared with its discrete counterpart, Eq. (9.22). We have
obtained a non-homogeneous ODE, which can be solved by any means. From the
physical point of view, however, it is interesting to find a particular solution of this
ODE by making use of a normal mode superposition. We express the solution and
the loading, respectively, as


U p (x) = D pn Un , (9.52)
n=1
196 9 Standing Waves and Separation of Variables



Q(x) = Hn Un . (9.53)
n=1

We now assume that the second derivative of Eq. (9.52) can be carried out term
by term, and use Eq. (9.38) for the normal modes to obtain

 ∞
 ω2
U p (x) = D pn Un = − n
D pn Un . (9.54)
n=1 n=1
c2

Combining Eqs. (9.51)–(9.54), we get

Hn
D pn = . (9.55)
ωn2 − ω 2

Clearly, this solution is applicable only if the frequency of the applied harmonic
force does not happen to coincide with any of the natural frequencies of the system
(in which case we have the phenomenon of resonance, with the amplitude of the
response increasing steadily in time).
If the forcing function is not harmonic (or even periodic), we can still make use
of the normal mode decomposition to find a particular solution of (9.49). We assume
now that the coefficients of the expansions (9.52) and (9.53) are functions of time,
namely,


u p (x, t) = D̂n (t) Un (x), (9.56)
n=1



q(x, t) = Ĥn (t) Un (x). (9.57)
n=1

Note that the coefficients in Eq. (9.57) can be calculated, instant by instant, by the
formula
L
2
Ĥk = q(x, t) Uk (x) d x. (9.58)
L
0

Similarly, the coefficients of the yet to be determined particular solution are given by

L
2
D̂k = u p (x, t) Uk (x) d x. (9.59)
L
0
9.4 Solving Initial-Boundary Value Problems by Separation of Variables 197

Let us multiply Eq. (9.49) through by Uk (x) and integrate over the string to get

L   L
∂2u p ∂2u p
− c2 Uk (x) d x = q Uk (x) d x. (9.60)
∂t 2 ∂x 2
0 0

Integrating by parts, and taken into consideration that the boundary terms vanish, we
use Eq. (9.38) to write

L L L
∂2u p
c 2
Uk (x) d x = c2 u p Uk (x) dx = −ωk2 u p Uk d x. (9.61)
∂x 2
0 0 0

Moreover, differentiating (9.56) term by term yields

 ¨ ∞
∂2u p
= D̂ n (t) Un (x). (9.62)
∂t 2
n=1

Putting all these results together, Eq. (9.60) implies that

D̂¨ k + ωk2 D̂k = Ĥk . (9.63)

This is an ODE for the determination of the time-dependent coefficients of the par-
ticular solution of the PDE. Clearly, we only need a particular solution of this ODE.
Notice that in the case in which the time dependence of Q k happens to be harmonic,
we recover the solution given by (9.55). Otherwise, we can use, for example, the
Duhamel integral formula, viz.,

t
1
D̂k (t) = Ĥ (τ ) sin(ωk (t − τ )) dτ . (9.64)
ωk
0

The complete solution of the non-homogeneous problem is thus given by



  
u(x, t) = Un (x) D̂n + An cos(ωn t) + Bn sin(ωn t) . (9.65)
n=1

The constants An , Bn can be adjusted to fit the initial conditions.


198 9 Standing Waves and Separation of Variables

9.5 Shape-Preserving Motions of More General


Continuous Systems

In the analysis of the vibration of general linearly elastic systems the method of
separation of variables leads to the separation of the (linear) problem into a trivial
temporal part and a spatial part that is governed by a PDE (or system thereof) of the
elliptic type. We start with the special case of a single spatial dimension.

9.5.1 String with Variable Properties

Consider the case of a string, such as that of a musical instrument, with a smoothly
varying cross section with area A = A(x). In accordance with Eq. (8.5), the governing
equation is
u tt − c(x)2 u x x = q. (9.66)

where

T
c(x) = . (9.67)
ρ A(x)

In other words, the mass per unit length of the string is a function of position along
the string. We ask the same question as in the case of the uniform string: Are there
shape preserving solutions of the homogeneous equation? If so, what is the precise
shape of these solutions and how does their amplitude vary with time? We proceed
in exactly the same manner as before, namely, we substitute the shape-preservation
assumption
U (x, t) = U (x) f (t) (9.68)

into (the homogeneous version of) Eq. (9.66) and obtain

f¨(t) U  (x)
= c(x)2 . (9.69)
f (t) U (x)

Just as in the case of Eq. (9.27), we reason that both sides of this equation must
necessarily be constant. Assuming this constant to be negative (for the same reasons
as before, which are justified a posteriori) we obtain that the time variation of the
normal modes is again necessarily harmonic, that is,

f (t) = a cos(ωt) + b sin(ωt). (9.70)


9.5 Shape-Preserving Motions of More General Continuous Systems 199

For the shape of the normal modes, however, we obtain the ODE

ω2
U  (x) = − U (x). (9.71)
c(x)2

If we compare this equation with the discrete counterpart (9.8), we see that the
mass matrix corresponds to the variable mass density per unit length. We need now
to ascertain the existence of non-trivial solutions of Eq. (9.71) satisfying the given
homogeneous boundary conditions. Without entering into the subtleties of the Sturm-
Liouville theory, we can convince ourselves that such solutions exist by the following
intuitive argument.
The left-hand side of Eq. (9.71) is (roughly) a measure of the curvature of the
solution. In this sense, Eq. (9.71) tells us that the curvature of a solution (a normal
mode) is proportional to the function itself, and that the constant of proportionality
must be, point by point, negative. What this means is that if, starting from the left end
we assume the solution to move upwards, the curvature will be negative, and thus
it will bring us downwards, and vice-versa. So, let us assume that we choose some
candidate value for the natural frequency. We may be lucky and hit the other end of
the string. If we don’t, however, we can change the value of ω gradually until we do
hit the other end. That we will always be able to do so is guaranteed by the fact that
the x-axis attracts the solution towards it, according to our curvature interpretation.
Moreover, once we find a value of the frequency which satisfies the condition of
hitting the other boundary, we can increase it gradually until we hit the far end again.
Every time we do this, we add another half wave to the shape of the solution. This
hit-and-miss argument can, of course, be formalized into a proof and, perhaps more
importantly, into a numerical algorithm to find the normal modes.9
In conclusion, although for a general non-uniform string we no longer have sinu-
soidal normal modes, the normal modes have a wave-like appearance. The natural
frequencies, moreover, will no longer be integer multiples of each other. As a result,
our ear will perceive the various frequencies as dissonant with respect to each other.
That is why guitar and violin strings have a constant cross section. In the case of the
drum, however, because of the two-dimensionality of the membrane, the frequen-
cies are not integer multiples of each other even in the case of constant thickness.
Hence follows the typical dissonant sound of drums and cymbals. Be that as it may,
we will now prove that the normal modes (eigenfunctions) satisfy a generalized (or
weighted) orthogonality condition, just as in the case of the discrete system.
Let Um , Un be two normal modes corresponding, respectively, to two different
natural frequencies ωm , ωn . We have, therefore,

ωm2
Um (x) = − Um (x), (9.72)
c(x)2

9 See Exercise 9.6.


200 9 Standing Waves and Separation of Variables

ωn2
Un (x) = − Un (x). (9.73)
c(x)2

Multiplying Eq. (9.72) by Un and Eq. (9.73) by Um , subtracting the results and inte-
grating over the length of the string yields

L L
  Um Un
Un Um − Um Un d x = −(ωm2 − ωn2 ) d x. (9.74)
C(x)2
0 0

Integrating by parts the left-hand side of this equation, however, and implementing the
boundary conditions, we conclude that it must vanish. Since the natural frequencies
were assumed to be different, we conclude that

L
Um Un
d x = 0. (9.75)
c(x)2
0

This is the desired generalized orthogonality condition. Since the normal modes are
determined up to a multiplicative constant, we may choose to normalize them by
imposing, without any loss of generality, the extra condition

L
Um Un
d x = δmn . (9.76)
c(x)2
0

Given a function F(x) that vanishes at the ends of the string, we can express it in
terms of the normal modes as


F(x) = Dn Un (x). (9.77)
n=1

The coefficients of this expansion can be obtained by multiplying through by


Uk /c(x)2 and integrating term by term. Invoking our orthogonality conditions, we
obtain
L
F(x) Uk (x)
Dk = d x. (9.78)
c(x)2
0

From here on, the treatment of the non-uniform string is identical in all respects to
that of the uniform string, provided that one takes into consideration the new normal
modes and their generalized orthogonality condition.10

10 In particular, the Duhamel integral will have to be expressed differently as compared to the uniform

case.
9.5 Shape-Preserving Motions of More General Continuous Systems 201

9.5.2 Beam Vibrations

The equations of motion of a Bernoulli beam, which we have already encountered


in Sect. 7.1.4.1, can be reduced to the single fourth-order PDE

(E I u x x )x x = −q − ρAu tt , (9.79)

in which u = u(x, t) denotes the (small) transverse displacement. The free vibrations
of a beam of constant properties is, therefore, governed by the homogeneous equation

c4 u x x x x + u tt = 0, (9.80)

with
EI
c4 = . (9.81)
ρA

Setting
u(x, t) = U (x) f (t), (9.82)

yields
U  f 
− c4 = = −ω 2 . (9.83)
U f

As before, we conclude that the time dependence of any shape-preserving motion


is necessarily harmonic, with a frequency ω to be determined from the boundary
conditions of the spatial problem. Some of the cases of practical interest are obtained
by assuming each one of the ends x = 0 or x = L to be pinned (U = U  = 0),
clamped (U = U  = 0) or free (U  = U  = 0). The general solution of the spatial
equation
U  − γ 4 U = 0, (9.84)

where √
ω
γ= , (9.85)
c
can be expressed as

U (x) = A sin γx + B cos γx + C sinh γx + D cosh γx. (9.86)

The four integration constants A, B, C, D are interrelated linearly by the imposition


of the 4 boundary conditions corresponding to each type of support at the two ends of
the beam. The resulting linear system is homogeneous and, therefore, its nontrivial
solutions are found by equating the determinant of its coefficient matrix to zero.
This equation delivers the natural frequency spectrum, as shown in the example of
Box 9.1.
202 9 Standing Waves and Separation of Variables

Box 9.1 The tuning fork


The tuning fork, in use since the early eighteenth century, is a U-shaped metal bar whose clearly
audible fundamental (lowest) natural frequency is used as a reference to tune musical instru-
ments. Each tine of the fork, of length L, can be considered as a Bernoulli beam with one end
clamped and the other end free. Imposition of the corresponding boundary conditions results in
the homogeneous system
⎡ ⎤⎧ ⎫ ⎧ ⎫
0 1 0 1 ⎪
⎪ A⎪⎪ ⎪ ⎪0⎪ ⎪
⎢ γ 0 γ 0 ⎥ ⎨ B ⎬ ⎨0⎬
⎢ 2 ⎥
⎣ −γ sin γ L −γ 2 cos γ L γ 2 sinh γ L γ 2 cosh γ L ⎦ ⎪ C ⎪ ⎪0⎪ .
=

⎩ D⎪⎭ ⎪ ⎩0⎪ ⎭
−γ 3 cos γ L γ 3 sin γ L γ 3 cosh γ L γ 3 sinh γ L

The determinant of the coefficient matrix is

 = 2γ 6 (1 + cos γ L cosh γ L).

We conclude that the natural frequencies are obtained from the roots of the transcendental equation
1
cos γ L = − .
cosh γ L
Written in this way, since the right-hand side decays extremely fast, the natural frequencies
beyond the first and the second are obtained with very small error from cos γ L = 0. The first and
second roots are, respectively, γ1 L = 0.597π and γ2 L = 1.494π. The lowest natural frequency
is, therefore,
 

0.597π 2 E I
ω1 = γ1 c =
2 2
.
L ρA
The dimensions can be easily calibrated to produce the usual orchestra pitch A440.

9.5.2.1 Orthogonality of the Normal Modes

Let Ui (x) and U j (x) be normal modes corresponding to two distinct eigenvalues
γi4 and γ 4j , respectively for some specific support conditions. We recall that these
eigenfunctions are obtained, up to a multiplicative constant, from nontrivial solutions
of the homogeneous system of equations for the constants of integration. Exploiting
Eq. (9.84) and integrating by parts, we obtain

L L
(γ 4j − γi4 ) Ui U j d x = (Ui U  
j − U j Ui )d x
0 0
L
 
= (−Ui U      L
j + U j Ui )d x + Ui U j − U j Ui 0
0
 
= Ui U       L
j − U j Ui − Ui U j + U j Ui 0 . (9.87)
9.5 Shape-Preserving Motions of More General Continuous Systems 203

The last expression vanishes by virtue of the boundary conditions, whence the desired
orthogonality.
An important feature of eigenfunctions beyond their orthogonality is their
completeness in the sense that every smooth enough function can be expressed as
an infinite linear combination of these basic functions.11 In particular, this feature
is helpful in solving non-homogeneous problems by expressing the forcing load in
terms of the eigenfunctions, as was done for the vibrating string.

9.5.3 The Vibrating Membrane

As a two-dimensional version of the vibrating string, the vibrating membrane is a


thin elastic sheet, with negligible bending stiffness, stretched uniformly in a plane
by means of a tension T , measured in units of force per unit length, and then placed
within a rigid frame, like the skin of a drum or a trampoline. We assume that the dis-
placements are only transversal and, whether due to the applied load or the vibrating
process or possibly small deformations of the frame, they are very small when com-
pared with the overall dimensions in the plane. We already introduced the governing
equation (2.43) as
q ρh
u x x + u yy = − + u tt , (9.88)
T T
where h is the thickness of the membrane and ρ the density of the material. In the
absence of the external loading term q, we obtain the two-dimensional wave equation,
which we already encountered in Sect. 7.2.4. The elimination of the time dependence
leads to the elliptic equilibrium equation
q
u x x + u yy = − . (9.89)
T
This is the Poisson equation. If the right-hand side vanishes we obtain the Laplace
equation. These equations appear in many engineering applications, including fluid
mechanics, acoustics, electrostatics and gravitation.
The free vibrations of the membrane are described by the hyperbolic PDE

1
u x x + u yy = u tt , (9.90)
c2

with c2 = T /ρh representing the speed of propagation of signals. In terms of the


Laplacian operator introduced in Eq. (1.21), we can rewrite (9.90) more compactly as

1
∇2u = u tt . (9.91)
c2

11 For a proof, see [1], p. 359.


204 9 Standing Waves and Separation of Variables

A shape-preserving motion is of the form

u(x, y, t) = U (x, y) f (t), (9.92)

which leads to the relation

∇ 2U f 
c2 = = −ω 2 . (9.93)
U f

As before, we obtain for the time dependence the familiar

f (t) = A sin ωt + B cos ωt. (9.94)

As far as the shape itself is concerned, we are left with the elliptic PDE

ω2
∇ 2U + U = 0. (9.95)
c2
In other words, the natural frequencies ω, as expected, will be determined by solving
the eigenvalue problem for the linear differential operator ∇ 2 . The selection of the
frequency spectrum will depend on the shape and size of the membrane domain A
and on the type of boundary conditions imposed, just as in the previous problems.
Since the Laplacian operator is elliptic, we have yet to consider this kind of problems.
We can, however, advance a few useful considerations. We start by observing that
the divergence theorem (1.18) can be restated, with the appropriate interpretation,
in a two-dimensional (rather than three-dimensional) Cartesian domain A, in which
case the flux is evaluated over the boundary curve ∂A. Applying the theorem to a
vector field given by the product of a scalar field φ time the gradient of another scalar
field ψ, we obtain

∇ · (φ∇ψ) dA = φ ds, (9.96)
dn
A ∂A

where n denotes the normal to the boundary curve. On the other hand, it is easy to
check that
∇ · (φ∇ψ) = ∇φ · ∇ψ + φ∇ 2 ψ. (9.97)

Combining these results, we obtain the identity


 
dψ dφ
(φ∇ ψ − ψ∇ φ) dA =
2 2
φ −ψ ds. (9.98)
dn dn
A ∂A

This elegant result has important consequences. Let Ui (x, y) and U j (x, y) be eigen-
functions (that is, natural modes of vibration) corresponding to two distinct natural
frequencies ωi , ω j , respectively. Then, emulating Eq. (9.74) or (9.87), we can write
9.5 Shape-Preserving Motions of More General Continuous Systems 205

1 2
(ω − ωi2 ) Ui U j dA = − (Ui ∇ 2 U j − U j ∇ 2 Ui )dA
c2 j
A A
 
dU j dUi
=− Ui − Uj ds. (9.99)
dn dn
∂A

What this result entails is that the orthogonality condition between eigenfunctions
will be satisfied for at least two types of boundary conditions. The first type corre-
sponds to a simple support, namely, to the vanishing of the transverse displacement.
This kind of boundary condition (U = 0) for an elliptic operator is known as the
Dirichlet type. The other kind of boundary condition (dU/dn = 0), known as the
Neumann type, corresponds to the vanishing of the slope of the membrane.12 In more
general problems of Elasticity, the Neumann boundary condition corresponds to the
specification of a surface traction. At any rate, we conclude that the eigenfunctions
corresponding to different eigenvalues are orthogonal. It can also be shown that they
form a complete set and that any sufficiently smooth function defined over the domain
A can be expanded in terms of the set formed by all the eigenfunctions.
Although we will not pursue at this stage the actual solution of the elliptic equation
in the general case, it turns out that a further application of the method of separation
of variables will allow us to solve the problem for the case of a rectangular membrane.
Indeed, let the domain A be the Cartesian product [0, a] × [0, b], where a and b are
the lengths of the sides. Consider the case of a simply supported membrane along
the whole perimeter. We try a variable separated solution of Eq. (9.95) in the form

U (x, y) = X (x)Y (y), (9.100)

which yields
d2 X d 2Y ω2
+ = − . (9.101)
dx2 dy 2 c2

This identity is only possible if each of the summands is constant, namely, after
solving and imposing the boundary conditions at x = 0 and y = 0, if

ω2
X (x) = A sin λx Y (y) = B sin μy with λ2 + μ2 = . (9.102)
c2
Imposing the boundary conditions at x = a and y = b and discarding the trivial
solution, we obtain the condition
 
m2 n2 ω2
π 2
+ = m, n = 1, 2, 3, . . . (9.103)
a2 b2 c2

12 As a matter of historical interest, the Neumann boundary condition is named after the German

mathematician Carl Neumann (1832–1925), not to be confused with the Hungarian-American math-
ematician John von Neumann (1903–1957).
206 9 Standing Waves and Separation of Variables

For any given pair m, n, the corresponding shape-preserving vibration u mn =


u mn (x, y, t) is
   
mπx nπ y m2 n2 m2 n2
u mn = sin sin A sin πc + 2 t + B cos πc + 2t .
a b a2 b a2 b
(9.104)

From the musical point of view, we can see that the successive natural frequencies
do not appear in ratios of small integers, which accounts for the blunt sound of drums
in an orchestra. Moreover, for a ratio b/a equal to a rational number, we have multiple
eigenvalues. For instance, in a square membrane, we have u mn = u nm for any pair with
m = n. In that case, any linear combination of u mn and u nm is again an eigenfunction
corresponding to the same eigenvalue. The idea of an ocular tonometer based on a
careful measurement of the normal modes of the eye and their sensitivity to small
variations of the intra-ocular pressure is adversely affected by these considerations,
since it is difficult to ascertain which normal mode of vibration has been excited by
an external source.

Exercises

Exercise 9.1 Write the equations of motion of a system with 2 or 3 masses moving
in the plane (or in the line) and interconnected by springs. Are there any non-zero
vectors (that is, displacements) for which the equal sign in Eq. (9.2) applies? Why,
or why not? What would the violation of the inequality mean for the stiffness and/or
for the mass matrix in physical terms?

Exercise 9.2 For a given vector f = f0 sin(ωt) in Eq. (9.19), express the vector f0 in
terms of the eigenvector basis and then determine the components of the particular
solution U p in the same basis. What happens in this method when ω happens to
coincide with one of the natural frequencies?

Exercise 9.3 Compare the solution given by Eqs. (9.43)–(9.47) with that given (for
the same problem, with a slightly different notation) by the extension technique in
Chap. 8. If you prefer, consider just the case in which the initial velocity is identically
zero. Are these solutions the same, as they should?

Exercise 9.4 Verify, by direct substitution, that the integral (9.64) satisfies the ODE
(9.63).

Exercise 9.5 A guitarist plucks a string of length L at exactly its midpoint. Let
W denote the magnitude of the imposed deflection under the finger just before it
is released from rest. Assuming the two halves of the string to be straight at that
instant, determine the coefficients of its expression in terms of the eigenfunctions
(i.e., perform a Fourier expansion). Plot the approximate shape obtained by using
just a few terms of the expansion. Using the method of separation of variables,
find the solution for the motion of the string. Plot it for various times within a
9.5 Shape-Preserving Motions of More General Continuous Systems 207

period, using just a few terms of the approximation. Solve the same problem by the
method of characteristics and compare the results for the same times. Comment on
the comparison. Is any of the two solutions exact? If so, which one?

Exercise 9.6 Consider a vibrating string of constant density but with a linearly
varying cross section  x
A(x) = A0 1 + 4 ,
L
where L is the string length and A0 is the area at the left end. Implement numerically
the shooting method described in Sect. 9.5.1 to find the first few natural frequencies
and the corresponding normal modes. You may use specific numerical values or
introduce a non-dimensional coordinate ξ = x/L and calculate the eigenvalues
relative to those of a string with constant cross section A0 . The shooting routine can
easily be handled with Mathematica by solving the ODE with zero displacement and,
say, unit slope at the left end. Varying the coefficient containing the eigenvalue until
the other end is hit and counting the number of oscillations, the solution is obtained
in just a few runs.

Exercise 9.7 Show that for a simply supported (i.e., pinned-pinned) Bernoulli beam,
the natural modes of vibration are the same as for the vibrating string, while the
successive natural frequencies are not in the same relation.

Exercise 9.8 A uniform Bernoulli beam of length L is supported on a continuous


elastic foundation, just like a beam lying on the ground. The differential equation
for the static bending deflection u = u(x) of this beam is E I u  + ku = −q. In this
equation, k represents the stiffness of the foundation expressed in units of force per
unit length of the beam. The ends of the beam are supported by means of immovable
pins. Solve for the deflection of this beam under a constant distributed load q by
expanding in terms of a Fourier sine series. If the dynamic (inertia) contribution
were to be taken into account, what would the solution be?

Exercise 9.9 For the vibrating membrane, show that the orthogonality between nor-
mal modes corresponding to different natural frequencies is also verified by a third
kind of boundary condition known as the Robin type. It corresponds physically to
an elastic support that keeps the proportionality between the displacement and the
normal slope of the membrane, namely, U = kdU/dn, where k is an elastic constant.

Exercise 9.10 A rectangular membrane with a/b = 1.5, simply supported around
its perimeter in the x, y plane, is subjected to a uniformly distributed normal load p
(per unit area of the membrane) and to a uniform tension T (load per unit length) in
all directions. Find an approximate solution of this static problem by means of the
technique of separation of variables. Give numerical values to the various constants
and estimate the maximum deflection of the membrane to 3 significant digits.
208 9 Standing Waves and Separation of Variables

References

1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol I. Interscience, Wiley,


New York
2. Hildebrand FB (1965) Methods of applied mathematics. Prentice Hall, Englewood Cliffs
(Reprinted by Dover (1992))
Chapter 10
The Diffusion Equation

The archetypal parabolic equation is the diffusion equation, or heat equation, in one
spatial dimension. Because it involves a time derivative of odd order, it is essentially
irreversible in time, in sharp distinction with the wave equation. In physical terms
one may say that the diffusion equation entails an arrow of time, a concept related
to the Second Law of Thermodynamics. On the other hand, many of the solution
techniques already developed for hyperbolic equations are also applicable for the
parabolic case, and vice-versa, as will become clear in this chapter.

10.1 Physical Considerations

Many phenomena of everyday occurrence are by nature diffusive.1 They arise, for
example, as the result of sneezing, pouring milk into a cup of coffee, intravenous
injection and industrial pollution. These phenomena, consisting of the spread of one
substance within another, are characterized by thermodynamic irreversibility as the
system tends to equilibrium by trying to render the concentration of the invading
substance as uniform as possible. The flow of heat is also a diffusive process. A more
graphical way to describe this irreversibility is to say that diffusive phenomena are
characterized by an arrow of time. Thus, the drop of milk poured into the coffee will
never collect itself again into a drop.

10.1.1 Diffusion of a Pollutant

Consider a tube of constant cross section filled with a liquid at rest (the substrate),
in which another substance (the pollutant) is present with a variable concentration
g = g(x, t), measured in terms of mass of pollutant per unit length of tube. If this

1 This
section is largely a more detailed repetition of Sect. 2.4.2.
© Springer International Publishing AG 2017 209
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_10
210 10 The Diffusion Equation

Fig. 10.1 Diffusion of a perfusion p(x, t)


pollutant in a buried pipe

velocity v(x, t) content g(x, t)

dx

tube is embedded in a hostile environment and, if the tube wall permits it, a certain
amount of pollutant, p = p(x, t), may perfuse through the lateral wall per unit length
and per unit time. The quantity p is usually called the production. We want to account
for the variation in time of the amount of pollutant contained in an infinitesimal slice
of width d x, as shown in Fig. 10.1.
If the pollutant were to remain at rest, this accounting would be trivial, as it
would state that the change in pollutant content, namely ∂g(x,t)∂t
d x, is entirely due
to the perfusion through the lateral wall, that is, p(x, t) d x. In reality, however,
the pollutant tends to move with a velocity v(x, t) in the direction x of the tube
axis. This motion, which is the essence of the diffusive phenomenon, results in
an inflow through the left face of the slice given by g(x, t) v(x, t), measured in
mass of pollutant per unit time. Analogously, the right face of the slice, located
at the spatial position x + d x, will witness an outflow of pollutant in the amount
g(x + d x, t) v(x + d x, t). The net contribution due to flow through the faces is,
therefore, given by g(x, t) v(x, t)−g(x +d x, t) v(x +d x, t) = − ∂(gv)
∂x
d x +O(d x 2 ),
where we have assumed the quantities involved to be differentiable. Adding up the
various contributions, we obtain in the limit as d x → 0 the balance equation

∂g ∂(gv)
+ = p. (10.1)
∂t ∂x
To complete the physical description of the diffusion phenomenon, we need to
supply a constitutive equation that relates the two dependent field variables v and g.
In the case of diffusion of a pollutant (or, in general, a substance in small concen-
trations within another), it is possible to formulate a sensible, experimentally based,
constitutive law directly in terms of the pollutant concentration. The most commonly
used model, called Fick’s law, states that

gv = −D grad g, (10.2)

where grad denotes the spatial gradient and the positive constant D is a property that
depends on the substances involved. The minus sign in Eq. (10.2) agrees with the
fact that the pollutant tends to flow in the direction of smaller concentrations.
10.1 Physical Considerations 211

Combining the last two equations, we obtain the second-order linear PDE
∂g ∂2g
− D 2 = p. (10.3)
∂t ∂x
In the absence of production we obtain the homogeneous equation
∂g ∂2g
− D 2 = 0, (10.4)
∂t ∂x
known as the diffusion equation and also as the heat equation. A clever statistical
motivation for the diffusion equation is presented in Box 10.1.

Box 10.1 A discrete diffusion model


A plausible a priori justification of Fick’s law can be made by resorting to a
Statistical Mechanics argument. This kind of argument can be very fruitful
in many applications by providing a heuristic link between various levels of
analysis. Following a line of thought that can be regarded as a greatly simplified
version of Einstein’s celebrated 1905 explanation of Brownian motion, we
postulate a discrete model of space and time, namely, we assume that all
events take place at specific isolated sites and instants. The sites are assumed
to be equally spaced along the real line according to the formula:

xi = i h i = . . . − 3, −2, −1, 0, 1, 2, 3, . . . ,

where Δx = h is the distance between neighbouring sites. Similarly, the


chosen instants of time are spaced at regular intervals according to the formula

tj = j k j = 0, 1, 2, 3, . . . ,

with Δt = k being the time interval between consecutive events. We assume,


moreover, that at time t = 0 each site is occupied by a number Ni0 of particles
and we want to establish how this discrete mechanical system will evolve
j
thereafter, namely, we want to predict the number Ni of particles at the site xi
at time t j .
In the intended physical picture we imagine that there is an underlying
ground substance (coffee, say) in equilibrium and that the particles of interest
(a drop of milk, say) are being constantly bombarded by collisions with the
molecules making up the ground substance. Because these collisions are ran-
dom, we assume that each particle has an equal probability β = 0.5 to move
one space either to the right or to the left in the time interval Δt = k. Under
212 10 The Diffusion Equation

this basic assumption, the rule of evolution of this systema is given by


j+1 j j
Ni = 0.5Ni−1 + 0.5Ni+1 .

The link between the discrete model and the diffusion equation is obtained
by formulating the latter as a finite-difference approximation on the assumed
j
space-time grid. Setting g(xi , t j ) = Ni / h and using standard approximation
formulae for first and second derivatives, we obtain

j+1 j j j j
Ni − Ni N − 2Ni + Ni+1
≈ D i−1 .
k h2

Setting D = h 2 /2k, we recover the discrete version. The diffusion coefficient


D is thus seen to be related directly to the average particle distance and the
mean time between collisions.
a A discrete system governed by an evolution rule that determines the next state on the basis

of the present state only is called a cellular automaton.

10.1.2 Conduction of Heat

The First Law of Thermodynamics asserts that for each substance there exists a
function of state, called the internal energy, whose rate of change is balanced by the
power of the external forces (or mechanical power) acting on the system plus the
heating input (or thermal power), namely,

d internal energy
= mechanical power + thermal power. (10.5)
dt
If we consider a fixed non-deforming substrate, such as a metal wire, the mechani-
cal power vanishes and the free energy is a function of the temperature alone. In close
analogy with the diffusion case shown in Fig. 10.1, the thermal power going into a
slice of width d x consists of two parts: (i) A power supply p = p(x, t), measured in
terms of energy per unit length and per unit time. This power is the result of sources
of heat distributed throughout the length of the wire (or its lateral surface). (ii) A
heat flux q = q(x, t) in the direction of the axial coordinate x. This flux, due to the
ability of the material to conduct heat, is measured in terms of energy per unit time.2
Denoting by g = g(x, t) the temperature field and by u = u(x, t) the internal energy

2 In the more general three-dimensional context, the production term p is measured per unit volume
(rather than length) and the flux term q is measured per unit area. Since the cross section has been
assumed to be constant, we did not bother to effect the formal passage to one dimension.
10.1 Physical Considerations 213

content per unit length, the statement of the balance of energy is expressed as

∂u(x, t) ∂q(x, t)
= p(x, t) − . (10.6)
∂t ∂x
For many materials, a good empirical constitutive law for increments Δu in inter-
nal energy due to corresponding increments Δg in temperature is given by the linear
relation
Δu = c Δg, (10.7)

where c is a constant known as the specific heat (capacity). The internal energy
depends in general also on the deformation, which in our case has been ignored
since the material was assumed to be rigid.
As far as the heat flux is concerned, Fourier’s Law of heat conduction is an
empirical relation valid for most materials within limited temperature ranges. It
establishes that the heat flux is proportional to the gradient of the temperature. In our
notation, this law is expressed as

∂g
q = −k , (10.8)
∂x
where k is the thermal conductivity of the material, a positive constant. The minus sign
expresses the fact that heat flows spontaneously form higher to lower temperatures.
Introducing the constitutive equations (10.7) and (10.8) into the energy balance
equation (10.5), we obtain
∂g ∂2g
c − k 2 = p. (10.9)
∂t ∂x

This equation is identical in form3 to Eq. (10.3) governing the diffusion of one sub-
stance into another, as we studied in Sect. 10.1.1. For this reason, this equation is
known both as the (non-homogeneous) diffusion equation and as the heat equation.
The adjective non-homogeneous refers here to the fact that there are body sources.
Thus, the equation would be called homogeneous if the right-hand side were zero. On
the other hand, the material itself may have properties, such as the specific heat or the
thermal conductivity, varying from point to point, in which case it is the body (rather
than the equation) which would be called inhomogeneous. In deriving Eq. (10.9),
in fact, it was assumed that the coefficient of thermal conductivity k was constant
throughout the domain of interest. If, instead, k and/or c are functions of position
(that is, if the material is inhomogeneous) Eq. (10.9) should be replaced by
 
∂g ∂ ∂g
c(x) − k(x) = p. (10.10)
∂t ∂x ∂x

3 With D = k/c.
214 10 The Diffusion Equation

10.2 General Remarks on the Diffusion Equation

The diffusion equation, as we already know, is of the parabolic type. At each point,
therefore, it has a single characteristic direction, whose slope is given by

dt
= 0. (10.11)
dx
There are some similarities between the heat equation and the wave equation (and
between hyperbolic and parabolic equations in general), but there are also many
differences, both in interpretation and in the nature of their solutions, which we
would like to point out. An important difference from the mathematical and physical
points of view is that, unlike the wave equation, the heat equation is not invariant
with respect to time reversal. In other words, if we were to make the change of
independent variables 
x̂ = x
(10.12)
tˆ = −t

we would not obtain the same equation (as would be the case with the wave equation).
From the physical point of view, this is a manifestation of the fact that the diffusion
equation describes thermodynamically irreversible processes. If we were to run a
video of a wave phenomenon backwards, we would not be able to tell whether or
not we are witnessing a real phenomenon. But if we were to run backwards a movie
of a diffusive phenomenon, we would immediately be able to tell that something
“unphysical” is taking place: the milk already dissolved in coffee spontaneously
becomes a small drop, a bar in thermal equilibrium spontaneously gets cooler at one
end and warmer at the other, and so on.
A feature that the diffusion equation shares with the wave equation is that in both
cases the initial value problem makes physical and mathematical sense. Thus, from
the state of the system at one particular instant of time, the differential equation allows
us to predict the evolution of the system for future times. But here, again, there is a
significant difference when we realize that, according to Eq. (10.11), the initial curve
(t = 0, say) is a characteristic of the PDE. What this means is that we will not be able
to consider the specification of initial data along a characteristic as the exceptional
case, but rather as the rule. Moreover, if we recall that characteristics are lines along
which weak singularities propagate, we find that according to the heat equation these
disturbances (if they could exist at all) must propagate at an infinite speed! In fact, it
can be shown that, in the interior of its domain of existence, any solution of the heat
equation must be of class C ∞ . In other words, contrary to the wave equation, any
singularity in the data at the boundary of the domain is immediately smeared out, as
befits a diffusive process.
When discussing the different types of second-order equations, we remarked that
if the initial data (the values of the function and of its first partial derivatives) are
specified on a characteristic line they will in general contravene the PDE. In the case
of hyperbolic equations, upon reconciling the initial data with the PDE we end up
10.2 General Remarks on the Diffusion Equation 215

losing uniqueness of the solution. In order to restore it, one has to deal with the
so-called characteristic initial value problem, whereby data have to be specified on
two intersecting characteristics. In the case of the diffusion equation, on the other
hand, it is clear that to reconcile the initial data on a characteristic line (t = constant)
we simply have to refrain from specifying the value of the time derivative gt = ∂g ∂t
.
Indeed, by specifying the value of the function itself, and assuming that it is twice
differentiable, we can obtain its second spatial derivative, and the PDE automatically
delivers the value of the time derivative. We note, moreover, that the values of all
subsequent partial derivatives of all orders become thus available. This means that,
contrary to the case of the wave equation, the reconciliation of the initial data with
the PDE does not lead to a lack of uniqueness.

10.3 Separating Variables

Without assuming any initial and/or boundary conditions, we try to see whether the
method of separation of variables can give us some indication of possible solutions
of the homogeneous diffusion equation. We set

g(x, t) = G(x) f (t). (10.13)

Upon substitution in (10.4), we obtain the relation

f˙ G 
=D = −λ2 , (10.14)
f G

with an obvious notation. Without any loss of generality, we may henceforth assume
that D = 1, since this can always be achieved by a suitable re-scaling of the spatial
variable. The choice of negative sign in the constant rightmost side of Eq. (10.14) is
dictated by the reasoning that follows. Integrating first the time-dependent part, we
obtain
f (t) = C e−λ t .
2
(10.15)

Since, as already pointed out, the diffusion equation implies an arrow of time, it is
clear that, had we chosen a positive value in the right-hand side, the solution would
have rapidly diverged with (increasing) time. The spatial part leads to the solution

G(x) = A cos(λx) + B sin(λx). (10.16)

Absorbing the undetermined constant C within A and B, we obtain a solution of the


heat equation for λ = 0, namely,

g(x, t) = (A cos(λx) + B sin(λx)) e−λ t .


2
(10.17)
216 10 The Diffusion Equation

Note that the special choice λ = 0 yields the solution g(x, t) = A + Bx. This
time-independent solution of the diffusion equation corresponds to a case of steady
state (or equilibrium).
Since we are dealing with a linear homogeneous equation, any linear combination
of solutions is a solution. We may, for example, choose for each value of λ some
prescriptions A = A(λ) and B = B(λ), and form an integral (which is, after all, a
limit of sums) such as

∞
(A cos(λx) + B sin(λx)) e−λ t dλ.
2
g(x, t) = (10.18)
−∞

Provided the integral converges, this expression is a new solution of the diffusion
equation. We will later exploit this fact to show how to construct a solution in this
way by judiciously choosing the spectral coefficients A(λ) and B(λ) so as to match
any given initial and boundary conditions.

10.4 The Maximum–Minimum Theorem and Its


Consequences

Before presenting the solutions of particular initial-boundary value problems, it is


instructive to dwell on some of the properties of these solutions. Are they unique?
Can they sustain discontinuities? Do they depend continuously on the initial data? A
particularly illuminating method to answer these questions in the case of parabolic
equations is based on the maximum–minimum theorem, whose statement can and
will be later interpreted in appealingly physical terms.4
Theorem 10.1 (Maximum–minimum theorem) Let D denote a closed rectangle
with sides parallel to the coordinate axes. Without loss of generality (since the heat
equation is invariant under translations in both directions), we may assume that this
rectangle is [0, L] × [0, T ], as shown in the Fig. 10.2. Any (continuous)5 solution of
the diffusion equation (10.4) in D attains its maximum and minimum values either
on the base (t = 0) of the rectangle or on one of the vertical sides (x = 0, L).

Proof Let M and m denote, respectively, the maximum values6 of g in D and in the
union of the base and the vertical sides of D (that is, in ∂D minus the open top of
the rectangle, as indicated with a thick line in Fig. 10.2). Assume the statement of

4 This section follows [5].


5 As already pointed out, any solution is already smooth in the interior of the domain. Continuity
refers, therefore, to the data specified on (part of) the boundary.
6 Recall that a continuous function defined over a compact (closed and bounded) subset of Rn attains

its maximum and minimum values at one or more points of its domain.
10.4 The Maximum–Minimum Theorem and Its Consequences 217

Fig. 10.2 The t


maximum–minimum
theorem T

x
0 L

the theorem not to be true. There exists, therefore, a point (x0 , t0 ) with 0 < x0 < L
and 0 < t0 ≤ T at which g attains the value M > m. We construct the augmented
function
M −m
h(x, t) = g(x, t) + (x − x0 )2 . (10.19)
4L 2
The reason for this construction will become apparent soon. We only remark now
that the value of this function is at each point of D greater than or equal to the value
of the solution at that point. In particular, the restriction of this function to the union
of the vertical sides and the base of D satisfies the inequality

M −m M 3
h(x, t) ≤ m + = + m < M. (10.20)
4 4 4
This result tells us that the new function h(x, t) also attains its maximum value at
some point (x1 , t1 ) with 0 < x1 < L , 0 < t1 ≤ T . At this point we must have that
h t ≥ 0 and h x x ≤ 0. [Important question: why don’t we just say h t = 0?] Combining
these two conditions, we obtain

h t − h x x ≥ 0. (10.21)

On the other hand, recalling the definition (10.19), we have

M −m M −m
h t − h x x = gt − gx x − =− < 0. (10.22)
2L 2 2L 2
Thus, the assumption M > m has led us to a contradiction and the first part of the
theorem has been proved. The proof of the part dealing with the minimum value
follows directly by noting that if g is a solution so is −g. 

Corollary 10.1 (Uniqueness) The initial-boundary value problem of the heat equa-
tion has a unique solution.
218 10 The Diffusion Equation

Proof The problem we are describing corresponds to the specification of the values of
g at the base and the vertical sides of the rectangle. The proof follows immediately
from the assumption that there exist two different solutions to this problem. The
difference between these two solutions would, therefore, vanish on this part of the
boundary. Since our equation is linear, this difference satisfies the heat equation in
the given domain. It follows that (unless the difference is identically zero) we have
found a solution of the heat equation that attains a maximum or minimum value at a
point not belonging to the part of the boundary stipulated by the maximum–minimum
theorem. 

From both the physical and the computational points of view, it is important to
ascertain that the behaviour of the solutions of the heat equation is not chaotic. In other
words, a small change in the initial and/or boundary data results in a correspondingly
small change in the solution. This is the content of the following corollary of the
maximum–minimum theorem.

Corollary 10.2 (Continuous dependence on the data) The solution of the initial-
boundary value problem depends continuously on the data in the sense that a small
change in the data results in a correspondingly small change in the solution.

Proof More precisely, this corollary states that if the data (specified on the base and
the two vertical sides of the rectangular region [0, L] × [0, T ]) corresponding to two
solutions g1 (x, t) and g2 (x, t) satisfy the conditions

|g1 (0, t) − g2 (0, t)| < , (10.23)

|g1 (x, 0) − g2 (x, 0)| < , (10.24)

and
|g1 (L , t) − g2 (L , t)| < , (10.25)

then so do the solutions over the whole rectangular domain, viz.:

|g1 (x, t) − g2 (x, t)| < , ∀(x, t) ∈ D. (10.26)

To prove this corollary we start by noticing once again that the difference between two
solutions is itself a solution of the (homogeneous) heat equation. The corresponding
data are, clearly, given by the difference of the data of the individual solutions.
Applying the main theorem to the difference between the given solutions the corollary
follows. 

Remark 10.1 A cursory reading of the proof of the maximum–minimum theorem


may convey the impression that its main argument could have been applied also to
the time-reversed problem. Could one not, having specified the data at the top of the
rectangle (rather than at its base) and at the vertical sides, conclude that the solution
must attain its maximum and minimum values there? The argument appears to be
10.4 The Maximum–Minimum Theorem and Its Consequences 219

entirely reversible. Nevertheless, when establishing the conditions for a maximum to


possibly exist at the base (which step would be necessary in the proof of the theorem),
we would have to state that h t ≤ 0 and h x x ≤ 0, thus ruining the conclusion of the
theorem. Once again, this is a manifestation of the time-irreversibility of the heat
equation.

We have extracted several important facts out of a relatively simple proof. More-
over, the statement of the main theorem corresponds to the physically intuitive fact
that if, for example, the maximum value of the temperature data occurs at the base of
the rectangle, then we do not expect at any time and at any point the temperature to
rise above this value. In its search for thermal equilibrium, the bar will seek to even
out the temperatures as much as permitted by the boundary data.

10.5 The Finite Rod

Consider a rod of finite length occupying the interval [0, L] and thermally insulated
on its lateral surface. The temperature distribution g(x, t) abides by the diffusion
equation (10.4), also called the heat equation. At the ends of the bar, the temperature
is kept equal to zero7 at all times, i.e.,

g(0, t) = g(L , t) = 0, ∀t > 0. (10.27)

Moreover, at the initial time, the temperature throughout the length of the rod is given
as some continuous function

g(x, 0) = g0 (x) 0 ≤ x ≤ L. (10.28)

For consistency, we assume that the function g0 vanishes at the two ends of the
rod. As we already know (from the results of the previous section), if this problem
has a solution it must be unique. We try a solution by the method of separation
of variables. According to Eq. (10.17), except for the spatially linear solution, any
variable-separated solution must be of the form

g(x, t) = (A cos(λx) + B sin(λx)) e−λ t .


2
(10.29)

Enforcing the boundary conditions (10.27), we obtain that the constant A must vanish
and that the parameter λ must belong to a discrete spectrum given by the formula

λ= n = 1, 2, . . . (10.30)
L

7 Note that the temperature appearing in the heat equation is not necessarily the absolute thermody-

namic temperature.
220 10 The Diffusion Equation

To satisfy the initial condition, we propose an infinite superposition of variable-


separated solutions, namely,

  nπx 
e( L ) t .
nπ 2
g(x, t) = Bn sin (10.31)
n=1
L

Introducing this form of the solution into the initial condition (10.28), we obtain

  nπx 
g0 (x) = Bn sin . (10.32)
n=1
L

By the orthogonality property of the trigonometric functions involved, we have

L  nπx 
2
Bn = g0 (x) sin d x. (10.33)
L L
0

Recall that we have simplified the heat equation by assuming that D = 1. If we


now restore the original value of this constant, the only change to Eq. (10.31) would
consist of a rescaling of the time variable, that is,

  nπx 
e−( L )
nπ 2
g(x, t) = Bn sin Dt
. (10.34)
n=1
L

Notice that as time goes on, the solution tends to a state of thermal equilibrium, as
expected. From a detailed analysis of this solution, one can verify that if the initial
temperature distribution is continuous with piece-wise continuous derivatives, the
solution is of class C ∞ for t > 0. We have already alluded to this property earlier and
indicated that, from the physical point of view, its meaning is that any irregularities
in the initial data are immediately smoothed out by the diffusive process of heat
transfer. It is interesting to remark that one can use this property of the solution to
prove that the heat equation cannot in general be solved backward in time.
The method of separation of variables has allowed us to solve the homogeneous
heat equation (no distributed heat sources or sinks) under a regime of homogeneous
boundary conditions (zero temperature at the ends of the rod). Other, more general,
homogeneous boundary conditions (such as insulated ends) can also be considered,
leading to Fourier series expansions involving cosine terms. The solution of cases
where the boundary conditions are arbitrary functions of time or where there exist
heat sources distributed over the length of the rod, however, requires the consideration
of other methods (such as Laplace transforms, Green functions, Duhamel integrals
and eigenfunction expansions). The general treatment of these methods is beyond
the scope of this book, but we will explore some of them to a limited extent.
10.6 Non-homogeneous Problems 221

10.6 Non-homogeneous Problems

Although more general non-homogeneous boundary conditions can be considered,


we will limit our attention to boundary conditions of temperature. We consider,
therefore, the problem

gt − Dgx x = 0 0<x<L t > 0, (10.35)

g(x, 0) = g0 (x) 0 ≤ x ≤ L, (10.36)

g(0, t) = f 0 (t) g(L , t) = f L (t) t ≥ 0. (10.37)

We assume, moreover, that the following consistency conditions are satisfied, namely,

g0 (0) = f 0 (0) g0 (L) = f L (0). (10.38)

Our aim is to show that this problem (a homogeneous equation with inhomo-
geneous boundary conditions) can be generally converted into a problem of a non-
homogeneous equation with homogeneous boundary conditions. To this effect, we
decompose the solution into the sum of two terms as

g(x, t) = G(x, t) + S(x, t). (10.39)

The second term (which, somewhat imprecisely, will be referred to as the steady state
part of the solution) is given by a spatial linear interpolation of the given boundary
conditions, namely,
 x x
S(x, t) = f 0 (t) 1 − + f L (t) . (10.40)
L L
Introducing the proposed decomposition (10.39) into the original PDE (10.35), we
obtain
G t − DG x x = −St . (10.41)

Thus, the “transient” (non-steady) part, G(x, t), of the solution satisfies a non-
homogeneous version of the heat equation, whereby the sources are obtained by
a particular linear combination of the time-derivatives of the boundary conditions.
On the other hand, it is not difficult to verify that (in fact, by construction) the function
G(x, t) satisfies the homogeneous boundary conditions

G(0, t) = G(L , t) = 0 ∀t > 0, (10.42)

and the initial condition

G(x, 0) = g0 (x) − S(x, 0) 0 ≤ x ≤ L. (10.43)


222 10 The Diffusion Equation

We conclude that a heat conduction problem with non-homogeneous boundary


conditions can be transformed into a counterpart with homogeneous boundary con-
ditions at the price of introducing distributed and time-dependent heat sources and
modifying the initial conditions, both in an elementary manner. The solution of the
transformed problem can be achieved by the method of eigenfunction expansion
(which, as in the case of the wave equation, we also call normal-mode superpo-
sition). The eigenfunctions are precisely the spatial parts of the variable-separated
solutions of the homogeneous equation that we studied earlier. In the case of a bar
with uniform properties, these eigenfunctions are harmonic functions, thus leading
to the Fourier series expansion. In more general cases (which we will not consider)
the eigenfunctions are not harmonic, but (according to the Sturm–Liouville theory)
still constitute a complete set of orthogonal functions.
In order to solve the problem given by Eqs. (10.41)–(10.43), we express a partic-
ular solution as
∞  nπx 
G(x, t) = Dn (t) sin . (10.44)
n=1
L

This expression agrees with its counterpart for the treatment of the non-homogeneous
wave equation that we studied in a previous chapter. Similarly, we express the heat
source as
∞  nπx 
St (x, t) = Cn (t) sin . (10.45)
n=1
L

The coefficients of this expansion can be calculated at each instant of time by the by
now familiar formula

L  nπx 
2
Cn (t) = St (x, t) sin d x. (10.46)
L L
0

Introducing the representations (10.45) and (10.46) into (10.41), we obtain a


sequence of mutually independent ODEs, that is,

d Dn  nπ 2
+D Dn = −Cn . (10.47)
dt L
A particular solution of this equation is given by

t
Cn (τ )e−( L )
2

D(t−τ )
Dn (t) = − dτ . (10.48)
0
10.6 Non-homogeneous Problems 223

The complete solution of the non-homogeneous problem is, therefore,


∞ 
   nπx 
Bn e−( L )
nπ 2
G(x, t) = Dt
+ Dn (t) sin . (10.49)
n=1
L

The constants can be adjusted to satisfy the initial condition (10.43). Finally, the
solution of the original problem is obtained form Eq. (10.39).

10.7 The Infinite Rod

In the case of a rod of infinite spatial extent, we are confronted with the pure Cauchy
(or initial-value) problem

gt − Dgx x = 0 −∞< x <∞ t > 0, (10.50)

g(x, 0) = g0 (x) − ∞ < x < ∞, (10.51)

without any boundary conditions. We will assume that the initial temperature distri-
bution g0 (x) is continuous and bounded over the real line. To show the uniqueness
of the solution of this problem, we would like to emulate the procedure we used in
the case of the finite rod, namely, to prove a maximum–minimum theorem. In order
to achieve this goal, however, it turns out that, unlike the finite case, we must now
make an extra assumption on the nature of the solution: we need to assume a-priori
that the sought after solution is continuous and bounded. Otherwise, it can be shown
explicitly that the maximum–minimum theorem doesn’t hold and the solution is, in
fact, not unique. A standard argument due to Tychonoff8 shows how to construct a
C ∞ solution that vanishes at t = 0. This solution, however, is unbounded. A solution
g(x, t) is said to be bounded if there exists a positive number M such that

|g(x, t)| ≤ M −∞< x <∞ t > 0. (10.52)

Let g1 (x, t) and g2 (x, t) be two bounded solutions of Eq. (10.50). The difference
g = g1 −g2 between these solutions is, therefore, also bounded. Instead of proceeding
to prove an independent maximum theorem, we can take advantage of the maximum
theorem for the finite case to produce a proof of uniqueness by showing that g(x, t)
must vanish identically over the half plane of interest. To this end, we attempt to
construct a family of solutions g L (x, t) of the heat equation, over the finite spatial
intervals −L ≤ x ≤ L, each of which enjoys the property of being non-negative

8 See [4], p. 211.


224 10 The Diffusion Equation

and greater than (or equal to) |g(x, t)| over the common domain of definition. Such
a family of solutions is given by the prescription9
 
4M x2
g L (x, t) = 2 Dt + , (10.53)
L 2

as can be verified. In particular, we notice that for any given T > 0 the values taken
by this solution over the part of the boundary consisting of the base [−L , L] × {0}
and the sides {−L} × [0, T ] and {L} × [0, T ] are point by point larger than (or equal
to) the corresponding values of |g|. This must, therefore, be true for the interior
points as well. Fixing an arbitrary point (x, t), we conclude from Eq. (10.53) that for
sufficiently large L the absolute value of g(x, t) can be bounded by as small a positive
number as desired. This concludes the proof of uniqueness. As a corollary of this
theorem, one can (by the same procedure as in the finite case) prove the continuous
dependence of the solution on the initial data.
Having thus demonstrated the uniqueness of the Cauchy problem, we need to
construct a solution by any method, which will then become the (unique) solution.
In particular, if two solutions are found which appear to be different (perhaps because
of the different methods used to derive them), they are automatically identical to each
other. We have already remarked that fairly general solutions of the heat equation
can be found by adjusting the coefficients in the expression

∞
(A(λ) cos(λx) + B(λ) sin(λx)) e−λ t dλ,
2
g(x, t) = (10.54)
−∞

so as to match the initial condition g(x, 0) = g0 (x), namely,

∞
g0 (x) = (A(λ) cos(λx) + B(λ) sin(λx)) dλ. (10.55)
−∞

When we compare this expression with the familiar formula for the Fourier series,
we realize that it can be regarded as a generalized version of it. The generalization
consists in not demanding that the function represented be periodic, since the domain
of definition of the initial conditions is now unbounded. As a result, we no longer
obtain a discrete spectrum of possible values for the wavelength, but rather a continu-
ous spectrum, where every wavelength is represented. If we had at our disposal some
kind of orthogonality condition, as was the case in the finite domain, we would be
able to obtain these coefficients directly from Eq. (10.55). Instead, we will proceed
to introduce the concept of Fourier integral by a heuristic argument of passage to
the limit of the Fourier series as the period tends to infinity.

9 As suggested in [6], p. 605.


10.8 The Fourier Series and the Fourier Integral 225

10.8 The Fourier Series and the Fourier Integral

A periodic function f (x) of period 2L can be represented by means of a Fourier


series in the form

∞
1
f (x) = a0 + (an cos(λn x) + bn sin(λn x)) . (10.56)
2 n=1

The equal sign in this equation has to be taken with a pinch of salt. Be that as it
may, the “frequencies” λn constitute a discrete spectrum dictated by the period of
the function being represented, specifically given by

λn = . (10.57)
L
The coefficients of the expansion (10.57), also called amplitudes, are given by the
integrals
L
1
an = f (ξ) cos(λn ξ) dξ, (10.58)
L
−L

L
1
bn = f (ξ) sin(λn ξ) dξ. (10.59)
L
−L

These formulas are obtained by a direct application of the orthogonality property of


trigonometric functions and by assuming that the Fourier series can be integrated
term by term. It is convenient to write the Fourier series (10.56) in a more compact
notation, using complex algebra. Recalling the identity

eiα = cos α + i sin α, (10.60)

(where i denotes the imaginary unit), we can write (10.56) as




f (x) = cn eiλn x . (10.61)
n=−∞

The (complex) coefficients are related to the (real) coefficients by the formulas
1
2
(an − ibn ) for n ≥ 0
cn = (10.62)
1
(a
2 n
+ ibn ) for n < 0
226 10 The Diffusion Equation

More explicitly, the coefficients are given by

L
1
cn = f (ξ)e−iλn ξ dξ. (10.63)
2L
−L

Notice that, although the original function may have been defined only in the interval
[−L , L], the Fourier representation is valid over the whole line. In other words, the
Fourier series represents a periodic extension of the given function, obtained by just
translating and copying the function ad infinitum. When performing this extension,
even if the original function is continuous, we may obtain points of discontinuity at
the extreme values of each period. In such cases, it can be shown that the Fourier
series converges to the average value. We will not discuss these or other phenomena
pertaining to the convergence of Fourier series. In particular, we will assume that
differentiation can be carried out term by term and that the series thus obtained is an
almost-everywhere faithful representation of the derivative of the original function.
Taking these liberties, we can easily understand why the Fourier series can be so
useful in the solution of differential equations. For example, the second derivative
of a function has Fourier coefficients which are, one by one, proportional to the
coefficients of the original function, the constant of proportionality being −λ2n . In
the transformed world of Fourier coefficients, therefore, taking a second derivative
is interpreted as a kind of multiplication.
We want to extend the above concepts to functions that are not necessarily periodic
and that are defined over the entire real line. We will make the assumption that the
absolute value of the given function f (x) is integrable and that the integral over the
real line is finite, namely, for some positive number M,

∞
| f (x)| d x ≤ M. (10.64)
−∞

Let us consider an arbitrary interval [−H, H ]. If we restrict the given function to


this interval and then extend it periodically, we can represent the resulting function
f H (x) by means of a Fourier series as

∞  H
1  inπ(ξ−x)
f H (x) = f (ξ)e− H dξ, (10.65)
2H n=−∞
−H

where we have combined Eqs. (10.61) and (10.62). We intend to let H go to infinity
and to replace the summation by an integral. To achieve this goal, it is convenient to
define π
Δ= . (10.66)
H
10.8 The Fourier Series and the Fourier Integral 227

We rewrite Eq. (10.65) trivially as


⎛ ⎞

 H
1 ⎝Δ f (ξ)e H dξ ⎠ .
inπ(x−ξ)
f H (x) = (10.67)
2π −∞
−H

But if we recall the definition of the (Riemann) integral of a function as a limit of


sums, we obtain (as Δ → 0 while H → ∞)

∞ ∞
1
f (x) = f (ξ)eiλ(x−ξ) dξ dλ. (10.68)

−∞ −∞

This formula is known as the Fourier integral. Notice that in obtaining this result
we have defined a new continuous variable λ, whose discrete values in the limiting
process were nπ/H = nΔ, precisely as required by the definition of an integral. The
Fourier integral can be regarded in two steps, just as we suggested for the Fourier
series. In the first step, called the Fourier transform, we produce a transformation
of the original function f (of the independent variable x) to another function F (of
the independent variable λ running within the “frequency domain”) by means of the
formula
∞
1
F(λ) = √ f (ξ)e−iλξ dξ. (10.69)

−∞

The second step consists of the inverse Fourier transform

∞
1
f (ξ) = √ F(λ)eiλξ dλ. (10.70)

−∞

The process of obtaining the Fourier transform of functions is, clearly, a linear
operation from the space of functions into itself. We indicate this linear operator by
F. Thus, we can write
F(λ) = F[ f (x)]. (10.71)

Given a differential equation, we can apply the Fourier transform to hopefully


obtain a tractable problem in the frequency domain. After solving this problem, we
may attempt to return to the real world by means of the inverse transform. Before
entering any such considerations, we should remember that the Fourier integral was
obtained by a heuristic process of passage to the limit, so that it behooves us to justify
this process by checking directly that the result is correct. We will, however, omit
this (not very difficult) proof.10

10 See [1], p. 78.


228 10 The Diffusion Equation

Consider the Fourier transform of the derivative of a function. We have:


⎛ ⎞
∞ ∞ ∞
1 1 ⎝ 
F [ f  (x)] = √ f  (ξ)e−iλξ dξ = √ f (ξ)e−iλξ  + iλ f (ξ)e−iλξ dξ ⎠ .
2π 2π −∞
−∞ −∞
(10.72)

If the original function vanishes at ±∞, we obtain the important relation

F[ f  (x)] = iλF[ f (x)]. (10.73)

Again, just as in the case of the Fourier series, we obtain that in the frequency
domain differentiation is interpreted as multiplication by the frequency variable.
Another important property is the so-called convolution. The convolution product
f ∗ g of two functions f (x) and g(x) is defined as

∞
1
( f ∗ g)(x) = √ f (x − ξ)g(ξ) dξ. (10.74)

−∞

The convolution product is commutative and associative. The Fourier transform of


the convolution product of two functions can be shown to be equal to the product of
the transforms, that is,
F[ f ∗ g] = F[ f ] F[g]. (10.75)

10.9 Solution of the Cauchy Problem

We are now in a position of solving the Cauchy problem as formulated in Eqs. (10.50)
and (10.51), provided we add the extra condition that both the initial temperature
distribution g0 (x) and its derivative g0 (x) vanish at x = ±∞. Fourier-transforming
Eq. (10.50) with respect to the space variable (while the time variable remains as a
parameter), we can write

F[gt (x, t)] − DF[gx x (x, t)] = 0. (10.76)

We note that the x-Fourier transform of g(x, t) is a function of λ and of the time vari-
able t, which has remained unaffected by the transformation. Denoting this transform
by G(λ, t) and using the derivative property of the Fourier transform, we write

G t (λ, t) + Dλ2 G(λ, t) = 0. (10.77)

We note that the derivative with respect to the time parameter is directly reflected
as the derivative with respect to the same parameter of the Fourier transform, and it
is only the derivative with respect to the transformed variable that enjoys the special
10.9 Solution of the Cauchy Problem 229

property derived above. For each value of λ, Eq. (10.77) is a first-order ODE. The
initial condition is obtained by Fourier-transforming the initial condition (10.51). We
denote
G 0 (λ) = F[g0 (x)]. (10.78)

The solution of (10.77) with initial condition (10.78) is an elementary problem and
we obtain
G(λ, t) = G 0 (λ)e−Dλ t .
2
(10.79)

The solution to the original problem is given by the inverse transform of this function.
The calculation of inverse transforms is usually not a straightforward task. In this
particular case, however, it can be accomplished by an application of the convolution
formula (10.75). Indeed, the inverse transform of the second factor in the right-hand
side of (10.79) is given by
  1 x2
F −1 e−Dλ t = √ e− 4Dt .
2
(10.80)
2Dt

We have obtained this value from a table of Fourier transforms, although in this case
the direct evaluation of the inverse transform is relatively straightforward. Applying
now the convolution formula to (10.79), we obtain
  1 x2
g(x, t) = F −1 G 0 (λ)e−Dλ t = g0 (x) ∗ √ e− 4Dt ,
2
(10.81)
s Dt

Plot3D[(1/(2*Sqrt[Pi*t]))* NIntegrate[Exp[(-x^2/2 - (x - ksi)^2/(4*t))], {ksi, -3, 3}],


{x, -2.5, 2.5}, {t, 0.0001, 50}, AxesLabel -> {x, t, u}, PlotRange -> All]

Fig. 10.3 Plot of Eq. (10.82)


230 10 The Diffusion Equation

or
∞
1 (x−ξ)2
g(x, t) = √ g0 (ξ)e− 4Dt dξ. (10.82)
2 π Dt
−∞

This is the solution of the Cauchy problem for the heat equation. A useful way
to interpret this result is obtained by the use of the concept of generalized functions,
as we will do in the next section. Figure 10.3 shows a plot of the solution (10.82)
for D = 1 and a bell-shaped initial temperature distribution given by the function
g0 (x) = e−x /2 . The integration has been numerically achieved with the use of
2


Mathematica . The initial time has been taken somewhat greater than zero, to avoid
a numerical singularity.

10.10 Generalized Functions

We define a generalized function (or a distribution) as a real-valued linear functional


on a space F of functions.11 One of the physical motivations behind this notion is the
ability to describe concentrated entities (forces, masses, and so on) within the same
framework as smoothly distributed ones. Let us denote by φ one such generalized
function. By saying that it is a real-valued functional on a space of functions we
mean that, given a function f belonging to this space, the functional φ assigns to it
a real number, which we will denote as φ[ f ]. The fact that this functional is linear
means that, given any two functions, f and g, and any two real numbers α and β,
we must have
φ[α f + βg] = αφ[ f ] + βφ[g]. (10.83)

Given a continuous (or just integrable) function f , we can assign to it a unique


generalized function φ f by means of the prescription

∞
φf = g(x) f (x) d x ∀g ∈ F. (10.84)
−∞

It is not difficult to prove (by an argument akin to the so-called fundamental lemma
of the calculus of variations) that the linear functional thus defined determines the
function f uniquely. If this were all that we have to say about generalized functions it
wouldn’t be worth our effort. But consider, for example, the following functional on
the given space of functions F: it assigns to each function the value of the function
at the origin. We denote this functional by δ and call it Dirac’s delta. More precisely

11 For technical reasons, the space of functions over which these functionals are defined consists of

the so-called space of test functions. Each test function is of class C ∞ and has compact support
(that is, it vanishes outside a closed and bounded subset of R). The graph of a test function can be
described as a smooth ‘bump’.
10.10 Generalized Functions 231

δ[g] = g(0) ∀g ∈ F. (10.85)

This functional is clearly linear, but is not amenable to an integral representation


in the conventional sense. In other words, there is no ordinary function, f δ say,
which, when plugged into the integrand of Eq. (10.84) will produce the same result
as Dirac’s delta. Nevertheless, we can now symbolically imagine a “function” δ(x)
which somehow does the trick, namely,

∞
g(x)δ(x) d x = g(0) ∀g ∈ F. (10.86)

A useful way to look at this integral is to consider Dirac’s delta “function” as a


limit of a sequence of integrable functions δn (x) (n = 1, 2, . . .) that vanish outside
increasingly smaller intervals around the origin. This is, after all, the “physical” mean-
ing of a concentrated entity. Such a sequence of functions, illustrated in Fig. 10.4,
can be defined as n
if |x| < n1
δn (x) = 2 (10.87)
0 if |x| ≥ n1

The area under the graph of each function remains thus always equal to 1. As we
calculate the integral of the product of these functions with any given function g, we
obtain
∞ 1/n
n
g(x)δn (x) d x = g(x) d x = g(x̄). (10.88)
2
−∞ −1/n

We have used the mean value theorem to replace the integral of a function by the
value of the function at an interior point x̄ times the length 2/n of the interval of
integration. As n increases, the intermediate point gets more and more confined and
eventually becomes the origin, which is the only point common to all the nested
intervals. Thus we recover (10.86) as a limiting case.
An obvious feature of the Dirac delta function is its filtering or substitution prop-
erty
∞
g(x)δ(x − a) d x = g(a). (10.89)
−∞

Generalized functions are in general not differentiable, but we are interested in


extending the notion of differentiability in such a way that derivatives may be defined
in a distributional sense. To obtain a meaningful definition, we can emulate the
properties of the derivatives of those distributions which do have ordinary derivatives,
namely, those given by Eq. (10.84). For these regular distributions, we clearly want
to have the property
232 10 The Diffusion Equation

Fig. 10.4 Intuiting the Dirac δ


distribution

δ5 (x)

δ4 (x)

δ3 (x)

δ2 (x)

δ1 (x)

x
0 1

∞
φ f  [g] = g(x) f  (x) d x ∀g ∈ F. (10.90)
−∞

At this point, we can integrate by parts12 and obtain

∞
φ [g] = −
f g  (x) f (x) d x = −φ f [g  ] ∀g ∈ F. (10.91)
−∞

Accordingly, we adopt the following definition for the derivative of a distribution

φ [g] = −φ[g  ] ∀g ∈ F. (10.92)

The distributional derivative of a distribution is itself a distribution. An important


example is given by the distributional derivative of the Heaviside (step) function

⎨0 if x < 0
H (x) = (10.93)

1 if x ≥ 0

The generalized derivative (which we don’t bother to indicate with anything but
the conventional symbol for ordinary derivatives) is given by13

12 We must now use the fact that the function space consisted of functions with compact support, so

that they, and all their derivatives, vanish at infinity.


13 Here again we are using the compact support property.
10.10 Generalized Functions 233

∞ ∞
  
H [g] = −H [g ] = − g (x)H (x) d x = − g  (x) d x = g(0). (10.94)
−∞ 0

We see that the action of the derivative of the Heaviside function is identical to the
action of the Dirac delta function on each and every function of the original function
space. We conclude, therefore, that the derivative of the Heaviside function is the
Dirac delta function.
Let us go back to our solution of the heat equation as expressed in Eq. (10.82),
and let us assume that our initial condition g0 (x) is not a function but a distribution.
In particular, let us consider the case

g0 (x) = δ(x − a). (10.95)

The physical meaning of such an initial condition is that at the initial time we placed
a concentrated source of heat at the point x = a. This interpretation is clearly
contained in the conception of the Dirac function as a limit of a sequence of ordinary
functions, as we have demonstrated above. Indeed, the functions in this sequence
vanish everywhere except for an increasingly smaller and smaller interval around
that point. When we plug this initial condition in the general solution of the Cauchy
problem, we obtain:

∞ (x−a)2
1 2
− (x−ξ) e− 4Dt
ga (x, t) = √ δ(ξ − a)e 4Dt dx = √ . (10.96)
2 π Dt 2 π Dt
−∞

The meaning of the expression in the right-hand side of this equation is, therefore,
the temperature distribution, as a function of time and space, in an infinite rod which
has been subjected to a concentrated unit source of heat at the point x = a at time
t = 0. This is thus some kind of “influence function” (of the same type that used to
be studied in structural engineering in the old days for bridge design). In the context
of the theory of differential equations, these functions (representing the effect due to
a unit concentrated cause at an arbitrary position) are called Green’s functions. The
usefulness of Green’s functions is that, because the differential equation of departure
is linear, we can conceive of the solution as simply a superposition of the effects
of the concentrated unit sources. This interpretation is borne out by the following
equation
∞
g(x, t) = g0 (a)ga (x, t) da, (10.97)
−∞

which is the same as Eq. (10.82). This calculation shows that, if we have any means
(exact or approximate) to calculate Green’s function for a particular differential equa-
tion (perhaps with some boundary conditions), then we have a recipe for constructing
solutions by superposition integrals.
234 10 The Diffusion Equation

10.11 Inhomogeneous Problems and Duhamel’s Principle

Consider once more the Cauchy problem14 for the inhomogeneous heat equation

gt − Dgx x = f (x, t) −∞< x <∞ t > 0, (10.98)

with the initial conditions

g(x, 0) = g0 (x) − ∞ < x < ∞. (10.99)

As we have already discovered in Sect. 8.9 when dealing with the wave equation,
Duhamel’s principle constructs the solution of non-homogeneous problems out of a
clever continuous superposition of solutions of homogeneous problems, for which
the solution is assumed to be known (either exactly or approximately). We remark that
we only need to solve the stated problem (10.98) for homogeneous initial conditions,
i.e., for
g(x, 0) = 0 − ∞ < x < ∞. (10.100)

Indeed, if we consider first the homogeneous problem

gt − Dgx x = 0 −∞< x <∞ t > 0, (10.101)

with the original initial conditions (10.99) as solved, and if we call its solution ḡ(x, t),
then the function g(x, t) − ḡ(x, t) satisfies Eq. (10.98) with the homogeneous initial
conditions (10.100). In other words, if we solve Eq. (10.98) with homogeneous initial
conditions, all we have to do to obtain the required solution is to add the solution of
the homogeneous equation with the original inhomogeneous initial conditions.
In view of our newly acquired familiarity with the Dirac distribution, we may
motivate Duhamel’s principle by viewing the right-hand side of Eq. (10.98), repre-
senting the forcing function, as an infinite superposition of pulses in the form

∞
f (x, t) = f (x, τ )δ(t − τ ) dτ . (10.102)
−∞

For this integral to make sense, we are tacitly extending the given forcing function
as zero over the interval (−∞, 0). At any rate, if t > 0, the lower limit of the
integral can be changed to 0. The mathematical expression (10.102) can be seen as
the counterpart of the graphical representation given in Box 8.2.
Assume that we are able to solve the inhomogeneous problem

gt − Dgx x = f (x, τ )δ(t − τ ) −∞< x <∞ t > τ > 0, (10.103)

14 Although we are presenting the principle in the context of an infinite rod, the same idea can be

applied to the case of the finite rod.


10.11 Inhomogeneous Problems and Duhamel’s Principle 235

with vanishing initial conditions at t = τ . We denote the solution of this problem,


in which τ acts just as a parameter, g(x, t; τ ). Notice that this function vanishes
identically for t < τ . Then, by superposition, we must have

t
g(x, t) = g(x, t; τ ) dτ . (10.104)
0

The remarkable fact is that we can actually obtain the solution of (10.103) by
means of an initial value problem of the homogeneous equation! To visualize how
this is possible,15 all we need to do is integrate Eq. (10.103) with respect to time
over a small interval (τ − , τ + ). In so doing, and taking into consideration that
g(x, t; τ ) vanishes for t < τ , we obtain

g(x, τ + ; τ ) − 2D ḡx x = f (x, τ ), (10.105)

where ḡx x is an intermediate value within the given interval. As  → 0, we conclude


that
g(x, τ ; τ ) = f (x, τ ). (10.106)

The meaning of this equation is that the problem of Eq. (10.103) with homogeneous
initial conditions can be replaced with the homogeneous problem (no forcing term),
but with the initial condition

g(x, 0; τ ) = f (x, τ ). (10.107)

Since we have assumed that the solution of the homogeneous problem with arbi-
trary initial conditions is available (for example, by means of Fourier transforms,
as per Eq. (10.82)), we obtain the solution of the original problem by means of
Eq. (10.94).
Exercises

Exercise 10.1 (Spinal drug delivery)16 A drug has been injected into the spine so
that at time t = 0 it is distributed according to the formula g(x, 0) = c + C sin πx L
,
where the spine segment under study extends between x = 0 and x = L, and where
c and C are constants. The drug concentration at the ends of the spine segment is
artificially maintained at the value c for all subsequent times. If, in the absence of
any production, T is the time elapsed until the difference between the concentration
at the midpoint of the spine and c reaches one-half of its initial value, calculate the
longitudinal diffusion coefficient D of the drug through the spinal meninges. For
a rough order of magnitude, assume L = 10 mm and T = 3 h. [Hint: verify that

15 This is a somewhat different interpretation from that of Sect. 8.9.


16 See [2], p. 40.
236 10 The Diffusion Equation
 
2
πx
g(x, t) = c + C exp − DπL2
t
sin L
satisfies the PDE (10.4) and the initial and
boundary conditions.]

Exercise 10.2 (Modified discrete model) Show that if the probability β in the model
of Box 10.1 has a value 0 < β < 0.5, so that the particles have a positive probability
α = 1 − 2β of staying put, the diffusion equation is still recovered, but with a
different value for the diffusion coefficient D.

Exercise 10.3 (Biased discrete diffusion) Let β + and β − = 1−β + represent, respec-
tively, the generally different probabilities for a particle to move to the right or to the
left. Obtain a PDE whose approximation matches the corresponding discrete model.
Propose a physical interpretation.

Exercise 10.4 (Finite domain) Modify the original discrete model so that it can
accommodate a spatial domain of a finite extent. Consider two different kinds of
boundary conditions, as follows: (1) The number of particles at each end of the
domain remains constant. For this to be the case, new particles will have to be
supplied or removed at the ends. (2) The total number of particles is preserved, with
no flux of new particles through the end points of the domain. Implement the resulting
model in a computer code and observe the time behaviour. What is the limit state of
the system for large times under both kinds of boundary conditions?

Exercise 10.5 Prove that if the data (over the three appropriate sides of a rectangular
region) of one solution of the heat equation are everywhere greater than the data of
another solution, then the same holds true for the corresponding solutions at all the
interior points of the region. Moreover, show that if the absolute value of the data at
each point is smaller than that of the data of an everywhere positive solution, then
so is the absolute value at each interior point smaller than the corresponding value
of the (positive) solution at that point.

Exercise 10.6 (Irreversibility) Show that the boundary-initial value of the heat equa-
tion for a finite rod cannot in general be solved backward in time for general initial
conditions. [Hint: assume that it can and impose initial conditions that are not of
class C ∞ .]

Exercise 10.7 Solve the problem17

gt = gx x 0<x <1 t > 0,

g(0, t) = 0 g(1, t) = cos t t > 0,

g(x, 0) = x 0 ≤ x ≤ 1.

Use the method of eigenfunction expansion.

17 See [3], p. 71.


10.11 Inhomogeneous Problems and Duhamel’s Principle 237

Exercise 10.8 Verify the equivalence between expressions (10.56) and (10.61).

Exercise 10.9 (Convolution) Prove Eq. (10.75). [Hint: apply the inverse Fourier
transform to the right-hand side.]

Exercise 10.10 Solve the Cauchy problem for the non-homogeneous heat equation

gt − Dgx x = f (x, t) −∞< x <∞ t > 0, (10.108)

with homogeneous initial conditions by means of Fourier transforms. [Hint: to solve


the ODE obtained in the frequency domain, use the method of variation of the con-
stants].

Exercise 10.11 (Fourier transform and the wave equation) Apply the Fourier trans-
form method to the solution of the Cauchy problem for the (homogeneous) one-
dimensional wave equation when the initial velocity is zero and the initial defor-
mation of the (infinite) string is given as some function f (x). Compare with the
d’Alembert solution.

Exercise 10.12 The free transverse vibrations of a beam are described by the fourth-
order differential equation (9.80), namely,

c4 u x x x x + u tt = 0,

where c is a constant. Use Fourier transforms to solve for the vibrations of an infinite
beam knowing that at time t = 0 the beam is released from rest with a displacement
given by a function u(x, 0) = f (t). Express the result as an integral. Hint: The
inverse Fourier transform of cos aλ2 is √12a cos 4ax2
− π4 .

Exercise 10.13 (Distributional derivative) Justify the conclusion that the distribu-
tional derivative of the Heaviside function is the Dirac distribution by approximating
the (discontinuous) Heaviside function by means of a sequence of continuous func-
tions including straight transitions with increasingly steeper slopes.

Exercise 10.14 (Bending moment diagrams) If a bending moment diagram of a


beam is a C 2 function, considerations of equilibrium show that its second derivative
equals the applied distributed load. Imagine now a C 0 bending moment diagram
given by the formula

kx 0 ≤ x ≤ L/2
M(x) =
k(L − x) L/2 < x ≤ L

where k is a constant and L is the length of the beam. Determine the load applied
on the beam by calculating the second distributional derivative of the given bending
moment function. Show that, extending the diagram as zero beyond the beam domain,
we also recover the reactions at the supports.
238 10 The Diffusion Equation

Exercise 10.15 (Influence function in statics) Given a simply supported beam, iden-
tify the influence function ga (x) with the bending moment diagram due to a unit con-
centrated force acting at the point a ∈ [0, L]. Apply Eq. (10.97) carefully to obtain
the bending moment diagram for an arbitrary load g(x). Compare with the standard
answer. Check the case g(x) = constant.

Exercise 10.16 Express the solution of the inhomogeneous problem at the end of
Sect. 10.9 in terms of the notation (10.96).

Exercise 10.17 Construct the solution of the Cauchy problem (10.98) with homo-
geneous boundary conditions by means of Duhamel’s principle. Compare the result
with that of the exercise at the end of Sect. 10.10.

Exercise 10.18 Apply Duhamel’s principle to the general inhomogeneous heat


equation over a finite domain, when both the boundary and the initial conditions are
zero. For the solution of the associated homogeneous problem, use the eigenfunction
expansion method. Show that the result is identical to that obtained by expanding the
forcing function in terms of eigenfunctions, rather than using Duhamel’s principle.

References

1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol I. Interscience, Wiley, New
York
2. Epstein M (2012) The elements of continuum biomechanics. Wiley, London
3. Farlow SJ (1993) Partial differential equations for scientists and engineers. Dover, New York
4. John F (1982) Partial differential equations. Springer, Berlin
5. Petrovsky IG (1954) Lectures on partial differential equations. Interscience, New York
(Reprinted by Dover (1991))
6. Zauderer E (1998) Partial differential equations of applied mathematics, 2nd edn. Interscience,
Wiley, New York
Chapter 11
The Laplace Equation

The Laplace equation is the archetypal elliptic equation. It appears in many applica-
tions when studying the steady state of physical systems that are otherwise governed
by hyperbolic or parabolic operators. Correspondingly, elliptic equations require the
specification of boundary data only, and the Cauchy (initial-value) problem does not
arise. The boundary data of a second-order elliptic operator offer a choice between
two extremes: either the function or its transverse derivative must be specified at the
boundary, but not both independently. From the physical standpoint, this dichotomy
makes perfect sense. Thus, in a body in equilibrium, one expects to specify a support
displacement or the associated reaction force, but not both.

11.1 Introduction

The Laplace equation


u x x + u yy = 0, (11.1)

for a scalar function u = u(x, y), and its inhomogeneous version

u x x + u yy = f (x, y), (11.2)

also known as the Poisson equation, as well as their three-dimensional counterparts

u x x + u yy + u zz = 0, (11.3)

and
u x x + u yy + u zz = f (x, y, z), (11.4)

© Springer International Publishing AG 2017 239


M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5_11
240 11 The Laplace Equation

are ubiquitous in Physics and Engineering applications. We have already mentioned


their appearance in the theory of static elasticity for the description of the small
transverse deflections of a tensed membrane. Another application in elasticity is
the torsion of a shaft of non-circular cross section. In fluid mechanics, the Laplace
equation appears in connection with the irrotational motion of a perfect fluid. A sim-
ilar application pertains to acoustics. In electrostatics, Poisson’s equation relates the
charge density with the electrostatic potential. In the classical theory of gravitation,
Poisson’s equation relates the mass density with the gravitational potential. In heat
conduction, the Laplace equation appears in connection with the steady state. In
general, many hyperbolic equations in two or three spatial dimensions have a spatial
part given by the Laplace operator, also called the Laplacian, defined (in Cartesian
coordinates) as
∇ 2 u = u x x + u yy + u zz . (11.5)

If it appears convenient (for reasons of symmetry, for example) to formulate a problem


in a curvilinear coordinate system while keeping the same physical meaning for the
field variable, then the Laplace operator must change accordingly.
A function satisfying Laplace’s equation is said to be harmonic.1 In an
n-dimensional setting, the Laplacian is given by the obvious extension of Eq. (11.5)
to n independent variables. Equations (11.3) and (11.4) are extended accordingly.
The equations of Laplace and Poisson, as we have already remarked in Chap. 6, are
of the elliptic type. They possess no characteristic directions and, correspondingly,
the nature of the domains of definition and of the boundary conditions that lead to
properly formulated problems with unique solutions is different from the case of
hyperbolic and parabolic equations.

11.2 Green’s Theorem and the Dirichlet and Neumann


Problems

One of the fundamental theorems of vector calculus is the divergence theorem. It


states that the integral of the divergence of a (sufficiently smooth) vector field over a
(sufficiently regular) domain D in Rn is equal to the flux of the vector field over the
boundary ∂D of this domain. We recall that, in Cartesian coordinates x1 , ..., xn , the
divergence of a vector field v with components v1 , ..., vn , is defined as

∂v1 ∂v2 ∂vn


div v = + + ... + . (11.6)
∂x1 ∂x2 ∂xn

The flux of the vector v over an oriented surface element d A with (exterior) unit
normal n, is obtained by projecting the vector on the normal and multiplying by the

1 Notice that we have also used the term harmonic to designate a sinusoidal function of one variable.

These two usages are unrelated.


11.2 Green’s Theorem and the Dirichlet and Neumann Problems 241

area of the element. In terms of these definitions, therefore, the divergence theorem
establishes that  
(div v) d V = v · n d A. (11.7)
D ∂D

We will use this theorem (whose proof can be found in any good textbook of Calculus)
to derive some useful expressions involving the Laplacian operator.
Consider a differentiable function u(x1 , ..., xn ). The gradient of this function is
the vector field ∇u with components
⎧ ∂u ⎫

⎪ ∂x1 ⎪


⎪ ⎪


⎪ . ⎪

⎨ ⎬
.
{∇u} = (11.8)

⎪ . ⎪


⎪ ⎪


⎪ ⎪

⎩ ∂u ⎭
∂xn

The divergence of the gradient is, therefore, precisely the Laplacian. We conclude
that the integral of the Laplacian of a scalar field over a domain is equal to the flux
of its gradient through the boundary. In this case, the dot product of the vector field
(i.e., the gradient) with the exterior unit normal is the directional derivative of the
scalar field in the exterior normal direction n. Thus, we can write
 
du
∇2u d V = d A. (11.9)
dn
D ∂D

Consider next two scalar fields u, v. Applying the divergence theorem to the
product of one field times the gradient of the other, we obtain
 
du
2
v dA = v∇ u + ∇u · ∇v d V. (11.10)
dn
∂D D

Subtracting the expression obtained by interchanging the fields, yields


 
du dv
2
v −u dA = v∇ u − u∇ 2 v d V. (11.11)
dn dn
∂D D

Equations (11.10) and (11.11) are known as Green’s identities.


For the particular case in which both scalar fields are made to coincide, Eq. (11.10)
yields the result  
du
2
u dA = u∇ u + ∇u · ∇u d V. (11.12)
dn
∂D D
242 11 The Laplace Equation

Suppose now that the function u is harmonic over the domain D and that it
vanishes on its boundary ∂D. From Eq. (11.12) it will follow that the integral of
the square of the magnitude of the gradient must vanish. But for a continuous and
non-negative function this is possible only if the gradient vanishes identically within
the domain. In other words, all its partial derivatives vanish identically, so that the
function must, in fact, be a constant. But since the function has been assumed to
vanish over the boundary of the domain, we conclude that it must vanish over the
whole of D. This rather trivial observation implies the uniqueness of the solution of
the so-called Dirichlet problem defined as follows.
Dirichlet problem: Find a solution u of the Poisson equation ∇ 2 u = f over a
domain D with prescribed values of u on the boundary ∂D. The proof of uniqueness2
is straightforward. Indeed, assume that there exist two solutions to this problem. Since
the Poisson equation is linear, the difference between these two solutions must be
harmonic (i.e., must satisfy the Laplace equation) and attain a zero value over the
whole boundary. It follows from our previous reasoning that this difference must be
identically zero, so that both solutions coincide.
Consider now the case in which u is harmonic over the domain D and that its
normal derivative (rather than the function itself) vanishes on the boundary ∂D.
Again, by applying Eq. (11.12), we arrive at the conclusion that u must be a constant.
Nevertheless, in this case, we can no longer conclude that it must vanish. We thus
obtain the following statement about the solution of the so-called Neumann problem.3
Neumann problem: Find a solution of the Poisson equation ∇ 2 u = f over a
domain D with prescribed values of the normal derivative on ∂D. A solution of this
problem is determined uniquely to within an additive constant. Moreover, according
to Eq. (11.9), the solution can only exist if the boundary data satisfy the auxiliary
condition  
du
dA = f d V. (11.13)
dn
∂D D

Remark 11.1 Intuitively, we can imagine that the Dirichlet problem corresponds to
the specification of displacements, while the Neumann problem corresponds to the
specification of boundary tractions in an elastic structure. This explains why the
Neumann problem requires the satisfaction of an auxiliary condition: The tractions
must be in equilibrium with the applied body forces. If they are not, a solution cannot
exist within the realm of statics. The dynamic problem is, of course, governed by
hyperbolic equations, such as the wave equation, which necessitate the specification
of initial displacements and velocities and which include the forces of inertia.
Notice that, since the specification of the value of the solution on the boundary is
enough to determine a unique solution (if such a solution indeed exists), we cannot

2 Strictlyspeaking, this proof of uniqueness requires the solution to be twice differentiable not just
in the interior but also on the boundary of the domain. This requirement can be relaxed if the proof
is based on the maximum-minimum principle, that we shall study below.
3 Named after the German mathematician Carl Gottfried Neumann (1832–1925), not to be confused

with John von Neumann (1903–1957), the great Hungarian-American mathematician.


11.2 Green’s Theorem and the Dirichlet and Neumann Problems 243

simultaneously specify both the function and its normal derivative on the boundary. In
other words, the Cauchy problem for the Laplace equation has no solution in general.
This fact is in marked contrast with hyperbolic equations. It can be shown that the
Cauchy problem for the Laplace equation is in general unsolvable even locally.4 We
have already seen that a solution of the heat equation must be of class C ∞ . In the
case of Laplace’s equation, the complete absence of characteristic directions, leads
one to guess that perhaps this will also be the case, since no discontinuities can be
tolerated. It can be shown, in fact, that bounded solutions of Laplace’s equation must
be not just C ∞ , but also (real) analytic (i.e., they must have convergent Taylor-series
expansions in an open neighbourhood of every point).

11.3 The Maximum-Minimum Principle

Theorem 11.1 (Maximum-minimum theorem)5 A harmonic function which is con-


tinuous in D ∪ ∂D (namely, in the union of the interior of a bounded domain and its
boundary) attains its maximum and minimum values on the boundary ∂D.

Proof The proof can be carried out along similar lines as in the case of the parabolic
equation. Since the boundary ∂D is a closed and bounded set, the restriction of u
to ∂D must attain a maximum value m at some point of ∂D. On the other hand,
since D ∪ ∂D is also closed and bounded, the function must attain its maximum
M at some point P of D ∪ ∂D. Let us assume that P is an interior point and that,
moreover, M > m. Without any loss of generality, we may assume that the origin of
coordinates is at P. Let us now construct the auxiliary function

M −m 2
v=u+ r . (11.14)
2d 2
In this expression, r denotes the length of the position vector and d is the diameter
of D.6 This function v is strictly larger than u, except at P, where they have the
same value, namely M. The restriction of v to ∂D, on the other hand, will satisfy the
inequality
M −m
v≤m+ < M. (11.15)
2
We conclude that v attains its maximum at an interior point. On the other hand,

4 See [3], p. 98. Notice that, correspondingly, the Dirichlet problem has in general no solution for
the hyperbolic and parabolic cases, since the specification of the solution over the whole boundary
of a domain will in general lead to a contradiction. For this point see [2], p. 236.
5 See [4], p. 169.
6 Recall that the diameter of a set (in a metric space) is the least upper bound of the distances between

all pairs of points of the set.


244 11 The Laplace Equation

M −m M −m
∇2v = ∇2u + 2
= > 0. (11.16)
d d2
Since, however, at an interior point of a domain any maximum of a differentiable
function must be a relative maximum, none of the second partial derivatives can be
positive, making the satisfaction of Eq. (11.16) impossible. Having arrived at this
contradiction, we conclude that the maximum of u is attained at the boundary ∂D.
Changing u to −u, we can prove that the same is true for the minimum value. 


Just as in the case of the parabolic heat equation, we can prove as corollaries of
the maximum-minimum theorem the uniqueness and continuous dependence on the
boundary data of the Dirichlet problem.
A nice intuitive visualization of the maximum-minimum principle can be gathered
from the case of a membrane (or a soap film) extended over a rigid closed frame. If we
give the initially plane frame a small transverse deformation (i.e., a warping), we do
not expect the membrane to bulge either upwards or downwards beyond the frame,
unless external forces are applied. A similar intuitive interpretation can be stated in
the realm of the thermal steady state over some plane region. The temperature attains
its maximum and minimum values at the boundaries of the region.

11.4 The Fundamental Solutions

There are many ways to tackle the difficult problem of solving the Laplace and
Poisson equations in some degree of generality. A useful result towards this end is
obtained by investigating the possible existence of spherically symmetric solutions.
Let P ∈ Rn be a point with coordinates x̄1 , ..., x̄n and let r = r (x1 , ..., xn ) denote
the distance function to P, namely,


 n
2
r = + x j − x̄ j . (11.17)
j=1

A function is said to be spherically symmetric with respect to P if it can be expressed


as a function of the single variable r , viz.,

u = g(r ). (11.18)

If this is the case, it is not difficult to calculate its Laplacian. Indeed, denoting by
primes the derivatives of g with respect to its independent variable r , we obtain

∂u ∂r xk − x̄k
= g = g . (11.19)
∂xk ∂xk r

Similarly, for the second derivatives we get


11.4 The Fundamental Solutions 245

∂2u (xm − x̄m )(xk − x̄k ) r δkm − (x m − x̄ m )(x k − x̄ k )


2
= g + g (11.20)
∂xm ∂xk r2 r3

Making m = k and adding for k = 1, ..., n yields the desired result as

n−1
∇ 2 u = g g. (11.21)
r
If we wish to satisfy Laplace’s equation, therefore, we are led to the solution of a
simple linear ODE. Specifically,

n−1
g g = 0. (11.22)
r
The solution, which exists on the whole of Rn , except at P (where it becomes
unbounded), is given by

⎨A+ B
r n−2
if n > 2
g= , (11.23)

A + B ln r if n = 2

where A and B are arbitrary constants. Notice that in the particularly important case
n = 3 the solution is a linear function of the reciprocal distance (from the physical
point of view, this corresponds to the electrostatic or gravitational potentials of a
concentrated charge or mass).
To reveal the meaning of the solution (11.23), let us consider the following (three-
dimensional) problem associated with a sphere of radius ε with centre at P. We look
for a bounded C 1 spherically-symmetric function u(x, y, z) vanishing at infinity and
satisfying the Poisson equation

1
∇2u = r ≤ ε, (11.24)
4
3
π ε2 r

and the Laplace equation


∇2u = 0 r > ε. (11.25)

Just as before, the assumed spherical symmetry allows us to reduce this problem to
that of an ODE, namely, setting u = g(r ; ε),
⎧ 3
2 ⎨ 4π ε3 if r ≤ ε
g (r ; ε) + g (r ; ε) = . (11.26)
r ⎩
0 if r > ε

The solution of this problem is easily obtained as


246 11 The Laplace Equation
⎧  2 

⎨ 8π1 ε rε2 − 3 if r ≤ ε
g(r ; ε) = . (11.27)


− 4π1 r if r > ε

A remarkable feature of the solution just found is that the solution outside the
sphere is independent of the size of the sphere.7 As the radius of the sphere tends
to zero, the right-hand side of Eq. (11.25) approaches δ P , the (three-dimensional)
Dirac delta function at P. This means that, with A = 0 and with the value of B
appropriately calibrated (for each dimension), Eq. (11.23) provides the solution of
the problem
∇2u = δP , (11.28)

with zero boundary condition at infinity. We call this a fundamental solution of the
Laplace equation with pole P. This solution depends both on the coordinates x j
of the variable point in space and the coordinates x̄ j of point P. It is sometimes
convenient to express this double dependence explicitly with the notation K (x j , x̄ j ).
The explicit formulas for the fundamental solution in an arbitrary dimension can be
obtained in a similar way.8
Any solution of Eq. (11.28) in some domain D ⊂ Rn containing P, regardless of
boundary conditions, will also be called a fundamental solution with pole P. Clearly,
if w is harmonic within this domain, the new function

G(x j , x̄ j ) = K (x j , x̄ j ) + w(x j ) (11.29)

is also a fundamental solution with pole P.

11.5 Green’s Functions

A fundamental solution that satisfies zero boundary conditions on the boundary of a


given domain is called Green’s function for that domain. The importance of Green’s
functions (also known as influence functions, for obvious reasons) is that the solution
of the Poisson equation
∇ 2 u = f (x j ) (11.30)

on a domain D with u vanishing at the boundary is given by the superposition integral

7 This feature of the solution for the case of the gravitational field was extremely important to
Newton, who was at pains to prove it. It is this property that allowed him to conclude that the forces
exerted by a homogeneous sphere on empty space are unchanged if the total mass is concentrated
at its centre.
8 For the case n = 2, we have B = 1/2π. For n = 3, as we have seen, the value is B = −1/4π.

For higher dimensions, the value of the constant can be shown to be related to the ‘area’ of the
corresponding hyper-sphere, which involves the Gamma function. See [3], p. 96.
11.5 Green’s Functions 247

u(x j ) = G(x j , ξ j ) f (ξ j ) d Vξ . (11.31)
D

The solution of the Dirichlet problem for the Laplace equation with inhomoge-
neous boundary conditions, can be obtained in a similar way. Indeed, let the boundary
values be given by
u|∂D = h(x j ). (11.32)

Let us assume that this function h has enough smoothness that we can extend it
(non-uniquely, of course) to a C ∞ function ĥ defined over the whole domain D.
Then clearly the function
v = u − ĥ (11.33)

satisfies, by construction, homogeneous boundary conditions, but the inhomoge-


neous (Poisson) equation
∇ 2 v = −∇ 2 ĥ. (11.34)

In terms of the Green function (assumed to be known for the domain under consid-
eration), the solution of this problem is given by Eq. (11.31) as

v(x j ) = − G(x j , ξ j ) ∇ 2 ĥ(ξ j ) d Vξ . (11.35)
D

From Eq. (11.33) we obtain the solution of the original Dirichlet problem as

u(x j ) = ĥ(x j ) − G(x j , ξ j ) ∇ 2 ĥ(ξ j ) d Vξ . (11.36)
D

This expression would seem to indicate that the solution depends on the particular
extension ĥ adopted. That this is not the case can be deduced by a straightforward
application of Eq. (11.11), appropriately called a Green identity, which yields
 
dG(x j , ξ j )
G(x j , ξ j ) ∇ 2 ĥ(ξ j ) d Vξ = ĥ(x j ) − h(ξ j ) d Aξ . (11.37)
dn
D ∂D

In obtaining this result, the properties of each of the functions involved were
exploited. Combining Eqs. (11.36) and (11.37), we obtain the final result

dG(x j , ξ j )
u(x j ) = h(ξ j ) d Aξ (11.38)
dn
∂D
248 11 The Laplace Equation

Thus, the solution involves only the values of the data at the boundary, rather than
any extension to the interior. Expression (11.41) is regular at every interior point of
the domain.
Although we have not provided rigorous proofs of any of the preceding theorems
(proofs that can be found in the specialized books),9 we have attempted to present
enough circumstantial evidence to make these results at least plausible. The main
conclusion so far is that to solve a Dirichlet problem over a given domain, whether for
the Laplace or the Poisson equation, can be considered equivalent to finding Green’s
function for that domain. It is important to realize, however, that finding Green’s
function is itself a Dirichlet problem.

11.6 The Mean-Value Theorem

An important result that characterizes harmonic functions is the following.

Theorem 11.2 (Mean value theorem) The value of a harmonic function u at any
point is equal to the average of its values over the surface of any sphere with centre
at that point.

Proof The theorem is valid for the circle, the sphere or, in the general case, an
n-dimensional ball. For specificity, we will consider the (three-dimensional) case of
a sphere. Let P be a point with coordinates x̂ j in the domain under consideration
and let B denote the (solid) sphere with centre P and radius R. Always keeping
the centre fixed, we proceed to apply Green’s formula (11.11) to the fundamental
solution K (x j , x̄ j ) and to the harmonic function u(x j ) and we notice the vanishing
of the term ∇ 2 u (by hypothesis), and obtain
 
dK du
u ∇2 K d V = u −K d A. (11.39)
dn dn
B ∂B

But, due to the spherical symmetry of K , both it and its derivative in the normal
direction (which is clearly radial) are constant over the boundary of the ball. Recalling
that, by a direct application of the divergence theorem, the flux of the gradient of a
harmonic function vanishes over the boundary of any domain, we conclude that the
last term under the right-hand side integral vanishes. The normal derivative of K at
the boundary is obtained directly from Eq. (11.23), or (11.27), as

dK dK 1
= = . (11.40)
dn dr 4π R 2
Finally, invoking Eq. (11.28) for the fundamental solution K , we obtain

9 Good sources are [1, 3, 5].


11.6 The Mean-Value Theorem 249

1
u(x̂ j ) = u d A, (11.41)
4π R 2
∂B

which is the desired result. 




Corollary 11.1 The value of a harmonic function at a point is also equal to the
volume average over any ball with centre at that point.
Proof Trivial. 

Remark 11.2 It is remarkable that the converse of the mean value theorem also holds.
More specifically, if a continuous function over a given domain satisfies the mean
value formula for every ball contained in this domain, then this function is harmonic
in the domain. For a rigorous proof of this fact, see [1], p. 277.

11.7 Green’s Function for the Circle and the Sphere

It is, in fact, not difficult to construct explicit Green’s functions for a domain which
is a circle, a sphere or, in the general case, an n-dimensional ball of given radius R.
The construction is based on a simple geometric property of circles and spheres. We
digress briefly to show this property in the case of a circle (the extension to three
dimensions is rather obvious, by rotational symmetry).
Given a circle of centre P and radius R, as shown in Fig. 11.1, let Q = P be
an arbitrary internal point. Extending the radius through Q, we place on this line
another point S outside the circle, called the it reflected image of Q, according to the
proportion

PS R 2
= . (11.42)
PQ PQ

Fig. 11.1 Green’s function S


argument

Q
R
C
P
250 11 The Laplace Equation

It is clear that this point lies outside the circle. Let C be a point on the circumfer-
ence. The triangles Q PC and C P S are similar, since they share the angle at P and
the ratio of the adjacent sides. Indeed, by Eq. (11.42),

PS PC
= . (11.43)
PC PQ

It follows that the ratio of the remaining sides must be the same, namely,

CS PC
= . (11.44)
CQ PQ

The right-hand side of this equation is independent on the particular point C chosen
on the circumference. It depends only on the radius of the circle and the radial distance
to the fixed point Q.
We use this property to construct Green’s function for the circle (or the sphere) B.
We start by noting that (always keeping Q and, therefore, also S fixed) the fundamen-
tal solution K (X, S), where X denotes an arbitrary point belonging to B, is smooth
in B and its boundary ∂B. This follows from the fact that S is an exterior point.
Moreover, if the point X happens to belong to the boundary, the value of K (X, S) is
given by
 n−2
PQ
K (X, S)| X ∈∂B = K (X, Q)| X ∈∂B . (11.45)
R

This result is a direct consequence of Eq. (11.44) and the general formula (11.23) for
n > 2. For n = 2, a similar (logarithmic) formula applies. Therefore, the function
 2−n
PQ
G(X, Q) = K (X, Q) − K (X, S), (11.46)
R

is Green’s function for the ball.


For the circle (n = 2), the corresponding formula is
 
1 PQ
G(X, Q) = K (X, Q) − K (X, S) − ln . (11.47)
2π R

Assume that a Dirichlet problem has been given by specifying the value of the
field on the boundary as a function h(Y ), Y ∈ ∂B. According to Eq. (11.41), the
solution of this problem is given by

dG(X, Y )
u(X ) = h(Y ) d Ay . (11.48)
dn
∂B
11.7 Green’s Function for the Circle and the Sphere 251

We need, therefore, to calculate the normal (i.e., radial) derivative of Green’s func-
tions (just derived) at the boundary of the ball. When this is done carefully, the
result is
2
dG(X, Y ) 1 R2 − P X
= H (X, Y ) = 2
. (11.49)
dn Y 4π R XY
This is the formula for the sphere. For the circle, the denominator has a 2 rather
than a 4. Equation (11.48), with the use of (11.49), is known as Poisson’s formula. It
solves the general Dirichlet problem for a ball.

Exercises

Exercise 11.1 Show that under an orthogonal change of coordinates, the Laplacian
retains the form given in Eq. (11.5). If you prefer to do so, work in a two-dimensional
setting.
Exercise 11.2 Express the Laplacian in cylindrical coordinates (or polar coordi-
nates, if you prefer to work in two dimensions).
Exercise 11.3 Obtain Eqs. (11.9), (11.10) and (11.11). Show, moreover, that the flux
of the gradient of a harmonic function over the boundary of any bounded domain
vanishes.
Exercise 11.4 A membrane is extended between two horizontal concentric rigid
rings of 500 and 50 mm radii. If the inner ring is displaced vertically upward by an
amount of 100 mm, find the resulting shape of the membrane. Hint: Use Eq. (11.23).
Exercise 11.5 Carry out the calculations leading to (11.27). Make sure to use each of
the assumptions made about the solution. Find the solution for the two-dimensional
case.
Exercise 11.6 Carry out and justify all the steps necessary to obtain Eq. (11.38).
Exercise 11.7 What is the value of G(X, P)? How is this value reconciled with
Eq. (11.46)?
Exercise 11.8 Adopting polar coordinate in the plane, with origin at the centre of
the circle, and denoting the polar coordinates of X by ρ, θ and those of Y (at the
boundary) by R, ψ, show that the solution to Dirichlet’s problem is given by the
expression

2π
1 R 2 − ρ2
u(ρ, θ) = h(ψ) dψ. (11.50)
2π R2 + ρ2 − 2R ρ cos(θ − ψ)
0

where h(ψ) represents the boundary data. Use Eqs. (11.48) and (11.49) (with a 2 in
the denominator).
252 11 The Laplace Equation

References

1. Courant R, Hilbert D (1962) Methods of mathematical physics, vol II. Interscience, Wiley,
New York
2. Garabedian PR (1964) Partial differential equations. Wiley, New York
3. John F (1982) Partial differential equations. Springer, Berlin
4. Petrovsky IG (1991) Lectures on partial differential equations. Dover, New York
5. Sobolev SL (1989) Partial differential equations of mathematical physics. Dover, New York
Index

A Characteristic vector field, 56


Acceleration wave, 123 Class C k , 7
Acoustical axes, 151 Compact support, 230
Acoustics, 147 Compatibility conditions
Acoustic tensor, 141 geometric, 123
Affine space, 5 iterated, 125
Angular momentum, 23 Complete integral, 108
Angular momentum balance, 46 Completeness, 203
Autonomous system of ODEs, 16 Conservation law, 44
Conservation of energy, 20
Conservative system, 19
B Conserved quantity, 19
Balance laws, 27 Constitutive law, 34
Beam, 135 Continuity equation, 44
Beam on elastic foundation, 207 Continuous dependence on data, 180, 218,
Bending moment diagram, 237 244
Bending waves in beams, 137 Continuum mechanics, 41
Bernoulli–Euler beam, 135 Convolution, 228
Bi-characteristics, 139 Curve, 6
Boundary control, 175
Brownian motion, 211
Burgers equation, 65, 83, 84 D
D’Alembert solution, 160
Decay-induction equation, 126, 142
C Diffusion, 37
Canonical form of a first-order system, 134 equation, 211, 213
Cauchy problem, 58, 98, 175 Dirac’s delta, 230
Cauchy stress, 44 Directional derivative, 12
Caustic, 148 Dirichlet boundary condition, 205
Cellular automata, 26 Dirichlet problem, 242
Cellular automaton, 212 Discontinuous data, 63
Characteristic Discrete diffusion, 211
initial value problem, 215 Distribution, 230
Characteristic curves, 56 Divergence, 10
Characteristic directions, 94 Divergence theorem, 11, 240
Characteristic manifolds, 138 Domain of dependence, 162
Characteristic strips, 96 Drug delivery, 235
© Springer International Publishing AG 2017 253
M. Epstein, Partial Differential Equations, Mathematical Engineering,
DOI 10.1007/978-3-319-55212-5
254 Index

Duhamel integral, 197 Hooke’s law, 151


Duhamel’s principle, 177, 234 Hyperbolic PDE, 120
Dust, 87
Dynamical system, 17
I
Influence function, 233, 246
E Inhomogeneous wave equation, 177
Eigenvalue, 187 Initial data, 58
Eigenvector, 187 Integral surface, 53
Eikonal, 138 Interior coordinates, 143
Einstein, Albert, 211 Internal energy, 46, 212
Elliptic PDE, 120 Intra-ocular pressure, 206
Energy, 171
Involutivity, 22
Energy balance, 47
Irreversibility, 236
Entropy condition, 86
Irreversible process, 214
Equilibrium, 13
Isotropic material, 151
Iterated compatibility, 125
F
Fick’s law, 210
First integral, 18 K
First law of thermodynamics, 47, 212 Kepler’s second law, 23
Fishing rod control, 177 Kinetic energy, 19, 46, 172
Flux, 28 Kronecker symbol, 149, 151
Flux of a vector field, 10
Flux vector, 32
Focusing, 148 L
Fourier integral, 224, 227 Lamé constants, 151
Fourier’s law, 213 Laplace equation, 40, 203
Fourier transform, 227 Laplacian operator, 12
Free vibrations, 158 Linear momentum balance, 45
Frobenius’ theorem, 22 Linear PDE, 27
Fundamental solution, 246 Lipschitz continuous, 21
Longitudinal wave, 152

G
General integral, 109
M
Generalized function, 230
Manifold, 8
Geometric compatibility conditions, 123
Mass conservation, 44
Glissando, 174
Material derivative, 43
Gradient, 11
Maximum-minimum theorem, 216, 243
Green’s function, 233, 246
Green’s identities, 241 Mean value theorem, 248
Group property, 17 Modal matrix, 134
Growth, 174 Monge cone, 92

H N
Hadamard’s lemma, 121 Natural base vectors, 9
Heat Natural frequency, 186
capacity, 213 Neumann boundary condition, 205
equation, 211, 213 Neumann problem, 242
specific, 213 Normal forms, 129
Heat equation, 38 Normal mode, 186
Index 255

O Standing wave, 189


Order of a PDE, 27 Statistical Mechanics, 211
Orientable surface, 10 Steady state, 216
Orthogonality, 191, 200, 203 Stress tensor, 44
Orthogonality condition, 187 Strip, 96
Orthonormality, 187 Strip condition, 96, 98
Strong singularity, 134
Sturm-Liouville problem, 193
P Summation convention, 45
Parabolic PDE, 120 Superposition, 188
Parametrized curve, 6 Surface, 8
Pencil of planes, 90 Synchronicity, 185
Phase space, 18 Systems of PDEs, 131
Poisson equation, 40, 203, 239
Poisson’s formula, 251
Positive definite, 184 T
Postion vector, 7 Tangent plane, 9
Potential energy, 19, 172 Test function, 230
Propagation condition, 151 Thermal conductivity, 213
Time reversal, 214
Timoshenko beam, 137, 144
Q boundary control, 177
Quasi-linear PDE, 27 Tonometer, 206
Totally hyperbolic system, 133
Traffic flow, 36
R Transport equation, 126, 142
Range of influence, 162 Transversal wave, 152
Rankine-Hugoniot condition, 78 Tuning fork, 202
Rarefaction wave, 86
Rays, 139
Reduced form, 106 U
Riemann initial conditions, 83 Uniqueness, 171, 217
Robin boundary condition, 207

V
S Variation of constants, 177
Scalar field, 11 Vector field, 9, 12
Separatrix, 23 Vibrating string, 157
Shear waves in beams, 137
Slinky, 175
Smooth, 7 W
Solitons, 39 Wave amplitude vector, 140
Spectral coefficients, 216 Wave breaking, 65
Spectrum, 190 Wave equation, 39
Spherical symmetry, 244 Wave front, 123
Stability, 171 Weak singularity, 123

You might also like