Computational Methods
Computational Methods
Contact: [email protected]
Department of Materials Science and Engineering
Indian Institute of Technology, Kanpur
Disclaimer: The document (under preparation) is not a mathematics textbook.
The main focus is on enhancing your learning experience by using computer
programs. The document starts with a brief introduction to Python, which will
be used for writing small programs and plotting in the rest of the chapters. You
are advised to read standard mathematics textbooks to understand the topics
and use this document only to improve your knowledge by learning to use simple
computer programs.
Contents
2 Numerical Methods 10
2.1 Solution of non-linear equations . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Relaxation method . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Bisection method . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Newton-Raphson method . . . . . . . . . . . . . . . . . . . . . . 13
2.1.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Solution of linear equations . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Gaussian elimination and back-substitution . . . . . . . . . . . 14
2.2.2 Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.3 Gauss-Seidel method . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.4 Systematic formulation of iterative solution of Ax=b . . . . . . 19
2.2.5 Convergence criteria for iterative methods . . . . . . . . . . . . 22
2.2.6 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
ii
CONTENTS iii
3 Partial Differentiation 29
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Total differential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Some useful formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Chain rule 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.2 Chain rule 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.3 Chain rule 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.4 Reciprocal relation . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.5 Cyclic relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.6 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 How to find the maximum and minimum . . . . . . . . . . . . . . . . 38
3.4.1 Second derivative test . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.2 Method of Lagrange multipliers . . . . . . . . . . . . . . . . . . 40
3.4.3 Geometrical interpretation . . . . . . . . . . . . . . . . . . . . . 42
3.4.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5 Change of variables: Legendre transform . . . . . . . . . . . . . . . . . 43
3.5.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.6 Differentiation of integrals . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.6.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4 Multiple integrals 48
4.1 Double and triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Change of variables: Jacobians . . . . . . . . . . . . . . . . . . . . . . 51
4.2.1 Cartesian coordinate system . . . . . . . . . . . . . . . . . . . . 51
4.2.2 Polar coordinate system . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.3 Cylindrical coordinate system . . . . . . . . . . . . . . . . . . . 53
4.2.4 Spherical coordinate system . . . . . . . . . . . . . . . . . . . . 54
4.2.5 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.6 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Octave files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5 Vector analysis 58
5.1 Triple products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.1 Scalar triple product . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.2 Vector triple product . . . . . . . . . . . . . . . . . . . . . . . . . 58
CONTENTS iv
5.1.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 First derivative of scalar and vector fields . . . . . . . . . . . . . . . . 59
5.2.1 Gradient and directional derivative . . . . . . . . . . . . . . . . 59
5.2.2 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Second derivative of scalar and vector fields . . . . . . . . . . . . . . . 62
5.3.1 Divergence of gradient or Laplacian . . . . . . . . . . . . . . . . 62
5.3.2 Laplacian of a vector field . . . . . . . . . . . . . . . . . . . . . . 62
5.3.3 Curl of gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.4 Divergence of curl . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.5 Curl of curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.6 Gradient of divergence . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Line integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4.1 Exact Differential . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4.2 Scalar potential for a conservative force field . . . . . . . . . . . 65
5.4.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.5 Green’s theorem in plane . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.5.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6 Divergence and divergence theorem . . . . . . . . . . . . . . . . . . . . 66
5.6.1 Physical significance of divergence . . . . . . . . . . . . . . . . . 66
5.6.2 Equation of continuity . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6.3 Divergence theorem: volume and surface integral . . . . . . . . 67
5.6.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.7 Curl and Stokes’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.7.1 Physical significance of curl . . . . . . . . . . . . . . . . . . . . 68
5.7.2 Stoke’s theorem: surface and line integral . . . . . . . . . . . . 68
5.7.3 Vector potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.7.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.1 A function of two variables z(x, y). Along the red curve, y is constant
and along the blue curve, x is constant. Note that, along the red
curve, we can equivalently define x(y, z) with y constant. Similarly,
along the blue curve, we can equivalently define y(x, z) with x con-
stant. Along the dotted curve, we can define either x(y, z) or y(x, z)
with z constant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Tangent or linear approximation of differential ∆y ≈ dy = y 0 dx. . . . . 32
3.3 Curve with a (a) maximum point, (b) minimum point and (c) inflec-
tion point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
vii
LIST OF FIGURES viii
3.4 Surface with a (a) maximum point, (b) minimum point and (c) saddle
point. Along the blue (red) curve, x (y) remains constant. The two
curves are crossing each other atp the origin. . . . . . . . . . . . . . . . 39
3.5 Finding minimum distance d = x2 + y 2 from the origin, with the
constraint that the point lies on the curve. . . . . . . . . . . . . . . . . 40
3.6 Tangent drawn at (a) minimum d and (b) any other d. See Fig 3.5 for
the definition of d. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.1 Various different areas over which a double integral needs to be cal-
culated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Polar coordinate system. . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Cylindrical coordinate system. . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Spherical coordinate system. . . . . . . . . . . . . . . . . . . . . . . . . 54
10.1 Boundary conditions in one (left) and two (right) dimension. Within
the domain, the unknown function f is determined by the differential
d2
equation (Laplace’s equation in this case, ∆ = dx 2 for 1D and ∆ =
∂2 ∂2
∂x2
+ ∂y2 for 2D, respectively). Along the boundaries, values of f are
given by the boundary conditions. Obviously, the function f must
vary smoothly as we move from the interior to the boundary of a
domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
10.2 Harmonic functions obey mean-value property – average value of the
function at the boundary is equal to its value at the center. Examples
shown for (a) one-dimensional domain, (b) and (c) two-dimensional
domain. In case of (b), f = 10 and f = 0 along the top and bottom
part of the perimeter, respectively, such that f = 5 along the line
lying in the middle. What would be the value of the function at the
center in case of (c), where the function is zero everywhere, except at
a single point at the boundary? . . . . . . . . . . . . . . . . . . . . . . 124
10.3 Plot of two harmonic functions (a) x2 − y 2 and (b) 2xy. They do not
have any maximum or minimum, but only a saddle point at (0, 0).
The third one (c) x2 + y 2 is not a harmonic function and it has a
minimum at (0, 0). We can easily verify that x2 + y 2 does not satisfy
Laplace’s equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
10.4 A bar, having finite width in the y−direction and semi-infinite in the
x− direction. One side (along the y−axis) is held at 200◦ and the long
sides (along the x − axis) is held at 0◦ . The far end is also held at 0◦ .
What would be the temperature distribution within the bar? . . . . . 127
10.5 Eq. 10.21 plotted for n ranging from (a) 1 − 3, (b) 1 − 29, and (c) 1 − 299.129
10.6 Temperature distribution in a circular plate, with boundary condi-
tions shown in Figure 10.2(b). I have plotted Eq. 10.37 by including
99 terms in the sum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
10.7 Finite difference method: the domain is divided in a square of rectan-
gular grid. Note that, grid points (red) are also placed at the bound-
ary. However, values of f (x, y) remain fixed (boundary conditions) at
these grid points. On the other hand, f (x, y) changes with each iter-
ation inside the domain (black points), until convergence is achieved. 133
10.8 (a) Boundary conditions in a square plate, (b) analytically calculated
temperature distribution (Eq. 10.25) and (c) numerically calculated
temperature distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 135
10.9 (a) A bar is uniformly heated to 50◦ initially. Then, two of its faces
(red) are brought in contact with thermal reservoirs at 0◦ and rest
of the faces (white) are insulated, such that heat flow is essentially
one-dimensional. (b) The temperature profile as a function of time,
as the bar cools down. . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
10.10Let us imagine that a sound is created at the middle of a long tunnel.
2
At t = 0, the pulse has the form e−x . Half of it travels to the right
and the rest travels to the left. I assume v = 1. . . . . . . . . . . . . . 138
2
10.11Wave propagation for initial position G(x) = e−x and initial velocity
H(x) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
LIST OF FIGURES x
10.12
Wave propagation for initial position G(x) = 0 and initial velocity
2
H(x) = 2xe−x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
A.1 Graph of f (x, y): two possible solutions of of Eq. A.15, plotted assum-
ing a = b = 1. Contour lines or level curves are shown at the base of
the plots. Along the contour lines, f (x, y) is constant. Contour lines
are parallel to y = x − constant. . . . . . . . . . . . . . . . . . . . . . . . 157
A.2 Solution of Eq. A.15 (a = b = 1): f (x, y) = u(x − y) = constant and
initial condition is given along the x−axis as f (x, 0) = g(x). Values on
the x−axis will be “carried” or “transported” along the straight lines,
because f (x, y) is constant along the lines. Thus, the solution can be
written as, f (x, y) = g(x − y). . . . . . . . . . . . . . . . . . . . . . . . . 159
A.3 The data curve (solid line) is given along the x−axis: f (x, 0) = g(x).
The characteristic curves (dashed lines) originate from the data curve.161
x
A.4 (Left) Color map with contour lines for f (x, y) = y+1 = r. (Right)
Contour lines are along x = r(y + 1) and f (x, y) = constant = r along
these lines. Clearly, there exist a singularity at the point (0,-1). . . . 162
List of Tables
2
2.1 Values of x obtained at different steps while solving x = e−x by re-
laxation method. We start with an initial guess of 1. . . . . . . . . . . 10
2.2 Values of x0 , x1 and x2 after successive iterations using the Jacobi
method, starting with the initial values of x0 = −2, x1 = 2 and x2 = 3. 18
2.3 Different iterative schemes for solving a set of linear equations Ax = b. 20
11.1 List of all possible outcomes (known as the sample space) if we throw
two dice simultaneously. Each outcome is termed as a sample point
and there are 36 sample points in this case. . . . . . . . . . . . . . . . 144
11.2 A sample space (sum of two dice) and probability derived from Ta-
ble 11.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
xii
Chapter 1
If you have ever done any introductory coding course, you must have started with
the “Hello world” code. First, create a file named “hello.py”, type the following and
save it.
print(’Hello world’)
In a Linux terminal, type the following and press enter, and you are on your way
to tame the python.
python hello.py
1.1.1 Exercise
1. Write a code to subtract y from x (Ans: x − y).
2. Write a code to multiply x with y (Ans: x ∗ y).
3. Write a code to divide x by y (Ans: x/y).
4. Write a code for raising x to the power of y (Ans: x ∗ ∗y).
5. Take two integers and find out what you get by doing x//y and x%y. The
second one is known as modulo.
6. Write a code to get the distance traveled s = ut + 21 at2 , where one has to input
the values of u, a, t to calculate the value of s.
1
1.2. MATHEMATICAL FUNCTIONS
The numpy package contains several functions, like log (both natural and base
10), exponential, trigonometric, hyperbolic, positive square root, and constants
like e and pi. One can import more than one function at a time.
The following command will import all the functions available in the math pack-
age, although it may not be a good practice.
1.2.1 Exercise
1. Write a code, that will take r and θ value (polar coordinate) as input and
convert them to Cartesian coordinate.
1.3.1 IF statement
Let us see an example where we want to check whether an integer is divisible by
13 or not.
m = int(input("Type any integer m: "))
if m%13==0 :
print("m is divisible by 13")
else :
print("m is not divisible by 13")
1
We already have used some built-in functions like print, input, float, which are not part of any
package.
2
1.3. CONTROL STATEMENTS
Note the space at the beginning of the line next to the lines containing if or else
statements. Let us show another example where we need to check whether a
person is eligible to get vaccine or not; the eligibility criteria being the age of the
person should lie between between 18 and 45.
m = int(input("Type the age in integer: "))
if m<=17 :
print("not eligible, age should be at least 18")
elif m>=46:
print("not eligible, age should be at most 45")
else:
print("eligible")
Here elif stands for else if and we can have as many of them as we wish, each
one checking for a new condition. We could have combined the selection criteria
in a single line,
m = int(input("Type the age in integer: "))
if m<=17 or m>=46:
print("not eligible, age should be at least 18 or at most 45")
else:
print("eligible")
1.3.3 Exercise
1. Write a code to find out whether an integer is even or odd.
3
1.4. FOR LOOP
Note that, by default, the loop starts from zero. However, we can start from any
number, say 90 to 100.
for n in range(90,100):
print(n)
Also, note that, by default, the interval between two numbers is one, which we
can change according to need, for example, to 5.
for n in range(10,100,5):
print(n)
We can even move backward by taking some negative interval, for example, 100
to 15.
for n in range(100,10,-5):
print(n)
Let us do some meaningful exercise, rather than just printing some numbers; for
100
X
example, calculate the sum of all integers from 1 to 100, i.e., n.
n=1
s=0
for n in range(101):
s += n
print(s)
1.4.1 Exercise
m
X 1
1. Calculate the sum (n is an integer), till the result converges to the third
n
n=1
decimal place.
m
X 1
2. Calculate the sum (n is an integer), till the result converges to the
n2
n=1
third decimal place.
m
X 1
3. Calculate the sum (n is an integer), till the result converges to the
n4
n=1
third decimal place. Compare the values of m for various powers of n and
confirm that the sum converges faster for higher powers of n.
4
1.5. LISTS AND ARRAYS
5
1.5. LISTS AND ARRAYS
Note that we have converted the values to another list by using another built-in
function list. We can add an element to the existing list of elements.
v = [1.0, 2.0, 3.0]
print(v)
v.append(4.0)
print(v)
We can also create an empty list by v = [ ] and keep adding elements by using
v.append(1.0) etc. Similarly, we can eliminate an element from a list by using
v.pop().
Lists are one-dimensional, and thus, they have limited scope. Therefore, you
have to use arrays, which can handle vectors, as well as matrices. We will use
some built-in functions from the NumPy package. For example, let us create a
vector containing three integer elements:
import numpy as np
v = np.zeros(3,int)
print(v)
v[0] = 1
v[1] = 2
v[2] = 3
print(v)
Similarly, we can create a 2 × 2 matrix by doing,
import numpy as np
v = np.zeros([2,2],int)
print(v)
v[0,0] = 1
v[0,1] = 2
v[1,0] = 3
v[1,1] = 4
print(v)
Note how we are referring to the individual elements of an array, like v[0], v[1] or
v[0, 1], v[1, 0] etc. Instead of all zeros, we can also create arrows with all ones, by
calling the function ones from the NumPy package, like v = ones(3, int). Similarly,
we can create an empty array using the function empty from the NumPy package,
like v = empty(3, int). We can take a list and convert it to an array by using the
function array from the NumPy package. Check various ways you can create
arrays from lists.
import numpy as np
r = [1.0, 2.0, 3.0]
v = np.array(r, float)
print(v)
v = np.array([4.0, 5.0, 6.0], float)
print(v)
v = np.array([[1, 2],[3,4]], int)
print(v)
1.5.1 Exercise
1. Define a list having five elements and find their mean value.
6
1.6. USER-DEFINED FUNCTION
1.6.1 Exercise
sin x
1. Write a code to find whether f (x) = x has any root between x = 2 and x = 4.
1.7.1 2D plots
We can create ordinary graphs in Python by using the function plot from the
matplotlib.pyplot package.
import matplotlib.pyplot as plt
x = [-5.0, -4.0, -3.0, -2.0, -1.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
y = [25.0, 16.0, 9.0, 4.0, 1.0, 0.0, 1.0, 4.0, 9.0, 16.0, 25.0]
plt.plot(x,y)
plt.show( )
Note that, values of the dependent and independent variable are inserted via two
lists. Instead of data input directly in the program file (say you have 10000 data
points, which makes it nearly impossible to do it this way), it is better to read the
data from some file.
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt("data1.dat", float)
x = data[:,0]
y = data[:,1]
plt.xlabel("x-axis")
plt.ylabel("y-axis")
plt.xlim(-5,5)
plt.ylim(0,25)
plt.plot(x,y)
plt.show( )
7
1.8. 3D GRAPHICS
-5.0 25.0
-4.0 16.0
-3.0 9.0
-2.0 4.0
-1.0 1.0
0.0 0.0
1.0 1.0
2.0 4.0
3.0 9.0
4.0 16.0
5.0 25.0
We can also plot functions, for example a trigonometric function like cos x. In this
case, we have to create an arry of x−values by using linspace function from the
numpy package.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-6.28,6.28,100)
y = np.cos(x)
plt.plot(x,y)
plt.show( )
1.7.2 3D plots
The following code illustrates how to do 3D plotting (for example, a function
z(x, y) = x2 + y 2 ) in python.
from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
import numpy as np
def f(x,y):
value=x**2 + y**2
return value
x = np.linspace(-2,2,100)
y = np.linspace(-2,2,100)
x, y = np.meshgrid(x,y)
z = f(x,y)
fig = plt.figure( )
ax = plt.axes(projection="3d")
ax.plot_surface(x,y,z)
plt.show( )
1.7.3 Exercise
sin x
1. Plot y(x) = x .
2. Plot z(x, y) = x2 − y 2 .
1.8 3D graphics
Python also provides us tools for generating 3D graphics. We have to import
vpython package for this purpose. The following code shows how to generate a
8
1.8. 3D GRAPHICS
1.8.1 Exercise
1. Write a code to generate a square lattice.
9
Chapter 2
Numerical Methods
While a problem’s analytical solution is the most accurate one, we often need to
find a numerical answer. 1 In this chapter, I am going to describe the algorithms
of several numerical methods, as well as some example python codes. Read the
Appendix section for a brief introduction to Python programming.
Iteration x
0 1
1 0.36787944
2 0.87342302
3 0.46632719
10 0.7086265
20 0.66416843
30 0.65520278
40 0.65252326
2
Table 2.1: Values of x obtained at different steps while solving x = e−x by relax-
ation method. We start with an initial guess of 1.
10
2.1. SOLUTION OF NON-LINEAR EQUATIONS
f(x)
x1 xm x2
Figure 2.1: Bisection method: given f (x) is continuous between x1 and x2 and
if f (x1 ) and f (x2 ) have opposite signs, then there must be one root between x1
and x2 . The interval is bisected (xm being the midpoint) to further narrow down
the search and sign of f (xm ) tells whether the root is located between [x1 , xm ] or
[xm , x2 ].
import math as ma
x = float(input("Enter initial guess:"))
for n in range(50):
x1 = x
x = ma.exp(-x*x)
err = abs(x1 -x)
if err < 0.001:
break
print("The solution is:")
print(x)
2
Starting with an initial value of x0 = 1, we get x1 = e−x0 = e−1 = 0.36787944 in the
2
next step. We repeat the process and get x2 = e−x1 = e−0.13533528 = 0.87342302. We
have to keep doing this (see Table 2.1) till the values converge adequately; say the
difference between two successive x values become less than 0.001.
11
2.1. SOLUTION OF NON-LINEAR EQUATIONS
find it precisely? First, we bisect the interval [x1 , x2 ] into two halves [x1 , xm ] and
[xm , x2 ] (this being the origin of the name: bisection method). Now, the sign of
f (xm ) is going to determine whether the root is in the interval [x1 , xm ] and [xm , x2 ].
We have to keep bisecting the interval in this manner until it becomes so small
that we can confidently say that the root exists at the midpoint. Let us discuss
the algorithm and write a Python code based on that.
• Step 1: Select the interval and check whether f (x1 )f (x2 ) > 0. If true, change
the initial guess of the interval, as the root does not exist within this interval.
x1 +x2
• Step 2: Halve the interval at xm = 2 .
• Step 3: If f (x1 )f (xm ) < 0, then the root lies between x1 and xm . In this case,
keep the lower boundary at x1 , but reset the upper boundary to x2 = xm .
Otherwise, if f (x1 )f (xm ) > 0, then the root lies between xm and x2 . In this
case, reset the lower boundary to x1 = xm , but keep the upper boundary at
x2 .
• Step 4: Check whether the lower and upper boundary distance is sufficiently
small (say less than 0.001). In that case, take the midpoint of the upper and
lower boundary to be the final answer. Otherwise, go back to step 2.
12
2.1. SOLUTION OF NON-LINEAR EQUATIONS
root
f(xn)
xn+2 xn+1 xn
Figure 2.2: Newton-Raphson method: take some initial guess xn and draw a
tangent at that point. Find the location where the tangent crosses the x−axis
(xn+1 ) and draw a tangent there. Every time we do this, we move closer to the
root.
f (xn )
xn+1 = xn − 4x = xn − . (2.1)
f 0 (xn )
Next, we draw a tangent at xn+1 and get to the next point xn+2 and so on. Thus,
after each iteration, we get closer to the root and finally converge when two suc-
cessive points xn+1 and xn are very close to each other (less than some tolerance
limit). Finally, let us write our Python code for the Newton-Raphson method.
13
2.2. SOLUTION OF LINEAR EQUATIONS
break
x1 = x2
print("Root at:",x2)
print("Value of the function:",f(x2))
# Main code
x1=float(input("Enter the value of x1:"))
newton(x1)
2.1.4 Exercise
2
1. Write Python code to solve x = 2 − e−x using relaxation method.
Answer: 1.98015456
2. Test the given bisection method program with different initial guesses to find
all the roots. √ √
Answer: −2 − 3, −2 + 3, 4.
14
2.2. SOLUTION OF LINEAR EQUATIONS
We can quickly solve by back-substituting, from third to the second and finally,
to the first equation. Thus we should aim to convert any given set of three linear
equations in the form of Eq. 2.5. Equivalently, we need to convert the coefficient
matrix A to an upper triangular matrix. We will use the Gaussian elimination
method, which is based on the following principles.
• Multiplication of any row of the coefficient matrix A and the corresponding
row of vector b by any constant does not change the solution.
• Adding or subtracting any multiple of a row of the coefficient matrix A with
any other row and doing the same to the vector b does not change the solu-
tion.
First, let us write the A matrix and b vector in a compact form (known as the
augmented matrix),
3 3 1 12
2 1 2 10 . (2.6)
1 2 3 14
Keep in mind that our aim is to make the coefficient matrix A an upper triangular
matrix. First, we do: (second row) −2/3× (first row),
3 3 1 12
0 −1 4/3 2 . (2.7)
1 2 3 14
15
2.2. SOLUTION OF LINEAR EQUATIONS
import numpy as np
import sys
# system size
n = int(input(’Number of unknowns? ’))
# initializing matrix and vector
a = np.zeros([n,n],float)
b = np.zeros(n,float)
d = np.zeros([n,n+1],float)
print(’Enter the coefficient matrix row by row’)
for i in range(n):
for j in range(n):
print(’Enter row’, i, ’column’, j)
a[i][j] = float(input())
print(’Enter the b vector’)
for i in range(n):
b[i] = float(input())
# Constructing the augmented matrix
for i in range(n):
for j in range(n+1):
if j <= (n-1):
d[i,j] = a[i,j]
else:
d[i,j] = b[i]
print("The augmented matrix is")
print(d)
# Checking for zeros at the diagonals
for i in range(n):
if d[i][i] == 0.0:
sys.exit(’Error: division by zero!!’)
for j in range(i+1,n):
r = d[j][i] / d[i][i]
for k in range(n+1):
d[j][k] = d[j][k] - r * d[i][k]
# back-substitution
b[n-1] = d[n-1][n]/d[n-1][n-1]
for i in range(n-2,-1,-1):
b[i] = d[i][n]
for j in range(i+1,n):
b[i] = b[i] - d[i][j] * b[j]
b[i] = b[i]/d[i][i]
print("Solution is:")
print(b)
16
2.2. SOLUTION OF LINEAR EQUATIONS
We use the first equation to solve x0 , the second equation to solve x1 , and so on.
1
x0 = [b0 − a01 x1 − a02 x2 − · · · · −a0n xn ], (2.11)
a00
1
x1 = [b1 − a10 x0 − a12 x2 − · · · · −a1n xn ],
a11
·······································
1
xn = [bn − an0 x0 − an1 x1 − · · · · −ann−1 xn−1 ].
ann
Note that, for the method to work, each of the diagonal entries of the coefficient
matrix must be non-zero. Otherwise, we must interchange rows or columns to
avoid any zero along the diagonal. For example, let us solve the following system
of linear equations,
Since there is no zero along the diagonal of the coefficient matrix, it is straight-
forward to write the solution as
1
xm+1
0 = [8 − 2xm m
1 − 3x2 ], (2.13)
4
1
xm+1
1 = [−14 − 3xm m
0 − 2x2 ],
−5
1
xm+1
2 = [27 + 2xm m
0 − 3x1 ].
8
The superscripts denote successive iterations. We start with an initial guess of
x0 = −2, x1 = 2 and x2 = 3. After the first iteration, we get
1
x0 = [8 − 2(2) − 3(3)] = −1.25, (2.14)
4
1
x1 = [−14 − 3(−2) − 2(3)] = 2.8,
−5
1
x2 = [27 + 2(−2) − 3(2)] = 2.125.
8
17
2.2. SOLUTION OF LINEAR EQUATIONS
Iteration x0 x1 x2
1 -1.250 2.800 2.125
2 -0.994 2.900 2.013
3 -0.959 3.009 2.039
10 -0.986 2.993 2.004
11 -0.999 3.010 2.006
12 -1.009 3.003 1.996
Table 2.2: Values of x0 , x1 and x2 after successive iterations using the Jacobi
method, starting with the initial values of x0 = −2, x1 = 2 and x2 = 3.
N = 50
x0 = -2
x1 = 2
x2 = 3
for n in range(1,N):
y0 = x0
y1 = x1
y2 = x2
x0 = (8.0 - 2.0 * y1 - 3.0 * y2) / 4.0
x1 = (-14.0 - 3.0 * y0 - 2.0 * y2) / (-5)
x2 = (27.0 + 2.0 * y0 - 3.0 * y1) / 8.0
er = (abs(x0 - y0) + abs(x1 - y1) + abs(x2 - y2))/3.0
if er < 0.001:
break
print ("Convergence achieved in",n,"steps")
print (x0,x1,x2)
18
2.2. SOLUTION OF LINEAR EQUATIONS
N = 50
x0 = -2
x1 = 2
x2 = 3
for n in range(1,N):
y0 = x0
y1 = x1
y2 = x2
x0 = (8.0 - 2.0 * x1 - 3.0 * x2) / 4.0
x1 = (-14.0 - 3.0 * x0 - 2.0 * x2) / (-5)
x2 = (27.0 + 2.0 * x0 - 3.0 * x1) / 8.0
er = (abs(x0 - y0) + abs(x1 - y1) + abs(x2 - y2))/3.0
if er < 0.001:
break
print ("Convergence achieved in",n,"steps")
print (x0,x1,x2)
19
2.2. SOLUTION OF LINEAR EQUATIONS
Table 2.3: Different iterative schemes for solving a set of linear equations Ax = b.
a00 xm+1
0 = b0 − a01 xm m
1 − a02 x2 , (2.18)
a11 xm+1
1 = b1 − a10 xm m
0 − a12 x2 ,
a22 xm+1
2 = b2 − a20 xm m
0 − a21 x1 .
Note that we have a diagonal matrix (D) on the left-hand side, and the matrix
on the right-hand side is a sum of lower (L) and upper (U ) triangular matrices.
Finally, we can express the above equation in a compact form like,
xm+1 = P xm + q, (2.20)
such that,
4 0 0 0 0 0 0 2 3
D = 0 −5 0 , L = 3 0 0 , U = 0 0 2 . (2.22)
0 0 8 −2 3 0 0 0 0
Thus,
0.25 0 0 0 2 3 0 −0.50 −0.75
P = −D −1 (L + U ) = 0 −0.20 0 3 0 2 = 0.60 0 0.40
0 0 0.125 −2 3 0 0.25 −0.375 0
(2.23)
and
0.25 0 0 8 2
q = D −1 b = 0 −0.20 0 −14 = 2.8 . (2.24)
0 0 0.125 27 3.375
Finally, I leave it as an exercise to verify that P xm + q yields Eq. 2.13. Let us write
a Python code for the Jacobi method.
import numpy as np
20
2.2. SOLUTION OF LINEAR EQUATIONS
import sys
# system size
n = int(input(’Number of unknowns? ’))
# initializing matrix and vector
a = np.zeros([n,n],float)
b = np.zeros(n,float)
x = np.zeros(n,float)
d = np.zeros([n,n],float)
p = np.zeros([n,n],float)
q = np.zeros(n,float)
print(’Enter the coefficient matrix row by row’)
for i in range(n):
for j in range(n):
print(’Enter row’, i, ’column’, j)
a[i][j] = float(input())
print(’Enter the b vector’)
for i in range(n):
b[i] = float(input())
# Constructing the diagonal matrix
for i in range(n):
if a[i][i] == 0.0:
sys.exit(’Error: division by zero!!’)
else:
d[i][i] = 1.0/a[i][i]
# matrix a changed to L+U (diagonal=0)
a[i][i] = 0.0
# Constructing the p matrix
p = -d.dot(a)
print("The p matrix is")
print(p)
# Constructing the q matrix
q = d.dot(b)
print("The q matrix is")
print(q)
print("Enter the initial guess")
for i in range(n):
b[i] = float(input())
# Jacobi iteration starts
for i in range(100):
x = p.dot(b) + q
er = 0.0
for j in range(n):
er += abs(b[j] - x[j])
b[j] = x[j]
if er < 0.001:
print("Converged in ",i," steps")
print(x)
sys.exit("Well done!")
21
2.2. SOLUTION OF LINEAR EQUATIONS
s = P s + q. (2.25)
• All the eigen values of P satisfies |λ < 1| (this is known as the spectral radius
of P ).
Although norms are somewhat easy to calculate, the spectral radius is a better
indicator. Is there a way to directly say something about the convergence from the
coefficient Xmatrix A without calculating P ? The matrix A is diagonally dominant
if |aii | > |aij | (absolute value of the diagonal element is greater that the sum of
j,j6=i
the absolute values of other entries in a row). A diagonally dominant coefficient
matrix A ensures convergence (see exercise).
2.2.6 Exercise
1. Solve Eq. 2.12 using the Gaussian elimination method (by hand).
2. What would happen if you try to solve the example problem (the Jacobi or
Gauss-Seidel method) with an initial guess, which is exactly equal to the
actual solution, i.e., x0 = −1, x1 = 3, and x2 = 2? Modify the given Python
code and check the answer.
3
There are different types of norms. ||P ||1 is the sum of absolute values column wise and taking
the maximum one. ||P ||∞ is the sum of absolute values row wise and taking the maximum one.
22
2.3. INTERPOLATION
3. Write a Python code to solve Eq. 2.2 using the Jacobi and Gauss-Seidel
method.
8. Run three iterations using Eq. 2.20, Eq. 2.23 and Eq. 2.24 and match with
Table 2.2 (round off at third decimal place).
9. Calculate ||P ||1 and ||P ||∞ for the following matrix,
4 2 3
P = 3 −5
2 .
−1 3 7
10. For the Jacobi method, prove that if the coefficient matrix is diagonally dom-
inant, then ||P ||∞ < 1.
11. Why do you think that the Jacobi method failed to converge for Eq. 2.2?
2.3 Interpolation
We know the value of some function at two points, x1 and x2 , but would like to
know its value at x, lying in-between. Linear interpolation would be the simplest
thing to do in this situation. The working principle is depicted in Figure 2.3. We
can easily find the slope of the straight line joining the points f (x1 ) and f (x2 ), and
then it is straightforward to get the value at x, lying along the line,
f (x2 ) − f (x1 )
f (x) ≈ f (x1 ) + d = f (x1 ) + (x − x1 ). (2.28)
x2 − x1
Note that the value obtained along the straight line will most likely differ from
the actual value of the function, f (x) (unless the actual function is linear itself).
The larger the distance between x1 and x2 , the more will be the error (difference
between the actual and interpolated value).
Let us write a python code for interpolation. In order to demonstrate various
√
concept, we assume a function f (x) = x. We take two points x1 and x2 on this
23
2.3. INTERPOLATION
f(x2)
f(x)
f(x1) d
x1 x x2
Figure 2.3: Linear interpolation: we know the values of f (x1 ) and f (x2 ), but would
like to know f (x). Linear interpolation would be the simplest method of doing
this. Since we know the slope of the line connecting the points f (x1 ) and f (x2 ),
we can get f (x) (red point) along this line. However, it may differ from the actual
value of the function at x. Smaller the distance between x1 and x2 , better would
be the match with the actual value.
curve and apply interpolation to get the value at the midpoint, i.e., x = x1 +x
2 .
2
Since we have started with a known function, we can accurately calculate the
difference between the actual and interpolated value.
import numpy as np
def interpolation(x1, x2, y1, y2, x):
fx = y1 + (x - x1) * (y2 - y1) / (x2 - x1)
return fx
# Driving program (assuming a function y=x**0.5)
x1 = float(input("Enter the value of x1:"))
y1 = np.sqrt(x1)
x2 = float(input("Enter the value of x2:"))
y2 = np.sqrt(x2)
x = (x1 + x2) / 2
nans = interpolation(x1, x2, y1, y2, x)
ans = np.sqrt(x)
err = abs(nans - ans)
print("Interpolated value at x=",x)
print(nans)
print("Absolute value of deviation from the actual value:",err)
Actual value of f (x) is 1.22474 at x = 1.5. If you start with x1 = 1 and x2 = 2, the
error turns out to be 0.01763. On the other hand, if you start with x1 = 1.25 and
x2 = 1.75, the error reduces to 0.00429.
24
2.4. NUMERICAL INTEGRATION
(a) (b)
a h b a h b
Figure 2.4: Estimating the area under the curve (a) by dividing the total area
in small rectangular slices of width h and (b) by dividing the area in a set of
trapezoids. Clearly, the second approximation is more accurate.
2.3.1 Exercise
1. Let’s assume that we want the value at x = 8.5 and start with x1 = 8 and
x2 = 9. Using the example code, find the error. Why the error is so small
compared to the case, when we tried to calculate the value at x = 1.5, starting
with x1 = 1 and x2 = 2?
25
2.5. NUMERICAL DIFFERENTIATION
Now we can add all such slices to get the total area under the curve,
N −1
N " #
X 1 1 f (a) + f (b) X
I≈ An = h f (a) + f (a + h) + f (a + 2h) + · · · + f (b) = h + f (a + nh) .
n=1
2 2 2 n=1
(2.31)
Using this algorithm, let us write a python code to evaluate the following inte-
Z 2
gral, I = (5x4 + 4x3 + 3x2 + 2x + 1)dx and match the answer with the analytical
1
result.
def f(x):
return 5*x**4 + 4*x**3 + 3*x**2 + 2*x + 1
N = 20
a = 1.0
b = 2.0
h = (b-a)/N
s = 0.5 * f(a) + 0.5 * f(b)
for m in range(1,N):
s += f(a + m * h)
print(h * s)
In this code, I have used a user-defined function and a for loop. Running the code,
you get an answer of 57.037915625, while the analytical result is 57. We can
improve the accuracy by reducing the width (h) of the trapezoids (see exercise).
2.4.2 Exercise
1. In the given example code, reduce the width of the trapezoids (h) and check
the difference with the analytical result.
f (x + h) − f (x)
f 0 (x) = lim . (2.32)
h→0 h
Practically it is impossible to take h → 0, but we can take h as small as possible.
The working principle is depicted in Figure 2.5. You will be given a set of values
(say from some experimental observation) at a regular interval h. You have to
calculate the difference between the adjacent values and divide by the width of
the interval (h). There are three ways of doing this, as discussed below.
26
2.5. NUMERICAL DIFFERENTIATION
x+h/2
x-h/2
x+h
x-h
x
x
Figure 2.5: Calculation of the first derivative using forward, backward and central
difference method. The filled circles are data points given at a regular interval h.
In case of the forward and backward difference, we can calculate the derivative
on the same set of points (filled circles). On the other hand, using the central
difference, we can get the derivative on a different set of points (open circles),
lying in the middle of the original set of data points.
f (x + h) − f (x)
f 0 (x) ≈ . (2.33)
h
f (x) − f (x − h)
f 0 (x) ≈ . (2.34)
h
f (x + h/2) − f (x − h/2)
f 0 (x) ≈ . (2.35)
h
27
2.6. MAXIMA AND MINIMA
The only way to get the derivative on the original set of data points is to double
the interval to 2h, which may worsen the numerical error. To get the second
derivative, we have to apply the central difference twice,
f 0 (x + h/2) − f 0 (x − h/2)
f 00 (x) ≈ (2.36)
h
[f (x + h) − f (x)]/h − [f (x) − f (x − h)]/h f (x + h) − 2f (x) + f (x − h)
= = .
h h2
Note the difference of outcome in the case of first and second derivative calcu-
lation. In the case of the first derivative, we get the values in the middle of the
given data set. On the other hand, in the case of the second derivative, we get the
values at the given data points.
Finally, let us write a code for calculating the first derivative using the central
difference method. We will use a known function x3 /3 to compare between the
numerical and analytical result.
import numpy as np
def f(x):
return x**3/3.0
n = 50
dx = 2.0/n
df = np.zeros(n,float)
for i in range(1,n+1):
xl = -1.0 + (i-1) * dx
fxl = f(xl)
xr = -1.0 + i * dx
fxr = f(xr)
df = (fxr - fxl) / (xr - xl)
xm = (xr + xl) / 2.0
err = abs(df - xm*xm)
print("Error with analytical result")
print(err)
2.5.4 Exercise
1. Vary the interval (try n = 10, 20, 30, 40, 60) in the given code and check how the
error is changing.
2. Write a code for calculating the second derivative using the central difference
method. You can use the same function, y = x3 /3.
28
Chapter 3
Partial Differentiation
3.1 Introduction
First let us understand why do we need partial differentiation. If I have a function
dy
of a single variable, like y(x); then its derivative dx can be interpreted (a) geomet-
rically as slope of y(x) or (b) physically as rate of change of y with respect to x.
For example, rate of change of position is equal to velocity, rate of change of ve-
locity is acceleration etc. Derivatives are used in several problems like differential
equations or finding maxima and minima etc. Partial derivatives are used when
we are dealing with a function of more than one variable, i.e., we will generalize
what we have learned in single variable calculus and do multi-variable calculus.
As shown in Fig. 3.1, z(x, y) is a function of two variables. If we draw a plane
parallel to the xz plane, then y is constant on that plane and the plane will inter-
sect the surface z(x, y) along the red curve. Now, rate of change of z with respect
to x (when y is constant) can be calculated along the red curve by the partial
∂z
derivative ∂x . Similarly, rate of change of z with respect to y (when x is constant)
∂z
can be calculated along the blue curve by the partial derivative ∂y . Sometimes,
∂z
the variable kept constant is specified as a subscript, like ∂x y , meaning partial
derivative of z with respect to x, when y is held constant. We can also take second
∂ ∂z ∂2z ∂ ∂z ∂2z
(and higher) derivative, like ∂x ∂x = ∂x2 ≡ zxx and ∂y ∂x = ∂y∂x ≡ zyx . For most
of the applied problems, zxy = zyx and this is known as the reciprocity relation.
In simple words, it says that order of differentiation does not matter.
Example: if we consider a “pure”, i.e., single component (C=1) and single phase
(P=1) system, which is subjected to only one form of work (say Newtonian or
“PV” work), then thermodynamic functions like internal energy (U (V, S)), enthalpy
(H(P, s)), Helmholtz free energy (A(V, T )) and Gibbs free energy (G(P, T )) are bivari-
ate functions. Thus, you should visualize U, H, A, G as surfaces.
29
3.1. INTRODUCTION
z
z(x,y) with y=constant
y
z(x,y) with x=constant
x
Figure 3.1: A function of two variables z(x, y). Along the red curve, y is constant
and along the blue curve, x is constant. Note that, along the red curve, we can
equivalently define x(y, z) with y constant. Similarly, along the blue curve, we can
equivalently define y(x, z) with x constant. Along the dotted curve, we can define
either x(y, z) or y(x, z) with z constant.
Application: power series It is often useful to write the power series of a func-
tion, like
y(x) = c0 + c1 (x − x0 ) + c2 (x − x0 )2 + c3 (x − x0 )3 + c4 (x − x0 )4 · ·· (3.1)
1 00 1
y(x) = y(x0 ) + y 0 (x0 )(x − x0 ) + y (x0 )(x − x0 )2 + y 000 (x0 )(x − x0 )3 + · · · (3.3)
2! 3!
and in a more compact notation as,
∞
d n
X
y(x) = h y(x0 ) , (3.4)
dx
n=0
30
3.2. TOTAL DIFFERENTIAL
where h = (x − x0 ) and k = (y − y0 ).
3.1.1 Exercise
1. Derive Eq. 3.5, following the method shown for Eq. 3.4.
2. Replace x = (x0 + h) in Eq. 3.4 and write an alternative form of the Taylor
series expansion. Repeat it for Eq. 3.5.
3. Write the linear and quadratic approximation of a function, using Eq. 3.4
and Eq. 3.5.
4. We can write the quadratic approximation for a two variable function (derived
in the last problem) in a more compact form, by considering two vectors
~x = (x, y) and ~x0 = (x0 , y0 ) as,
1
z(~x) = z(~x0 ) + [(~x − ~x0 )T ] · ∇z(~x0 ) + [(~x − ~x0 )T ] · Hz(~x0 )[(~x − ~x0 )], (3.6)
2
where Hz(~x0 ) is a 2×2 Hessian matrix. Derive the components of the Hessian
matrix.
5. What would be the form of the Hessian matrix if you are dealing with a
function of n-variables?
∆y dy
= + = y 0 + . (3.7)
∆x dx
31
3.2. TOTAL DIFFERENTIAL
Δy
dy
Δx=dx
x
Figure 3.2: Tangent or linear approximation of differential ∆y ≈ dy = y 0 dx.
∂z ∂z
∆z = z(x + ∆x, y + ∆y) − z(x, y) ≈ dz = dx + dy . (3.9)
∂x ∂y
A geometrical interpretation, very similar to the case of one variable function, can
be given in this case as well; ∆z is change along the surface, while dz is change
along the tangent plane (consult the text book Mary L. Boas for more detail).
We can also generalize for a n-variable function f (x1 , x2 , · · ·, xn ) and write the
total differential as:
n
X ∂f
∆f ≈ df = dxi . (3.10)
∂xi
i=1
We can still think of a geometrical interpretation, although we can not draw it; ∆f
is change along a n-dimensional surface, while df is change along n-dimensional
tangent plane.
32
3.2. TOTAL DIFFERENTIAL
3.2.1 Exercise
1. Take a function y(x) = x3 . Using the linear/tangent approximation, estimate
y(x + dx) for x = 1 and ∆x = 1, 0.1, 0.01, 0.001 and 0.0001. How much the do
the approximate values differ form the actual values? What do you conclude
from this?
2. To the linear/tangent approximation y(x + dx) = y(x) + y 0 dx, add higher order
terms. Compare with the power series obtained in the previous section.
4. Can you apply the linear/tangent approximation backward? How would you
modify the formula to do that?
5. Write the forward and backward expansion for y(x) upto 3rd order term and
add them. What is the advantage of doing this? Something similar is done
in case of molecular dynamics simulations, known as the Verlet algorithm.
Read about the Verlet algorithm.
33
3.3. SOME USEFUL FORMULAE
dz ∂z dx ∂z dy
= + . (3.11)
dt ∂x dt ∂y dt
∂z ∂z ∂x ∂x
∂z ∂z ∂s ∂t
∂s ∂t = ∂x ∂y ∂y ∂y . (3.13)
∂s ∂t
We can easily use the matrix form to further generalize for any number of variables
(see exercise).
34
3.3. SOME USEFUL FORMULAE
Caution!!!! Be careful when you are dealing with too many variables. Let us
take the example of cartesian to polar coordinate transformation.
x = r cos θ,
y = r sin θ,
p
r = x2 + y 2 ,
y
θ = arctan . (3.15)
x
∂y ∂θ 1/x x
Now, from the above equations, ∂θ = r cos θ = x and ∂y Why one
= 1+y 2 /x2
= r2
.
is not reciprocal of the other? This is because, we have actually calculated ∂y ∂θ r
2
∂θ ∂y
and ∂y . If we take y = x tan θ and then calculate ∂θ = cosx2 θ = rx , then it is
x x
∂θ
indeed reciprocal of ∂y . I present a simple general proof below.
x
∂x ∂y
=1. (3.16)
∂y z ∂x z
35
3.3. SOME USEFUL FORMULAE
3.3.6 Exercise
1. Write chain rule 2 in matrix form for u(x, y, z) and x(s, t), y(s, t), z(s, t).
2. Jacobian matrix: consider Eq. 3.15, where x(r, θ) and y(r, θ). Write total
differential of x(r, θ) and y(r, θ) and show that it can be expressed in the form
of the following matrix equation,
dx dr
=A .
dy dθ
Find the form of A (Jacobian matrix). Repeat for r(x, y) and θ(x, y) and in this
case prove that the Jacobian matrix is A−1 . Note that, if we compare term
by term between A and A−1 , they are not reciprocals.
5. Use the “algebraic” method to solve the following problems. If there are less
number of variables, simple elimination/substitution works. Otherwise, you
have to apply Cramer’s rule in complicated cases.
h i
(a) Given w = f (ax + by), find b ∂w ∂w
∂x y − a ∂y x .
Answer: 0
(b) Given z = xe−y , x = cosh t, y = cos s, find ∂z ∂z
∂s t and ∂t s .
Answer: z sin s and e−y sinh t.
dx ∂x
(c) Given x = yz, y = sin(y + z), find dy . Is this different from ∂y z and
∂x
∂y ? Give proper justification.
y
36
3.4. HOW TO FIND THE MAXIMUM AND MINIMUM
(d) Given m = pq, a sin p − p = q, b cos q + q = p, find ∂p ∂p ∂p ∂b
∂q m , ∂q a , ∂q b , ∂a p
and ∂a ∂q m . Note that, for the last two, you need to apply Cramer’s rule.
7. In the following problems, explicit forms of the functions are not given.
∂s ∂s
(a) Given s(v, T ), v(p, T ), cp = T ∂T p
, cv = T ∂T v , find (cp − cv ).
∂s
∂v
Answer: T ∂v T ∂T p
(b) Given u(x, y), y(x, z), find ∂u
∂x z .
(c) Given x(y, z), y(x, z), z(x, y), derive the following:
∂y ∂y ∂z
= / ,
∂z x ∂t x ∂t x
∂x ∂x ∂y
= / .
∂y z ∂t z ∂t z
(d) Derive the cyclic relation [Eq. 3.17] just by using substitution/elimination
technique.
(e) Write the cyclic relation between v (molar volume), P (pressure) and T
(temperature). Is there any physical significance of each term?
8. Change of variables
2 2
(a) In the wave equation ∂∂xF2 = v12 ∂∂tF2 , substitute r = x + vt and s = x − vt
and express the equation in terms of s, r. Then, solve the equation.
∂2F ∂2F
(b) Express Laplace equation ∂x2
+ ∂y 2
= 0 in polar coordinates r, θ.
∂2F ∂2F ∂2F
(c) Express Laplace equation ∂x2
+ ∂y 2
+ ∂z 2
= 0 in cylindrical polar coor-
dinates r, θ, z.
∂2F ∂2F ∂2F
(d) Express Laplace equation ∂x2
+ ∂y 2
+ ∂z 2
= 0 in spherical polar coordi-
nates r, θ, φ.
∂2z ∂ z 2 ∂ z 2
(e) Solve the partial differential equation ∂x2
− 5 ∂x∂y + 6 ∂y 2 = 0 by substitut-
ing s = y + 2x and t = y + 3x.
Answer: z = f (y + 3x) + g(y + 2x)
∂ z 2 ∂2z 2
∂ z
(f) Solve the partial differential equation 2 ∂x 2 + ∂x∂y − 10 ∂y 2 = 0 by substi-
tuting s = 5x − 2y and t = 2x + y.
∂2w ∂2w
(g) Solve the partial differential equation ∂x2
− ∂y 2
= 1 by substituting x =
s + t and y = s − t.
37
3.4. HOW TO FIND THE MAXIMUM AND MINIMUM
0 1 1
@(x) -x . 2 @(x) x . 2 @(x) x . 3
-0.2 0.8
0.5
-0.4 0.6
-0.6 0.4
-0.5
-0.8 0.2
Figure 3.3: Curve with a (a) maximum point, (b) minimum point and (c) inflection
point.
(x − x0 )2 00
y(x) = y(x0 ) + y (x0 ), (3.18)
2!
since the first derivative y 0 (x0 ) = 0. Now, if x0 is a point of maximum, at any
other point y(x) should be less than y(x0 ), i.e., y(x) − y(x0 ) < 0 for any x. This will
be satisfied, provided y 00 (x0 ) < 0 , which is the condition for the maximum point,
which also means a concave down curvature. On the other hand, if x0 is a point
of minimum, y(x) − y(x0 ) > 0 for any x. This will be satisfied, provided y 00 (x0 ) > 0 ,
which is the condition for the minimum point, which also means a concave up
curvature. Note that, the first derivative changes sign as we move from the left to
right of a maximum or minimum point but the second derivative does not change
sign.
There can be a third case, if curvature of a function changes from concave
down to concave up (or vice versa). In that case, the concavity (second deriva-
tive) changes sign and must pass through a value of 0 at some point, which is
known as the inflection point. Thus, if x0 is an inflection point, it must satisfy
y 00 (x0 ) = 0 . Such an example is shown in Fig. 3.3 (c). While the second derivative
changes sing as we move from the left to right of a inflection point, first derivative
does not change sign. Also see problem set for further details.
38
3.4. HOW TO FIND THE MAXIMUM AND MINIMUM
0 z z
-20
140
100
z
120
-40 (a) (b)
-60
100
80
50
(c)
-80
60 0
-100
40
-120
20 -50
-140
0
10
-100
10
5 10
10
x 0 5
5 10
x 5 5 10
0 0
-5
y 0
x 0 5
y
-5
-5
-10 -5 0
-10
-10 -10
-5
-10 -10
-5
y
Figure 3.4: Surface with a (a) maximum point, (b) minimum point and (c) saddle
point. Along the blue (red) curve, x (y) remains constant. The two curves are
crossing each other at the origin.
apply the quadratic approximation, setting the first order term to zero at the point
of extremum,
∂ 2
∂
z(x, y) = z(x0 , y0 ) + h +k z(x0 , y0 ), (3.19)
∂x ∂y
where h = (x − x0 ) and k = (y − y0 ). We can expand and write,
bk 2 b2
2 2
z(x, y) − z(x0 , y0 ) = ah + 2bhk + ck = a h + + c− k2 , (3.20)
a a
∂ 2 z(x0 ,y0 ) ∂ 2 z(x0 ,y0 ) ∂ 2 z(x0 ,y0 )
where a = ∂x2
,b = ∂x∂y , c = ∂y 2
. For a minimum point, [z(x, y) −
ac−b2
z(x0 , y0 )] > 0, which is satisfied if a > 0. Since a > 0, we also need 0 and a >
(ac − b2 ) > 0, which further implies that c > 0. Thus, the condition for minimum
point is,
2
zxx > 0, zyy > 0, zxx zyy > zxy . (3.21)
2
zxx < 0, zyy < 0, zxx zyy > zxy . (3.22)
2
Finally, if we have a > 0, ac−ba < 0, which implies that ac < b2 . Similarly, if
2
a < 0, ac−b
a > 0, which also implies that ac < b2 . In such a case we do not have a
maximum or minimum, but a saddle point [Fig 3.4] and the condition is,
2
zxx zyy < zxy . (3.23)
Such a condition is obviously satisfied if zxx and zyy has opposite sign and such
an example is shown in Fig 3.4 (c).
39
3.4. HOW TO FIND THE MAXIMUM AND MINIMUM
1.5
y
1
0.5
0
x
-0.5
d (x,y)
-1
-1.5
-1.5 -1 -0.5 0 0.5 1 1.5
p
Figure 3.5: Finding minimum distance d = x2 + y 2 from the origin, with the
constraint that the point lies on the curve.
∂f ∂f
df = dx + dy = 0,
∂x ∂y
∂φ ∂φ
dφ = dx + dy = 0. (3.24)
∂x ∂y
Adding, we get
∂f ∂φ ∂f ∂φ
+λ dx + +λ dy = 0. (3.25)
∂x ∂x ∂y ∂y
40
3.4. HOW TO FIND THE MAXIMUM AND MINIMUM
Normally, first we define a new function from f (x, y) and φ(x, y):
where λ is the Lagrange multiplier. Taking partial derivative of the above function
yields Eq. 3.26,
∂F ∂f ∂φ
= +λ =0,
∂x ∂x ∂x
∂F ∂f ∂φ
= +λ =0,
∂y ∂y ∂y
∂F
= φ(x, y) − c = 0 . (3.28)
∂λ
∂F x
=p − 2λx = 0,
∂x x + y2
2
∂F y
=p + λ = 0,
∂y x + y2
2
∂F
= y − x2 + a = 0.
∂λ
1
The first equation implies that either x = 0 or λ = √ ,
and from the second
2 x2 +y 2
q
equation we get y = − 12 and from the third equation we get x = ± a − 12 and the
q
1 1
minimum distance is given by d = 4 + a − 2 . Compare the results with the
“direct” method, as suggested previously.
41
3.4. HOW TO FIND THE MAXIMUM AND MINIMUM
1.5 1.5
y (a) y (b)
1 1
0.5 0.5
0 0
x x
-0.5 -0.5
-1 -1
-1.5 -1.5
-1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5
Figure 3.6: Tangent drawn at (a) minimum d and (b) any other d. See Fig 3.5 for
the definition of d.
3.4.4 Exercise
1. Find the maximum and minimum points of the functions:
(a) z(x, y) = x2 + y 2 + 2x − 4y + 10
Answer: Minimum point at (-1,2)
(b) z(x, y) = x2 − y 2 + 2x − 4y + 10
Answer: Saddle point at (-1,-2)
2. A point can move along the curve xy = c. Find its minimum distance from
the origin. Plot and try to understand the geometrical interpretation of the
method of Lagrange multipliers.
42
3.5. CHANGE OF VARIABLES: LEGENDRE TRANSFORM
2 2
3. You have to fit a box (rectangular parallelepiped) within an ellipsoid xa2 + yb2 +
z2
c2
= 1, such that the edges of the box are parallel to the axes. What would
be the maximum possible volume of such a box?
Answer: 8abc
√
3 3
4. Find the minimum distance from the origin to the intersection of xy = 6 with
7x + 24z = 0.
5. Consider a rectangle, with isosceles triangles at two of the ends. For a fixed
perimeter, if I want maximize the total area, what would be the value of θ
(angle between the sides of the triangle and sides of the rectangle)? Also try
to identify the shape.
6. A box has three of its faces in the coordinate planes and one vertex in the
plane ax + by + cz = d. Find the maximum volume of the box.
7. A point can move only along the line ax + by − c = 0. What would be the
location of the point, such that the sum of the squares of its distances from
(1,0) and (-1,0) is minimum.
where p = ∂f /∂x and q = ∂f /∂y. Note that, (p, x) and (q, y) form a conjugate pair
of variables. Now, instead of x and y as independent variable, if I want x and q
to be the independent variable, what do I do? How to transform f (x, y) ⇒ g(x, q)?
Subtract d(qy) from df , such that,
g = f − qy , (3.31)
such that
dg = pdx − ydq , (3.32)
43
3.6. DIFFERENTIATION OF INTEGRALS
∂g ∂g
= p, = −y. (3.33)
∂x ∂q
∂2g ∂2g
Using reciprocity relations ∂q∂x = ∂x∂q , we can further write that,
∂p ∂y
=− . (3.34)
∂q x ∂x q
3.5.1 Exercise
1. Combined form of first and second law is: du = T ds − pdv, which tells us
that internal energy is a function of entropy and volume, i.e., u(s, v). Find
a Legendre transformation to get the following functions (and also find the
name of each of the functions):
(a) a function of (T, v)
(b) a function of (s, p)
(c) a function of (T, p)
2. Using du and other three functions in the previous problem, find all possible
Maxwell relations.
Note that, I have taken the lower limit to be a constant, while the upper limit is a
variable x. If we interchange the limit, we get a negative sign,
Z a
d
f (t)dt = −f (x) . (3.36)
dx x
44
3.6. DIFFERENTIATION OF INTEGRALS
Since one of the limits is a variable, the definite integral yields a function, instead
of some constant value, i.e., Z x
f (t)dt = g(x), (3.37)
a
and its derivative is given by g 0 (x) = f (x), according to Eq. 3.35. Starting from the
last statement, we can prove Eq. 3.35. Since f (t) = dg dt , we can write the integral
of f (t) as, Z x
I= f (t)dt = [g(t)]xa = g(x) − g(a). (3.38)
a
Now, we can easily calculate the derivative of the integral as,
Z x
dI d dg
= f (t)dt = = f (x). (3.39)
dx dx a dx
Now, instead of x, if the upper limit is some function like v(x), then we can write
dI dI dx
dv = f (v) = dx dv , which implies that,
Z v(x)
d dv
f (t)dt = f (v) . (3.40)
dx a dx
We can further generalize and take both the upper and lower limit to be functions
of x, such that
Z v(x)
d dv du
f (t)dt = f (v) − f (u) . (3.41)
dx u(x) dx dx
Now, if we try to calculate the derivative of g(x) (or derivative of the integral of
f (x, t)) as given in Eq. 3.42, we can differentiate within the integral sign,
Z b Z b
dg(x) d ∂f (x, t)
= f (x, t)dt = dt , (3.44)
dx dx a a ∂x
because both the limits of the integration are constants (a and b). On the other
hand, if the function is described by Eq. 3.43 (both the limits are function of x),
45
3.6. DIFFERENTIATION OF INTEGRALS
then we can combine both Eq. 3.41 and Eq. 3.44 to write (Leibniz rule),
Z v(x) Z v(x)
dg(x) d ∂f (x, t) dv du
= f (x, t)dt = dt + f (x, v) − f (x, u) . (3.45)
dx dx u(x) u(x) ∂x dx dx
3.6.1 Exercise
Z ∞
2
1. Find tn e−at dt, when n is odd and a > 0.
0
Z ∞
2
2. Find tn e−at dt, when n = 2, 4, 6, ...., 2m.
0
46
Bibliography
[1] Boas, Mary L., “Mathematical Methods in the Physical Sciences” (Third Edi-
tion), WILEY.
[2] Zia, R. K. P. and Redish, Edward F. and McKay, Susan R., “Making sense of
the Legendre transform”, American Journal of Physics, 77, 614-622 (2009).
47
Chapter 4
Multiple integrals
Integral over a rectangular area: The simplest case is when both the x and y
limits are constants and f (x, y) = g(x)h(y), such that
Z Z Z b Z d Z b Z d
f (x, y)dxdy = g(x)h(y)dydx = g(x)dx h(y)dy . (4.1)
x=a y=c x=a y=c
For example, see the figure in first row of column (i) in Fig. 4.1.
Repeated integral: y first Now let us consider the figure in second row of col-
umn (i) in Fig. 4.1, where the area of integration is not the rectangle, but a tri-
angle. In this case also, maximum and minimum value of x and y ranges from
(a, b) and (c, d), respectively. However, we can not set the limits of x from a − b and
limits of y from c − d, as we did for the rectangle. In this case we can solve in two
ways, first let us do the dy integration first (you can think this as “column-wise”
process), !
Z Z Z b Z yh (x)
f (x, y)dxdy = f (x, y)dy dx. (4.2)
x=a y=yl (x)
| {z }
F (x)
48
4.1. DOUBLE AND TRIPLE INTEGRALS
y(x)
h y=d
y=d
x(y)
h
dy
dy
dy
x(y) dx
dx
l
dx
y=c x=a y(x) y=c
x=a x=b l x=b y=d
y(x)
d h
x(y) x(y)
dy
y(x) l dx h
dy
dx h dx
dy
y=c
y(x)
c y(x)=c x=a l x=b
x=a l x=b
y(x)
y=d h y=d
x(y)=a
dy
h
dy
dx dx
l h
dy
dx
l
y=c y(x)
a b x=a l x=b y=c
(i) (ii) (iii)
Figure 4.1: Various different areas over which a double integral needs to be cal-
culated.
Repeated integral: x first Alternately, as shown in the third row of column (i)
in Fig. 4.1, we can do the dx integration first (you can think this as “row-wise”
process), !
Z Z Z d Z xh (y)
f (x, y)dxdy = f (x, y)dx dy. (4.3)
y=c x=xl (y)
| {z }
F (y)
Repeated integral: either one first From the above discussion, it is clear that
the outcome does not depend on which of two integrals (dx or dy) is evaluated
49
4.1. DOUBLE AND TRIPLE INTEGRALS
first, i.e.,
! !
Z Z Z b Z yh (x) Z d Z xh (y)
f (x, y)dxdy = f (x, y)dy dx = f (x, y)dx dy. (4.4)
x=a y=yl (x) y=c x=xl (y)
| {z } | {z }
F (x) F (y)
4.1.1 Exercise
1. Evaluate the following double integrals (it is always a good idea to draw
the area over which you are integrating). Also try to solve the problems by
interchanging the order of integration (whenever possible).
Z 1 Z 6
(a) 3xdydx.
x=0 y=2
Answer: 6
Z 2 Z 4
(b) 2dxdy.
y=0 x=2y
Answer: 8
Z 4 Z x/2
(c) 3ydydx.
x=0 y=0
Answer: 8
Z 1 Z ex
(d) 2ydydx.
x=0 y=x
e2 5
Answer: 2 − 6
Z 3 Z 1
(e) dxdy.
y=0 x=1−y/3
Answer: 23
Z 1 Z 2x
(f) 3(x + y)dydx
x=0 y=0
Answer: 4
Z 2 Z 4
√
(g) 5y xdxdy
y=0 x=y 2
Answer: 32
Z 1 Z √1−x2
(h) 3ydydx
x=0 y=0
Answer: 1
50
4.2. CHANGE OF VARIABLES: JACOBIANS
Z π Z π
sin x
(i) dxdy
y=0 x=y 2x
Answer: 1
Z 1 Z 1
ex
(j) √ dxdy
y=0 x=y 2 x
Z 2 Z 2 √
2
(k) 2e−y /2 dydx
x=0 y=x
51
4.2. CHANGE OF VARIABLES: JACOBIANS
y
eθ j
er
i
θ
dr
rd
(x,y)
dθ (r,θ)
θ
x
Note that, in the Cartesian coordinate system, the basis vectors (orthogonal) are
taken to be (î, ĵ). In the polar coordinate system, the basis vectors (orthogonal)
are (êr , êθ ). How does one coordinate system transform into another?
Note that, we have rotated the coordinate system (î, ĵ) by θ to get (êr , êθ ) and the
rotation matrix is defined by,
cos θ sin θ
R(θ) = . (4.10)
− sin θ cos θ
I just mentioned about the rotation matrix and we will learn in detail later.
Let us come back to the topic of interest in this chapter: what is the form
of differential area element in polar coordinates? As shown in Fig. 4.2, the area
element is:
dA = rdrdθ . (4.11)
The length element, equal to the line connecting the two points (r, θ) and (r+dr, θ+
dθ) is given by,
52
4.2. CHANGE OF VARIABLES: JACOBIANS
(r+dr,θ+dθ,z+dz)
rdθ
dz
(r,θ,z)
dr
θ dθ
e rdθ
z
dr e
θ
j
e
i r
Note that, we have rotated the coordinate system (î, ĵ, k̂) by an angle θ about the
z axis to get (êr , êθ , êz ) and the rotation matrix is defined by,
cos θ sin θ 0
Rz (θ) = − sin θ cos θ 0 . (4.16)
0 0 1
Again, the rotation matrix is just mentioned here, as we will learn about the
properties of rotation matrices later.
Let us come back to the topic of interest in this chapter: what is the form of
differential volume element in cylindrical coordinates? As shown in Fig. 4.3, the
volume element is:
dV = rdrdθdz . (4.17)
53
4.2. CHANGE OF VARIABLES: JACOBIANS
er
dr
k
eφ
j
θ dφ
rsin
i
rdθ
e
θ
r dθ
θ
φ r rs dφ
sin in
θ θ
θdφ
rsin
The area element (on the surface of the cylinder of radius r0 ) is:
dA = r0 dθdz. (4.18)
The length element, equal to the line connecting the two points (r, θ, z) and (r +
dr, θ + dθ, z + dz) is given by,
54
4.2. CHANGE OF VARIABLES: JACOBIANS
The length element, equal to the line connecting the two points (r, θ, φ) and (r +
dr, θ + dθ, φ + dφ) is given by,
4.2.5 Jacobians
Say, I want to evaluate some area integral in ploar coordinate, instead of cartesian
coordinates (because of symmetry, which can make life simpler). We have to make
the substitution given in Eq. 4.8 and also write the area element dxdy in terms
of variables in polar coordinate. We already have derived the area element in the
polar coordinate. Is there a general method of getting it, which can be used for
any transformation (x, y, z) → (u, v, w).
Let us start by writing the differential of Eq. 4.8.
∂x ∂x
dx = (cos θ)dr + (−r sin θ)dθ = dr + dθ, (4.25)
∂r ∂θ
∂y ∂y
dy = (sin θ)dr + (r cos θ)dθ = dr + dθ.
∂r ∂θ
Rearranging the above equations:
∂x ∂x
dx ∂r ∂θ dr x, y dr ∂(x, y) dr
= ∂y ∂y =J = (4.26)
dy ∂r ∂θ
dθ r, θ dθ ∂(r, θ) dθ
Similarly, we can write the Jacobian for any transformation (x, y, z) to (u, v, w) as:
∂x ∂x ∂x
∂u ∂v ∂w
x, y, z ∂(x, y, z) ∂y ∂y ∂y
J = = ∂u ∂v ∂w
, (4.27)
u, v, w ∂(u, v, w) ∂z ∂z ∂z
∂u ∂v ∂w
dV = |J|dudvdw . (4.28)
4.2.6 Exercise
1. Derive the length element ds in polar coordinates, using the differential of
Eq. 4.8.
5. Starting with dA and dV in spherical coordinates, find the surface area and
volume of a sphere by selecting appropriate limits.
55
4.3. OCTAVE FILES
6. Solve same problem in different coordinate system: write a triple integral for
finding the volume inside the cone z 2 = x2 + y 2 and between z = 1 and z = 2
using
• Cartesian coordinate system
• cylindrical coordinate system
• spherical coordinate system
7π
Answer: 3
7. Find the volume of a cone of height h, which is equal to the radius of the
base r. (Hint: use cylindrical coordinates)
3
Answer: πh3
8. Find the volume of the cone defined as: θ = α < π2 and lying inside the
12. Find the inverse Jacobians for polar, cylindrical and spherical coordinate
systems.
13. Find the Jacobians for the following transformations:
(a) Parabolic cylindrical coordinates: x = 12 (u2 − v 2 ), y = uv
(b) Elliptic cylindrical coordinates: x = a cosh u cos v, y = a sinh u sin v
14. Evaluate the following integrals:
Z ∞
2
(a) I = e−ax dx
−∞
√
Z 1 Z 1−x2
2 +y 2 )
(b) I = dx e−(x dy
0 0
Answer: π4 (1 − e−1 )
Z ∞Z ∞
x2 + y 2
(c) I = 2 − y 2 )2
e−2xy dxdy (Hint: parabolic cylindrical)
0 0 1 + (x
x−y 2
Z 1/2 Z 1−x
(d) I = dydx (Hint: substitute x = 12 (r − s), y = 21 (r + s))
x−0 y=x x + y
1
Answer: 12
56
Bibliography
[1] Boas, Mary L., “Mathematical Methods in the Physical Sciences” (Third Edi-
tion), WILEY.
57
Chapter 5
Vector analysis
5.1.3 Exercise
~ to be along x axis
1. Derive the formula for vector triple product, assuming B
and C~ in the xy plane.
58
5.2. FIRST DERIVATIVE OF SCALAR AND VECTOR FIELDS
2. Let us change from rectangular to some general coordinate system (any three
non-coplanar vectors, not perpendicular to each other). Derive the Jacobian,
used in multiple integrals for changing variables.
3. Using reciprocal lattice vectors ~b1 , ~b2 and ~b3 , find the direction perpendicu-
lar to the plane with Miller index (hkl). Also find the inter-planar spacing
between (hkl) planes.
4. Differentiation of a vector: in Cartesian coordinates, a vector is represented
~ = Ax î + Ay ĵ + Az k̂. Evaluate the following time derivatives.
as A
dA~
(a) dt =?
(b) d ~
dt (aA) =?
d ~ ~
(c) dt (A · B) =?
d ~ ~
(d) dt (A × B) =?
~ × (B
9. Prove the Jacobi identity: A ~ × C)
~ +B
~ × (C
~ × A)
~ +C
~ × (A
~ × B)
~ = 0.
~ × B)
10. Prove the Lagrange’s identity: (A ~ · (C
~ × D)
~ = (A
~ · C)(
~ B ~ · D)
~ − (A
~ · D)(
~ B ~ · C)
~
~ × B),
11. Evaluate the scalar triple product of (A ~ (B~ × C)
~ and (C
~ × A).
~
dφ ~ · û ,
= ∇φ (5.4)
du
where the unit vector is pointing in a direction along which the derivative is cal-
culated.
59
5.2. FIRST DERIVATIVE OF SCALAR AND VECTOR FIELDS
φ(x,y)
Ûs
φ(x0,y)
0
5.2.2 Divergence
~ =∇
divV ~ = ∂Vx + ∂Vy + ∂Vz .
~ ·V (5.5)
∂x ∂y ∂z
~ · (φV
∇ ~ ) = (∇φ)
~ ·V~ + φ(∇
~ ·V
~ ). (5.6)
Physical significance of divergence will be discussed later.
60
5.2. FIRST DERIVATIVE OF SCALAR AND VECTOR FIELDS
5.2.3 Curl
~ = î ∂Vz ∂Vy ∂Vx ∂Vz ∂Vy ∂Vx
curlV − + ĵ − + k̂ − . (5.7)
∂y ∂z ∂z ∂x ∂x ∂y
~ × (φV
∇ ~ ) = (∇φ)
~ ×V~ + φ(∇
~ ×V
~ ). (5.8)
Physical significance of curl will be discussed later.
5.2.4 Exercise
1. Using Lagrange multiplier, find the maximum value of the directional deriva-
tive dφ/du, subject to the constraint that a2 + b2 + c2 = 1, where û = aî + bĵ + ck̂.
2. Write the gradient operator in (a) polar, (b) cylindrical and (c) spherical coor-
dinate system.
~ in polar, as well as Cartesian coor-
3. For the following functions, calculate ∇f
dinate system. Compare the answers and check whether you get the same
answer or not.
• f (r) = r
• f (r, θ) = r cos θ
• f (r, θ) = r sin θ
• f (r) = r2
61
5.3. SECOND DERIVATIVE OF SCALAR AND VECTOR FIELDS
∂ 2 Vx ∂ 2 Vy ∂ 2 Vz ∂ 2 Vx ∂ 2 Vy ∂ 2 Vz ∂ 2 Vx ∂ 2 Vy ∂ 2 Vz
~ ∇
∇( ~ ·V
~ ) = î + + + ĵ + + + k̂ + + .
∂x2 ∂x∂y ∂x∂z ∂x∂y ∂y 2 ∂y∂z ∂x∂z ∂y∂z ∂z 2
(5.14)
1
Later, we will find that it is related to Euler reciprocity relation and definition of an exact
differential, which is also a path function in thermodynamics. A related concept is a conservative
force field in classical mechanics and electrodynamics and work done is independent of path in a
conservative force field.
62
5.4. LINE INTEGRALS
5.3.7 Exercise
~ · ∇)
1. Write the Laplacian operator (∇2 = ∇ ~ in (a) polar, (b) cylindrical and (c)
spherical coordinate system.
2. Starting from the gradient, divergence and curl (first derivatives), derive
some second derivatives like: Laplacian, (curl grad), (div curl) and (grad div).
~ operator as a “vector” and it worked fine!
3. In order to memorize, we treated ∇
~ × (∇ψ)
Then, can we conclude that (∇φ) ~ = 0?
~ × ∇ψ)?
~ · (∇φ
4. What would be the expression for: ∇ ~
thing to keep in mind while calculating a line integral is the fact that there is
only one independent variable along a curve. Thus, first we have to express
F~ (x, y, z) and d~s = îdx+ ĵdy + k̂dz as functions of a single variable and then evaluate
the integral (of one variable) to find the total work done by the force to move an
object from one point to other along a path.
Now, work required to move an object from one point to another may depend
on the path (for example, because of energy dissipated due to friction). Such a
field is known as a non-conservative force field. On the other hand, if the work
required to move an object from one point to another is independent of the path
taken, we call it a conservative force field.
Clearly, we can evaluate W along different path and find out whether the force
field is conservative or not. Can we do this without evaluating the integral? The
answer is yes and we have to think logically to recognize the following:
~ × F~ = 0 .
A vector field is conservative if ∇
The first two statements are correlated can can be stated as,
If F~ = ∇W,
~ then curl F~ = 0 .
This is not entirely new, as we already know the reverse statement. In order to
prove this, let us write the components of F~ = ∇W
~ : Fx = ∂W/∂x, Fy = ∂W/∂y
and Fz = ∂W/∂z. Now, using the equality of second derivatives, i.e., ∂ 2 W/∂x∂y =
∂ 2 W/∂y∂x, we find that
63
5.4. LINE INTEGRALS
Is there a way to express the above set of equations in a compact form? We have
~ × F~ = 0.
to use the definition of curl and we can write a compact equation like ∇
Finally, we want to prove that work done is independent of the path for a
conservative force field, i.e.,
Z
F~ · d~s is independent of path if ∇
~ × F~ = 0 or F~ = ∇W
~ .
Now, since F~ = ∇W
~ , we can write F~ · d~s = ∂W
∂x dx + ∂W
∂y dy + ∂W
∂z dz = dW , where
d~s = îdx + ĵdy + k̂dz. Finally, the line integral
Z B Z B
F~ · d~s = dW = W (B) − W (A), (5.16)
A A
is found to depend only on the value of W at the end points and independent
of the path along which the integration is carried out. It is obvious that for a
conservative force field, integral over a closed path:
I I
~
F · d~s = dW = 0. (5.17)
C C
Next, we see two important applications of what we have learnt just now.
∂W ∂W ∂W
dW = dx + dy + dz = Fx (x, y, z)dx + Fy (x, y, z)dy + Fz (x, y, z)dz. (5.18)
∂x ∂y ∂z
64
5.5. GREEN’S THEOREM IN PLANE
line integrals of exact differentials are path independent, such that Eq. 5.16
and Eq. 5.17 are valid. In thermodynamics, exact differentials are related to state
functions, while inexact differentials are related to path functions.
One should note the connection between an exact differential and a conser-
vative vector field. If I give you a conservative vector field F~ , you can find an
exact differential dW = F~ · d~s. On the other hand, if I give you an exact differ-
ential, dW = Xdx + Y dy + Zdz, then you can define a conservative vector field
like F~ = X î + Y ĵ + Z k̂. You will find problems related to differentials in thermo-
dynamics, while problems related to force fields are important in mechanics or
electrodynamics.
5.4.3 Exercise
4. Test if dz = (2x + y)dx + (x + y)dy is exact or not. If exact, then find z(x, y).
h i
5. Given dP = VRT −b dT + RT
(V −b)2 − a
TV 2 dV , find out the function P (T, V ).
65
5.6. DIVERGENCE AND DIVERGENCE THEOREM
5.5.1 Exercise
1. Using Green’s theorem, prove that:
Z Z I
~ ~
∇ · V dxdy = ~ · n̂ds
V (5.21)
∂A
A
Divergence represents net outflow per unit volume. See class notes/text book for
a simple proof for a cubic volume element.
~ · J~ + ∂ρ = 0.
∇ (5.24)
∂t
In steady state,
~ · J~ = 0.
∇ (5.25)
66
5.6. DIVERGENCE AND DIVERGENCE THEOREM
A
A'
θ
Figure 5.2: Amount of water crossing through area A0 is same as amount of water
crossing through the area A.
Note that, the above equations are correct if there exist no source and sink. Oth-
erwise, we have to add a term ψ to take into account the source minus sink part,
~ · J~ + ∂ρ = ψ.
∇ (5.26)
∂t
(ρ)(vt)(A0 ) = (ρv)A0 t = (ρv)(A cos θ)t = (ρv cos θ)At = (J~ · n̂)At. (5.27)
Note that, θ is the angle between the direction of ~v and n̂ (unit normal to the
surface A). Thus, net amount of water crossing per unit area and per unit time is
given by J~ · n̂ . Now, we can take some area element da on any surface enclosing
some volume (for example surface of a sphere) and unit normal n̂ to the surface.
Thus, mass of water flowing out of the area is given by (J~ · n̂)da and the total
outflow from the volume enclosed by the surface is
Z Z Z Z
~
(J · n̂)da = J~ · d~a. (5.28)
We already know that divergence is net outflow per unit volume. We can easily
argue that (consult textbook or class notes): net outflow from the volume enclosed
by the surface must be equal to the the net outflow from a surface enclosing the
volume, which leads to the divergence theorem:
Z Z Z Z Z
~ · J)dV
(∇ ~ = (J~ · n̂)da . (5.29)
67
5.7. CURL AND STOKES’ THEOREM
da=dxdy
dy
Surface da
of
dx hemisphere
Figure 5.3: (left) Line integral over a closed path in xy plane, such that the nor-
mal is pointing towards the k̂ at the given point. (center) We generalize the area
element and take it to be on the surface of a hemisphere. (right) Flat view of the
hemisphere.
Note that, the L.H.S. is a triple integral over the entire volume enclosed by the
surface A and R.H.S. is a double integral over the entire surface enclosing the
volume V .
5.6.4 Exercise
~ = ∇×
~ A, ~ using the divergence theorem, prove that B ~ · n̂da over
H
1. Given that, B
any closed surface is zero. Can you justify this in terms of simple arguments.
68
5.7. CURL AND STOKES’ THEOREM
Figure 5.4: In a fishing net, the net forms the open surface and the rim (made of
metal or plastic) is the curve bounding the open surface.
69
5.7. CURL AND STOKES’ THEOREM
Let us think of a small fishing net, as shown in Fig. 5.4. The net forms the
open surface, while the rim is the curve bounding the open surface. Note that, we
can deform the net easily, but the rim does not change. Now, let us think of a
net of the shape of a hemisphere. We can deform the net to any other shape,
keeping the rim unchanged. If we do this, whatever we agrued to get Eq. 5.33,
still remains valid. Let us further assume that the net is made of a stretchable
material, which looks like a fishing net when stretched, but converts to something
like a badminton racket when unstretched. Accoding to our logic, integral over
the surface should be the same for the stretched and unstretched net. Thus,
we conclude that, what matters is the curve bounding the sufrace, not the surface
itself.
This further implies that, all we need is to calculate a surface integral over a
flat surface (similar to the badminton racket), instead of a curved surface (similar
to the fishing net). The result is going to be same as long as the rim (curve
bounding the net) remains same. For example, if we take the bounding curve to
be a circle, it does not matter whether we have a perfect hemisphere or deformed
hemisphere on top of the circle. We need not even try to calculate the surface
integral over a deformed (or perfect) hemisphere. All we need to calculate is a
surface integral over the circle (bounding the “hemisphere” of whatever shape).
5.7.4 Exercise
1. Let us verify the fact that integral over a hemisphere is same as integral over
a circle bounding the hemisphere. Assume V ~ = 4y î + xĵ + 2z k̂.
~ ×V ~ ) · n̂da over the hemisphere x2 + y 2 + z 2 = a2 , z ≥ 0.
RR
(a) Find (∇
(b) Verify that the result will be same if we evaluate the integral over the
circle bounding the hemisphere.
70
Bibliography
[1] Boas, Mary L., “Mathematical Methods in the Physical Sciences” (Third Edi-
tion), WILEY.
71
Chapter 6
We are taking a vector ~r(x, y), multiplying it with a matrix M and getting a new
vector r~0 = M~r. Note that, by taking various different M , we can get any other
point on the plane, starting from a point (x, y). Thus, matrix M is an operator,
that operates on the column matrices and maps the plane into itself. Why do we
call it a linear transformation? This is because of the following:
r'(x',y') r'(x',y')
r(x,y)
r'(x',y')
72
6.2. ORTHOGONAL TRANSFORMATION
Thus, we can also conclude that for an orthogonal transformation, the determi-
nant of the transformation matrix must satisfy,
detM = ±1 . (6.7)
Later, we will see that detM = +1 for a rotation opetation and detM = −1 for a
reflection operation.
6.2.1 Rotation in 2D
As shown in Fig. 6.2, we can either rotate the vector keeping the reference axes
fixed or rotate the reference system keeping the vector fixed.
Active transformation
If we keep the reference system fixed, we can write x0 = r cos(θ + α) = x cos θ − y sin θ
and y 0 = r sin(θ + α) = x sin θ + y cos θ. This can be expressed in the matrix notation
as, 0
x cos θ − sin θ x
= . (6.8)
y0 sin θ cos θ y
1
In case of a (3 × 3) determinant, it represents a scalar triple product, i.e., volume bounded by
three vectors. (a) By transposing, we are just writing row vectors as column vectors, but the volume
does not change ⇒ det(MT ) = det(M). (b) Interchanging two rows/columns does not change the
determinant, but it picks a negative sign. (c) If two rows/columns are same, all the vectors are
coplanar and determinant is 0.
73
6.2. ORTHOGONAL TRANSFORMATION
y
y y'
r'(x',y')
r(x,y) (x,y)
x'
(x',y')
θ
α α θ
x x
Figure 6.2: Anti-clockwise rotation by an angle θ: We can either rotate the vector
keeping the reference axes fixed (left) or rotate the reference system keeping the
vector fixed (right).
Passive transformation
On the other hand, if we keep the vector fixed and rotate the reference axes
(change of basis), then the components of the vector in the new reference system
can be written as x0 = r cos(α − θ) = x cos θ + y sin θ and y 0 = r sin(α − θ) = −x sin θ +
y cos θ. This can be expressed in the matrix notation as:
0
x cos θ sin θ x
= . (6.9)
y0 − sin θ cos θ y
We can easily verify that rotation matrix is orthogonal, because RT (θ) = R−1 (θ) .
We also find that R−1 (θ) = R(−θ) . This makes sense, because inverse of an anti-
clockwise rotation is a clockwise rotation. We also see that, matrices in Eq. 6.8
and Eq. 6.9 are inverse of each other. This implies that, rotation of a vector in the
anti-clockwise direction is equivalent to the rotation of the reference axes in the
opposite (clockwise) direction. We can also verify that detR = +1 . This is true for
any rotation matrix.
6.2.2 Rotation in 3D
Matrix corresponding to an anti-clockwise rotation about the z-axis is:
cos θ − sin θ 0
sin θ cos θ 0 . (6.10)
0 0 1
Again, we can easily verify that detR = +1 and R(θ)T = R(θ)−1 = R(−θ) .
6.2.3 Reflection
Reflection of a vector about a line making an angle θ with the x-axis is shown in
Fig. 6.3. We can write: x0 = r cos(2θ − α) = x cos 2θ + y sin 2θ and y 0 = r sin(2θ − α) =
74
6.2. ORTHOGONAL TRANSFORMATION
y
(x',y')
(x,y)
θ α
x
Figure 6.3: Reflection of a vector about a line making an angle θ with the x-axis.
6.2.4 Exercise
3. A cube has three fold rotational symmetry along the body diagonals. Find
the orthogonal matrices, corresponding to each of the three fold axes.
75
6.3. MATRIX DIAGONALIZATION
y y
Figure 6.4: (Left) When multiplied with some matrix A, an ordinay vector ~v trans-
forms to v~0 . (Right) When multiplied with some matrix A, an eigen vector ~v trans-
forms to λ~v .
76
6.3. MATRIX DIAGONALIZATION
(5 − λ)x − 2y = 0 (6.13)
−2x + (2 − λ)y = 0,
which gives us two eigenvalues λ = 1 and λ = 6. Putting these values in Eq. 6.13,
we get two lines:
2x − y = 0 & x + 2y = 0. (6.15)
Now, let us take two vectors along these two lines ~v1 = î + 2ĵ and ~v2 = −2î + ĵ and
these are the eigenvectors of A, with eigen values λ = 1 and λ = 6, respectively.
Note that, if we take an ordinary vector, say î + ĵ, and multiply with the matrix A,
the vector is transformed to another vector 3î, which has a different length and
also oriented in some other direction [see Fig. 6.4]. On the other hand, if we take
an eigenvector and multiply with the matrix A, we get a vector in the same (or
opposite) direction and length of the new vector is stretched/compressed (by a
factor of corresponding eigenvalue) with respect to the length of the eigenvector.
For example, in this case, ~v1 remains ~v1 , while ~v2 becomes 6~v2 after transformation
[see Fig. 6.4].
D = B −1 AB . (6.18)
77
6.3. MATRIX DIAGONALIZATION
y
y'
R=R'
r=r'
θ x'
θ
x
Figure 6.5: In (x, y) coordinate system, vector ~r is transformed to vector R ~ by some
transformation matrix A. If we rotate the coordinate system (rotation matrix B)
~ 0 (same
to go to a new coordinate system (x0 , y 0 ), then ~r0 is transformed to vector R
transformation). In the new coordinate system, the transformation matrix from ~r0
~ 0 is B −1 AB. This is known as the similarity transformation.
to R
Now, we can decide to take x0 and y 0 along v̂1 and v̂2 , respectively. Thus, matrix B
(see Eq. 6.16) is nothing but a rotation matrix that orient (1, 0) and (0, 1) along the
eigenvectors v̂1 and v̂2 , respectively.3 This is equivalent to rotation of coordinate
system by an angle θ (anti-clockwise).
We also know from Eq. 6.18 that B −1 AB is a diagonal matrix. Let us try to
understand the meaning of this matrix more clearly. First we define a transfor-
mation of ~r to R ~ in the (x, y) system, given by R
~ = A~r. As shown in Fig. 6.5, the
0 0
same transformation in the (x , y ) system is given by R ~ 0 = A0~r0 . Let us see how
A [transformation matrix in (x, y) system] is related to A0 [transformation matrix
in (x0 , y 0 ) system]. Now, since (x, y) coordinate system is rotated clockwise with
3
We can easily check that B î = v̂1 and B ĵ = v̂2 .
78
6.3. MATRIX DIAGONALIZATION
R ~0
~ = BR & ~r = B~r0 . (6.20)
~ = A~r, we find that B R
Replacing the above in R ~ 0 = B −1 AB ~r0 .
~ 0 = AB~r0 and finally, R
| {z }
A0
Thus, B −1 AB
describes the same transformation in (x0 , y 0 )
coordinate system,
which was described by A in the (x, y) coordinate system. Since we have dis-
covered that B is a rotation (orthogonal) matrix, we further write Eq. 6.18 as an
orthogonal similarity transformation,
D = B −1 AB = B T AB , (6.21)
We will not try to find all the eigenvalues and eigenvectors. Rather, we will try to
find the eigenvector for the eigenvalue=1. 5 . The following equations need to be
4
We can check that detA = 1, which implies that this is definitely a rotation matrix.
5
Note that, there has to be one such eigenvalue and eigenvector if A is a rotation matrix
79
6.3. MATRIX DIAGONALIZATION
solved for λ = 1,
1 √
( − λ)x + y/ 2 + z/2 = 0 (6.23)
2 √ √
−x/ 2 − λy + z/ 2 = 0
√ 1
x/2 − y/ 2 + ( − λ)z = 0.
2
Adding first two equations, we get x = z and y = 0 and thus, the eigenvector
corresponding to the eigenvalue λ = 1 is [1, 0, 1]. Hence, the above mentioned
rotation matrix A describes a rotation about [1, 0, 1] axis [i.e., î + k̂ axis].
Instead of (î, ĵ, k̂), it would be interesting to define a coordinate system where
~ = î + k̂ is one of the reference axes.6 Just by observation, we select a vector,
w
perpendicular to w ~ = [1, 0, 1] and this turns out to be ~u = [1, 0, −1]. Now we get the
third axis ~v = w ~ × ~u = 2ĵ. Thus, we get a new right handed coordinate system
(û, v̂, ŵ), where û = √12 [1, 0, −1], v̂ = [0, 1, 0] and ŵ = √12 [1, 0, 1].
Similar to the previous section, we construct a B matrix as,
√ √
1/ 2 0 1/ 2
B = 0√ 1 0√ . (6.24)
−1/ 2 0 1/ 2
Interestingly, this is the rotation matrix from (î, ĵ, k̂) to (û, v̂, ŵ) coordinate system.7
Thus, the similarity transformation B −1 AB describes the same transformation in
(û, v̂, ŵ) coordinate system, as described by A in (î, ĵ, k̂) coordinate system and we
get,8
0 1 0
B −1 AB = −1 0 0 . (6.25)
0 0 1
If we take a closer look at matrix B, we realize that w
~ (the rotation axis) is similar
to the z-axis in new coordinate system. Comparing with Eq. 6.10, we can easily
regognize the rotation angle as −90◦ .
Finally, can we find the rotation matrix representing a −90◦ rotation about the
axis (î + k̂)? Now we have to first wirte the rotation matrix A in some known
form, for example, with respect to z-axis using Eq. 6.10. Now, we have to define
a rotation matrix B that defines a rotation from (î + k̂) to k̂. Then, the similarity
transorm B −1 AB will give the answer.
80
6.3. MATRIX DIAGONALIZATION
y2 x2
y1
x1
Figure 6.6: A quadratic equation 5x21 − 4x1 x2 + 5x22 = 20, when represented with
respect to the principal axes, simplifies to 3y12 + 7y22 = 20. The principal axes y1
and y2 are oriented along (î + ĵ) and (−î + ĵ), respectively.
Let us solve a problem: express the quadratic equation 5x21 − 4x1 x2 + 5x22 = 20 in
terms of its principal axes.
81
6.3. MATRIX DIAGONALIZATION
v'
u' u
o A
Figure 6.7: Two vectors ~u and ~v are not orthogonal. We can define a set of vectors,
~u0 =~u and ~v 0 =~v − (û · ~v )û, which are perpendicular to each other. The blue vectors in
the diagram just show the direction of ~u0 and ~v 0 .
We need to solve the following set of linear equations to get the eigenvalues and
eigenvectors:
(1 − λ)x − 4y + 2z = 0, (6.29)
−4x + (1 − λ)y − 2z = 0,
2x − 2y − (2 + λ)z = 0.
82
6.3. MATRIX DIAGONALIZATION
write
1 1
~v30 = (−1, 0, 2) − − √ √ (1, 1, 0), (6.31)
2 2
and thus, get a vector (−1/2, 1/2, 2) or (−1, 1, 4). Note that, this vector is perpendic-
ular to both (1, 1, 0) and (2, −2, 1). However, we need to check whether the vector
we derived is an eigenvector of matrix A. You can easily verify that,
−1 −1
A 1 = −3 1 . (6.32)
4 4
D = B −1 AB = B † AB , (6.33)
6.3.7 Exercise
83
6.3. MATRIX DIAGONALIZATION
2
, with constants 6 and 8, respectively. Thus, we can write the second
4
1 2 6
column of the product matrix to be equal to .
3 4 8
2 1 1 −1
4. Take a matrix A = and multiply it with another matrix B = .
1 2 1 1
(a) Check that, columns of B = (b1 b2 ) is made of eigenvectors of A, such
that Ab1 = 3b1 and Ab2 = b2 . (b) Using the above, we can write AB =
(Ab1 Ab (3b1 b2 ). (c) Note that, we can further rewrite this as (3b1 b2 ) =
2 ) =
3 0
(b1 b2 ) = BD, where D is a diagonal matrix, made of eigenvalues of
0 1
A. (d) Finally, confirm that AB = BD.
3 2 1 1
5. Repeat the previous problem with A = and B = .
6 −1 −3 1
Similarity transformation
2 1
6. Given A = . Find the matrix B, satisfying the similarity transforma-
1 2
tion B −1 AB = D, where D is a diagonal matrix. Find out whether B is a
rotation matrix or not and give proper justification to support your answer.
3 2
7. Given A = . Find the matrix B, satisfying the similarity transfor-
6 −1
mation B −1 AB = D, where D is a diagonal matrix. Find out whether B is a
rotation matrix or not and give proper justification to support your answer.
10. In first two problems, check whether trace and determinant is conserved
after similarity transformation.
84
6.3. MATRIX DIAGONALIZATION
14. Find the rotation matrix corresponding to 90◦ rotation about î + k̂.
15. Find the rotation matrix corresponding to 180◦ rotation about î + ĵ.
16. Find the rotation matrix corresponding to 120◦ rotation about î + ĵ + k̂.
17. Express the quadratic equation 5x21 − 4x1 x2 + 5x22 = 5 in terms of its principal
axes. Make a plot, clearly showing the principal axes. You have to specify
the vectors along the principal axes.
18. Express the quadratic equation x21 − x1 x2 + x22 = 1 in terms of its principal
axes. Make a plot, clearly showing the principal axes. You have to specify
the vectors along the principal axes.
Unitary transformation
85
Chapter 7
dy d2 y d3 y dn y
a0 y + a1 + a2 2 + a3 3 + · · · · +an n = b, (7.1)
dx dx dx dx
where a’s and b are functions of x (or constants). Some examples of linear equa-
tions are,
xy 0 + x2 y = ex , (order 1) (7.2)
3 00 x 0
x y + e y + ln xy = cos x, (order 2)
y 000 − 2y 00 + y 0 = 2 sin x(order 3).
Note that, in each of the above equation, dependent variable y and all its derivative
occur linearly. The order of the differential equation is decided according to the
order of the highest derivative included in the equation. General solution of a
linear differential equation of order n has n independent arbitrary constants and
we can get a particular solution by assigning particular values to the constants,
based on boundary condition or initial condition.
Some examples of the non-linear equations are,
y 0 − ln y = 0, (order 1) (7.3)
3 00 0 3
x y + y − y = sin x, (order 2)
y − 2y 00 + y 02 + x2 y = 2 sin y(order 3).
000
Note that, in each of the above equation, either the dependent variable y or some
of its derivative does not occur linearly.
86
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
Note that, we can solve linear, as well as non-linear equations using this method.
√
where c = 3.
In order to integrate the right hand side, we multiply and divide by (csc x − cot x),
and substitute (csc x − cot x) = v, such that,
87
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
We know that, if the above expression is an exact differential, then we can define
a function F (x, y), such that P = ∂F/∂x and Q = ∂F/∂y.1 Thus, we can write
Thus, we have to find f (y) to get the solution. Using the other equation, we can
write,
∂F y2
y + 3x3 y 2 = = 3x3 y 2 + f 0 (y) ⇒ f (y) = + c1 .
∂y 2
y2
Thus, F (x, y) = x3 y 3 − x5 + 2 + c1 and the general solution of the given differential
equation is
y2
F (x, y) = c2 ⇒ x3 y 3 − x5 + =c
2
where the constant c replaces both c1 and c2 . You should differentiate the answer
1
Check for an exact differential: ∂P/∂y = ∂Q/∂x, because ∂ 2 F/∂y∂x = ∂ 2 F/∂x∂y.
88
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
and check whether you get the equation given in the question.
Thus, we have to find f (y) to get the solution. Using the other equation, we can
write,
∂F y3
3x2 e3y − y 2 = = 3x2 e3y + f 0 (y) ⇒ f (y) = + c1
∂y 3
Thus, F (x, y) = x2 e3y +ex −y 3 /3+c1 and the general solution for the given differential
equation is
F (x, y) = c2 ⇒ x2 e3y + ex − y 3 /3 = c
x2
Z Z
F (x, y) = P (x, y)dx = (y + x + 1)dx = xy + + x + f (y).
2
Thus, we have to find f (y) to get the solution. Using the other equation, we can
write,
∂F y2
x−y = = x + f 0 (y) ⇒ f (y) = − + c1
∂y 2
x2 y2
F (x, y) = c2 ⇒ xy + +x− =c
2 2
89
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
Since the right hand side is zero, every coefficient must be equal to zero,
(n + 2) + (m + 1) = 0,
(n + 3) = 0,
m = 0.
y 2 + 3xy 3 1 − xy
dx + dy = 0.
y3 y3
| {z } | {z }
P1 (x,y) Q1 (x,y)
We can easily verify that the above equation is exact, ∂P1 /∂y = ∂Q1 /∂x = −1/y 2 .
Now, we have to find a function F (x, y), such that P1 = ∂F/∂x and Q1 = ∂F/∂y. We
can write,
x 3x2
Z Z
1
F (x, y) = P1 (x, y)dx = + 3x dx = + + f (y).
y y 2
Thus, we have to find f (y) to get the solution. Using the other equation, we can
write,
1 x ∂F x 1
3
− 2 = = − 2 + f 0 (y) ⇒ f (y) = − 2 + c1
y y ∂y y 2y
Thus, F (x, y) = x/y + 3x2 /2 − 1/2y 2 + c1 and the general solution for the given
differential equation is
x 3x2 1
F (x, y) = c2 ⇒ + − 2 =c
y 2 2y
90
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
When such condition (left hand side is a function of x only) is satisfied, I claim
R
f (x)dx
that the integrating factor is U (x, y) = U (x) = e = eI (see problem set). In
this particular case, the integrating factor is,
R
dx/x
U (x) = e = eln x = x. (7.7)
We can verify that, the above equation is exact as ∂P1 /∂y = ∂Q1 /∂x = (3x2 − 2xy).
Now, we have to find a function F (x, y), such that P1 = ∂F/∂x and Q1 = ∂F/∂y. We
can write,
y 2 x2
Z Z
F (x, y) = P1 (x, y)dx = (3x2 y − y 2 x)dx = x3 y − + f (y).
2
Thus, we have to find f (y) to get the solution. Using the other equation, we can
write,
∂F
x3 − x2 y = = x3 − x2 y + f 0 (y) ⇒ f (y) = c1
∂y
y 2 x2
Thus, F (x, y) = x3 y − 2 + c1 and the general solution for the given differential
equation is
y 2 x2
F (x, y) = c2 ⇒ x3 y − =c
2
91
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
xn f (y/x).3 Since P and Q are homogeneous functions of same degree, the factor
xn gets canceled and we can write,
dy P (x, y) y
y0 = =− =f . (7.10)
dx Q(x, y) x
dy P (x, y) 3y 2 y y
=− = 2 + =f .
dx Q(x, y) x x x
dv dv dx
v+x = 3v 2 + v ⇒ 2 = .
dx 3v x
Integrating both sides, we get
1 x −x
− = ln |x| + ln |c| ⇒ = −3 ln |cx| ⇒ y =
3v y 3 ln |xc|
In order to verify, you can differentiate the last equation and check whether you
get the differential equation given in the question.
dy P (x, y) y y2 y
=− = − 2 =f .
dx Q(x, y) x x x
dv −dv dx
v+x = v − v2 ⇒ 2 = .
dx v x
3
Example: a homogeneous function of degree 2, x2 +xy can be expressed as x2 (1+y/x) = x2 f (y/x).
4
There is an alternate method of solving homogeneous differential equations. We can prove that
1/(P x + Qy) is an integrating factor for Eq. 7.9.
92
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
1 x x
= ln |x| + ln |c| ⇒ = ln |cx| ⇒ y = (7.11)
v y ln |cx|
v − v2 −2v 2
dv dv 1 1 dx
v+x = ⇒x = ⇒ − 2− dv = 2 .
dx 1+v dx 1+v v v x
5
There is an alternate way to solve linear equations, to be shown in the examples.
93
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
Note that, we have only one arbitrary constant, as expected for a linear first order
equation. Also, yc is the solution of Eq. 7.12 with Q = 0 and yp is known as the
particular solution.67 Some examples are given below.
Example 1: Solve y 0 − xy = 1.
Method 1:
This is a linear equation, with P (x) = −1/x and Q(x) = 1. Thus,
Z Z
1 1
I = P (x)dx = − dx = − ln x ⇒ eI = e− ln x =
x x
Z Z
y 1
yeI = eI Q(x)dx + ln c ⇒ = dx + ln c = ln(cx) ⇒ y = x ln(xc)
x x
Method 2:
dy dv
Let y = uv and dx= u dx + v du
dx . Thus, the above equation is converted to
dv du uv dv du u
u +v − =1⇒u +v − = 1.
dx dx x dx dx x
| {z }
=0
du u
= ⇒ ln u = ln(cx) ⇒ u = c1 x.
dx x
Let us replace u = c1 x in the equation above (term involving v is still equal to zero),
dv
c1 x = 1 ⇒ c1 v = ln(cx).
dx
Finally, using y = uv, we get,
1
y = c1 x ln(cx) ⇒ y = x ln(cx) .
c1
Example 2: Solve y 0 + y = ex .
Method 1:
This is a linear equation with P (x) = 1 and Q(x) = ex . Thus,
Z
I = P dx = x ⇒ eI = ex .
e2x ex
Z Z
I
ye = I
e Qdx + c = e2x dx + c = +c⇒ y = + ce−x .
2 2
Method 2:
6
From Eq. 7.16, yp eI = eI Qdx and yc eI = c, such that (yp + yc )eI = yeI = eI Qdx + c.
R R
7
Note that, the general solution of Eq. 7.12, i.e., y = yp + yc is not unique.
94
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
dy dv
Let y = uv and = u dx
dx + v du
dx . Thus, the above equation is converted to
dv du x dv du
u +v + uv = e ⇒ u +v + u = ex .
dx dx dx dx
| {z }
=0
du
= −u ⇒ ln u = −x + c1 ⇒ u = c2 e−x .
dx
Let us replace u = c2 e−x in the equation above (term involving v is still equal to
zero),
dv e2x
c2 e−x = ex ⇒ c2 v = + c3 .
dx 2
Finally, using y = uv, we get,
e2x ex
−x c3
y = c2 e + ⇒ y= + ce−x .
2c2 c2 2
x2
Z Z
1
yeI = eI Qdx + c = xdx + c = +c⇒ y = + cx−3 .
2 2x
Method 2:
dy dv
Let y = uv and = u dx
dx + v du
dx . Thus, the above equation is converted to
dv du 3uv 1 dv du 3u 1
u +v + = 2 ⇒u +v + = 2.
dx dx x x dx dx x x
| {z }
=0
du 3u
=− ⇒ ln u = −3 ln x + ln c1 ⇒ u = c1 x−3 .
dx x
Let us replace u = c1 x−3 in the equation above (term involving v is still equal to
zero),
dv 1 x2
c1 x−3 = 2 ⇒ c1 v = + c2 .
dx x 2
95
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
x2
−3 c2 1
y = c1 x + ⇒ + cx−3 .
2c1 c1 2x
y 0 + P y = Qy n , (7.17)
where P and Q are functions of x (or can be constants). Clearly, it is not a linear
equation, but can easily be converted to a linear equation, by making a change of
variable,
z = y 1−n ⇒ z 0 = (1 − n)y −n y 0 . (7.18)
Multiplying Eq. 7.17 with (1 − n)y −n and then making the above substitution, we
get
Thus, we have converted the non-linear equation 7.17 to a linear equation and
we already know how to solve this. Some examples are given below.
1 −2/3 0 1 1/3 1 1 1
y y + y = x ⇒ z 0 + z = x.
3 3 3 3 3
Thus, we have converted the non-linear equation to a linear equation in x and z,
with P (x) = 1/3 and Q(x) = x/3. Thus,
Z
x
I = P dx = ⇒ eI = ex/3
3
Z Z
x
zeI = eI Qdx + c = ex/3 dx + c = xex/3 − 3ex/3 + c ⇒ z = x − 3 + ce−x/3 .
3
1 −1/2 0 1 1/2 1
y y + y = x3/2 ⇒ z 0 + z = x3/2 .
2 2x 2x
96
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
x5/2
Replacing z = y 1/2 , the answer is y 1/2 = + cx−1/2 .
3
7.2.6 Exercise
Exact equations
8. (1 + y 2 )dx + xydy = 0.
x2 x2 y 2
Answer: 2 + 2 +c=0
97
7.2. FIRST ORDER DIFFERENTIAL EQUATIONS
15. Check whether following functions are homogeneous and if yes, find the
degree.
(a) 4x2 + y 2
(b) x2 − 5xy + y 3 /x
(c) xy sin(x/y)
(d) (y 4 − x3 y)/x − xy 2 sin(x/y)
(e) x sin(xy)
p
(f) x2 y 3 + x5 ln(y/x) − y 6 / x2 + y 2
(g) x3 + x2 y + xy 2 + y 3
(h) x2 + y
(i) x2 + xy + y 3
(j) x + cos y
p
16. Solve ydy = (−x + x2 + y 2 )dx.
Answer: y 2 = 2cx + c2
98
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
21. Prove that, 1/(P x + Qy) is an integrating factor for Eq. 7.9. [Hint: you have
to prove that (P dx + Qdy)/(P x + Qy) is an exact differential, provided P and
Q are homogeneous functions of same degree.]
Linear equations
22. Prove that eI is the integrating factor for Eq. 7.12, i.e., eI (P y − Q)dx + eI dy = 0
is an exact equation. Following the technique of solving an exact equation,
prove that yeI = eI Qdx + c.
R
2
23. Solve dy + (2xy − xe−x )dx = 0.
x2 −x2 2
Answer: y = 2 e + ce−x
26. Solve y 0 + √ y = √1 .
x2 +1 (x+ x2 +1)
(x+c)
Answer: y = √
x+ x2 +1
Bernoulli equations
27. 3xy 2 y 0 + 3y 3 = 1.
Answer: y 3 = 1
3 + cx−3
99
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
(a2 D2 + a1 D + a0 ) y = 0. (7.21)
| {z }
auxiliary equation
We could also have substituted y = ecx in Eq. 7.20 and get the same auxiliary
equation,
a2 c2 + a1 c + a0 = 0. (7.22)
Now, let us consider three possible cases.
(D − c1 )(D − c2 )y = 0. (7.23)
Thus, in order to solve Eq. 7.21, we need to solve two first order equations,
(D − c1 )y = 0 & (D − c2 )y = 0. (7.24)
These are separable equations, with solutions y1 = ec1 x and y2 = ec2 x and the
general solution is a linear combination of the two.9 Thus, if c1 and c2 are two
roots of the auxiliary equation, the general solution is,
where A and B are arbitrary complex constants. Since e±ιβx = cos βx ± ι sin βx, we
can rewrite the above equation as, y = eαx (C1 cos βx + C2 sin βx), where C1 = (A + B)
and C2 = ι(A − B). Note that, by selecting appropriate constants, we can get
real, as well as imaginary solutions. For example, if we take A = B = 1/2, we
get a real solution y = eαx cos βx. Similarly, if we take A = 1/2ι and B = −1/2ι,
we get another real solution y = eαx sin βx. Interestingly, since cos βx and sin βx
are linearly independent functions, we can get a series of real solutions by taking
√
8 −a1 ± a2 1 −4a2 a0
c1 and c2 are the roots of the auxiliary equation a2 D2 + a1 D + a0 = 0, given by 2a2
.
9
We can do this as two solutions are linearly independent. Two functions f1 (x) and f2 (x) are lin-
f1 (x) f2 (x)
early independent if the Wronskian is not equal to zero. Wronskian is given by: W = 0 .
f1 (x) f20 (x)
100
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
where C1 and C2 are real arbitrary constants. We can further express this as,
(D − c) (D − c)y = 0. (7.29)
| {z }
u
R side, with P = −c
This is a linear first order equation, having non-zero right hand
and Q = Aecx . The solution is given by Eq. 7.16, where I = P dx = −cx. We can
write the solution as,
Z Z
ye = e Qdx = e−cx Becx dx = Ax + B ⇒ y = (Ax + B)ecx .
I I
(7.32)
(D2 + ω 2 )x = 0.
10
Potential energy of a spring is given by U (x) = 12 kx2 = 12 mω 2 x2 and the force = − dU
dx
. Such a
force is known as conservative force and we already know that work done is independent of the
path in such a force field.
101
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
D2 + ω 2 = 0,
and the roots are D = ±ιω. Thus, the general solution can be expressed in any of
the three forms given in Eq. 7.26, Eq. 7.27 or Eq. 7.28,
d2 x dx d2 x dx
m 2
= −kx − c ⇒ m 2 +c + kx = 0 .
dt dt dt dt
√
2
The auxiliary equation is mD2 + cD + k = 0, having roots D = −c± 2m c −4mk
=
q
2 −4mk
c
p
− 2m ± c 4m 2
c
= −γ ± γ 2 − ω 2 . Note that, γ = 2m is known as the damping
coefficient and the reason is going to be obvious when we discuss the solutions of
the equation of motion. Let us discuss three possible cases.
102
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
1.5
Critically damped
Overdamped
1 Underdamped
x(t)
0.5
0 1 2 3 4 5
t
Figure 7.1: Displacement as a function of time for damped harmonic motion.
(a) (b)
Fs
Figure 7.2: Different systems having similar differential equation: (a) spring-mass
and (b) RLC circuit connected in series. Images are take from Wikipedia.
103
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
is a RLC circuit, where the components are connected in series (see Fig. 7.2). In
this case, the governing equation is12
d2 I dI I dV
L 2
+R + = ,
dt dt C dt
and if we set right hand side equal to zero, we solve an equation similar to damped
harmonic motion. In this case, resistance has a similar role as played by friction
in case of spring-mass system.
7.3.2 Exercise
1. Re-derive Eq. 7.25: write auxiliary equation (D − c1 ) (D − c2 )y = 0. (D − c2 )y
| {z }
u(x)
must be some function of x, say u(x). Now, first solve for u(x) from (D−c1 )u =
0. Then, solve for (D − c2 )y = u and check whether you get the same answer
as Eq. 7.25.
3. y 00 + y 0 − 2y = 0
Answer: y = Aex + Be−2x
4. y 00 + 9y = 0
Answer: y = Ae3ιx + Be−3ιx
5. y 00 − 2y 0 + y = 0
Answer: y = (Ax + B)ex
6. y 00 − 5y 0 + 6y = 0
Answer: y = Ae3x + Be2x
7. y 00 − 4y 0 + 13y = 0
Answer: y = Ae2x sin(2x + γ)
8. 4y 00 + 12y 0 + 9y = 0
Answer: y = (A + Bx)e−3x/2
12 Q
We know that V = RI, V = Q/C, V = L(dI/dt). Combining, we get L dI dt
+ RI + C
= V . Taking
dQ d2 I dI I dV
time derivative and noting that I = dt , we get L dt2 + R dt + C = dt .
104
7.3. SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
9. e−x , e−4x
13. 1, x, x2
(D − 1) (D + 2)y = ex .
| {z }
u
Now, let (D + 2)y = u, such that we get a first order linear differential equation,
(D − 1)u = ex ⇒ u0 − u = ex ,
105
7.4. COUPLED FIRST ORDER DIFFERENTIAL EQUATIONS
I would like to draw attention to the fact that, we have obtained yc from the
arbitrary constants at every step. If we omit the arbitrary constants, we can
quickly get the particular solution. Finally, we can beautify the final answer by
writing 13 c − 91 = c2 , such that the general solution is y = 13 xex + c2 ex + c1 e−2x .
7.3.4 Exercise
1. y 00 − 4y = 10
Answer: y = Ae2x + Be−2x − 5
2
2. y 00 + y 0 − 2y = e2x
Answer: Aex + Be−2x + 41 e2x
3. y 00 + y = 2ex
Answer: y = Aeιx + Be−ιx + ex
4. y 00 − y 0 − 2y = 3e2x
Answer: y = Ae−x + Be2x + xe2x
5. y 00 + 2y 0 + y = 2e−x
Answer: y = (Ax + B + x2 )e−x
106
7.4. COUPLED FIRST ORDER DIFFERENTIAL EQUATIONS
Note that, we can express the above equation in the matrix form as,
0
y1 a b y1
0 = . (7.36)
y2 c d y2
Two column vectors y~0 and ~y are related by the matrix A, such that y~0 = A~y . Now,
let us assume that ~y = B~x, such that y~0 = AB~x and we get,
Finally, we use ~y = B~x to get the solution for y1 (t) and y2 (t).
Let us solve for,
x01
−3 0 x1
= (7.42)
x02 0 5 x2
and the solutions are, x1 = c1 e−3t and x2 = c2 e5t . Now, we get the final solution
107
7.5. CONVERTING HIGHER ORDER TO 1ST ORDER EQUATIONS
y1 1 1 x1
from ~y = B~x ⇒ = ,
y2 −3 1 x2
7.4.1 Exercise
1. Solve y10 = y1 + y2 , y20 = 4y1 + y2 .
Answer: y1 = c1 e3t + c2 e−t , y2 = 2c1 e3t − 2c2 e−t
x1 = y & x2 = y 0 , (7.45)
x01 0
= y = x2 & x02 =y . 00
Thus, we have converted a 2nd order equation to two coupled 1st order equations,
which we can solve following the method shown in the previous section. We can
do it for even higher order equations, like a 3rd order equation,
x1 = y, x2 = y 0 , x3 = y 00 , (7.48)
Let us see an example, where we solve the following 2nd order linear equation
using this method,
y 00 + 5y1 − 6y = 0. (7.50)
108
7.5. CONVERTING HIGHER ORDER TO 1ST ORDER EQUATIONS
x01 = 0 + x2 , (7.51)
x02 = 6x1 − 5x2 .
0 1
The eigenvalues and eigenvectors for matrix A = are,
6 −5
1 1
λ1 = 1, & λ2 = −6, . (7.52)
1 −6
1 0 1 1
The D matrix is given by D = and the B matrix is given by, B = .
0 −6 1 −6
Using Eq. 7.38 we can write
0
z1 1 0 z1
0 = , (7.53)
z2 0 −6 z2
and the solutions are z1 = c1 et and z2 = c2 e−6t . Thus, the solution for x1 and x2
are,
x1 1 1 z1
= . (7.54)
x2 1 −6 z2
Now, since y = x1 , we write the final solution as, y = c1 et + c2 e−6t .
7.5.1 Exercise
1. y 00 + y 0 − 2y = 0
Answer: y = c1 et + c2 e−2t
109
Chapter 8
What is so special about cos nx and sin nx? Obviously, they also have a period of
2π, i.e., sin n(x + 2π) = sin nx and cos n(x + 2π) = cos nx. You may think that sin nx
and cos nx has a shorter period (2π/n), but still they repeat every 2π, which makes
them suitable for the purpose of expanding f (x).1
The second advantage of expanding in terms of sin nx and cos nx is their or-
thogonality,
Z π
1
sin mx cos nxdx = 0, (8.2)
2π −π
Z π
1
sin mx sin nxdx = δmn ,
2π −π
Z π
1
cos mx cos nxdx = δmn .
2π −π
110
8.2. FOURIER SERIES OF FUNCTIONS OF PERIOD 2L
coefficients,
1 π
Z
an = f (x) cos nxdx, (8.3)
π −π
1 π
Z
bn = f (x) sin nxdx.
π −π
To get the
R πconstant a0 ,Rwe just need to integrate each term of Eq. 8.1 from −π to
π
π. Since −π sin nxdx = −π cos nxdx = 0,
Z π
1
a0 = f (x)dx. (8.4)
2π −π
Continuity of f (x):
There are functions which are periodic, but piecewise continuous in the interval
[−π, π].2 We can still write the Fourier series for such a function. Say f (x) is
continuous everywhere, expect at x0 . The Fourier series (Eq. 8.1) converges to
f (x) everywhere except x0 . At the point of discontinuity, the series converges to
−
f (x+
0 )+f (x0 )
the average of the left and right hand limits of f at x0 , i.e., 2 .
8.1.1 Exercise
1. Can we use sin nx and cos nx in Fourier series to expand a function of period
2π if n is non-integer?
3. Give examples of some functions which are periodic, but not defined for all
x ∈ R.
111
8.2. FOURIER SERIES OF FUNCTIONS OF PERIOD 2L
∞ h
X nπx nπx i
f (x) = a0 + an cos + bn sin , (8.6)
L L
n=1
Example
Find the Fourier series expansion of,
0, −2 < x < −1
f (x) = 2, −1 < x < 1 (8.8)
0, 1<x<2
1 2 1 1
Z Z
a0 = f (x)dx = dx = 1, (8.9)
4 −2 2 −1
1 2
Z nπx Z 1 nπx 4 nπ
an = f (x) cos dx = cos dx = sin , (8.10)
2 −2 2 −1 2 nπ 2
1 2
Z nπx Z 1 nπx
bn = f (x) sin dx = sin dx = 0. (8.11)
2 −2 2 −1 2
8.2.1 Exercise
1. If f (x) has a period of w, prove that f (x + nw) = f (x), where n is an integer.
112
8.3. FOURIER COSINE/SINE SERIES
8.3.3 Exercise
1. Derive the coefficients of Fourier cosine and sine series.
113
8.4. HALF RANGE EXPANSION
where
L
2 L
Z Z
2 nπx nπx
ãn = f˜(x) cos dx = f (x) cos dx, (8.20)
L 0 L L 0 L
1 L ˜ 1 L
Z Z
ã0 = f (x)dx = f (x)dx.
L 0 L 0
where
L
2 L
Z Z
2 ˜
nπx nπx
b̃n = f (x) sin dx = f (x) sin dx. (8.23)
L 0 L L 0 L
8.4.1 Exercise
114
Chapter 9
x2 y 00 + xy 0 + (x2 − n2 )y = 0 . (9.2)
Sometimes, it is preferred to have an additional parameter k in Bessel equation.
Substituting x = kz in the above equation, (dy/dx) = (dy/dz)(dz/dx) = 1/k(dy/dz)
and (d2 y/dx2 ) = 1/k 2 (d2 y/dz 2 ). Substituting in the above equation, we get Bessel
equation with z as the independent variable,
z 2 y 00 + zy 0 + (k 2 z 2 − n2 )y = 0 . (9.3)
where p(x), p0 (x), q(x), r(x) are real-valued and continuous in the interval a ≤ x ≤ b,
and r(x) > 0 throughout the interval. The Sturm-Liouville boundary conditions
are given by,
115
9.3. STURM-LIOUVILLE BOUNDARY VALUE PROBLEMS (SL-BVP)
where k1 , k2 are constants, at least one of them non-zero, and so are l1 , l2 . One
can also have a periodic boundary condition given by,
The solutions
The solutions of Eq. 9.4 are called the eigenfunctions. We have to find the eigen-
function y(x) corresponding to an eigenvalue λ. Several eigenvalues and eigen-
functions are possible for a given problem. If all the conditions mentioned above
are satisfied, then all the eigenvalues are real.1
Multiplying the first and second equation with yn and ym , respectively, and sub-
tracting,
0 0
yn [pym ] − ym [pyn0 ]0 = (λn − λm )rym yn . (9.9)
Adding and subtracting pyn0 ym
0 to the left-hand side of the above equation, we can
0 b
Z b
p(ym yn − yn0 ym ) a = (λn − λm ) rym yn dx . (9.11)
|a {z }
Eigenfunctions are orthogonal if the left-hand side is equal to zero. Do you no-
tice that q(x) got eliminated and p(x), as well as boundary conditions [Eq. 9.5]
1
Eigenvalues generally correspond to physical quantities like energies and frequencies, which
are real.
116
9.3. STURM-LIOUVILLE BOUNDARY VALUE PROBLEMS (SL-BVP)
117
9.3. STURM-LIOUVILLE BOUNDARY VALUE PROBLEMS (SL-BVP)
[(1 − x2 )y 0 ]0 + λy = 0, (9.17)
Jn (kR) = 0. (9.20)
For a given value of n, Jn (kx) has infinitely many zeros at different values of an,i ,
such that,
an,i
kn,i = . (9.21)
R
Thus, Bessel functions Jn (kn,1 x), Jn (kn,2 x), Jn (kn,3 x) · ·· form an orthogonal set on
the interval 0 ≤ x ≤ R with respect to the weight function r(x) = x,
Z R
xJn (kn,i x)Jn (kn,j x)dx = δij . (9.22)
0
9.3.1 Exercise
1. How many boundary conditions do you need for Legendre equation? Specify
the boundary conditions.
2. How many boundary conditions do you need for Bessel equation? Specify
the boundary conditions.
118
9.3. STURM-LIOUVILLE BOUNDARY VALUE PROBLEMS (SL-BVP)
(1 − x2 )y 00 − xy 0 + n2 y = 0, (9.23)
119
Chapter 10
In this chapter, we will learn some of the most important problems in science and
engineering, represented in the form of partial differential equations (PDEs). Un-
like ordinary differential equations (ODEs), PDEs involve multivariable functions.
For example, we have to deal with functions in higher spatial dimensions (two or
three dimensions, involving two or three spatial variables) or functions of both
space and time.
120
10.1. BOUNDARY AND INITIAL CONDITIONS
f(x,l)=d
f(l,y)=b
f(0,y)=a
f(0)=a
f(l)=b
f=0
f=0
f(x,0)=c
Figure 10.1: Boundary conditions in one (left) and two (right) dimension. Within
the domain, the unknown function f is determined by the differential equation
d2 ∂2 ∂2
(Laplace’s equation in this case, ∆ = dx 2 for 1D and ∆ = ∂x2 + ∂y 2 for 2D, respec-
tively). Along the boundaries, values of f are given by the boundary conditions.
Obviously, the function f must vary smoothly as we move from the interior to the
boundary of a domain.
121
10.2. CLASSIFICATION OF PDES
∂f ∂2f
+ = 0. (10.1)
∂y ∂x2
∂f ∂2f
+ (1 + x2 ) 2 = 0. (10.2)
∂y ∂x
∂f ∂2f
+ k 2 = 0, (10.3)
∂y ∂x
122
10.2. CLASSIFICATION OF PDES
because the classification is based on the coefficients of the second order terms
only. The PDE is classified as parabolic if b2 − 4ac = 0, hyperbolic if b2 − 4ac > 0
and elliptic if b2 − 4ac < 0.
10.2.5 Exercise
1. Identify whether the following equations are parabolic, hyperbolic or elliptic.
∂2f 2
∂ f ∂2f ∂f
(a) ∂x2
+ 4 ∂x∂y + ∂y 2
+ ∂x = 0.
∂2f ∂2f ∂2f ∂f
(b) ∂x2
+ 2 ∂x∂y + ∂y 2
+ ∂y = 0.
∂2f ∂2f ∂2f
(c) ∂x2
+ ∂x∂y + ∂y 2
= 2.
(d) Prove that, class of Eq. 10.6 does not change by a change of variable
from (x, y) → (χ(x, y), η(x, y)).
123
10.3. LAPLACE’S EQUATION
10
(a) (b) (c)
δ(θ)
f=0 f=5 f=?
+10 -10
Figure 10.2: Harmonic functions obey mean-value property – average value of the
function at the boundary is equal to its value at the center. Examples shown for
(a) one-dimensional domain, (b) and (c) two-dimensional domain. In case of (b),
f = 10 and f = 0 along the top and bottom part of the perimeter, respectively,
such that f = 5 along the line lying in the middle. What would be the value of the
function at the center in case of (c), where the function is zero everywhere, except
at a single point at the boundary?
∇2 f = 0. (10.8)
Laplace’s equation arises in different context, like the study of heat flow, grav-
ity, electrostatics etc. A solution of Laplace’s equation is known as a harmonic
function. It has specific properties, which will become clear as we progress.
d2 f
= 0 ⇒ f (x) = mx + c. (10.9)
dx2
I would like to mention two interesting features of the solution of Laplace’s equa-
tion.
124
10.3. LAPLACE’S EQUATION
Table 10.3: Real and imaginary part of the function f (x, y) = (x + ιy)n . Interest-
ingly, all of them are harmonic functions.
Figure 10.3: Plot of two harmonic functions (a) x2 − y 2 and (b) 2xy. They do not
have any maximum or minimum, but only a saddle point at (0, 0). The third one
(c) x2 + y 2 is not a harmonic function and it has a minimum at (0, 0). We can easily
verify that x2 + y 2 does not satisfy Laplace’s equation.
∂2f ∂2f
+ = 0. (10.11)
∂x2 ∂y 2
125
10.3. LAPLACE’S EQUATION
126
10.3. LAPLACE’S EQUATION
y
T=0
T=200
T=0
w
T=0 x
Figure 10.4: A bar, having finite width in the y−direction and semi-infinite in the
x− direction. One side (along the y−axis) is held at 200◦ and the long sides (along
the x − axis) is held at 0◦ . The far end is also held at 0◦ . What would be the
temperature distribution within the bar?
1 d2 X 1 d2 Y
= − . (10.14)
X dx2 Y dy 2
Now, the left-hand side is a function of x, and the right-hand side is a function
of y, and they can be equal if both are equal to some constant. This is the prin-
ciple behind the method of separation of variables. Finally, we have to solve two
eigenvalue problems,
d2 X d2 Y
= −k 2 X & = k 2 Y, (10.15)
dx2 dy 2
127
10.3. LAPLACE’S EQUATION
none of the solutions can satisfy the boundary conditions. For example, in the
x direction T → 0 as x → ∞. However, neither sin kx, nor cos kx can satisfy this
condition. Note that, in Equation 10.15 the negative sign was arbitrarily assigned
to one of the equations. If we reverse our choice, the solutions are,
kx
sin ky e
Y = ,X = . (10.16)
cos ky e−kx
which is nothing but a Fourier sine series for f (y) = 200. We can find the coeffi-
cients (say for y = 10),
2 w
Z 800
nπy nπ , odd n,
an = f (y) sin dy = (10.20)
w 0 w 0, even n.
128
10.3. LAPLACE’S EQUATION
200
10 10 10
8 8 8
6 6 6
0 0 0
2 4 y 2 4 y 2 4 y
4 4 4
6 2 6 2 6 2
x 8 0 x 8 0 x 8 0
10 10 10
Figure 10.5: Eq. 10.21 plotted for n ranging from (a) 1 − 3, (b) 1 − 29, and (c) 1 − 299.
of the exponential term. However, in the y−direction, the boundary condition re-
quires a constant temperature along the edge. This requires many sine functions;
just a few of them are not sufficient. One can verify this with the following Python
code.
129
10.3. LAPLACE’S EQUATION
like cekx + de−kx and we have to choose c and d to ensure that T = 0 at x = 10. I
leave it as an exercise to show that 12 ek(10−x) − e−k(10−x) = sinh k(10 − x) satisfies
what we are looking for. Rest of the problem is very similar to what we did for the
semi-infinite plate. Thus, we can write the solution as,
∞
X nπ nπy
T (x, y) = an sinh (10 − x) sin . (10.22)
10 10
n=1
which is a Fourier sine series for f (y) = 200. After finding An (same as before), we
can derive the values of an as,
800
an = nπ sinh nπ , odd n, (10.24)
0, even n.
Thus, the temperature distribution in the square plate can be written as,
800 1 π πy 1 3π 3πy
T (x, y) = sinh (10 − x) sin + sinh (10 − x) sin +··· .
π sinh π 10 10 3 sinh 3π 10 10
(10.25)
I have shown a plot of the temperature distribution in Figure 10.8(b).
Circular domain
In polar coordinates, Laplace’s equation reads (see exercise for derivation),
∂2f 1 ∂f 1 ∂2f
+ + = 0. (10.26)
∂r2 r ∂r r2 ∂θ2
Writing f (r, θ) = R(r)Θ(θ), we get two ordinary differential equations (eigenvalue
problem) to solve,
d2 R dR
r2 +r = k 2 R, (10.27)
dr2 dr
d2 Θ
= −k 2 Θ.
dθ2
Solution to the first and second equation gives radial and angular part of the
function, respectively. The angular solution looks like Θ(θ) = A sin kθ + B cos kθ.
Convince yourself that Θ must satisfy Θ(θ) = Θ(θ + 2π), which requires k to be
some integer n = 0, 1, 2, 3, · · ·.1 The eigenfunctions are
130
10.3. LAPLACE’S EQUATION
90°
10.5
135° 45° 9.0
7.5
1.0
0.8
0.6 6.0
0.4
0.2
180° 0° 4.5
3.0
1.5
1.5
270°
Since r = 0 lies inside the domain, we have to discard ln r and r−n , as they are
undefined at the origin. Thus, we can write the general solution as,
∞
X ∞
X
f (r, θ) = rn (an cos nθ + bn sin nθ) = a0 + rn (an cos nθ + bn sin nθ). (10.31)
n=0 n=1
If the boundary condition is f (a, θ) = h(θ), where a is the radius of the circular
domain, we can get the coefficients using,
Z 2π
1
a0 = h(θ)dθ for n = 0, (10.32)
2π 0
Z 2π
1
an = n h(θ) cos nθdθ for n ≥ 1,
a π 0
Z 2π
1
bn = n h(θ) sin nθdθ for n ≥ 1.
a π 0
131
10.3. LAPLACE’S EQUATION
First, for n = 0,
Z 2π Z π Z 2π
1 1
a0 = h(θ)dθ = 10dθ + (0)dθ = 5. (10.34)
2π 0 2π 0 π
Next, for n ≥ 1,
1 2π
Z Z π Z 2π
1
an = h(θ) cos nθdθ = 10 cos nθdθ + (0) cos nθdθ = 0, (10.35)
π 0 π 0 π
Finally,
Z 2π Z π Z 2π
1 1 10
bn = h(θ) sin nθdθ = 10 sin nθdθ + (0) sin nθdθ = [1 − (−1)n ].
π 0 π 0 π nπ
(10.36)
Thus, the temperature distribution in the circular plate is,
∞
X 10
T (r, θ) = 5 + [1 − (−1)n ]rn sin nθ. (10.37)
nπ
n=1
I have plotted T (r, θ) using the following Python code (see Fig. 10.6).
import numpy as np
import matplotlib.pyplot as plt
def f(r, theta):
s = 0.0
for n in range(1,100):
s += 10.0 * (1.0 - (-1.0)**n) * r**n * np.sin(n*theta) / (n*np.pi)
s1 = s + 5.0
return(s1)
radius = np.linspace(0, 1, 50)
angle = np.linspace(0, 2.0*np.pi, 50)
r, theta = np.meshgrid(radius, angle)
Z = f(r, theta)
fig, ax = plt.subplots(subplot_kw=dict(projection=’polar’))
cf = ax.contourf(theta, r, Z, cmap=’afmhot’)
fig.colorbar(cf)
plt.show( )
132
10.3. LAPLACE’S EQUATION
(x,y+h)
(x-h,y)
(x+h,y)
(x,y)
(x,y-h)
Figure 10.7: Finite difference method: the domain is divided in a square of rectan-
gular grid. Note that, grid points (red) are also placed at the boundary. However,
values of f (x, y) remain fixed (boundary conditions) at these grid points. On the
other hand, f (x, y) changes with each iteration inside the domain (black points),
until convergence is achieved.
∂2f f (x + h, y) − 2f (x, y) + f (x − h, y)
2
= , (10.38)
∂x h2
∂2f f (x, y + h) − 2f (x, y) + f (x, y − h)
2
= .
∂y h2
133
10.3. LAPLACE’S EQUATION
where the superscript denotes the iteration sequence. We start with some guess
value at every grid point f 0 (x, y) (zeroth step) and hope to achieve convergence
quickly. Note that, grid points located at the boundary always have fixed val-
ues (because of boundary condition) and they do not change during the iteration
process. Finally, let us write a Python code to solve the problem of temperature
distribution in a square plate and compare with the analytical results.
134
10.4. DIFFUSION OR HEAT EQUATION
y
T=0 (a) (c)
200
(b)
200
175 175
150 150
125
T(x,y)
T(x,y)
125
T=200
100
T=0
100
75 75
50 50
25 25
0 0
10 10
8 8
T=0 x 0
2 4
6
y
0
2 4
6
y
4 4
6 2 6 2
x 8 0 x 8 0
10 10
Figure 10.8: (a) Boundary conditions in a square plate, (b) analytically calculated
temperature distribution (Eq. 10.25) and (c) numerically calculated temperature
distribution.
plt.show( )
10.3.5 Exercise
1. Starting with x = r cos θ and y = r sin θ, derive Eq. 10.26.
2. Analytically solve the problem shown in Figure 10.2(c), assuming the bound-
ary condition f (r, θ) = δ(θ) = 1.
1 ∂f
∇2 f = , (10.42)
α ∂t
where α is a positive coefficient. As the name suggests, this equation arises in
the context of diffusion or heat flow. In diffusion, f and α are concentration and
diffusivity (a material property), respectively. Similarly, in heat flow, f and α are
temperature and thermal diffusivity (a material property), respectively. We will
use α2 , instead of α, in the rest of the discussion. This minor adjustment helps
us with the notation when we write down the final solution.
∂2f 1 ∂f
2
= 2 . (10.43)
∂x α ∂t
135
10.4. DIFFUSION OR HEAT EQUATION
50 t=0
t=0.01
t=0.1
40 t=0.2
T=0
t=0.3
t=0.4
30
T
T=0 20
l
10
0 2 4 6 8 10
l
Figure 10.9: (a) A bar is uniformly heated to 50◦ initially. Then, two of its faces
(red) are brought in contact with thermal reservoirs at 0◦ and rest of the faces
(white) are insulated, such that heat flow is essentially one-dimensional. (b) The
temperature profile as a function of time, as the bar cools down.
Using variables separation, we can split time and space-dependent part as f (x, t) =
X(x)T (t). But we can avoid that, as the answer is not too difficult to guess,
where λ is positive and real. If λ is negative, the function becomes infinite with
t → ∞. On the other hand, if λ is imaginary, f (x, t) would oscillate with time and
not decay. However, we are looking for a solution that decays with time, and thus,
we must choose a positive and real λ. Substituting Eq. 10.44 in Eq. 10.43 and
canceling e−λt from both sides, we get,
d2 X λ
2
= − 2 X = −k 2 X, (10.45)
dx α
such that λ = k 2 α2 . This is an eigenvalue problem, having a general solution
X(x) = A cos kx + B sin kx. Thus, we can write,
2 α2 t 2 α2 t
f (x, t) = Ae−k cos kx + Be−k sin kx. (10.46)
136
10.4. DIFFUSION OR HEAT EQUATION
2 l
Z 200
nπx nπ , odd n,
bn = T (x, 0) sin dx = (10.49)
l 0 l 0, even n.
import numpy as np
import matplotlib.pyplot as plt
def f(x,t):
s = 0.0
for n in range(1,400,2):
s +=np.exp(-n**2 * np.pi**2 * t)*np.sin(n * np.pi * x / 10) / n
s1 = 200 * s / np.pi
return s1
x = np.linspace(0,10,30)
t = float(input("Enter the value of time: "))
Temp=f(x,t)
plt.plot(x, Temp)
plt.xlabel(’l’)
plt.ylabel(’T’)
plt.show( )
137
10.5. WAVE EQUATION
2
f(x,0)=e-x
1.0 t=0
t=10
0.8
0.6
h(x+vt) g(x-vt)
2 2
=0.5e-(x+t) =0.5e-(x-t)
0.4
0.2
0.0
15 10 5 0 5 10 15
t
Figure 10.10: Let us imagine that a sound is created at the middle of a long
2
tunnel. At t = 0, the pulse has the form e−x . Half of it travels to the right and the
rest travels to the left. I assume v = 1.
1 dG
G∇2 F = F . (10.52)
α2 dt
Dividing both sides by F G, we get,
1 2 1 1 dG
∇ F = 2 . (10.53)
F α G dt
Since the left side of the identity depends only on space variables and the right
side only on time variables, both sides must be equal to some constant.
∇2 F = −k 2 F, (10.54)
dG
= −k 2 α2 G.
dt
The space equation is the Helmholtz equation. As discussed before, k is chosen
to be a real number, because we want the function to decay with time.
10.4.3 Exercise
138
10.5. WAVE EQUATION
d2 f 1 d2 f
= , (10.56)
dx2 v 2 dt2
where the constant v is the wave velocity. I claim that the solution is,
Before attempting derivation, let us try to understand what does the solution
mean. Imagine a long tunnel and someone is stranded in the middle of it. The
person shouts for help (at t = 0). Whatever sound he produces, half of it travels
to the right and rest travels to the left of the tunnel. As shown in Figure 10.10,
the initial form of the pulse is a Gaussian.2 At time t = 10, the function is still
a Gaussian, but it has moved to some other place as the wave is traveling. Note
that, Figure 10.10 is a special case and we can actually write the solution as,
1 1
f (x, t) = g(x − vt) + g(x + vt). (10.58)
2 2
We can prove Eq. 10.57 by substituting ξ = x − vt, η = x + vt and looking to
write Eq. 10.56 in terms of the new variables ξ and η. Using chain rule,
∂f ∂f ∂ξ ∂f ∂η ∂f ∂f
= + = −v +v , (10.59)
∂t ∂ξ ∂t ∂η ∂t ∂ξ ∂η
∂f ∂f ∂ξ ∂f ∂η ∂f ∂f
= + = +
∂x ∂ξ ∂x ∂η ∂x ∂ξ ∂η
Similarly,
Finally, substituting in Eq. 10.56, we can write the one-dimensional wave equa-
tion in terms of the new variables as,
∂2f
= 0, (10.61)
∂η∂ξ
which has a solution,
f (ξ, η) = g(ξ) + h(η). (10.62)
Writing ξ and η in terms of the old variables x and t, we get Eq 10.57.
2
Form of the function does not matter. I could have chosen any other form, like a δ-function or
Lorentzian. However, from our experience we know it should be a localized function. If someone
claps for a second, we hear that for a second only, not for an hour!
139
10.5. WAVE EQUATION
The first one is the initial position and the second one is the initial velocity of the
string. At t = 0, we the initial velocity and position is,
Integrating the first of the two equations with respect to the position, we can
write,3
1 x
Z
H(y)dy = −g(x) + h(x). (10.65)
v 0
Solving g(x) and h(x) from the above equations and substituting x + vt and x − vt
in place of x, we get,
Z x+vt
1 1
h(x + vt) = G(x + vt) + H(y)dy, (10.66)
2 2v 0
Z x−vt
1 1
g(x − vt) = G(x − vt) − H(y)dy.
2 2v 0
Let us try to understand the solution. First note that if both G(x) and H(x) is
zero, there is no wave! There is no surprise in it, because if you take a string and
do nothing to it, you can not set off any wave motion. Either you have to pull
the string (G(x) 6= 0) and release it, or you have to hold the string at one end and
shake your hand to give is some initial velocity (H(x) 6= 0).
Case 1: Assume the initial velocity H(x) = 0: We can write the solution as,
1
f (x, t) = [G(x − vt) + G(x + vt)] . (10.68)
2
Do you notice that the right and left moving wave go symmetrically away from
2
their initial position as time progresses? Let us take G(x) = 2e−x and v = 1, such
that,
2 2
f (x, t) = e−(x−t) + e−(x+t) . (10.69)
I have plotted different snapshots of the wave in Figure 10.11.
3
The integration yields a function, not a number.
140
10.5. WAVE EQUATION
10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0 10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0 10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0
2
Figure 10.11: Wave propagation for initial position G(x) = e−x and initial velocity
H(x) = 0.
Figure 10.12: Wave propagation for initial position G(x) = 0 and initial velocity
2
H(x) = 2xe−x .
141
10.6. SCHRÖDINGER EQUATION
Case 2: Assume initial position G(x) = 0: We can write the solution as,
Z x+vt
1 1 h i
f (x, t) = H(y)dy = H̃(x + vt) − H̃(x − vt) . (10.70)
2v x−vt 2v
2
Let us take H(x) = 2xe−x and v = 0.5, such that,
2 2
f (x, t) = e−(x−0.5t) − e−(x+0.5t) . (10.71)
10.5.2 Exercise
142
Chapter 11
143
11.2. HOW TO COUNT?
1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 2,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6
6 6,1 6,2 6,3 6,4 6,5 6,6
Table 11.1: List of all possible outcomes (known as the sample space) if we throw
two dice simultaneously. Each outcome is termed as a sample point and there
are 36 sample points in this case.
certain outcome in some experiment, first we need to count all possible outcomes
in that experiment. Thus, we need to learn how to count in a systematic way.
Case 2: In the above problem, if we have 15 slots to arrange 15 balls, there there
are 15! = 1307674368000 possible arrangements. As shown in Fig. 11.1(b), let us
consider a case when we have only 3 slots to arrange the 15 balls, and clearly,
number of possible arrangements are much smaller in this case. As argued in
the previous section, the first slot can be occupied in 15 ways, the second slot
2
Some cases where we need to calculate permutation are, number of ways (a) n people can be
seated in n chairs, (b) n cards can be arranged on the table, (c) 5 single digit numbers (say 0-4) can
be arranged to give 5 digit numbers etc. Note that, order does matter in all the cases.
3
You have only one ball of each type in the reservoir, using which you have to fill the slots. So,
if you decide to put ‘ball 1’ in the first slot, you can not put it in any other slot and since there is
only one ‘ball 1’ available, it can not be repeated in any other slot.
144
11.2. HOW TO COUNT?
Slots Slots
V W X Y Z
(e) (f)
1 2 3 4 5 6 7 8 9 101112131415
Slots
145
11.2. HOW TO COUNT?
can be occupied in 14 ways etc. and total number of possible arrangements are
15 × 14 × 13 = 2730. However, it can be expressed in a smart way like,
(n − r)! n!
P (n, r) = n(n − 1) · · · ·(n − r + 1) = n(n − 1) · · · ·(n − r + 1) = . (11.3)
(n − r)! (n − r)!
Note that, in this case also, there is no repetition and order does matter.
Case 3: Let us consider a combination lock, as shown in Fig. 11.1(c). Let the
combination to open the lock be 0279. Imagine that you forget it and you must
open it without breaking the lock. Note that, the first slot can be occupied by
any one of the 10 digits, ranging from 0-9. Thus, there are 10 ways in which the
first slot can be filled. Same is true for rest of the slots as well. Since there are 4
slots in this particular lock, you may have to try 10 × 10 × 10 × 10 = 104 possible
combinations before you can open the combination lock.4 Thus, if there are n
number of things to choose from and we can choose r at a time, then there are
nr ways of choosing them. Note that, unlike the previous two cases, repetition
is allowed, i.e., we can have combinations like 0000, 1112, 2234 etc. However,
similar to the previous two cases, order does matter in this case as well, i.e., 1234
is not same as 4321.
Before we move on to the next section, let us appreciate the fact that there
is one important factor that is common among all three cases discussed above:
order does matter. In the following section, we will learn to count when order does
not matter.
11.2.2 Combination
Case 1: First, let us discuss the case when repetition is not allowed. If you think
carefully, then you will realize that derivation of P (n, r) involves two steps. First,
we find out in how many ways one can select a set of r balls out of n. We denote
it as C(n, r),5 which is equal to all possible combinations of r objects chosen from
a set of n. Second, we find out the number of ways r balls can be arranged in r
boxes (one ball per box), and it is given by P (r, r). Thus, we can write
n!
P (n, r) = C(n, r) × P (r, r) ⇒ C(n, r) = . (11.4)
(n − r)!r!
The whole argument leading to the derivation of Eq. 11.4 can be rephrased in
the following manner. We know n = 15 distinguishable pool balls can be arranged
in r = 3 slots in P (15, 3) = 15 × 14 × 13 ways. If we consider a particular set of three
4
We should actually call it a
permutation
lock!
5 n
This is also represented by .
r
146
11.2. HOW TO COUNT?
pool balls (say number 1,2,3), then there are 6 possible arrangements (if order
does matter)
1 2 3
1 3 2
2 1 3
(11.5)
2 3 1
3 1 2
3 2 1
However, if order does not matter (which we call as combination), then we have
only one possibility: (123). Thus, we have to adjust the permutation formula and
divide it by the number of ways in which three numbers can be ordered, i.e.,
C(15, 3) = P (15, 3)/3!. Generalizing in terms of n and r, we get C(n, r) = P (n, r)/r! ,
which is same as Eq. 11.4.
Interestingly, from Eq. 11.4, we find that C(n, r) = C(n, n − r). In words, all
possible combinations of selecting r objects from a set of n is exactly equal to that
of selecting (n − r) objects from a set of n. This is obvious, because every time we
select r balls out of n, (n − r) balls are left out. Thus, all possible combinations of
selecting r objects out of n must be same as that of selecting (n − r) objects out of
n.
Let us discuss an example to understand when do we need to calculate com-
bination. Say we have 5 people and we want to make a committee of 3 [see
Fig. 11.1(d)]. In how many ways we can select 3 out of 5 people. In this case, we
need to calculate C(n, r). Say one particular composition is a committee consist-
ing of Mr. X, Y and Z. Note that, in this case order does not matter, because (X,Y,Z)
is the same committee as (X,Z,Y), (Z,X,Y) etc. This is the main difference between
permutation (order does matter) and combination (order does not matter).
15! 8! 1! 15!
C(15, 7) × C(15 − 7, 7) × C(15 − 7 − 7, 1) = × × = . (11.6)
8!7! 1!7! 0!1! 7!7!1!
Thus, we can generalize by stating that: if there are total n = n1 + n2 + · · +nk
objects, among which n1 , n2 , ... nk are indistinguishable, then total number of
6
Some examples are (a) slots 1234567, (b) slots 123458, (c) slots 1346789 etc.
147
11.2. HOW TO COUNT?
(a) (b)
C V S M B C V S M B
(c) (d)
C V S M B C V S M B
Figure 11.2: Three (r) scoops of ice cream (blue dots) chosen from five (n) different
flavors: (a) 3 chocolate, (b) 1 vanilla, 1 strawberry, 1 butterscotch, (c) 2 chocolate,
1 mango and (d) 1 chocolate, 1 vanilla, 1 mango. In order to find all possible
combinations, we have to calculate the number of ways 7 objects can be arranged,
which can be divided in two types, 3 blue dots and 4 red vertical lines.
n!
C(n, n1 ) × C(n − n1 , n2 ) × C(n − n1 − n2 , n3 ) × · · · = . (11.7)
n1 ! × n2 ! × n3 ! · · · nk !
Case 3: Finally, let us discuss the case when repetition is allowed. As shown
in Fig. 11.1(f), in an ice-cream parlor, you are allowed to take 3 scoops of ice
cream out of chocolate (C), vanilla (V), strawberry (S), mango (M) and butterscotch
(B). Repetition is allowed in this case, i.e., you can opt for all 3 chocolate or 1
butterscotch, 1 vanilla & 1 strawberry or 2 chocolate & 1 mango or any other
combination. Obviously, order does not matter, i.e., an arrangement of chocolate
on top, followed by vanilla and strawberry is same as vanilla on top, followed by
strawberry and chocolate. Let us find out in how many ways we can select 3
scoops out of 5 flavors.
Some examples are shown in Fig. 11.2. The scoops are shown by blue balls
(r = 3) and the partition between different flavors (n = 5) are shown by red vertical
lines (there are n − 1 of them). Remember that, we are looking for all possible
combinations of r scoops of ice creams (repetition allowed) from n different flavors.
From Fig. 11.2, it is clear that the answer is nothing but all possible arrangements
of (n − 1) red lines and r blue dots, which is given by,
(r + n − 1)!
= C(r + n − 1, r). (11.8)
r! × (n − 1)!
11.2.3 Summary
• If there are N1 possible outcomes of event 1, N2 possible outcomes of event
2, ......., Nn possible outcomes of event n, and all the events occur simulta-
neously (or in succession), then there are N1 × N2 · · · ×Nn possible outcomes.
148
11.2. HOW TO COUNT?
Example 2: Now, imagine that you all believe in equality and you plan to make
a committee, where every member is equal. In that case, in how many ways a 4
member committee can be chosen from a class of 30?
Note that, in this case order does not matter, i.e., a committee comprising A,
B, C and D is same as the committee comprising A, C, D and B. Thus, the answer
30!
is C(30, 4) = 26!×4! = 30×29×28×27
24 .
I would like to draw your attention to the fact that, we can solve the above
problems without thinking too much about which formula to apply. For example,
if we want to make a committee of 4 from a class of 30, the first post can be filled
in 30 ways, the second post can be filled in 29 ways, · · · and thus, the answer is
30 × 29 × 28 × 27. Now, 4 members can be selected for 4 posts in 4! = 24 ways. Thus,
if order does not matter, then there are 30×29×28×27
24 ways of forming a 4 member
committee.
149
11.3. DISCRETE PROBABILITY FUNCTIONS
Sample space 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
Probability 36 36 36 36 36 36 36 36 36 36 36
Table 11.2: A sample space (sum of two dice) and probability derived from Ta-
ble 11.1.
For filling box 1, we can select 2 balls out of 14 in C(14, 2) ways (as order does
not matter). For box 2, we can select 3 balls out of remaining 12 in C(12, 3) ways.
Similarly, we can select 4 balls for box 3 in C(9, 4) ways and 5 balls for box 4 in
14!
C(5, 5) ways. Thus, the answer is C(14, 2) × C(12, 3) × C(9, 4) × C(5, 5) = 2!×3!×4!×5! .
11.2.5 Exercise
1. In a paramagnet, there are N -atoms and each of them can have either ↑ or ↓
spin.
150
11.3. DISCRETE PROBABILITY FUNCTIONS
6/36 1
1/6 5/36
0.75
4/36
(a) (b) (c)
F(x)
f(x)
f(x)
3/36 0.5
2/36
0.25
1/36
0
1 2 3 4 5 6 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12
x x x
Figure 11.3: Plot of the probability functions: (a) throwing a single die and (b)
throwing two dice simultaneously (see Table 11.2). (c) Plot of cumulative proba-
bility distribution function in case of throwing two dice.
plot of the probability function is shown in Fig. 11.3(b). A plot of the probability
function, in case of throwing a single die is shown in Fig. 11.3(a). Since all the
outcomes are equally probable, f (x) is a constant (equal to 1/6) in this case.
How the above discussion is useful to a scientist or engineer? Say you are
doing some experiment, like measuring electrical conductivity of some material.
You generally do the experiment
P several times and measure a value of xi for Ni
times out of total N = Ni and the probability of xi is calculated as pi = f (xi ) =
Ni /N . We are generally interested in the average value of all measurements and
the spread of the data about the average value, as discussed in the next section.
1 X X Ni X X
x̄ = xi Ni = xi = xi p i = xi f (xi ). (11.9)
N N
i i i i
151
11.3. DISCRETE PROBABILITY FUNCTIONS
6/36 6/36
5/36 (a) 5/36 (b) (c)
4/36 4/36
f(x)
f(x)
f(x)
3/36 3/36
2/36 2/36
1/36 1/36
a b x
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
x x
Figure 11.4: (a) Bar chart of the probability function shown in Fig. 11.3(b). Height
of each bar is proportional to the value of the data point and width of each bar
is equal to 1. (b) Width of each bar halved and additional points added. We can
keep halving the width, untill the points start touching each other (in the limit
of bar width→ 0) and we get a continious line, which represents a continuous
probability distribution function. (c) In this case, probability is given by the area
Rb
under the curve a f (x)dx.
11.3.3 Exercise
1. Let the random variable x be the number of heads when three coins are
tossed. Make a table of x and the probability function f (x) = p.
10 P
Since f (xi ) = pi is a probability function, it satisfies i f (xi ) = 1.
11
This is also represented by a symbol of σ.
152
11.4. CONTINUOUS PROBABILITY FUNCTIONS
Compare with Eq. 11.10 and note that the sum is replaced by an integral. Cu-
mulative probability function is given by,
Z x
F (x) = f (u)du. (11.15)
−∞
Compare with Eq. 11.12 and note that the sum is replaced by an integral. It is
obvious that F (∞) = 1.
12
If we add all the areas, we should get 1.
13
Instead of aR discrete random variable xi , we now have a continuous random variable x.
14 ∞
Obviously, −∞ f (x)dx = 1.
153
11.5. BINOMIAL DISTRIBUTION
A related question would be the probability of not more than r perfect products
out of n.20
15
There are 2 possible outcomes of the 1st toss, 2 possible outcomes of the 2nd toss and so on.
16
If it is one of the specific outcomes like HT T HH, then the answer is 215 .
17
Equivalently, T needs to be distributed in 2 different slots, like 45, 24, 12 etc.
18
Or equivalently, n − r out of n products are defective.
19
q = 1 − p.
20
Equivalently, less than or equal to r perfect products out of n.
154
Appendix A
Introduction to Partial
Differential Equations
In this chapter, we shall learn about the basics of the partial differential equa-
tions.
where u(x, y) is a known function. We can rewrite the above equation as,
155
A.1. CLASSIFICATION OF PDES
where all the coefficients aα = 1. In general, for a PDE to be linear, aα can be any
function of the independent variables, x = (x, y, z, · · ·) and we can write a k th order
linear PDE as,
Xk
aα (x)Dα f + u(x) = 0 . (A.6)
α=0
• The derivatives are non-linear like (∂f /∂x)2 , (∂f /∂x)(∂ 2 f /∂y 2 ) etc.
Let us learn about several sub-classes of non-linear PDE. Non-linear PDEs can
appear very complicated. Fortunately, their classification is based only on the
highest order terms and rest of the lower order terms can be ignored for classifying
non-liner PDEs.
Semi-linear PDE
If the highest order term is linear, we call it a semi-linear PDE. Let us first separate
the highest order term from the rest of the terms and write a k th order PDE as,
where a is a function which contains anything other than the highest order term.
Note that, we have to care only about the linearity of the highest order term and
the rest can contain non-linearity of any form. For example, the equation
Quasi-linear PDE
In this case, the highest order term is also non-linear. However, the coefficient of
the highest order term contains terms one order less than the highest order. For
156
A.2. METHOD OF CHARACTERISTICS
f(x,y)=(x-y)2 f(x,y)=sin(x-y)
3 0.5
2 0
1 -0.5 1
4 0.8
3.5 0.6
3 0.4
2.5 0.2
2 0
-0.2
1.5 -0.4
1 -0.6
0.5 -0.8
0 -1
y
x
(1,0)
(0,-1)
Figure A.1: Graph of f (x, y): two possible solutions of of Eq. A.15, plotted as-
suming a = b = 1. Contour lines or level curves are shown at the base of the
plots. Along the contour lines, f (x, y) is constant. Contour lines are parallel to
y = x − constant.
A.1.3 Exercise
157
A.2. METHOD OF CHARACTERISTICS
Since f (x, y) is a solution, graph of f (x, y)2 is going to be a smooth surface S.3 We
get a nice geometric interpretation, if we rewrite the equation as “dot product”,
[a(x, y), b(x, y), c(x, y)] · [fx (x, y), fy (x, y), −1] = 0. (A.13)
| {z } | {z }
tangent normal
Do you recognize that the second vector is normal to the surface S at (x, y, f (x, y))
or (x, y, z)?4 Thus, the first vector lies on the plane, tangent to the surface S at
(x, y, f (x, y)) or (x, y, z). Based on this information, if we can construct the surface
S, then we get the solution f (x, y).5
Let C(s) = (x(s), y(s), z(s)) be a parameterized curve on the surface S. Tangent
to the curve C 0 (s) = (x0 (s), y 0 (s), z 0 (s)) should lie on the tangent plane, such that,
dx
x0 (s) = = a(x(s), y(s)), (A.14)
ds
dy
y 0 (s) = = b(x(s), y(s)),
ds
dz
z 0 (s) = = c(x(s), y(s)).
ds
The above system of ODEs are called the characteristic equations of the PDE
given in Eq. A.12 and C(s) is called the integral curve to the PDE. If we get all
such curves, then we can construct the surface S.
Example 1: Homogeneous equations are simplest ones to solve. So, let us
start with,
afx + bfy = 0, (A.15)
where a, b are constants. The characteristic equations and solutions are,
dx
= a ⇒ x = as + c1 , (A.16)
ds
dy
= b ⇒ y = bs + c2 ,
ds
dz
= 0 ⇒ z = c3 .
ds
Eliminating s, we can write the integral curves as,
bx − ay = c4 & z = c3 . (A.17)
Thus, z = f (x, y) is constant along the lines bx − ay = c4 . If we draw all such lines,
then we can construct the surface S. Thus, we can write the general solution of
Eq. A.15 as,
f (x, y) = u(bx − ay). (A.18)
2
Graph of f (x, y) is the set of points (x, y, f (x, y)), where z = f (x, y) is the height of the graph at
the point (x, y). Try to plot the graph of f (x, y) = x2 + y 2 and see how the surface looks like.
3
Such that we can draw a tangent plane at every point.
4
Let us write F (x, y, z) = f (x, y) − z = 0 and ∇F = (fx , fy , −1).
5
Because S is the graph of f (x, y).
158
A.2. METHOD OF CHARACTERISTICS
y
x
x *
o
x * o x
* o
Figure A.2: Solution of Eq. A.15 (a = b = 1): f (x, y) = u(x − y) = constant and initial
condition is given along the x−axis as f (x, 0) = g(x). Values on the x−axis will
be “carried” or “transported” along the straight lines, because f (x, y) is constant
along the lines. Thus, the solution can be written as, f (x, y) = g(x − y).
You must keep in mind that, while f (x, y) ∈ R2 , u(bx − ay) ∈ R. We can verify,
and try to solve Eq. A.15. Without doing anything, we can predict that the value
of the function on the x axis will be “carried” or “transported” along the contour
lines, bx − ay = constant (see Figure A.2). Based on this fact, convince yourself
that, if we take the initial condition (assuming a = b = 1) to be,
then we get the particular solutions plotted in Figure A.1. Thus, a particular
solution to Eq. A.15 is obtained from the initial condition g(x) given in Eq.A.20
159
A.2. METHOD OF CHARACTERISTICS
and try to solve it. We must find the surface S, which is the graph of f , and
contain the data curve,
Γ(r) = (r, 0, g(r)). (A.24)
Note that, the data curve has been parameterized by r and constructed from the
initial condition. The integral curves C(r, s) = (x(r, s), y(r, s), z(r, s)), also lying on
S, originate from the data curve Γ(r) (see Figure A.3), and satisfy,
dx
x0 (r, s) = = a(x(r, s), y(r, s)), (A.25)
ds
dy
y 0 (r, s) = = b(x(r, s), y(r, s)),
ds
dz
z 0 (r, s) = = c(x(r, s), y(r, s)).
ds
This is same as before, with additional restriction that the integral curves must
originate from the data curve Γ(r). Note that, the integral curves are parameter-
ized by r and s. Since s = 0 along r, we can write x(r, 0) = r, y(r, 0) = 0, z(r, 0) = g(r),
and we have to solve,
∂x
x0 (r, s) = = a(x(r, s), y(r, s)); x(r, 0) = r, (A.26)
∂s
∂y
y 0 (r, s) = = b(x(r, s), y(r, s)); y(r, 0) = 0,
∂s
∂z
z 0 (r, s) = = c(x(r, s), y(r, s)); z(r, 0) = g(r).
∂s
Example 2: Let us solve the following linear, first order, non-homogeneous
PDE,
fx + fy = f, (A.27)
f (x, 0) = cos x.
The data curve is Γ(r) = (r, 0, cos r). The characteristic equations, with initial
160
A.2. METHOD OF CHARACTERISTICS
(x1,0,g(x1))
(x0,0,g(x0))
x1
x0
x
Figure A.3: The data curve (solid line) is given along the x−axis: f (x, 0) = g(x).
The characteristic curves (dashed lines) originate from the data curve.
conditions are,
It is left as an exercise for you to verify that f (x, y) satisfies Eq. A.27.
Example 3: Let us now solve a quasi-linear equation, given by,
f fx + fy = 0, (A.31)
f (x, 0) = x.
The data curve is given by Γ(r) = (r, 0, r). The characteristic equations, with initial
161
A.2. METHOD OF CHARACTERISTICS
-0.5 50 -0.5
f(x,y)=x/(y+1) r=1.0
-0.55 1 45 r=0.8
r=0.6
-0.6 0.8 40 -0.6 r=0.4
0.6 r=0.2
-0.65 0.4 35
0.2 30 -0.7
-0.7
-0.75 25 y
y 20 -0.8
-0.8
-0.85 15
-0.9 10 -0.9
-0.95 5
0 -1
0 0.1 0.2 0.3 0.4 0.5
0 0.1 0.2 0.3 0.4 0.5
x x
x
Figure A.4: (Left) Color map with contour lines for f (x, y) = y+1 = r. (Right)
Contour lines are along x = r(y + 1) and f (x, y) = constant = r along these lines.
Clearly, there exist a singularity at the point (0,-1).
conditions are,
Thus, f (x, y) = constant = r along the lines x = r(y+1). Let us assume that r ∈ [0, 1].
Thus, x lies in a region bound by the y−axis (when r = 0) and the line x = y + 1
(when r = 1). Along the y−axis, f (x, y) = 0 and along the line y = x − 1, f (x, y) = 1.
Few more contour lines or level curves are shown in Figure A.4. As shown in the
figure, along the contour lines f (x, y) = constant = r. However, all the lines emerge
from the point (0,-1). How can this be possible? Clearly, there exist a singularity
at the point (0,-1).
A.2.3 Exercise
1. Solve yfx − xfy = 0. Make a diagram like Figure A.2.
162
A.3. CANONICAL FORM
f (x, y) = constant along the straight line (or some curve). Thus, if we know the
value of f (x, 0) = g(x), we know the value of f (x, y), as shown in Figure A.2.
If the PDE is non-homogeneous, we need to solve for f (x, y), originating from
the data curve f (x, 0) = g(x), as shown in Figure A.3.
a(x, y)fxx + b(x, y)fxy + c(x, y)fyy + d(x, y)fx + e(x, y)fy + g(x, y)f + h(x, y) = 0. (A.35)
• Parabolic, if ∆ = 0.
• Hyperbolic, if ∆ > 0.
• Elliptic, if ∆ < 0.
Let us change the independent variables from (x, y) ⇒ (θ(x, y), η(x, y)). Let us
define a new function,
where,
163
A.4. BOUNDARY CONDITIONS
We shall see that any hyperbolic equation of the form Eq. A.35 can be transformed
to the canonical form (Eq. A.40) by suitable change of variable. We need to find
θ(x, y) and η(x, y) such that,
Note that, θ(x, y) and η(x, y) are roots of the same equation. We can rewrite,
" r # ! " r # !
b b2 b b2
aθx + + − ac θx aθx + − − ac θy = 0. (A.42)
2 4 2 4
| {z }| {z }
I II
Now we have two first order linear PDEs and we know how to solve them using
method of characteristics. The characteristic equations are,
dx
= a, (A.43)
r ds
dy b b2
= + − ac,
ds 2 4
dθ
= 0.
ds
Thus, θ(x, y) is constant along the characteristic curves and the characteristic
curves are given by, p
dy dy/ds b/2 + b2 /4 − ac
= = . (A.44)
dx dx/ds a
Similarly, η(x, y) is constant along the characteristic curves and the characteristic
curves are given by, p
dy dy/ds b/2 − b2 /4 − ac
= = . (A.45)
dx dx/ds a
(f )∂Ω = u, (A.46)
164
A.5. ELLIPTIC PDE: LAPLACE EQUATION
then,
||u1 − u2 || < δ ⇒ ||f1 − f2 || < ε. (A.51)
This implies that, if the boundary condition changes by small amount, the solu-
tion should also change by small amount.9
165
A.6. HYPERBOLIC PDE: WAVE EQUATION
dt 1
= ± ⇒ x ± ct = constant. (A.55)
dx c
Let us define the following change of variable,
and assume that f (x, y) = u(θ(x, y), η(x, y)). Applying chain rule, it can be shown
that,
Substituting in Eq. A.53, we get the canonical form of 1D wave equation and its
solution as,
uθη = 0 ⇒ u(θ, η) = φ(θ) + ψ(η). (A.58)
Thus, the solution of 1D wave equation is,
It is left as an exercise to verify that the above equation satisfies Eq. A.53. Apply-
ing the initial conditions,
1 x
Z
φ(x) − ψ(x) = h(τ )dτ + constant. (A.61)
c x0
1 x
Z
1 constant
φ(x) = g(x) + h(τ )dτ + , (A.62)
2 2c x0 2
1 x
Z
1 constant
ψ(x) = g(x) − h(τ )dτ − .
2 2c x0 2
166
A.6. HYPERBOLIC PDE: WAVE EQUATION
D’Alembert’s solution: Combining the above equations, we can write the solu-
tion of 1D wave equation as,
Z x+ct
1 1
f (x, t) = [g(x + ct) + g(x − ct)] + h(τ )dτ . (A.64)
2 2c x−ct
Solution,
Z x+t Z Z
1 1 1
f (x, t) = [g(x + t) + g(x − t)] + h(τ )dτ − p(x, y)dR . (A.67)
2 2 x−t 2
167