0% found this document useful (0 votes)
6 views

1718 Theories Notes

The document discusses various theories related to matter, space, and time, including least action, special relativity, relativistic electromagnetism, and quantum mechanics. It covers fundamental concepts such as conservation laws, Lorentz transformations, Maxwell's equations, and the Schrödinger equation. The content is structured into sections and subsections, providing detailed explanations and examples for each topic.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

1718 Theories Notes

The document discusses various theories related to matter, space, and time, including least action, special relativity, relativistic electromagnetism, and quantum mechanics. It covers fundamental concepts such as conservation laws, Lorentz transformations, Maxwell's equations, and the Schrödinger equation. The content is structured into sections and subsections, providing detailed explanations and examples for each topic.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Theories of Matter, Space and Time

N Evans and SF King

March 6, 2018
Contents

1 Least Action 1
1.1 Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Snell’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Complicated Problems . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Light in Vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 Light in the Atmosphere . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Newtonian Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Multiple Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Example: Projectile Motion . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 Example 2: Double Pendulum . . . . . . . . . . . . . . . . . . . . 11
1.3 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.1 Ignorable Coordinates . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.2 Energy Conservation . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.3 Example - Central Forces . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.4 Hamiltonian and Energy . . . . . . . . . . . . . . . . . . . . . . . 16
1.4 Appendix - Calculus of Variation . . . . . . . . . . . . . . . . . . . . . . . 17
1.5 Appendix - Mathematics of conservation laws . . . . . . . . . . . . . . . . 19

2 Special Relativity 21
2.1 The Postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.2 Lorentz Contraction . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 An Analogy to Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Four Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.1 Index Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 The Laws of Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Four-velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Four Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.3 Four Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.4 Hypothesis for Dynamical Law . . . . . . . . . . . . . . . . . . . 33
2.6 Physics with Four-Momentum . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.1 The Doppler Effect . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.2 The Compton Effect . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6.3 Fixed Target Experiments . . . . . . . . . . . . . . . . . . . . . . 36

ii
CONTENTS iii

2.6.4 The GZK Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . 38


2.7 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8 Relativistic Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Relativistic Electromagnetism 42
3.1 Integral Form of Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . 42
3.1.1 Gauss’ Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1.2 No Magnetic Charges . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.3 Faraday’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.4 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Differential Form of Maxwell’s Equations . . . . . . . . . . . . . . . . . . 45
3.2.1 Maxwell’s Equations in Differential Form . . . . . . . . . . . . . . 46
3.2.2 Conservation of Charge . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.3 The Displacement Current . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.1 Electrostatic Potential . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.2 The Magnetic Vector Potential . . . . . . . . . . . . . . . . . . . . 50
3.3.3 A New Electric Potential . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.4 Gauge Transformations . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.5 Maxwell’s Equations in Lorenz Gauge . . . . . . . . . . . . . . . . 53
3.4 Relativistic Formulation Of Electromagnetism . . . . . . . . . . . . . . . . 54
3.4.1 Four-vector Current . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4.2 Conservation of Charge . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.3 The Four Vector ∂ µ . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.4 Four Vector Potential . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.4.5 A Moving Point Charge . . . . . . . . . . . . . . . . . . . . . . . 57
3.4.6 The Electromagnetic Field Strength Tensor . . . . . . . . . . . . . 59
3.4.7 Lorentz Transformations of Electric and Magnetic Fields . . . . . . 60
3.4.8 The Relativistic Force Law . . . . . . . . . . . . . . . . . . . . . . 61
3.5 The Lagrangian For a Charged Particle . . . . . . . . . . . . . . . . . . . . 62
3.6 Appendix - Gauss’ and Stoke’s Theorems . . . . . . . . . . . . . . . . . . 63
3.6.1 Gauss’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Appendix - Vector Identities . . . . . . . . . . . . . . . . . . . . . . . . . 67

4 Quantum Mechanics 70
4.1 Non-relativistic Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . 70
4.1.1 One Dimensional, Time Dependent Schroedinger Equation . . . . . 70
4.1.2 Time Independent Schroedinger Equation . . . . . . . . . . . . . . 71
4.1.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.1.4 Proof that Probability Is Conserved . . . . . . . . . . . . . . . . . 72
4.1.5 Momentum Space Wave Functions . . . . . . . . . . . . . . . . . . 73
4.1.6 Heisenberg Uncertainty Principle . . . . . . . . . . . . . . . . . . 75
4.1.7 Square Well Example . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.1.8 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
iv CONTENTS

4.1.9 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.1.10 The 3D Schroedinger Equation . . . . . . . . . . . . . . . . . . . . 79
4.1.11 Wave Function Collapse and All That . . . . . . . . . . . . . . . . 79
4.2 Path Integral Approach to Quantum Mechanics . . . . . . . . . . . . . . . 81
4.2.1 Proposal for the Quantum Mechanical Amplitude . . . . . . . . . . 81
4.2.2 The Classical Limit . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.3 Wave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.4 Deriving the Schroedinger Equation . . . . . . . . . . . . . . . . . 84
4.2.5 Path Integral for a Free Particle . . . . . . . . . . . . . . . . . . . 86
4.2.6 Interpreting the Free Particle Kernel . . . . . . . . . . . . . . . . . 87
4.2.7 Barrier Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.8 The Kernel in Terms of Wave Functions . . . . . . . . . . . . . . . 90
4.2.9 Appendix - Gaussian Integrals . . . . . . . . . . . . . . . . . . . . 92
4.3 Relativistic Quantum Mechanics - The Klein Gordon Equation . . . . . . . 94
4.3.1 Problems in the Klein Gordon Equation . . . . . . . . . . . . . . . 95
4.3.2 Feynman Stueckelberg Interpretation . . . . . . . . . . . . . . . . 95
Chapter 1

Least Action

Newton, through his three laws of dynamics, developed an extremely successful description
of the motion of objects. These laws for example can describe the elliptical orbits of planets
to remarkable precision. There is though an alternative presentation of these successes, the
Principle of Least Action, which we will explore here. It is a formalism that grew out of
optics and will allow us to study an area of mathematics called “calculus of variation”. Of
course it must turn out to be the same as Newton’s laws. This alternative formalism makes
some dynamics problems easier to solve but, more importantly, it will give us new insights
into conservation laws. It is important to master these methods since as one moves to the
forefront of modern quantum theories the Least Action Principle becomes the only way to
define theories such as that of the strong nuclear force.

1.1 Optics
Our starting point will be to think about the path that light travels by. In these enlightened
times we might start from Maxwell’s equations and derive a wave equation with light waves
as solutions to determine how the light propagates. Before this technology though Fermat
proposed

Fermat’s Principle of Least Time: Light propagates between two points so as to minimize
its travel time

Thus for example in a uniform medium where the speed of light c is a constant the
minimum time of travel

d
t= (1.1)
c
is given by the path of shortest distance d ie a straight line. This is still a perfectly good (if
limited) description of light.
We can obtain more interesting results by thinking about media where the speed of light
changes.

1
2 CHAPTER 1. LEAST ACTION

1.1.1 Snell’s Law


Consider two neighbouring regions of space in which light travels at different speeds v1 , v2 -
for example a glass air interface. We will be interested in the light that travels from the point
(x1 , y1 ) in the first medium to the point (x2 , y2 ) in the second

(x1, y1)
material 1 - v1
θ1
d1
(x, 0) Figure 1.1: Possible paths that light
might follow transiting across the
interface between two materials.
d2
material 2 - v2
θ2

(x2, y2)

In any one medium light travels in a straight line but in this case we have some choice
in where the light crosses between the media. Lets consider the arbitrary crossing point
(x, y = 0). The time of travel is
d1
T [x] = v1 + dv22
√ √ (1.2)
(x−x1 )2 +y21 (x−x2 )2 +y22
= v1 + v2
We now want to find the path (ie the value of x through which it passes) which minimizes
the time taken. Thus

dT (x − x1 ) (x2 − x)
= q − q =0 (1.3)
dx v1 (x − x1 )2 + y21 v2 (x2 − x)2 + y22
This equation though is just

sin θ1 sin θ2
v1 = v2 (1.4)
which is Snell’s law.
In terms of index of refraction which is defined, relative to the vacuum, as
c
n1 = (1.5)
v1

n1 sin θ1 = n2 sin θ2 (1.6)


1.1. OPTICS 3

1.1.2 Complicated Problems


We can imagine more complicated problems than that above where the index of refraction
is an arbitrary function of position. For example consider light moving in a plane where the
speed of the light is v(x, y)

y
Figure 1.2: Possible paths that light
might travel in a plane with varying
speed of light v(x, y).

xa xb
x
Different paths are described by different functions y(x). The time to travel along an
arbitrary little piece of path is
p
distance dx2 + dy2
dT = = (1.7)
velocity v(x, y)
Summing such contributions up along a path gives the total time of travel
s
Z xb  2
1 dy
T [y(x)] = 1+ dx (1.8)
xa v(x, y) dx
To rewrite this in a more standard form we have found that the time taken to traverse a
path is
Z xb
T [y] = L(y, ẏ, x) dx (1.9)
xa
where
1 p
L(y, ẏ, x) = 1 + ẏ2 (1.10)
v(x, y)
Now we want to find the path y(x) that gives the minimum time.

Exercise 1.1: To remind yourself about partial differentiation if a function is defined by

T = a(t) b(t)3 ḃ(t) t 10

where the dot indicates a derivative with respect to t. Give expressions for
∂T ∂T ∂T dT
, , ,
∂t ∂b ∂ ḃ dt
4 CHAPTER 1. LEAST ACTION

This is the sort of problem that Calculus of Variation is designed to address, as discussed
in Appendix 1.4.
As we show in Appendix 1.4, the problem of finding the path that minimizes the time, is
equivalent to solving a differential equation called the Euler Lagrange equation,
 
d ∂L ∂L
− =0 (1.11)
dx ∂ ẏ ∂y

which corresponds to Eq.1.82 with q identified as y and s identified as x.


Lets look at a couple of examples

1.1.3 Light in Vacuum


In vacuum the speed of light is a constant so v(x, y) = c
The Euler Lagrange equation is
!
d ẏ
p =0 (1.12)
dx 1 + ẏ2
Integrating this gives

p = constant (1.13)
1 + ẏ2
The only solution of this is

ẏ = constant, m (1.14)
or integrating

y = mx + c (1.15)
ie a straight line. This is our first example of the solution of the Euler Lagrange equation
giving the path that minimizes T . m and c are determined by the initial and final position of
the light.

1.1.4 Light in the Atmosphere


In the atmosphere the air temperature and density change with height resulting in the speed
of light depending on height - v(h). Equivalently we can write the refractive index n(h) with
c
v(h) = (1.16)
n(h)
Our result for the length of time light takes to travel some path h(x) can be written as an
optical path length
Z x2 p
cT [h] = dxL, L = n(h) 1 + ḣ2 (1.17)
x1
1.1. OPTICS 5

We can use the fact that L is independent of x to simplify the Euler Lagrange equation as
follows. Note that

dL ∂ L ∂ L ∂L
= + ḣ + ḧ (1.18)
dx ∂x ∂h ∂ ḣ
∂L
The first term on the right is zero. Now replace using the Euler Lagrange equation
∂h
 
∂L d ∂L
= (1.19)
∂ h dx ∂ ḣ
and we find
 
dL d ∂L ∂L
= ḣ + ḧ (1.20)
dx dx ∂ ḣ ∂ ḣ
which is just
 
d ∂L
L − ḣ =0 (1.21)
dx ∂ ḣ
which gives us

∂L
L − ḣ = constant, D (1.22)
∂ ḣ
Note that this is only a first order equation rather than the second order Euler Lagrange
equation so is simpler to solve.
In our problem, using the explicit form for L above we have
p ḣ2 n
n 1 + ḣ2 − p =D (1.23)
1 + ḣ2
which simplifies to
n
p =D (1.24)
1 + ḣ2
Note that the physical meaning of D is the value of the index of refraction at the point where
the light ray becomes horizontal so that ḣ = 0.
Squaring and rearranging we find
r
dh n2
= −1 (1.25)
dx D2
Thus
x − x0 = hh0 q ndh
R
2
(1.26)
−1
D2
6 CHAPTER 1. LEAST ACTION

Explicit Example: Consider a ray of light that begins moving horizontally (ḣ = 0) at h = 0
in an atmosphere where
n(h) = n0 − λ h (1.27)
where λ is some constant. We must solve the integral
dh
Z
x= q (1.28)
(n0 −λ h)2
D2
−1
This can be done by changing variables to

n0 − λ h = D cosh φ (1.29)
The integral becomes
D D
Z
x=− dφ = − φ + c (1.30)
λ λ
Returning to the original coordinates and requiring the boundary conditions ḣ(x = 0) = 0
and h(x = 0) = 0 gives the result

n0 λx
h= λ (1 − cosh n0 ) (1.31)

• When λ is positive n(h) decreases with altitude - this is what normally happens in the
atmosphere. Plotting the form of the solution we find

h
apparent light path
to observer
(0,0)
x
Figure 1.3: The solution for the
light path h(x) when n(h)
decreases with height.

Thus if we look up at the Empire State building it will appear taller than it actually is.
• If there is a temperature inversion then λ is negative so n(h) increases with altitude. Plot-
ting the form of the solution we find

Figure 1.4: The solution for the


light path h(x) when n(h)
increases with height.
x
(0,0)

We see “the sky on the ground” - a mirage.


1.2. NEWTONIAN DYNAMICS 7

Exercise 1.2:
(a) Consider a fibre optic cable lying in the z direction. The cable is made of glass with index
of refraction n(r), where r is the radial distance from the centre of the cable. Working in
cylindrical coordinates (r, θ , z) show that Fermat’s Principle implies light travels on the path
minimizing the quantity
Z z2 Z z2 p
f r(z), θ (z), r′(z), θ ′ (z) dz =
′ ′

n(r) r 2 + r2 θ 2 + 1 dz.
z1 z1

where a prime indicates differentiation with respect to z. z1 and z2 are the z-coordinates of
the end points of the path.

(b) If a light ray initially has θ ′ = 0 show, from the appropriate Euler Lagrange equation,
that the θ independence of f implies the path followed by the light is described by a constant
value of θ .

(c) Use the z independence of f to deduce that the first order differential equation for rays
travelling paths with constant θ is

∂f ′
f− r = constant.
∂ r′

1.2 Newtonian Dynamics


We have seen that the motion of light can be described by a “principle of least time”. Is there
an equivalent rule that would describe the motion of a particle in Newtonian dynamics?
There is and it is enshrined as

Hamilton’s Principle: A particle travels by the path between two points that minimizes the
Action.

We need to know what the “action” is. Let’s write it first for one dimensional motion.
The action is

R tb
S[path] = ta L(x, ẋ,t)dt (1.32)

where the dot indicates differentiation with respect to the time, t. L is known as the
Lagrangian and is given by

L = kinetic energy − potential energy = T −V (1.33)


8 CHAPTER 1. LEAST ACTION

From Appendix 1.4, we know that the path that minimizes the action satisfies the Euler
Lagrange equation, analogous to the case of optics Eq.1.11,
 
∂L
d
dt ∂ ẋ − ∂∂ Lx = 0 (1.34)

which corresponds to Eq.1.82 with q identified as x and s identified as t.

We can now check to see if any of this makes sense (!). For a non-relativistic particle in
a one dimensional potential we have

1
L = T −V = mẋ2 −V (x) (1.35)
2
The Euler Lagrange equation is therefore

d ∂V
(mẋ) + =0 (1.36)
dt ∂x
which is Newton’s second law since

∂V
F =− (1.37)
∂x

Note that the momentum of the particle is given by

∂L
p = mẋ = (1.38)
∂ ẋ

1.2.1 Multiple Coordinates


Suppose that we now have several coordinates

qi i = 1...n (1.39)

For example for one particle moving in three dimensions we might call x = q1 , y = q2 z = q3 .
As discussed in Appendix 1.4, Eq.1.85, for the n dimensional case we have to solve a set
of n Euler Lagrange equations - one associated with each coordinate,

 
∂L
d
dt ∂ q̇i − ∂∂qLi = 0 (1.40)

i.e. we need to write down n copies of the Euler Lagrange equation, for i = 1, 2, . . .n and
try to solve them simultaneously.
1.2. NEWTONIAN DYNAMICS 9

Generalized Coordinates
The reason that we have written the coordinates so generally as qi rather than for example
using x, y, z is that in some problems these are not the appropriate coordinates because of a
constraint. A simple example to illustrate this is a ball on a wire hoop

y
θ Figure 1.5: The coordinates describing
a ball constrained to run
x around a hoop.

The hoop stops the ball moving in the radial direction so the ball cannot be at any arbitrary
(x, y). The sensible coordinate to use is the angle θ .
Such a reduced set of coordinates are called generalized coordinates.

Generalized Momentum
A generalization of the idea of momentum can be defined in the spirit of (1.38). The gener-
alized momentum associated with a generalized coordinate is given by

∂L
pi = ∂ q̇i (1.41)

1.2.2 Example: Projectile Motion


Consider the familiar problem of a projectile in a uniform gravitational field

y
Figure 1.6: The motion of a projectile in the
x, y plane subject to constant gravity
in the vertical direction y.

x
10 CHAPTER 1. LEAST ACTION

We can obtain the normal Newtonian equations of motion from the Euler Lagrange equa-
tions. We need expressions for the kinetic and potential energy of the system so we can build
the Lagrangian. The kinetic energy is just
1 1
T = mẋ2 + mẏ2 (1.42)
2 2
and the potential energy

V = mgy (1.43)
So the Lagrangian is just
1 1
L = T −V = mẋ2 + mẏ2 − mgy (1.44)
2 2
Now we find the two Euler Lagrange equations. The first associated with the x coordinate
is
 
d ∂L ∂L
− =0 (1.45)
dt ∂ ẋ ∂x
which gives
mẍ = 0 (1.46)
The second equation associated with the y coordinate is
 
d ∂L ∂L
− =0 (1.47)
dt ∂ ẏ ∂y
which gives

ÿ = −g (1.48)
The two boxed equations are the standard Newtonian equations of motion.

Hopefully you’re starting to see the power of this technique now - the kinetic and poten-
tial energies of a system are fairly easy to work out and then we just do some maths. There’s
not all that resolving forces business! The next problem is an example that would be very
hard by the standard methodology.
1.2. NEWTONIAN DYNAMICS 11

1.2.3 Example 2: Double Pendulum


Consider a double pendulum as shown in Figure 1.7,

θ1 l1

v1 Figure 1.7: The coordinates relevant


m1
for a double pendulum.
l2
θ2 v1 + v2
m2

It would be pretty hard work to determine all the forces in play here. However, the
Lagrangian technique means we only have to calculate the energies of the two masses to get
to the equations of motion.
The first mass has a velocity ~v1 with magnitude l1 θ̇1 (v = ω r). The second mass has
both this motion plus a second contribution from the swing of the second pendulum ~v2 with
magnitude l2 θ̇2 . The total velocity of the second mass is therefore

~vtot =~v1 +~v2 (1.49)


so
2
vtot = (~v1 +~v2 ).(~v1 +~v2 )
(1.50)
= (l1 θ̇1 )2 + (l 2 θ̇2 )2 + 2l 1 θ̇1 l2 θ̇2 cos(θ2 − θ1 )

where θ2 − θ1 is the angle between ~v1 and ~v2 .


Thus the total kinetic energy of the system is
1 1 
T = m1 l12 θ̇12 + m2 (l1 θ̇1 )2 + (l2 θ̇2 )2 + 2l1 θ̇1 l2 θ̇2 cos(θ2 − θ1 )

(1.51)
2 2
The potential energy is determined by the heights of the masses

V = −m1 gl1 cos θ1 − m2 g(l1 cos θ1 + l2 cos θ2 ) (1.52)


and the Lagrangian is

L = T −V (1.53)
There are two Euler lagrange equations - one associated with θ1

d
m1 l12 θ̇1 + m2 l12 θ̇1 + m2 l1 l2 θ̇2 cos(θ2 − θ1 ) − m2 l1 l2 θ̇1 θ̇2 sin(θ2 − θ1 )
 
dt
(1.54)
+(m1 + m2 )gl1 sin θ1 = 0
12 CHAPTER 1. LEAST ACTION

and one with θ2

d  2 
m2 l2 θ̇2 + m2 l1 l2 θ̇1 cos(θ2 − θ1 ) + m2 l1 l2 θ̇1 θ̇2 sin(θ2 − θ1 ) + m2 gl2 sin θ2 = 0 (1.55)
dt
These are pretty messy (but that was the point!). Things simplify a bit if we assume that
both θ1 and θ2 are small and expand to linear order. We then get

(m1 + m2 )l12 θ̈1 + m2 l1 l2 θ̈2 = −(m1 + m2 )gl1 θ1


(1.56)
m2 l22 θ̈2 + m2 l1 l2 θ̈1 = −m2 gl2 θ2
These coupled equations in fact have normal mode solutions of the form

θ̈1 = −ω 2 θ1
(1.57)
θ̈2 = −ω 2 θ 2

ie the two pendulums oscillate with the same frequency.


To find ω you can try substituting in the form of the solution in (1.57) into (1.56). You’ll
find two simultaneous equations for θ1 and θ2 with two solutions. You’ll find in one case
θ1 /θ2 is positive and in the other it is negative. So in one case the pendulums swing together
and in the other case in opposite directions.

Exercise 1.3: If a system with generalized coordinate q has the Lagrangian


1
L = q̇2 − q3
2
what is the Euler Lagrange equation describing the system?

Exercise 1.4: If a system with generalized coordinates ξ and ψ has the Lagrangian
1
L = ξ̇ 2 + cos ξ ψ̇ − ξ eψ
2
what are the Euler Lagrange equations describing the system?

Exercise 1.5:Two blocks of equal mass M are connected by a flexible string of length ℓ.
One block is placed on a smooth horizontal table and the other block hangs over the edge.
Using the length z of string hanging over the edge as a generalized coordinate, write down
the Lagrangian and use the Euler–Lagrange equation to find the acceleration of the hanging
mass in the following cases:

(i) the mass of the string is negligible,

(ii) the string is heavy with mass m distributed uniformly along it.
1.3. CONSERVATION LAWS 13

Exercise 1.6:
(a) Show that for a non-relativistic, free particle of mass m travelling with constant velocity
v the action S describing its motion reduces to

S = mvd/2

where d is the distance travelled. This was a form for the action proposed by Maupertuis
who believed it reflected the simplicity and economy of the Creator-God....

(b) Consider such a particle rolling on a table in the x, y plane with speed v1 . Along the
y-axis there is a height discontinuity in the table which the particle can move over at the cost
of potential energy which reduces its velocity to v2 . If the particle starts at (x1 , y1 ) to the left
of the y axis and ends to the right at (x2 , y2 ) show that the action for it passing across the
y-axis at arbitrary y (assuming it travels in a straight line except when it crosses the y axis)
is given by
q q
S = mv1 x1 + (y − y1 ) + mv2 x22 + (y − y2 )2
2 2

By minimizing the action deduce the relation

v1 sin θ1 = v2 sin θ2

where the angles are the angles between the particle’s direction of motion and the x axis
before and after it crosses the y axis. Contrast this result with Snell’s Law for light.

1.3 Conservation Laws


Finally lets look at one of the most surprising pieces of insight to come out of the Lagrangian
formalism - that is a deeper understanding of conservation laws. The mathematics of this is
discussed in more detail in Appendix 1.5.

1.3.1 Ignorable Coordinates


If the Lagrangian does not depend on some coordinate qi it is called an ignorable coordinate.
Then ∂ L/∂ qi = 0 and it’s associated generalized momentum is conserved as we can see from
the Euler Lagrange equation
 
d ∂L ∂L d pi
− = =0 (1.58)
dt ∂ q̇i ∂ qi dt
so
pi = constant (1.59)
14 CHAPTER 1. LEAST ACTION

This is clearly a mathematical fact but there is a deeper interpretation. If L only depends
on q̇i not qi itself then we can shift

qi → qi + const (1.60)
and leave the Lagrangian, L, (and hence the physics) invariant. This is a symmetry - transla-
tion invariance in the qi direction.
Thus we learn that the true relation is

symmetry → conserved momentum

This is a new insight we have not seen before in Newtonian mechanics.

1.3.2 Energy Conservation


Consider the case that L does not depend explicitly on t. This implies that a quantity known
as the Hamiltonian is conserved. The Hamiltonian is defined as,

∂L
H =∑ q̇i − L (1.61)
i ∂ q̇i
To prove that it is conserved we explicitly calculate
 
dH d ∂L ∂L ∂L ∂L ∂L
=∑ q̇i + ∑ q̈i − −∑ q̇i − ∑ q̈i = 0 (1.62)
dt i dt ∂ q̇i i ∂ q̇i ∂t i ∂ qi i ∂ q̇i

using the Euler Lagrange equations and ∂∂Lt = 0.


In simple systems the Hamiltonian is just the total energy of the system as we can see for
example in one dimension where
1
L = mẋ2 −V (x) (1.63)
2
so using the definition above
1
H = mẋ2 +V (x) (1.64)
2
In conclusion here we have learnt that time translation invariance implies energy conser-
vation.
1.3. CONSERVATION LAWS 15

1.3.3 Example - Central Forces


Consider a particle moving subject to a central force ie in a potential V (r)

Figure 1.8: The coordinates relevant


r F for a particle moving in a
central potential.
θ
Ο
The kinetic energy of the particle is

1 1 1
T = m(ẋ2 + y˙2 ) = mṙ2 + mr2 θ̇ 2 (1.65)
2 2 2
thus

1 1
L = mṙ2 + mr2 θ̇ 2 −V (r) (1.66)
2 2
There is a Euler Lagrange equation associated with the r coordinate
 
d ∂L ∂L
− =0 (1.67)
dt ∂ ṙ ∂r
giving
∂V
mr̈ = mrθ̇ 2 − (1.68)
∂r
Plus a second equation for θ , which since L is independent of θ , is just

d
(mr2 θ̇ ) = 0 (1.69)
dt
which tells us that angular momentum is conserved.
The Hamiltonian is also conserved and is given here by

1 1
H = mṙ2 + mr2 θ̇ 2 +V (1.70)
2 2
which is the total energy.

Exercise 1.7: If a system with generalized coordinates x and y has the Action
Z  
1 2 1 2
S= ẋ + ẏ + cos ẏ − x dt
2 2

what quantities are conserved?


16 CHAPTER 1. LEAST ACTION

1.3.4 Hamiltonian and Energy


Finally it is worth stressing that the Hamiltonian is not always the energy of the system. As
an example consider a bead on a hoop that is being rotated at a fixed angular velocity ω , as
shown in the diagram,

Figure 1.9: A bead on a hoop which is


a rotating at angular speed ω .
θ m

To be explicit, the hoop is in a vertical plane near the surface of the Earth, where that
vertical plane is subject to a steady rotation about a fixed axis passing through the centre of
the hoop, due to an external turning force or torque. The fact that the energy is not conserved
is due to the fact that the turning force which is required to maintain the steady rate of rotation
is external to the system. However we shall show that, even in this case, the Hamiltonian is
conserved, even though the Hamiltonian cannot be identified with the energy.
The single coordinate θ (which is a function of time t) as shown in the diagram is suffi-
cient to describe the position of the bead so this is a good generalized coordinate. The kinetic
energy is given by

1
T = m(a2 θ̇ 2 + a2 sin2 θ ω 2 ) (1.71)
2
and the potential energy by

V = −mga cos θ (1.72)


Thus

1
L = m(a2 θ̇ 2 + a2 sin2 θ ω 2 ) + mga cos θ (1.73)
2
Since L does not depend on t the Hamiltonian is conserved. In particular

1 1
H = ma2 θ̇ 2 − ma2 sin2 θ ω 2 − mga cos θ (1.74)
2 2
Although H is conserved the total energy of the system is not since to keep the hoop rotating
a constant external torque must be applied, thereby doing work on the system.
1.4. APPENDIX - CALCULUS OF VARIATION 17

1.4 Appendix - Calculus of Variation


In this Appendix we derive the Euler-Lagrange equation from the calculus of variation, using
a general notation which is applicable both to optics and dynamics.
Consider a set of curves q(s) between two points (q1 , s1 ) and (q2 , s2 ) in the s, q plane (we
will only consider curves where the trajectory is single valued at each value of s). 1

q
q2
Figure 1.10: An arbitrary path in the
q − s plane between fixed end points.
q1

s1 s2
S
Imagine we are interested in one curve that minimizes the quantity
Z s2
S[q(s)] = L(q, q̇, s)ds (1.75)
s1
L is just a number at each point on a given curve determined by the values of q and s at that
point and the gradient q̇ = dq/ds. The integral sums these numbers along the line.
If the curve that minimizes S is q̄(s) we can write the other curves as deviations from it

q(s) = q̄(s) + δ q(s) (1.76)


subject to the boundary conditions

δ q(s1 ) = δ q(s2 ) = 0 (1.77)


The value of S for these curves varies from the value for q̄(s) by

δ S = S[q̄ + δ q] − S[q̄] (1.78)

Since q̄(s) is the minimum though δ S = 0 to lowest order in δ q.


Let’s calculate S[q̄ + δ q] to order δ q
R s2
S[q̄ + δ q] = s1 L(q̄ + δ q, q̄˙ + δ q̇, s)ds

R s2  ∂L ∂L

≃ s1
˙
L(q̄, q̄, s) + δ q̇ ∂ q̇ + δ q ∂ q + .... ds (1.79)
R s2  
≃ S[q̄] + s1 δ q̇ ∂∂ Lq̇ + δ q ∂∂ Lq ds + O(δ q2 )
1 Forexample, in the case of light in two dimensions we identify q(s) → y(x), while for a particle in one
dimension we identify q(s) → x(t), or for a simple pendulum we identify q(s) → θ (t).
18 CHAPTER 1. LEAST ACTION

Integrating the second term by parts (u = ∂ L/∂ q̇, dv/ds = δ q̇ etc)

∂ L s2
Z s2   Z s2  
∂L d ∂L
δ q̇ ds = δ q − δq ds (1.80)
s1 ∂ q̇ ∂ q̇ s1 s1 ds ∂ q̇

The first term vanishes since δ q vanishes at the ends of the path.

Thus
Z s2    
d ∂L ∂L
S[q̄ + δ q] − S[q̄] = − δq − ds + ... (1.81)
s1 ds ∂ q̇ ∂q
This is only zero (at order δ q) if

 
∂L
d
ds ∂ q̇ − ∂∂ Lq = 0 (1.82)

This is the Euler Lagrange equation.

In general, we will want to solve problems in more than one dimension. For example,
there may be several such generalised coordinates, qi corresponding to three dimensions,
x, y, z or multiple angles θi . The above formalism is easily adapted for such cases. The
definition of the Action above in terms of the Lagrangian (L = T − V ) remains the same,
however we now have several coordinates

qi i = 1...n (1.83)

In the derivation above of the Euler Lagrange equation, it is straightforward to take into
account deviations in the path in all of these coordinates. We would find that the change in
the action of a path close to the minimizing path would have the form
Z s2    
d ∂L ∂L
∆S = −
s1
∑ δ qi ds ∂ q̇i

∂ qi
ds (1.84)
i

At the minimum the coefficients of each δ qi must vanish independently so we get a set
of Euler Lagrange equations - one associated with each coordinate

 
∂L
d
ds ∂ q̇i − ∂∂qLi = 0 (1.85)

Exercise 1.8: Work through the above derivation in the case where L depends on two coor-
dinates q and p. What two equations must then be satisfied by the minimizing curve?
1.5. APPENDIX - MATHEMATICS OF CONSERVATION LAWS 19

1.5 Appendix - Mathematics of conservation laws


Under certain circumstances the Euler Lagrange equation simplifies from a second order
equation to a first order equation. This has important applications in Newtonian dynamics,
where the physical interpretation is the connection between symmetry and conservation laws,
although here we just focus on the mathematics.
There are two particularly interesting special cases:

1) If L(q, q̇, s) is independent of the coordinate q


 
d ∂L
=0 (1.86)
ds ∂ q̇
So
∂L
∂ q̇ = constant (1.87)

2) If L(q, q̇, s) is independent of the coordinate s

dL ∂ L ∂L
= q̇ + q̈ (1.88)
ds ∂q ∂ q̇
using the Euler Lagrange equation gives
 
dL d ∂L ∂L
= + q̈ (1.89)
ds ds ∂ q̇ ∂ q̇
which is just
 
d ∂L
L − q̇ =0 (1.90)
ds ∂ q̇
so that

L − q̇ ∂∂ Lq̇ = constant (1.91)


20 CHAPTER 1. LEAST ACTION

Exercise 1.9: This is an exercise in using calculus of variation outside of optics or dynamics.
A smooth curved wire connects the origin to the lower point (x1 , y1 ). A bead on the wire
slides without friction from rest at the upper to the lower point under the influence of gravity.
It’s mechanical energy is conserved as it moves along the wire. Choose down to be the
positive y direction.

(a) Show that the time, T, required for the bead’s journey is
R x2 (1+y′ 2 )
q
1
T = √2g 0 y dx.

(b) Given that the integrand of the above integral is independent of x show that the curve y(x)
making T stationary satisfies the differential equation
q
dy (b−y)
dx = y

(c) Change the dependent variable from y to φ where y = b sin2 φ /2 and show that the above
can be integrated to give the brachistochrone

x = b/2(φ − sin φ )
Chapter 2

Special Relativity

Light travels at the very high speed, c ≃ 3 × 108 ms−1. In the late 1800s and early 1900s
physicists realized that the familiar Newtonian laws of motion breakdown when particles
travel near this speed, which turns out to be a maximum speed in our Universe. Einstein
reconciled these discoveries in his Special Theory of Relativity which he wrote down in
1905. Originally these ideas emerged in Maxwell’s theory of electromagnetism but it is
now standard to present the laws of dynamics first then move to the more complicated case
of electromagnetism. This is the ordering we will take in this and the next chapter. The
Special Theory of Relativity deals with observations of dynamics by an observer moving at
a constant speed. Here we will learn how to write the laws of dynamics in a form consistent
with Special Relativity’s postulates. These laws are needed to explain essentially any event in
a particle accelerator, many observations in astronomy, but also are crucial to our everyday
lives. For example, the GPS satellite system our mobile phones use continually are very
sensitive to relativistic corrections from the satellites motions.

2.1 The Postulates


The two fundamental postulates of special relativity are

• The speed of light, c, is the same when measured in any inertial frame. This was the
crucial result from the Michelson-Morley experiment.

• The laws of physics are the same for an observer in any inertial frame. This is the
statement that there is no observer (for example stationary relative to some “aether”)
for whom the laws are especially simple.

An observer moving at constant speed is said to be in an inertial frame that can be thought
of as a combination of

• A rigid, stationary (relative to the observer) lattice grid by which position coordinates
are specified

• A set of synchronized clocks at each lattice point so time can be recorded.

21
22 CHAPTER 2. SPECIAL RELATIVITY

Figure 2.1: A spatial grid with a clock


at each point that can be used to
specify the coordinates of an event.

With these observational tools the observer can specify any event by the set of coordinates

(x, y, z), and t (2.1)

Note that moving from one inertial frame to another is often described as performing a
boost.

2.2 Lorentz Transformations


To see the bizarre implications of the first postulate, consider a light wave front emitted from
a stationary source at the origin

Figure 2.2: A spherical light front


y
emitted from the origin.
x

Let us call this inertial frame (stationary relative to the light source) frame S. The light
wave moves away from the source at speed c as a spherical shell described by

x2 + y2 + z2 = (ct)2 (2.2)

Now consider an inertial frame S′ moving with speed v in the positive x direction. For
convenience lets set the origin of both sets of coordinates at time t = 0 at the same place.

z z’
S S’ v

Figure 2.3: Two sets of coordinate axes


y y’
separating at speed v in the
x x’
x direction.

vt
2.2. LORENTZ TRANSFORMATIONS 23

The origins of the two sets of coordinates separate by a distance vt in time t.

The first postulate says that the observer in S′ sees light travel at speed c too. Thus in this
frame too the light forms a spherical shell centred on the origin in S′ described now by
′ ′ ′
x 2 + y 2 + z 2 = (ct ′)2 (2.3)
This is very surprising - you would have guessed that the observer moving relative to the
light source would not be in the centre of the spherical light shell.
The only way to reconcile the two viewpoints is if the two observers disagree on the
values of times and positions. The two equations for the position of the shell (2.2), (2.3) in
the two frames moving at relative speed v are reconciled by the Lorentz transformations

t ′ = γ t − cv2 x


x′ = γ (x − vt)
(2.4)
y′ =y

z′ = z
where s
1
γ= 2 (2.5)
1 − vc2

Exercise 2.1: Explicitly check that substituting the Lorentz transformations into (2.3) one
obtains (2.2).

Exercise 2.2: How would the Lorentz transformations differ if the boost was in the z direc-
tion rather than the x direction?

An immediate check we should make on these transformations is that they make sense in
the slow moving world we live in. When v ≪ c, γ ≃ 1 and

t ′ ≃ t, x′ ≃ x − vt (2.6)

which is indeed what we would expect.


The Lorentz transformations imply that observers moving relative to each other will not
agree on the simultaneity of events. For example if a stationary observer sees an event happen
at t = 0 a distance of 10m away

(t = 0, x = 10) (2.7)
Then an observer moving in the x direction at speed v will record the event as occurring at a
time
24 CHAPTER 2. SPECIAL RELATIVITY

v
t ′ = −γ
(10) (2.8)
c2
ie earlier than when the two observers passed each other (t = t ′ = 0). The implications of
this are that the observers do not agree on measurements of periods and lengths.

2.2.1 Time Dilation


Imagine an observer at the origin in the frame S marks a second by letting off two flashes of
light separated by 1 second on his watch. The flashes of light are two events with coordinates

(x = 0, t = 0) (x = 0, t = 1) (2.9)
The moving observer in the S′ frame sees the events as
′ ′ ′ ′
(x = 0, t = 0) (x = −γ vt, t = γ ) (2.10)
The S′ observer has recorded a time
−1/2
v2

γ = 1− 2 ≥1 (2.11)
c
longer than one second. The S′ observer therefore declares that the S observer’s watch (which
is moving relative to S′ ) is running slow.

A moving clock runs slow

2.2.2 Lorentz Contraction


Consider a ruler of length L at rest in the frame S. An observer in S might make measure-
ments of the position of the two ends to deduce its length. Those measurements can be
represented by the events

(t = 0, x = 0), (t = 0, x = L) (2.12)
A moving observer in the frame S′ watches this process and is somewhat bemused. He
sees the measurement events as
′ ′ v ′ ′
(t = 0, x = 0) (t = −γ
2
L, x = γ L) (2.13)
c
The measurements were taken according to S′ at different times. Remember that S′ sees the
ruler moving, so if you measure the end points at different times you’ll not correctly measure
the length.
S′ wants S to make the second measurement at t ′ = 0. In S the position of the ruler doesn’t
change but when should S make the measurement so that S′ says t ′ = 0?


 v 
t = γ t − 2L = 0 (2.14)
c
2.3. AN ANALOGY TO ROTATIONS 25

thus
v
t= L (2.15)
c2
(The S observer doesn’t see what is special about this time of course!)
Now where is the second end in the S′ coordinates when this new S measurement is
made?

v2 L
x′ = γ (x − vt) = γ (L − L) = (2.16)
c2 γ
Thus S′ says the two correct simultaneous measurements of the end points are
L
(t ′ = 0, x′ = 0) (t ′ = 0, x′ = ) (2.17)
γ
S′ therefore sees the moving ruler to be shorter by a factor of γ relative to S.

Exercise 2.3: Repeat the computation of the length of the ruler in the frame S′ but assuming
that the ends of the ruler are at the points x = 1m, x = 2m in the S frame. Show that the
contracted length is again L/γ .

Moving objects contract in the direction of motion.

2.3 An Analogy to Rotations


It’s helpful to think of the Lorentz transformations as a generalization of the idea of rotations
in the following sense.
Consider first rotations in two dimensions. We can set up two observers who are using
coordinates rotated by an angle θ relative to each other

y sin
Figure 2.4: The motion of coordinate
axes under a rotation.
O
x cos
O
x
The coordinates transform between the two coordinate systems as

x′ = x cos θ + y sin θ (2.18)


y′ = y cos θ − x sin θ
26 CHAPTER 2. SPECIAL RELATIVITY

The different coordinate choices are in a sense a distraction from the physics involved
(of say a moving particle) which is really the same for the observer using either coordinates.
The elegant way to express this is to use vectors. The vector (eg from the origin to a particle)
is the same for both observers although its components may be different for the different
observers. We write

~x or x = (x, y) (2.19)
The coordinate transformation can then be written as a matrix multiplication on the vector
 ′    
x cos θ sin θ x
′ = (2.20)
y − sin θ cos θ y

There is something invariant about the position of a particle under rotations - it’s distance
from the origin ie
′ ′
L2 = x2 + y2 = x 2 + y 2 (2.21)
We can extract this from the vector by the dot product of the vector with itself

L2 =~x.~x (2.22)

Now consider Lorentz transformations in the x and t directions where the coordinates are
mixed up by a boost. The Lorentz transformations, although not exactly like the mixing of
spatial coordinates under rotations, do have a similar form. Let’s try to draw a diagram with
the coordinate axes of two different inertial frame observers both shown.
We begin with one stationary observer’s coordinates in the x − (ct) plane. We use ct
rather than just t because it has the same dimensions as x.

ct
light Figure 2.5: The path light follows
in the x − ct plane.

Note that light travels on the line at 45o to the axes since it reaches a distance x = ct in ct
time.
We can use the Lorentz transformations to plot the position of the equivalent axes in a
frame moving relative to this frame. The coordinate axes are given when ct ′ = 0 and x′ = 0
so
2.3. AN ANALOGY TO ROTATIONS 27

v
ct ′ = γ ct − γ x
c
v
ct ′ = 0 → ct = x (2.23)
c
v
x′ = γ x − γ ct
c
c
x′ = 0 → ct = x (2.24)
v

Now we can plot these axes in the original frame’s coordinates

ct ct’ light path

x’
Figure 2.6: Coordinate axes before and
after a boost superimposed.
x

The marked lines are the S′ coordinate axes - they agree with the original coordinates as
to the point (0,0). The plot also shows the grid x′ = 0, 1, 2.. ct ′ = 0, 1, 2.. etc. Note that in
the new coordinate system the path light takes is given by the same line - it goes through the
points (0,0) (1,1) (2,2) etc.
This is an equivalent plot to the one we drew for rotations. We can place an event on the
plot and then read off its coordinates in either the original frame using the square grid or in
the boosted frame using the skewed grid.
28 CHAPTER 2. SPECIAL RELATIVITY

The grid can be used to see time dilation and length contraction
ct ct’ light path

x’
Figure 2.7: Coordinate axes before
and after a boost with events
x marked relevant to measuring a
time and a length.

The circles are events positioned at x′ = 0 every second in S′ . Reading the time of the
event on the original axes though shows that S sees more than 1s having passed between
events - a clock in a moving inertial frame measures time more slowly - time dilation.
The solid line represents a rod in S′ . In S if we measure distance at the same time for
each end we get a smaller length - lengths appear contracted in a moving inertial frame.

Although x and t change between S and S′ this picture, like the coordinates for the rota-
tions, we want a frame invariant way to discuss events. This will lead us to introduce vectors
in this plane which have space and time like components. In the rotation case the vector had
an invariant length that was the same for all observers. For Lorentz transformations we have
shown in Section 2.2 that the quantity

ct 2 − |x|2 = constant (2.25)


is left invariant. This will be the “length” of our new “4-vectors”.

Exercise 2.4: Sketch a space-time diagram showing a stationary and a moving coordinate
frame with relative speed v. Explain from the diagram how a ruler lying on the x-axis of the
moving frame between x′ = 1 and x′ = 2 is seen contracted in the stationary frame.
2.4. FOUR VECTORS 29

2.4 Four Vectors


Our previous discussion leads us to consider a four component vector with ct as the time like
component, and x y and z position components describing an event or object. We will write
this four-vector as

xµ = (ct, x, y, z) = (x0 , x1 , x2 , x3 ) (2.26)


In this notation the index µ on xµ takes the values 0, 1, 2, 3 corresponding to the components
as shown.
We have identified two properties of the four-vector already. Firstly under Lorentz boosts
in the positive x direction by speed v it transforms as

− cv γ 0 0
  
γ ct
 −vγ γ 0 0 

xµ → x µ =   x 
 
c (2.27)
 0 0 1 0   y 
0 0 0 1 z
Secondly we know that it has a Lorentz invariant length

(x0 )2 − (x1 )2 − (x2 )2 − (x3 )2 = (ct)2 − |~x|2 (2.28)

2.4.1 Index Convention


At this point we are going to adopt a rather compact notation for multiplying four-vectors.
It will take a little getting used to but is not intrinsically deep! There are two rules that will
apply to the “µ ” index on a four vector

• A given label for an index may occur at most twice in any term in an expression.

• A repeated index is said to be “contracted”. Typically people write a repeated index


once up and once down. Such a repeated index is “summed over”.

The best way to explain this is with an example. We can write the Lorentz transformation
of xµ in the following form
′ µ
xµ → x µ = Λ ν xν (2.29)
µ
The new object Λ ν has two indices each of which can take the values 0, 1, 2, 3 and so there
are 4 × 4 = 16 components. These 16 components are just the 16 components of the Lorentz
transformation matrix we’ve written above (for example let µ count the row and ν the col-
umn).
In the expression the ν index occurs twice and this implies we must let ν take all possible
values and add up the answers we get in each case.
30 CHAPTER 2. SPECIAL RELATIVITY

For example, consider the case where we set µ = 0 then



x 0 = Λ0ν xν

= Λ00 x0 + Λ01 x1 + Λ02 x2 + Λ03 x3 (2.30)

= γ x0 − γ vc x1
This has reproduced the Lorentz transformation for x0 = ct.

Exercise 2.5: Convince yourself that equations (2.27) and (2.29) both reproduce the four
equations (2.4).

We can also write the Lorentz invariant length in this way. Formally we do this as follows.
We define a two index object called the metric with the 16 components
 
1 0 0 0
 0 −1 0 0 
gµν =   0 0 −1 0 
 (2.31)
0 0 0 −1
Now we can write

xµ = gµν xν (2.32)
This four vector with a lowered index has components

xµ = (ct, −x, −y, −z) (2.33)


So finally we can define the length of the four vector as

xµ xµ = x0 x0 + x1 x1 + x2 x2 + x3 x3
(2.34)
= (ct)2 − x2 − y2 − z2
This notation, which is common, is a little sloppy because the lowered index on a four
vector secretly contains the metric and its minus signs. In practice you may just want to
remember to insert the minus signs as they appear in the above expression when you contract
the indices on four vectors, as here, rather than always write the metric factors! BEWARE
though that there are not these minus signs in the Lorentz transformation expression (2.29)
µ
where Λν is not a four vector like object!

Exercise 2.6: Calculate xµ xµ and xµ yµ for the four vectors

xµ = (3, 1, 0, 2) yµ = (4, 5, 3, 0)

Show explicitly by performing a Lorentz boost by speed v in the x-direction that these
products are Lorentz invariant.
2.5. THE LAWS OF DYNAMICS 31

2.5 The Laws of Dynamics


We have seen the consequences of relativity for observations of lengths and periods. Now
we will turn to thinking about how to formulate the laws of dynamics. Simple Newtonian
formulae such as f = ma do not work because they contain time dependence and different
observers don’t agree on lengths of time.
Our guiding principle should be the second postulate which says that physical laws
should be the same for an observer in any inertial frame. We will cast the laws in a way
where this is manifestly true. Four-vectors will be the tool that allows this since they are a
frame invariant way of describing the properties of a particle. Our laws will only

• contain Lorentz invariant quantities such as xµ xµ

• or take the form X µ = Y µ

This latter form is explicitly Lorentz invariant because the two sides of the equation
transform in the same way under Lorentz transformations.
So far we only have a four-vector describing position. We will now construct four-vectors
describing the kinematic properties of a particle.

2.5.1 Four-velocity
It is not sensible to use

dxµ
v= (2.35)
dt
as our definition of velocity because both xµ and t transform under Lorentz boosts. The
resulting transformation is very messy.
Ideally we would like a measure of time that is Lorentz invariant so that v would trans-
form only through the transformation of xµ . It would then be a four-vector itself. Such a
Lorentz invariant measure of time is
Proper Time: the time elapsed on a clock in the rest frame of a moving object. Essentially
we imagine that everything has a watch and we time an event for the object by the time on its
watch not the observer’s. Observers in any reference frame will then get the same answer.
Finally we can make a sensible choice for our variable four-velocity

dxµ
uµ = dτ (2.36)
Let’s stress again that this four-vector transforms just like xµ under boosts ie
′ µ
u µ = Λ ν uν (2.37)
It is useful to know how four-velocity relates to the more standard velocity measured by
an observer using his own watch (we can call this coordinate velocity)

dxµ dxµ dt
uµ = = (2.38)
dτ dt d τ
32 CHAPTER 2. SPECIAL RELATIVITY

We can work out dt/d τ from the Lozentz transformations. τ is the time in the rest frame,
where the particle is sat at the origin, so in a moving frame

dt
t = γτ → =γ (2.39)

Thus the components of four-velocity are

uµ = γ (c, vx , vy , vz ) (2.40)
From this expression we can finally work out the invariant “length” of this four vector from
the product

uµ uµ = γ 2 (c2 − |~v|2 ) = c2 (2.41)

2.5.2 Four Acceleration


The definition of acceleration is now straightforward

duµ
aµ = (2.42)

Again it’s worth stressing that this object is a four-vector which transforms in the same way
as xµ .

2.5.3 Four Momentum


The natural generalization of momentum is given by

dxµ
pµ = muµ = m (2.43)

Here we have introduced the mass of the particle, m - it is a constant, intrinsic property of
the particle.
pµ is again a four-vector that transforms as
µ
p′ µ = Λ ν pν (2.44)

Interestingly though we have been led to introduce a time-like version of momentum.


What does this correspond to? To find out we should take the classical limit of the theory
(v ≪ c) and see what it corresponds to in Newtonian dynamics. Remember that the time like
component of four-velocity was u0 = γ c so

p0 = mcγ

= mc(1 − v2 /c2 )−1/2 (2.45)


2
≃ mc(1 + 12 vc2 + ...)
2.5. THE LAWS OF DYNAMICS 33

The first term is a constant. The second term though is recognizable since 12 mv2 is kinetic
energy in the low v limit. This suggests we should interpret p0 as the relativistic version of
energy (divided by c). Then we have a surprising interpretation of the first, constant, term -
a particle at rest has energy

Erest = mc2 (2.46)


We can write the components of pµ in a number of ways now
E
pµ = ( ,~p) = muµ = mγ (c,~v) (2.47)
c
The relativistic expression for energy is therefore

E = γ mc2 (2.48)
and the relativistic version of kinetic energy (the energy when moving minus the energy at
rest)
T = (γ − 1)mc2 (2.49)
The invariant length of the four-vector follows from uµ uµ = c2 so

E2
pµ pµ = c2
− |~p|2 = m2 c2 (2.50)

Exercise 2.7: Calculate by explicitly performing a boost the relativistic energy and mo-
mentum of a proton moving at speed v=0.5c. The rest mass of a proton is approximately 1
GeV/c2 .

2.5.4 Hypothesis for Dynamical Law


Armed with these four-vector variables we can now have a guess as to the form of the rela-
tivistic version of Newton’s second law. The obvious equation to try is

d pµ
fµ = (2.51)

This is manifestly Lorentz invariant and has the correct non-relativistic limit if f µ is a rel-
ativistic extension of force. As yet though we haven’t mentioned forces and we won’t until
we discuss electro-magnetism! In fact this guess is the correct law.
The law tells us something interesting even when f µ = 0

d pµ
= 0 → pµ = constant (2.52)

In other words, if no external force acts on a system four-momentum is conserved. This is
the relativistic analogue of conservation of energy (p0 ) and conservation of the usual three
component momentum (p1 , p2 , p3 ).
34 CHAPTER 2. SPECIAL RELATIVITY

2.6 Physics with Four-Momentum


To gain experience with four-vectors we will now look at four physics problems where using
four-momentum makes the solutions much easier than without.
To make our life easier we will use a trick that is common. Instead of using the usual
units system we will work in a new system where

c=1 (2.53)
In other words we redefine the unit of length so that it is the distance light travels in 1 second!
This would not be sensible for everyday life but in problems where everything is travelling
at the speed of light a meter is an absurdly small distance. In practice we will be able to drop
all the factors of c from computations. It’s pretty easy to put them back into the final answer
using dimensional analysis as we will see.

2.6.1 The Doppler Effect


What frequency will an observer see a light wave have if he is moving relative to it?

Figure 2.8: A photon approaching an observer.

Consider first a static observer in the frame of the light source. The photons of light carry
four momentum

h
pµ = (E,~p) = (h f , − x̂) = (h f , −h f , 0, 0) (2.54)
λ
Note that the photon is moving in the negative x-direction towards the observer. We have
used the quantum mechanical relations between the energy and frequency of the photon and
between its momentum and wavelength. We have also used f λ = c = 1.
We can now ask what would happen to the frequency of the light if the observer was
moving in the positive x-direction at speed v. We just perform a boost on the four-vector
    
γ −vγ 0 0 hf γ (1 + v)h f
 −vγ γ
p′ µ = 
0 0    −h f  =  −γ (1 + v)h f 
   
 0 (2.55)
0 1 0   0   0 
0 0 0 1 0 0
Now if we just concentrate on the time-like component we have
r s
′0 1 (1 + v)2
p = E′ = h f ′ = (1 + v) h f = hf (2.56)
1 − v2 (1 + v)(1 − v)
2.6. PHYSICS WITH FOUR-MOMENTUM 35

or
s
(1 + v)
f′ = f (2.57)
(1 − v)
Finally we can reintroduce the factors of c since the factors of (1 + v) are not dimension-
ally correct. We should have
q
(1+v/c)
f′ = (1−v/c)
f (2.58)

2.6.2 The Compton Effect


The Compton Effect relates the angle of scattering of a photon off a static electron to its final
wavelength. The classic experiment is schematically

∆λ(θ)

monochromatic x-ray static free electron θ


source λ in metal target

Figure 2.9: A schematic of the Compton experiment.

You’ve probably calculated this relationship for the change in the wavelength of the pho-
ton as a function of its scattering angle, ∆λ (θ ), previously. Using four-momentum will get
us to the answer much quicker.
Set up the four momentum of the particles to be:
µ
initial photon: pγ i = ( λh , λh x̂)
µ
initial electron: pei = (me , 0)
µ
final photon: pγ f = ( λh′ , λh′ f̂)
µ
final electron: pe f

Here f̂ is a unit vector in the direction of the motion of the final photon, which is at an angle
θ to the x axis.
Since no external force acts, four momentum is conserved in the collision so
µ µ µ µ
pγ i + pei = pγ f + pe f (2.59)
36 CHAPTER 2. SPECIAL RELATIVITY

µ
It turns out to be helpful to rearrange this equation so that pe f is isolated - we know least
µ
about pe f so will want to eliminate it
µ µ µ µ
pγ i + pei − pγ f = pe f (2.60)
Now we consider the Lorentz invariant product

µ
pe f pe f µ = m2e

µ µ µ
= (pγ i + pei − pγ f )(pγ iµ + peiµ − pγ f µ )
(2.61)
µ µ µ µ µ µ
= pγ i pγ iµ + pei peiµ + pγ f pγ f µ + 2(pγ i peiµ − pγ i pγ f µ − pγ f peiµ )

= 0 + m2e + 0 + 2( λhi me − 0) − 2 λhi λh (1 − cos θ ) − 2( λh me − 0)


f f

We have used two crucial facts here. Firstly when the four momentum of a particle is con-
tracted with itself we simply obtain the invariant m2e . Secondly we have used the contraction
µ
law p1 p2µ = (p01 p02 −~p1 .~p2 ).
Rearranging we find
h h h h
me − me = (1 − cos θ ) (2.62)
λi λf λi λ f
Multiplying through by λi λ f /(hme ) gives
h
λ f − λi =
(1 − cos θ ) (2.63)
me
which is the answer we want. Again we can insert c on dimensional grounds
h
λ f − λi = me c (1 − cos θ ) (2.64)

2.6.3 Fixed Target Experiments


A simple way in which to create fundamental particles is by colliding a high energy proton
or electron into a fixed target of, for example, lead.
µ Figure 2.10: A proton collision with a
p π fixed lead target generating a pion that
then decays.
νµ

Pb

It’s not immediately obvious how much energy is available to make rest mass energy of
the new particle because momentum conservation requires the final state to be moving and
have kinetic energy. A sensible thing to do is to move to the Centre of Mass frame where
the particle and target (a particle in the wall) approach each other with equal and opposite
momentum.
2.6. PHYSICS WITH FOUR-MOMENTUM 37

Lab frame CoM frame

a b a b
µ
µ
pb = (mb , 0)
p a = (Ea , pa )

Figure 2.11: The lab frame with one fixed target and the centre of mass frame where both
particles have equal momentum.

In this frame the particle produced will be at rest and all the energy of the initial state
will become rest mass energy of the product.
We can work out the Lorentz boost needed to move from the original “lab” frame to the
centre of mass frame. We boost the four-momenta in the lab frame by an amount v
    
µ′ γ −γ v Ea γ (Ea − vpa )
pa = = (2.65)
−γ v γ pa γ (pa − vEa )
    
µ′ γ −γ v mb γ mb
pb = = (2.66)
−γ v γ 0 −γ vmb
In the Centre of Mass frame the momenta must be equal and opposite so
′ ′
pax = −pbx
(2.67)
γ vmb = γ (pa − vEa )
and the required boost is by

v pa
c = mb c+Ea /c (2.68)

If after this boost the particles are ultra-relativitic so that Ea ≃ |pa | = |pb | ≃ Eb then the
total available energy is
v s
u
u 2
4mb c 2 4m2b c2 (mb c + Ea /c)2
ECoM = 2γ mb c = u 2
= (2.69)
(mb c + Ea /c)2 − p2a
 
pa
t
1 − m c+Ea /c
b

If we now expand in the limit with Ea ≫ mb c2 , ma c2 we find (remember that in this high
energy limit Ea /c = pa )
s
4m2b Ea2
ECoM = (2.70)
2mb Ea

ECoM = 2mb Ea (2.71)
38 CHAPTER 2. SPECIAL RELATIVITY

We could have obtained this result more quickly by calculating the invariant rest mass of
the whole system in the original coordinates
µ
pT OT pT OT µ = m2T OT c2
µ µ
= (pa + pb )(paµ + pbµ ) (2.72)
µ µ µ
= pa paµ + pb pbµ + 2pa pbµ
which in the limit where Ea is large compared to the rest masses gives

Ea mb c2
m2T OT c2 = 2 = 2Ea mb (2.73)
c c

2.6.4 The GZK Bound


Active galaxies accelerate protons to very high energies but there is a maximum energy we
should expect to see (first calculated by Greisen, Zatsepon and Kuzmin). The reason for the
maximum is that the Universe is full of photons left over from the Big Bang which higher
energy protons can interact with. These photons are responsible for the ambient background
temperature of the Universe T ∼ 3K (Eγ = kB T = 8 × 10−4 eV ). The protons interact as
follows

pγ → ∆(M∆ ∼ 1.2GeV /c2 ) → π + n (2.74)


The ∆ is a short lived particle and the final decay is by far its most dominant decay process.
If there is sufficient energy in the collision to create a ∆ then the proton is converted to other
particles very efficiently. We can calculate the minimum energy the proton must have.
Let’s assign the proton and photon initial four-momenta
µ
pµp = (E p , k, 0, 0), pγ = (hν , −hν , 0, 0) (2.75)
Note that we’ve set up the process so the photon and proton will collide head on. This maxi-
mizes the energy available for new particle creation and will therefore give us the minimum
proton energy for the process.
Four momentum will be conserved in the interaction so
µ µ
p∆ = pµp + pγ (2.76)
Rearranging and squaring gives
µ p γ
p∆ p∆µ = m2∆ = (pµ + pµ )(p pµ + pγ µ )
(2.77)
= m2p + m2γ + 2E p hν − 2k(−hν )
For a relativistic proton E p ≃ k and so

m2∆ − m2p
Ep = ≃ 2 × 1020 GeV (2.78)
4hν
2.7. TENSORS 39

Protons with energy of this or above will under go this interaction. Factoring in the
density of photons it turns out that the mean free path for such protons is about 3Mpc (our
galaxy group is about 20Mpc across). We shouldn’t expect to see any protons of this energy
from active galaxies.
Surprisingly though experimenters have reported observations of cosmic ray protons with
higher energy than this bound (although they do see a decrease in the number of events above
the bound limit). If these events are real we might be doing something wrong! Could Special
Relativity break down at such high energies? Could there be a source of high energy protons
within 3Mpc (either an astronomical source or very massive particles left over from the Big
bang that decay to these protons)? At the moment this issue is an open question.

Exercise 2.8: A charged pion (mπ = 140MeV /c2 ) at rest decays to a charged muon (mµ =
105MeV /c2 ) and a massless muon neutrino. Calculate the energy and momenta of the neu-
trino and the muon.

Exercise 2.9: In the original (Homestake) solar neutrino detection experiment neutrinos
from the sun interact with Cl 37 atoms to form Ar37 and an electron. Assuming the Cl atoms
are at rest what boost is required to move to the centre of mass frame? Determine the mini-
mum energy the neutrino must have for this reaction to proceed.

Exercise 2.10: Speculative models of particle physics predict that at very high energies all
matter is unified into a single form. If this were true one would expect, very rarely, that
protons would decay to, for example, a positron and a photon. Derive an expression for the
wavelength of the emerging photon in the proton’s rest frame.

2.7 Tensors
We are now familiar with four-vectors. They are though just one part of a family of objects
called tensors which can have more than one index. We will need these later when we study
electromagnetism. To introduce them think about angular momentum:
Non-relativistically angular momentum is given by

~l =~r ×~p (2.79)


with components
l 1 = ypz − zpy

l 2 = zpx − xpz (2.80)

l 3 = xpy − ypx

Relativistically these components are naturally part of the tensor

Lµν = xµ pν − pµ xν (2.81)
40 CHAPTER 2. SPECIAL RELATIVITY

For example
L12 = xpy − ypx = l 3 (2.82)
Tensors have a number of properties which in this case we can deduce from its “compos-
ite” nature. Thus
µ
• Under Lorentz transformations: L′µν = Λ α Λνβ Lαβ

• Lorentz invariant: Lµν Lµν = (L00 )2 − (L01 )2 + (L11 )2 + ... = constant

There are 16 terms in the final sum here (in fact because of the anti-symmetry of Lµν in (2.81)
the diagonal terms are zero and eg L12 = −L21 so there are only 6 independent components).
Finally we note that the metric we introduced earlier is itself a tensor.

Exercise 2.11: Show that the metric tensor is invariant to a boost by speed v.

2.8 Relativistic Action


The action that reproduces the relativistic equation of motion for a free particle

d pµ
=0 (2.83)

has an interesting form. It is given by
Z r
dxµ dxµ
S = −m dτ (2.84)
dτ dτ
Note that formally here τ need not be the proper time because we can parametrize the
µ dxµ d τ ′
path by any other τ ′ (τ ). Since dx
d τ = d τ ′ d τ the action transforms to the same form as (2.84)
but with τ → τ ′ .
The Euler Lagrange equations take the form
!
d ∂L ∂L
dx
− µ =0 (2.85)
dτ ∂ µ ∂x

or explicitly
" #
dxµ dxµ dxµ −1/2
 
1 d
m =0 (2.86)
2 dτ dτ dτ dτ
from which we learn that
dxµ dxµ
= uµ uµ = c2 (2.87)
dτ dτ
and
dxµ d pµ
 
d
m = =0 (2.88)
dτ dτ dτ
the correct equation of motion if we do identify the parameter with the proper time.
2.8. RELATIVISTIC ACTION 41

If we stare at (2.84) though we realize that it has an interesting form. The proper time
is being used to parameterize the path of the particle but if we move d τ into the square root
we see it cancels and what we are actually doing is calculating the length of the path. This
is very elegant in that the length of the path is the only physical characteristic of the motion
- it’s nice that the action is so simple.
Chapter 3

Relativistic Electromagnetism

In this section of the course we will study electromagnetism. We begin by reviewing Maxwell’s
equations in integral and differential form. Our main task here though will be to understand
how these equations already encode relativity. To do this we will need to rewrite them in
terms of potentials to find a manifestly Lorentz invariant form. We will then understanding
how electric and magnetic fields change under a boost.

3.1 Integral Form of Maxwell’s Equations


We begin by reviewing the physics of Maxwell’s equations in integral form.

3.1.1 Gauss’ Law


It is a remarkable fact that the net number of electric field lines exiting a closed surface S is
proportional to the sum of the electric charges q inside the volume enclosed by S. Since the
electric field ~E is proportional to the number of field lines per unit area, mathematically we
have,
R
~ ~ q
S E.d A = ε0 (3.1)

• ~E is the (vector) force a unit charge experiences at a point on the closed surface S.

• The integral means a sum of ~E.d~A for the infinitesimal surface elements that make up
a whole, closed surface S. Remember that a little area element is described by a vector
normal to its surface

Figure 3.1: An area element vector.


dA = |d A| n
• q is the net charge contained inside the surface.

42
3.1. INTEGRAL FORM OF MAXWELL’S EQUATIONS 43

e.g. The electric field around a point charge is given by Gauss’ law using a spherical surface
S of radius r around the charge. The integral is then trivially performed and the result is
summarised in Figure 3.2.

Figure 3.2: A Guassian surface around a point charge.

q
4π r2 |~E| = ε0

q
|~E| = 4πε0 r2

3.1.2 No Magnetic Charges


The equivalent of Gauss’ law for magnetic fields is just

~ ~
R
S B.d A = 0 (3.2)
since there are no magnetic charges, i.e. no magnetic monopoles.

3.1.3 Faraday’s Law


B

Figure 3.3: Magnetic flux penetrating a wire loop.

Faraday discovered that moving a loop of wire in a magnetic field induces a current in
the wire. The number of magnetic field lines passing through the loop is the magnetic flux
given by, Z
Φ = ~B.d~A (3.3)
S
where the area S which is integrated over in this case is the open surface enclosed by the
loop. Now the induced voltage depends on the rate of change of the number of magnetic
field lines passing through the loop with respect to time t, as given by Faraday’s law,

∂Φ
e.m. f . = − (3.4)
∂t
The minus sign reflects Lenz’s Law which says the system resists change.
44 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

The voltage difference around the loop perimeter, s, is given by “V = Ed” but since differ-
ent bits of the wire point in different directions we must calculate this for each infinitessimal
bit of wire and sum the answers,
Z
e.m. f = ~E.d~l (3.5)
s

Finally combining the above equations we have

R ∂ ~B
~ ~ ~
R
s E.d l = − S ∂ t .d A (3.6)

3.1.4 Ampere’s Law


The analogue of Faraday’s law for the case of the line integral of the magnetic field around a
loop s bounding an open surface S is given by,

R ∂ ~E
~ ~ ~ ~ ~
R R
s B.d l = µ0 S J.d A + µ0ε0 S ∂ t .d A (3.7)

Reading just the first two terms in this equation we see the familiar physics that if a
current I (given by the current density J~ integrated over the area d~A of the closed surface S)
is flowing through some loop then there is a circulating magnetic field

Figure 3.4: The magnetic field induced by a


B current carrying wire loop.

The final term was added for consistency by Maxwell (we will revisit this shortly) and
mirrors the term in Faraday’s law.

This integral form of Maxwell’s equations are a complete description of electromag-


netism. In what follows we shall simply recast the equations in several different ways in
order to display their physics content better.
3.2. DIFFERENTIAL FORM OF MAXWELL’S EQUATIONS 45

3.2 Differential Form of Maxwell’s Equations


The first rewriting of Maxwell’s equations we shall do is to put the equations into a differ-
ential equation form. The benefit of this form will be that the equations are true locally at
a point. In the integral form one has to pick “loops” and “areas” to define the integrals and
they are therefore telling you about global properties of a problem. We will need two bits of
mathematics (see Appendix 3.1 for a proof):

Gauss’ Theorem:

~ ~A = ~∇.~FdV
R R
S F.d (3.8)

Stoke’s Theorem:

~ ~ = (~∇ × ~F).d~A
R
s F.d l (3.9)

We can use these to find the differential form of Maxwell’s equations as the following
two examples show

Differential Form of Gauss’ Law

We can now convert the integral form of Gauss’ Law in Eq.3.1

~E.d~A = q
Z
(3.10)
S ε0
to the differential form using Gauss’ Theorem
Z Z
~E.d~A = ~∇.~EdV (3.11)
S V
where V is the volume enclosed by the closed surface S. If we also write the charge in terms
of a charge density
q ρ
Z
= dV (3.12)
ε0 V ε0
then comparing the above equations we find
ρ
Z Z
~∇.~EdV = dV (3.13)
V V ε0
Then shrinking the volume V to a point the integrands may be equated to yield Gauss’s law
in differential form,
~∇.~E = ρ (3.14)
ε0
46 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Differential Form of Faraday’s Law

We now convert the integral form of Faraday’s law in Eq.3.6

∂ ~B ~
Z Z
~E.d~l = −
.d A (3.15)
s S ∂t
to the differential form using Stokes’ Theorem
Z Z
~E.d~l = (~∇ × ~E).d~A (3.16)
s S
Equating the right-hand sides of the above equations

∂ ~B ~
Z Z
(~∇ × ~E).d~A = − .d A (3.17)
S S ∂t
Then, shrinking the surface to a point, we may equate the integrands of the last equation and
hence arrive at the differential form of Faraday’s law

∂ ~B
∇ × ~E = − (3.18)
∂t

3.2.1 Maxwell’s Equations in Differential Form


Using Gauss’ theorem and Stoke’s theorem we have rewritten the Maxwell equations for
Gauss’s law and Faraday’s law as the differential equations Eqs.3.14 and 3.18. It is straight-
forward to do the same for the analogous equations for the magnetic field. Then all four
Maxwell equations in the vacuum may be expressed in differential form:

~∇.~E = ρ
ε0

~∇.~B = 0
(3.19)
~∇ × ~E ~
= − ∂∂Bt

~∇ × ~B = µ0 J~ + µ0ε0 ∂ ~E
∂t

Exercise 3.1: How many components do the following 9 objects have?


~∇ ∇2 φ ~∇.(~∇ × ~A)
~∇φ ∇2~A ∂µ F µν
~∇.~A ∇ × ~A ∂ µ F νλ

Exercise 3.2: The vector ~A = xî + xyĵ + xz3 k̂. Evaluate ~∇.~A and ~∇ × ~A
3.2. DIFFERENTIAL FORM OF MAXWELL’S EQUATIONS 47

3.2.2 Conservation of Charge


Another equation it is useful to put into differential form is that describing charge conserva-
tion. Since charge is conserved the current flowing out through the surface of some volume
must give the change in charge within the volume

∂ρ
Z Z
~ ~A = −
J.d dV (3.20)
S ∂t
I

Figure 3.5: Charge leaving a volume as a current.

q = ρd V

Applying Gauss’ Divergence theorem to the left hand side we have the differential form for
charge and current conservation,

~∇. J~ = − ∂ ρ (3.21)
∂t

Exercise 3.3: Prove the conservation of charge starting from Maxwell’s equations in differ-
ential form.
Exercise 3.4: Consider flow within a gas or fluid of density ρ (r). Show that conservation of
mass within some volume implies
− dtd ρ dV = ρ~v.d~A
R R

where v(r) is the flow velocity. By explicitly applying this equation to an infinitesimal vol-
ume show that
∂ ~
∂ t ρ + ∇.(ρ~v) = 0
If the velocity of the fluid is subject to the condition ~∇ ×~v = 0 show, using Stoke’s
Theorem, that the fluid does not support circulation.

3.2.3 The Displacement Current


Prior to Maxwell’s involvement the fourth ”Maxwell” equation was just

~∇ × ~B = µ0 J~ (3.22)
However, we can see quite simply in this formalism that this can not be correct. This is
because it is true that for any vector field ~F (The proof is given in Appendix 3.2 at the end of
this chapter).

~∇.(~∇ × ~F) ≡ 0 (3.23)


48 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Let’s see if this makes sense for our equation above by taking the divergence

~∇.(~∇ × ~B) ≡ 0 = µ0~∇.J~ (3.24)


But this isn’t correct since we just saw that ~∇.J~ = − ∂∂ρt ! Maxwell’s extra term corrects things
as we can see

~∇.(~∇ × ~B) ≡ 0 = µ0~∇.J~ + µ0 ε0 ∂ ~∇.~E (3.25)


∂t
Using the first Maxwell equation (~∇.~E = ρ /ε0 ) we recover the correct formula for the con-
servation of charge and current in Eq.3.21.

Exercise 3.5: Prove the following vector identities


~∇ × (~∇φ ) = 0
~∇ × (φ ~A) = φ (~∇ × ~A) + (~∇φ ) × ~A

3.3 Potentials
Potentials are a mathematical trick for making the Maxwell’s equations easier to solve. The
one you are already familiar with is the electrostatic potential, which we shall discuss first.

3.3.1 Electrostatic Potential


In electrostatic problems the Maxwell equations reduce to

~∇.~E = ρ ~∇ × ~E = ~0 (3.26)
ε0
If we write

~E = −~∇φ (3.27)
then, because of the identity (see Appendix 2)

~∇ × ~∇φ ≡ 0 (3.28)
the second of our two Maxwell equations is automatically satisfied. We are left with only
Poisson’s equation
ρ
−∇2 φ = (3.29)
ε0
This simplifies things since the equation only involves one scalar function rather than the
three components of the electric field. The electric field can then readily be obtained from
the scalar potential using Eq.3.27.
3.3. POTENTIALS 49

By integrating Eq.3.27, we can write,


Z ~x
φ =− ~E.d~l (3.30)

which shows that φ can be interpreted as the “potential energy” for moving a unit charge
from infinity to the point ~x. This energy is independent of the path the charge takes to arrive
at that point.
Note that φ is only defined upto an arbitrary constant (the energy of a charge at infinity)
since

~E = −~∇(φ +C) = −~∇φ (3.31)

Example 1: Infinite Parallel Plate Capacitor


Consider the capacitor with a potential difference of V across it

x=d φ=V
E
x=0 φ=0
Figure 3.6: A parallel plate capacitor.

Between the parallel planes of the plates there is no charge so Poisson’s equation reduces
to Laplace’s equation

∇2 φ = 0 (3.32)
In this problem, by the symmetry of the assumed infinite capacitor plates, the only variation
in φ will be in the x direction defined to be perpendicular to the plates,

d2
∇2 φ = φ =0 (3.33)
dx2
Integrating twice we obtain

φ = Ax +C (3.34)
with A,C constants. They can be fixed by imposing the boundary conditions
φ (x = 0) = 0, φ (x = d) = V . We obtain
V
φ= x (3.35)
d
Finally we can obtain the electric field from the potential

~E = −~∇φ = (− V , 0, 0) (3.36)
d
50 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Example 2: Co-axial Cable


The case of a co-axial cable is a similar problem but with different symmetry properties.

Figure 3.7: A coaxial cable.


φ= 0
b
a

φ=V

Here the potential will only vary radially. In the Appendix ∇2 is calculated in cylindrical
polar coordinates (r, θ , z). Only allowing r variation in φ we find
 
1 d dφ
r =0 (3.37)
r dr dr
Integrating twice we find

φ = A ln r +C (3.38)
Again we fix the integration constants from the boundary conditions shown in the figure, so
V
φ (r) = − (ln r − ln b) (3.39)
ln(b/a)
Exercise 3.6: Solve Laplace’s equation for the potential generated by a charged point parti-
cle. The operator ∇2 in spherical polar coordinates is given by
∂2 2 2
∇2 = ∂ r2
+ 2r ∂∂r + r12 ∂∂θ 2 + cotr2θ ∂∂θ + r2 sin
1 ∂
2 θ ∂φ2

3.3.2 The Magnetic Vector Potential


Having introduced an electrostatic scalar potential we might try to introduce a magnetostatic
scalar potential in the same way. This does not work though because even in static magnetic
problems there must be a current to generate the magnetic field. Thus

~∇ × ~B = µ0 J~ (3.40)
and we can not use a scalar potential field since ~∇ × ~∇φ ≡ 0.
On the other hand for all magnetic fields, static or otherwise,
~∇.~B = 0 (3.41)

and so we can make use of the alternative identity ~∇.(~∇ × ~F) ≡ 0.


3.3. POTENTIALS 51

Thus we can automatically solve the Maxwell equation in Eq.3.41 provided we write the
magnetic field in turns of a new vector field, the “vector potential” ~A

~B = ~∇ × ~A (3.42)

Just as there was some freedom in the choice of the electrostatic potential so there is an
arbitrariness about the vector potential ~A. This is because the magnetic field ~B is left invariant
if we transform

~A → ~A + ~∇ψ (x) (3.43)


where ψ (x) is an arbitrary scalar field. The invariance of ~B follows from the identity ~∇ ×
(~∇ψ ) ≡ 0. In other words the same magnetic field ~B results from using either ~A or ~A+~∇ψ (x).
Of course trading in one vector field ~B for another ~A does not bring any simplification.
However the winning card for this approach is that, unlike the electrostatic scalar potential
φ , the magnetic vector potential ~A defined in Eq.3.42 is valid for both static and time-varying
fields.

3.3.3 A New Electric Potential


The electrostatic potential we wrote before only worked when there were no time dependent
fields, and hence there were no magnetic fields appearing in Eqs.3.26. Allowing time depen-
dent (non-static) fields means the the electric and magnetic fields are no longer decoupled
and in particular the electrostatic potential in Eq.3.27 is no longer consistent since it always
~
implies ~∇ × ~E = ~0. Instead we seek a new potential that implies ~∇ × ~E = − ∂∂Bt .
Can we find simultaneous potentials for both ~E and ~B that work in all circumstances?
The desired potentials are:

~E = −~∇φ − ∂ ~A
∂t
(3.44)
~B = ~∇ × ~A
The second of these equations defines the usual magnetic vector potential which is always
valid even in the non-static case since
~∇.~B = ~∇.(~∇ × ~A) = 0 (3.45)
52 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

The first of these equations involves a new, second term on the right-hand side which is
~
designed to yield the correct Maxwell equation, ~∇ × ~E = − ∂∂Bt . This is easily seen by taking
the curl of the first equation,
 
~∇ × ~E = ~∇ × −~∇φ − ∂ ~A
∂t

~ ~A)
= −~∇ × (~∇φ ) − ∂ (∇×
∂t
(3.46)

~
= − ∂∂Bt
To summarise, potentials ~A and φ may be defined in Eq.3.44 which always automati-
~
cally satisfy the homogeneous Maxwell equations ~∇.~B = 0 and ~∇ × ~E = − ∂∂Bt . This should
simplify things greatly since now there are only the remaining two inhomogeneous Maxwell
equations to solve. Let’s write them out in terms of the potentials

~∇.~E = −∇2 φ − d(~∇.~A) = ρ


(3.47)
dt ε0

For the ~∇ × ~B equation we will again use the identity for this product in Appendix 2. Thus
!
∂ ∂ ~
A
~∇(~∇.~A) − ∇2~A = µ0 J~ + µ0 ε0 − − ~∇φ (3.48)
∂t ∂t
or rearranging

2~
−∇2~A + µ0 ε0 ∂∂ t A2 = µ0 J~ − ~∇(~∇.~A + µ0 ε0 ∂∂φt ) (3.49)

Unfortunately these two equations we are left with are quite messy! To clean them up we
can make use of our ability to redefine the potentials whilst keeping the ~E, ~B fields the same.

3.3.4 Gauge Transformations


The transformations for these potentials that leave ~E, ~B invariant are the following gauge
transformations

~A → ~A + ~∇ψ
(3.50)
φ →φ − ∂∂ψt

Exercise 3.7: Show explicitly that the ~E, ~B fields are left invariant by these transformations.
3.3. POTENTIALS 53

We can make a choice of gauge that transforms ~∇. ~A as follows

~∇. ~A → ~∇.(~A + ∇ψ ) = ~∇. ~A + ∇2 ψ (3.51)


Note that ~∇. ~A is a number at each point in space. ∇2 ψ is also a number at each point but
here we get to choose it by choosing ψ . The upshot is that we can choose to transform ~∇.~A
to anything we want!
~∇. ~A = 0 is one sensible choice (known as Coulomb gauge). Another choice which we
shall focus on below is called Lorenz gauge.

3.3.5 Maxwell’s Equations in Lorenz Gauge


Let’s choose to make a gauge transformation designed to cancel the second term on the
right-hand side of Eq.3.49,
~∇. ~A = −µ0 ε0 ∂ φ (3.52)
∂t
In this gauge the two inhomogeneous Maxwell equations in Eqs.3.47, 3.49 simplify to

2 ρ
−∇2φ + µ0ε0 ∂∂ tφ2 = ε0 (3.53)

2~
−∇2~A + µ0ε0 ∂∂ tA2 = µ0J~ (3.54)

This form of the inhomogeneous Maxwell’s equations is much prettier!


Observe the following two points:

Wave Equations in Free Space

In free space J~ = 0 and ρ = 0 and these equations become wave equations

∂ 2φ 2~ ∂ 2~A
−∇2 φ + µ0 ε0 = 0, −∇ A + µ0 ε 0 =0 (3.55)
∂ t2 ∂ t2
which have complex wave solutions of the form

~A(~r,t) = ~A0 ei(ω t−~k.~r) (3.56)


The physical solutions are obtained by taking the real part of the complex equation. Substi-
tuting the complex (or real) solution into the wave equation we find the condition

ω2 1
2
= c2 = (3.57)
k µ0 ε0

In other words these waves move at a speed c = 1/ µ0 ε0 which is the speed of light.
Following a similar analysis for the electric and magnetic fields, this is how Maxwell
concluded that light is an electromagnetic wave.
54 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Relativistic Form

Eqns. 3.53 and 3.54 also have a very suggestive form for Relativity - they are symmetric
in time and space. There’s also a symmetry between the components of ~A and φ - should we
promote them to the components of a four-vector? Similarly should the charge density and
current become a four-vector?

3.4 Relativistic Formulation Of Electromagnetism


Our goal now is to cast Maxwell’s equations in a manifestly Lorentz invariant form which is
compatible with the second postulate of Special Relativity. The equations in Lorentz gauge
in Eqs.3.53, 3.54 suggested a four-vector form which we will now explore.

3.4.1 Four-vector Current


Consider a uniform distribution of charge in a volume V at rest in some frame

q = ρV Figure 3.8: Charges distributed in a volume.

If the charge density is ρ0 then the total charge is ρ0V .


Now consider boosting to a frame moving with speed v relative to the charge. The volume
changes because of Lorentz contraction
V
V′ = (3.58)
γ
The total number of charges in the box must be the same for each observer though so the
charge density must also change to keep the total charge fixed. Thus

ρ ′ = γρ0 (3.59)
There will also now be a current density since the charges are moving in the new inertial
frame. These transformations are all consistent with ρ and J~ being a four vector.
Thus we define

~
J µ = (ρ c, J) (3.60)
Classically the current density is just given in terms of the speed of the particles as ρ~v.
3.4. RELATIVISTIC FORMULATION OF ELECTROMAGNETISM 55

The natural relativistic definition is therefore

dxµ
J µ = ρ0 u µ = ρ0 (3.61)

The Lorentz invariant “length” of the four-vector then follows from uµ uµ = c2

J µ Jµ = ρ02 c2 (3.62)

Exercise 3.8: Write equations for how each of the four components of J µ transform under a
Lorentz boost by v in the x-direction.

3.4.2 Conservation of Charge


The conservation of charge equation

~∇.J~ + ∂ ρ = 0 (3.63)
∂t
can now be written in a Lorentz invariant form

∂ µ Jµ = 0 (3.64)
where    
µ ∂ ~ 1∂ ~
∂ = , −∇ = , −∇ (3.65)
∂ x0 c ∂t
Note the minus sign in the definition of the relativistic derivative four-vector ∂ µ . It
looks a bit odd but is needed to get the signs correct here. In fact it is the only prescription
compatible with the usual definition of xµ as we show in the next section.

3.4.3 The Four Vector ∂ µ


You might worry that defining
 
µ 1∂
∂ = , −∇ (3.66)
c ∂t
with a minus sign contradicts the fact that

xµ = (ct, x) (3.67)
For example under a Lorentz boost to a frame moving with speed v in the positive x
direction
′ µ
x µ = Λν xν (3.68)
56 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

ie

v v
(ct ′) = γ (ct) − γ x, x′ = γ x − γ (ct) (3.69)
c c
or inverting the relations

v v
(ct) = γ (ct ′ ) + γ x′ , x = γ x′ + γ (ct ′ ) (3.70)
c c
Similarly the definition in (3.66) would imply

′µ µ
∂ = Λν ∂ ν (3.71)

ie

1 ∂ 1∂ v ∂ ∂ ∂ v 1∂

=γ + γ , − ′
= −γ − γ (3.72)
c ∂t c ∂t c ∂x ∂x ∂x c c ∂t
Note the signs in the transformations
To show this is consistent let’s work it out from first principles

∂ ∂x ∂ ∂t ∂

= ′ + ′ (3.73)
∂t ∂t ∂x ∂t ∂t

∂ ∂x ∂ ∂t ∂
= + (3.74)
∂ x′ ∂ x′ ∂ x ∂ x′ ∂ t
from the transformations in (3.70) above

∂x ∂t ∂t v ∂x
= vγ , = γ, = 2 γ, =γ (3.75)
∂ t′ ∂ t′ ∂x′ c ∂ x′

Substituting these in (3.73) and (3.74) we find (3.72) - this shows that there is not an
inconsistency (and in fact that the minus sign in (3.66) is required).
An alternative quicker statement of this is that one would like

∂ xµ µ
= δν or ∂µ xµ = 4 (3.76)
∂ xν
and again the minus sign in ∂ µ is required.

We can also define a four-vector version of ∇2 by

1 ∂2
 = ∂ µ ∂µ = 2 2
− ∇2 (3.77)
c ∂t
3.4. RELATIVISTIC FORMULATION OF ELECTROMAGNETISM 57

3.4.4 Four Vector Potential


The final element we need to write Maxwell’s equations in a Lorentz invariant form is a
four-vector including the potentials. The appropriate four-vector is

φ
Aµ = ( , ~A) (3.78)
c
The Maxwell equations are then


Aµ = ε0 c2
(3.79)
where
1 ∂2
= − ∇2 (3.80)
c2 ∂ t 2
The µ = 0 equation is the φ equation (3.53) and the µ = 1, 2, 3 equations give the com-
ponents of the equation (3.54) for ~A.
The Maxwell equations in Lorentz gauge also required the gauge condition (3.52) which
becomes

∂µ Aµ = 0 (3.81)

Remember being able to write these equations in four-vector notation is a huge step in
itself. We now know that electromagnetism is relativistically invariant.

3.4.5 A Moving Point Charge


One of the advantages of the relativistic formulation is that we understand how electric and
magnetic fields behave under boosts. As an example lets look at the fields around a moving
electric charge.
For an electric charge at rest we know that
 
µ φ ~ q ~
A = ( , A) = ,0 (3.82)
c 4πε0 rc
We can make the charge move by boosting to an inertial frame moving at speed v in the
positive x direction
′ µ
A µ = Λ ν Aν (3.83)
so for example
′ v
A 0 = γ (A0 − Ax ) (3.84)
c
which means
γq
φ′ = . (3.85)
4πε0 r′
58 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

We must remember also that r2 = x2 + y2 + z2 and x transforms too, so

γq
φ′ = (3.86)
4πε0 (γ 2 (x′ + vt ′ )2 + y′2 + z′2 )1/2

Turning to the spatial components we find

′ v γv q
A x = −γ A0 = − 2 (3.87)
c c 4πε0(γ (x + vt )2 + y′2 + z′2 )1/2
2 ′ ′

′ ′
and A y = A z = 0.

The electric field is then given by

~′
~ ′φ ′ − ∂ A
~E ′ = −∇ (3.88)
∂ t′
which works through to

′ qγ (x′ +vt ′ )
Ex = 4πε0 (γ 2 (x′ +vt ′ )2 +y′2 +z′2 )3/2

′ qγ y′
Ey = 4πε0 (γ (x +vt ) +y′2 +z′2 )3/2
2 ′ ′ 2 (3.89)

′ qγ z′
Ez = 4πε0 (γ 2 (x′ +vt ′ )2 +y′2 +z′2 )3/2

These results are particularly interesting when v ≃ c. Look first on the x-axis

′ q
Ex≃ (3.90)
4πε0 γ 2 (x′ + vt ′ )2
since γ is large this component of the field is reduced relative to that of the stationary charge.
′ ′
On the other hand if we look at the field perpendicular to the motion (ie at x′ = −vt ′ ) E y , E z
are both enlarged by a factor of γ . Thus the field of a relativistic moving charge is essentially
confined to a disc

v Figure 3.9: The electric field of a highly


relativistic charge is confined to a plane
transverse to its motion.
3.4. RELATIVISTIC FORMULATION OF ELECTROMAGNETISM 59

3.4.6 The Electromagnetic Field Strength Tensor


It is also possible to write Maxwell’s equations in a relativistic form involving ~E and ~B fields
rather than the potentials. Remember that

~
~E = −~∇φ − ∂ A (3.91)
∂t
µ ν
so a component is given in terms of A , ∂ by

Ei
= ∂ i A0 − ∂ 0 Ai (3.92)
c
Similarly

~B = ~∇ × ~A (3.93)
so, up to signs, we have the form

Bi = ∂ j Ak − ∂ k A j (3.94)
Thus we conclude that the ~E and ~B fields are described by the EM field strength tensor

F µν = ∂ µ Aν − ∂ ν Aµ (3.95)

Explicitly the components are


 1 2 3 
0 − Ec − Ec − Ec
E1
0 −B3 B2
 
µν c
F = (3.96)
 
E2
B3 0 −B1

 c

E3
c −B2 B1 0
where µ counts the row and ν the column.
Maxwell’s equations in terms of F µν are a little involved and are given by

∂µ F µν = µ0J ν (3.97)
and
∂ λ F µν + ∂ µ F νλ + ∂ ν F λ µ = 0 (3.98)

For example the first equation contains (ν = 0) ~∇.~E = ρ /ε0 and (ν = 1, 2, 3) ~∇ × ~B =


~
µ0 J~ + µ0 ε0 ∂∂Et .
The second equation is actually 64 equations so contains many repeats of the remaining
two Maxwell equations. For example if we set λ = 1, µ = 3, ν = 2 we obtain ~∇.~B = 0 and
so forth.

Exercise 3.9: Explicitly extract the differential form of Maxwell’s equations from (3.97),(3.98).
60 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Exercise 3.10: Evaluate F µν Fµν in terms of ~E and ~B fields.

Exercise 3.11: There is a four component tensor ε µνρσ which is zero if any two indices
take the same value. ε 1234 = 1. Other non-zero components are obtained by interchanging
indices of ε 1234 - interchanging any two indices changes the value by a minus sign. Thus
ε 3214 = −1 whilst ε 2314 = 1. Show explicitly that ε 1234 and ε 1134 are left invariant by a
Lorentz boost.
Show that ε µνρσ Fρσ takes the same form as F µν but with the elctric and amgnetic field
components interchanged.
Hence evaluate in terms of ~E and ~B fields the Lorentz invariant quantity ε µνρσ Fµν Fρσ .

3.4.7 Lorentz Transformations of Electric and Magnetic Fields


We can calculate the Lorentz Transformation properties of the ~E and ~B fields using the fact
that F µν transforms as
′ µν µ
F = Λ α Λνβ F αβ (3.99)
For example for a boost by speed v in the positive z direction
′ ′
E1
c = F 10

= Λ1α Λ0β F αβ

= Λ0β (Λ10 F 0β + Λ11 F 1β + Λ12 F 2β + Λ13 F 3β )


(3.100)
= Λ0β F 1β

= Λ00 F 10 + Λ01 F 11 + Λ02 F 12 + Λ03 F 13


 
E1
= γ c − vc B2
3.4. RELATIVISTIC FORMULATION OF ELECTROMAGNETISM 61

The full set of transformations are given by


′  
E1 E1 v 2
c =γ c − cB

′  
E2 E2
c =γ c + cv B1


E3 E3
c = c
(3.101)
′1
 2

B = γ B1 + vc Ec


 1

B 2 = γ B2 − vc Ec


B 3 = B3

3.4.8 The Relativistic Force Law


When we were studying relativity in Chapter 2 we promised to return to the idea of relativis-
tic force when we had studied electromagnetism.
Classically the electromagnetic force is given by

~F = q(~E +~v × ~B) (3.102)


Thus for example the x component is given by

F 1 = q(E 1 + v2 B3 − v3 B2 ) = q(cF 10 − v2 F 12 − v3 F 13 ) (3.103)


to make this more symmetric we can add −v1 F 11 since this is just zero!
Since (c, v1 , v2 , v3 ) are just the non-relativistic limit of uµ we are led to

f µ = quν F µν (3.104)
Now we can ask what the non-relativistic limit of the time-like component of force is?

f 0 = q(u0 F 00 − u1 F 10 − u2 F 20 − u3 F 30 )

~
= −qγ ~v.cE (3.105)
 
= − qcγ ~v. ~E +~v × ~B

where we have used that ~v.(~v × ~B) ≡ 0.


62 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Taking v ≪ c we obtain q~v.~E which is just the work done per second. This indeed should
0
be the rate of change of energy and it makes sense to equate it to ddpτ in the relativistic
generalization of Newton’s law.

3.5 The Lagrangian For a Charged Particle


The equation of motion for a charged, moving particle is given by (for the moment we return
to the non-relativistic notation)

d~p
= q(~E +~v × ~B) (3.106)
dt
The action that reproduces this equation is
1
Z
S= Ldt, L = m|~ẋ|2 + q(~ẋ.~A) − qφ (3.107)
2
The Euler Lagrange equation is
 
d ∂L ∂L
− =0 (3.108)
dt ∂~ẋ ∂~x
or
d ~
(mẋ + q~A) − ~∇(q~ẋ.~A − qφ ) = 0 (3.109)
dt
To see this is the equation we want we must first be careful about the time dependence
~
of A. Of course it can explicitly depend on time, but even if it’s constant the particle, as it
moves, will see a time variation of the field. This is accounted for using the chain rule

d ∂ dx ∂ dy ∂ dz ∂ ∂ ~~
= + + + = + ẋ.∇ (3.110)
dt ∂ t dt ∂ x dt ∂ y dt ∂ z ∂t
So our equation of motion is

d~p ∂ ~A ~ ~ ~
+q + qẋ.∇A − q∇(~ẋ.~A) + q~∇φ = 0 (3.111)
dt ∂t
Next we use the identity

~ẋ × ~∇ × ~A = ~∇(~ẋ.~A) − (~ẋ.~∇)~A (3.112)


We have
!
d~p ∂ ~A ~
=q − − ∇φ + q~ẋ × ~∇ × ~A (3.113)
dt ∂t
Finally we remember the form for the electric and magnetic field in terms of the potentials
(3.44) and see that this is precisely the equation of motion (3.106) we wanted!
3.6. APPENDIX - GAUSS’ AND STOKE’S THEOREMS 63

Note also the expressions for the generalized momenta

∂L
~pgen = = m~ẋ + e~A (3.114)
∂~ẋ
and for the Hamiltonian
1
H = ~pgen .~ẋ − L = m|~ẋ|2 + eφ (3.115)
2
These expressions combine to the generalized four-vector momentum
µ
pgen = muµ + eAµ (3.116)
Replacing momenta in a problem by this generalized four momenta is called “minimal
substitution”.

3.6 Appendix - Gauss’ and Stoke’s Theorems


Here are derivations of these two crucial theorems:

3.6.1 Gauss’ Theorem


We want to convert the surface integral
Z
~F.d~A (3.117)
S

to a form that is locally true. We do this by calculating the integral for an infinitesimal cubic
volume

Z
dx
dy

Y
Figure 3.10: An infinitesimal cube.
dz

O X

We choose the surface in the integral as the surface of this cube.


As an example lets take

~F = F ẑ (3.118)
(ie the field ~F points in the z direction.)
64 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Calculating the surface integral for this ~F:


Flux at bottom surface = −F(0) δ x δ y
 
Flux at top surface = F(0) + ∂∂Fz δ z δ x δ y
Here we have Taylor expanded F to keep only the leading change in its behaviour as we
move in the z direction. Note that the top and bottom areas contribute opposite signs because
the area vectors point in opposite directions. The other surfaces contribute nothing for this
choice of ~F. The total integral is therefore

~F.d~A = ∂ F δ x δ y δ z
Z
(3.119)
S ∂z
This result generalizes, when ~F has x and y components too, to:

~ ~
R
∂ Fx ∂ Fy ∂ Fz
 
S F.d A
lim δ V → 0 = + + (3.120)
δV ∂x ∂y ∂z
where δ V is the volume of the cube.
Alternatively we may write this as:
Z
~F.d~A = ∇.~F δ V (3.121)
S
where
 
∂ ∂ ∂
∇= x̂ + ŷ + ẑ (3.122)
∂x ∂y ∂z
and dV is an integral over the whole volume.

Gauss’ Theorem For Extended Volumes

It is easy to obtain the equivalent expression for an arbitrary volume - we just build it up
out of infinitesimal cubes: eg if we put two together

dA Figure 3.11: Two adjacent


infinitesimal cubes.

It turns out that

Z Z Z
~F.d~A = ~F.d~A + ~F.d~A (3.123)
two cubes cube one cube two
3.6. APPENDIX - GAUSS’ AND STOKE’S THEOREMS 65

since the side shared by the two cubes has an area vector with opposite sign in the case of
the two integrals - the side cancels! We can therefore build any shape in this way and the
surface integral is just the sum over the surface integrals of the component cubes so we arrive
at Gauss’ Law

~ ~A = ~∇.~FdV
R R
S F.d (3.124)

3.6.2 Stokes’ Theorem


Next we want to convert the line integral
Z
~F.d~l (3.125)
s

to a form that is locally true. We do this by calculating the integral for an infinitesimal
rectangular loop

Z dx Q

dz Figure 3.12: An infinitesimal square


in the x − z plane.
P

We’ve chosen the loop to lie in the x-z plane


If at the bottom corner of the rectangle (P)

~F = Fx x̂ + Fy ŷ + Fz ẑ (3.126)
The line integral gets contributions from the top and bottom of the form “Fx dx” and from the
sides of the form “Fy dy”. We must take into account the change in these components across
the box though. Clockwise round the box we get contributions:

~F.d~l = Fz δ z + (Fx + ∂ Fx δ z)δ x − (Fz + ∂ Fz δ x) δ z − Fx δ x


Z
(3.127)
s ∂z ∂x

~F.d~l = ( ∂ Fx − ∂ Fz ) δ x δ z
Z
(3.128)
s ∂z ∂x
66 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

or
Z
~F.d~l = cy dA (3.129)
s

Note that the area element is in the ŷ direction.


In general cy is the y-component of a vector called the curl of F. Its other components
are

∂ Fz ∂ Fy
cx = ( − ) (3.130)
∂y ∂z
∂ Fy ∂ Fx
cz = ( − ) (3.131)
∂x ∂y

We can write the curl as

x̂ ŷ ẑ
~∇ × ~F = ∂ ∂ ∂
(3.132)
∂x ∂y ∂z
Fx Fy Fz

The calculation above then generalizes, for an area placed at random relative to the axes,
to

Z
~F.d~l = (~∇ × ~F).d~A (3.133)
s

Stokes’ Theorem For Extended Areas

We can again make larger areas by placing infinitesimal squares next to each other - the
common sides cancel from the sum

Figure 3.13: Two adjacent


dl infinitesimal squares.

Thus
Z Z Z
~F.d~l = ~F.d~l + ~F.d~l (3.134)
two sq sq one sq two
3.7. APPENDIX - VECTOR IDENTITIES 67

Using our above result we arrive at Stoke’s theorem

R
~ ~
R
~ ~ ~
s F.d l = S (∇ × F).d A (3.135)

3.7 Appendix - Vector Identities


Identity: ~∇.(~∇ × ~F) = 0

Proof:
x̂ ŷ ẑ
~∇ × ~F = ∂ ∂ ∂
∂x ∂y ∂z
Fx Fy Fz
     
∂ Fz ∂ Fy ∂F
= ∂y − ∂z x̂ + ∂ Fx
∂z − ∂∂Fxz ŷ + ∂ xy − ∂∂Fyx ẑ

∂ 2 Fz ∂ 2F 2 2 ∂ 2F 2
~∇.(~∇ × ~F) =
∂ x∂ y − ∂ x∂ yz + ∂∂ y∂Fxz − ∂∂y∂Fzx + ∂ z∂ xy − ∂∂ z∂Fyx

= 0

Identity: ~∇ × (~∇φ ) = 0

Proof:
x̂ ŷ ẑ
~∇ × (~∇φ ) = ∂ ∂ ∂
∂x ∂y ∂z
∂φ ∂φ ∂φ
∂x ∂y ∂z
   2   2 
∂ 2φ 2 2 2
= ∂ y∂ z − ∂∂z∂φy x̂ + ∂∂z∂φx − ∂∂x∂φz ŷ + ∂∂x∂φy − ∂∂y∂φx ẑ

= 0
68 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM

Identity: ~∇ × (~∇ × ~F) = ~∇(~∇.~


F) − ∇2 ~F

Proof:
x̂ ŷ ẑ
~∇ × ~F = ∂ ∂ ∂
∂x ∂y ∂z
Fx Fy Fz
     
∂ Fz ∂ Fy ∂ Fx ∂ Fz ∂ Fy ∂ Fx
= ∂y − ∂z x̂ + ∂z − ∂x ŷ + ∂x − ∂y ẑ
∂ 2 Fy
 2 2 2

~∇ × (~∇ × ~F) =
∂ y∂ x − ∂∂ yF2x − ∂∂ zF2z + ∂∂ z∂Fxz x̂

∂ 2 Fy ∂ 2 Fy
 2 2

− ∂ x2
− ∂∂ x∂Fxy − ∂∂ z∂Fyz + ∂ z2

∂ 2F
 2 2

∂ 2 Fx
+ ∂ x∂ z − ∂∂ xF2z − ∂∂ yF2z + ∂ y∂ yz ẑ
h    2  i
∂ Fy 2 2
= ∂
∂x
∂ Fx
∂x + ∂y + ∂∂Fzz − ∂∂ xF2 + ∂∂y2 + ∂∂z2 Fx x̂
h    2  i
∂ Fy 2 2
+ ∂
∂y
∂ Fx
∂x + ∂y + ∂∂Fzz − ∂∂ xF2 + ∂∂y2 + ∂∂z2 Fy ŷ
h    2  i
∂ Fy 2 2
+ ∂
∂z
∂ Fx
∂x + ∂y + ∂∂Fzz − ∂∂ xF2 + ∂∂y2 + ∂∂z2 Fz ẑ

= ~∇(~∇.~F) − ∇2 ~F

Identity: In cylinderical polar coordinates (r, θ , z)


∂2 2
∇2 = ∂ r2
+ 1r ∂∂r + r12 ∂∂θ + ∂∂z2

Proof:
∂2 2 2
∇2 = ∂ x2
+ ∂∂y2 + ∂∂z2
    2
∂ ∂r ∂ ∂θ ∂ ∂ ∂r ∂ ∂θ ∂
= ∂x ∂x ∂r + ∂x ∂θ + ∂y ∂y ∂r + ∂y ∂θ + ∂∂z2
 2  2  2
∂r ∂2 2 ∂2 2 ∂2
= ∂x ∂ r2
+ ∂∂ x2r ∂∂r + ∂θ
∂x ∂θ2
+ ∂∂ xθ2 ∂∂θ + ∂r
∂y ∂ r2
+
 2
∂ 2r ∂ ∂θ ∂2 2 2
∂ y2 ∂ r
+ ∂y ∂θ2
+ ∂∂ yθ2 ∂∂θ + ∂∂z2
3.7. APPENDIX - VECTOR IDENTITIES 69

Now we use the relations between x, y and r, θ :

r = (x2 + y2 )1/2 x = r sin θ

tan θ = x/y y = r cos θ


∂r x
∂x = (x2 +y2 )1/2
= sin θ

∂ 2r 1 2 cos2 θ
∂ x2
= (x2 +y2 )1/2
− (x2 +yx 2 )3/2 = 1r (1 − sin2 θ ) = r

∂r y
∂y = (x2 +y2 )1/2
= cos θ

2
∂ 2r 1 y 1 2 sin2 θ
∂ y2
= (x2 +y2 )1/2
− (x2 +y 2 )3/2 = r (1 − cos θ ) = r

∂θ cos2 θ cos θ
∂x = x = r

∂ 2θ 2 cos θ sin θ ∂ θ 2 cos θ sin θ


∂ x2
= y ∂x = r2

∂θ
∂y = − cos2 θ yx2 = − sinr θ

∂ 2θ
∂ y2
= 2 cos θ sin θ yx2 ∂∂θy + 2 cos2 θ yx3 = − r22 sin θ
cos θ
(sin2 θ − 1) = − 2 sin θr2cos θ

Substituting in above we find directly


∂2 2
∇2 = ∂ r2
+ 1r ∂∂r + r12 ∂∂θ + ∂∂z2
  2
1 ∂
= r ∂r r ∂∂r + r12 ∂∂θ + ∂∂z2

A similar procedure may be used in spherical polar coordinates (r, θ , φ ) where


∂2 2 2
∇2 = ∂ r2
+ 2r ∂∂r + r12 ∂∂θ 2 + cotr2θ ∂∂θ + r2 sin
1 ∂
2 θ ∂φ2
Chapter 4

Quantum Mechanics

4.1 Non-relativistic Quantum Mechanics


To set the scene for the work to come we begin here by reviewing the basics of non-
relativistic Quantum Mechanics. We will mostly work in one dimension. We will motivate
the form of the Schroedinger equation, discuss the information content and interpretation of
the wave function, and finally work through the simple example of the square well.

4.1.1 One Dimensional, Time Dependent Schroedinger Equation


In Quantum Mechanics the behaviour of a particle is controlled by a wave equation. A free
particle is associated with a wave
ψ = ei(kx−ω t) (4.1)
where the wave number k and angular frequency ω are related to the momentum and energy
of the particle
h p
p= → k= (4.2)
λ h̄
E
E = hν → ω= (4.3)

here h is Planck’s constant and h̄ = h/2π .
The properties of the particle can therefore be obtained from the wave by acting on it
with operators (which we mark by a hat over the symbol)

Ê ψ = ih̄ ψ (4.4)
∂t

p̂ψ = −ih̄ ψ (4.5)
∂x
The free wave function (4.1) is an eigenfunction of these operators with the values of E
and p being the eigenvalues.

70
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 71

u*u

dx x

Figure 4.1: u∗ (x)u(x)dx - the area under the curve shown - gives the probablity to find the
particle in that region of x.

For a classical particle in a potential V we require that energy is conserved so

p2
E= +V (4.6)
2m
which, using the operators, we can rewrite as a wave equation

∂ h̄2 ∂ 2
Ĥ ψ ≡ ih̄
ψ =− ψ +V ψ (4.7)
∂t 2m ∂ x2
where Ĥ is the Hamiltonian operator. This is the time dependent Schroedinger equation
which is central to Quantum Mechanics.

4.1.2 Time Independent Schroedinger Equation


In problems where V is independent of time there are always solutions to the Schroedinger
equation of the form

ψ (x,t) = u(x)e−iEt/h̄ (4.8)


where u(x) satisfies (simply subsitute this solution into the full Schroedinger equation) the
time independent Schroedinger equation

h̄2 ∂ 2
− u(x) +V (x)u(x) = Eu(x) (4.9)
2m ∂ x2

4.1.3 Interpretation
The amplitude of the wave function ψ ∗ (x,t)ψ (x,t) (which in the time independent case is
just u∗ (x)u(x)) is associated with the probability of finding a particle at x. Remembering that
x is continuous the precise statement is

u∗ (x)u(x)dx = prob of finding particle between x and x + dx (4.10)


72 CHAPTER 4. QUANTUM MECHANICS

q = ρd V

Figure 4.2: The change in a conserved quantity, q, in a volume matches to a current leaving
the volume.

Graphically this is shown in Fig 4.1 which shows that the probability of finding the particle
in the dx spatial slice is just the area under the curve u∗ u in that slice.
Since the particle must be somewhere with probability one we must have
Z ∞
u∗ (x)u(x)dx = 1 (4.11)
−∞
Note that for a free particle wave function the normalization of the wavefunction is inter-
preted as the flux of particles per unit volume or within a finite box.
Formally we find observable properties of the particles using the operators
Z ∞ Z ∞

hxi = u (x) x̂ u(x)dx = u∗ (x) x u(x)dx (4.12)
−∞ −∞
Z ∞ Z ∞  
∗ ∗ ∂
hpi = u (x) p̂ u(x)dx = u (x) −ih̄ u(x)dx (4.13)
−∞ −∞ ∂x

4.1.4 Proof that Probability Is Conserved


To back up this interpretation of the wave function we can show that probability is conserved
in the theory. This means that if the probability of the particle being in some area decreases
then the probability that it lies outside must increase. In other words there is a flow of
probability current density (see Fig 4.2) satisfying the usual conservation equation (cf electric
charge)

∂ρ
Z Z
~ ~A = −
J.d dV (4.14)
S ∂t
Using Gauss’ theorem ( ~A.d~S = ~∇.~A dV ) we have
R R

∂ρ ~ ~
+ ∇.J = 0 (4.15)
∂t
or in one dimension
∂ ρ ∂ Jx
+ =0 (4.16)
∂t ∂x
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 73

Now we can show using the Schroedinger equation that ρ = ψ ∗ ψ satisfies such a relation.
We add two copies of the Schroedinger equation as follows

−iψ ∗ (SE) + (SE)∗ iψ (4.17)


This gives

ih̄2 ∗ ∂ 2
h̄ψ ∗ ∂∂ψt + h̄ψ ∂∂ψt = ∗
2m ψ ∂ x2 ψ − iψ V ψ
(4.18)
ih̄2 2
− 2m ψ ∂∂x2 ψ ∗ + iψ ∗V ψ
and hence
 
∂ ∗ ih̄ ∂ ∗ ∂ ∂ ∗
(ψ ψ ) = ψ ψ −ψ ψ (4.19)
∂t 2m ∂ x ∂x ∂x
which indeed has the form of a conservation equation with ρ = ψ ∗ ψ .

4.1.5 Momentum Space Wave Functions


In the above discussion we have described the particle by its wave function at a particular
point in space and then shown how to calculate it’s momentum with an operator. Alterna-
tively we could write a wave function that describes the probability of the particle having
momentum in some d p interval directly and then calculating the position becomes more
complicated.
In fact it is possible to set up this momentum space wave function such that

φ ∗ (p) φ (p) d p = prob. of particle having momentum p to p + d p (4.20)


Z ∞
φ ∗ (p) φ (p)d p = 1 (4.21)
−∞
with the properties of the particle being given by the operator relations
Z ∞
φ ∗ (p) p φ (p)d p = hpi (4.22)
−∞
Z ∞  
∗ ∂
φ (p) ih̄ φ (p)d p = hxi (4.23)
−∞ ∂p
Note the difference in sign on x̂ relative to the position space operator p̂. The relationship
between ψ (x) and φ (p) is given by a Fourier Transform

Z ∞
1
φ (p) = √ ψ (x)e−ipx/h̄ dx (4.24)
2π h̄ −∞

or inversely
74 CHAPTER 4. QUANTUM MECHANICS

Z ∞
1
ψ (x) = √ φ (p)eipx/h̄ d p (4.25)
2π h̄ −∞

We can demonstrate that the Fourier Transform indeed has the correct properties by
checking the consistency of the three operator equations above. Firstly consider

 ′
 ′′ 
ipx ′′ −ipx ′′
φ ∗ (p) 1 R
dx′ e h̄ ψ ∗ (x′ )
R R R
φ (p) d p = 2π h̄ dp dx e h̄ ψ (x )
(4.26)
′′
′′ ′′ R −ip(x −x′ )
dx′ dx 1 ∗ ′
R R
= 2π h̄ ψ (x )ψ (x ) d pe h̄

We recognise the d p integral as the Fourier expansion of a delta function

1
Z
δ (x − x0 ) = e−ik(x−x0 ) dk (4.27)

So with k = p/h̄ and dk = d p/h̄

′′ ′′ ′′
φ ∗ (p) φ (p)d p = dx′ dx δ (x − x′ )ψ ∗ (x′ )ψ (x )
R R R


dx′ ψ ∗ (x′ )ψ (x )
R
= (4.28)

= 1
The equations are consistent.

Secondly we can check the relation for the expectation value of the particles position
 
R ∗ ∂
φ (p) ih̄ ∂ p φ (p)d p

R ′′ −ipx′′
    
ipx′ ′′  ′′
−ix
1 R
dx′ e ψ ∗ (x′ )
R
= 2π h̄ dp h̄ ih̄ h̄ dx e h̄ ψ (x )

′′
′′ ′′ ′′ −ip(x −x′ )
dx′ dx 1
ψ ∗ (x′ ) x ψ (x ) d p e
R R R
= 2π h̄

′′ ′′ ′′ ′′ (4.29)
dx′ dx δ (x − x′ )ψ ∗ (x′ ) x ψ (x )
R R
=

dx′ ψ ∗ (x′ ) x′ ψ (x )
R
=

= hxi
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 75

Finally we check the expectation value for momentum


   ′′  
R ∗ 1 R R ′ ipx′ ∗ ′ R ′′ ∂ −ipx ′′
φ (p) p φ (p)d p = 2π h̄ d p dx e h̄ ψ (x ) dx ih̄ ′′ e h̄ ψ (x ) (4.30)
∂x

The differential has been inserted adhocly to simply bring down a factor of p. Now we
integrate by parts throwing away surface terms at infinity

R ′′ −ipx′′ 
  
R ∗ 1 R R ′ ipx′ ∗ ′ ∂ ′′
φ (p) p φ (p)d p = 2π h̄ d p dx e h̄ ψ (x ) dx e h̄ −ih̄ ′′ ψ (x )
∂x

R ′′ ′′
  ′′

dx′ dx δ (x − x′ )ψ ∗ (x′ ) −ih̄
R
= ′′ ψ (x )
∂x
  ′

dx′ ψ ∗ (x′ ) −ih̄
R
= ′ ψ (x )
∂x

= hpi
(4.31)
Everything is nicely consistent.

4.1.6 Heisenberg Uncertainty Principle


In general the wavefunction of a particle ψ (x) will correspond to some localised wave packet
whose Fourier transform is the momentum space wavefunction φ (p), as in Eqs.4.24, 4.25.
From the theory of Fourier transforms, it is seen that any wave packet that is more strongly
peaked in position space will be less strongly peaked in momentum space, and vice versa.
For example, a wavefunction which is a plane wave in position space (and hence its position
is completely undetermined) will have a sharp value of momentum with no uncertainty. It is
possible to derive a relation between the spread or width of the wave packet in position space
∆x and in momentum space ∆p, namely,

∆x∆p ≥ h̄/2 (4.32)

The equality follows directly from the theory of Fourier transforms for the idealised wavepack-
ets. The inequality expresses the fact that, in real experiments which measure the position
and momentum of a particle simultaneously, the product of uncertainties in the respective
measurements must always exceed the above bound.
There is also a similar uncertainty relation for energy and time of a quantum state,

∆E∆t ≥ h̄/2 (4.33)

For example, for an atomic transition, the shorter the transition time ∆t the greater the width
of the associated spectral line ∆E, and vice versa.
The above relations in Eqs.4.32, 4.33 are collectively known as the Heisenberg Uncer-
tainty Principle. They highlight the fact that the quantum world represents a major departure
76 CHAPTER 4. QUANTUM MECHANICS

V= V=0 V=

x=0 x=a

Figure 4.3: The potential of an infinite square well.

from classical physics, since, even in the most accurate idealised experiment, two quantities
such as position and momentum cannot ever be known simultaneously to arbitrary precision.
Even great physicists such as Albert Einstein never accepted this, and this led to a series
of high profile debates with Niels Bohr. It is now generally accepted that Bohr was correct
and Einstein was wrong. Quantum Mechanics, though completely counter to our intuition,
has been thoroughly vindicated in all experiments to date involving atoms and subatomic
particles.

4.1.7 Square Well Example


A simple, interesting example of a Quantum Mechanics system is the square potential well
as shown in Fig 1.3. We assume that the particle can not penetrate the infinite barriers

ψ = 0, for x ≤ 0, x ≥ a (4.34)

Since the potential is time independent the solution takes the form

ψ (x,t) = u(x)e−iEt/h̄ (4.35)


and we must solve the time independent Schroedinger equation

h̄2 d 2
− u(x) +V (x)u(x) = EU (x) (4.36)
2m dx2
Of course in the region of interest the potential is just V = 0.

The solutions to this equation take the form

u(x) = A sin kx + B cos kx (4.37)


The integration constants are fixed by the boundary conditions of ψ vanishing at x = 0, a so
nπ x
un (x) = A sin (4.38)
a
with n integers 1, 2, 3....
Substituting this solution into the Schroedinger equation we find
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 77

x=0 x=a

Figure 4.4: The intial conditions for the square well problem considered in section 1.8

h̄2  nπ 2
En = (4.39)
2m a
Finally to find the constant A we can require ψ (x,t) is correctly normalized
R∞ ∗ ψ dx
−∞ ψ = 1
R a 2 2 nπ x
= 0 A sin a dx (4.40)

= A2 2a
The full solution is therefore
r
2 nπ x −iEnt/h̄
ψn (x,t) = sin e (4.41)
a a

4.1.8 Completeness
The consideration of how a particular initial condition for the wave function in a square
well evolves with time provides interesting insight into the uniqueness of the solutions we
have found. In particular since the solutions are sine waves of period 2a there is a strong
connection to problems one encounters when studying Fourier analysis such as wave forms
on a string.
For example if we take an initial wave function, at t = 0, of the triangular form show in
Fig 1.4 then we can write

ψ (x,t = 0) = ∑ cn un(x) (4.42)
n=1
where the cn are the Fourier-like coefficients (we’ll explain how to derive them in the next
section) which are given by r
8k a nπ
cn = 2 2 sin (4.43)
n π 2 2
We now know the time evolution since we know that each individual term evolves as

un (x, 0) → e−iEnt/h̄ un (x, 0) (4.44)


78 CHAPTER 4. QUANTUM MECHANICS

Resuming the series at time t gives the evolution of the initial condition (to a precision
determined by how many terms you resum).

This is an example of a general rule in QM called completeness: any wave function may
be expanded as a series of the eigenfunction solutions of the Schroedinger equation relevant
to that problem. In other words in any problem we may write

φ (x) = ∑ cn un (x) (4.45)


n

for any function φ (x), where

Hun = En un (4.46)
We won’t prove this here but if it weren’t true it would be quite surprising! Imagine
we had found all the solutions of the Schroedinger equation and then wrote down an initial
condition that couldn’t be rewritten in terms of those solutions... we’d have missed the evo-
lution of that initial condition and hence we can’t have had all the solutions! Completeness
is usually the case for a theory to make sense and it allows us to evolve all initial states with
time.

4.1.9 Orthogonality
It is also important in these initial condition problems that there is a unique way of writing

ψ (x, 0) = ∑ cn un (x) (4.47)


n
If it were not unique then a given initial condition would have more than one expansion
which would evolve differently. Again the theory would not make sense.
Each un (x) therefore contains unique information. Orthogonality is a mathematical state-
ment of this fact
Z ∞
u∗n (x)um (x)dx = δnm (4.48)
−∞

where δnm = 1 if m = n and δnm = 0 if m 6= n.


You can think of this expression as similar to a dot product between the coordinate axes
vectors (î, ĵ, k̂) - the axes contain the separate information about the three directions in the
space and the dot product is zero between any two orthogonal directions.

Proof: The un are eigenfunctions of the Hamiltonian H satisfying Hun = En un so consider


Z
u∗i Hu j dx (4.49)

We can act with H to either the left or right in which case we will find
Z Z
Ej u∗i u j dx = Ei u∗i u j dx (4.50)
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 79

which can only be true for i 6= j if the wave functions are orthogonal and both sides are zero.
When i = j the integral over the wave function squared is just the usual probability of finding
the particle in all space and is set equal to one.
Now we know enough to derive the coefficients in (4.43). Given

ψ (x,t = 0) = ∑ cn un (x) (4.51)


n

we multiply by some u∗m and integrate over all space


Z Z
u∗m u(x,t = 0)dx = ∑ u∗m cn un (x)dx (4.52)
n

and using orthogonality we find only one term of the sum on the right survives and hence
Z
cn = u∗m ψ (x,t = 0)dx (4.53)

using the initial conditions show in Fig 1.4 and performing the integrals leads to (4.43).

4.1.10 The 3D Schroedinger Equation


We have concentrated on one dimensional problems but the analysis is easily extended to
three dimensions. the momentum operator is

~p̂ = −ih̄~∇ (4.54)


The Schroedinger equation becomes

∂ h̄2
ih̄ ψ = − ∇2 ψ +V ψ (4.55)
∂t 2m
The probability to find a particle in some infinitesimal box of volume δ V is

Prob = ψ ∗ ψδ V (4.56)
where for example in spherical coordinates δ V = r2 sin θ d θ d φ dr.

4.1.11 Wave Function Collapse and All That


The most mysterious feature of QM is that a particle is described by a probability wave
which “collapses” during a “measurement” to leave the particle at just one point. In some
sense one should think of a quantum of the particle’s energy as being smeared through the
wave. If we probe the wave at a point and it releases a quantum then it will look like the
particle was at that point. This idea has to allow the wave at a point to “know” what’s going
on in the rest of the wave instantaneously and this is a rather uncomfortable fact. A number
of unresolved ideas to understand things better are:

• Copenhagen Interpretation - don’t philosophise about it, use it!


80 CHAPTER 4. QUANTUM MECHANICS

• Hidden Variables - secretly there is a deterministic description of QM which the wave


function is an “average” over.

• Many Worlds - all outcomes happen in parallel universes (this doesn’t explain why a
measurement splits the universes though).

None of these are really satisfactory - not least because it is not precisely clear what
constitutes a measurement. Nevertheless QM is the most successful theory physics has and
so is clearly correct. The real impact of these issues is that it is hard to have an intuitive feel
for the subject. In the next chapter we will investigate an alternative formalism for QM in
which the idea of a trajectory for the particle is central, rather than a wave function, and it
allows some classical intuition to be used.

Exercise 1.1:
Make an odd continuation of the solutions to the infinite square well problem and cal-
culate the momentum space wave functions φ (p). What is the physical significance of your
result?
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 81

Figure 4.5: The classic double slit experiment showing the wave nature of particles.

4.2 Path Integral Approach to Quantum Mechanics


New insights into classical mechanics can be obtained from Hamilton’s principle in which
a classical particle is viewed as following the path which minimizes an Action (see Chapter
1). Feynman developed a Quantum Mechanics version of this idea which we will study here.
We’re going to start with his prescription and see that it is indeed the same theory as the
Schroedinger equation. Although it returns some classical intuition to the quantum world, it
is still a very strange place!

To motivate the form of the theory consider the usual double slit type experiment shown
in Fig 4.5. A classical description in which the particle goes through a single slit will clearly
not do. We will adopt a much more radical idea that the particle travels by ALL possible
paths!
The interference pattern suggests that there should be cancelling and reinforcing phases
in the description. We are therefore led to the proposal of the next section.

4.2.1 Proposal for the Quantum Mechanical Amplitude


Following Feynman, we propose that the probability amplitude for a particle to travel from
point A to point B is given schematically by

K(B, A) = constant ∑ eiS[path]/h̄ (4.57)


all paths

where S is the classical action of each particular path, and every possible path contributes in
the sum.

The probablity for a particle to travel from point A to point B is then given by

P(B, A) = |K(B, A)|2 (4.58)

where K(B, A) in Eq.4.57 is called the QM kernel.


Our proposal looks nutty (!) - every possible path is contributing the same constant
amount up to a phase. Can this ever reproduce Hamilton’s principle as the classical limit of
the theory?
82 CHAPTER 4. QUANTUM MECHANICS

A ∆S ≫ h̄

Figure 4.6: A collection of paths away from the minimum of the action have rapidly varying
phase in the kernel and cancel.

4.2.2 The Classical Limit


If we consider a particle (with momentum p) incident on a hole (of radius r) then we will
see large quantum effects only when the wavelength of the wave function associated with the
particle is

λ &r (4.59)
Of course λ = hp so it is because h is small in nature that we don’t see quantum effects when
we throw cricket balls through doors (of course there might well be some serious classical
effects, so don’t try this at home!)
From this discussion we can see that if we take

h→0 (4.60)
then all wavelengths become very small and the theory becomes classical at all length scales.
Note that also in this limit the Uncertainty Principle (∆p∆x ≥ h̄) allows both p and x to
be measured together which again corresponds to classical physics.

So what does our prescription give in this classical limit h → 0? In general for a set of
paths close to each other (as shown in Fig 2.2), in this limit, we will find the difference in the
classical action between neighbouring paths

∆S ≫ h̄ (4.61)

just because h̄ is so small. This means that these paths have very different phases in the
kernel above. The phase just points out a direction in the complex plane. The sum over these
paths will just average the phase... but if the phases are essentially random as in this case we
will get precisely zero.

The only time this won’t be true is if we find a cluster of paths for which ∆S < h̄. This
will only be true around a minimum of S where there is little change in S. A little cluster of
paths here will all have roughly the same phase and add in such a way as to dominate the
kernel. Thus in the classical limit our prescription does reproduce Hamilton’s Principle.

Incidentally this tells us that in a quantum theory a classical trajectory gets smeared since
it is equally likely to travel on a neighbouring path provided ∆S ≤ h̄.
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 83

B
A

x
C
t

Figure 4.7: Paths from A to B via C.

4.2.3 Wave Functions


We won’t really believe that this new prescription is QM until we have seen that it gives
identical physics to the Schroedinger equation. To move towards that proof lets see how to
relate the kernel to wave functions. We had that for motion from a point A = (xa ,ta) to a
point B = (xb ,tb )
Probability(A → B) = |K(B, A)|2 (4.62)
If we imagine that the particle began at A at ta because its wave function was such that

|ψ (ta)|2 = δ (x − xa ) (4.63)

then we can identify the wave function at a later time tb > ta with the kernel

ψ (xb ,tb ) = K(B, A) (4.64)

where we allow xb to be any general point at time tb.


Using this result it is possible to derive an expression for the evolution of any wavefunc-
tion at some time into the wavefunction at some later time in terms of an integral over the
product of the initial wave function and the kernel. In order to do this, consider the set of
paths shown in Fig. 4.7. For a path going through C the action divides
Z t1 Z t2
Spath = SAC + SCB = Ldt + Ldt (4.65)
t0 t1

So the contribution to the kernel from all possible paths from A to B through C is given by

K(B, A, viaC) = ∑ eiSAC /h̄ . ∑ eiSCB /h̄ (4.66)


A→C C→B
Note that the cross terms in the multiplication of the sums gives all combinations of route A
to C with all routes C to B. We therefore have

K(B, A, viaC) = constant K(C, A) K(B,C) (4.67)


These are not all the paths from A to B though because they all go through the special
point C. To get all paths from A to B we must let C vary over all possible positions so that
Z ∞
K(B, A) = constant K(C, A) K(B,C) dxc (4.68)
−∞
84 CHAPTER 4. QUANTUM MECHANICS

t
tn
xn

∆t x3
x2
x1
t0
x0 x

Figure 4.8: Paths a particle might take from the point x at time t to x′ at time t ′ divided into
many very short straight segments.

We previously, in (4.64), identified K(B, A) as the wave function at time tb and similarly
we can identify here K(C, A) = ψ (xc ,tc), the wave function at time tc . In both cases the
wavefunctions have evolved from the delta function form at time ta in (4.63) but they can
be arbitrarily complicated depending on the evolution, for example, through some potential.
Thus this expression tells us how one wave function evolves into another
Z ∞
ψ (xb ,tb) = constant ψ (xc ,tc ) K(B,C) dxc (4.69)
−∞

The evolution is controlled by the kernel.

4.2.4 Deriving the Schroedinger Equation


We want to show that the path integral expression for the evolution of a wave function is
the same as the Schroedinger equation. The analysis below makes use of Gaussian integrals
which are reviewed in the Appendix.
To derive the standard Schroedinger equation we must look at a particle with the La-
grangian
1
L = mẋ2 −V (x) (4.70)
2
The path integral expression for how the wave function evolves is
Z ∞
′ ′
ψ (x ,t ) = A K(x′ ,t ′; x,t)ψ (x,t)dx (4.71)
−∞

We need a way to keep track of all possible paths in order to work out the kernel. One
way to do this is to divide time up into infinitesimal time slices and assume that the particle
travels in a straight line at constant speed in any such time slice as shown in Fig 4.8.
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 85

Now we can consider the time evolution of the wave function just across one ∆t time
slice. We’ll assume that the particle doesn’t travel too far in any time slice (so x = x′ + ∆x)
and that it’s velocity is constant along the way
Z ∞
ψ (x′ ,t + ∆t) = A K(x′ ,t + ∆t; x,t)ψ (x,t)dx (4.72)
−∞
We know the kernel here because the paths are always straight lines (it’s just exp(iS path /h̄))
R t+∆t
Sx→x′ = t L(x, ẋ)dt
 
x+x′ x′ −x
= L 2 , ∆t ∆t
(4.73)
  2  
1 x′ −x x+x′
= 2m ∆t −V 2 ∆t

Thus our wave function evolves as


  ′ 2  ′ 
Z ∞ i ∆t 1 x −x
−V x+x
2 m ∆t
ψ (x′ ,t + ∆t) = A
h̄ 2
e ψ (x,t)dx (4.74)
−∞
There are lots of small terms in this expression so we can perform an expansion in them
x − x′ = ∆x

ψ (x′ ,t + ∆t) = ψ (x′ ,t) + ∆t ∂ ψ∂(xt ,t) + ...
′ 2 ∂ 2 ψ (x′ ,t)
ψ (x,t) = ψ (x′ ,t) + ∆x ∂ ψ∂(xx′ ,t) + (∆x)
2 ′ ...
∂x 2
x+x′
 
−i∆t V
= 1 − i∆t ′
h̄ V (x ) + ...
h̄ 2
e

• To zeroth order our expression is, keeping x′ constant in the integral


Z ∞
m∆x2

ψ (x ,t) = A ei 2h̄∆t ψ (x′ ,t)d(∆x) (4.75)
−∞
Note we’ve changed from summing over all x to summing over all ∆x but these are
equivalent!
The integral is just a Gaussian integral as in Eq. 4.110 and so
 1/2
′ 2π ih̄∆t
ψ (x ,t) = A ψ (x′ ,t) (4.76)
m
which can only be true if

2π ih̄∆t −1/2

A= m (4.77)
We have derived an expression for the constant in the wave function evolution equation.

• The Schroedinger equation emerges at the next leading order


86 CHAPTER 4. QUANTUM MECHANICS

∂ ψ (x′ ,t) ∂ ψ (x′ ,t) (∆x)2 ∂ 2 ψ (x′ ,t)


Z ∞  
i m∆x∆t
2
′ ′
∆t =A e −i V (x )ψ (x ,t) + ∆x
2h̄∆t + ′ d(∆x)
∂t −∞ h̄ ∂x 2 ∂x 2
(4.78)
Each term on the right hand side is a Gaussian style integral again. The middle term has a
single power of ∆x so is an odd integral and zero. Using Eqs. 4.110 and 4.112, the remaining
terms give

∂ ψ (x′ ,t) ∆t ih̄∆t ∂ 2 ψ (x′ ,t)


∆t = −i V (x′ )ψ (x′ ,t) + (4.79)
∂t h̄ 2m ∂ x2

or in other words the Schroedinger equation.

4.2.5 Path Integral for a Free Particle


The path integral provides a nice way to think about quantum mechanics but in truth the
Schroedinger equation is usually easier to solve. Let’s look at a very simple problem - a free
particle - using the path integral approach though.
We will split the free particles trajectory up into ∆t time slices again (see Fig 4.8 but now
with V = 0). We have already determined that the kernel for motion over one time slice is
 
r m (x1 −x0 )2
m i 2h̄ ∆t
K(x1 , x0 ) = e (4.80)
2π ih̄∆t
To combine two time slices we multiply the kernels for the two separate motions and inte-
grate over the position of the central point as in (4.68)
 
m (x1 −x0 )2 (x2 −x1 )2
m  Z i 2h̄ ∆t + ∆t
K(x2 , x0 ) = e dx1 (4.81)
2π ih̄∆t
which we can do using the Gaussian integral result below Eq. 4.113.
 
r m (x2 −x0 )2
m i 2h̄ 2∆t
K(x2 , x0 ) = e (4.82)
2π ih̄2∆t
Note that all that has happened is that we have recovered the result for one time slice but with
the time doubled and the distance travelled lengthened. One can keep repeating the above
calculation adding time slices and the final result for the whole motion is then just
 
m (xn −x0 )2
i 2h̄ n∆t
m
p
K(xn , x0 ) = 2π ih̄n∆t e
  (4.83)
m (xn −x0 )2
q i 2h̄ (tn −t0 )
m
= 2π ih̄(tn −t0 ) e

Note the form of the exponential is easy to remember because it’s just exp(i∆tKE/h̄) with
KE the classical kinetic energy assuming constant velocity.
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 87

0.4

0.2

Re(K) 2 4 6 8 x
-0.2

-0.4

Figure 4.9: The real part of the kernel for a free particle plotted against position at some
fixed time (it takes the form cos x2 ).

4.2.6 Interpreting the Free Particle Kernel


We can see that this answer encodes a number of QM results we already know. First set
x0 = 0 and t0 = 0 for simplicity so
r
m i mx2
K(x,t) = e 2h̄t (4.84)
2π ih̄t
From (4.64) we know that K(x,t) = ψ (x,t) is a free particle wavefunction if the particle
started from a Dirac delta function at the origin. Now if we plot the real part of K(x,t) at
some later t it looks like Fig 4.9.
It is a wave whose wavelength shortens as we go to larger x. Classically for a particle to
have got to some x in time t it must have
x
p=m (4.85)
t
The QM version of this result is that the approximate wavelength of the kernel at some x
is given by the condition

∆phase = 2π

m(x+λ )2 2
= 2h̄t − mx
2h̄t
(4.86)

mxλ
≃ h̄t
where we have expanded in λ /x. We find
2π h̄ h
λ= = (4.87)
mx/t p
a familiar result. The interpretation is that the higher momentum (smaller wavelength) com-
ponents of the wavepacket travel further out in a given time.
Similarly we can fix x in K(x,t) and plot the real part against t as shown in Fig 4.10.
88 CHAPTER 4. QUANTUM MECHANICS

Re(K) 0.05 0.1 0.15 0.2 0.25 t


-1

-2

-3

Figure 4.10: The real part of the kernel for a free particle plotted against time at some fixed
position (it takes the form cos(1/t).

We can work out the period of the wave at some t as we did the wavelength above

mx2 mx2
2π = 2h̄t − 2h̄(t+T )

mx2
1 − (1 + T /t)−1

= (4.88)
2h̄t

mx2
≃ 2h̄t 2
T
The angular frequency is

1 mx2
ω = 2π /T = (4.89)
h̄ 2t 2
which, up to the factor of h̄ is just the kinetic energy of the particle and hence

E = h̄ω (4.90)
The interpretation is that the higher energy (higher frequency) components of the wavepacket
pass by a fixed point earlier in time.

4.2.7 Barrier Problems


Knowing the kernel for a free particle we can solve a number of problems involving particles
starting from a point source, passing through a barrier and eventually ending up on a screen.
To find the kernel associated with the particles’ motion from the source to the screen we
must sum eiS/h̄ for all the paths not blocked by the barrier. On these paths the particles are
free so

imx2
K = C(t)e 2h̄t (4.91)
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 89

P
L0

θ
d
source _d sin θ
2

screen
barrier

Figure 4.11: A single slit barrier problem.

If we assume the source is at infinity then the distance from the source to any point on the
barrier is the same. We can therefore treat each point on the barrier as an equal emitter of
particles and just sum eiS/h̄ for the paths from the barrier to the screen. We find
Z imx2path
K(screen) = A(t) e 2h̄t f (s) ds (4.92)
barrier
Here A(t) is a constant depending only on time, the exponential is the contribution from the
action of each path, f (s) is either 1 or 0 depending upon whether that point on the barrier is
a hole or blocking the particle and finally ds sums over all points on the barrier. Compare
this to (4.68)

Example - Single Slit

Lets look at a simple barrier with a single slit opening of width d as shown in Fig 2.7.
We will work in the narrow width approximation where d ≪ L0 . The distance from a point
P on the screen to each element of the hole is
d d
L0 + x sin θ , − <x< (4.93)
2 2
Our expression for the kernel is therefore
Z d
2 2 /2h̄t
K(P,t) = A(t) d
eim(L0 +x sin θ ) dx (4.94)
−2
Since L0 ≫ d then
R d2 2
K(P,t) ≃ A eimL0 /2h̄t ei2mL0 x sin θ /2h̄t dx
− d2

2
h id
2
≃ AeimL0/2h̄t imL0h̄tsin θ eimL0 x sin θ /h̄t d (4.95)
−2

2
−iA(t)h̄teimL0 /2h̄t
 
mL0 d sin θ
≃ mL0 sin θ 2 sin 2h̄t
90 CHAPTER 4. QUANTUM MECHANICS

sin2 x
x2 0.8

0.6

0.4

0.2

-10 -5 5 10

x= sinθ

Figure 4.12: The probability function for the end point of a particle passing through a single
slit.

The probability of finding a particle at P is


|A|2 h̄2t 2
 
mL0 d sin θ
|K(P,t)|2 = m2 L20 sin2 θ
4 sin2 2h̄t
(4.96)
sin2 (α sin θ )
≃ constant β sin2 θ

where α and β are just constants. We can plot the rough form of this solution and find the
form in Fig 4.12.
Note that the minima are when
mL0 d
sin θ = nπ (4.97)
2h̄t
ie when
2h̄t h
d sin θ = nπ = n = nλ (4.98)
mL0 p
The usual result for destructive interference.

4.2.8 The Kernel in Terms of Wave Functions


In order to switch between the Schroedinger equation formalism and the path integral for-
malism it is helpful to have an expression for the kernel in terms of wave functions.
To find this form remember that
Z ∞
ψ (x,t2 ) = K(x,t2; y,t1)ψ (y,t1)dy (4.99)
−∞

Lets try now to get an equivalent statement starting from the time independent Schroedinger
equation

H φn = E n φn (4.100)
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 91

If we start with some wavepacket at time t1 we can use completeness to write it as



ψ (x,t1 ) = ∑ cn φn (x) (4.101)
n=1
Equally we can invert this expression to give
Z ∞
cn = φn∗ (y)ψ (y,t1)dy (4.102)
−∞
We’ve used the orthogonality of the wave functions here to pick out the coefficient of a
particular φn by multiplying by φn∗ and integrating over all space. Again I’ve switch x → y to
remind that the answer doesn’t depend on the integration variable.

Furthermore we know how ψ (x,t1 ) evolves in time to time t2



ψ (x,t2 ) = ∑ cn φn(x)e−iEn(t2−t1)/h̄ (4.103)
n=1
Substituting in our expression for the cn we find
Z ∞ ∞
ψ (x,t2 ) = ∑ φn (x)e−iEn(t2 −t1)/h̄ φn∗(y)ψ (y,t1)dy
−∞ n=1
(4.104)

Comparing back to the path integral result (4.99) we see that

K(x,t2; y,t1) = ∑∞ ∗
n=1 φn (x)φn (y)e
−iEn (t2 −t1 )/h̄ (4.105)
Exercise 2.1:
Show that for a free particle travelling from xa at ta to xb at tb the classical action is given
by

1 (xb − xa )2
Sclassical = m
2 (tb − ta )

Exercise 2.2:
Perform the Gaussian integral
Z ∞
2 −β x
e−α x dx
−∞

Hint: Complete the square!

Exercise 2.3:
Consider a non-relativistic, free particle of mass m travelling in two dimensions between
two points A and B on the x axis equally spaced about the y axis. Consider paths where the
92 CHAPTER 4. QUANTUM MECHANICS

particle travels in a straight line at constant speed to an arbitrary point on the y axis and then
in a straight line at the same speed to B, taking total time T. Calculate the action for these
paths. Argue that classically the particle will travel in a straight line. Quantum mechanically
the path is smeared. Estimate the width of the path when the particle crosses the y axis.

Exercise 2.4:
A massive, non-relativistic particle emitted by a source at infinity encounters a sheet of
absorbing material with a circular hole of side a in it. Derive an expression for the quantum
probability for finding the particle at a distance d along the axis of the hole on the far side at
a time T .

4.2.9 Appendix - Gaussian Integrals


We will need to know the results of the following integrals (but we’ll need a few tricks in
order to calculate them)
n −α x2 dx
R∞
In (α ) = −∞ x e (4.106)

• Firstly consider when n = 0. The trick is to calculate I02


Z ∞ Z ∞ Z ∞Z ∞
−α x2 −α y2 2 +y2 )
I02 (α ) = e dx e dy = e−α (x dxdy (4.107)
−∞ −∞ −∞ −∞
This is a two dimensional integral in the x, y plane and we can switch to polar coordinates
r, θ
Z ∞ Z 2π
2
I02 (α ) = e−α r (rdrd θ ) (4.108)
0 0

since rdr = dr2 /2


1 1 h i∞ π
2π −α r2
I02 (α ) = [θ ] e = (4.109)
2 (−α ) 0 0 α
and thus
q
π
I0 (α ) = α (4.110)

• When n is an ODD number the integral is ODD and therefore zero.

• To obtain the result for EVEN n note that


n
I2n (α ) = (−1)n ddα n I0 (α ) (4.111)

Thus for example


4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 93

Z ∞ r
2 −α x2 d 1 π
I2 = x e dx = − I0 (α ) = (4.112)
−∞ dα 2α α

Finally we shall also need the related integral


Z ∞
e−α [(x1 −x0 ) ] dx
2 +(x −x )2
J= 2 1
1 (4.113)
−∞
which is simplified by noting that

(x2 + x0 ) 2 (x2 − x0 )2
 
2 2
(x1 − x0 ) + (x2 − x1 ) = 2 x1 − + (4.114)
2 2
now if we change the integration variable to w = x1 − (x2 + x0 )/2 (dw = dx1 ) we find
Z ∞
2 α 2
J= e−2α w e− 2 (x2 −x0 ) dw (4.115)
−∞
r
π − α (x2 −x0 )2
J= e 2 (4.116)

94 CHAPTER 4. QUANTUM MECHANICS

4.3 Relativistic Quantum Mechanics - The Klein Gordon


Equation
In this section we will study relativistic quantum mechanics. In a particle accelerator we
are interested in, for example, the interactions of highly energetic electrons so the need to
combine relativity and quantum mechanics is pressing.
We will use natural units henceforth. This firstly means redefining the unit of distance so
that c = 1. Secondly we will redefine the unit of energy so that E = hν = 2πν ie set h̄ = 1.
So mass, energy, inverse length and inverse time all have the same dimensions. Generally
think of energy E as the basic unit, e.g mass m has units of GeV and distance x has unit
GeV−1 .
For a free relativistic particle the total energy E is no longer given by the equation we
used to derive the Schroedinger equation in section 4.1. Instead it is given by the Einstein
equation
E 2 = ~p 2 + m2 . (4.117)

In position space we write the energy-momentum operator as


p̂µ → i∂ µ , (Ê,~p̂) = (i , −i~∇) (4.118)
∂t

Note that the minus sign in the spatial parts of ∂ µ match and explain the sign in the standard
operator relations (4.4,4.5).
Substituting these operators generates the Klein Gordon equation

( + m2 ) φ (x) = 0 (4.119)

where we have introduced the box notation,

 = ∂µ ∂ µ = ∂ 2 /∂ t 2 − ∇2 (4.120)

and x is the 4-vector (t,~x).

The Klein-Gordon equation has plane wave solutions:

φ (x) = Ne−i(Et−~p.~x) (4.121)

where N is a normalization constant and if we substitute the solution into the equation we
recover q
E = ± |~p|2 + m2 (4.122)
4.3. RELATIVISTIC QUANTUM MECHANICS - THE KLEIN GORDON EQUATION95

4.3.1 Problems in the Klein Gordon Equation


There are two problems with this equation though. Indeed historically Schroedinger origi-
nally began by writing down this relativistic equation but then retreated to his non-relativistic
equation because of the issues we will discuss here.
Firstly there are both positive and negative energy solutions because of the square root
in (4.122). The negative energy solutions pose a severe problem if you try to interpret φ as
a wave function as we are trying to do. The spectrum is no longer bounded from below, and
you can extract arbitrarily large amounts of energy from the system by driving it into ever
more negative energy states. The system is completely unstable! Any external perturbation
capable of pushing a particle across the energy gap of 2m between the positive and negative
energy continuum of states can uncover this difficulty. Furthermore, we cannot just throw
away these solutions as unphysical since they appear as part of the complete set of states
(as discussed in section 1.8) for the Klein Gordon equation and so emerge in almost any
problem.

A second problem with the wave function interpretation arises when trying to find a
probability density. In relativity a density transforms under boosts, since lengths contract,
and forms part of a 4-vector with the current density. Here since φ is Lorentz invariant, |φ |2
does not transform like a density so we will not have a Lorentz covariant continuity equation

∂t ρ + ~∇.J~ = 0 ∂µ J µ = 0 (4.123)

We can derive a candidate for the probability density/current by finding something which
does satisfy such a continuity equation as we did section 1.4 for the Schroedinger equation.
As there, one starts with the Klein-Gordon equation multiplied by φ ∗ and subtracts the com-
~ and
plex conjugate of the KG equation multiplied by φ . (4.123) emerges with J µ = (ρ , J)

∂φ∗
 
∗∂φ
ρ ≡ i φ −φ , (4.124)
∂t ∂t
J~ ≡ −i (φ ∗~∇φ − φ ~∇φ ∗ ) (4.125)

It is thus natural to interpret ρ as a probability density and J~ as a probability current.


However, for a plane wave solution (4.121), ρ = 2|N|2 E, so ρ is not positive definite
since we’ve already found E can be negative. This clearly makes no sense!
We should note that the equation is a candidate to describe spinless relativistic parti-
cles only since there is just a single probability density describing a particle state (as in the
Schroedinger equation).

Exercise 3.1: Derive (4.124) and (4.125).

4.3.2 Feynman Stueckelberg Interpretation


The Klein-Gordon equation appears to have unacceptable negative energy states and negative
probabilities for those states if φ is interpreted as the single particle wave function. Many
years later Feynman and Stueckelberg came to the rescue and proposed a way forwards to
96 CHAPTER 4. QUANTUM MECHANICS

make sense of the equation. It is linked to Pauli’s idea that one does not directly measure the
number of particles. You can only detect them via their charges through an interaction. This
means you can’t observe the probability density but only the charge density/current (qJ µ )
and that can be negative!
The Klein Gordon equation has a time reversal symmetry so in addition to states prop-
agating forwards in time that look like e−iEt there are solutions that travel backwards in
time like e+iEt . Normally we would throw away these backwards propagating solutions for
causality’s sake (you don’t want to be able to kill your Grandfather!). However, if E can be
negative these two sets of states become confused. Does e−i(−E)t propagate forwards in time
with negative energy or backwards in time with positive energy?
Feynman and Stueckelberg proposed that it is possible to consistently keep just half of
the solutions to the Klein Gordon equations but not the ones you would immediately guess.
They suggested to keep positive energy states propagating forwards in time, but only neg-
ative energy states that propagate backwards in time! We interpret these states as positive
energy states moving forwards in time (e+i(−E)t ). In the solutions the charge density/current
is opposite sign though. These particles look like negative charge versions of the normal
particle states propagating forwards in time. This is a prediction of anti-particles!
Now we find a theory that is consistent with the requirements of causality and that has
none of the aforementioned problems. In fact, the negative energy states cause us prob-
lems only so long as we think of them as real physical states propagating forwards in time.
Therefore, we should interpret the emission (absorption) of a negative energy particle with
momentum pµ as the absorption (emission) of a positive energy antiparticle with momentum
−pµ .
In order to get more familiar with this picture, consider a process with a π + and a photon
in the initial state and final state. In Fig 3.1a the π + starts from the point A and at a later
time t1 emits a photon at the point ~x1 . If the energy of the π + is still positive, it travels on
forwards in time and eventually will absorb the initial state photon at t2 at the point ~x2 . The
final state is then again a photon and a (positive energy) π + .
There is another process however, with the same initial and final state, shown in Fig 3.1b.
Again, the π + starts from the point A and at a later time t2 emits a photon at the point~x1 . But
this time, the energy of the photon emitted is bigger than the energy of the initial π + . Thus,
the energy of the π + becomes negative and it is forced to travel backwards in time. Then
at an earlier time t1 it absorbs the initial state photon at the point ~x2 , thereby rendering its
energy positive again. From there, it travels forward in time and the final state is the same as
in figure 2.1(a), namely a photon and a (positive energy) π + .
In today’s language, the process in Fig 3.1b would be described as follows: in the initial
state we have an π + and a photon. At time t1 and at the point ~x2 the photon creates a π + -π −
pair. Both propagate forwards in time. The π + ends up in the final state, whereas the π −
is annihilated at (a later) time t2 at the point ~x1 by the initial state π + , thereby producing
the final state photon. To someone observing in real time, the negative energy state moving
backwards in time looks to all intents and purposes like a negatively charged pion with
positive energy moving forwards in time.
We have discovered anti-matter! the Feynman Stueckelburg interpretation revives the
Klein Gordon equation as a perfectly sensible theory of spinless particles and their anti-
particles.
4.3. RELATIVISTIC QUANTUM MECHANICS - THE KLEIN GORDON EQUATION97

B B

time
(t 1, x 1)
(t2, x 2)
(t 1, x 1)
(t2, x 2)

A A

(a) (b)

Figure 4.13: Pion-photon scatterings in which the intermediate pion has (a) positive energy
and travels forwards in time and (b) has negative energy and travels backwards in time.

You might also like