1718 Theories Notes
1718 Theories Notes
March 6, 2018
Contents
1 Least Action 1
1.1 Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Snell’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Complicated Problems . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Light in Vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 Light in the Atmosphere . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Newtonian Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Multiple Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Example: Projectile Motion . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 Example 2: Double Pendulum . . . . . . . . . . . . . . . . . . . . 11
1.3 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.1 Ignorable Coordinates . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.2 Energy Conservation . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.3 Example - Central Forces . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.4 Hamiltonian and Energy . . . . . . . . . . . . . . . . . . . . . . . 16
1.4 Appendix - Calculus of Variation . . . . . . . . . . . . . . . . . . . . . . . 17
1.5 Appendix - Mathematics of conservation laws . . . . . . . . . . . . . . . . 19
2 Special Relativity 21
2.1 The Postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.2 Lorentz Contraction . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 An Analogy to Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Four Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.1 Index Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 The Laws of Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Four-velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Four Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.3 Four Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.4 Hypothesis for Dynamical Law . . . . . . . . . . . . . . . . . . . 33
2.6 Physics with Four-Momentum . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.1 The Doppler Effect . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.2 The Compton Effect . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6.3 Fixed Target Experiments . . . . . . . . . . . . . . . . . . . . . . 36
ii
CONTENTS iii
3 Relativistic Electromagnetism 42
3.1 Integral Form of Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . 42
3.1.1 Gauss’ Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1.2 No Magnetic Charges . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.3 Faraday’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.4 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Differential Form of Maxwell’s Equations . . . . . . . . . . . . . . . . . . 45
3.2.1 Maxwell’s Equations in Differential Form . . . . . . . . . . . . . . 46
3.2.2 Conservation of Charge . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.3 The Displacement Current . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.1 Electrostatic Potential . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.2 The Magnetic Vector Potential . . . . . . . . . . . . . . . . . . . . 50
3.3.3 A New Electric Potential . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.4 Gauge Transformations . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.5 Maxwell’s Equations in Lorenz Gauge . . . . . . . . . . . . . . . . 53
3.4 Relativistic Formulation Of Electromagnetism . . . . . . . . . . . . . . . . 54
3.4.1 Four-vector Current . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4.2 Conservation of Charge . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.3 The Four Vector ∂ µ . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.4 Four Vector Potential . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.4.5 A Moving Point Charge . . . . . . . . . . . . . . . . . . . . . . . 57
3.4.6 The Electromagnetic Field Strength Tensor . . . . . . . . . . . . . 59
3.4.7 Lorentz Transformations of Electric and Magnetic Fields . . . . . . 60
3.4.8 The Relativistic Force Law . . . . . . . . . . . . . . . . . . . . . . 61
3.5 The Lagrangian For a Charged Particle . . . . . . . . . . . . . . . . . . . . 62
3.6 Appendix - Gauss’ and Stoke’s Theorems . . . . . . . . . . . . . . . . . . 63
3.6.1 Gauss’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Appendix - Vector Identities . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 Quantum Mechanics 70
4.1 Non-relativistic Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . 70
4.1.1 One Dimensional, Time Dependent Schroedinger Equation . . . . . 70
4.1.2 Time Independent Schroedinger Equation . . . . . . . . . . . . . . 71
4.1.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.1.4 Proof that Probability Is Conserved . . . . . . . . . . . . . . . . . 72
4.1.5 Momentum Space Wave Functions . . . . . . . . . . . . . . . . . . 73
4.1.6 Heisenberg Uncertainty Principle . . . . . . . . . . . . . . . . . . 75
4.1.7 Square Well Example . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.1.8 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
iv CONTENTS
4.1.9 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.1.10 The 3D Schroedinger Equation . . . . . . . . . . . . . . . . . . . . 79
4.1.11 Wave Function Collapse and All That . . . . . . . . . . . . . . . . 79
4.2 Path Integral Approach to Quantum Mechanics . . . . . . . . . . . . . . . 81
4.2.1 Proposal for the Quantum Mechanical Amplitude . . . . . . . . . . 81
4.2.2 The Classical Limit . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.3 Wave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.4 Deriving the Schroedinger Equation . . . . . . . . . . . . . . . . . 84
4.2.5 Path Integral for a Free Particle . . . . . . . . . . . . . . . . . . . 86
4.2.6 Interpreting the Free Particle Kernel . . . . . . . . . . . . . . . . . 87
4.2.7 Barrier Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.8 The Kernel in Terms of Wave Functions . . . . . . . . . . . . . . . 90
4.2.9 Appendix - Gaussian Integrals . . . . . . . . . . . . . . . . . . . . 92
4.3 Relativistic Quantum Mechanics - The Klein Gordon Equation . . . . . . . 94
4.3.1 Problems in the Klein Gordon Equation . . . . . . . . . . . . . . . 95
4.3.2 Feynman Stueckelberg Interpretation . . . . . . . . . . . . . . . . 95
Chapter 1
Least Action
Newton, through his three laws of dynamics, developed an extremely successful description
of the motion of objects. These laws for example can describe the elliptical orbits of planets
to remarkable precision. There is though an alternative presentation of these successes, the
Principle of Least Action, which we will explore here. It is a formalism that grew out of
optics and will allow us to study an area of mathematics called “calculus of variation”. Of
course it must turn out to be the same as Newton’s laws. This alternative formalism makes
some dynamics problems easier to solve but, more importantly, it will give us new insights
into conservation laws. It is important to master these methods since as one moves to the
forefront of modern quantum theories the Least Action Principle becomes the only way to
define theories such as that of the strong nuclear force.
1.1 Optics
Our starting point will be to think about the path that light travels by. In these enlightened
times we might start from Maxwell’s equations and derive a wave equation with light waves
as solutions to determine how the light propagates. Before this technology though Fermat
proposed
Fermat’s Principle of Least Time: Light propagates between two points so as to minimize
its travel time
Thus for example in a uniform medium where the speed of light c is a constant the
minimum time of travel
d
t= (1.1)
c
is given by the path of shortest distance d ie a straight line. This is still a perfectly good (if
limited) description of light.
We can obtain more interesting results by thinking about media where the speed of light
changes.
1
2 CHAPTER 1. LEAST ACTION
(x1, y1)
material 1 - v1
θ1
d1
(x, 0) Figure 1.1: Possible paths that light
might follow transiting across the
interface between two materials.
d2
material 2 - v2
θ2
(x2, y2)
In any one medium light travels in a straight line but in this case we have some choice
in where the light crosses between the media. Lets consider the arbitrary crossing point
(x, y = 0). The time of travel is
d1
T [x] = v1 + dv22
√ √ (1.2)
(x−x1 )2 +y21 (x−x2 )2 +y22
= v1 + v2
We now want to find the path (ie the value of x through which it passes) which minimizes
the time taken. Thus
dT (x − x1 ) (x2 − x)
= q − q =0 (1.3)
dx v1 (x − x1 )2 + y21 v2 (x2 − x)2 + y22
This equation though is just
sin θ1 sin θ2
v1 = v2 (1.4)
which is Snell’s law.
In terms of index of refraction which is defined, relative to the vacuum, as
c
n1 = (1.5)
v1
y
Figure 1.2: Possible paths that light
might travel in a plane with varying
speed of light v(x, y).
xa xb
x
Different paths are described by different functions y(x). The time to travel along an
arbitrary little piece of path is
p
distance dx2 + dy2
dT = = (1.7)
velocity v(x, y)
Summing such contributions up along a path gives the total time of travel
s
Z xb 2
1 dy
T [y(x)] = 1+ dx (1.8)
xa v(x, y) dx
To rewrite this in a more standard form we have found that the time taken to traverse a
path is
Z xb
T [y] = L(y, ẏ, x) dx (1.9)
xa
where
1 p
L(y, ẏ, x) = 1 + ẏ2 (1.10)
v(x, y)
Now we want to find the path y(x) that gives the minimum time.
where the dot indicates a derivative with respect to t. Give expressions for
∂T ∂T ∂T dT
, , ,
∂t ∂b ∂ ḃ dt
4 CHAPTER 1. LEAST ACTION
This is the sort of problem that Calculus of Variation is designed to address, as discussed
in Appendix 1.4.
As we show in Appendix 1.4, the problem of finding the path that minimizes the time, is
equivalent to solving a differential equation called the Euler Lagrange equation,
d ∂L ∂L
− =0 (1.11)
dx ∂ ẏ ∂y
ẏ = constant, m (1.14)
or integrating
y = mx + c (1.15)
ie a straight line. This is our first example of the solution of the Euler Lagrange equation
giving the path that minimizes T . m and c are determined by the initial and final position of
the light.
We can use the fact that L is independent of x to simplify the Euler Lagrange equation as
follows. Note that
dL ∂ L ∂ L ∂L
= + ḣ + ḧ (1.18)
dx ∂x ∂h ∂ ḣ
∂L
The first term on the right is zero. Now replace using the Euler Lagrange equation
∂h
∂L d ∂L
= (1.19)
∂ h dx ∂ ḣ
and we find
dL d ∂L ∂L
= ḣ + ḧ (1.20)
dx dx ∂ ḣ ∂ ḣ
which is just
d ∂L
L − ḣ =0 (1.21)
dx ∂ ḣ
which gives us
∂L
L − ḣ = constant, D (1.22)
∂ ḣ
Note that this is only a first order equation rather than the second order Euler Lagrange
equation so is simpler to solve.
In our problem, using the explicit form for L above we have
p ḣ2 n
n 1 + ḣ2 − p =D (1.23)
1 + ḣ2
which simplifies to
n
p =D (1.24)
1 + ḣ2
Note that the physical meaning of D is the value of the index of refraction at the point where
the light ray becomes horizontal so that ḣ = 0.
Squaring and rearranging we find
r
dh n2
= −1 (1.25)
dx D2
Thus
x − x0 = hh0 q ndh
R
2
(1.26)
−1
D2
6 CHAPTER 1. LEAST ACTION
Explicit Example: Consider a ray of light that begins moving horizontally (ḣ = 0) at h = 0
in an atmosphere where
n(h) = n0 − λ h (1.27)
where λ is some constant. We must solve the integral
dh
Z
x= q (1.28)
(n0 −λ h)2
D2
−1
This can be done by changing variables to
n0 − λ h = D cosh φ (1.29)
The integral becomes
D D
Z
x=− dφ = − φ + c (1.30)
λ λ
Returning to the original coordinates and requiring the boundary conditions ḣ(x = 0) = 0
and h(x = 0) = 0 gives the result
n0 λx
h= λ (1 − cosh n0 ) (1.31)
• When λ is positive n(h) decreases with altitude - this is what normally happens in the
atmosphere. Plotting the form of the solution we find
h
apparent light path
to observer
(0,0)
x
Figure 1.3: The solution for the
light path h(x) when n(h)
decreases with height.
Thus if we look up at the Empire State building it will appear taller than it actually is.
• If there is a temperature inversion then λ is negative so n(h) increases with altitude. Plot-
ting the form of the solution we find
Exercise 1.2:
(a) Consider a fibre optic cable lying in the z direction. The cable is made of glass with index
of refraction n(r), where r is the radial distance from the centre of the cable. Working in
cylindrical coordinates (r, θ , z) show that Fermat’s Principle implies light travels on the path
minimizing the quantity
Z z2 Z z2 p
f r(z), θ (z), r′(z), θ ′ (z) dz =
′ ′
n(r) r 2 + r2 θ 2 + 1 dz.
z1 z1
where a prime indicates differentiation with respect to z. z1 and z2 are the z-coordinates of
the end points of the path.
(b) If a light ray initially has θ ′ = 0 show, from the appropriate Euler Lagrange equation,
that the θ independence of f implies the path followed by the light is described by a constant
value of θ .
(c) Use the z independence of f to deduce that the first order differential equation for rays
travelling paths with constant θ is
∂f ′
f− r = constant.
∂ r′
Hamilton’s Principle: A particle travels by the path between two points that minimizes the
Action.
We need to know what the “action” is. Let’s write it first for one dimensional motion.
The action is
R tb
S[path] = ta L(x, ẋ,t)dt (1.32)
where the dot indicates differentiation with respect to the time, t. L is known as the
Lagrangian and is given by
From Appendix 1.4, we know that the path that minimizes the action satisfies the Euler
Lagrange equation, analogous to the case of optics Eq.1.11,
∂L
d
dt ∂ ẋ − ∂∂ Lx = 0 (1.34)
We can now check to see if any of this makes sense (!). For a non-relativistic particle in
a one dimensional potential we have
1
L = T −V = mẋ2 −V (x) (1.35)
2
The Euler Lagrange equation is therefore
d ∂V
(mẋ) + =0 (1.36)
dt ∂x
which is Newton’s second law since
∂V
F =− (1.37)
∂x
∂L
p = mẋ = (1.38)
∂ ẋ
qi i = 1...n (1.39)
For example for one particle moving in three dimensions we might call x = q1 , y = q2 z = q3 .
As discussed in Appendix 1.4, Eq.1.85, for the n dimensional case we have to solve a set
of n Euler Lagrange equations - one associated with each coordinate,
∂L
d
dt ∂ q̇i − ∂∂qLi = 0 (1.40)
i.e. we need to write down n copies of the Euler Lagrange equation, for i = 1, 2, . . .n and
try to solve them simultaneously.
1.2. NEWTONIAN DYNAMICS 9
Generalized Coordinates
The reason that we have written the coordinates so generally as qi rather than for example
using x, y, z is that in some problems these are not the appropriate coordinates because of a
constraint. A simple example to illustrate this is a ball on a wire hoop
y
θ Figure 1.5: The coordinates describing
a ball constrained to run
x around a hoop.
The hoop stops the ball moving in the radial direction so the ball cannot be at any arbitrary
(x, y). The sensible coordinate to use is the angle θ .
Such a reduced set of coordinates are called generalized coordinates.
Generalized Momentum
A generalization of the idea of momentum can be defined in the spirit of (1.38). The gener-
alized momentum associated with a generalized coordinate is given by
∂L
pi = ∂ q̇i (1.41)
y
Figure 1.6: The motion of a projectile in the
x, y plane subject to constant gravity
in the vertical direction y.
x
10 CHAPTER 1. LEAST ACTION
We can obtain the normal Newtonian equations of motion from the Euler Lagrange equa-
tions. We need expressions for the kinetic and potential energy of the system so we can build
the Lagrangian. The kinetic energy is just
1 1
T = mẋ2 + mẏ2 (1.42)
2 2
and the potential energy
V = mgy (1.43)
So the Lagrangian is just
1 1
L = T −V = mẋ2 + mẏ2 − mgy (1.44)
2 2
Now we find the two Euler Lagrange equations. The first associated with the x coordinate
is
d ∂L ∂L
− =0 (1.45)
dt ∂ ẋ ∂x
which gives
mẍ = 0 (1.46)
The second equation associated with the y coordinate is
d ∂L ∂L
− =0 (1.47)
dt ∂ ẏ ∂y
which gives
ÿ = −g (1.48)
The two boxed equations are the standard Newtonian equations of motion.
Hopefully you’re starting to see the power of this technique now - the kinetic and poten-
tial energies of a system are fairly easy to work out and then we just do some maths. There’s
not all that resolving forces business! The next problem is an example that would be very
hard by the standard methodology.
1.2. NEWTONIAN DYNAMICS 11
θ1 l1
It would be pretty hard work to determine all the forces in play here. However, the
Lagrangian technique means we only have to calculate the energies of the two masses to get
to the equations of motion.
The first mass has a velocity ~v1 with magnitude l1 θ̇1 (v = ω r). The second mass has
both this motion plus a second contribution from the swing of the second pendulum ~v2 with
magnitude l2 θ̇2 . The total velocity of the second mass is therefore
L = T −V (1.53)
There are two Euler lagrange equations - one associated with θ1
d
m1 l12 θ̇1 + m2 l12 θ̇1 + m2 l1 l2 θ̇2 cos(θ2 − θ1 ) − m2 l1 l2 θ̇1 θ̇2 sin(θ2 − θ1 )
dt
(1.54)
+(m1 + m2 )gl1 sin θ1 = 0
12 CHAPTER 1. LEAST ACTION
d 2
m2 l2 θ̇2 + m2 l1 l2 θ̇1 cos(θ2 − θ1 ) + m2 l1 l2 θ̇1 θ̇2 sin(θ2 − θ1 ) + m2 gl2 sin θ2 = 0 (1.55)
dt
These are pretty messy (but that was the point!). Things simplify a bit if we assume that
both θ1 and θ2 are small and expand to linear order. We then get
θ̈1 = −ω 2 θ1
(1.57)
θ̈2 = −ω 2 θ 2
Exercise 1.4: If a system with generalized coordinates ξ and ψ has the Lagrangian
1
L = ξ̇ 2 + cos ξ ψ̇ − ξ eψ
2
what are the Euler Lagrange equations describing the system?
Exercise 1.5:Two blocks of equal mass M are connected by a flexible string of length ℓ.
One block is placed on a smooth horizontal table and the other block hangs over the edge.
Using the length z of string hanging over the edge as a generalized coordinate, write down
the Lagrangian and use the Euler–Lagrange equation to find the acceleration of the hanging
mass in the following cases:
(ii) the string is heavy with mass m distributed uniformly along it.
1.3. CONSERVATION LAWS 13
Exercise 1.6:
(a) Show that for a non-relativistic, free particle of mass m travelling with constant velocity
v the action S describing its motion reduces to
S = mvd/2
where d is the distance travelled. This was a form for the action proposed by Maupertuis
who believed it reflected the simplicity and economy of the Creator-God....
(b) Consider such a particle rolling on a table in the x, y plane with speed v1 . Along the
y-axis there is a height discontinuity in the table which the particle can move over at the cost
of potential energy which reduces its velocity to v2 . If the particle starts at (x1 , y1 ) to the left
of the y axis and ends to the right at (x2 , y2 ) show that the action for it passing across the
y-axis at arbitrary y (assuming it travels in a straight line except when it crosses the y axis)
is given by
q q
S = mv1 x1 + (y − y1 ) + mv2 x22 + (y − y2 )2
2 2
v1 sin θ1 = v2 sin θ2
where the angles are the angles between the particle’s direction of motion and the x axis
before and after it crosses the y axis. Contrast this result with Snell’s Law for light.
This is clearly a mathematical fact but there is a deeper interpretation. If L only depends
on q̇i not qi itself then we can shift
qi → qi + const (1.60)
and leave the Lagrangian, L, (and hence the physics) invariant. This is a symmetry - transla-
tion invariance in the qi direction.
Thus we learn that the true relation is
∂L
H =∑ q̇i − L (1.61)
i ∂ q̇i
To prove that it is conserved we explicitly calculate
dH d ∂L ∂L ∂L ∂L ∂L
=∑ q̇i + ∑ q̈i − −∑ q̇i − ∑ q̈i = 0 (1.62)
dt i dt ∂ q̇i i ∂ q̇i ∂t i ∂ qi i ∂ q̇i
1 1 1
T = m(ẋ2 + y˙2 ) = mṙ2 + mr2 θ̇ 2 (1.65)
2 2 2
thus
1 1
L = mṙ2 + mr2 θ̇ 2 −V (r) (1.66)
2 2
There is a Euler Lagrange equation associated with the r coordinate
d ∂L ∂L
− =0 (1.67)
dt ∂ ṙ ∂r
giving
∂V
mr̈ = mrθ̇ 2 − (1.68)
∂r
Plus a second equation for θ , which since L is independent of θ , is just
d
(mr2 θ̇ ) = 0 (1.69)
dt
which tells us that angular momentum is conserved.
The Hamiltonian is also conserved and is given here by
1 1
H = mṙ2 + mr2 θ̇ 2 +V (1.70)
2 2
which is the total energy.
Exercise 1.7: If a system with generalized coordinates x and y has the Action
Z
1 2 1 2
S= ẋ + ẏ + cos ẏ − x dt
2 2
To be explicit, the hoop is in a vertical plane near the surface of the Earth, where that
vertical plane is subject to a steady rotation about a fixed axis passing through the centre of
the hoop, due to an external turning force or torque. The fact that the energy is not conserved
is due to the fact that the turning force which is required to maintain the steady rate of rotation
is external to the system. However we shall show that, even in this case, the Hamiltonian is
conserved, even though the Hamiltonian cannot be identified with the energy.
The single coordinate θ (which is a function of time t) as shown in the diagram is suffi-
cient to describe the position of the bead so this is a good generalized coordinate. The kinetic
energy is given by
1
T = m(a2 θ̇ 2 + a2 sin2 θ ω 2 ) (1.71)
2
and the potential energy by
1
L = m(a2 θ̇ 2 + a2 sin2 θ ω 2 ) + mga cos θ (1.73)
2
Since L does not depend on t the Hamiltonian is conserved. In particular
1 1
H = ma2 θ̇ 2 − ma2 sin2 θ ω 2 − mga cos θ (1.74)
2 2
Although H is conserved the total energy of the system is not since to keep the hoop rotating
a constant external torque must be applied, thereby doing work on the system.
1.4. APPENDIX - CALCULUS OF VARIATION 17
q
q2
Figure 1.10: An arbitrary path in the
q − s plane between fixed end points.
q1
s1 s2
S
Imagine we are interested in one curve that minimizes the quantity
Z s2
S[q(s)] = L(q, q̇, s)ds (1.75)
s1
L is just a number at each point on a given curve determined by the values of q and s at that
point and the gradient q̇ = dq/ds. The integral sums these numbers along the line.
If the curve that minimizes S is q̄(s) we can write the other curves as deviations from it
R s2 ∂L ∂L
≃ s1
˙
L(q̄, q̄, s) + δ q̇ ∂ q̇ + δ q ∂ q + .... ds (1.79)
R s2
≃ S[q̄] + s1 δ q̇ ∂∂ Lq̇ + δ q ∂∂ Lq ds + O(δ q2 )
1 Forexample, in the case of light in two dimensions we identify q(s) → y(x), while for a particle in one
dimension we identify q(s) → x(t), or for a simple pendulum we identify q(s) → θ (t).
18 CHAPTER 1. LEAST ACTION
∂ L s2
Z s2 Z s2
∂L d ∂L
δ q̇ ds = δ q − δq ds (1.80)
s1 ∂ q̇ ∂ q̇ s1 s1 ds ∂ q̇
The first term vanishes since δ q vanishes at the ends of the path.
Thus
Z s2
d ∂L ∂L
S[q̄ + δ q] − S[q̄] = − δq − ds + ... (1.81)
s1 ds ∂ q̇ ∂q
This is only zero (at order δ q) if
∂L
d
ds ∂ q̇ − ∂∂ Lq = 0 (1.82)
In general, we will want to solve problems in more than one dimension. For example,
there may be several such generalised coordinates, qi corresponding to three dimensions,
x, y, z or multiple angles θi . The above formalism is easily adapted for such cases. The
definition of the Action above in terms of the Lagrangian (L = T − V ) remains the same,
however we now have several coordinates
qi i = 1...n (1.83)
In the derivation above of the Euler Lagrange equation, it is straightforward to take into
account deviations in the path in all of these coordinates. We would find that the change in
the action of a path close to the minimizing path would have the form
Z s2
d ∂L ∂L
∆S = −
s1
∑ δ qi ds ∂ q̇i
−
∂ qi
ds (1.84)
i
At the minimum the coefficients of each δ qi must vanish independently so we get a set
of Euler Lagrange equations - one associated with each coordinate
∂L
d
ds ∂ q̇i − ∂∂qLi = 0 (1.85)
Exercise 1.8: Work through the above derivation in the case where L depends on two coor-
dinates q and p. What two equations must then be satisfied by the minimizing curve?
1.5. APPENDIX - MATHEMATICS OF CONSERVATION LAWS 19
dL ∂ L ∂L
= q̇ + q̈ (1.88)
ds ∂q ∂ q̇
using the Euler Lagrange equation gives
dL d ∂L ∂L
= + q̈ (1.89)
ds ds ∂ q̇ ∂ q̇
which is just
d ∂L
L − q̇ =0 (1.90)
ds ∂ q̇
so that
Exercise 1.9: This is an exercise in using calculus of variation outside of optics or dynamics.
A smooth curved wire connects the origin to the lower point (x1 , y1 ). A bead on the wire
slides without friction from rest at the upper to the lower point under the influence of gravity.
It’s mechanical energy is conserved as it moves along the wire. Choose down to be the
positive y direction.
(a) Show that the time, T, required for the bead’s journey is
R x2 (1+y′ 2 )
q
1
T = √2g 0 y dx.
(b) Given that the integrand of the above integral is independent of x show that the curve y(x)
making T stationary satisfies the differential equation
q
dy (b−y)
dx = y
(c) Change the dependent variable from y to φ where y = b sin2 φ /2 and show that the above
can be integrated to give the brachistochrone
x = b/2(φ − sin φ )
Chapter 2
Special Relativity
Light travels at the very high speed, c ≃ 3 × 108 ms−1. In the late 1800s and early 1900s
physicists realized that the familiar Newtonian laws of motion breakdown when particles
travel near this speed, which turns out to be a maximum speed in our Universe. Einstein
reconciled these discoveries in his Special Theory of Relativity which he wrote down in
1905. Originally these ideas emerged in Maxwell’s theory of electromagnetism but it is
now standard to present the laws of dynamics first then move to the more complicated case
of electromagnetism. This is the ordering we will take in this and the next chapter. The
Special Theory of Relativity deals with observations of dynamics by an observer moving at
a constant speed. Here we will learn how to write the laws of dynamics in a form consistent
with Special Relativity’s postulates. These laws are needed to explain essentially any event in
a particle accelerator, many observations in astronomy, but also are crucial to our everyday
lives. For example, the GPS satellite system our mobile phones use continually are very
sensitive to relativistic corrections from the satellites motions.
• The speed of light, c, is the same when measured in any inertial frame. This was the
crucial result from the Michelson-Morley experiment.
• The laws of physics are the same for an observer in any inertial frame. This is the
statement that there is no observer (for example stationary relative to some “aether”)
for whom the laws are especially simple.
An observer moving at constant speed is said to be in an inertial frame that can be thought
of as a combination of
• A rigid, stationary (relative to the observer) lattice grid by which position coordinates
are specified
21
22 CHAPTER 2. SPECIAL RELATIVITY
With these observational tools the observer can specify any event by the set of coordinates
Note that moving from one inertial frame to another is often described as performing a
boost.
Let us call this inertial frame (stationary relative to the light source) frame S. The light
wave moves away from the source at speed c as a spherical shell described by
x2 + y2 + z2 = (ct)2 (2.2)
Now consider an inertial frame S′ moving with speed v in the positive x direction. For
convenience lets set the origin of both sets of coordinates at time t = 0 at the same place.
z z’
S S’ v
vt
2.2. LORENTZ TRANSFORMATIONS 23
The first postulate says that the observer in S′ sees light travel at speed c too. Thus in this
frame too the light forms a spherical shell centred on the origin in S′ described now by
′ ′ ′
x 2 + y 2 + z 2 = (ct ′)2 (2.3)
This is very surprising - you would have guessed that the observer moving relative to the
light source would not be in the centre of the spherical light shell.
The only way to reconcile the two viewpoints is if the two observers disagree on the
values of times and positions. The two equations for the position of the shell (2.2), (2.3) in
the two frames moving at relative speed v are reconciled by the Lorentz transformations
t ′ = γ t − cv2 x
x′ = γ (x − vt)
(2.4)
y′ =y
z′ = z
where s
1
γ= 2 (2.5)
1 − vc2
Exercise 2.1: Explicitly check that substituting the Lorentz transformations into (2.3) one
obtains (2.2).
Exercise 2.2: How would the Lorentz transformations differ if the boost was in the z direc-
tion rather than the x direction?
An immediate check we should make on these transformations is that they make sense in
the slow moving world we live in. When v ≪ c, γ ≃ 1 and
t ′ ≃ t, x′ ≃ x − vt (2.6)
(t = 0, x = 10) (2.7)
Then an observer moving in the x direction at speed v will record the event as occurring at a
time
24 CHAPTER 2. SPECIAL RELATIVITY
v
t ′ = −γ
(10) (2.8)
c2
ie earlier than when the two observers passed each other (t = t ′ = 0). The implications of
this are that the observers do not agree on measurements of periods and lengths.
(x = 0, t = 0) (x = 0, t = 1) (2.9)
The moving observer in the S′ frame sees the events as
′ ′ ′ ′
(x = 0, t = 0) (x = −γ vt, t = γ ) (2.10)
The S′ observer has recorded a time
−1/2
v2
γ = 1− 2 ≥1 (2.11)
c
longer than one second. The S′ observer therefore declares that the S observer’s watch (which
is moving relative to S′ ) is running slow.
(t = 0, x = 0), (t = 0, x = L) (2.12)
A moving observer in the frame S′ watches this process and is somewhat bemused. He
sees the measurement events as
′ ′ v ′ ′
(t = 0, x = 0) (t = −γ
2
L, x = γ L) (2.13)
c
The measurements were taken according to S′ at different times. Remember that S′ sees the
ruler moving, so if you measure the end points at different times you’ll not correctly measure
the length.
S′ wants S to make the second measurement at t ′ = 0. In S the position of the ruler doesn’t
change but when should S make the measurement so that S′ says t ′ = 0?
′
v
t = γ t − 2L = 0 (2.14)
c
2.3. AN ANALOGY TO ROTATIONS 25
thus
v
t= L (2.15)
c2
(The S observer doesn’t see what is special about this time of course!)
Now where is the second end in the S′ coordinates when this new S measurement is
made?
v2 L
x′ = γ (x − vt) = γ (L − L) = (2.16)
c2 γ
Thus S′ says the two correct simultaneous measurements of the end points are
L
(t ′ = 0, x′ = 0) (t ′ = 0, x′ = ) (2.17)
γ
S′ therefore sees the moving ruler to be shorter by a factor of γ relative to S.
Exercise 2.3: Repeat the computation of the length of the ruler in the frame S′ but assuming
that the ends of the ruler are at the points x = 1m, x = 2m in the S frame. Show that the
contracted length is again L/γ .
y sin
Figure 2.4: The motion of coordinate
axes under a rotation.
O
x cos
O
x
The coordinates transform between the two coordinate systems as
The different coordinate choices are in a sense a distraction from the physics involved
(of say a moving particle) which is really the same for the observer using either coordinates.
The elegant way to express this is to use vectors. The vector (eg from the origin to a particle)
is the same for both observers although its components may be different for the different
observers. We write
~x or x = (x, y) (2.19)
The coordinate transformation can then be written as a matrix multiplication on the vector
′
x cos θ sin θ x
′ = (2.20)
y − sin θ cos θ y
There is something invariant about the position of a particle under rotations - it’s distance
from the origin ie
′ ′
L2 = x2 + y2 = x 2 + y 2 (2.21)
We can extract this from the vector by the dot product of the vector with itself
L2 =~x.~x (2.22)
Now consider Lorentz transformations in the x and t directions where the coordinates are
mixed up by a boost. The Lorentz transformations, although not exactly like the mixing of
spatial coordinates under rotations, do have a similar form. Let’s try to draw a diagram with
the coordinate axes of two different inertial frame observers both shown.
We begin with one stationary observer’s coordinates in the x − (ct) plane. We use ct
rather than just t because it has the same dimensions as x.
ct
light Figure 2.5: The path light follows
in the x − ct plane.
Note that light travels on the line at 45o to the axes since it reaches a distance x = ct in ct
time.
We can use the Lorentz transformations to plot the position of the equivalent axes in a
frame moving relative to this frame. The coordinate axes are given when ct ′ = 0 and x′ = 0
so
2.3. AN ANALOGY TO ROTATIONS 27
v
ct ′ = γ ct − γ x
c
v
ct ′ = 0 → ct = x (2.23)
c
v
x′ = γ x − γ ct
c
c
x′ = 0 → ct = x (2.24)
v
x’
Figure 2.6: Coordinate axes before and
after a boost superimposed.
x
The marked lines are the S′ coordinate axes - they agree with the original coordinates as
to the point (0,0). The plot also shows the grid x′ = 0, 1, 2.. ct ′ = 0, 1, 2.. etc. Note that in
the new coordinate system the path light takes is given by the same line - it goes through the
points (0,0) (1,1) (2,2) etc.
This is an equivalent plot to the one we drew for rotations. We can place an event on the
plot and then read off its coordinates in either the original frame using the square grid or in
the boosted frame using the skewed grid.
28 CHAPTER 2. SPECIAL RELATIVITY
The grid can be used to see time dilation and length contraction
ct ct’ light path
x’
Figure 2.7: Coordinate axes before
and after a boost with events
x marked relevant to measuring a
time and a length.
The circles are events positioned at x′ = 0 every second in S′ . Reading the time of the
event on the original axes though shows that S sees more than 1s having passed between
events - a clock in a moving inertial frame measures time more slowly - time dilation.
The solid line represents a rod in S′ . In S if we measure distance at the same time for
each end we get a smaller length - lengths appear contracted in a moving inertial frame.
Although x and t change between S and S′ this picture, like the coordinates for the rota-
tions, we want a frame invariant way to discuss events. This will lead us to introduce vectors
in this plane which have space and time like components. In the rotation case the vector had
an invariant length that was the same for all observers. For Lorentz transformations we have
shown in Section 2.2 that the quantity
Exercise 2.4: Sketch a space-time diagram showing a stationary and a moving coordinate
frame with relative speed v. Explain from the diagram how a ruler lying on the x-axis of the
moving frame between x′ = 1 and x′ = 2 is seen contracted in the stationary frame.
2.4. FOUR VECTORS 29
− cv γ 0 0
γ ct
−vγ γ 0 0
′
xµ → x µ = x
c (2.27)
0 0 1 0 y
0 0 0 1 z
Secondly we know that it has a Lorentz invariant length
• A given label for an index may occur at most twice in any term in an expression.
The best way to explain this is with an example. We can write the Lorentz transformation
of xµ in the following form
′ µ
xµ → x µ = Λ ν xν (2.29)
µ
The new object Λ ν has two indices each of which can take the values 0, 1, 2, 3 and so there
are 4 × 4 = 16 components. These 16 components are just the 16 components of the Lorentz
transformation matrix we’ve written above (for example let µ count the row and ν the col-
umn).
In the expression the ν index occurs twice and this implies we must let ν take all possible
values and add up the answers we get in each case.
30 CHAPTER 2. SPECIAL RELATIVITY
= γ x0 − γ vc x1
This has reproduced the Lorentz transformation for x0 = ct.
Exercise 2.5: Convince yourself that equations (2.27) and (2.29) both reproduce the four
equations (2.4).
We can also write the Lorentz invariant length in this way. Formally we do this as follows.
We define a two index object called the metric with the 16 components
1 0 0 0
0 −1 0 0
gµν = 0 0 −1 0
(2.31)
0 0 0 −1
Now we can write
xµ = gµν xν (2.32)
This four vector with a lowered index has components
xµ xµ = x0 x0 + x1 x1 + x2 x2 + x3 x3
(2.34)
= (ct)2 − x2 − y2 − z2
This notation, which is common, is a little sloppy because the lowered index on a four
vector secretly contains the metric and its minus signs. In practice you may just want to
remember to insert the minus signs as they appear in the above expression when you contract
the indices on four vectors, as here, rather than always write the metric factors! BEWARE
though that there are not these minus signs in the Lorentz transformation expression (2.29)
µ
where Λν is not a four vector like object!
xµ = (3, 1, 0, 2) yµ = (4, 5, 3, 0)
Show explicitly by performing a Lorentz boost by speed v in the x-direction that these
products are Lorentz invariant.
2.5. THE LAWS OF DYNAMICS 31
This latter form is explicitly Lorentz invariant because the two sides of the equation
transform in the same way under Lorentz transformations.
So far we only have a four-vector describing position. We will now construct four-vectors
describing the kinematic properties of a particle.
2.5.1 Four-velocity
It is not sensible to use
dxµ
v= (2.35)
dt
as our definition of velocity because both xµ and t transform under Lorentz boosts. The
resulting transformation is very messy.
Ideally we would like a measure of time that is Lorentz invariant so that v would trans-
form only through the transformation of xµ . It would then be a four-vector itself. Such a
Lorentz invariant measure of time is
Proper Time: the time elapsed on a clock in the rest frame of a moving object. Essentially
we imagine that everything has a watch and we time an event for the object by the time on its
watch not the observer’s. Observers in any reference frame will then get the same answer.
Finally we can make a sensible choice for our variable four-velocity
dxµ
uµ = dτ (2.36)
Let’s stress again that this four-vector transforms just like xµ under boosts ie
′ µ
u µ = Λ ν uν (2.37)
It is useful to know how four-velocity relates to the more standard velocity measured by
an observer using his own watch (we can call this coordinate velocity)
dxµ dxµ dt
uµ = = (2.38)
dτ dt d τ
32 CHAPTER 2. SPECIAL RELATIVITY
We can work out dt/d τ from the Lozentz transformations. τ is the time in the rest frame,
where the particle is sat at the origin, so in a moving frame
dt
t = γτ → =γ (2.39)
dτ
Thus the components of four-velocity are
uµ = γ (c, vx , vy , vz ) (2.40)
From this expression we can finally work out the invariant “length” of this four vector from
the product
duµ
aµ = (2.42)
dτ
Again it’s worth stressing that this object is a four-vector which transforms in the same way
as xµ .
dxµ
pµ = muµ = m (2.43)
dτ
Here we have introduced the mass of the particle, m - it is a constant, intrinsic property of
the particle.
pµ is again a four-vector that transforms as
µ
p′ µ = Λ ν pν (2.44)
p0 = mcγ
The first term is a constant. The second term though is recognizable since 12 mv2 is kinetic
energy in the low v limit. This suggests we should interpret p0 as the relativistic version of
energy (divided by c). Then we have a surprising interpretation of the first, constant, term -
a particle at rest has energy
E = γ mc2 (2.48)
and the relativistic version of kinetic energy (the energy when moving minus the energy at
rest)
T = (γ − 1)mc2 (2.49)
The invariant length of the four-vector follows from uµ uµ = c2 so
E2
pµ pµ = c2
− |~p|2 = m2 c2 (2.50)
Exercise 2.7: Calculate by explicitly performing a boost the relativistic energy and mo-
mentum of a proton moving at speed v=0.5c. The rest mass of a proton is approximately 1
GeV/c2 .
d pµ
fµ = (2.51)
dτ
This is manifestly Lorentz invariant and has the correct non-relativistic limit if f µ is a rel-
ativistic extension of force. As yet though we haven’t mentioned forces and we won’t until
we discuss electro-magnetism! In fact this guess is the correct law.
The law tells us something interesting even when f µ = 0
d pµ
= 0 → pµ = constant (2.52)
dτ
In other words, if no external force acts on a system four-momentum is conserved. This is
the relativistic analogue of conservation of energy (p0 ) and conservation of the usual three
component momentum (p1 , p2 , p3 ).
34 CHAPTER 2. SPECIAL RELATIVITY
c=1 (2.53)
In other words we redefine the unit of length so that it is the distance light travels in 1 second!
This would not be sensible for everyday life but in problems where everything is travelling
at the speed of light a meter is an absurdly small distance. In practice we will be able to drop
all the factors of c from computations. It’s pretty easy to put them back into the final answer
using dimensional analysis as we will see.
Consider first a static observer in the frame of the light source. The photons of light carry
four momentum
h
pµ = (E,~p) = (h f , − x̂) = (h f , −h f , 0, 0) (2.54)
λ
Note that the photon is moving in the negative x-direction towards the observer. We have
used the quantum mechanical relations between the energy and frequency of the photon and
between its momentum and wavelength. We have also used f λ = c = 1.
We can now ask what would happen to the frequency of the light if the observer was
moving in the positive x-direction at speed v. We just perform a boost on the four-vector
γ −vγ 0 0 hf γ (1 + v)h f
−vγ γ
p′ µ =
0 0 −h f = −γ (1 + v)h f
0 (2.55)
0 1 0 0 0
0 0 0 1 0 0
Now if we just concentrate on the time-like component we have
r s
′0 1 (1 + v)2
p = E′ = h f ′ = (1 + v) h f = hf (2.56)
1 − v2 (1 + v)(1 − v)
2.6. PHYSICS WITH FOUR-MOMENTUM 35
or
s
(1 + v)
f′ = f (2.57)
(1 − v)
Finally we can reintroduce the factors of c since the factors of (1 + v) are not dimension-
ally correct. We should have
q
(1+v/c)
f′ = (1−v/c)
f (2.58)
∆λ(θ)
You’ve probably calculated this relationship for the change in the wavelength of the pho-
ton as a function of its scattering angle, ∆λ (θ ), previously. Using four-momentum will get
us to the answer much quicker.
Set up the four momentum of the particles to be:
µ
initial photon: pγ i = ( λh , λh x̂)
µ
initial electron: pei = (me , 0)
µ
final photon: pγ f = ( λh′ , λh′ f̂)
µ
final electron: pe f
Here f̂ is a unit vector in the direction of the motion of the final photon, which is at an angle
θ to the x axis.
Since no external force acts, four momentum is conserved in the collision so
µ µ µ µ
pγ i + pei = pγ f + pe f (2.59)
36 CHAPTER 2. SPECIAL RELATIVITY
µ
It turns out to be helpful to rearrange this equation so that pe f is isolated - we know least
µ
about pe f so will want to eliminate it
µ µ µ µ
pγ i + pei − pγ f = pe f (2.60)
Now we consider the Lorentz invariant product
µ
pe f pe f µ = m2e
µ µ µ
= (pγ i + pei − pγ f )(pγ iµ + peiµ − pγ f µ )
(2.61)
µ µ µ µ µ µ
= pγ i pγ iµ + pei peiµ + pγ f pγ f µ + 2(pγ i peiµ − pγ i pγ f µ − pγ f peiµ )
We have used two crucial facts here. Firstly when the four momentum of a particle is con-
tracted with itself we simply obtain the invariant m2e . Secondly we have used the contraction
µ
law p1 p2µ = (p01 p02 −~p1 .~p2 ).
Rearranging we find
h h h h
me − me = (1 − cos θ ) (2.62)
λi λf λi λ f
Multiplying through by λi λ f /(hme ) gives
h
λ f − λi =
(1 − cos θ ) (2.63)
me
which is the answer we want. Again we can insert c on dimensional grounds
h
λ f − λi = me c (1 − cos θ ) (2.64)
Pb
It’s not immediately obvious how much energy is available to make rest mass energy of
the new particle because momentum conservation requires the final state to be moving and
have kinetic energy. A sensible thing to do is to move to the Centre of Mass frame where
the particle and target (a particle in the wall) approach each other with equal and opposite
momentum.
2.6. PHYSICS WITH FOUR-MOMENTUM 37
a b a b
µ
µ
pb = (mb , 0)
p a = (Ea , pa )
Figure 2.11: The lab frame with one fixed target and the centre of mass frame where both
particles have equal momentum.
In this frame the particle produced will be at rest and all the energy of the initial state
will become rest mass energy of the product.
We can work out the Lorentz boost needed to move from the original “lab” frame to the
centre of mass frame. We boost the four-momenta in the lab frame by an amount v
µ′ γ −γ v Ea γ (Ea − vpa )
pa = = (2.65)
−γ v γ pa γ (pa − vEa )
µ′ γ −γ v mb γ mb
pb = = (2.66)
−γ v γ 0 −γ vmb
In the Centre of Mass frame the momenta must be equal and opposite so
′ ′
pax = −pbx
(2.67)
γ vmb = γ (pa − vEa )
and the required boost is by
v pa
c = mb c+Ea /c (2.68)
If after this boost the particles are ultra-relativitic so that Ea ≃ |pa | = |pb | ≃ Eb then the
total available energy is
v s
u
u 2
4mb c 2 4m2b c2 (mb c + Ea /c)2
ECoM = 2γ mb c = u 2
= (2.69)
(mb c + Ea /c)2 − p2a
pa
t
1 − m c+Ea /c
b
If we now expand in the limit with Ea ≫ mb c2 , ma c2 we find (remember that in this high
energy limit Ea /c = pa )
s
4m2b Ea2
ECoM = (2.70)
2mb Ea
√
ECoM = 2mb Ea (2.71)
38 CHAPTER 2. SPECIAL RELATIVITY
We could have obtained this result more quickly by calculating the invariant rest mass of
the whole system in the original coordinates
µ
pT OT pT OT µ = m2T OT c2
µ µ
= (pa + pb )(paµ + pbµ ) (2.72)
µ µ µ
= pa paµ + pb pbµ + 2pa pbµ
which in the limit where Ea is large compared to the rest masses gives
Ea mb c2
m2T OT c2 = 2 = 2Ea mb (2.73)
c c
m2∆ − m2p
Ep = ≃ 2 × 1020 GeV (2.78)
4hν
2.7. TENSORS 39
Protons with energy of this or above will under go this interaction. Factoring in the
density of photons it turns out that the mean free path for such protons is about 3Mpc (our
galaxy group is about 20Mpc across). We shouldn’t expect to see any protons of this energy
from active galaxies.
Surprisingly though experimenters have reported observations of cosmic ray protons with
higher energy than this bound (although they do see a decrease in the number of events above
the bound limit). If these events are real we might be doing something wrong! Could Special
Relativity break down at such high energies? Could there be a source of high energy protons
within 3Mpc (either an astronomical source or very massive particles left over from the Big
bang that decay to these protons)? At the moment this issue is an open question.
Exercise 2.8: A charged pion (mπ = 140MeV /c2 ) at rest decays to a charged muon (mµ =
105MeV /c2 ) and a massless muon neutrino. Calculate the energy and momenta of the neu-
trino and the muon.
Exercise 2.9: In the original (Homestake) solar neutrino detection experiment neutrinos
from the sun interact with Cl 37 atoms to form Ar37 and an electron. Assuming the Cl atoms
are at rest what boost is required to move to the centre of mass frame? Determine the mini-
mum energy the neutrino must have for this reaction to proceed.
Exercise 2.10: Speculative models of particle physics predict that at very high energies all
matter is unified into a single form. If this were true one would expect, very rarely, that
protons would decay to, for example, a positron and a photon. Derive an expression for the
wavelength of the emerging photon in the proton’s rest frame.
2.7 Tensors
We are now familiar with four-vectors. They are though just one part of a family of objects
called tensors which can have more than one index. We will need these later when we study
electromagnetism. To introduce them think about angular momentum:
Non-relativistically angular momentum is given by
l 3 = xpy − ypx
Lµν = xµ pν − pµ xν (2.81)
40 CHAPTER 2. SPECIAL RELATIVITY
For example
L12 = xpy − ypx = l 3 (2.82)
Tensors have a number of properties which in this case we can deduce from its “compos-
ite” nature. Thus
µ
• Under Lorentz transformations: L′µν = Λ α Λνβ Lαβ
There are 16 terms in the final sum here (in fact because of the anti-symmetry of Lµν in (2.81)
the diagonal terms are zero and eg L12 = −L21 so there are only 6 independent components).
Finally we note that the metric we introduced earlier is itself a tensor.
Exercise 2.11: Show that the metric tensor is invariant to a boost by speed v.
d pµ
=0 (2.83)
dτ
has an interesting form. It is given by
Z r
dxµ dxµ
S = −m dτ (2.84)
dτ dτ
Note that formally here τ need not be the proper time because we can parametrize the
µ dxµ d τ ′
path by any other τ ′ (τ ). Since dx
d τ = d τ ′ d τ the action transforms to the same form as (2.84)
but with τ → τ ′ .
The Euler Lagrange equations take the form
!
d ∂L ∂L
dx
− µ =0 (2.85)
dτ ∂ µ ∂x
dτ
or explicitly
" #
dxµ dxµ dxµ −1/2
1 d
m =0 (2.86)
2 dτ dτ dτ dτ
from which we learn that
dxµ dxµ
= uµ uµ = c2 (2.87)
dτ dτ
and
dxµ d pµ
d
m = =0 (2.88)
dτ dτ dτ
the correct equation of motion if we do identify the parameter with the proper time.
2.8. RELATIVISTIC ACTION 41
If we stare at (2.84) though we realize that it has an interesting form. The proper time
is being used to parameterize the path of the particle but if we move d τ into the square root
we see it cancels and what we are actually doing is calculating the length of the path. This
is very elegant in that the length of the path is the only physical characteristic of the motion
- it’s nice that the action is so simple.
Chapter 3
Relativistic Electromagnetism
In this section of the course we will study electromagnetism. We begin by reviewing Maxwell’s
equations in integral and differential form. Our main task here though will be to understand
how these equations already encode relativity. To do this we will need to rewrite them in
terms of potentials to find a manifestly Lorentz invariant form. We will then understanding
how electric and magnetic fields change under a boost.
• ~E is the (vector) force a unit charge experiences at a point on the closed surface S.
• The integral means a sum of ~E.d~A for the infinitesimal surface elements that make up
a whole, closed surface S. Remember that a little area element is described by a vector
normal to its surface
42
3.1. INTEGRAL FORM OF MAXWELL’S EQUATIONS 43
e.g. The electric field around a point charge is given by Gauss’ law using a spherical surface
S of radius r around the charge. The integral is then trivially performed and the result is
summarised in Figure 3.2.
q
4π r2 |~E| = ε0
q
|~E| = 4πε0 r2
~ ~
R
S B.d A = 0 (3.2)
since there are no magnetic charges, i.e. no magnetic monopoles.
Faraday discovered that moving a loop of wire in a magnetic field induces a current in
the wire. The number of magnetic field lines passing through the loop is the magnetic flux
given by, Z
Φ = ~B.d~A (3.3)
S
where the area S which is integrated over in this case is the open surface enclosed by the
loop. Now the induced voltage depends on the rate of change of the number of magnetic
field lines passing through the loop with respect to time t, as given by Faraday’s law,
∂Φ
e.m. f . = − (3.4)
∂t
The minus sign reflects Lenz’s Law which says the system resists change.
44 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
The voltage difference around the loop perimeter, s, is given by “V = Ed” but since differ-
ent bits of the wire point in different directions we must calculate this for each infinitessimal
bit of wire and sum the answers,
Z
e.m. f = ~E.d~l (3.5)
s
R ∂ ~B
~ ~ ~
R
s E.d l = − S ∂ t .d A (3.6)
R ∂ ~E
~ ~ ~ ~ ~
R R
s B.d l = µ0 S J.d A + µ0ε0 S ∂ t .d A (3.7)
Reading just the first two terms in this equation we see the familiar physics that if a
current I (given by the current density J~ integrated over the area d~A of the closed surface S)
is flowing through some loop then there is a circulating magnetic field
The final term was added for consistency by Maxwell (we will revisit this shortly) and
mirrors the term in Faraday’s law.
Gauss’ Theorem:
~ ~A = ~∇.~FdV
R R
S F.d (3.8)
Stoke’s Theorem:
~ ~ = (~∇ × ~F).d~A
R
s F.d l (3.9)
We can use these to find the differential form of Maxwell’s equations as the following
two examples show
~E.d~A = q
Z
(3.10)
S ε0
to the differential form using Gauss’ Theorem
Z Z
~E.d~A = ~∇.~EdV (3.11)
S V
where V is the volume enclosed by the closed surface S. If we also write the charge in terms
of a charge density
q ρ
Z
= dV (3.12)
ε0 V ε0
then comparing the above equations we find
ρ
Z Z
~∇.~EdV = dV (3.13)
V V ε0
Then shrinking the volume V to a point the integrands may be equated to yield Gauss’s law
in differential form,
~∇.~E = ρ (3.14)
ε0
46 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
∂ ~B ~
Z Z
~E.d~l = −
.d A (3.15)
s S ∂t
to the differential form using Stokes’ Theorem
Z Z
~E.d~l = (~∇ × ~E).d~A (3.16)
s S
Equating the right-hand sides of the above equations
∂ ~B ~
Z Z
(~∇ × ~E).d~A = − .d A (3.17)
S S ∂t
Then, shrinking the surface to a point, we may equate the integrands of the last equation and
hence arrive at the differential form of Faraday’s law
∂ ~B
∇ × ~E = − (3.18)
∂t
~∇.~E = ρ
ε0
~∇.~B = 0
(3.19)
~∇ × ~E ~
= − ∂∂Bt
~∇ × ~B = µ0 J~ + µ0ε0 ∂ ~E
∂t
Exercise 3.2: The vector ~A = xî + xyĵ + xz3 k̂. Evaluate ~∇.~A and ~∇ × ~A
3.2. DIFFERENTIAL FORM OF MAXWELL’S EQUATIONS 47
∂ρ
Z Z
~ ~A = −
J.d dV (3.20)
S ∂t
I
q = ρd V
Applying Gauss’ Divergence theorem to the left hand side we have the differential form for
charge and current conservation,
~∇. J~ = − ∂ ρ (3.21)
∂t
Exercise 3.3: Prove the conservation of charge starting from Maxwell’s equations in differ-
ential form.
Exercise 3.4: Consider flow within a gas or fluid of density ρ (r). Show that conservation of
mass within some volume implies
− dtd ρ dV = ρ~v.d~A
R R
where v(r) is the flow velocity. By explicitly applying this equation to an infinitesimal vol-
ume show that
∂ ~
∂ t ρ + ∇.(ρ~v) = 0
If the velocity of the fluid is subject to the condition ~∇ ×~v = 0 show, using Stoke’s
Theorem, that the fluid does not support circulation.
~∇ × ~B = µ0 J~ (3.22)
However, we can see quite simply in this formalism that this can not be correct. This is
because it is true that for any vector field ~F (The proof is given in Appendix 3.2 at the end of
this chapter).
Let’s see if this makes sense for our equation above by taking the divergence
3.3 Potentials
Potentials are a mathematical trick for making the Maxwell’s equations easier to solve. The
one you are already familiar with is the electrostatic potential, which we shall discuss first.
~∇.~E = ρ ~∇ × ~E = ~0 (3.26)
ε0
If we write
~E = −~∇φ (3.27)
then, because of the identity (see Appendix 2)
~∇ × ~∇φ ≡ 0 (3.28)
the second of our two Maxwell equations is automatically satisfied. We are left with only
Poisson’s equation
ρ
−∇2 φ = (3.29)
ε0
This simplifies things since the equation only involves one scalar function rather than the
three components of the electric field. The electric field can then readily be obtained from
the scalar potential using Eq.3.27.
3.3. POTENTIALS 49
which shows that φ can be interpreted as the “potential energy” for moving a unit charge
from infinity to the point ~x. This energy is independent of the path the charge takes to arrive
at that point.
Note that φ is only defined upto an arbitrary constant (the energy of a charge at infinity)
since
x=d φ=V
E
x=0 φ=0
Figure 3.6: A parallel plate capacitor.
Between the parallel planes of the plates there is no charge so Poisson’s equation reduces
to Laplace’s equation
∇2 φ = 0 (3.32)
In this problem, by the symmetry of the assumed infinite capacitor plates, the only variation
in φ will be in the x direction defined to be perpendicular to the plates,
d2
∇2 φ = φ =0 (3.33)
dx2
Integrating twice we obtain
φ = Ax +C (3.34)
with A,C constants. They can be fixed by imposing the boundary conditions
φ (x = 0) = 0, φ (x = d) = V . We obtain
V
φ= x (3.35)
d
Finally we can obtain the electric field from the potential
~E = −~∇φ = (− V , 0, 0) (3.36)
d
50 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
φ=V
Here the potential will only vary radially. In the Appendix ∇2 is calculated in cylindrical
polar coordinates (r, θ , z). Only allowing r variation in φ we find
1 d dφ
r =0 (3.37)
r dr dr
Integrating twice we find
φ = A ln r +C (3.38)
Again we fix the integration constants from the boundary conditions shown in the figure, so
V
φ (r) = − (ln r − ln b) (3.39)
ln(b/a)
Exercise 3.6: Solve Laplace’s equation for the potential generated by a charged point parti-
cle. The operator ∇2 in spherical polar coordinates is given by
∂2 2 2
∇2 = ∂ r2
+ 2r ∂∂r + r12 ∂∂θ 2 + cotr2θ ∂∂θ + r2 sin
1 ∂
2 θ ∂φ2
~∇ × ~B = µ0 J~ (3.40)
and we can not use a scalar potential field since ~∇ × ~∇φ ≡ 0.
On the other hand for all magnetic fields, static or otherwise,
~∇.~B = 0 (3.41)
Thus we can automatically solve the Maxwell equation in Eq.3.41 provided we write the
magnetic field in turns of a new vector field, the “vector potential” ~A
~B = ~∇ × ~A (3.42)
Just as there was some freedom in the choice of the electrostatic potential so there is an
arbitrariness about the vector potential ~A. This is because the magnetic field ~B is left invariant
if we transform
~E = −~∇φ − ∂ ~A
∂t
(3.44)
~B = ~∇ × ~A
The second of these equations defines the usual magnetic vector potential which is always
valid even in the non-static case since
~∇.~B = ~∇.(~∇ × ~A) = 0 (3.45)
52 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
The first of these equations involves a new, second term on the right-hand side which is
~
designed to yield the correct Maxwell equation, ~∇ × ~E = − ∂∂Bt . This is easily seen by taking
the curl of the first equation,
~∇ × ~E = ~∇ × −~∇φ − ∂ ~A
∂t
~ ~A)
= −~∇ × (~∇φ ) − ∂ (∇×
∂t
(3.46)
~
= − ∂∂Bt
To summarise, potentials ~A and φ may be defined in Eq.3.44 which always automati-
~
cally satisfy the homogeneous Maxwell equations ~∇.~B = 0 and ~∇ × ~E = − ∂∂Bt . This should
simplify things greatly since now there are only the remaining two inhomogeneous Maxwell
equations to solve. Let’s write them out in terms of the potentials
For the ~∇ × ~B equation we will again use the identity for this product in Appendix 2. Thus
!
∂ ∂ ~
A
~∇(~∇.~A) − ∇2~A = µ0 J~ + µ0 ε0 − − ~∇φ (3.48)
∂t ∂t
or rearranging
2~
−∇2~A + µ0 ε0 ∂∂ t A2 = µ0 J~ − ~∇(~∇.~A + µ0 ε0 ∂∂φt ) (3.49)
Unfortunately these two equations we are left with are quite messy! To clean them up we
can make use of our ability to redefine the potentials whilst keeping the ~E, ~B fields the same.
~A → ~A + ~∇ψ
(3.50)
φ →φ − ∂∂ψt
Exercise 3.7: Show explicitly that the ~E, ~B fields are left invariant by these transformations.
3.3. POTENTIALS 53
2 ρ
−∇2φ + µ0ε0 ∂∂ tφ2 = ε0 (3.53)
2~
−∇2~A + µ0ε0 ∂∂ tA2 = µ0J~ (3.54)
∂ 2φ 2~ ∂ 2~A
−∇2 φ + µ0 ε0 = 0, −∇ A + µ0 ε 0 =0 (3.55)
∂ t2 ∂ t2
which have complex wave solutions of the form
ω2 1
2
= c2 = (3.57)
k µ0 ε0
√
In other words these waves move at a speed c = 1/ µ0 ε0 which is the speed of light.
Following a similar analysis for the electric and magnetic fields, this is how Maxwell
concluded that light is an electromagnetic wave.
54 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
Relativistic Form
Eqns. 3.53 and 3.54 also have a very suggestive form for Relativity - they are symmetric
in time and space. There’s also a symmetry between the components of ~A and φ - should we
promote them to the components of a four-vector? Similarly should the charge density and
current become a four-vector?
ρ ′ = γρ0 (3.59)
There will also now be a current density since the charges are moving in the new inertial
frame. These transformations are all consistent with ρ and J~ being a four vector.
Thus we define
~
J µ = (ρ c, J) (3.60)
Classically the current density is just given in terms of the speed of the particles as ρ~v.
3.4. RELATIVISTIC FORMULATION OF ELECTROMAGNETISM 55
dxµ
J µ = ρ0 u µ = ρ0 (3.61)
dτ
The Lorentz invariant “length” of the four-vector then follows from uµ uµ = c2
J µ Jµ = ρ02 c2 (3.62)
Exercise 3.8: Write equations for how each of the four components of J µ transform under a
Lorentz boost by v in the x-direction.
~∇.J~ + ∂ ρ = 0 (3.63)
∂t
can now be written in a Lorentz invariant form
∂ µ Jµ = 0 (3.64)
where
µ ∂ ~ 1∂ ~
∂ = , −∇ = , −∇ (3.65)
∂ x0 c ∂t
Note the minus sign in the definition of the relativistic derivative four-vector ∂ µ . It
looks a bit odd but is needed to get the signs correct here. In fact it is the only prescription
compatible with the usual definition of xµ as we show in the next section.
xµ = (ct, x) (3.67)
For example under a Lorentz boost to a frame moving with speed v in the positive x
direction
′ µ
x µ = Λν xν (3.68)
56 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
ie
v v
(ct ′) = γ (ct) − γ x, x′ = γ x − γ (ct) (3.69)
c c
or inverting the relations
v v
(ct) = γ (ct ′ ) + γ x′ , x = γ x′ + γ (ct ′ ) (3.70)
c c
Similarly the definition in (3.66) would imply
′µ µ
∂ = Λν ∂ ν (3.71)
ie
1 ∂ 1∂ v ∂ ∂ ∂ v 1∂
′
=γ + γ , − ′
= −γ − γ (3.72)
c ∂t c ∂t c ∂x ∂x ∂x c c ∂t
Note the signs in the transformations
To show this is consistent let’s work it out from first principles
∂ ∂x ∂ ∂t ∂
′
= ′ + ′ (3.73)
∂t ∂t ∂x ∂t ∂t
∂ ∂x ∂ ∂t ∂
= + (3.74)
∂ x′ ∂ x′ ∂ x ∂ x′ ∂ t
from the transformations in (3.70) above
∂x ∂t ∂t v ∂x
= vγ , = γ, = 2 γ, =γ (3.75)
∂ t′ ∂ t′ ∂x′ c ∂ x′
Substituting these in (3.73) and (3.74) we find (3.72) - this shows that there is not an
inconsistency (and in fact that the minus sign in (3.66) is required).
An alternative quicker statement of this is that one would like
∂ xµ µ
= δν or ∂µ xµ = 4 (3.76)
∂ xν
and again the minus sign in ∂ µ is required.
1 ∂2
= ∂ µ ∂µ = 2 2
− ∇2 (3.77)
c ∂t
3.4. RELATIVISTIC FORMULATION OF ELECTROMAGNETISM 57
φ
Aµ = ( , ~A) (3.78)
c
The Maxwell equations are then
Jµ
Aµ = ε0 c2
(3.79)
where
1 ∂2
= − ∇2 (3.80)
c2 ∂ t 2
The µ = 0 equation is the φ equation (3.53) and the µ = 1, 2, 3 equations give the com-
ponents of the equation (3.54) for ~A.
The Maxwell equations in Lorentz gauge also required the gauge condition (3.52) which
becomes
∂µ Aµ = 0 (3.81)
Remember being able to write these equations in four-vector notation is a huge step in
itself. We now know that electromagnetism is relativistically invariant.
γq
φ′ = (3.86)
4πε0 (γ 2 (x′ + vt ′ )2 + y′2 + z′2 )1/2
′ v γv q
A x = −γ A0 = − 2 (3.87)
c c 4πε0(γ (x + vt )2 + y′2 + z′2 )1/2
2 ′ ′
′ ′
and A y = A z = 0.
~′
~ ′φ ′ − ∂ A
~E ′ = −∇ (3.88)
∂ t′
which works through to
′ qγ (x′ +vt ′ )
Ex = 4πε0 (γ 2 (x′ +vt ′ )2 +y′2 +z′2 )3/2
′ qγ y′
Ey = 4πε0 (γ (x +vt ) +y′2 +z′2 )3/2
2 ′ ′ 2 (3.89)
′ qγ z′
Ez = 4πε0 (γ 2 (x′ +vt ′ )2 +y′2 +z′2 )3/2
These results are particularly interesting when v ≃ c. Look first on the x-axis
′ q
Ex≃ (3.90)
4πε0 γ 2 (x′ + vt ′ )2
since γ is large this component of the field is reduced relative to that of the stationary charge.
′ ′
On the other hand if we look at the field perpendicular to the motion (ie at x′ = −vt ′ ) E y , E z
are both enlarged by a factor of γ . Thus the field of a relativistic moving charge is essentially
confined to a disc
~
~E = −~∇φ − ∂ A (3.91)
∂t
µ ν
so a component is given in terms of A , ∂ by
Ei
= ∂ i A0 − ∂ 0 Ai (3.92)
c
Similarly
~B = ~∇ × ~A (3.93)
so, up to signs, we have the form
Bi = ∂ j Ak − ∂ k A j (3.94)
Thus we conclude that the ~E and ~B fields are described by the EM field strength tensor
F µν = ∂ µ Aν − ∂ ν Aµ (3.95)
∂µ F µν = µ0J ν (3.97)
and
∂ λ F µν + ∂ µ F νλ + ∂ ν F λ µ = 0 (3.98)
Exercise 3.9: Explicitly extract the differential form of Maxwell’s equations from (3.97),(3.98).
60 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
Exercise 3.11: There is a four component tensor ε µνρσ which is zero if any two indices
take the same value. ε 1234 = 1. Other non-zero components are obtained by interchanging
indices of ε 1234 - interchanging any two indices changes the value by a minus sign. Thus
ε 3214 = −1 whilst ε 2314 = 1. Show explicitly that ε 1234 and ε 1134 are left invariant by a
Lorentz boost.
Show that ε µνρσ Fρσ takes the same form as F µν but with the elctric and amgnetic field
components interchanged.
Hence evaluate in terms of ~E and ~B fields the Lorentz invariant quantity ε µνρσ Fµν Fρσ .
= Λ1α Λ0β F αβ
′
E2 E2
c =γ c + cv B1
′
E3 E3
c = c
(3.101)
′1
2
B = γ B1 + vc Ec
′
1
B 2 = γ B2 − vc Ec
′
B 3 = B3
f µ = quν F µν (3.104)
Now we can ask what the non-relativistic limit of the time-like component of force is?
f 0 = q(u0 F 00 − u1 F 10 − u2 F 20 − u3 F 30 )
~
= −qγ ~v.cE (3.105)
= − qcγ ~v. ~E +~v × ~B
Taking v ≪ c we obtain q~v.~E which is just the work done per second. This indeed should
0
be the rate of change of energy and it makes sense to equate it to ddpτ in the relativistic
generalization of Newton’s law.
d~p
= q(~E +~v × ~B) (3.106)
dt
The action that reproduces this equation is
1
Z
S= Ldt, L = m|~ẋ|2 + q(~ẋ.~A) − qφ (3.107)
2
The Euler Lagrange equation is
d ∂L ∂L
− =0 (3.108)
dt ∂~ẋ ∂~x
or
d ~
(mẋ + q~A) − ~∇(q~ẋ.~A − qφ ) = 0 (3.109)
dt
To see this is the equation we want we must first be careful about the time dependence
~
of A. Of course it can explicitly depend on time, but even if it’s constant the particle, as it
moves, will see a time variation of the field. This is accounted for using the chain rule
d ∂ dx ∂ dy ∂ dz ∂ ∂ ~~
= + + + = + ẋ.∇ (3.110)
dt ∂ t dt ∂ x dt ∂ y dt ∂ z ∂t
So our equation of motion is
d~p ∂ ~A ~ ~ ~
+q + qẋ.∇A − q∇(~ẋ.~A) + q~∇φ = 0 (3.111)
dt ∂t
Next we use the identity
∂L
~pgen = = m~ẋ + e~A (3.114)
∂~ẋ
and for the Hamiltonian
1
H = ~pgen .~ẋ − L = m|~ẋ|2 + eφ (3.115)
2
These expressions combine to the generalized four-vector momentum
µ
pgen = muµ + eAµ (3.116)
Replacing momenta in a problem by this generalized four momenta is called “minimal
substitution”.
to a form that is locally true. We do this by calculating the integral for an infinitesimal cubic
volume
Z
dx
dy
Y
Figure 3.10: An infinitesimal cube.
dz
O X
~F = F ẑ (3.118)
(ie the field ~F points in the z direction.)
64 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
~F.d~A = ∂ F δ x δ y δ z
Z
(3.119)
S ∂z
This result generalizes, when ~F has x and y components too, to:
~ ~
R
∂ Fx ∂ Fy ∂ Fz
S F.d A
lim δ V → 0 = + + (3.120)
δV ∂x ∂y ∂z
where δ V is the volume of the cube.
Alternatively we may write this as:
Z
~F.d~A = ∇.~F δ V (3.121)
S
where
∂ ∂ ∂
∇= x̂ + ŷ + ẑ (3.122)
∂x ∂y ∂z
and dV is an integral over the whole volume.
It is easy to obtain the equivalent expression for an arbitrary volume - we just build it up
out of infinitesimal cubes: eg if we put two together
Z Z Z
~F.d~A = ~F.d~A + ~F.d~A (3.123)
two cubes cube one cube two
3.6. APPENDIX - GAUSS’ AND STOKE’S THEOREMS 65
since the side shared by the two cubes has an area vector with opposite sign in the case of
the two integrals - the side cancels! We can therefore build any shape in this way and the
surface integral is just the sum over the surface integrals of the component cubes so we arrive
at Gauss’ Law
~ ~A = ~∇.~FdV
R R
S F.d (3.124)
to a form that is locally true. We do this by calculating the integral for an infinitesimal
rectangular loop
Z dx Q
~F = Fx x̂ + Fy ŷ + Fz ẑ (3.126)
The line integral gets contributions from the top and bottom of the form “Fx dx” and from the
sides of the form “Fy dy”. We must take into account the change in these components across
the box though. Clockwise round the box we get contributions:
~F.d~l = ( ∂ Fx − ∂ Fz ) δ x δ z
Z
(3.128)
s ∂z ∂x
66 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
or
Z
~F.d~l = cy dA (3.129)
s
∂ Fz ∂ Fy
cx = ( − ) (3.130)
∂y ∂z
∂ Fy ∂ Fx
cz = ( − ) (3.131)
∂x ∂y
x̂ ŷ ẑ
~∇ × ~F = ∂ ∂ ∂
(3.132)
∂x ∂y ∂z
Fx Fy Fz
The calculation above then generalizes, for an area placed at random relative to the axes,
to
Z
~F.d~l = (~∇ × ~F).d~A (3.133)
s
We can again make larger areas by placing infinitesimal squares next to each other - the
common sides cancel from the sum
Thus
Z Z Z
~F.d~l = ~F.d~l + ~F.d~l (3.134)
two sq sq one sq two
3.7. APPENDIX - VECTOR IDENTITIES 67
R
~ ~
R
~ ~ ~
s F.d l = S (∇ × F).d A (3.135)
Proof:
x̂ ŷ ẑ
~∇ × ~F = ∂ ∂ ∂
∂x ∂y ∂z
Fx Fy Fz
∂ Fz ∂ Fy ∂F
= ∂y − ∂z x̂ + ∂ Fx
∂z − ∂∂Fxz ŷ + ∂ xy − ∂∂Fyx ẑ
∂ 2 Fz ∂ 2F 2 2 ∂ 2F 2
~∇.(~∇ × ~F) =
∂ x∂ y − ∂ x∂ yz + ∂∂ y∂Fxz − ∂∂y∂Fzx + ∂ z∂ xy − ∂∂ z∂Fyx
= 0
Identity: ~∇ × (~∇φ ) = 0
Proof:
x̂ ŷ ẑ
~∇ × (~∇φ ) = ∂ ∂ ∂
∂x ∂y ∂z
∂φ ∂φ ∂φ
∂x ∂y ∂z
2 2
∂ 2φ 2 2 2
= ∂ y∂ z − ∂∂z∂φy x̂ + ∂∂z∂φx − ∂∂x∂φz ŷ + ∂∂x∂φy − ∂∂y∂φx ẑ
= 0
68 CHAPTER 3. RELATIVISTIC ELECTROMAGNETISM
Proof:
x̂ ŷ ẑ
~∇ × ~F = ∂ ∂ ∂
∂x ∂y ∂z
Fx Fy Fz
∂ Fz ∂ Fy ∂ Fx ∂ Fz ∂ Fy ∂ Fx
= ∂y − ∂z x̂ + ∂z − ∂x ŷ + ∂x − ∂y ẑ
∂ 2 Fy
2 2 2
~∇ × (~∇ × ~F) =
∂ y∂ x − ∂∂ yF2x − ∂∂ zF2z + ∂∂ z∂Fxz x̂
∂ 2 Fy ∂ 2 Fy
2 2
− ∂ x2
− ∂∂ x∂Fxy − ∂∂ z∂Fyz + ∂ z2
ŷ
∂ 2F
2 2
∂ 2 Fx
+ ∂ x∂ z − ∂∂ xF2z − ∂∂ yF2z + ∂ y∂ yz ẑ
h 2 i
∂ Fy 2 2
= ∂
∂x
∂ Fx
∂x + ∂y + ∂∂Fzz − ∂∂ xF2 + ∂∂y2 + ∂∂z2 Fx x̂
h 2 i
∂ Fy 2 2
+ ∂
∂y
∂ Fx
∂x + ∂y + ∂∂Fzz − ∂∂ xF2 + ∂∂y2 + ∂∂z2 Fy ŷ
h 2 i
∂ Fy 2 2
+ ∂
∂z
∂ Fx
∂x + ∂y + ∂∂Fzz − ∂∂ xF2 + ∂∂y2 + ∂∂z2 Fz ẑ
= ~∇(~∇.~F) − ∇2 ~F
Proof:
∂2 2 2
∇2 = ∂ x2
+ ∂∂y2 + ∂∂z2
2
∂ ∂r ∂ ∂θ ∂ ∂ ∂r ∂ ∂θ ∂
= ∂x ∂x ∂r + ∂x ∂θ + ∂y ∂y ∂r + ∂y ∂θ + ∂∂z2
2 2 2
∂r ∂2 2 ∂2 2 ∂2
= ∂x ∂ r2
+ ∂∂ x2r ∂∂r + ∂θ
∂x ∂θ2
+ ∂∂ xθ2 ∂∂θ + ∂r
∂y ∂ r2
+
2
∂ 2r ∂ ∂θ ∂2 2 2
∂ y2 ∂ r
+ ∂y ∂θ2
+ ∂∂ yθ2 ∂∂θ + ∂∂z2
3.7. APPENDIX - VECTOR IDENTITIES 69
∂ 2r 1 2 cos2 θ
∂ x2
= (x2 +y2 )1/2
− (x2 +yx 2 )3/2 = 1r (1 − sin2 θ ) = r
∂r y
∂y = (x2 +y2 )1/2
= cos θ
2
∂ 2r 1 y 1 2 sin2 θ
∂ y2
= (x2 +y2 )1/2
− (x2 +y 2 )3/2 = r (1 − cos θ ) = r
∂θ cos2 θ cos θ
∂x = x = r
∂θ
∂y = − cos2 θ yx2 = − sinr θ
∂ 2θ
∂ y2
= 2 cos θ sin θ yx2 ∂∂θy + 2 cos2 θ yx3 = − r22 sin θ
cos θ
(sin2 θ − 1) = − 2 sin θr2cos θ
Quantum Mechanics
70
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 71
u*u
dx x
Figure 4.1: u∗ (x)u(x)dx - the area under the curve shown - gives the probablity to find the
particle in that region of x.
p2
E= +V (4.6)
2m
which, using the operators, we can rewrite as a wave equation
∂ h̄2 ∂ 2
Ĥ ψ ≡ ih̄
ψ =− ψ +V ψ (4.7)
∂t 2m ∂ x2
where Ĥ is the Hamiltonian operator. This is the time dependent Schroedinger equation
which is central to Quantum Mechanics.
h̄2 ∂ 2
− u(x) +V (x)u(x) = Eu(x) (4.9)
2m ∂ x2
4.1.3 Interpretation
The amplitude of the wave function ψ ∗ (x,t)ψ (x,t) (which in the time independent case is
just u∗ (x)u(x)) is associated with the probability of finding a particle at x. Remembering that
x is continuous the precise statement is
q = ρd V
Figure 4.2: The change in a conserved quantity, q, in a volume matches to a current leaving
the volume.
Graphically this is shown in Fig 4.1 which shows that the probability of finding the particle
in the dx spatial slice is just the area under the curve u∗ u in that slice.
Since the particle must be somewhere with probability one we must have
Z ∞
u∗ (x)u(x)dx = 1 (4.11)
−∞
Note that for a free particle wave function the normalization of the wavefunction is inter-
preted as the flux of particles per unit volume or within a finite box.
Formally we find observable properties of the particles using the operators
Z ∞ Z ∞
∗
hxi = u (x) x̂ u(x)dx = u∗ (x) x u(x)dx (4.12)
−∞ −∞
Z ∞ Z ∞
∗ ∗ ∂
hpi = u (x) p̂ u(x)dx = u (x) −ih̄ u(x)dx (4.13)
−∞ −∞ ∂x
∂ρ
Z Z
~ ~A = −
J.d dV (4.14)
S ∂t
Using Gauss’ theorem ( ~A.d~S = ~∇.~A dV ) we have
R R
∂ρ ~ ~
+ ∇.J = 0 (4.15)
∂t
or in one dimension
∂ ρ ∂ Jx
+ =0 (4.16)
∂t ∂x
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 73
Now we can show using the Schroedinger equation that ρ = ψ ∗ ψ satisfies such a relation.
We add two copies of the Schroedinger equation as follows
Z ∞
1
φ (p) = √ ψ (x)e−ipx/h̄ dx (4.24)
2π h̄ −∞
or inversely
74 CHAPTER 4. QUANTUM MECHANICS
Z ∞
1
ψ (x) = √ φ (p)eipx/h̄ d p (4.25)
2π h̄ −∞
We can demonstrate that the Fourier Transform indeed has the correct properties by
checking the consistency of the three operator equations above. Firstly consider
′
′′
ipx ′′ −ipx ′′
φ ∗ (p) 1 R
dx′ e h̄ ψ ∗ (x′ )
R R R
φ (p) d p = 2π h̄ dp dx e h̄ ψ (x )
(4.26)
′′
′′ ′′ R −ip(x −x′ )
dx′ dx 1 ∗ ′
R R
= 2π h̄ ψ (x )ψ (x ) d pe h̄
1
Z
δ (x − x0 ) = e−ik(x−x0 ) dk (4.27)
2π
′′ ′′ ′′
φ ∗ (p) φ (p)d p = dx′ dx δ (x − x′ )ψ ∗ (x′ )ψ (x )
R R R
′
dx′ ψ ∗ (x′ )ψ (x )
R
= (4.28)
= 1
The equations are consistent.
Secondly we can check the relation for the expectation value of the particles position
R ∗ ∂
φ (p) ih̄ ∂ p φ (p)d p
R ′′ −ipx′′
ipx′ ′′ ′′
−ix
1 R
dx′ e ψ ∗ (x′ )
R
= 2π h̄ dp h̄ ih̄ h̄ dx e h̄ ψ (x )
′′
′′ ′′ ′′ −ip(x −x′ )
dx′ dx 1
ψ ∗ (x′ ) x ψ (x ) d p e
R R R
= 2π h̄
h̄
′′ ′′ ′′ ′′ (4.29)
dx′ dx δ (x − x′ )ψ ∗ (x′ ) x ψ (x )
R R
=
′
dx′ ψ ∗ (x′ ) x′ ψ (x )
R
=
= hxi
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 75
The differential has been inserted adhocly to simply bring down a factor of p. Now we
integrate by parts throwing away surface terms at infinity
R ′′ −ipx′′
R ∗ 1 R R ′ ipx′ ∗ ′ ∂ ′′
φ (p) p φ (p)d p = 2π h̄ d p dx e h̄ ψ (x ) dx e h̄ −ih̄ ′′ ψ (x )
∂x
R ′′ ′′
′′
∂
dx′ dx δ (x − x′ )ψ ∗ (x′ ) −ih̄
R
= ′′ ψ (x )
∂x
′
∂
dx′ ψ ∗ (x′ ) −ih̄
R
= ′ ψ (x )
∂x
= hpi
(4.31)
Everything is nicely consistent.
The equality follows directly from the theory of Fourier transforms for the idealised wavepack-
ets. The inequality expresses the fact that, in real experiments which measure the position
and momentum of a particle simultaneously, the product of uncertainties in the respective
measurements must always exceed the above bound.
There is also a similar uncertainty relation for energy and time of a quantum state,
For example, for an atomic transition, the shorter the transition time ∆t the greater the width
of the associated spectral line ∆E, and vice versa.
The above relations in Eqs.4.32, 4.33 are collectively known as the Heisenberg Uncer-
tainty Principle. They highlight the fact that the quantum world represents a major departure
76 CHAPTER 4. QUANTUM MECHANICS
V= V=0 V=
x=0 x=a
from classical physics, since, even in the most accurate idealised experiment, two quantities
such as position and momentum cannot ever be known simultaneously to arbitrary precision.
Even great physicists such as Albert Einstein never accepted this, and this led to a series
of high profile debates with Niels Bohr. It is now generally accepted that Bohr was correct
and Einstein was wrong. Quantum Mechanics, though completely counter to our intuition,
has been thoroughly vindicated in all experiments to date involving atoms and subatomic
particles.
ψ = 0, for x ≤ 0, x ≥ a (4.34)
Since the potential is time independent the solution takes the form
h̄2 d 2
− u(x) +V (x)u(x) = EU (x) (4.36)
2m dx2
Of course in the region of interest the potential is just V = 0.
x=0 x=a
Figure 4.4: The intial conditions for the square well problem considered in section 1.8
h̄2 nπ 2
En = (4.39)
2m a
Finally to find the constant A we can require ψ (x,t) is correctly normalized
R∞ ∗ ψ dx
−∞ ψ = 1
R a 2 2 nπ x
= 0 A sin a dx (4.40)
= A2 2a
The full solution is therefore
r
2 nπ x −iEnt/h̄
ψn (x,t) = sin e (4.41)
a a
4.1.8 Completeness
The consideration of how a particular initial condition for the wave function in a square
well evolves with time provides interesting insight into the uniqueness of the solutions we
have found. In particular since the solutions are sine waves of period 2a there is a strong
connection to problems one encounters when studying Fourier analysis such as wave forms
on a string.
For example if we take an initial wave function, at t = 0, of the triangular form show in
Fig 1.4 then we can write
∞
ψ (x,t = 0) = ∑ cn un(x) (4.42)
n=1
where the cn are the Fourier-like coefficients (we’ll explain how to derive them in the next
section) which are given by r
8k a nπ
cn = 2 2 sin (4.43)
n π 2 2
We now know the time evolution since we know that each individual term evolves as
Resuming the series at time t gives the evolution of the initial condition (to a precision
determined by how many terms you resum).
This is an example of a general rule in QM called completeness: any wave function may
be expanded as a series of the eigenfunction solutions of the Schroedinger equation relevant
to that problem. In other words in any problem we may write
Hun = En un (4.46)
We won’t prove this here but if it weren’t true it would be quite surprising! Imagine
we had found all the solutions of the Schroedinger equation and then wrote down an initial
condition that couldn’t be rewritten in terms of those solutions... we’d have missed the evo-
lution of that initial condition and hence we can’t have had all the solutions! Completeness
is usually the case for a theory to make sense and it allows us to evolve all initial states with
time.
4.1.9 Orthogonality
It is also important in these initial condition problems that there is a unique way of writing
We can act with H to either the left or right in which case we will find
Z Z
Ej u∗i u j dx = Ei u∗i u j dx (4.50)
4.1. NON-RELATIVISTIC QUANTUM MECHANICS 79
which can only be true for i 6= j if the wave functions are orthogonal and both sides are zero.
When i = j the integral over the wave function squared is just the usual probability of finding
the particle in all space and is set equal to one.
Now we know enough to derive the coefficients in (4.43). Given
and using orthogonality we find only one term of the sum on the right survives and hence
Z
cn = u∗m ψ (x,t = 0)dx (4.53)
using the initial conditions show in Fig 1.4 and performing the integrals leads to (4.43).
∂ h̄2
ih̄ ψ = − ∇2 ψ +V ψ (4.55)
∂t 2m
The probability to find a particle in some infinitesimal box of volume δ V is
Prob = ψ ∗ ψδ V (4.56)
where for example in spherical coordinates δ V = r2 sin θ d θ d φ dr.
• Many Worlds - all outcomes happen in parallel universes (this doesn’t explain why a
measurement splits the universes though).
None of these are really satisfactory - not least because it is not precisely clear what
constitutes a measurement. Nevertheless QM is the most successful theory physics has and
so is clearly correct. The real impact of these issues is that it is hard to have an intuitive feel
for the subject. In the next chapter we will investigate an alternative formalism for QM in
which the idea of a trajectory for the particle is central, rather than a wave function, and it
allows some classical intuition to be used.
Exercise 1.1:
Make an odd continuation of the solutions to the infinite square well problem and cal-
culate the momentum space wave functions φ (p). What is the physical significance of your
result?
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 81
Figure 4.5: The classic double slit experiment showing the wave nature of particles.
To motivate the form of the theory consider the usual double slit type experiment shown
in Fig 4.5. A classical description in which the particle goes through a single slit will clearly
not do. We will adopt a much more radical idea that the particle travels by ALL possible
paths!
The interference pattern suggests that there should be cancelling and reinforcing phases
in the description. We are therefore led to the proposal of the next section.
where S is the classical action of each particular path, and every possible path contributes in
the sum.
The probablity for a particle to travel from point A to point B is then given by
A ∆S ≫ h̄
Figure 4.6: A collection of paths away from the minimum of the action have rapidly varying
phase in the kernel and cancel.
λ &r (4.59)
Of course λ = hp so it is because h is small in nature that we don’t see quantum effects when
we throw cricket balls through doors (of course there might well be some serious classical
effects, so don’t try this at home!)
From this discussion we can see that if we take
h→0 (4.60)
then all wavelengths become very small and the theory becomes classical at all length scales.
Note that also in this limit the Uncertainty Principle (∆p∆x ≥ h̄) allows both p and x to
be measured together which again corresponds to classical physics.
So what does our prescription give in this classical limit h → 0? In general for a set of
paths close to each other (as shown in Fig 2.2), in this limit, we will find the difference in the
classical action between neighbouring paths
∆S ≫ h̄ (4.61)
just because h̄ is so small. This means that these paths have very different phases in the
kernel above. The phase just points out a direction in the complex plane. The sum over these
paths will just average the phase... but if the phases are essentially random as in this case we
will get precisely zero.
The only time this won’t be true is if we find a cluster of paths for which ∆S < h̄. This
will only be true around a minimum of S where there is little change in S. A little cluster of
paths here will all have roughly the same phase and add in such a way as to dominate the
kernel. Thus in the classical limit our prescription does reproduce Hamilton’s Principle.
Incidentally this tells us that in a quantum theory a classical trajectory gets smeared since
it is equally likely to travel on a neighbouring path provided ∆S ≤ h̄.
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 83
B
A
x
C
t
|ψ (ta)|2 = δ (x − xa ) (4.63)
then we can identify the wave function at a later time tb > ta with the kernel
So the contribution to the kernel from all possible paths from A to B through C is given by
t
tn
xn
∆t x3
x2
x1
t0
x0 x
Figure 4.8: Paths a particle might take from the point x at time t to x′ at time t ′ divided into
many very short straight segments.
We previously, in (4.64), identified K(B, A) as the wave function at time tb and similarly
we can identify here K(C, A) = ψ (xc ,tc), the wave function at time tc . In both cases the
wavefunctions have evolved from the delta function form at time ta in (4.63) but they can
be arbitrarily complicated depending on the evolution, for example, through some potential.
Thus this expression tells us how one wave function evolves into another
Z ∞
ψ (xb ,tb) = constant ψ (xc ,tc ) K(B,C) dxc (4.69)
−∞
We need a way to keep track of all possible paths in order to work out the kernel. One
way to do this is to divide time up into infinitesimal time slices and assume that the particle
travels in a straight line at constant speed in any such time slice as shown in Fig 4.8.
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 85
Now we can consider the time evolution of the wave function just across one ∆t time
slice. We’ll assume that the particle doesn’t travel too far in any time slice (so x = x′ + ∆x)
and that it’s velocity is constant along the way
Z ∞
ψ (x′ ,t + ∆t) = A K(x′ ,t + ∆t; x,t)ψ (x,t)dx (4.72)
−∞
We know the kernel here because the paths are always straight lines (it’s just exp(iS path /h̄))
R t+∆t
Sx→x′ = t L(x, ẋ)dt
x+x′ x′ −x
= L 2 , ∆t ∆t
(4.73)
2
1 x′ −x x+x′
= 2m ∆t −V 2 ∆t
2π ih̄∆t −1/2
A= m (4.77)
We have derived an expression for the constant in the wave function evolution equation.
Note the form of the exponential is easy to remember because it’s just exp(i∆tKE/h̄) with
KE the classical kinetic energy assuming constant velocity.
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 87
0.4
0.2
Re(K) 2 4 6 8 x
-0.2
-0.4
Figure 4.9: The real part of the kernel for a free particle plotted against position at some
fixed time (it takes the form cos x2 ).
∆phase = 2π
m(x+λ )2 2
= 2h̄t − mx
2h̄t
(4.86)
mxλ
≃ h̄t
where we have expanded in λ /x. We find
2π h̄ h
λ= = (4.87)
mx/t p
a familiar result. The interpretation is that the higher momentum (smaller wavelength) com-
ponents of the wavepacket travel further out in a given time.
Similarly we can fix x in K(x,t) and plot the real part against t as shown in Fig 4.10.
88 CHAPTER 4. QUANTUM MECHANICS
-2
-3
Figure 4.10: The real part of the kernel for a free particle plotted against time at some fixed
position (it takes the form cos(1/t).
We can work out the period of the wave at some t as we did the wavelength above
mx2 mx2
2π = 2h̄t − 2h̄(t+T )
mx2
1 − (1 + T /t)−1
= (4.88)
2h̄t
mx2
≃ 2h̄t 2
T
The angular frequency is
1 mx2
ω = 2π /T = (4.89)
h̄ 2t 2
which, up to the factor of h̄ is just the kinetic energy of the particle and hence
E = h̄ω (4.90)
The interpretation is that the higher energy (higher frequency) components of the wavepacket
pass by a fixed point earlier in time.
imx2
K = C(t)e 2h̄t (4.91)
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 89
P
L0
θ
d
source _d sin θ
2
screen
barrier
If we assume the source is at infinity then the distance from the source to any point on the
barrier is the same. We can therefore treat each point on the barrier as an equal emitter of
particles and just sum eiS/h̄ for the paths from the barrier to the screen. We find
Z imx2path
K(screen) = A(t) e 2h̄t f (s) ds (4.92)
barrier
Here A(t) is a constant depending only on time, the exponential is the contribution from the
action of each path, f (s) is either 1 or 0 depending upon whether that point on the barrier is
a hole or blocking the particle and finally ds sums over all points on the barrier. Compare
this to (4.68)
Lets look at a simple barrier with a single slit opening of width d as shown in Fig 2.7.
We will work in the narrow width approximation where d ≪ L0 . The distance from a point
P on the screen to each element of the hole is
d d
L0 + x sin θ , − <x< (4.93)
2 2
Our expression for the kernel is therefore
Z d
2 2 /2h̄t
K(P,t) = A(t) d
eim(L0 +x sin θ ) dx (4.94)
−2
Since L0 ≫ d then
R d2 2
K(P,t) ≃ A eimL0 /2h̄t ei2mL0 x sin θ /2h̄t dx
− d2
2
h id
2
≃ AeimL0/2h̄t imL0h̄tsin θ eimL0 x sin θ /h̄t d (4.95)
−2
2
−iA(t)h̄teimL0 /2h̄t
mL0 d sin θ
≃ mL0 sin θ 2 sin 2h̄t
90 CHAPTER 4. QUANTUM MECHANICS
sin2 x
x2 0.8
0.6
0.4
0.2
-10 -5 5 10
x= sinθ
Figure 4.12: The probability function for the end point of a particle passing through a single
slit.
where α and β are just constants. We can plot the rough form of this solution and find the
form in Fig 4.12.
Note that the minima are when
mL0 d
sin θ = nπ (4.97)
2h̄t
ie when
2h̄t h
d sin θ = nπ = n = nλ (4.98)
mL0 p
The usual result for destructive interference.
Lets try now to get an equivalent statement starting from the time independent Schroedinger
equation
H φn = E n φn (4.100)
4.2. PATH INTEGRAL APPROACH TO QUANTUM MECHANICS 91
K(x,t2; y,t1) = ∑∞ ∗
n=1 φn (x)φn (y)e
−iEn (t2 −t1 )/h̄ (4.105)
Exercise 2.1:
Show that for a free particle travelling from xa at ta to xb at tb the classical action is given
by
1 (xb − xa )2
Sclassical = m
2 (tb − ta )
Exercise 2.2:
Perform the Gaussian integral
Z ∞
2 −β x
e−α x dx
−∞
Exercise 2.3:
Consider a non-relativistic, free particle of mass m travelling in two dimensions between
two points A and B on the x axis equally spaced about the y axis. Consider paths where the
92 CHAPTER 4. QUANTUM MECHANICS
particle travels in a straight line at constant speed to an arbitrary point on the y axis and then
in a straight line at the same speed to B, taking total time T. Calculate the action for these
paths. Argue that classically the particle will travel in a straight line. Quantum mechanically
the path is smeared. Estimate the width of the path when the particle crosses the y axis.
Exercise 2.4:
A massive, non-relativistic particle emitted by a source at infinity encounters a sheet of
absorbing material with a circular hole of side a in it. Derive an expression for the quantum
probability for finding the particle at a distance d along the axis of the hole on the far side at
a time T .
Z ∞ r
2 −α x2 d 1 π
I2 = x e dx = − I0 (α ) = (4.112)
−∞ dα 2α α
(x2 + x0 ) 2 (x2 − x0 )2
2 2
(x1 − x0 ) + (x2 − x1 ) = 2 x1 − + (4.114)
2 2
now if we change the integration variable to w = x1 − (x2 + x0 )/2 (dw = dx1 ) we find
Z ∞
2 α 2
J= e−2α w e− 2 (x2 −x0 ) dw (4.115)
−∞
r
π − α (x2 −x0 )2
J= e 2 (4.116)
2α
94 CHAPTER 4. QUANTUM MECHANICS
∂
p̂µ → i∂ µ , (Ê,~p̂) = (i , −i~∇) (4.118)
∂t
Note that the minus sign in the spatial parts of ∂ µ match and explain the sign in the standard
operator relations (4.4,4.5).
Substituting these operators generates the Klein Gordon equation
( + m2 ) φ (x) = 0 (4.119)
= ∂µ ∂ µ = ∂ 2 /∂ t 2 − ∇2 (4.120)
where N is a normalization constant and if we substitute the solution into the equation we
recover q
E = ± |~p|2 + m2 (4.122)
4.3. RELATIVISTIC QUANTUM MECHANICS - THE KLEIN GORDON EQUATION95
A second problem with the wave function interpretation arises when trying to find a
probability density. In relativity a density transforms under boosts, since lengths contract,
and forms part of a 4-vector with the current density. Here since φ is Lorentz invariant, |φ |2
does not transform like a density so we will not have a Lorentz covariant continuity equation
∂t ρ + ~∇.J~ = 0 ∂µ J µ = 0 (4.123)
We can derive a candidate for the probability density/current by finding something which
does satisfy such a continuity equation as we did section 1.4 for the Schroedinger equation.
As there, one starts with the Klein-Gordon equation multiplied by φ ∗ and subtracts the com-
~ and
plex conjugate of the KG equation multiplied by φ . (4.123) emerges with J µ = (ρ , J)
∂φ∗
∗∂φ
ρ ≡ i φ −φ , (4.124)
∂t ∂t
J~ ≡ −i (φ ∗~∇φ − φ ~∇φ ∗ ) (4.125)
make sense of the equation. It is linked to Pauli’s idea that one does not directly measure the
number of particles. You can only detect them via their charges through an interaction. This
means you can’t observe the probability density but only the charge density/current (qJ µ )
and that can be negative!
The Klein Gordon equation has a time reversal symmetry so in addition to states prop-
agating forwards in time that look like e−iEt there are solutions that travel backwards in
time like e+iEt . Normally we would throw away these backwards propagating solutions for
causality’s sake (you don’t want to be able to kill your Grandfather!). However, if E can be
negative these two sets of states become confused. Does e−i(−E)t propagate forwards in time
with negative energy or backwards in time with positive energy?
Feynman and Stueckelberg proposed that it is possible to consistently keep just half of
the solutions to the Klein Gordon equations but not the ones you would immediately guess.
They suggested to keep positive energy states propagating forwards in time, but only neg-
ative energy states that propagate backwards in time! We interpret these states as positive
energy states moving forwards in time (e+i(−E)t ). In the solutions the charge density/current
is opposite sign though. These particles look like negative charge versions of the normal
particle states propagating forwards in time. This is a prediction of anti-particles!
Now we find a theory that is consistent with the requirements of causality and that has
none of the aforementioned problems. In fact, the negative energy states cause us prob-
lems only so long as we think of them as real physical states propagating forwards in time.
Therefore, we should interpret the emission (absorption) of a negative energy particle with
momentum pµ as the absorption (emission) of a positive energy antiparticle with momentum
−pµ .
In order to get more familiar with this picture, consider a process with a π + and a photon
in the initial state and final state. In Fig 3.1a the π + starts from the point A and at a later
time t1 emits a photon at the point ~x1 . If the energy of the π + is still positive, it travels on
forwards in time and eventually will absorb the initial state photon at t2 at the point ~x2 . The
final state is then again a photon and a (positive energy) π + .
There is another process however, with the same initial and final state, shown in Fig 3.1b.
Again, the π + starts from the point A and at a later time t2 emits a photon at the point~x1 . But
this time, the energy of the photon emitted is bigger than the energy of the initial π + . Thus,
the energy of the π + becomes negative and it is forced to travel backwards in time. Then
at an earlier time t1 it absorbs the initial state photon at the point ~x2 , thereby rendering its
energy positive again. From there, it travels forward in time and the final state is the same as
in figure 2.1(a), namely a photon and a (positive energy) π + .
In today’s language, the process in Fig 3.1b would be described as follows: in the initial
state we have an π + and a photon. At time t1 and at the point ~x2 the photon creates a π + -π −
pair. Both propagate forwards in time. The π + ends up in the final state, whereas the π −
is annihilated at (a later) time t2 at the point ~x1 by the initial state π + , thereby producing
the final state photon. To someone observing in real time, the negative energy state moving
backwards in time looks to all intents and purposes like a negatively charged pion with
positive energy moving forwards in time.
We have discovered anti-matter! the Feynman Stueckelburg interpretation revives the
Klein Gordon equation as a perfectly sensible theory of spinless particles and their anti-
particles.
4.3. RELATIVISTIC QUANTUM MECHANICS - THE KLEIN GORDON EQUATION97
B B
time
(t 1, x 1)
(t2, x 2)
(t 1, x 1)
(t2, x 2)
A A
(a) (b)
Figure 4.13: Pion-photon scatterings in which the intermediate pion has (a) positive energy
and travels forwards in time and (b) has negative energy and travels backwards in time.