Em Notes 20 Special Relativity
Em Notes 20 Special Relativity
April 9, 2016
1 Galilean transformations
1.1 The invariance of Newton’s second law
Newtonian second law,
F = ma
is a 3-vector equation and is therefore valid if we make any rotation of our frame of reference. Thus, if Oij
is a rotation matrix and we rotate the force and the acceleration vectors,
F̃ i = Oij Oj
ãi = Oij aj
then we have
F̃ = mã
and Newton’s second law is invariant under rotations. There are other invariances. Any change of the
coordinates x → x̃ that leaves the acceleration unchanged is also an invariance of Newton’s law. Equating
and integrating twice,
ã = a
d2 x̃ d2 x
=
dt2 dt2
gives
x̃ = x + x0 − v0 t
The addition of a constant, x0 , is called a translation and the change of velocity of the frame of reference is
called a boost. Finally, integrating the equivalence dt̃ = dt shows that we may reset the zero of time (a time
translation),
t̃ = t + t0
The complete set of transformations is
x0i = j Oij xj
P
Rotations
x0 = x + a T ranslations
t0 = t + t0 Origin of time
x0 = x + vt Boosts (change of velocity)
There are three independent parameters describing rotations (for example, specify a direction in space by
giving two angles (θ, ϕ) then specify a third angle, ψ, of rotation around that direction). Translations can
be in the x, y or z directions, giving three more parameters. Three components for the velocity vector and
one more to specify the origin of time gives a total of 10 parameters. These 10 transformations comprise the
Galilean group. Newton’s second law is invariant under the Galilean transformations.
Notice that all of the Galilean transformations are linear. This is crucial, because the position vectors x
form a vector space, and only linear transformations preserve the linear combinations we require of vectors.
1
1.2 Failure of the Galilean group for electrodynamics
The same is not true of electrodynamics. For example, in the absence of sources, we have seen that Maxwell’s
equations lead to the wave equation,
1 ∂2ψ
− 2 2 + ∇2 ψ = 0
c ∂t
for the each component of the fields and the potentials. But if we perform a boost of the coordinates, this
equation is not invariant:
1 ∂2
˜ 2 ψ x̃, t̃
0 = − 2 2
ψ x̃, t̃ + ∇
c ∂ t̃
1 ∂2 ˜ 2 ψ (x̃, t)
= − 2 2 ψ (x̃, t) + ∇
c ∂t
1 ∂ ∂ dx̃ ˜
= − 2 ψ (x̃, t) − · ∇ψ (x̃, t) + ∇2 ψ (x, t)
c ∂t ∂t dt
1 ∂ ∂ ˜
= − 2 ψ (x̃, t) − v0 · ∇ψ (x̃, t) + ∇2 ψ (x, t)
c ∂t ∂t
2 2
1 ∂
˜ ∂ ψ (x̃, t) + v0 · ∇
˜ ψ (x̃, t) + ∇2 ψ (x, t)
= − 2 ψ (x̃, t) − 2 v 0 · ∇
c ∂t2 ∂t
1 ∂2
2
2 ˜ ∂ ψ (x̃, t) − v0 · ∇
˜ ψ (x̃, t)
= − 2 2 ψ (x̃, t) + ∇2 ψ (x, t) + 2 v0 · ∇
c ∂t c ∂t c
This means that there is an inherent conflict between the symmetry of Maxwell’s equations and the symmetry
of Newton’s second law. They do not change in a consistent way if we change to a moving frame of reference.
We must make a choice between modifying Maxwell’s equations or modifying Newton’s law. This is not as
2
drastic as it sounds, since both of the troublesome terms above are of order vc0 1, but one must still
be modified.
Since we know that Maxwell’s equations actually predict the speed of light, it is not unreasonable to
suppose that they are valid at large velocities. On the other hand, in 1900, the laws of Newtonian experiments
had been tested only for v c. We therefore begin by considering what set of boost transformations does
leave the wave equation invariant.
xα = (ct, x, y, z) f or α = 0, 1, 2, 3
2
so using the Einstein summation convention: any repeated index, one up and one down, is automatically
summed,
3
∂ X ∂
xα α ≡ xα α
∂x α=0
∂x
P
This will save us from writing large numbers of s. With these definitions, the wave equation is simply
∂2ψ
η αβ =0
∂xα ∂xβ
The double sum involves 16 terms, but since all but four components of η αβ are zero, only the four we need
survive. Further details of this notation is given below.
xα = Λ̄αµ x̃µ
∂xα ∂
Λ̄αµ x̃µ
=
∂ x̃β ∂ x̃β
∂
= Λ̄αµ β x̃µ
∂ x̃
= Λ̄αµ δβµ
= Λ̄αβ
The transformation of the derivative operator is therefore
∂ ∂
= Λ̄βα β
∂ x̃α ∂x
and the d’Alembertian of ψ becomes
∂2ψ
αβ αβ ∂ ∂ψ
η = η Λ̄µα Λ̄νβ
∂ x̃α ∂ x̃β ∂xµ ∂xν
∂2ψ
= η αβ Λ̄µα Λ̄νβ
∂xµ ∂xν
We want this to hold regardless of ψ, so the matrix of partial derivatives is arbitrary. Therefore, in order for
the wave equation to be invariant, we must have
η αβ Λ̄µα Λ̄νβ = η µν
3
Exercise: Prove that this is equivalent to
b2 − a 2 = −1
−ac + bd = 0
d2 − c2 = 1
ac
Solving the center equation, d = b , so the third equation becomes
a2 2
c − c2 = 1
b2
a2 − b2 c2 b2
=
c2 = b2
Therefore,
c = ±b
d = ±a
To solve the first equation, we let b = sinh ζ, and immediately find a = cosh ζ.
Therefore,
cosh ζ sinh ζ
sinh ζ cosh ζ
Λ̄µα =
1
1
where the choice of the + sign preserves the direction of x and t. Inverting,
cosh ζ − sinh ζ
− sinh ζ cosh ζ
Λµα =
1
1
Changing the parameterization puts this in a more familar form. Let
v
tanh ζ =
c
4
Then
1
cosh ζ = p
1 − tanh2 ζ
1
= q
v2
1− c2
1 v
sinh ζ = q
1− v2 c
c2
Defining
1
γ≡q
v2
1− c2
− γv
γ c
− γv γ
Λµα
=
c
1
1
This gives
vx
ct̃ = γ ct −
c
x̃ = γ (x − vt)
ỹ = y
z̃ = z
which is the typical form for a relativistic boost. In the limit as c v, we have γ ≈ 1 and this transformation
reduces to
t̃ = t
x̃ = x − vt
ỹ = y
z̃ = z
so we recover the Galilean boost and identify the parameter v with the relative velocity of the frames.
5
and as we have seen this is unchanged by the 10 distinct Galilean transformations. If we know one frame
of reference in which Newton’s second law holds, then these transformations give us a 10 parameter family
of equivalent frames of reference. Einstein’s first postulate is that these inertial frames of reference are
indistinguishable. This means that there is no such thing as absolute rest. We can say that two frames move
with constant relative velocity, but it is incorrect to say that one is at rest and the other moves.
The constancy of the speed of light, or perhaps better, the existence of a limiting velocity, is demonstrated
by the Michaelson-Morely experiment. By measuring the speed of light in two different directions, at different
times of the year so that the motion relative to the “fixed stars” is different for each, they found no effect of
the motion on the travel time of the light. Difficulties explaining this, and especially difficulties when models
were based on the disturbance travelling in a medium, lend support to the idea that light always travels in
empty space with the limiting velocity, c. Notice that this is in dramatic conflict with our normal idea of
addition of velocities. In Euclidean 3-space, if observer A moves with velocity v with respect to observer B,
and A throws a ball with velocity u, then the velocity of the ball with respect to B is u + v. But according
to this postulate of special relativity, if A shines a light beam it travels with speed c relative to both A and
B. We must combine cn̂ with v to get cn̂0 , regardless of the directions of the unit vectors n̂, n̂0 .
With some basic assumptions about the nature of space (specifically, spacetime is a vector space and
inertial observers move in straight lines), the two postulates are:
1. The laws of physics are the same in all inertial frames of reference
2. There is a limiting velocity to all physical phenomena, c, found experimentally to be the speed at which
light travels in vacuum (and theorized to be the speed at which gravitational waves travel in nearly
flat spacetime). This velocity is independent of inertial frame, so that if in one inertial frame an object
moves with speed c, then it moves with speed c in all inertial frames.
There are some other basic ideas we will use. Since there is strong evidence for the conservation of momentum,
with momenta additively conserved, we still need the physical arena to be a vector space, with linear
combinations of vectors giving other vectors. Also, we need a notions of straight lines and distance. We will
assume that un-forced particles travel in straight lines, along their initial direction. What constitutes the
length turns out to be the central difference between classical and relativistic models.
Another approach to these supplementary assumptions is given in problem 11.1. The assumption that
spacetime is homogeneous and isotropic places strong constraints on the allowed transformations, since the
transformation cannot depend on location or time. This approach also rules out position or time dependence
of a scale factor, Λ, (see below). However, just as the 2-dimensional surface of a sphere is isotropic and
homogeneous, there exist constant curvature 4-dimensional spaces which are homogeneous and isotropic, so
some further assumption is required.
x0 = ct
x1 = x
6
x2 = y
x3 = z
where c is the postulated universal physical constant with units of velocity. Then we may write the trans-
formation as
3
X
x0a = Mab xb
b=0
Now consider two inertial frames, with origins coinciding at time t = t0 = 0, in which a pulse of light is
emitted at time t = 0. Picture an expanding spherical wave with radius ct = ct0 . Then we must have
p
ct = x2 + y 2 + z 2
p
ct0 = x02 + y 02 + z 02
x2 + y 2 + z 2 − c2 t2 = 0
x + y 02 + z 02 − c2 t02
02
= 0
Each of these must hold if the other does, and since the primed and unprimed coordinates are linearly related
to one another, they must be proportional,
To restrict Λ, suppose we relate x0a to a third frame, x00a . If the relative velocity is u, then we must have
Now choose u = −v, so that we are back to the original frame, x00a = xa . Then we require
Λ (v) Λ (−v)
and therefore
Λ (v) = ef (v)
where f (−v) = −f (v). Conventionally, we take f (v) = 0, but there exist generalizations of relativity
involving nontrivial factors. Setting Λ = 1 is equivalent to assuming that clocks maintain the same rate as
they move from place to place. However, as long as the clock rates in different places are related by a single
multiplicative function, there is no measurable effect comparing magnitudes of times that could demonstrate
it. From here on, we will take f (v) = 0 and Λ = 1.
We therefore define
s2 ≡ x2 + y 2 + z 2 − c2 t2
and require s02 = s2 between any two inertial frames. This equivalence defines the Lorentz transformations.
Any linear transformation preserving the quantity s is a Lorentz transformation.
We check that for motion of O0 along the positive x-axis of O, we have
7
where
1
γ ≡ p
1 − β2
v
β ≡
c
Substituting into s02 we have
= γ 2 − γ 2 β 2 x2 + 2γ 2 β − 2γ 2 β xct − γ 2 − γ 2 β 2 c2 t2 + y 2 + z 2
and since
!2
1
γ2 − γ2β2 1 − β2
= p
1− β2
1
1 − β2
= 2
1−β
= 1
we have
s02 γ 2 − γ 2 β 2 x2 + 2γ 2 β − 2γ 2 β xct − γ 2 − γ 2 β 2 c2 t2 + y 2 + z 2
=
= x 2 − c2 t 2 + y 2 + z 2
= s2
8
The similarity to a rotation,
x0 = x cos θ − y sin θ
0
y = x sin θ + y cos θ
is not accidental, but will become clear when we find all Lorentz transformations.
Now suppose the velocity is in an arbitrary direction, β. We can project the position coordinate x parallel
and perpendicular to β,
1
xk = (β · x) β
β2
1
x⊥ = x − 2 (β · x) β
β
with x = x⊥ + xk . The component xk will behave just like the x-direction in the formula above, while the
perpendicular directions x⊥ will be unchanged. The time transforms as before, so we have
ct0 = γ (ct − β · x)
x0k
= γ xk − βct
x0⊥ = x⊥
x0 = x0k + x0⊥
= γ xk − βct + x⊥
1 1
= γ (β · x) β − βct + x − (β · x) β
β2 β2
γ−1
= x+ (β · x) β − γβct
β2
3 Spacetime
We now consider properties the 4-dimensional physical arena called spacetime. The defining properties are
that it is a 4-dimensional vector space in which the squared length of any vector (from (x1 , y1 z1 , ct1 ) to
(x2 , y2 z2 , ct2 ) is given by
2 2 2 2
s2 = (x2 − x1 ) + (y2 − y1 ) + (x2 − x1 ) − c2 (t2 − t1 )
We have seen that the value of s2 is independent of the inertial frame of reference. If the time interval
2
(t2 − t1 ) is larger than the spatial separation, so that s2 < 0, we use the equivalent length
2 2 2 2
c2 τ 2 = c2 (t2 − t1 ) − (x2 − x1 ) − (y2 − y1 ) − (x2 − x1 )
To distinguish from 3-dimensional names, s is called the proper length and τ is called the proper time.
xα = x0 , x1 , x2 , x3
9
where x0 = ct and xi for Latin indices i = 1, 2, 3 are the usual spatial (x, y, z). This means that use of
the Greek or Latin alphabet tells us whether an object is four or three dimensional. We know how the
coordinates xα change when we change to a different frame of reference. We now define a (contravariant)
vector, or 4-vector, to be any set of four quantities,
Aα = A0 , A1 , A2 , A3
= A0 , Ai
= A0 , A
A00 γ A0 − β · A
=
A0k γ Ak − βA0
=
A0⊥ = γA⊥
is the same in any inertial reference frame. This is the length of the 4-vector Aα . We will give alternative
notation for this later.
Now let Aα and B α be any two 4-vectors. Their scalar product or inner product is given by
−A0 B 0 + A · B
3.2 Causality
In graphing spacetime, time is generally taken as the vertical axis. Points in spacetime are called events and
denoted P (t, x). The invariant separation between two events P (t1 , x1 ) , P (t2 , x2 ) is given by the invariant
interval
2 2
s2 = |x1 − x2 | − c2 (t1 − t2 )
or
2 2
c2 τ 2 = c2 (t1 − t2 ) − |x1 − x2 |
10
whichever is positive. When s2 > 0 the separation is called spacelike and when τ 2 > 0 the separation is
called timelike. When s2 = c2 τ 2 = 0, the separation of the two events is called lightlike or null.
The minus sign between the time and space coordinates in the expression for the interval is responsibility
for causal relations in spacetime. Consider the lightlike lines from any fixed spacetime event, P . This set of
null lines is called the light cone, and its position in spacetime is agreed on by all inertial observers. As a
result, the region above these lines is agreed by all inertial observers to occur at later times (larger values
of t) and constitutes the future of P . Events lying below the lowest set of lightlike lines are agreed to have
earlier values of t, and this region is therefore called the past of P . The remaining points of spacetime are
called elsewhere.
Any object travelling from P at the speed of light must follow a null curve; objects travelling slower than
the speed of light follow curves contained inside the future light cone. Moreover, the path lies of a particle
travelling slower than the speed of light lies in the future of every event on its path. Such a path is called
the world line of the particle and is said to be a timelike curve.
Suppose P1 and P2 are two events on the world line of a particle. Then there exists a frame of reference
in which P1 and P2 occur at the same spatial location. In this frame of reference,
2 2
c2 τ12
2
= c2 (t1 − t2 ) − |x1 − x2 |
2
= c2 (t1 − t2 )
so that in this particular frame of reference, the proper time interval equals the difference in time coordinates,
τ12 = t1 − t2 .
Similarly, suppose two events P1 and P2 have spacelike separation
2 2
s212 = |x1 − x2 | − c2 (t1 − t2 ) > 0
Then there exists a frame of reference in which the two events occur at the same value of t, and the proper
interval becomes equal to the spatial separation of the events:
2
s212 = |x1 − x2 |
Now consider the world line of a particle. We know (and will demonstrate later) that such a particle
always moves with speed less than c. The proper time along its world line is the physical time for the particle.
Consider two infinitesimally separated points on the world line. Choose a frame of reference (any will do!)
and specify the position of the particle in that frame of reference by x (t), so that the infinitesimal change
in proper time is
r
1
dτ = dt2 − 2 dx2
c
s 2
1 dx
= dt 1 − 2
c dt
We may not integrate along the world line between any two events A, B, to find the elapsed proper time for
the particle,
ˆtB
s 2
1 dx
τAB = dt 1 − 2
c dt
tA
ˆtB
s
2
v (t)
= dt 1 −
c2
tA
This shows that the elapsed time for physical processes depends on the motion.
11