0% found this document useful (0 votes)
95 views11 pages

Em Notes 20 Special Relativity

1) The document discusses Galilean transformations and how they leave Newton's second law of motion invariant. Galilean transformations include rotations, translations, and boosts. 2) It then explains that Maxwell's equations, such as the wave equation, are not invariant under Galilean transformations and boosts. This represents a conflict between classical mechanics and electromagnetism. 3) To resolve this issue, the document considers what transformations would leave the wave equation invariant, as Maxwell's equations have been shown to accurately predict light speed. This leads to exploring transformations known as the Lorentz transformations.

Uploaded by

Harshit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views11 pages

Em Notes 20 Special Relativity

1) The document discusses Galilean transformations and how they leave Newton's second law of motion invariant. Galilean transformations include rotations, translations, and boosts. 2) It then explains that Maxwell's equations, such as the wave equation, are not invariant under Galilean transformations and boosts. This represents a conflict between classical mechanics and electromagnetism. 3) To resolve this issue, the document considers what transformations would leave the wave equation invariant, as Maxwell's equations have been shown to accurately predict light speed. This leads to exploring transformations known as the Lorentz transformations.

Uploaded by

Harshit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Special Relativity

April 9, 2016

1 Galilean transformations
1.1 The invariance of Newton’s second law
Newtonian second law,
F = ma
is a 3-vector equation and is therefore valid if we make any rotation of our frame of reference. Thus, if Oij
is a rotation matrix and we rotate the force and the acceleration vectors,
F̃ i = Oij Oj
ãi = Oij aj
then we have
F̃ = mã
and Newton’s second law is invariant under rotations. There are other invariances. Any change of the
coordinates x → x̃ that leaves the acceleration unchanged is also an invariance of Newton’s law. Equating
and integrating twice,
ã = a
d2 x̃ d2 x
=
dt2 dt2
gives
x̃ = x + x0 − v0 t
The addition of a constant, x0 , is called a translation and the change of velocity of the frame of reference is
called a boost. Finally, integrating the equivalence dt̃ = dt shows that we may reset the zero of time (a time
translation),
t̃ = t + t0
The complete set of transformations is
x0i = j Oij xj
P
Rotations
x0 = x + a T ranslations
t0 = t + t0 Origin of time
x0 = x + vt Boosts (change of velocity)
There are three independent parameters describing rotations (for example, specify a direction in space by
giving two angles (θ, ϕ) then specify a third angle, ψ, of rotation around that direction). Translations can
be in the x, y or z directions, giving three more parameters. Three components for the velocity vector and
one more to specify the origin of time gives a total of 10 parameters. These 10 transformations comprise the
Galilean group. Newton’s second law is invariant under the Galilean transformations.
Notice that all of the Galilean transformations are linear. This is crucial, because the position vectors x
form a vector space, and only linear transformations preserve the linear combinations we require of vectors.

1
1.2 Failure of the Galilean group for electrodynamics
The same is not true of electrodynamics. For example, in the absence of sources, we have seen that Maxwell’s
equations lead to the wave equation,
1 ∂2ψ
− 2 2 + ∇2 ψ = 0
c ∂t
for the each component of the fields and the potentials. But if we perform a boost of the coordinates, this
equation is not invariant:

1 ∂2 
˜ 2 ψ x̃, t̃

0 = − 2 2
ψ x̃, t̃ + ∇
c ∂ t̃
1 ∂2 ˜ 2 ψ (x̃, t)
= − 2 2 ψ (x̃, t) + ∇
c ∂t  
1 ∂ ∂ dx̃ ˜
= − 2 ψ (x̃, t) − · ∇ψ (x̃, t) + ∇2 ψ (x, t)
c ∂t ∂t dt
 
1 ∂ ∂ ˜
= − 2 ψ (x̃, t) − v0 · ∇ψ (x̃, t) + ∇2 ψ (x, t)
c ∂t ∂t
 2 2 
1 ∂ 
˜ ∂ ψ (x̃, t) + v0 · ∇
 
˜ ψ (x̃, t) + ∇2 ψ (x, t)
= − 2 ψ (x̃, t) − 2 v 0 · ∇
c ∂t2 ∂t
1 ∂2
  2
2  ˜ ∂ ψ (x̃, t) − v0 · ∇
 
˜ ψ (x̃, t)
= − 2 2 ψ (x̃, t) + ∇2 ψ (x, t) + 2 v0 · ∇
c ∂t c ∂t c

This means that there is an inherent conflict between the symmetry of Maxwell’s equations and the symmetry
of Newton’s second law. They do not change in a consistent way if we change to a moving frame of reference.
We must make a choice between modifying Maxwell’s equations or modifying Newton’s law. This is not as
2
drastic as it sounds, since both of the troublesome terms above are of order vc0  1, but one must still
be modified.
Since we know that Maxwell’s equations actually predict the speed of light, it is not unreasonable to
suppose that they are valid at large velocities. On the other hand, in 1900, the laws of Newtonian experiments
had been tested only for v  c. We therefore begin by considering what set of boost transformations does
leave the wave equation invariant.

1.3 Some definitions


Rewrite the wave equation,
1 ∂2ψ
− + ∇2 ψ = 0
c2 ∂t2
by introducing some systematic notation. Let

xα = (ct, x, y, z) f or α = 0, 1, 2, 3

and define the 4 × 4 object,  


−1 0 0 0
 0 1 0 0 
η αβ ≡
 0

0 1 0 
0 0 0 1
We write derivatives with respect to these four coordinates as

∂xα

2
so using the Einstein summation convention: any repeated index, one up and one down, is automatically
summed,
3
∂ X ∂
xα α ≡ xα α
∂x α=0
∂x
P
This will save us from writing large numbers of s. With these definitions, the wave equation is simply
∂2ψ
η αβ =0
∂xα ∂xβ
The double sum involves 16 terms, but since all but four components of η αβ are zero, only the four we need
survive. Further details of this notation is given below.

1.4 Invariance of the d’Alembertian wave equation


As with the Galilean transformation, we require our transformations to be linear,
x̃α = Λαβ xβ
(Notice what this expression means. The symbol Λαβ represents a 4 × 4 constant matrix which multiplies
the components of vector xβ to give the new components. We write one index up and one down because we
want to sum over the index β, while the raised position of the free index α must be the same in all terms.
There is a rigorous meaning to the two index positions which would take us too far afield – for now, I will
simply use the correct index positions. We can always sum one raised index and one lowered one, and the
objects you are used to calling “vectors” have a raised index: the components of a 3-vector v are written as
vi .
To transform the wave equation, we use the chain rule to write the derivative with respect to the new
coordinates,
∂ ∂xβ ∂
α
=
∂ x̃ ∂ x̃α ∂xβ
α α α µ α
Letting Λ̄ β be the inverse of Λ β , so that Λ µ Λ̄ β = δβ , we have

xα = Λ̄αµ x̃µ
∂xα ∂
Λ̄αµ x̃µ

=
∂ x̃β ∂ x̃β

= Λ̄αµ β x̃µ
∂ x̃
= Λ̄αµ δβµ
= Λ̄αβ
The transformation of the derivative operator is therefore
∂ ∂
= Λ̄βα β
∂ x̃α ∂x
and the d’Alembertian of ψ becomes
∂2ψ
 
αβ αβ ∂ ∂ψ
η = η Λ̄µα Λ̄νβ
∂ x̃α ∂ x̃β ∂xµ ∂xν
∂2ψ
= η αβ Λ̄µα Λ̄νβ
∂xµ ∂xν
We want this to hold regardless of ψ, so the matrix of partial derivatives is arbitrary. Therefore, in order for
the wave equation to be invariant, we must have
η αβ Λ̄µα Λ̄νβ = η µν

3
Exercise: Prove that this is equivalent to

ηµν Λµα Λνβ = ηαβ

We consider the special case of motion in the x-direction, so that


     
a b −1 a c −1
 c d  1  b d   1 
   = 
 1  1  1   1 
1 1 1 1
so we only need to solve
     
a b −1 a c −1
=
c d 1 b d 1
    
a b −a −c −1
=
c d b d 1
2 2
   
b −a −ac + bd −1
=
−ac + bd d2 − c2 1
This gives us three equations,

b2 − a 2 = −1
−ac + bd = 0
d2 − c2 = 1
ac
Solving the center equation, d = b , so the third equation becomes

a2 2
c − c2 = 1
b2
a2 − b2 c2 b2

=
c2 = b2

Therefore,

c = ±b
d = ±a

To solve the first equation, we let b = sinh ζ, and immediately find a = cosh ζ.
Therefore,  
cosh ζ sinh ζ
 sinh ζ cosh ζ
Λ̄µα = 


 1 
1
where the choice of the + sign preserves the direction of x and t. Inverting,
 
cosh ζ − sinh ζ
 − sinh ζ cosh ζ
Λµα = 


 1 
1
Changing the parameterization puts this in a more familar form. Let
v
tanh ζ =
c

4
Then
1
cosh ζ = p
1 − tanh2 ζ
1
= q
v2
1− c2
1 v
sinh ζ = q
1− v2 c
c2

Defining
1
γ≡q
v2
1− c2

we find the transformation of coordinates x̃α = Λαβ xβ with

− γv
 
γ c
 − γv γ
Λµα

=
 c 
1 
1

This gives
 vx 
ct̃ = γ ct −
c
x̃ = γ (x − vt)
ỹ = y
z̃ = z

which is the typical form for a relativistic boost. In the limit as c  v, we have γ ≈ 1 and this transformation
reduces to

t̃ = t
x̃ = x − vt
ỹ = y
z̃ = z

so we recover the Galilean boost and identify the parameter v with the relative velocity of the frames.

2 Special relativity from Einstein’s postulates


It is also possible to derive the reletivistic transformations from postulates.

2.1 The postulates


Special relativity is a combination of two fundamental ideas: the equivalence of inertial frames, and the
invariance of the speed of light. Inertial frames are the same in relativistic mechanics as they are in Newtonian
mechanics, i.e., frames of reference (sets of orthonormal basis vectors) in which Newton’s second law holds.
Newton’s second law
F = ma

5
and as we have seen this is unchanged by the 10 distinct Galilean transformations. If we know one frame
of reference in which Newton’s second law holds, then these transformations give us a 10 parameter family
of equivalent frames of reference. Einstein’s first postulate is that these inertial frames of reference are
indistinguishable. This means that there is no such thing as absolute rest. We can say that two frames move
with constant relative velocity, but it is incorrect to say that one is at rest and the other moves.
The constancy of the speed of light, or perhaps better, the existence of a limiting velocity, is demonstrated
by the Michaelson-Morely experiment. By measuring the speed of light in two different directions, at different
times of the year so that the motion relative to the “fixed stars” is different for each, they found no effect of
the motion on the travel time of the light. Difficulties explaining this, and especially difficulties when models
were based on the disturbance travelling in a medium, lend support to the idea that light always travels in
empty space with the limiting velocity, c. Notice that this is in dramatic conflict with our normal idea of
addition of velocities. In Euclidean 3-space, if observer A moves with velocity v with respect to observer B,
and A throws a ball with velocity u, then the velocity of the ball with respect to B is u + v. But according
to this postulate of special relativity, if A shines a light beam it travels with speed c relative to both A and
B. We must combine cn̂ with v to get cn̂0 , regardless of the directions of the unit vectors n̂, n̂0 .
With some basic assumptions about the nature of space (specifically, spacetime is a vector space and
inertial observers move in straight lines), the two postulates are:

1. The laws of physics are the same in all inertial frames of reference
2. There is a limiting velocity to all physical phenomena, c, found experimentally to be the speed at which
light travels in vacuum (and theorized to be the speed at which gravitational waves travel in nearly
flat spacetime). This velocity is independent of inertial frame, so that if in one inertial frame an object
moves with speed c, then it moves with speed c in all inertial frames.
There are some other basic ideas we will use. Since there is strong evidence for the conservation of momentum,
with momenta additively conserved, we still need the physical arena to be a vector space, with linear
combinations of vectors giving other vectors. Also, we need a notions of straight lines and distance. We will
assume that un-forced particles travel in straight lines, along their initial direction. What constitutes the
length turns out to be the central difference between classical and relativistic models.
Another approach to these supplementary assumptions is given in problem 11.1. The assumption that
spacetime is homogeneous and isotropic places strong constraints on the allowed transformations, since the
transformation cannot depend on location or time. This approach also rules out position or time dependence
of a scale factor, Λ, (see below). However, just as the 2-dimensional surface of a sphere is isotropic and
homogeneous, there exist constant curvature 4-dimensional spaces which are homogeneous and isotropic, so
some further assumption is required.

2.2 Lorentz transformations


In order for a position, x, and time, t, to describe a vector in every frame of reference, we need to restrict
possible transformations to linear transformations. Only linear transformations preserve the additivity prop-
erties of vectors. This means that the position and time in any two frames of reference must be related by
a matrix,     
ct̃ M00 M01 M02 M03 ct
 x̃   M10 M11 M12 M13   x 
 ỹ  =  M20 M21 M22 M23   y 
    

z̃ M30 M31 M32 M33 z


or, from above, x̃α = Λαβ xβ . It is convenient to write this as a matrix equation, and simplifies the notation
if we define

x0 = ct
x1 = x

6
x2 = y
x3 = z

where c is the postulated universal physical constant with units of velocity. Then we may write the trans-
formation as
3
X
x0a = Mab xb
b=0

Now consider two inertial frames, with origins coinciding at time t = t0 = 0, in which a pulse of light is
emitted at time t = 0. Picture an expanding spherical wave with radius ct = ct0 . Then we must have
p
ct = x2 + y 2 + z 2
p
ct0 = x02 + y 02 + z 02

Write these relations as

x2 + y 2 + z 2 − c2 t2 = 0
x + y 02 + z 02 − c2 t02
02
= 0

Each of these must hold if the other does, and since the primed and unprimed coordinates are linearly related
to one another, they must be proportional,

x2 + y 2 + z 2 − c2 t2 = Λ (v) x02 + y 02 + z 02 − c2 t02




To restrict Λ, suppose we relate x0a to a third frame, x00a . If the relative velocity is u, then we must have

x2 + y 2 + z 2 − c2 t2 = Λ (v) x02 + y 02 + z 02 − c2 t02




= Λ (v) Λ (u) x002 + y 002 + z 002 − c2 t002




Now choose u = −v, so that we are back to the original frame, x00a = xa . Then we require

Λ (v) Λ (−v)

and therefore
Λ (v) = ef (v)
where f (−v) = −f (v). Conventionally, we take f (v) = 0, but there exist generalizations of relativity
involving nontrivial factors. Setting Λ = 1 is equivalent to assuming that clocks maintain the same rate as
they move from place to place. However, as long as the clock rates in different places are related by a single
multiplicative function, there is no measurable effect comparing magnitudes of times that could demonstrate
it. From here on, we will take f (v) = 0 and Λ = 1.
We therefore define
s2 ≡ x2 + y 2 + z 2 − c2 t2
and require s02 = s2 between any two inertial frames. This equivalence defines the Lorentz transformations.
Any linear transformation preserving the quantity s is a Lorentz transformation.
We check that for motion of O0 along the positive x-axis of O, we have

ct0 = γ (ct − βx)


x0 = γ (x − βct)
y0 = y
0
z = z

7
where
1
γ ≡ p
1 − β2
v
β ≡
c
Substituting into s02 we have

s02 = x02 + y 02 + z 02 − c2 t02


2 2
= [γ (x − βct)] + y 2 + z 2 − [γ (ct − βx)]
= γ 2 x2 − 2βxct + β 2 c2 t2 + y 2 + z 2 − γ 2 c2 t2 − 2βxct + β 2 x2
 

= γ 2 − γ 2 β 2 x2 + 2γ 2 β − 2γ 2 β xct − γ 2 − γ 2 β 2 c2 t2 + y 2 + z 2
  

and since
!2
1
γ2 − γ2β2 1 − β2

= p
1− β2
1
1 − β2

= 2
1−β
= 1

we have

s02 γ 2 − γ 2 β 2 x2 + 2γ 2 β − 2γ 2 β xct − γ 2 − γ 2 β 2 c2 t2 + y 2 + z 2
  
=
= x 2 − c2 t 2 + y 2 + z 2
= s2

proving that the transformation is a Lorentz transformation.


Notice that we can use a hyperbolic substitution to rewrite the Lorentz transformation. Define the
rapidity, ζ, by
β ≡ tanh ζ
Then
1
γ = p
1 − β2
1
= p
1 − tanh2 ζ
1
= q
sinh2 ζ
1− cosh2 ζ
cosh ζ
= p
cosh2 ζ − sinh2 ζ
= cosh ζ

and γβ = sinh ζ. Then, with x0 = ct, we have

x00 = x0 cosh ζ − x1 sinh ζ


x01 = −x0 sinh ζ + x1 cosh ζ
x02 = x2
x03 = x3

8
The similarity to a rotation,

x0 = x cos θ − y sin θ
0
y = x sin θ + y cos θ

is not accidental, but will become clear when we find all Lorentz transformations.
Now suppose the velocity is in an arbitrary direction, β. We can project the position coordinate x parallel
and perpendicular to β,
1
xk = (β · x) β
β2
1
x⊥ = x − 2 (β · x) β
β
with x = x⊥ + xk . The component xk will behave just like the x-direction in the formula above, while the
perpendicular directions x⊥ will be unchanged. The time transforms as before, so we have

ct0 = γ (ct − β · x)
x0k

= γ xk − βct
x0⊥ = x⊥

The last two may be combined as

x0 = x0k + x0⊥

= γ xk − βct + x⊥
   
1 1
= γ (β · x) β − βct + x − (β · x) β
β2 β2
γ−1
= x+ (β · x) β − γβct
β2

3 Spacetime
We now consider properties the 4-dimensional physical arena called spacetime. The defining properties are
that it is a 4-dimensional vector space in which the squared length of any vector (from (x1 , y1 z1 , ct1 ) to
(x2 , y2 z2 , ct2 ) is given by
2 2 2 2
s2 = (x2 − x1 ) + (y2 − y1 ) + (x2 − x1 ) − c2 (t2 − t1 )

We have seen that the value of s2 is independent of the inertial frame of reference. If the time interval
2
(t2 − t1 ) is larger than the spatial separation, so that s2 < 0, we use the equivalent length
2 2 2 2
c2 τ 2 = c2 (t2 − t1 ) − (x2 − x1 ) − (y2 − y1 ) − (x2 − x1 )

To distinguish from 3-dimensional names, s is called the proper length and τ is called the proper time.

3.1 Contravariant Vectors


We will discuss the reasons for this notation later, but from now on, the coordinate labels will be written
raised. Thus, for Greek indices α, β, . . . ∈ (0, 1, 2, 3), we write

xα = x0 , x1 , x2 , x3


9
where x0 = ct and xi for Latin indices i = 1, 2, 3 are the usual spatial (x, y, z). This means that use of
the Greek or Latin alphabet tells us whether an object is four or three dimensional. We know how the
coordinates xα change when we change to a different frame of reference. We now define a (contravariant)
vector, or 4-vector, to be any set of four quantities,

Aα = A0 , A1 , A2 , A3


= A0 , Ai


= A0 , A


that transform in the same way, i.e.,

A00 γ A0 − β · A

=
A0k γ Ak − βA0

=
A0⊥ = γA⊥

It follows immediately that the quantity


2 2
kAα k = − A0 +A·A

is the same in any inertial reference frame. This is the length of the 4-vector Aα . We will give alternative
notation for this later.
Now let Aα and B α be any two 4-vectors. Their scalar product or inner product is given by

−A0 B 0 + A · B

This is seen to be invariant by noting that Aα + B α is also a vector, and writing it as


1  2   2   2 
−A0 B 0 + A · B = − A0 + B 0 + (A + B) · (A + B) − − A0 + A · A − − B 0 + B · B
2
1 α 2 2 2

= kA + B α k − kAα k − kB α k
2
Since each of the three terms on the right is invariant (that is, unchanged by change of reference frame), the
sum is as well, so the inner product is unchanged as well.
More generally, if we write the general form of a Lorentz transformation as
3
X
x0α = M α β xβ
β=0

then a 4-vector is any set of four functions Aα which transform as


3
X
A0α = M α β Aβ
β=0

3.2 Causality
In graphing spacetime, time is generally taken as the vertical axis. Points in spacetime are called events and
denoted P (t, x). The invariant separation between two events P (t1 , x1 ) , P (t2 , x2 ) is given by the invariant
interval
2 2
s2 = |x1 − x2 | − c2 (t1 − t2 )
or
2 2
c2 τ 2 = c2 (t1 − t2 ) − |x1 − x2 |

10
whichever is positive. When s2 > 0 the separation is called spacelike and when τ 2 > 0 the separation is
called timelike. When s2 = c2 τ 2 = 0, the separation of the two events is called lightlike or null.
The minus sign between the time and space coordinates in the expression for the interval is responsibility
for causal relations in spacetime. Consider the lightlike lines from any fixed spacetime event, P . This set of
null lines is called the light cone, and its position in spacetime is agreed on by all inertial observers. As a
result, the region above these lines is agreed by all inertial observers to occur at later times (larger values
of t) and constitutes the future of P . Events lying below the lowest set of lightlike lines are agreed to have
earlier values of t, and this region is therefore called the past of P . The remaining points of spacetime are
called elsewhere.
Any object travelling from P at the speed of light must follow a null curve; objects travelling slower than
the speed of light follow curves contained inside the future light cone. Moreover, the path lies of a particle
travelling slower than the speed of light lies in the future of every event on its path. Such a path is called
the world line of the particle and is said to be a timelike curve.
Suppose P1 and P2 are two events on the world line of a particle. Then there exists a frame of reference
in which P1 and P2 occur at the same spatial location. In this frame of reference,
2 2
c2 τ12
2
= c2 (t1 − t2 ) − |x1 − x2 |
2
= c2 (t1 − t2 )

so that in this particular frame of reference, the proper time interval equals the difference in time coordinates,
τ12 = t1 − t2 .
Similarly, suppose two events P1 and P2 have spacelike separation
2 2
s212 = |x1 − x2 | − c2 (t1 − t2 ) > 0

Then there exists a frame of reference in which the two events occur at the same value of t, and the proper
interval becomes equal to the spatial separation of the events:
2
s212 = |x1 − x2 |

Now consider the world line of a particle. We know (and will demonstrate later) that such a particle
always moves with speed less than c. The proper time along its world line is the physical time for the particle.
Consider two infinitesimally separated points on the world line. Choose a frame of reference (any will do!)
and specify the position of the particle in that frame of reference by x (t), so that the infinitesimal change
in proper time is
r
1
dτ = dt2 − 2 dx2
c
s  2
1 dx
= dt 1 − 2
c dt

We may not integrate along the world line between any two events A, B, to find the elapsed proper time for
the particle,

ˆtB
s  2
1 dx
τAB = dt 1 − 2
c dt
tA
ˆtB
s
2
v (t)
= dt 1 −
c2
tA

This shows that the elapsed time for physical processes depends on the motion.

11

You might also like