Genrel PDF
Genrel PDF
General Relativity
Benjamin Crowell
www.lightandmatter.com
Fullerton, California
www.lightandmatter.com
Copyright
2009
c Benjamin Crowell
5
6
Contents
1 A geometrical theory of spacetime 11
1.1 Time and causality . . . . . . . . . . . . . . . . 12
1.2 Experimental tests of the nature of time . . . . . . . . 14
The Hafele-Keating experiment, 15.—Muons, 16.—Gravitational
red-shifts, 16.
1.3 Non-simultaneity and the maximum speed of cause and
effect . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Ordered geometry. . . . . . . . . . . . . . . . . 18
1.5 The equivalence principle. . . . . . . . . . . . . . 20
Proportionality of inertial and gravitational mass, 21.—Geometrical
treatment of gravity, 21.—Eötvös experiments, 22.—The equiv-
alence principle, 23.—Gravitational red-shifts, 32.—The Pound-
Rebka experiment, 34.
Problems . . . . . . . . . . . . . . . . . . . . . . 38
3 Differential geometry 79
3.1 The affine parameter revisited, and parallel transport . . 80
The affine parameter in curved spacetime, 80.—Parallel transport,
81.
3.2 Models . . . . . . . . . . . . . . . . . . . . . 82
3.3 Intrinsic quantities . . . . . . . . . . . . . . . . . 87
Coordinate independence, 88.
3.4 The metric . . . . . . . . . . . . . . . . . . . . 90
The Euclidean metric, 91.—The Lorentz metric, 95.—Isometry, in-
ner products, and the Erlangen program, 96.—Einstein’s carousel,
98.
3.5 The metric in general relativity. . . . . . . . . . . . 103
The hole argument, 104.—A Machian paradox, 104.
3.6 Interpretation of coordinate independence. . . . . . . 105
Is coordinate independence obvious?, 105.—Is coordinate indepen-
dence trivial?, 106.—Coordinate independence as a choice of gauge,
107.
7
Problems . . . . . . . . . . . . . . . . . . . . . . 109
4 Tensors 111
4.1 Lorentz scalars . . . . . . . . . . . . . . . . . . 111
4.2 Four-vectors . . . . . . . . . . . . . . . . . . . 112
The velocity and acceleration four-vectors, 112.—The momentum
four-vector, 114.—The frequency four-vector and the relativistic
Doppler shift, 119.—A non-example: electric and magnetic fields,
121.—The electromagnetic potential four-vector, 122.
4.3 The tensor transformation laws . . . . . . . . . . . 123
4.4 Experimental tests . . . . . . . . . . . . . . . . 127
Universality of tensor behavior, 127.—Speed of light differing from
c, 127.—Degenerate matter, 128.
4.5 Conservation laws. . . . . . . . . . . . . . . . . 133
No general conservation laws, 133.—Conservation of angular mo-
mentum and frame dragging, 134.
4.6 Things that aren’t quite tensors . . . . . . . . . . . 136
Area, volume, and tensor densities, 136.—The Levi-Civita symbol,
138.—Angular momentum, 139.
Problems . . . . . . . . . . . . . . . . . . . . . . 141
5 Curvature 143
5.1 Tidal curvature versus curvature caused by local sources 144
5.2 The stress-energy tensor . . . . . . . . . . . . . . 145
5.3 Curvature in two spacelike dimensions . . . . . . . . 146
5.4 Curvature tensors . . . . . . . . . . . . . . . . . 150
5.5 Some order-of-magnitude estimates . . . . . . . . . 153
The geodetic effect, 153.—Deflection of light rays, 154.
5.6 The covariant derivative . . . . . . . . . . . . . . 155
The covariant derivative in electromagnetism, 156.—The covariant
derivative in general relativity, 157.
5.7 The geodesic equation . . . . . . . . . . . . . . . 161
Characterization of the geodesic, 161.—Covariant derivative with
respect to a parameter, 161.—The geodesic equation, 162.—Uniqueness,
162.
5.8 Torsion . . . . . . . . . . . . . . . . . . . . . 163
Are scalars path-dependent?, 163.—The torsion tensor, 166.—Experimental
searches for torsion, 167.
5.9 From metric to curvature . . . . . . . . . . . . . . 169
Finding γ given g, 169.—Numerical solution of the geodesic equation,
170.—The Riemann tensor in terms of the Christoffel symbols,
172.—Some general ideas about gauge, 172.
5.10 Manifolds . . . . . . . . . . . . . . . . . . . . 175
Why we need manifolds, 175.—Topological definition of a manifold,
176.—Local-coordinate definition of a manifold, 178.
Problems . . . . . . . . . . . . . . . . . . . . . . 180
8
paradox, 185.—Radiation from event horizons, 186.
6.2 The Schwarzschild metric . . . . . . . . . . . . . 187
The zero-mass case, 188.—Geometrized units, 190.—A large-r limit,
190.—The complete solution, 191.—Geodetic effect, 194.—Orbits,
197.—Deflection of light, 202.
6.3 Black holes. . . . . . . . . . . . . . . . . . . . 205
Singularities, 205.—Event horizon, 206.—Expected formation, 207.—Observational
evidence, 207.—Singularities and cosmic censorship, 209.—Hawking
radiation, 210.—Black holes in d dimensions, 212.
6.4 Degenerate solutions . . . . . . . . . . . . . . . 213
Problems . . . . . . . . . . . . . . . . . . . . . . 217
7 Symmetries 219
7.1 Killing vectors. . . . . . . . . . . . . . . . . . . 219
Conservation laws, 223.
7.2 Spherical symmetry . . . . . . . . . . . . . . . . 224
7.3 Static and stationary spacetimes. . . . . . . . . . . 226
Stationary spacetimes, 226.—Isolated systems, 227.—A station-
ary field with no other symmetries, 227.—A stationary field with
additional symmetries, 228.—Static spacetimes, 229.—Birkhoff’s
theorem, 229.—The gravitational potential, 231.
7.4 The uniform gravitational field revisited . . . . . . . . 233
Closed timelike curves, 236.
Problems . . . . . . . . . . . . . . . . . . . . . . 238
8 Sources 239
8.1 Sources in general relativity . . . . . . . . . . . . . 239
Point sources in a background-independent theory, 239.—The Ein-
stein field equation, 240.—Energy conditions, 247.—The cosmolog-
ical constant, 253.
8.2 Cosmological solutions. . . . . . . . . . . . . . . 256
Evidence for the finite age of the universe, 256.—Evidence for
expansion of the universe, 257.—Evidence for homogeneity and
isotropy, 258.—The FRW cosmologies, 259.—A singularity at the
Big Bang, 263.—Observability of expansion, 266.—The vacuum-
dominated solution, 274.—The matter-dominated solution, 278.—The
radiation-dominated solution, 281.—Observation, 281.
8.3 Mach’s principle revisited . . . . . . . . . . . . . . 283
The Brans-Dicke theory, 283.—Predictions of the Brans-Dicke theory,
287.—Hints of empirical support, 287.—Mach’s principle is false.,
288.
Problems . . . . . . . . . . . . . . . . . . . . . . 290
9
Problems . . . . . . . . . . . . . . . . . . . . . . 305
10
Chapter 1
A geometrical theory of
spacetime
“I always get a slight brain-shiver, now [that] space and time appear
conglomerated together in a gray, miserable chaos.” – Sommerfeld
This is a book about general relativity, at a level that is meant
to be accessible to advanced undergraduates.
This is mainly a book about general relativity, not special rel-
ativity. I’ve heard the sentiment expressed that books on special
relativity generally do a lousy job on special relativity, compared to
books on general relativity. This is undoubtedly true, for someone
who already has already learned special relativity — but wants to
unlearn the parts that are completely wrong in the broader context
of general relativity. For someone who has not already learned spe-
cial relativity, I strongly recommend mastering it first, from a book
such as Taylor and Wheeler’s Spacetime Physics. Even an advanced
student may be able to learn a great deal from a masterfully writ-
ten, nonmathematical treatment at an even lower level, such as the
ones in Hewitt’s, Conceptual Physics or the inexpensive paperback
by Gardner, Relativity Simply Explained.
In the back of this book I’ve included excerpts from three papers
by Einstein — two on special relativity and one on general relativity.
They can be read before, after, or along with this book. There are
footnotes in the papers and in the main text linking their content
with each other.
I should reveal at the outset that I am not a professional rela-
tivist. My field of research was nonrelativistic nuclear physics until
I became a community college physics instructor. I can only hope
that my pedagogical experience will compensate to some extent for
my shallow background, and that readers who find mistakes will be
kind enough to let me know about them using the contact informa-
tion provided at http://www.lightandmatter.com/area4author.
html.
11
1.1 Time and causality
Updating Plato’s allegory of the cave, imagine two super-intelligent
twins, Alice and Betty. They’re raised entirely by a robotic tutor
on a sealed space station, with no access to the outside world. The
robot, in accord with the latest fad in education, is programmed to
encourage them to build up a picture of all the laws of physics based
on their own experiments, without a textbook to tell them the right
answers. Putting yourself in the twins’ shoes, imagine giving up
all your preconceived ideas about space and time, which may turn
out according to relativity to be completely wrong, or perhaps only
approximations that are valid under certain circumstances.
Causality is one thing the twins will notice. Certain events re-
sult in other events, forming a network of cause and effect. One
general rule they infer from their observations is that there is an
unambiguously defined notion of betweenness: if Alice observes that
event 1 causes event 2, and then 2 causes 3, Betty always agrees that
2 lies between 1 and 3 in the chain of causality. They find that this
agreement holds regardless of whether one twin is standing on her
head (i.e., it’s invariant under rotation), and regardless of whether
one twin is sitting on the couch while the other is zooming around
the living room in circles on her nuclear fusion scooter (i.e., it’s also
invariant with respect to different states of motion).
You may have heard that relativity is a theory that can be inter-
preted using non-Euclidean geometry. The invariance of between-
ness is a basic geometrical property that is shared by both Euclidean
and non-Euclidean geometry. We say that they are both ordered
geometries. With this geometrical interpretation in mind, it will
be useful to think of events not as actual notable occurrences but
merely as an ambient sprinkling of points at which things could hap-
pen. For example, if Alice and Betty are eating dinner, Alice could
choose to throw her mashed potatoes at Betty. Even if she refrains,
there was the potential for a causal linkage between her dinner and
Betty’s forehead.
Betweenness is very weak. Alice and Betty may also make a
number of conjectures that would say much more about causality.
For example: (i) that the universe’s entire network of causality is
connected, rather than being broken up into separate parts; (ii) that
the events are globally ordered, so that for any two events 1 and 2,
either 1 could cause 2 or 2 could cause 1, but not both; (iii) not only
are the events ordered, but the ordering can be modeled by sorting
the events out along a line, the time axis, and assigning a number t,
time, to each event. To see what these conjectures would entail, let’s
discuss a few examples that may draw on knowledge from outside
Alice and Betty’s experiences.
Example: According to the Big Bang theory, it seems likely that
the network is connected, since all events would presumably connect
1
The possibility of having time come back again to the same point is often
referred to by physicists as a closed timelike curve (CTC). Kip Thorne, in his
popularization Black Holes and Time Warps, recalls experiencing some anxiety
after publishing a paper with “Time Machines” in the title, and later being
embarrassed when a later paper on the topic was picked up by the National
Enquirer with the headline PHYSICISTS PROVE TIME MACHINES EXIST.
“CTC” is safer because nobody but physicists know what it means.
2
This point is revisited in section 6.1.
3
Hafele and Keating, Science, 177 (1972), 168
4
These differences in velocity are not simply something that can be eliminated
by choosing a different frame of reference, because the clocks’ motion isn’t in
a straight line. The clocks back in Washington, for example, have a certain
acceleration toward the earth’s axis, which is different from the accelerations
experienced by the traveling clocks.
1.2.2 Muons
Although the Hafele-Keating experiment is impressively direct,
it was not the first verification of relativistic effects on time, it did
not completely separate the kinematic and gravitational effects, and
the effect was small. An early experiment demonstrating a large and
purely kinematic effect was performed in 1941 by Rossi and Hall,
who detected cosmic-ray muons at the summit and base of Mount
Washington in New Hampshire. The muon has a mean lifetime of
2.2 µs, and the time of flight between the top and bottom of the
mountain (about 2 km for muons arriving along a vertical path)
at nearly the speed of light was about 7 µs, so in the absence of
relativistic effects, the flux at the bottom of the mountain should
have been smaller than the flux at the top by about an order of
magnitude. The observed ratio was much smaller, indicating that
the “clock” constituted by nuclear decay processes was dramatically
slowed down by the motion of the muons.
gen maser clock which was used to control the frequency of a radio
signal. The radio signal was received on the ground, the nonrela-
tivistic Doppler shift was subtracted out, and the residual blueshift
was interpreted as the gravitational effect effect on time, matching
the relativistic prediction to an accuracy of 0.01%.
Section 1.3 Non-simultaneity and the maximum speed of cause and effect 17
and that all inertial frames of reference are equally valid. The best
that they can do is to compare clocks once Betty returns, and verify
that the net result of the trip was to make Betty’s clock run more
slowly on the average.
Alice and Betty can never satisfy their curiosity about exactly
when during Betty’s voyage the discrepancies accumulated or at
what rate. This is information that they can never obtain, but
they could obtain it if they had a system for communicating in-
stantaneously. We conclude that instantaneous communication is
impossible. There must be some maximum speed at which signals
can propagate — or, more generally, a maximum speed at which
cause and effect can propagate — and this speed must for example
be greater than or equal to the speed at which radio waves propa-
gate. It is also evident from these considerations that simultaneity
itself cannot be a meaningful concept in relativity.
O1-O2 express the same ideas as Euclid’s E1-E2. Not all lines
in the system will correspond physically to chains of causality; we
could have a line segment that describes a snapshot of a steel chain,
and O3-O4 then say that the order of the links is well defined. But
O3 and O4 also have clear physical significance for lines describing
causality. O3 forbids time travel paradoxes, like going back in time
and killing our own grandmother as a child; figure a illustrates why a
violation of O3 is referred to as a closed timelike curve. O4 says that
events are guaranteed to have a well-defined cause-and-effect order
only if they lie on the same line. This is completely different from
the attitude expressed in Newton’s famous statement: “Absolute,
true and mathematical time, of itself, and from its own nature flows
equably without regard to anything external . . . ”
If you’re dismayed by the austerity of a system of geometry with-
out any notion of measurement, you may be more appalled to learn
that even a system as weak as ordered geometry makes some state-
ments that are too strong to be completely correct as a foundation
for relativity. For example, if an observer falls into a black hole, at
some point he will reach a central point of infinite density, called a
singularity. At this point, his chain of cause and effect terminates,
violating O2. It is also an open question whether O3’s prohibition
on time-loops actually holds in general relativity; this is Stephen
Hawking’s playfully named chronology protection conjecture. We’ll
also see that in general relativity O1 is almost always true, but there
are exceptions.
find a much more lengthy list of axioms than the ones presented here. The
axioms I’m omitting take care of details like making sure that there are more
than two points in the universe, and that curves can’t cut through one another
without intersecting. The classic, beautifully written book on these topics is
H.S.M. Coxeter’s Introduction to Geometry, which is “introductory” in the sense
that it’s the kind of book a college math major might use in a first upper-division
course in geometry.
13
V.B. Braginskii and V.I. Panov, Soviet Physics JETP 34, 463 (1972).
14
This statement of the equivalence principle is summarized, along with some
other forms of it to be encountered later, in the back of the book on page 347.
Lorentz frames
The conclusion is that we need to abandon the entire distinction
between Newton-style inertial and noninertial frames of reference.
The best that we can do is to single out certain frames of reference
defined by the motion of objects that are not subject to any non-
gravitational forces. A falling rock defines such a frame of reference.
In this frame, the rock is at rest, and the ground is accelerating. The
rock’s world-line is a straight line of constant x = 0 and varying t.
Such a free-falling frame of reference is called a Lorentz frame. The
frame of reference defined by a rock sitting on a table is an inertial
frame of reference according to the Newtonian view, but it is not a
Lorentz frame.
In Newtonian physics, inertial frames are preferable because they
make motion simple: objects with no forces acting on them move
along straight world-lines. Similarly, Lorentz frames occupy a privi-
leged position in general relativity because they make motion simple:
objects move along “straight” world-lines if they have no nongravi-
tational forces acting on them.
bullets along three Cartesian axes and tracing their paths, which
she defines to be linear.
We’ve gone to elaborate lengths to show that we can really de-
termine, without reference to any external reference frame, that
the chamber is not being acted on by any nongravitational forces,
so that we know it is free-falling. In addition, we also want the
observer to be able to tell whether the chamber is rotating. She
could look out through a porthole at the stars, but that would be
missing the whole point, which is to show that without reference to
any other object, we can determine whether a particular frame is a
Lorentz frame. One way to do this would be to watch for precession
of a gyroscope. Or, without having to resort to additional appara-
tus, the observer can check whether the paths traced by the bullets
change when she changes the muzzle velocity. If they do, then she
infers that there are velocity-dependent Coriolis forces, so she must
be rotating. She can then use flywheels to get rid of the rotation,
and redo the calibration.
After the initial calibration, she can always tell whether or not
she is in a Lorentz frame. She simply has to fire the bullets, and see
whether or not they follow the precalibrated paths. For example,
she can detect that the frame has become non-Lorentzian if the
chamber is rotated, allowed to rest on the ground, or accelerated by
a rocket engine.
It may seem that the detailed construction of this elaborate
thought-experiment does nothing more than confirm something ob-
vious. It is worth pointing out, then, that we don’t really know
whether it works or not. It works in general relativity, but there are
j / Two local Lorentz frames. A second way of stating the equivalence principle is that it is
always possible to define a local Lorentz frame in a particular neigh-
borhood of spacetime.16 It is not possible to do so on a universal
basis.
The locality of Lorentz frames can be understood in the anal-
ogy of the string stretched across the globe. We don’t notice the
curvature of the Earth’s surface in everyday life because the radius
of curvature is thousands of kilometers. On a map of LA, we don’t
notice any curvature, nor do we detect it on a map of Mumbai, but
it is not possible to make a flat map that includes both LA and
Mumbai without seeing severe distortions.
Terminology
The meanings of words evolve over time, and since relativity is
now a century old, there has been some confusing semantic drift
in its nomenclature. This applies both to “inertial frame” and to
“special relativity.”
Early formulations of general relativity never refer to “inertial
k / One planet rotates about
frames,” “Lorentz frames,” or anything else of that flavor. The very
its axis and the other does not. first topic in Einstein’s original systematic presentation of the the-
As discussed in more detail on ory17 is an example (figure k) involving two planets, the purpose
p. 104, Einstein believed that of which is to convince the reader that all frames of reference are
general relativity was even more created equal, and that any attempt to make some of them into
radically egalitarian about frames second-class citizens is invidious. Other treatments of general rel-
of reference than it really is. He ativity from the same era follow Einstein’s lead.18 The trouble is
thought that if the planets were
alone in an otherwise empty 16
This statement of the equivalence principle is summarized, along with some
universe, there would be no way other forms of it, in the back of the book on page 347.
to tell which planet was really 17
Einstein, “The Foundation of the General Theory of Relativity,” 1916. An
rotating and which was not, so excerpt is given on p. 321.
that B’s tidal bulge would have 18
Two that I believe were relatively influential are Born’s 1920 Einstein’s The-
to disappear. There would be no
way to tell which planet’s surface
was a Lorentz frame.
28 Chapter 1 A geometrical theory of spacetime
that this example is more a statement of Einstein’s aspirations for
his theory than an accurate depiction of the physics that it actu-
ally implies. General relativity really does allow an unambiguous
distinction to be made between Lorentz frames and non-Lorentz
frames, as described on p. 26. Einstein’s statement should have
been weaker: the laws of physics (such as the Einstein field equa-
tion, p. 240) are the same in all frames (Lorentz or non-Lorentz).
This is different from the situation in Newtonian mechanics and spe-
cial relativity, where the laws of physics take on their simplest form
only in Newton-inertial frames.
Because Einstein didn’t want to make distinctions between frames,
we ended up being saddled with inconvenient terminology for them.
The least verbally awkward choice is to hijack the term “inertial,”
redefining it from its Newtonian meaning. We then say that the
Earth’s surface is not an inertial frame, in the context of general
relativity, whereas in the Newtonian context it is an inertial frame
to a very good approximation. This usage is fairly standard,19 but
would have made Newton confused and Einstein unhappy. If we
follow this usage, then we may sometimes have to say “Newtonian-
inertial” or “Einstein-inertial.” A more awkward, but also more
precise, term is “Lorentz frame,” as used in this book; this seems to
be widely understood.20
The distinction between special and general relativity has under-
gone a similar shift over the decades. Einstein originally defined the
distinction in terms of the admissibility of accelerated frames of ref-
erence. This, however, puts us in the absurd position of saying that
special relativity, which is supposed to be a generalization of Newto-
nian mechanics, cannot handle accelerated frames of reference in the
same way that Newtonian mechanics can. In fact both Newtonian
mechanics and special relativity treat Newtonian-noninertial frames
of reference in the same way: by modifying the laws of physics so
that they do not take on their most simple form (e.g., violating New-
ton’s third law), while retaining the ability to change coordinates
back to a preferred frame in which the simpler laws apply. It was
realized fairly early on21 that the important distinction was between
special relativity as a theory of flat spacetime, and general relativity
as a theory that described gravity in terms of curved spacetime. All
relativists writing since about 1950 seem to be in agreement on this
more modern redefinition of the terms.22
Chiao’s paradox
The remainder of this subsection deals with the subtle ques-
tion of whether and how the equivalence principle can be applied to
charged particles. You may wish to skip it on a first reading. The
short answer is that using the equivalence principle to make con-
clusions about charged particles is like the attempts by slaveholders
and abolitionists in the 19th century U.S. to support their positions
based on the Bible: you can probably prove whichever conclusion
was the one you set out to prove.
The equivalence principle is not a single, simple, mathemati-
cally well defined statement.24 As an example of an ambiguity that
l / Chiao’s paradox: a charged
is still somewhat controversial, 90 years after Einstein first proposed
particle and a neutral particle are the principle, consider the question of whether or not it applies to
in orbit around the earth. Will the charged particles. Raymond Chiao25 proposes the following thought
charged particle radiate, violating experiment, which I’ll refer to as Chiao’s paradox. Let a neutral par-
the equivalence principle? ticle and a charged particle be set, side by side, in orbit around the
earth. Assume (unrealistically) that the space around the earth has
no electric or magnetic field. If the equivalence principle applies
regardless of charge, then these two particles must go on orbiting
amicably, side by side. But then we have a violation of conservation
of energy, since the charged particle, which is accelerating, will radi-
ate electromagnetic waves (with very low frequency and amplitude).
It seems as though the particle’s orbit must decay.
The resolution of the paradox, as demonstrated by hairy cal-
culations26 is interesting because it exemplifies the local nature of
very difficult to obtain now. A more recent treatment by Grøn and Næss is
accessible at arxiv.org/abs/0806.0464v1. A full exposition of the techniques
is given by Poisson, “The Motion of Point Particles in Curved Spacetime,” www.
livingreviews.org/lrr-2004-6.
27
Because relativity describes gravitational fields in terms of curvature of
spacetime, the Euclidean relationship between the radius and circumference of
a circle fails here. The r coordinate should be understood here not as the radius
measured from the center but as the circumference divided by 2π.
Problems 39
40 Chapter 1 A geometrical theory of spacetime
Chapter 2
Geometry of flat spacetime
2.1 Affine properties of Lorentz geometry
The geometrical treatment of space, time, and gravity only requires
as its basis the equivalence of inertial and gravitational mass. That
equivalence holds for Newtonian gravity, so it is indeed possible
to redo Newtonian gravity as a theory of curved spacetime. This
project was carried out by the French mathematician Cartan, as
summarized very readably in section 17.5 of The Road to Reality by
Roger Penrose. The geometry of the local reference frames is very
simple. The three space dimensions have an approximately Eu-
clidean geometry, and the time dimension is entirely separate from
them. This is referred to as a Euclidean spacetime with 3+1 dimen-
sions. Although the outlook is radically different from Newton’s, all
of the predictions of experimental results are the same.
The experiments in section 1.2 show, however, that there are
real, experimentally verifiable violations of Newton’s laws. In New-
tonian physics, time is supposed to flow at the same rate everywhere,
which we have found to be false. The flow of time is actually depen-
dent on the observer’s state of motion through space, which shows
that the space and time dimensions are intertwined somehow. The
geometry of the local frames in relativity therefore must not be as
simple as Euclidean 3+1. Their actual geometry was implicit in
Einstein’s 1905 paper on special relativity, and had already been
developed mathematically, without the full physical interpretation,
by Hendrik Lorentz. Lorentz’s and Einstein’s work were explicitly
connected by Minkowski in 1907, so a Lorentz frame is often referred
to as a Minkowski frame.
To describe this Lorentz geometry, we need to add more struc-
ture on top of the axioms O1-O4 of ordered geometry, but it will not
be the additional Euclidean structure of E3-E4, it will be something
different.
To see how to proceed, let’s consider the bare minimum of geo-
metrical apparatus that would be necessary in order to set up frames
of reference. The following argument shows that the main miss-
ing ingredient is merely a concept of parallelism. We only expect
Lorentz frames to be local, but we do need them to be big enough
to cover at least some amount of spacetime. If Betty does an Eötvös a / Hendrik Antoon Lorentz
experiment by releasing a pencil and a lead ball side by side, she (1853-1928)
41
is essentially trying to release them at the same event A, so that
she can observe them later and determine whether their world-lines
stay right on top of one another at point B. That was all that was
required for the Eötvös experiment, but in order to set up a Lorentz
frame we need to start dealing with objects that are not right on top
of one another. Suppose we release two lead balls in two different
locations, at rest relative to one another. This could be the first step
toward adding measurement to our geometry, since the balls mark
two points in space that are separated by a certain distance, like
two marks on a ruler, or the goals at the ends of a soccer field. Al-
b / Objects are released at
though the balls are separated by some finite distance, they are still
rest at spacetime events P and close enough together so that if there is a gravitational field in the
Q. They remain at rest, and their area, it is very nearly the same in both locations, and we expect the
world-lines define a notion of distance defined by the gap between them to stay the same. Since
parallelism. they are both subject only to gravitational forces, their world-lines
are by definition straight lines (geodesics). The goal here is to end
up with some kind of coordinate grid defining a (t, x) plane, and on
such a grid, the two balls’ world-lines are vertical lines. If we release
them at events P and Q, then observe them again later at R and
S, PQRS should form a rectangle on such a plot. In the figure, the
irregularly spaced tick marks along the edges of the rectangle are
meant to suggest that although ordered geometry provides us with a
well-defined ordering along these lines, we have not yet constructed
a complete system of measurement.
The depiction of PQSR as a rectangle, with right angles at its
vertices, might lead us to believe that our geometry would have
c / There is no well-defined something like the concept of angular measure referred to in Euclid’s
angular measure in this ge- E4, equality of right angles. But this is too naive even for the
ometry. In a different frame of
Euclidean 3+1 spacetime of Newton and Galileo. Suppose we switch
reference, the angles are not
right angles. to a frame that is moving relative to the first one, so that the balls
are not at rest. In the Euclidean spacetime, time is absolute, so
events P and Q would remain simultaneous, and so would R and
S; the top and bottom edges PQ and RS would remain horizontal
on the plot, but the balls’ world-lines PR and QS would become
slanted. The result would be a parallelogram. Since observers in
different states of motion do not agree on what constitutes a right
angle, the concept of angular measure is clearly not going to be
useful here. Similarly, if Euclid had observed that a right angle
drawn on a piece of paper no longer appeared to be a right angle
when the paper was turned around, he would never have decided
that angular measure was important enough to be enshrined in E4.
d / Simultaneity is not well In the context of relativity, where time is not absolute, there is
defined. The constant-time lines not even any reason to believe that different observers must agree on
PQ and RS from figure b are the simultaneity of PQ and RS. Our observation that time flows dif-
not constant-time lines when ferently depending on the observer’s state of motion tells us specifi-
observed in a different frame of cally to expect this not to happen when we switch to a frame moving
reference.
to the relative one. Thus in general we expect that PQRS will be
Define affine parameters t and x for time and position, and con-
struct a (t, x) plane. Although affine geometry treats all directions
symmetrically, we’re going beyond the affine aspects of the space,
and t does play a different role than x here, as shown, for example,
by L4 and L5.
In the (t, x) plane, consider a rectangle with one corner at the
origin O. We can imagine its right and left edges as representing the
3
These facts are summarized for convenience on page 346 in the back of the
book.
4
For the experimental evidence on isotropy, see http://www.
edu-observatory.org/physics-faq/Relativity/SR/experiments.html#
Tests_of_isotropy_of_space.
I There is no flow through the top and bottom. This case cor-
responds to Galilean relativity, in which the rectangle shears
horizontally under a boost, and simultaneity is preserved, vi-
olating L5.
II Area flows downward at both the top and the bottom. The
flow is clockwise at both the positive t axis and the positive
x axis. This makes it plausible that the flow is clockwise ev-
erywhere in the (t, x) plane, and the proof is straightforward.5
As v increases, a particular element of area flows continually
clockwise. This violates L4, because two events with a cause
and effect relationship could be time-reversed by a Lorentz
boost.
III Area flows upward at both the top and the bottom.
Only case III is possible, and given case III, there must be at least
one point P in the first quadrant where area flows neither clockwise
nor counterclockwise.6 The boost simply increases P’s distance from
the origin by some factor. By the linearity of the transformation,
the entire line running through O and P is simply rescaled. This
special line’s inverse slope, which has units of velocity, apparently
has some special significance, so we give it a name, c. We’ll see later
that c is the maximum speed of cause and effect whose existence
we inferred in section 1.3. Any world-line with a velocity equal to
c retains the same velocity as judged by moving observers, and by
isotropy the same must be true for −c.
For convenience, let’s adopt time and space units in which c = 1,
and let the original rectangle be a unit square. The upper right
tip of the parallelogram must slide along the line through the origin
with slope +1, and similarly the parallelogram’s other diagonal must
have a slope of −1. Since these diagonals bisected one another on
5
Proof: By linearity of L, the flow is clockwise at the negative axes as well.
Also by linearity, the handedness of the flow is the same at all points on a ray
extending out from the origin in the direction θ. If the flow were counterclockwise
somewhere, then it would have to switch handedness twice in that quadrant, at
θ1 and θ2 . But by writing out the vector cross product r × dr, where dr is the
displacement caused by L(dv), we find that it depends on sin(2θ +δ), which does
not oscillate rapidly enough to have two zeroes in the same quadrant.
6
This follows from the fact that, as shown in the preceding footnote, the
handedness of the flow depends only on θ.
Example: 5
Let the intersection of the parallelogram’s two diagonals be T in
the original (rest) frame, and T0 in the Lorentz-boosted frame. An
observer at T in the original frame simultaneously detects the
passing by of the two flashes of light emitted at P and Q, and
since she is positioned at the midpoint of the diagram in space,
she infers that P and Q were simultaneous. Since the arrival of
both flashes of light at the same point in spacetime is a concrete
event, an observer in the Lorentz-boosted frame must agree on
their simultaneous arrival. (Simultaneity is well defined as long
as no spatial separation is involved.) But the distances traveled e / Example 5. Flashes of
by the two flashes in the boosted frame are unequal, and since light travel along P0 T0 and Q0 T0 .
the speed of light is the same in all cases, the boosted observer The observer in this frame of
reference judges them to have
infers that they were not emitted simultaneously.
been emitted at different times,
Example: 6 and to have traveled different
A different kind of symmetry is the symmetry between observers. distances.
If observer A says observer B’s time is slow, shouldn’t B say that
A’s time is fast? This is what would happen if B took a pill that
slowed down all his thought processes: to him, the rest of the
world would seem faster than normal. But this can’t be correct
for Lorentz boosts, because it would introduce an asymmetry be-
tween observers. There is no preferred, “correct” frame corre-
sponding to the observer who didn’t take a pill; either observer
can correctly consider himself to be the one who is at rest. It may
seem paradoxical that each observer could think that the other
was the slow one, but the paradox evaporates when we consider
the methods available to A and B for resolving the controversy.
They can either (1) send signals back and forth, or (2) get to-
gether and compare clocks in person. Signaling doesn’t estab-
lish one observer as correct and one as incorrect, because as
we’ll see in the following section, there is a limit to the speed of
propagation of signals; either observer ends up being able to ex-
plain the other observer’s observations by taking into account the
finite and changing time required for signals to propagate. Meet-
ing in person requires one or both observers to accelerate, as in
the original story of Alice and Betty, and then we are no longer
dealing with pure Lorentz frames, which are described by non-
accelerating observers.
8
N. Ashby, “Relativity in the Global Positioning System,” http://www.
livingreviews.org/lrr-2003-1
Time dilation in the Pound-Rebka experiment Example: 10 h / The change in the frequency
In the description of the Pound-Rebka experiment on page 34, I of x-ray photons emitted by 57 Fe
postponed the quantitative estimation of the frequency shift due as a function of temperature,
to temperature. Classically, one expects only a broadening of drawn after Pound And Rebka
the line, since the Doppler shift is proportional to vk /c , where (1960). Dots are experimental
vk , the component of the emitting atom’s velocity along the line measurements. The solid curve
is Pound and Rebka’s theoretical
of sight, averages to zero. But relativity tells us to expect that if
calculation using the Debye the-
the emitting atom is moving, its time will flow more slowly, so the ory of the lattice vibrations with
frequency of the light it emits will also be systematically shifted a Debye temperature of 420 de-
downward. This frequency shift should increase with tempera- grees C. The dashed line is one
ture. In other words, the Pound-Rebka experiment was designed with the slope calculated in the
as a test of general relativity (the equivalence principle), but this text using a simplified treatment
special-relativistic effect is just as strong as the relativistic one, of the thermodynamics. There is
an arbitrary vertical offset in the
and needed to be accounted for carefully.
experimental data, as well as the
theoretical curves.
9
Bailey at al., Nucl. Phys. B150(1979) 1
10
Phys. Rev. Lett. 4 (1960) 337
11
Phys. Rev. Lett. 4 (1960) 274
d / Example 11. Matter is The cause-and-effect interpretation of relativity tells us that this
lifted out of a Newtonian black Newtonian picture is incorrect. A physical object that approaches
hole with a bucket. The dashed to within a distance r of a concentration of mass M , with M /r
line represents the point at which sufficiently large, has no causal future lying at larger values of r .
the escape velocity equals the The conclusion is that there is a limit on the tensile strength of
speed of light. any substance, imposed purely by general relativity, and we can
state this limit without having to know anything about the physical
nature of the interatomic forces. Cf. homework problem 4 and
section 3.4.4, as well as some references given in the remark
following problem 4.
2.3.2 Logic
The trichotomous classification of causal relationships has in-
teresting logical implications. In classical Aristotelian logic, every
proposition is either true or false, but not both, and given proposi-
2.4.2 Observer-independence of c
The constancy of the speed of light for observers in all frames of
reference was originally detected in 1887 when Michelson and Morley
set up a clever apparatus to measure any difference in the speed of
light beams traveling east-west and north-south. The motion of
the earth around the sun at 110,000 km/hour (about 0.01% of the
speed of light) is to our west during the day. Michelson and Morley
believed that light was a vibration of a physical medium, the ether,
so they expected that the speed of light would be a fixed value
relative to the ether. As the earth moved through the ether, they
thought they would observe an effect on the velocity of light along
an east-west line. For instance, if they released a beam of light in
a westward direction during the day, they expected that it would
move away from them at less than the normal speed because the
earth was chasing it through the ether. They were surprised when
they found that the expected 0.01% change in the speed of light did
not occur.
13
http://arxiv.org/abs/0908.1832
15
arxiv.org/abs/0905.1929
t0 = γt + vγx
x0 = vγt + γx
y0 = y
z0 = z
17
This is done, for example, in Misner, Thorne, and Wheeler, Gravitation, pp.
157-159.
where we assume that the square has negligible size, so that all four
Lorentz boosts act in a way that preserves the origin of the coordi-
nate systems. (We have no convenient way in our notation L(. . .) to
describe a transformation that does not preserve the origin.) The
first transformation, L(−vŷ), changes coordinates measured by the
original gyroscope-defined frame to new coordinates measured by
the new gyroscope-defined frame, after the box has been acceler-
ated in the positive y direction.
The calculation of T is messy, and to be honest, I made a series
of mistakes when I tried to crank it out by hand. Calculations in
relativity have a reputation for being like this. Figure d shows a page
from one of Einstein’s notebooks, written in fountain pen around
1913. At the bottom of the page, he wrote “zu umstaendlich,”
meaning “too involved.” Luckily we live in an era in which this sort
of thing can be handled by computers. Starting at this point in the
book, I will take appropriate opportunities to demonstrate how to
use the free and open-source computer algebra system Maxima to
19 [ 0 + . . . ]
20 [ ]
21 (%o9)/T/ [ 1 + . . . ]
22 [ ]
23 [ 2 ]
24 [ - v + . . . ]
In other words,
0 0
T 1 = 1 + ... ,
0 −v 2
1 Suppose that we don’t yet know the exact form of the Lorentz
transformation, but we know based on the Michelson-Morley exper-
iment that the speed of light is the same in all inertial frames, and
we’ve already determined, e.g., by arguments like those on p. 65,
that there can be no length contraction in the direction perpendic-
ular to the motion. We construct a “light clock,” consisting simply
of two mirrors facing each other, with a light pulse bouncing back
and forth between them.
(a) Suppose this light clock is moving at a constant velocity v in the
direction perpendicular to its own optical arm, which is of length L.
Use the Pythagorean theorem√to prove that the clock experiences a
time dilation given by γ = 1/ 1 − v 2 , thereby fixing the time-time
portion of the Lorentz transformation.
(b) Why is it significant for the interpretation of special relativity
that the result from part a is independent of L?
(c) Carry out a similar calculation in the case where the clock moves
with constant acceleration a as measured in some inertial frame. Al-
though the result depends on L, prove that in the limit of small L,
we recover the earlier constant-velocity result, with no explicit de-
pendence on a.
Remark: Some authors state a “clock postulate” for special relativity, which
says that for a clock that is sufficiently small, the rate at which it runs de-
pends only on v, not a (except in the trivial sense that v and a are related
by calculus). The result of part c shows that the clock “postulate” is really a
theorem, not a statement that is logically independent of the other postulates
of special relativity. Although this argument only applies to a particular fam-
ily of light clocks of various sizes, one can also make any small clock into an
acceleration-insensitive clock, by attaching an accelerometer to it and apply-
ing an appropriate correction to compensate for the clock’s observed sensitivity
to acceleration. (It’s still necessary for the clock to be small, since otherwise
the lack of simultaneity in relativity makes it impossible to describe the whole
clock as having a certain acceleration at a certain instant.) Farley at al.20 have
verified the “clock postulate” to within 2% for the radioactive decay of muons
with γ ∼ 12 being accelerated by magnetic fields at 5 × 1018 m/s2 . Some peo-
ple get confused by this acceleration-independent property of small clocks and
think that it contradicts the equivalence principle. For a good explanation, see
http://math.ucr.edu/home/baez/physics/Relativity/SR/clock.html.
. Solution, p. 328
20
Nuovo Cimento 45 (1966) 281
21
L. Briatore and S. Leschiutta, Evidence for the earth gravitational shift by
direct atomic-time-scale comparison, Il Nuovo Cimento B, 37B (2): 219 (1979).
S. Iijima and K. Fujiwara, An experiment for the potential blue shift at the
Norikura Corona Station, Annals of the Tokyo Astronomical Observatory, Sec-
ond Series, Vol. XVII, 2 (1978) 68.
Problems 77
axis with rapidities η1 and η2 , find the matrix representing the com-
bined Lorentz transformation, in a Taylor series up to the first non-
classical terms in each matrix element. A mixed Taylor series in
two variables can be obtained simply by nesting taylor functions.
The taylor function will happily work on matrices, not just scalars.
. Solution, p. 328
79
mals as neither rigorous nor necessary. In 1966, Abraham Robinson
demonstrated that concerns about rigor had been unfounded; we’ll
come back to this point in section 3.2. Although it is true that any
calculation written using infinitesimals can also be carried out using
limits, the following example shows how much more well suited the
infinitesimal language is to differential geometry.
Areas on a sphere Example: 1
TheR area of a region S in the Cartesian plane can be calculated
as S dA, where dA = dx dy is the area of an infinitesimal rectan-
gle of width dx and height dy . A curved surface such as a sphere
does not admit a global Cartesian coordinate system in which the
constant coordinate curves are both uniformly spaced and per-
pendicular to one another. For example, lines of longitude on the
earth’s surface grow closer together as one moves away from the
equator. Letting θ be the angle with respect to the pole, and φ the
azimuthal angle, the approximately rectangular patch bounded by
θ, θ + dθ, φ, and φ + dφ has width r sin θdθ and height r dφ, giv-
ing dA = r 2 sin θdθdφ. If you look at the corresponding derivation
in an elementary calculus textbook that strictly eschews infinites-
imals, the technique is to start from scratch with Riemann sums.
This is extremely laborious, and moreover must be carried out
again for every new case. In differential geometry, the curvature
of the space varies from one point to the next, and clearly we
don’t want to reinvent the wheel with Riemann sums an infinite
number of times, once at each point in space.
3.2 Models
c / Bad things happen if we
try to construct an affine param- A typical first reaction to the phrase “curved spacetime” — or even
eter along a curve that isn’t a “curved space,” for that matter — is that it sounds like nonsense.
geodesic. This curve is similar How can featureless, empty space itself be curved or distorted? The
to path ABC in figure b. Par- concept of a distortion would seem to imply taking all the points
allel transport doesn’t preserve
and shoving them around in various directions as in a Picasso paint-
the vectors’ angle relative to
the curve, as it would with a ing, so that distances between points are altered. But if space has
geodesic. The errors in the no identifiable dents or scratches, it would seem impossible to deter-
construction blow up in a way mine which old points had been sent to which new points, and the
that wouldn’t happen if the curve distortion would have no observable effect at all. Why should we
had been a geodesic. The fourth expect to be able to build differential geometry on such a logically
dashed parallel flies off wildly dubious foundation? Indeed, historically, various mathematicians
around the back of the sphere, have had strong doubts about the logical self-consistency of both
wrapping around and meeting
non-Euclidean geometry and infinitesimals. And even if an authori-
the curve at a point, 4, that is
essentially random. tative source assures you that the resulting system is self-consistent,
its mysterious and abstract nature would seem to make it difficult
1
Heath, pp. 195-202
Frames moving at c?
A good application of these ideas is to the question of what the
world would look like in a frame of reference moving at the speed
of light. This question has a long and honorable history. As a
young student, Einstein tried to imagine what an electromagnetic
wave would look like from the point of view of a motorcyclist riding
alongside it. We now know, thanks to Einstein himself, that it really
doesn’t make sense to talk about such observers.
The most straightforward argument is based on the positivist
idea that concepts only mean something if you can define how to
measure them operationally. If we accept this philosophical stance
(which is by no means compatible with every concept we ever discuss
in physics), then we need to be able to physically realize this frame
The symbols dx, dx0 , and dx0 are all synonyms, and likewise for
dy, dx1 , and dx1 .
In the non-Euclidean case, the Pythagorean theorem is false; dxµ
and dxµ are no longer synonyms, so their product is no longer simply
the square of a distance. To see this more explicitly, let’s write the
expression so that only the covariant quantities occur. By local
flatness, the relationship between the covariant and contravariant
vectors is linear, and the most general relationship of this kind is
given by making the metric a symmetric matrix gµν . Substituting
dxµ = gµν xν , we have
where there are now implied sums over both µ and ν. Notice how
implied sums occur only when the repeated index occurs once as
a superscript and once as a subscript; other combinations are un-
grammatical.
Self-check: Why does it make sense to demand that the metric
be symmetric?
In an introductory course in Newtonian mechanics, one makes
a distinction between vectors, which have a direction in space, and
scalars, which do not. These are specific examples of tensors, which
can be expressed as objects with m superscripts and n subscripts.
A scalar has m = n = 0. A covariant vector has (m, n) = (0, 1),
a contravariant vector (1, 0), and the metric (0, 2). We refer to the
number of indices as the rank of the tensor. Tensors are discussed
in more detail, and defined more rigorously, in chapter 4. For our
present purposes, it is important to note that just because we write
a symbol with subscripts or superscripts, that doesn’t mean it de-
serves to be called a tensor. This point can be understood in the
more elementary context of Newtonian scalars and vectors. For ex-
ample, we can define a Newtonian “vector” u = (m, T , e), where m
is the mass of the moon, T is the temperature in Chicago, and e is
the charge of the electron. This creature u doesn’t deserve to be
called a vector, because it doesn’t behave as a vector under rota-
tion. Similarly, a tensor is required to behave in a certain way under
rotations and Lorentz boosts.
When discussing the symmetry of rank-2 tensors, it is convenient
to introduce the following notation:
1
T(ab) = (Tab + Tba )
2
1
T[ab] = (Tab − Tba )
2
Area Example: 9
In one dimension, g is a single number, and lengths are given
√
by ds = g dx . The square root can also be understood through
example 6 on page 93, in which we saw that a uniform rescaling
x → αx is reflected in gµν → α−2 gµν .
In two-dimensional Cartesian coordinates, multiplication of the
width and height of a rectangle gives the element of area dA =
√
g11 g22 dx 1 dx 2 . Because the coordinates are orthogonal, g is di-
√
agonal, and the factor of gp 11 g22 is identified as the square root
of its determinant, so dA = |g |dx 1 dx 2 . Note that the scales on
the two axes are not necessarily the same, g11 6= g22 .
The same expression for the element of area holds even if the co-
ordinates
p pare not orthogonal. In example 8, for instance, we have
|g | = 1 − cos2 φ = sin φ, which is the right correction factor
corresponding to the fact that dx 1 and dx 2 form a parallelepiped
rather than a rectangle.
Area of a sphere Example: 10
For coordinates (θ, φ) on the surface of a sphere of radius r , we
have, by an argument similar to that of example 7 on page 93,
gθθ = r 2 , gφφ = r 2 sin2 θ, gθφ = 0. The area of the sphere is
Z
A = dA
Z Z p
= |g |dθdφ
Z Z
= r2 sin θdθdφ
= 4πr 2
s2 = gtt t2 + gxx x2 = 0 ,
and this remains true after the Lorentz boost (t, x) → (γt, γx). It
is a matter of convention which element of the metric to make pos-
itive and which to make negative. In this book, I’ll use gtt = +1
and gxx = −1, so that g = diag(+1, −1). This has the advan-
tage that any line segment representing the timelike world-line of a
physical object has a positive squared magnitude; the forward flow
of time is represented as a positive number, in keeping with the
philosophy that relativity is basically a theory of how causal rela-
tionships work. With this sign convention, spacelike vectors have
positive squared magnitudes, timelike ones negative. The same con-
vention is followed, for example, by Penrose. The opposite version,
with g = diag(−1, +1) is used by authors such as Wald and Misner,
Thorne, and Wheeler.
Our universe does not have just one spatial dimension, it has
three, so the full metric in a Lorentz frame is given by
g = diag(+1, −1, −1, −1).
7
Proof: Let b and c be parallel and timelike, and directed forward in time.
Adopt a frame of reference in which every spatial component of each vector
vanishes. This entails no loss of generality, since inner products are invariant
under such a transformation. Since the time-ordering is also preserved under
transformations in the Poincaré group, each is still directed forward in time, not
backward. Now let b and c be pulled away from parallelism, like opening a pair
of scissors in the x − t plane. This reduces bt ct , while causing bx cx to become
negative. Both effects increase the inner product.
Ehrenfest’s paradox
Ehrenfest11 described the following paradox. Suppose that ob-
server B, in the lab frame, measures the radius of the disk to be r
when the disk is at rest, and r0 when the disk is spinning. B can
also measure the corresponding circumferences C and C 0 . Because
8
The example is described in Einstein’s paper “The Foundation of the General
Theory of Relativity.” An excerpt, which includes the example, is given on
p. 321.
9
Relativistic description of a rotating disk, Am. J. Phys. 43 (1975) 869
10
Space, Time, and Coordinates in a Rotating World, http://www.phys.uu.
nl/igg/dieks
11
P. Ehrenfest, Gleichförmige Rotation starrer Körper und Relativitätstheorie,
Z. Phys. 10 (1909) 918, available in English translation at en.wikisource.org.
Problems 109
the metric by doing local tests in which right triangles are formed
out of laser beams, and violations of the Pythagorean theorem are
detected. Will this work? . Solution, p. 330
5 In the early decades of relativity, many physicists were in the
habit of speaking as if the Lorentz transformation described what an
observer would actually “see” optically, e.g., with an eye or a camera.
This is not the case, because there is an additional effect due to opti-
cal aberration: observers in different states of motion disagree about
the direction from which a light ray originated. This is analogous
to the situation in which a person driving in a convertible observes
raindrops falling from the sky at an angle, even if an observer on the
sidewalk sees them as falling vertically. In 1959, Terrell and Penrose
independently provided correct analyses,17 showing that in reality
an object may appear contracted, expanded, or rotated, depending
on whether it is approaching the observer, passing by, or receding.
The case of a sphere is especially interesting. Consider the following
four cases:
111
but so is 2τ +7. Less trivially, a photon’s proper time is always zero,
but one can still define an affine parameter along its trajectory. We
will need such an affine parameter, for example, in section 6.2.7,
page 202, when we calculate the deflection of light rays by the sun,
one of the early classic experimental tests of general relativity.
Another example of a Lorentz scalar is the pressure of a perfect
fluid, which is often assumed as a description of matter in cosmo-
logical models.
Infinitesimals and the clock “postulate” Example: 1
At the beginning of chapter 3, I motivated the use of infinitesimals
as useful tools for doing differential geometry in curved space-
time. Even in the context of special relativity, however, infinitesi-
mals can be useful. One way of expressing the proper time accu-
mulated on a moving clock is
Z
s = ds
Z q
= gij dx i dx j
s
Z 2 2 2
dx dy dz
= 1− − − dt ,
dt dt dt
which only contains an explicit dependence on the clock’s veloc-
ity, not its acceleration. This is an example of the clock “postulate”
referred to in the remark at the end of homework problem 1 on
page 76. Note that the clock postulate only applies in the limit of
a small clock. This is represented in the above equation by the
use of infinitesimal quantities like dx .
4.2 Four-vectors
4.2.1 The velocity and acceleration four-vectors
Our basic Lorentz vector is the spacetime displacement dxi . Any
other quantity that has the same behavior as dxi under rotations
and boosts is also a valid Lorentz vector. Consider a particle moving
through space, as described in a Lorentz frame. Since the particle
may be subject to nongravitational forces, the Lorentz frame can-
not be made to coincide (except perhaps momentarily) with the
particle’s rest frame. Dividing the infinitesimal displacement by an
infinitesimal proper time interval, we have the four-velocity vector
v i = dxi /dτ , whose components in a Lorentz coordinate system
are (γ, γv 1 , γv 2 , γv 3 ), where v µ , µ = 1, 2, 3, is the ordinary three-
component velocity vector as defined in classical mechanics. The
four-velocity’s squared magnitude v i vi is always exactly 1, even if
the particle is not moving at the speed of light.
When we hear something referred to as a “vector,” we usually
take this is a statement that it not only transforms as a vector, but
ẗ 2 − ẍ 2 = −a2 .
1
The solution of these differential equations is t = a sinh aτ,
x = a1 cosh aτ, and eliminating τ gives
1p
x= 1 + a2 t 2 .
a
2
cf. p. 32
3
http://arxiv.org/abs/hep-th/9508018
4
Luo et al., “New Experimental Limit on the Photon Rest Mass with a Ro-
tating Torsion Balance,” Phys. Rev. Lett. 90 (2003) 081801. The interpretation
of such experiments is difficult, and this paper attracted a series of comments. A
weaker but more universally accepted bound is 8 × 10−52 kg, Davis, Goldhaber,
and Nieto, Phys. Rev. Lett. 35 (1975) 1402.
5
Goldhaber and Nieto, ”Terrestrial and Extraterrestrial Limits on The Pho-
ton Mass,” Rev. Mod. Phys. 43 (1971) 277
∂x0µ
[1] v 0µ = v κ
∂xκ
∂xκ
[2] vµ0 = vκ 0µ .
∂x
Note the inversion of the partial derivative in one equation compared
to the other. Because these equations describe a change from one
coordinate system to another, they clearly depend on the coordinate
system, so we use Greek indices rather than the Latin ones that
would indicate a coordinate-independent equation. Note that the
letter µ in these equations always appears as an index referring to
the new coordinates, κ to the old ones. For this reason, we can get
away with dropping the primes and writing, e.g., v µ = v κ ∂x0µ /∂xκ
rather than v 0 , counting on context to show that v µ is the vector
expressed in the new coordinates, v κ in the old ones. This becomes
especially natural if we start working in a specific coordinate system
where the coordinates have names. For example, if we transform
from coordinates (t, x, y, z) to (a, b, c, d), then it is clear that v t is
expressed in one system and v c in the other.
Self-check: Recall that the gauge transformations allowed in gen-
eral relativity are not just any coordinate transformations; they
must be (1) smooth and (2) one-to-one. Relate both of these re-
quirements to the features of the vector transformation laws above.
In equation [2], µ appears as a subscript on the left side of the
equation, but as a superscript on the right. This would appear
to violate our rules of notation, but the interpretation here is that
in expressions of the form ∂/∂xi and ∂/∂xi , the superscripts and
subscripts should be understood as being turned upside-down. Sim-
ilarly, [1] appears to have the implied sum over κ written ungram-
matically, with both κ’s appearing as superscripts. Normally we
only have implied sums in which the index appears once as a super-
script and once as a subscript. With our new rule for interpreting
indices on the bottom of derivatives, the implied sum is seen to be
written correctly. This rule is similar to the one for analyzing the
∂x κ 1
0µ
= ∂ x 0µ (wrong!) ,
∂x ∂x κ
but this would give infinite results for the mixed terms! Only in the
case of functions of a single variable is it possible to flip deriva-
tives in this way; it doesn’t work for partial derivatives. To evalu-
ate these partial derivatives, we have to invert the transformation
(which in this example is trivial to accomplish) and then take the
partial derivatives.
The metric is a rank-2 tensor, and transforms analogously:
∂xκ ∂xλ
gµν = gκλ
∂x0µ ∂x0ν
(writing g rather than g 0 on the left, because context makes the
distinction clear).
Self-check: Write the similar expressions for g µν , gνµ , and gµν ,
which are entirely determined by the grammatical rules for writing
superscripts and subscripts. Interpret the case of a rank-0 tensor.
An accelerated coordinate system? Example: 14
Let’s see the effect on Lorentzian metric g of the transformation
1
t0 = t x 0 = x + at 2 .
2
The inverse transformation is
1
t = t0 x = x 0 − at 02 .
2
The tensor transformation law gives
gt00 t 0 = 1 − (at 0 )2
gx0 0 x 0 = −1
gx0 0 t 0 = −at 0 .
11
This statement of the equivalence principle, along with the others we have
encountered, is summarized in the back of the book on page 347.
12
arxiv.org/abs/hep-ph/9703240
n → p + e− + ν̄
p + e− → n + ν ,
which happen due to the weak nuclear force. The first of these re-
leases 0.8 MeV, and has a half-life of 14 minutes. This explains
why free neutrons are not observed in significant numbers in our
universe, e.g., in cosmic rays. The second reaction requires an input
of 0.8 MeV of energy, so a free hydrogen atom is stable. The white
dwarf contains fairly heavy nuclei, not individual protons, but sim-
ilar considerations would seem to apply. A nucleus can absorb an
electron and convert a proton into a neutron, and in this context the
process is called electron capture. Ordinarily this process will only
occur if the nucleus is neutron-deficient; once it reaches a neutron-
to-proton ratio that optimizes its binding energy, neutron capture
cannot proceed without a source of energy to make the reaction go.
In the environment of a white dwarf, however, there is such a source.
The annihilation of an electron opens up a hole in the “Fermi sea.”
There is now an state into which another electron is allowed to drop
without violating the exclusion principle, and the effect cascades
upward. In a star with a mass above the Chandrasekhar limit, this
process runs to completion, with every proton being converted into a
neutron. The result is a neutron star, which is essentially an atomic
nucleus (with Z = 0) with the mass of a star!
Observational evidence for the existence of neutron stars came
in 1967 with the detection by Bell and Hewish at Cambridge of a
mysterious radio signal with a period of 1.3373011 seconds. The sig-
nal’s observability was synchronized with the rotation of the earth
relative to the stars, rather than with legal clock time or the earth’s
rotation relative to the sun. This led to the conclusion that its origin
was in space rather than on earth, and Bell and Hewish originally
dubbed it LGM-1 for “little green men.” The discovery of a second
signal, from a different direction in the sky, convinced them that it
was not actually an artificial signal being generated by aliens. Bell
published the observation as an appendix to her PhD thesis, and
it was soon interpreted as a signal from a neutron star. Neutron
stars can be highly magnetized, and because of this magnetization
they may emit a directional beam of electromagnetic radiation that
sweeps across the sky once per rotational period — the “lighthouse
effect.” If the earth lies in the plane of the beam, a periodic signal
can be detected, and the star is referred to as a pulsar. It is fairly
easy to see that the short period of rotation makes it difficult to
explain a pulsar as any kind of less exotic rotating object. In the
approximation of Newtonian mechanics,
p a spherical body of density
ρ, rotating with a period T = 3π/Gρ, has zero apparent gravity
at its equator, since gravity is just strong enough to accelerate an
object so that it follows a circular trajectory above a fixed point on
We’ll see in chapter 6 that for a non-rotating black hole, the metric
is of the form
2. Similarly, it has terms that are odd under reversal of the dif-
ferential dφ of the azimuthal coordinate.
Tensorial
One is to let have the values 0 and ±1 at some arbitrarily
chosen point, in some arbitrarily chosen coordinate system, but to
let it transform like a tensor. Then Aµ = µκλ uκ v λ needs to be
modified, since the right-hand side is a tensor, and that would make
A a tensor, but if A is an area we don’t want it to transform like
a 1-tensor. We therefore need to revise the definition of area to
be Aµ = g −1/2 µκλ uκ v λ , where g is the determinant of the lower-
16
Hermann Weyl, “Space-Time-Matter,” 1922, p. 109, available online at
archive.org/details/spacetimematter00weyluoft.
17
For a proof, see the Wikipedia article “Parity of a permutation.”
Tensor-density
The other option is to let have the same 0 and ±1 values at
all points. Then is clearly not a tensor, because it doesn’t scale
by a factor of k n when the coordinates are scaled by k; is a tensor
density with weight +1 for the upper-index version and −1 for the
lower-index one. The relation Aµ = µκλ uκ v λ gives an area that is
a tensor density, not a tensor, because A is not written in terms of
purely tensorial quantities. Scaling the coordinates by k leaves µκλ
unchanged, scales up uκ v λ by k 2 , and scales up the area by k 2 , as
expected. Unfortunately, there is no consistency in the literature as
to whether should be a tensor or a tensor density. Some will define
both a tensor and a nontensor version, with notations like and ˜,
or18 0123 and [0123]. Others avoid writing the letter completely.19
The tensor-density version is convenient because we always know
that its value is 0 or ±1. The tensor version has the advantage that
it transforms as a tensor.
18
Misner, Thorne, and Wheeler
19
Hawking and Ellis
Problems 141
. Solution, p. 331
12 Estimate the energy contained in the electric field of an
electron, if the electron’s radius is r. Classically (i.e., assuming
relativity but no quantum mechanics), this energy contributes to the
electron’s rest mass, so it must be less than the rest mass. Estimate
the resulting lower limit on r, which is known as the classical electron
radius. . Solution, p. 331
13 For gamma-rays in the MeV range, the most frequent mode of
interaction with matter is Compton scattering, in which the photon
is scattered by an electron without being absorbed. Only part of
the gamma’s energy is deposited, and the amount is related to the
angle of scattering. Use conservation of four-momentum to show
that in the case of scattering at 180 degrees, the scattered photon
has energy E 0 = E/(1+2E/m), where m is the mass of the electron.
p
14 Derive the equation T = 3π/Gρ given on page 131 for the
period of a rotating, spherical object that results in zero apparent
gravity at its surface.
15 Section 4.4.3 presented an estimate of the upper limit on the
mass of a white dwarf. Check the self-consistency of the solution
in the following respects: (1) Why is it valid to ignore the contri-
bution of the nuclei to the degeneracy pressure? (2) Although the
electrons are ultrarelativistic, spacetime is approximated as being
flat. As suggested in example 11 on page 58, a reasonable order-of-
magnitude check on this result is that we should have M/r c2 /G.
16 The laws of physics in our universe imply that for bodies with
a certain range of masses, a neutron star is the unique equilibrium
state. Suppose we knew of the existence of neutron stars, but didn’t
know the mass of the neutron. Infer upper and lower bounds on the
mass of the neutron.
143
intrinsic curvature. It arises only from the choice of the coordinates
(t0 , x0 ) defined by a frame tied to the accelerating rocket ship.
The fact that the above metric has nonvanishing derivatives, un-
like a constant Lorentz metric, does indicate the presence of a grav-
itational field. However, a gravitational field is not the same thing
as intrinsic curvature. The gravitational field seen by an observer
aboard the ship is, by the equivalence principle, indistinguishable
from an acceleration, and indeed the Lorentzian observer in the
earth’s frame does describe it as arising from the ship’s accelera-
tion, not from a gravitational field permeating all of space. Both
observers must agree that “I got plenty of nothin’ ” — that the
region of the universe to which they have access lacks any stars,
neutrinos, or clouds of dust. The observer aboard the ship must de-
scribe the gravitational field he detects as arising from some source
very far away, perhaps a hypothetical vast sheet of lead lying billions
of light-years aft of the ship’s deckplates. Such a hypothesis is fine,
but it is unrelated to the structure of our hoped-for field equation,
which is to be local in nature.
Not only does the metric tensor not represent the gravitational
field, but no tensor can represent it. By the equivalence princi-
ple, any gravitational field seen by observer A can be eliminated by
switching to the frame of a free-falling observer B who is instanta-
neously at rest with respect to A at a certain time. The structure of
the tensor transformation law guarantees that A and B will agree on
whether a given tensor is zero at the point in spacetime where they
pass by one another. Since they agree on all tensors, and disagree
on the gravitational field, the gravitational field cannot be a tensor.
We therefore conclude that a nonzero intrinsic curvature of the
type that is to be included in the Einstein field equations is not
encoded in any simple way in the metric or its first derivatives.
Since neither the metric nor its first derivatives indicate curvature,
we can reasonably conjecture that the curvature might be encoded
in its second derivatives.
d2 α
K= ,
dxdy h / 1. Gaussian curvature
can be interpreted as the failure
where d2 α = dα0 − dα. of parallelism represented by
d2 α/dx dy .
2. From a point P, emit a fan of rays at angles filling a certain
range θ of angles in Gaussian polar coordinates (figure i). Let the
arc length of this fan at r be L, which may not be equal to its
Euclidean value LE = rθ. Then2
d2
L
K = −3 .
dr2 LE
b / The change in the vector The above treatment may be somewhat misleading in that it may
due to parallel transport around lead you to believe that there is a single coordinate system in
the octant equals the integral which the Riemann tensor is always constant. This is not the
of the Riemann tensor over the case, since the calculation of the Riemann tensor was only valid
interior. near the origin O of the normal coordinates. The character of
these coordinates becomes quite complicated far from O; we end
up with all our constant-x lines converging at north and south
poles of the sphere, and all the constant-y lines at east and west
poles.
Angular coordinates (φ, θ) are more suitable as a large-scale de-
scription of the sphere. We can use the tensor transformation law
to find the Riemann tensor in these coordinates. If O, the origin
of the (x , y ) coordinates, is at coordinates (φ, θ), then dx /dφ =
ρ sin θ and dy /dθ = ρ. The result is R φθφθ = R xyxy (dy /dθ)2 = 1
y
and R θφθφ = R xyx (dx /dφ)2 = sin2 θ. The variation in R θφθφ is
not due to any variation in the sphere’s intrinsic curvature; it rep-
resents the behavior of the coordinate system.
The Riemann tensor only measures curvature within a particular
plane, the one defined by dpc and dq d , so it is a kind of sectional cur-
vature. Since we’re currently working in two dimensions, however,
there is only one plane, and no real distinction between sectional
curvature and Ricci curvature, which is the average of the sectional
curvature over all planes that include dq d : Rcd = Racad . The Ricci
curvature in two spacelike dimensions, expressed in normal coordi-
nates, is simply the diagonal matrix diag(K, K).
Let’s estimate the size of the effect. The first derivative of the
metric is, roughly, the gravitational field, whereas the second deriva-
tive has to do with curvature. The curvature of spacetime around
the earth should therefore vary as GM r−3 , where M is the earth’s
mass and G is the gravitational constant. The area enclosed by a
circular orbit is proportional to r2 , so we expect the geodetic effect
to vary as nGM/r, where n is the number of orbits. The angle of
precession is unitless, and the only way to make this result unitless
is to put in a factor of 1/c2 . In units with c = 1, this factor is un-
necessary. In ordinary metric units, the 1/c2 makes sense, because
it causes the purely relativistic effect to come out to be small. The
result, up to unitless factors that we didn’t pretend to find, is
nGM
∆θ ∼ .
c2 r
We might also expect a Thomas precession. Like the spacetime
curvature effect, it would be proportional to nGM/c2 r. Since we’re a / The geodetic effect as
not worrying about unitless factors, we can just lump the Thomas measured by Gravity Probe B.
precession together with the effect already calculated.
4
This statement is itself only a rough estimate. Anyone who has taught
physics knows that students will often calculate an effect exactly while not un-
derstanding the underlying physics at all.
b / Precession angle as a function of time as measured by the four gyroscopes aboard Gravity Probe B.
∂b Ψ → ∂b eiα Ψ
= ∂b + A0b − Ab Ψ0
∇b = ∂b + ieAb
where Γbac , called the Christoffel symbol, does not transform like
a tensor, and involves derivatives of the metric. (“Christoffel” is
pronounced “Krist-AWful,” with the accent on the middle syllable.)
The explicit computation of the Christoffel symbols from the metric
is deferred until section 5.9, but the intervening sections 5.7 and 5.8
can be omitted on a first reading without loss of continuity.
Christoffel symbols on the globe Example: 8
As a qualitative example, consider the geodesic airplane trajec-
tory shown in figure d, from London to Mexico City. In physics
it is customary to work with the colatitude, θ, measured down
from the north pole, rather then the latitude, measured from the
equator. At P, over the North Atlantic, the plane’s colatitude has
a minimum. (We can see, without having to take it on faith from
the figure, that such a minimum must occur. The easiest way to
convince oneself of this is to consider a path that goes directly
over the pole, at θ = 0.)
At P, the plane’s velocity vector points directly west. At Q, over d / Example 8.
New England, its velocity has a large component to the south.
or
With the partial derivative ∂a , it does not make sense to use the
metric to raise the index and form ∂ a . It does make sense to do so
with covariant derivatives, so ∇a = g ab ∇b is a correct identity.
∂a Xb = Xb,a
∇a Xb = Xb;a
∇a Xb = Xb ;a
6
“On the gravitational field of a point mass according to Einstein’s the-
ory,” Sitzungsberichte der Königlich Preussischen Akademie der Wissenschaften
1 (1916) 189, translated in arxiv.org/abs/physics/9905030v1.
5.7.4 Uniqueness
The geodesic equation is useful in establishing one of the neces-
sary theoretical foundations of relativity, which is the uniqueness of
geodesics for a given set of initial conditions. This is related to ax-
iom O1 of ordered geometry, that two points determine a line, and
is necessary physically for the reasons discussed on page 22; briefly,
if the geodesic were not uniquely determined, then particles would
have no way of deciding how to move. The form of the geodesic
equation guarantees uniqueness. To see this, consider the following
algorithm for determining a numerical approximation to a geodesic:
3. Add (d2 xi /dλ2 )∆λ to the currently stored value of dxi /dλ.
5. Add ∆λ to λ.
6. Repeat steps 2-5 until the the geodesic has been extended to
the desired affine distance.
5.8 Torsion
This section describes the concept of gravitational torsion. It can
be skipped without loss of continuity, provided that you accept the
symmetry property Γa[bc] = 0 without worrying about what it means
physically or what empirical evidence supports it.
Self-check: Interpret the mathematical meaning of the equation
Γa[bc] = 0, which is expressed in the notation introduced on page 92.
7
This point was mentioned on page 151, in connection with the definition of
the Riemann tensor.
8
http://www.npl.washington.edu/eotwash/publications/pdf/lowfron-
tier2.pdf
9
Carroll and Field, http://arxiv.org/abs/gr-qc/9403058
1
Γdba = g cd (∂? g?? ) ,
2
where inversion of the one-component matrix G has been replaced
by matrix inversion, and, more importantly, the question marks indi-
cate that there would be more than one way to place the subscripts
so that the result would be a grammatical tensor equation. The
most general form for the Christoffel symbol would be
1
Γbac = g db (L∂c gab + M ∂a gcb + N ∂b gca ) ,
2
where L, M , and N are constants. Consistency with the one-
dimensional expression requires L + M + N = 1, and vanishing
torsion gives L = M . The L and M terms have a different physical
significance than the N term.
Suppose an observer uses coordinates such that all objects are
described as lengthening over time, and the change of scale accu-
mulated over one day is a factor of k > 1. This is described by the
derivative ∂t gxx < 1, which affects the M term. Since the metric is
used to calculate
√ squared distances, the gxx matrix element scales
down by 1/ k. To compensate for ∂t v x < 0, so we need to add a
positive correction term, M > 0, to the covariant derivative. When
the same observer measures the rate of change of a vector v t with
respect to space, the rate of change comes out to be too small, be-
cause the variable she differentiates with respect to is too big. This
requires N < 0, and the correction is of the same size as the M
correction, so |M | = |N |. We find L = M = −N = 1.
Self-check: Does the above argument depend on the use of space
for one coordinate and time for the other?
The resulting general expression for the Christoffel symbol in
terms of the metric is
1
Γcab = g cd (∂a gbd + ∂b gad − ∂d gab ) .
2
One can readily go back and check that this gives ∇c gab = 0. In fact,
the calculation is a bit tedious. For that matter, tensor calculations
in general can be infamously time-consuming and error-prone. Any
reasonable person living in the 21st century will therefore resort to
a computer algebra system. The most widely used computer alge-
bra system is Mathematica, but it’s expensive and proprietary, and
it doesn’t have extensive built-in facilities for handling tensors. It
1 import math
2
3 l = 0 # affine parameter lambda
4 dl = .001 # change in l with each iteration
5 l_max = 100.
6
7 # initial position:
10
http://arxiv.org/abs/0903.2085
11
http://arxiv.org/abs/0903.2085
. . . and differentia-
tion of this gives the
gauge field. . . Ab Γcab
A second differen-
tiation gives the
directly observable
field(s) . . . E and B Rcdab
13
For those with knowledge of topology, these can be formalized a little more:
we want a completely normal, second-countable, locally connected topological
space that has Lebesgue covering dimension n, is a homogeneous space under
its own homeomorphism group, and is a complete uniform space. I don’t know
whether this is sufficient to characterize a manifold completely, but it suffices to
rule out all the counterexamples of which I know.
Lines Example: 11
The set of all real numbers is a 1-manifold. Similarly, any line with
the properties specified in Euclid’s Elements is a 1-manifold. All
such lines are homeomorphic to one another, and we can there-
fore speak of “the line.”
A circle Example: 12
A circle (not including its interior) is a 1-manifold, and it is not
homeomorphic to the line. To see this, note that deleting a point
from a circle leaves it in one connected piece, but deleting a point
from a line makes two. Here we use the fact that a homeomor-
phism is guaranteed to preserve “rubber-sheet” properties like the
number of pieces.
No changes of dimension Example: 13
A “lollipop” formed by gluing an open 2-circle (i.e., a circle not
including its boundary) to an open line segment is not a manifold,
because there is no n for which it satisfies M1.
It also violates M2, because points in this set fall into three distinct
classes: classes that live in 2-dimensional neighborhoods, those
that live in 1-dimensional neighborhoods, and the point where the
line segment intersects the boundary of the circle.
No manifolds made from the rational numbers Example: 14
The rational numbers are not a manifold,
√ because specifying an
arbitrarily small neighborhood around 2 excludes every rational
number, violating M3.
Similarly, the rational plane defined by rational-number coordinate
pairs (x , y ) is not a 2-manifold. It’s good that we’ve excluded
this space, because it has the unphysical property that curves
can cross without having a point in common. For example, the
curve y = x 2 crosses from one side of the line y = 2 to the other,
but never intersects it. This is physically undesirable because it
doesn’t match up with what we have in mind when we talk about
collisions between particles as intersections of their world-lines,
or when we say that electric field lines aren’t supposed to inter-
sect.
No boundary Example: 15
The open half-plane y > 0 in the Cartesian plane is a 2-manifold.
The closed half-plane y ≥ 0 is not, because it violates M2; the
Remark: The incompatibility between [1] and [2] can be interpreted as showing
that general relativity does not admit any spacetime that has all the global
Problems 181
182 Chapter 5 Curvature
Chapter 6
Vacuum solutions
In this chapter we investigate general relativity in regions of space
that have no matter to act as sources of the gravitational field.
We will not, however, limit ourselves to calculating spacetimes in
cases in which the entire universe has no matter. For example,
we will be able to calculate general-relativistic effects in the region
surrounding the earth, including a full calculation of the geodetic
effect, which was estimated in section 5.5.1 only to within an order
of magnitude. We can have sources, but we just won’t describe the
metric in the regions where the sources exist, e.g., inside the earth.
The advantage of accepting this limitation is that in regions of empty
space, we don’t have to worry about the details of the stress-energy
tensor or how it relates to curvature. As should be plausible based a/A Swiss commemorative
on the physical motivation given in section 5.1, page 144, the field coin shows the vacuum field
equations in a vacuum are simply Rab = 0. equation.
1 p
x= 1 + a2 t2 − 1 ,
a
gt0 0 t0 = (1 + ax0 )2
gx0 0 x0 = −1 .
183
expect that this one also has zero Ricci curvature. This is straight-
forward to verify. The nonvanishing Christoffel symbols are
0 a 0
Γt x0 t0 = 0
and Γx t0 t0 = a(1 + ax0 ) .
1 + ax
The only elements of the Riemann tensor that look like they might
0 0
be nonzero are Rt t0 x0 x0 and Rx t0 x0 t0 , but both of these in fact vanish.
Self-check: Verify these facts.
This seemingly routine exercise now leads us into some very in-
teresting territory. Way back on page 12, we conjectured that not all
events could be time-ordered: that is, that there might exists events
in spacetime 1 and 2 such that 1 cannot cause 2, but neither can 2
cause 1. We now have enough mathematical tools at our disposal
to see that this is indeed the case.
We observe that x(t) approaches the asymptote x = t−1/a. This
asymptote has a slope of 1, so it can be interpreted as the world-line
of a photon that chases the ship but never quite catches up to it.
Any event to the left of this line can never have a causal relationship
with any event on the ship’s world-line. Spacetime, as seen by an
observer on the ship, has been divided by a curtain into two causally
disconnected parts. This boundary is called an event horizon. Its
existence is relative to the world-line of a particular observer. An
observer who is not accelerating along with the ship does consider
an event horizon to exist. Although this particular example of the
a / A spaceship (curved world- indefinitely accelerating spaceship has some physically implausible
line) moves with an acceleration features (e.g., the ship would have to run out of fuel someday), event
perceived as constant by its
horizons are real things. In particular, we will see in section 6.3.2
passengers. The photon (straight
world-line) come closer and that black holes have event horizons.
closer to the ship, but will never Interpreting everything in the (t0 , x0 ) coordinates tied to the ship,
quite catch up. the metric’s component gt0 0 t0 vanishes at x0 = −1/a. An observer
aboard the ship reasons as follows. If I start out with a head-start
of 1/a relative to some event, then the timelike part of the metric at
that event vanishes. If the event marks the emission of a material
particle, then there is no possible way for that particle’s world-line
to have ds2 > 0. If I were to detect a particle emitted at that event,
it would violate the laws of physics, since material particles must
have ds2 > 0, so I conclude that I will never observe such a particle.
Since all of this applies to any material particle, regardless of its
mass m, it must also apply in the limit m → 0, i.e., to photons and
other massless particles. Therefore I can never receive a particle
emitted from this event, and in fact it appears that there is no
way for that event, or any other event behind the event horizon, to
have any effect on me. In my frame of reference, it appears that
light cones near the horizon are tipped over so far that their future
light-cones lie entirely in the direction away from me.
We’ve already seen in example 11 on page 58 that a naive New-
tonian argument suggests the existence of black holes; if a body is
2
http://xxx.lanl.gov/abs/gr-qc/9605032
3
“On the gravitational field of a point mass according to Einstein’s the-
ory,” Sitzungsberichte der K oniglich Preussischen Akademie der Wissenschaften a / The field equations of general
1 (1916) 189. An English translation is available at http://arxiv.org/abs/ relativity are nonlinear.
physics/9905030v1.
Use of ctensor
In fact, when I calculated the Christoffel symbols above by hand,
I got one of them wrong, and missed calculating one other because I
thought it was zero. I only found my mistake by comparing against
a result in a textbook. The computation of the Riemann tensor is
an even bigger mess. It’s clearly a good idea to resort to a com-
puter algebra system here. Cadabra, which was discussed earlier, is
specifically designed for coordinate-independent calculations, so it
won’t help us here. A good free and open-source choice is ctensor,
which is one of the standard packages distributed along with the
computer algebra system Maxima, introduced on page 68.
The following Maxima program calculates the Christoffel sym-
bols found in section 6.2.1.
1 load(ctensor);
2 ct_coords:[t,r,theta,phi];
3 lg:matrix([1,0,0,0],
4 [0,-1,0,0],
5 [0,0,-r^2,0],
6 [0,0,0,-r^2*sin(theta)^2]);
7 cmetric();
8 christof(mcs);
Line 1 loads the ctensor package. Line 2 sets up the names of the
coordinates. Line 3 defines the gab , with lg meaning “the version of
g with lower indices.” Line 7 tells Maxima to do some setup work
with gab , including the calculation of the inverse matrix g ab , which
is stored in ug. Line 8 says to calculate the Christoffel symbols.
The notation mcs refers to the tensor Γ0 bca with the indices swapped
around a little compared to the convention Γabc followed in this
book. On a Linux system, we put the program in a file flat.mac
and run it using the command maxima -b flat.mac. The relevant
part of the output is:
1 1
2 (%t6) mcs = -
3 2, 3, 3 r
4
5 1
6 (%t7) mcs = -
7 2, 4, 4 r
8
∇t v r = ∂t v r + Γrtc v c .
2m
h0 ≈ for r m .
r2
The interpretation of this calculation is as follows. We assert the
equivalence principle, by which the acceleration of a free-falling par-
ticle can be said to be zero. After some calculations, we find that
the rate at which time flows (encoded in h) is not constant. It is
different for observers at different heights in a gravitational poten-
tial well. But this is something we had already deduced, without
the tensor gymnastics, in example 4 on page 115.
Integrating, we find that for large r, h = 1 − 2m/r.
1 load(ctensor);
2 ct_coords:[t,r,theta,phi];
3 lg:matrix([(1-2/r),0,0,0],
4 [0,-(1+b1/r),0,0],
5 [0,0,-r^2,0],
6 [0,0,0,-r^2*sin(theta)^2]);
7 cmetric();
8 ricci(true);
9 limit(r^4*ric[1,1],r,inf);
Time-reversal symmetry
The Schwarzschild metric is invariant under time reversal, since
time occurs only in the form of dt2 , which stays the same under
dt → −dt. This is the same time-reversal symmetry that occurs in
Newtonian gravity, where the field is described by the gravitational
acceleration g, and accelerations are time-reversal invariant.
Fundamentally, this is an example of general relativity’s coordi-
nate independence. The laws of physics provided by general rela-
tivity, such as the vacuum field equation, are invariant under any
smooth coordinate transformation, and t → −t is such a coordinate
transformation, so general relativity has time-reversal symmetry.
Since the Schwarzschild metric was found by imposing time-reversal-
symmetric boundary conditions on a time-reversal-symmetric differ-
ential equation, it is an equally valid solution when we time-reverse
it. Furthermore, we expect the metric to be invariant under time
reversal, unless spontaneous symmetry breaking occurs (see p. 281).
This suggests that we ask the more fundamental question of what
global symmetries general relativity has. Does it have symmetry
under parity inversion, for example? Or can we take any solution
such as the Schwarzschild spacetime and transform it into a frame
of reference in which the source of the field is moving uniformly in a
certain direction? Because general relativity is locally equivalent to
special relativity, we know that these symmetries are locally valid.
But it may not even be possible to define the corresponding global
symmetries. For example, there are some spacetimes on which it
is not even possible to define a global time coordinate. On such a
spacetime, which is described as not time-orientable, there does not
exist any smooth vector field that is everywhere timelike, so it is
not possible to define past versus future light-cones at all points in
space without having a discontinuous change in the definition occur
somewhere. This is similar to the way in which a Möbius strip does
not allow an orientation of its surface (an “up” direction as seen by
an ant) to be defined globally.
Suppose that our spacetime is time-orientable, and we are able
to define coordinates (p, q, r, s) such that p is always the timelike
coordinate. Because q → −q is a smooth coordinate transforma-
Flat space
As a first warmup, consider two spatial dimensions, represented
by Euclidean polar coordinates (r, φ). Parallel-transport of a gyro-
scope’s angular momentum around a circle of constant r gives
∇φ Lφ = 0
∇ φ Lr = 0 .
Computing the covariant derivatives, we have
0 = ∂φ Lφ + Γφφr Lr
0 = ∂φ Lr + Γrφφ Lφ .
The Christoffel symbols are Γφφr = 1/r and Γrφφ = −r. This is
all made to look needlessly complicated because Lφ and Lr are ex-
pressed in different units. Essentially the vector is staying the same,
but we’re expressing it in terms of basis vectors in the r and φ di-
rections that are rotating. To see this more transparently, let r = 1,
and write P for Lφ and Q for Lr , so that
P 0 = −Q
Q0 = P ,
which have solutions such as P = sin φ, Q = cos φ. For each orbit
(2π change in φ), the basis vectors rotate by 2π, so the angular
P 0 = −Q
Q0 = (1 − )P ,
√
where = 2m. The solutions rotate with frequency ω 0 = 1 − .
The result is that when the basis vectors rotate by 2π, the compo-
nents
√ no longer return to their original values; they lag by a factor
of 1 − ≈ 1 − m. Putting the factors of r back in, this is 1 − m/r.
The deviation from unity shows that after one full revolution, the L
vector no longer has quite the same components expressed in terms
of the (r, φ) basis vectors.
To understand the sign of the effect, let’s imagine a counter-
clockwise rotation. The (r, φ) rotate counterclockwise, so relative
to them, the L vector rotates clockwise. After one revolution, it has
not rotated clockwise by a full 2π, so its orientation is now slightly
counterclockwise compared to what it was. Thus the contribution
to the geodetic effect arising from spatial curvature is in the same
direction as the orbit.
Comparing with the actual results from Gravity Probe B, we see
that the direction of the effect is correct. The magnitude, however,
is off. The precession accumulated over n periods is 2πnm/r, or,
in SI units, 2πnGm/c2 r. Using the data from section 2.5.4, we find
∆θ = 2 × 10−5 radians, which is too small compared to the data
shown in figure b on page 154.
2+1 Dimensions
To reproduce the experimental results correctly, we need to in-
clude the time dimension. The angular momentum vector now has
components (Lφ , Lr , Lt ). The physical interpretation of the Lt com-
ponent is obscure at this point; we’ll return to this question later.
dLφ
= ∂φ Lφ + ω −1 ∂t Lφ
dφ
dLr
= ∂φ Lr + ω −1 ∂t Lr
dφ
dLt
= ∂φ Lt + ω −1 ∂t Lt
dφ
Setting the covariant derivatives equal to zero gives
0 = ∂φ Lφ + Γφφr Lr
0 = ∂φ Lr + Γrφφ Lφ
0 = ∂t Lr + Γrtt Lt
0 = ∂t Lt + Γttr Lr .
Self-check: There are not just four but six covariant derivatives
that could in principle have occurred, and in these six covariant
derivatives we could have had a total of 18 Christoffel symbols. Of
these 18, only four are nonvanishing. Explain based on symmetry
arguments why the following Christoffel symbols must vanish: Γφφt ,
Γttt .
Putting all this together in matrix form, we have L0 = M L,
where
0 −1 0
M = 1− 0 −(1 − )/2ω .
0 −/2ω(1 − ) 0
6.2.6 Orbits
The main event of Newton’s Principia Mathematica is his proof
of Kepler’s laws. Similarly, Einstein’s first important application in
general relativity, which he began before he even had the exact form
of the Schwarzschild metric in hand, was to find the non-Newtonian
behavior of the planet Mercury. The planets deviate from Keplerian
behavior for a variety of Newtonian reasons, and in particular there
is a long list of reasons why the major axis of a planet’s elliptical
orbit is expected to gradually rotate. When all of these were taken
into account, however, there was a remaining discrepancy of about
40 seconds of arc per century, or 6.6 × 10−7 radians per orbit. The
direction of the effect was in the forward direction, in the sense that
if we view Mercury’s orbit from above the ecliptic, so that it orbits
6
Misner, Thorne, and Wheeler, Gravitation, p. 1118
7
Rindler, Essential Relativity, 1969, p. 141
Conserved quantities
If Einstein had had a computer on his desk, he probably would
simply have integrated the motion numerically using the geodesic
equation. But it is possible to simplify the problem enough to at-
tack it with pencil and paper, if we can find the relevant conserved
quantities of the motion. Nonrelativistically, these are energy and
angular momentum.
Consider a rock falling directly toward the sun. The Schwarzschild
metric is of the special form
ds2 = h(r)dt2 − k(r)dr2 − . . . .
The rock’s trajectory is a geodesic, so it extremizes the proper time
s between any two events fixed in spacetime, just as a piece of string
stretched across a curved surface extremizes its length. Let the rock
pass through distance r1 in coordinate time t1 , and then through r2
in t2 . (These should really be notated as ∆r1 , . . . or dr1 , . . . , but we
avoid the ∆’s or d’s for convenience.) Approximating the geodesic
using two line segments, the proper time is
s = s1 + s2
q q
= h1 t21 − k1 r12 + h2 t22 − k2 r22
q q
= h1 t21 − k1 r12 + h2 (T − t1 )2 − k2 r22 ,
where T = t1 + t2 is fixed. If this is to be extremized with respect
to t1 , then ds/dt1 = 0, which leads to
h1 t1 h2 t2
0= − ,
s1 s2
and
dφ
L = r2
ds
for the conserved energy per unit mass and angular momentum per
unit mass.
In interpreting the energy per unit mass E, it is important to
understand that in the general-relativistic context, there is no use-
ful way of separating the rest mass, kinetic energy, and potential
energy into separate terms, as we could in Newtonian mechanics.
E includes contributions from all of these, and turns out to be less
than the contribution due to the rest mass (i.e., less than 1) for a
planet orbiting the sun. It turns out that E can be interpreted as a
measure of the additional gravitational mass that the solar system
possesses as measured by a distant observer, due to the presence of
the planet. It then makes sense that E is conserved; by analogy
with Newtonian mechanics, we would expect that any gravitational
effects that depended on the detailed arrangement of the masses
within the solar system would decrease as 1/r4 , becoming negligible
at large distances and leaving a constant field varying as 1/r2 .
One way of seeing that it doesn’t make sense to split E into parts
is that although the equation given above for E involves a specific set
Perihelion advance
For convenience, let the mass of the orbiting rock be 1, while m
stands for the mass of the gravitating body.
The unit mass of the rock is a third conserved quantity, and
since the magnitude of the momentum vector equals the square of
the mass, we have for an orbit in the plane θ = π/2,
or
ṙ2 = E 2 − U 2
where
U 2 = (1 − 2m/r)(1 + L2 /r2 ) .
L2 p
r= 1 + 1 − 12m2 /L2
2m
L2
≈ (1 − ) ,
m
where = 3(m/L)2 . A planet in a nearly circular orbit oscillates
between perihelion and aphelion with a period that depends on the
curvature of U 2 at its minimum. We have
d2 (U 2 )
k=
dr2
d2 2m L2 2mL2
= 2 1− + 2 −
dr r r r3
4m 6L2 24mL2
=− 3 + 4 −
r r r5
−6 4
= 2L m (1 + 2)
∆saz = 2πr2 /L
= 2πL3 m−2 (1 − 2) .
1 import math
2
3 # constants, in SI units:
4 G = 6.67e-11 # gravitational constant
5 c = 3.00e8 # speed of light
6 m_kg = 1.99e30 # mass of sun
8
http://arxiv.org/abs/astro-ph/0407149
At line 14, we take the mass to be 1000 times greater than the
mass of the sun. This helps to make the deflection easier to calcu-
late accurately without running into problems with rounding errors.
Lines 17-25 set up the initial conditions to be at the point of closest
approach, as the photon is grazing the sun. This is easier to set
up than initial conditions in which the photon approaches from far
away. Because of this, the deflection angle calculated by the pro-
gram is cut in half. Combining the factors of 1000 and one half, the
final result from the program is to be interpreted as 500 times the
actual deflection angle.
The result is that the deflection angle is predicted to be 870
seconds of arc. As a check, we can run the program again with
m = 0; the result is a deflection of −8 seconds, which is a measure
of the accumulated error due to rounding and the finite increment
used for λ.
Dividing by 500, we find that the predicted deflection angle is
1.74 seconds, which, expressed in radians, is exactly 4Gm/c2 r. The
unitless factor of 4 is in fact the correct result in the case of small
deflections, i.e., for m/r 1.
Although the numerical technique has the disadvantage that it
doesn’t let us directly prove a nice formula, it has some advantages
as well. For one thing, we can use it to investigate cases for which
the approximation m/r 1 fails. For m/r = 0.3, the numerical
techique gives a deflection of 222 degrees, whereas the weak-field
approximation 4Gm/c2 r gives only 69 degrees. What is happening
here is that we’re getting closer and closer to the event horizon of a
black hole. Black holes are the topic of section 6.3, but it should be
intuitively reasonable that something wildly nonlinear has to happen
as we get close to the point where the light wouldn’t even be able
to escape.
The precision of Eddington’s original test was only about ± 30%,
and has never been improved on significantly with visible-light as-
tronomy. A better technique is radio astronomy, which allows mea-
surements to be carried out without waiting for an eclipse. One
merely has to wait for the sun to pass in front of a strong, compact
radio source such as a quasar. These techniques have now verified
the deflection of light predicted by general relativity to a relative
precision of about 10−5 .9
9
For a review article on this topic, see Clifford Will, “The Con-
frontation between General Relativity and Experiment,” http://relativity.
livingreviews.org/Articles/lrr-2006-3/.
10
arxiv.org/abs/0903.1105
11
arxiv.org/abs/0906.4040
Particle physics
Hawking radiation has some intriguing properties from the point
of view of particle physics. In a particle accelerator, the list of
Black-hole complementarity
A very difficult question about the relationship between quan-
tum mechanics and general relativity occurs as follows. In our ex-
ample above, observer A detects an extremely red-shifted spectrum
of light from the black hole. A interprets this as evidence that
the space near the event horizon is actually an intense maelstrom
of radiation, with the temperature approaching infinity as one gets
closer and closer to the horizon. If B returns from the region near
the horizon, B will agree with this description. But suppose that
observer C simply drops straight through the horizon. C does not
feel any acceleration, so by the equivalence principle C does not
detect any radiation at all. Passing down through the event hori-
zon, C says, “A and B are liars! There’s no radiation at all.” A
and B, however, C see as having entered a region of infinitely in-
tense radiation. “Ah,” says A, “too bad. C should have turned
back before it got too hot, just as I did.” This is an example of a
principle we’ve encountered before, that when gravity and quantum
mechanics are combined, different observers disagree on the number
of quanta present in the vacuum. We are presented with a paradox,
because A and B believe in an entirely different version of reality
that C. A and B say C was fricasseed, but C knows that that didn’t
happen. One suggestion is that this contradiction shows that the
proper logic for describing quantum gravity is nonaristotelian, as
described on page 60. This idea, suggested by Susskind et al., goes
by the name of black-hole complementarity, by analogy with Niels
Bohr’s philosophical description of wave-particle duality as being
14
arxiv.org/abs/gr-qc/0503022v4
15
arxiv.org/abs/gr-qc/9506079v1
with solution
2
u = ± t3/2 , t<0 .
3
There is no solution for t > 0.
a / The change of coordinates is
If physicists living in this universe, at t < 0, for some reason
degenerate at t = 0.
choose t as their time coordinate, there is in fact a way for them to
tell that the cataclysmic event at t = 0 is not a reliable prediction.
At t = 0, their metric’s time component vanishes, so its signature
changes from + − −− to 0 − −−. At that moment, the machin-
ery of the standard tensor formulation of general relativity breaks
down. For example, one can no longer raise indices, because g ab is
the matrix inverse of gab , but gab is not invertible. Since the field
equations are ultimately expressed in terms of the metric using ma-
chinery that includes raising and lowering of indices, there is no way
to apply them at t = 0. They don’t make a false prediction of the
end of the world; they fail to make any prediction at all. Physicists
accustomed to working in terms of the t coordinate can simply throw
up their hands and say that they have no way to predict anything
at t > 0. But they already know that their spacetime is one whose
observables, such as curvature, are all constant with respect to time,
so they should ask why this perfect symmetry is broken by singling
out t = 0. There is physically nothing that should make one mo-
ment in time different than any other, so choosing a particular time
to call t = 0 should be interpreted merely as an arbitrary choice of
r2
ds2 = −dr2 − dθ02 .
1 − ω2 r2
Identify the two values of r at which singularities occur, and classify
them as coordinate or non-coordinate singularities.
(c) Consider the following argument, which is intended to provide
an answer to part b without any computation. In two dimensions,
there is only one measure of curvature, which is equivalent (up to a
constant of proportionality) to the Gaussian curvature. The Gaus-
sian curvature is proportional to the angular deficit of a triangle.
Since the angular deficit of a triangle in a space with negative cur-
vature satisfies the inequality −π < < 0, we conclude that the
Gaussian curvature can never be infinite. Since there is only one
measure of curvature in a two-dimensional space, this means that
there is no non-coordinate singularity. Is this argument correct,
and is the claimed result consistent with your answers to part b?
. Solution, p. 332
4 The first experimental verification of gravitational redshifts
was a measurement in 1925 by W.S. Adams of the spectrum of light
emitted from the surface of the white dwarf star Sirius B. Sirius B
has a mass of 0.98M and a radius of 5.9×106 m. Find the redshift.
Problems 217
7 Verify by direct calculation, as asserted on p. 214, that the
Riemann tensor vanishes for the metric ds2 = −tdt2 − d`2 , where
d`2 = dx2 + dy 2 + dz 2 . . Solution, p. 333
8 Suppose someone proposes that the vacuum field equation
of general relativity isn’t Rab = 0 but rather Rab = k, where k is
some constant that describes an innate tendency of spacetime to
have tidal distortions. Explain why this is not a good proposal.
. Solution, p. 333
9 Prove, as claimed on p. 212, that in 2+1 dimensions, with a
vanishing cosmological constant, there is no nontrivial Schwarzschild
metric. . Solution, p. 333
219
The Euclidean plane Example: 1
The Euclidean plane has two Killing vectors corresponding to
translation in two linearly independent directions, plus a third Killing
vector for rotation about some arbitrarily chosen origin O. In Carte-
sian coordinates, one way of writing a complete set of these is is
ξ1 = (1, 0)
ξ2 = (0, 1)
ξ3 = (−y , x ) .
2
Hawking and Ellis, The Large Scale Structure of Space-Time, p. 62, give
a succinct treatment that describes the flux densities and proves that Gauss’s
theorem, which ordinarily fails in curved spacetime for a non-scalar flux, holds in
the case where the appropriate Killing vectors exist. For an explicit description
of how one can integrate to find a scalar mass-energy, see Winitzki, Topics in
General Relativity, section 3.1.5, available for free online.
3
Johannsen and Psaltis, http://arxiv.org/abs/1008.3902v1
We have already solved the field equations for a metric of this form
and found as a solution the Schwarzschild spacetime.6 Since the
metric’s components are all independent of t, ∂t is a Killing vec-
tor, and it is timelike for large r, so the Schwarzschild spacetime is
asymptotically static.
The no-hair theorems say that relativity only has a small reper-
toire of types of black-hole singularities, defined as singularities in-
side regions of space that are causally disconnected from the uni-
verse, in the sense that future light-cones of points in the region
do not extend to infinity.7 That is, a black hole is defined as a
singularity hidden behind an event horizon, and since the defini-
tion of an event horizon is dependent on the observer, we specify
an observer infinitely far away. The theorems cannot classify naked
singularities, i.e., those not hidden behind horizons, because the role
of naked singularities in relativity is the subject of the cosmic cen-
sorship hypothesis, which is an open problem. The theorems do
not rule out the Big Bang singularity, because we cannot define the
notion of an observer infinitely far from the Big Bang. We can also
see that Birkhoff’s theorem does not prohibit the Big Bang, because
cosmological models are not vacuum solutions with Λ = 0. Black
string solutions are not ruled out by Birkhoff’s theorem because they
would lack spherical symmetry, so we need the arguments given on
p. 212 to show that they don’t exist.
1 load(ctensor);
2 ct_coords:[t,x,y,z];
3 lg:matrix([exp(2*z),0,0,0],
4 [0,-exp(-2*j*z),0,0],
5 [0,0,-exp(-2*k*z),0],
6 [0,0,0,-1]
7 );
8 cmetric();
9 scurvature();
10 leinstein(true);
The output from line 9 shows that the scalar curvature is constant,
which is a necessary condition for any spacetime that we want to
think of as representing a uniform field. Inspecting the Einstein
tensor output by line 10, we find that√in order to get Gxx and Gyy
to vanish, we need j and k to be (1 ± 3i)/2. By trial and error, we
find that assigning the complex-conjugate values to j and k makes
Gtt and Gzz vanish as well, so that we have a vacuum solution.
This solution is, unfortunately, complex, so it is not of any obvious
value as a physically meaningful result. Since the field equations
are nonlinear, we can’t use the usual trick of forming real-valued
superpositions of the complex solutions. We could try simply √ tak-
ing the real part of the metric. This gives g = e−z cos 3z and
√ xx
gyy = e−z sin 3z, and is unsatisfactory because the√ metric becomes
degenerate (has a zero determinant) at z = nπ/2 3, where n is an
integer.
It turns out, however, that there is a very similar solution, found
by Petrov in 1962,11 that is real-valued. The Petrov metric, which
10
A metric of this general form is referred to as a Kasner metric. One usually
sees it written with a logarithmic change of variables, so that z appears in the
base rather than in the exponent.
11
Petrov, in Recent Developments in General Relativity, 1962, Pergamon, p.
383. For a presentation that is freely accessible online, see Gibbons and Gielen,
“The Petrov and Kaigorodov-Ozsváth Solutions: Spacetime as a Group Mani-
fold,” arxiv.org/abs/0802.4082.
12
http://golem.ph.utexas.edu/string/archives/000550.html
from page 233 has constant values of R = 1/2 and k = 1/4. Note
that Maxima’s ctensor package has built-in functions for these; you
have to call the lriemann and uriemann before calling them.
(b) Similarly, show that the Petrov metric
√ √
ds2 = −dr2 − e−2r dz 2 + er [2 sin 3rdφdt − cos 3r(dφ2 − dt2 )]
Remark: Surprisingly, one can have a spacetime on which every possible curva-
ture invariant vanishes identically, and yet which is not flat. See Coley, Hervik,
and Pelavas, “Spacetimes characterized by their scalar curvature invariants,”
arxiv.org/abs/0901.0791v2.
239
netism, knowledge of such a solution leads directly to the ability to
write down the field equations with sources included. If Coulomb’s
law tells us the 1/r2 variation of the electric field of a point charge,
then we can infer Gauss’s law. The situation in general relativity
is not this simple. The field equations of general relativity, unlike
the Gauss’s law, are nonlinear, so we can’t simply say that a planet
or a star is a solution to be found by adding up a large number of
point-source solutions. It’s also not clear how one could represent a
moving source, since the singularity is a point that isn’t even part
of the continuous structure of spacetime (and its location is also
hidden behind an event horizon, so it can’t be observed from the
outside).
Experimental tests
But how do we know that this prediction is even correct? Can
it be verified in the laboratory? The classic laboratory test of the
1
Kreuzer, Phys. Rev. 169 (1968) 1007
c / The Kreuzer experiment. 1. There are two passive masses, P, and an active mass A consisting of
a single 23-cm diameter teflon cylinder immersed in a fluid. The teflon cylinder is driven back and forth
with a period of 400 s. The resulting deflection of the torsion beam is monitored by an optical lever and
canceled actively by electrostatic forces from capacitor plates (not shown). The voltage required for this active
cancellation is a measure of the torque exerted by A on the torsion beam. 2. Active mass as a function of
temperature. 3. Passive mass as a function of temperature. In both 2 and 3, temperature is measured in units
of ohms, i.e., the uncalibrated units of a thermistor that was immersed in the liquid.
Singularity theorems
An important example of the use of the energy conditions is that
Hawking and Ellis have proved that under the assumption of the
strong energy condition, any body that becomes sufficiently com-
pact will end up forming a singularity. We might imagine that
the formation of a black hole would be a delicate thing, requiring
perfectly symmetric initial conditions in order to end up with the
perfectly symmetric Schwarzschild metric. Many early relativists
thought so, for good reasons. If we look around the universe at
various scales, we find that collisions between astronomical bodies
are extremely rare. This is partly because the distances are vast
compared to the sizes of the objects, but also because conservation
of angular momentum has a tendency to make objects swing past
one another rather than colliding head-on. Starting with a cloud of
objects, e.g., a globular cluster, Newton’s laws make it extremely
difficult, regardless of the attractive nature of gravity, to pick initial
conditions that will make them all collide the future. For one thing,
they would have to have exactly zero total angular momentum.
Most relativists now believe that this is not the case. General
relativity describes gravity in terms of the tipping of light cones.
When the field is strong enough, there is a tendency for the light
cones to tip over so far that the entire future light-cone points at the
source of the field. If this occurs on an entire surface surrounding
the source, it is referred to as a trapped surface.
To make this notion of light cones “pointing at the source” more
rigorous, we need to define the volume expansion Θ. Let the set of
all points in a spacetime (or some open subset of it) be expressed as
the union of geodesics. This is referred to as a foliation in geodesics,
or a congruence. Let the velocity vector tangent to such a curve
be ua . Then we define Θ = ∇a ua . This is exactly analogous to
the classical notion of the divergence of the velocity field of a fluid,
which is a measure of compression or expansion. Since Θ is a scalar,
it is coordinate-independent. Negative values of Θ indicate that the
geodesics are converging, so that volumes of space shrink. A trapped
surface is one on which Θ is negative when we foliate with lightlike
geodesics oriented outward along normals to the surface.
When a trapped surface forms, any lumpiness or rotation in
the initial conditions becomes irrelevant, because every particle’s
entire future world-line lies inward rather than outward. A possi-
ble loophole in this argument is the question of whether the light
cones will really tip over far enough. We could imagine that un-
der extreme conditions of high density and temperature, matter
might demonstrate unusual behavior, perhaps including a negative
energy density, which would then give rise to a gravitational repul-
Current status
The current status of the energy conditions is shaky. Although
it is clear that all of them hold in a variety of situations, there are
strong reasons to believe that they are violated at both microscopic
and cosmological scales, for reasons both classical and quantum-
mechanical.9 We will see such a violation in the following section.
9
Barcelo and Visser, “Twilight for the energy conditions?,” http://arxiv.
org/abs/gr-qc/0205066v1.
• It should be coordinate-independent.
11
For a detailed review of the evidence that rules out various variations on the
Hoyle theme, see http://www.astro.ucla.edu/~wright/stdystat.htm.
1 load(ctensor);
2 dim:3;
3 ct_coords:[r,theta,phi];
4 depends(f,t);
5 lg:matrix([f,0,0],
6 [0,r^2,0],
7 [0,0,r^2*sin(theta)^2]);
8 cmetric();
9 einstein(true);
Line 2 tells Maxima that we’re working in a space with three di-
mensions rather than its default of four. Line 4 tells it that f is a
dr2
2 2 2
ds = dt − a + r2 dθ2 + r2 sin2 θdφ2 .
1 − kr2
1 load(ctensor);
2 ct_coords:[t,r,theta,phi];
3 depends(a,t);
4 lg:matrix([1,0,0,0],
5 [0,-a^2/(1-k*r^2),0,0],
6 [0,0,-a^2*r^2,0],
7 [0,0,0,-a^2*r^2*sin(theta)^2]);
8 cmetric();
9 einstein(true);
The result is
2
ȧ
Gtt =3 + 3ka−2
a
2
ä ȧ
G r r = G θ θ = G φφ =2 + + ka−2 ,
a a
ä 1 4π
= Λ− (ρ + 3P )
a 3 3
2
ȧ 1 8π
= Λ+ ρ − ka−2 .
a 3 3
3. The size of the solar system increases at this rate as well (i.e.,
gravitationally bound systems get bigger, including the earth
and the Milky Way).
shows that k enters only via a factor the form (. . .)e(...)t + (. . .)k. For large t, the
k term becomes negligible, and the Einstein tensor becomes Gab = g ab Λ, This is
consistent with the approximation we used in deriving the solution, which was
to ignore both the source terms and the k term in the Friedmann equations.
The exact solutions with Λ > 0 and k = −1, 0, and 1 turn out in fact to be
equivalent except for a change of coordinates.
The metric
dr 2
2m 1 2
2
ds = 1 − − Λr dt 2 − −r 2 dθ2 −r 2 sin2 θdφ2
r 3 1 − 2rm − 13 Λr 2
is an exact solution to the Einstein field equations with cosmo-
logical constant Λ, and can be interpreted as a universe in which
P =0
ρ ∝ a−3
ä = −ca−2
√
ȧ = 2ca−1/2 .
8.2.10 Observation
Historically, it was very difficult to determine the universe’s av-
erage density, even to within an order of magnitude. Most of the
matter in the universe probably doesn’t emit light, making it dif-
ficult to detect. Astronomical distance scales are also very poorly
calibrated against absolute units such as the SI.
A strong constraint on the models comes from accurate mea-
surements of the cosmic microwave background, especially by the
1989-1993 COBE probe, and its 2001-2009 successor, the Wilkinson
Microwave Anisotropy Probe, positioned at the L2 Lagrange point d / The angular scale of fluc-
tuations in the cosmic microwave
background can be used to infer
21
Problem 5 on p. 290 shows that this symmetry is also exhibited by the the curvature of the universe.
Friedmann equations.
22
Komatsu et al., 2010, arxiv.org/abs/1001.4538
23
Riess et al., 2007, arxiv.org/abs/astro-ph/0611572
24
See Carroll, “The Cosmological Constant,” http://www.livingreviews.
27
Jordan was a member of the Nazi Sturmabteilung or “brown shirts” who
nevertheless ran afoul of the Nazis for his close professional relationships with
Jews.
28
A limit of 5 × 10−23 has been placed on the anisotropy of the inertial mass
of the proton: R.W.P. Drever, “A search for anisotropy of inertial mass using a
free precession technique,” Philosophical Magazine, 6:687 (1961) 683.
29
This leads to an exception to the statement above that all Brans-Dicke
spacetimes are expected to look like Big Bang cosmologies. Any solution of the
GR field equations containing nothing but vacuum and electromagnetic fields
(known as an “elevtrovac” solution) is also a valid Brans-Dicke spacetime. In
such a spacetime, a constant φ can be set arbitrarily. Such a spacetime is in
some sense not generic for Brans-Dicke gravity.
30
Another good technical reasons for thinking of φ as relating to the gravita-
tional constant is that general relativity has a standard prescription for describ-
ing fields on a background of curved spacetime. The vacuum field equations of
general relativity can be derived from the principle of least action, and although
the details are beyond the scope of this book (see, e.g., Wald, General Relativ-
ity, appendix E), the general idea is that we define a Lagrangian density LG
that depends on the Ricci scalar curvature, and then extremize its integral over
all possible histories of the evolution of the gravitational field. If we want to
describe some other field, such as matter, light, or φ, we simply take the special-
relativistic Lagrangian LM for that field, change all the derivatives to covariant
derivatives, and form the sum (1/G)LG + LM . In the Brans-Dicke theory, we
have three pieces, (1/G)LG + LM + Lφ , where LM is for matter and Lφ for φ.
4 + 3ω 2 t 2/(4+3ω)
φ = 8π ρo t ,
6 + 4ω o to
where ρo is the density of matter in the universe at time t = to .
When the density of matter is small, G is large, which has the same
observational consequences as the disappearance of inertia; this is
exactly what one expects according to Mach’s principle. For ω → ∞,
the gravitational “constant” G = 1/φ really is constant.
Returning to the thought experiment involving the 22-caliber ri-
fle fired out the window, we find that in this imaginary universe, with
a very small density of matter, G should be very large. This causes
a frame-dragging effect from the laboratory on the gyroscope, one
much stronger than we would see in our universe. Brans and Dicke
calculated this effect for a laboratory consisting of a spherical shell,
and although technical difficulties prevented the reliable extrapo-
lation of their result to ρo → 0, the trend was that as ρo became
small, the frame-dragging effect would get stronger and stronger,
presumably eventually forcing the gyroscope to precess in lock-step
with the laboratory. There would thus be no way to determine, once
the bullet was far away, that the laboratory was rotating at all —
in perfect agreement with Mach’s principle.
32
Bertotti, Iess, and Tortora, “A test of general relativity using radio links
with the Cassini spacecraft,” Nature 425 (2003) 374
T µν = diag(−ρ, P , P , P ) ,
Problems 291
292 Chapter 8 Sources
Chapter 9
Gravitational waves
9.1 The speed of gravity
In Newtonian gravity, gravitational effects are assumed to propagate
at infinite speed, so that for example the lunar tides correspond at
any time to the position of the moon at the same instant. This
clearly can’t be true in relativity, since simultaneity isn’t something
that different observers even agree on. Not only should the “speed
of gravity” be finite, but it seems implausible that that it would be
greater than c; in section 2.2 (p. 46), we argued based on empirically
well established principles that there must be a maximum speed of
cause and effect. Although the argument was only applicable to
special relativity, i.e., to a flat spacetime, it seems likely to apply to
general relativity as well, at least for low-amplitude waves on a flat
background. As early as 1913, before Einstein had even developed
the full theory of general relativity, he had carried out calculations in
the weak-field limit showing that gravitational effects should prop-
agate at c. This seems eminently reasonable, since (a) it is likely to
be consistent with causality, and (b) G and c are the only constants
with units that appear in the field equations (obscured by our choice
of units, in which G = 1 and c = 1), and the only velocity-scale that
can be constructed from these two constants is c itself.1
Although extremely well founded theoretically, this turns out to
be extremely difficult to test empirically. In a 2003 experiment,2 Fo-
malont and Kopeikin used a world-wide array of radio telescopes to
observe a conjunction in which Jupiter passed within 3.70 of a quasar,
so that the quasar’s radio waves came within about 3 light-seconds
of the planet on their way to the earth. Since Jupiter moves with
1
High-amplitude waves need not propagate at c. For example, general rela-
tivity predicts that a gravitational-wave pulse propagating on a background of
curved spacetime develops a trailing edge that propagates at less than c (Misner,
Thorne, and Wheeler, p. 957). This effect is weak when the amplitude is small
or the wavelength is short compared to the scale of the background curvature.
It makes sense that the effect vanishes when background curvature is absent,
since there is then no fixed scale. Dispersion requires that different wavelengths
propagate at different speeds, but without a scale there is no reason for any
wavelength to behave any differently from any other. At very high amplitudes,
one can even have such exotic phenomena as the formation of black holes when
enough wave energy is focused into a small region. None of these phenomena is
ever likely to be observed empirically, since all gravitational waves in our universe
have extremely small amplitudes.
2
http://arxiv.org/abs/astro-ph/0302294
293
v = 4 × 10−5 , one expects naively that the radio waves passing by it
should be deflected by the field produced by Jupiter at the position it
had 3 seconds earlier. This position differs from its present position
by about 10−4 light-seconds, and the result should be a difference
in propagation time, which should be different when observed from
different locations on earth. Fomalont and Kopeikin measured these
phase differences with picosecond precision, and found them to be
in good agreement with the predictions of general relativity. The
real excitement started when they published their result with the
interpretation that they had measured, for the first time, the speed
of gravity, and found it to be within 20% error bars of c. Samuel3
and Will4 published refutations, arguing that Kopeikin’s calcula-
tions contained mistakes, and that what had really been measured
was the speed of light, not the speed of gravity.
The reason that the interpretation of this type of experiment
is likely to be controversial is that although we do have theories of
gravity that are viable alternatives to general relativity (e.g., the
Brans-Dicke theory, in which the gravitational constant is a dynam-
ically changing variable), such theories have generally been carefully
designed to agree with general relativity in the weak-field limit, and
in particular every such theory (or at least every theory that remains
viable given current experimental data) predicts that gravitational
effects propagate at c in the weak-field limit. Without an alternative
theory to act as a framework — one that disagrees with relativity
about the speed of gravity — it is difficult to know whether an ob-
servation that agrees with relativity is a test of this specific aspect
of relativity.
3
http://arxiv.org/abs/astro-ph/0304006
4
http://arxiv.org/abs/astro-ph/0301145
5
Stairs, “Testing General Relativity with Pulsar Timing,” http://
relativity.livingreviews.org/Articles/lrr-2003-5/
does have nonvanishing curvature. In other words, it seems like e / As the gravitational wave
we should be looking for transverse waves rather than longitudi- propagates in the z direction, the
nal ones.10 On the other hand, this metric cannot be a solution metric oscillates in the x and y
to the vacuum field equations, since it doesn’t preserve volume. It directions, preserving volume.
also stands still, whereas we expect that solutions to the field equa-
tions should propagate at the velocity of light, at least for small
amplitudes. These conclusions are self-consistent, because a wave’s
polarization can only be constrained if it propagates at c (see p. 115).
Based on what we’ve found out, the following seems like a metric
that might have a fighting chance of representing a real gravitational
wave:
dy 2
ds2 = dt2 − (1 + A sin(z − t)) dx2 − − dz 2
1 + A sin(z − t)
It is transverse, it propagates at c(= 1), and the fact that gxx is the
reciprocal of gyy makes it volume-conserving. The following Maxima
program calculates its Einstein tensor:
1 load(ctensor);
2 ct_coords:[t,x,y,z];
10
A more careful treatment shows that longitudinal waves can always be in-
terpreted as physically unobservable coordinate waves, in the limit of large dis-
tances from the source. On the other hand, it is clear that no such prohibition
against longitudinal waves could apply universally, because such a constraint
can only be Lorentz-invariant if the wave propagates at c (see p. 115), whereas
high-amplitude waves need not propagate at c. Longitudinal waves near the
source are referred to as Type III solutions in a classification scheme due to
Petrov. Transverse waves, which are what we could actually observe in practical
experiments, are type N.
A2 cos2 (z − t)
Gtt = −
2 + 4A sin(z − t) + 2A2 sin2 (z − t)
1 load(ctensor);
2 ct_coords:[t,x,y,z];
3 f : A*exp(%i*k*(z-t));
4 lg:matrix([1,0,0,0],
5 [0,-(1+f+c*f^2),0,0],
6 [0,0,-(1-f+d*f^2),0],
7 [0,0,0,-1]);
8 cmetric();
9 einstein(true);
ds2 = dt 2 − p(z − t )2 dx 2 − q (z − t )2 dy 2 − dz 2 ,
1 load(ctensor);
2 ct_coords:[t,x,y,z];
3 depends(p,[z,t]);
4 depends(q,[z,t]);
5 lg:matrix([1,0,0,0],
6 [0,-p^2,0,0],
7 [0,0,-q^2,0],
8 [0,0,0,-1]);
9 cmetric();
10 einstein(true);
G 4 m 5
P =k ,
c5 r
where k is a unitless constant of order unity.
For the Hulse-Taylor pulsar,12 we have m ∼ 3 × 1030 kg (about
one and a half solar masses) and r ∼ 109 m. The binary pulsar is
made to order our purposes, since m/r is extremely large compared
to what one sees in almost any other astronomical system. The
resulting estimate for the power is about 1024 watts.
The pulsar’s period is observed to be steadily lengthening at a
rate of α = 2.418 × 10−12 seconds per second. To compare this with
our crude theoretical estimate, we take the Newtonian energy of the
system Gm2 /r and multiply by ωα, giving 1025 W, which checks to
within an order of magnitude. A full general-relativistic calculation
reproduces the observed value of α to within the 0.1% error bars of
the data.
12
http://arxiv.org/abs/astro-ph/0407149
Problems 305
Appendix 1: Excerpts from three papers by
Einstein
The following English translations of excerpts from three papers by Einstein were originally
published in “The Principle of Relativity,” Methuen and Co., 1923. The translation was by
W. Perrett and G.B. Jeffery, and notes were provided by A. Sommerfeld. John Walker (www.
fourmilab.ch) has provided machine-readable versions of the first two and placed them in the
public domain. Some notation has been modernized, British spelling has been Americanized,
etc. Footnotes by Sommerfeld, Walker, and B. Crowell are marked with initials. B. Crowell’s
modifications to the present version are also in the public domain.
The paper “On the electrodynamics of moving bodies” contains two parts, the first dealing
with kinematics and the second with electrodynamics. I’ve given only the first part here, since
the second one is lengthy, and painful to read because of the cumbersome old-fashioned notation.
The second section can be obtained from John Walker’s web site.
The paper “Does the inertia of a body depend upon its energy content?,” which begins on
page 319, is very short and readable. A shorter and less general version of its main argument is
given on p. 120.
“The foundation of the general theory of relativity” is a long review article in which Einstein
systematically laid out the general theory, which he had previously published in a series of
shorter papers. The first three sections of the paper give the general physical reasoning behind
coordinate independence, referred to as general covariance. It begins on page 321.
The reader who is interested in seeing these papers in their entirety can obtain them inex-
pensively in a Dover reprint of the original Methuen anthology.
I. KINEMATICAL PART
§1. Definition of Simultaneity
Let us take a system of coordinates in which the equations of Newtonian mechanics hold
good.19 In order to render our presentation more precise and to distinguish this system of
coordinates verbally from others which will be introduced hereafter, we call it the “stationary
14
Einstein knew about the Michelson-Morley experiment by 1905 (J. van Dongen, arxiv.org/abs/0908.1545),
but it isn’t cited specifically here. The 1881 and 1887 Michelson-Morley papers are available online at en.
wikisource.org. —BC
15
I.e., to first order in v/c. Experimenters as early as Fresnel (1788-1827) had shown that there were no effects
of order v/c due to the earth’s motion through the aether, but they were able to interpret this without jettisoning
the aether, my contriving models in which solid substances dragged the aether along with them. The negative
result of the Michelson-Morley experiment showed a lack of an effect of order (v/c)2 . —BC
16
The preceding memoir by Lorentz was not at this time known to the author. —AS
17
The second postulate is redundant if we take the “laws of electrodynamics and optics” to refer to Maxwell’s
equations. Maxwell’s equations require that light move at c in any frame of reference in which they are valid, and
the first postulate has already claimed that they are valid in all inertial frames of reference. Einstein probably
states constancy of c as a separate postulate because his audience is accustomed to thinking of Maxwell’s equations
as a partial mathematical representation of certain aspects of an underlying aether theory. Throughout part I of
the paper, Einstein is able to derive all his results without assuming anything from Maxwell’s equations other than
the constancy of c. The use of the term “postulate” suggests the construction of a formal axiomatic system like
Euclidean geometry, but Einstein’s real intention here is to lay out a set of philosophical criteria for evaluating
candidate theories; he freely brings in other, less central, assumptions later in the paper, as when he invokes
homogeneity of spacetime on page 311. —BC
18
Essentially what Einstein means here is that you can’t have Maxwell’s equations without establishing position
and time coordinates, and you can’t have position and time coordinates without clocks and rulers. Therefore even
the description of a purely electromagnetic phenomenon such as a light wave depends on the existence of material
objects. He doesn’t spell out exactly what he means by “rigid,” and we now know that relativity doesn’t actually
allow the existence of perfectly rigid solids (see p. 98). Essentially he wants to be able to talk about rulers that
behave like solids rather than liquids, in the sense that if they are accelerated sufficiently gently from rest and
later brought gently back to rest, their properties will be unchanged. When he derives the length contraction
later, he wants it to be clear that this isn’t a dynamical phenomenon caused by an effect such as the drag of the
aether.—BC
19
i.e., to the first approximation.—AS
Problems 307
system.”
If a material point is at rest relative to this system of coordinates, its position can be defined
relative thereto by the employment of rigid standards of measurement and the methods of
Euclidean geometry, and can be expressed in Cartesian coordinates.
If we wish to describe the motion of a material point, we give the values of its coordinates
as functions of the time. Now we must bear carefully in mind that a mathematical description
of this kind has no physical meaning unless we are quite clear as to what we understand by
“time.” We have to take into account that all our judgments in which time plays a part are
always judgments of simultaneous events. If, for instance, I say, “That train arrives here at 7
o’clock,” I mean something like this: “The pointing of the small hand of my watch to 7 and the
arrival of the train are simultaneous events.”20
It might appear possible to overcome all the difficulties attending the definition of “time”
by substituting “the position of the small hand of my watch” for “time.” And in fact such a
definition is satisfactory when we are concerned with defining a time exclusively for the place
where the watch is located; but it is no longer satisfactory when we have to connect in time
series of events occurring at different places, or—what comes to the same thing—to evaluate
the times of events occurring at places remote from the watch.
We might, of course, content ourselves with time values determined by an observer stationed
together with the watch at the origin of the coordinates, and coordinating the corresponding
positions of the hands with light signals, given out by every event to be timed, and reaching him
through empty space. But this coordination has the disadvantage that it is not independent of
the standpoint of the observer with the watch or clock, as we know from experience. We arrive
at a much more practical determination along the following line of thought.
If at the point A of space there is a clock, an observer at A can determine the time values
of events in the immediate proximity of A by finding the positions of the hands which are
simultaneous with these events. If there is at the point B of space another clock in all respects
resembling the one at A, it is possible for an observer at B to determine the time values of
events in the immediate neighbourhood of B. But it is not possible without further assumption
to compare, in respect of time, an event at A with an event at B. We have so far defined only
an “A time” and a “B time.” We have not defined a common “time” for A and B, for the latter
cannot be defined at all unless we establish by definition that the “time” required by light to
travel from A to B equals the “time” it requires to travel from B to A. Let a ray of light start
at the “A time” tA from A towards B, let it at the “B time” tB be reflected at B in the direction
of A, and arrive again at A at the “A time” t0A .
In accordance with definition the two clocks synchronize21 if
tB − tA = t0A − tB .
We assume that this definition of synchronism is free from contradictions, and possible for
any number of points; and that the following relations are universally valid:—
1. If the clock at B synchronizes with the clock at A, the clock at A synchronizes with the
clock at B.
20
We shall not here discuss the inexactitude which lurks in the concept of simultaneity of two events at approx-
imately the same place, which can only be removed by an abstraction.—AS
21
The procedure described here is known as Einstein synchronization.—BC
2AB
= c,
t0A− tA
light path
velocity =
time interval
where time interval is to be taken in the sense of the definition in § 1.
Let there be given a stationary rigid rod; and let its length be l as measured by a measuring-
rod which is also stationary. We now imagine the axis of the rod lying along the axis of x of
the stationary system of coordinates, and that a uniform motion of parallel translation with
velocity v along the axis of x in the direction of increasing x is then imparted to the rod. We
now inquire as to the length of the moving rod, and imagine its length to be ascertained by the
following two operations:—
(a) The observer moves together with the given measuring-rod and the rod to be measured,
and measures the length of the rod directly by superposing the measuring-rod, in just the same
way as if all three were at rest.
(b) By means of stationary clocks set up in the stationary system and synchronizing in
accordance with § 1, the observer ascertains at what points of the stationary system the two
22
This assumption fails in a rotating frame (see p. 100), but Einstein has restricted himself here to an approx-
imately inertial frame of reference.—BC
Problems 309
ends of the rod to be measured are located at a definite time. The distance between these two
points, measured by the measuring-rod already employed, which in this case is at rest, is also a
length which may be designated “the length of the rod.”
In accordance with the principle of relativity the length to be discovered by the operation
(a)—we will call it “the length of the rod in the moving system”—must be equal to the length
l of the stationary rod.
The length to be discovered by the operation (b) we will call “the length of the (moving)
rod in the stationary system.” This we shall determine on the basis of our two principles, and
we shall find that it differs from l.
Current kinematics tacitly assumes that the lengths determined by these two operations are
precisely equal, or in other words, that a moving rigid body at the epoch t may in geometrical
respects be perfectly represented by the same body at rest in a definite position.
We imagine further that at the two ends A and B of the rod, clocks are placed which syn-
chronize with the clocks of the stationary system, that is to say that their indications correspond
at any instant to the “time of the stationary system” at the places where they happen to be.
These clocks are therefore “synchronous in the stationary system.”
We imagine further that with each clock there is a moving observer, and that these observers
apply to both clocks the criterion established in § 1 for the synchronization of two clocks. Let
a ray of light depart from A at the time23 tA , let it be reflected at B at the time tB , and reach
A again at the time t0A . Taking into consideration the principle of the constancy of the velocity
of light we find that
rAB rAB
tB − tA = and t0A − tB =
c−v c+v
where rAB denotes the length of the moving rod—measured in the stationary system. Observers
moving with the moving rod would thus find that the two clocks were not synchronous, while
observers in the stationary system would declare the clocks to be synchronous.
So we see that we cannot attach any absolute signification to the concept of simultaneity, but
that two events which, viewed from a system of coordinates, are simultaneous, can no longer be
looked upon as simultaneous events when envisaged from a system which is in motion relative
to that system.
x0 x0 x0
1 0
τ (0, 0, 0, t) + τ 0, 0, 0, t + + = τ x , 0, 0, t + .
2 c−v c+v c−v
∂τ v ∂τ
0
+ 2 2
= 0.
∂x c − v ∂t
It is to be noted that instead of the origin of the coordinates we might have chosen any other
point for the point of origin of the ray, and the equation just obtained is therefore valid for all
values of x0 , y, z.
An analogous consideration—applied to the axes of Y and Z—it being borne in mind that
light is always
√ propagated along these axes, when viewed from the stationary system, with the
velocity c − v 2 gives us
2
Problems 311
∂τ ∂τ
= 0, = 0.
∂y ∂z
v
τ =a t− 2 x0
c − v2
where a is a function φ(v) at present unknown, and where for brevity it is assumed that at the
origin of k, τ = 0, when t = 0.
With the help of this result we easily determine the quantities ξ, η, ζ by expressing in
equations that light (as required by the principle of the constancy of the velocity of light, in
combination with the principle of relativity) is also propagated with velocity c when measured
in the moving system. For a ray of light emitted at the time τ = 0 in the direction of the
increasing ξ
v 0
ξ = cτ or ξ = ac t − 2 x .
c − v2
But the ray moves relative to the initial point of k, when measured in the stationary system,
with the velocity c − v, so that
x0
= t.
c−v
c2
ξ=a x0 .
c2 − v 2
In an analogous manner we find, by considering rays moving along the two other axes, that
v
η = cτ = ac t − x0
c − v2
2
when
y
√ = t, x0 = 0.
c2 − v 2
Thus
c c
η = a√ y and ζ = a √ z.
c2 − v 2 c2 − v 2
where
1
β=p ,
1 − v 2 /c2
x2 + y 2 + z 2 = c2 t2 .
Transforming this equation with the aid of our equations of transformation we obtain after
a simple calculation
ξ 2 + η 2 + ζ 2 = c2 τ 2 .
The wave under consideration is therefore no less a spherical wave with velocity of propaga-
tion c when viewed in the moving system. This shows that our two fundamental principles are
compatible.24
In the equations of transformation which have been developed there enters an unknown
function φ of v, which we will now determine.
For this purpose we introduce a third system of coordinates K0 , which relative to the system
k is in a state of parallel translatory motion parallel to the axis of Ξ,25 such that the origin of
24
The equations of the Lorentz transformation may be more simply deduced directly from the condition that
in virtue of those equations the relation x2 + y 2 + z 2 = c2 t2 shall have as its consequence the second relation
ξ 2 + η 2 + ζ 2 = c2 τ 2 .—AS
25
In Einstein’s original paper, the symbols (Ξ, H, Z) for the coordinates of the moving system k were introduced
without explicitly defining them. In the 1923 English translation, (X, Y, Z) were used, creating an ambiguity
between X coordinates in the fixed system K and the parallel axis in moving system k. Here and in subsequent
references we use Ξ when referring to the axis of system k along which the system is translating with respect to
K. In addition, the reference to system K0 later in this sentence was incorrectly given as “k” in the 1923 English
translation.—JW
Problems 313
coordinates of system K0 moves with velocity −v on the axis of Ξ. At the time t = 0 let all
three origins coincide, and when t = x = y = z = 0 let the time t0 of the system K0 be zero. We
call the coordinates, measured in the system K0 , x0 , y 0 , z 0 , and by a twofold application of our
equations of transformation we obtain
Since the relations between x0 , y 0 , z 0 and x, y, z do not contain the time t, the systems K
and K0 are at rest with respect to one another, and it is clear that the transformation from K
to K0 must be the identical transformation. Thus
φ(v)φ(−v) = 1.
We now inquire into the signification of φ(v). We give our attention to that part of the axis of
Y of system k which lies between ξ = 0, η = 0, ζ = 0 and ξ = 0, η = l, ζ = 0. This part of the
axis of Y is a rod moving perpendicularly to its axis with velocity v relative to system K. Its
ends possess in K the coordinates
l
x1 = vt, y1 = , z1 = 0
φ(v)
and
x2 = vt, y2 = 0, z2 = 0.
The length of the rod measured in K is therefore l/φ(v); and this gives us the meaning of the
function φ(v). From reasons of symmetry it is now evident that the length of a given rod moving
perpendicularly to its axis, measured in the stationary system, must depend only on the velocity
and not on the direction and the sense of the motion. The length of the moving rod measured in
the stationary system does not change, therefore, if v and −v are interchanged. Hence follows
that l/φ(v) = l/φ(−v), or
φ(v) = φ(−v).
It follows from this relation and the one previously found that φ(v) = 1, so that the transfor-
mation equations which have been found become
τ = β(t − vx/c2 ),
ξ = β(x − vt),
η = y,
ζ = z,
where
§ 4. Physical Meaning of the Equations Obtained in Respect to Moving Rigid Bodies and
Moving Clocks
We envisage a rigid sphere26 of radius R, at rest relative to the moving system k, and with
its centre at the origin of coordinates of k. The equation of the surface of this sphere moving
relative to the system K with velocity v is
ξ 2 + η 2 + ζ 2 = R2 .
x2
p + y 2 + z 2 = R2 .
( 1 − v 2 /c2 )2
A rigid body which, measured in a state of rest, has the form of a sphere, therefore has in a
state of motion—viewed from the stationary system—the form of an ellipsoid of revolution with
the axes
p
R 1 − v 2 /c2 , R, R.
Thus, whereas the Y and Z dimensions of the sphere (and therefore of every rigid body of no
matter what form)
p do not appear modified by the motion, the X dimension appears shortened
in the ratio 1 : 1 − v 2 /c2 , i.e., the greater the value of v, the greater the shortening. For v = c
all moving objects—viewed from the “stationary” system—shrivel up into plane figures.27 For
velocities greater than that of light our deliberations become meaningless; we shall, however,
find in what follows, that the velocity of light in our theory plays the part, physically, of an
infinitely great velocity.
It is clear that the same results hold good of bodies at rest in the “stationary” system,
viewed from a system in uniform motion.
Further, we imagine one of the clocks which are qualified to mark the time t when at rest
relative to the stationary system, and the time τ when at rest relative to the moving system,
to be located at the origin of the coordinates of k, and so adjusted that it marks the time τ .
What is the rate of this clock, when viewed from the stationary system?
Between the quantities x, t, and τ , which refer to the position of the clock, we have, evidently,
x = vt and
1
τ=p (t − vx/c2 ).
1 − v 2 /c2
Therefore,
26
That is, a body possessing spherical form when examined at rest.—AS
27
In the 1923 English translation, this phrase was erroneously translated as “plain figures”. I have used the
correct “plane figures” in this edition.—JW
Problems 315
p p
τ = t 1 − v 2 /c2 = t − (1 − 1 − v 2 /c2 )t
whence
p it follows that the time marked by the clock (viewed in the stationary system) is slow by
1 − 1 − v 2 /c2 seconds per second, or—neglecting magnitudes of fourth and higher order—by
1 2 2
2 v /c .
From this there ensues the following peculiar consequence. If at the points A and B of K
there are stationary clocks which, viewed in the stationary system, are synchronous; and if the
clock at A is moved with the velocity v along the line AB to B, then on its arrival at B the
two clocks no longer synchronize, but the clock moved from A to B lags behind the other which
has remained at B by 12 tv 2 /c2 (up to magnitudes of fourth and higher order), t being the time
occupied in the journey from A to B.
It is at once apparent that this result still holds good if the clock moves from A to B in any
polygonal line, and also when the points A and B coincide.
If we assume that the result proved for a polygonal line is also valid for a continuously curved
line, we arrive at this result: If one of two synchronous clocks at A is moved in a closed curve
with constant velocity until it returns to A, the journey lasting t seconds, then by the clock
which has remained at rest the travelled clock on its arrival at A will be 12 tv 2 /c2 second slow.
Thence we conclude that a spring-clock at the equator must go more slowly, by a very small
amount, than a precisely similar clock situated at one of the poles under otherwise identical
conditions.28
ξ = wξ τ , η = wη τ , ζ = 0,
wξ + v
x = t,
1 + vwξ /c2
p
1 − v 2 /c2
y = wη t,
1 + vwξ /c2
z = 0.
28
Einstein specifies a spring-clock (“unruhuhr”) because the effective gravitational field is weaker at the equator
than at the poles, so a pendulum clock at the equator would run more slowly by about two parts per thousand
than one at the north pole, for nonrelativistic reasons. This would completely mask any relativistic effect, which
he expected to be on the order of v 2 /c2 , or about 10−13 . In any case, it later turned out that Einstein was
mistaken about this example. There is also a gravitational time dilation that cancels the kinematic effect. See
example 7, p. 52. The two clocks would actually agree.—BC
a is then to be looked upon as the angle between the velocities v and w. After a simple calculation
we obtain
p
(v 2 + w2 + 2vw cos a) − (vw sin a/c)2
V = .
1 + vw cos a/c2
It is worthy of remark that v and w enter into the expression for the resultant velocity in a
symmetrical manner. If w also has the direction of the axis of X, we get
v+w
V = .
1 + vw/c2
It follows from this equation that from a composition of two velocities which are less than c,
there always results a velocity less than c. For if we set v = c − κ, w = c − λ, κ and λ being
positive and less than c, then
2c − κ − λ
V =c < c.
2c − κ − λ + κλ/c
It follows, further, that the velocity of light c cannot be altered by composition with a
velocity less than that of light. For this case we obtain
c+w
V = = c.
1 + w/c
We might also have obtained the formula for V, for the case when v and w have the same
direction, by compounding two transformations in accordance with § 3. If in addition to the
systems K and k figuring in § 3 we introduce still another system of coordinates k 0 moving
parallel to k, its initial point moving on the axis of Ξ30 with the velocity w, we obtain equations
between the quantities x, y, z, t and the corresponding quantities of k 0 , which differ from the
equations found in § 3 only in that the place of “v” is taken by the quantity
v+w
;
1 + vw/c2
Problems 317
We have now deduced the requisite laws of the theory of kinematics corresponding to our
two principles, and we proceed to show their application to electrodynamics.31
31
The remainder of the paper is not given here, but can be obtained from John Walker’s web site at www.
fourmilab.ch.—BC
1 − v cosφ
l∗ = l p c
1 − v 2 /c2
where c denotes the velocity of light. We shall make use of this result in what follows.
Let there be a stationary body in the system (x, y, z), and let its energy—referred to the
system (x, y, z) be E0 . Let the energy of the body relative to the system (ξ, η, ζ) moving as
above with the velocity v, be H0 .
Let this body send out, in a direction making an angle φ with the axis of x, plane waves
of light, of energy 12 L measured relative to (x, y, z), and simultaneously an equal quantity of
light in the opposite direction. Meanwhile the body remains at rest with respect to the system
(x, y, z). The principle of energy must apply to this process, and in fact (by the principle of
relativity) with respect to both systems of coordinates. If we call the energy of the body after
the emission of light E1 or H1 respectively, measured relative to the system (x, y, z) or (ξ, η, ζ)
respectively, then by employing the relation given above we obtain
1 1
E0 = E1 + L + L,
2 2
1 1 − vc cosφ 1 1 + v cosφ
H0 = H1 + L p + Lp c
2 1 − v 2 /c2 2 1 − v 2 /c2
L
= H1 + p .
1 − v 2 /c2
Problems 319
( )
1
H0 − E0 − (H1 − E1 ) = L p −1 .
1 − v 2 /c2
The two differences of the form H − E occurring in this expression have simple physical signi-
fications. H and E are energy values of the same body referred to two systems of coordinates
which are in motion relative to each other, the body being at rest in one of the two systems
(system (x, y, z)). Thus it is clear that the difference H − E can differ from the kinetic energy
K of the body, with respect to the other system (ξ, η, ζ), only by an additive constant C, which
depends on the choice of the arbitrary additive constants34 of the energies H and E. Thus we
may place
H0 − E0 = K0 + C,
H1 − E1 = K1 + C,
The kinetic energy of the body with respect to (ξ, η, ζ) diminishes as a result of the emission
of light, and the amount of diminution is independent of the properties of the body. Moreover,
the difference K0 − K1 , like the kinetic energy of the electron (§ 10), depends on the velocity.
Neglecting magnitudes of fourth and higher orders35 we may place
1L 2
K0 − K1 = v .
2 c2
From this equation it directly follows36 that:—
If a body gives off the energy L in the form of radiation, its mass diminishes by L/c2 . The
fact that the energy withdrawn from the body becomes energy of radiation evidently makes no
difference, so that we are led to the more general conclusion that
The mass of a body is a measure of its energy-content; if the energy changes by L, the mass
changes in the same sense by L/9 × 1020 , the energy being measured in ergs, and the mass in
grammes.
It is not impossible that with bodies whose energy-content is variable to a high degree (e.g.
with radium salts) the theory may be successfully put to the test.
If the theory corresponds to the facts, radiation conveys inertia between the emitting and
absorbing bodies.
34
A potential energy U is only defined up to an additive constant. If, for example, U depends on the distance
between particles, and the distance undergoes a Lorentz contraction, there is no reason to imagine that the
constant will stay the same.—BC
35
The purpose of making the approximation is to show that under realistic lab conditions, the effect exactly
mimics a change in Newtonian mass.
36
The object has the same velocity v before and after emission of the light, so this reduction in kinetic energy
has to be attributed to a change in mass.—BC
Problems 321
hover freely in space at so great a distance from each other and from all other masses that only
those gravitational forces need be taken into account which arise from the interaction of different
parts of the same body. Let the distance between the two bodies be invariable, and in neither
of the bodies let there be any relative movements of the parts with respect to one another.
But let either mass, as judged by an observer at rest relative to the other mass, rotate
with constant angular velocity about the line joining the masses. This is a verifiable relative
motion of the two bodies. Now let us imagine that each of the bodies has been surveyed by
means of measuring instruments at rest relative to itself, and let the surface of S1 prove to be a
sphere, and that of S2 an ellipsoid of revolution. Thereupon we put the question — What is the
reason for this difference in the two bodies? No answer can be admitted as epistemologically
satisfactory,40 unless the reason given is an observable fact of experience. The law of causality
has not the significance of a statement as to the world of experience, except when observable
facts ultimately appear as causes and effects.
Newtonian mechanics does not give a satisfactory answer to this question. It pronounces as
follows: — The laws of mechanics apply to the space R1 , in respect to which the body S1 is at
rest, but not to the space R2 , in respect to which the body S2 is at rest. But the privileged
space R1 of Galileo, thus introduced, is a merely factitious 41 cause, and not a thing that can be
observed. It is therefore clear that Newton’s mechanics does not really satisfy the requirement
of causality in the case under consideration but only apparently does so, since it makes the
factitious cause R1 responsible for the observable difference in the bodies S1 and S2 .
The only satisfactory answer must be that the physical system consisting of S1 and S2
reveals within itself no imaginable cause to which the differing behaviour of S1 and S2 can be
referred. The cause must therefore lie outside this system. We have to take it that the general
laws of motion, which in particular determine the shapes of S1 and S2 , must be such that the
mechanical behaviour of S1 and S2 is partly conditioned in quite essential respects, by distant
masses which we have not included in the system under consideration. These distant masses
and their motions relative to S1 and S2 must then be regarded as the seat of the causes (which
must be susceptible to observation) of the different behaviour of our two bodies S1 and S2 . They
take over the rôle of the factitious cause R1 . Of all imaginable spaces R1 , R2 , etc., in any kind
of motion relative to one another there is none which we may look upon as privileged a priori
without reviving the above-mentioned epistemological objection. The laws of physics must be
of such a nature that they apply to systems reference in any kind of motion.42 Along this road
we arrive at an extension at the postulate of relativity.
In addition to this weighty argument from the theory of knowledge, there is a well-known
physical fact which favours an extension of the theory of relativity. Let K be a Galilean system
of reference, i.e., a system relative to which (at least in the four-dimensional region under
consideration) a mass, sufficiently distant from other masses, is moving with uniform motion
in a straight line. Let K 0 be a second system of reference which is moving relative to K in
uniformly accelerated translation. Then, relative to K 0 , a mass sufficiently distant from other
40
Of course an answer may be satisfactory from the point of view of epistemology, and yet be unsound hysically,
if it is in conflict with other experiences. —AS
41
i.e., artificial —BC
42
At this time, Einstein had high hopes that his theory would be fully Machian. He was already aware of the
Schwarzschild solution (he refers to it near the end of the paper), which offended his Machian sensibilities because
it imputed properties to spacetime in a universe containing only a single point-mass. In the present example of
the bodies S1 and S2 , general relativity actually turns out to give the non-Machian result which Einstein here
says would be unsatisfactory.—BC
43
Eötvös has proved experimentally that the gravitational field has this property in great accuracy.—AS
44
We assume the possibility of verifying “simultaneity” for events immediately proximate in space, or — to
speak more precisely — for immediate proximity or coincidence in space-time, without giving a definition of this
fundamental concept.—AS
Problems 323
K (x, y, z, t), and also a system of coordinates K 0 (x0 , y 0 , z 0 , t0 ) in uniform rotation45 relative to
K. Let the origins of both systems, as well as their axes of Z, permanently coincide. We shall
show that for a space-time measurement in the system K 0 the above definition of the physical
meaning of lengths and times cannot be maintained. For reasons of symmetry it is clear that a
circle around the origin in the X, Y plane of K may at the same time be regarded as a circle
in the X 0 , Y 0 plane of K 0 . We suppose that the circumference and diameter of this circle have
been measured with a unit measure infinitely small compared with the radius, and that we have
the quotient of the two results. If this experiment were performed with a measuring-rod46 at
rest relative to the Galilean system K, the quotient would be π. With a measuring-rod at rest
relative to K 0 , the quotient would be greater than π. This is readily understood if we envisage
the whole process of measuring from the “stationary” system K, and take into consideration
that the measuring-rod applied to the periphery undergoes a Lorentzian contraction, while the
one applied along the radius does not.47 Hence Euclidean geometry does not apply to K 0 .
The notion of coordinates defined above, which presupposes the validity of Euclidean geometry,
therefore breaks down in relation to the system K 0 . So, too, we are unable to introduce a
time corresponding to physical requirements in K 0 , indicated by clocks at rest relative to K 0 .
To convince ourselves of this impossibility, let us imagine two clocks of identical constitution
placed, one at the origin of coordinates, and the other at the circumference of the circle, and
both envisaged from the “stationary” system K. By a familiar result of the special theory of
relativity, the clock at the circumference — judged from K — goes more slowly than the other,
because the former is in motion and the latter at rest. An observer at the common origin
of coordinates, capable of observing the clock at the circumference by means of light, would
therefore see it lagging behind the clock beside him. As he will not make up his mind to let the
velocity of light along the path in question depend explicitly on the time, he will interpret his
observations as showing that the clock at the circumference “really” goes more slowly than the
clock at the origin. So he will be obliged to define time in such a way that the rate of a clock
depends upon where the clock may be.
We therefore reach this result: — In the general theory of relativity, space and time cannot
be defined in such a way that differences of the spatial coordinates can be directly measured by
the unit measuring-rod, or differences in the time coordinate by a standard clock.
The method hitherto employed for laying coordinates into the space-time continuum in a
definite manner thus breaks down, and there seems to be no other way which would allow us
to adapt systems of coordinates to the four-dimensional universe so that we might expect from
their application a particularly simple formulation of the laws of nature. So there is nothing for
it but to regard all imaginable systems of coordinates, on principle, as equally suitable for the
description of nature.48 This comes to requiring that: —
45
This example of a rotating frame of reference was discussed on p. 98.—BC
46
Einstein implicitly assumes that the measuring rods are perfectly rigid, but it is not obvious that this is
possible. This issue is discussed on p. 102.—BC
47
As described on p. 98, Ehrenfest originally imagined that the circumference of the disk would be reduced
by its rotation. His argument was incorrect, because it assumed the ability to start the disk rotating when it
had originally been at rest. The present paper marks the first time that Einstein asserted the opposite, that the
circumference is increased.—BC
48
This is a conceptual leap, not a direct inference from the argument about the rotating frame. Einstein
started thinking about this argument in 1912, and concluded from it that he should base a theory of gravity on
non-Euclidean geometry. Influenced by Levi-Civita, he tried to carry out this project in a coordinate-independent
way, but he failed at first, and for a while explored a theory that was not coordinate-independent. Only later
did he return to coordinate-independence. It should be clear, then, that the link between the rotating-frame
argument and coordinate-independence was not as clearcut as Einstein makes out here, since he himself lost faith
in it for a while.—BC
49
In this book I’ve used the more transparent terminology “coordinate independence” rather than “general
covariance.”—BC
50
For more on this point, see p. 106.—BC
51
This is an extreme interpretation of general covariance, and one that Einstein himself didn’t hew closely to
later on. He presented an almost diametrically opposed interpretation in a philosophical paper, “On the aether,”
Schweizerische naturforschende Gesellschaft 105 (1924) 85.—BC
52
i.e., what this book refers to as incidence measurements (p. 88)—BC
Problems 325
Appendix 2: Hints and solutions
Hints
Hints for Chapter 1
Page 38, problem 5: Apply the equivalence principle.
Problems 327
Page 76, problem 1:
(a) Let t be the time taken in the lab frame for the light to go from one mirror to the other,
and t0 the corresponding interval in the clock’s frame. Then t0 = L, and (vt)2 +L2 = t2 , where the
use of the same L in both equations makes use of our prior knowledge that there is no transverse
length contraction. Eliminating L, we find the expected expression for γ, which is independent
of L (b) If the result of a were independent of L, then the relativistic time dilation would depend
on the details of the construction of the clock measuring the time dilation. We would be forced
to abandon the geometrical interpretation of special relativity. (c) The effect is to replace vt with
vt+at2 /2 as the quantity inside the parentheses in the expression (. . .)2 +L2 = t2 . The resulting
correction terms are of higher order in t than the ones appearing in the original expression, and
can therefore be made as small in relative size as desired by shortening the time t. But this is
exactly what happens when we make the clock sufficiently small.
Page 77, problem 2:
Since gravitational redshifts can be interpreted as gravitational time dilations, the gravi-
tational time dilation is given by the difference in gravitational potential gdr (in units where
c = 1). The kinematic effect is given by dγ = d(v 2 )/2 = ω 2 rdr. The ratio of the two effects is
ω 2 R cos λ/g, where R is the radius of the Earth and λ is the latitude. Tokyo is at 36 degrees
latitude, and plugging this in gives the claimed result.
Page 77, problem 3:
(a) Reinterpret figure i on p. 74 as a picture of a Sagnac ring interferometer. Let light waves
1 and 2 move around the loop in opposite senses. Wave 1 takes time t1i to move inward along
the crack, and time t1o to come back out. Wave 2 takes times t2i and t2o . But t1i = t2i (since
the two world-lines are identical), and similarly t1o = t2o . Therefore creating the crack has no
effect on the interference between 1 and 2, and splitting the big loop into two smaller loops
merely splits the total phase shift between them. (b) For a circular loop of radius r, the time
of flight of each wave is proportional to r, and in this time, each point on the circumference
of the rotating interferometer travels a distance v(time) = (ωr)(time) ∝ r2 . (c) The effect is
proportional to area, and the area is zero. (d) The light clock in c has its two ends synchronized
according to the Einstein prescription, and the success of this synchronization verifies Einstein’s
assumption of commutativity in this particular case. If we make a Sagnac interferometer in the
shape of a triangle, then the Sagnac effect measures the failure of Einstein’s assumption that all
three corners can be synchronized with one another.
Page 77, problem 5:
Here is the program:
1 L1:matrix([cosh(h1),sinh(h1)],[sinh(h1),cosh(h1)]);
2 L2:matrix([cosh(h2),sinh(h2)],[sinh(h2),cosh(h2)]);
3 T:L1.L2;
4 taylor(taylor(T,h1,0,2),h2,0,2);
The diagonal components of the result are both 1 + η12 /2 + η22 /2 + η1 η2 + . . . Everything after
the 1 is nonclassical. The off-diagonal components are η1 + η2 + η1 η22 /2 + η2 η12 /2 + . . ., with the
third-order terms being nonclassical.
Problems 329
but it would be on the order of c∆T , which is ∼ 100 m. This is considerably worse than civilian
GPS’s 20-meter error bars.
Page 109, problem 3:
The process that led from the Euclidean metric of example 7 on page 93 to the non-Euclidean
one of equation [3] on page 100 was not just a series of coordinate transformations. At the final
step, we got rid of the variable t, reducing the number of dimensions by one. Similarly, we could
take a Euclidean three-dimensional space and eliminate all the points except for the ones on the
surface of the unit sphere; the geometry of the embedded sphere is non-Euclidean, because we’ve
redefined geodesics to be lines that are “as straight as they can be” (i.e., have minimum length)
while restricted to the sphere. In the example of the carousel, the final step effectively redefines
geodesics so that they have minimal length as determined by a chain of radar measurements.
Page 109, problem 4:
(a) No. The track is straight in the lab frame, but curved in the rotating frame. Since the
spatial metric in the rotating frame is symmetric with respect to clockwise and counterclockwise,
the metric can never result in geodesics with a specific handedness. (b) The dθ02 term of the
metric blows up here. A geodesic connecting point A, at r = 1/ω, with point B, at r < 1/ω,
must have minimum length. This requires that the geodesic be directly radial at A, so that
dθ0 = 0; for if not, then we could vary the curve slightly so as to reduce |dθ0 |, and the resulting
increase in the dr2 term would be negligible compared to the decrease in the dθ02 term. (c) No.
As we found in part a, laser beams can’t be used to form geodesics.
Page 110, problem 5: A and B are equivalent under a Lorentz transformation, so the Penrose
result clearly includes B. The outline of the sphere is still spherical. C is also equivalent to
A and B, because there are only two effects (Lorentz contraction and optical aberration), and
both of them depend only on the observer’s instantaneous velocity, not on his history of motion.
D is not a well-defined question. When asking this question, we’re implicitly assuming that
the sphere has some “real” shape, which appears different because the sphere has been set into
motion. But you can’t impart an angular acceleration to a perfectly rigid body in relativity.
Page 110, problem 6: Applying the de Broglie relations to the relativistic identity m2 =
2 2 2 2 2
p − p , we find the dispersion relation to be m = ω − k . The group velocity is dω/dk =
E
2
1 − (m/ω) . Applying p the de Broglie relations to this, and associating the group velocity
with v, we have v = 1 − (m/E)2 , which is equivalent to E = mγ. Since E = mγ has been
established, and m2 = E 2 − p2 was assumed, it follows immediately that p = mγvp holds as well.
All hell breaks loose if we try to associate v with the phase velocity, which is ω/k = 1 + (m/k)2 .
For example, the phase velocity is always greater than c(= 1) for m > 0.
Problems 331
the nonvanishing Christoffel symbols (ignoring permutations of the lower indices) are Γtzt = g
and Γz tt = ge2gz . We can apply the geodesic equation with the affine parameter taken to be the
proper time, and this gives z̈ = −ge2gz ṫ2 , where dots represent differentiation with respect to
√
proper time. For a particle instantaneously at rest, ṫ = 1/ gtt = e−2gz , so z̈ = −g.
(c) [2] was constructed by performing a change of coordinates on a flat-space metric, so it
is flat. The Riemann tensor of [1] has Rtztz = −g 2 , so [1] isn’t flat. Therefore the two can’t be
the same under a change of coordinates.
(d) [2] is flat, so its curvature is constant. [1] has the property that under the transformation
z → z + c, where c is a constant, the only change is a rescaling of the time coordinate; by
coordinate invariance, such a rescaling is unobservable.
Page 181, problem 6: (a) 0 ≤ x ≤ 1
(b) 0 ≤ x < 1
(c) x2 ≤ 2
Page 181, problem 7: The double cone fails to satisfy axiom M2, because the apex has
properties that differ topologically from those of other points: deleting it chops the space into
two disconnected pieces.
Page 181, problem 8: When we use a word like “torus,” there is some hidden ambiguity. We
could mean something strange like the following. Suppose we construct the three-dimensional
space of coordinates (x, y, z) in which all three coordinates are rational numbers. Then let a
torus be the set of such all points lying at a distance of 1/2 from the nearest point on a unit
circle. This is in some sense a torus, but it doesn’t have the topological properties one usually
assumes. For example, two continuous curves on its surface can cross without having a point of
intersection. We can’t get anywhere without assuming that the word “torus” refers to a surface
that has the usual topological properties.
Now let’s prove that it’s a manifold using both definitions.
Using the topological definition, M1 is satisfied with n = 2, because every point on the
surface lives in a two-dimensional neighborhood. M2 holds because the only differences between
points are those that are not topological, e.g., Gaussian curvature. M3 holds provided precisely
under the interpretation outlined in the first paragraph.
Alternatively, we can use the local-coordinate definition. We have already shown that a
circle is a 1-manifold, which can be coordinatized in two patches by an angle φ. The torus can
therefore be coordinatized by a pair of such angles, (φ1 , φ2 ), in four patches. Again we need to
assume the interpretation given above, since otherwise real-number pairs like (φ1 , φ2 ) wouldn’t
have the same topology as points on the rational-number torus.
Page 217, problem 3: (a) There are singularities at r = 0, where gθ0 θ0 = 0, and r = 1/ω,
where gtt = 0. These are considered singularities because the inverse of the metric blows up.
They’re coordinate singularities, because they can be removed by a change of coordinates back
to the original non-rotating frame.
(b) This one has singularities in the same places. The one at r = 0 is a coordinate singularity,
because at small r the ω dependence is negligible, and the metric is simply that of ordinary
plane polar coordinates in flat space. The one at r = 1/ω is not a coordinate singularity. The
1 load(ctensor);
2 dim:2;
3 ct_coords:[r,theta];
4 lg:matrix([-1,0],
5 [0,-r^2/(1-w^2*r^2)]);
6 cmetric();
7 ricci(true);
8 scurvature();
The result is R = 6ω 2 /(1 − 2ω 2 r2 + ω 4 r4 ). This blows up at r = 1/ω, which shows that this is
not a coordinate singularity. The fact that R does not blow up at r = 0 is consistent with our
earlier conclusion that r = 0 is a coordinate singularity, but would not have been sufficient to
prove that conclusion.
(c) The argument is incorrect. The Gaussian curvature is not just proportional to the angular
deficit , it is proportional to the limit of /A, where A is the area of the triangle. The area of
the triangle can be small, so there is no upper bound on the ratio /A. Debunking the argument
restores consistency with the answer to part b.
Page 217, problem 7: The only nonvanishing Christoffel symbol is Γttt = −1/2t. The anti-
symmetric treatment of the indices in Rabcd = ∂c Γadb − ∂d Γacb + Γace Γedb − Γade Γecb guarantees
that the Riemann tensor must vanish when there is only one nonvanishing Christoffel symbol.
Page 218, problem 8: The first thing one notices is that the equation Rab = k isn’t written
according to the usual rules of grammar for tensor equations. The left-hand side has two lower
indices, but the right-hand side has none. In the language of freshman physics, this is like
setting a vector equal to a scalar. Suppose we interpret it as meaning that each of R’s 16
components should equal k in a vacuum. But this still isn’t satisfactory, because it violates
coordinate-independence. For example, suppose we are initially working with some coordinates
0
xµ , and we then rescale all four of them according to xµ = 2xµ . Then the components of Rab
all scale down by a factor of 4. But this would violate the proposed field equation.
Page 218, problem 9: The following Maxima code calculates the Ricci tensor for a metric
with gtt = h and grr = k.
1 load(ctensor);
2 dim:3;
3 ct_coords:[t,r,phi];
4 depends(h,r);
5 depends(k,r);
6 lg:matrix([h,0,0],
7 [0,-k,0],
8 [0,0,-r^2]);
9 cmetric();
10 ricci(true);
Inspecting the output (not reproduced here), we see that Rφφ = 0 requires k 0 /k = h0 /h. Since
Problems 333
the logarithmic derivatives of h and k are the same, the two functions can differ by at most a
constant factor c. So now we do a second iteration of the calculation:
1 load(ctensor);
2 dim:3;
3 ct_coords:[t,r,phi];
4 depends(h,r);
5 lg:matrix([h,0,0],
6 [0,-c*h,0],
7 [0,0,-r^2]);
8 cmetric();
9 ricci(true);
The result for Rrr is independent of c. Since h is essentially the gravitational potential, we have
the requirements h0 > 0 (because gravity is attractive) and h00 < 0 (because gravity weakens
with distance). Therefore we find that Rrr is positive, and we do not obtain a vacuum solution.
Page 238, problem 2: (a) If she makes herself stationary relative to the sun, she will still
experience local geometrical changes because of the planets. (b) If it was to be impossible
for her to prove the universe’s nonstationarity, then any world-line she picked would have to
experience constant local geometrical conditions. A counterexample is any world-line extending
back to the Big Bang, which is a singularity with drastically different conditions than any other
region of spacetime. (c) To maintain a constant local geometry, she would have to “surf” the
wave, but she can’t do that, because it propagates at the speed of light. (d) There are places
where the local mass-energy density is increasing, and the field equations link this to a change
in the local geometry.
Page 238, problem 4:
Under these special conditions, the geodesic equations become r̈ = Γrtt ṫ2 , φ̈ = 0, ẗ = 0,
where the dots can in principle represent differentation with respect to any affine parameter we
like, but we intend to use the proper time s. By symmetry,√ √there will
√ be no motion in the z
direction. The Christoffel symbol equals −(1/2)er (cos 3r − 3 sin 3r). At a location where
√
the cosine equals 1, this is simply −er /2. For ṫ, we have dt/ds = 1/ gtt = e−r/2 . The result of
the calculation is simply r̈ = −1/2, which is independent of r.
Page 238, problem 5:
The Petrov metric is one example. The metric has no singularities anywhere, so the r
coordinate can be extended from −∞ to +∞, and there is no point that can be considered the
center. The existence of a dφdt term in the metric shows that it is not static.
A simpler example is a spacetime made by taking a flat Lorentzian space and making it wrap
around topologically into a cylinder, as in problem 1 on p. 109. As discussed in the solution
to that problem, this spacetime has a preferred state of rest in the azimuthal direction. In
a frame that is moving azimuthally relative to this state of rest, the Lorentz transformation
requires that the phase of clocks be adjusted linearly as a function of the azimuthal coordinate
φ. As described in section 3.4.4, this will cause a discontinuity once we wrap around by 2π, and
therefore clock synchronization fails, and this frame is not static.
Page 290, problem 2: No. General relativity only allows coordinate transformations that are
smooth and one-to-one (see p. 89). This transformation is not smooth at t = 0.
Page 290, problem 5: (a) The Friedmann equations are
ä 1 4π
= Λ− (ρ + 3P )
a 3 3
and
2
ȧ 1 8π
= Λ+ ρ − ka−2 .
a 3 3
The first equation is time-reversal invariant because the second derivative stays the same under
time reversal. The second equation is also time-reversal invariant, because although the first
derivative flips its sign under time reversal, it is squared.
(b) We typically do not think of a singularity as being a point belonging to a manifold at all. If
we want to create this type of connected, symmetric back-to-back solution, then we need the Big
Bang singularity to be a point in the manifold. But this violates the definition of a manifold,
because then the Big Bang point would have topological characteristics different from those of
other points: deleting it separates the spacetime into two pieces.
Page 290, problem 4: Example 4 on page 270, the cosmic girdle, showed that a rope that
stretches over cosmological distances does expand significantly, unlike Brooklyn, nuclei, and
solar systems. Since the Milne universe is nothing but a flat spacetime described in funny
coordinates, something about that argument must fail. The argument used in that example
relied on the use of a closed cosmology, but the Milne universe is not closed. This is not a
completely satisfying resolution, however, because we expect that a rope in an open universe
will also expand, except in the special case of the Milne universe.
In a nontrivial open universe, every galaxy is accelerating relative to every other galaxy.
By the equivalence principle, these accelerations can also be seen as gravitational fields, and
tidal forces are what stretch the rope. In the special case of the Milne universe, there is no
acceleration of test particles relative to other test particles, so the rope doesn’t stretch.
Example 6 on page 273, the cosmic whip, resulted in the conclusion that the velocity of the
rope-end passing by cannot be interpreted as a measure of the velocity of the distant galaxy
to which the rope’s other end is hitched, which makes sense because cosmological solutions are
nonstationary, so there is no uniquely defined notion of the relative velocity of distant objects.
The Milne universe, however, is stationary, so such velocities are well defined. The key here is
that nothing is accelerating, so the time delays in the propagation of information do not lead to
ambiguities in extrapolating to a distant object’s velocity “now.”
The Milne case also avoids the paradox in which we could imagine that if the rope is suf-
ficiently long, its end would be moving at more than the speed of light. Although there is no
limit to the length of a rope in the Milne universe (there being no tidal forces), the Hubble
law cannot be extrapolated arbitrarily, since the expanding cloud of test particles has an edge,
beyond which there is only vacuum.
Page 290, problem 6: The cosmological constant is a scalar, so it doesn’t change under
reflection. The metric is also invariant under reflection of any coordinate. This follows because
Problems 335
we have assumed that the coordinates are locally Lorentzian, so that the metric is diagonal.
It can therefore be written as a line element in which the differentials are all squared. This
establishes that the Λgab is invariant under any spatial or temporal reflection.
The specialized form of the energy-momentum tensor diag(−ρ, P , P , P ) is also clearly in-
variant under any reflection, since both pressure and mass-energy density are scalars.
The form of the tensor transformation law for a rank-2 tensor guarantees that the diagonal
elements of such a tensor stay the same under a reflection. The off-diagonal elements will flip
sign, but since only the G and T terms in the field equation have off-diagonal terms, the field
equations remain valid under reflection.
In summary, the Einstein field equations retain the same form under reflection in any co-
ordinate. This important symmetry property, which is part of the Poincaré group in special
relativity, is retained when we make the transition to general relativity. It’s a discrete sym-
metry, so it wasn’t guaranteed to exist simply because of general covariance, which relates to
continuous coordinate transformations.
Page 290, problem 7: (a) The Einstein field equations are Gab = 8πTab + Λgab . That means
that in a vacuum, where T = 0, a cosmological constant is equivalent to ρ = (1/8π)Λ and
P = −(1/8π)Λ. This gives ρ + 3P = (1/8π)(−2Λ), which violates the SEC for Λ > 0, since part
of the SEC is ρ + 3P ≥ 0.
(a) Since our universe appears to have a positive cosmological constant, and the paper by
Hawking and Ellis assumes the strong energy condition, doubts are raised about the conclusion of
the paper as applied to our universe. However, the theorem is being applied to the early universe,
which was not a vacuum. Both P and ρ were large and positive in the early, radiation-dominated
universe, and therefore the SEC was not violated.
Page 291, problem 8:
(a) The Ricci tensor is Rtt = g 2 e2gz , Rzz = −g 2 . The scalar curvature is 2g 2 , which is
constant, as expected.
(b) Both Gtt and Gzz vanish by a straightforward computation.
(c) The Einstein tensor is Gtt = 0, Gxx = Gyy = g 2 , Gzz = 0. It is unphysical because it has
a zero mass-energy density, but a nonvanishing pressure.
Page 291, problem 9:
This proposal is an ingenious attempt to propose a concrete method for getting around the
fact that in relativity, there is no unique way of defining the relative velocities of objects that
lie at cosmological distances from one another.
Because the Milne universe is a flat spacetime, there is nothing to prevent us from laying
out a chain of arbitrary length. The chain will not, for example, be subject to the kind of tidal
forces that would inevitably break a chain that was lowered through the event horizon of a
black hole. But this only guarantees us that we can have a chain of a certain length as measured
in the chain’s frame. An observer at rest with respect to the chain describes all the links of
the chain as existing simultaneously at a certain set of locations. But this is a description in
(T , R) coordinates. To an observer who prefers the FRW coordinates, the links do not exist
simultaneously at these locations. This observer says that the supposed locations of distant
points on the chain occurred far in the past, and suspects that the chain has broken since then.
The paradox can also be resolved from the point of view of the (T , R) coordinates. The
ä 4π
= − (1 + 3w)ρ
a 3
2
ȧ 8π
= ρ .
a 3
Eliminating ρ, we find
ä
= −β ,
ȧ2
where β = (1 + 3w)/2. For a solution of the form a ∝ tδ , calculation of the derivatives results in
δ = 1/(1 + β) = (2/3)/(1 + w). For dust, δ = 2/3, which checks out against the result on p. 279.
For radiation, δ = 1/2. For a cosmological constant, w = −1 gives δ = ∞, so the solution has a
different form.
Problems 337
Page 291, problem 12: The integral is exactly the same as the one in example 9 on p. 280
for the dust case, except that the exponent 2/3 is generalized to δ = (2/3)/(1 + w), as shown
in the solution to problem 11. The result is L/t = 1/(1 − δ) = (w + 1)/(w + 1/3). In the
radiation-dominated case, we have L/t = 2.
Page 305, problem 1: (a) The members of the Hulse-Taylor system are spiraling toward one
another as they lose energy to gravitational radiation. If one of them were replaced with a
low-mass test particle, there would be negligible radiation, and the motion would no longer be
a spiral. This is similar to the issues encountered on pp. 39ff because the neutron stars in the
Hulse-Taylor system suffer a back-reaction from their own gravitational radiation.
(b) If this occurred, then the particle’s world-line would be displaced in space relative to a
geodesic of the spacetime that would have existed without the presence of the particle. What
would determine the direction of that displacement? It can’t be determined by properties of
this preexisting, ambient spacetime, because the Riemann tensor is that spacetime’s only local,
intrinsic, observable property. At a fixed point in spacetime, the Riemann tensor is even under
spatial reflection, so there’s no way it can distinguish a certain direction in space from the
opposite direction.
What else could determine this mysterious displacement? By assumption, it’s not deter-
mined by a preexisting, ambient electromagnetic field. If the particle had charge, the direction
could be one imposed by the back-reaction from the electromagnetic radiation it had emitted
in the past. If the particle had a lot of mass, then we could have something similar with gravi-
tational radiation, or some other nonlinear interaction of the particle’s gravitational field with
the ambient field. But these nonlinear or back-reaction effects are proportional to q 2 and m2 ,
so they vanish when q = 0 and m → 0.
The only remaining possibility is that the result violates the symmetry of space expressed by
L1 on p. 46; the Lorentzian geometry is the result of L1-L5, so violating L1 should be considered
a violation of Lorentz invariance.
Problems 339
rectly from the paper. I believe the use of these images in this book falls under the fair use
exception to copyright in the U.S.. 246 Apollo 11 mirror: NASA, public domain. 257
Penzias-Wilson antenna: NASA, public domain. 263 Lemaı̂tre: Ca. 1933, public domain..
281 Cosmic microwave background image: NASA/WMAP Science Team, public domain. 288
Dicke’s apparatus: Dicke, 1967. Used under the US fair-use doctrine. 296 LIGO and LISA
sensitivities: NASA, public domain. 295 Graph of pulsar’s period: Weisberg and Taylor,
http://arxiv.org/abs/astro-ph/0211217.
342 Index
Goudsmit, 75 Jacobian matrix, 137
GPS
frames of reference used in, 101 Kasner metric, 234
timing signals, 126 Killing equation, 221
gravitational constant, 190 Killing vector, 219
gravitational field orbit, 219
uniform, 180, 233, 291 Kretchmann invariant, 205, 226, 238
gravitational mass, 21, 243 Kreuzer experiment, 243
active, 243
passive, 243 large extra dimensions, 212
gravitational potential, see potential Lemaı̂tre, Georges, 264
gravitational red-shift, see red-shift length contraction, 50
gravitational shielding, 251 Lense-Thirring effect, 136, 235
gravitational waves Levi-Civita symbol, 138, 166
empirical evidence for, 294 Levi-Civita, Tullio, 83, 106, 138
energy content, 296 light
propagation at c, 293 deflection by sun, 154, 202
propagation at less than c, for high ampli- light clock, 76
tudes, 293 light cone, 57
rate of radiation, 302 lightlike, 57
transverse nature, 299 logic
Gravity Probe A, 17 Aristotelian, 60
Gravity Probe B, 67, 127 loop quantum gravity, 62
frame dragging, 136 Lorentz boost, 47
geodetic effect calculated, 194 lune, 86
geodetic effect estimated, 153
group, 96 Mössbauer effect, 35
Mach’s principle, 105, 239, 283
Hafele-Keating experiment, 15, 66
manifold, 175
Hawking radiation, 210
mass
Hawking, Stephen, 20
active gravitational, 243
hole argument, 104
ADM, 298
homeomorphism, 176
gravitational, 21, 243
Hoyle, Fred, 258
inertial, 21, 243
Hubble constant, 253, 280
passive gravitational, 243
Hubble flow, 281
mass-energy, 114
Hubble, Edwin, 257
ADM, 298
Hulse, R.A., 202
Maxima, 68, 189
Hulse-Taylor pulsar, 202, 304
Mercury
hyperbolic geometry, 150
orbit of, 192
inertial frame, 24 metric, 91
ambiguity in definition, 29 Michelson-Morley experiment, 62
inertial mass, 21, 243 Milne universe, 265
information paradox, 185, 209 Minkowski, 41
inner product, 96 model
intrinsic quantity, 87 mathematical, 84
isometry, 96 momentum four-vector, 114
Ives-Stilwell experiments, 119 muon, 16
Index 343
naked singularity, 210 proper time, 111
neighborhood, 176 pulsar, 130, 202
neutrino, 116
neutron star, 130, 202 rank of a tensor, 92
no-cloning theorem, 185 rapidity, 59
no-hair theorems, 229 red-shift
normal coordinates, 148 cosmological
null energy condition, 248 kinematic versus gravitational, 232, 273
gravitational, 16, 34
observable universe, 280 Ricci curvature, 145
size and age, 280 defined, 152
open cosmology, 261 Ricci scalar, 205
open set, 176 Riemann curvature tensor, 151
optical effects, 110 Riemann tensor
orbit defined, 151
Killing vector, 219 rigid-body rotation, 99
orientability, 137 Rindler coordinates, 180
orientable ring laser, 66
in time, 193 Robinson
Abraham, 85
parallel postulate, 18 Robinson, Abraham, 80
parallel transport, 81, 82 rotating frame of reference, 98, 233
parity, 96 rotation
Pasch, Moritz, 19 rigid, 99
patch, 179
Penrose, Roger, 93, 110, 210 Sagittarius A*, 208, 229
Penrose-Hawking singularity theorems, 252, 264 Sagnac effect, 100, 228
Penzias, Arno, 257 defined, 66
Petrov classification, 299 in GPS, 53
Petrov metric, 234, 238 proportional to area, 77
photon scalar curvature, 205
mass, 117 Schwarzschild metric, 193
Pioneer anomaly, 271 in d dimensions, 212
Planck mass, 168 Schwarzschild, Karl, 187
Planck scale, 167 shielding
Playfair’s axiom, 18 gravitational, 251
Poincaré group, 96, 336 signature
polarization change of, 213
of gravitational waves, 299 defined as a list of signs, 188
of light, 115 defined as an integer, 213
potential, 32 singularity, 20, 209
Hansen’s, 232 naked, 210
not defined in arbitrary spacetimes, 232 singularity theorems, 252
relativistic vs. Newtonian, 232 Sirius B, 16
Pound-Rebka experiment, 16, 34 spacelike, 57
principal group, 97 spaceship paradox, 59, 181
prior geometry, 107 special relativity
projective geometry, 88 defined, 29
proper distance, 260 spherical geometry, 84
344 Index
spherical symmetry, 224 transformation laws, 123
spontaneous symmetry breaking, 281 transition map, 179
standard cosmological coordinates, 260 transverse polarization
static spacetime, 229 of gravitational waves, 299
stationary, 226 of light, 115
asymptotically, 227 trapped surface, 252
steady-state cosmology, 258 triangle inequality, 96
stress-energy tensor, 146, 241 Type III solution, 299
string theory, 167 Type N solution, 299
strong energy condition, 248
surface of last scattering, 258 Uhlenbeck, 75
Susskind, Leonard, 211 uniform gravitational field, 180, 233, 291
Sylvester’s law of inertia, 214 unitarity, 185, 209
symmetrization, 92 units
symmetry geometrized, 190
spherical, 224 universe
observable, 280
symmetry breaking
size and age, 280
spontaneous, 281
upsidasium, 26
synchronization
Einstein convention, 228, 308 vector
contravariant, 91
tangent space, 220 covariant, 91
Tarski, Alfred, 84 velocity vector, 112
Taylor, J.H., 202 volume expansion, 252
tensor, 92, 125
antisymmetric, 92 Waage, Harold, 26
rank, 92 wavenumber, 119
symmetric, 92 waves
transformation law, 125 gravitational, see gravitational waves
tensor density, 137 weak energy condition, 248
tensor transformation laws, 123 weight of a tensor density, 137
Terrell, James, 110 Wheeler, John, 26
Thomas precession, 153, 195 white dwarf, 129
Thomas, Llewellyn, 75 Wilson, Robert, 257
time dilation world-line, 21
gravitational, 15, 33
nonuniform field, 53
kinematic, 15, 50
time reversal, 96
of the Schwarzschild metric, 193
symmetry of general relativity, 193
time-orientable, 193
timelike, 57
Tolman-Oppenheimer-Volkoff limit, 131
topology, 175
torsion, 163
tensor, 166
trace energy condition, 248
Index 345
Euclidean geometry (page 18):
E3 A unique circle can be constructed given any point as its center and any line segment as
its radius.
E5 Parallel postulate: Given a line and a point not on the line, exactly one line can be drawn
through the point and parallel to the given line.53
O2 Line segments can be extended: given A and B, there is at least one event such that [ABC]
is true.
O4 Betweenness: For any three distinct events A, B, and C lying on the same line, we can
determine whether or not B is between A and C (and by statement 3, this ordering is
unique except for a possible over-all reversal to form [CBA]).
A1 Constructibility of parallelograms: Given any P, Q, and R, there exists S such that [PQRS],
and if P, Q, and R are distinct then S is unique.
A3 Lines parallel to the same line are parallel to one another: If [ABCD] and [ABEF], then
[CDEF].
L1 Spacetime is homogeneous and isotropic. No point has special properties that make it
distinguishable from other points, nor is one direction distinguishable from another.
L2 Inertial frames of reference exist. These are frames in which particles move at constant
velocity if not subject to any forces. We can construct such a frame by using a particular
particle, which is not subject to any forces, as a reference point.
53
This is a form known as Playfair’s axiom, rather than the version of the postulate originally given by Euclid.
346 Index
L3 Equivalence of inertial frames: If a frame is in constant-velocity translational motion
relative to an inertial frame, then it is also an inertial frame. No experiment can distinguish
one inertial frame from another.
L5 No simultaneity: The experimental evidence in section 1.2 shows that observers in different
inertial frames do not agree on the simultaneity of events.
Accelerations and gravitational fields are equivalent. There is no experiment that can
distinguish one from the other (page 23).
There is no way to associate a preferred tensor field with spacetime (page 127).
Index 347